What does "Position in a stream" mean in C#? - c#

I have trouble to understand the word "Position of a stream". My question is somehow related to the concept of the stream method Seek(); it is kind confusing to me what this method does, they say its purpose is to set the position of the stream to a given value but yet its name describes the seek operation not set operation. Does anyone understand clearly what these two words are for and how they work together? Thanx

Think about a file as a sequence of bytes, and a stream as a view over that sequence, with a cursor marking the current position - so as you read data, the cursor is advanced. The Position property is simply the position of that cursor. So when you open a stream it's typically at 0, and as you read it increases. For seekable streams, you can "rewind" to the start of the stream with
stream.Position = 0;
or maybe skip 10 bytes using:
stream.Position += 10;
etc.

A stream is basically a sequence of bytes - the position is the point in the sequence that the stream is in.
The point of Seek is to "jump" to a location in the stream - a specific index (aka Position), similar to seeks done in a hard drive. With Seek you can specify an offset to start seeking from, so how many bytes to "jump".

The following 2 statements do exactly the same:
s.Position = 100;
s.Seek(100, SeekOrigin.Begin);
And they both determine the position (as a bytecount) where the next Read or Write will occur.
The Seek() name is very ancient.

I actually agree that, at first glance, Seek is not the best name for what it does. SeekAndSet or SeekAndMove would make more sense to me because that is what the method does - it seeks the position you want in the stream and sets the cursor to that position.
However, when you think of Seek in Computer Science terms in relation to hard disk drives it becomes obvious what this method does. It seeks and moves to the position.

Related

How to move cursor back 1 reading with FileStream?

I read byte by byte in FileStream. I use ReadByte.
I always need to check next byte. I would like to be able, if I read a certain byte, to go back one byte.
The reason is that when I meet this byte, I need to pass the FileStream to another function, but it needs to read it at this specific previous position (back one byte).
How can I achieve this?
Indeed I searched https://www.bing.com/search?q=c%23+change+position+to+previous+stream+site%3astackoverflow.com but all questions suggest to use Seek(offset, Beginning). Some user suggested duplicate which shows how to use .Seek(0, SeekOrigin.Begin); - that definitely what I want. I need to seek to current position (for which I found plausible method be searching for "C# filestream position" - FileStream.Position) decreased by one.
There is the Seek method to set the stream to a given position. You can get the current position of the stream with the Position property.
It should something like this then:
fileStream.Seek(filestream.Position - 1, SeekOrigin.Begin);

position in a file stream

This is probably a silly question. but of the two following ideas, which is conceptually correct when dealing with streams:
1) position is between characters
(pos0)byte0(pos1)byte1(pos2)byte2
2) position is on the character
(pos0/byte0)(pos1/byte1)(pos2/byte2)
thanks
The position is before the byte you read. If the position is 0 and your read one byte, you've read the first byte. If you set the position to the stream size, you can't read anything since there is nothing behind - you just could append something to file.

Streams Why Use Seek(0L, SeekOrigin.Begin) instead of Position = 0 or vice versa

Could anyone please explain to me the differences, if any?
I tried to Google it but couldn't find much information. Maybe I didn't use correct keywords.
Any insight would be greatly appreciated.
stream.Seek(x, SeekOrigin.Begin); and stream.Position = x; both result in the stream position being set to x. The difference is that the Position setter unconditionally discards any read buffer, while the Seek method attempts to retain the part of the buffer that is still relevant to the new position.
You'll have to test, which one is faster for your scenario, but there's definitely a performance difference and neither is faster in all cases. I really wonder why this difference isn't documented.
In your example there is no difference.
The actual difference between Stream.Position and Stream.Seek is that Position uses an absolute offset whereas Seek uses an offset relative to the origin specified by the second argument.
As far as I can tell, at least for this specific case, nothing.
Both method Seek() and property Position require CanSeek to be true so from what I see it's up to the implementer.
Seek is really there to allow searching from specified locations (SeekOrigins) to an offset (the examples given on MSDN are somewhat convoluted but representative of the purpose: http://msdn.microsoft.com/en-us/library/system.io.filestream.seek.aspx).
Position is absolute and is obviously not meant for searching.
The case you mentioned just happens to be equivalent.
Personally, I'd use .Position = 0 to move to the beginning of the stream as that reads cleaner to me than "Seek using the beginning of the file as an origin and move this 0 offset of bytes."

Stitching together multiple streams in one Stream class

I want to make a class (let's call the class HugeStream) that takes an IEnumerable<Stream> in its constructor. This HugeStream should implement the Stream abstract class.
Basically, I have 1 to many pieces of UTF8 streams coming from a DB that when put together, make a gigantic XML document. The HugeStream needs to be file-backed so that I can seek back to position 0 of the whole stitched-together-stream at any time.
Anyone know how to make a speedy implementation of this?
I saw something similar created at this page but it does not seem optimal for handling large numbers of large streams. Efficiency is the key.
On a side note, I'm having trouble visualizing Streams and am a little confused now that I need to implement my own Stream. If there's a good tutorial on implementing the Stream class that anyone knows of, please let me know; I haven't found any good articles browsing around. I just see a lot of articles on using already-existing FileStreams and MemoryStreams. I'm a very visual learner and for some reason can't find anything useful to study the concept.
Thanks,
Matt
If you only read data sequentially from the HugeStream, then it simply needs to read each child stream (and append it into a local file as well as returning the read data to the caller) until the child-stream is exhausted, then move on to the next child-stream. If a Seek operation is used to jump "backwards" in the data, you must start reading from the local cache file; when you reach the end of the cache file, you must resume reading the current child stream where you left off.
So far, this is all pretty straight-forward to implement - you just need to indirect the Read calls to the appropriate stream, and switch streams as each one runs out of data.
The inefficiency of the quoted article is that it runs through all the streams every time you read to work out where to continue reading from. To improve on this, you need to open the child streams only as you need them, and keep track of the currently-open stream so you can just keep reading more data from that current stream until it is exhausted. Then open the next stream as your "current" stream and carry on. This is pretty straight-forward, as you have a linear sequence of streams, so you just step through them one by one. i.e. something like:
int currentStreamIndex = 0;
Stream currentStream = childStreams[currentStreamIndex++];
...
public override int Read(byte[] buffer, int offset, int count)
{
while (count > 0)
{
// Read what we can from the current stream
int numBytesRead = currentSteam.Read(buffer, offset, count);
count -= numBytesRead;
offset += numBytesRead;
// If we haven't satisfied the read request, we have exhausted the child stream.
// Move on to the next stream and loop around to read more data.
if (count > 0)
{
// If we run out of child streams to read from, we're at the end of the HugeStream, and there is no more data to read
if (currentStreamIndex >= numberOfChildStreams)
break;
// Otherwise, close the current child-stream and open the next one
currentStream.Close();
currentStream = childStreams[currentStreamIndex++];
}
}
// Here, you'd write the data you've just read (into buffer) to your local cache stream
}
To allow seeking backwards, you just have to introduce a new local file stream that you copy all the data into as you read (see the comment in my pseudocode above). You need to introduce a state so you know that you are reading from the cache file rather than the current child stream, and then just directly access the cache (seeking etc is easy because the cache represents the entire history of the data read from the HugeStream, so the seek offsets are identical between the HugeStream and the Cache - you simply have to redirect any Read calls to get the data out of the cache stream)
If you read or seek back to the end of the cache stream, you need to resume reading data from the current child stream. Just go back to the logic above and continue appending data to your cache stream.
If you wish to be able to support full random access within the HugeStream you will need to support seeking "forwards" (beyond the current end of the cache stream). If you don't know the lengths of the child streams beforehand, you have no choice but to simply keep reading data into your cache until you reach the seek offset. If you know the sizes of all the streams, then you could seek directly and more efficiently to the right place, but you will then have to devise an efficient means for storing the data you read to the cache file and recording which parts of the cache file contain valid data and which have not actually been read from the DB yet - this is a bit more advanced.
I hope that makes sense to you and gives you a better idea of how to proceed...
(You shouldn't need to implement much more than the Read and Seek interfaces to get this working).

memorystream question

After write the xml document into the memory stream. When I want to use it by using XMLDocuments.Load, I have to set the position back to 0.
I am wondering If there any standard way to do it?
Well the simplest way is just:
stream.Position = 0;
I'm not sure what you're after beyond that. You can use the Seek method, but personally I find the Position property to be far simpler.
Do you definitely need to go via a stream in the first place? If you've already got the XmlDocument, why not just use that?
That's pretty much how you have to do it. The position must be set back to 0, because after writing the document into the stream, the stream is positioned at the end, ready to append more data. Setting the position to 0 effectively "rewinds" the stream, so that you will read it back in from the beginning.
This is a normal and expected usage pattern, if you are doing something like this anyway.

Categories

Resources