I was searching for a BinaryReader.Skip function, while I came across this feature request on msdn.
He said you can provide your own BinaryReader.Skip() function, by using this.
Only looking at this code, I'm wondering why he chose this way to skip a certain amount of bytes:
for (int i = 0, i < count; i++) {
reader.ReadByte();
}
Is there a difference between that and:
reader.ReadBytes(count);
Even if it's just a small optimalisation, I'd like to undestand. Because now it doesnt make sense to me why you would use the for loop.
public void Skip(this BinaryReader reader, int count) {
if (reader.BaseStream.CanSeek) {
reader.BaseStream.Seek(count, SeekOffset.Current);
}
else {
for (int i = 0, i < count; i++) {
reader.ReadByte();
}
}
}
No, there is no difference. EDIT: Assuming that the stream has enough byes
The ReadByte method simply forwards to the underlying Stream's ReadByte method.
The ReadBytes method calls the underlying stream's Read until it reads the required number of bytes.
It's defined like this:
public virtual byte[] ReadBytes(int count) {
if (count < 0) throw new ArgumentOutOfRangeException("count", Environment.GetResourceString("ArgumentOutOfRange_NeedNonNegNum"));
Contract.Ensures(Contract.Result<byte[]>() != null);
Contract.Ensures(Contract.Result<byte[]>().Length <= Contract.OldValue(count));
Contract.EndContractBlock();
if (m_stream==null) __Error.FileNotOpen();
byte[] result = new byte[count];
int numRead = 0;
do {
int n = m_stream.Read(result, numRead, count);
if (n == 0)
break;
numRead += n;
count -= n;
} while (count > 0);
if (numRead != result.Length) {
// Trim array. This should happen on EOF & possibly net streams.
byte[] copy = new byte[numRead];
Buffer.InternalBlockCopy(result, 0, copy, 0, numRead);
result = copy;
}
return result;
}
For most streams, ReadBytes will probably be faster.
ReadByte will throw an EndOfStreamException if the end of the stream is reached, whereas ReadBytes will not. It depends on whether you want Skip to throw if it cannot skip the requested number of bytes without reaching the end of the stream.
ReadBytes is faster than multiple ReadByte calls.
Its a very small optimization which will occasionally skip bytes (rather then reading them into ReadByte) Think of it this way
if(vowel)
{
println(vowel);
}
else
{
nextLetter();
}
If you can prevent that extra function call you save a little runtime
Related
I currently have a memorystream with length of about 30000 (Named memStream here)
I wished to read this memorystream in chunks using the following code (I picked up on the net and modified somewhat):
byte[] chunk = new byte[4096];
bool hasNext = true;
while(hasNext)
{
int index = 0;
while (index < chunk.Length)
{
int bytesRead = memStream.Read(chunk, index, chunk.Length - index);
if (bytesRead == 0)
{
break;
}
index += bytesRead;
//Do something with this chunk
}
if (index != 0) // Our previous chunk may have been the last one
{
//Do something with the last chunk
}
if (index != chunk.Length) // We didn't read a full chunk: we're done
{
hasNext = false;
}
}
yet the following read()method doesn't appear to be working
int bytesRead = memStream.Read(chunk, index, chunk.Length - index);
WHERE
chunk: new byte[4096]
index: 0
memstream: capacitiy & length : 34272
memstream: position 0 (according to VS watch)
Always returns
0 bytesRead
Chunk with all values containing '0'
Any idea why? Could this be a rights permission?
Thank you for your time.
After creating and filling the MemoryStream, you need to set the read position to the begining like so:
memStream.Seek(0, SeekOrigin.Begin);
Here is the method I like to use. I believe, there is nothing new with this code.
public static byte[] ReadFully(Stream stream, int initialLength)
{
// If we've been passed an unhelpful initial length, just
// use 1K.
if (initialLength < 1)
{
initialLength = 1024;
}
byte[] buffer = new byte[initialLength];
int read = 0;
int chunk;
while ((chunk = stream.Read(buffer, read, buffer.Length - read)) > 0)
{
read += chunk;
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length)
{
int nextByte = stream.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1)
{
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
My goal is to read data sent from TCP Clients e.g. box{"id":1,"aid":1}
It is a command to interpret in my application in Jason-like text.
And this text is not necessarily at the same size each time.
Next time there can be run{"id":1,"aid":1,"opt":1}.
The method called by this line;
var serializedMessageBytes = ReadFully(_receiveMemoryStream, 1024);
Please click to see; Received data in receiveMemoryStream
Although we can see the data in the stream,
in the ReadFully method, "chunck" always return 0 and the method returns {byte[0]}.
Any help effort greatly appreciated.
Looking at your stream in the Watch window, the Position of the stream (19) is at the end of the data, hence there is nothing left to read. This is possibly because you have just written data to the stream and have not subsequently reset the position.
Add a stream.Position = 0; or stream.Seek(0, System.IO.SeekOrigin.Begin); statement at the start of the function if you are happy to always read from the start of the stream, or check the code that populates the stream. Note though that some stream implementations do not support seeking.
My goal is to have a file stream open up a user-chosen file, then, it should stream the files bytes through in chunks (buffers) of about 4mb (this can be changed it's just for fun). As the bytes travel (in chunks) through the stream, I'd like to have a looping if-statement see if the bytes value is contained in an array I have declared elsewhere. (The code below will build a random array for replacing bytes), and the replacement loop could just say something like the bottom for-loop. As you can see I'm fairly fluent in this language but for some reason the editing and rewriting of chunks as they are read from a file to a new one is eluding me. Thanks in advance!
private void button2_Click(object sender, EventArgs e)
{
GenNewKey();
const int chunkSize = 4096; // read the file by chunks of 4KB
using (var file = File.OpenRead(textBox1.Text))
{
int bytesRead;
var buffer = new byte[chunkSize];
while ((bytesRead = file.Read(buffer, 0, buffer.Length)) > 0)
{
byte[] newbytes = buffer;
int index = 0;
foreach (byte b in buffer)
{
for (int x = 0; x < 256; x++)
{
if (buffer[index] == Convert.ToByte(lst[x]))
{
try
{
newbytes[index] = Convert.ToByte(lst[256 - x]);
}
catch (System.Exception ex)
{
//just to show why the error was thrown, but not really helpful..
MessageBox.Show(index + ", " + newbytes.Count().ToString());
}
}
}
index++;
}
AppendAllBytes(textBox1.Text + ".ENC", newbytes);
}
}
}
private void GenNewKey()
{
Random rnd = new Random();
while (lst.Count < 256)
{
int x = rnd.Next(0, 255);
if (!lst.Contains(x))
{
lst.Add(x);
}
}
foreach (int x in lst)
{
textBox2.Text += ", " + x.ToString();
//just for me to see what was generated
}
}
public static void AppendAllBytes(string path, byte[] bytes)
{
if (!File.Exists(path + ".ENC"))
{
File.Create(path + ".ENC");
}
using (var stream = new FileStream(path, FileMode.Append))
{
stream.Write(bytes, 0, bytes.Length);
}
}
Where textbox1 holds the path and name of file to encrypt, textBox2 holds the generated cipher for personal debugging purposes, button two is the encrypt button, and of course I am using System.IO.
Indeed you have a off by one error in newbytes[index] = Convert.ToByte(lst[256 - x])
if x is 0 then you will have lst[256], however lst only goes between 0-255. Change that to 255 should fix it.
The reason it freezes up is your program is EXTREMELY inefficient and working on the UI thread (and has a few more errors like you should only go up to bytesRead in size when processing buffer, but that will just give you extra data in your output that should not be there. Also you are reusing the same array for buffer and newbytes so your inner for loop could modify the same index more than once because every time you do newbytes[index] = Convert.ToByte(lst[256 - x]) you are modifying buffer[index] which will get checked again the next itteration of the for loop).
There is a lot of ways you can improve your code, here is a snippet that does similar to what you are doing (I don't do the whole "find the index and use the opposite location", I just use the byte that is passed in as the index in the array).
while ((bytesRead = file.Read(buffer, 0, buffer.Length)) > 0)
{
byte[] newbytes = new byte[bytesRead];
for(int i = 0; i < newbytes.Length; i++)
{
newbytes[i] = (byte)lst[buffer[i]]))
}
AppendAllBytes(textBox1.Text + ".ENC", newbytes);
}
This may also lead to freezing but not as much, to solve the freeing you should put all of this code in to a BackgroundWorker or similar to run on another thread.
This is a little more tricky than I first imagined. I'm trying to read n bytes from a stream.
The MSDN claims that Read does not have to return n bytes, it just must return at least 1 and up to n bytes, with 0 bytes being the special case of reaching the end of the stream.
Typically, I'm using something like
var buf = new byte[size];
var count = stream.Read (buf, 0, size);
if (count != size) {
buf = buf.Take (count).ToArray ();
}
yield return buf;
I'm hoping for exactly size bytes but by spec FileStream would be allowed to return a large number of 1-byte chunks as well. This must be avoided.
One way to solve this would be to have 2 buffers, one for reading and one for collecting the chunks until we got the requested number of bytes. That's a little cumbersome though.
I also had a look at BinaryReader but its spec also does not clearly state that n bytes will be returned for sure.
To clarify: Of course, upon the end of the stream the returned number of bytes may be less than size - that's not a problem. I'm only talking about not receiving n bytes even though they are available in the stream.
A slightly more readable version:
int offset = 0;
while (offset < count)
{
int read = stream.Read(buffer, offset, count - offset);
if (read == 0)
throw new System.IO.EndOfStreamException();
offset += read;
}
Or written as an extension method for the Stream class:
public static class StreamUtils
{
public static byte[] ReadExactly(this System.IO.Stream stream, int count)
{
byte[] buffer = new byte[count];
int offset = 0;
while (offset < count)
{
int read = stream.Read(buffer, offset, count - offset);
if (read == 0)
throw new System.IO.EndOfStreamException();
offset += read;
}
System.Diagnostics.Debug.Assert(offset == count);
return buffer;
}
}
Simply; you loop;
int read, offset = 0;
while(leftToRead > 0 && (read = stream.Read(buf, offset, leftToRead)) > 0) {
leftToRead -= read;
offset += read;
}
if(leftToRead > 0) throw new EndOfStreamException(); // not enough!
After this, buf should have been populated with exactly the right amount of data from the stream, or will have thrown an EOF.
Getting everything together from answers here I came up with the following solution. It relies on a source stream length. Works on .NET core 3.1
/// <summary>
/// Copy stream based on source stream length
/// </summary>
/// <param name="source"></param>
/// <param name="destination"></param>
/// <param name="bufferSize">
/// A value that is the largest multiple of 4096 and is still smaller than the LOH threshold (85K).
/// So the buffer is likely to be collected at Gen0, and it offers a significant improvement in Copy performance.
/// </param>
/// <returns></returns>
private async Task CopyStream(Stream source, Stream destination, int bufferSize = 81920)
{
var buffer = new byte[bufferSize];
var offset = 0;
while (offset < source.Length)
{
var leftToRead = source.Length - offset;
var lengthToRead = leftToRead - buffer.Length < 0 ? (int)(leftToRead) : buffer.Length;
var read = await source.ReadAsync(buffer, 0, lengthToRead).ConfigureAwait(false);
if (read == 0)
break;
await destination.WriteAsync(buffer, 0, lengthToRead).ConfigureAwait(false);
offset += read;
}
destination.Seek(0, SeekOrigin.Begin);
}
Best explained with code:
long pieceLength = Math.Pow(2,18); //simplification
...
public void HashFile(string path)
{
using (FileStream fin = File.OpenRead(path))
{
byte[] buffer = new byte[(int)pieceLength];
int pieceNum = 0;
long remaining = fin.Length;
int done = 0;
int offset = 0;
while (remaining > 0)
{
while (done < pieceLength)
{
int toRead = (int)Math.Min(pieceLength, remaining);
int read = fin.Read(buffer, offset, toRead);
//if read == 0, EOF reached
if (read == 0)
break;
offset += read;
done += read;
remaining -= read;
}
HashPiece(buffer, pieceNum);
done = 0;
pieceNum++;
buffer = new byte[(int)pieceLength];
}
}
}
This works fine if the file is smaller than pieceLength and only does the outer loop once. However, if the file is larger, it throws this at me:
This is in the int read = fin.Read(buffer, offset, toRead); line.
Unhandled Exception: System.ArgumentException: Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.
at System.IO.FileStream.Read(Byte[] array, Int32 offset, Int32 count)
done, buffer DO get reinitialized properly. File is larger than 1 MB.
Thanks in advance
Well, at least one problem is that you're not taking into account the "piece already read" when you work out how much to read. Try this:
int toRead = (int) Math.Min(pieceLenght - done, remaining);
And then also adjust where you're reading to within the buffer:
int read = fin.Read(buffer, done, toRead);
(as you're resetting done for the new buffer, but not offset).
Oh, and at that point offset is irrelevant, so remove it.
Then note djna's answer as well - consider the case where for whatever reason you read to the end of the file, but without remaining becoming zero. You may want to consider whether remaining is actually useful at all... why not just keep reading blocks until you get to the end of the stream?
You don't adjust the value of "remaining" in this case
if (read == 0)
break;
The FileStream.Read method's Offset and Length parameters relate to positions in the buffer, not to positions in the file.
Basically, this should fix it:
int read = fin.Read(buffer, 0, toRead);