I have a socket-based application that exposes received data with a BinaryReader object on the client side. I've been trying to debug an issue where the data contained in the reader is not clean... i.e. the buffer that I'm reading contains old data past the size of the new data.
In the code below:
System.Diagnostics.Debug.WriteLine("Stream length: {0}", _binaryReader.BaseStream.Length);
byte[] buffer = _binaryReader.ReadBytes((int)_binaryReader.BaseStream.Length);
When I comment out the first line, the data doesn't end up being dirty (or, doesn't end up being dirty as regularly) as when I have that print line statement. As far as I can tell, from the server side the data is coming in cleanly, so it's possible that my socket implementation has some issues. But does anyone have any idea why adding that print line would cause the data to be dirty more often?
Your binary reader looks like it is a private member variable (if the leading underscore is a tell tell sign).
Is your application multithreaded? You could be experiencing a race condition if another thread is attempting to do also use your binaryReader while you are reading from it. The fact that you experience issues even without that line seems quite suspect to me.
Are you sure that your reading logic is correct? Stream.Length indicates the length of the entire stream, not of the remaining data to be read.
Suppose that, initially, 100 bytes were available. Length is 100, and BinaryReader corrects reads 100 bytes and advances the stream position by 100. Then, another 20 bytes arrive. Length is now 120; however, your BinaryReader should only be reading 20 bytes, not 120. The ‘extra’ 100 bytes requested in the second read would either cause it to block or (if the stream is not implemented correctly) break.
The problem was silly and unrelated. I believe my reading logic above is correct, however. The issue was that the _binaryReader I was using was a reference that was not owned by my class and hence the underlying stream was being rewritten with bad data.
Related
I'm creating an application that will take an image in a certain format from one of a video game's files and convert it to a DDS. This requires me to build the DDS in a buffer and then write it out to a DDS file. This buffer is of type List<byte>.
I first write the magic number, which is just the text "DDS ", with this code:
ddsFile.AddRange(Encoding.ASCII.GetBytes("DDS "));
I then need to write the header size, which is always 0x7C000000 (124), and this is where I've hit a wall. I used this code to write it to the buffer:
ddsFile.AddRange(BitConverter.GetBytes(0x0000007C));
This made sense to me because Encoding.ASCII.GetBytes()says itself that it returns a byte[], and it does accept an int as a parameter, no problem. And additionally, this was what I saw recommended when looking for a method for adding multi-byte values to a byte list. But for whatever reason, when the program tries to execute that line, this exception is thrown:
Unable to cast object of type 'System.Byte[]' to type 'System.IConvertible'.
But what's even more strange to the point of being ridiculous is that, upon seeing what did make it into the buffer, I see that the int actually was being written to the buffer, but the exception was still occurring for who knows what reason.
Bizarrely, even writing a single byte to the list after writing the magic number e.g. ddsFile.Add((byte)0x00)); results in the same thing.
Any help in figuring out why this exception occurs and/or a solution would be greatly appreciated.
This is not an answer to the question but a suggestion to do it differently.
Instead of using a List<byte> and manually doing all the conversions (while certainly possible, it's cumbersome), use a stream and a BinaryWriter - the stream can be a memory stream if you want to buffer the image in memory or a file stream if you want to write it to disk right away.
Using a BinaryWriter against the stream makes the conversions a lot simpler (and you can still manually convert parts of the data easily, if you need to do so).
Here's a short example:
var ms = new MemoryStream();
var bw = new BinaryWriter(ms, Encoding.ASCII);
bw.Write("DDS ");
bw.Write(124); // writes 4 bytes
bw.Write((byte) 124); // writes 1 byte
...
Use whichever overload of Write() you need to output the right bytes. (This short example omits cleaning up things but if you use a file stream, you'll need to make sure that you properly close it.)
I am using a networkstream to pass short strings around the network.
Now, on the receiving side I have encountered an issue:
Normally I would do the reading like this
see if data is available at all
get count of data available
read that many bytes into a buffer
convert buffer content to string.
In code that assumes all offered methods work as probably intended, that would look something like this:
NetworkStream stream = someTcpClient.GetStream();
while(!stream.DataAvailable)
;
byte[] bufferByte;
stream.Read(bufferByte, 0, stream.Lenght);
AsciiEncoding enc = new AsciiEncoding();
string result = enc.GetString(bufferByte);
However, MSDN says that NetworkStream.Length is not really implemented and will always throw an Exception when called.
Since the incoming data are of varying length I cannot hard-code the count of bytes to expect (which would also be a case of the magic-number antipattern).
Question:
If I cannot get an accurate count of the number of bytes available for reading, then how can I read from the stream properly, without risking all sorts of exceptions within NetworkStream.Read?
EDIT:
Although the provided answer leads to a better overall code I still want to share another option that I came across:
TCPClient.Available gives the bytes available to read. I knew there had to be a way to count the bytes in one's own inbox.
There's no guarantee that calls to Read on one side of the connection will match up 1-1 with calls to Write from the other side. If you're dealing with variable length messages, it's up to you to provide the receiving side with this information.
One common way to do this is to first work out the length of the message you're going to send and then send that length information first. On the receiving side, you then obtain the length first and then you know how big a buffer to allocate. You then call Read in a loop until you've read the correct number of bytes. Note that, in your original code, you're currently ignoring the return value from Read, which tells you how many bytes were actually read. In a single call and return, this could be as low as 1, even if you're asking for more than 1 byte.
Another common way is to decide on message "formats" - where e.g. message number 1 is always 32 bytes in length and has X structure, and message number 2 is 51 bytes in length and has Y structure. With this approach, rather than you sending the message length before sending the message, you send the format information instead - first you send "here comes a message of type 1" and then you send the message.
A further common way, if applicable, is to use some form of sentinels - if your messages will never contain, say, a byte with value 0xff then you scan the received bytes until you've received an 0xff byte, and then everything before that byte was the message you wanted to receive.
But, whatever you want to do, whether its one of the above approaches, or something else, it's up to you to have your sending and receiving sides work together to allow the receiver to discover each message.
I forgot to say but a further way to change everything around is - if you want to exchange messages, and don't want to do any of the above fiddling around, then switch to something that works at a higher level - e.g. WCF, or HTTP, or something else, where those systems already take care of message framing and you can, then, just concentrate on what to do with your messages.
You could use StreamReader to read stream to the end
var streamReader = new StreamReader(someTcpClient.GetStream(), Encoding.ASCII);
string result = streamReader.ReadToEnd();
I want to use the stream class to read/write data to/from a serial port. I use the BaseStream to get the stream (link below) but the Length property doesn't work. Does anyone know how can I read the full buffer without knowing how many bytes there are?
http://msdn.microsoft.com/en-us/library/system.io.ports.serialport.basestream.aspx
You can't. That is, you can't guarantee that you've received everything if all you have is the BaseStream.
There are two ways you can know if you've received everything:
Send a length word as the first 2 or 4 bytes of the packet. That says how many bytes will follow. Your reader then reads that length word, reads that many bytes, and knows it's done.
Agree on a record separator. That works great for text. For example you might decide that a null byte or a end-of-line character signals the end of the data. This is somewhat more difficult to do with arbitrary binary data, but possible. See comment.
Or, depending on your application, you can do some kind of timing. That is, if you haven't received anything new for X number of seconds (or milliseconds?), you assume that you've received everything. That has the obvious drawback of not working well if the sender is especially slow.
Maybe you can try SerialPort.BytesToRead property.
I am working on improving a stream reader class that uses a BinaryReader. It consists of a while loop that uses .PeekChar() to check if more data exists to continue processing.
The very first operation is a .ReadInt32() which reads 4 bytes. What if PeekChar only "saw" one byte (or one bit)? This doesn't seem like a reliable way of checking for EOF.
The BinaryReader is constructed using its default parameters, which as I understand it, uses UTF8 as the default encoding. I assume that .PeekChar() checks for 8 bits but I really am not sure.
How many bits does .PeekChar() look for? (and what are some alternate methods to checking for EOF?)
Here BinaryReader.PeekChar
I read:
ArgumentException: The current character cannot be decoded into the
internal character buffer by using the Encoding selected for the
stream.
This makes clear that amount of memory read depends on Encoding applied to that stream.
EDIT
Actually definition according to MSDN is:
Returns the next available character and does not advance the
byte or character position.*
Infact, it depends on encoding if this is a byte or more...
Hope this helps.
Making your Read*() calls blindly and handling any exceptions that are thrown is the normal method. I don't believe that the stream position is moved if anything goes wrong.
The PeekChar() method of BinaryReader is very buggy. Even when trying to read a from a memory stream with UTF8 encoded data, PeekChar() throws an exception after reading a particular length of the stream. The BCL team has acknowledged the issue, but they have not committed to resolving the issue. Their only response is to avoid using PeekChar() if you can.
Let's assume we have a simple internet socket, and it's going to send 10 megabytes (because I want to ignore memory issues) of random data through.
Is there any performance difference or a best practice method that one should use for receiving data? The final output data should be represented by a byte[]. Yes I know writing an arbitrary amount of data to memory is bad, and if I was downloading a large file I wouldn't be doing it like this. But for argument's sake let's ignore that and assume it's a smallish amount of data. I also realise that the bottleneck here is probably not the memory management but rather the socket receiving. I just want to know what would be the most efficient method of receiving data.
A few dodgy ways can think of is:
Have a List and a buffer, after the buffer is full, add it to the list and at the end list.ToArray() to get the byte[]
Write the buffer to a memory stream, after its complete construct a byte[] of the stream.Length and read it all into it in order to get the byte[] output.
Is there a more efficient/better way of doing this?
Just write to a MemoryStream and then call ToArray - that does the business of constructing an appropriately-sized byte array for you. That's effectively what a List<byte> would be like anyway, but using a MemoryStream will be a lot simpler.
Well, Jon Skeet's answer is great (as usual), but there's no code, so here's my interpretation. (Worked fine for me.)
using (var mem = new MemoryStream())
{
using (var tcp = new TcpClient())
{
tcp.Connect(new IPEndPoint(IPAddress.Parse("192.0.0.192"), 8880));
tcp.GetStream().CopyTo(mem);
}
var bytes = mem.ToArray();
}
(Why not combine the two usings? Well, if you want to debug, you might want to release the tcp connection before taking your time looking at the bytes received.)
This code will receive multiple packets and aggregate their data, FYI. So it's a great way to simply receive all tcp data sent during a connection.
What is the encoding of your data? is it plain ASCII, or is it something else, like UTF-8/Unicode?
if it is plain ASCII, you could just allocate a StringBuilder() of the required size (get the size from the ContentLength header of the response) and keep on appending your data to the builder, after converting it into a string using Encoding.ASCII.
If it is Unicode/UTF8 then you have an issue - you cannot just call Encoding..GetString(buffer, 0, bytesRead) on the bytes read, because the bytesRead might not constitute a logical string fragment in that encoding. For this case you will need to buffer the entire entity body into memory(or file), then read that file and decode it using the encoding.
You could write to a memory stream, then use a streamreader or something like that to get the data. What are you doing with the data? I ask because would be more efficient from a memory standpoint to write the incoming data to a file or database table as the data is being received rather than storing the entire contents in memory.