Socket EndReceive Order / Data Issues - c#

Background:
The application I am programming uses async sockets (using BeginSend, EndSend, BeginReceive, EndReceive) to send data between each other. The sockets are TCP, no socket flags, on IPV4.
It uses the system where it sends a 4-byte (int) message, followed by a message with the length specified in the previous message. I use function helpers that handle the MessageLength, and the MessageBody. The flow is something like this
BeginReceive()
EndReceive()
MessageLengthReceived()
BeginReceive()
MessageBodyReceived()
Issue:
The issue arrives when I send file data, in chunks of 16kb (with an additional small overhead: offset, pieceIndex, etc). Occasionally, when receiving the MessageLength, it receives a data from a random part in the previous message, instead of the actual message length. Part of this issue is that it doesn't always happen at a set offset (eg beginning or end of file / piece / 16 kb chunk) and can happen with any file, but happens more if I send a lot more files / larger files.
There are internal messages that are sent (eg RequestMessages) that never experience this problem. All the internal messages are < 100 bytes.
I've tried waiting for the file chunk to save completely before requesting another chunk, but it still fails. I've also tried limiting how many chunks to send at a time, but this only resolves the issue when using 127.0.0.1 (local clients), and not cross network (LAN).
I've spent hours going through my application to see if there's any issues, but I have yet to see any where it would be sending the wrong data as a header. The issue always seems to inbetween the send and the receive of the two clients. Is there settings for socket / method of sending that I should be using? Or could it be some sort of race condition (I thought about race condition, but the fact that the data can be anywhere randomly in a file made me rethink this).

From the question, i guess the problem you are dealing with is inside the MonoTorrent library.
I myself has never encountered such problem. and by looking at the codes, i think the receive part is already ordered because the network IO will not try to receive a second message until the first one has been handled. PieceMessages' write requests are queued in DiskIO also so that should not be the problem.
however, in sending procedure, the ProcessQueue function can be called from several places. and the EnqueueSendMessage called by ProcessQueue indirectly doesn't actually enqueue the message to any queue. it just simply call the Socket.BeginSend. I don't know if Socket.BeginSend() has any queue mechanism inside. If there is not, this may bring some problem when multiple threads are trying to make the same socket "BeginSend" different data.

Related

Combining data before calling Socket.Send

I am using a System.Net.Sockets.Socket in TCP mode to send data to some other machine. I implemented a listener to receive data formatted in a certain way and it is all working nicely. Now I am looking for ways to optimize use of bandwidth and delivery speed.
I wonder if it is worth sending a single array of bytes (or Span in the newer frameworks) instead of using a consecutive series of Send calls. I now send my encoded message in parts (head, length, tail, end, using separate Send calls).
Sure I could test this and I will but I do not have much experience with this and may be missing important considerations.
I realize there is a nowait option that may have an impact and that the socket and whatever is there lower in the stack may apply its own optimizing policy.
In would like to be able to prioritize delivery time (the time between the call to Send and reception at the other end) over bandwidth use for those messages to which it matters, and be more lenient with messages that are not time critical. But then again, I create a new socket whenever I find there is something in my queue and then use that until the queue is empty for what could be more than one message so this may not always work. Ideally I would want to be lenient so the socket can optimize payload until a time critical message hits the queue and then tell the socket to hurry until no more time critical messages are in the queue.
So my primary question is should I build my message before calling Send once (would that potentially do any good or just waste CPU cycles) and are there any caveats an experienced TCP programmer could make me aware of?

protobuf-net.Grpc Flow/Congestion Control

I'm working on a program which will asynchronously load large amounts of data from a server via gRPC (both ends use protobuf-net.Grpc).
While I can simply throw large amounts of data at the client via IAsyncEnumerable, I sometimes want to prioritize sending certain parts earlier (where the priorities are determined on the fly, and not know at the start, kind of like sending a video stream and skipping ahead).
If I were to send data and wait for a response every time I'd be leaving a lot of bandwidth unused. The alternative would be throwing tons of data at the client, which could cause network congestion and delay prioritized packets by an indefinite amount.
Can I somehow use HTTPS/2s / TCPs flow/congestion control for myself here? Or will I need to implement a basic flow/congestion control system on top of gRPC?
To be slightly more precise: I'd like to send as much data as possible without filling up any internal buffers, causing delays down the line.

Named pipe details

I need to send messages from a C#.Net application to a C++ application on Windows. They'll be running on the same PC. After doing some research, it sounds like using a named pipe might work. But I'm still confused about several details. So if anyone can fill me in, I'd appreciate it.
It sounds like a named pipe is basically a type of file. If my .Net application keeps writing to the file, will it keep getting larger? Or will whatever I'm writing go away as soon as the C++ application has read it?
If I send a message with a single write() call, am I guaranteed that it will be read together, or could it be broken up? For example, if I send "hello", is it possible the timing will be such that I'll read "hel" and then "lo"?
Am I correct that if I send several messages before trying to read them, they just sit there and I can read several at once? Will it take multiple read() calls to get every message, or will they all come concatenated together?
Is there any way for the C++ application to know that a message is waiting? Or should I just have a loop going that tries to read a message, sleeps, then tries to read again?
It sounds like a named pipe is basically a type of file. If my .Net application keeps writing to the file, will it keep getting larger? Or will whatever I'm writing go away as soon as the C++ application has read it?
A pipe doesn't really have a size. There may be some number of bytes in it, and you could call that the size of the pipe. This would be cosmetic. Why do you care? If your concern is that pipes may be implemented terribly on your platform, then you should switch platforms.
If I send a message with a single write() call, am I guaranteed that it will be read together, or could it be broken up? For example, if I send "hello", is it possible the timing will be such that I'll read "hel" and then "lo"?
Pipes are streams of bytes. There is no such thing as a message on a pipe. (At least, as far as the pipe knows.)
Am I correct that if I send several messages before trying to read them, they just sit there and I can read several at once? Will it take multiple read() calls to get every message, or will they all come concatenated together?
There aren't messages. Pipes are streams of bytes. If you try to read 100 bytes, you will get 100 bytes, unless there are fewer than that number available.
Is there any way for the C++ application to know that a message is waiting? Or should I just have a loop going that tries to read a message, sleeps, then tries to read again?
You can have a thread block on reading the pipe. That thread can exist just to permit a simple way for you to query whether a message is waiting, for example, by feeding bytes read from the pipe into a thread-safe queue of some kind. It could include the application-level message protocol logic, so the queue would consist of complete application-level messages.
A pipe pretty much acts like a TCP connection as far as the read and write semantics go.

NetworkStream doesn't flush data

I'm writing a simple chat program using sockets. When I'm sending a long message, flush the stream and a short message afterwards, the end of the long message gets appended to the short message. It looks like this:
Send "aaasdsd"
Recieve "aaasdsd"
Send "bb"
Recieve "bbasdsd"
Through debugging I've found that the Flush method, that's supposed to clear all data from the stream, does not do that. According to mdsn, it is the expected behaviour, because NetworkStream is not bufferized. How do I clear the stream in that case? I could just follow every message with an empty (consisting of \0 chars) one of the same length, but I don't think it's correct to do that, also, it would screw up some features I need.
TCP doesn't work this way. It's as simple as that.
TCP is a stream-based protocol. That means that you shouldn't ever treat it as a message-based protocol (unlike, say, UDP). If you need to send messages over TCP, you have to add your own messaging protocol on top of TCP.
What you're trying to do here is send two separate messages, and receive two separate messages on the other side. This would work fine on UDP (which is message-based), but it will not work on TCP, because TCP is a stream with no organisation.
So yeah, Flush works just fine. It's just that no matter how many times you call Flush on one side, and how many times you call individual Sends, each Receive on the other end will get as much data as can fit in its buffer, with no respect to the Sends on the other side.
The solution you've devised (almost - just separate the strings with a single \0) is actually one of the proper ways to handle this. By doing that, you're working with messages on top of the stream again. This is called message framing - it allows you to tell individual messages apart. In your case, you've added delimiters between the messages. Think about writing the same data in a file - again, you'll need some way of your own to separate the individual messages (for example, using end lines).
Another way to handle message framing is using a length prefix - before you send the string itself, send it's length. Then, when you read on the other side, you know that between the strings, there should always be a length prefix, so the reader knows when the message ends.
Yet another way isn't probably very useful for your case - you can work with fixed-length data. So a message will always be exactly 100 bytes, for example. This is very powerful when combined with pre-defined message types - so message type 1 would contain exactly two integers, representing some coördinates, for example.
In either case, though, you'll need your own buffering on the receiving end. This is because (as you've already seen) a single receive can read multiple messages at once, and at the same time, it's not guaranteed to read the whole message in a single read. Writing your own networking is actually pretty tricky - unless you're doing this to actually learn network programming, I'd recommend using some ready technology - for example, Lindgren (a nice networking library, optimized for games but works fine for general networking as well) or WCF. For a chat system, simple HTTP (especially with the bi-directional WebSockets) might be just fine as well.
EDIT:
As Damien correctly noted, there seems to be another problem with your code - you seem to be ignoring the return value of Read. The return value tells you the amount of bytes you've actually read. Since you have a fixed-size persistent buffer on the receiving side (apparently), it means that every byte after the amount you've just read will still contain the old data. To fix this, just make sure you're only working with as much bytes as Read returned. Also, since this seems to indicate you're ignoring the Read return value altogether, make sure to properly handle the case when Read returns 0 - that means the other side has gracefully shutdown its connection - and the receiving side should do the same.

Does BeginReceive() get everything sent by BeginSend()?

I'm writing a program that will have both a server side and a client side, and the client side will connect to a server hosted by the same program (but by another instance of it, and usually on another machine). So basically, I have control over both aspects of the protocol.
I am using BeginReceive() and BeginSend() on both sides to send and receive data. My question is if these two statements are true:
Using a call to BeginReceive() will give me the entire data that was sent by a single call to BeginSend() on the other end when the callback function is called.
Using a call to BeginSend() will send the entire data I pass it to the other end, and it will all be received by a single call to BeginReceive() on the other end.
The two are basically the same in fact.
If the answer is no, which I'm guessing is the case based on what I've read about sockets, what is the best way to handle commands? I'm writing a game that will have commands such as PUT X Y. I was thinking of appending a special character (# for example) to the end of each command, and each time I receive data, I append it to a buffer, then parse it only after I encounter a #.
No, you can't expect BeginReceive to necessarily receive all of the data from one call to BeginSend. You can send a lot of data in one call to BeginSend, which could very well be split across several packets. You may receive each packet's data in a separate receive call.
The two main ways of splitting a stream into multiple chunks are:
Use a delimiter (as per your current suggestion). This has the benefit of not needing to know the size beforehand, but has the disadvantage of being relatively hard to parse, and potentially introducing requirements such as escaping the delimiter.
Prepend the size of each message before the message. The receiver can read the length first, and then know exactly how much data to expect.

Categories

Resources