I have two computers communicating over a serial modem.
I would like to have a reliability protocol on that line.
I have been looking into PPP, SLIP and RATP. Not all of them are the best fit, and I do not want to write all that code, especially if there is a good code base for that online.
Is there a library or code project in C# that can be used for that purpose?
If not what protocol should you recommend to implement?
The connection speed is 9600, but the amount of data sent is not very big, and speed is not a big issue. Simplicity and ease of development is!
I always just add a CRC to each message, but my higher level protocols are self-synchronizing and loss tolerant by virtue of unsolicited state reports -- if a command is lost and the state doesn't change, that becomes apparent on the next state report. Depending on whether your requirement is to detect or correct errors, and whether you can tolerate extra delays for retransmissions, you might need to look into a forward error correcting code.
Concepts of interest include message sequence numbers, acknowledgement, go-back-N vs selective retransmit, and minimum distance between codes (Hamming distance).
I definitely suggest you look at the design of TCP. The basics are really pretty minimal for guaranteed in-order delivery.
Related
I have question to ask.
I wonder, while sending data with TcpClient, is there any sort of CRC
checking algorithm that works automatically? Or I have to implement my own algorithm and resend the data if it doesn't arrive to remote host correctly?
Any ideas?
My best regards...
The TCP checksum is fairly weak, but considered good enough for a reliable stream protocol. Often, there are more robust checksums being performed at the data link layer - i.e., ethernet (IIRC) uses CRC-32.
If you really want to roll your own, here is a detailed guide to the theory and implementation.
There is a checksum in TCP. But, as the same article says,
The TCP checksum is a weak check by modern standards. Data Link Layers with high bit error rates may require additional link error correction/detection capabilities. The weak checksum is partially compensated for by the common use of a CRC or better integrity check at layer 2, below both TCP and IP, such as is used in PPP or the Ethernet frame.
So if you have concerns about that, maybe it won't hurt to add another checking.
TCP contains a checksum and the TCP/IP stack will detect broken packets. So you don't need to implement any error detection algorithms on your own unless you want to.
The whole point of using TCP is that the error checking is built in the protocol. So you don't have to worry about this kind of things.
I have a requirement to create a UDP file transfer system. I know TCP is guaranteed and much more reliable, but I need to transfer huge files between locations and I think the speed advantage in this project outweighs the benefits using TCP. I’m just starting this project, but would like some guidance if anyone has done this before. I will be writing both sides (client and server) so I don’t need to worry about feature limitations in other products.
In a nutshell I need to:
Take large files and send them in chunks
Be able to throttle bandwidth from the client
Create some kind of packet numbering system for errors,
retransmitions and assembling files by chunk on server (yes, all the
stuff we get from TCP for free :-)
Configurable datagram size – I think some firewalls complain if they
get too big?
Anything else I may be missing
I’m starting this journey using UdpClient and would like to write this app in C#. Any words of wisdom (other than to use TCP)?
It’s been done with huge success. We used to use RocketStream.com, but they sold their product to another company for internal use only. We typically get speeds that are 30X faster than FTP or raw TCP byte transfers.
in regards to
Configurable datagram size – I think some firewalls complain if they get too big?
one datagram could be up to 65,536 bytes. cosidering all the ip header information you'll end up with 65,507 bytes for payload. but you have to consider how all devices are configured along you network path. typically most devices have set an MTU-size of 1500 bytes so this will be typically your limit "on the internet". if you set up a dedicated network between your locations you can increase your MTU an all devices.
further in regards to
Create some kind of packet numbering system for errors, retransmitions and assembling files by chunk on server (yes, all the stuff we get from TCP for free :-)
i think the best thing in your case would be to implement a application level protocol. like
32 byte sequence number
8 byte crc32 checksum (correct me on the bytesize)
any bytes left can be used for data
hope this gives you some bit of a direction
::edit::
from experience i can tell you UDP is about 10-15% faster than TCP on dedicated and UDP-tuned networks.
I'm not convinced the speed gain will be tremendous, but an interesting experiment. Such a protocol will look and behave more like one of the traditional modem based protocols, and probably ZModem is one of the better examples to get some inspiration from (implements an ack window, adaptive block size, etc).
There are already some people who tried this, check out this site.
That would be cool if you succeed.
Don't go in it without WireShark. You'll need it.
For the algorithm, I guess that you have pretty much the idea of how to start. Maybe some pointers:
start with MTU that will be common to both endpoints, and use packets of only that size, so you'll have control over packet fragmentation (when you come down from TCP, I hope that this is for the more control over low level stuff).
you'll probably want to look into STUN or TURN for punching the holes into NATs.
look into ZModem - that also has a nostalgic value :)
since you want to squeeze maximum from you link, try to put as much as you can in the 'control packets' so you don't waste a single byte.
I wouldn't use any CRC on packet level, because I guess that networks underneath are handling that stuff.
I just had an idea...
break up a file in 16k chunks (length is arbitrary)
create HASH of each chunk
transmit all hashes of the chunks, using any protocol
at receiving end, prepare by hashing everything you have on your hard drive, network, I mean everything, in 16k chunks
compare received hashes to your local hashes and reconstruct the data you have
download the rest using any protocol
I know that I'm 6 months behind the schedule, but I just couldn't resist.
Others have said more interesting things, but I would like to point out that you need to make sure you use a good compression algorithm. That will make a world of difference.
Also I would recommend validating your assumptions as to the speed improvement possibility, make a trivial system of sending data (not worrying about loss, corruption, or other problems) and see what bandwidth you get. This will at least give you a realistic upper bound for what can be done.
Finally consider why you are taking on this task? Will the speed gains be worth it after the amount of time spent developing it?
I have a .NET service which need to feed live financial data to its clients. The output rate for this feed might get intense and I am looking for the best architecture to implement this type of service with low latency and high performance.
I was thinking of using some kind of a stream data provider, one that is used for audio or video, but send feed updates instead.
Would appreciate any thought on this subject, or any real world examples
Update:
I don't have to use WCF, that was only my first approach since it is the current technology. Any other implementation in C# is welcome.
Full Disclosure: I work for Informatica (formerly 29West) and am on the engineering team responsible for their messaging products. I am biased. I do, however, have a pretty good grasp of low-latency messaging in the financial market.
If you message rates are about 60 messages/sec. (as stated in a comment on Will Dean's answer), and they're being delivered to a GUI with a human sitting in front of it and reacting to the market at human-speed, it honestly doesn't matter a whole lot what software you use from a latency perspective. You might even be able to get away with using WCF (though I'd still recommend against it; we considered supporting it once and prototyped an adapter for it and it bloated latencies up by an order of magnitude - we decided not to bother with it at the time).
Now, Informatica's messaging software can pass messages between processes on the same machine in well under a microsecond, and if you want to buy some nice 10 gig-E NICs with kernel bypass or InfiniBand gear, you can pass millions of messages per second between machines with single-digit microseconds of latency. We'll also soon be releasing a new data serialization library that's supported in C/C++, Java, and .NET as part of the messaging product that in some cases is actually faster than Protocol Buffers (although Protocol Buffers are widely used and also a very good choice). Our .NET and Java APIs both have a feature called "ZOD" for "Zero Object Delivery", which is a kinda funny way of saying they generate no new objects during message delivery, meaning no garbage collection pauses & associated latency spikes/outliers. We've got another product called UMDS that's specifically designed to fan out high-speed backbone traffic to slower desktop apps without slowing down the backbone or other clients.
I could go on and on about how great Informatica's messaging software is and I do think it's worth checking out, but this already looks like a straight-up ad, and I'm an engineer, not a sales person. So here's a few pieces of more general advice:
If you have a lot of clients receiving the same data, you'll want some flavor of UDP multicast. You'll often want a reliable multicast transport of some kind - the well-known (and free) reliable multicast protocol is PGM. Windows includes an implementation of PGM that's usable in C#; I'll refer you to Mike Rettig's excellent blog post on how to use it if you want to try it out. (I happen to know Mike - he's a smart guy.) Protocol choice is an area in which you get what you pay for; Informatica's messaging includes a reliable multicast protocol loosely based off of PGM (our architect who designed it co-wrote the PGM RFC a long while back), but with a lot of major improvements. Plain PGM might be fine for what you need, though.
You want to go with a brokerless/serverless architecture. Have the apps communicate peer-to-peer with nothing in the middle. Avoid extra hops in the message path (which usually means avoid most JMS implementations, avoid almost anything with "queue" in the name somewhere, etc.).
Be mindful of how your system behaves when one individual client misbehaves. Can one slow consumer slow down everyone else?
There are a lot of OS tuning and BIOS tuning options that can benefit any sort of low-latency messaging, homegrown or bought - things like interrupt coalescing, tying NIC interrupts to a particular CPU core, receive-side scaling (which has historically been terrible when used with UDP on Windows, but should be getting much better in the future), disabling certain CPU power states, etc.
Resist the temptation to use built-in object serialization in .NET to send whole objects over the wire - it is orders of magnitude slower than using a simple binary format (like Protocol Buffers, or Informatica's serialization library, or your own binary format, etc.).
If you have more specific questions or need more detail on any of my advice, just let me know!
How low is 'low latency' and how busy is 'intense'? You need to have some idea of what you're aiming for to choose the right approach.
I could supply you some hardware which would respond to 100% of all requests within, say, 20us upto the full capacity of your network hardware, but it would not use WCF much at all.
To a very broad approximation, I would say that things like WCF are very high-level and trade-off ease-of-use and abstraction-for-the-benefit-of-the-programmer against performance (latency/throughput). Whether they trade it off too much for your application needs real numbers.
The lowest-latency, lowest-overhead IP-based protocol in widespread use is UDP - that's why it's used for things like DNS and NTP. It's very scalable at the server, because the server doesn't need to keep any state, and it's very simple to implement on almost any platform. But you do need to be thinking in terms of network packets rather than .NET objects. Do you get to supply the client-end software too?
Live financial data? Never rely on WCF on that. Instead, go with what other industries use. i.e. NASDAQ uses Real-Time Innovations - Data Distribution Service to deliver live stock ticks to users. They provide C/C++/C# api for their communications libraries, which is extremely easy to setup and use (compared to WCF).
In general, this sort of real-time data feeds use publish/subscribe paradigm which helps to make sure that the communication happens with minimal overhead. This sort of an approach is the main idea in message-oriented middle ware and it is exactly what financial services use for real-time stuff.
On a side node, you can deliver real-time audio-video packets using RTI-DDS library, as far as I know, unmanned aerial vehicles like MQ-9 uses again this library to deliver live video & geo-location information to the ground control stations.
There are also free data distribution service libraries but I've no experience in them. You just need to google for it.
Edit: I'm currently prototyping some HMI (human machine interface) software which uses aforementioned RTI-DDS libraries along with two other libraries which have such message oriented architectures, which did work a thread up to now for all my real-time communication needs. Here is a demo: http://epics.codeplex.com/ (It will be used in remotely controlling the equipment in our brand new nuclear research facility)
The more assumptions you make and features you cut out the faster you can make your system. The more robust and flexible you attempt to make things, the more your performance will suffer. I would suggest a few basic must haves:
A binary data serialization format.
Don't use XML or any other human
readable method of passing your
data.
A robust enough data
serialization format that it can
support cross-architecture,
cross-language endpoints. BER comes
to mind - C# seems to have support
A transport protocol that has
guaranteed delivery and data
integrity. If any type of
financial algorithm will be using
this data, even missing one tick
could mean the difference between
and order being triggered or missing
out on a price. Even if you are
going to aggregate ticks in your
server you still want control over
how the information is presented to
your clients. TCP works for distributed systems. However there are much faster alternatives if your clients are on the same machine as your server. UDP won't even garauntee order, which can be problematic (though not insurmountable).
With regard to internal processing:
Avoid strings and other classes that
add significant overhead to simple
tasks. Use basic character arrays
instead. I'm not sure what options
you have in C# or if you even have
lightweight alternatives. If so, use
them. This applies to data-structures as well.
Be aware of double/float comparison errors. Use comparisons that only check for the necessary level of precision. If possible convert everything to integers internally and provide enough metadata to convert back on the other end.
Use something similar to pooled allocators in C++. My lack of knowledge of C# prevents me from being more specific. Again C# probably isn't your best choice here. Bottom line is that you are going to be creating and destroying a lot of tick objects and there is no reason to ask the OS for the memory every time.
Only send out deltas, don't send information that your clients already have. This assumes you are using a transport with guaranteed delivery. If not you could end up displaying stale data for a long time.
This might be of interest although its specific to gaming ... Lowest Latency small size data Internet transfer protocol? c#
Here is a tutorial on UDP connection http://www.winsocketdotnetworkprogramming.com/clientserversocketnetworkcommunication8r.html
Another Article on UDP
http://msdn.microsoft.com/en-us/magazine/cc163648.aspx
You ask specifically about a "low latency User Feed". What do you really want with low latency, for 'Feed Only' (and especially if it does not generate revenue), could the Users wait a second; that is not low latency.
If you want to trade FAST then you need to physically move across the street from the Exchange (or nearby with an Optical Link). Next you need to 'Trade on the Card'; the Ethernet Card is 'smart' and is fed 'Trade Formulas' that program the Network Card to make a preprogrammed trade based on Data received (without pestering your Computer).
See: http://intelligenttradingtechnology.com/article/groundbreaking-results-high-performance-trading-fpga-and-x86-technologies
Learning to manipulate that Environment will buy you more than reinventing the wheel.
Ultra low latency is costly, but billions are at stake; your stakes (and pursuit of lower latency) with be throttled by $.
In the past i've used Tibco rv or raw sockets for streaming prices/ rates, where high frequency updates are expected. In this situation, it is often the client (or in in fact the user) who is the limitation (as there is only so many updates a user can process), and this is therefore an example of where you can 'lose' data. In this situation a client side service broker can be used to throttle updates.
If the system is used for automated trading or HFT then products like 29West LatencyBuster has been proven to work well and offers guaranteed messaging.
I'm reading a lossy bit stream and I need a way to recover as much usable data as possible. There can be 1's in place of 0's and 0's in palce of 1's, but accuracy is probably over 80%.
A bonus would be if the algorithm could compensate for missing/too many bits as well.
The source I'm reading from is analogue with noise (microphone via FFT), and the read timing could vary depending on computer speed.
I remember reading about algorithms used in CD-ROM's doing this in 3? layers, so I'm guessing using several layers is a good option. I don't remember the details though, so if anyone can share some ideas that would be great! :)
Edit: Added sample data
Best case data:
in: 0000010101000010110100101101100111000000100100101101100111000000100100001100000010000101110101001101100111000101110000001001111011001100110000001001100111011110110101011100111011000100110000001000010111
out: 0010101000010110100101101100111000000100100101101100111000000100100001100000010000101110101001101100111000101110000001001111011001100110000001001100111011110110101011100111011000100110000001000010111011
Bade case (timing is off, samples are missing):
out: 00101010000101101001011011001110000001001001011011001110000001001000011000000100001011101010011011001
in: 00111101001011111110010010111111011110000010010000111000011101001101111110000110111011110111111111101
Edit2: I am able to controll the data being sent. Currently attempting to implement simple XOR checking (though it won't be enough).
If I understand you correctly, you have two needs:
Modulate a signal into sound and then demodulate it.
Apply error correction since the channel is unreliable.
Modulation and demodulation is a wellknown application, with several ways to modulate the information.
Number two, error correction also is wellknown and have several possibilities. Which one is applicable depends on the error rate and whether you have duplex operation so that you can request resends. If you have decent quality and can request resends an approach like the one TCP is using is worth exploring.
Otherwise you will have to get down to error detection and error correction algorithms, like the one used on CDROMs.
Edit after the comment
Having the modulation/demodulation done and no resend possibilities narrows the problem. If you are having timing issues, I would still recommend that you read up on existing (de)modulation methods, as there are ways to automatically resynchronize with the sender and increase signal-to-noise ratio.
Down to the core problem; error correction you will have to add parity bits to your output stream in order to be able to detect the errors. Starting with the forward error correction article #Justin suggests, an scheme that looks quite simple, but still powerful is the Hamming(7,4) scheme.
You need to use forward error correction. An XOR parity check will only detect when an error occurs. A simple error correction algorithm would be to send each chunk of data multiple times (at least 3) and make a majority decision.
The choice of algorithm depends on several factors:
Channel utilization (if you have lots of free time, you don't need an efficient coding)
Error types: are the bad bits randomly spaced or do they usually occur in a row
Processing time: code complexity is limited if data transmission needs to be fast
There are lot of possibilities, see : http://en.wikipedia.org/wiki/Error_detection_and_correction
This can help you with changed bits, but may be unsuitable to check whenever you have all the bits.
In the end, it will probably take much more than few lines of simple code.
Suppose you were forced to use TCP sockets over UDP sockets (ie: something that Silverlight insists on). Would it be possible to create a multiplayer game that involves sending real time positional updates to up to say eight players so that each player could accurately see every other player in real time, even though UDP would be the better protocol to use? Given the option, would you wish to go as far as to select a different technology (ie: Java), simply to gain UDP support?
Thanks,
Nick
As long as a few milliseconds aren't important i see no reason to use UDP.
To receive UDP packets, you must have a public IP address.
To receive UDP packets, you need to be able to listen on a port. Not all frameworks in all environments can do this, often for security reasons and such.
As you describe Silverlight as a target platform, we can anticipate that this won't always be the case for your players.
Use TCP.
As an alternative to Silverlight, you might look at Haxe (or Flash).
(From the comments, there is mention of STUN and stuff; that's an interesting if difficult angle to pursue.)
It depends on how fast of real-time you are looking at. For example, if you try to make a space battle, and everyone is close, but moving at a high speed, then you may find that the milliseconds difference makes a difference, but, if you are doing something like an auto racing game then it won't make any difference, so TCP is fine.
So, try it, get some numbers and decide if it is acceptable.
The bigger problem will be the difference in bandwidth, so, if one person is playing over a really slow connection, and everyone else are on very fast connections, then that slower player will be a problem. You may need to scale the updates to the slowest connection, and you may find that TCP/UDP issues are not enough of a concern, as the difference in connection speeds are a far bigger problem.
So, test with various connection speeds, with differing numbers of users, each with their own connection speeds, and see if, as one user, the game is still enjoyable.
UPDATE
It is not bandwidth that will be the concern, but the latency, as was pointed out in a comment. I had picked the wrong term, as several people might be able to respond quickly and be closer to real-time, but one user may be much slower, perhaps on a congested network, slow computer, or whatever, but they may only send updates every 1000ms, whereas everyone else is doing it every 100ms.