I'm writing my first application with NetMQ (ZeroMQ implementation for .NET).
I also need to listen to information sent from a client using a traditional TCP socket (a.k.a a non-0MQ socket).
I've seen references to the availability of this socket type in the official ZeroMQ documentation here, (look for ZMQ_STREAM), but there's very few details on how to use it (and that doesn't help much either, the .NET API is quite a bit different from the C++ API).
The offical NetMQ documentation also makes no mention of the Streaming socket type.
Finally I had a look over to the Test suite for NetMQ on Github, and found a partial answer to my question in the method RawSocket.
The following snippet works:
using (NetMQContext context = NetMQContext.Create())
{
using (var routerSocket = context.CreateRouterSocket())
{
routerSocket.Options.RouterRawSocket = true;
routerSocket.Bind("tcp://127.0.0.1:5599");
byte[] id = routerSocket.Receive();
byte[] message = routerSocket.Receive();
Console.WriteLine(Encoding.ASCII.GetString(id));
Console.WriteLine(Encoding.ASCII.GetString(message));
}
}
When using standard TCP/IP test-tools, the byte[] message is printed out nicely, e.g. like this:
Hello World!
but the byte[] id is printed out like this:
???♥
In other words, I have no clue what's up with the id part. Why is routerSocket.Receive called twice? What is contained within the id? Is this something ZeroMQ/NetMQ specific, or is something TCP/IP specific information being extracted here?
Thanks to #Mangist for pointing this out.
The answer is in the RouterSocket documentation:
An identity, sometimes called an address, is just a binary string
with no meaning except "this is a unique handle to the connection".
Then, when you send a message via a ROUTER socket, you first send an
identity frame.
When receiving messages a ZMQ_ROUTER socket shall prepend a message
part containing the identity of the originating peer to the message
before passing it to the application. Messages received are
fair-queued from among all connected peers. When sending messages a
ZMQ_ROUTER socket shall remove the first part of the message and use
it to determine the identity of the peer the message shall be routed
to.
Identities are a difficult concept to understand, but it's essential
if you want to become a ZeroMQ expert. The ROUTER socket invents a
random identity for each connection with which it works. If there are
three REQ sockets connected to a ROUTER socket, it will invent three
random identities, one for each REQ socket.
This image illustrates the core concept of the ID frames:
Related
I'm currently learning how to use sockets on c#, and have a question regarding how the messages should be between the client and the server.
Currently i have a server application and a client application, and in each application i have some strings that are the commands. When, for example, the client needs the time from the server, i have a string like this:
public const string GET_TIME_COMMAND = "<GET_TIME_COMMAND>";
Then i have a if statement on the server, thats checks if the message sent from the client starts with that string and if so, it sends another message to the client with another command and with the time in a json string.
My question is, is this a good way to do it, and if not could you advise me on another way to go about this?
TCP
Keep in mind that TCP is a stream based connection. You may or may not get the complete command in one message. You may even get multiple commands in one read.
To solve this TCP messages usually have a unique start and stop sequence or byte that may not be part of the message.
(SomeCommand)
Where ( is the start and ) is the stop symbol.
An alternative way os to prepend a header to the actual message that contains the message length.
11 S O M E M E S S A G E
Where 11 is the message length and somemessage is the actual message. You'd usually transmit the length as a byte or ushort, not a string literal.
In both cases you have to read over and over until you have one complete message - then you can dispatch it into the application.
Also TCP is connection based. You have to connect to the remote site. The advantage is that TCP makes sure that all messages are sent in the very order you put them in. TCP will also automatically re-send lost packets and you don't have to worry about that.
UDP
In contrast to that UDP is a message/packet based, but it is not reliable. You may or may not get the message and have to re-send it in some cases. Also UDP doesn't have a notion of a "session". You would have to do that yourself if required.
The answer to your question depends on the protocol used. For TCP this won't work well with your current message format. You'd probably have to prepend a header.
You could use UDP, but then you may have to detect and re-send messages that got lost.
I have a question using ZeroMQ with Dealer Sockets connecting to a peer instead of binding and having a peer connect to it.
Here is in an example (This is not using a public C# binding of ZeroMQ):
var router = context.CreateRouterSocket();
var dealer = context.CreateDealerSocket();
var endpoint = ZeroTcpEndpoint.CreateLoopback(9000);
dealer.Connect(endpoint);
// ...
//
// If I don't bind the Router and I send a message on the Dealer,
// it doesn't block. It seems the message is dropped.
// router.Bind(endpoint);
dealer.SendMessage();
As you can see, if I have a Router Socket that I don't bind and I connect a Dealer Socket to it, SendMessage does not block. It seems the message is dropped.
This simulates that a Dealer only blockes on SendMessage, as per the documentation, if it binds and has a peer connect to it, and not when it itself connects to a peer.
Basically what I am trying to achieve is avoid having the message dropped if the peer suddenly goes offline for whatever reason and since there is no way to tell the difference between a dropped message and a sent message except through blocking when there is no peer, I thought a Dealer would be the best option.
If I can't do it with a Dealer though, as observed here, is there an alternative method I can do? There is a question here that observes something similar in the Python binding (PyZMQ PUSH socket does not block on send()) and it basically recommends that I set the Send High Watermark to 1. What are the pitfalls if I do that?
Thanks.
I just started to learn ZeroMQ and want to build a distributed webcrawler as an example while learing.
My idea is to have a "server", written in PHP, which accepts a url where the crawling should start.
Workers (C# cli) will have to crawl that url, extract links, and push them back into a stack on the server. The server keeps sending urls in the stack to workers.
Perhaps a redis will keep track of all crawled urls, so we dont crawl sites multiple times and have the ability to extract statistics of the current process.
I would like to have the server to distribute tasks evenly, be aware of new/missing workers and redistribute urls when a worker doesnt respond.
Why PHP for the server: i'm just very comfortable with PHP, that is all. I dont want to make the example/testing project more complicated.
Why C# for the minions: because it runs on most windows machines. I can give the executable to various friends which can just execute it and help me test my project.
The crawling process and redis functionality are not part of my question.
My first approach was the PUSH/PULL pattern, which generally works for my scenario, but isnt aware of it's minions. I think i need a DEALER/ROUTER broker in the middle and have to handle the worker-awareness for myself.
I found this question but i'm not really sure if i understand the answer...
I'm asking for some hints how to impement the zmq stuff. Is the dealer approach correct? Is there any way to get an automatic worker-awareness? I think I need some resources/examples, or do you think that i just need to dig deeper in the zmq guide?
However, some hints towards the right direction would be great :)
Cheers
I'm building a job/task distributor that works the same as your crawler, in principal, at least. Here's a few things I've learned:
Define All Events
Communication between server and crawlers will be based on different things happening in your system, such as dispatching work from server to crawler, or a crawler sending a heartbeat message to the server. Define the system's event types; they are the use cases:
DISPATCH_WORK_TO_CRAWLER_EVENT
CRAWLER_NODE_STATUS_EVENT
...
Define a Message Standard
All communication between server and crawlers should be done using ZMsg's, so define a standard that organizes your frames, something like this:
Frame1: "Crawler v1.0" //this is a static header
Frame2: <event type> //ex: "CRAWLER_NODE_STATUS_EVENT"
Frame3: <content xml/json/binary> //content that applies to this event (if any)
Now you can create message validators to validate ZMsgs received between peers since you have a standard convention all messages must follow.
Server
Use a single ROUTER on the server for asynchrounous and bidirectional communication with the crawlers. Also, use a PUB socket for broadcasting heartbeat messages.
Don't block on the ROUTER socket, use a POLLER to loop every 5s or whatever, this allows the server to do other things periodically, like broadcast heartbeat events to the crawlers; something like this:
Socket rtr = .. //ZMQ.ROUTER
Socket pub = .. //ZMQ.PUB
ZMQ.Poller poller = new ZMQ.Poller(2)
poller.register( rtr, ZMQ.Poller.POLLIN)
poller.register( pub, ZMQ.Poller.POLLIN)
while (true) {
ZMsg msg = null
poller.poll(5000)
if( poller.pollin(0)){
//messages from crawlers
msg = ZMsg.recvMsg(rtr)
}
//send heartbeat messages
ZMsg hearbeatMsg = ...
//create message content here,
//publish to all crawlers
heartbeatMsg.send(pub)
}
To address your question about worker awareness, a simple and effective method uses a FIFO stack along with the heartbeat messages; something like this:
server maintains a simple FIFO stack in memory
server sends out heartbeats; crawlers respond with their node name; the ROUTER automatically puts the address of the node in the message as well (read up on message enveloping)
push 1 object onto the stack containing the node name and node address
when the server wants to dispatch work to a crawler, just pop the next object from the stack, create the message and address is properly (using the node address), and off it goes to that worker
dispatch more work to other crawlers the same way; when a crawler responds back to the server, just push another object with node name/address back on the stack; the other workers won't be available until they respond, so we don't bother them.
This is a simple but effective method of distributing work based on worker availability instead of blindly sending out work. Check lbbroker.php example, the concept is the same.
Crawler (Worker)
The worker should use a single DEALER socket along with a SUB. The DEALER is the main socket for async communication, and the SUB subscribes to heartbeat messages from the server. When the worker receives a heartbeat messages, it responds to the server on the DEALER socket.
Socket dlr = .. //ZMQ.DEALER
Socket sub = .. //ZMQ.SUB
ZMQ.Poller poller = new ZMQ.Poller(2)
poller.register( dlr, ZMQ.Poller.POLLIN)
poller.register( sub, ZMQ.Poller.POLLIN)
while (true) {
ZMsg msg = null
poller.poll(5000)
if( poller.pollin(0)){
//message from server
msg = ZMsg.recvMsg(dlr)
}
if( poller.pollin(1)){
//heartbeat message from server
msg = ZMsg.recvMsg(sub)
//reply back with status
ZMsg statusMsg = ...
statusMsg.send(dlr)
}
The rest you can figure out on your own. Work through the PHP examples, build stuff, break it, build more, it's the only way you'll learn!
Have fun, hope it helps!
If I do the following:
UdpClient c = new UdpClient();
c.Connect(new System.Net.IPEndPoint(IPAddress.Parse("69.65.85.125"), 9900));
c.Send(new byte[] { 1,2,3,4,5 }, 5);
then I will be sending a packet to my router then my router will send that packet to the ip "69.65.85.125".
If I where to capture that packet on the computer that has the ip "69.65.85.125" I will be able to see the port that was oppened by the router (client.RemoteEndpoint). How will it be possible to see that information without capturing the packet at the other enpoint? Is there a way to query the router?
If your router supports it you can query it via UPnP. Here is a wrapper library for UPnP I found for .NET, I have never used it so I cant give you any advice if it is good or not.
Look at the ComponetsTest program for example code in the zip for the library. You will need to reference the UPnP documentation to find out what calls you will need to make to the service.
From the message board of the library of someone asking a how to find port mappings.
The WANPPPConnection and WANIPConnection services have actions called
GetSpecificPortMappingEntry, simply call this iterating through the
indexes from 0 until an error is returned, each call will return
another UPnP port mapping, you can also get the static mappings with a
different service.
In order to get the public IP, the remote device should respond by sending a UDP packet back to you that contains the IP address and port it saw. This is one of the most fundamental concepts behind a STUN server, commonly used in UDP hole-punching algorithms.
There are several free STUN servers available that do exactly this. Send one of them a "binding" request, and you will get back a response with your public IP address and port.
stun.l.google.com:19302
stun1.l.google.com:19302
stun2.l.google.com:19302
stun3.l.google.com:19302
stun4.l.google.com:19302
stun01.sipphone.com
stun.ekiga.net
stun.fwdnet.net
stun.ideasip.com
stun.iptel.org
stun.rixtelecom.se
stun.schlund.de
stunserver.org
stun.softjoys.com
stun.voiparound.com
stun.voipbuster.com
stun.voipstunt.com
stun.voxgratia.org
stun.xten.com
If you are truly interested in doing proper UDP hole-punching, check out ICE (Interactive Connectivity Establishment). It's a brilliant algorithm that uses STUN and another protocol called TURN to guarantee a successful connection between peers. (Apple uses it for Facetime video calls, among others.)
If you're interested, the company I work for has developed a product called IceLink that uses ICE/STUN/TURN to establish direct data streams between peers. SDKs are available for .NET, Mac, iOS, Android, Java, Windows Phone, Windows 8, Unity, Xamarin, and more, and it even includes full support for WebRTC audio/video streams.
Here I am troubleshooting a theoretical problem about HOW servers and clients are working on machines. I know all NET Processes, but I am missing something referring to code. I was unable to find something related about this.
I code in Visual C# 2008, i use regular TCPClient / TCPListener with 2 different projects:
Project1 (Client)
Project2 (Server)
My issues are maybe so simple:
1-> About how server receives data, event handlers are possible?
In my first server codes i used to make this loop:
while (true)
{
if (NetworkStream.DataAvailable)
{
//stuff
}
Thread.Sleep(200);
}
I encounter this as a crap way to control the incoming data from a server. BUT server is always ready to receive data.
My question: There is anything like...? ->
AcceptTcpClient();
I want a handler that waits until something happen, in this case a specific socket data receiving.
2-> General networking I/O methods.
The problem is (beside I'm a noob) is how to handle multiple data writing.
If I use to send a lot of data in a byte array, the sending can break if I send more data. All data got joined and errors occurs when receiving. I want to handle multiple writes to send and receive.
Is this possible?
About how server receives data, event handlers are possible?
If you want to write call-back oriented server code, you may find MSDN's Asynchronous Server Socket Example exactly what you're looking for.
... the sending can break if I send more data. All data got joined and errors occurs when receiving.
That is the nature of TCP. The standardized Internet protocols fall into a few categories:
block oriented stream oriented
reliable SCTP TCP
unreliable UDP ---
If you really want to send blocks of data, you can use SCTP, but be aware that many firewalls DROP SCTP packets because they aren't "usual". I don't know if you can reliably route SCTP packets across the open Internet.
You can wrap your own content into blocks of data with your own headers or add other "synchronization" mechanisms to your system. Consider an HTTP server: it must wait until it reads an entire request like:
GET /index.html HTTP/1.1␍␊
Host: www.example.com␍␊
␍␊
Until the server sees the CRLFCRLF sequence, it must keep the partially-read data in a buffer. The bytes might come in one at a time in a dozen or more packets. Or, if the client is sending multiple requests in a single stream, a dozen requests might come in a single packet.
You just have to handle this.