WCF Alternative for big messages

WCF Alternative for big messages - c#

I have a WCF Service deployed in azure. I have a client consuming this service which runs on Windows Phone 7. Everything worked fine but when i was trying to send to server some larger files or enumerables with lost of items, errors occured. I found out that there can be configured max message size, max array length etc in configuration file. So i added few zeros to default values and it worked. However, i am not happy with this solution, because it is dirty.
my question is:
1.What exactly are disadvantages of mindlessly increasing message size limits and how does it affect service?
2.What is alternative for me instead of increasing message size?
In particular, i nedd to send to server GPS track which consists of same metadata and huge ammount of location points.
If i understand concept correctly, by default wcf uses SOAP, which is XML based. So objects sent are encoded as XML (similiar to XML serialization in .net?). So can it be somehow switched to some binary mode to send BLOBS or to upload large objects troyugh streams? Or is my oinly option to bypass WCF service completely and upload directly to server Storage (like SQL azure or Azure Blob Service), which exposes API to do so?
Thank you.

As Peretz mentioned in a comment, that's what is supposed to happen.
The defaults are just that--defaults. Not "recommended" settings, nor pseudo-max sizes. They're available to alter based on your needs (and should be).
You could use net.tcp binding (if you're not already) which handles data a little better (with regards to serializing), but what you're doing is well within the boundary of WCF and its abilities.

I had quite same problem with huge gps tracks. I realy suggest not to use soap for this kind of tasks. You can use WebHttpBinding and implement streaming from your storage, or use something like ASP.NET WebAPI(will ship with MVC4 and can be hosted outside of IIS) for low level plumbing of streams to client. This will allow you to implement multiple download streams and all what you might need in this kind of tasks.
As in overall design concept of such systems, try not to think that one tool can solve your problems. Just use right took for a task. If you have some busines tasks - implement transactional ws-* based services for them. To transfer huge amount of data - use something like REST services. This will also help to query tracks.
For example: Tracks/{deviceid}/{trackDate}.{format}.
Feel free to ask=)

You should not arbitarily increase message sizes by chucking 0's on the end of these settings. Yes they are defaults which can be changed, but increasing message sizes whenever a limit is hit is not something you should always resort to. One of the reasons for having such small sizes is security: it prevents clients from flooding servers with huge messages and taking them down. It also encourages clients to send small messages which helps with scalability.
There are encodings that you can use: it depends on the bindings used. I thought that WCF encodes SOAP as binary anyway...but I may be wrong, I haven't touched WCF for 6 months now.
In previous projects whenever we hit size limits we looked at cutting our data into smaller chunks. One of the best things we ever did was implement pagable grids in our GUI's which only got 10-20 or so records from the sever at a time. Entity Framework was greating in allowing us to write a single generic skip take query to do this for ALL of our grids.
Just increasing sizes is an easy fix...until you cant increase any further. Its a brittle and broken approach.

Related

Should I use WCF for a simple textual wire protocol?

I need to write a program that will communicate with other .NET programs ... but also a legacy VFP program over TCP. I need to choose a fairly simple TCP message format that the VFP programmer can use. It could be as simple as a sequence of small XML blobs delimited by... I dunno, a null character? Whatever.
I need to choose between TcpListener/TcpClient and WCF. I started researching WCF but its architecture seems opaque and built-in Visual Studio templates are heavily biased toward making "web services" that act like a sort of RPC mechanism, but require a special "host" or web server that is external to the application. And Microsoft's 6-stage tutorial makes WCF sound pretty cumbersome (involving code generators, command-line and XML crap just to remotely subtract or multiply two numbers).
I want a self-contained app (no "host"), I want control of the wire protocol, and I want to understand how it works. WCF doesn't seem to facilitate these things, so I abandoned it in factor of TcpListener/TcpClient.
However, the program is to serve as an intermediary between a single (VFP) server and many (.NET) clients, and there will be communication in both directions and across different connections. Using TcpListener and TcpClient, the work of juggling the connections and threads is getting a bit messy, I have no experience with IAsyncResult, and I'm not just not confident in my code quality.
So I would like to solicit opinions again: should I still consider WCF instead? If yes, can you point me toward answers to the following questions?
Where in the web is a good explanation of WCF's architecture? Or do I need a book?
How is bi-directional communication done in WCF, where either side (of a single TCP connection) can send a message at any time?
How can I get past all the web-services and RPC mumbo-jumbo, and control the wire protocol?
In WCF, how do I shut down the app cleanly, closing all connections in parallel without hacky Thread.Abort() commands?
If no, how can I set up my code (that uses TcpListener/TcpClient/NetworkStream) so that I can read a message from a NetworkStream, but also accept requests from other connections, shut down cleanly at any time, and avoid wasting CPU time to poll queues and NetworkStreams that are inactive?

The short answer: go with WCF. While there's a good amount of tooling and code-generation and other bells and whistles around it, there's nothing that is preventing you from setting up everything in code (you can define your contracts, set the endpoints up, etc. all in code).
For your specific questions:
WCF Architecture - This is pretty basic, and it should get you up and running relatively quickly.
What you are looking for is duplex services. The NetTcpBinding allows for duplex services out-of-the-box (although you can do it with HTTP, you need a specific binding).
If you want to control the wire format, you will want to create a custom encoder. However, I have to strongly recommend against it. You want to create an XML file with null character to delineate separate messages? There's no need for that, the nature of XML is that you can create child elements to perform the appropriate grouping; there's no limit to how many elements you can nest. There's really no need for this.
Simply shutting down the ServiceHost by calling Close, this will allow all outstanding requests to complete, and then shut down gracefully. If you really want to tear down without concern, then call Abort.
In the end, I'd strongly recommend that you not use the NetTcpBinding; VFP will have a difficult time consuming the protocol. However, if you use an HTTP-based protocol, there are always tools that VFP can easily use to make the call and consume the contents (assuming you stick with XML).

Just to tack on about a common on using DCOM, VFP can utilize DCOM, but needs to be done with CreateObjectEx()... the only big difference is you need to know the GUID of the class instance you are connecting to on whatever server it is connecting to, AND the machine name its going to connect to.
Then the remote object does its work via exposed functions, but VFP calling it from some other machine on the network treats it as if the function was being performed locally and gets whatever the return values are.
I've done DCOM with VFP even as far back as 10 yrs ago for an insurance company...

SQL Service Broker vs Custom Queue

I am creating a mass mailer application, where a web application sets up a email template and then queues a bunch of email address for sending. The other side will be a Windows service (or exe) that will poll this queue, picking up the messages for sending.
My question is, what would the advantage be of using SQL Service Broker (or MSMQ) over just creating my own custom queue table?
Everything I'm reading is suggesting I use Service Broker, but I really don't see what the huge advantage over a flat table (that would be a lot simpler to work with for me). For reference the application will be used to send 50,000-100,000 emails almost daily.

Do you know how to implement a queue over a flat table? This is not a silly question, implementing a queue over a table correctly is much harder than it sounds. Queue-like-tables are notoriously deadlock prone and you need to carefully consider the table design and the enqueue and dequeue operations. Also, do you know how to scale your pooling of the table? And how are you goind to handle retries and timeouts (ie. what timers are used for)?
I'm not saying you should use SSB. The lerning curve is very steep and is primarily a distributed applicaiton platform, not a local queueing product so some features, like dialogs, will actually be obstacles for you rather than advantages. I'm just saying that you must consider also the difficulties of flat-table-queues. If you never implemented a flat-table-queue then be warned, there are many dragons under that bridge.
50k-100k messages per day is nothing, is only one message per second. If you want 100k per minute, then we have something to talk about.

If you every need to port to another vendor's database, you will have less problem if you used normal tables.
As you seem to only have one reader and one write from your queue, I would tend to use a standard table until you hit problem. However if you start to feel the need to use “locking hints” etc, that the time to switch to the Service Broker Queues.
I would not use MSMQ, if both the sender and the reader need a database connection to work. MSMQ would be good if the sender did not talk to the database at all, as it lets the sender keep working when the database is down. However having to setup and maintain both the MSMQ and the database is likely to be more work then it is worth for most systems.

For advantages of Service Broker see this link:
http://msdn.microsoft.com/en-us/library/ms166063.aspx
In general we try to use a tool or standard functionality rather than building things ourselves. This lowers the cost and can make upgrading easier.

I know this is old question, but is sufficiently abstract to be relevant for long enough time.
After using both paradigms I would suggest flat table. It is surprisingly scalable and nifty. Correct hints need to be used.
Once the application goes distributed, or starts using mutiple allways on groups with different RW and RO servers, the Service Broker (or any other method of distributed communication) becomes a neccessity.
Flat table
needs only few hints (higly dependent on isolation level) to work scalably and reliably in the consumer (READPAST, UPDLOCK, ROWLOCK)
the order of message processing is not set in stone
the consumer must make sure that the message stays in the queue if the processing fails
needs some polling mechanism (job, CDC (here lies madness :)), external application...)
turn of maintenance jobs and automatic statistics for the table
Service broker
needs extremely overblown "infrastructure" (message types, contracts, services, queues, activation procedures, must be enabled after each server restart, conversations need to be correctly created and dropped...)
is extremely opaque - we have spent ages trying to make it run after it mysteriously stopped working
there is a predefined order of message processing
the tables it uses can cause deadlocks themselfs if SB is overused
is the only way (except for linked servers...) to send messages directly from database on RW server of one HA group to a database that is RO in this HA group (without any external app)
is the only way to send messages between different servers (linked servers are a big NONO (unless they become an YESYES - you know the drill - it depends)) (without any external app)

How do you access in-memory services from web applications?

Say I need to design an in-memory service because of a very high load read/write system. I want to dump the results of the objects every 2 minutes. How would I access the in-memory objects/data from within a web application?
(I was thinking a Windows service would be running in the background handling the in-memory service etc.)
I want the fastest possible solution, and I would guess most people would say use a web service? What other options would I have? I just don't understand how I could hook into the Windows service's objects etc.
(Please don't ask why I would want to do this, maybe you're right and it's a bad idea but I am also curious if this type of architecture is possible.)
Update
I was looking at this site swoopo.com that I would think has a lot of hits near the end of auctions, but since the auction keeps resetting the hits to the database would be just crazy so I was thinking if they did it in memory then dumped to db every x minutes...

What you're describing is called a cache, with a facade front-end.
You write a facade to which you commit your changes and acquire your datasets. The facade queues up reads and writes and commits when the queue is full or after a certain amount of time has passed. Your web application has a single point of access to the data (the facade), and the facade is structured in such a way to avoid writing and reading from storage too often.
Most relational database management systems do this for you. They do this kind of optimization and queuing internally so writing another layer on top of it would only slow things down. So don't write a cache if you're using an RDBMS.
Regarding the specifics of accessing such a facade, you can treat it as just an object, and implement it however you want (its own thread, a thread pool, a Web service, a Windows service, whatever).

Any remoting technology would work such as sockets, pipes and the like.
Check out: www.remobjects.com

You could use a Windows Message Queues or a Service Bus, or even .NET remoting.
See http://www.nservicebus.com/, or http://code.google.com/p/masstransit/.

You could hook into the Windows Services objects by using Remoting or WCF, both offer very fast interprocess communication. Sockets are fast too but are more cumbersome to program compared to WCF. There is a ton of WCF documentation and support online.
Databases provide a level of caching for you. The advantage of an in memory golden copy such as the one you propose is that it never has to read from disk when a request comes in and if you host it on the same machine as your IIS (provided you have enough RAM for both) there is no extra network hop, making it much faster that querying a db. However, the downside to this approach is that it does not scale as well if you need to add machines to load balance.
Third party messaging providers such as TIBCO are also worth looking at.

What is the best way scale out work to multiple machines?

We're developing a .NET app that must make up to tens of thousands of small webservice calls to a 3rd party webservice. We would prefer a more 'chunky' call, but the 3rd party does not support it. We've designed the client to use a configurable number of worker threads, and through testing have code that is fairly well optimized for one multicore machine. However, we still want to improve the speed, and are looking at spreading the work accross multiple machines. We're well versed in typical client/server/database apps, but new to designing for multiple machines. So, a few questions related to that:
Is there any other client-side optimization, besides multithreading, that we should look at that could improve speed of a http request/response? (I should note this is a non-standard webservice, so is implemented using WebClient, not a WCF or SOAP client)
Our current thinking is to use WCF to publish chunks of work to MSMQ, and run clients on one or more machines to pull work off of the queue. We have experience with WCF + MSMQ, but want to be sure we're not missing better options. Are there other, better ways to do this today?
I've seen some 3rd party tools like DigiPede and Microsoft's HPC offerings, but these seem like overkill. Any experience with those products or reasons we should consider them over roll-our-own?

Sounds like your goal is to execute all these web service calls as quickly as you can, and get the results tabulated. Given that, your greatest efficiency control is going to be through scaling the number of concurrent requests you can make.
Be sure to look at your client-side connection limits. By default, I think the system default is 2 connections. I haven't tried this myself, but by upping the number of connections with this property, you should theoretically see a multiplier effect in terms of generating more requests by generating more connections from a single machine. There's more info on MS forums.
The MSMQ option works well. I'm running that configuration myself. ActiveMQ is also a fine solution, but MSMQ is already on the server.
You have a good starting point. Get that in operation, then move on to performance and throughput.

At CodeMash this year, Wesley Faler did an interesting presentation on this sort of problem. His solution was to store "jobs" in a DB, then use clients to pull down work and mark status when complete.
He then pushed the whole infrastructure up to Amazon's EC2.
Here's his slides from the presentation - they should give you the basic idea:
I've done something similar w/ multiple PC's locally - the basics of managing the workload were similar to Faler's approach.

If you have optimized the code, you could look into optimizing the network side to minimize the number of packets sent:
reuse HTTP sessions (i.e.: multiple transactions into one session by keeping the connection open, reduces TCP overhead)
reduce the number of HTTP headers to the minimum in the request to save bandwidth
if supported by server, use gzip to compress the body of the request (need to balance CPU usage to do the compression, and the bandwidth you save)

You might want to consider Rhino Service Bus instead of MSMQ. The source is available here.

How the best solution to solve Vehicle Tracking GPS collecting data in C#?

I have several vehicles that send a data to server each minute. The server should be listening and decode the data to store in the database. There will be thousands of entries per minute. What is the best approach to solve that problem?

My personal favorite, WCF or WebService farm pumps the data to a Microsoft Message Queue (MSMQ) and have a application server (1 or more) convert the data and put it into the DB.
As you get deeper (if you ever need to), you can use the features of MSMQ to handle timeouts, load buffering, 'dead-letters', server failures, whatever. Consider this article.
On the web facing side of this, because it is stateless and thin you can easily scale out this layer without thinking about complex load balancing. You can use DNS load balancing to start and then move to a better solution when you need it.
As a further note, by using MSMQ, you can also see how far 'behind' the system is by looking at how many messages are in the queue. If that number is near 0, then you good. If that number keeps rising non-stop, you need more performance (add another application server).

We're doing exactly what Jason says, except using a direct TCP/UDP socket listener with a custom payload for higher performance.

How long do you expect each operation to take? From what you're saying it seems like you can just write the data straight to the db after processing, so you don't have to synchronize your threads at all (The db should have that taken care of for you).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.