Architectural design of write heavy application [closed] - c#

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am dealing with live tracking system where one device can push around 2 million GPS points every year (i.e 1 point every 5 seconds, operating for 8 hours over 365 days). If this operates on a global scale with thousands of devices, this results in billions of records per year.
I know SQL Server can handle it. But I need to be able to perform live tracking with thousands of devices performing concurrent writes. It works fine with a few devices but I can see this being CPU intensive when I open lots of tracking sites.
I am planning to try:
Mongo DB
Socket approach with kazzing.
Any alternative suggestions?

Given the information you have posted, there is nothing wrong with your architecture. The devil is in the details though. For one, alot depends on how well your DB is designed. It depends on how well written your queries are, db indexes, triggers, etc...
Also, if this is a mobile device of any type, you shouldn't be using a traditional sockets based connector. You cannot depend on a stable tcp connection to a remote server. You should use a stateless architecture like REST to expose/write your data for you. REST is very easy to implement in .NET b.t.w. This should move the scale difficulty from the db, to the web server.
Lastly, to minimize work done on the server, I would implement some sort of caching or buffer pool system to maintain the data on each device for reading, and create a write cache for sending data to the central server. The write cache will be vital seeing as how you cannot depend on a stable tcp connection with transaction management from the server. You need to maintain a cache of data to be written, i.e. (a queue) and pop the queue when you have confirmation from the server that it has received the data you have written. The queue should be popped whenever there is data and a data connection. However, I would need to know more about your requirements before I could say for sure or give more details.

Related

Articles on how to organize background queue operations [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Now I'm thinking about how to organize architecture of system. The system will consists of web site, where user can upload some documents and then get it processed back and a some background daemon with an queue of tasks that should process provided documents.
My question is:
Should I implement the daemon I told you above, as a WCF service with only named pipes (no netowork access to this service needed)?
Any suggestions/tips/advices on that?
The data user can provide is just a bunch of XML files. ASP.NET web site will expose functionality to get this XML files and then somehow should be able to pass them to daemon.
Could you point me please on some articles on that topic.
Thanks in advance!
POST EDIT
After some hours discovering MSMQ suggested here by guys, my thought on that technology is about that is more for distributed architecture (processing nodes are located on separate machines and there is exchanging messages between differents computers through network).
At the moment separating to independent machines is not needed. There will be just on machine on which being an ASP.NET website and some processing program.
Is that using of MSMQ so necessary?
POST EDIT #2
As I using .NET Framework here, please suggest only offers what are compatible for .NET. There is really no any options here.
If your deployment will be on a single server, your initial idea of a WCF service is probably the way to go - see MSDN for a discussion regarding hosting in IIS or in a Windows Service.
As #JeffWatkins said, a good pattern to follow when calling the service is to simply pass it the location of the file on disk that needs processing. This will be much more efficient when dealing with large files.
I think the precise approach taken here will depend on the nature of files you are receiving from users. In the case of quite small files you may find it more efficient to stream them to your service from your website such that they never touch the disk. In this case, your service would then expose an additional method that is used when dealing with small files.
Edit
Introducing a condition where the file may be streamed is probably a good idea, but it would be valuable for you to do some testing so you can figure out:
Whether it is worth doing
What the optimal size is for streaming versus writing to disk
My answer was based on the assumption that you were deploying to a single machine. If you are wanting something more scalable, then yes, using MSMQ would be a good way to scale your application.
See MSDN for some sample code for building a WCF/MSMQ demo app.
I've designed something similar. We used a WCF service as the connection point, then RabbitMQ for queuing up the messages. Then, a separate service works with items in the queue, sending async callback when the task if finished, therefore finishing the WCF call (WCF has many built in features for dealing with this)
You can setup timeouts on each side, or you can even choose to drop the WCF connection and use the async callback to notify the user that "processing is finished"
I had much better luck with RabbitMQ than MSMQ, FYI.
I don't have any links for you, as this is something our team came up with and has worked very well (1000 TPS with a 4 server pool, 100% stateless) - Just an Idea.
I would give a serious look to ServiceStack. This functionality is built-in, and you will have minimal programming to do. In addition, ServiceStack's architecture is very good and easy to debug if you do run into any issues.
https://github.com/ServiceStack/ServiceStack/wiki/Messaging-and-redis
On a related note, my company does a lot of asynchronous background processing with a web-based REST api front end (the REST service uses ServiceStack). We do use multiple machines and have implemented a RabbitMQ backend; however, the RabbitMQ .NET library is very poorly-designed and unnecessarily cumbersome. I did a redesign of the core classes to fix this issue, but have not been able to publish them to the community yet as we have not released our project to production.
Have a look at http://www.devx.com/dotnet/Article/27560
It's a little bit dated but can give you a headstart and basic understanding.

What is a thread-safe way to maintain a global state accessible by all threads [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have a client/server setup with a proxy sitting between them. I'd like the proxy to behave a bit smarter so that when 5 clients are connected, certain packets can be relayed from client to client instead of client -> server -> client. Each client is currently getting their own thread so what is the best way to share a global state that has references to the other clients. This state object would really just hold an id associated to each client and a reference to their socket so i can send on it.
Would something as simple as a dictionary work? My concern is that while accessing the dictionary i might have another client connect or disconnect which would be modifying the dictionary as im accessing it.
Dictionary would work, if dictionary is the data structure suitable for your situation. This decision has little to do with multithreading.
That said, if you decide to use a dictionary, you should synchronize the access to it using
e.g. the lock statements
or switching to ConcurrentDictionary, a thread-safe version of Dictionary
You should read up on Semaphores.
But in the context of C#, you need to make sure you are carefully and appropriately using the lock keyword. This stuff is hard, and very easy to get wrong in catastrophic ways.
If your proxy is set up as a WCF service then you could use the ASP.NET cache. Session state in WCF is per call, but the cache is still there lurking behind the scenes. Note that this is simply going to give you caching - it is not going to give you any pre-rolled access control to the cache, so you can still get race issues.
For further info on using the Cache, check this previous SO question How to return HttpContext.Current.Cache from WCF? and this blog post Enable Session State in a WCF service .

Recommendations on a Client Host application in .Net [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
My question is concerning an application that will use a host application where the database server is, and a few clients that will be sending information to the host.
Basically the host application receives some data from the clients, makes some calculations and error checking and then (if all goes well) it stores the information in the database. The data received could be easily serialized or get character separated in a string of less than 50 characters.
I know my basic option in developing this communication application is WCF and have worked with it before but my concerns for this particular case is the fact that:
The host and the clients will at most times be connected to the internet through wireless USB modems which as we all know do not provide the most reliable connection ever.
There will be many clients all sending information to the host at the same time, each having their own identification id since that determines the type of the data received and what it represents.
Due to the not so reliable connections i would like to be able to know if the packet has been sent successfully, and if not to be able to keep trying until the communication is complete.
New data will be sent from each client every couple of minutes and if lets say we have a connection failure for 5 minutes i would like to be able to send all unsent information when the connection is restored.
Lastly i'm kind of trying to figure how i would be able to know where to contact the host as the usb modems do not have a static ip and this could change from time to time.
My thought is to either try to establish a communication through WCF services where the clients would send all information to the host directly or maybe consider serializing the data from the clients in XML format, then upload them on a 3rd server that will be available all of the time, and then use the host application every one minute to try and synchronize the available information with the ones one the 3rd server.
Hope i made it pretty clear with this lengthy post on what i'm trying to accomplish and would really appreciate your thoughts for a project like this.
Instead of starting a discussion. Ill try and give you an answer.
I have implemented a system your describing. Based on that experience i can tell you that you will be wanting to look at a message based system to do the communication between your clients and host(s).
A message based system allows you to transparently handle the communication going on. It allows to you resend a message in case it failed to transmit.
To keep it short there are various message based frameworks available for the .Net community. To name a few: NServiceBus, Mass Transit, Rhino Service Bus, or maybe the more lightweight Agatha RRSL.
Point is, there are quite a few. Its up to you to research em and find out which one suits your needs best.

Compressing TCP in SQL Server 2008 [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
We have .NET 4.0 application using Entity Framework. Application connects remotely over TCP to the SQL Server. When on LAN, it's fast, but over internet the traffic is very high.
All we would like to do is to turn on some TCP compression but it looks like that SQL Server 2008 doesn't provide this feature.
What's the simplest solution to make TCP communication compressed?
You are trying to solve the problem on the wrong level / layer. Profile your communication with SQL server and start to think about optimizations. That is the only valid point to start. Incorrect usage of EF can lead in terrible chatty and slow communication with SQL server and that is something that will simply not be solved by any compression because chatty sequential communication means multiple unnecessary roundtrips to database where each single roundtrip will increase the duration of processing by its latency. I have already seen solutions (and some of them I did myself) where incorrect usage of EF created thousands of lazy loading queries within single request processing.
And yes it can end up in replacing part of your EF code with stored procedures and custom queries and in the worst case with abandoning whole EF.
If your problem is amount of transferred data it is again time to think about optimization and reducing amount of transferred data to only needed subset or perhaps using some preprocessing on SQL server in stored procedure or view. Btw. such thinking should be done during application design where you should think about target environment where the application will run.
Edit:
One more note. It is not very common to communicate with database over WAN. Usually such requirement leads to implementing another tier with business logic sitting on the server in LAN with SQL server. This business logic tier exposes web services to client on WAN. This architecture can reduce latency when communicating with database a lot and in the same time a correct architecture of message exchanging between client and service can lead to additional improvements.
SQL Server uses the TDS protocal.
It doesn't care whether you use TCP or Named Pipes or "the force" to get data from A to B
You'd need to enable or set up some kind of compression (Google search):
at the OS level
the interface/hardware level
the network level

Which .NET Memcached client do you use, EnyimMemcached vs. BeITMemcached? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Seems like both EnyimMemcached (https://github.com/enyim/EnyimMemcached) and BeITMemcached (http://code.google.com/p/beitmemcached/) are popular .NET Memcached libraries. Both are reasonably active projects under development and have over a thousand downloads. Trying to figure out which one to use but found competing remarks! I did read another related post but still want to ask more people on this before making a decision.
EnyimMemcached claims on its project homepage (https://github.com/enyim/EnyimMemcached), that
based on our non-disclosed specially handcrafted in-house performance test we're the fastest C# client ever, using negative amount of system resources, be it memory or CPU time
and
we follow memcached's protocol specification as strictly as no one else: even the memcached guys ask us if they don't understand something
While BeITMemcached claims on its project wiki page (http://code.google.com/p/beitmemcached/wiki/Features) that
We have performed extensive functional testing and performance testing of the BeIT Memcached client and we are satisifed that it is working as it should. When we compared the performance against two other clients, the java port and the Enyim memcached client, our client consumed the least resources and had the best performance. It is also following the memcached protocol specifications more strictly, has the most memcached features, and is still much smaller in actual code size.
So for those who have experience on these or anything similar, which client did you choose to use and possibly why you chose the one you chose?
Thank you,
Ray.
We tested both and found Enyim to perform the best for our expected usage scenario: many (but not millions) cached objects, and millions of cache-get requests (average web site concurrency load = 16-20 requests.)
Our performance factor was measuring the time from making the request to having the object initialized in memory on the calling server. Both libraries would have sustained the job, but the enyim client was preferred in our testing.
There is a comparison between Enyim and BeIT at sysdot.wordpress.com/2011/03/08/memcached-clients-which-ones-best/
I have found Enyim to work the best.
It is easy to use, reliable and fast :)
Eniym client's Store() sometimes does not work correctly. It happens when key does not present in cache, for most cases after memcached service restart. This construction:
T val = _client.Get<T>(key);
if (val == null)
{
// ... filling val variable ...
var result = _client.Store(StoreMode.Add, key, val);
// ... result can be false, sometimes ...
}
works 50/50. T entity is [Serializable].

Categories

Resources