It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I need to build a server to accept client connection with a very high frequency and load (each user will send a request each 0.5 seconds and should get a response in under 800ms, I should be able to support thousands of users on one server). The assumption is that the SQL Server is finely tuned and will not pose a problem. (assumption that of course might not be true)
I'm looking to write a non-blocking server to accomplish this. My back end is an SQL Server which is sitting on another machine. It doesn't have to be updated live - so I think I can cache most of the data in memory and dump it to the DB every 10-20 seconds.
Should I write the server using C# (which is more compatible with SQL Server)? maybe Python with Tornado? What should be my considerations when write a high-performance server?
EDIT: (added more info)
The Application is a game server.
I don't really know the actual traffic - but this is the prognosis and the server should support it and scale well.
It's hosted "in the cloud" in a Datacenter.
Language doesn't really matter. Performance does. (a Web service can be exposed on the SQL Server to allow other languages than .NET)
The connections are very frequent but small (very little data is returned and little computations are necessary).
It should hold most of the data in the memory for fastest performance.
Any thoughts will be much appreciated :)
Thanks
Okay, if you REALLY need high performance, don't go for C#, but C/C++, it's obvious.
In any case, the fastest way to do server programming (as far as I know) is to use IOCP (I/O Completion Ports). Well, that's what I used when I made a MMORPG server emulator, and it performed faster than the official C++ select-based servers.
Here's a very complete introduction to IOCP in C#
http://www.codeproject.com/KB/IP/socketasynceventargs.aspx
Good luck !
Use the programming language that you know the most. It's a lot more expensive to hunt down performance issues in an large application that you do not fully understand.
It's a lot cheaper to buy more hardware.
People will say C++, because garbage collection in .Net could kill your latency. You could avoid garbage collection though if you were clever, by reusing existing managed objects.
Edit: your assumption about SQL Server is probably wrong. You need to store your state in memory for random access. If you need to persist changes, journal them to the filsystem and consolidate them with the database infrequently
Edit 2: You will have a lot different threads talking to the same data. In order to avoid blocking and deadlocks, learn about lock-free programming (Interlocked.CompareExchange etc)
I was part of a project that included very high-performance server code, which actually included the ability to response with a TCP packet within 12 milliseconds or so.
We used C# and I must agree with jgauffin - a language that you know is much more important than just about anything.
Two tips:
Writing to console (especially in color) can really slow things down.
If it's important for the server to be fast at the first requests, you might want to use a pre-JIT compiler to avoid JIT compilation during the first requests. See Ngen.exe.
Related
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
From a performance standpoint, it is more beneficial to read large amounts of data from an XML file or by looping through an array?
I have around 2,000 datasets I need to loop through and do calculations with, so I'm just wondering if it would be better to import all XML data and process it as an array (single large import) or to import each dataset sequentially (many small imports).
Thoughts and suggestions?
If I have interpreted your question correctly, you need to load 2,000 sets of data from one file, and then process them all. So you have to read all the data and process all the data. At a basic level there is the same amount of work to do.
So I think the question is "How can I finish the same processing earlier?"
Consider:
How much memory will the data use? If it's going to be more than 1.5GB of RAM, then you will not be able to process it in a single pass on a 32-bit PC, and even on 64-bit PCs you're likely to see virtual memory paging killing performance. In either of these cases, streaming the data in smaller chunks is a necessity.
Conversely if the data is small (e.g. 2000 records might only be 200kB for all I know), then you may get better I/O performance by reading it in one chunk, or it will load so fast compared to the processing time that there is no point trying to optimise it.
Are the records independent? (so they don't need to be processed in a particular order, and you don't need one record present in memory in order to process another one) If so, and if the loading time is significant overall, then the "best" approach may be to parallelise the operation - If you can process some data while you are loading more data in the background, you will utilise the hardware better and do the same work in less time. So you probably want to consider splitting your loading and processing onto different threads.
But spreading the processing onto many threads might not help you if loading takes much longer than processing, as your processing threads may be starved of data while waiting for I/O - so using 1 processing thread may be just as fast as using 3 or 7. And there's no point in creating more threads than you have available CPU cores. If going multithreaded, I'd write it to use a configurable/dynamic number of threads and then do some testing to determine what the optimum approach will be.
But before you consider all of that, you might want to consider writing a brute force approach and see what the performance is like. Do you even need to optimise it?
And if the answer is "yes, I desperately need to optimise it", then can you reconsider the data format? XML is a very useful but grossly inefficient format. If you have a performance critical case, is there anything you can do to reduce the XML size (e.g. simply using shorter element names can make a massive difference on large files), or even use a much more compact and easily read binary format?
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I'm currently writting an application that make a huge lot of call to slow webservices (I had no say in that pattern) that produce little output.
I'd like to make like 100 parallel calls (I know real parallelism can only go as far as you have cores).
But I was wondering if they were performance differences between different approach.
I'm hesitating between:
Using Task.Factory.StartNew in a loop.
Using Parallel.For.
Using BackgroundWorker.
Using AsyncCallback.
...Others?
My main goal, is to have as many webservices calls being started as quickly as possible.
How should I proceed?
From a performance standpoint it's unlikely to matter. As you yourself have described, the bottleneck in your program is a network call to a slow performing web service. That will be the bottleneck. Any differences in how long it takes you to spin up new threads or manage them is unlikely to matter at all due to how much they will be overshadowed by the network interaction.
You should use the model/framework that you are most comfortable with, and that will most effectively allow you to write code that you know is correct. It's also important to note that you don't actually need to use multiple threads on your machine at all. You can send a number of asynchronous requests to the web service all from the same thread, and even handle all of the callbacks in the same thread. Parallelizing the sending of requests is unlikely to have any meaningful performance impact. Because of this you don't really need to use any of the frameworks that you have described, although the Task Parallel Library is actually highly effective at managing asynchronous operations even when those operations don't represent work in another thread. You don't need it, but it's certainly capable of helping.
According to your advices I used Async (with I/O event) while I was previously using TLP.
Async really does outperform Sync + Task usage.
I can now launches 100 requests (almost?) at the same time and if the longest running one takes 5 seconds, the whole process will only last 7 seconds while when using Sync + TLP it took me like 70 seconds.
In conclusion, (auto generated) Async is really the way to go when consuming a lot of webservices.
Thanks to you all.
Oh and by the way, this would not be possible without:
<connectionManagement>
<add address="*" maxconnection="100" />
</connectionManagement>
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have method which calls a stored procedure 300 times in a for loop and each time the stored procedure returns me 1200 records. How can i improve this ? I cannot eliminate the 300 calls but is there any otherways i can try out. I am using REST service impletemented through ASP.NET and using IBATIS for database connectivity
I cannot eliminate the 300 calls
Eliminate the 300 calls.
Even if all you can do is to just add another stored procedure which calls the original stored procedure 300 times, aggregating the results, you should see a massive performance gain.
Even better if you can write a new stored procedure that replicates the original functionality but is structured more appropriately for your specific use case, and call that, once, instead.
Making 300 round trips between your code and your database quite simply is going to take time, even where the code and the database are on the same system.
Once this bit of horrible is resolved, there will be other things you can look to optimise, if required.
Measure.
Measure the amount of time spent inside the server-side code. Measure the amount of that time that is spent in the stored procedure. Measure the amount of time spent at the client part. Do some math, and you have a rough estimate for network time and other overheads.
Returning 1200 records, I would expect network bandwidth to be one of the main issues; you could perhaps investigate whether a different serialization engine (with the same output type) might help, or perhaps whether adding compression (gzip / deflate) support would be beneficial (meaning: reduced bandwidth being more important than the increased CPU required).
Latency might be important if you are calling the REST service 300 times; maybe you can parallelize slightly, or make fewer big calls rather than lots of small calls.
You could batch the SQL code, so you only make a few trips to the DB (calling the SP repeatedly in each) - that is perfectly possible; just use EXEC etc (still using parameterization).
You could look at how you are getting the data from ADO.NET to the REST layer. You mention IBATIS, but have you checked whether this is fast / slow compared to, say, "dapper" ?
Finally, the SP performance itself can be investigated; indexing or just a re-structuring of the SP's SQL may help.
Well, if you have to return 360,000 records, you have to return 360,000 records. But do you really need to return 360,000 records? Start there and work your way down.
Without knowing too much of the details, the architecture appears flawed. On one hand its considered unreasonable to lock the tables for the 6 seconds it takes to retrieve the 360,000 records using a single S.P. execution, but it fine to return a possibly inconsistent set of 360,000 records that are retrieved via multiple S.P. executions. It makes me wonder what exactly are you trying to implement and if there is a better way to design the integration between the client and the server.
For instance, if the client is retrieving a set of records that have been created since the last request, then maybe a paged ATOM feed would be more appropriate.
What ever it is you are doing, 360,000 records is a lot of data to move between the server and the client and we should be looking at the architecture and purpose of that data transfer to make sure the current approach is appropriate.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have an aspx website with many pages which accept user input, pull data from a SQL database, do some heavy data processing then finally present the data to the user.
The site is getting bigger and bigger and it is starting to put a lot of stress on the server.
What I want to do is to maybe separate things a bit:
Server-A will host the website, the site will accept input from users and pass those parameters to applications running on Server-B
Server-B will fetch data from SQL, do the heavy data processing, then pass a dataset or datatable object back to the website.
Is this possible?
Sure, this is called an N-Tier Architecture.
The most obvious thing to separate is one database server, tuned to meet the demands of a database (fast disks, lots of RAM) and one or more separate web servers.
You can expand on that by placing an application tier between the web server and the database server. The application tier can accept the user input that was collected in the web tier, interact with the database, do the heavy crunching, and return the result to the web tier. Most typically, you would use Windows Communication Foundation (WCF) to expose the functionality of the application tier to the web server(s). Application servers might often be tuned to have very fast CPU's and might have slower disks and possibly less memory than database servers, depending on exactly what they need to do. The beauty of this solution is that you can just add more and more identical application servers as the load on your application grows.
Based on the business model, you need some caching strategy to prevent heavy calculation for each input.
Consider an stock website. Although there are many transactions each minute, they won't update market trend for each of them. They can schedule an update based on something like, defined intervals (hourly, daily...), defined number of interactions (based on count of value) etc.
The task should be done when the server load is low. This way visitors see the stock trend on main page while it is accurate enough.
For such heavy load scenarios, good design is everything since sometimes even the expensive hard-wares will not help much.
If you like, share some info about what is going done.
i would use a load balancer like an F5
that way your architecture does not change and
but i would use the ntier approach to split your site into a data and presentation layer
then the load balancer will direct each request to the server with the lightest load
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I have a turn-based game in which two players may play against each others.
It's written in C# and uses XNA 4.0.
Currently multiplayer is implemented with TCP/IP. It works pretty nicely, but only if the players are within the same network and one of them knows the IP of the other.
So the question is: How should I implemented the online play for this game? Is TCP a reasonable method for connecting two random players from the opposite sides of the world without them having to deal with IP addresses and ports(or any other such technical details)?
To make this problem more challenging, I have no server for hosting the game matching service. (Well, I have an access to a virtual web server which I could use for sharing the IPs.)
To list questions:
Does .NET offer better choice of communication method than TCP?
What would be the best way to deal with NATs in my case?
Is there a cheap way of getting my own server and run the TCP game matching service there?
TCP vs UDP.
TCP is a bit slower than UDP but more failsafe.
Since your game is turn based it will probably send minimal amounts of data between the client and server and it is not really latency dependant, I would say you might aswell go for TCP.
To make this problem more challenging, I have no server for hosting the game matching service. (Well, I have an access to a virtual web server which I could use for sharing the IPs.)
If you are going to provide your players with a server browser or similar you will need to have a centralized server, a web server with a script/application built for this would do just fine.
Is there a cheap way of getting my own server and run the TCP game matching service there?
A web server or similar host would do just fine and is usually cheap, what you want is:
Function for a server to add itself to the server list.
Function for a client to retrieve the servers on the list.
Doing web requests with C# is no problem at all, the requests could look something like:
http://www.example.com/addToServerList.php?name=MyEpicServer&ip=213.0.0.ABC (adds this server to the list)
http://www.example.com/getOnlineServers.php (returns list of all the servers)
You need to specify what kind of load and latency that is expected and tolerated.
General answer is:
For real time games - UDP.
For scrabble-like-games - TCP.
Use your server to share IP's as you said.
Minecraft uses TCP. It's good for traffic that must be transmitted and received AND can be queued a little.
UDP is a one way error checking. Only the receiving side check for error. This was needed with the older slow ethernet technology where a round trip to check packets is too slow.
TCP is a very reliable protocol with a handshake. So the sending side knows if the data is transmitted sucessfully. But due to the round trip, it puts a lot more overhead and lag on the transmission.
TCP also do arrange packets, which UDP also don't do.
Some games it does not mind losing packets (for example "steaming" data where objects moves around, and it will get updated the next round of packet anyway). There you can use UDP. But if it is critical to get all the data, rather go with TCP, otherwise you will spend a lot of time writing code to make sure that all the data is transmitted successfully.
The networks is quickly enough and the internet being TCP/IP I recommend TCP, except if you really need very low lattency traffic.
This website gives a good summary:
http://www.diffen.com/difference/TCP_vs_UDP
NAT: Should not be a problem just as long as your Time To Live (TTL) is big enough. Every time it get NAT, it's TTL get subtracted by one. When it is 0, it gets dropped