MSMQ - sending messages to transactional queue became very slow

MSMQ - sending messages to transactional queue became very slow - c#

We have an intranet ASP.NET web forms application which performs the actual work via service layer, represented by .NET Remoting services. A couple of weeks ago we've started to get timeout exceptions from IIS, it turned to be that the request from front end was not processed in allowed default execution time (for .NET 2.0+ it is 110sec). During investigation we've found out, that the problem has to be with sending messages to the to the transactional MSMQ queue (runs on W2K3 x64 server). The way we send the messages is the following: from DB we're getting all the records we want to push to the queue and then every single record pushed in separated MSMQ transaction in foreach cycle like this:
using (MessageQueue queue = new MessageQueue(#".\private$\OurQueue"))
{
using (MessageQueueTransaction tran = new MessageQueueTransaction())
{
tran.Begin();
Message msg = new Message(BODY); // BODY is some class which holds few fields of type Guid, String and DateTime
msg.Label = "Some label for the message";
msg.UseDeadLetterQueue = true;
msg.TimeToBeReceived = new TimeSpan(7, 0, 0, 0);
msg.Priority = MessagePriority.Normal;
queue.Send(msg, tran);
tran.Commit();
queue.Close();
return msg.Id;
}
}
The number of messages to be sent can reach 30.000-100.000, that's why the performance is crucial.
Experiments with Stopwatch class have revealed that if you try to change the logic and send all messages in ONE transaction like this:
using (MessageQueue queue = new MessageQueue(#".\private$\OurQueue"))
{
using (MessageQueueTransaction tran = new MessageQueueTransaction())
{
tran.Begin();
for (int i = 0; i < recordsNumber; i++)
{
Message msg = new Message(BODY); // BODY is some class which holds few fields of type Guid, String and DateTime
msg.Label = "Some label for the message";
msg.UseDeadLetterQueue = true;
msg.TimeToBeReceived = new TimeSpan(7, 0, 0, 0);
msg.Priority = MessagePriority.Normal;
queue.Send(msg, tran);
}
tran.Commit();
queue.Close();
}
}
the performance is much better - for separated transactions time required to push 30K of records to the MSMQ is ~115 sec, while for 1 transaction it is only ~16 sec. Now this is all guessing to some extend since the profiling and performance estimation was done on my development machine - I don't have access to the production servers, but still this makes me think: is this a correct comparison and if it's a good idea to send several tens of thousands of records to the MSMQ in 1 transaction? However the main question for which I'm still looking for answer is how it happened, that it worked like this for last 7 years or so with separated MSMQ transactions and never timed out, but recently all of a sudden became so slow? Except moving from .NET 3.5 to 4.0 (not to 4.5 unfortunately as we're still using old W2K3 servers) we didn't change anything in the code and farm admin claims there were no changes recently done to the servers themselves (DB x 1, service with Remoting x 2, front end x 2). Profiling on my dev machine has shown that the most work appears to be done in queue.Send, but it shouldn't cause the code to time out, since the call to Send is asynchronous by design, i.e. it should immediately return to the caller (as it's pointed out in documentation). The code which sends messages to the MSMQ is run inside "main" unit of work, which is DB transaction, but I doubt that the MSMQ transaction got promoted to the MSDTC - I guess in that case it would be always slow.
Does anybody know what I could miss here?

Related

Receiving MSMQ Transactions As a Group

I'm currently having to send files over the size of 4MB through several servers using MSMQ. The files are initially sent in chunks, like so:
using (MessageQueueTransaction oTransaction = new MessageQueueTransaction())
{
// Begin the transaction
oTransaction.Begin();
// Start reading the file
using (FileStream oFile = File.OpenRead(PhysicalPath))
{
// Bytes read
int iBytesRead;
// Buffer for the file itself
var bBuffer = new byte[iMaxChunkSize];
// Read the file, a block at a time
while ((iBytesRead = oFile.Read(bBuffer, 0, bBuffer.Length)) > 0)
{
// Get the right length
byte[] bBody = new byte[iBytesRead];
Array.Copy(bBuffer, bBody, iBytesRead);
// New message
System.Messaging.Message oMessage = new System.Messaging.Message();
// Set the label
oMessage.Label = "TEST";
// Set the body
oMessage.BodyStream = new MemoryStream(bBody);
// Log
iByteCount = iByteCount + bBody.Length;
Log("Sending data (" + iByteCount + " bytes sent)", EventLogEntryType.Information);
// Transactional?
oQueue.Send(oMessage, oTransaction);
}
}
// Commit
oTransaction.Commit();
}
These messages are sent from Machine A to Machine B, and then forwarded to Machine C. However, I've noticed that the PeekCompleted event on Machine B is triggered before all messages are sent.
For example, a test run just now showed 8 messages sent, and were processed on Machine B in groups of 1, 1 and then 6.
I presume this is due to the transactional part ensuring the messages arrive in exactly the right order, but not guaranteeing they are all collected at exactly at the same time.
The worry I have is that when Machine B passes the messages to Machine C, these now count as 3 separate transactions, and I'm unsure as to whether the transactions themselves are delivered in the correct order (for example, 1 then 6 then 1).
My question is, is it possible to receive messages using PeekCompleted by transaction (meaning, all 8 messages are collected first), and pass them on so Machine C gets all 8 messages together? Even in a system where multiple transactions are being sent at the same time?
Or are the transactions themselves guaranteed to arrive in the correct order?

I think I missed this when looking at the topic:
https://msdn.microsoft.com/en-us/library/ms811055.aspx
That these messages will either be sent together, in the order they
were sent, or not at all. In addition, consecutive transactions
initiated from the same machine to the same queue will arrive in the
order they were committed relative to each other. Moreover
So, no matter how diluted the transactions get, the order will never be affected.

Odd Behavior of Azure Service Bus ReceiveBatch()

Working with a Azure Service Bus Topic currently and running into an issue receiving my messages using ReceiveBatch method. The issue is that the expected results are not actually the results that I am getting. Here is the basic code setup, use cases are below:
SubscriptionClient client = SubscriptionClient.CreateFromConnectionString(connectionString, convoTopic, subName);
IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100);
foreach (BrokeredMessage message in messageList)
{
try
{
Console.WriteLine(message.GetBody<string>() + message.MessageId);
message.Complete();
}
catch (Exception ex)
{
message.Abandon();
}
}
client.Close();
MessageBox.Show("Done");
Using the above code, if I send 4 messages, then poll on the first run through I get the first message. On the second run through I get the other 3. I'm expecting to get all 4 at the same time. It seems to always return a singular value on the first poll then the rest on subsequent polls. (same result with 3 and 5 where I get n-1 of n messages sent on the second try and 1 message on the first try).
If I have 0 messages to receive, the operation takes between ~30-60 seconds to get the messageList (that has a 0 count). I need this to return instantly.
If I change the code to IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100, new Timespan(0,0,0)); then issue #2 goes away because issue 1 still persists where I have to call the code twice to get all the messages.
I'm assuming that issue #2 is because of a default timeout value which I overwrite in #3 (though I find it confusing that if a message is there it immediately responds without waiting the default time). I am not sure why I never receive the full amount of messages in a single ReceiveBatch however.

The way I got ReceiveBatch() to work properly was to do two things.
Disable Partitioning in the Topic (I had to make a new topic for this because you can't toggle that after creation)
Enable Batching on each subscription created like so:
List item
SubscriptionDescription sd = new SubscriptionDescription(topicName, orgSubName);
sd.EnableBatchedOperations = true;
After I did those two things, I was able to get the topics to work as intended using IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100, new TimeSpan(0,0,0));

I'm having a similar problem with an ASB Queue. I discovered that I could mitigate it somewhat by increasing the PrefetchCount on the client prior to receiving the batch:
SubscriptionClient client = SubscriptionClient.CreateFromConnectionString(connectionString, convoTopic, subName);
client.PrefetchCount = 100;
IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100);
From the Azure Service Bus Best Practices for Performance Improvements Using Service Bus Brokered Messaging:
Prefetching enables the queue or subscription client to load additional messages from the service when it performs a receive operation.
...
When using the default lock expiration of 60 seconds, a good value for
SubscriptionClient.PrefetchCount is 20 times the maximum processing rates of all receivers of the factory. For example, a factory creates 3 receivers, and each receiver can process up to 10 messages per second. The prefetch count should not exceed 20*3*10 = 600.
...
Prefetching messages increases the overall throughput for a queue or subscription because it reduces the overall number of message operations, or round trips. Fetching the first message, however, will take longer (due to the increased message size). Receiving prefetched messages will be faster because these messages have already been downloaded by the client.

Just a few more pieces to the puzzle. I still couldn't get it to work even after Enable Batching and Disable Partitioning - I still had to do two ReceiveBatch calls. I did find however:
Restarting the Service Bus services (I am using Service Bus for Windows Server) cleared up the issue for me.
Doing a single RecieveBatch and taking no action (letting the message locks expire) and then doing another ReceiveBatch caused all of the messages to come through at the same time. (Doing an initial ReceiveBatch and calling Abandon on all of the messages didn't cause that behavior.)
So it appears to be some sort of corruption/bug in Service Bus's in-memory cache.

C# best way to implement TCP Client Server Application

I want to extend my experience with the .NET framework and want to build a client/server application.
Actually, the client/server is a small Point Of Sale system but first, I want to focus on the communication between server and client.
In the future, I want to make it a WPF application but for now, I simply started with a console application.
2 functionalities:
client(s) receive(s) a dataset and every 15/30min an update with changed prices/new products
(So the code will be in a Async method with a Thread.sleep for 15/30 mins).
when closing the client application, sending a kind of a report (for example, an xml)
On the internet, I found lots of examples but i can't decide which one is the best/safest/performanced manner of working so i need some advice for which techniques i should implement.
CLIENT/SERVER
I want 1 server application that handles max 6 clients. I read that threads use a lot of mb and maybe a better way will be tasks with async/await functionallity.
Example with ASYNC/AWAIT
http://bsmadhu.wordpress.com/2012/09/29/simplify-asynchronous-programming-with-c-5-asyncawait/
Example with THREADS
mikeadev.net/2012/07/multi-threaded-tcp-server-in-csharp/
Example with SOCKETS
codereview.stackexchange.com/questions/5306/tcp-socket-server
This seems to be a great example of sockets, however, the revisioned code isn't working completely because not all the classes are included
msdn.microsoft.com/en-us/library/fx6588te(v=vs.110).aspx
This example of MSDN has a lot more with Buffersize and a signal for the end of a message. I don't know if this just an "old way" to do this because in my previous examples, they just send a string from the client to the server and that's it.
.NET FRAMEWORK REMOTING/ WCF
I found also something about the remoting part of .NET and WCF but don' know if I need to implement this because i think the example with Async/Await isn't bad.
SERIALIZED OBJECTS / DATASET / XML
What is the best way to send data between it? Juse an XML serializer or just binary?
Example with Dataset -> XML
stackoverflow.com/questions/8384014/convert-dataset-to-xml
Example with Remoting
akadia.com/services/dotnet_dataset_remoting.html
If I should use the Async/Await method, is it right to something like this in the serverapplication:
while(true)
{
string input = Console.ReadLine();
if(input == "products")
SendProductToClients(port);
if(input == "rapport")
{
string Example = Console.ReadLine();
}
}

Here are several things anyone writing a client/server application should consider:
Application layer packets may span multiple TCP packets.
Multiple application layer packets may be contained within a single TCP packet.
Encryption.
Authentication.
Lost and unresponsive clients.
Data serialization format.
Thread based or asynchronous socket readers.
Retrieving packets properly requires a wrapper protocol around your data. The protocol can be very simple. For example, it may be as simple as an integer that specifies the payload length. The snippet I have provided below was taken directly from the open source client/server application framework project DotNetOpenServer available on GitHub. Note this code is used by both the client and the server:
private byte[] buffer = new byte[8192];
private int payloadLength;
private int payloadPosition;
private MemoryStream packet = new MemoryStream();
private PacketReadTypes readState;
private Stream stream;
private void ReadCallback(IAsyncResult ar)
{
try
{
int available = stream.EndRead(ar);
int position = 0;
while (available > 0)
{
int lengthToRead;
if (readState == PacketReadTypes.Header)
{
lengthToRead = (int)packet.Position + available >= SessionLayerProtocol.HEADER_LENGTH ?
SessionLayerProtocol.HEADER_LENGTH - (int)packet.Position :
available;
packet.Write(buffer, position, lengthToRead);
position += lengthToRead;
available -= lengthToRead;
if (packet.Position >= SessionLayerProtocol.HEADER_LENGTH)
readState = PacketReadTypes.HeaderComplete;
}
if (readState == PacketReadTypes.HeaderComplete)
{
packet.Seek(0, SeekOrigin.Begin);
BinaryReader br = new BinaryReader(packet, Encoding.UTF8);
ushort protocolId = br.ReadUInt16();
if (protocolId != SessionLayerProtocol.PROTOCAL_IDENTIFIER)
throw new Exception(ErrorTypes.INVALID_PROTOCOL);
payloadLength = br.ReadInt32();
readState = PacketReadTypes.Payload;
}
if (readState == PacketReadTypes.Payload)
{
lengthToRead = available >= payloadLength - payloadPosition ?
payloadLength - payloadPosition :
available;
packet.Write(buffer, position, lengthToRead);
position += lengthToRead;
available -= lengthToRead;
payloadPosition += lengthToRead;
if (packet.Position >= SessionLayerProtocol.HEADER_LENGTH + payloadLength)
{
if (Logger.LogPackets)
Log(Level.Debug, "RECV: " + ToHexString(packet.ToArray(), 0, (int)packet.Length));
MemoryStream handlerMS = new MemoryStream(packet.ToArray());
handlerMS.Seek(SessionLayerProtocol.HEADER_LENGTH, SeekOrigin.Begin);
BinaryReader br = new BinaryReader(handlerMS, Encoding.UTF8);
if (!ThreadPool.QueueUserWorkItem(OnPacketReceivedThreadPoolCallback, br))
throw new Exception(ErrorTypes.NO_MORE_THREADS_AVAILABLE);
Reset();
}
}
}
stream.BeginRead(buffer, 0, buffer.Length, new AsyncCallback(ReadCallback), null);
}
catch (ObjectDisposedException)
{
Close();
}
catch (Exception ex)
{
ConnectionLost(ex);
}
}
private void Reset()
{
readState = PacketReadTypes.Header;
packet = new MemoryStream();
payloadLength = 0;
payloadPosition = 0;
}
If you're transmitting point of sale information, it should be encrypted. I suggest TLS which is easily enabled on through .Net. The code is very simple and there are quite a few samples out there so for brevity I'm not going to show it here. If you are interested, you can find an example implementation in DotNetOpenServer.
All connections should be authenticated. There are many ways to accomplish this. I've use Windows Authentication (NTLM) as well as Basic. Although NTLM is powerful as well as automatic it is limited to specific platforms. Basic authentication simply passes a username and password after the socket has been encrypted. Basic authentication can still, however; authenticate the username/password combination against the local server or domain controller essentially impersonating NTLM. The latter method enables developers to easily create non-Windows client applications that run on iOS, Mac, Unix/Linux flavors as well as Java platforms (although some Java implementations support NTLM). Your server implementation should never allow application data to be transferred until after the session has been authenticated.
There are only a few things we can count on: taxes, networks failing and client applications hanging. It's just the nature of things. Your server should implement a method to clean up both lost and hung client sessions. I've accomplished this in many client/server frameworks through a keep-alive (AKA heartbeat) protocol. On the server side I implement a timer that is reset every time a client sends a packet, any packet. If the server doesn't receive a packet within the timeout, the session is closed. The keep-alive protocol is used to send packets when other application layer protocols are idle. Since your application only sends XML once every 15 minutes sending a keep-alive packet once a minute would able the server side to issue an alert to the administrator when a connection is lost prior to the 15 minute interval possibly enabling the IT department to resolve a network issue in a more timely fashion.
Next, data format. In your case XML is great. XML enables you to change up the payload however you want whenever you want. If you really need speed, then binary will always trump the bloated nature of string represented data.
Finally, as #NSFW already stated, threads or asynchronous doesn't really matter in your case. I've written servers that scale to 10000 connections based on threads as well as asynchronous callbacks. It's all really the same thing when it comes down to it. As #NSFW said, most of us are using asynchronous callbacks now and the latest server implementation I've written follows that model as well.

Threads are not terribly expensive, considering the amount of RAM available on modern systems, so I don't think it's helpful to optimize for a low thread count. Especially if we're talking about a difference between 1 thread and 2-5 threads. (With hundreds or thousands of threads, the cost of a thread starts to matter.)
But you do want to optimize for minimal blocking of whatever threads you do have. So for example instead of using Thread.Sleep to do work on 15 minute intervals, just set a timer, let the thread return, and trust the system to invoke your code 15 minutes later. And instead of blocking operations for reading or writing information over the network, use non-blocking operations.
The async/await pattern is the new hotness for asynchronous programming on .Net, and it is a big improvement over the Begin/End pattern that dates back to .Net 1.0. Code written with async/await is still using threads, it is just using features of C# and .Net to hide a lot of the complexity of threads from you - and for the most part, it hides the stuff that should be hidden, so that you can focus your attention on your application's features rather than the details of multi-threaded programming.
So my advice is to use the async/await approach for all of your IO (network and disk) and use timers for periodic chores like sending those updates you mentioned.
And about serialization...
One of the biggest advantages of XML over binary formats is that you can save your XML transmissions to disk and open them up using readily-available tools to confirm that the payload really contains the data that you thought would be in there. So I tend to avoid binary formats unless bandwidth is scarce - and even then, it's useful to develop most of the app using a text-friendly format like XML, and then switch to binary after the basic mechanism of sending and receiving data have been fleshed out.
So my vote is for XML.
And regarding your code example, well ther's no async/await in it...
But first, note that a typical simple TCP server will have a small loop that listens for incoming connections and starts a thread to hanadle each new connection. The code for the connection thread will then listen for incoming data, process it, and send an appropriate response. So the listen-for-new-connections code and the handle-a-single-connection code are completely separate.
So anyway, the connection thread code might look similar to what you wrote, but instead of just calling ReadLine you'd do something like "string line = await ReadLine();" The await keyword is approximately where your code lets one thread exit (after invoking ReadLine) and then resumes on another thread (when the result of ReadLine is available). Except that awaitable methods should have a name that ends with Async, for example ReadLineAsync. Reading a line of text from the network is not a bad idea, but you'll have to write ReadLineAsync yourself, building upon the existing network API.
I hope this helps.

Load testing a website

I am currently writing a small application to load test a website and am having a few problems.
List<string> pageUrls = new List<string();
// NOT SHOWN ... populate the pageUrls with thousands of links
var parallelOptions = new System.Threading.Tasks.ParallelOptions();
parallelOptions.MaxDegreeOfParallelism = 100;
System.Threading.Tasks.Parallel.ForEach(pageUrls, parallelOptions, pageUrl =>
{
var startedOn = DateTime.UtcNow;
var request = System.Net.HttpWebRequest.Create(pageUrl);
var responseTimeBefore = DateTime.UtcNow;
try
{
var response = (System.Net.HttpWebResponse)request.GetResponse();
responseCode = response.StatusCode.ToString();
response.Close();
}
catch (System.Net.WebException ex)
{
// NOT SHOWN ... write to the error log
}
var responseTimeAfter = DateTime.UtcNow;
var responseDuration = responseTimeAfter - responseTimeBefore;
// NOT SHOWN ... write the response duration out to a file
var endedOn = DateTime.UtcNow;
var threadDuration = endedOn - startedOn;
// sleep for one second
var oneSecond = new TimeSpan(0, 0, 1);
if (threadDuration < oneSecond)
{
System.Threading.Thread.Sleep(oneSecond - threadDuration);
}
}
);
When I set the MaxDegreeOfParallelism to a low value such as 10 everything works fine, the responseDuration stays between 1 and 3 seconds. If I increase the value to 100 (as in the example) the responseDuration climbs quickly until after around 300 requests the it has reached 25 seconds (and still climbing).
I thought I may be doing something wrong so I also ran Apache jMeter with the standard web test plan setup and set the users to 100. After about 300 samples the response times had rocketed to around 40 seconds.
I'm skeptical that my server is reaching its limit. The task manager on the server shows that only 2GB of the 16GB is being used and the processor hangs around 5% effort.
Could I be hitting some limit on the number of simultaneous connections on my client computer? If so, how do I change this?
Am I forgetting to do something in my code? Clean-up/close connections?
Could it be that my code is OK and it is in fact my server that just can't handle the traffic?
For reference my client computer that is running the code above is running Windows 7 and is on the same network as the server I am testing. The server is running Windows Server 2008 IIS 7.5 and is a dedicated 8-core 16GB RAM machine.

MaxDegreeOfParallelism should be used only when you are trying to limit the number of cores to be used as part of your program strategy.
By default, Parallel library utilizes the most number of available threads - so setting this option to any number mostly will limit the performance depending on the environment running it.
I would suggest you to try running this code without setting this option and that should improve the performance.
ParallelOptions.MaxDegreeOfParallelism Property in MSDN - read remarks section for more information.

Several suggestions:
How large is your recorded Jmeter test script and did you insert some think time? The larger the test, the heavier the load.
Make sure the LAN is not in use by competing traffic during test runs. Having a Gigabit ethernet switch should be mandatory.
Do use 2-3 slave machines and avoid using heavy results loggers in Jmeter like tree.You were right to minimize these graphs and results.

Optimizing download of multiple web pages. C#

I am developing an app where I need to download a bunch of web pages, preferably as fast as possible. The way that I do that right now is that I have multiple threads (100's) that have their own System.Net.HttpWebRequest. This sort of works, but I am not getting the performance I would like. Currently I have a beefy 600+ Mb/s connection to work with, and this is only utilized at most 10% (at peaks). I guess my strategy is flawed, but I am unable to find any other good way of doing this.
Also: If the use of HttpWebRequest is not a good way to download web pages, please say so :)
The code has been semi-auto-converted from java.
Thanks :)
Update:
public String getPage(String link){
myURL = new System.Uri(link);
myHttpConn = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(myURL);
myStreamReader = new System.IO.StreamReader(new System.IO.StreamReader(myHttpConn.GetResponse().GetResponseStream(),
System.Text.Encoding.Default).BaseStream,
new System.IO.StreamReader(myHttpConn.GetResponse().GetResponseStream(),
System.Text.Encoding.Default).CurrentEncoding);
System.Text.StringBuilder buffer = new System.Text.StringBuilder();
//myLineBuff is a String
while ((myLineBuff = myStreamReader.ReadLine()) != null)
{
buffer.Append(myLineBuff);
}
return buffer.toString();
}

One problem is that it appears you're issuing each request twice:
myStreamReader = new System.IO.StreamReader(
new System.IO.StreamReader(
myHttpConn.GetResponse().GetResponseStream(),
System.Text.Encoding.Default).BaseStream,
new System.IO.StreamReader(myHttpConn.GetResponse().GetResponseStream(),
System.Text.Encoding.Default).CurrentEncoding);
It makes two calls to GetResponse. For reasons I fail to understand, you're also creating two stream readers. You can split that up and simplify it, and also do a better job of error handling...
var response = (HttpWebResponse)myHttpCon.GetResponse();
myStreamReader = new StreamReader(response.GetResponseStream(), Encoding.Default)
That should double your effective throughput.
Also, you probably want to make sure to dispose of the objects you're using. When you're downloading a lot of pages, you can quickly run out of resources if you don't clean up after yourself. In this case, you should call response.Close(). See http://msdn.microsoft.com/en-us/library/system.net.httpwebresponse.close.aspx

I am adding this answer as another possibility which people may encounter when
downloading from multiple servers using multi-threaded apps
using Windows XP or Vista as the operating system
The tcpip.sys driver for these operating systems has a limit of 10 outbound connections per second. This is a rate limit, not a connection limit, so you can have hundreds of connections, but you cannot initiate more than 10/s. The limit was imposed by Microsoft to curtail the spread of certain types of virus/worm. Whether such methods are effective is outside the scope of this answer.
In a multi-threaded application that downloads from multitudes of servers, this limitation can manifest as a series of timeouts. Windows puts into a queue all of the "half-open" (newly open but not yet established) connections once the 10/s limit is reached. In my application, for example, I had 20 threads ready to process connections, but I found that sometimes I would get timeouts from servers I knew were operating and reachable.
To verify that this is happening, check the operating system's event log, under System. The error is:
EventID 4226: TCP/IP has reached the security limit imposed on the number of concurrent TCP connect attempts.
There are many references to this error and plenty of patches and fixes to apply to remove the limit. However because this problem is frequently encountered by P2P (Torrent) users, there's quite a prolific amount of malware disguised as this patch.
I have a requirement to collect data from over 1200 servers (that are actually data sensors) on 5-minute intervals. I initially developed the application (on WinXP) to reuse 20 threads repeatedly to crawl the list of servers and aggregate the data into a SQL database. Because the connections were initiated based on a timer tick event, this error happened often because at their invocation, none of the connections are established, thus 10 are immediately queued.
Note that this isn't a problem necessarily, because as connections are established, those queued are then processed. However if non-queued connections are slow to establish, that time can negatively impact the timeout limits of the queued connections (in my experience). The result, looking at my application log file, was that I would see a batch of connections that timed out, followed by a majority of connections that were successful. Opening a web browser to test "timed out" connections was confusing, because the servers were available and quick to respond.
I decided to try HEX editing the tcpip.sys file, which was suggested on a guide at speedguide.net. The checksum of my file differed from the guide (I had SP3 not SP2) and comments in the guide weren't necessarily helpful. However, I did find a patch that worked for SP3 and noticed an immediate difference after applying it.
From what I can find, Windows 7 does not have this limitation, and since moving the application to a Windows 7-based machine, the timeout problem has remained absent.

I do this very same thing, but with thousands of sensors that provide XML and Text content. Factors that will definitely affect performance are not limited to the speed and power of your bandwidth and computer, but the bandwidth and response time of each server you are contacting, the timeout delays, the size of each download, and the reliability of the remote internet connections.
As comments indicate, hundreds of threads is not necessarily a good idea. Currently I've found that running between 20 and 50 threads at a time seems optimal. In my technique, as each thread completes a download, it is given the next item from a queue.
I run a custom ThreaderEngine Class on a separate thread that is responsible for maintaining the queue of work items and assigning threads as needed. Essentially it is a while loop that iterates through an array of threads. As the threads finish, it grabs the next item from the queue and starts the thread again.
Each of my threads are actually downloading several separate items, but the method call is the same (.NET 4.0):
public static string FileDownload(string _ip, int _port, string _file, int Timeout, int ReadWriteTimeout, NetworkCredential _cred = null)
{
string uri = String.Format("http://{0}:{1}/{2}", _ip, _port, _file);
string Data = String.Empty;
try
{
HttpWebRequest Request = (HttpWebRequest)WebRequest.Create(uri);
if (_cred != null) Request.Credentials = _cred;
Request.Timeout = Timeout; // applies to .GetResponse()
Request.ReadWriteTimeout = ReadWriteTimeout; // applies to .GetResponseStream()
Request.Proxy = null;
Request.CachePolicy = new System.Net.Cache.RequestCachePolicy(System.Net.Cache.RequestCacheLevel.NoCacheNoStore);
using (HttpWebResponse Response = (HttpWebResponse)Request.GetResponse())
{
using (Stream dataStream = Response.GetResponseStream())
{
if (dataStream != null)
using (BufferedStream buffer = new BufferedStream(dataStream))
using (StreamReader reader = new StreamReader(buffer))
{
Data = reader.ReadToEnd();
}
}
return Data;
}
}
catch (AccessViolationException ave)
{
// ...
}
catch (Exception exc)
{
// ...
}
}
Using this I am able to download about 60KB each from 1200+ remote machines (72MB) in less than 5 minutes. The machine is a Core 2 Quad with 2GB RAM and utilizes four bonded T1 connections (~6Mbps).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.