C# multithreaded incrementation [closed] - c#

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I'm trying to make an application (and learn C# along the way) that basically counts. The faster the better.
That's why initially I was thinking about multiple threads. However, as I see it, it wouldn't be possible, because the whole point of multithreading is to run code in parallel at the same time, right?
So, can I use multiple threads? Or any general tips as to make it increment faster?
Thank you.

You have to partition the number range. E.g., instead of incrementing from 0 to 999'999 in one thread, let four threads increment from 0 to 249'999, from 250'000 to 499'999, from 500'000 to 749'999 and from 750'000 to 999'999 respectively.
And have a look at Task Parallelism (Task Parallel Library).
Do not make the error to create one million tasks that increment once! The multitasking overhead would actually slow down the process considerably. You can actually only make a gain in speed, if every task has to perform a substantial amount of work.

For the best performance, for simple incrementing of a single variable, I would use the often overlooked Interlocked class.
private long SingleVariable = 0;
public void MultiThreadedMethod()
{
Interlocked.Increment(SingleVariable);
}
Although there would be no perceived benefit from multi-threading, performance wise in this simple example, as I would expect the cache (cache coherence network) bandwidth to be the bottleneck in such a case.
However, such a pattern is often useful in the multi-threaded world, where multiple threads may be completing a Unit of Work, and then incrementing a central counter to track the total progress across the threads. The alternative to interlocked, would be to use a Monitor.Enter (or lock keyword in C#), which would be relatively slower.

Related

XML vs Array Performance [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
From a performance standpoint, it is more beneficial to read large amounts of data from an XML file or by looping through an array?
I have around 2,000 datasets I need to loop through and do calculations with, so I'm just wondering if it would be better to import all XML data and process it as an array (single large import) or to import each dataset sequentially (many small imports).
Thoughts and suggestions?
If I have interpreted your question correctly, you need to load 2,000 sets of data from one file, and then process them all. So you have to read all the data and process all the data. At a basic level there is the same amount of work to do.
So I think the question is "How can I finish the same processing earlier?"
Consider:
How much memory will the data use? If it's going to be more than 1.5GB of RAM, then you will not be able to process it in a single pass on a 32-bit PC, and even on 64-bit PCs you're likely to see virtual memory paging killing performance. In either of these cases, streaming the data in smaller chunks is a necessity.
Conversely if the data is small (e.g. 2000 records might only be 200kB for all I know), then you may get better I/O performance by reading it in one chunk, or it will load so fast compared to the processing time that there is no point trying to optimise it.
Are the records independent? (so they don't need to be processed in a particular order, and you don't need one record present in memory in order to process another one) If so, and if the loading time is significant overall, then the "best" approach may be to parallelise the operation - If you can process some data while you are loading more data in the background, you will utilise the hardware better and do the same work in less time. So you probably want to consider splitting your loading and processing onto different threads.
But spreading the processing onto many threads might not help you if loading takes much longer than processing, as your processing threads may be starved of data while waiting for I/O - so using 1 processing thread may be just as fast as using 3 or 7. And there's no point in creating more threads than you have available CPU cores. If going multithreaded, I'd write it to use a configurable/dynamic number of threads and then do some testing to determine what the optimum approach will be.
But before you consider all of that, you might want to consider writing a brute force approach and see what the performance is like. Do you even need to optimise it?
And if the answer is "yes, I desperately need to optimise it", then can you reconsider the data format? XML is a very useful but grossly inefficient format. If you have a performance critical case, is there anything you can do to reduce the XML size (e.g. simply using shorter element names can make a massive difference on large files), or even use a much more compact and easily read binary format?

C# Multithread, which performs the best? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I'm currently writting an application that make a huge lot of call to slow webservices (I had no say in that pattern) that produce little output.
I'd like to make like 100 parallel calls (I know real parallelism can only go as far as you have cores).
But I was wondering if they were performance differences between different approach.
I'm hesitating between:
Using Task.Factory.StartNew in a loop.
Using Parallel.For.
Using BackgroundWorker.
Using AsyncCallback.
...Others?
My main goal, is to have as many webservices calls being started as quickly as possible.
How should I proceed?
From a performance standpoint it's unlikely to matter. As you yourself have described, the bottleneck in your program is a network call to a slow performing web service. That will be the bottleneck. Any differences in how long it takes you to spin up new threads or manage them is unlikely to matter at all due to how much they will be overshadowed by the network interaction.
You should use the model/framework that you are most comfortable with, and that will most effectively allow you to write code that you know is correct. It's also important to note that you don't actually need to use multiple threads on your machine at all. You can send a number of asynchronous requests to the web service all from the same thread, and even handle all of the callbacks in the same thread. Parallelizing the sending of requests is unlikely to have any meaningful performance impact. Because of this you don't really need to use any of the frameworks that you have described, although the Task Parallel Library is actually highly effective at managing asynchronous operations even when those operations don't represent work in another thread. You don't need it, but it's certainly capable of helping.
According to your advices I used Async (with I/O event) while I was previously using TLP.
Async really does outperform Sync + Task usage.
I can now launches 100 requests (almost?) at the same time and if the longest running one takes 5 seconds, the whole process will only last 7 seconds while when using Sync + TLP it took me like 70 seconds.
In conclusion, (auto generated) Async is really the way to go when consuming a lot of webservices.
Thanks to you all.
Oh and by the way, this would not be possible without:
<connectionManagement>
<add address="*" maxconnection="100" />
</connectionManagement>

improve perfomance of a REST Service [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have method which calls a stored procedure 300 times in a for loop and each time the stored procedure returns me 1200 records. How can i improve this ? I cannot eliminate the 300 calls but is there any otherways i can try out. I am using REST service impletemented through ASP.NET and using IBATIS for database connectivity
I cannot eliminate the 300 calls
Eliminate the 300 calls.
Even if all you can do is to just add another stored procedure which calls the original stored procedure 300 times, aggregating the results, you should see a massive performance gain.
Even better if you can write a new stored procedure that replicates the original functionality but is structured more appropriately for your specific use case, and call that, once, instead.
Making 300 round trips between your code and your database quite simply is going to take time, even where the code and the database are on the same system.
Once this bit of horrible is resolved, there will be other things you can look to optimise, if required.
Measure.
Measure the amount of time spent inside the server-side code. Measure the amount of that time that is spent in the stored procedure. Measure the amount of time spent at the client part. Do some math, and you have a rough estimate for network time and other overheads.
Returning 1200 records, I would expect network bandwidth to be one of the main issues; you could perhaps investigate whether a different serialization engine (with the same output type) might help, or perhaps whether adding compression (gzip / deflate) support would be beneficial (meaning: reduced bandwidth being more important than the increased CPU required).
Latency might be important if you are calling the REST service 300 times; maybe you can parallelize slightly, or make fewer big calls rather than lots of small calls.
You could batch the SQL code, so you only make a few trips to the DB (calling the SP repeatedly in each) - that is perfectly possible; just use EXEC etc (still using parameterization).
You could look at how you are getting the data from ADO.NET to the REST layer. You mention IBATIS, but have you checked whether this is fast / slow compared to, say, "dapper" ?
Finally, the SP performance itself can be investigated; indexing or just a re-structuring of the SP's SQL may help.
Well, if you have to return 360,000 records, you have to return 360,000 records. But do you really need to return 360,000 records? Start there and work your way down.
Without knowing too much of the details, the architecture appears flawed. On one hand its considered unreasonable to lock the tables for the 6 seconds it takes to retrieve the 360,000 records using a single S.P. execution, but it fine to return a possibly inconsistent set of 360,000 records that are retrieved via multiple S.P. executions. It makes me wonder what exactly are you trying to implement and if there is a better way to design the integration between the client and the server.
For instance, if the client is retrieving a set of records that have been created since the last request, then maybe a paged ATOM feed would be more appropriate.
What ever it is you are doing, 360,000 records is a lot of data to move between the server and the client and we should be looking at the architecture and purpose of that data transfer to make sure the current approach is appropriate.

C# when do I need locks? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have read something about threadsafety but I want to understand on what operations I need to put a lock.
For example lets say I want a threadsafe queue/
If the deqeue operation will return the first element if there is one, when do I need a lock? Lets say i'm using an abstract linked list for the entries.
Should write actions be locked? Or reading ones? Or both?
Hope if someone can explain this to me or give me some links.
Synchronization in concurrent scenarios is a very wide topic. Essentially whenever two or more threads have some shared state between them (counter, data structure) and at least one of them mutates this shared state concurrently with a read or another mutation from a different thread, the results may be inconsistent. In such cases you will need to use some form of synchronization (of which locks are a flavor).
Now going to your question, a typical code that does a dequeue is the following (pseudocode):
if(queue is not empty)
queue.dequeue
which may be executed concurrently by multiple threads. Although some queue implementations internally synchronize both the queue is not empty operation as well as the queue.dequeue operation, that is not enough, since a thread executing the above code may be interrupted between the check and the actual dequeue, so some threads may find the queue empty when reaching the dequeue even though the check returned true. A lock over the entire sequence is needed:
lock(locker)
{
if(queue is not empty)
queue.dequeue
}
Note that the above may be implemented as a single thread-safe operation by some data structures, but I'm just trying to make a point here.
The best guide for locking and threading I found, is this page (this is the text I consult when working with locking and threading):
http://www.albahari.com/threading/
Yo want the paragraph "Locking and Thread Safety", but read the rest also, it is very well written.
For a basic overview see MSDN: Thread Synchronization. For a more detailed introduction I recommend reading Amazon: Concurrent Programming on Windows.
You need locks on objects that are subject to non atomic operations.
Add object to a list -> non atomic
Give value to a byte or an int -> atomic
As the simplest rule of thumb, all shared mutable data requires locking a lock while you access it.
You need a lock when writing, because you need to ensure no people are writing the same fields at the same time.
You need to lock when reading, because another thread could be halfway writing the data, so it could be in an inconsistent state. Inconsistant data can produce incorrect output, or crashes.
Locks have their own set of problems associated with them, (Google for "dining philosophers") so I tend to avoid using explicit locks whenever possible. Higher level building blocks, like the ConcurrentQueue<> are less errorprone, but you should still read the documentation.
Another simple way to avoid locks is to make a copy of the input data for your background process. Or even better, use immutable input (data that can not change).
The basic rules of locking
Changing the same thing simultaneously does not fly
Reading a thing that is being changed does not fly
Reading the same thing simultaneously does fly
Changing different things simultaneously might fly
Locking needs to prevent situations that do not fly. This can be done in many ways. C# gives you a lot of tools for this. Among other the Concurrent<> collection types like ConcurrentDictionary, ConcurrentQueue etc. But also ReaderWriterLockSlim and more.
You might find this free .pdf from microsoft useful. It's called 'An Introduction to Programming with C# Threads'
http://research.microsoft.com/pubs/70177/tr-2005-68.pdf
Ot this somewhat more humorous relay
http://www.codeproject.com/Articles/114262/6-ways-of-doing-locking-in-NET-Pessimistic-and-opt

What to use for high-performance server coding? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I need to build a server to accept client connection with a very high frequency and load (each user will send a request each 0.5 seconds and should get a response in under 800ms, I should be able to support thousands of users on one server). The assumption is that the SQL Server is finely tuned and will not pose a problem. (assumption that of course might not be true)
I'm looking to write a non-blocking server to accomplish this. My back end is an SQL Server which is sitting on another machine. It doesn't have to be updated live - so I think I can cache most of the data in memory and dump it to the DB every 10-20 seconds.
Should I write the server using C# (which is more compatible with SQL Server)? maybe Python with Tornado? What should be my considerations when write a high-performance server?
EDIT: (added more info)
The Application is a game server.
I don't really know the actual traffic - but this is the prognosis and the server should support it and scale well.
It's hosted "in the cloud" in a Datacenter.
Language doesn't really matter. Performance does. (a Web service can be exposed on the SQL Server to allow other languages than .NET)
The connections are very frequent but small (very little data is returned and little computations are necessary).
It should hold most of the data in the memory for fastest performance.
Any thoughts will be much appreciated :)
Thanks
Okay, if you REALLY need high performance, don't go for C#, but C/C++, it's obvious.
In any case, the fastest way to do server programming (as far as I know) is to use IOCP (I/O Completion Ports). Well, that's what I used when I made a MMORPG server emulator, and it performed faster than the official C++ select-based servers.
Here's a very complete introduction to IOCP in C#
http://www.codeproject.com/KB/IP/socketasynceventargs.aspx
Good luck !
Use the programming language that you know the most. It's a lot more expensive to hunt down performance issues in an large application that you do not fully understand.
It's a lot cheaper to buy more hardware.
People will say C++, because garbage collection in .Net could kill your latency. You could avoid garbage collection though if you were clever, by reusing existing managed objects.
Edit: your assumption about SQL Server is probably wrong. You need to store your state in memory for random access. If you need to persist changes, journal them to the filsystem and consolidate them with the database infrequently
Edit 2: You will have a lot different threads talking to the same data. In order to avoid blocking and deadlocks, learn about lock-free programming (Interlocked.CompareExchange etc)
I was part of a project that included very high-performance server code, which actually included the ability to response with a TCP packet within 12 milliseconds or so.
We used C# and I must agree with jgauffin - a language that you know is much more important than just about anything.
Two tips:
Writing to console (especially in color) can really slow things down.
If it's important for the server to be fast at the first requests, you might want to use a pre-JIT compiler to avoid JIT compilation during the first requests. See Ngen.exe.

Categories

Resources