Multithread + C# + Performance + SQLite

Multithread + C# + Performance + SQLite - c#

I have Windows Form application.
(C# with visual studio 2010, Framework : 4.0)
Database: .db (file database), connection object through SQLite
Thread th = new Thread(() => database(node, xNodetcname, xNodesqlquery1, xNodeHint, e, inXml));
th.Name = thread;
th.Start();
Above code create each thread and processing parallel on database() function.
Each thread having one SQL Query which fetching data from database.
While I not use Multithreading the performance is better
but when I use Multithreading the performance is down.
Example:
Without Multithreading 3 query processing time = 1.5 minutes
With Multithreading 3 query processing time = 1.9 minutes.
My aim is to reduce the processing time of query.

Then generally stay away from threads.
Some basic edutation: In most cases database performance is limited by IO, not CPU. IO can partialyl be mitigated by using a lot of memory as buffers, hence large database servers have TONS of memory.
You run a small leightweight databasde. It likely is not running on a database server level hardware or a SSD - so it has a IO problem. Perforamcne will be limited by IO.
Now multiple threads make sure that the IO side (hard disc) is inefficieent, especialyl because a winform app is not running normally on a high end IO subsystem.
Ergo: if you want a faster query then:
Optimize the query. MIssing index?
Get proper hardware and / or upgrade to a heaver setup. A SSD is a great help here.
Do not use multi threading - try to solve it on SQL level but accept that this may not be possible. THere is a reason companies still actually use real database servers to handle large data amounts. And SQLite may not be a good option - no chance saying it is or not as you totally ignore that side in your information.

There are a few things you have to take into consideration here:
Is your db connection in Multi-Threading mode(As described in this document)?
Is the database engine suitable for multithreading (hint, SQLite is not, see #TomTom answer)
Are you using a thread pool(you are not) or are you initializing a new thread every time (which is rather slow)

Related

C# Windows Service Threading Process Improve Performance

We developed windows services with using threads to consume database records(generally .Net 2.0).Code block under the show.
for(int i=0;i<ThreadCount;i++)
{
ParameterizedThreadStart pts=new ParameterizedThreadStart(MyCode.DoWork)
Thread t=new Thread(pts);
t.Start(someObject);
}
ThreadCount read from app.config.MyCode.DoWork(Object someObject) code block select with some data in SQL server,and some operations.In addition to we call sp,and the query contains with(rowlock) vs.
while(someObject.Running)
{
*/select some data
}
Main question is how to improve my windows service.Some article related to manual creating thread increase CPU cost vs.So how to improve my app performance.If i use Parallel Task Library bring to any advantage.Create a task instead of create thread.Does Task create thread manually to look available CPU count,or i convert to like.
for(int i=0;i<ThreadCount;i++){
Task t=new Task(=>MyCode.Work());
}

To improve the performance of your application, you need to find out where the performance is lacking and then why it's lacking. Unfortunately we can't do that for you, because it needs access to the running C# and SQL code.
I suggest doing some performance profiling, either with a profiler tool (I use Redgate's tool) or by adding profiling code to your application. Once you can see the bottlenecks, you can make a theory about what's causing them.
I would start with the database first - look at the cached execution plans to see if there are any clues. Try running the stored procedures separately from the service.

to make things simpler and assuming the windows service is the only client accessing the database.
1. try to access the database from one thread. this reduces lock contention on db tables assuming there is contention in accessing data.
2. put the data retrieved into a queue and have TPL tasks processing the data to utilize all CPU cores assuming your performance bottle neck is the cpu
3. and then buy as many cpu core as you can afford.
This is an intuitive pattern and there are many assumptions in it. You'll need to profile and analyze your program to know whether this is suitable.

Do SQL Server stored procedures perform better in network clusters?

From what I've read, there appears to be marginal performance benefits using stored procedures vs simply building the commands in C# and calling them explicitly in the program's code, at least when it comes to machines that share the server program and db engine (and when the procedures are simple). Most people seem to think it's a 'preference issue', and add a few other minor benefits to justify their case.
However, one I couldn't find any information on, is the benefit of a stored procedure when the database engine is located on a separate physical machine from the main application.
If I am not mistaken, in a server farm, wouldn't a stored procedure offload the processing on some cpu threads from the main server application, and have the primary processing done on the db engine server's cpu instead? Or, is this already done on the db engine's cpu anyways, when the C# libraries 'build' the information for the db engine to process?
Specifically, I have a long-running transaction that I could do multiple calls in a C# transaction block, but I suspect that a stored proc will in fact have a huge performance benefit by reducing the network calls to the db engine, as well as guaranteeing the processing is not being done on the main server application.
Is this true?

Performance gains from a stored procedure (versus something like Dapper or an OR/M like Entity Framework) can vary anywhere from nearly identical to a very noticeable performance improvement. I don't think your question can be answered without seeing the code that would be translated to a stored procedure.
Having said that, in my experience making a single stored procedure call versus multiple statements from the application code, yes, it would likely be faster.

If the SP is just a simple query (ie one SELECT statement) the performance gain is that a SP is pre-compiled. While the query is running you should not see any difference if it is a query or a SP.
I'm not sure of the effect if the SP is more complicated because this would depend on the query.

The more important benefit of a SP is that all the data are kept in the DBMS instead of being sent back and forward to the client. If you are dealing with large amount of data the benefit is more evident. The difference rises if your DB is located on a different machine and even more if the connection between them is slow.
On the contrary you must consider that a SP usually is not compiled to machine code so if the SP implements very complex logic it could be faster to implement the logic on the client.
Then you should also consider that moving the business logic to the server is not so good for code maintenance, you could add a technology debit implementing in the DB something that should be in your client code.
So, there isn't a solution valid for all the seasons but usually a well written SP is faster than the same code running on the client

There are a few issues at play here. As others have said, it kind of depends. A raw select statement will be barely noticeable. If there's a hugely complex query then a SP can save a lot of repetitive parsing. If there's a lot of intermediate data then SP will keep the data local, reducing network traffic. If your DB has a higher spec than the client it might run faster due to CPU horsepower.
Downsides can be things like bogging down the DB for everyone with processing that could be done on the client. This is generally if you're running an underpowered SQL server. Another subtle side to this is that licensing costs for a multi-core DB server can be impressive. Your $ per cycle on a SQL Server box can be many times what it is on your client.

SQL Server High Frequency Inserts

I've a system where Data is being inserted through SP that's called via WCF Service.
In system, we have currently 12000+ actively logged in Users who will be calling WCF service at every 30 seconds (effectively min 200 requests per second).
On SQL Server side, CPU Usage shoots to 100% and when I examined, > 90% of time was spent in DB Writes. This affects overall server performance.
I need suggestion to resolve this issue so that we have less DB write operations and more CPU remains free.
Am open to integrate any other DB Server, use Entity Framework or any other ORM combination if needed. I need to have solution to handle this issue.
Other information that might be helpful:
Table has no indexes defined
Database has growth factor set to 200MB.
SQL Server Version is 2012.

SImple solution: back the writes. Do not call into the sql server for every insert.
Make a service that collects them and calls them more coarsely. The main problem is that transaction handling is a little heavy cost wise - in cases like that it may make sense to batch them.
Do not call a SP for every row, load them into a temp table and then process them in bulk (or use a table variable to provide the sp with multiple lines of information at once).
This gets rid of a lot of issues, including a ton of commits (you basically ask for like 200 TPS which is quite heavy and not needed here).
How you do that is up to you - but for something that heavy I would stay away from an ORM (Entity Framework is hilarious in not batching anything - that should be tons of sp calls) and use handcrafted sql at least for this part. I love ORM's but it is always nice to have a high performance hand crafted approach when needed.

Performing just SELECT commands on SQLite

I have made a SQLite database (~700MB, 3 tables, 3 indexes - 1 rtree index and 2 primary keys). I have marked it as a read-only file (on Windows).
Is it safe and performant to execute just SELECT commands on this database from multiple threads?
If so how can it be made more performant (any options or flags to enable, any tiny tunings)?
This application is in C# using System.Data.SQLite (1.0.82.0), compiled for .NET 4.0 on a x64 machine. And It works fine (not necessarily performant or correctly paralleled because I can not/do not know (how to) prove them). Currently I have no real bottleneck but soon I will! I need to search the rtree as fast as possible. (On my machine 4GB, 2 Cores) It takes sometimes more than 5 milliseconds to search the rtree. I have made that part multithreaded to process my data paralleled. And according to structure of the R-Tree (or I think R*-Tree in SQLite's case) if my database grows to some GB it should be no problem because these trees has low depths and are fast on large datasets. But if any improvements are possible, then it should be considered in this application.
I can not be sure that the part that has been made parallel is really running in parallel and for example SQLite (or System.Data.SQLite) has not an internal lock. In fact in some tests the parallel version runs slower!

This should be safe, provided each thread has its own connection or you use locks to prevent multiple threads from using the same connection at the same time.

Is it safe and performant to execute just SELECT commands on this database from multiple threads?
Most likely
how can it be made more performant (if it is possible)?
What are your bottlenecks? Disk I/O? Processor? Memory?
Making an application more performant is best done by 1) identifying the pieces that are performing poorly (and can be improved) and 2) making those pieces more performant. There are a multitude of tools out there that will identify the slowest parts of your code so you know what to tackle first. It makes no sense to shave 10ms off of a query when the program takes the results of that query and spends 10 seconds writing it to disk.
There's not a "magic wand" that you can wave over an application (especially a database-driven application) and make it run faster. You need to know what to fix first.

You can set the threading support level: http://www.sqlite.org/threadsafe.html
SQLite support three different threading modes:
Single-thread. In this mode, all mutexes are disabled and SQLite is unsafe to use in more than a single thread at once.
Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more threads.
Serialized. In serialized mode, SQLite can be safely used by multiple threads with no restriction.
The threading mode can be selected at compile-time (when the SQLite library is being compiled from source code) or at start-time (when the application that intends to use SQLite is initializing) or at run-time (when a new SQLite database connection is being created). Generally speaking, run-time overrides start-time and start-time overrides compile-time. Except, single-thread mode cannot be overridden once selected.
The default mode is serialized.
The slowdown you are seeing is the serialization of requests. Change the threading model and things will speed up. Keep in mind "unsafe" probably means both readers and writers at the same time. I am not sure what is the best mode for ONLY readers.

Force simultaneous threads/tasks for C# load testing app?

Question:
Is there a way to force the Task Parallel Library to run multiple tasks simultaneously? Even if it means making the whole process run slower with all the added context switching on each core?
Background:
I'm fairly new to multithreading, so I could use some assistance. My initial research hasn't turned up much, but I also doubt I know what exactly to search for. Perhaps someone more experienced with multithreading can help me better understand TPL and/or find a better solution.
Our company is planning on deploying a piece of software to all users' machines that will connect to a central server a few times a day, and synchronize some files and MS Access data back to the user's machine. We would like to load-test this concept first and see how the Access DB holds up to lots of simultaneous connections.
I've been tasked with writing a .NET application that behaves like the client app (connecting & syncing with a network location), but does this on multiple threads simultaneously.
I've been getting familiar with the Task Parallel Library (TPL), as this seems like the best (newest) way to handle multithreading, and get return values back from each thread easily. However as I understand it, TPL decides how to run each "task" for the fastest execution possible, splitting the work among the available cores. So lets say I want to run 30 sync jobs on a 2-core machine... the TPL would run 15 on each core, sequentially. This would mean my load test would only be hitting the Access DB with at most 2 connections at the same time. I want to hit the database with lots of simultaneous connections.

You can force the TPL to do this by specifying TaskOptions.LongRunning. According to Reflector (not according to the docs, though) this always creates a new thread. I consider relying on this safe production use.
Normal tasks will not do, because they don't guarantee execution. Setting MinThreads is a horrible solution (for production) because you are changing a process global setting to solve a local problem. And still, you are not guaranteed success.
Of course, you can also start threads. Tasks are more convenient though because of error handling. Nothing wrong with using threads for this use case.

Based on your comment, I think you should reconsider using Access in the first place. It doesn't scale well and has problems once the database grows to a certain size. Especially if this is simply served off some file share on your network.
You can try and simulate load from your single machine but I don't think that would be very representative of what you are trying to accomplish.
Have you considered using SQL Server Express? It's basically a de-tuned version of the full-blown SQL Server which might suit your needs better.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.