One SQLiteConnection per thread? - c#

I am using SQLite from system.data.sqlite.org
We need to access the database from many threads (for various reasons). I've read a lot about sqlite thread safe capabilities (the default synchronized access mode is fine for me).
I wonder if it is possible to simply open a connection per thread. Is something like this possible? I really don't care about race conditions (request something that hasn't been inserted yet). I am only interested in the fact that it is possible to access the data using one SQLiteConnection object per thread.

Yes. In fact, it's the proper way, as SQLite is not thread safe (by default. You can make it threadsafe compiling with some option). And just to ensure it works: SQLite is being used in some small websites, so multithreading is there :)
Here more information: http://www.sqlite.org/faq.html#q6

Given you use a separate connection per thread you should be fine.
From docs
Note that SQLiteConnection instance is not guaranteed to be thread
safe. You should avoid using the same SQLiteConnection in several
threads at the same time. It is recommended to open a new connection
per thread and to close it when the work is done.

Related

Reading in 'parallel' from postgre using npgsql

Sorry, if this question seems like a duplicate, but I was not able to figure out a solution from other answers.
I have a postgre DB that I am accessing using Npgsql.
I have multiple clients reading from the DB simultaneously.
I am getting exception - Operation is in progress.
I know the reason behind it.
I am using
public bool ReadRecord(arguments)
{
....
NpgsqlDataReader reader = cmd.ExecuteReader();
while (reader.Read())
{
...
}
...
}
Calling routine
private void GetBundleIdAndIsWat(arguments)
{
...
TaskQueue.GlobalInstance.AddTaskQueue(new Task<bool>(ReadRecord, InputData), tokenSource);
...
}
and while the reader is not disposed off for a given thread, the other one is not able to execute a command.
How do all multiple threads read from the DB simultaneously?
Does ExecuteReaderAsync allow only one thread to execute the command at a time?
In this case, I won't be able to read at the same time, right?
I read about Connection Pool but don't really know how to implement it.
How do all multiple threads read from the DB simultaneously?
It looks like you're sharing single database connection between parallel operations.
Just use separate connection per operation.
Note, that:
number of parallel connections is limited both by connection pool size and PostgreSQL. Connection pool size is limited by default to 100 connections, and you can change it by setting MaxPoolSize parameter in connection string. PostgreSQL limits this by max_connections parameter in postgresql.conf, and by default this one is also set to 100. Note, that a hundred of parallel connections is a rather big value, don't increase it until you really need this.
you mentioned threads, but this is a good candidate to use parallel tasks and asynchronous code instead of threads. This also will allow to utilize connections in more efficient way.
I read about Connection Pool but don't really know how to implement it
You don't need to implement connection pool yourself.
Well-done ADO .NET provider already implements it.
Just use connections, commands, and readers as usual: create, use, dispose.

Do we need thread safety when Cache is accessed from multiple processes (Redis)

I know what thread safety is. And in some scenarios it makes perfect senses. For instance, I understand that logger need to be thread safe, otherwise it might try to open the same file and access it (when access from multiple threads).
But I cannot visualize, why thread safety is important in while accessing cache. How can get/set from multiple thread can corrupt cache.
And most important, if thread safety is required (while accessing cache), how can we use it when cache is accessed from multiple processes. It would be nice if someone can answer in context of Redis.
Thanks In Advance
Redis is single-threaded. As such all commands in Redis are atomic. However, depending on the implementation in the client library sharing a connection may still be problematic. There would be the potential for reads and writes to be out of sequence such that one thread could get the read another thread was supposed to get causing problems in the client side. This could cause corruption by missing writes or invalid responses causing rewrites.
Thus the concern is not so much corrupting the data in Redis but leaking the data on the client side. Think of a shopping cart with someone else's items being charged to you as an example. For this reason, among others, your client access needs be be thread safe.
Although I have not got any direct text regarding it. But it seems, locking (or other way for synchronization) is applied on server end. And it make sure data is not corrupted from multiple threads/processes.
And why it is important fro make client libraries thread safe, is because they write/read on TCP connection (via network stream I guess). And it is important that if same client is used by multiple thread, it should work fine (in case client is thread safe), otherwise it will be document that, client should not shared among multiple thread.
I am not marking this as a correct answer. If people up vote this and agree on that, then I will do that.

Is it OK to use Control.Invoke instead of using a lock?

I'll have a database object that can be accessed from multiple threads as well as from the main thread. I don't want them to access the underlying database object concurrently, so I'll write a set of thread safe public methods that can be accessed from multiple threads.
My first idea was to use a lock around my connection such as lock(oleDbConnection), but the problem is that I would have to lock it for the main thread since is one more thread that can access it. Which would mean rewriting lots of code.
But, since these threads and the main thread wont access the database very often, how about just using some of my control's (maybe the main form's) Invoke method every time I call any of the database methods from another thread. This way, as far as I understand, these methods would be never called concurrently, and I wouldn't need to worry about the main thread. I guess the only problem would be degrading performance a little bit, but as I said, the database is not accessed that often; the reason why I use threads is not so that they can access the database concurrently but so that they can perform other operations concurrently.
So does this sound like a good idea? Am I missing something? Sounds a bit too easy so I'm suspicious.
It sounds like it would work AFAIK, but it also sounds like a really bad idea.
The problem is that when writing lock you are saying "I want this code to be a critical section", whereas when writing Invoke you are saying "I want this to be executed on the UI thread". These two things are certainly not equivalent, which can lead to lots of problems. For example:
Invoke is normally used to access UI controls. What if a developer sees Invoke and nothing UI-related, and goes "gee, that's an unneeded Invoke; let's get rid of it"?
What if more than one UI thread ends up existing?
What if the database operation takes a long time (or times out)? Your UI would stop responding.
I would definitely go for the lock. You typically want the UI thread responsive when performing operations that may take time, which includes any sort of DB access; you don't know whether it's alive or not for instance.
Also, the typical way to handle connections is to create, use and dispose the connection for each request, rather than reusing the same connection. This might perhaps solve some of your concurrency problems.
Why don't you try to use Connection Pool. Every thread can do its work with a different DB connection and send the result to main thread with Invoke. Connection Pooling is a very common approach used in Servers.
See Using Connection Pooling with SQL Server

ReaderWriterLock vs lock{}

Please explain what are the main differences and when should I use what.
The focus on web multi-threaded applications.
lock allows only one thread to execute the code at the same time. ReaderWriterLock may allow multiple threads to read at the same time or have exclusive access for writing, so it might be more efficient. If you are using .NET 3.5 ReaderWriterLockSlim is even faster. So if your shared resource is being read more often than being written, use ReaderWriterLockSlim. A good example for using it is a file that you read very often (on each request) and you update the contents of the file rarely. So when you read from the file you enter a read lock so that many requests can open it for reading and when you decide to write you enter a write lock. Using a lock on the file will basically mean that you can serve one request at a time.
Consider using ReaderWriterLock if you have lots of threads that only need to read the data and these threads are getting blocked waiting for the lock and and you don’t often need to change the data.
However ReaderWriterLock may block a thread that is waiting to write for a long time.
Therefore only use ReaderWriterLock after you have confirmed you get high contention for the lock in “real life” and you have confirmed you can’t redesign your locking design to reduce how long the lock is held for.
Also consider if you can't rather store the shared data in a database and let it take care of all the locking, as this is a lot less likely to give you a hard time tracking down bugs, iff a database is fast enough for your application.
In some cases you may also be able to use the Aps.net cache to handle shared data, and just remove the item from the cache when the data changes. The next read can put a fresh copy in the cache.
Remember
"The best kind of locking is the
locking you don't need (i.e. don't
share data between threads)."
Monitor and the underlying "syncblock" that can be associated with any reference object—the underlying mechanism under C#'s lock—support exclusive execution. Only one thread can ever have the lock. This is simple and efficient.
ReaderWriterLock (or, in V3.5, the better ReaderWriterLockSlim) provide a more complex model. Avoid unless you know it will be more efficient (i.e. have performance measurements to support yourself).
The best kind of locking is the locking you don't need (i.e. don't share data between threads).
ReaderWriterLock allows you to have multiple threads hold the ReadLock at the same time... so that your shared data can be consumed by many threads at once. As soon as a WriteLock is requested no more ReadLocks are granted and the code waiting for the WriteLock is blocked until all the threads with ReadLocks have released them.
The WriteLock can only ever be held by one thread, allow your 'data updates' to appear atomic from the point of view of the consuming parts of your code.
The Lock on the other hand only allows one thread to enter at a time, with no allowance for threads that are simply trying to consume the shared data.
ReaderWriterLockSlim is a new more performant version of ReaderWriterLock with better support for recursion and the ability to have a thread move from a Lock that is essentially a ReadLock to the WriteLock smoothly (UpgradeableReadLock).
ReaderWriterLock/Slim is specifically designed to help you efficiently lock in a multiple consumer/ single producer scenario. Doing so with the lock statement is possible, but not efficient. RWL/S gets the upper hand by being able to aggressively spinlock to acquire the lock. That also helps you avoid lock convoys, a problem with the lock statement where a thread relinquishes its thread quantum when it cannot acquire the lock, making it fall behind because it won't be rescheduled for a while.
It is true that ReaderWriterLockSlim is FASTER than ReaderWriterLock. But the memory consumption by ReaderWriterLockSlim is outright outrageous. Try attaching a memory profiler and see for yourself. I would pick ReaderWriterLock anyday over ReaderWriterLockSlim.
I would suggest looking through http://www.albahari.com/threading/part4.aspx#_Reader_Writer_Locks. It talks about ReaderWriterLockSlim (which you want to use instead of ReaderWriterLock).

only one of multiple threads to execute a particular code path

I have multiple threads starting at the roughly the same time --- all executing the same code path. Each thread needs to write records to a table in a database. If the table doesn't exist it should be created. Obviously two or more threads could see the table as missing, and try to create it.
What is the preferred approach to ensure that this particular block of code is executed only once by only one thread.
While I'm writing in C# on .NET 2.0, I assume that the approach would be framework/language neutral.
Something like this should work...
private object lockObject = new object();
private void CreateTableIfNotPresent()
{
lock(lockObject)
{
// check for table presence and create it if necessary,
// all inside this block
}
}
Have your threads call call the CreateTableIfNotPresent function. The lock block will ensure that no thread will be able to execute the code inside of the block concurrently, so no threads will be able to view the table as not present while another is creating it.
This is a classical application for either a Mutex or a Semaphore
A mutex ensures that a specific piece of code (or several pieces of code) can only be run by a single thread at a time. You could be clever and use a different mutex for each table, or simply constrain the whole initialisation block to one thread at a time.
A semaphore (or set of semaphores) could perform exactly the same function.
Most lock implementations will use a mutex internally, so look at what lock code is already available in the language or libraries you are using.
#ebpower has it right that in certain applications, you would actually be more efficient to catch an exception caused by an attempt to create the same table multiple times, though this may not be the case in your example.
However there are many other ways of proceeding. For example, you could use a single-threaded ExecutorService (sorry, I could only find a Java reference) that has responsibility for creating any tables that your worker threads discover are missing. If it gets two requests for the same table, it simply ignores the later ones.
A variant on a Memoizer (remembering table references, creating them first if necessary) would also work under the circumstances. The book Java Concurrency In Practice walks through the implementation of a nice Memoizer class, but this would be pretty simple to port to any other language with effective concurrency building blocks.
This is what Semaphores are for.
You may not even need to bother with locks since your database shouldn't let you create multiple tables with the same name. Why not just catch the appropriate exceptions and if two threads try to create the same table, one wins and continues on, while the other recovers and continues on.
I'd use a thread sync object such as ManualResetEvent though it sounds to me like you're willing a race condition which may mean you have a design problem
Some posts have suggested Mutexes - this is an overkill unless your threads are running on different processes.
Others have suggested using locks - this is fine but locking can lead to over-pessimistic locks on data which can negate the benefit of using threads in the first place.
A more fundamental question is why are you doing it this way at all? What benefit does threading bring to the problem domain? Does concurrency solve your problem?
You may want to try static constructors to get a reference of the table.
According to the MSDN (.net 2.0), A static constructor is used to initialize any static data, or to perform a particular action that needs performed once only.
Also, CLR automatically guarantees that a static constructor executes only once per AppDomain and is thread-safe.
For more info, check Chapter 8 of CLR via C# by Jeffrey Richter.

Categories

Resources