How to know if HttpWebClientProtocol class is Thread Safe? - c#

I read this question, but the answers and discussions are confusing myself.
So I decided to check, but how could I do it? How to create a test to prove if HttpWebClientProtocol class is Thread Safeor not?
I have already done the following test:
Create one HttpWebClientProtocol to call a WS.
I create the WS by myself and have just a Thread.Sleep(30000) inside.
So I create two independent threads to call this HttpWebClientProtocol at the same time.
The result is: Both threads called the WS with no problems. (One thread didn't need to wait the first call ends)
with this test have I proved that the object IS Thread Safe and the "correct' answer of the other question is wrong??

Well... I have a better test for you.
HttpWebClientProtocol Class
Directly from MSDN. Here's a copy/pasta of what they have to say about thread safety:
Thread Safety
The properties on this class are copied into a new instance of a WebRequest object for each XML Web service method call. While you can call XML Web service methods on the same WebClientProtocol instance from different threads at the same time, there is no synchronization done to ensure that a consistent snapshot of the properties gets transferred to the WebRequest object. Therefore, if you need to modify the properties and make concurrent method calls from different threads you should use a different instance of the XML Web service proxy or provide your own synchronization.
About thread safety
It's not about just "being available". But it's about making sure that data/state being affected by one thread does not affect the correct execution of the other thread.
If they share data structure and those structure are shared between threads, they are not thread-safe. The issue might not be easily apparent but on a system with large amount of usage of that class in a multi-threaded system, you could find some bugs/exceptions/weird behaviors that you will not be able to reproduce in a development environment and "only happens in production".
That my friend, is NOT thread safe.
About HttpWebClientProtocol and why it's not thread-safe
While the documentation is clear about being able to reuse the HttpWebClientProtocol, it is important to know that all the properties of the object itself are not going to be persisted to other requests created on another thread.
Meaning that if you have 2 threads playing with the Credentials property, you might end-up with some requests with different credentials. This would be bad in a web application with impersonation where requests could be done with a different credential and you could end-up with the data of someone else.
However, if you only need to set the initial properties once, then yes. You can reuse the instance.

Related

Trying to multithread using a global udpclient object, is there possible collisions issues?

I'm making a project in a p2p sharing system which will initiate a lot of sockets with the same ports. right now I'm using a global UdpClient which will use receive and sendasync methods on different threads with different endpoints. there is no usage of mutex as of now which is why I'm asking if collisions are possible using said object if I'm not changing the information inside this object
right now I tried only one example and it doesn't seem to collide although I don't trust one example enough for a full answer
As far as I can see, UdpClient is not thread safe. Thread safe objects should specifically mention that in the documentation, and UdpClient does not seem to do that.
So without any type of synchronization your code is most likely not safe. Testing is not sufficient since multi threading bugs are notorious for being difficult to reproduce. When you write multi threaded code you need to ensure any shared data is synchronized appropriately.
Using it within a lock is probably safe. But that is not a guarantee, UI objects are only safe to use from the thread that created the. Unfortunately that is not always well documented. A problem with locks is that it will block the thread, so locks are best used for very short and fast sections of code, not while doing long running operations like IO. And I don't think the compiler will even let you hold a lock while awaiting.
Another pattern is to use one or more concurrent queues, i.e. threads put messages on the queue, and another thread reads from the queue and sends the messages. There are many possible designs, and the best design will really depend on the specific application. However, designing concurrent systems is difficult, and I would recommend trying to create modules that are fairly independent, so you can understand and test a single module, without having to understand the entire program.
Memory is safe read concurrently. But the same does not extend to objects, since many object may mutate internal state when reading. Some types, like List<T>, specifically mentions that concurrent reads are safe. So make sure you check the documentation before using any object concurrently.

Do we need thread safety when Cache is accessed from multiple processes (Redis)

I know what thread safety is. And in some scenarios it makes perfect senses. For instance, I understand that logger need to be thread safe, otherwise it might try to open the same file and access it (when access from multiple threads).
But I cannot visualize, why thread safety is important in while accessing cache. How can get/set from multiple thread can corrupt cache.
And most important, if thread safety is required (while accessing cache), how can we use it when cache is accessed from multiple processes. It would be nice if someone can answer in context of Redis.
Thanks In Advance
Redis is single-threaded. As such all commands in Redis are atomic. However, depending on the implementation in the client library sharing a connection may still be problematic. There would be the potential for reads and writes to be out of sequence such that one thread could get the read another thread was supposed to get causing problems in the client side. This could cause corruption by missing writes or invalid responses causing rewrites.
Thus the concern is not so much corrupting the data in Redis but leaking the data on the client side. Think of a shopping cart with someone else's items being charged to you as an example. For this reason, among others, your client access needs be be thread safe.
Although I have not got any direct text regarding it. But it seems, locking (or other way for synchronization) is applied on server end. And it make sure data is not corrupted from multiple threads/processes.
And why it is important fro make client libraries thread safe, is because they write/read on TCP connection (via network stream I guess). And it is important that if same client is used by multiple thread, it should work fine (in case client is thread safe), otherwise it will be document that, client should not shared among multiple thread.
I am not marking this as a correct answer. If people up vote this and agree on that, then I will do that.

Retrieve data from threads

I am trying to run two separate threads, like A and B. A and B running on totally different data, and A only need small part of data from B. They both need to be running all time. How can I retrieve the data from thread B and not interrupt B's running.
I am new to the multiple threads, could you tell me in examples?
That's not how threads work, threads don't “own” data (most of the time). You can access data that was used or created on another thread just like any other data, but it can be very dangerous to do so.
The problem is that most data structures are not ready to be accessed from more than one thread at the same time (they are not thread-safe). There are several ways how to fix that:
Use lock (or some other synchronization construct) to access the shared resource. Doing this makes sure that only one thread accesses the resource at a time, so it's safe. This is the most general approach (it works every time), it's probably the most common solution and the one that is easiest to get right (just lock on the right lock object every time you access the resource). But it can hurt performance, because it can make threads wait on each other a lot.
Don't share data between threads. If you have several operations that you want to run in parallel, some require resource A and others require resource B, run those that require A on one thread and those that require B on another thread. This way, you can be sure that only one thread accesses A or B, so it's safe. Another variant of this is if each thread has a copy of the resource.
Use special thread-safe data structures. For example in .Net 4, there is a whole namespace of thread-safe collections: System.Collections.Concurrent.
Use immutable data structures. If the structure doesn't change, it's safe to access it from several threads at the same time. For example, because of this it's safe to share a string between several threads.
Use special constructs that avoid locking, like Interlocked operations or volatile operations. This is how most of the structures from #3 are implemented internally and it's a solution that can be much more performant than #1. But it's also very hard to do this right, which is why you should avoid it unless you really know what you're doing.
You have several options and it can be all confusing. But the best option usually is to just use a lock to access the shared resource, or use a thread-safe structure from a library and doing that is not hard. But if you find out that's not enough, you can go for the more advanced alternatives, but it will be hard to get right.

How many instances of an Application object can run per application

I was reading the following post How to correctly use IHttpModule
*
Now lets think of the word itself. Application pool. Yes pool. It
means that a certain web application is running multiple
HttpApplication instances in one pool. Yes multiple. Otherwise it
wouldn't be called a pool. »How many?« you may ask. That doesn't
really matter as long as you know there could be more than one. We
trust IIS to do its job. And it obviously does it so well that it made
this fact completely transparent for us developers hence not many
completely understand its inner workings. We rely on its robustness to
provide the service. And it does. Each of these HttpApplication
instances in the pool keeps its own list of HTTP modules that it uses
with each request it processes.
*
I have a question that under what scenario multiple instances of an Application object can run for a single application. Till now I was aware of the fact that a single application object exists per application. So I am curious to know that is this true that multiple instances can run per application and how it is decided ?
Each HttpApplication object instance is unique to a single request. If your site is processing multiple requests in parallel, each one must have it's own instance of HttpApplication. That object has per-request state information that must not change during the request's lifetime (including the body of the request and response!)
The instances are pooled, as described in the article. Each one will be reused to service multiple subsequent requests, up to the limit set on the application pool, then it'll be allowed to die off.
Note that you're specifically asking about HttpApplication. This is distinct from the System.Windows.Forms.Application class, which is in fact a singleton class that only exists once per application.

only one of multiple threads to execute a particular code path

I have multiple threads starting at the roughly the same time --- all executing the same code path. Each thread needs to write records to a table in a database. If the table doesn't exist it should be created. Obviously two or more threads could see the table as missing, and try to create it.
What is the preferred approach to ensure that this particular block of code is executed only once by only one thread.
While I'm writing in C# on .NET 2.0, I assume that the approach would be framework/language neutral.
Something like this should work...
private object lockObject = new object();
private void CreateTableIfNotPresent()
{
lock(lockObject)
{
// check for table presence and create it if necessary,
// all inside this block
}
}
Have your threads call call the CreateTableIfNotPresent function. The lock block will ensure that no thread will be able to execute the code inside of the block concurrently, so no threads will be able to view the table as not present while another is creating it.
This is a classical application for either a Mutex or a Semaphore
A mutex ensures that a specific piece of code (or several pieces of code) can only be run by a single thread at a time. You could be clever and use a different mutex for each table, or simply constrain the whole initialisation block to one thread at a time.
A semaphore (or set of semaphores) could perform exactly the same function.
Most lock implementations will use a mutex internally, so look at what lock code is already available in the language or libraries you are using.
#ebpower has it right that in certain applications, you would actually be more efficient to catch an exception caused by an attempt to create the same table multiple times, though this may not be the case in your example.
However there are many other ways of proceeding. For example, you could use a single-threaded ExecutorService (sorry, I could only find a Java reference) that has responsibility for creating any tables that your worker threads discover are missing. If it gets two requests for the same table, it simply ignores the later ones.
A variant on a Memoizer (remembering table references, creating them first if necessary) would also work under the circumstances. The book Java Concurrency In Practice walks through the implementation of a nice Memoizer class, but this would be pretty simple to port to any other language with effective concurrency building blocks.
This is what Semaphores are for.
You may not even need to bother with locks since your database shouldn't let you create multiple tables with the same name. Why not just catch the appropriate exceptions and if two threads try to create the same table, one wins and continues on, while the other recovers and continues on.
I'd use a thread sync object such as ManualResetEvent though it sounds to me like you're willing a race condition which may mean you have a design problem
Some posts have suggested Mutexes - this is an overkill unless your threads are running on different processes.
Others have suggested using locks - this is fine but locking can lead to over-pessimistic locks on data which can negate the benefit of using threads in the first place.
A more fundamental question is why are you doing it this way at all? What benefit does threading bring to the problem domain? Does concurrency solve your problem?
You may want to try static constructors to get a reference of the table.
According to the MSDN (.net 2.0), A static constructor is used to initialize any static data, or to perform a particular action that needs performed once only.
Also, CLR automatically guarantees that a static constructor executes only once per AppDomain and is thread-safe.
For more info, check Chapter 8 of CLR via C# by Jeffrey Richter.

Categories

Resources