C# ThreadStaticAttribute marked fields are automatically released when thread dies? - c#

I discovered ThreadStaticAttribute, and I have a lot of questions about it:
all my previous thread-dependent static information was implemented as a static dictionary in which TKey is Thread, and when I wanted to access it, I used Thread.CurrentThread and that works. But this requires maintenance because if a thread dies, I have to delete the corresponding entry from the dictionary. And I also need to consider thread safety and a lot of other matters.
By using ThreadStaticAttribute, all these matters seem to be solved, but I need to be sure of it. My questions are: do I need to delete the instance held by ThreadStaticAttribute marked fields, somehow, before the thread dies?? Where is the information on that field held?? It is in the instance of a Thread object, or something like that, so that when it is not used anymore, the garbage collector automatically discards it? Are there performance penalties? What ones? Is it faster than using a Keyed collection like I was doing?
I need clarification on how ThreadStaticAttribute works.

No you do not need to delete instances of values help in a field which is tagged with ThreadStatic. The garbage collector will automatically pick them up when both the thread and the object are no longer reachable by rooted objects.
The only exception here is if the value implements IDisposable and you want to actively dispose of it. In general this is a hard problem to solve for a number of reasons. It's much simpler to not have values which implement IDisposable and are in a ThreadStatic field.
As to where this field is actually stored, it's somewhat irrelevant. All you need to be concerned about is that it will behave like any other object in .Net. The only two behavior differences are
The field will reference a different value per accessing thread.
The initializer for the field will only be run once (in practice, it's a bad idea to have any).

Marking a static member variable as [ThreadStatic] tells the compiler to allocate it in the thread's memory area (eg. where the thread's stack is allocated) rather than in the global memory area. Thus, each thread will have its own copy (which are guaranteed to be initialized to the default value for that type, eg. null, 0, false, etc; do not use in-line initializers as they will only initialize it for one thread).
So, when the thread goes away, so does its memory area, releasing the reference. Of course if it's something that needs more immediate disposal (open file streams, etc) instead of waiting for background garbage collection, you might want to make sure you do that before the thread exits.
There could be a limit to the amount of [ThreadStatic] space available, but it should be sufficient for sane uses. It should be somewhat faster than accessing a keyed collection (and more easily thread-safe), and I think it's comparable to accessing a normal static variable.
Correction: I have since heard that accessing ThreadStatic variables is somewhat slower than accessing normal static variables. I'm not sure if it is even actually faster than accessing a keyed collection, but it does avoid issues of orphans (which was your question) and needing locking for threadsafety which would complicate a keyed-collection approach.

Related

Is keeping track of c# GC objects (to avoid memory leaks) with a weak reference dictionary a good idea?

I have implemented a static class with a dictionary that tracks some objects in my application using weak references.
static Dictionary<Type,List<WeakReference>> Monitor;
On request (not in production) the garbage collector is forced and the dictionary is returned. By checking the number of "Alive" objects for each type, I can quickly check if I have a memory leak because the number of alive objects is higher than expected.
Given the simplicity and usefulness of this class made by me, I was wondering if there is something better in the dot net framework to keep track of certain objects, let's call them "observable".
Each contribution is highly appreciated.
Assuming that you do not want to use a profiler, then your approach can work. But be aware that your solution itself presents a memory leak as any object you add to it will have a weakreference object present, even after the object has been collected.
You need to periodically remove the weakreferences that are no longer valid, one approach is to remove on add like shown here https://www.chriswirz.com/software/weak-reference-lists-in-c-sharp.
Also remember that you are not using a ConcurrentDictionary so you need to control access to that structure.
If you are okay with spawning another process, you might have success in using the profiling api.
https://learn.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/profiling-overview
You could spawn a process that attaches to the host, collects data and signals back with the results.

Non blocking variable updates

I'm going to implement non-blocking write to a variable via Volatile.Write. Should i use Volatile.Read for all consumers of this variable, or it is not necessary? What kind of impacts may occure if i read this variable as usual (without any kind of barriers)? And the same question about Interlocked.Exchange
From the documentation of the Volatile class:
Calling one of these methods affects only a single memory access. To provide effective synchronization for a field, all access to the field must use Volatile.Read and Volatile.Write.
One of the things that may go wrong is that the compiler may emit code that reads the value of the variable into a register just once, and then keeps accessing this cached copy forever after, without ever checking to see whether the original value has changed.
Same thing with Interlocked.Exchange.
Generally, the best way to handle these kinds of situations is to fully encapsulate your variable inside a class exposing a property which accesses the variable via Volatile or Interlocked, thus guaranteeing that the variable will never be accessed by any other means.

Should I lock a datatable in multithread paradigm?

In a project of windows services (C# .Net Platform), I need a suggestion.
In the project I have class named Cache, in which I keep some data that I need frequently. There is a thread that updates cache after each 30 minutes. Where as there are multiple threads which use cache data.
In the cache class, there are getter and setter functions which are used by user threads and cache updater thread respectively. No one uses data objects like tables directly, because they are private members.
From the above context, do you think I should use locking functionality in the cache class?
The effects of not using locks when writing to a shared memory location (like cache) really depend on the application. If the code was used in banking software the results could be catastrophic.
As a rule o thumb - when multiple threads access the same location, even if only one tread writes and all the other read, you should use locks (for write operation). What can happen is that one thread will start reading data, get swiped out by the updater thread; So it'll potentially end up using a mixture of old and new data. If that really as an impact depends on the application and how sensible it is.
Key Point: If you don't lock on the reads there's a chance your read won't see the changes. A lock will force your read code to get values from main memory rather than pulling data from a cache or register. To avoid actually locking you could use Thread.MemoryBarrier(), which will do the same job without overhead of actually locking anything.
Minor Points: Using lock would prevent a read from getting half old data and half new data. If you are reading more than one field, I recommend it. If you are really clever, you could keep all the data in an immutable object and return that object to anyone calling the getter and so avoid the need for a lock. (When new data comes in, you create a new immutable object, then replace the old with the new in one go. Use a lock for the actual write, or, if you're still feeling really clever, make the field referencing the object volatile.)
Also: when your getter is called, remember it's running on many other threads. There's a tendency to think that once something is running the Cache class's code it's all on the same thread, and it's not.

C# locks and newbie multithreading questions

Some newbie questions about multi-threading in .NET which I think will help reinforce some concepts I'm trying to absorb - I've read several multi-threading material (including the Albahari ebook) but feel I just need some confirmation of some questions to help drive these concepts home
A lock scope protects a shared region of code - suppose there is a thread executing a method that increments a simple integer variable x in a loop - however this won't protect code elsewhere that might also alter variable x eg in another method on another thread ...
Since this is two different regions of code potentially affecting the same variable, do we solve this by locking both regions of code using the same lock variable for both lock scopes around variable x? If you locked both regions of code with different lock variables, this would not protect the variable correct?
To further this example, using the same lock variable, what would happen if for some reason, code in one method went into some infinite loop and never relinquished the lock variable - how could the second region of code in the other method detect this?
How does the choice of lock variable influence the behavior of the lock? I've read numerous posts on this subject already but can never seem to find a definitive answer - in some instances people explicitly use an object variable specifically for this purpose, other times people use lock(this) and finally there've been times I've seen people use a type object.
How do the different choices of lock variables influence the behavior / scope of the lock and what scenarios would it make sense to use one over the other?
suppose you have a hashtable wrapped in a class exposing add, remove, get and some sort of Calculate method (say each object represents a quantity and this method sums each value) and all these methods are locked - however, once a reference to an object in that collection is made available to other code and passed around an application, this object (not the hashtable) would now be outside the lock scope surrounding the methods of that class ..how could you then protect access / updates to those actual objects taken from the hashtable, which could interfere with the Calculate method?
Appreciate any heuristics provided that would help reinforce these concepts for me - thanks!
1) Yes
2) That's a deadlock
3) The parts of your code you want to block are an implementation detail of your class. Exposing the lock object by using lock(this) or lock(this.GetType()) is asking for trouble since now external code can lock the same object and block your code unintentionally or maliciously. The lock object should be private.
4) It isn't very clear what you mean, you certainly wouldn't want to expose the Hashtable directly. Just keep it as a private field of the class, encapsulating it.
However, the odds that you can safely expose your class to client code using threads go down very rapidly with the number of public methods and properties you expose. You'll quickly get to a point where only the client code can properly take a lock. Fine-grained locking creates lots of opportunities for threading races when the client code is holding on to property values. Say a Count property value you return. By the time it uses the value, like in a for loop, the Count property might have changed. Only the most careful design can avoid these traps, a serious headache.
Furthermore, fine-grained locking is very inefficient since it inevitably is done in the most inner parts of your code. Locks are not that expensive, a rough 100 cpu cycles, but it quickly adds up. Especially wasted effort if the class object isn't actually used in multiple threads.
You then have no option but to declare your class thread-unsafe and the client code needs to use it in a thread-safe manner. Also the core reason that so many .NET classes are not thread-safe. This is the biggest reason that threading is so hard to get right, the programmer least likely to do it correctly is responsible for doing the most difficult thing.
1)
You are correct. You must use the same lock object to protect two distinct area's of code that for example increment the variable x.
2)
This is known as a deadlock and is one of the difficulties with multithreaded programming. There are algorithms which can be used to prevent deadlocks such as the Bankers Algorithm.
3)
Some languages make locking easy, for example in .Net you can just create an object and use it as the shared lock. This is good for synchronising code within a given process. Lock(this) just applies the lock to the object in question. However try to avoid this, instead create a private object and use that. Lock(this) can lead to deadlocking situations. The lock object underneath is probably just a wrapper around a Critical Section. If you wanted to protect a resource across different processes you would need a much heavier named Mutex, this requires a lock on a kernel object and is expensive, so do not use unless you must.
4)You need to make sure locking is applied there as well. But surely when people call methods on this reference they call the methods which employ synchronisation.

volatile for reference type in .net 4.0

I got confused on volatile for reference type .
I understand that for primitive type, volatile can reflect value changes from another thread immediately. For reference type, it can reflect the address changes immediately. However, what about the content of the object. Are they still cached?
(Assuming List.Add() is an atomic operation)
For example, I have:
class A
{
volatile List<String> list;
void AddValue()
{
list.Add("a value");
}
}
If one thread calls the function AddValue, the address of list does not change, will another thread get updated about the "content" change of the list, or the content may be cached for each thread and it doesn't update for other threads?
I understand that for primitive type, volatile can reflect value changes from another thread immediately
You understand incorrectly in at least three ways. You should not attempt to use volatile until you deeply understand everything about weak memory models, acquire and release semantics, and how they affect your program.
First off, be clear that volatile affects variables, not values.
Second, volatile does not affect variables that contain values of value types any differently than it affects variables that contain references.
Third, volatile does not mean that value changes from other threads are visible immediately. Volatile means that variables have acquire and release semantics. Volatile affects the order in which side effects of memory mutations can be observed to happen from a particular thread. The idea that there exists a consistent universal order of mutations and that those mutations in that order can be observed instantaneously from all threads is not a guarantee made by the memory model.
However, what about the content of the object?
What about it? The storage location referred to by a volatile variable of reference type need not have any particular threading characteristics.
If one thread calls the function AddValue, the address of list does not change, will another thread get updated about the "content" change of the list.
Nope. Why would it? That other thread might be on a different processor, and that processor cache might have pre-loaded the page that contains the address of the array that is backing the list. Mutating the list might have changed the storage location that contains the address of the array to refer to some completely different location.
Of course, the list class is not threadsafe in the first place. If you're not locking access to the list then the list can simply crash and die when you try to do this.
You don't need volatile; what you need is to put thread locks around accesses to the list. Since thread locks induce full fences you should not need half fences introduced by volatile.
It's worse than that.
If you concurrently access an object that isn't thread-safe, your program may actually crash. Getting out-of-date information is not the worst potential outcome.
When sharing .NET base class library objects between threads, you really have no choice but to use locking. For lockless programming, you need invasive changes to your data structures at the lowest levels.
The volatile keyword has no impact on the content of the list (or, more precisely, the object being referenced).
Speaking about updated / not updated for another thread is an oversimplification of what's happening. You should use the lock statement to synchronize access to the shared list. Otherwise you are effectively facing race conditions that may lead to program crash. The List<T> class is not thread-safe by itself.
Look at http://www.albahari.com/threading/part4.aspx#_The_volatile_keyword for a good explanation about what volatile actually does and how it impacts fields.
The entire part of threading on that site is a must read anyway, it contains huge amounts of useful information that have proved very useful for me when I was designing multi threaded software.

Categories

Resources