I'm trying to designing a class and I'm having issues with accessing some of the nested fields and I have some concerns with how multithread safe the whole design is. I would like to know if anyone has a better idea of how this should be designed or if any changes that should be made?
using System;
using System.Collections;
namespace SystemClass
{
public class Program
{
static void Main(string[] args)
{
System system = new System();
//Seems like an awkward way to access all the members
dynamic deviceInstance = (((DeviceType)((DeviceGroup)system.deviceGroups[0]).deviceTypes[0]).deviceInstances[0]);
Boolean checkLocked = deviceInstance.locked;
//Seems like this method for accessing fields might have problems with multithreading
foreach (DeviceGroup dg in system.deviceGroups)
{
foreach (DeviceType dt in dg.deviceTypes)
{
foreach (dynamic di in dt.deviceInstances)
{
checkLocked = di.locked;
}
}
}
}
}
public class System
{
public ArrayList deviceGroups = new ArrayList();
public System()
{
//API called to get names of all the DeviceGroups
deviceGroups.Add(new DeviceGroup("Motherboard"));
}
}
public class DeviceGroup
{
public ArrayList deviceTypes = new ArrayList();
public DeviceGroup() {}
public DeviceGroup(string deviceGroupName)
{
//API called to get names of all the Devicetypes
deviceTypes.Add(new DeviceType("Keyboard"));
deviceTypes.Add(new DeviceType("Mouse"));
}
}
public class DeviceType
{
public ArrayList deviceInstances = new ArrayList();
public bool deviceConnected;
public DeviceType() {}
public DeviceType(string DeviceType)
{
//API called to get hardwareIDs of all the device instances
deviceInstances.Add(new Mouse("0001"));
deviceInstances.Add(new Keyboard("0003"));
deviceInstances.Add(new Keyboard("0004"));
//Start thread CheckConnection that updates deviceConnected periodically
}
public void CheckConnection()
{
//API call to check connection and returns true
this.deviceConnected = true;
}
}
public class Keyboard
{
public string hardwareAddress;
public bool keypress;
public bool deviceConnected;
public Keyboard() {}
public Keyboard(string hardwareAddress)
{
this.hardwareAddress = hardwareAddress;
//Start thread to update deviceConnected periodically
}
public void CheckKeyPress()
{
//if API returns true
this.keypress = true;
}
}
public class Mouse
{
public string hardwareAddress;
public bool click;
public Mouse() {}
public Mouse(string hardwareAddress)
{
this.hardwareAddress = hardwareAddress;
}
public void CheckClick()
{
//if API returns true
this.click = true;
}
}
}
Making a class thread-safe is a heck of a difficult thing to do.
The first, naive, way, that many tends to attempt is just adding a lock and ensuring that no code that touches mutable data does so without using the lock. By that I mean that everything in the class that is subject to change, has to first lock the locking object before touching the data, be it just reading from it, or writing to it.
However, if this is your solution, then you should probably not do anything at all to the code, just document that the class is not thread-safe and leave it to the programmer that uses it.
Why?
Because you've effectively just serialized all access to it. Two threads that tries use the class at the same time, even though they are touching separate parts of it, will block. One of the threads will be given access, the other one will wait until the first one is complete.
This is actually discouraging multi-threaded usage of your class, so in this case you're adding overhead of locking to your class, and not actually getting any benefits from it. Yes, your class is now "thread safe", but it isn't actually a good thread-citizen.
The other way is to start adding granular locks, or writing lock-free constructs (seriously hard), so that if two parts of the object aren't always related, code that accesses each part have their own lock. This would allow multiple threads that accesses different parts of the data to run in parallel without blocking one another.
This becomes hard wherever you need to work on more than one part of the data at a time, as you need to be super-careful to take the locks in the right order, or suffer deadlocks. It should be your class' responsibility to ensure the locks are taken in the right order, not the code that uses the class.
As for your specific example, it looks to me as though the parts that will change from background threads are only the "is the device connected" boolean values. In this case I would make that field volatile, and use a lock around each. If, however, the list of devices will change from background threads, you're going to run into problems pretty fast.
You should first try to identify all the parts that will be changed by background threads, and then devise scenarios for how you want the changes to propagate to other threads, how to react to the changes, etc.
Related
Every so often I hit upon this problem and ignore it, but it started gnawing at me today.
private readonly object _syncRoot = new object();
private List<int> NonconcurrentObject { get; } = new List<int>();
public void Fiddle()
{
lock (_syncRoot)
{
// ...some code...
NonconcurrentObject.Add(1);
Iddle();
}
}
public void Twiddle()
{
lock (_syncRoot)
{
// ...some different code...
NonconcurrentObject.Add(2);
Iddle();
}
}
private void Iddle()
{
// NOT THREADSAFE! DO NOT CALL THIS WITHOUT LOCKING ON _syncRoot
// ......lots of code......
NonconcurrentObject.Add(3);
}
I have multiple public methods of a class with some code that is not inherently threadsafe (the List above is a trivial example). I want to use helper methods for the code shared between them (as anyone would), but in splitting off the shared code I'm faced with a dilemma: do I use recursive locking in the helper methods or not? If I do, my code is wasteful and possibly less performant. If I don't (as above), the helper method is no longer threadsafe and open to a nasty race condition if called by some other method in the future.
How can I (elegantly and robustly) signal that a method isn't threadsafe?
You use doc comments.
///<remarks>not thread safe</remarks>
You could use custom attributes to mark methods that are not thread safe.
The advantage over comments is that it gives you options for further processing (via reflection) if you wish to do so at a later date.
public class NotThreadSafe : Attribute
{
//...
}
public class MyClass
{
[NotThreadSafe]
public void MyMethod()
{
//...
}
}
You could add the _Unsafe suffix to your utility methods that are not protected with locks.
Advantages: It reminds you that you are doing dangerous things, and so that you must be extra careful. A small mistake could cost you days of debugging in the future.
Disadvantages: Not very pretty, and can be confused with the unsafe keyword.
private void Iddle_Unsafe()
{
NonconcurrentObject.Add(3);
}
public void Twiddle()
{
lock (_syncRoot)
{
NonconcurrentObject.Add(2);
Iddle_Unsafe();
}
}
Every example I've ever seen of locking uses a private object to lock specific blocks of code, and Thread Synchronization (C#) gives the same kind of example, but also says "Strictly speaking, the object provided is used solely to uniquely identify the resource being shared among multiple threads, so it can be an arbitrary class instance. In practice, however, this object usually represents the resource for which thread synchronization is necessary." (Emphasis mine.) In my example here, and in my code, there is only one instance of "MyClass", which is running on its own thread, and a reference to it is passed around to various other classes.
Is it OK to lock on the MyClass reference and then call Ready(), or should I instead put a private object() within MyClass and lock on that, as shown in the LockedReady() method? Thank you for your answer, in advance.
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var uc = new UserClass();
uc.DoThings();
}
}
public class MyClass
{
public bool Ready()
{
//determine if the class is ready to perform its function
//assumes that the instance of MyClass is locked,
//as shown in UserClass.DoThings
}
private object _readyLock = new object();
public bool LockedReady()
{
lock (_readyLock)
{
//determine if the class is ready to perform its function
//no assumption made that the object is locked, as
//shown in AnotherClass.DoAnotherThing()
}
}
}
public class UserClass
{
private MyClass _myc;
public UserClass()
{
var t = new Thread(SetupMyClass);
t.Start();
}
private void SetupMyClass()
{
_myc = new MyClass();
}
public void DoThings()
{
lock(_myc)
{
if (_myc.Ready())
{
//Do things
}
}
}
public void DoOtherThings()
{
var ac = new AnotherClass(_myc);
ac.DoAnotherThing();
}
}
public class AnotherClass
{
private MyClass _myc;
public AnotherClass(MyClass myClass)
{
_myc = myClass;
}
public void DoAnotherThing()
{
if (_myc.LockedReady())
{
//do another thing
}
}
}
}
Functionally, it doesn't matter, one object doesn't perform better than the other, unless there is shared use of that object by other locking concerns.
With C#, it isn't uncommon to lock on the actual domain object, rather than a surrogate object for the lock. It is also common to see a member object used, and a common legacy example is the SyncRoot object on the early System.Collections. Either way works, as long as you use a reference type.
However, the argument to be made for using an internal surrogate lock object is one of encapsulation. It eliminates the possibility of external interference if a user of your class decides to use your class as a lock. Using an internal lock object protects your locks from external interference, so one could argue that locking is an implementation detail that should be hidden.
The important thing is to ensure it is correct and appropriate. Make sure your locking is done at an appropriate granularity. (For example, using a static lock object probably isn't the best approach for a non-singleton, and probably not even most singletons). In cases where your class has multiple mutually exclusive threaded operations, you don't want to lock on "this" or you have unnecessary contention. That is like having one red light for 2 non-overlapping intersections.
Using what I judged was the best of all worlds on the Implementing the Singleton Pattern in C# amazing article, I have been using with success the following class to persist user-defined data in memory (for the very rarely modified data):
public class Params
{
static readonly Params Instance = new Params();
Params()
{
}
public static Params InMemory
{
get
{
return Instance;
}
}
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
public int ChunkSize
{
get
{
// Loc uses the Localizations impl
LC.Loc("params.chunksize").To<int>();
}
}
public void RebuildLocalizations()
{
_localizations = null;
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}
My usage would look something like this:
var allLocs = Params.InMemory.Localizations; //etc
Whenever I update the database, the RefreshLocalizations gets called, so only part of my in-memory store is rebuilt. I have a single production environment out of about 10 that seems to be misbehaving when the RefreshLocalizations gets called, not refreshing at all, but this is also seems to be intermittent and very odd altogether.
My current suspicions goes towards the singleton, which I think does the job great and all the unit tests prove that the singleton mechanism, the refresh mechanism and the RAM performance all work as expected.
That said, I am down to these possibilities:
This customer is lying when he says their environment is not using loading balance, which is a setting I am not expecting the in-memory stuff to work properly (right?)
There is some non-standard pool configuration in their IIS which I am testing against (maybe in a Web Garden setting?)
The singleton is failing somehow, but not sure how.
Any suggestions?
.NET 3.5 so not much parallel juice available, and not ready to use the Reactive Extensions for now
Edit1: as per the suggestions, would the getter look something like:
public IEnumerable<Localization> Localizations
{
get
{
lock(_localizations) {
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
}
To expand on my comment, here is how you might make the Localizations property thread safe:
public class Params
{
private object _lock = new object();
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
lock (_lock) {
if ( _localizations == null ) {
_localizations = new Repository<Localization>().Get();
}
return _localizations;
}
}
}
public void RebuildLocalizations()
{
lock(_lock) {
_localizations = null;
}
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}
There is no point in creating a thread safe singleton, if your properties are not going to be thread safe.
You should either lock around assignment of the _localization field, or instantiate in your singleton's constructor (preferred). Any suggestion which applies to singleton instantiation applies to this lazy-instantiated property.
The same thing further applies to all properties (and their properties) of Localization. If this is a Singleton, it means that any thread can access it any time, and simply locking the getter will again do nothing.
For example, consider this case:
Thread 1 Thread 2
// both threads access the singleton, but you are "safe" because you locked
1. var loc1 = Params.Localizations; var loc2 = Params.Localizations;
// do stuff // thread 2 calls the same property...
2. var value = loc1.ChunkSize; var chunk = LC.Loc("params.chunksize");
// invalidate // ...there is a slight pause here...
3. loc1.RebuildLocalizations();
// ...and gets the wrong value
4. var value = chunk.To();
If you are only reading these values, then it might not matter, but you can see how you can easily get in trouble with this approach.
Remember that with threading, you never know if a different thread will execute something between two instructions. Only simple 32-bit assignments are atomic, nothing else.
This means that, in this line here:
return LC.Loc("params.chunksize").To<int>();
is, as far as threading is concerned, equivalent to:
var loc = LC.Loc("params.chunksize");
Thread.Sleep(1); // anything can happen here :-(
return loc.To<int>();
Any thread can jump in between Loc and To.
I've a class that contains a static collection to store the logged-in users in an ASP.NET MVC application. I just want to know about the below code is thread-safe or not. Do I need to lock the code whenever I add or remove item to the onlineUsers collection.
public class OnlineUsers
{
private static List<string> onlineUsers = new List<string>();
public static EventHandler<string> OnUserAdded;
public static EventHandler<string> OnUserRemoved;
private OnlineUsers()
{
}
static OnlineUsers()
{
}
public static int NoOfOnlineUsers
{
get
{
return onlineUsers.Count;
}
}
public static List<string> GetUsers()
{
return onlineUsers;
}
public static void AddUser(string userName)
{
if (!onlineUsers.Contains(userName))
{
onlineUsers.Add(userName);
if (OnUserAdded != null)
OnUserAdded(null, userName);
}
}
public static void RemoveUser(string userName)
{
if (onlineUsers.Contains(userName))
{
onlineUsers.Remove(userName);
if (OnUserRemoved != null)
OnUserRemoved(null, userName);
}
}
}
That is absolutely not thread safe. Any time 2 threads are doing something (very common in a web application), chaos is possible - exceptions, or silent data loss.
Yes you need some kind of synchronization such as lock; and static is usually a very bad idea for data storage, IMO (unless treated very carefully and limited to things like configuration data).
Also - static events are notorious for a good way to keep object graphs alive unexpectedly. Treat those with caution too; if you subscribe once only, fine - but don't subscribe etc per request.
Also - it isn't just locking the operations, since this line:
return onlineUsers;
returns your list, now unprotected. all access to an item must be synchronized. Personally I'd return a copy, i.e.
lock(syncObj) {
return onlineUsers.ToArray();
}
Finally, returning a .Count from such can be confusing - as it is not guaranteed to still be Count at any point. It is informational at that point in time only.
Yes, you need to lock the onlineUsers to make that code threadsafe.
A few notes:
Using a HashSet<string> instead of the List<string> may be a good idea, since it is much more efficient for operations like this (Contains and Remove especially). This does not change anything on the locking requirements though.
You can declare a class as "static" if it has only static members.
Yes you do need to lock your code.
object padlock = new object
public bool Contains(T item)
{
lock (padlock)
{
return items.Contains(item);
}
}
Yes. You need to lock the collection before you read or write to the collection, since multiple users are potentially being added from different threadpool workers. You should probably also do it on the count as well, though if you're not concerned with 100% accuracy that may not be an issue.
As per Lucero's answer, you need to lock onlineUsers. Also be careful what will clients of your class do with the onlineUsers returned from GetUsers(). I suggest you change your interface - for example use IEnumerable<string> GetUsers() and make sure the lock is used in its implementation. Something like this:
public static IEnumerable<string> GetUsers() {
lock (...) {
foreach (var element in onlineUsers)
yield return element;
// We need foreach, just "return onlineUsers" would release the lock too early!
}
}
Note that this implementation can expose you to deadlocks if users try to call some other method of OnlineUsers that uses lock, while still iterating over the result of GetUsers().
That code it is not thread-safe per se.
I will not make any suggestions relative to your "design", since you didn't ask any. I'll assume you found good reasons for those static members and exposing your list's contents as you did.
However, if you want to make your code thread-safe, you should basically use a lock object to lock on, and wrap the contents of your methods with a lock statement:
private readonly object syncObject = new object();
void SomeMethod()
{
lock (this.syncObject)
{
// Work with your list here
}
}
Beware that those events being raised have the potential to hold the lock for an extended period of time, depending on what the delegates do.
You could omit the lock from the NoOfOnlineUsers property while declaring your list as volatile. However, if you want the Count value to persist for as long as you are using it at a certain moment, use a lock there, as well.
As others suggested here, exposing your list directly, even with a lock, will still pose a "threat" on it's contents. I would go with returning a copy (and that should fit most purposes) as Mark Gravell advised.
Now, since you said you are using this in an ASP.NET environment, it is worth saying that all local and member variables, as well as their member variables, if any, are thread safe.
In a C# app, suppose I have a single global class that contains some configuration items, like so :
public class Options
{
int myConfigInt;
string myConfigString;
..etc.
}
static Options GlobalOptions;
the members of this class will be uses across different threads :
Thread1: GlobalOptions.myConfigString = blah;
while
Thread2: string thingie = GlobalOptions.myConfigString;
Using a lock for access to the GlobalOptions object would also unnecessary block when 2 threads are accessing different members, but on the other hand creating a sync-object for every member seems a bit over the top too.
Also, using a lock on the global options would make my code less nice I think;
if I have to write
string stringiwanttouse;
lock(GlobalOptions)
{
stringiwanttouse = GlobalOptions.myConfigString;
}
everywhere (and is this thread-safe or is stringiwanttouse now just a pointer to myConfigString ? Yeah, I'm new to C#....) instead of
string stringiwanttouse = GlobalOptions.myConfigString;
it makes the code look horrible.
So...
What is the best (and simplest!) way to ensure thread-safety ?
You could wrap the field in question (myConfigString in this case) in a Property, and have code in the Get/Set that uses either a Monitor.Lock or a Mutex. Then, accessing the property only locks that single field, and doesn't lock the whole class.
Edit: adding code
private static object obj = new object(); // only used for locking
public static string MyConfigString {
get {
lock(obj)
{
return myConfigstring;
}
}
set {
lock(obj)
{
myConfigstring = value;
}
}
}
The following was written before the OP's edit:
public static class Options
{
private static int _myConfigInt;
private static string _myConfigString;
private static bool _initialized = false;
private static object _locker = new object();
private static void InitializeIfNeeded()
{
if (!_initialized) {
lock (_locker) {
if (!_initialized) {
ReadConfiguration();
_initalized = true;
}
}
}
}
private static void ReadConfiguration() { // ... }
public static int MyConfigInt {
get {
InitializeIfNeeded();
return _myConfigInt;
}
}
public static string MyConfigString {
get {
InitializeIfNeeded();
return _myConfigstring;
}
}
//..etc.
}
After that edit, I can say that you should do something like the above, and only set configuration in one place - the configuration class. That way, it will be the only class modifying the configuration at runtime, and only when a configuration option is to be retrieved.
Your configurations may be 'global', but they should not be exposed as a global variable. If configurations don't change, they should be used to construct the objects that need the information - either manually or through a factory object. If they can change, then an object that watches the configuration file/database/whatever and implements the Observer pattern should be used.
Global variables (even those that happen to be a class instance) are a Bad Thing™
What do you mean by thread safety here? It's not the global object that needs to be thread safe, it is the accessing code. If two threads write to a member variable near the same instant, one of them will "win", but is that a problem? If your client code depends on the global value staying constant until it is done with some unit of processing, then you will need to create a synchronization object for each property that needs to be locked. There isn't any great way around that. You could just cache a local copy of the value to avoid problems, but the applicability of that fix will depend on your circumstances. Also, I wouldn't create a synch object for each property by default, but instead as you realize you will need it.