Suppose I have a static helper class that I'm using a lot in a web app. Suppose that the app receives about 20 requests per second for a sustained period of time and that, by magic, two requests ask the static class to do some work at the exact same nanosecond.
What happens when this happens?
To provide some context, the class is a used to perform a linq-to-sql query: it receives a few parameters, including the UserID, and returns a list of custom objects.
thanks.
It entirely depends on what your "some work" means. If it doesn't involve any shared state, it's absolutely fine. If it requires access to shared state, you'll need work out how to handle that in a thread-safe way.
A general rule of thumb is that a class's public API should be thread-safe for static methods, but doesn't have to be thread-safe for instance methods - typically any one instance is only used within a single thread. Of course it depends on what your class is doing, and what you mean by thread-safe.
What happens when this happens?
If your methods are reentrant then they are thread safe and what will happen is that chances are they will work. If those static methods rely on some shared state and you haven't synchronized access to this state chances are this shared state will get corrupted. But you don't need to hit the method at the same nanosecond by 20 requests to corrupt your shared state. 2 suffice largely if you don't synchronize it.
So static methods by themselves are not evil (well actually they are as they are not unit test friendly but that's another topic), it's the way they are implemented that matters in a multithreaded environment. So you should make them thread safe.
UPDATE:
Because in the comments section you mentioned LINQ-TO-SQL as long as all variables used in the static method are local, this method is thread-safe. For example:
public static SomeEntity GetEntity(int id)
{
using (var db = new SomeDbContext())
{
return db.SomeEntities.FirstOrDefault(x => x.Id == id);
}
}
you must ensure your methods are thread safe, so don't use static attributes to store any kind of state. If you are declaring new objects inside the static method, there is no problem because each thread have its own object.
It depends if the static class has any state or not (i.e. static variables shared across all calls). If it does not, then it's fine. If it does, it's not good. Examples:
// Fine
static class Whatever
{
public string DoSomething() {
return "something";
}
}
// Death from above
static class WhateverUnsafe
{
static int count = 0;
public int Count() {
return ++count;
}
}
You can make the second work fine using locks, but then you introduce deadlocks and concurrency issues.
I have built massive web applications with static classes but they never have any shared state.
It crashes out in a nasty way (if you are doing this to share state), avoid doing this in a webapp... Or alternativly protect the reads/writes with a lock:
http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlockslim.aspx
But honestly you really should avoid using statics, unless you REALLY have to, and if you really have to you have to be very careful with your locking strategy and test it to destruction to make sure have managed to isolated reads and writes from each other
Related
I'm wondering will this scenario be thread safe and are there issues that I'm not currently seeing:
From ASP.net controller I call non-static method from non-static class (this class is in another project, and class is injected into controller).
This method (which is non-static) does some work and calls some other static method passing it userId
Finally static method does some work (for which userId is needed)
I believe this approach is thread safe, and that everything will be done properly if two users call this method at the same time (let's say in same nanosecond). Am I correct or completely wrong ? If I am wrong what would be correct way of using static methods within ASP.net project ?
EDIT
Here is code :)
This is call from the controller:
await _workoutService.DeleteWorkoutByIdAsync(AzureRedisFeedsConnectionMultiplexer.GetRedisDatabase(),AzureRedisLeaderBoardConnectionMultiplexer.GetRedisDatabase(), workout.Id, userId);
Here how DeleteWorkoutByIdAsync looks like:
public async Task<bool> DeleteWorkoutByIdAsync(IDatabase redisDb,IDatabase redisLeaderBoardDb, Guid id, string userId)
{
using (var databaseContext = new DatabaseContext())
{
var workout = await databaseContext.Trenings.FindAsync(id);
if (workout == null)
{
return false;
}
databaseContext.Trenings.Remove(workout);
await databaseContext.SaveChangesAsync();
await RedisFeedService.StaticDeleteFeedItemFromFeedsAsync(redisDb,redisLeaderBoardDb, userId, workout.TreningId.ToString());
}
return true;
}
As you can notice DeleteWorkoutByIdAsync calls static method StaticDeleteFeedItemFromFeedsAsync which looks like this:
public static async Task StaticDeleteFeedItemFromFeedsAsync(IDatabase redisDb,IDatabase redisLeaderBoardDd, string userId, string workoutId)
{
var deleteTasks = new List<Task>();
var feedAllRedisVals = await redisDb.ListRangeAsync("FeedAllWorkouts:" + userId);
DeleteItemFromRedisAsync(redisDb, feedAllRedisVals, "FeedAllWorkouts:" + userId, workoutId, ref deleteTasks);
await Task.WhenAll(deleteTasks);
}
And here is static method DeleteItemFromRedisAsync which is called in StaticDeleteFeedItemFromFeedsAsync:
private static void DeleteItemFromRedisAsync(IDatabase redisDb, RedisValue [] feed, string redisKey, string workoutId, ref List<Task> deleteTasks)
{
var itemToRemove = "";
foreach (var f in feed)
{
if (f.ToString().Contains(workoutId))
{
itemToRemove = f;
break;
}
}
if (!string.IsNullOrEmpty(itemToRemove))
{
deleteTasks.Add(redisDb.ListRemoveAsync(redisKey, itemToRemove));
}
}
"Thread safe" isn't a standalone term. Thread Safe in the the face of what? What kind of concurrent modifications are you expecting here?
Let's look at a few aspects here:
Your own mutable shared state: You have no shared state whatsoever in this code; so it's automatically thread safe.
Indirect shared state: DatabaseContext. This looks like an sql database, and those tend to be thread "safe", but what exactly that means depends on the database in question. For example, you're removing a Trenings row, and if some other thread also removes the same row, you're likely to get a (safe) concurrency violation exception. And depending on isolation level, you may get concurrency violation exceptions even for other certain mutations of "Trenings". At worst that means one failed request, but the database itself won't corrupt.
Redis is essentially single-threaded, so all operations are serialized and in that sense "thread safe" (which might not buy you much). Your delete code gets a set of keys, then deletes at most one of those. If two or more threads simultaneously attempt to delete the same key, it is possible that one thread will attempt to delete a non-existing key, and that may be unexpected to you (but it won't cause DB corruption).
Implicit consistency between redis+sql: It looks like you're using guids, so the chances of unrelated things clashing are small. Your example only contains a delete operation (which is likely no to cause consistency issues), so it's hard to speculate whether under all other circumstances redis and the sql database will stay consistent. In general, if your IDs are never reused, you're probably safe - but keeping two databases in sync is a hard problem, and you're quite likely to make a mistake somewhere.
However, your code seems excessively complicated for what it's doing. I'd recommend you simplify it dramatically if you want to be able to maintain this in the long run.
Don't use ref parameters unless you really know what you're doing (and it's not necessary here).
Don't mix up strings with other data types, so avoid ToString() where possible. Definitely avoid nasty tricks like Contains to check for key equality. You want your code to break when something unexpected happens, because code that "limps along" can be virtually impossible to debug (and you will write bugs).
Don't effectively return an array of tasks if the only thing you can really do is wait for all of them - might as well do that in the callee to simplify the API.
Don't use redis. It's probably just a distraction here - you already have another database, so it's very unlikely you need it here, except for performance reasons, and it's extremely premature to go adding whole extra database engines for a hypothetical performance problem. There's a reasonable chance that the extra overhead of requiring extra connections may make your code slower than if you had just one db, especially if you can't save many sql queries.
Note: this answer was posted before the OP amended their question to add their code, revealing that this is actually a question of whether async/await is thread-safe.
Static methods are not a problem in and of themselves. If a static method is self-contained and manages to do its job using local variables only, then it is perfectly thread safe.
Problems arise if the static method is not self-contained, (delegates to thread-unsafe code,) or if it manipulates static state in a non-thread safe fashion, i.e. accesses static variables for both read and write outside of a lock() clause.
For example, int.parse() and int.tryParse() are static, but perfectly thread safe. Imagine the horror if they were not thread-safe.
what you are doing here is synchronizing on a list (deleteTasks). If you do this i would recommend 1 of 2 things.
1) Either use thread safe collections
https://msdn.microsoft.com/en-us/library/dd997305(v=vs.110).aspx
2) Let your DeleteItemFromRedisAsync return a task and await it.
Although i think in this particular case i don't see any issues as soon as you refactor it and DeleteItemFromRedisAsync can get called multiple times in parallel then you will have issues. The reason being is that if multiple threads can modify your list of deleteTasks then you are not longer guaranteed you collect them all (https://msdn.microsoft.com/en-us/library/dd997373(v=vs.110).aspx if 2 threads do an "Add"/Add-to-the-end in a non-thread safe way at the same time then 1 of them is lost) so you might have missed a task when waiting for all of them to finish.
Also i would avoid mixing paradigms. Either use async/await or keep track of a collection of tasks and let methods add to that list. don't do both. This will help the maintainability of your code in the long run. (note, threads can still return a task, you collect those and then wait for all of them. but then the collecting method is responsible for any threading issues instead of it being hidden in the method that is being called)
I want to make sure that I always create only one instance of a Thread so I built this:
private static volatile Thread mdmFetchThread = null;
private static object Locker = new object();
public void myMethod(){
string someParameter = getParameterDynamically();
lock(Locker)
{
// If an mdmFetchThread is already running, we do not start a new one.
if(mdmFetchThread != null && mdmFetchThread.ThreadState != ThreadState.Stopped)
{
// warn...
}
else
{
mdmFetchThread = new Thread(() => { doStuff(someParameter); });
mdmFetchThread.Start();
}
}
}
Is this ok to do or what could be possible pitfalls?
//Edit: As requested below a bit context: doStuff() is calling some external system. This call might timeout but I cant specify the timeout. So I call it in mdmFetchThread and do a mdmFetchThread.join(20000) later. To avoid that I call the external system twice, I created the static variable so that I can check if a call is currently ongoing.
Storing a thread in a static variable is OK (if you need at most one such thread per AppDomain). You can store whatever you want in static storage.
The condition mdmFetchThread.ThreadState != ThreadState.Stopped is racy. You might find it to be false 1 nanosecond before the thread exits. Then you accidentally do nothing. Maintain your own boolean status variable and synchronize properly. Abandon volatile because it is more complicated than necessary.
Consider switching to Task. It is more modern. Less pitfalls.
Consider using a Lazy<Task> to create the singleton behavior you want.
Add error handling. A crash in a background thread terminates the process without notifying the developer of the error.
Generally speaking if you are using statics to store state (such as a thread), then you might have a design flaw when attempting to scale out or when trying to manage the lifetime of the object. I usually try to avoid statics whenever possible.
An alternative might be to create a class that only manages a single thread to perform your task as an instance. This class might be responsible for passing data to your Thread or managing the state of it. For example, ensuring it is only run once, stopping the thread gracefully, or handling when the thread completes. If you wanted to scale out, then you'd just create multiple instances of your class each with their own thread that they manage. If you only wanted one, then just pass around a single instance.
If you're looking for ways to make this instance available to your entire application (which is usually the issue people are trying to solve when using static variables), then take a look into patterns like using ServiceContainers and IServiceProvider.
I have written a static class which is a repository of some functions which I am calling from different class.
public static class CommonStructures
{
public struct SendMailParameters
{
public string To { get; set; }
public string From { get; set; }
public string Subject { get; set; }
public string Body { get; set; }
public string Attachment { get; set; }
}
}
public static class CommonFunctions
{
private static readonly object LockObj = new object();
public static bool SendMail(SendMailParameters sendMailParam)
{
lock (LockObj)
{
try
{
//send mail
return true;
}
catch (Exception ex)
{
//some exception handling
return false;
}
}
}
private static readonly object LockObjCommonFunction2 = new object();
public static int CommonFunction2(int i)
{
lock (LockObjCommonFunction2)
{
int returnValue = 0;
try
{
//send operation
return returnValue;
}
catch (Exception ex)
{
//some exception handling
return returnValue;
}
}
}
}
Question 1: For my second method CommonFunction2, do I use a new static lock i.e. LockObjCommonFunction2 in this example or can I reuse the same lock object LockObj defined at the begining of the function.
Question 2: Is there anything which might lead to threading related issues or can I improve the code to be safe thread.
Quesiton 3: Can there be any issues in passing common class instead of struct.. in this example SendMailParameters( which i make use of wrapping up all parameters, instead of having multiple parameters to the SendMail function)?
Regards,
MH
Question 1: For my second method CommonFunction2, do I use a new
static lock i.e. LockObjCommonFunction2 in this example or can I reuse
the same lock object LockObj defined at the begining of the function.
If you want to synchronize these two methods, then you need to use the same lock for them. Example, if thread1 is accessing your Method1, and thread2 is accessing your Method2 and you want them to not concurrently access both insides, use the same lock. But, if you just want to restrict concurrent access to just either Method1 or 2, use different locks.
Question 2: Is there anything which might lead to threading related
issues or can I improve the code to be safe thread.
Always remember that shared resources (eg. static variables, files) are not thread-safe since they are easily accessed by all threads, thus you need to apply any kind of synchronization (via locks, signals, mutex, etc).
Quesiton 3: Can there be any issues in passing common class instead of
struct.. in this example SendMailParameters( which i make use of
wrapping up all parameters, instead of having multiple parameters to
the SendMail function)?
As long as you apply proper synchronizations, it would be thread-safe. For structs, look at this as a reference.
Bottomline is that you need to apply correct synchronizations for anything that in a shared memory. Also you should always take note of the scope the thread you are spawning and the state of the variables each method is using. Do they change the state or just depend on the internal state of the variable? Does the thread always create an object, although it's static/shared? If yes, then it should be thread-safe. Otherwise, if it just reuses that certain shared resource, then you should apply proper synchronization. And most of all, even without a shared resource, deadlocks could still happen, so remember the basic rules in C# to avoid deadlocks. P.S. thanks to Euphoric for sharing Eric Lippert's article.
But be careful with your synchronizations. As much as possible, limit their scopes to only where the shared resource is being modified. Because it could result to inconvenient bottlenecks to your application where performance will be greatly affected.
static readonly object _lock = new object();
static SomeClass sc = new SomeClass();
static void workerMethod()
{
//assuming this method is called by multiple threads
longProcessingMethod();
modifySharedResource(sc);
}
static void modifySharedResource(SomeClass sc)
{
//do something
lock (_lock)
{
//where sc is modified
}
}
static void longProcessingMethod()
{
//a long process
}
You can reuse the same lock object as many times as you like, but that means that none of the areas of code surrounded by that same lock can be accessed at the same time by various threads. So you need to plan accordingly, and carefully.
Sometimes it's better to use one lock object for multiple location, if there are multiple functions which edit the same array, for instance. Other times, more than one lock object is better, because even if one section of code is locked, the other can still run.
Multi-threaded coding is all about careful planning...
To be super duper safe, at the expense of potentially writing much slower code... you can add an accessor to your static class surround by a lock. That way you can make sure that none of the methods of that class will ever be called by two threads at the same time. It's pretty brute force, and definitely a 'no-no' for professionals. But if you're just getting familiar with how these things work, it's not a bad place to start learning.
1) As to first it depends on what you want to have:
As is (two separate lock objects) - no two threads will execute the same method at the same time but they can execute different methods at the same time.
If you change to have single lock object then no two threads will execute those sections under shared locking object.
2) In your snippet there is nothing that strikes me as wrong - but there is not much of code. If your repository calls methods from itself then you can have a problem and there is a world of issues that you can run into :)
3) As to structs I would not use them. Use classes it is better/easier that way there is another bag of issues related with structs you just don't need those problems.
The number of lock objects to use depends on what kind of data you're trying to protect. If you have several variables that are read/updated on multiple threads, you should use a separate lock object for each independent variable. So if you have 10 variables that form 6 independent variable groups (as far as how you intend to read / write them), you should use 6 lock objects for best performance. (An independent variable is one that's read / written on multiple threads without affecting the value of other variables. If 2 variables must be read together for a given action, they're dependent on each other so they'd have to be locked together. I hope this is not too confusing.)
Locked regions should be as short as possible for maximum performance - every time you lock a region of code, no other thread can enter that region until the lock is released. If you have a number of independent variables but use too few lock objects, your performance will suffer because your locked regions will grow longer.
Having more lock objects allows for higher parallelism since each thread can read / write a different independent variable - threads will only have to wait on each other if they try to read / write variables that are dependent on each other (and thus are locked through the same lock object).
In your code you must be careful with your SendMailParameters input parameter - if this is a reference type (class, not struct) you must make sure that its properties are locked or that it isn't accessed on multiple threads. If it's a reference type, it's just a pointer and without locking inside its property getters / setters, multiple threads may attempt to read / write some properties of the same instance. If this happens, your SendMail() function may end up using a corrupted instance. It's not enough to simply have a lock inside SendMail() - properties and methods of SendMailParameters must be protected as well.
I am trying to investigate locking to create a threadsafe class and have a couple of questions. Given the following class:
public class StringMe
{
protected ArrayList _stringArrayList = new ArrayList();
static readonly object _locker = new object();
public void AddString(string stringToAdd)
{
lock (_locker) _stringArrayList.Add(stringToAdd);
}
public override string ToString()
{
lock (_locker)
{
return string.Join(",",string[])_stringArrayList.ToArray(Type.GetType("System.String")));
}
}
}
1) Did I successfully make AddString andToString threadsafe?
2) In the ToString method I've created is it necessary to lock there to make it threadsafe?
3) Is it only the methods that modify data that need to be locked or do both the read and write opperations need to be locked to make it threadsafe?
Thank you so much for your time!
No, you haven't made those calls thread-safe - because the _stringArrayList field is protected. Subclasses could be doing whatever they like with it while AddString and ToString are being called.
For example (as the other answers claim that your code is thread-safe.)
public class BadStringMe : StringMe
{
public void FurtleWithList()
{
while (true)
{
_stringArrayList.Add("Eek!");
_stringArrayList.Clear();
}
}
}
Then:
BadStringMe bad = new BadStringMe();
new Thread(bad.FurtleWithList).Start();
bad.AddString("This isn't thread-safe");
Prefer private fields - it makes it easier to reason about your code.
Additionally:
Prefer List<T> to ArrayList these days
You're locking with a static variable for some reason... so even if you've got several instances of StringMe, only one thread can be in AddString at a time in total
Using typeof(string) is much cleaner than Type.GetType("System.String")
3) Is it only the methods that modify data that need to be locked or do both the read and write opperations need to be locked to make it threadsafe?
All, assuming that there might be some operations. If everything is just reading, you don't need any locks - but otherwise your reading threads could read two bits of data from the data structure which have been modified in between, even if there's only one writing thread. (There are also memory model considerations to bear in mind.)
1) Did I successfully make AddString andToString threadsafe?
Yes, If you change _stringArrayList to be private
2) In the ToString method I've created is it necessary to lock there to make it threadsafe?
Yes
3) Is it only the methods that modify data that need to be locked or do both the read and write opperations need to be locked to make it threadsafe?
Read and write.
Yes to all three (i.e. read/write to the last).
But there is more:
You make your lock object static, while the data you protect is a per instance field. That means that all instances of StringMe are protected against each other, event though they have distinct data (i.e. instances of _stringArrayList). For the example you give, you can remove the static modifier from _locker. To be more precise, you typically define a "lock" for a set of data, or yet better invariants, you want to preserve. So usually, the lifetime (and scope) of the lock should equal that of the data.
Also, for good measure, you should not have a higher visibility on the data you protect than on the lock. In your example, a derived implementation could alter _stringArrayList (since it is protected) without acquiring the lock, thus breaking the invariant. I would make them both private and, if you must, only expose _stringArrayList through (properly locking) methods to derived classes.
I have a method that is getting called from multiple threads. Each of the threads have their own instance of the class. What's the most straightforward way to synchronize access to the code?
I can't just use lock(obj) where obj is an instance member, but would it be sufficient to just declare obj as static on the class? So all calls to the method would be locking on the same object? A simple illustration follows:
class Foo
{
static object locker = new object();
public void Method()
{
lock(locker)
{
//do work
}
}
}
EDIT: The //do work bit is writing to a database. Why I need to serialize the writes would take 3 pages to explain in this particular instance, and I really don't want to relive all the specifics that lead me to this point. All I'm trying to do is make sure that each record has finished writing before writing the next one.
Why do you need any synchronization when the threads each have their own instance? Protect the resource that is shared, don't bother with unshared state. That automatically helps you find the best place for the locking object. If it is a static member that the objects have in common then you indeed need a static locking object as well.
Your example would certainly work, though there must be some resource that is being shared across the different instances of the class to make that necessary.
You left out the most important part: what data is involved in // do work
If // do work uses static data then you have the right solution.
If // do work only uses instance data then you can leave out the lock() {} altogether (because 1 instance belongs to 1 Thread) or use a non-static locker (1 instance, multiple threads).