My question has to do with the following bit of code. I simplified the code to distill the question.
I understand that the lock keeps the foo Hashtable variable from being changed, but what about the variable outside the lock? We're seeing some odd behavior in code that looks like this and this was the question that came up. Thanks for any input.
using System.Collections;
namespace MultithreadScratch01
{
public class ThreadFoo
{
public void Foo(Stub2 stub2, Hashtable foo)
{
Stub1 bar;
var prop1 = stub2.Prop1;
var prop2 = stub2.Prop2;
var prop3 = stub2.Prop3;
var hash = string.Format("{0}_{1}_{2}", prop1, prop2, prop3);
lock(foo)
{
if(!foo.Contains(hash))
{
bar = new Stub1 {Foo = "some arbitrary string", Bar = 123};
foo.Add(hash, bar);
}
}
}
public class Stub1
{
public string Foo { get; set; }
public int Bar { get; set; }
}
public class Stub2
{
public string Prop1 { get; set; }
public string Prop2 { get; set; }
public string Prop3 { get; set; }
}
}
}
What you're doing here is a "worst practice" in multiple ways.
First off, there is no guarantee whatsoever that other threads that have reference to that hash table are also locking it before they read or write it. That's why this technique is so bad; it is difficult to make that guarantee.
Second, there is no guarantee whatsoever that other threads that have reference to that hash table instance are not locking it and failing to unlock it because they are buggy or hostile. You are potentially putting other people in charge of the correctness of your code, and that is a dangerous position to be in. A good rule is "never lock anything that external code can see". Never lock "this", never lock a Type object, and so on. There are times to make an exception to that rule, but I would want a good reason.
The right thing to do here is to use a ConcurrentDictionary in the first place. If you can't do that, then I would write a wrapper around HashTable:
sealed class ThreadSafeHashTable
{
private readonly HashTable hashTable = new HashTable();
public void Add(object key, object value)
{
lock(this.hashTable)
{
...
Now the locking is (1) always done every time Add is called, and (2) only the code in this class can possibly take the lock, because the object that is locked is private and never passed out.
Only if you could not do either of those things would I do this the hard way. If you want to do it the hard way then you are going to have to track down every single place the hash table could possibly be used and ensure that the correct locking code is written.
It isn't quite true to say that the lock
keeps the foo Hashtable variable from being changed
simply; no two threads locking on the same object can be inside the lock at the same time. If you have any code that doesn't lock on the same object (the hash-table), it will walk right in and could well cause damage. Re the other variables... if anything is mutating the objects, and isn't locking on the same lock object that you are, things could get funky. Actually, there are even some edge-cases if it is locking on the same object (that would be solved my moving the property-reads inside the lock). However, since those variables aren't "captured", once you have a snapshot of the values the snapshot will not change.
You are somewhat mistaken on what the lock is doing. It is not preventing other threads from accessing or modifying foo, but it is actually preventing other threads from entering the code block the lock surrounds. If you are modifying foo elsewhere, you might see issues.
Since hash is a local variable, you shouldn't see any problems with it.
You might want to try using a ConcurrentDictionary instead of a Hashtable, it is designed for multi-threaded access.
Locking on a parameter is a dangerous game as the guys above have alluded to since you don't know what is being done with that parameter outside of your method. Someone else could be modifying it at the same time.
You could wrap all the code in your method to make sure only one thread is in that code at a time, but since your Stub1 and Stub2 classes aren't thread safe (need to put locks on the properties) then even doing that wouldn't guarantee that the properties aren't being changed while you're reading them.
Related
I have this code:
public static class ProcessClass()
{
public static string MyProp {get;set;}
public static void ProcessMethod(InputObject input)
{
if(String.IsNullOrEmpty(MyProp)
MyProp = input.Name();
//do stuff
MyProp = null;
}
}
Now, everything except for MyProp was already in place. I needed a way to track the original input.Name throughout the code since it can change and also because ProcessMethod could get called again internally with a different input.Name value due to business rules, and I need to state that this second calling came from input.Name.
Obviously, this is bad since if two people do this at the same time, they will both share the same MyProp value, and simply making is null at the end seems dangerous and hacky. I don't really have the option of changing this method to be non static because that would involve changing A LOT of the codebase, which isn't really an option.
What are ways around using a static property, but still being able to keep my original input.Name value without risking thread safety issues? One option I am thinking is is to have every method accept an input.Name() and track is that way(and remove the MyProp), but I can picture that getting out of hand fast, and extremely messy.
I don't need to have this be a property, but if it is it obviously needs to be static in this class.
Since it may be a multi-user environment, then replace the string with a ConcurrentDictionary where you store the input.Name() as a Value and as Key the unique identifier of the user (an id, a name, etc).
I wonder how to define a class properly and use it safely. I mean thread safely when thousands of concurrent calls are being made by every website visitor.
I made myself something like below but i wonder is it properly built
public static class csPublicFunctions
{
private static Dictionary<string, clsUserTitles> dicAuthorities;
static csPublicFunctions()
{
dicAuthorities = new Dictionary<string, clsUserTitles>();
using (DataTable dtTemp = DbConnection.db_Select_DataTable("select * from myTable"))
{
foreach (DataRow drw in dtTemp.Rows)
{
clsUserTitles tempCLS = new clsUserTitles();
tempCLS.irAuthorityLevel = Int32.Parse(drw["Level"].ToString());
tempCLS.srTitle_tr = drw["Title_tr"].ToString();
tempCLS.srTitle_en = drw["Title_en"].ToString();
dicAuthorities.Add(drw["authorityLevel"].ToString(), tempCLS);
}
}
}
public class clsUserTitles
{
private string Title_tr;
public string srTitle_tr
{
get { return Title_tr; }
set { Title_tr = value; }
}
private string Title_en;
public string srTitle_en
{
get { return Title_en; }
set { Title_en = value; }
}
private int AuthorityLevel;
public int irAuthorityLevel
{
get { return AuthorityLevel; }
set { AuthorityLevel = value; }
}
}
public static clsUserTitles returnUserTitles(string srUserAuthority)
{
return dicAuthorities[srUserAuthority];
}
}
Dictionary will be initialized only 1 time. No add remove update later.
Dictionary supports thread safe reading. Here is the proof from MSDN:
A Dictionary can support multiple readers concurrently,
as long as the collection is not modified. Even so, enumerating
through a collection is intrinsically not a thread-safe procedure. In
the rare case where an enumeration contends with write accesses, the
collection must be locked during the entire enumeration. To allow the
collection to be accessed by multiple threads for reading and writing,
you must implement your own synchronization.
So, if you are planning to only read data from it, it should work. However, I do not believe that your dictionary is filled only once and won't be modified during your application work. in this case, all other guys in this thread are correct, it is necessary to synchronize access to this dictionary and it is best to use the ConcurrentDictionary object.
Now, I want to say a couple of words about the design itself. If you want to store a shared data between users, use ASP.NET Cache instead which was designed for such purposes.
A quick look through your code and it seems to me that your first problem will be the publicly available dictionary dicAuthorities. Dictionaries are not thread safe. Depending on what you want to do with that Dictionary, you'll need to implement something that regulates access to it. See this related question:
Making dictionary access thread-safe?
As the others have said, Dictionary<TKey,TValue> is not inherently thread-safe. However, if your usage scenario is:
Fill the dictionary on startup
Use that dictionary as lookup while the application is running
Never add or remove values after startup
than you should be fine.
However, if you use .net 4.5, I would recommend making #3 explict, by using a ReadOnlyDictionary
So, your implementation might look like this (changed the coding style to more C# friendly)
private static readonly ReadOnlyDictionary<string, UserTitles> authorities;
static PublicFunctions()
{
Dictionary<string, UserTitles> authoritiesFill = new Dictionary<string, clsUserTitles>();
using (DataTable dtTemp = DbConnection.db_Select_DataTable("select * from myTable"))
{
foreach (DataRow drw in dtTemp.Rows)
{
UserTitles userTitle = new UserTitles
{
AuthorityLevel = Int32.Parse(drw["Level"].ToString()),
TitleTurkish = drw["Title_tr"].ToString();
TitleEnglish = drw["Title_en"].ToString();
}
authoritiesFill.Add(drw["authorityLevel"].ToString(), userTitle);
}
}
authorities = new ReadOnlyDictionary<string, UserTitles>(authoritiesFill);
}
I've also added a readonly modifier to the declaration itself, because this way you can be sure that it won't be replaced at runtime by another dictionary.
No you code is not thread safe.
[EDIT does not apply - set/created inside static constructor] Dictionary (as pointed by System Down answer) is not thread safe while being updated. Dictionary is not read only - hence no way to guarantee that it is not modified over time.
[EDIT does not apply - set/created inside static constructor] Initialization is not protected by any locks so you end-up with multiple initializations at the same time
Your entries are mutable - so it is very hard to reason if you get consistent value of each entry
[EDIT does not apply - only modified in static constructor] Field that holds dictionary not read-only - depending on code you may end-up with inconsistent data if not caching pointer to dictionary itself.
Side note: try to follow coding guidelines for C# and call classes starting with upper case MySpecialClass and have names that reflect purpose of the class (or clearly sample names).
EDIT: most of my points do not apply as the only initialization of the dictionary is inside static constructor. Which makes initialization safe from thread-safety point of view.
Note that initialization inside static constructor will happen at non-deterministic moment "before first use". It can lead to unexpected behavior - i.e. when access to DB may use wrong "current" user account.
The answer to your question is no, it's not thread safe. Dictionary is not a thread-safe collection. If you want to use a thread-safe dictionary then use ConcurrentDictionary.
Besides that, it's difficult to say whether your csPublicFunctions is thread-safe or not because it depends on how you handle your database connections inside the call to DbConnection.db_Select_DataTable
There is not thread-safe problem only with public Dictionary.
Yes, dictionary filling is thread-safe. But another modification of this dictionary is not thread safe. As was wrote above - ConcurrentDictionary could help.
Another problem that your class clsUserTitles is not thread-safe too.
If clsUserTitles is using only for reading you could make each property setter of clsUserTitles private. And initialize these properties from clsUserTitles constructor.
I have a class with two properties and two methods. Like the one below for example. (please ignore the data types or return types, it's just a typical scenario)
// The methods could be invoked by multiple threads
public class Stock
{
private static int FaceValue {get; set;}
private static int Percent (get; set;}
// method that updates the two properties
Public void UpdateStock()
{
FaceValue += 1;
Percent = FaceValue * 100;
}
// method that reads the two properties
public int[] GetStockQuote()
{
return new int[] { FaceValue, Percent};
}
}
I need to ensure this class is thread safe. I could use lock(obj) in both the methods as one technique to make it threadsafe but what would be the best technique to make it thread safe, considering the following:
There are only two properties that is read/updated. So, not sure if locking inside the methods is a good technique.
Will it be enough if I just make the properties thread safe rather than the methods or the class ?
Also, is there a way to make the whole class thread safe rather than individual methods or properties ? Any recommended lock techniques from .Net 4.0 ?
Just wondering if I am thinking through this right or may be I am over complicating it considering these. Many thanks in advance to help me get this clear.
Mani
In general, a lock is probably the simplest approach here.
A potentially better alternative would be to make this class immutable. If you make it so you can't change the values within the class once it's created, you no longer have to worry when reading the values, as there's no way for them to be modified.
In this case, that could be done by having a constructor that takes the two values, and changing UpdateStock to be more like:
public Stock GetUpdatedStock()
{
// Create a new instance here...
return new Stock(this.FaceValue + DateTime.Now.MilliSecond, this.FaceValue * 100);
}
Edit:
Now that you've made FaceValue and Percent static, you will need synchronization. A lock is likely the simplest option here.
With a single value, you could potentially use the Interlocked class to handle updates atomically, but there is no way to do an atomic update of both values*, which is likely required for the thread safety to be done properly. In this case, synchronizing via a lock will solve your issue.
*Note: This could possibly be done without a lock via Interlocked.CompareExchange if you put both values within a class, and exchanged the entire class instance - but that's likely a lot more trouble than it's worth.
There is no silver bullet solution for making thread safe, each scenario needs it's own solution. The most obvious is to use a lock, but in your example, you can simplify and use the Interlocked class and have this take care of making it an atomic operation:
public class Stock
{
private static int FaceValue {get; set;}
Public void UpdateStock()
{
//only a single property to update now
Interlocked.Increment(FaceValue);
}
// method that reads the two properties
public int[] GetStockQuote()
{
var currVal = FaceValue;
return new int[] { currVal, currVal * 100 };
}
}
See Interlocked on MSDN.
Often I need to minimise object allocations within code that runs very frequently.
Of course I can use normal techniques like object pooling, but sometimes I just want something that's contained locally.
To try and achieve this, I came up with the below:
public static class Reusable<T> where T : new()
{
private static T _Internal;
private static Action<T> _ResetAction;
static Reusable()
{
_Internal = Activator.CreateInstance<T>();
}
public static void SetResetAction(Action<T> resetAction)
{
_ResetAction = resetAction;
}
public static T Get()
{
#if DEBUG
if (_ResetAction == null)
{
throw new InvalidOperationException("You must set the reset action first");
}
#endif
_ResetAction(_Internal);
return _Internal;
}
}
Currently, the usage would be:
// In initialisation function somewhere
Reuseable<List<int>>.SetResetAction((l) => l.Clear());
....
// In loop
var list = Reuseable<List<int>>.Get();
// Do stuff with list
What I'd like to improve, is the fact that the whole thing is not contained in one place (the .SetResetAction is separate to where it's actually used).
I'd like to get the code to something like below:
// In loop
var list = Reuseable<List<int>>.Get((l) => l.Clear());
// Do stuff with list
The problem with this is that i get an object allocation (it creates an Action<T>) every loop.
Is it possible to get the usage I'm after without any object allocations?
Obviously I could create a ReuseableList<T> which would have a built-in Action but I want to allow for other cases where the action could vary.
Are you sure that creates a new Action<T> on each iteration? I suspect it actually doesn't, given that it doesn't capture any variables. I suspect if you look at the IL generated by the C# compiler, it will cache the delegate.
Of course, that's implementation-specific...
EDIT: (I was just leaving before I had time to write any more...)
As Eric points out in the comment, it's not a great idea to rely on this. It's not guaranteed, and it's easy to accidentally break it even when you don't change compiler.
Even the design of this looks worrying (thread safety?) but if you must do it, I'd probably turn it from a static class into a "normal" class which takes the reset method (and possibly the instance) in a constructor. That's a more flexible, readable and testable approach IMO.
I learned about Lazy class in .Net recently and have been probably over-using it. I have an example below where things could have been evaluated in an eager fashion, but that would result in repeating the same calculation if called over and over. In this particular example the cost of using Lazy might not justify the benefit, and I am not sure about this, since I do not yet understand just how expensive lambdas and lazy invocation are. I like using chained Lazy properties, because I can break complex logic into small, manageable chunks. I also no longer need to think about where is the best place to initialize stuff - all I need to know is that things will not be initialized if I do not use them and will be initialized exactly once before I start using them. However, once I start using lazy and lambdas, what was a simple class is now more complex. I cannot objectively decide when this is justified and when this is an overkill in terms of complexity, readability, possibly speed. What would your general recommendation be?
// This is set once during initialization.
// The other 3 properties are derived from this one.
// Ends in .dat
public string DatFileName
{
get;
private set;
}
private Lazy<string> DatFileBase
{
get
{
// Removes .dat
return new Lazy<string>(() => Path.GetFileNameWithoutExtension(this.DatFileName));
}
}
public Lazy<string> MicrosoftFormatName
{
get
{
return new Lazy<string>(() => this.DatFileBase + "_m.fmt");
}
}
public Lazy<string> OracleFormatName
{
get
{
return new Lazy<string>(() => this.DatFileBase + "_o.fmt");
}
}
This is probably a little bit of overkill.
Lazy should usually be used when the generic type is expensive to create or evaluate, and/or when the generic type is not always needed in every usage of the dependent class.
More than likely, anything calling your getters here will need an actual string value immediately upon calling your getter. To return a Lazy in such a case is unnecessary, as the calling code will simply evaluate the Lazy instance immediately to get what it really needs. The "just-in-time" nature of Lazy is wasted here, and therefore, YAGNI (You Ain't Gonna Need It).
That said, the "overhead" inherent in Lazy isn't all that much. A Lazy is little more than a class referencing a lambda that will produce the generic type. Lambdas are relatively cheap to define and execute; they're just methods, which are given a mashup name by the CLR when compiled. The instantiation of the extra class is the main kicker, and even then it's not terrible. However, it's unnecessary overhead from both a coding and performance perspective.
You said "i no longer need to think about where is the best place to initialize stuff".
This is a bad habit to get in to. You should know exactly what's going on in your program
You should Lazy<> when there's an object that needs to be passed, but requires some computation.
So only when it will be used it will be calculated.
Besides that, you need to remember that the object you retrieve with the lazy is not the object that was in the program's state when it was requested.
You'll get the object itself only when it will be used. This will be hard to debug later on if you get objects that are important to the program's state.
This does not appear to be using Lazy<T> for the purpose of saving creation/loading of an expensive object so much as it is to (perhaps unintentionally) be wrapping some arbitrary delegate for delayed execution. What you probably want/intend your derived property getters to return is a string, not a Lazy<string> object.
If the calling code looks like
string fileName = MicrosoftFormatName.Value;
then there is obviously no point, since you are "Lazy-Loading" immediately.
If the calling code looks like
var lazyName = MicrosoftFormatName; // Not yet evaluated
// some other stuff, maybe changing the value of DatFileName
string fileName2 = lazyName.Value;
then you can see there is a chance for fileName2 to not be determinable when the lazyName object is created.
It seems to me that Lazy<T> isn't best used for public properties; here your getters are returning new (as in brand new, distinct, extra) Lazy<string> objects, so each caller will (potentially) get a different .Value! All of your Lazy<string> properties depend on DatFileName being set at the time their .Value is first accessed, so you will always need to think about when that is initialized relative to the use of each of the derived properties.
See the MSDN article "Lazy Initialization" which creates a private Lazy<T> backing variable and a public property getter that looks like:
get { return _privateLazyObject.Value; }
What I might guess your code should/might like, using Lazy<string> to define your "set-once" base property:
// This is set up once (durinig object initialization) and
// evaluated once (the first time _datFileName.Value is accessed)
private Lazy<string> _datFileName = new Lazy<string>(() =>
{
string filename = null;
//Insert initialization code here to determine filename
return filename;
});
// The other 3 properties are derived from this one.
// Ends in .dat
public string DatFileName
{
get { return _datFileName.Value; }
private set { _datFileName = new Lazy<string>(() => value); }
}
private string DatFileBase
{
get { return Path.GetFileNameWithoutExtension(DatFileName); }
}
public string MicrosoftFormatName
{
get { return DatFileBase + "_m.fmt"; }
}
public string OracleFormatName
{
get { return DatFileBase + "_o.fmt"; }
}
Using Lazy for creating simple string properties is indeed an overkill. Initializing the instance of Lazy with lambda parameter is probably much more expensive than doing single string operation. There's one more important argument that others didn't mention yet - remember that lambda parameter is resolved by the compiler to quite complex structure, far more comples than string concatenation.
The other area that is good to use lazy loading is in a type that can be consumed in a partial state. As an example, consider the following:
public class Artist
{
public string Name { get; set; }
public Lazy<Manager> Manager { get; internal set; }
}
In the above example, consumers may only need to utilise our Name property, but having to populate fields which may or may not be used could be a place for lazy loading. I say could not should, as there are always situations when it may be more performant to load all up front.... depending on what your application needs to do.