I have this code:
public static class ProcessClass()
{
public static string MyProp {get;set;}
public static void ProcessMethod(InputObject input)
{
if(String.IsNullOrEmpty(MyProp)
MyProp = input.Name();
//do stuff
MyProp = null;
}
}
Now, everything except for MyProp was already in place. I needed a way to track the original input.Name throughout the code since it can change and also because ProcessMethod could get called again internally with a different input.Name value due to business rules, and I need to state that this second calling came from input.Name.
Obviously, this is bad since if two people do this at the same time, they will both share the same MyProp value, and simply making is null at the end seems dangerous and hacky. I don't really have the option of changing this method to be non static because that would involve changing A LOT of the codebase, which isn't really an option.
What are ways around using a static property, but still being able to keep my original input.Name value without risking thread safety issues? One option I am thinking is is to have every method accept an input.Name() and track is that way(and remove the MyProp), but I can picture that getting out of hand fast, and extremely messy.
I don't need to have this be a property, but if it is it obviously needs to be static in this class.
Since it may be a multi-user environment, then replace the string with a ConcurrentDictionary where you store the input.Name() as a Value and as Key the unique identifier of the user (an id, a name, etc).
Related
I thinking need to reformulate my question.
My questions is. What is the best way to get the same List I can use in my whole project?
My code looks like this now:
public static class MessagingController
{
static List<MessagingDelivery> MessagingDeliveryList = Messaging.GetMessagingDeliveryList();
}
internal static class Messaging
{
static List<MessagingDelivery> MessagingDeliveryList;
static Messaging()
{ MessagingDeliveryList = new List<MessagingDelivery>(); }
internal static void CreateMessagingText(short reference, short number, string text)
{ MessagingDeliveryList.Add(new MessagingDelivery(reference, number, text)); }
internal static void ChangeMessagingDelivery(short reference, string status, string error)
{ MessagingDelivery.ChangeStatus(reference, status, error); }
internal static List<MessagingDelivery> GetMessagingDeliveryList()
{ return MessagingDeliveryList; }
}
Old question:
What is "best practice" for get a static List<T> and why?
Code 1:
public static List<MessagingDelivery> messagingDeliveryList
= Messaging.GetMessagingDeliveryList();
Code 2:
static List<MessagingDelivery> messagingDeliveryList
= Messaging.GetMessagingDeliveryList();
public static List<MessagingDelivery> MessagingDeliveryList
{ get { return messagingDeliveryList; } }
I assume Code 1 is the fastest way. Is there a good reason to use Code 2?
Neither. A static List<T> with a name that sounds like an actively used object (rather than, say, immutable configuration data) is not fast or slow: it is simply broken. It doesn't matter how fast broken code can run (although the faster it runs, the sooner and more often you will notice it break).
That aside, by the time the JIT has done inlining, there will rarely if ever be any appreciable difference between the 2 options shown.
Besides which: that simply isn't your bottleneck. For example, what are you going to do with the list? Search? Append? Remove - from - the - right? From the left? Fetch by index? All these things are where the actual time is spent. Not the list reference lookup.
While the first is going to be a hair faster, I would say that the second is going to be easier to maintain in the long-run by restricting access to the accessor.
To pick a mediocre example: If, in a few weeks, you suddenly need to deal with encryption or limited access rights, you only have one place to make the change. In the first example, you'd need to search the program for places that access your list, which is a far less effective use of your time. For security, especially, it might even be dangerous, if you start dumping access tokens or keys throughout the program.
So, it depends on what you need. In production, unless the few extra cycles for an method call/return is going to be significant for the purpose (which may well be the case, for some situations), I'd go with the second.
My application has InstrumentFactory - the only place where I create Instrument instance. Each instrument instance contains several fields, such as Ticker=MSFT and GateId=1 and also unique Id =1.
And now I realized that I almost never need Instrument instance. In 90% of cases I just need Id. For example now I have such method:
public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
return instrumentInfos[instrument.Id];
}
We know that we should not pass in parameters more information than required. So this code probably should be refactored to:
public InstrumentInfo GetInstrumentInfo(int instrumentId)
{
return instrumentInfos[instrumentId];
}
90% of my code can now be refactored to use instrumentId instead of Instrument.
Should I do that? Changing everywhere Instrument to instrumentId will make it as a hard requirement (each Instrument should have exactly one unique id). But what benefits will I have? In return of "hard requirements" I want to have some benefits for that... (speed, readability?) But I don't see them.
Using ids everywhere instead of the object is wrong approach and it goes against the spirit of OOP.
There are two big advantages to using the object itself:
It's type-safe. You can't accidentally pass something like Person to the first version, but you can accidentally pass person.Id to the second.
It makes your code easy to modify. If, in the future, you decide that you need long ids, or some other way to identify a unique Instrument, you won't need to change the calling code.
And you should probably change your dictionary too, it should be something like Dictionary<Instrument, InstrumentInfo>, not Dictionary<int, InstrumentInfo>, like you have now. This way, you get both of the advantages there too. To make it work, you need to implement equality in Instrument, which means properly overriding Equals() and GetHashCode() and ideally also implementing IEquatable<Instrument>.
It's always better to work in terms of objects than primitive values like integers. If tomorrow your requirements happen to change and you need more than just the ID, it is easy to add those to the Instrument object instead of changing all your code.
GetInstrumentInfo(int instrumentId);
This probably means that the client code has to have a:
GetInstrumentInfo(instrument.Id);
Don't let the users of your method worry about small details like that. Let them just pass the entire object and let your method do the work.
Don't see any major performance disadvantage. Whether you pass an Int or reference to the actual object.
Say you wanted to develop GetInstrumentInfo a bit more, its easier to have access to the entire object than just an Int.
The first thing you need to ask yourself is this:
"If I have two instruments with ID == 53, then does that mean they are definitely the same instrument, no matter what? Or is there a meaningful case where they could be different?"
Assuming the answer is "they are both the same. If any other property differs, that is either a bug or because one such object was obtained after another, and that will resolve itself soon enough (when whatever thread of processing is using the older instrument, stops using it)" then:
First, internally, just use whatever you find handier. You'll quite likely find that this is to go by the int all the time, though you get some type-safety from insisting that an Instrument is passed to the method. This is especially true if all Instrument construction happens from an internal or private constructor accessed via factory methods, and there is no way for a user of the code to create a bogus Instrument with an id that doesn't match anything in your system.
Define equality as such:
public class Instrument : IEquatable<Instrument>
{
/* all the useful stuff you already have */
public bool Equals(Instrument other)
{
return other != null && Id == other.Id;
}
public override bool Equals(object other)
{
return Equals(other as Instrument);
}
public override int GetHashCode()
{
return Id;
}
}
Now, especially when we consider that the above is likely to be inlined most of the time, there is pretty much no implementation difference as to whether we use the ID or the object in terms of equality, and hence also in terms of using them as a key.
Now, you can define all of your public methods in any of the following means:
public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
return instrumentInfos[instrument];
}
Or:
public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
return instrumentInfos[instrument.Id];
}
Or:
public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
return GetInstrumentInfo(instrument.Id);
}
private InstrumentInfo GetInstrumentInfo(int instrumentID)
{
return instrumentInfos[instrumentID]
}
The performance impact will be the same, whichever you go for. The code presented to users will be type-safe and guarantee they don't pass in bogus values. The implementation picked can be simply that which you find more convenient for other reasons.
Since it won't cost you any more to use the instrument itself as a key internally, I'd still recommend you do that (the first of the three options above) as the type-safety and making it hard to pass in bogus values will then apply to your internal code too. If on the other hand you find that a set of calls keep just using the id anyway (if e.g. they are talking to a database layer to which only the ID means anything), then changing just those places becomes quick and easy for you, and hidden from the user.
You also give your users the ability to use your object as a key and to do quick equality comparisons, if it suits them to do so.
You could overload each function, one that takes an instrument and one that takes an id:
public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
// call GetInstrumentInfo passing the id of the object
return GetInstrumentInfo[instrument.Id];
}
public InstrumentInfo GetInstrumentInfo(int instrumentId)
{
return instrumentInfos[instrumentId];
}
This will give you enough flexibility so that while you go through any place that calls GetInstrumentInfo to change it to pass id, the current code will still function.
As to whether or not you "should" is purely up to you. You would have to weigh how much time it would take to change it versus the benefit of making the change in the code.
My question has to do with the following bit of code. I simplified the code to distill the question.
I understand that the lock keeps the foo Hashtable variable from being changed, but what about the variable outside the lock? We're seeing some odd behavior in code that looks like this and this was the question that came up. Thanks for any input.
using System.Collections;
namespace MultithreadScratch01
{
public class ThreadFoo
{
public void Foo(Stub2 stub2, Hashtable foo)
{
Stub1 bar;
var prop1 = stub2.Prop1;
var prop2 = stub2.Prop2;
var prop3 = stub2.Prop3;
var hash = string.Format("{0}_{1}_{2}", prop1, prop2, prop3);
lock(foo)
{
if(!foo.Contains(hash))
{
bar = new Stub1 {Foo = "some arbitrary string", Bar = 123};
foo.Add(hash, bar);
}
}
}
public class Stub1
{
public string Foo { get; set; }
public int Bar { get; set; }
}
public class Stub2
{
public string Prop1 { get; set; }
public string Prop2 { get; set; }
public string Prop3 { get; set; }
}
}
}
What you're doing here is a "worst practice" in multiple ways.
First off, there is no guarantee whatsoever that other threads that have reference to that hash table are also locking it before they read or write it. That's why this technique is so bad; it is difficult to make that guarantee.
Second, there is no guarantee whatsoever that other threads that have reference to that hash table instance are not locking it and failing to unlock it because they are buggy or hostile. You are potentially putting other people in charge of the correctness of your code, and that is a dangerous position to be in. A good rule is "never lock anything that external code can see". Never lock "this", never lock a Type object, and so on. There are times to make an exception to that rule, but I would want a good reason.
The right thing to do here is to use a ConcurrentDictionary in the first place. If you can't do that, then I would write a wrapper around HashTable:
sealed class ThreadSafeHashTable
{
private readonly HashTable hashTable = new HashTable();
public void Add(object key, object value)
{
lock(this.hashTable)
{
...
Now the locking is (1) always done every time Add is called, and (2) only the code in this class can possibly take the lock, because the object that is locked is private and never passed out.
Only if you could not do either of those things would I do this the hard way. If you want to do it the hard way then you are going to have to track down every single place the hash table could possibly be used and ensure that the correct locking code is written.
It isn't quite true to say that the lock
keeps the foo Hashtable variable from being changed
simply; no two threads locking on the same object can be inside the lock at the same time. If you have any code that doesn't lock on the same object (the hash-table), it will walk right in and could well cause damage. Re the other variables... if anything is mutating the objects, and isn't locking on the same lock object that you are, things could get funky. Actually, there are even some edge-cases if it is locking on the same object (that would be solved my moving the property-reads inside the lock). However, since those variables aren't "captured", once you have a snapshot of the values the snapshot will not change.
You are somewhat mistaken on what the lock is doing. It is not preventing other threads from accessing or modifying foo, but it is actually preventing other threads from entering the code block the lock surrounds. If you are modifying foo elsewhere, you might see issues.
Since hash is a local variable, you shouldn't see any problems with it.
You might want to try using a ConcurrentDictionary instead of a Hashtable, it is designed for multi-threaded access.
Locking on a parameter is a dangerous game as the guys above have alluded to since you don't know what is being done with that parameter outside of your method. Someone else could be modifying it at the same time.
You could wrap all the code in your method to make sure only one thread is in that code at a time, but since your Stub1 and Stub2 classes aren't thread safe (need to put locks on the properties) then even doing that wouldn't guarantee that the properties aren't being changed while you're reading them.
I learned about Lazy class in .Net recently and have been probably over-using it. I have an example below where things could have been evaluated in an eager fashion, but that would result in repeating the same calculation if called over and over. In this particular example the cost of using Lazy might not justify the benefit, and I am not sure about this, since I do not yet understand just how expensive lambdas and lazy invocation are. I like using chained Lazy properties, because I can break complex logic into small, manageable chunks. I also no longer need to think about where is the best place to initialize stuff - all I need to know is that things will not be initialized if I do not use them and will be initialized exactly once before I start using them. However, once I start using lazy and lambdas, what was a simple class is now more complex. I cannot objectively decide when this is justified and when this is an overkill in terms of complexity, readability, possibly speed. What would your general recommendation be?
// This is set once during initialization.
// The other 3 properties are derived from this one.
// Ends in .dat
public string DatFileName
{
get;
private set;
}
private Lazy<string> DatFileBase
{
get
{
// Removes .dat
return new Lazy<string>(() => Path.GetFileNameWithoutExtension(this.DatFileName));
}
}
public Lazy<string> MicrosoftFormatName
{
get
{
return new Lazy<string>(() => this.DatFileBase + "_m.fmt");
}
}
public Lazy<string> OracleFormatName
{
get
{
return new Lazy<string>(() => this.DatFileBase + "_o.fmt");
}
}
This is probably a little bit of overkill.
Lazy should usually be used when the generic type is expensive to create or evaluate, and/or when the generic type is not always needed in every usage of the dependent class.
More than likely, anything calling your getters here will need an actual string value immediately upon calling your getter. To return a Lazy in such a case is unnecessary, as the calling code will simply evaluate the Lazy instance immediately to get what it really needs. The "just-in-time" nature of Lazy is wasted here, and therefore, YAGNI (You Ain't Gonna Need It).
That said, the "overhead" inherent in Lazy isn't all that much. A Lazy is little more than a class referencing a lambda that will produce the generic type. Lambdas are relatively cheap to define and execute; they're just methods, which are given a mashup name by the CLR when compiled. The instantiation of the extra class is the main kicker, and even then it's not terrible. However, it's unnecessary overhead from both a coding and performance perspective.
You said "i no longer need to think about where is the best place to initialize stuff".
This is a bad habit to get in to. You should know exactly what's going on in your program
You should Lazy<> when there's an object that needs to be passed, but requires some computation.
So only when it will be used it will be calculated.
Besides that, you need to remember that the object you retrieve with the lazy is not the object that was in the program's state when it was requested.
You'll get the object itself only when it will be used. This will be hard to debug later on if you get objects that are important to the program's state.
This does not appear to be using Lazy<T> for the purpose of saving creation/loading of an expensive object so much as it is to (perhaps unintentionally) be wrapping some arbitrary delegate for delayed execution. What you probably want/intend your derived property getters to return is a string, not a Lazy<string> object.
If the calling code looks like
string fileName = MicrosoftFormatName.Value;
then there is obviously no point, since you are "Lazy-Loading" immediately.
If the calling code looks like
var lazyName = MicrosoftFormatName; // Not yet evaluated
// some other stuff, maybe changing the value of DatFileName
string fileName2 = lazyName.Value;
then you can see there is a chance for fileName2 to not be determinable when the lazyName object is created.
It seems to me that Lazy<T> isn't best used for public properties; here your getters are returning new (as in brand new, distinct, extra) Lazy<string> objects, so each caller will (potentially) get a different .Value! All of your Lazy<string> properties depend on DatFileName being set at the time their .Value is first accessed, so you will always need to think about when that is initialized relative to the use of each of the derived properties.
See the MSDN article "Lazy Initialization" which creates a private Lazy<T> backing variable and a public property getter that looks like:
get { return _privateLazyObject.Value; }
What I might guess your code should/might like, using Lazy<string> to define your "set-once" base property:
// This is set up once (durinig object initialization) and
// evaluated once (the first time _datFileName.Value is accessed)
private Lazy<string> _datFileName = new Lazy<string>(() =>
{
string filename = null;
//Insert initialization code here to determine filename
return filename;
});
// The other 3 properties are derived from this one.
// Ends in .dat
public string DatFileName
{
get { return _datFileName.Value; }
private set { _datFileName = new Lazy<string>(() => value); }
}
private string DatFileBase
{
get { return Path.GetFileNameWithoutExtension(DatFileName); }
}
public string MicrosoftFormatName
{
get { return DatFileBase + "_m.fmt"; }
}
public string OracleFormatName
{
get { return DatFileBase + "_o.fmt"; }
}
Using Lazy for creating simple string properties is indeed an overkill. Initializing the instance of Lazy with lambda parameter is probably much more expensive than doing single string operation. There's one more important argument that others didn't mention yet - remember that lambda parameter is resolved by the compiler to quite complex structure, far more comples than string concatenation.
The other area that is good to use lazy loading is in a type that can be consumed in a partial state. As an example, consider the following:
public class Artist
{
public string Name { get; set; }
public Lazy<Manager> Manager { get; internal set; }
}
In the above example, consumers may only need to utilise our Name property, but having to populate fields which may or may not be used could be a place for lazy loading. I say could not should, as there are always situations when it may be more performant to load all up front.... depending on what your application needs to do.
For the first time i created a linq to sql classes. I decided to look at the class and found this.
What... why is it doing if(sz !=sz2) { sz = sz2; }. I dont understand. Why isnt the set generated as this._Property1 = value?
private string _Property1;
[Column(Storage="_Property1", CanBeNull=false)]
public string Property1
{
get
{
return this._Property1;
}
set
{
if ((this._Property1 != value))
{
this._Property1 = value;
}
}
}
It only updates the property if it has changed. This is probably based on the assumption that a comparison is cheaper than updating the reference (and all the entailed memory management) that might be involved.
Where are you seeing that? The usual LINQ-to-SQL generated properties look like the following:
private string _Property1;
[Column(Storage="_Property1", CanBeNull=false)]
public string Property1 {
get {
return this._Property1;
}
set {
if ((this._Property1 != value)) {
this.OnProperty1Changing(value);
this.SendPropertyChanging();
this._Property1 = value;
this.SendPropertyChanged("Property1");
this.OnProperty1Changed();
}
}
}
And now it's very clear that the device is to avoid sending property changing/changed notifications when the property is not actually changing.
Now, it turns out that OnProperty1Changing and OnProperty1Changed are partial methods so that if you don't declare a body for them elsewhere the calls to those methods will not be compiled into the final assembly (so if, say, you were looking in Reflector you would not see these calls). But SendPropertyChanging and SendPropertyChanged are protected methods that can't be compiled out.
So, did you perhaps change a setting that prevents the property changing/changed notifications from being emitted by the code generator?
Setting a field won't cause property change notifications, so that's not the reason.
I would guess that this design choice was driven by something like the following:
That string is an immutable reference type. Therefore the original and new instances are interchangeable. However the original instance may have been around longer and on average may therefore be slightly more expensive to collect (*). So performance may be better if the original instance is retained rather than being replaced by a new identical instance.
(*) The new value has in most cases only just been allocated, and won't be reused after the property is set. So it is very often a Gen0 object that is efficient to collect, whereas the original value's GC generation is unknown.
If this reasoning is correct, I wouldn't expect to see the same pattern for value-type properties (int, double, DateTime, ...).
But of course this is only speculation and I may be completely wrong.
Looks like there's persistence going on here. If something is using reflection (or a pointcut, or something) to create a SQL UPDATE query when _Property1 changes, then it'll be very much more expensive to update the field than to do the comparison.
It comes from Heijlsberg's ObjectPascal root.... at least that's how most of the Borland Delphi VCL is implemented... ;)