Scenario
I have a line of code whereby I pass a good number of parameters into a method.
CODE as described above
foreach(Asset asset in assetList)
{
asset.ContributePrice(m_frontMonthPrice, m_Vol, m_divisor, m_refPrice, m_type,
m_overrideVol, i, m_decimalPlaces, metalUSDFID, metalEURFID);
}
What I really want to do...
What I really want to do is spawn a new thread everytime I call this method so that it does the work quicker (there are a lot of assets).
Envisaged CODE
foreach(Asset asset in assetList)
{
Thread myNewThread =
new Thread(new ThreadStart(asset.ContributePrice (m_frontMonthPrice, m_Vol,
m_divisor, m_refPrice, m_type, m_overrideVol, i, m_decimalPlaces, metalUSDFID,
metalEURFID)));
myNewThread.Start();
}
ISSUES
This is something which has always bothered me......why can't I pass the parameters into the thread.....what difference does it make?
I can't see a way around this that won't involve lots of refactoring......
.......This is an old application, built piece by piece as a result of feature creep.
Therefore, the code itself is messy and hard to read/follow.
I thought I had pinpointed an area to save some time and increase the processing speed but now I've hit a wall with this.
SUGGESTIONS?
Any help or suggestions would be greatly appreciated.
Cheers.
EDIT:
I'm using .Net 3.5.......I could potentially update to .Net 4.0
If you're using C# 3, the easiest way would be:
foreach(Asset asset in assetList)
{
Asset localAsset = asset;
ThreadStart ts = () => localAsset.ContributePrice (m_frontMonthPrice, m_Vol,
m_divisor, m_refPrice, m_type, m_overrideVol, i,
m_decimalPlaces, metalUSDFID, metalEURFID);
new Thread(ts).Start();
}
You need to take a "local" copy of the asset loop variable to avoid weird issues due to captured variables - Eric Lippert has a great blog entry on it.
In C# 2 you could do the same with an anonymous method:
foreach(Asset asset in assetList)
{
Asset localAsset = asset;
ThreadStart ts = delegate { localAsset.ContributePrice(m_frontMonthPrice,
m_Vol, m_divisor, m_refPrice, m_type, m_overrideVol, i,
m_decimalPlaces, metalUSDFID, metalEURFID); };
new Thread(ts).Start();
}
In .NET 4 it would probably be better to use Parallel.ForEach. Even before .NET 4, creating a new thread for each item may well not be a good idea - consider using the thread pool instead.
Spawning a new thread for each task will most likely make the task run significantly slower. Use the thread pool for that as it amortizes the cost of creating new threads. If you're on .NET 4 take a look at the new Task class.
If you need to pass parameters to a thread when starting it, you must use the ParameterizedThreadStart delegate. If you need to pass several parameters, consider encapsulating them in a type.
You could use ParameterizedThreadStart. You'll need to wrap all of your parameters into a single object. (Untested code below).
struct ContributePriceParams
{
public decimal FrontMonthPrice;
public int Vol;
//etc
}
//...
foreach(Asset asset in assetList)
{
ContributePriceParams pStruct = new pStruct() {FrontMonthPrice = m_frontMonthPrice, Vol = m_vol};
ParameterizedThreadStart pStart = new ParameterizedThreadStart(asset.ContributePrice);
Thread newThread = new Thread(pStart);
newThread.Start(pStruct);
}
Related
I have a method which takes an argument and run it against database, retrieve records, process and save processed records to a new table. Running the method from the service with one parameter works. What i am trying to achieve now is make the parameter dynamic. I have implemented a method to retrieve the parameters and it works fine. Now i am trying to run methods parallel from the list of parameter's provided. My current implementation is:
WorkerClass WorkerClass = new WorkerClass();
var ParametersList = WorkerClass.GetParams();
foreach (var item in ParametersList){
WorkerClass WorkerClass2 = new WorkerClass();
Parallel.Invoke(
()=>WorkerClass2.ProcessAndSaveMethod(item)
);
}
On the above implementation i think defining a new WorkerClass2 defies the whole point of Parallel.Invoke but i am having an issue with data mixup when using already defined WorkerClass. The reason for the mix up is Oracle connection is opened inside the Init() Method of the class and static DataTable DataCollectionList; is defined on a class level thus creating an issue.
Inside the method ProcessAndSaveMethod(item) i have:
OracleCommand Command = new OracleCommand(Query, OracleConnection);
OracleDataAdapter Adapter = new OracleDataAdapter(Command);
Adapter.Fill(DataCollectionList);
Inside init():
try
{
OracleConnection = new OracleConnection(Passengers.OracleConString);
DataCollectionList = new DataTable();
OracleConnection.Open();
return true;
}
catch (Exception ex)
{
OracleConnection.Close();
DataCollectionList.Clear();
return false;
}
And the function isn't run parallely as i was trying to do. Is there another way to implement this?
To run it in parallel you need to call Parallel.Invoke only once with all the tasks to be completed:
Parallel.Invoke(
ParametersList.Select(item =>
new Action(()=>WorkerClass2.ProcessAndSaveMethod(item))
).ToArray()
);
If you have a list of somethings and want it processed in parallel, there really is no easier way than PLinq:
var parametersList = SomeObject.SomeFunction();
var resultList = parametersList.AsParallel()
.Select(item => new WorkerClass().ProcessAndSaveMethod(item))
.ToList();
The fact that you build up a new connection and use a lot of variables local to the one item you process is fine. It's actually the preferred way to do multi-threading: keep as much local to the thread as you can.
That said, you have to measure if multi-threading is actually the fastest way to solve your problem. Maybe you can do your processing sequentially and then do all your database stuff in one go with bulk inserts, temporary tables or whatever is suited to your specific problem. Splitting a task into smaller tasks for more processors to run is not always faster. It's a tool and you need to find out if that tool is helping in your specific situation.
I achieved parallel processing using the below code and also avoided null pointer exception from DbCon.open() caused by connection pooling using the max degree of parallelism parameter.
Parallel.ForEach(ParametersList , new ParallelOptions() { MaxDegreeOfParallelism = 5 }, item=>
{
WorkerClass Worker= new WorkerClass();
Worker.ProcessAndSaveMethod(item);
});
I'm writing a program that will analyze changes in the stock market.
Every time the candles on the stock charts are updated, my algorithm scans every chart for certain pieces of data. I've noticed that this process is taking about 0.6 seconds each time, freezing my application. Its not getting stuck in a loop, and there are no other problems like exception errors slowing it down. It just takes a bit of time.
To solve this, I'm trying to see if I can thread the algorithm.
In order to call the algorithm to check over the charts, I have to call this:
checkCharts.RunAlgo();
As threads need an object, I'm trying to figure out how to run the RunAlgo(), but I'm not having any luck.
How can I have a thread run this method in my checkCharts object? Due to back propagating data, I can't start a new checkCharts object. I have to continue using that method from the existing object.
EDIT:
I tried this:
M4.ALProj.BotMain checkCharts = new ALProj.BotMain();
Thread algoThread = new Thread(checkCharts.RunAlgo);
It tells me that the checkCharts part of checkCharts.RunAlgo is gives me, "An object reference is required for the non-static field, method, or property "M4.ALProj.BotMain"."
In a specific if statement, I was going to put the algoThread.Start(); Any idea what I did wrong there?
The answer to your question is actually very simple:
Thread myThread = new Thread(checkCharts.RunAlgo);
myThread.Start();
However, the more complex part is to make sure that when the method RunAlgo accesses variables inside the checkCharts object, this happens in a thread-safe manner.
See Thread Synchronization for help on how to synchronize access to data from multiple threads.
I would rather use Task.Run than Thread. Task.Run utilizes the ThreadPool which has been optimized to handle various loads effectively. You will also get all the goodies of Task.
await Task.Run(()=> checkCharts.RunAlgo);
Try this code block. Its a basic boilerplate but you can build on and extend it quite easily.
//If M4.ALProj.BotMain needs to be recreated for each run then comment this line and uncomment the one in DoRunParallel()
private static M4.ALProj.BotMain checkCharts = new M4.ALProj.BotMain();
private static object SyncRoot = new object();
private static System.Threading.Thread algoThread = null;
private static bool ReRunOnComplete = false;
public static void RunParallel()
{
lock (SyncRoot)
{
if (algoThread == null)
{
System.Threading.ThreadStart TS = new System.Threading.ThreadStart(DoRunParallel);
algoThread = new System.Threading.Thread(TS);
}
else
{
//Recieved a recalc call while still calculating
ReRunOnComplete = true;
}
}
}
public static void DoRunParallel()
{
bool ReRun = false;
try
{
//If M4.ALProj.BotMain needs to be recreated for each run then uncomment this line and comment private static version above
//M4.ALProj.BotMain checkCharts = new M4.ALProj.BotMain();
checkCharts.RunAlgo();
}
finally
{
lock (SyncRoot)
{
algoThread = null;
ReRun = ReRunOnComplete;
ReRunOnComplete = false;
}
}
if (ReRun)
{
RunParallel();
}
}
I have a silverlight 5 app that depends on several asynchronous calls to web services to populate the attributes of newly created graphics. I am trying to find a way to handle those asynchronous calls synchronously. I have tried the suggestions listed in this article and this one. i have tried the many suggestions regarding the Dispatcher object. None have worked well, so I am clearly missing something...
Here is what I have:
public partial class MainPage : UserControl {
AutoResetEvent waitHandle = new AutoResetEvent(false);
private void AssignNewAttributeValuesToSplitPolygons(List<Graphic> splitGraphics)
{
for (int i = 0; i < splitGraphics.Count; i++)
{
Graphic g = splitGraphics[i];
Thread lookupThread1 = new Thread(new ParameterizedThreadStart(SetStateCountyUtm));
lookupThread1.Start(g);
waitHandle.WaitOne();
Thread lookupThread2 = new Thread(new ParameterizedThreadStart(SetCongressionalDistrict));
lookupThread1.Start(g);
waitHandle.WaitOne();
}
private void SetStateCountyUtm(object graphic)
{
this.Dispatcher.BeginInvoke(delegate() {
WrapperSetStateCountyUtm((Graphic)graphic);
});
}
private void WrapperSetStateCountyUtm(Graphic graphic)
{
GISQueryEngine gisQEngine = new GISQueryEngine();
gisQEngine.StateCountyUtmLookupCompletedEvent += new GISQueryEngine.StateCountyUtmLookupEventHandler(gisQEngine_StateCountyUtmLookupCompletedEvent);
gisQEngine.PerformStateCountyUtmQuery(graphic.Geometry, graphic.Attributes["clu_number"].ToString());
}
void gisQEngine_StateCountyUtmLookupCompletedEvent(object sender, StateCountyUtmLookupCompleted stateCountyUtmLookupEventArgs)
{
string fred = stateCountyUtmLookupEventArgs.
waitHandle.Set();
}
}
public class GISQueryEngine
{
public void PerformStateCountyUtmQuery(Geometry inSpatialQueryGeometry, string cluNumber)
{
QueryTask queryTask = new QueryTask(stateandCountyServiceURL);
queryTask.ExecuteCompleted += new EventHandler<QueryEventArgs>(queryTask_StateCountyLookupExecuteCompleted);
queryTask.Failed += new EventHandler<TaskFailedEventArgs>(queryTask_StateCountyLookupFailed);
Query spatialQueryParam = new ESRI.ArcGIS.Client.Tasks.Query();
spatialQueryParam.OutFields.AddRange(new string[] { "*" });
spatialQueryParam.ReturnGeometry = false;
spatialQueryParam.Geometry = inSpatialQueryGeometry;
spatialQueryParam.SpatialRelationship = SpatialRelationship.esriSpatialRelIntersects;
spatialQueryParam.OutSpatialReference = inSpatialQueryGeometry.SpatialReference;
queryTask.ExecuteAsync(spatialQueryParam, cluNumber);
}
//and a whole bunch of other stuff i can add if needed
}
If I leave the 'waitHandle.WaitOne()' method uncommented, no code beyond that method is ever called, at least that I can see with the step through debugger. The application just hangs.
If I comment out the 'waitHandle.WaitOne()', everything runs just fine - except asynchronously. In other words, when the app reads the Attribute values of the new graphics, those values may or may not be set depending on how quickly the asynch methods return.
Thanks for any help.
It's going to be rather difficult to work through a problem like this as there are a few issues you'll need to address. SL is by nature asynch so forcing it to try and work synchronously is usually a very bad idea. You shouldn't do it unless it's absolutely necessary.
Is there a reason that you cannot wait for an async. callback? From what I see you appear to be making two calls for every state that is being rendered. I'm guessing the concern is that one call must complete before the second is made? In scenarios like this, I would kick off the first async call, and in it's response kick off the second call passing along the result you'll want to use from the first call. The second call response updates the provided references.
However, in cases where you've got a significant number of states to update, this results in a rather chatty, and difficult to debug set of calls. I'd really be looking at creating a service call that can accept a set of state references and pass back a data structure set for the values to be updated all in one hit. (or at least grouping them up to one call per state if the batch will be too time consuming and you want to render/interact with visual elements as they load up.)
Is it possible to instantiate a class in a separate thread without a compile time warning?
For example the below code gives the compile time error "Use of unassigned local variable BECheck". I would rather keep AvailabilityCheckBase abstract and not assign it some dummy variable. Creating both BTCheck and BECheck is slow which is why I need it threaded.
public static AvailabilityCheckBase ByDSL(string dsl)
{
AvailabilityCheckBase BECheck;
AvailabilityCheckBase BTCheck;
Thread BEThread = new Thread(new ThreadStart(() => BECheck = new BEAvailabilityCheck(dsl)));
Thread BTThread = new Thread(new ThreadStart(() => BTCheck = new BTAvailabilityCheck(dsl)));
BEThread.Join();
BTThread.Join();
return BECheck.Merge(BTCheck);
}
The language has no knowledge of the Thread constructor or the Join method: it can't tell that you will definitely assign values to both variables before Join returns. If you want to keep the current approach, you'll need to assign values to the variables first. I agree this is slightly ugly, but it's the only way of keeping the compiler happy here.
(It's not clear why you're creating two new threads here, given that your original thread is then blocking on both of them, by the way.)
A better approach if you're using .NET 4 would be to use Task<T>, which effectively gives you the "promise" of a value:
Task<AvailabilityCheckBase> beCheck =
Task.Factory.StartNew(() => new BEAvailabilityCheck(dsl));
Task<AvailabilityCheckBase> btCheck =
Task.Factory.StartNew(() => new BTAvailabilityCheck(dsl));
return beCheck.Result.Merge(btCheck.Result);
It's worth becoming familiar with Task<T> and the TPL in general, as the new async features in C# 5 are heavily dependent on them.
Doesn't this fix your compile error? :
change
AvailabilityCheckBase BECheck;
AvailabilityCheckBase BTCheck;
to
AvailabilityCheckBase BECheck = null;
AvailabilityCheckBase BTCheck = null;
In order to call BECheck.Merge in your last line, BECheck should be initialized, and the compiler doesn't know it will be created before Thread.Join.
Try writing
AvailabilityCheckBase BECheck = null;
AvailabilityCheckBase BTCheck = null;
in the first lines.
If you assign the values to null you should see the message disappear, this would be good practice. There also doesn't appear to be any checking to make sure that the initialisations worked, you should probably include a check for BECheck and BTCheck still being null at the end of the function before you try to return to avoid an exception being thrown.
Use Task's:
Task<AvailabilityCheckBase> BETask = new Task<AvailabilityCheckBase>(() => BECheck = new BEAvailabilityCheck(dsl));
Task<AvailabilityCheckBase> BTTask = new Task<AvailabilityCheckBase>(() => BECheck = new BTAvailabilityCheck(dsl));
BETask.WaitAll(BETask,BTTask);
AvailabilityCheckBase BECheck = BETask.Result;
AvailabilityCheckBase BTCheck = BTTask.Result;
I wanted to parallelize a piece of code, but the code actually got slower probably because of overhead of Barrier and BlockCollection. There would be 2 threads, where the first would find pieces of work wich the second one would operate on. Both operations are not much work so the overhead of switching safely would quickly outweigh the two threads.
So I thought I would try to write some code myself to be as lean as possible, without using Barrier etc. It does not behave consistent however. Sometimes it works, sometimes it does not and I can't figure out why.
This code is just the mechanism I use to try to synchronize the two threads. It doesn't do anything useful, just the minimum amount of code you need to reproduce the bug.
So here's the code:
// node in linkedlist of work elements
class WorkItem {
public int Value;
public WorkItem Next;
}
static void Test() {
WorkItem fst = null; // first element
Action create = () => {
WorkItem cur=null;
for (int i = 0; i < 1000; i++) {
WorkItem tmp = new WorkItem { Value = i }; // create new comm class
if (fst == null) fst = tmp; // if it's the first add it there
else cur.Next = tmp; // else add to back of list
cur = tmp; // this is the current one
}
cur.Next = new WorkItem { Value = -1 }; // -1 means stop element
#if VERBOSE
Console.WriteLine("Create is done");
#endif
};
Action consume = () => {
//Thread.Sleep(1); // this also seems to cure it
#if VERBOSE
Console.WriteLine("Consume starts"); // especially this one seems to matter
#endif
WorkItem cur = null;
int tot = 0;
while (fst == null) { } // busy wait for first one
cur = fst;
#if VERBOSE
Console.WriteLine("Consume found first");
#endif
while (true) {
if (cur.Value == -1) break; // if stop element break;
tot += cur.Value;
while (cur.Next == null) { } // busy wait for next to be set
cur = cur.Next; // move to next
}
Console.WriteLine(tot);
};
try { Parallel.Invoke(create, consume); }
catch (AggregateException e) {
Console.WriteLine(e.Message);
foreach (var ie in e.InnerExceptions) Console.WriteLine(ie.Message);
}
Console.WriteLine("Consume done..");
Console.ReadKey();
}
The idea is to have a Linkedlist of workitems. One thread adds items to the back of that list, and another thread reads them, does something, and polls the Next field to see if it is set. As soon as it is set it will move to the new one and process it. It polls the Next field in a tight busy loop because it should be set very quickly. Going to sleep, context switching etc would kill the benefit of parallizing the code.
The time it takes to create a workitem would be quite comparable to executing it, so the cycles wasted should be quite small.
When I run the code in release mode, sometimes it works, sometimes it does nothing. The problem seems to be in the 'Consumer' thread, the 'Create' thread always seems to finish. (You can check by fiddling with the Console.WriteLines).
It has always worked in debug mode. In release it about 50% hit and miss. Adding a few Console.Writelines helps the succes ratio, but even then it's not 100%. (the #define VERBOSE stuff).
When I add the Thread.Sleep(1) in the 'Consumer' thread it also seems to fix it. But not being able to reproduce a bug is not the same thing as knowing for sure it's fixed.
Does anyone here have a clue as to what goes wrong here? Is it some optimization that creates a local copy or something that does not get updated? Something like that?
There's no such thing as a partial update right? like a datarace, but then that one thread is half doen writing and the other thread reads the partially written memory? Just checking..
Looking at it I think it should just work.. I guess once every few times the threads arrive in different order and that makes it fail, but I don't get how. And how I could fix this without adding slowing it down?
Thanks in advance for any tips,
Gert-Jan
I do my damn best to avoid the utter minefield of closure/stack interaction at all costs.
This is PROBABLY a (language-level) race condition, but without reflecting Parallel.Invoke i can't be sure. Basically, sometimes fst is being changed by create() and sometimes not. Ideally, it should NEVER be changed (if c# had good closure behaviour). It could be due to which thread Parallel.Invoke chooses to run create() and consume() on. If create() runs on the main thread, it might change fst before consume() takes a copy of it. Or create() might be running on a separate thread and taking a copy of fst. Basically, as much as i love c#, it is an utter pain in this regard, so just work around it and treat all variables involved in a closure as immutable.
To get it working:
//Replace
WorkItem fst = null
//with
WorkItem fst = WorkItem.GetSpecialBlankFirstItem();
//And
if (fst == null) fst = tmp;
//with
if (fst.Next == null) fst.Next = tmp;
A thread is allowed by the spec to cache a value indefinitely.
see Can a C# thread really cache a value and ignore changes to that value on other threads? and also http://www.yoda.arachsys.com/csharp/threads/volatility.shtml