Bad Performance Parallel.ForEach - c#

I have a Parallel ForEach loop, that perform terrible. I don't understand what is wrong with the code that cause it to perform so bad, it runs for like 15 minutes, and I stop it.. I'm not sure if that some kind of memory leak or something similar, but the sequence version that I had before trying to change it to parallel, was working fine with the same base code.
Note that sourceFileRead and outboundFileRead are two big list of around 100,000 IrfRecord. That's why I wanted to move it to work in parallel, but it seems like i'm doing something very wrong ?
private async Task Compare(ConcurrentBag<IrfRecord> sourceFileRead, ConcurrentBag<IrfRecord> outboundFileRead)
{
await Task.Run(() =>
{
Parallel.ForEach(sourceFileRead, (src) =>
{
var matched = outboundFileRead.FirstOrDefault(ob => ob.Key == src.Key);
if (matched == null)
{
ReportMissing(src.Key);
return;
}
CompareEntities(src, matched);
});
});
private void CompareEntities(IrfRecord src, IrfRecord matched)
{
for (var i = 0; i < src.Body.Count - 1; i++)
{
if (src.Body[i].Settings.ToIgnore) continue;
if (src.Body[i].Value != matched.Body[i].Value)
{
ReportDiff(src.Body[i], matched.Body[i], src.Key);
}
}
}
private void ReportMissing(string srcKey)
{
_differencesCount.AddOrUpdate("Missing", new List<string> { srcKey },
(key, existingVal) =>
{
existingVal.Add(srcKey);
return existingVal;
});
}
private void ReportDiff(IrfRecordProperty srcProp, IrfRecordProperty obProp, string srcKey)
{
_differencesCount.AddOrUpdate(srcProp.Settings.Name, new List<string> { srcKey + $#" LC Value: {srcProp.Value}, D Value: {obProp.Value}" },
(key, existingVal) =>
{
existingVal.Add(srcKey + $#" LC Value: {srcProp.Value}, D Value: {obProp.Value}");
return existingVal;
});
}
Am I using the Parallel.ForEach in a wrong way ?

Related

Comparing the data

I am trying to compare a set of integers (say alpha) with the result i am getting (say result)
if result is in alpha i should be able to get the output as mentioned in example below
alpha = 0,1,2,3,4,5,6,7
result = 0,5,6
Final answer should be ABBBBAAB
and what i am getting is ABBBBBBB BBBBBABB BBBBBBAB
As per code
public static int[] alpha = new int[8]
{
0,1,2,3,4,5,6,7
};
public static void Main(string[] args)
{
// Lines of code
foreach (var jagged in manager.JaggedList)
{
// Lines of code
foreach (var item in Items)
{
Console.Write(item.Number); //For Ex output here is (0,5,6)
List<int> result = new List<int>();
result.Add(item.Number);
foreach (var Var in result)
{
for (int i = 0; i < alpha.Length; i++)
{
if (result.Contains(alpha[i]))
{
Console.Write(alpha[A]);
}
else
{
Console.Write(alpha[B]);
}
}
}
Console.WriteLine();
}
}
Final answer should be ABBBBAAB
and what i am getting is ABBBBBBB BBBBBABB BBBBBBAB
If Linq is acceptable then just use
alpha
.Select(a => result.Contains(a) ? "A" : "B")
.ToList()
.ForEach(x => Console.Write(x));
using foreach loops
foreach(var a in alpha) {
var found = false;
foreach(var r in result) {
if(a == r) {
found = true;
}
}
Console.Write(found ? "A" : "B");
}

How to get return value from call back method in c#

I passed "GetFilesRevisions_Results" method for GetAsync which will handle the result for it.But I want to return a integer from "GetFilesRevisions_Results" method.How I can achieve this ?
Thanks in advance!
private int GetVersionNumber(string i_sFileName)
{
#region Get latest version no.
int nVerNo = 0;
// RequestResult result;
try
{
OAuthUtility.GetAsync
(
"https://api.dropboxapi.com/1/revisions/auto/",
new HttpParameterCollection
{
{ "path", i_sFileName },
{ "access_token", accessToken },
{ "rev_limit", 1 }
},
callback: GetFilesRevisions_Results ??? How I can access return variable
);
}
catch
{
}
return nVerNo;
#endregion
}
private int GetFilesRevisions_Results(RequestResult result)
{
int nVerNo = 0;
if (result.StatusCode == 200)
{
dynamic dynJson = JsonConvert.DeserializeObject(Convert.ToString(result));
foreach (var item in dynJson)
{
nVerNo = Convert.ToInt32(item.rev);
}
}
else
{
throw new Exception("Failed to get revisions of files");
}
return nVerNo;
}
#endregion Get version Number
There's no way for you to know when your callback will be invoked, thus a return value is not the proper way to get your int.
You may use an event with an int argument and invoke it from within GetFilesRevisions_Results, just before the return. You will then be able to use your integer value from any of this event listeners.
You probably want to use a wait handle:
AutoResetEvent waitHandle = new AutoResetEvent(false);
int nVerNoGlobalTempHolder = 0;
private int GetVersionNumber(string i_sFileName)
{
#region Get latest version no.
for (int i = 0; i < 10 && nVerNoGlobalTempHolder != 0; i++)
{
//Someone is waiting for this callback already...
//Do something like:
Thread.Sleep(500);
}
If (nVerNoGlobalTempHolder == 0) throw new Exception("timeout");
// RequestResult result;
try
{
OAuthUtility.GetAsync
(
"https://api.dropboxapi.com/1/revisions/auto/",
new HttpParameterCollection
{
{ "path", i_sFileName },
{ "access_token", accessToken },
{ "rev_limit", 1 }
},
callback: GetFilesRevisions_Results ??? How I can access return variable
);
}
catch
{
}
waitHandle.WaitOne();
int nVerNo =nVerNoGlobalTempHolder;
nVerNoGlobalTempHolder = 0;//Reset this in case you have multiple thread calling it
return nVerNo;
}
private int GetFilesRevisions_Results(RequestResult result)
{
if (result.StatusCode == 200)
{
dynamic dynJson = JsonConvert.DeserializeObject(Convert.ToString(result));
foreach (var item in dynJson)
{
nVerNoGlobalTempHolder = Convert.ToInt32(item.rev);
}
}
else
{
throw new Exception("Failed to get revisions of files");
}
WaitHandle.Set();
}
This also implements very basic syncing in case more than one thread calls it. If you you dont need that, remove the the for loop at the start

Finish two tasks then printing something

I have three tasks, one is producer, then consumer and the last one is to print something after finishing the first two. However the code doesn't reach the last task, which means no printing.
while (true)
{
ThreadEvent.WaitOne(waitingTime, false);
lock (SyncVar)
{
collection = new BlockingCollection<string>(4);
Task producer = Task.Run(() =>
{
if (list.Count > 0)
Console.WriteLine("Block begin");
while (!collection.IsAddingCompleted)
{
var firstItem = list.FirstOrDefault();
collection.TryAdd(firstItem);
list.Remove(firstItem);
}
collection.CompleteAdding();
});
Task consumer = Task.Run(() => DoConsume());
Task endTask = consumer.ContinueWith(i => Console.WriteLine("Block end"));// not print this line, why?
Task.WaitAll(producer, consumer, endTask);
if (ThreadState != State.Running) break;
}
}
Please look at my code logic.
EDIT:
For `DoConsume', it is complicated.
public void DoConsume()
{
if (collection.Count > 0)
Console.WriteLine("There are {0} channels to be processed.", collection.Count);
var workItemBlock = new ActionBlock<string>(
workItem =>
{
bool result =ProcessEachChannel(workItem);
});
foreach (var workItem in collection.GetConsumingEnumerable())
{
workItemBlock.Post(workItem);
}
workItemBlock.Complete();
}
The problem is that your producer will never complete:
// This will run until after CompleteAdding is called
while (!collection.IsAddingCompleted)
{
var firstItem = list.FirstOrDefault();
collection.TryAdd(firstItem);
list.Remove(firstItem);
}
//... which doesn't happen until after the loop
collection.CompleteAdding();
It looks like you're just trying to add all of the items in your list, which should be as simple as:
Task producer = Task.Run(() =>
{
if (list.Count > 0)
Console.WriteLine("Block begin");
while(list.Any())
{
var firstItem = list.First();
collection.TryAdd(firstItem);
list.Remove(firstItem);
}
collection.CompleteAdding();
});
Or, a simpler method:
Task producer = Task.Run(() =>
{
if (list.Count > 0)
Console.WriteLine("Block begin");
foreach(var item in list)
{
collection.TryAdd(item);
}
list.Clear();
collection.CompleteAdding();
});
I used Reed Copsey's code but the error is still there. Just can't figure it out why.
I think that my code has the flaw at while (!collection.IsAddingCompleted).
Because the collection has the boundary of 4, suppose there are two item left in the collection. The condition collection.IsAddingCompleted is never met therefore the code could not jump out of the while loop.
I rewrote the code, it seems fine. The code is similar MSDN. I used Take to retrieve the element in the collection.
while (true)
{
ThreadEvent.WaitOne(waitingTime, false);
lock (SyncVar)
{
collection = new BlockingCollection<string>(4);
DoWork dc = new DoWork();
Task consumer = Task.Run(() =>
{
while (!collection.IsCompleted)
{
string data = "";
try
{
if (collection.Count > 0)
data = collection.Take();
}
catch (InvalidOperationException e)
{
Console.WriteLine(e.Message);
}
if (data != "")
{
bool result = dc.DoConsume(data);
}
}
});
Task producer = Task.Run(() =>
{
if (list.Count > 0)
Console.WriteLine("Block begin");
foreach (var item in list)
{
collection.Add(item);
}
list.Clear();
collection.CompleteAdding();
});
Task endTask = consumer.ContinueWith(i => Console.WriteLine("Block end"));
Task.WaitAll(producer, consumer, endTask);
if (ThreadState != State.Running) break;
}

Using Params in function like this C# is it a good practice

FUNCTION **
private void GetComboxItems(params int[] type)
{
try
{
/* DEPARTMENT CODE */
if (type[0] == 1)
{
cmbDept.Items.Clear();
using (SFCDataContext SFC = new SFCDataContext())
{
var Dept = (from i in SFC.Systems_SettingsDepartments
orderby i.Department_ID
select i);
foreach (var q in Dept)
{
cmbDept.Items.Add(q.Department_ID);
}
SFC.Connection.Close();
}
}
/* CORRECTIVE ACTION RECORD CODE */
if (type[1] == 1)
{
cmbCARNo.Items.Clear();
using (SFCDataContext SFC = new SFCDataContext())
{
var CarNo = (from i in SFC.Systems_CARLogSheets
orderby i.CARDocNo
where i.PostStatus == 0
select new
{
Code = i.CARDocNo
});
foreach (var w in CarNo)
{
cmbCARNo.Items.Add(w.Code);
}
SFC.Connection.Close();
}
}
/* MEASUREMENT CODE */
if (type[2] == 1)
{
cmbMeas.Items.Clear();
using (SFCDataContext SFC = new SFCDataContext())
{
var Measure = (from i in SFC.Systems_SettingsMeasurements
orderby i.Measurement_ID
where i.CategoryType == "Measurement"
select new
{
DESC = i.Measurement
});
foreach (var e in Measure)
{
cmbMeas.Items.Add(e.DESC);
}
SFC.Connection.Close();
}
}
/* SUB-MEASUREMENT CODE */
if (type[3] == 1)
{
cmbSubMeas.Items.Clear();
using (SFCDataContext SFC = new SFCDataContext())
{
var SubMeas = (from i in SFC.Systems_SettingsMeasurements
orderby i.Measurement_ID
where i.CategoryType == "Sub-Measurement"
select new
{
DESC = i.Measurement
});
foreach (var r in SubMeas)
{
cmbSubMeas.Items.Add(r.DESC);
}
SFC.Connection.Close();
}
}
}
catch (Exception ex)
{ MessageBox.Show(ex.Message.ToString()); }
}
* FORM LOAD **
private void frmSQMProductivityReports_Load(object sender, EventArgs e)
{
GetComboxItems(1, 0, 1, 0);
}
why is it that at this code.. my 1st if statement is "True" so it does what follows the code inside the if statement and it does. now the 2nd if statement is "False" which it skips the function inside it. but then now the 3rd if statement is "True" which is it should have do same as the 1st but as i have checked couple times it skips the function inside the if statement why is it? is there something wrong in my codes i tried looking at it its seems ok to me..
According to your input, the if conditions that meets the criteria are the first and the third.. note that some statements can be "skipped" if an exception is throw, so placing breakpoints there or printing logs may help you understand better what is happening.
Side notes:
The use of params seems to be redundant in this case (it's mostly used when an unknown #arguments should be passed) since the number of arguments is fixed.
use bool type rather then int for flags

Getting weird result while using Task Parallel Library?

I am trying to do some filter task using TPL. Here I am simplifying the code to filter number based on condition. Here is the code.
public static void Main (string[] args)
{
IEnumerable<int> allData = getIntData ();
Console.WriteLine ("Complete Data display");
foreach (var item in allData) {
Console.Write(item);
Console.Write(" | ");
}
Console.WriteLine ();
filterAllDatas (ref allData, getConditions ());
foreach (var item in allData) {
Console.Write(item);
Console.Write(" | ");
}
Console.WriteLine ();
}
static void filterAllDatas(ref IEnumerable<int> data, IEnumerable<Func<int,bool>> conditions)
{
List<int> filteredData = data.ToList ();
List<Task> tasks = new List<Task>();
foreach (var item in data.AsParallel()) {
foreach (var condition in conditions.AsParallel()) {
tasks.Add(Task.Factory.StartNew(() => {
if (condition(item)) {
filteredData.Remove(item);
}
}));
}
}
Task.WaitAll(tasks.ToArray());
data = filteredData.AsEnumerable ();
}
static IEnumerable<Func<int,bool>> getConditions()
{
yield return (a) => { Console.WriteLine("modulo by 2"); return a % 2 == 0;};
yield return (a) => { Console.WriteLine("modulo by 3"); Thread.Sleep(3000); return a % 3 == 0;};
}
static IEnumerable<int> getIntData ()
{
for (int i = 0; i < 10; i++) {
yield return i;
}
}
Here, it is simple code to filter out integer which is divided by two or three. Now, if I remove that thread sleep code work perfectly but if I put that it is not.
Normally means without Thread.Sleep , both condition execute 10 times e.g. for every number. But if I add Thread.Sleep first condition executes 7 times and second executes thirteen times. And because of this few number skip the condition. I try to debug but didn't get anything that can point out issue with my code.
Is there any good way to achieve this? Like filter condition on data can work async and parallel to improve performance ?
Code is for demo purpose only.
FYI: Currently I am using Mono with Xamarine studio on windows machine.
Please let me know if any further details needed.
I would guess it has to do with how your task's lambda closes over the loop variable condition. Try changing it as follows:
foreach (var condition in conditions.AsParallel()) {
var tasksCondition = condition
tasks.Add(Task.Factory.StartNew(() => {
if (tasksCondition(item)) {
filteredData.Remove(item);
}
}));
Note you're also closing over the loop variable item, which could cause similar problems.
First you can change your getConditions method to see what's happening inside :
static IEnumerable<Func<int, bool>> getConditions()
{
yield return (a) => { Console.WriteLine(a + " modulo by 2"); return a % 2 == 0; };
yield return (a) => { Console.WriteLine(a + " modulo by 3"); Thread.Sleep(3000); return a % 3 == 0; };
}
And if you stop capturing the foreach's variables, it will work :
static void filterAllDatas(ref IEnumerable<int> data, IEnumerable<Func<int, bool>> conditions)
{
List<int> filteredData = data.ToList();
List<Task> tasks = new List<Task>();
foreach (var item in data.AsParallel())
{
var i = item;
foreach (var condition in conditions.AsParallel())
{
var c = condition;
tasks.Add(Task.Factory.StartNew(() =>
{
if (c(i))
{
filteredData.Remove(i);
}
}));
}
}
Task.WaitAll(tasks.ToArray());
data = filteredData.AsEnumerable();
}

Categories

Resources