Local variables in parallel foreach loops c# - c#

I had two nested foreach loops that need to run through a significant amount of data and calculation. Regular foreach loops took far too long (several hours).
So, I looked up ways to speed it up and found Parallel.ForEach. This is my first time dealing with parallelisation but the examples seem easy enough.
Below is my code currently, the problem with this is local variables (I think, at least). errors are added for nodes that work fine outside of the parallel loops.
Parallel.ForEach(allNodes, (startNode) =>
{
Parallel.ForEach(allNodes, (endNode) =>
{
if (startNode != endNode)
{
List<Geo_Model_Struct> route = pathfinder.getRouteOptimised(startNode, endNode);
if (route.Count <= 0)
{
//failed to find route
errors.Add(string.Format("Cound not find a route from {0} to {1}", startNode, endNode));
}
else
{
List<Geo_Model_Struct> accessibleRoute = accessiblePathfinder.getRouteOptimised(startNode, endNode);
if (accessibleRoute.Count <= 0)
{
//failed to find route
errors.Add(string.Format("Cound not find an accessible route from {0} to {1}", startNode, endNode));
}
}
}
endCount++;
System.Diagnostics.Debug.WriteLine("I: {0}/{1}\tJ: {2}/{3}", startCount, allNodes.Count - 1, endCount, allNodes.Count - 1);
}
);
startCount++;
});
I'm guessing it's something to do with the route local variable being altered when it shouldn't as nearly all checked routes fail. But I don't know how to reliably debug this kind of thing so any help is appreciated.
Edit:
I am testing all possible routes to make sure they all work. route.Count should be > 0 for most tests. When using traditional foreach loops this is the case (e.g. 15 out of 500 times route.Count <= 0 is true)
When using Parallel.ForEach route.Count is 0 most of the time (e.g. somewhere in the region of 494 out of 500 times) so very few actually pass the test and when looking at the errors produced most that fail in parallel, pass using the traditional foreach
Solved
I found a way to remove the need to get data from the database within the getRouteOptimised method. This fixed the issue. Still not sure exactly what it is about the db connection that caused the problem, but it works now.

Without seeing the rest of your code, I suspect the issue is with the pathfinder and accessiblepathfinder objects. They may not be thread safe. A possible way to circumvent this is to create those variables locally within the inner foreach loop.
if (startNode != endNode)
{
// Create and Initialise pathfinder here
MyPathFinderObject pathfinder = new MyPathFinderObject(<parameters>);
List<Geo_Model_Struct> route = pathfinder.getRouteOptimised(startNode, endNode);
if (route.Count <= 0)
.../...
else
{
// Create and Initialise accessiblePathfinder here
MyAccessiblePathFinderObject accessiblePathfinder = new MyAccessiblePathFinderObject(<parameters>);
List<Geo_Model_Struct> accessibleRoute = accessiblePathfinder.getRouteOptimised(startNode, endNode);
.../...
}
}
However there is no guarantee that this will work.
From the docs:
You must be extremely cautious when getting data from properties and methods. Large object models are known for sharing mutable state in unbelievably devious ways.

Related

Parallel.ForEach loop is not working "it skips some and double do others"

I have 2 methods that can do the work for me, one is serial and the other one is parallel.
The reason for parallelization is because there are lots of iteration(about 100,000 or so)
For some reason, the parallel one do skip or double doing some iterations, and I don't have any clue how to debug it.
The serial method
for(int i = somenum; i >= 0; i-- ){
foreach (var nue in nuelist)
{
foreach (var path in nue.pathlist)
{
foreach (var conn in nue.connlist)
{
Func(conn,path);
}
}
}
}
The parallel method
for(int i = somenum; i >= 0; i-- ){
Parallel.ForEach(nuelist,nue =>
{
Parallel.ForEach(nue.pathlist,path=>
{
Parallel.ForEach(nue.connlist, conn=>
{
Func(conn,path);
});
});
});
}
Inside Path class
Nue firstnue;
public void Func(Conn conn,Path path)
{
List<Conn> list = new(){conn};
list.AddRange(path.list);
_ = new Path(list);
}
public Path(List<Conn>)
{
//other things
firstnue.pathlist.Add(this);
/*
firstnue is another nue that will be
in the next iteration of for loop
*/
}
They are both the same method except, of course, foreach and Parallel.ForEach loop.
the code is for the code in here (GitHub page)
List<T>, which I assume you use with firstnue.pathlist, isn't thread-safe. That means, when you add/remove items from the same List<T> from multiple threads at the same time, your data will get corrupt. In order to avoid that problem, the simplest solution is to use a lock, so multiple threads doesn't try to modify list at once.
However, a lock essentially serializes the list operations, and if the only thing you do in Func is to change a list, you may not gain much by parallelizing the code. But, if you still want to give it a try, you just need to change this:
firstnue.pathlist.Add(this);
to this:
lock (firstnue.pathlist)
{
firstnue.pathlist.Add(this);
}
Thanks to sedat-kapanoglu, I found the problem is really about thread safety. The solution was to change every List<T> to ConcurrentBag<T>.
For everyone, like me, The solution of "parallel not working with collections" is to change from System.Collections.Generic to System.Collections.Concurrent

How to get most recent database entry and compare it to a textfield

having an ongoing problem with my project that I'm working on. Fairly new to C# and ASP.Net.
I'm currently trying to get an entry from a textfield and compare it to the last entry in my database. For my business rule, the Reading must not be higher than the Previous Years reading. I will have multiple Readings from different machines.
meterReading is the instance of my class MeterReading
This is currently what I have:
var checkMeterReading = (from p in db.MeterReading
where (p.Reading < meterReading.Reading)
select p);
if (checkMeterReading.Count() > 0)
{
if (!String.IsNullOrEmpty())
{
//saves into DB
}
}
else
{
TempData["Error"] = "Meter Reading must be higher than last actual";
}
Don't know if I'm doing anything stupid or not. Thanks in advance
You're currently checking whether any reading in the database is less than the current reading; that's clearly not right, as you could have stored readings of 200, 5000, 12005 and be testing against 9000. There are 2 readings less than 9000, so your code would allow you to insert the 9000 at the end. What you want to check is that all the readings are less, or equivalently: that no reading is higher:
var higherExists = db.MeterReading.Any(p => p.Reading > newReading);
if(higherExists) {
// danger!
} else {
// do the insert... as long as you're dealing with race conditions :)
}
Note that a better approach IMO would be to compare using time, since errors and meter replacements mean that the readings are not necessarily monotonic. Then you'd do something like:
var lastRow = db.MeterReading.OrderByDescending(p => p.ReadingDate).FirstOrDefault();
if(lastRow == null || lastRow.Reading < newReading) {
// fine
} else {
// danger
}
Note that your current code only supports one customer and meter. You probably also need to filter the table by those.

Recursion algorithm Stops running after finding the first Leaf node

I do apology in advance if there are many variables in the following code sample that their "types" is not clear to you, it is a big library, I just can't put all of that in here, so think of it at high-level, and the name of the variables is kind of helpful too...
Problem: A "concept" can have many "relations". Each of those relations can also have many concepts, For example like a father and child, a father has many children, a child may itsself be a father and has more children ,etc...
So I want to pass the root father and get all the hierarchy and write it to a file ...
The high-level code I am using is this, THE PROBLEM IS THAT it Crashes by a Null exception when it gets the child that has no more children. So its object is null in this line:
oUCMRConceptReltn = moTargetConceptList.ConceptReltns.get_ItemByIndex(i, false);
So I thought well let's put a not null check around it, yeah fixes the crash BUT after it sees the first leafe, it doesn't go further and algorithm stops.
So Something is wrong with the way I am calling recursion, but can't figure it out.
private void MyLoadMethod(string sConceptCKI)
{
UCMRConceptLib.UCMRConceptLoadQual oUCMRConceptLoadQual = new UCMRConceptLib.UCMRConceptLoadQual();
//Fill out UCMRConceptLoadQual object to get new list of related concepts
moTargetConceptList.Load(oUCMRConceptLoadQual;
// WHEN IT IS ZERO, THERE ARE NO MORE CHILDREN.
int numberofKids = moTargetConceptList.ConceptReltns.Count();
if (numberofKids == 0)
return ;
for (int i = 1; i <= numberofKids; i++)
{
oUCMRConceptReltn = moTargetConceptList.ConceptReltns.get_ItemByIndex(i, false);
//Get the concept linked to the relation concept
if (oUCMRConceptReltn.SourceCKI == sConceptCKI)
{
oConcept = moTargetConceptList.ItemByKeyConceptCKI(oUCMRConceptReltn.TargetCKI, false);
}
else
{
oConcept = moTargetConceptList.ItemByKeyConceptCKI(oUCMRConceptReltn.SourceCKI, false);
}
//write its name to the file...now recursion: go and find its children.
builder.AppendLine("\t" + oConcept.PrimaryCTerm.SourceString);
MyLoadMethod(oConcept.ConceptCKI);
}
return ;
}
Just as a side note, the check for number of kids being 0 is redundant, because you're never going to enter the loop.
The algorithm looks okay for what you want to do. You don't need to return anything in this case because your algorithm uses a side effect (the appendLine) to give your output.
I don't know C#, but it looks to me as if you're using some variables that are not local to the function, like oUCMRConceptReltn and oConcept. If they're not local to the function, different recursive invocations can change those values in unexpected ways. Recursive functions should almost never write to a variable outside its own scope.
Most indexes in c style languages are 0 based. So don't loop through 1 to numberofKids, loop 0 to numberofKids-1.
for (int i = 0; i < numberofKids; i++)

C#/SQL Checking for duplicate entries

I am working on a project, and I have hit a brick wall. My code adds dates with a type to a database, but I need to create an error when there is already a similar entry within the code. However, I cannot get my loop to check for duplicates, and instead it adds duplicates! I am not very good at loops so I'm a bit stuck at this. Any help to check for duplicate entries and to stop it from creating too many would be a great help! Changed my code within this text area so it's not exactly the same variable names.
Here is my code: -
if (DT != null && DT.Length > 0 || DF != null && DF.Length > 0)
{
for (int t = 0; t < Type.Length; t++)
{
DateTime checkDate;
if (Type.IsSelectionValid(8, out typeError) && DateTime.TryParse(DF, out typeError) && DateTime.TryParse(DT, out checkDate))
{
TypeValid = true;
error.Error = false;
}
else
{
error.Errors = "Type-Invalid";
absenceTypeValid = false;
break;
}
}
else
{
error.Errors = "Type-Duplicate";
TypeValid = false;
break;
}
}
}
I'm 'fairly' sure you are going out of your way to make a problem more difficult than it is here, but I can't say for sure since I'm not entirely sure what this is doing.
But here are the conditions that need to be met to get to your Type-Duplicate Error line:
1) Either DT or DF have to not be empty to get past the first if statement
2) Either IsSelectionValid() has to return false or either DT or DF have to be an invalid DateTime.
None of those things constitute a duplicate.
Let me try to explain what I see here:
I first see variables called DT, DF. I can see these are dates, but that's all I know about them. I see 'Type' which I understand even less about than DT and DF. I see that you are doing a loop for Type.Length number of iterations... but what does this mean to me if I don't have a clue what Type is?
If you had comments explaining what things are I 'might' be able to help you, but there's just really not enough information to know what's happening here.
If you simply want to know how to avoid adding duplicates to a database, then I would suggest adding a constraint or index to the column in the database and then you can just catch the exceptions that are thrown when you try to insert a duplicate and deal with it that way. Alternatively, account for it in your insert statement.

Trouble in implementing link list sorting in c#

I am having trouble implementing a sort algo (merge) for singly list as defined below
My mergesort method always gives me null..I am not able to figure out what is wrong
Can you guys help me out?
Node class
public class Node
{
private int data;
private Node next;
}
Linked List class
public class SSL
{
private Node head;
}
My merge sort code
public static void MergeSort(SSL a)
{
SSL x = new SSL();
SSL y = new SSL();
if (a.Head == null || a.Head.Next == null) // base case if list has 0 or 1 element
return;
AlternateSplitting(a, x, y);
MergeSort(x);
MergeSort(y);
a = SortedMerge(x, y);
}
I implemented following helper methods to implement merge sort
AlternateSplitting: This method will split the list into 2 lists
public static void AlternateSplitting(SSL src, SSL odd, SSL even)
{
while (src.Head != null)
{
MoveNode(odd, src);
if (src.Head != null)
MoveNode(even, src);
}
} // end of AlternateSplitting
This method will merge the 2 list and return a new list
public static SSL SortedMerge(SSL a, SSL b)
{
SSL c = new SSL();
if (a.Head == null)
return b;
else
if (b.Head == null)
return a;
else
{
bool flagA = false;
bool flagB = false;
Node currentA = new Node();
Node currentB = new Node();
while (!flagA && !flagB)
{
currentA = a.Head;
currentB = b.Head;
if (currentA.Data < currentB.Data)
{
MoveNodeToEnd(a, c);
currentA = a.Head;
if (currentA== null)
flagA = true;
}
else
if (currentA.Data > currentB.Data)
{
MoveNodeToEnd(b, c);
currentB = b.Head;
if (currentB== null)
flagB = true;
}
} // end of while
if (flagA)
{
while (currentB != null)
{
MoveNodeToEnd(b, c);
currentB = b.Head;
}
}
else
if(flagB)
{
while (currentA != null)
{
MoveNodeToEnd(a, c);
currentA = a.Head;
}
}
return c;
} // end of outer else
} // end of function sorted merge
I am not able to figure out what is
wrong Can you guys help me out?
Find a bug and you fix it for a day. Teach how to find bugs and believe me, it takes a lifetime to fix the bugs. :-)
Your fundamental problem is not that the algorithm is wrong -- though, since it gives incorect results, it certainly is wrong. But that's not the fundamental problem. The fundamental problem is that you don't know how to figure out where a program goes wrong. Fix that problem first! Learn how to debug programs.
Being able to spot the defect in a program is an acquired skill like any other -- you've got to learn the basics and then practice for hundreds of hours. So learn the basics.
Start by becoming familiar with the basic functions of your debugger. Make sure that you can step through programs, set breakpoints, examine local variables, and so on.
Then write yourself some debugging tools. They can be slow -- you're only going to use them when debugging. You don't want your debugging tools in the production version of your code.
The first debugging tool I would write is a method that takes a particular Node and produces a comma-separated list of the integers that are in the list starting from that node. So you'd say DumpNode(currentB) and what would come back is, say "{10,20,50,30}". Obviously doing the same for SSL is trivial if you can do it for nodes.
I would also write tools that do things like count nodes in a list, tell you whether a given list is already sorted, and so on.
Now you have something you can type into the watch window to more easily observe the changes to your data structures as they flow by. (There are ways to make the debugger do this rendering automatically, but we're discussing the basics here, so let's keep it simple.)
That will help you understand the flow of data through the program more easily. And that might be enough to find the problem. But maybe not. The best bugs are the ones that identify themselves to you, by waving a big red flag that says "there's a bug over here". The tool that turns hard-to-find bugs into self-identifying bugs is the debug assertion.
When you're writing your algorithm, think "what must be true?" at various points. For example, before AlternateSplitting runs, suppose the list has 10 items. When it is done running, the two resulting lists had better have 5 items each. If they don't, if they have 10 items each or 0 items each or one has 3 and the other has 7, clearly you have a bug somewhere in there. So start writing debug-only code:
public static void AlternateSplitting(SSL src, SSL odd, SSL even)
{
#if DEBUG
int srcCount = CountList(src);
#endif
while (src.Head != null) { blah blah blah }
#if DEBUG
int oddCount = CountList(odd);
int evenCount = CountList(even);
Debug.Assert(CountList(src) == 0);
Debug.Assert(oddCount + evenCount == srcCount);
Debug.Assert(oddCount == evenCount || oddCount == evenCount + 1);
#endif
}
Now AlternateSplitting will do work for you in the debug build to detect bugs in itself. If your bug is because the split is not working out correctly, you'll know immediately when you run it.
Do the same thing to the list merging algorithm -- figure out every point where "I know that X must be true at this point", and then write a Debug.Assert(X) at that point. Then run your test cases. If you have a bug, then the program will tell you and the debugger will take you right to it.
Good luck!

Categories

Resources