Parallelize transitive reduction - c#

I have a Dictionary<int, List<int>>, where the Key represents an element of a set (or a vertex in an oriented graph) and the List is a set of other elements which are in relation with the Key (so there are oriented edges from Key to Values). The dictionary is optimized for creating a Hasse diagram, so the Values are always smaller than the Key.
I have also a simple sequential algorithm, that removes all transitive edges (e.g. I have relations 1->2, 2->3 and 1->3. I can remove the edge 1->3, because I have a path between 1 and 3 via 2).
for(int i = 1; i < dictionary.Count; i++)
{
for(int j = 0; j < i; j++)
{
if(dictionary[i].Contains(j))
dictionary[i].RemoveAll(r => dictionary[j].Contains(r));
}
}
Would it be possible to parallelize the algorithm? I could do Parallel.For for the inner loop. However, this is not recommended (https://msdn.microsoft.com/en-us/library/dd997392(v=vs.110).aspx#Anchor_2) and the resulting speed would not increase significantly (+ there might be problems with locking). Could I parallelize the outer loop?

There is simple way to solve the parallelization problem, separate data. Read from original data structure and write to new. That way You can run it in parallel without even need to lock.
But probably the parallelization is not even necessary, the data structures are not efficient. You use dictionary where array would be sufficient (as I understand the code You have vertices 0..result.Count-1). And List<int> for lookups. List.Contains is very inefficient. HashSet would be better. Or, for more dense graphs, BitArray. So instead of Dictionary<int, List<int>> You can use BitArray[].
I rewrote the algorithm and made some optimizations. It does not make plain copy of the graph and delete edges, it just construct the new graph from only the right edges. It uses BitArray[] for input graph and List<int>[] for final graph, as the latter one is far more sparse.
int sizeOfGraph = 1000;
//create vertices of a graph
BitArray[] inputGraph = new BitArray[sizeOfGraph];
for (int i = 0; i < inputGraph.Length; ++i)
{
inputGraph[i] = new BitArray(i);
}
//fill random edges
Random rand = new Random(10);
for (int i = 1; i < inputGraph.Length; ++i)
{
BitArray vertex_i = inputGraph[i];
for(int j = 0; j < vertex_i.Count; ++j)
{
if(rand.Next(0, 100) < 50) //50% fill ratio
{
vertex_i[j] = true;
}
}
}
//create transitive closure
for (int i = 0; i < sizeOfGraph; ++i)
{
BitArray vertex_i = inputGraph[i];
for (int j = 0; j < i; ++j)
{
if (vertex_i[j]) { continue; }
for (int r = j + 1; r < i; ++r)
{
if (vertex_i[r] && inputGraph[r][j])
{
vertex_i[j] = true;
break;
}
}
}
}
//create transitive reduction
List<int>[] reducedGraph = new List<int>[sizeOfGraph];
Parallel.ForEach(inputGraph, (vertex_i, state, ii) =>
{
{
int i = (int)ii;
List<int> reducedVertex = reducedGraph[i] = new List<int>();
for (int j = i - 1; j >= 0; --j)
{
if (vertex_i[j])
{
bool ok = true;
for (int x = 0; x < reducedVertex.Count; ++x)
{
if (inputGraph[reducedVertex[x]][j])
{
ok = false;
break;
}
}
if (ok)
{
reducedVertex.Add(j);
}
}
}
}
});
MessageBox.Show("Finished, reduced graph has "
+ reducedGraph.Sum(s => s.Count()) + " edges.");
EDIT
I wrote this:
The code has some problems. With the direction i goes now, You can delete edges You would need and the result would be incorrect. This turned out to be a mistake. I was thinking this way, lets have a graph
1->0
2->1, 2->0
3->2, 3->1, 3->0
Vertex 2 gets reduced by vertex 1, so we have
1->0
2->1
3->2, 3->1, 3->0
Now vertex 3 gets reduced by vertex 2
1->0
2->1
3->2, 3->0
And we have a problem, as we can not reduce 3->0 which stayed here because of reduced 2->0. But it is my mistake, this would never happen. The inner cycle goes strictly from lower to higher, so instead
Vertex 3 gets reduced by vertex 1
1->0
2->1
3->2, 3->1
and now by vertex 2
1->0
2->1
3->2
And the result is correct. I apologize for the error.

Related

How to optimize this loop that purges duplicate vertices?

I have this bit of code here
var vertexIndexDictionary = new Dictionary<Vector3, int>();
for (int i = 0; i < triangles.Length; i++)
{
for (int j = 0; j < 3; j++)
{
var vertex = triangles[i][j];
if (!vertexIndexDictionary.ContainsKey(vertex))
{
vertexIndexDictionary.Add(vertex, vertexIndexDictionary.Count);
}
}
}
var vertices = vertexIndexDictionary.Keys.ToArray();
which goes through the triangles array and gets rid of duplicate vertices. This triangle array can get very large, and so the running time gets really long as well. Is there some way I can achieve the same thing but faster? E.g. with another data type?
Edit:
Triangles array is initialized like this:
var triangles = new Triangle[count];
and triangle struct is
struct Triangle {
public Vector3 a;
public Vector3 b;
public Vector3 c;
public Vector3 this[int i]
{
get
{
switch (i)
{
case 0:
return a;
case 1:
return b;
default:
return c;
}
}
}
}
You could get direct access to the entry stored inside the vertexIndexDictionary, with the low-level CollectionsMarshal.GetValueRefOrAddDefault method. Using this API you can reduce the dictionary searching operations from 2-3 to just 1 per iteration. Also you could consider avoiding using the LINQ ToArray method, because LINQ in general is not the tool of choice when performance is of uttermost importance. In this particular case avoiding the ToArray is unlikely to provide any tangible benefit (because the LINQ happens to be internally optimized for sources that implement the ICollection<T> interface), but I'll show you how to do it anyway:
for (int i = 0; i < triangles.Length; i++)
{
for (int j = 0; j < 3; j++)
{
Vector3 vertex = triangles[i][j];
ref int valueRef = ref CollectionsMarshal.GetValueRefOrAddDefault(
vertexIndexDictionary, vertex, out bool exists);
if (!exists) valueRef = vertexIndexDictionary.Count;
indices[i * 3 + j] = valueRef;
}
}
Vector3[] vertices = new Vector3[vertexIndexDictionary.Keys.Count];
vertexIndexDictionary.Keys.CopyTo(vertices, 0);

Cache friendly and faster way faster - `InvokeMe()`

I am trying to figure out, if this really is the fastest approach. I want this to be as fast as possible, cache friendly, and serve a good time complexity.
DEMO: https://dotnetfiddle.net/BUGz8s
private static void InvokeMe()
{
int hz = horizontal.GetLength(0) * horizontal.GetLength(1);
int vr = vertical.GetLength(0) * vertical.GetLength(1);
int hzcol = horizontal.GetLength(1);
int vrcol = vertical.GetLength(1);
//Determine true from Horizontal information:
for (int i = 0; i < hz; i++)
{
if(horizontal[i / hzcol, i % hzcol] == true)
System.Console.WriteLine("True, on position: {0},{1}", i / hzcol, i % hzcol);
}
//Determine true position from vertical information:
for (int i = 0; i < vr; i++)
{
if(vertical[i / vrcol, i % vrcol] == true)
System.Console.WriteLine("True, on position: {0},{1}", i / vrcol, i % vrcol);
}
}
Pages I read:
Is there a "faster" way to iterate through a two-dimensional array than using nested for loops?
Fastest way to loop through a 2d array?
Time Complexity of a nested for loop that parses a matrix
Determining the big-O runtimes of these different loops?
EDIT: The code example, is now, more towards what I am dealing with. It's about determining a true point x,y from a N*N Grid. The information available at disposal is: horizontal and vertical 2D arrays.
To NOT cause confusion. Imagine, that overtime, some positions in vertical or horizontal get set to True. This works currently perfectly well. All I am in for, is, the current approach of using one for-loop per 2D array like this, instead of using two for loops per 2D array.
Time complexity for approach with one loop and nested loops is the same - O(row * col) (which is O(n^2) for row == col as in your example for both cases) so the difference in the execution time will come from the constants for operations (since the direction of traversing should be the same). You can use BenchmarkDotNet to measure those. Next benchmark:
[SimpleJob]
public class Loops
{
int[, ] matrix = new int[10, 10];
[Benchmark]
public void NestedLoops()
{
int row = matrix.GetLength(0);
int col = matrix.GetLength(1);
for (int i = 0; i < row ; i++)
for (int j = 0; j < col ; j++)
{
matrix[i, j] = i * row + j + 1;
}
}
[Benchmark]
public void SingleLoop()
{
int row = matrix.GetLength(0);
int col = matrix.GetLength(1);
var l = row * col;
for (int i = 0; i < l; i++)
{
matrix[i / col, i % col] = i + 1;
}
}
}
Gives on my machine:
Method
Mean
Error
StdDev
Median
NestedLoops
144.5 ns
2.94 ns
4.58 ns
144.7 ns
SingleLoop
578.2 ns
11.37 ns
25.42 ns
568.6 ns
Making single loop actually slower.
If you will change loop body to some "dummy" operation - for example incrementing some outer variable or updating fixed (for example first) element of the martix you will see that performance of both loops is roughly the same.
Did you consider
for (int i = 0; i < row; i++)
{
for (int j = 0; j < col; j++)
{
Console.Write(string.Format("{0:00} ", matrix[i, j]));
Console.Write(Environment.NewLine + Environment.NewLine);
}
}
It is basically the same loop as yours, but without / and % that compiler may or may not optimize.

check if 2d array contains value on second spot

I have a 2D array for a lottery I am creating. Basically it's a set of 2 integers:
int[,] coupon = new int[rowAmount, numAmount];
Where rowAmount is the amount of rows, and numAmount is the amount of numbers in that row.
Now I need to select the numbers for each row, however there may not be duplicates of a number within a specific row.
for (int r = 0; r < rowAmount; ++r)
{
for (int n = 0; n < numAmount; ++n)
{
userNum = lotRng.Next(1, numAmount * rngMult);
while (COUPON CONTAINS DUPLICATE NUMBER ON SECOND SPOT )
{
userNum = lotRng.Next(1, numAmount * rngMult);
}
coupon[r, n] = userNum;
}
}
My issue is the while part, I cannot figure out how to check if coupon contains the userNum on the second slot(The numAmount slot). For lists and stuff I used to just do list.Contains() but that doesn't seem to work on here.
Depending on the size of your array is wether it makes sense to optimize performance.
Depending on that one possibility would be to sort the array and use Array.BinarySearch .
You have to sort your array for that.
https://msdn.microsoft.com/en-us/library/2cy9f6wb(v=vs.110).aspx
So you have a number of possibilities to optimize data structure.
Solution with an array of lists is one of my favourites for this. It's very similar to my other with jagged arrays but faster- because List search is most efficient and Linq searches are not.
const int rowAmount = 1000;
const int numAmount=1000;
const int rngMult = 10;
Random lotRng = new Random();
var coupon = new List<int>[rowAmount];
int userNum;
for (int r = 0; r < rowAmount; r++)
{
coupon[r]= new List<int>();
for (int n = 0; n < numAmount; ++n)
{
do userNum = lotRng.Next(1, numAmount * rngMult);
while (coupon[r].Contains(userNum));
coupon[r].Add(userNum);
}
}
Of course it would be also possible to use a list of lists (kind of 2D lists), if necessary.
var couponLoL = new List<List<int>>();
The following quick and dirty way show a possible way of copying a 2D array to a list, but not to recommend here for several reasons (loop, boxing for value types):
var coupon= new int[rowAmount,numAmount];
[..]
do userNum = lotRng.Next(1, numAmount * rngMult);
while (coupon.Cast<int>().ToList().Contains(userNum));
In this special case it makes even less sense, because this would look in the whole 2D array for the double value. But it is worth knowing how to convert from 2D to 1D array (and then in a list).
Solution with 2D jagged arrays: If you want to access rows and columns in C#, a jagged array is very convenient and unless you care very much how effient the array ist stored internally, jagged array are to recommend strongly for that.
Jagged arrays are simply arrays of arrays.
const int rowAmount = 1000;
const int numAmount=1000;
const int rngMult = 10;
int userNum;
Random lotRng = new Random();
var coupon = new int[rowAmount][];
for (int r = 0; r < rowAmount; r++)
{
coupon[r] = new int[numAmount];
for (int n = 0; n < numAmount; ++n)
{
do userNum = lotRng.Next(1, numAmount * rngMult);
while (Array.Exists(coupon[r], x => x == userNum));
coupon[r, n] = userNum;
}
}
The above function Array.Exists works only for 1 dimension what is enough here, and needs no Linq. The same as above with Linq method .Any :
while (coupon[r].Any(x => x == userNum));
If you would have to search in two dimensions for a double value, you would need a loop more again, but still on nested loop level less than without this.
Linq is elegant, but normally not the fastest method (but you would have to handle with very big arrays of multi-million sizes for that to matter).
For other possibilities of using Linq, look for example here:
How to use LINQ with a 2 dimensional array
Another idea would be to make one-dimensional array of size rowAmount*numAmount.
It needs a little bit of thinking, but allows most simple and fastest access of searching.
Solution with loops in an array. Not elegant, but you could refactor the search loops in an own method to look better. But as a second point, in this case a linear search like shown is not a really fast solution either.
Only inner part of the 2 for-loops, not full code (as in other answers by me here):
bool foundDup;
do
{
userNum = lotRng.Next(1, numAmount * rngMult);
foundDup = false;
for (var x = 0; x < coupon.GetLength(1); x++) //Iterate over second dimension only
if (coupon[r, x] == userNum)
{ foundDup = true;
break;
}
} while (foundDup);
coupon[r, n] = userNum;
In the special context of this question, you can optimize the loop:
for (var x = 0; x < n; x++)
As you say in your comment you need to check all n fields vs. your new userNum. You can solve this with the following code:
for (int r = 0; r < rowAmount; ++r)
{
for (int n = 0; n < numAmount; ++n)
{
userNum = lotRng.Next(1, numAmount * rngMult);
for (int x = 0; x < coupon.GetLength(1); x++) //Iterate over your second dimension again
{
while (coupon[r,x] == userNum)
{
userNum = lotRng.Next(1, numAmount * rngMult);
}
}
coupon[r, n] = userNum;
}
}

For loop - doesn't do last loop

I have this for loop. TicketList starts with 109 tickets. nColumns = 100. I calculate the number of rows I will need depending on the number of tickets. So in this case I need 2 rows. Row one will be full and row two will only have 9 entries. I have the loop below. It only runs one time for the NumOfRows and fills the first 100 and never loops.
What am I missing?
for (int j = 0; j < NumOfRows; j++)
{
for (int i = 0; i < nColumns; i++)
{
if (TicketList.Count() > 0)
{
t = rand.Next(0, TicketList.Count() - 1);
numbers[i, j] = TicketList[t];
TicketList.Remove(TicketList[t]);
}
}
}
Try changing your code to use a more LINQ-like, functional approach. If might make the logic easier. Something like this:
TicketList
.OrderBy(x => rand.Next())
.Select((ticket, n) => new
{
ticket,
j = n / NumOfRows,
i = n % NumOfRows
})
.ToList()
.ForEach(x =>
{
numbers[x.i, x.j] = x.ticket;
});
You may need to flip around x.i & x.j or use nColumns instead of NumOfRows - I wasn't sure what your logic was looking for - but this code might work better.
Other than a few poor choices, your loops appear to be fine. I would venture that NumOfRows is not being calculated correctly.
The expression NumOfRows = (TotalTickets + (Columns - 1)) / Columns; should calculate the correct number of rows.
Also, you should use the property version of Count rather than the Linq extension method and use IList<T>.RemoveAt() or List<T>.RemoveAt rather than Remove(TicketList[T]).
Using Remove() requires that the list be enumerated to locate the element to remove, which may not be the same index that you are targeting. Not to mention that you will scan 50% (on average) of the list for each Remove call, when you already know the correct index to remove.
The functional approach listed earlier seems like overkill.
I've attempted to replicate your issue, assuming certain facts about the various variables in use. The loop repeats the expected number of times.
static void TestMe ()
{
List<object> TicketList = new List<object>();
for (int index = 0; index < 109; index++)
TicketList.Add(new object());
var rand = new Random();
int nColumns = 100;
int NumOfRows = (TicketList.Count + (nColumns - 1)) / nColumns;
object[,] numbers;
int t;
numbers = new object[nColumns, NumOfRows];
for (int j = 0; j < NumOfRows; j++)
{
Console.WriteLine("OuterLoop");
for (int i = 0; i < nColumns; i++)
{
if (TicketList.Count > 0)
{
t = rand.Next(0, TicketList.Count - 1);
numbers[i, j] = TicketList[t];
TicketList.RemoveAt(t);
}
}
}
}
The problem that you are seeing must be the result of something that you have not included in your sample.

The negamax algorithm..what's wrong?

I'm trying to program a chess game and have spent days trying to fix the code. I even tried min max but ended with the same result. The AI always starts at the corner, and moves a pawn out of the way then the rook just moves back and forth with each turn. If it get's eaten, the AI moves every piece from one side to the other until all are eaten. Do you know what could be wrong with the following code?
public Move MakeMove(int depth)
{
bestmove.reset();
bestscore = 0;
score = 0;
int maxDepth = depth;
negaMax(depth, maxDepth);
return bestmove;
}
public int EvalGame() //calculates the score from all the pieces on the board
{
int score = 0;
for (int i = 0; i < 8; i++)
{
for (int j = 0; j < 8; j++)
{
if (AIboard[i, j].getPiece() != GRID.BLANK)
{
score += EvalPiece(AIboard[i, j].getPiece());
}
}
}
return score;
}
private int negaMax(int depth, int maxDepth)
{
if (depth <= 0)
{
return EvalGame();
}
int max = -200000000;
for (int i = 0; i < 8; i++)
{
for (int j = 0; j < 8; j++)
{
for (int k = 0; k < 8; k++)
{
for (int l = 0; l < 8; l++)
{
if(GenerateMove(i, j, k, l)) //generates all possible moves
{
//code to move the piece on the board
board.makemove(nextmove);
score = -negaMax(depth - 1, maxDepth);
if( score > max )
{
max = score;
if (depth == maxDepth)
{
bestmove = nextmove;
}
}
//code to undo the move
board.undomove;
}
}
}
}
}
return max;
}
public bool GenerateMove(int i, int j, int k, int l)
{
Move move;
move.moveFrom.X = i;
move.moveFrom.Y = j;
move.moveTo.X = k;
move.moveTo.Y = l;
if (checkLegalMoves(move.moveTo, move.moveFrom)) //if a legal move
{
nextMove = move;
return true;
}
return false;
}
This code:
public Move MakeMove(int depth)
{
bestscore = 0;
score = 0;
int maxDepth = depth;
negaMax(depth, maxDepth);
return bestmove;
}
Notice that the best move is never set! The return score of negaMax is compared to move alternatives. You're not even looping over the possible moves.
Also, it's really hard to look for errors, when the code you submit is not fully consistent. The negaMax method takes two arguments one place in your code, then it take four arguments in the recursive call?
I also recommend better abstraction in your code. Separate board representation, move representation, move generation, and the search algorithm. That will help you a lot. As an example: Why do you need the depth counter in the move generation?
-Øystein
You have two possible issues:
It is somewhat ambiguous as you don't show us your variable declarations, but I think you are using too many global variables. Negamax works by calculating best moves at each node, and so while searching the values and moves should be local. In any case, it is good practice to keep the scope of variables as tight as possible. It is harder to reason about the code when traversing the game tree changes so many variables. However, your search looks like it should return the correct values.
Your evaluation does not appear to discriminate which side is playing. I don't know if EvalPiece handles this, but in any case evaluation should be from the perspective of whichever side currently has the right to move.
You also have other issues that are not directly to your problem:
Your move generation is scary. You're pairwise traversing every possible pair of from/to squares on the board. This is highly inefficient and I don't understand how such a method would even work. You need only to loop through all the pieces on the board, or for a slower method, every square on the board (instead of 4096 squares).
MakeMove seems like it may be the place for the root node. Right now, your scheme works, in that the last node the search exits from will be root. However, it is common to use special routines at the root such as iterative deepening, so it may be good to have a separate loop at the root.

Categories

Resources