Need help for search optimization

Need help for search optimization - c#

I am fairly new to programming and i need some help with optimizing.
Basically a part of my method does:
for(int i = 0; i < Tiles.Length; i++)
{
x = Tiles[i].WorldPosition.x;
y = Tiles[i].WorldPosition.y;
z = Tiles[i].WorldPosition.z;
Tile topsearch = Array.Find(Tiles,
search => search.WorldPosition == Tiles[i].WorldPosition +
new Vector3Int(0,1,0));
if(topsearch.isEmpty)
{
// DoMyThing
}
}
So i am searching for a Tile in a position which is 1 unit above the current Tile.
My problem is that for the whole method it takes 0.1 secs which results in a small hick up..Without Array.Find the method is 0.01 secs.
I tried with a for loop also, but still not great result, because i need 3 more checks for
the bottom, left and right..
Can somebody help me out and point me a way of acquiring some fast results?
Maybe i should go with something like threading?

You could create a 3-dimensional array so that you can look up a tile at a specific location by just looking what's in Tiles[x, y + 1, z].
You can then iterate through your data in 2 loops: one to build up Tiles and one to do the checks you are doing in your code above, which would then just be:
for(int i = 0; i < Tiles.Length; i++)
{
Tile toFind = Tiles[Tile[i].x, Tile[i].y + 1, Tile[i].z];
if (toFind != null) ...
}
You would have to dimension the array so that you have 1 extra row in the y so that Tiles[x, y + 1, z] doesn't cause an index-out-of-range exception.

Adding to Roy's solution, if the space is not continuous, as it might be, you could put a hashcode of WorldPosition (the x, y and z coordinates) to some good use here.
I mean you could override WorldPosition's GetHashCode with your own implementation like that:
public class WorldPosition
{
public int X;
public int Y;
public int Z;
public override int GetHashCode()
{
int result = X.GetHashCode();
result = (result * 397) ^ Y.GetHashCode();
result = (result * 397) ^ Z.GetHashCode();
return result;
}
}
See Why is '397' used for ReSharper GetHashCode override? for explanation.
Then you can put your tiles in a Dictionary<WorldPosition, Tile>.
This would allow for quickly looking up for dict[new WorldPosition(x, y, z + 1)] etc. Dictionaries use hashcode for keys, so it would be fast.

First, like #Roy suggested, try storing the values in an array so you can access them with x,y,z coordinates,
Another thing you could do is change the search to
Tile topsearch = Array.Find(Tiles,
search => search.WorldPosition.x == Tiles[i].WorldPosition.x &&
search.WorldPosition.y == (Tiles[i].WorldPosition.y + 1) &&
search.WorldPosition.z == Tiles[i].WorldPosition.z)
This might be faster as well, depending on how many fields your WorldPosition has

Related

What wrong with this implement of this arcsine approximate in C#

This is a formula to approximate arcsine(x) using Taylor series from this blog
This is my implementation in C#, I don't know where is the wrong place, the code give wrong result when running:
When i = 0, the division will be 1/x. So I assign temp = 1/x at startup. For each iteration, I change "temp" after "i".
I use a continual loop until the two next value is very "near" together. When the delta of two next number is very small, I will return the value.
My test case:
Input is x =1, so excected arcsin(X) will be arcsin (1) = PI/2 = 1.57079633 rad.
class Arc{
static double abs(double x)
{
return x >= 0 ? x : -x;
}
static double pow(double mu, long n)
{
double kq = mu;
for(long i = 2; i<= n; i++)
{
kq *= mu;
}
return kq;
}
static long fact(long n)
{
long gt = 1;
for (long i = 2; i <= n; i++) {
gt *= i;
}
return gt;
}
#region arcsin
static double arcsinX(double x) {
int i = 0;
double temp = 0;
while (true)
{
//i++;
var iFactSquare = fact(i) * fact(i);
var tempNew = (double)fact(2 * i) / (pow(4, i) * iFactSquare * (2*i+1)) * pow(x, 2 * i + 1) ;
if (abs(tempNew - temp) < 0.00000001)
{
return tempNew;
}
temp = tempNew;
i++;
}
}
public static void Main(){
Console.WriteLine(arcsin());
Console.ReadLine();
}
}

In many series evaluations, it is often convenient to use the quotient between terms to update the term. The quotient here is
(2n)!*x^(2n+1) 4^(n-1)*((n-1)!)^2*(2n-1)
a[n]/a[n-1] = ------------------- * --------------------- -------
(4^n*(n!)^2*(2n+1)) (2n-2)!*x^(2n-1)
=(2n(2n-1)²x²)/(4n²(2n+1))
= ((2n-1)²x²)/(2n(2n+1))
Thus a loop to compute the series value is
sum = 1;
term = 1;
n=1;
while(1 != 1+term) {
term *= (n-0.5)*(n-0.5)*x*x/(n*(n+0.5));
sum += term;
n += 1;
}
return x*sum;
The convergence is only guaranteed for abs(x)<1, for the evaluation at x=1 you have to employ angle halving, which in general is a good idea to speed up convergence.

You are saving two different temp values (temp and tempNew) to check whether or not continuing computation is irrelevant. This is good, except that you are not saving the sum of these two values.
This is a summation. You need to add every new calculated value to the total. You are only keeping track of the most recently calculated value. You can only ever return the last calculated value of the series. So you will always get an extremely small number as your result. Turn this into a summation and the problem should go away.

NOTE: I've made this a community wiki answer because I was hardly the first person to think of this (just the first to put it down in a comment). If you feel that more needs to be added to make the answer complete, just edit it in!
The general suspicion is that this is down to Integer Overflow, namely one of your values (probably the return of fact() or iFactSquare()) is getting too big for the type you have chosen. It's going to negative because you are using signed types — when it gets to too large a positive number, it loops back into the negative.
Try tracking how large n gets during your calculation, and figure out how big a number it would give you if you ran that number through your fact, pow and iFactSquare functions. If it's bigger than the Maximum long value in 64-bit like we think (assuming you're using 64-bit, it'll be a lot smaller for 32-bit), then try using a double instead.

Can't get cost function for logistic regression to work

I'm trying to implement logistic regression by myself writing code in C#. I found a library (Accord.NET) that I use to minimize the cost function. However I'm always getting different minimums. Therefore I think something may be wrong in the cost function that I wrote.
static double costfunction(double[] thetas)
{
int i = 0;
double sum = 0;
double[][] theta_matrix_transposed = MatrixCreate(1, thetas.Length);
while(i!=thetas.Length) { theta_matrix_transposed[0][i] = thetas[i]; i++;}
i = 0;
while (i != m) // m is the number of examples
{
int z = 0;
double[][] x_matrix = MatrixCreate(thetas.Length, 1);
while (z != thetas.Length) { x_matrix[z][0] = x[z][i]; z++; } //Put values from the training set to the matrix
double p = MatrixProduct(theta_matrix_transposed, x_matrix)[0][0];
sum += y[i] * Math.Log(sigmoid(p)) + (1 - y[i]) * Math.Log(1 - sigmoid(p));
i++;
}
double value = (-1 / m) * sum;
return value;
}
static double sigmoid(double z)
{
return 1 / (1 + Math.Exp(-z));
}
x is a list of lists that represent the training set, one list for each feature. What's wrong with the code? Why am I getting different results every time I run the L-BFGS? Thank you for your patience, I'm just getting started with machine learning!

That is very common with these optimization algorithms - the minima you arrive at depends on your weight initialization. The fact that you are getting different minimums doesn't necessarily mean something is wrong with your implementation. Instead, check your gradients to make sure they are correct using the finite differences method, and also look at your train/validation/test accuracy to see if they are also acceptable.

2d-array with "wrapped edges" in C#

Warning: I'm a C# newb. Besides answering my questions, if you have any tips in general after seeing my code, they are welcome.
Let's say I define a two-dimensional array of size 10x10 in C#:
var arr = new int[10,10];
It is an error to access elements with indices out of the range 0-9. In some applications, (e.g. some games where the 2d-array represents a world) it's necessary to "wrap" the edges of the array. So for example
arr[-1, 0]
would actually refer to the element at arr[9,0].
One approach I've been using is the following class. I didn't subclass System.Array because C# apparently forbids doing so.
Example usage:
var grid = new Grid(10,10);
grid.set(-1, 0, 100); // Set element at (-1,0) to value 100.
grid.at(-1,0); // retrieve element at (-1,0)
The class itself:
class Grid
{
public int[,] state;
public int width { get { return state.GetLength(0); } }
public int height { get { return state.GetLength(1); } }
public Grid(int width_init, int height_init)
{
state = new int[width_init, height_init];
}
int mod(int a, int b)
{
if (a >= 0)
return a % b;
else
return (b + a % b) % b;
}
int wrap_x(int x) { return mod(x, width); }
int wrap_y(int y) { return mod(y, height); }
public int at(int x, int y)
{
return state[wrap_x(x), wrap_y(y)];
}
public void set(int x, int y, int val)
{
state[wrap_x(x), wrap_y(y)] = val;
}
// more stuff here...
}
Question: Is there a game/creative-coding framework out there that provides this sort of class?
Question: Can you think of a simpler mod I can use in the above?
In order to process each element along with the corresponding "x" and "y", I use the following method:
public void each(Action<int, int, int> proc)
{
for (int x = 0; x < width; x++)
for (int y = 0; y < height; y++)
proc(x, y, state[x, y]);
}
Question: I looked around for a similar method defined on System.Array but I didn't find one. Did I miss it?
Question: In the above, for(int x = 0; x < width; x++) is the common idiom expressing "go from zero up to N by 1". Is there a mechanism which expresses this in C#? I.e. I'd like to write the above as:
width.up_to((x) =>
height.up_to((y) =>
proc(x, y, state[x, y]);
where up_to would be a method on integers. Is there something like up_to already defined?
Similar to map from Scheme, here's a map method which applies a Func to each element and its corresponding indices. It returns a new Grid.
public Grid map(Func<int, int, int, int> proc)
{
var grid = new Grid(width, height);
each((x, y, val) => grid.state[x, y] = proc(x, y, val));
return grid;
}
Question: Let's suppose I setup a subclass class World : Grid which adds additional instance variables. The trouble with the above map is that when called on an instance of World, you get a Grid, not a World. How should I fix this? Is this the wrong approach altogether? Perhaps a better design is to not subclass Grid but to have keep it as an instance variable in World.
Sorry for the long submission. :-)
Update: I asked the question about upto separately and got some good answers.

One thing you can do to facilitate reference to your grid is overload [,]:
public int this[int x, int y]
{
get { return state[wrap_x(x), wrap_y(y)]; }
set { state[wrap_x(x), wrap_y(y)] = value; }
}
If you find the syntax more suitable, go for it.
Regarding your mod function, the best suggestion I can make is to make those variables (a and b) meaningful. index and maxSize oughta do it.
Other stuff:
Your state variable should be private.
Unless you absolutely, exclusively need ints, consider using a generic for the type of your state array. Your Grid class becomes Grid<T>.
With the bracket [,] overload, you can get rid of your at and set functions.
As for your World class, with a simplified Grid, the question to ask is this classic: is or has? IS your World a Grid or does it HAVE a Grid? Only you can answer that, but I'm leaning towards HAS.
Consider a Grid constructor which takes a ready-made 2d array as a parameter:
Example:
public Grid(int[,] state)
{
this.state = state;
}
mod can be made valid for any value (multiple wrap around) with a slight modification.
Example:
int mod(int index, int maxSize)
{
while (index < 0) index += maxSize;
return index % maxSize;
}
Results:
mod(0,10) => 0
mod(1,10) => 1
mod(-1,10) => 9
mod(10,10) => 0
mod(-10,10) => 0
mod(11,10) => 1
mod(-11,10) => 9

Just access the array with the modulo % function. For an N by M array use the following.
int x = A[i % N, j % M];
it will do exactly what you need. In your example use arr[-1 % 10, 0 % 10] instead of arr[-1,0]. There is no need for a wrapper function, or additional code!

Random directions, with no repeat.. (Bad Description)

Hey there, So I'm knocking together a random pattern generation thing.
My code so far:
int permutes = 100;
int y = 31;
int x = 63;
while (permutes > 0) {
int rndTurn = random(1, 4);
if (rndTurn == 1) { y = y - 1; } //go up
if (rndTurn == 2) { y = y + 1; } //go down
if (rndTurn == 3) { x = x - 1; } //go right
if (rndTurn == 4) { x = x + 1; } //go left
setP(x, y, 1);
delay(250);
}
My question is, how would I go about getting the code to not go back on itself?
e.g. The code says "Go Left" but the next loop through it says "Go Right", how can I stop this?
NOTE: setP turns a specific pixel on.
Cheers peoples!

It depends on what you mean.
If you mean "avoid going back to a step I was most previously on" then you have to remember the direction of the last movement. That is if you move up your next movement can't be down.
If you mean "avoid going back on a spot you've ever been on" then you're going to have to remember every spot you've been on. This can be implemented efficiently with a hash table using a key with a class representing a coordinate with appropriate Equals/HashCode functions.

Since each square corresponds to a pixel, your coordinate space must be finite, so you could keep track of coordinates you've already visited.
If there's a corresponding getP function to determine if a pixel has already been turned on, you could just use that.

You remember the last direction and, using random(1,3), pick either of the remaining three, then store that as the last one.

Not sure if this approach will work.
Create a new variable called lastRndTurn as int, and assign this after your if statements.
Then add a new while loop after your int rndTurn = random(1, 4).
while (lastRndTurn == rndTurn)
{
rndTurn = random(1, 4);
}

C# Micro-Optimization Query: IEnumerable Replacement

Note: I'm optimizing because of past experience and due to profiler software's advice. I realize an alternative optimization would be to call GetNeighbors less often, but that is a secondary issue at the moment.
I have a very simple function described below. In general, I call it within a foreach loop. I call that function a lot (about 100,000 times per second). A while back, I coded a variation of this program in Java and was so disgusted by the speed that I ended up replacing several of the for loops which used it with 4 if statements. Loop unrolling seems ugly, but it did make a noticeable difference in application speed. So, I've come up with a few potential optimizations and thought I would ask for opinions on their merit and for suggestions:
Use four if statements and totally ignore the DRY principle. I am confident this will improve performance based on past experience, but it makes me sad. To clarify, the 4 if statements would be pasted anywhere I called getNeighbors() too frequently and would then have the inside of the foreach block pasted within them.
Memoize the results in some mysterious manner.
Add a "neighbors" property to all squares. Generate its contents at initialization.
Use a code generation utility to turn calls to GetNeighbors into if statements as part of compilation.
public static IEnumerable<Square> GetNeighbors(Model m, Square s)
{
int x = s.X;
int y = s.Y;
if (x > 0) yield return m[x - 1, y];
if (y > 0) yield return m[x, y - 1];
if (x < m.Width - 1) yield return m[x + 1, y];
if (y < m.Height - 1) yield return m[x, y + 1];
yield break;
}
//The property of Model used to get elements.
private Square[,] grid;
//...
public Square this[int x, int y]
{
get
{
return grid[x, y];
}
}
Note: 20% of the time spent by the GetNeighbors function is spent on the call to m.get_Item, the other 80% is spent in the method itself.

Brian,
I've run into similar things in my code.
The two things I've found with C# that helped me the most:
First, don't be afraid necessarily of allocations. C# memory allocations are very, very fast, so allocating an array on the fly can often be faster than making an enumerator. However, whether this will help depends a lot on how you're using the results. The only pitfall I see is that, if you return a fixed size array (4), you're going to have to check for edge cases in the routine that's using your results.
Depending on how large your matrix of Squares is in your model, you may be better off doing 1 check up front to see if you're on the edge, and if not, precomputing the full array and returning it. If you're on an edge, you can handle those special cases separately (make a 1 or 2 element array as appropriate). This would put one larger statement in there, but that is often faster in my experience. If the model is large, I would avoid precomputing all of the neighbors. The overhead in the Squares may outweigh the benefits.
In my experience, as well, preallocating and returning vs. using yield makes the JIT more likely to inline your function, which can make a big difference in speed. If you can take advantage of the IEnumerable results and you are not always using every returned element, that is better, but otherwise, precomputing may be faster.
The other thing to consider - I don't know what information is saved in Square in your case, but if hte object is relatively small, and being used in a large matrix and iterated over many, many times, consider making it a struct. I had a routine similar to this (called hundreds of thousands or millions of times in a loop), and changing the class to a struct, in my case, sped up the routine by over 40%. This is assuming you're using .net 3.5sp1, though, as the JIT does many more optimizations on structs in the latest release.
There are other potential pitfalls to switching to struct vs. class, of course, but it can have huge performance impacts.

I'd suggest making an array of Squares (capacity four) and returning that instead. I would be very suspicious about using iterators in a performance-sensitive context. For example:
// could return IEnumerable<Square> still instead if you preferred.
public static Square[] GetNeighbors(Model m, Square s)
{
int x = s.X, y = s.Y, i = 0;
var result = new Square[4];
if (x > 0) result[i++] = m[x - 1, y];
if (y > 0) result[i++] = m[x, y - 1];
if (x < m.Width - 1) result[i++] = m[x + 1, y];
if (y < m.Height - 1) result[i++] = m[x, y + 1];
return result;
}
I wouldn't be surprised if that's much faster.

I'm on a slippery slope, so insert disclaimer here.
I'd go with option 3. Fill in the neighbor references lazily and you've got a kind of memoization.
ANother kind of memoization would be to return an array instead of a lazy IEnumerable, and GetNeighbors becomes a pure function that is trivial to memoize. This amounts roughly to option 3 though.
In any case, but you know this, profile and re-evaluate every step of the way. I am for example unsure about the tradeoff between the lazy IEnumerable or returning an array of results directly. (you avoid some indirections but need an allocation).

Why not make the Square class responsible of returning it's neighbours? Then you have an excellent place to do lazy initialisation without the extra overhead of memoization.
public class Square {
private Model _model;
private int _x;
private int _y;
private Square[] _neightbours;
public Square(Model model, int x, int y) {
_model = model;
_x = x;
_y = y;
_neightbours = null;
}
public Square[] Neighbours {
get {
if (_neightbours == null) {
_neighbours = GetNeighbours();
}
return _neighbours;
}
}
private Square[] GetNeightbours() {
int len = 4;
if (_x == 0) len--;
if (_x == _model.Width - 1) len--;
if (_y == 0) len--;
if (-y == _model.Height -1) len--;
Square [] result = new Square(len);
int i = 0;
if (_x > 0) {
result[i++] = _model[_x - 1,_y];
}
if (_x < _model.Width - 1) {
result[i++] = _model[_x + 1,_y];
}
if (_y > 0) {
result[i++] = _model[_x,_y - 1];
}
if (_y < _model.Height - 1) {
result[i++] = _model[_x,_y + 1];
}
return result;
}
}

Depending on the use of GetNeighbors, maybe some inversion of control could help:
public static void DoOnNeighbors(Model m, Square s, Action<s> action) {
int x = s.X;
int y = s.Y;
if (x > 0) action(m[x - 1, y]);
if (y > 0) action(m[x, y - 1]);
if (x < m.Width - 1) action(m[x + 1, y]);
if (y < m.Height - 1) action(m[x, y + 1]);
}
But I'm not sure, if this has better performance.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Need help for search optimization - c#

Related

What wrong with this implement of this arcsine approximate in C#

Can't get cost function for logistic regression to work

2d-array with "wrapped edges" in C#

Random directions, with no repeat.. (Bad Description)

C# Micro-Optimization Query: IEnumerable Replacement

Categories

Resources