Right now, I'm working on a QuadTree LOD system for planets. In general everything is working quite nice.
As a rough description of how the mesh is generated:
Every node generated by the QuadTree contains a container called "NodeMeshData". This contains all information to generate a simple Mesh
public class NodeMeshData
{
public Vector3[] vertices;
public int[] indices;
public Vector2[] uvs;
public void Clear()
{
vertices = null;
indices = null;
uvs = null;
}
}
Whenever the mesh needs to be regenerated, all QuadTree nodes without any children (leaf nodes) are asked for there NodeMeshData. All these are put into one Array of NodeMeshData objects.
public void UpdateQuadTree(Vector3 playerPosition)
{
_quadTree.UpdateTree(playerPosition);
NodeMeshData[] data = _quadTree.GetNodeMeshData().ToArray();
CombinedMeshData.Combine(data);
}
(CombinedMeshData is a NodeMeshData property from the class where this method is)
The problem now lies in the combining of all these NodeMeshData objects into one to generate the final mesh - the extention method "Combine". By process of elimination I found out that it must be this method that causes the problem I have.
public static class NodeMeshHelper
{
public static void Combine(this NodeMeshData newData, IList<NodeMeshData> nodeMeshData)
{
int verticesCount = nodeMeshData.Sum(static nmd => nmd.vertices.Length);
int indicesCount = nodeMeshData.Sum(static nmd => nmd.indices.Length);
int uvsCount = nodeMeshData.Sum(static nmd => nmd.uvs.Length);
List<Vector3> vertices = new(verticesCount);
List<int> indices = new(indicesCount);
List<Vector2> uvs = new(uvsCount);
int lastIndex = 0;
foreach (var meshData in nodeMeshData)
{
vertices.AddRange(meshData.vertices);
int[] shiftedIndices = meshData.indices.Select(index => index + lastIndex).ToArray();
lastIndex += meshData.indices.Last() + 1;
indices.AddRange(shiftedIndices);
uvs.AddRange(meshData.uvs);
}
newData.vertices = vertices.ToArray();
newData.indices = indices.ToArray();
newData.uvs = uvs.ToArray();
}
}
Whenever it is called and executed, the RAM usage rises. After a few seconds, the garbage collector kicks in and cleans the memory of all the unused data as it is supposed to do. But this results in a nasty frame drop every single time.
My idea is that all these declarations of new Lists or the AddRange calls cause the rising RAM usage. From asking my collegues I found out, that declaring Lists with a fixed size should eliminate the internal redeclarations of Arrays within the List objects when AddRange is called. But as you can see, I already do that and it doesn't help. I also tried to change the NodeMeshHelper into a non static class with fields for the lists or the final NodeMeshData result. It didn't help either.
What can I do to fix this? Is there a simple way, e.g. changing the "Combine" method in a way that it doesn't use new memory every single time? Or do I need to rethink my QuadTree or mesh creation algorithm completely (I hope not)? What about compute shaders?
Whoever is interested in the full code:
Github
UPDATE
I changed the NodeMeshHelper to not longer use Lists. Now it uses Arrays. I did that because it seems to improve the performance a bit. Also I hope that it might go in the right direction to fix my major problem. Yes, I know there still are "new" statements for the arrays.
public static class NodeMeshHelper
{
public static void Combine(this NodeMeshData newData, NodeMeshData[] nodeMeshData)
{
int vertexIterator = 0;
int indexIterator = 0;
int uvIterator = 0;
int verticesCount = 0;
foreach (var nmd in nodeMeshData)
verticesCount += nmd.vertices.Length;
int indicesCount = 0;
foreach (var nmd in nodeMeshData)
indicesCount += nmd.indices.Length;
int uvsCount = 0;
foreach (var nmd in nodeMeshData)
uvsCount += nmd.uvs.Length;
Vector3[] vertices = new Vector3[verticesCount];
int[] indices = new int[indicesCount];
Vector2[] uvs = new Vector2[uvsCount];
int lastIndex = 0;
foreach (var meshData in nodeMeshData)
{
// vertices
foreach (Vector3 vertex in meshData.vertices)
{
vertices[vertexIterator] = vertex;
vertexIterator++;
}
// indices
foreach (int index in meshData.indices)
{
indices[indexIterator] = index + lastIndex;
indexIterator++;
}
lastIndex += meshData.indices[^1] + 1;
// uvs
foreach (Vector2 uv in meshData.uvs)
{
uvs[uvIterator] = uv;
uvIterator++;
}
}
newData.vertices = vertices;
newData.indices = indices;
newData.uvs = uvs;
}
}
UPDATE 2
I also tried to implement this in Unitys Job System. But unfortunately it does not support NativeArrays of NativeArrays, which (in my brain) is necessary for my purposes.
Related
What i need:
a polygon with arbitrary amount of vertices ( or at least up to max number of vertices )
it should be a struct, so that it can be fast and can be assigned / passed by value
It seems like i can't use arrays or collections for storing vertices, because then my polygon struct would point to objects on a heap, and when one polygon is assigned to another one by value only shallow copy would be performed, and i would have both polygons pointing to the same vertex array. For example:
Polygon a = new Polygon();
Polygon b = a;
// both polygons would be changed
b.vertices[0] = 5;
Then how do i create a struct that can have arbitrary number (or some fixed number) of vertices, but without using heap at all?
I could just use lots of variables like v1, v2, v3 ... v10 etc, but i want to keep my code clean, more or less.
You have the option to define your array with the fixed keyword, which puts it in the stack.
But you cannot directly access the elements of the array, unless you are in an unsafe context and use pointers.
To get the following behavior:
static void Main(string[] args)
{
FixedArray vertices = new FixedArray(10);
vertices[0] = 4;
FixedArray copy = vertices;
copy[0] = 8;
Debug.WriteLine(vertices[0]);
// 4
Debug.WriteLine(copy[0]);
// 8
}
Then use the following class definition:
public unsafe struct FixedArray
{
public const int MaxSize = 100;
readonly int size;
fixed double data[MaxSize];
public FixedArray(int size) : this(new double[size])
{ }
public FixedArray(double[] values)
{
this.size = Math.Min(values.Length, MaxSize);
for (int i = 0; i < size; i++)
{
data[i] = values[i];
}
}
public double this[int index]
{
get
{
if (index>=0 && index<size)
{
return data[index];
}
return 0;
}
set
{
if (index>=0 && index<size)
{
data[index] = value;
}
}
}
public double[] ToArray()
{
var array = new double[size];
for (int i = 0; i < size; i++)
{
array[i] = data[i];
}
return array;
}
}
A couple of things to consider. The above needs to be compiled with the unsafe option. Also the MaxSize but be a constant, and the storage required cannot exceed this value. I am using an indexer this[int] to access the elements (instead of a field) and also have a method to convert to a native array with ToArray(). The constructor can also take a native array, or it will use an empty array to initialize the values. This is to ensure that new FixedArray(10) for example will have initialized at least 10 values in the fixed array (instead of being undefined as it is the default).
Read more about this usage of fixed from Microsoft or search for C# Fixed Size Buffers.
Heap array field
struct StdArray
{
int[] vertices;
Foo(int size)
{
vertices = new int[size];
}
}
Stack array field
unsafe struct FixedArray
{
fixed int vertices[100];
int size;
Foo(int size)
{
this.size = size;
// no initialization needed for `vertices`
}
}
If it suits your logic, you could use a Span<T>, which is allocated on the stack. Read more here
One other way to just copy the array with a copy constructor
public Polygon(Polygon other)
{
this.vertices = other.vertices.Clone() as int[];
}
then
var a = new Polygon();
a.vertices[0] = 5;
var b = new Polygon(a):
Debug.WriteLine(a.vertices[0]);
// 5
Debug.WriteLine(b.vertices[0]);
// 5
b.vertices[0] = 10;
Debug.WriteLine(a.vertices[0]);
// 5
Debug.WriteLine(b.vertices[0]);
// 10
The following programme returns whether a tree is balanced or not. A tree is said to be balanced if a path from the root to any leaf has the same length.
using System;
namespace BalancedTree
{
public class MainClass
{
static bool isBalanced(int[][] sons)
{
return isBalanced(sons, 0);
}
static bool isBalanced(int[][] sons, int startNode)
{
int[] children = sons[startNode];
int minHeight = int.MaxValue;
int maxHeight = int.MinValue;
bool allChildBalanced = true;
if(children.Length == 0)
return true;
else
{
foreach (int node in children)
{
int h = height(sons, node);
if(h > maxHeight)
maxHeight = h;
if(h < minHeight)
minHeight = h;
}
}
foreach (int node in children)
{
allChildBalanced = allChildBalanced && isBalanced(sons, node);
if(!allChildBalanced)
return false;
}
return Math.Abs(maxHeight - minHeight) < 2 && allChildBalanced;
}
static int height(int[][] sons, int startNode)
{
int maxHeight = 0;
foreach (int child in sons[startNode])
{
int thisHeight = height(sons, child);
if(thisHeight > maxHeight)
maxHeight = thisHeight;
}
return 1 + maxHeight;
}
public static void Main (string[] args)
{
int[][] sons = new int[6][];
sons[0] = new int[] { 1, 2, 4 };
sons[1] = new int[] { };
sons[2] = new int[] { 3, 5};
sons[3] = new int[] { };
sons[4] = new int[] { };
sons[5] = new int[] { };
Console.WriteLine (isBalanced(sons));
}
}
}
My problem is that my code is very inefficient, due to recursive calls to function
static int height(int[][] sons, int startNode)
making the time complexity exponential.
I know this can be optimised in case of a binary tree, but I'm looking for a way to optimise my programme in case of a general tree as described above.
One idea would be for instance to call function 'height' from the current node instead of startNode.
My only constraint is time complexity which must be linear, but I can use additional memory.
Sorry, but I have never done C#. So, there will be no example code.
However, it shouldn't be too hard for you to do it.
Defining isBalanced() recursively will never give best performance. The reason is simple: A tree can still be unbalanced, if all sub-trees are balanced. So, you can't just traverse the tree once.
However, your height() function already does the right thing. It visits every node in the tree only once to find the height (i.e. maximum length from the root to a leaf).
All you have to do is write a minDistance() function that finds the minimum length from the root to a leaf. You can do this using almost the same code.
With these functions a tree is balanced if and only if height(...)==minDistance(...).
Finally, you can merge both function into one that returns a (min,max) pair. This will not change time complexity but could bring down execution time a bit, if returning pairs is not too expensive in C#
Basically I am using a MeshGeometry3D to load points, positions and normals from an STL file.
The STL file format duplicates points, so I want to first search the MeshGeometry3D.Positions for duplicate before adding the newly read point.
The Mesh.Positions.IndexOf(somePoint3D) does not work, because it compares based on the object reference rather than the X, Y, Z values of the Point3D. This is why I am iterating the entire Collection to manually find duplicates:
//triangle is a custom class containing three vertices of type Point3D
//for each 3 points read from STL the triangle object is reinitialized
vertex1DuplicateIndex = -1;
vertex2DuplicateIndex = -1;
vertex3DuplicateIndex = -1;
for (int q = tempMesh.Positions.Count - 1; q >= 0; q--)
{
if (vertex1DuplicateIndex != -1)
if (tempMesh.Positions[q] == triangle.Vertex1)
vertex1DuplicateIndex = q;
if (vertex2DuplicateIndex != -1)
if (tempMesh.Positions[q] == triangle.Vertex2)
vertex2DuplicateIndex = q;
if (vertex3DuplicateIndex != -1)
if (tempMesh.Positions[q] == triangle.Vertex3)
vertex3DuplicateIndex = q;
if (vertex1DuplicateIndex != -1 && vertex2DuplicateIndex != -1 && vertex3DuplicateIndex != -1)
break;
}
This code is actually very efficient when duplicates are found, but when there is no duplicate the collection is iterated entirely which is very slow for big meshes, with more than a million positions.
Is there another approach on the search?
Is there a way to force Mesh.Positions.IndexOf(newPoint3D) to compare based on value like the Mesh.Positions[index]==(somePoint3D), rather than the reference comparison it is doing now?
I don't know of a built in way to do this, but you could use a hash map to cache the indices of the 3D vectors.
Depending of the quality of your hash functions for the vectors you'll have a 'sort of' constant lookup (no collisions are impossible, but it should be faster than iterating though all the vertex data for each new triangle point).
Using hakononakani's idea I've managed to speed up a bit, using a combination of a HashSet and a Dictionary. The following is a simplified version of my code:
class CustomTriangle
{
private Vector3D normal;
private Point3D vertex1, vertex2, vertex3;
}
private void loadMesh()
{
CustomTriangle triangle;
MeshGeometry3D tempMesh = new MeshGeometry3D();
HashSet<string> meshPositionsHashSet = new HashSet<string>();
Dictionary<string, int> meshPositionsDict = new Dictionary<string, int>();
int vertex1DuplicateIndex, vertex2DuplicateIndex, vertex3DuplicateIndex;
int numberOfTriangles = GetNumberOfTriangles();
for (int i = 0, j = 0; i < numberOfTriangles; i++)
{
triangle = ReadTriangleDataFromSTLFile();
vertex1DuplicateIndex = -1;
if (meshPositionsHashSet.Add(triangle.Vertex1.ToString()))
{
tempMesh.Positions.Add(triangle.Vertex1);
meshPositionsDict.Add(triangle.Vertex1.ToString(), tempMesh.Positions.IndexOf(triangle.Vertex1));
tempMesh.Normals.Add(triangle.Normal);
tempMesh.TriangleIndices.Add(j++);
}
else
{
vertex1DuplicateIndex = meshPositionsDict[triangle.Vertex1.ToString()];
tempMesh.TriangleIndices.Add(vertex1DuplicateIndex);
tempMesh.Normals[vertex1DuplicateIndex] += triangle.Normal;
}
//Do the same for vertex2 and vertex3
}
}
At the end tempMesh will have only unique points. All you have to do is normalize all Normals and you're ready to visualize.
The same can be achieved only with the Dictionary using:
if (!meshPositionsDict.Keys.Contains(triangle.Vertex1.ToString()))
I just like using the HashSet, because it's fun to work with :)
In both cases the final result is a ~60 times faster algorithm than before!
In my code I have a nested loop which does not iterate with the exception of an if statement that always occurs no matter what the condition. Without the if statement the portion of the for loop's code which iterates the loop becomes unreachable. No matter what I have tried I have not been able to get the inside loop to iterate.
class Map
{
public int Width { get; set; }
public int Height { get; set; }
public Vector2[] positions = new Vector2[500*500];
private GroundVoxel[,] map = new GroundVoxel[500, 500];
private Vector2 voxelPosition = new Vector2(0,0);
private static int sizeX = 499, sizeY = 499, airLevel = 425;
private int positionX = 0, positionY = 0, vectorNumber = 0;
public Map()
{
}
public Vector2[] Initialize()
{
for (int i = 0; i <= sizeY; i++)
{
for (int j = 0; j <= sizeX; j++) <-- This does not iterate.
{
map[positionX, positionY] = new GroundVoxel(voxelPosition);
voxelPosition.X += 80;
positions[vectorNumber] = voxelPosition;
vectorNumber += 1;
if (j == sizeX) <-- This always executes even though j != sizeX.
{
break;
}
}
voxelPosition.Y += 80;
voxelPosition.X = 0;
}
return positions;
}
}
}
You have to use the fully qualified name to refer to a static class member variable like your sizeX and sizeY. Here is an article on the subject.
Hope this helps!
I think we'll need more code. I've copied your code into a basic winforms test application and both of my loops iterates as expected.
I'm not familiar with XNA or what a "VoxelPosition" is, but I think you have a lurking bug here:
voxelPosition.X += 80;
positions[vectorNumber] = voxelPosition;
You are simply storing the same pointer in a very large array -- all of the entries will be pointing to the same object.
You will need to declare another object every time through the loop to store individal vector entries.
Hope this helps?
I'm writing some sort of Geometry Wars inspired game except with added 2d rigid body physics Ai pathfinding some waypoint analysis line of sight checks load balancing etc. It seems that even though with around 80-100 enemies on screen it can work reasonably fast with all that stuff enabled the performance completely breaks down once you get to a total of 250 (150 enemies) objects or so. I've searched for any O(n^2) parts in the code but there don't seem to be any left. I'm also using spatial grids.
Even if I disable pretty much everything from the supposedly expensive Ai related processing it doesn't seem to matter, it like still breaks down at 150 enemies.
Now I implemened all the code from scratch, currently even the matrix multiplication code, and I'm almost completely relying on the GC as well as using C# closures for some things, so I expect this to be seriously far from being optimized, but still it doesn't make sense to me that with like 1/15 of the processing work but double the objects the game suddenly starts to slow down to crawl? Is this normal, how is the XNA platform normally supposed to scale as far as the amount of objects being processed is concerned?
I remember Some slerp spinning cube thing I did at first could handle more than 1000 at once so I think I'm doing something wrong?
edit:
Here's the grid structure's class
public abstract class GridBase{
public const int WORLDHEIGHT = (int)AIGridInfo.height;
public const int WORLDWIDTH = (int)AIGridInfo.width;
protected float cellwidth;
protected float cellheight;
int no_of_col_types;
// a dictionary of lists that gets cleared every frame
// 3 (=no_of_col_types) groups of objects (enemy side, players side, neutral)
// 4000 initial Dictionary hash positions for each group
// I have also tried using an array of lists of 100*100 cells
//with pretty much identical results
protected Dictionary<CoordsInt, List<Collidable>>[] grid;
public GridBase(float cellwidth, float cellheight, int no_of_col_types)
{
this.no_of_col_types = no_of_col_types;
this.cellheight=cellheight;
this.cellwidth=cellwidth;
grid = new Dictionary<CoordsInt, List<Collidable>>[no_of_col_types];
for (int u = 0; u < no_of_col_types; u++)
grid[u] = new Dictionary<CoordsInt, List<Collidable>>(4000);
}
public abstract void InsertCollidable(Collidable c);
public abstract void InsertCollidable(Grid_AI_Placeable aic);
//gets called in the update loop
public void Clear()
{
for (int u = 0; u < no_of_col_types; u++)
grid[u].Clear();
}
//gets the grid cell of the left down corner
protected void BaseCell(Vector3 v, out int gx, out int gy)
{
gx = (int)((v.X + (WORLDWIDTH / 2)) / cellwidth);
gy = (int)((v.Y + (WORLDHEIGHT / 2)) / cellheight);
}
//gets all cells covered by the AABB
protected void Extent(Vector3 pos, float aabb_width, float aabb_height, out int totalx, out int totaly)
{
var xpos = pos.X + (WORLDWIDTH / 2);
var ypos = pos.Y + (WORLDHEIGHT / 2);
totalx = -(int)((xpos / cellwidth)) + (int)((xpos + aabb_width) / cellwidth) + 1;
totaly = -(int)((ypos / cellheight)) + (int)((ypos + aabb_height) / cellheight) + 1;
}
}
public class GridBaseImpl1 : GridBase{
public GridBaseImpl1(float widthx, float widthy)
: base(widthx, widthy, 3)
{
}
//adds a collidable to the grid /
//caches for intersection test
//checks if it should be tested to prevent penetration /
//tests penetration
//updates close, intersecting, touching lists
//Collidable is an interface for all objects that can be tested geometrically
//the dictionary is indexed by some simple struct that wraps the row and column number in the grid
public override void InsertCollidable(Collidable c)
{
//some tag so that objects don't get checked more than once
Grid_Query_Counter.current++;
//the AABB is allocated in the heap
var aabb = c.CollisionAABB;
if (aabb == null) return;
int gx, gy, totalxcells, totalycells;
BaseCell(aabb.Position, out gx, out gy);
Extent(aabb.Position, aabb.widthx, aabb.widthy, out totalxcells, out totalycells);
//gets which groups to test this object with in an IEnumerable (from a statically created array)
var groupstestedagainst = CollidableCalls.GetListPrevent(c.CollisionType).Select(u => CollidableCalls.group[u]);
var groups_tested_against = groupstestedagainst.Distinct();
var own_group = CollidableCalls.group[c.CollisionType];
foreach (var list in groups_tested_against)
for (int i = -1; i < totalxcells + 1; i++)
for (int j = -1; j < totalycells + 1; j++)
{
var index = new CoordsInt((short)(gx + i), (short)(gy + j));
if (grid[list].ContainsKey(index))
foreach (var other in grid[list][index])
{
if (Grid_Query_Counter.Check(other.Tag))
{
//marks the pair as close, I've tried only keeping the 20 closest but it's still slow
other.Close.Add(c);
c.Close.Add(other);
//caches the pair it so that checking if the pair intersects doesn't go through the grid //structure loop again
c.CachedIntersections.Add(other);
var collision_function_table_id = c.CollisionType * CollidableCalls.size + other.CollisionType;
//gets the function to use on the pair for testing penetration
//the function is in a delegate array statically created to simulate multiple dispatch
//the function decides what coarse test to use until descending to some complete //geometric query
var prevent_delegate = CollidableCalls.preventfunctions[collision_function_table_id];
if (prevent_delegate == null) { Grid_Query_Counter.Put(other.Tag); continue; }
var a = CollidableCalls.preventfunctions[collision_function_table_id](c, other);
//if the query returns true mark as touching
if (a) { c.Contacted.Add(other); other.Contacted.Add(c); }
//marks it as tested in this query
Grid_Query_Counter.Put(other.Tag);
}
}
}
//adds it to the grid if the key doesn't exist it creates the list first
for (int i = -1; i < totalxcells + 1; i++)
for (int j = -1; j < totalycells + 1; j++)
{
var index = new CoordsInt((short)(gx + i), (short)(gy + j));
if (!grid[own_group].ContainsKey(index)) grid[own_group][index] = new List<Collidable>();
grid[own_group][index].Add(c);
}
}
[...]
}
First. Profile your code. Even if you just use manually inserted time stamps to surround blocks you're interested in. I prefer to use the profiler that comes built into Visual Studio Pro.
However, based in your description, I would assume your problems are due to too many draw calls. Once you exceed 200-400 draw calls per frame your performance can drop dramatically. Try batching your rendering and see if this improves performance.
You can use a profiler such as ANTS Profiler to see what may be the problem.
Without any code theres not much I can do.