GC collecting...what? - c#

I am trying to optimize my engine (C# + SlimDX) to make as less allocations as possible (to prevent the GC from firing too often) using as guide a profiler that gives me where the garbaged object are generated. Its going pretty well (going down from like 20 MB garbaged every 5s to 8 MB garbaged every 1 minute and half (yep, it was very little optimized XD))
There is a method where I can't find anything declarated and i don't know what's to do. It seems this method generate 2 garbaged object per execution in its body (not on a called function) :
Can somebody guide me to understand why this function generate object to be garbaged? I really don't have a clue.
public override void Update()
{
base.Update();
if (LastCheckInstancesNumber != Instances.Count)
{
LastCheckInstancesNumber = Instances.Count;
_needToRegenerateUpdate = true;
}
// Crea byte array da usare nel prossimo draw.
if (_needToRegenerateUpdate)
{
Int32 PrimitivesCount = Instances.Count;
Int32 Size = PrimitivesCount * 80;
if ((ByteUpdateTemp != null) && (ByteUpdateTemp.Length < Size))
ByteUpdateTemp = new byte[Size];
int offset = 0;
PrimitivesCount = 0;
Int32 Count = Instances.Count;
for (int i = 0; i < Count; i++)
{
InstancedBase3DObjectInstanceValues ib = Instances[i];
if (ib.Process)
{
MathHelper.CopyMatrix(ref ib._matrix, ref MatrixTemp);
MathHelper.CopyVector(ref ib._diffuseColor, ref ColorTemp);
ObjectUpdateTemp[0] = MatrixTemp.M11;
ObjectUpdateTemp[1] = MatrixTemp.M12;
ObjectUpdateTemp[2] = MatrixTemp.M13;
ObjectUpdateTemp[3] = MatrixTemp.M14;
ObjectUpdateTemp[4] = MatrixTemp.M21;
ObjectUpdateTemp[5] = MatrixTemp.M22;
ObjectUpdateTemp[6] = MatrixTemp.M23;
ObjectUpdateTemp[7] = MatrixTemp.M24;
ObjectUpdateTemp[8] = MatrixTemp.M31;
ObjectUpdateTemp[9] = MatrixTemp.M32;
ObjectUpdateTemp[10] = MatrixTemp.M33;
ObjectUpdateTemp[11] = MatrixTemp.M34;
ObjectUpdateTemp[12] = MatrixTemp.M41;
ObjectUpdateTemp[13] = MatrixTemp.M42;
ObjectUpdateTemp[14] = MatrixTemp.M43;
ObjectUpdateTemp[15] = MatrixTemp.M44;
ObjectUpdateTemp[16] = ColorTemp.X;
ObjectUpdateTemp[17] = ColorTemp.Y;
ObjectUpdateTemp[18] = ColorTemp.Z;
ObjectUpdateTemp[19] = ColorTemp.W;
ByteConverter.WriteSingleArrayToByte(ref ObjectUpdateTemp, ref ByteUpdateTemp, offset);
offset += 20;
PrimitivesCount++;
}
}
SynchronizedObject so = SynchronizationEventWriter.LockData();
so.Synchronizedobject = ByteUpdateTemp;
SynchronizationEventWriter.Update();
SynchronizationEventWriter.UnlockData();
_needToRegenerateUpdate = false;
so = SynchronizationEventWriterNum.LockData();
so.Synchronizedobject = PrimitivesCount;
SynchronizationEventWriterNum.Update();
SynchronizationEventWriterNum.UnlockData();
}
}
Notes :
The new byte[Size] is NEVER called due to caching.
The MathHelper function simply copy each element (Single) from one object to another without creating anything.
The base.Update() does almost nothing (and anyway is derived from ALL object in my engine, but only here i have the garbage object)
Thanks!!!
EDIT:
internal void GetLock()
{
Monitor.Enter(InternalLock);
Value.Locked = true;
Value.LockOwner = Thread.CurrentThread;
}
public SynchronizedObject LockData()
{
Parent.GetLock();
return Parent.Value;
}
Here's the code of the LockData(). I don't think it generates anything :|

I've resolved!!!
The problem was that the so.Synchronizedobject = PrimitivesCount; was assigning an Int32 to an Object class. It seems that this replaces every time the object causing the old object to be garbaged.
I resolved by using a box class to enclose the Int32 object and simply change the value inside.

What's in base.Update(), anything?
Can your profiler dump the heap? If so why I'd put a breakpoint just before this method and dump the heap and then again straight afterwards. That way you'll be able to see what type of object has been created.
Short of that a brute force approach of commenting out line by line is another (horrible) idea.
Do your MathHelper methods create a temp object?

I'm just guessing, but it looks like you're creating two SynchronizedObject objects in the bottom nine lines of that function:
SynchronizedObject so = SynchronizationEventWriter.LockData();
and
so = SynchronizationEventWriterNum.LockData();
No detailed knowledge of SynchronizedObject or whether LockData() actually creates anything, but it's the only choice I can see in your code...

Related

how to add map box tiles to gmap?

I want to get map box tiles from database but it does not work. I get MBTilesMapProvider class from here.
It is invoked like below:
map.Manager.Mode = AccessMode.ServerAndCache;
map.MapProvider = new MBTilesMapProvider("C:\\Users\\NPC\\Desktop\\test\\ne.mbtiles");
result:
but if google maps used as map provider like below it works well
map.Manager.Mode = AccessMode.ServerAndCache;
map.MapProvider = GoogleSatelliteMapProvider.Instance;
When i debuged i noticed that GetTiles method is invoked never.
Note: I think there is no problem about finding database because it reads meta data from database.
I solved solution by make a bit changes at MBTilesHelper.cs.
First i realized that when reading metadata from database MinZoom and MaxZoom values, they always stay as zero because it does not contain MinZoom or MaxZoom. Therefore, i set them manually.
and secondly i changed a bit "getImage" method.
private PureImage getImage(GPoint pos, int zoom)
{
PureImage retval = null;
var resultImage = _mbtiles.GetTileStream(pos.X, pos.Y, zoom);
if (resultImage != null && resultImage.Length > 0)
{
//resultImage.Position = 0L;
retval = GetTileImageFromArray(resultImage);
}
return retval;
}

Unity: What's allocating memory in my function?

I am relatively new to memory and optimization. I was tasked to optimize an existing project and while looking at the profiler in Unity, I saw that there was a function producing a bunch of stuff for the GC to clean up.
The function is as such:
public bool AnyCanvasOpen()
{
for (int i = 0; i < canvasList.Count; i++)
if (canvasList[i].activeSelf && canvasList[i].name != "MainUICanvas")
return true;
return false;
}
where canvasList is just a list of GameObjects. What could be the cause of my 282B allocation?
The canvasList variable is a List of GameObject and Unity's GameObject derives from UnityEngine.Object which has the name property you are using.
The memory allocation came from that: canvasList[i].name != "MainUICanvas"
This is because the Object.name property needs to return a string from the native side and this means that a new string is created each time that property is accessed. This is why the GameObject.CompareTag function was added, to make it possible to compare Object name without allocation memory. The GameObject.CompareTag function will compare the object name on the native side without the need to create or return a new string on the C# side. This removes the memory allocation.
Unfortunately, most Unity code examples on their site and elsewhere uses Object.name instead of the GameObject.CompareTag function which causes more people to use the Object.name property more often.
This shouldn't allocate memory:
List<GameObject> canvasList = new List<GameObject>();
public bool AnyCanvasOpen()
{
for (int i = 0; i < canvasList.Count; i++)
if (canvasList[i].activeSelf && canvasList[i].CompareTag("MainUICanvas"))
return true;
return false;
}

Am I reusing OpenCL/Cloo(C#) objects correctly?

I'm experimenting with OpenCL (through Cloo's C# interface). To do so, I'm experimenting with the customary matrix-multiplication-on-the-GPU. The problem is, during my speed tests, the application crashes. I'm trying to be efficient regarding the the re-allocation of various OpenCL objects, and I'm wondering if I'm botching something in doing so.
I'll put the code in this question, but for a bigger picture, you can get the code from github here: https://github.com/kwende/ClooMatrixMultiply
My main program does this:
Stopwatch gpuSw = new Stopwatch();
gpuSw.Start();
for (int c = 0; c < NumberOfIterations; c++)
{
float[] result = gpu.MultiplyMatrices(matrix1, matrix2, MatrixHeight, MatrixHeight, MatrixWidth);
}
gpuSw.Stop();
So I'm basically doing the call NumberOfIterations times, and timing the average execution time.
Within the MultiplyMatrices call, the first time through, I call Initialize to setup all the objects I'm going to reuse:
private void Initialize()
{
// get the intel integrated GPU
_integratedIntelGPUPlatform = ComputePlatform.Platforms.Where(n => n.Name.Contains("Intel")).First();
// create the compute context.
_context = new ComputeContext(
ComputeDeviceTypes.Gpu, // use the gpu
new ComputeContextPropertyList(_integratedIntelGPUPlatform), // use the intel openCL platform
null,
IntPtr.Zero);
// the command queue is the, well, queue of commands sent to the "device" (GPU)
_commandQueue = new ComputeCommandQueue(
_context, // the compute context
_context.Devices[0], // first device matching the context specifications
ComputeCommandQueueFlags.None); // no special flags
string kernelSource = null;
using (StreamReader sr = new StreamReader("kernel.cl"))
{
kernelSource = sr.ReadToEnd();
}
// create the "program"
_program = new ComputeProgram(_context, new string[] { kernelSource });
// compile.
_program.Build(null, null, null, IntPtr.Zero);
_kernel = _program.CreateKernel("ComputeMatrix");
}
I then enter the main body of my function (the part that will be executed NumberOfIterations times).
ComputeBuffer<float> matrix1Buffer = new ComputeBuffer<float>(_context,
ComputeMemoryFlags.ReadOnly| ComputeMemoryFlags.CopyHostPointer,
matrix1);
_kernel.SetMemoryArgument(0, matrix1Buffer);
ComputeBuffer<float> matrix2Buffer = new ComputeBuffer<float>(_context,
ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.CopyHostPointer,
matrix2);
_kernel.SetMemoryArgument(1, matrix2Buffer);
float[] ret = new float[matrix1Height * matrix2Width];
ComputeBuffer<float> retBuffer = new ComputeBuffer<float>(_context,
ComputeMemoryFlags.WriteOnly | ComputeMemoryFlags.CopyHostPointer,
ret);
_kernel.SetMemoryArgument(2, retBuffer);
_kernel.SetValueArgument<int>(3, matrix1WidthMatrix2Height);
_kernel.SetValueArgument<int>(4, matrix2Width);
_commandQueue.Execute(_kernel,
new long[] { 0 },
new long[] { matrix2Width ,matrix1Height },
null, null);
unsafe
{
fixed (float* retPtr = ret)
{
_commandQueue.Read(retBuffer,
false, 0,
ret.Length,
new IntPtr(retPtr),
null);
_commandQueue.Finish();
}
}
The third or fourth time through (it's somewhat random, which hints at memory access issues), the program crashes. Here is my kernel (I'm sure there are faster implementations, but right now my goal is just to get something working without blowing up):
kernel void ComputeMatrix(
global read_only float* matrix1,
global read_only float* matrix2,
global write_only float* output,
int matrix1WidthMatrix2Height,
int matrix2Width)
{
int x = get_global_id(0);
int y = get_global_id(1);
int i = y * matrix2Width + x;
float value = 0.0f;
// row y of matrix1 * column x of matrix2
for (int c = 0; c < matrix1WidthMatrix2Height; c++)
{
int m1Index = y * matrix1WidthMatrix2Height + c;
int m2Index = c * matrix2Width + x;
value += matrix1[m1Index] * matrix2[m2Index];
}
output[i] = value;
}
Ultimately the goal here is to better understand the zero-copy features of OpenCL (since I'm using Intel's integrated GPU). I have been having trouble getting it to work and so wanted to step back a bit to see if I understood even more basic things...apparently I don't as I can't get even this to work without blowing up.
The only other thing I can think of is it's how I'm pinning the pointer to send it to the .Read() function. But I don't know of an alternative.
Edit:
For what it's worth, I updated the last part of code (the read code) to this, and it still crashes:
_commandQueue.ReadFromBuffer(retBuffer, ref ret, false, null);
_commandQueue.Finish();
Edit #2
Solution found by huseyin tugrul buyukisik (see comment below).
Upon placing
matrix1Buffer.Dispose();
matrix2Buffer.Dispose();
retBuffer.Dispose();
At the end, it all worked fine.
OpenCl resources like buffers, kernels and commandqueues should be released after other resources that they are bound-to are released. Re-creating without releasing depletes avaliable slots quickly.
You have been re-creating arrays in a method of gpu and that was the scope of opencl buffers. When it finishes, GC cannot track opencl's unmanaged memory areas and that causes leaks, which makes crashes.
Many opencl implementations use C++ bindings which needs explicit release commands by C#, Java and other environments.
Also the set-argument part is not needed many times when repeated kernel executions use exact same buffer order as parameters of kernel.

Creating an array of objects using initializers seems to fail

I have a GameObject we'll call the GM. Attached to it is a script that's meant to be the primary logical controller for a game.
In that script, somewhere, I have:
private dbEquipment equipment_database = new dbEquipment();
The relevant snippet from dbEquipment.cs:
public class dbEquipment {
private int total_items = 13;
private clEquipment[] _master_equipment_list;
public dbEquipment() {
_master_equipment_list = new clEquipment[total_items];
_master_equipment_list[0] = new clEquipment {
... //large amount of object initializing here
};
... //etc, for all 13 items
}
}
When I run Unity, I get:
NullReferenceException: Object reference not set to an instance of an object
Pointed at the line:
_master_equipment_list[0] = new clEquipment { ...
I tried running through the array and initializing every clEquipment object to an empty clEquipment() first:
for(int x = 0; x < total_items; x++) { _master_equipment_list[x] = new clEquipment(); }
just to be totally sure that the array was actually filled, but I got the same result.
I've also tried changing it to be a List<clEquipment>, and changing everything appropriately -- no dice.
Any ideas?
My guess is that you may been including a null reference in the section that says //large amount of object initializing here when you create a new clEquipment.
_master_equipment_list[0] = new clEquipment {
... //check for nulls here
};
You might want to post the code for the clEquipment class. You say you tried initializing every object...did you do that before the break line? If it didn't break, thats a good sign.
Also, hard to tell from your code, but do you need a "()" in the initialization where it breaks? Just a thought
_master_equipment_list[0] = new clEquipment () {

Recursive conditions

Sorry to put up yet another recursion question, but I've looked over a fair few on here and haven't found the solution for my problem.
I use the below function:
unsafe
{
// Allocate global memory space for the size of AccessibleContextInfo and store the address in acPtr
IntPtr acPtr = Marshal.AllocHGlobal(Marshal.SizeOf(new AccessibleContextInfo()));
try
{
Marshal.StructureToPtr(new AccessibleContextInfo(), acPtr, true);
if (WABAPI.getAccessibleContextInfo(vmID, ac, acPtr))
{
acInfo = (AccessibleContextInfo)Marshal.PtrToStructure(acPtr, typeof(AccessibleContextInfo));
if (!ReferenceEquals(acInfo, null))
{
AccessibleTextItemsInfo atInfo = new AccessibleTextItemsInfo();
if (acInfo.accessibleText)
{
IntPtr ati = Marshal.AllocHGlobal(Marshal.SizeOf(new AccessibleTextItemsInfo()));
WABAPI.getAccessibleTextItems(vmID, ac, ati, 0); //THIS IS WHERE WE DO IT
atInfo = (AccessibleTextItemsInfo)Marshal.PtrToStructure(ati, typeof(AccessibleTextItemsInfo));
if (ati != IntPtr.Zero)
{
Marshal.FreeHGlobal(ati);
}
}
AccessibleTreeItem newItem = BuildAccessibleTree(acInfo, atInfo, parentItem, acPtr);
newItem.setAccessibleText(atInfo);
if (!ReferenceEquals(newItem, null))
{
for (int i = 0; i < acInfo.childrenCount; i++)
{
//Used roles = text, page tab, push button
if (acInfo.role_en_US != "unknown" && acInfo.states_en_US.Contains("visible")) // Note the optomization here, I found this get me to an acceptable speed
{
AccessibleContextInfo childAc = new AccessibleContextInfo();
IntPtr childContext = WABAPI.getAccessibleChildFromContext(vmID, ac, i);
GetAccessibleContextInfo(vmID, childContext, out childAc, newItem);
if (childContext != IntPtr.Zero)
{
Settings.Save.debugLog("Releasing object " + childContext.ToString() + " from JVM: " + vmID);
WABAPI.releaseJavaObject(vmID, childContext);
childContext = IntPtr.Zero;
}
}
}
}
return newItem;
}
}
else
{
acInfo = new AccessibleContextInfo();
}
}
finally
{
if (acPtr != IntPtr.Zero)
Marshal.FreeHGlobal(acPtr);
}
}
return null;
}
To build an AccessibleTreeItem representing the entire GUI of a Java application. However, this function takes 5-6 seconds to run. I'm only looking for one particular subsection of the tree (Lets call it Porkchops).
What I'd like to do is prior to building the tree, get the values and as soon as acRole.name == "Porkchop", use that as the parent object and create an AccessibleTreeItem that represents the subtree.
How on earth do I manage this? If this is a simple question, apologies, but it's driving me crazy.
Edit 1 - The performance hit is encountered on releaseJavaObject(), as when I remove that line the function completes in less than a second, but it creates a horrible memory leak.
Therefore, I'm not really looking for alternative solutions, as I know that the above does work correctly. I just need some way to check the value of acInfo.name prior to creating the tree, and then using the correct acInfo node as the parent.
Edit 2 - See the attached image for a better explanation than my rambling. Currently, the function will pull this entire tree from the JVM. I've highlighted the appropriate section that I work with, and would like to know if there's a way that will allow me to get that information, without building the entire tree. Or even if I could just return the tree once all children of that node have been populated.

Categories

Resources