Is any difference (in speed of my program - execution time) between??
1 st option:
private void particleReInit(int loop)
{
this.particle[loop].active = true;
this.particle[loop].life = 1.0f;
this.particle[loop].fade = 0.3f * (float)(this.random.Next(100)) / 1000.0f + 0.003f;
this.particle[loop].r = colors[this.col][0]; // Select Red Rainbow Color
this.particle[loop].g = colors[this.col][1]; // Select Red Rainbow Color
this.particle[loop].b = colors[this.col][2]; // Select Red Rainbow Color
this.particle[loop].x = 0.0f;
this.particle[loop].y = 0.0f;
this.particle[loop].xi = 10.0f * (this.random.Next(120) - 60.0f);
this.particle[loop].yi = (-50.0f) * (this.random.Next(60)) - (30.0f);
this.particle[loop].xg = 0.0f;
this.particle[loop].yg = 0.8f;
this.particle[loop].size = 0.2f;
this.particle[loop].center = new PointF(particleTextures[0].Width / 2, particleTextures[0].Height / 2);
}
2 nd option:
Particle p = particle[loop];
p.active = true;
p.life = 1.0f;
...
Where Particle particle[] = new Particle[NumberOfParticles]; is just an array of Particles which have some properties like life, position.
I'm doing it in Visual Studio 2010 Like WFA (Windows Form Aplication) and need to enhance performance (we're not able to use OpenGL, so for more particles my program tends to be slow).
I would certainly expect there to be a difference in speed - it's doing more work, after all. Another thread may have changed the contents of the array between statements, which may (or may not) be visible within this thread. If these are properties rather than fields, the property setter could even conceivably change the values in the array within the same thread, which would have to be visible.
Whether the difference in speed is significant or not is a different matter, and one that we can't judge.
More importantly, I'd say that the second form is clearer than the existing code.
In fact, if this is actually meant to be reinitializing the whole element, I'd actually create a new Particle and then assign that to the element:
particle[loop] = new Particle {
active = true,
life = 1f,
// etc
};
... or create a separate method/constructor which created a particle in an appropriate state.
While there is likely to be a very small difference in speed, any difference in speed here is likely going to be negligible. I would use the version that is the easiest to read and maintain.
Personally, I prefer your 2nd option, as I find it much easier to read.
I'm doing it in Visual Studio 2010 Like WFA (Windows Form Aplication) and need to enhance performance (we're not able to use OpenGL .. so for more particles my program tends to be slow).
If your goal is performance, I highly doubt this routine is the core of your problem. You should run this under a profiler. Until you actually measure your performance, and find the real problem, you're just guessing and likely going to spend time optimizing the wrong thing.
It is far more likely that the rendering of the "particles" using GDI+ is likely more of a bottleneck, and changing this routine between your two options will have no impact on the perceived speed.
As an alternative - make Particle a struct, initialize it sequentially in loop and I bet it will be much faster as any of these 2 options.
Related
So, this seems to be a strange one (to me). I'm relatively new to Unity, so I'm sure this is me misunderstanding something.
I'm working on a VSEPR tutorial module in unity. VSEPR is the model by which electrons repel each other and form the shape (geometry) of atoms.
I'm simulating this by making (in this case) 4 rods (cylinder primitives) and using rigidbody.AddForce to apply an equal force against all of them. This works beautifully, as long as the force is equal. In the following image you'll see the rods are beautifully equi-distant at 109.47 degrees (actually you can "see" their attached objects, two lone pairs and two electron bonds...the rods are obscured in the atom shell.)
(BTW the atom's shell is just a sphere primitive - painted all pretty.)
HOWEVER, in the real world, the lone pairs actually exert SLIGHTLY more force...so when I add this additional force to the model...instead of just pushing the other electron rods a little farther away, it pushes the ENTIRE 4-bond structure outside the atom's shell.
THE REASON THIS SEEMS ODD IS...two things.
All the rods are children of the shell...so I thought that made them somewhat immobile compared to the shell (I.e. if they moved, the shell would move with them...which would be fine).
I have a ConfiguableJoint holding the rods to the center of the atoms shell ( 0,0,0). The x/y/z-motion is set to fixed. I thought this should keep the rods fairly immutably attached to the 0,0,0 center of the shell...but I guess not.
POINTS OF INTEREST:
The rods only push on each other, by a script attached to one rod adding force to an adjacent rod. they do not AddForce to the atom shell or anything else.
CODE:
void RepulseLike() {
if (control.disableRepulsionForce) { return; } // No force when dragging
//////////////////// DETERMINE IF FORCE APPLIED //////////////////////////////
// Scroll through each Collider that this.BondStick bumps
foreach (Collider found in Physics.OverlapSphere(transform.position, (1f))) {
// Don't repel self
if (found == this.collider) { continue; }
// Check for charged particle
if (found.gameObject.tag.IndexOf("Charge") < 0) { continue; }// No match "charge", not a charged particle
/////////////// APPLY FORCE ///////////////
// F = k(q1*q2/r^2)
// where
// k = Culombs constant which in this instance is represented by repulseChargeFactor
// r = distance
// q1 and q2 are the signed magnitudes of the charges, in this case -1 for electrons and +1 for protons
// Swap from local to global variable for other methods
other = found;
/////////////////////////////////
// Calculate pushPoints for SingleBonds
forceDirection = (other.transform.position - transform.position) * magnetism; //magnetism = 1
// F = k(q1*q2/distance^2), q1*q2 ia always one in this scenario. k is arbitrary in this scenario
force = control.repulseChargeFactor * (1 / (Mathf.Pow(distance, 2)));
found.rigidbody.AddForce(forceDirection.normalized * force);// * Time.fixedDeltaTime);
}//Foreach Collider
}//RepulseLike Method
You may want to use spheres to represent electrons, so apply forces onto them and re-orient(rotate) rods according to this sphere.
I think I found my own answer...unless someone has suggestions for a better way to handle this.
I simply reduced the mass of the electron rods, and increased the mass of the sphere. I'm not sure if this is the "best practices" solution or a kluge...so input still welcomed :-)
I am working on a system, which is distributing Commands from a HashSet to a Player. I want to distribute a Command to the Player, who is closest to the Command.
void AssignCommand(Player player, HashSet<Command> commandList) {
//Player assigned;
float min = float.MaxValue;
float dist;
foreach(Command command in commandList) {
dist = Vector3.Distance(command.Position, player.Position);
if(dist < min) {
//check if command already assigned to another player
assigned = command.assigned;
if(assigned != null) {
//reassign when distance is smaller
if(dist < command.Distance(assigned)) {
//mark previously assigned command as unassigned
if(player.activeCommand != null) player.activeCommand.assigned = null;
player.activeCommand = command;
command.assigned = player;
min = dist;
assigned.activeCommand = null;
AssignCommand(assigned, commandList);
}
}
else {
if(player.activeCommand != null) player.activeCommand.assigned = null;
player.activeCommand = command;
command.assigned = player;
min = dist;
}
}
}
}
My problem with this code is that if there are a lot of commands in the HashSet it takes quite a while and the framerate drops from ~60 to about ~30 fps on my machine. This is no surprise, because the Vector3.Distance method is just called for (every player) * (every command), which is way too much. I am looking now for a way to reduce the number of calls somehow to improve the performance. Any ideas here?
I also tried running this code in a different Thread, but I gave up, because this is changing and using too many Thread Unsafe values. My latest try brought me to the check if assigned != null throwing an error for comparing.
I would be very grateful for any hints either improving the overall speed of this code or how to run this in a ThreadPool. If required I can also post my JobClass I created for the Thread attempt.
All the threading solutions and optimizations are fine, but the biggest thing you want to keep in mind (for this and for the future) is: Do not use Vector3.Distance or Vector3.magnitude for this, ever. They are inefficient.
Instead, use Vector3.sqrMagnitude, which is the same thing (for distance comparison), without the sqrt (the most expensive part).
Another optimization is to write your own (square) distance calculation, throwing out the y value if you know you don't care about vertical distances. My distance comparison code was slow, so I tested this pretty carefully and found this is the fastest way (especially if you don't care about vertical positions): (EDIT: this was fastest in 2015. Test yourself for the fastest code on modern Unity.)
tempPosition = enemy.transform.position; // declared outside the loop, but AFAIK that shouldn't make any difference
float xD = targetPosition.x - tempPosition.x;
float yD = targetPosition.y - tempPosition.y; // optional
float zD = targetPosition.z - tempPosition.z;
float dist2 = xD*xD + yD*yD + zD*zD; // or xD*xD + zD*zD
Edit: Another optimization (that you're likely already doing) is to only recalculate when a player has moved. I like this one because it doesn't compromise the data at all.
I wrote my own version of System.Threading.Tasks for unity and added something like this for ordering workload based on distance from the camera.
Essentially whenever a task (or in your case command) was needed it passed off a position and the task to a TaskManager that would then each frame sort the items it had and run through them.
You could probably do the same / similar but instead of passing the command to some sort of CommmandManager like I did with a TaskManager do a lookup on creation and pass the command to the player closest to the point in question.
Most people these days are pulling their scene graphs in to something like a quad tree which should make finding the right player fairly fast then each player would be responsible for executing its own commands.
Okay after hours working on the issue I finally found a solution to improve the performance on the one hand and make it threadable on the other hand. I had problems with the threading, because player is a Unity object.. Instead of using the player object within the task I only get its position in the Start() method. Thus I managed to make it threadable, although it seemed strange to me, that I represented the players by its positions using Vector3 now.
Improving the performance did happen by adding another Dictionary storing the already calculated distances for each player. So when reassigning a player the distance did not need to be recalculated again... I am not sure how much of performance this brought, because I tested it together with the Thread, but at least I got rid of some calls of Distance and am back at stabel 60 fps!
Furthermore I messed something up with the recursions, so that I got up to 100.000 recursions for each player.. This should NOT happen o.O. The fix was easy enough. Simply added a minCommand command, which I set during the foreach and only assign commands and touch the Sets AFTER the foreach.. I bet the code would now run like sugar even without threading...
I am currently working on a project in C# where i play around with planetary gravitation, which i know is a hardcore topic to graps to it's fullest but i like challenges. I've been reading up on Newtons laws and Keplers Laws, but one thing i cannot figure out is how to get the correct gravitational direction.
In my example i only have 2 bodies. A Satellite and a Planet. This is to make is simplify it, so i can grasp it - but my plan is to have multiple objects that dynamically effect each other, and hopefully end up with a somewhat realistic multi-body system.
When you have an orbit, then the satellite has a gravitational force, and that is ofcourse in the direction of the planet, but that direction isn't a constant. To explain my problem better i'll try using an example:
let's say we have a satellite moving at a speed of 50 m/s and accelerates towards the planet at a speed of 10 m/s/s, in a radius of 100 m. (all theoretical numbers) If we then say that the framerate is at 1, then after one second the object will be 50 units forward and 10 units down.
As the satellite moves multiple units in a frame and about 50% of the radius, the gravitational direcion have shifted alot, during this frame, but the applied force have only been "downwards". this creates a big margin of error, especially if the object is moving a big percentage of the radius.
In our example we'd probably needed our graviational direction to be based upon the average between our current position and the position at the end of this frame.
How would one go about calculating this?
I have a basis understanding of trigonometry, but mainly with focus on triangles. Assume i am stupid, because compared to any of you, i probably am.
(I made a previous question but ended up deleting it as it created some hostility and was basicly not that well phrased, and was ALL to general - it wasn't really a specific question. i hope this is better. if not, then please inform me, i am here to learn :) )
Just for reference, this is the function i have right now for movement:
foreach (ExtTerBody OtherObject in UniverseController.CurrentUniverse.ExterTerBodies.Where(x => x != this))
{
double massOther = OtherObject.Mass;
double R = Vector2Math.Distance(Position, OtherObject.Position);
double V = (massOther) / Math.Pow(R,2) * UniverseController.DeltaTime;
Vector2 NonNormTwo = (OtherObject.Position - Position).Normalized() * V;
Vector2 NonNormDir = Velocity + NonNormTwo;
Velocity = NonNormDir;
Position += Velocity * Time.DeltaTime;
}
If i have phrased myself badly, please ask me to rephrase parts - English isn't my native language, and specific subjects can be hard to phrase, when you don't know the correct technical terms. :)
I have a hunch that this is covered in keplers second law, but if it is, then i'm not sure how to use it, as i don't understand his laws to the fullest.
Thank you for your time - it means alot!
(also if anyone see multi mistakes in my function, then please point them out!)
I am currently working on a project in C# where i play around with planetary gravitation
This is a fun way to learn simulation techniques, programming and physics at the same time.
One thing I cannot figure out is how to get the correct gravitational direction.
I assume that you are not trying to simulate relativistic gravitation. The Earth isn't in orbit around the Sun, the Earth is in orbit around where the sun was eight minutes ago. Correcting for the fact that gravitation is not instantaneous can be difficult. (UPDATE: According to commentary this is incorrect. What do I know; I stopped taking physics after second year Newtonian dynamics and have only the vaguest understanding of tensor calculus.)
You'll do best at this early stage to assume that the gravitational force is instantaneous and that planets are points with all their mass at the center. The gravitational force vector is a straight line from one point to another.
Let's say we have a satellite moving at a speed of 50 m/s ... If we then say that the framerate is one frame per second then after one second the object will be 50 units right and 10 units down.
Let's make that more clear. Force is equal to mass times acceleration. You work out the force between the bodies. You know their masses, so you now know the acceleration of each body. Each body has a position and a velocity. The acceleration changes the velocity. The velocity changes the position. So if the particle starts off having a velocity of 50 m/s to the left and 0 m/s down, and then you apply a force that accelerates it by 10 m/s/s down, then we can work out the change to the velocity, and then the change to the position. As you note, at the end of that second the position and the velocity will have both changed by a huge amount compared to their existing magnitudes.
As the satellite moves multiple units in a frame and about 50% of the radius, the gravitational direcion have shifted alot, during this frame, but the applied force have only been "downwards". this creates a big margin of error, especially if the object is moving a big percentage of the radius.
Correct. The problem is that the frame rate is enormously too low to correctly model the interaction you're describing. You need to be running the simulation so that you're looking at tenths, hundredths or thousanths of seconds if the objects are changing direction that rapidly. The size of the time step is usually called the "delta t" of the simulation, and yours is way too large.
For planetary bodies, what you're doing now is like trying to model the earth by simulating its position every few months and assuming it moves in a straight line in the meanwhile. You need to actually simulate its position every few minutes, not every few months.
In our example we'd probably needed our graviational direction to be based upon the average between our current position and the position at the end of this frame.
You could do that but it would be easier to simply decrease the "delta t" for the computation. Then the difference between the directions at the beginning and the end of the frame is much smaller.
Once you've got that sorted out then there are more techniques you can use. For example, you could detect when the position changes too much between frames and go back and redo the computations with a smaller time step. If the positions change hardly at all then increase the time step.
Once you've got that sorted, there are lots of more advanced techniques you can use in physics simulations, but I would start by getting basic time stepping really solid first. The more advanced techniques are essentially variations on your idea of "do a smarter interpolation of the change over the time step" -- you are on the right track here, but you should walk before you run.
I'll start with a technique that is almost as simple as the Euler-Cromer integration you've been using but is markedly more accurate. This is the leapfrog technique. The idea is very simple: position and velocity are kept at half time steps from one another.
The initial state has position and velocity at time t0. To get that half step offset, you'll need a special case for the very first step, where velocity is advanced half a time step using the acceleration at the start of the interval and then position is advanced by a full step. After this first time special case, the code works just like your Euler-Cromer integrator.
In pseudo code, the algorithm looks like
void calculate_accel (orbiting_body_collection, central_body) {
foreach (orbiting_body : orbiting_body_collection) {
delta_pos = central_body.pos - orbiting_body.pos;
orbiting_body.acc =
(central_body.mu / pow(delta_pos.magnitude(),3)) * delta_pos;
}
}
void leapfrog_step (orbiting_body_collection, central_body, delta_t) {
static bool initialized = false;
calculate_accel (orbiting_body_collection, central_body);
if (! initialized) {
initialized = true;
foreach orbiting_body {
orbiting_body.vel += orbiting_body.acc*delta_t/2.0;
orbiting_body.pos += orbiting_body.vel*delta_t;
}
}
else {
foreach orbiting_body {
orbiting_body.vel += orbiting_body.acc*delta_t;
orbiting_body.pos += orbiting_body.vel*delta_t;
}
}
}
Note that I've added acceleration as a field of each orbiting body. This was a temporary step to keep the algorithm similar to yours. Note also that I moved the calculation of acceleration to it's own separate function. That is not a temporary step. It is the first essential step to advancing to even more advanced integration techniques.
The next essential step is to undo that temporary addition of the acceleration. The accelerations properly belong to the integrator, not the body. On the other hand, the calculation of accelerations belongs to the problem space, not the integrator. You might want to add relativistic corrections, or solar radiation pressure, or planet to planet gravitational interactions. The integrator should be unaware of what goes into those accelerations are calculated. The function calculate_accels is a black box called by the integrator.
Different integrators have very different concepts of when accelerations need to be calculated. Some store a history of recent accelerations, some need an additional workspace to compute an average acceleration of some sort. Some do the same with velocities (keep a history, have some velocity workspace). Some more advanced integration techniques use a number of techniques internally, switching from one to another to provide the best balance between accuracy and CPU usage. If you want to simulate the solar system, you need an extremely accurate integrator. (And you need to move far, far away from floats. Even doubles aren't good enough for a high precision solar system integration. With floats, there's not much point going past RK4, and maybe not even leapfrog.)
Properly separating what belongs to whom (the integrator versus the problem space) makes it possible to refine the problem domain (add relativity, etc.) and makes it possible to easily switch integration techniques so you can evaluate one technique versus another.
So i found a solution, it might not be the smartest, but it works, and it's pretty came to mind after reading both Eric's answer and also reading the comment made by marcus, you could say that it's a combination of the two:
This is the new code:
foreach (ExtTerBody OtherObject in UniverseController.CurrentUniverse.ExterTerBodies.Where(x => x != this))
{
double massOther = OtherObject.Mass;
double R = Vector2Math.Distance(Position, OtherObject.Position);
double V = (massOther) / Math.Pow(R,2) * Time.DeltaTime;
float VRmod = (float)Math.Round(V/(R*0.001), 0, MidpointRounding.AwayFromZero);
if(V > R*0.01f)
{
for (int x = 0; x < VRmod; x++)
{
EulerMovement(OtherObject, Time.DeltaTime / VRmod);
}
}
else
EulerMovement(OtherObject, Time.DeltaTime);
}
public void EulerMovement(ExtTerBody OtherObject, float deltaTime)
{
double massOther = OtherObject.Mass;
double R = Vector2Math.Distance(Position, OtherObject.Position);
double V = (massOther) / Math.Pow(R, 2) * deltaTime;
Vector2 NonNormTwo = (OtherObject.Position - Position).Normalized() * V;
Vector2 NonNormDir = Velocity + NonNormTwo;
Velocity = NonNormDir;
//Debug.WriteLine("Velocity=" + Velocity);
Position += Velocity * deltaTime;
}
To explain it:
I came to the conclusion that if the problem was that the satellite had too much velocity in one frame, then why not seperate it into multiple frames? So this is what "it" does now.
When the velocity of the satellite is more than 1% of the current radius, it seperates the calculation into multiple bites, making it more precise.. This will ofcourse lower the framerate when working with high velocities, but it's okay with a project like this.
Different solutions are still very welcome. I might tweak the trigger-amounts, but the most important thing is that it works, then i can worry about making it more smooth!
Thank's everybody that took a look, and everyone who helped be find the conclusion myself! :) It's awesome that people can help like this!
First of all, I am aware that this question really sounds as if I didn't search, but I did, a lot.
I wrote a small Mandelbrot drawing code for C#, it's basically a windows form with a PictureBox on which I draw the Mandelbrot set.
My problem is, is that it's pretty slow. Without a deep zoom it does a pretty good job and moving around and zooming is pretty smooth, takes less than a second per drawing, but once I start to zoom in a little and get to places which require more calculations it becomes really slow.
On other Mandelbrot applications my computer does really fine on places which work much slower in my application, so I'm guessing there is much I can do to improve the speed.
I did the following things to optimize it:
Instead of using the SetPixel GetPixel methods on the bitmap object, I used LockBits method to write directly to memory which made things a lot faster.
Instead of using complex number objects (with classes I made myself, not the built-in ones), I emulated complex numbers using 2 variables, re and im. Doing this allowed me to cut down on multiplications because squaring the real part and the imaginary part is something that is done a few time during the calculation, so I just save the square in a variable and reuse the result without the need to recalculate it.
I use 4 threads to draw the Mandelbrot, each thread does a different quarter of the image and they all work simultaneously. As I understood, that means my CPU will use 4 of its cores to draw the image.
I use the Escape Time Algorithm, which as I understood is the fastest?
Here is my how I move between the pixels and calculate, it's commented out so I hope it's understandable:
//Pixel by pixel loop:
for (int r = rRes; r < wTo; r++)
{
for (int i = iRes; i < hTo; i++)
{
//These calculations are to determine what complex number corresponds to the (r,i) pixel.
double re = (r - (w/2))*step + zeroX ;
double im = (i - (h/2))*step - zeroY;
//Create the Z complex number
double zRe = 0;
double zIm = 0;
//Variables to store the squares of the real and imaginary part.
double multZre = 0;
double multZim = 0;
//Start iterating the with the complex number to determine it's escape time (mandelValue)
int mandelValue = 0;
while (multZre + multZim < 4 && mandelValue < iters)
{
/*The new real part equals re(z)^2 - im(z)^2 + re(c), we store it in a temp variable
tempRe because we still need re(z) in the next calculation
*/
double tempRe = multZre - multZim + re;
/*The new imaginary part is equal to 2*re(z)*im(z) + im(c)
* Instead of multiplying these by 2 I add re(z) to itself and then multiply by im(z), which
* means I just do 1 multiplication instead of 2.
*/
zRe += zRe;
zIm = zRe * zIm + im;
zRe = tempRe; // We can now put the temp value in its place.
// Do the squaring now, they will be used in the next calculation.
multZre = zRe * zRe;
multZim = zIm * zIm;
//Increase the mandelValue by one, because the iteration is now finished.
mandelValue += 1;
}
//After the mandelValue is found, this colors its pixel accordingly (unsafe code, accesses memory directly):
//(Unimportant for my question, I doubt the problem is with this because my code becomes really slow
// as the number of ITERATIONS grow, this only executes more as the number of pixels grow).
Byte* pos = px + (i * str) + (pixelSize * r);
byte col = (byte)((1 - ((double)mandelValue / iters)) * 255);
pos[0] = col;
pos[1] = col;
pos[2] = col;
}
}
What can I do to improve this? Do you find any obvious optimization problems in my code?
Right now there are 2 ways I know I can improve it:
I need to use a different type for numbers, double is limited with accuracy and I'm sure there are better non-built-in alternative types which are faster (they multiply and add faster) and have more accuracy, I just need someone to point me where I need to look and tell me if it's true.
I can move processing to the GPU. I have no idea how to do this (OpenGL maybe? DirectX? is it even that simple or will I need to learn a lot of stuff?). If someone can send me links to proper tutorials on this subject or tell me in general about it that would be great.
Thanks a lot for reading that far and hope you can help me :)
If you decide to move the processing to the gpu, you can choose from a number of options. Since you are using C#, XNA will allow you to use HLSL. RB Whitaker has the easiest XNA tutorials if you choose this option. Another option is OpenCL. OpenTK comes with a demo program of a julia set fractal. This would be very simple to modify to display the mandlebrot set. See here
Just remember to find the GLSL shader that goes with the source code.
About the GPU, examples are no help for me because I have absolutely
no idea about this topic, how does it even work and what kind of
calculations the GPU can do (or how is it even accessed?)
Different GPU software works differently however ...
Typically a programmer will write a program for the GPU in a shader language such as HLSL, GLSL or OpenCL. The program written in C# will load the shader code and compile it, and then use functions in an API to send a job to the GPU and get the result back afterwards.
Take a look at FX Composer or render monkey if you want some practice with shaders with out having to worry about APIs.
If you are using HLSL, the rendering pipeline looks like this.
The vertex shader is responsible for taking points in 3D space and calculating their position in your 2D viewing field. (Not a big concern for you since you are working in 2D)
The pixel shader is responsible for applying shader effects to the pixels after the vertex shader is done.
OpenCL is a different story, its geared towards general purpose GPU computing (ie: not just graphics). Its more powerful and can be used for GPUs, DSPs, and building super computers.
WRT coding for the GPU, you can look at Cudafy.Net (it does OpenCL too, which is not tied to NVidia) to start getting an understanding of what's going on and perhaps even do everything you need there. I've quickly found it - and my graphics card - unsuitable for my needs, but for the Mandelbrot at the stage you're at, it should be fine.
In brief: You code for the GPU with a flavour of C (Cuda C or OpenCL normally) then push the "kernel" (your compiled C method) to the GPU followed by any source data, and then invoke that "kernel", often with parameters to say what data to use - or perhaps a few parameters to tell it where to place the results in its memory.
When I've been doing fractal rendering myself, I've avoided drawing to a bitmap for the reasons already outlined and deferred the render phase. Besides that, I tend to write massively multithreaded code which is really bad for trying to access a bitmap. Instead, I write to a common store - most recently I've used a MemoryMappedFile (a builtin .Net class) since that gives me pretty decent random access speed and a huge addressable area. I also tend to write my results to a queue and have another thread deal with committing the data to storage; the compute times of each Mandelbrot pixel will be "ragged" - that is to say that they will not always take the same length of time. As a result, your pixel commit could be the bottleneck for very low iteration counts. Farming it out to another thread means your compute threads are never waiting for storage to complete.
I'm currently playing with the Buddhabrot visualisation of the Mandelbrot set, looking at using a GPU to scale out the rendering (since it's taking a very long time with the CPU) and having a huge result-set. I was thinking of targetting an 8 gigapixel image, but I've come to the realisation that I need to diverge from the constraints of pixels, and possibly away from floating point arithmetic due to precision issues. I'm also going to have to buy some new hardware so I can interact with the GPU differently - different compute jobs will finish at different times (as per my iteration count comment earlier) so I can't just fire batches of threads and wait for them all to complete without potentially wasting a lot of time waiting for one particularly high iteration count out of the whole batch.
Another point to make that I hardly ever see being made about the Mandelbrot Set is that it is symmetrical. You might be doing twice as much calculating as you need to.
For moving the processing to the GPU, you have lots of excellent examples here:
https://www.shadertoy.com/results?query=mandelbrot
Note that you need an WebGL capable browser to view that link. Works best in Chrome.
I'm no expert on fractals but you seem to have come far already with the optimizations. Going beyond that may make the code much harder to read and maintain so you should ask yourself it is worth it.
One technique I've often observed in other fractal programs is this: While zooming, calculate the fractal at a lower resolution and stretch it to full size during render. Then render at full resolution as soon as zooming stops.
Another suggestion is that when you use multiple threads you should take care that each thread don't read/write memory of other threads because this will cause cache collisions and hurt performance. One good algorithm could be split the work up in scanlines (instead of four quarters like you did now). Create a number of threads, then as long as there as lines left to process, assign a scanline to a thread that is available. Let each thread write the pixel data to a local piece of memory and copy this back to main bitmap after each line (to avoid cache collisions).
At the moment, I have a sprite that I have arbitrarily set to move 1 pixel per second. The code is basically this (The code isn't optimised at all, I could do it much nicer but it is the principle I am trying to solve first:):
private const long MOVEMENT_SPEED = 10000000; // Ticks in 1 second
private long movementTimeSpan = MOVEMENT_SPEED;
protected void PerformMovement(GameTime gameTime)
{
movementTimeSpan -= gameTime.ElapsedGameTime.Ticks;
if (movementTimeSpan <= 0)
{
// Do the movement of 1 pixel in here, and set movementTimeSpan back to MOVEMENT_SPEED
}
}
Perform movement is called in a loop as you'd expect, and it equates to updating around 10 times per second. So if I lower the MOVEMENT_SPEED, my sprite speeds up, but it never gets any faster than 10 pixels per second. For projectiles and other stuff I obviously want it to update much faster than this.
If I alter the movement to 2 pixels or more, it creates issues with calculating collisions and suchlike, but these might be possible to overcome.
The other alternative is to store x and y as a float rather than an int, and increase the values as a fraction of the number of elapsed ticks. I am not sure if this will create smooth movement or not as there still has to be some rounding involved.
So my question is, does anyone know the standard way?
Should I increase the amount to more than 1 pixel and update my collision detection to be recursive, should I store X,Y as floats and move as a % of elapsed time, or is there a 3rd better way of doing it?
The standard way is to not count down a timer to move, but instead the opposite:
private const float MOVEMENT_SPEED = 10.0f; //pixels per second
private float time;
protected void PerformMovement(GameTime gameTime)
{
time = (float)gameTime.ElapsedGameTime.TotalSeconds;
character.X += MOVEMENT_SPEED * time;
}
Make the movement based on the time elapsed. The reason floats are commonly used is to get the fractional value of motion. Fixed-point is another common fractional representation but uses ints instead.
As for collision, collision can be very tricky but in general you don't absolutely need to do it once per pixel of motion (as you suggested with recursion); that's overkill and will lead to terrible performance in no time. If you are currently having trouble with 2-pixel motion, I would reevaluate how you're doing your collisions. In general, it becomes problematic when you're moving very fast to the point of skipping over thin walls, or even passing over to the "wrong side" of a wall, depending on how your collision is set up. This is known as "tunnelling". There are many ways of solving this. Look here and scroll down to "Preventing Tunnelling". As the article states many people just cap their speed at a safe value. But another common method is to "step" through your algorithm in smaller time steps than is currently being passed in. For example, if the current elapsed time is 0.1, you could step by 0.01 within a loop and check each small step.
A way to do what you request, although not very recommended, is to increase your game's update frequency to a higher value than the usual 30 or 60 fps, but only draw to the screen every N frames. You can do it by just having your graphics engine ignore Draw calls until a count or timer reaches the desired value.
Of course, this solution should be avoided unless it is specifically desired, because performance can degrade quite fast as the number of updated elements increases.
For example, Proun (not an XNA game) uses this trick for exactly your reasons.
With the default of IsFixedTimeStep = true, XNA behaves in a similar fashion, skipping calls to Draw if Update takes too long.