How can I draw smoothed/rounded/curved line graphs? (C#) - c#

I'm measuring some system performance data to store it in a database. From those data points I'm drawing line graphs over time. In their nature, those data points are a bit noisy, ie. every single point deviates at least a bit from the local mean value. When drawing the line graph straight from one point to the next, it produces jagged graphs. At a large time scale like > 10 data points per pixel, this noise is compressed into a wide jagged line area that is, say, 20px high instead of 1px as in smaller scales.
I've read about line smoothing, anti-aliasing, simplifying and all these things. But everything I've found seems to be about something else.
I don't need anti-aliasing, .NET already does that for me when drawing the line on the screen.
I don't want simplification. I need the extreme values to remain visible, at least most of them.
I think it goes in the direction of spline curves but I couldn't find much example images to evaluate whether the described thing is what I want. I did find a highly scientific book at Google Books though, full of half-page long formulas, which I wasn't like reading through now...
To give you an example, just look at Linux/Gnome's system monitor application. I draws the recent CPU/memory/network usage with a smoothed line. This may be a bit oversimplified, but I'd give it a try and see if I can tweak it.
I'd prefer C# code but algorithms or code in other languages is fine, too, as long as I can port it to C# without external references.

You can do some data-smoothing. Instead of using the real data, apply a simple smoothing algorithm that keeps the peaks like a Savitzky-Golayfilter.
You can get the coefficients here.
The easiest to do is:
Take the top coefficients from the website I linked to:
// For np = 5 = 5 data points
var h = 35.0;
var coeff = new float[] { 17, 12, -3 }; // coefficients from the site
var easyCoeff = new float[] {-3, 12, 17, 12, -3}; // Its symmetrical
var center = 2; // = the center of the easyCoeff array
// now for every point from your data you calculate a smoothed point:
smoothed[x] =
((data[x - 2] * easyCoeff[center - 2]) +
(data[x - 1] * easyCoeff[center - 1]) +
(data[x - 0] * easyCoeff[center - 0]) +
(data[x + 1] * easyCoeff[center + 1]) +
(data[x + 2] * easyCoeff[center + 2])) / h;
The first 2 and last 2 points you cannoth smooth when using 5 points.
If you want your data to be more "smoothed" you can experiment with coefficents with larger data points.
Now you can draw a line through your "smoothed" data. The larger your np = number of points, the smoother your data. But you also loose peak accuracy, but not as much when simply averaging some points together.

You cannot fix this in the graphics code. If your data is noisy then the graph is going to be noisy as well, no matter what kind of line smoothing algorithm you use. You'll need to filter the data first. Create a second data set with points that are interpolated from the original data. A Least Squares fit is a common technique. Averaging is simple to implement but tends to hide extremes.

I think what you are looking for is a routine to provide 'splines'. Here is a link describing splines:
http://en.wikipedia.org/wiki/Spline_(mathematics)
If that is the case I don't have any recommendations for a spline library, but an initial google search turned up a bunch.
Sorry for no code, but hopefully knowing the terminology will aid you in your search.
Bob

Reduce the number of data points, using MIN/MAX/AVG before you display them. It'll look nicer and it'll be faster

Graphs of network traffic often use a weighted average. You can sample once per second into a circular list of length 10 and for the graph, at each sample, graph the average of the samples.
If 10 isn't enough you can store many more. You don't need to recalculate the average from scratch, either:
new_average = (old_average*10 - replaced_sample + new_sample)/10
If you don't want to store all 10, however, you can approximate with this:
new_average = old_average*9/10 + new_sample/10
Lots of routers use this to save on storage. This ramps toward the current traffic rate exponentially.
If you do implement this, do something like this:
new_average = old_average*min(9,number_of_samples)/10 + new_sample/10
number_of_samples++
to avoid the initial ramp-up. You should also adjust the 9/10, 1/10 ratio to actually reflect the time preiod of each sample because your timer won't fire exactly once per second.

Related

Find Offset in Grid

I'm sorry about the vague title, but I'm not really sure how to ask this without being very specific. If you suggest a title which is more clear, I'll change it as soon as I can.
Anyway, I don't think I can ask my question very succinctly without first providing a little background information. In a 2D space, I am creating "acres", which contain "tiles".
[One Acre with 64 Tiles]
For the sake of clarity, we'll assume that in this specific instance, there are 12 acres, four in the first row, four in the second, and four in the third. Each acre has 64 tiles in it, in an eight by eight grid.
[Twelve Acres, each with 64 Tiles]
I am generating a texture the width and height of the desired number of acres, multiplied by the number of tiles in each acre (in our example, the texture would be 32 pixels wide [the number of acres in a horizontal row {4} multiplied by the number of tiles in an acre {8}], and 24 pixels tall [the number of acres in a vertical column {3} multiplied by the number of tiles in an acre {8}]). The texture is then filled with perlin noise, which I would like to use to colour each tile.
[Single Acre, with 64 Tiles, next to the Perlin image generated for it (scaled up). This has a slight random colour variation applied to each tile.]
I would like to generate one image for all of the acres, and read from it each time a new acre is created, but therein lies the problem, and the subject of my question. How do I get the offset, so that each adjacent acre continues the pattern?
[What I want (to get this, I just created a single larger tile)]
The method I'm currently using doesn't seem to work, however, and ends up creating something like the following.
Strange Result http://2catstudios.github.io/images/StackOverflow/150113_Grid_Offset/Perlin_Twelve_Acres_NoSpace.png
[Strange Result]
Following is the code which I'm currently using to find the (incorrect, I assume) offset. The link directs to a Gist, where the perlin generation function, and acre/tile generation functions are pasted.
int xOffset = ( parentAcreXIndex * desiredWidth );
int yOffset = ( parentAcreYIndex * desiredHeight );
new Color ( 0.000f, 0.502f + ( parentWorld.worldPerlin.GetPixel ( xOffset + ( desiredWidth - tileXIndex ), yOffset + ( desiredHeight - tileYIndex )).grayscale * 0.3f ), 0.000f, 1 );
Full class (Links to GitHub's Gist), the above line is at 100
I don't really know what else to say; my mind is a bit "foggy" from trying to figure this out, so please forgive me if I've left something important out. Do let me know, and I'll update my post with the required information.
Also, I'm sorry about this question, it must be pretty hard to understand. I'm going to read over this a few times, after I publish it, to see if I can improve the wording.
Thank you for your time!
Michael
Edit
Thank you for taking a look at this! It turns out the problem was that the plane I was using for visualization was actually upside down. I'll make sure to check simple things like that in the future, sorry for the confusion! I have left the question up, because I was given enough points here to post images, and when I tried to delete it, the points were revoked. When I earn more points, I will come back to delete this. Thanks!
You seem to be asking this:
If I have a grid of pixels, which are grouped into 'acres' (32x32), how do I map from a given pixel (as the row and col of an acre) to the overall pixel?
Acres begin at every 32nd pixel, so for a given acre (acreX, acreY)
acreOriginInTextureX = acreX * 8;
acreOriginInTextureY = acreY * 8;
So a given tile (tileX, tileY) within an acre will be:
tilePosInTextureX = acreOriginInTextureX + tileX
= acreX * 8 + tileX
tilePosInTextureY = acreY * 8 + tileY
Really this is:
tilePosInTextureX = acreX * tilesPerAcreX + tileX
... and same for Y.
NB: I'm assuming zero indexing everywhere. If not, you'll need to subtract 1 from acreX, acreY, tileX, tileY, but not tilesPerAcreX or Y.

Making C# mandelbrot drawing more efficient

First of all, I am aware that this question really sounds as if I didn't search, but I did, a lot.
I wrote a small Mandelbrot drawing code for C#, it's basically a windows form with a PictureBox on which I draw the Mandelbrot set.
My problem is, is that it's pretty slow. Without a deep zoom it does a pretty good job and moving around and zooming is pretty smooth, takes less than a second per drawing, but once I start to zoom in a little and get to places which require more calculations it becomes really slow.
On other Mandelbrot applications my computer does really fine on places which work much slower in my application, so I'm guessing there is much I can do to improve the speed.
I did the following things to optimize it:
Instead of using the SetPixel GetPixel methods on the bitmap object, I used LockBits method to write directly to memory which made things a lot faster.
Instead of using complex number objects (with classes I made myself, not the built-in ones), I emulated complex numbers using 2 variables, re and im. Doing this allowed me to cut down on multiplications because squaring the real part and the imaginary part is something that is done a few time during the calculation, so I just save the square in a variable and reuse the result without the need to recalculate it.
I use 4 threads to draw the Mandelbrot, each thread does a different quarter of the image and they all work simultaneously. As I understood, that means my CPU will use 4 of its cores to draw the image.
I use the Escape Time Algorithm, which as I understood is the fastest?
Here is my how I move between the pixels and calculate, it's commented out so I hope it's understandable:
//Pixel by pixel loop:
for (int r = rRes; r < wTo; r++)
{
for (int i = iRes; i < hTo; i++)
{
//These calculations are to determine what complex number corresponds to the (r,i) pixel.
double re = (r - (w/2))*step + zeroX ;
double im = (i - (h/2))*step - zeroY;
//Create the Z complex number
double zRe = 0;
double zIm = 0;
//Variables to store the squares of the real and imaginary part.
double multZre = 0;
double multZim = 0;
//Start iterating the with the complex number to determine it's escape time (mandelValue)
int mandelValue = 0;
while (multZre + multZim < 4 && mandelValue < iters)
{
/*The new real part equals re(z)^2 - im(z)^2 + re(c), we store it in a temp variable
tempRe because we still need re(z) in the next calculation
*/
double tempRe = multZre - multZim + re;
/*The new imaginary part is equal to 2*re(z)*im(z) + im(c)
* Instead of multiplying these by 2 I add re(z) to itself and then multiply by im(z), which
* means I just do 1 multiplication instead of 2.
*/
zRe += zRe;
zIm = zRe * zIm + im;
zRe = tempRe; // We can now put the temp value in its place.
// Do the squaring now, they will be used in the next calculation.
multZre = zRe * zRe;
multZim = zIm * zIm;
//Increase the mandelValue by one, because the iteration is now finished.
mandelValue += 1;
}
//After the mandelValue is found, this colors its pixel accordingly (unsafe code, accesses memory directly):
//(Unimportant for my question, I doubt the problem is with this because my code becomes really slow
// as the number of ITERATIONS grow, this only executes more as the number of pixels grow).
Byte* pos = px + (i * str) + (pixelSize * r);
byte col = (byte)((1 - ((double)mandelValue / iters)) * 255);
pos[0] = col;
pos[1] = col;
pos[2] = col;
}
}
What can I do to improve this? Do you find any obvious optimization problems in my code?
Right now there are 2 ways I know I can improve it:
I need to use a different type for numbers, double is limited with accuracy and I'm sure there are better non-built-in alternative types which are faster (they multiply and add faster) and have more accuracy, I just need someone to point me where I need to look and tell me if it's true.
I can move processing to the GPU. I have no idea how to do this (OpenGL maybe? DirectX? is it even that simple or will I need to learn a lot of stuff?). If someone can send me links to proper tutorials on this subject or tell me in general about it that would be great.
Thanks a lot for reading that far and hope you can help me :)
If you decide to move the processing to the gpu, you can choose from a number of options. Since you are using C#, XNA will allow you to use HLSL. RB Whitaker has the easiest XNA tutorials if you choose this option. Another option is OpenCL. OpenTK comes with a demo program of a julia set fractal. This would be very simple to modify to display the mandlebrot set. See here
Just remember to find the GLSL shader that goes with the source code.
About the GPU, examples are no help for me because I have absolutely
no idea about this topic, how does it even work and what kind of
calculations the GPU can do (or how is it even accessed?)
Different GPU software works differently however ...
Typically a programmer will write a program for the GPU in a shader language such as HLSL, GLSL or OpenCL. The program written in C# will load the shader code and compile it, and then use functions in an API to send a job to the GPU and get the result back afterwards.
Take a look at FX Composer or render monkey if you want some practice with shaders with out having to worry about APIs.
If you are using HLSL, the rendering pipeline looks like this.
The vertex shader is responsible for taking points in 3D space and calculating their position in your 2D viewing field. (Not a big concern for you since you are working in 2D)
The pixel shader is responsible for applying shader effects to the pixels after the vertex shader is done.
OpenCL is a different story, its geared towards general purpose GPU computing (ie: not just graphics). Its more powerful and can be used for GPUs, DSPs, and building super computers.
WRT coding for the GPU, you can look at Cudafy.Net (it does OpenCL too, which is not tied to NVidia) to start getting an understanding of what's going on and perhaps even do everything you need there. I've quickly found it - and my graphics card - unsuitable for my needs, but for the Mandelbrot at the stage you're at, it should be fine.
In brief: You code for the GPU with a flavour of C (Cuda C or OpenCL normally) then push the "kernel" (your compiled C method) to the GPU followed by any source data, and then invoke that "kernel", often with parameters to say what data to use - or perhaps a few parameters to tell it where to place the results in its memory.
When I've been doing fractal rendering myself, I've avoided drawing to a bitmap for the reasons already outlined and deferred the render phase. Besides that, I tend to write massively multithreaded code which is really bad for trying to access a bitmap. Instead, I write to a common store - most recently I've used a MemoryMappedFile (a builtin .Net class) since that gives me pretty decent random access speed and a huge addressable area. I also tend to write my results to a queue and have another thread deal with committing the data to storage; the compute times of each Mandelbrot pixel will be "ragged" - that is to say that they will not always take the same length of time. As a result, your pixel commit could be the bottleneck for very low iteration counts. Farming it out to another thread means your compute threads are never waiting for storage to complete.
I'm currently playing with the Buddhabrot visualisation of the Mandelbrot set, looking at using a GPU to scale out the rendering (since it's taking a very long time with the CPU) and having a huge result-set. I was thinking of targetting an 8 gigapixel image, but I've come to the realisation that I need to diverge from the constraints of pixels, and possibly away from floating point arithmetic due to precision issues. I'm also going to have to buy some new hardware so I can interact with the GPU differently - different compute jobs will finish at different times (as per my iteration count comment earlier) so I can't just fire batches of threads and wait for them all to complete without potentially wasting a lot of time waiting for one particularly high iteration count out of the whole batch.
Another point to make that I hardly ever see being made about the Mandelbrot Set is that it is symmetrical. You might be doing twice as much calculating as you need to.
For moving the processing to the GPU, you have lots of excellent examples here:
https://www.shadertoy.com/results?query=mandelbrot
Note that you need an WebGL capable browser to view that link. Works best in Chrome.
I'm no expert on fractals but you seem to have come far already with the optimizations. Going beyond that may make the code much harder to read and maintain so you should ask yourself it is worth it.
One technique I've often observed in other fractal programs is this: While zooming, calculate the fractal at a lower resolution and stretch it to full size during render. Then render at full resolution as soon as zooming stops.
Another suggestion is that when you use multiple threads you should take care that each thread don't read/write memory of other threads because this will cause cache collisions and hurt performance. One good algorithm could be split the work up in scanlines (instead of four quarters like you did now). Create a number of threads, then as long as there as lines left to process, assign a scanline to a thread that is available. Let each thread write the pixel data to a local piece of memory and copy this back to main bitmap after each line (to avoid cache collisions).

How can you stitch multiple heightmaps together to remove seams?

I am trying to write an algorithm (in c#) that will stitch two or more unrelated heightmaps together so there is no visible seam between the maps. Basically I want to mimic the functionality found on this page :
http://www.bundysoft.com/wiki/doku.php?id=tutorials:l3dt:stitching_heightmaps
(You can just look at the pictures to get the gist of what I'm talking about)
I also want to be able to take a single heightmap and alter it so it can be tiled, in order to create an endless world (All of this is for use in Unity3d). However, if I can stitch multiple heightmaps together, I should be able to easily modify the algorithm to act on a single heightmap, so I am not worried about this part.
Any kind of guidance would be appreciated, as I have searched and searched for a solution without success. Just a simple nudge in the right direction would be greatly appreciated! I understand that many image manipulation techniques can be applied to heightmaps, but have been unable to find a image processing algorithm that produces the results I'm looking for. For instance, image stitching appears to only work for images that have overlapping fields of view, which is not the case with unrelated heightmaps.
Would utilizing a FFT low pass filter in some way work, or would that only be useful in generating a single tileable heightmap?
Because the algorithm is to be used in Unit3d, any c# code will have to be confined to .Net 3.5, as I believe that's the latest version Unity uses.
Thanks for any help!
Okay, seems I was on the right track with my previous attempts at solving this problem. My initial attemp at stitching the heightmaps together involved the following steps for each point on the heightmap:
1) Find the average between a point on the heightmap and its opposite point. The opposite point is simply the first point reflected across either the x axis (if stitching horizontal edges) or the z axis (for the vertical edges).
2) Find the new height for the point using the following formula:
newHeight = oldHeight + (average - oldHeight)*((maxDistance-distance)/maxDistance);
Where distance is the distance from the point on the heightmap to the nearest horizontal or vertical edge (depending on which edge you want to stitch). Any point with a distance less than maxDistance (which is an adjustable value that effects how much of the terrain is altered) is adjusted based on this formula.
That was the old formula, and while it produced really nice results for most of the terrain, it was creating noticeable lines in the areas between the region of altered heightmap points and the region of unaltered heightmap points. I realized almost immediately that this was occurring because the slope of the altered regions was too steep in comparison to the unaltered regions, thus creating a noticeable contrast between the two. Unfortunately, I went about solving this issue the wrong way, looking for solutions on how to blur or smooth the contrasting regions together to remove the line.
After very little success with smoothing techniques, I decided to try and reduce the slope of the altered region, in the hope that it would better blend with the slope of the unaltered region. I am happy to report that this has improved my stitching algorithm greatly, removing 99% of the lines reported above.
The main culprit from the old formula was this part:
(maxDistance-distance)/maxDistance
which was producing a value between 0 and 1 linearly based on the distance of the point to the nearest edge. As the distance between the heightmap points and the edge increased, the heightmap points would utilize less and less of the average (as defined above), and shift more and more towards their original values. This linear interpolation was the cause of the too step slope, but luckily I found a built in method in the Mathf class of Unity's API that allows for quadratic (I believe cubic) interpolation. This is the SmoothStep Method.
Using this method (I believe a similar method can be found in the Xna framework found here), the change in how much of the average is used in determining a heightmap value becomes very severe in middle distances, but that severity lessens exponentially the closer the distance gets to maxDistance, creating a less severe slope that better blends with the slope of the unaltered region. The new forumla looks something like this:
//Using Mathf - Unity only?
float weight = Mathf.SmoothStep(1f, 0f, distance/maxDistance);
//Using XNA
float weight = MathHelper.SmoothStep(1f, 0f, distance/maxDistance);
//If you can't use either of the two methods above
float input = distance/maxDistance;
float weight = 1f + (-1f)*(3f*(float)Math.Pow(input, 2f) - 2f*(float)Math.Pow(input, 3f));
//Then calculate the new height using this weight
newHeight = oldHeight + (average - oldHeight)*weight;
There may be even better interpolation methods that produce better stitching. I will certainly update this question if I find such a method, so anyone else looking to do heightmap stitching can find the information they need. Kudos to rincewound for being on the right track with linear interpolation!
What is done in the images you posted looks a lot like simple linear interpolation to me.
So basically: You take two images (Left, Right) and define a stitching region. For linear interpolation you could take the leftmost pixel of the left image (in the stitching region) and the rightmost pixel of the right image (also in the stitching region). Then you fill the space in between with interpolated values.
Take this example - I'm using a single line here to show the idea:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
Lets say our overlap is 4 pixels wide:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
^ ^ ^ ^ overlap/stitiching region.
The leftmost value of the left image would be 10
The rightmost value of the right image would be 1.
Now we interpolate linearly between 10 and 1 in 2 steps, our new stitching region looks as follows
stitch = [10, 07, 04, 01]
We end up with the following stitched line:
line = [11,11,11,10,07,04,01,02,02,02]
If you apply this to two complete images you should get a result similar to what you posted before.

Alternatives to the Dynamic Time Warping (DTW) method

I am doing some research into methods of comparing time series data. One of the algorithms that I have found being used for matching this type of data is the DTW (Dynamic Time Warping) algorithm.
The data I have, resemble the following structure (this can be one path):
Path Event Time Location (x,y)
1 1 2:30:02 1,5
1 2 2:30:04 2,7
1 3 2:30:06 4,4
...
...
Now, I was wondering whether there are other algorithms that would be suitable to find the closest match for the given path.
The keyword you are looking for is "(dis-)similarity measures".
Euclidean Distance (ED) as referred to by Adam Mihalcin (first answer) is easily computable and somehow reflects the natural understanding of the word distance in natural language. Yet when comparing two time series, DTW is to be preffered - especially when applied to real world data.
1) ED can only be applied to series of equal length. Therefore when points are missing, ED simply is not computable (unless also cutting the other sequence, thus loosing more information).
2) ED does not allow time-shifting or time-warping opposed to all algorithms which are based on DTW.
Thus ED is not a real alternative to DTW, because the requirements and restrictions are much higher. But to answer your question, I want to recommend to you this lecture:
Time-series clustering – A decade review
Saeed Aghabozorgi, Ali Seyed Shirkhorshidi, Teh Ying Wah
http://www.sciencedirect.com/science/article/pii/S0306437915000733
This paper gives an overview about (dis-)similarity measures used in time series clustering. Here a little excerpt to motivate your actually reading the paper:
If two paths are the same length, say n, then they are really points in an 2n-dimensional space. The first location determines the first two dimensions, the second location determines the next two dimensions, and so on. For example, if we just take the three points in your example, the path can be represented as the single 6-dimensional point (1, 5, 2, 7, 4, 4). If we want to compare this to another three-point path, we can compute either the Euclidean distance (square root of the sum of squares of per-dimension distances between the two points) or the Manhattan distance (sum of the per-dimension differences).
For example, the boring path that stays at (0, 0) for all three times becomes the 6-dimensional point (0, 0, 0, 0, 0, 0). Then the Euclidean distance between this point and your example path is sqrt((1-0)^2 + (5-0)^2 + (2-0)^2 + (7-0)^2 + (4-0)^2 + (4-0)^2) = sqrt(111) = 10.54. The Manhattan distance is abs(1-0) + abs(5-0) + abs(2-0) + abs(7-0) + abs(4-0) + abs(4-0) = 23. This kind of a difference between the metrics is not unusual, since the Manhattan distance is provably at least as great as the Euclidean distance.
Of course one problem with this approach is that not all paths will be of the same length. However, you can easily cut off the longer path to the same length as the shorter path, or consider the shorter of the two paths to stay at the same location or moving in the same direction after measurements end, until both paths are the same length. Either approach will introduce some inaccuracies, but no matter what you do you have to deal with the fact that you are missing data on the short path and have to make up for it somehow.
EDIT:
Assuming that path1 and path2 are both List<Tuple<int, int>> objects containing the points, we can cut off the longer list to match the shorter list as:
// Enumerable.Zip stops when it finishes one of the sequences
List<Tuple<int, int, int, int>> matchingPoints = Enumerable.Zip(path1, path2,
(tupl1, tupl2) =>
Tuple.Create(tupl1.Item1, tupl1.Item2, tupl2.Item1, tupl2.Item2));
Then, you can use the following code to find the Manhattan distance:
int manhattanDistance = matchingPoints
.Sum(tupl => Math.Abs(tupl.Item1 - tupl.Item3)
+ Math.Abs(tupl.Item2 - tupl.Item4));
With the same assumptions as for the Manhattan distance, we can generate the Euclidean distance as:
int euclideanDistanceSquared = matchingPoints
.Sum(tupl => Math.Pow(tupl.Item1 - tupl.Item3, 2)
+ Math.Pow(tupl.Item2 - tupl.Item4, 2));
double euclideanDistance = Math.Sqrt(euclideanDistanceSquared);
There's another question here that might be of some help. If you already have a given path, you can find the closest match by using the cross-track distance algorithm; on the other hand, if you actually want to solve the pattern-recognition problem, you might want to find out more about Levenshtein distance and Elastic Matching (from Wikipedia: "Elastic matching can be defined as an optimization problem of two-dimensional warping specifying corresponding pixels between subjected images".

Fast sub-pixel laser dot detection

I am using XNA to build a project where I can draw "graffiti" on my wall using an LCD projector and a monochrome camera that is filtered to see only hand held laser dot pointers. I want to use any number of laser pointers -- don't really care about differentiating them at this point.
The wall is 10' x 10', and the camera is only 640x480 so I'm attempting to use sub-pixel measurement using a spline curve as outlined here: tpub.com
The camera runs at 120fps (8-bit), so my question to you all is the fastest way to to find that subpixel laser dot center. Currently I'm using a brute force 2D search to find the brightest pixel on the image (0 - 254) before doing the spline interpolation. That method is not very fast and each frame takes longer to computer than they are coming in.
Edit: To clarify, in the end my camera data is represented by a 2D array of bytes indicating pixel brightness.
What I'd like to do is use an XNA shader to crunch the image for me. Is that practical? From what I understand, there really isn't a way to keep persistent variables in a Pixel Shader such as running totals, averages, etc.
But for arguments sake, let's say I found the brightest pixels using brute force, then stored them and their neighboring pixels for the spline curve into X number of vertices using texcoords. Is is practical then to use HLSL to compute a spline curve using texcoords?
I am also open to suggestions outside of my XNA box, be it DX10/DX11, maybe some sort of FPGA, etc. I just don't really have much experience with ways of crunching data in this way. I figure if they can do something like this on a Wii-Mote using 2 AA batteries than I'm probably going about this the wrong way.
Any ideas?
If by Brute-forcing you mean looking at every pixel independently, it is basically the only way of doing it. You will have to scan through all the images pixels, no matter what you want to do with the image. Althought you might not need to find the brightest pixels, you can filter the image by color (ex.: if your using a red laser). This is easily done using a HSV color coded image. If you are looking for some faster algorithms, try OpenCV. It's been optimized again and again for image treatment, and you can use it in C# via a wrapper:
[http://www.codeproject.com/KB/cs/Intel_OpenCV.aspx][1]
OpenCV can also help you easily find the point centers and track each points.
Is there a reason you are using a 120fps camera? you know the human eye can only see about 30fps right? I'm guessing it's to follow very fast laser movements... You might want to consider bringning it down, because real-time processing of 120fps will be very hard to acheive.
running through 640*480 bytes to find the highest byte should run within a ms. Even on slow processors. No need to take the route of shaders.
I would advice to optimize your loop.
for instance: this is really slow (because it does a multiplication with every array lookup):
byte highest=0;
foundX=-1, foundY=-1;
for(y=0; y<480; y++)
{
for(x=0; x<640; x++)
{
if(myBytes[x][y] > highest)
{
highest = myBytes[x][y];
foundX = x;
foundY = y;
}
}
}
this is much faster:
byte [] myBytes = new byte[640*480];
//fill it with your image
byte highest=0;
int found=-1, foundX=-1, foundY=-1;
int len = 640*480;
for(i=0; i<len; i++)
{
if(myBytes[i] > highest)
{
highest = myBytes[i];
found = i;
}
}
if(found!=-1)
{
foundX = i%640;
foundY = i/640;
}
This is off the top of my head so sorry for errors ;^)
You're dealing with some pretty complex maths if you want sub-pixel accuracy. I think this paper is something to consider. Unfortunately, you'll have to pay to see it using that site. If you've got access to a suitable library, they may be able to get hold of it for you.
The link in the original post suggested doing 1000 spline calculations for each axis - it treated x and y independantly, which is OK for circular images but is a bit off if the image is a skewed ellipse. You could use the following to get a reasonable estimate:
xc = sum (xn.f(xn)) / sum (f(xn))
where xc is the mean, xn is the a point along the x-axis and f(xn) is the value at the point xn. So for this:
*
* *
* *
* *
* *
* *
* * *
* * * *
* * * *
* * * * * *
------------------
2 3 4 5 6 7
gives:
sum (xn.f(xn)) = 1 * 2 + 3 * 3 + 4 * 9 + 5 * 10 + 6 * 4 + 7 * 1
sum (f(xn)) = 1 + 3 + 9 + 10 + 4 + 1
xc = 128 / 28 = 4.57
and repeat for the y-axis.
Brute-force is the only real way, however your idea of using a shader is good - you'd be offloading the brute-force check from the CPU, which can only look at a small number of pixels simultaneously (roughly 1 per core), to the GPU, which likely has 100+ dumb cores (pipelines) that can simultaneously compare pixels (your algorithm may need to be modified a bit to work well with the 1 instruction-many cores arrangement of a GPU).
The biggest issue I see is whether or not you can move that data to the GPU fast enough.
Another optimization to consider: if you're drawing, then the current location of the pointer is probably close the last location of the pointer. Remember the last recorded position of the pointer between frames, and only scan a region close to that position... say a 1'x1' area. Only if the pointer isn't found in that area should you scan the whole surface.
Obviously, there will be a tradeoff between how quickly your program can scan, and how quickly you'll be able to move your mouse before the camera "loses" the pointer and has to go to the slow, full-image scan. A little experimentation will probably reveal the optimum value.
Cool project, by the way.
Put the camera slightly out of focus and bitblt against a neutral sample. You can quickly scan rows for non 0 values. Also if you are at 8 bits and pick up 4 bytes at a time you can process the image faster. As other pointed out you might reduce the frame rate. If you have less fidelity than the resulting image there isn't much point in the high scan rate.
(The slight out of focus camera will help get just the brightest points and reduce false positives if you have a busy surface... of course assuming you are not shooting a smooth/flat surface)
Start with a black output buffer. Forget about subpixel for now. Every frame, every pixel, do this:
outbuff=max(outbuff,inbuff);
Do subpixel filtering to a third "clean" buffer when you're done with the image. Or do a chunk or a line of the screen at a time in real time. Advantage: real-time "rough" view of the drawing, cleaned up as you go.
When you convert from the rough output buffer to the "clean" third buffer, you can clear the rough to black. This lets you keep drawing forever without slowing down.
By drawing the "clean" over top the "rough," maybe in a slightly different color, you'll have the best of both worlds.
This is similar to what paint programs do--if you draw really fast, you see a rough version, then the paint program "cleans up" the image when it has time.
Some comments on the algorithm:
I've seen a lot of cheats in this arena. I've played Sonic on a Sega Genesis emulator that upsamples. and it has some pretty wild algorithms that work very well and are very fast.
You actually have some advantages you can gain because you might know the brightness and the radius on the dot.
You might just look at each pixel and its 8 neighbors and let those 9 pixels "vote" according to their brightness for where the subpixel lies.
Other thoughts
Your hand is not that accurate when you control a laser pointer. Try getting all the dots every 10 frames or so, identifying which beams are which (based on previous motion, and accounting for new dots, turned-off lasers, and dots that have entered or left the visual field), then just drawing a high resolution curve. Don't worry about sub pixel in the input--just draw the curve into the high res output.
Use a Catmull-Rom spline, which goes through all control points.

Categories

Resources