Optimize Chart with 10000+ points

Optimize Chart with 10000+ points - c#

I have chart that can contain a lot of points (10000 +)
When I scale the chart in order to see all points in screen, it takes some time to draw them
Can You advice me some optimization, in order not to draw all points

I'm not an expert with listed technologies, but I would solve this by 'bucketing' your data points.
Your X axis is time, so determine the resolution point for the current chart size. IE, if you are seeing the entire chart you will only need a data point per day for example. If you are zoomed in a long way, you might want a point per hour.
Now you have determined resolution, go through your chart, and find all the data that exists between the resolution points, IE, all data that is > 20th April 2011 at 4pm and < 20th April 2011 at 5pm if you are on an hourly resolution.
Depending on the type of data you are using, will determine if you want to average all the data point you have collected, or find the median (or some other method, such as a candle stick chart to show the max/min values). Either way, pick the most relevant method, repeat for all points and render the result with your new data.
Hope that's what you meant.

Seems like you should use some sort of level of detail (LoD) algorithm.
For example:
Always use a maximum given set of points to represent all your actual points. By calculating local minima and maxima you can create a proper representation of the given point set for a certain 'detail', depending on how far you are zoomed in.
Calculating these extrema can still prove to be slow, so you might need to cache them. You can calculate and cache this on the fly as new data arrives.

In addition to the other good suggestions, I would
Do some random-pausing on it, to see if it's spending much time doing something else that could be avoided, such as maybe allocating new point structures all the time.
Rather than paint directly to the window, paint to a bitmap, and copy that to the window. It always looks faster, and sometimes it even is faster. (Be sure to stub out the method that clears the window background.)

I had experienced a severe performance problem with thousands of Series added to the chart rather than thousands of Points. The solution that worked for me was a flavor of the Flyweight pattern:
Instead of adding 1000-s of series, add just a single one.
At the end of a virtual series, i.e., when all points of the series have been added and it's time to move on to the next one, insert an empty point:
series.Points.Add(new DataPoint(0, 0) { IsEmpty = true });
Hope that helps somebody.

Related

Smoothing the results of a series server-side to return fewer data points (.NET)

I need to show a chart from data returned from an API.
This API could potentially return millions of results, but it would tax the server heavily.
Thus, I'm looking for a way to return a fewer number of results and still show a trend in the chart. Basically, I'm looking to "smooth" the line of the graph by showing only relevant points.
Is there a .NET library that could help me in this implementation? Or perhaps a "smoothing" function that takes a limit on the number of points to results?

What would be your target number of results? One approach would be to just take a sampling of the points. For every 10 points you have, return 1, for instance. In which case, you could use Linq to accomplish this: Sampling a list with linq
This doesn't address the "showing only relevant points" part of your question, though. That's a little harder to solve programmatically. What does "relevant" mean in your data? Exceeding a certain deviation?
So maybe a moving average of your data would work. Take 10 points at a time, average them, return 1 point. Like this example: Smoothing data from a sensor
With either of those approaches, you can trade off accuracy and 'smoothness' by varying the "10" in the above examples. The higher the number, the "smoother" your result.

Fourier Transform on varied time data

So... no clue where I should be asking this question but I'm hoping someone here can at least point me in the right direction. I have a time series that I would like to do spectral analysis on but I can't find any tools for doing FFT that accommodate a varied time difference between data points (they all assume dt is constant). Does anyone know of a tool that would work for this (I'm specifically looking for a periodogram or some other way to determine periodicity).
My only thought is to do linear interpolation between data points at a specific time interval to give the data a constant dt but I'm worried that will scew the spectral analysis data.
Here is a small chunk of the data; time, data, dt
time data dt
39.630 49662.1 0.170
39.810 49582.5 0.180
40.150 49430.0 0.340
40.320 49413.8 0.170
40.490 49324.0 0.170
40.670 49092.5 0.180
40.830 49025.6 0.160
41.010 49101.5 0.180
any suggestions??

Most mathematical theory behind FFT requires a fixed sampling period.
I can suggest you to do the following:
Create a sequence of equally spaced time instants
Calculate the value for each instant based on the neighbor points (linear or quadratic interpolation should do it). You can use as many points as you desire in order to obtain the best approximation.
Depending on the degree of detail you want for your results use a parametric or non-parametric method for estimating spectrum: Burg method can be useful and LPC/AR model can also be useful.
Check the links at Mathworks:
http://www.mathworks.com/help/signal/nonparametric-spectral-estimation.html
http://www.mathworks.com/help/signal/parametric-spectral-estimation.html

Traveling salesman prob on 2d map with walls (obstacles) so pathfinding needed

I need to find optimal path between a number of points on a 2d map.
The 2d map is of a building and will simply have where you can not go (through walls) and all the points on the map. So it's not really a map, rather lines you cannot go through with points to pass through.
I have a number of points, say between 20 and 500
I start with one that I select and then need the route calculated for most optimal path.
I would love hints for where to look for this travelling salesman problem with obstacles. Or even better, done library for doing it.
Bonuses
Things like doors can be weighted as they are less fun to pass through back and forth.
Possibility of prioritizing/Weighting the ability to end close to where you started.
Selecting areas as passable but annoying (weighting down)
.Net/C# code that I can use, I want to use this both on .NET MVC project and Xamarin mobile project so .net code would be great (if code exists)
Update example
In my example here we have an office. Now I have not thought every detail out so this is merely an example.
All the purple dots need to be checked
Yellow area could mean annoying to pass through but doable
Red could mean not active but can be passed if no other option exists.
Blue (walls) are impenetrable and can not be passed.
Green is doors, weighted down possibly as it's annoying to go trough closed doors (usually this would probably make sense anyway as the dots in a room would be easiest to check together.
The user would go to one dot, check it, then the software should tell him which one to do next until he is done.
Bonus could be given for ending close to start place. So for instance in this example, if the red area was normal and contained dots it would have been easy to make it a loop. (So the user comes back close to where he started)
Finally I suppose it would also be smart to differentiate outdoors areas as you would need to get dressed for outdoors, so you only want to go out once.
Also it could be smart to be able to prioritize ending on a point close to stairwell to next floor if they intend to check multiple floors at once.
Of course would have more more complex and larger plans the this exmaple.
Again sorry for just brainstorming out ideas but I have never done this kind of work and is happy for any pointers :-)

Let N be the set of nodes to visit (purple points). For each i and j in N, let c(i,j) be the distance (or travel time) to get from i to j. These can be pre-computed based on actual distances plus walls, doors, other barriers, etc.
Now, you could then add a penalty to c(i,j) if the path from i to j goes through a door, "annoying" area, etc. But a more flexible way might be as follows:
Let k = 1,...,K be the various types of undesirable route attributes (doors, annoying areas, etc.). Let a_k(i,j) be the amount of each of these attributes on the path from i to j. (For example, suppose k=1 represents door, k=2 represents yellow areas, k=3 represents outside. Then from an i in the break area to j in the bathroom might have a_1(i,j) = 1, and from an i to a j both in the yellow areas would have a_2(i,j) = 0.5 or 2.0 or however annoying that area is, etc.)
Then, let p_k be a penalty for each unit of undesirable attribute k -- maybe p_1 = 0.1 if you don't mind going through doors too much but p_2 = 3.0 if you really don't like yellow areas.
Then, let c'(i,j) = c(i,j) + sum{k=1,...,K} p_k * a_k(i,j). In other words, replace the actual distance with the distance plus penalties for all the annoyances. The user can set the p_k values before the optimization in order to express his/her preferences among these. The final penalties p_k * a_k(i,j) should be commensurate with the distance units used for c(i,j), though -- you don't want distances of 100m but penalties of 1,000,000.
Now solve a TSP with distances given by c'(i,j).
The TSP requires you to start and end at the same node, so that preference is really a constraint. If you're going to solve for multiple floors simultaneously, then the stairway times would be in the c(i,j) so there's no need to explicitly encourage routes that end near a stairway -- the solution would tend to do that anyway since stairs are slow. If you're going to solve each floor independently, then just set the start node for each floor equal to the stairway.
I wouldn't do anything about the red (allowable but unused) areas -- that would already be baked into the c(i,j) calculations.
Hope this helps.

Generating isochrone maps from road networks

I wonder if there's any described algorithm that can convert isochrones into approximate area to show a range of some feature (in my problem this feature is a road network).
Example. I have something like on the image beneath:
It's a simple network (where I can arrive from the start point in X minutes or going Y kilometers). I have information of all the nodes and links. Now I need to create an isochrone map that show an approximate range where I can arrive.
Problems:
Convex hull - sucks because of too general approximation,
I can create buffors on roads - so I will get some polygon that shows range, but I will also have the holes by roads that connect into circles.
What I need to obtain is something like this:
I've found some potentially useful information HERE, but there are only some ideas how it could be done. If anyone has any concept, please, help me to solve my problem.

Interesting problem, to get better answers you might want to define exactly what will this area that shows the range (isochrone map) be used for? For example is it illustrative? If you define what kind of approximation you want it could help you solve the problem.
Now here are some ideas.
1) Find all the cycles in the graph (see link), then eliminate edges that are shared between two cycles. Finally take the convex hull of the remaining cycles, this together with all the roads, so that the outliers that do not form cycles are included, will give a good approximation for an isochrome map.
2) A simpler solution is to define a thickness around each point of every road, this thickness should be inversely proportional to how long it takes to arrive at that point from the starting point. I.e. the longer it takes to arrive at the point the less thick. You can then scale the thickness of all points until all wholes are filled, and then you will have an approximate isochrome map. One possible way of implementing this is to run an algorithm that takes all possible routes simultaneously from the starting point, branching off at every new intersection, while tracking how long it took to arrive at each point. During its execution, at every instant of time all previously discovered route should be thickened. At the end you can scale this thickness so as to fill all wholes.
Hopefully this will be of some help. Good luck.

I have solved the problem (it's not so fast and robust, but has to be enough for now).
I generated my possible routes using A* (A-Star) algorithm.
I used #Artur Gower's idea from point one to eliminate cycles and simplify my geometry.
Later I decided to generate 2 types of gemetries (1st - like on the image, 2nd - simple buffers):
1st one:
3. Then I have removed the rest of unnecessary points using Douglas-Peucker algorithm (very fast!).
4. In the end I used Concave Hull algorithm (aka Alpha-Shapes or Non-Convex Hull).
2nd one:
3. Apply a buffer to the existing geometry and take the exterior ring (JTS library made that really easier:)).

Dealing with imprecision in CAD drawing

I have a CAD application, that allows user to draw lines and polygons and all that.
One thorny problem that I face is user drawing can be highly imprecise, for example, a user might want to draw two rectangles that are connected to each other. Hence there should be one line shared by two rectangles. However, it's easy for user to, instead of draw a line, draw two lines that are very close to each other, so close to each other that when look from the screen, you would be mistaken that they are the same line, except that they aren't when you zoom in a little bit.
My application would require user to properly draw the lines ( or my preprocessing must be able to do auto correction), or else my internal algorithm (let's call it The Algorithm) would not be able to process the inputs correctly.
What is the best strategy to combat this kind of problem? I am thinking about rounding the point coordinates to a certain degree of precision, but although I can't exactly pinpoint the problem of this approach, but I feel that this is not the correct way of doing things, that this will introduce a new set of problem.
Edit: For the sake of argument the snapping isn't an available option. For the matter, all sorts of "input-side" guidance are not available. The correction must be done via preprocessing on my code, when the drawing is finished, but just before I submit it to my algorithm.
Crazy restriction, you say. But a user can construct their input either in my application, or they can construct their input in other CAD software and then submit to my engine to do the calculation. I can't control how they input in other CAD software.
Edit 2:I can let user to specify the "cluster radius" to occur, but the important point is, I would need to make sure that my preprocessing algorithm is consistent and won't really introduce a new set of problem.
Any idea?

One problem I see is that your clustering/snapping algorithm would have to decide on its own which point to move onto which other point.
During live input snapping is simple: the first point stays put, the second point is snapped onto the first. If in offline mode you get a bunch of points that you know should be snapped together, you have no idea where the resulting point should lie. Calculate the average, possibly resulting in a completely new point? Choose the most central point out of all the candidates? Pick one at random? Try to align your point with some other points on the x/y/z-axis?
If your program allows any user interaction at all, you could detect point clusters that might be candidates for merging, and present the user with different merge target points to choose from.
Otherwise, you could make this kind of behaviour configurable: take a merge radius ("if two or more poins are within n units of one another...") and a merging algorithm ("... merge them into the most central of the points given") as parameters and read them from a config file.

Snapping points. User should be able to snap to end points (and many more) then, when you detect a snap, just change the point user clicked to snap point point. Check AutoCAD, functions line End, Middle and so on.
EDIT: If you want offline snapping then you just need to check every pair of points if they are near each other. The problem is that this in NP-problem so it will take a lot of time as you can't really get under O(n^2) time complexity. This algorithm you need should be under "clustering".
EDIT2: I think you shouldn't consider that input data is bad. But if you reallllllly want to do this, simples way is to take each point, check if there are other points in users defined radius, if yes find whole group that should merge into one point, find avg of coordinates of points and point all of them to that specific point. But remember - most designers KNOW what are snap points for and if they don't use them they have valid idea for that.

Your basic problem seems to me (I hope I understood correctly) to determine if two lines are the "same" line.
Out of my own experience your feeling is correct, rounding the coordinates in the input might prove not to be a good idea.
Maybe you should leave the coordinates in the input as they are but implement your function let's name it IsSameLine That you use in "The Algorithm" (who among others determines if two rectangles are connected if i understood your description correctly).
IsSameLine could transform the endpoints of the input lines from source coordinates to screen coordinates considering a certain (possibly configurable) screen resolution and check if they are the same in screen coordinates.
I.e. let's say you have an input file with the following extent (lowerleft) (upperRight) ((10,10), (24,53)). The question would be how far apart would be points (11,15) and (11.1, 15.1) if drawn at "zoom to extents" level on a 1600x1200 pixels screen. So you can determine a transform from source coordinates to "screen coordinates". You use then this transformation in IsSameLine as described above.
I'm not sure however this would be actually a good solution for you.
Another (maybe better?) possibility is to implement IsSameLine to return true if the points of the two lines are at maximum epsilon distance apart. The epsilon could have a default value computed based on the extent of the input vector data and probably it would be a good idea to give the user the possibility to give another value for it.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.