I use gdal_retile.py script to cut rasters into tiles (rewritten to C#). Everything works fine, but I want my script to work in a little different way. What I want is to change the first level scale. I want it to be calculated using the pattern:
private const double MetersPerInch = 0.0254;
private const double DPI = 96;
private double GetScale(int meters, int pixels)
{
return meters/pixels/MetersPerInch*DPI;
}
For example.
If I get a raster of size 4k x 4k px and it's 100 km, then:
scale = 100000 / 4000 / 0.0254 * 96 = ~94488
Now I need to find the first scale higher than counted that is a power of 2. In this case it's 1:131072. And I should set it as my scale for the first level. Next levels' scale should be a power of 2: [1:262144, 1:524288, 1:1048576, ...].
Could any1 help me to modify the script? I don't care about the language (it can be done in Python or C#).
Thanks in advance for any solutions!
Ok, I did it!
Here's the source code (maybe some1 will need it in the future - be careful, cuz I did my changes in lines 370, 432 and 433 - if u want to revert my changes into original, just replace _scaleFactor variable by 2 and remove all the _scaleFactor calculations): http://pastebin.com/0qUCVk9J
Related
I'm sorry about the vague title, but I'm not really sure how to ask this without being very specific. If you suggest a title which is more clear, I'll change it as soon as I can.
Anyway, I don't think I can ask my question very succinctly without first providing a little background information. In a 2D space, I am creating "acres", which contain "tiles".
[One Acre with 64 Tiles]
For the sake of clarity, we'll assume that in this specific instance, there are 12 acres, four in the first row, four in the second, and four in the third. Each acre has 64 tiles in it, in an eight by eight grid.
[Twelve Acres, each with 64 Tiles]
I am generating a texture the width and height of the desired number of acres, multiplied by the number of tiles in each acre (in our example, the texture would be 32 pixels wide [the number of acres in a horizontal row {4} multiplied by the number of tiles in an acre {8}], and 24 pixels tall [the number of acres in a vertical column {3} multiplied by the number of tiles in an acre {8}]). The texture is then filled with perlin noise, which I would like to use to colour each tile.
[Single Acre, with 64 Tiles, next to the Perlin image generated for it (scaled up). This has a slight random colour variation applied to each tile.]
I would like to generate one image for all of the acres, and read from it each time a new acre is created, but therein lies the problem, and the subject of my question. How do I get the offset, so that each adjacent acre continues the pattern?
[What I want (to get this, I just created a single larger tile)]
The method I'm currently using doesn't seem to work, however, and ends up creating something like the following.
Strange Result http://2catstudios.github.io/images/StackOverflow/150113_Grid_Offset/Perlin_Twelve_Acres_NoSpace.png
[Strange Result]
Following is the code which I'm currently using to find the (incorrect, I assume) offset. The link directs to a Gist, where the perlin generation function, and acre/tile generation functions are pasted.
int xOffset = ( parentAcreXIndex * desiredWidth );
int yOffset = ( parentAcreYIndex * desiredHeight );
new Color ( 0.000f, 0.502f + ( parentWorld.worldPerlin.GetPixel ( xOffset + ( desiredWidth - tileXIndex ), yOffset + ( desiredHeight - tileYIndex )).grayscale * 0.3f ), 0.000f, 1 );
Full class (Links to GitHub's Gist), the above line is at 100
I don't really know what else to say; my mind is a bit "foggy" from trying to figure this out, so please forgive me if I've left something important out. Do let me know, and I'll update my post with the required information.
Also, I'm sorry about this question, it must be pretty hard to understand. I'm going to read over this a few times, after I publish it, to see if I can improve the wording.
Thank you for your time!
Michael
Edit
Thank you for taking a look at this! It turns out the problem was that the plane I was using for visualization was actually upside down. I'll make sure to check simple things like that in the future, sorry for the confusion! I have left the question up, because I was given enough points here to post images, and when I tried to delete it, the points were revoked. When I earn more points, I will come back to delete this. Thanks!
You seem to be asking this:
If I have a grid of pixels, which are grouped into 'acres' (32x32), how do I map from a given pixel (as the row and col of an acre) to the overall pixel?
Acres begin at every 32nd pixel, so for a given acre (acreX, acreY)
acreOriginInTextureX = acreX * 8;
acreOriginInTextureY = acreY * 8;
So a given tile (tileX, tileY) within an acre will be:
tilePosInTextureX = acreOriginInTextureX + tileX
= acreX * 8 + tileX
tilePosInTextureY = acreY * 8 + tileY
Really this is:
tilePosInTextureX = acreX * tilesPerAcreX + tileX
... and same for Y.
NB: I'm assuming zero indexing everywhere. If not, you'll need to subtract 1 from acreX, acreY, tileX, tileY, but not tilesPerAcreX or Y.
I am trying to find coordinates of one image inside of another using AForge framework:
ExhaustiveTemplateMatching tm = new ExhaustiveTemplateMatching();
TemplateMatch[] matchings = tm.ProcessImage(new Bitmap("image.png"), new Bitmap(#"template.png"));
int x_coordinate = matchings[0].Rectangle.X;
ProcessImages takes about 2 minutes to perform.
Image's size is about 1600x1000 pixels
Template's size is about 60x60 pixels
Does anyone know how to speed up that process?
As addition to the other answers, I would say that for your case:
Image's size is about 1600x1000 pixels Template's size is about 60x60 pixels
This framework is not the best fit. The thing you are trying to achieve is more search-image-in-other-image, than compare two images with different resolution (like "Search Google for this image" can be used).
About this so
called pyramid search.
it's true that the algorithm works way faster for bigger images. Actually the image-pyramid is based on template matching. If we take the most popular implementation (I found and used):
private static bool IsSearchedImageFound(this Bitmap template, Bitmap image)
{
const Int32 divisor = 4;
const Int32 epsilon = 10;
ExhaustiveTemplateMatching etm = new ExhaustiveTemplateMatching(0.90f);
TemplateMatch[] tm = etm.ProcessImage(
new ResizeNearestNeighbor(template.Width / divisor, template.Height / divisor).Apply(template),
new ResizeNearestNeighbor(image.Width / divisor, image.Height / divisor).Apply(image)
);
if (tm.Length == 1)
{
Rectangle tempRect = tm[0].Rectangle;
if (Math.Abs(image.Width / divisor - tempRect.Width) < epsilon
&&
Math.Abs(image.Height / divisor - tempRect.Height) < epsilon)
{
return true;
}
}
return false;
}
It should give you a picture close to this one:
As bottom line - try to use different approach. Maybe closer to Sikuli integration with .Net. Or you can try the accord .Net newer version of AForge.
If this is too much work, you can try to just extend your screenshot functionality with cropping of the page element that is required (Selenium example).
2 minutes seems too much for a recent CPU with the image a template sizes you are using. But there are a couple of ways to speed up the process. The first one is by using a smaller scale. This is called pyramid search. You can try to divide the image and template by 4 so that you will have an image of 400x250 and a template of 15x15 and match this smaller template. This will run way faster but it will be also less accurate. You can then use the interesting pixels found with the 15x15 template and search the corresponding pixels in the 1600x1000 image using the 60x60 template instead of searching in the whole image.
Depending on the template details you may try at an even lower scale (1/8) instead.
Another thing to know is that a bigger template will run faster. This is counter-intuitive but with a bigger template you will have less pixel to compare. So if possible try to use a bigger template. Sometimes this optimization is not possible if your template is already as big as it can be.
I've been looking around for a faster way to use the blend effect "multiply" on my bitmaps. I have tried using PorterDuff.Multiply but it doesn't achieve the desired result on bitmaps that contain Alpha channels, anything with 0 alpha becomes black.
I've read around and it seems the only way I can achieve the effect I'm after (photoshop/gimp's 'multiply' layer blending) is by applying the effect per pixel.
OpenGL is not an option for the App.
I'm not sure if I understand the algorithm properly for the blend mode suggested by Wikipedia.
TopColour * BottomColour / 255
Would be:
ColorC.R = ColorA.R * ColorB.R / 255;
ColorC.G = ColorA.G * ColorB.G / 255;
ColorC.B = ColorA.B * ColorB.B / 255;
// Alpha = Alpha?
// This example is suggesting Android.Graphics.Color.A/R/G/B is writeable,
// it's not - this is just for readability.
// ColorA = Top, ColorB = Bottom, ColorC = Result
Would it be faster to convert the Color.ToArgb and work with the integer?
And finally, am I calculating the multiply effect correctly - it doesn't display properly :(
I'm stuck, any help would be greatly appreciated.
Thank you.
If you want fast, you need to look into Renderscript, http://developer.android.com/guide/topics/renderscript/index.html .
This video from Google also shows you more or less everything you need, http://www.youtube.com/watch?v=gbQb1PVjfqM (Google IO 2012 - Doing More With Less: Being a Good Android Citizen).
I have a legacy map viewer application using WinForms. It is sloooooow. (The speed used to be acceptable, but Google Maps, Google Earth came along and users got spoiled. Now I am permitted to make if faster :)
After doing all the obvious speed improvements (caching, parallel execution, not drawing what does not need to be drawn, etc), my profiler shows me that the real choking point is the coordinate transformations when converting points from map-space to screen-space.
Normally a conversion code looks like this:
public Point MapToScreen(PointF input)
{
// Note that North is negative!
var result = new Point(
(int)((input.X - this.currentView.X) * this.Scale),
(int)((input.Y - this.currentView.Y) * this.Scale));
return result;
}
The real implementation is trickier. Latitudes/longitues are represented as integers. To avoid loosing precision, they are multiplied up by 2^20 (~ 1 million). This is how a coordinate is represented.
public struct Position
{
public const int PrecisionCompensationPower = 20;
public const int PrecisionCompensationScale = 1048576; // 2^20
public readonly int LatitudeInt; // North is negative!
public readonly int LongitudeInt;
}
It is important that the possible scale factors are also explicitly bound to power of 2. This allows us to replace the multiplication with a bitshift. So the real algorithm looks like this:
public Point MapToScreen(Position input)
{
Point result = new Point();
result.X = (input.LongitudeInt - this.UpperLeftPosition.LongitudeInt) >>
(Position.PrecisionCompensationPower - this.ZoomLevel);
result.Y = (input.LatitudeInt - this.UpperLeftPosition.LatitudeInt) >>
(Position.PrecisionCompensationPower - this.ZoomLevel);
return result;
}
(UpperLeftPosition representents the upper-left corner of the screen in the map space.)
I am thinking now of offloading this calculation to the GPU. Can anyone show me an example how to do that?
We use .NET4.0, but the code should preferably run on Windows XP, too. Furthermore, libraries under GPL we cannot use.
I suggest you look at using OpenCL and Cloo to do this - take a look at the vector add example and then change this to map the values by using two ComputeBuffers (one for each of LatitudeInt and LongitudeInt in each point) to 2 output ComputeBuffers. I suspect the OpenCL code would looks something like this:
__kernel void CoordTrans(__global int *lat,
__global int *lon,
__constant int ulpLat,
__constant int ulpLon,
__constant int zl,
__global int *outx,
__global int *outy)
{
int i = get_global_id(0);
const int pcp = 20;
outx[i] = (lon[i] - ulpLon) >> (pcp - zl);
outy[i] = (lat[i] - ulpLat) >> (pcp - zl);
}
but you would do more than one coord-transform per core. I need to rush off, I recommend you read up on opencl before doing this.
Also, if the number of coords is reasonable (<100,000/1,000,000) the non-gpu based solution will likely be faster.
Now one year later the problem arose again, and we found a very banal answer. I feel a bit stupid not realizing it earlier. We draw the geographic elements to bitmap via ordinary WinForms GDI. GDI is hardware accelerated. All we have to do is NOT to do the transformation by ourselves but set the scale parameters of System.Drawing.Graphics object:
Graphics.TranslateTransform(...) and Graphics.ScaleTransform(...)
We do not even need the trick with the bit shifting.
:)
I'm coming from a CUDA background, and can only speak for NVIDIA GPUs, but here goes.
The problem with doing this on a GPU is your operation/transfer time.
You have on the order of 1 operation to perform per element. You'd really want to do more than this per element to get a real speed improvement. The bandwidth between global memory and the threads on a GPU is around 100GB/s. So, if you have to load one 4 Byte integer to do one FLOP, you theoretical maximum speed is 100/4 = 25 FLOPS. This is far from the hundreds of FLOPS advertised.
Note this is the theoretical maximum, the real result might be worse. And this is even worse if you're loading more than one element. In your case, it looks like 2, so you might get a maximum of 12.5 FLOPS from it. In practice, it will almost certainly be lower.
If this sounds ok to you though, then go for it!
XNA can be used to do all the transformations you require and gives very good performance. It can also be displayed inside a winforms application: http://create.msdn.com/en-US/education/catalog/sample/winforms_series_1
I am using XNA to build a project where I can draw "graffiti" on my wall using an LCD projector and a monochrome camera that is filtered to see only hand held laser dot pointers. I want to use any number of laser pointers -- don't really care about differentiating them at this point.
The wall is 10' x 10', and the camera is only 640x480 so I'm attempting to use sub-pixel measurement using a spline curve as outlined here: tpub.com
The camera runs at 120fps (8-bit), so my question to you all is the fastest way to to find that subpixel laser dot center. Currently I'm using a brute force 2D search to find the brightest pixel on the image (0 - 254) before doing the spline interpolation. That method is not very fast and each frame takes longer to computer than they are coming in.
Edit: To clarify, in the end my camera data is represented by a 2D array of bytes indicating pixel brightness.
What I'd like to do is use an XNA shader to crunch the image for me. Is that practical? From what I understand, there really isn't a way to keep persistent variables in a Pixel Shader such as running totals, averages, etc.
But for arguments sake, let's say I found the brightest pixels using brute force, then stored them and their neighboring pixels for the spline curve into X number of vertices using texcoords. Is is practical then to use HLSL to compute a spline curve using texcoords?
I am also open to suggestions outside of my XNA box, be it DX10/DX11, maybe some sort of FPGA, etc. I just don't really have much experience with ways of crunching data in this way. I figure if they can do something like this on a Wii-Mote using 2 AA batteries than I'm probably going about this the wrong way.
Any ideas?
If by Brute-forcing you mean looking at every pixel independently, it is basically the only way of doing it. You will have to scan through all the images pixels, no matter what you want to do with the image. Althought you might not need to find the brightest pixels, you can filter the image by color (ex.: if your using a red laser). This is easily done using a HSV color coded image. If you are looking for some faster algorithms, try OpenCV. It's been optimized again and again for image treatment, and you can use it in C# via a wrapper:
[http://www.codeproject.com/KB/cs/Intel_OpenCV.aspx][1]
OpenCV can also help you easily find the point centers and track each points.
Is there a reason you are using a 120fps camera? you know the human eye can only see about 30fps right? I'm guessing it's to follow very fast laser movements... You might want to consider bringning it down, because real-time processing of 120fps will be very hard to acheive.
running through 640*480 bytes to find the highest byte should run within a ms. Even on slow processors. No need to take the route of shaders.
I would advice to optimize your loop.
for instance: this is really slow (because it does a multiplication with every array lookup):
byte highest=0;
foundX=-1, foundY=-1;
for(y=0; y<480; y++)
{
for(x=0; x<640; x++)
{
if(myBytes[x][y] > highest)
{
highest = myBytes[x][y];
foundX = x;
foundY = y;
}
}
}
this is much faster:
byte [] myBytes = new byte[640*480];
//fill it with your image
byte highest=0;
int found=-1, foundX=-1, foundY=-1;
int len = 640*480;
for(i=0; i<len; i++)
{
if(myBytes[i] > highest)
{
highest = myBytes[i];
found = i;
}
}
if(found!=-1)
{
foundX = i%640;
foundY = i/640;
}
This is off the top of my head so sorry for errors ;^)
You're dealing with some pretty complex maths if you want sub-pixel accuracy. I think this paper is something to consider. Unfortunately, you'll have to pay to see it using that site. If you've got access to a suitable library, they may be able to get hold of it for you.
The link in the original post suggested doing 1000 spline calculations for each axis - it treated x and y independantly, which is OK for circular images but is a bit off if the image is a skewed ellipse. You could use the following to get a reasonable estimate:
xc = sum (xn.f(xn)) / sum (f(xn))
where xc is the mean, xn is the a point along the x-axis and f(xn) is the value at the point xn. So for this:
*
* *
* *
* *
* *
* *
* * *
* * * *
* * * *
* * * * * *
------------------
2 3 4 5 6 7
gives:
sum (xn.f(xn)) = 1 * 2 + 3 * 3 + 4 * 9 + 5 * 10 + 6 * 4 + 7 * 1
sum (f(xn)) = 1 + 3 + 9 + 10 + 4 + 1
xc = 128 / 28 = 4.57
and repeat for the y-axis.
Brute-force is the only real way, however your idea of using a shader is good - you'd be offloading the brute-force check from the CPU, which can only look at a small number of pixels simultaneously (roughly 1 per core), to the GPU, which likely has 100+ dumb cores (pipelines) that can simultaneously compare pixels (your algorithm may need to be modified a bit to work well with the 1 instruction-many cores arrangement of a GPU).
The biggest issue I see is whether or not you can move that data to the GPU fast enough.
Another optimization to consider: if you're drawing, then the current location of the pointer is probably close the last location of the pointer. Remember the last recorded position of the pointer between frames, and only scan a region close to that position... say a 1'x1' area. Only if the pointer isn't found in that area should you scan the whole surface.
Obviously, there will be a tradeoff between how quickly your program can scan, and how quickly you'll be able to move your mouse before the camera "loses" the pointer and has to go to the slow, full-image scan. A little experimentation will probably reveal the optimum value.
Cool project, by the way.
Put the camera slightly out of focus and bitblt against a neutral sample. You can quickly scan rows for non 0 values. Also if you are at 8 bits and pick up 4 bytes at a time you can process the image faster. As other pointed out you might reduce the frame rate. If you have less fidelity than the resulting image there isn't much point in the high scan rate.
(The slight out of focus camera will help get just the brightest points and reduce false positives if you have a busy surface... of course assuming you are not shooting a smooth/flat surface)
Start with a black output buffer. Forget about subpixel for now. Every frame, every pixel, do this:
outbuff=max(outbuff,inbuff);
Do subpixel filtering to a third "clean" buffer when you're done with the image. Or do a chunk or a line of the screen at a time in real time. Advantage: real-time "rough" view of the drawing, cleaned up as you go.
When you convert from the rough output buffer to the "clean" third buffer, you can clear the rough to black. This lets you keep drawing forever without slowing down.
By drawing the "clean" over top the "rough," maybe in a slightly different color, you'll have the best of both worlds.
This is similar to what paint programs do--if you draw really fast, you see a rough version, then the paint program "cleans up" the image when it has time.
Some comments on the algorithm:
I've seen a lot of cheats in this arena. I've played Sonic on a Sega Genesis emulator that upsamples. and it has some pretty wild algorithms that work very well and are very fast.
You actually have some advantages you can gain because you might know the brightness and the radius on the dot.
You might just look at each pixel and its 8 neighbors and let those 9 pixels "vote" according to their brightness for where the subpixel lies.
Other thoughts
Your hand is not that accurate when you control a laser pointer. Try getting all the dots every 10 frames or so, identifying which beams are which (based on previous motion, and accounting for new dots, turned-off lasers, and dots that have entered or left the visual field), then just drawing a high resolution curve. Don't worry about sub pixel in the input--just draw the curve into the high res output.
Use a Catmull-Rom spline, which goes through all control points.