I have two different logic design in a single form (using C#). They are as follow:
First:
//Declaration is only one time
Bitmap a;
//This part of the code will be called many times
reload()
{
if(z==1)
{
//x and y are just strings representing the image path
a = new Bitmap(x);
pictureBox1 = a;
}
else
{
a = new Bitmap(y);
pictureBox1 = a;
}
}
Second:
//Declaration is only one time
Bitmap a;
Bitmap b;
a = new Bitmap(x);
b = new Bitmap(y);
//This part of the code will be called many times
reload()
{
if(z==1)
{
//x and y are just strings representing the image path
pictureBox1 = a;
}
else
{
pictureBox1 = b;
}
}
My question is which one is more memory efficient? I am developing for an embedded system (wince 6.0, CF 3.5) where memory is limited, and I need to deal with alot of images in a single form (method two will cause me to declare alot of Bitmap objects).
Please advice, thanks.
Assuming these are sizable enough Bitmaps, for memory efficiency, the first is better. It does the same job while allowing there to be one instance of Bitmap instead of two.
Now the question becomes, would you prefer to take the memory overhead and avoid instantiating them repeatedly? I find often in embedded systems you are very limited in both RAM and CPU time, and it really depends on what you're doing whether you'd rather use RAM to save CPU time or vice-versa.
For another example opposite yours, I once coded an embedded music synthesizer in school. CPU time was of the essence. So say I want to create a smooth tone. It is better for that to just jam an envelope into RAM to sample from than to calculate sine values.
Related
I am trying to compare two bitmaps to one another. One is premade, and the other one consists of a small image of the main screen being taken and filtered for everything besides full white. I now need a way to compare the amount of white pixels in the live Bitmap to the amount in the premade Bitmap (101 white pixels). A way I know of would be using the Bitmap.Get/SetPixel commands, but they are really slow and as this is used for a kind of time critical application, unfitting.
Especially since I could cut down the filtering process by a factor of 70 by following this guide.
https://www.codeproject.com/articles/617613/fast-pixel-operations-in-net-with-and-without-unsa
I also can't just compare the 2 Bitmaps, as the live one will usually not have the pixels in the same position, but will share the same amount of white pixels.
So yeah. It'd be great if one of you had a time effective solution to this problem.
Edit
Huge oversight on my part. When looking at the filtering method, it becomes apparent that one can just use a counter+=1 every time a pixel is not filtered out.
So I just changed this line of code in the filter function
row[rIndex] = row[bIndex] = row[gIndex] = distance > toleranceSquared ? unmatchingValue : matchingValue;
to this
if(distance > toleranceSquared)
{
row[rIndex] = row[bIndex] = row[gIndex] = unmatchingValue;
}
else
{
row[rIndex] = row[bIndex] = row[gIndex] = matchingValue;
WhitePixelCount += 1;
}
I got a task to show frames from camera as fast as possible. The camera is Basler Dart and it can produce more than 70 frames per second. Unfortunately it does not support stream output, but it produce bitmaps.
Now our solution is similar to the one from example from Basler, but it uses PictureBox to show frames. I've read somewhere that it is slow and better solution should be used. I agree that it is slow because it takes 25% of CPU (only camera, rest of app takes only 5%) when displaying all 70 fps. Unfortunately I haven't found the better solution.
private void OnImageGrabbed(Object sender, ImageGrabbedEventArgs e)
{
if (InvokeRequired)
{
// If called from a different thread, we must use the Invoke method to marshal the call to the proper GUI thread.
// The grab result will be disposed after the event call. Clone the event arguments for marshaling to the GUI thread.
BeginInvoke( new EventHandler<ImageGrabbedEventArgs>( OnImageGrabbed ), sender, e.Clone() );
return;
}
try
{
// Acquire the image from the camera. Only show the latest image. The camera may acquire images faster than the images can be displayed.
// Get the grab result.
IGrabResult grabResult = e.GrabResult;
// Check if the image can be displayed.
if (grabResult.IsValid)
{
// Reduce the number of displayed images to a reasonable amount if the camera is acquiring images very fast.
if (!stopWatch.IsRunning || stopWatch.ElapsedMilliseconds > 100)
{
stopWatch.Restart();
bool reqBitmapOldDispose=true;
if(grConvBitmap==null || grConvBitmap.Width!=grabResult.Width || grConvBitmap.Height!=grabResult.Height)
{
grConvBitmap = new Bitmap(grabResult.Width, grabResult.Height, PixelFormat.Format32bppRgb);
grConvRect=new Rectangle(0, 0, grConvBitmap.Width, grConvBitmap.Height);
}
else
reqBitmapOldDispose=false;
// Lock the bits of the bitmap.
BitmapData bmpData = grConvBitmap.LockBits(grConvRect, ImageLockMode.ReadWrite, grConvBitmap.PixelFormat);
// Place the pointer to the buffer of the bitmap.
converter.OutputPixelFormat = PixelType.BGRA8packed;
IntPtr ptrBmp = bmpData.Scan0;
converter.Convert( ptrBmp, bmpData.Stride * grConvBitmap.Height, grabResult );
grConvBitmap.UnlockBits( bmpData );
// Assign a temporary variable to dispose the bitmap after assigning the new bitmap to the display control.
Bitmap bitmapOld = pictureBox.Image as Bitmap;
// Provide the display control with the new bitmap. This action automatically updates the display.
pictureBox.Image = grConvBitmap;
if (bitmapOld != null && reqBitmapOldDispose)
{
// Dispose the bitmap.
bitmapOld.Dispose();
}
}
}
}
catch (Exception exception)
{
ShowException(exception, "OnImageGrabbed" );
}
finally
{
e.DisposeGrabResultIfClone();
}
}
My idea was to move the load to GPU. I've tryed SharpGL (OpenGL for C#) but it seemed that it cannot consume 70 textures per second but I guess it's because I made the solution in 60 minutes with learning basis of OpenGL.
My question is: What should I use instead of PictureBox to improve power and decrease CPU load? Should I use OpenGL or just limit displayed frames (as it's in example)? The PC has only integrated graphics (Intel i3 6th gen).
Overall when it comes to performance I would recommend doing some profiling to see where the actual bottleneck is. It is much easier to find performance problems if you have some data to go on rather than just guessing at the problem. A good profiler should also tell you a bit about garbage-collection penalty & allocation rates.
Overall I would say the example code looks quite decent:
There is some rate control, even if the limit of 100ms/10fps looks rather low to me.
The there does not seem to be much unnecessary copying going on as far as I can see.
It looks like you are reusing and updating the bitmap rather than recreating it every frame.
Some possible things you could try:
If the camera is a monochrome you could probably skip the conversion stage, and just do a memory copy from the grab-buffer to the bitmap.
If the camera is a high resolution model you could consider binning pixels to reduce the resolution of the images.
We are using WritableBitmap in wpf with fairly good performance. But I'm not sure how it compares to a winforms picturebox.
You could try doing the drawing yourself, i.e. attach an eventHandler to OnPaint of a panel and use one of the Graphics.DrawImage* methods to draw the latest bitmap. I have no idea if this will make any significant performance difference.
If the conversion takes any significant time you could try doing this on a background thread. If you do this you might need some way to handle bitmaps in such a way that no bitmap may be accessed from both the worker thread and UI thread at the same time.
The rate control could probably be improved to "process images as fast the UI thread can handle" instead of a fixed rate.
Another thing to consider is hardware rendering is used. This is normally the case, but there are situations where windows will fallback to software rendering, with a very high CPU usage as a result:
Servers may lack a GPU at all
Virtualization might not virtualize the GPU
Remote desktop might use software rendering
I've made a program that analyzes the first pixel of an image and then notes the values of it in a List, this is for checking if the image is black&white or in color. Now, does anyone know an efficient way of reading high-res images? Right now I'm using Bitmaps but they are highly inefficient. The images are around 18 megapixels each and I want to analyze around 400 photos. Code below:
Bitmap b;
foreach (FileInfo f in files) {
// HUGE MEMORY LEAK
System.GC.Collect();
System.GC.WaitForPendingFinalizers();
b = (Bitmap)Bitmap.FromFile(f.FullName);
// reading pixel (0,0) from bitmap
When I run my program it says:
"An unhandled exception of type 'System.OutOfMemoryException' occurred in System.Drawing.dll
Additional information: There is no available memory."
I've tried with System.GC.Collect() to clean up, as you can see, but the exception doesn't go away. If I analyse a folder that contains only a few photos, the program runs fine and gladly does it's job.
Using the first pixel of an image to check if it is colour or not is the wrong way to do this.
If you have an image with a black background (pixel value 0,0,0 in RGB), how do you know the image is black and white, and not colour with a black background?
Placing the bitmap in a Using is correct, as it will dispose properly.
The following will do the trick.
class Program
{
static void Main(string[] args) {
List<String> ImageExtensions = new List<string> { ".JPG", ".JPE", ".BMP", ".GIF", ".PNG" };
String rootDir = "C:\\Images";
foreach (String fileName in Directory.EnumerateFiles(rootDir)) {
if (ImageExtensions.Contains(Path.GetExtension(fileName).ToUpper())) {
try {
//Image.FromFile will work just as well here.
using (Image i = Bitmap.FromFile(fileName)) {
if (i.PixelFormat == PixelFormat.Format16bppGrayScale) {
//Grey scale...
} else if (i.PixelFormat == PixelFormat.Format1bppIndexed) {
//1bit colour (possibly b/w, but could be other indexed colours)
}
}
} catch (Exception e) {
Console.WriteLine("Error - " + e.Message);
}
}
}
}
}
The reference for PixelFormat is found here - https://msdn.microsoft.com/en-us/library/system.drawing.imaging.pixelformat%28v=vs.110%29.aspx
Objects in C# are limited to 2Gb, so I doubt that an individual image is causing the problem.
I also would suggest that you should NEVER manually call the GC to solve a memory leak (though this is not technically a leak, just heavy memory usage).
Using statements are perfect for ensuring that an object is marked for disposal, and the GC is very good at cleaning up.
We perform intensive image processing in our software, and have never had issues with memory using the approach I have shown.
While simply reading the header to find image data is a perfectly correct solution, it does mean a lot of extra work to decode different file types, which is not necessary unless you are working with vast amounts of images in very small memory (although if that is your aim, straight C is a better way to do it rather than C#. Horses for courses and all that jazz!)
EDIT - I just ran this on a directory containing over 5000 high-res TIFFs with no memory issues. The slowest part of the process was the console output!
I guess, if you need only first pixel - not necessary to read all file. Maybe you should take just first pixel from bitmap byte array and work with it. You should find pixel array manually and take first.
How find pixel array? Well, it depends on file format. You should find specification for each usable format and use it. Here is example for BMP reader
I'm writing a CAD-like application which needs to display maybe tens of thousands of lines when stuff is zoomed out.
At the moment I use C++ and Direct2D, which works quite smoothly: I can draw 100000 lines in something like 16 milliseconds. So I know my (average) machine can handle that.
I'm trying to move to WPF but I'm finding the performance disappointing. With the code below the redraw takes nearly one second (when I resize the window for example).
Profiling says the bottleneck is somewhere in [wpfgfx_v0400.dll], but I can't see exactly which functions.
So my questions: What am I doing wrong? How can I improve the performance of the code below?
public partial class MainWindow : Window
{
public MainWindow ()
{
InitializeComponent ();
var gg = new GeometryGroup ();
Random random = new Random ();
for (int i = 0; i < 1000; i++)
{
Point p0 = new Point (random.Next (1000), random.Next (1000));
Point p1 = new Point (random.Next (1000), random.Next (1000));
var lineGeometry = new LineGeometry (p0, p1);
gg.Children.Add (lineGeometry);
}
var stroke = new SolidColorBrush (Colors.Red);
gg.Freeze ();
stroke.Freeze ();
this.Content = new Path () { Data = gg, Stroke = stroke };
}
}
Drawing things in WPF is slower than you'd expect! A line with 1000 points shouldn't be a problem in general, but I wouldn't hope to draw anywhere near 100000 points using WPF geometry primitives - instead you'll eventually have to drop down to either rasterizing yourself or using some raw DirectX embedded in your WPF application.
Specifically the case of a single line of 1000 points should be well within the bounds of reason, some thoughts:
With random data, if the line goes back and forth to cover a lot of area, this can lead to some very ugly performance. This is not simply a case of creating a large area to redraw - going back and forth seems to cause some bad situation with WPF geometries
If you're drawing a single line, try using a PolyLine or PolyLineSegment in your Path
If you do actually want a number of disconnected lines, rather than a GeometryGroup full of LineGeometry, use a Path with many PathFigures.
For the general case as other answers have pointed out, you can look into OnRender or DrawingVisual type approaches, depending what you're doing the performance may or may not be better. I think you'll still struggle with the amount of elements you're suggesting, unless you can do some kind of virtualization.
First of all, I am aware that this question really sounds as if I didn't search, but I did, a lot.
I wrote a small Mandelbrot drawing code for C#, it's basically a windows form with a PictureBox on which I draw the Mandelbrot set.
My problem is, is that it's pretty slow. Without a deep zoom it does a pretty good job and moving around and zooming is pretty smooth, takes less than a second per drawing, but once I start to zoom in a little and get to places which require more calculations it becomes really slow.
On other Mandelbrot applications my computer does really fine on places which work much slower in my application, so I'm guessing there is much I can do to improve the speed.
I did the following things to optimize it:
Instead of using the SetPixel GetPixel methods on the bitmap object, I used LockBits method to write directly to memory which made things a lot faster.
Instead of using complex number objects (with classes I made myself, not the built-in ones), I emulated complex numbers using 2 variables, re and im. Doing this allowed me to cut down on multiplications because squaring the real part and the imaginary part is something that is done a few time during the calculation, so I just save the square in a variable and reuse the result without the need to recalculate it.
I use 4 threads to draw the Mandelbrot, each thread does a different quarter of the image and they all work simultaneously. As I understood, that means my CPU will use 4 of its cores to draw the image.
I use the Escape Time Algorithm, which as I understood is the fastest?
Here is my how I move between the pixels and calculate, it's commented out so I hope it's understandable:
//Pixel by pixel loop:
for (int r = rRes; r < wTo; r++)
{
for (int i = iRes; i < hTo; i++)
{
//These calculations are to determine what complex number corresponds to the (r,i) pixel.
double re = (r - (w/2))*step + zeroX ;
double im = (i - (h/2))*step - zeroY;
//Create the Z complex number
double zRe = 0;
double zIm = 0;
//Variables to store the squares of the real and imaginary part.
double multZre = 0;
double multZim = 0;
//Start iterating the with the complex number to determine it's escape time (mandelValue)
int mandelValue = 0;
while (multZre + multZim < 4 && mandelValue < iters)
{
/*The new real part equals re(z)^2 - im(z)^2 + re(c), we store it in a temp variable
tempRe because we still need re(z) in the next calculation
*/
double tempRe = multZre - multZim + re;
/*The new imaginary part is equal to 2*re(z)*im(z) + im(c)
* Instead of multiplying these by 2 I add re(z) to itself and then multiply by im(z), which
* means I just do 1 multiplication instead of 2.
*/
zRe += zRe;
zIm = zRe * zIm + im;
zRe = tempRe; // We can now put the temp value in its place.
// Do the squaring now, they will be used in the next calculation.
multZre = zRe * zRe;
multZim = zIm * zIm;
//Increase the mandelValue by one, because the iteration is now finished.
mandelValue += 1;
}
//After the mandelValue is found, this colors its pixel accordingly (unsafe code, accesses memory directly):
//(Unimportant for my question, I doubt the problem is with this because my code becomes really slow
// as the number of ITERATIONS grow, this only executes more as the number of pixels grow).
Byte* pos = px + (i * str) + (pixelSize * r);
byte col = (byte)((1 - ((double)mandelValue / iters)) * 255);
pos[0] = col;
pos[1] = col;
pos[2] = col;
}
}
What can I do to improve this? Do you find any obvious optimization problems in my code?
Right now there are 2 ways I know I can improve it:
I need to use a different type for numbers, double is limited with accuracy and I'm sure there are better non-built-in alternative types which are faster (they multiply and add faster) and have more accuracy, I just need someone to point me where I need to look and tell me if it's true.
I can move processing to the GPU. I have no idea how to do this (OpenGL maybe? DirectX? is it even that simple or will I need to learn a lot of stuff?). If someone can send me links to proper tutorials on this subject or tell me in general about it that would be great.
Thanks a lot for reading that far and hope you can help me :)
If you decide to move the processing to the gpu, you can choose from a number of options. Since you are using C#, XNA will allow you to use HLSL. RB Whitaker has the easiest XNA tutorials if you choose this option. Another option is OpenCL. OpenTK comes with a demo program of a julia set fractal. This would be very simple to modify to display the mandlebrot set. See here
Just remember to find the GLSL shader that goes with the source code.
About the GPU, examples are no help for me because I have absolutely
no idea about this topic, how does it even work and what kind of
calculations the GPU can do (or how is it even accessed?)
Different GPU software works differently however ...
Typically a programmer will write a program for the GPU in a shader language such as HLSL, GLSL or OpenCL. The program written in C# will load the shader code and compile it, and then use functions in an API to send a job to the GPU and get the result back afterwards.
Take a look at FX Composer or render monkey if you want some practice with shaders with out having to worry about APIs.
If you are using HLSL, the rendering pipeline looks like this.
The vertex shader is responsible for taking points in 3D space and calculating their position in your 2D viewing field. (Not a big concern for you since you are working in 2D)
The pixel shader is responsible for applying shader effects to the pixels after the vertex shader is done.
OpenCL is a different story, its geared towards general purpose GPU computing (ie: not just graphics). Its more powerful and can be used for GPUs, DSPs, and building super computers.
WRT coding for the GPU, you can look at Cudafy.Net (it does OpenCL too, which is not tied to NVidia) to start getting an understanding of what's going on and perhaps even do everything you need there. I've quickly found it - and my graphics card - unsuitable for my needs, but for the Mandelbrot at the stage you're at, it should be fine.
In brief: You code for the GPU with a flavour of C (Cuda C or OpenCL normally) then push the "kernel" (your compiled C method) to the GPU followed by any source data, and then invoke that "kernel", often with parameters to say what data to use - or perhaps a few parameters to tell it where to place the results in its memory.
When I've been doing fractal rendering myself, I've avoided drawing to a bitmap for the reasons already outlined and deferred the render phase. Besides that, I tend to write massively multithreaded code which is really bad for trying to access a bitmap. Instead, I write to a common store - most recently I've used a MemoryMappedFile (a builtin .Net class) since that gives me pretty decent random access speed and a huge addressable area. I also tend to write my results to a queue and have another thread deal with committing the data to storage; the compute times of each Mandelbrot pixel will be "ragged" - that is to say that they will not always take the same length of time. As a result, your pixel commit could be the bottleneck for very low iteration counts. Farming it out to another thread means your compute threads are never waiting for storage to complete.
I'm currently playing with the Buddhabrot visualisation of the Mandelbrot set, looking at using a GPU to scale out the rendering (since it's taking a very long time with the CPU) and having a huge result-set. I was thinking of targetting an 8 gigapixel image, but I've come to the realisation that I need to diverge from the constraints of pixels, and possibly away from floating point arithmetic due to precision issues. I'm also going to have to buy some new hardware so I can interact with the GPU differently - different compute jobs will finish at different times (as per my iteration count comment earlier) so I can't just fire batches of threads and wait for them all to complete without potentially wasting a lot of time waiting for one particularly high iteration count out of the whole batch.
Another point to make that I hardly ever see being made about the Mandelbrot Set is that it is symmetrical. You might be doing twice as much calculating as you need to.
For moving the processing to the GPU, you have lots of excellent examples here:
https://www.shadertoy.com/results?query=mandelbrot
Note that you need an WebGL capable browser to view that link. Works best in Chrome.
I'm no expert on fractals but you seem to have come far already with the optimizations. Going beyond that may make the code much harder to read and maintain so you should ask yourself it is worth it.
One technique I've often observed in other fractal programs is this: While zooming, calculate the fractal at a lower resolution and stretch it to full size during render. Then render at full resolution as soon as zooming stops.
Another suggestion is that when you use multiple threads you should take care that each thread don't read/write memory of other threads because this will cause cache collisions and hurt performance. One good algorithm could be split the work up in scanlines (instead of four quarters like you did now). Create a number of threads, then as long as there as lines left to process, assign a scanline to a thread that is available. Let each thread write the pixel data to a local piece of memory and copy this back to main bitmap after each line (to avoid cache collisions).