Is it possible to multi thread something that calls GPU?

Is it possible to multi thread something that calls GPU? - c#

I have a lighting system in my xna game that loops through each light, and adds these lights to a final light map, which contains all the lights.
The process to create these lights involves many functions that have to do with the graphics device, such as using effects / shaders and drawing to render targets, and using graphics.device.clear to clear the render target, etc
So my question is, would it be possible to multi thread each light? Or would this not be possible because there is only 1 graphics device, and only 1 thread can use it at a time? If it is possible, would it improve performance?

Basically no. The GraphicsDevice in XNA is single-threaded for rendering. You can send resources (textures, vertex buffers, etc) to the GPU from multiple threads. But you can only call Draw (and other rendering functions like Present) from your main thread.
I have heard of people having success doing rendering-type-things from multiple threads with the appropriate locking in place. But that seems like "bad voodoo". As the linked post says: "the XNA Framework documentation doesn’t make any promises here". Not to mention: even getting the locking right is tricky.
I'm not really sure about making multiple graphics devices - I've not tried it myself. I think that it is possible, but that you can't share resources between the devices - making it fairly useless. Probably not worth the effort.
As jalf mentioned in a comment on your question - once you get to the GPU everything is handled in parallel already. So this would only be useful if you are CPU limited due to hitting the batch limit (because almost everything that isn't your batches can be moved to another thread). And in that case there are many optimisations to consider first to reduce the number of batches - before trying a crazy scheme like this. (And you have measured your performance, right?)
It sounds like what you might be trying to do is render a fairly complicated scene to a render target in the background, and spreading the load across many frames. In that case - if performance requirements dictate it - you could perhaps render across multiple frames, on the main thread, scheduling it manually. Don't forget to set RenderTargetUsage.PreserveContents so it doesn't get cleared each time you put it on the graphics device.

Related

Threading UI Application

To start, I'm new to Threads, never worked with them before, but since the current project is severely impacted by slow runtimes, I wanted to take a peek into multithreading and this post is a question on whether it is possible given my current project, and how I would approach this in its entirety.
To start, you must understand that I have a Windows Form UI, which has a button that, upon clicking, runs an algorithm that generates an image. This is done in 3 steps really: Extracting patterns (if the settings have changed only), running the algorithm and displaying the result bitmap.
The average time spent on each part is as follows:
Pattern extraction takes up the biggest chunk, usually 60%+ of the running time.
The Algorithm itself takes up 40% of the running time.
However, if no settings have been changed, simply re-running won't require the recalculation of the patterns and hence it's way faster.
The displaying of the result bitmap, due to the bitmap being rescaled, takes a fixed ~200ms (Which I think can be optimized but IDK how).
The problem I'm having with trying to grasp the threading issue is that the algorithm is based on the patterns extracted from the first step, and the resulting bitmap is dependent on the algorithm.
My algorithm does, however, compute each pixel one by one, so I was wondering if it was possible to, once a single pixel has been calculated, already display it, such that the displaying of the image and the calculation of the others can be done in parallel.
If anything is unclear, please feel free to ask any questions.
Thank you!

current project is severely impacted by slow runtime
I would advice that you start with doing some measurements/profiling before doing anything else. It is not uncommon for programs to waste the vast majority of time doing useless stuff. Avoiding such unnecessary work can give a much more performance improvement than multi threading.
The typical method for moving processing intensive work to a background work is using Task.Run and async/await for processing the result. Note that using a background thread to avoid blocking the UI thread is different from doing the processing in parallel to improve performance, but both methods can be combined if needed.
My algorithm does, however, compute each pixel one by one, so I was wondering if it was possible to, once a single pixel has been calculated, already display it, such that the displaying of the image and the calculation of the others can be done in parallel.
Updating the displayed image for every pixel is probably not the way to go, since that would be rather costly. And you are typically not allowed to touch objects used by the UI from a background thread.
One way to manage things like this would be to have a timer that updates the UI every so often, and a shared buffer for the processed data. Once the update is triggered you would have a method that copies the shared buffer to the displayed bitmap, without locks this would not guarantee that the latest values are included, but for simply showing progress it might be good enough.
You could also consider things like splitting the image into individual chunks, so that you can process complete chunks on the background thread, and then put them in a output queue for the UI thread to pickup and display. See for example channel

Sprite Batching in Non-OpenGL/XNA Environment

I've been working on a game in Visual C# (Not the best platform, I know), and, as would perhaps be expected, it has started to run rather slowly. Running some tests has shown that the main hold up is in drawing images. I've been told that Sprite Batching is a good fix for that.
Problem is, I can't find anything on sprite batching that isn't specific to XNA or OpenGL. I know little to nothing about the process, and I was hoping to get some information on whether such a thing can be implemented using Visual Studio's Visual C#, and (if so) where I can go to learn more about it. If not, are there any other useful methods of speeding the process up a bit? Thanks!

It basically comes down to batching together calls to save on state switches (textures, fill rate) and draw calls (sending a draw call 50,000 times isn't as efficient as sending a single draw call, surprisingly enough). You're going to have to check, when calling the equivalent of a SpriteBatch.Draw(...), the following:
An internal 'max size' of your batch
If the texture switches, flush your buffer (i.e. draw whatever you have)
If SpriteBatch.End(...) has been called (you're done; flush the buffer and draw)
If you're still having trouble, feel free to check out MonoGame's implementation.
Edit: found a great gamedev question about this.

C# Loading screen threading loading and animation

I'm making a loading screen for a game in c#. Do I need to create a thread for drawing the spinning animation as well as a thread for loading the level?
I'm a bit confused as to how it works. I've spent quite a few hours messing with it to no avail. Any help would be appreciated.

Short of anything that XNA may provide for you, anytime you require doing multiple units of work at once, multiple threads are usually required - and almost certainly if you want to benefit from multiple CPUs. Depending upon exactly what you're looking to do, you're already in one thread (for your main method / program execution) - so you don't likely need to create 2 additional threads - but just one additional for either the loading of your level, or for the animation.
Alternatively, as was probably more common-place in older development when developers weren't concerned with multi-core CPUs, etc., you could use tricks such as doing both the level loading and the animation in the same thread - but at the expense of additional complexity for combining both concerns into the same unit of processing. (In every x # of lines of processing for loading the level, add code to update the loading animation.) However, given today's technology, you are almost certainly better off using multiple threads for this.

Loading takes time because it makes long calculations and long calculations are usually done in a different thread so that the program woun't freeze.
So the answer is yes.

How to quickly generate images with .NET

I've become rather familiar with the System.Drawing namespace in terms of knowing the basic steps of how to generate an image, draw on it, write text on it, etc. However, it's so darn slow for anything approaching print-quality. I have seen some suggestions using COM to talk to native Windows GDI to do this quicker but I was wondering if there were any optimizations I could make to allow for high speed, high quality image generation. I have tried playing with the anti-aliasing options and such immediately available to the Graphics, Bitmap and Image objects but are there any other techniques I can employ to do this high speed?
Writing this I just had the thought to use the task library in .Net 4 to do MORE work even though each generation task wouldn't be any quicker.
Anyway, thoughts and comments appreciated.
Thanks!

If you want raw speed, the best option is to use DirectX. The next best is probably to use GDI from C++ and provide a managed interface to call it. Then probably to p/invoke to GDI directly from C#, and last to use GDI+ in C#. But depending on what you're doing you may not see a huge difference. Multithreading may not help you if you're limited by the speed at which the graphics card can be driven by GDI+, but could be beneficial if you're processor bound while working out "what to draw". If you're printing many images in sequence, you may gain by running precalculation, rendering, and printing "phases" on separate threads.
However, there are a number of things you can do to optimise redraw speed, both optimisations and compromises, and these approaches will apply to any rendering system you choose. Indeed, most of these stem from the same principles used when optimising code.
How can you minimise the amount that you draw?
Firstly, eliminate unnecessary work. Think about each element you are drawing - is it really necessary? Often a simpler design can actually look better while saving a lot of rendering effort. Consider whether a grad fill can be replaced by a flat fill, or whether a rounded rectangle will look acceptable as a plain rectangle (and test whether doing this even provides any speed benefit on your hardware before throwing it away!)
Challenge your assumptions - e.g. the "high resolution" requirement - often if you're printing on something like a dye-sub printer (which is a process that introduces a bit of colour bleed) or a CMYK printer that uses any form of dithering to mix colours (which has a much lower practical resolution than the dot pitch the printer can resolve), a relatively low resolution anti-aliased image can often produce just as good a result as a super-high-res one. If you're outputting to a 2400dpi black and white printer, you may still find that 1200dpi or even 600dpi is acceptable (you get ever decreasing returns as you increase the resolution, and most people won't notice the difference between 600dpi and 2400dpi). Just print some typical examples out using different source resolutions to see what the results are like. If you can halve the resolution you could potentially render as much as 4x faster.
Generally try to avoid overdrawing the same area - If you want to draw a black frame around a region, you could draw a white rectangle inside a black rectangle, but this means filling all the pixels in the middle twice. You may improve the performance by drawing 4 black rectangles around the outside to draw the frame exactly. Conversely, if you have a lot of drawing primitives, can you reduce the number of primitives you're drawing? e.g. If you are drawing a lot of stripes, you can draw alternating rectangles of different colours (= 2n rectangles), or you can fill the entire background with one colour and then only draw the rectangles for the second colour (= n+1 rectangles). Reducing the number of individual calls to GDI+ methods can often provide significant gains, especially if you have fast graphics hardware.
If you draw any portion of the image more than once, consider caching it (render it into a bitmap and then blitting it to your final image when needed). The more complex this sub-image is, the more likely it is that caching it will pay off. For example, if you have a repeating pattern like a ruler, don't draw every ruler marking as a separate line - render a repeating section of the ruler (e.g. 10 lines or 50) and then blit this prerendered only a few times to draw the final ruler.
Similarly, avoid doing lots of unnecessary work (like many MeasureString calls for values that could be precalculated once or even approximated. Or if you're stepping through a lot of Y values, try to do it by adding an offset on each iteration rather than recaclualting the absolute position using mutliples every time).
Try to "batch" drawing to minimise the number of state changes and/or drawing method calls that are necessary - e.g. draw all the elements that are in one colour/texture/brush before you move on to the next colour. Use "batch" rendering calls (e.g. Draw a polyline primitive once rather than calling DrawLine 100 times).
If you're doing any per-pixel operations, then it's usually a lot faster to grab the raw image buffer and manipulate it directly than to call GetPixel/SetPixel methods.
And as you've already mentioned, you can turn off expensive operations such as anti-aliasing and font smoothing that won't be of any/much benefit in your specific case.
And of course, look at the code you're rendering with - profile it and apply the usual optimisations to help it flow efficiently.
Lastly, there comes a point where you should consider whether a hardware upgrade might be a cheap and effective solution - if you have a slow PC and a low end graphics card there may be a significant gain to be had by just buying a new PC with a better graphics card in it. Or if the images are huge, you may find a couple of GB more RAM eliminates virtual memory paging overheads. It may sound expensive, but there comes a point where the cost/benefit of new hardware is better than ploughing more money into additional work on optimisations (and their ever decreasing returns).

I have a few ideas:
Look at the code in Paint.net. It is an open source paint program written in C#. It could give you some good ideas. You could certainly do this in conjunction with ideas 2 and 3.
If the jobs need to be done in an "immediate" way, you could use something asynchronous to create the images. Depending on the scope of the overall application, you might even use something like NServiceBus to queue an image processing component with the task. Once the task is completed, the sending component would receive a notification via subscribing to the message published upon completion.
The task based solution is good for delayed processing. You could batch the creation of the images and use either the Task approach or something called Quartz.net (http://quartznet.sourceforge.net). It's an open source job scheduler that I use for all my time based jobs.

You can create a new Bitmap image and do LockBits(...), specifying the pixel format you want. Once you have the bits, if you want to use unmanaged code to draw into it, bring in the library, pin the data in memory, and use that library against it. I think you can use GDI+ against raw pixel data, but I was under the impression that System.Drawing is already a thin layer on top of GDI+. Anyways, whether or not I'm wrong about that, with LockBits, you have direct access to pixel data, which can be as fast or slow as you program it.
Once you're done with drawing, you can UnlockBits and viola you have the new image.

low performance when an image with high resolution loaded

I develop a utility that behaves like "Adobe photoshop". user can draw rectangle,circle,... with mouse pointer and then move or resize it. for this functionality, I assume each shape is a object that stored in a generic collection in a container object. when user wants to change anything I recognise that where he clicked and in behind of scence I select the target object and so on...
this way have a problem when objects in screen is lot or user loads a picture with high resolution.
What's your opinion?
How can I solve it?

This makes sense because the larger the data set, the more RAM and CPU will be required to handle it.
While performance issues are important to solve, a lot of it may be perceieved performance so something like a threading issue - where you have one thread trying to process the information and you block the UI thread which makes it look like the system is freezing.
There is a lot of information on StackOverflow that you may want to look at
C# Performance Optimization
C# Performance Best Practices
C# Performance Multi threading
C# Performance Collections (Since you said you were using a collection)

Use a profiler such as dotTrace and find out which method is the one most called and the one that takes the most amount of time to process. Those are the ones you want to try to optimize. Other than that, you may have to go down to the GPU to try to speed things up.

About these kind of problem, think about parallel extensions :
http://msdn.microsoft.com/en-us/concurrency/default.aspx
The more cpu you have, the faster your program is running.

The thing is that in hi resolution the computer needs to use more the processor, then this occurs, remember that this also occurs in The Gimp, even in Adobe Photoshop.
Regards.

Look into using a performance analyzing tool (such as ANTS Profiler) to help you pinpoint exactly where the bottle necks are occurring. True graphical computations on a high res photo require alot of resources, but I would assume the logic you are using to manage and find your objects require some tuning up as well.

I high-resolution image takes up a lot of memory (more bits-per-pixel). As such, any operation that you do to it means more bits to manipulate.
Does your program utilise "layers"?
If not, then I'm guessing you are adding components directly to the image - which means each operation has to manipulate the bits. So if you aren't using layers, then you should definitely start. Layers will allow you to draw operations to the screen but only merge them into the base high-resolution image once - when you save!
What library from Windows are you using to open the image?
If you are using System.Drawing then you are actually using GDI+ as it is a wrapper on top of it. GDI+ is nice for a lot of things because it simplies tons of operations, however it isn't the fastest in the world. For example using the [Get|Set]Pixel methods are MUCH slower than working directly on the BitmapData. As there are tons of articles on speeding up operations on top of GDI+ I will not re-iterate them here.
I hope the information I've provided answer some of your questions causes new ones. Good luck!

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.