Sprite Batching in Non-OpenGL/XNA Environment

Sprite Batching in Non-OpenGL/XNA Environment - c#

I've been working on a game in Visual C# (Not the best platform, I know), and, as would perhaps be expected, it has started to run rather slowly. Running some tests has shown that the main hold up is in drawing images. I've been told that Sprite Batching is a good fix for that.
Problem is, I can't find anything on sprite batching that isn't specific to XNA or OpenGL. I know little to nothing about the process, and I was hoping to get some information on whether such a thing can be implemented using Visual Studio's Visual C#, and (if so) where I can go to learn more about it. If not, are there any other useful methods of speeding the process up a bit? Thanks!

It basically comes down to batching together calls to save on state switches (textures, fill rate) and draw calls (sending a draw call 50,000 times isn't as efficient as sending a single draw call, surprisingly enough). You're going to have to check, when calling the equivalent of a SpriteBatch.Draw(...), the following:
An internal 'max size' of your batch
If the texture switches, flush your buffer (i.e. draw whatever you have)
If SpriteBatch.End(...) has been called (you're done; flush the buffer and draw)
If you're still having trouble, feel free to check out MonoGame's implementation.
Edit: found a great gamedev question about this.

Related

A few Questions about implementing a render loop in Windows Forms

Guten Tag zusammen! (German: Good Day everyone!)
Please excuse my English, it is not my mother tongue. Since I have already found a lot of good answers to my questions here I would like now to ask you some.
In the last few days I have done a little research to the question "What is the best way to implement a game loop in Windows Forms?". And I have found a good explained solution form the SlimDX team, which it is based on the work of Tom Miller.
My Questions are:
First: We I use this solution for an game loop which is the best way to redraw the From after I have rendered a frame? A way I have often found is to call Invalidete(), but this does not look like a good idea to me. Wouldn't this add a message to the message queue and break the while-loop every frame?
Second: To my understanding this loop will consume an entire CPU-Thread (Core). Is there and good way to slow it down to a recommended frame rate that does not consume an entire CPU-Thread?
Third: Is GDI+ capable of render a simply 2D game?. When did Drawing becomes so complex that it is advisable to use a hardware accelerated drawing whit some DirectX or OpenGL wrappers?

A way I have often found is to call Invalidate(), but this does not look like a good idea to me. Wouldn't this add a message to the message queue and break the while-loop every frame?
Yes, that's a bad idea. You don't control the frequency at which the messages are delivered to your window, which means you shouldn't rely on it.
To my understanding this loop will consume an entire CPU-Thread (Core). Is there and good way to slow it down to a recommended frame rate that does not consume an entire CPU-Thread?
You can use the SpinWait structure (NOT Thread.SpinWait) to wait short amount of times. Base the waiting time on the duration of the curret frame and the desired framerate.
But usually the game loop taking an entire core isn't a problem. It may even be desired if you want to maximize your framerate.
Is GDI+ capable of render a simply 2D game?
Yes, although it'd be a really simple game. GDI is slow. I advise you to go with a hardware-accelerated solution right from the start, that way you won't have to rewrite everything if GDI proves to be a bottleneck.

Is it possible to multi thread something that calls GPU?

I have a lighting system in my xna game that loops through each light, and adds these lights to a final light map, which contains all the lights.
The process to create these lights involves many functions that have to do with the graphics device, such as using effects / shaders and drawing to render targets, and using graphics.device.clear to clear the render target, etc
So my question is, would it be possible to multi thread each light? Or would this not be possible because there is only 1 graphics device, and only 1 thread can use it at a time? If it is possible, would it improve performance?

Basically no. The GraphicsDevice in XNA is single-threaded for rendering. You can send resources (textures, vertex buffers, etc) to the GPU from multiple threads. But you can only call Draw (and other rendering functions like Present) from your main thread.
I have heard of people having success doing rendering-type-things from multiple threads with the appropriate locking in place. But that seems like "bad voodoo". As the linked post says: "the XNA Framework documentation doesn’t make any promises here". Not to mention: even getting the locking right is tricky.
I'm not really sure about making multiple graphics devices - I've not tried it myself. I think that it is possible, but that you can't share resources between the devices - making it fairly useless. Probably not worth the effort.
As jalf mentioned in a comment on your question - once you get to the GPU everything is handled in parallel already. So this would only be useful if you are CPU limited due to hitting the batch limit (because almost everything that isn't your batches can be moved to another thread). And in that case there are many optimisations to consider first to reduce the number of batches - before trying a crazy scheme like this. (And you have measured your performance, right?)
It sounds like what you might be trying to do is render a fairly complicated scene to a render target in the background, and spreading the load across many frames. In that case - if performance requirements dictate it - you could perhaps render across multiple frames, on the main thread, scheduling it manually. Don't forget to set RenderTargetUsage.PreserveContents so it doesn't get cleared each time you put it on the graphics device.

Swapchain.Present() taking far too long, causing lag

I've recently been getting a bit of lag since I moved all of my c# SlimDX DX11 rendering code from my Form (yes, I'm a lazy developer) to bespoke classes. I whacked my program into EQATEC Profiler and got this as the major contributor to my lag:
Now it's clear here that whatever's in postRender() is really hogging the precious milliseconds. In fact, whatever crazy, convoluted code I have in there is effectively reducing my frame rate to ~15 FPS on its own.
So what's in postRender()? Just one line of code:
swapChain.Present(0, PresentFlags.None);
I just have no idea what's caused it to slow down so much, I've not made any changes to the swapchain code at all. All I've altered is the screen resolution (1680x1050), but that should be absolutely fine (for reference, this machine can run crysis2 at maximum settings at that resolution without breaking a sweat).
Does anybody have any idea what might cause a swapchain to take so long on presenting or where I should look next for problems?
EDIT:
Looking at the structure of my code, my RenderFrame() function is as follows:
preRender();
DeferredRender(preShader);
//Composite scene to output image
CompositeScene(compositeShader);
//Post Process
PostProcess(postProcShader);
//Depth of Field
DoF(dofShader);
//Present the swapchain
postRender();
The results of some of these functions are based on the functions before (for example, DeferredRender uses four render targets to capture Diffuse lighting, Normals, Positions and Color in a per-pixel manner. CompositeScene then puts them all together. This would require the GPU to have computed the previous step before it can continue. This whole process continues along, with DoF requiring the results of PostProcess, etc. Therefore the only shader that could possibly be holding Swapchain.Present() up must be the shader which runs in the function DoF, as all the other shaders cause the CPU to lock until they're finished. Correct?

There are a few reasons why you might find Present() taking up that much time in your frame. The Present call is the primary method of synchronization between the CPU and the GPU; if the CPU is generating frames much faster than the GPU can process them, they'll get queued up. Once the buffer gets full, Present() turns into a glorified Sleep() call while it waits for it to empty out.
Of course, it's pretty much impossible to say with the little information that you've given here. Just because a machine runs Crysis doesn't mean you can get away with throwing anything you want at the card. Double check that you're not expecting to render crazy amounts of geometry and that your shaders aren't abnormally long and complex.
Also take a look at your app using one of the available GPU profilers; PIX is good as a base point, while NVIDIA and AMD each have their own more specific offerings for their own products. Finally, make sure your drivers are updated. If there's a bug in the driver, any chance you have at reasoning about the issue goes out the window.

Trying to determine FPS in a C# WPF program

I'm writing an application for a touch table using WPF and C#. (And, I'm not terribly familiar with WPF. At all.) We suspect we're not getting the framerate we're "requesting" from the API so I'm trying to write my own FPS calculator. I'm aware of the math to calculate it based on the internal clock, I just don't know how to access a timer within the Windows/WPF API.
What library/commands do I need to get access to a timer?

Although you could use a DispatcherTimer (which marshalls its ticks onto the ui thread, causing relativity problems), or a System.Threading.Timer (which might throw an exception if you try to touch any UI controls), i'd recommend you just use the WPF profiling tools :)

I think you're looking for the StopWatch. Just initialize it and reset it with each start of your iteration. At the end of an iteration, do your calculation.

First of all, are you aware that Microsoft provides a free diagnostic tool that will tell you the frame rate at which WPF is updating the screen? I guess if you're not convinced you're getting the framerate you're asking for, then perhaps you might not trust it, but I've found it to be a reliable tool. It's called Perforator, and it's part of the WPF Performance Suite, which you can get by following the instructions here: http://msdn.microsoft.com/library/aa969767
That's probably simpler than writing your own.
Also, how exactly are you "requesting" a frame rate? What API are you using? Are you using the Timeline's DesiredFrameRate property? If so, this is more commonly used to reduce the frame rate than increase it. (The docs also talk about increasing the frame rate to avoid tearing, but that doesn't really make sense - tearing is caused by presenting frames out of sync with the monitor, and isn't an artifact of slow frame rates. In any case, on Vista or Windows 7, you won't get tearing with the DWM enabled.) It's only a hint, and WPF does not promise to match the suggested frame rate.
As for the measurement technique, there are a number of ways you could go. If you're just trying to work out whether the frame rate is in the right ballpark, you could just increment a counter once per frame (which you'd typically do in an event handler for CompositionTarget.Rendering), and set up a DispatcherTimer to fire once a second, and have it show the value in the UI, and then reset the counter. It'll be somewhat rough and ready as DispatcherTimer isn't totally accurate, but it'll show you whether you've got 15fps when you were expecting 30fps, for example.
If you're trying to get a more precise view (e.g., you want to try to work out whether frames are being rendered constantly, or if you seem to be getting lost frames from time to time), then that gets a bit more complex. But I'll wait to see if Perforator does the trick for you before making more suggestions.

You want to either wrap the win32 timing calls you'd normally call (such as QueryPerformanceCounter), by using p/Invoke, or use something in .NET that already wraps them.
You could use DateTime.Ticks, but it's probably not high enough resolution. The Stopwatch class uses QueryPerformanceCounter under the covers.
If you want something that's reusable on a lot of systems, rather than a simple diagnostic, be warned about processor related issues w/ QPC and Stopwatch. See this question: Can the .NET Stopwatch class be THIS terrible?

low performance when an image with high resolution loaded

I develop a utility that behaves like "Adobe photoshop". user can draw rectangle,circle,... with mouse pointer and then move or resize it. for this functionality, I assume each shape is a object that stored in a generic collection in a container object. when user wants to change anything I recognise that where he clicked and in behind of scence I select the target object and so on...
this way have a problem when objects in screen is lot or user loads a picture with high resolution.
What's your opinion?
How can I solve it?

This makes sense because the larger the data set, the more RAM and CPU will be required to handle it.
While performance issues are important to solve, a lot of it may be perceieved performance so something like a threading issue - where you have one thread trying to process the information and you block the UI thread which makes it look like the system is freezing.
There is a lot of information on StackOverflow that you may want to look at
C# Performance Optimization
C# Performance Best Practices
C# Performance Multi threading
C# Performance Collections (Since you said you were using a collection)

Use a profiler such as dotTrace and find out which method is the one most called and the one that takes the most amount of time to process. Those are the ones you want to try to optimize. Other than that, you may have to go down to the GPU to try to speed things up.

About these kind of problem, think about parallel extensions :
http://msdn.microsoft.com/en-us/concurrency/default.aspx
The more cpu you have, the faster your program is running.

The thing is that in hi resolution the computer needs to use more the processor, then this occurs, remember that this also occurs in The Gimp, even in Adobe Photoshop.
Regards.

Look into using a performance analyzing tool (such as ANTS Profiler) to help you pinpoint exactly where the bottle necks are occurring. True graphical computations on a high res photo require alot of resources, but I would assume the logic you are using to manage and find your objects require some tuning up as well.

I high-resolution image takes up a lot of memory (more bits-per-pixel). As such, any operation that you do to it means more bits to manipulate.
Does your program utilise "layers"?
If not, then I'm guessing you are adding components directly to the image - which means each operation has to manipulate the bits. So if you aren't using layers, then you should definitely start. Layers will allow you to draw operations to the screen but only merge them into the base high-resolution image once - when you save!
What library from Windows are you using to open the image?
If you are using System.Drawing then you are actually using GDI+ as it is a wrapper on top of it. GDI+ is nice for a lot of things because it simplies tons of operations, however it isn't the fastest in the world. For example using the [Get|Set]Pixel methods are MUCH slower than working directly on the BitmapData. As there are tons of articles on speeding up operations on top of GDI+ I will not re-iterate them here.
I hope the information I've provided answer some of your questions causes new ones. Good luck!

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.