I've got a fairly large Unity application here that has a lot of concurrent things happening at once. I realized that having all my evaluation, iteration, enumeration, decision making, and arithmetic logic being done in or called from Update() is starting to have a performance impact. I've since then offloaded most of this logic to Coroutines, and that has gained back some performance, but I'm trying to think of what I can do to make this better.
Coroutines have gained back performance when iterating over some large collection because I can yield at the end of each loop, but in some cases where it's a massive collection, processing one item each frame makes this task considerably slow. This is where I'm instead starting to offload these sorts of tasks to Threads. Problem with using Threads in Unity is that it tends to throw a fit if you call the Unity API from the Thread. So to combat that, I came up with this design pattern where the non-Unity logic is done in a thread, then once a decision or result was made, some additional Unity-required logic based on context is done in Unity's main thread.
public class Evaluator : MonoBehaviour {
public static Evaluator Instance;
static Queue<Func<Action>> evaluationQueue = new Queue<Func<Action>>();
static Queue<Action> actionQueue = new Queue<Action>();
static Thread evaluatorLoop;
public static void EvalLogic (Func<Action> evaluation) {
if (Instance == null) {
NanoDebug.LogError("EVALUATOR IS NOT INITIALIZED!");
return;
}
evaluationQueue.Enqueue(evaluation);
if (evaluatorLoop == null) {
evaluatorLoop = new Thread(EvaluatorLoop);
evaluatorLoop.IsBackground = true;
evaluatorLoop.Start();
}
}
static void EvaluatorLoop () {
for (;;) {
if (evaluationQueue.Count > 0) {
Func<Action> evaluation = evaluationQueue.Dequeue();
Action action = evaluation();
actionQueue.Enqueue(action);
}
else {
break;
}
}
evaluatorLoop = null;
}
void Awake () {
Instance = this;
}
void FixedUpdate () {
if (actionQueue.Count > 0) {
actionQueue.Dequeue()?.Invoke();
}
}
}
In some regular script, I would use this like so
void OnSomeGameEvent (GameEventBigData bigData) {
Func<Action> checkData = () => {
// Iterate over some big collection.
// Do X if some condition is met during iteration.
// Do some arithmetic with data.
// Evaluate some large if-statement or switch statement.
// Return this Action based on the decision or result of the above.
return () => {
// Mutate the state of some object involving the Unity API using evaluated data.
};
};
Evaluator.EvalLogic(checkData);
}
I've ran into a simular problem with large iterations using this pattern. While the Thread does the iteration also instantly, if I need to mutate the state of a collection of Game Objects (each one), then the above action queue fills up to the size of the collection and then it's up to the speed of FixedUpdate() to apply the selected changes. But, when it comes to complex decision making or doing math, this pattern seems to help Unity not skip a beat.
So my question is: Am I really gaining performace this way, or am I better off sticking to just Coroutines and intelligently yielding when necessary?
Here's my understanding of what Update() is. It's a render loop, right? If you want to chagne the state of a Game Object in such a way that it changes the visual appearance or location, you do it Update(). That being said, doesn't it make more sense to then offload ALL non-render related tasks from Update()?
tl;dr: Depends on your specific use case.
Am I really gaining performace this way, or am I better off sticking to just Coroutines and intelligently yielding when necessary?
That totally depends on what actually is done by the threads / Coroutine and how many of them you are using etc.
Using threads for performance instense things (mostly FileIO, parsing operations on large or many strings or in general large data processing) wherever you can definitely is a performance gain. However, have in mind that also creating threads has a certain overload. If you are to create them on a regular basis it is even better to have one persistent thread and then pass in work packages it has to solve. While there are no packages it should then sleep for a certain amount of time (e.g. 17ms which is about a frame in 60 FpS or even longer - I e.g. usually used 50ms and for now was quite happy with the results) so it doesn't eat up 100% of CPU time while it has nothing to do.
It's a render loop, right?
Not really, no. Update is just a message that gets invoked once each frame. What you do in there is of course up to you. As yourself said most of the Unity API (namely these parts that directly affect the scene or assets or rely on them) can only be used within the Unity main thread, so any of Unity's built-in messages (Update, FixedUpdate, etc).
That being said, doesn't it make more sense to then offload ALL non-render related tasks from Update()?
No! That would be overdoing it a little. Heavy things, yes why not. But have in mind that your device only has a limited amount of parallel CPU cores to work off all the Threads and Tasks. At a certain point the overhead just grows bigger than the actual work that is done. I wouldn't move each and any Vector3 calculation into a thread / workPackage .. but if you have to do 1000 of them, again, sur why not.
So as said basically it all comes down to: Depends on your specific use case.
Alternatively to Threads depending on your needs you might also want to look into the Unity Job System and the Burst Compiler which enables to do a lot of things asynchronous that are usually somehow bound to the Unity main thread.
In particular it is used a lot in combination with the new Mesh API.
And yet another alternative for heavy but generic mathematical operations would be using Compute Shaders so handing all the work to the GPU. This stuff can get extremly complex but at the same time extremly fast and powerfull. (Again see the examples and times in the MeshAPI examples)
And in general yes, this kind of dispatching actions/results back into the main thread is the typical solution. Just make sure to either lock your actionQueue or rater use a ConcurrentQueue to not get into threading issues.
Additionally you can have a Coroutine for working the packages off that skips to the next frame if too much time has passed already so you can still achieve a certain target framerate. You could achieve this e.g. using a StopWatch. This requires of course to implement certain points where it may skip (yield) like for example after each worked package / Action.
const float targetFrameRate = 60;
// yes, if you make Start return IEnumerator then Unity automatically
// runs it as a Coroutine
private IEnumerator Start()
{
// -> about 17 -> actual frame-rate might be only 58 or lower
var maxMS = Mathf.RoundToInt(1000f / targetFrameRate);
var stopWatch = new StopWatch();
stopWatch.Start();
// This is ok in a Coroutine as long as you yield inside
// in order to at least render one frame eventually
while(true)
{
if (actionQueue.Count > 0)
{
actionQueue.Dequeue()?.Invoke();
// after each invoked action check if the time has already passed
if(stopWatch.ElapsedMilliseconds >= targetFrameRate)
{
// render this frame and continue int he next
yield return null;
// and restart the timer
stopWatch.Restart();
}
}
else
{
// Otherwise directly go to the next frame
yield return null;
// and restart the timer
stopWatch.Restart();
}
}
}
If this still is not enough and your Action itself is too intensive you could split it up in multiple Actions in order to again allow to skip between them. Like e.g. instead of doing
Enqueue(LoopThroughObjectsAndDoStuff(thousendObjects));
you would rather do
foreach(var x in thousendObjects)
{
Enqueue(()=>DoStuffForObject(x));
}
Have in mind though that there is tradeoff and you have to play around a bit with the target frame-rate and the real-time until everythign is done.
There are situations where you prefer that a User has to wait effectively longer until something is finished but meanwhile have the application run with a good frame-rate. And there are other situations where you allow that the app runs with a lower frame-rate or even little freezes but the process is finished effectively faster.
Related
I'm implementing image processing algorithms in C# using .NET Framework 4.72 and need to decrease the computation code. Overall the code is sequential but there are quite a few methods with parameters that do not depend on each other. For example, it might be something like this
public void Algorithm(Object x, Object y) {
x = Filter(x);
x = Morphology(x);
y = Filter(y);
y = Morphology(y);
var z = Add(x,y);
//Similar pattern of separate operation that are then combined.
}
These functions generally take around 100ms to 500ms. They can be parallelised, and my approach has been something like this:
public void Algorithm(Object x, Object y) {
var xTask = Task.Run(() => {
x = Filter(x);
x = Morphology(x);
});
var yTask = Task.Run(() => {
y = Filter(y);
y = Morphology(y);
});
Task.WaitAll(xTask, yTask);
var z = Add(x,y);
}
It seems to work, a similar bit of code runs approximately twice as fast. (Note that the whole thing is wrapped in another Task.Run in the top most level function, so that is why I'm not awaiting here.
Question: Is this a valid approach, or is there another method for parallelising lots of little method calls that is more safe or efficient?
Update: This is not for parallelising processing a batch of images. It is about processing a single image as quick as possible.
This is valid enough - if you can process your workload in parallel then you should. You just need to be very aware of WHEN your workload can and should be parallel - and when it needs to be performed in order.
You also need to consider the cost of creating a new task, versus the benefits of doing so (i.e. sometimes avoid very small, very fast tasks).
I would strongly recommend you create additional methods and collections for managing your tasks - when they complete, and handle running lots of separate sets in parallel. Avoiding locking, managing shared memory/variables etc. For example, are you only ever processing one image at a time, or can you start processing the next one if you have cores available?
You need to be very careful with Task.WaitAll() - obviously you need to draw all your work together at some point, but be careful not to lock or block other work.
There's lots of articles out there on the various patterns you can use (pipelines sounds like a good match here).
Here's a few starters:
https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/tpl-and-traditional-async-programming
https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/potential-pitfalls-in-data-and-task-parallelism
I'm trying to understand Unity coroutines deeper. They can block execution yielding null or wait, but they are really not new threads.
In my case Unity should read info from database, which can take some time. And this is synchronous operation. This single line of code may potentially block execution for seconds.
Part of me tells just to start new thread. But I'm wondering whether it can be achieved with Unity-style coroutines.
private IEnumerator FetchPlayerInfo(int id)
{
// fetch player from DB
using (var session...)
{
// this line is NHibernate SQL query, may take long time
player = session.QueryOver<Player>()...;
}
// raise event to notify listeners that player info is fetched
}
I just don't see where to put yield. Does anyone know?
Thx in advance.
You can only return yield instructions when the control flow is in your own coroutine. If you have to run a long synchronous operation, and it has no asynchronous API (which is something to be expected from a database), then you really better off with starting another thread.
However, be aware that using other threads in Unity is a little bit tricky: you won't be able to use any of Unity's API and you'll have to check in the main thread for when the worker thread has a result ready. Consider looking at ready solutions, such as Loom.
You can think of yielding as breaking a large task into chunks. After each chunk you yield to let other things happen, then come back and do another chuck. In your case you would load X number of rows each chunk until all your data is loaded.
I have been working on Async calls and I found that the Async version of a method is running much slower than the Sync version. Can anyone comment on what I may be missing. Thanks.
Statistics
Sync method time is 00:00:23.5673480
Async method time is 00:01:07.1628415
Total Records/Entries returned per call = 19972
Below is the code that i am running.
-------------------- Test class ----------------------
[TestMethod]
public void TestPeoplePerformanceSyncVsAsync()
{
DateTime start;
DateTime end;
start = DateTime.Now;
for (int i = 0; i < 10; i++)
{
using (IPersonRepository repository = kernel.Get<IPersonRepository>())
{
IList<IPerson> people1 = repository.GetPeople();
IList<IPerson> people2 = repository.GetPeople();
}
}
end = DateTime.Now;
var diff = start - end;
Console.WriteLine(diff);
start = DateTime.Now;
for (int i = 0; i < 10; i++)
{
using (IPersonRepository repository = kernel.Get<IPersonRepository>())
{
Task<IList<IPerson>> people1 = GetPeopleAsync();
Task<IList<IPerson>> people2 = GetPeopleAsync();
Task.WaitAll(new Task[] {people1, people2});
}
}
end = DateTime.Now;
diff = start - end;
Console.WriteLine(diff);
}
private async Task<IList<IPerson>> GetPeopleAsync()
{
using (IPersonRepository repository = kernel.Get<IPersonRepository>())
{
return await repository.GetPeopleAsync();
}
}
-------------------------- Repository ----------------------------
public IList<IPerson> GetPeople()
{
List<IPerson> people = new List<IPerson>();
using (PersonContext context = new PersonContext())
{
people.AddRange(context.People);
}
return people;
}
public async Task<IList<IPerson>> GetPeopleAsync()
{
List<IPerson> people = new List<IPerson>();
using (PersonContext context = new PersonContext())
{
people.AddRange(await context.People.ToListAsync());
}
return people;
}
So we've got a whole bunch of issues here, so I'll just say right off the bat that this isn't going to be an exhaustive list.
First off, the point of asynchrony is not strictly to improve performance. It can be, in certain contexts, used to improve performance, but that's not necessarily its goal. It can also be used to keep a UI responsive, for example. Paralleization is usually used to increase performance, but parallelization and asynchrony aren't equivalent. On top of that, parallelization has an overhead. You're spending time creating threads, scheduling them, synchronizing data between them, etc. The benefit of performing some operations in parallel may or may not surpass this overhead. If it doesn't, a synchronous solution may well be more performant.
Next, your "asynchronous" example isn't asynchronous "all the way up". You're calling WaitAll on the tasks inside the loop. For the example to be properly asynchronous one would like to see it be asynchronous all the way up to a single operation, namely some form of message loop.
Next, the two aren't don't the exact same thing in an asynchronous and synchronous manor. They are doing different things, which will obviously affect performance:
Your "asynchronous" solution creates 3 repositories. Your synchronous solution creates one. There is going to be some overhead here.
GetPeopleAsync takes a list, then pulls all of the items out of the list and puts them into another list. That's unnecessary overhead.
Then there are problems with your benchmarking:
You're using DateTime.Now, which is not designed for timing how long an operation takes. it's precision isn't particularly high, for example. You should use a StopWatch to time how long code takes.
You aren't performing all that many iterations. There's plenty of opportunity for the variation to affect the results here.
You aren't accounting for the fact that the first few runs through a section of code will take longer. The JITter needs to "warm up".
Garbage collections can be affecting your timings, namely that the objects created in the first test can end up being cleaned up during the second test.
It may depend on your data, or rather the amount of it. You didn't post what test metrics you're using to run your tests but this is my experience:
Usually when you see a slowdown in the performance of parallel algorithms when you're expecting improvement it's that the overhead of loading the extra libraries and spawning threads etc. slows down the parallel algorithm and makes it look like the linear/single-threaded version is performing better.
A greater amount of data should show better performance. Also try running the same test twice when all the libraries are loaded to avoid the load overhead.
If you don't see improvement, something is seriously wrong.
Note: You're getting voted down, I'm guessing, because you posted much more code than context, metrics etc. in the OP. IMO, very few SOers will actually bother to read and grok even that much code without being able to execute it while also being presented with metrics that are not at all useful!
Why I didn't read the code: When I see a code block with scroll bars along with the kind of text that was present in the original OP, my brain says: Don't bother. I think many if not most, probably do this.
Things to try:
Two different synch times does not mean statistically significant data. You should run each algorithm a number of times (5 at least) to see if you're experiencing anomalies. If your results for the same algorithms vary wildly then you may have other issues such as bandwidth restriction, server load etc. and the issue is external.
Try a .NET memory performance and/or memory profiler to help you track down the issue.
See #servy's great answer for more clues. It seems that he actually took the time to look at your code more closely.
I have several actions that I want to execute in the background, but they have to be executed synchronously one after the other.
I was wondering if it's a good idea to use the Task.ContinueWith method to achieve this. Do you foresee any problems with this?
My code looks something like this:
private object syncRoot =new object();
private Task latestTask;
public void EnqueueAction(System.Action action)
{
lock (syncRoot)
{
if (latestTask == null)
latestTask = Task.Factory.StartNew(action);
else
latestTask = latestTask.ContinueWith(tsk => action());
}
}
There is one flaw with this, which I recently discovered myself because I am also using this method of ensuring tasks execute sequentially.
In my application I had thousands of instances of these mini-queues and quickly discovered I was having memory issues. Since these queues were often idle I was holding onto the last completed task object for a long time and preventing garbage collection. Since the result object of the last completed task was often over 85,000 bytes it was allocated to Large Object Heap (which does not perform compaction during garbage collection). This resulted in fragmentation of the LOH and the process continuously growing in size.
As a hack to avoid this, you can schedule a no-op task right after the real one within your lock. For a real solution, I will need to move to a different method of controlling the scheduling.
This should work as designed (using the fact that TPL will schedule the continuation immediately if the corresponding task already has completed).
Personally in this case I would just use a dedicated thread using a concurrent queue (ConcurrentQueue) to draw tasks from - this is more explicit but easier to parse reading the code, especially if you want to find out i.e. how many tasks are currently queued etc.
I used this snippet and have seem to get it work as designed.
The number of instances in my case does not runs in to thousands, but in single digit.
Nevertheless, no issues so far.
I would be interested in the ConcurrentQueue example, if there is any?
Thanks
I am working on a robot that is capable of motion detection using a webcam. I am doing this in C#
The robot moves too fast, so I want to turn it on/off at short time intervals in order to reduce its speed.
For example, it will start the engine then wait 0.5 second and turn it off, this cycle repeats every 2 seconds. This way, its speed wont be too fast. I would like to include this in a single function called Move()
I just don't know how to do this, especially because my motion detection code runs like 20 times a second. Depending on the position of the obstacle, I may need to disable the Move() function and activate other functions that let the robot move into other directions.
Any ideas/suggestions on where am I supposed to start?
Thanks a lot!
First of all, we need to establish how your program flows.
Does it execute one command, wait, then execute the next command? Or does it execute commands simultaneously? (for example, move and do something else)
I imagine you will want it to execute commands in sequence, rather than some complex threading which your motor system may not support.
In order to get your robot to move slowly, I would suggest creating a Move() method that takes a parameter, the amount of time you want it to spend moving, like this:
public void Move(int numberOfSeconds)
{
while (numberOfSeconds > 0)
{
myRobot.MotorOn();
Thread.Sleep(2000);
myRobot.MotorOff();
Thread.Sleep(500);
numberOfSeconds -= 2;
}
}
It's not exact, but that is one way of doing it.
If you then call Move(10) for example, your robot will move for 10 seconds, and pause every 2 seconds for half a second.
Regarding your program flow questions, you may want to think of it as a list of instructions:
MOVE FORWARD
STOP
CHECK FOR OBJECT
ROTATE TO AIM AT OBJECT
MOVE FORWARD
STOP
etc.
So in your main program control loop, assuming the calls are synchronous (ie. your program stops while it is executing a command, like in the above Move method), you could simply have a bunch of IF statements (or a switch)
public void Main()
{
// What calculations should the robot do?
If (someCalculations == someValue)
{
// Rotate the robot to face the object
robot.RotateRight(10);
}
else if (someOtherCalculation == someValue)
{
// We are on course, so move forward
Move(10);
}
}
That might help you get started.
If however, your robot is asynchronous, that is, the code keeps running while the robot is doing things (such as you constantly getting feedback from the motion sensors) you will have to structure your program differently. The Move() method may still work, but your program flow should be slightly different. You can use a variable to keep track of the state:
public enum RobotStates
{
Searching,
Waiting,
Hunting,
Busy,
}
Then in your main loop, you can examine the state:
if (myRobotState != RobotStates.Busy)
{
// Do something
}
Remember to change the state when your actions complete.
It's entirely possible you will have to use threading for an asynchronous solution, so your method that receives feedback from your sensor doesn't get stuck waiting for the robot to move, but can continue to poll. Threading is beyond the scope of this answer though, but there are plenty of resources out there.
You've encountered a very common problem, which is how to control a process when your actuator only has ON and OFF states. The solution you've proposed is a common solution, which is to set a 'duty cycle' by switching a motor on/off. In most cases, you can buy motor controllers that will do this for you, so that you don't have to worry about the details. Generally you want the pulsing to be a higher frequency, so that there's less observable 'stutter' in the motion.
If this is an educational project, you may be interested in theory on Motor Controllers. You might also want to read about Control Theory (in particular, PID control), as you can use other feedback (do you have some way to sense your speed?) to automatically control the motor to maintain the speed you want.
Thread.Sleep() may not be what you want because if your hardware is able to, you want to keep running your sensors while moving etc. The first solution that comes to mind for me is to use a timer. That way you can keep processing and handle your movement when needed.
(havent tested this code, but it gets the idea across)
System.Timers.Timer Timer = new Timer();
bool Moving;
void init()
{
Timer.AutoReset = false;
Timer.Elapsed += OnMoveTimerEvent;
Moving = false;
}
void MainLoop()
{
//stuff
if(should move)
{
timer.start();
}
if(should stop moving)
{
timer.stop();
}
}
void OnMoveTimerEvent(object source, ElapsedEventArgs e)
{
if (!Moving)
{
//start motor
Timer.Interval = 500;
Moving = true;
Timer.Start();
}
else
{
//stop motor
Moving = true;
Timer.Interval = 2000;
Timer.Start();
}
}
I'd suggest you look into Microsoft Robotics Studio? Never used it, but they might actually address this kind of issue. And probably others you haven't encountered yet.
Another option is to write the app using XNA's timing mechanisms. Only instead of rendering to the screen, you'd be rendering to your robot, sort of.
You can try using -
Thread.Sleep(500);
This piece of code will put the currently running thread to sleep for 500 milliseconds.
One way of adding suspension to work is to add sleep in between calls. You can put sleep in between calls. Putting sleep is like this: System.Threading.Thread.Sleep(5000); 5000 is for 5000 milliseconds.
It sounds like you want the main thread to execute functions periodically. If your robot has the .NET-Framework installed, you can use the Threading library.
while(condition)
{
//Wait a number ms (2000 ms = 2 secons)
Thread.Sleep(2000);
//Do something here, e.g. move
Thread.Sleep(500);
//...
}
But yours is a very difficult question. Could you please specify what kind of operating system or/and environment (libraries, framworks, ...) your robot has?
As far a motion detection and obstacle detection is concerned, your code should be using complex threading. For you motor speed problem have you tried doing Move() call in a separate thread.
and in a for loop you can use Thread.Sleep(2*1000) after every MotorOff() call.
Regards.