Optimize calling a solution multiple times during a run

Optimize calling a solution multiple times during a run - c#

During my current project, I'm calling a solution (which I'm going to refer to as solution2) every time the user presses a button (as long as the variables are correct). And I'm torn between calling a method inside solution2 on each correct user input, or to write everything in the start method and simply "activate" solution2 for each correct user input. I'm not too bothered which one is easier (except if one of them were going to cause major difficulty), I'm only looking for the most optimised way to do it. Thank you for your help. -TAG

If you're torn between two different means to solve a problem, and optimization is your main concern, measure it! This is a good use of the Stopwatch class, but even just recording your current time and subtracting the time after function completion to get a diff will help you out. Make a (Release!) build for each solution, and run them each a large number of times to establish which one is faster on average.
Once you've determined the most performant solution, keep that one, and consider leaving the performance tracking in so that you can identify bottlenecks in your code. This will allow you to isolate and correct performance problems with confidence. Ideally you can separate your implementation details into their own class so you can refactor and optimize freely without needing to change the rest of your code.

Related

C# stop async inheritance

I'm getting in touch with the whole async / await functionality in C# right now.
I think I know what it is good for. But I encountered places where I do not want the common inheritance of all the methods which call a library function of mine to need to be "async" aware.
Consider this (rough pseudo-code, not really representing the real thing, it's just about the context):
string JokeOfTheHour;
public string GiveJokeOfTheHour()
{
if(HourIsOver)
{
jokeOfTheHour = thirdPartyLibrary.GetNewJoke().GetAwaiter().GetResult();
}
return jokeOfTheHour;
}
I have a web-back-end library function which is called up to a million times per hour (or even more).
Exactly one time of these million calls per hour, the logic within uses a third party library which just supports async calls for the methods I want to use from it.
I don't want the user of my library to even think that it would make any sense for them to asynchronously run any code when calling my library-function, because it would only generate unnessecary overhead for their code and runtime the absolute most of the time.
The reasons I would state here are:
Seperation of Concern. I know how I work, my user does not need to.
Context is everything. As a developer, having background-knowledge is the way for me to know which cases I need to consider when writing code, and which not. That enables me to ommit writing hundreds of lines of code for stuff that should never happen.
Now, I want to know what general rules there are to do this. But sadly, I can't find simple statements or rules browsing the web where anybody sais "In this, this and this situation, you can stop this "async" keyword bubbling up your method-calltree". I've just seen persons (some of them Microsoft MVP's) talking about that there absolutely are situations where this should be done, also stating that you should use .GetAwaiter().GetResult() as a best practice then, but they are never specific about the situations itself.
What I am looking for is a down-to-the-ground general rule in which I can say:
Even though I might call third party functions which are async, I do not execute async, and do not want to appear as such. I'm a bottom level function using caches 99.99999% of the time. I don't need my user to implement the async methodology all the way up to where my actual user needs to decide where the async execution stops (Which makes my user who should actually benefit timely from my library do write more code and have more execution time).
I would really be thankful for your help :)

You seem to want your method to introduce itself with: "I'm fast". The truth is that from time to time it can actually be (very) slow. This potentially has serious consequences.
The statement
I'm a bottom level function using caches 99.99999% of the time'
is not correct if you call your method once an hour.
It is better for consumers of your method to see "I can be slow, but if you call me often, I cache the result, so I will return fast" (which would be GiveJokeOfTheHourAsync() with a comment.)
If you want your method to always be fast I would suggest one of these options:
Have an UpdateJokeAsync method that you call without waiting for it in your if(HourIsOver). This would mean returning stale result until you fetch a new one.
Update your joke using a timer.
Make 'get' always get the last known and have UpdateJokeAsync to update the joke.

How can I reduce the optimality gap of the routing model's assignment by allow more time to search?

I am solving a pick and delivery problem. I was testing OR-Tools to know how good it is by the following example:
1. Two vehicles at same start, two pickup locations (one for each customer) that are actually the same point in terms of geolocation, two customers having the same geolocation too.
2. No demand or capacity, just a time dimension between points and constraints to satisfy pickup and delivery.
3. The objective is to reduce the global span of the cumulative time
It's obvious that the optimal solution will use both vehicles, but it doesn't! I tried a lot of settings to make it escape from a local optima, but it still doesn't and doesn't even try to use the time at hand to reach a better solution and just finishes in a couple of seconds.
So, how can I force it to continue search even if it thinks that the solution at hand is enough?
BTW: I checked if my logic is correct by giving it the optimal route as an initial route, and when I do that, it uses it. It also, indicated that the objective value of the optimal route is less than the original route, so I guess there are no bugs in the code.

Divide work among processes or threads?

I am interning for a company this summer, and I got passed down this program which is a total piece. It does very computationally intensive operations throughout most of its duration. It takes about 5 minutes to complete a run on a small job, and the guy I work with said that the larger jobs have taken up to 4 days to run. My job is to find a way to make it go faster. My idea was that I could split the input in half and pass the halves to two new threads or processes, I was wondering if I could get some feedback on how effective that might be and whether threads or processes are the way to go.
Any inputs would be welcomed.
Hunter

I'd take a strong look at TPL that was introduced in .net4 :) PLINQ might be especially useful for easy speedups.
Genereally speaking, splitting into diffrent processes(exefiles) is inadvicable for perfomance since starting processes is expensive. It does have other merits such as isolation(if part of a program crashes) though, but i dont think they are applicable for your problem.

If the jobs are splittable, then going multithreaded/multiprocessed will bring better speed. That is assuming, of course, that the computer they run on actually has multiple cores/cpus.
Threads or processes doesn't really matter regarding speed (if the threads don't share data). The only reason to use processes that I know of is when a job is likely to crash an entire process, which is not likely in .NET.

Use threads if theres lots of memory sharing in your code but if you think you'd like to scale the program to run across multiple computers (when required cores > 16) then develop it using processes with a client/server model.

Best way when optimising code, always, is to Profile it to find out where the Logjam's are IMO.
Sometimes you can find non obvious huge speed increases with little effort.
Eqatec, and SlimTune are two free C# profilers which may be worth trying out.
(Of course the other comments about which parallelization architecture to use are spot on - it's just I prefer analysis first....

Have a look at the Task Parallel Library -- this sounds like a prime candidate problem for using it.
As for the threads vs processes dilemma: threads are fine unless there is a specific reason to use processes (e.g. if you were using buggy code that you couldn't fix, and you did not want a bad crash in that code to bring down your whole process).

Well if the problem has a parallel solution then this is the right way to (ideally) significantly (but not always) increase performance.
However, you don't control making additional processes except for running an app that launches multiple mini apps ... which is not going to help you with this problem.
You are going to need to utilize multiple threads. There is a pretty cool library added to .NET for parallel programming you should take a look at. I believe its namespace is System.Threading.Tasks or System.Threading with the Parallel class.
Edit: I would definitely suggest though, that you think about whether or not a linear solution may fit better. Sometimes parallel solutions would taken even longer. It all depends on the problem in question.

If you need to communicate/pass data, go with threads (and if you can go .Net 4, use the Task Parallel Library as others have suggested). If you don't need to pass info that much, I suggest processes (scales a bit better on multiple cores, you get the ability to do multiple computers in a client/server setup [server passes info to clients and gets a response, but other than that not much info passing], etc.).

Personally, I would invest my effort into profiling the application first. You can gain a much better awareness of where the problem spots are before attempting a fix. You can parallelize this problem all day long, but it will only give you a linear improvement in speed (assuming that it can be parallelized at all). But, if you can figure out how to transform the solution into something that only takes O(n) operations instead of O(n^2), for example, then you have hit the jackpot. I guess what I am saying is that you should not necessarily focus on parallelization.
You might find spots that are looping through collections to find specific items. Instead you can transform these loops into hash table lookups. You might find spots that do frequent sorting. Instead you could convert those frequent sorting operations into a single binary search tree (SortedDictionary) which maintains a sorted collection efficiently through the many add/remove operations. And maybe you will find spots that repeatedly make the same calculations. You can cache the results of already made calculations and look them up later if necessary.

EDIT
Hey Guys thanks for your fast answers, I've gotten a lot nearer to the problem, seems to be a present command that makes the devices answers faster (since I wait for every device answer this would slow the whole thing down massivly)
I`ll keep on trying or kill the device :P
Thanks
-
Sorry for the caption...I simply couldn't think of a better one. I am currently starting to think I'm mad or something like that.... :)
I'll first try to describe the problem with words...if nobody has an idea i'll try to extract the important parts...
Imagine the following:
My app sends info via rs232 ports - so via an communicationsmonitor software I can watch exactly what data is sent and what time is between that.
Now I have 2 functions - to isolate the problem currently I have the exactly same while-loops containing the exact same code (I actually copied it!). Maybe the only difference is the code before the while loops are reached but even that isn't really anything big. So once the while loops are entered they run infinitly (just at the moment of course).
As I said they contain the exact same sending routine.
But one is half as fast as the other one.... (WTF??????) :)
Now my first attempt was to place breakpoints 1 line above the "while"-line
and one line beneath - The program never left the while loops...
So...can you guys maybe think of any other reasons this might have? Or possibilities to find out what it is....I'm kind of out of ideas after the breakpoint experiment showing that nothing but the same few lines are executed...
Oh btw no threads or something like that used here...so that can't be the reason either.
I'm looking forward to your ideas...thanks

Use a profiler to find where the extra cycles are going.

Given the exact same statement(s) that are run mltiple times there are absolutly no gurantees that they will have the same time of execution. There are a lot of external factors that can influence this, like CPU clock for instance.
You said that you don't have threads but your program is not the only one in the OS and you can't know what exactly the other processes do or how to control the time slice alocated to other processes.
You could set the Priority of your process higher but that still will not provide any guarantees and is not very recommended.

Many Methods Kill Code Speed?

I'm building an application that is seriously slower than it should be (a process takes 4 seconds when it should take only .1 seconds, that's my goal at least).
I have a bunch of methods that pass an array from one to the other. This has kept my code nice and organized, but I'm worried that it's killing the efficiency of my code.
Can anyone confirm if this is the case?
Also, I have all of my code contained in a class separate from my UI. Is this going make things run significantly slower than if I had my code contained in the Form1.cs file?
Edit: There are about 95000 points that need to be calculated, each point goes through 7 methods that does additional calculations.

Have you tried any profiling or performance tools to narrow down why the slowdown occurs?
It might show you ways that you could use to refactor your code and improve performance.
This question asked by another user has several options that you can choose from:
Good .Net Profilers

No. This is not what is killing your code speed, unless many methods means like a million or something. You probably have more things iterating through your array than you need or realize, and the array itself may have a larger memory footprint than you realize.
Perhaps you should look into a design where instead of passing the array to 7 methods, you iterate the array once, passing the members to 7 methods, this will minimize the number of times you're iterating through 95000 members.

In general, function calls are basic enough to be highly optimized by any interpreter (or compiler). Therefore these do not produce to much blow-up in run time. In fact, if wrap your problem to, say, some fancy iterative solution, you save handling the stack, but instead have to handle some iteration variables, which will not be to hard.
I know, there have been programmers who wondered why their recursive algorithms have been so slow, until someone told them not to pass array entries by value.
You should provide some sample code. In general, you should for other bottlenecks, or find another algorithm.

Just need to run it against a good profiling tool. I've got some stuff I wished only took 4 seconds - works with upwards of a hundred million records in a pass.

An Array is a reference type not a value type. Therefore you never pass the array. You are actually passing the pointer to the array in memory. So passing the array isn't your issue. Most likely you have an issue with what you do with your array. You need to do what Jamie Keeling said and run it through a profiler or even just debug it and see if you get stuck in some big loops.

Why are you loading them all into an array and doing each method in turn rather than iterating through them as loaded?
If you can obtain them (from whatever input source) deal with them and output them (whether to screen, file our wherever) this will inevitably use less memory and reduce start-up time, at the very least.
If this answer is applicable to your situation, start by changing your methods to deal with enumerations rather than arrays (non-breaking change, since arrays are enumerations), then change your input method to yield return items as loaded rather than loading an entire array.

Sorry for posting an old link (.NET 1.1) but it was contained in VS2010 article, so:
Here you can read about method costs. (Initial link)
Then, if you start your code from VS (no matters, even in Release mode) the VS debugger connects to your code and slow it down.
I know that for this advise I will be minused but... The max performance will be achieved with unsafe operations with arrays (yes it's UNSAFE, but when there is performance deal, so...)
And the last - refactor your code to use minimum of methods which working with your arrays. It will improve the performance.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.