How expensive are exceptions in C#? It seems like they are not incredibly expensive as long as the stack is not deep; however I have read conflicting reports.
Is there definitive report that hasn't been rebutted?
Having read that exceptions are costly in terms of performance I threw together a simple measurement program, very similar to the one Jon Skeet published years ago. I mention this here mainly to provide updated numbers.
It took the program below 29914 milliseconds to process one million exceptions, which amounts to 33 exceptions per millisecond. That is fast enough to make exceptions a viable alternative to return codes for most situations.
Please note, though, that with return codes instead of exceptions the same program runs less than one millisecond, which means exceptions are at least 30,000 times slower than return codes. As stressed by Rico Mariani these numbers are also minimum numbers. In practice, throwing and catching an exception will take more time.
Measured on a laptop with Intel Core2 Duo T8100 # 2,1 GHz with .NET 4.0 in release build not run under debugger (which would make it way slower).
This is my test code:
static void Main(string[] args)
{
int iterations = 1000000;
Console.WriteLine("Starting " + iterations.ToString() + " iterations...\n");
var stopwatch = new Stopwatch();
// Test exceptions
stopwatch.Reset();
stopwatch.Start();
for (int i = 1; i <= iterations; i++)
{
try
{
TestExceptions();
}
catch (Exception)
{
// Do nothing
}
}
stopwatch.Stop();
Console.WriteLine("Exceptions: " + stopwatch.ElapsedMilliseconds.ToString() + " ms");
// Test return codes
stopwatch.Reset();
stopwatch.Start();
int retcode;
for (int i = 1; i <= iterations; i++)
{
retcode = TestReturnCodes();
if (retcode == 1)
{
// Do nothing
}
}
stopwatch.Stop();
Console.WriteLine("Return codes: " + stopwatch.ElapsedMilliseconds.ToString() + " ms");
Console.WriteLine("\nFinished.");
Console.ReadKey();
}
static void TestExceptions()
{
throw new Exception("Failed");
}
static int TestReturnCodes()
{
return 1;
}
I guess I'm in the camp that if performance of exceptions impacts your application then you're throwing WAY too many of them. Exceptions should be for exceptional conditions, not as routine error handling.
That said, my recollection of how exceptions are handled is essentially walking up the stack finding a catch statement that matches the type of the exception thrown. So performance will be impacted most by how deep you are from the catch and how many catch statements you have.
In my case, exceptions were very expensive. I rewrote this:
public BlockTemplate this[int x,int y, int z]
{
get
{
try
{
return Data.BlockTemplate[World[Center.X + x, Center.Y + y, Center.Z + z]];
}
catch(IndexOutOfRangeException e)
{
return Data.BlockTemplate[BlockType.Air];
}
}
}
Into this:
public BlockTemplate this[int x,int y, int z]
{
get
{
int ix = Center.X + x;
int iy = Center.Y + y;
int iz = Center.Z + z;
if (ix < 0 || ix >= World.GetLength(0)
|| iy < 0 || iy >= World.GetLength(1)
|| iz < 0 || iz >= World.GetLength(2))
return Data.BlockTemplate[BlockType.Air];
return Data.BlockTemplate[World[ix, iy, iz]];
}
}
And I noticed a good speed increase of about 30 seconds. This function gets called at least 32,000 times at startup. The code isn't as clear as to what the intention is, but the cost savings were huge.
I did my own measurements to find out how serious the exceptions implication were. I didn't try to measure the absolute time for throwing/catching exception. I was mostly interested in how much slower a loop will become if an exception is thrown in each pass. The measuring code looks like this:
for(; ; ) {
iValue = Level1(iValue);
lCounter += 1;
if(DateTime.Now >= sFinish)
break;
}
vs.
for(; ; ) {
try {
iValue = Level3Throw(iValue);
}
catch(InvalidOperationException) {
iValue += 3;
}
lCounter += 1;
if(DateTime.Now >= sFinish)
break;
}
The difference is 20 times. The second snippet is 20 times slower.
Barebones exception objects in C# are fairly lightweight; it's usually the ability to encapsulate an InnerException that makes it heavy when the object tree becomes too deep.
As for a definitive report, I'm not aware of any, although a cursory dotTrace profile (or any other profiler) for memory consumption and speed will be fairly easy to do.
Just to give my personal experience:
I'm working on a program that parses JSON files and extracts data from them, with Newtonsoft (Json.NET).
I rewrote this:
Option 1, with exceptions
try
{
name = rawPropWithChildren.Value["title"].ToString();
}
catch(System.NullReferenceException)
{
name = rawPropWithChildren.Name;
}
To this:
Option 2, without exceptions
if(rawPropWithChildren.Value["title"] == null)
{
name = rawPropWithChildren.Name;
}
else
{
name = rawPropWithChildren.Value["title"].ToString();
}
Of course, you don't really have context to judge about it, but here are my results (in debug mode):
Option 1, with exceptions.
38.50 seconds
Option 2, without exceptions.
06.48 seconds
To give a little bit of context, I'm working with thousands of JSON properties that can be null. Exceptions were thrown way too much, like maybe during 15% of the execution time. Well, not really precise, but they were thrown too many times.
I wanted to fix this, so I changed my code, and I did not understand why the execution time was so much faster. That was because of my poor exception handling.
So, what I've learned from this: I need to use exceptions only in particular cases and for things that can't be tested with a simple conditional statement. They also must be thrown the less often possible.
This is a kind of a random story for you, but I guess I would definitely think twice before using exceptions in my code from now on!
The performance hit with exceptions seems to be at the point of generating the exception object (albeit too small to cause any concerns 90% of the time). The recommendation therefore is to profile your code - if exceptions are causing a performance hit, you write a new high-perf method that does not use exceptions. (An example that comes to mind would be (TryParse introduced to overcome perf issues with Parse which uses exceptions)
THat said, exceptions in most cases do not cause significant performance hits in most situations - so the MS Design Guideline is to report failures by throwing exceptions
TLDR;
The answer should always start by answering "expensive compared to what?"
Exceptions are likely orders of magnitude faster than any connected service or data call so it's unlikely that avoiding their use is providing a tangible benefit over the improved information and control flow that they provide.
Throwing an exception can be measured in MICROseconds (but it depends on stack depth):
Image from this article and he posts the testing code: .Net exceptions performance
Do you generally get what you pay for? Most of the time yes.
Longer Explanation:
I'm quite interested in the origins of this question. As far as I can tell it is residual distaste for marginally useful c++ exceptions. .Net exceptions have a wealth of info in them and allow for neat and tidy code without excessive checks for success and logging. I explain much of the benefit of exceptions in another answer.
In 20 years of programming, I've never removed a throw or a catch to make something faster (not to say that I couldn't, just to say that there was lower hanging fruit and after that, nobody complained).
There is a separate question with competing answers, one catching an exception (no "Try" method was provided by the built-in method) and one that avoided exceptions.
I decided to do a head to head performance comparison of the two, and for a smaller number of columns the non-exception version was faster, but the exception version scaled better and eventually outperformed the exception-avoiding version:
The linqpad code for that test is below (including the graph rendering).
The point here though, is that this idea of "exceptions are slow" begs the question of "slower than what?" If a deep-stack exception costs 500 microseconds, does it matter if it occurs in response to a unique constraint that took a database 3000 microseconds to create? In any case, this demonstrates a generalized avoiding of exceptions for performance reasons will not necessarily yield more performant code.
Code for performance test:
void Main()
{
var loopResults = new List<Results>();
var exceptionResults = new List<Results>();
var totalRuns = 10000;
for (var colCount = 1; colCount < 20; colCount++)
{
using (var conn = new SqlConnection(#"Data Source=(localdb)\MSSQLLocalDb;Initial Catalog=master;Integrated Security=True;"))
{
conn.Open();
//create a dummy table where we can control the total columns
var columns = String.Join(",",
(new int[colCount]).Select((item, i) => $"'{i}' as col{i}")
);
var sql = $"select {columns} into #dummyTable";
var cmd = new SqlCommand(sql,conn);
cmd.ExecuteNonQuery();
var cmd2 = new SqlCommand("select * from #dummyTable", conn);
var reader = cmd2.ExecuteReader();
reader.Read();
Func<Func<IDataRecord, String, Boolean>, List<Results>> test = funcToTest =>
{
var results = new List<Results>();
Random r = new Random();
for (var faultRate = 0.1; faultRate <= 0.5; faultRate += 0.1)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
var faultCount=0;
for (var testRun = 0; testRun < totalRuns; testRun++)
{
if (r.NextDouble() <= faultRate)
{
faultCount++;
if(funcToTest(reader, "colDNE"))
throw new ApplicationException("Should have thrown false");
}
else
{
for (var col = 0; col < colCount; col++)
{
if(!funcToTest(reader, $"col{col}"))
throw new ApplicationException("Should have thrown true");
}
}
}
stopwatch.Stop();
results.Add(new UserQuery.Results{
ColumnCount = colCount,
TargetNotFoundRate = faultRate,
NotFoundRate = faultCount * 1.0f / totalRuns,
TotalTime=stopwatch.Elapsed
});
}
return results;
};
loopResults.AddRange(test(HasColumnLoop));
exceptionResults.AddRange(test(HasColumnException));
}
}
"Loop".Dump();
loopResults.Dump();
"Exception".Dump();
exceptionResults.Dump();
var combinedResults = loopResults.Join(exceptionResults,l => l.ResultKey, e=> e.ResultKey, (l, e) => new{ResultKey = l.ResultKey, LoopResult=l.TotalTime, ExceptionResult=e.TotalTime});
combinedResults.Dump();
combinedResults
.Chart(r => r.ResultKey, r => r.LoopResult.Milliseconds * 1.0 / totalRuns, LINQPad.Util.SeriesType.Line)
.AddYSeries(r => r.ExceptionResult.Milliseconds * 1.0 / totalRuns, LINQPad.Util.SeriesType.Line)
.Dump();
}
public static bool HasColumnLoop(IDataRecord dr, string columnName)
{
for (int i = 0; i < dr.FieldCount; i++)
{
if (dr.GetName(i).Equals(columnName, StringComparison.InvariantCultureIgnoreCase))
return true;
}
return false;
}
public static bool HasColumnException(IDataRecord r, string columnName)
{
try
{
return r.GetOrdinal(columnName) >= 0;
}
catch (IndexOutOfRangeException)
{
return false;
}
}
public class Results
{
public double NotFoundRate { get; set; }
public double TargetNotFoundRate { get; set; }
public int ColumnCount { get; set; }
public double ResultKey {get => ColumnCount + TargetNotFoundRate;}
public TimeSpan TotalTime { get; set; }
}
I recently measured C# exceptions (throw and catch) in a summation loop that threw an arithmetic overflow on every addition. Throw and catch of arithmetic overflow was around 8.5 microseconds = 117 KiloExceptions/second, on a quad-core laptop.
Exceptions are expensive, but there is more to it when you want to choose between exception and return codes.
Historically speaking the argument was: exceptions ensure that code is forced to handle the situation whereas return codes can be ignored. I never favoured these arguments as no programmer will want to ignore and break their codes on purpose - especially a good test team / or a well written test case will definitely not ignore return codes.
From a modern programming practices point of view, managing exceptions need to be looked at not only for their cost, but also for their viability.
First
Since most front ends will be disconnected from the API that is throwing exception. For example, a mobile app using a REST API. The same API can also be used for an Angular-based web frontend.
Either scenario will prefer return codes instead of exceptions.
Second
Nowadays, hackers randomly attempt to break all web utilities. In such a scenario, if they are constantly attacking your app's login API and if the app is constantly throwing exceptions, then you will end up dealing with thousands of exceptions a day. Of course, many will say the firewall will take care of such attacks. However, not all are spending money to manage a dedicated firewall or an expensive anti-spam service. It is better that your code is prepared for these scenarios.
Related
I am building software to evaluate many possible solutions and am trying to introduce parallel processing to speed up the calculations. My first attempt was to build a datatable with each row being a solution to evaluate but building the datatable takes quite some time and I am running into memory issues when the number of possible solutions goes into the millions.
The problem which warrants these solutions is structured as follows:
There is a range dates for x number of events which must be done in order. The solutions to evaluate could look as follows with each solution being a row, the events being the columns and the day number being the values.
Given 3 days (0 to 2) and three events:
0 0 0
0 0 1
0 0 2
0 1 1
0 1 2
0 2 2
1 1 1
1 1 2
1 2 2
2 2 2
My new plan was to use recursion and evaluate the solutions as I go rather than build a solution set to then evaluate.
for(int day = 0; day < maxdays; day++)
{
List<int> mydays = new List<int>();
mydays.Add(day);
EvalEvent(0,day,mydays);
}
private void EvalEvent(int eventnum,
int day, List<int> mydays)
{
Parallel.For(day,maxdays, day2 =>
// events must be on same day or after previous events
{
List<int> mydays2 = new List<int>();
for(int a = 0; a <mydays.Count;a++)
{
mydays2.Add(mydays[a]);
}
mydays2.Add(day2);
if(eventnum< eventcount - 1) // proceed to next event
{
EvalEvent(eventnum+1, day2,mydays2);
}
else
{
EvalSolution(mydays2);
}
});
}
My question is if this is actually an efficient use of parallel processing or will too many threads be spawned and slow it down? Should the parallel loop only be done on the last or maybe last few values of eventnum or is there a better way to approach the problem?
Requested old code pretty much is as follows:
private int daterange;
private int events;
private void ScheduleIt()
{
daterange = 10;
events = 6;
CreateSolutions();
int best = GetBest();
}
private DataTable Options();
private bool CreateSolutions()
{
Options= new DataTable();
Options.Columns.Add();
for (int day1=0;day1<=daterange ;day1++)
{
Options.Rows.Add(day1);
}
for (int event =1; event<events; event++)
{
Options.Columns.Add();
foreach(DataRow dr in Options.Rows)
{dr[Options.Columns.Count-1] = dr[Options.Columns.Count-2] ;}
int rows = Options.Rows.Count;
for (int day1=1;day1<=daterange ;day1++)
{
for(int i = 0; i <rows; i++)
{
if(day1 > Convert.ToInt32(Options.Rows[i][Options.Columns.Count-2]))
{
try{
Options.Rows.Add();
for (int col=0;col<Options.Columns.Count-1;col++)
{
Options.Rows[Options.Rows.Count-1][col] =Options.Rows[i][col];
}
Options.Rows[Options.Rows.Count-1][Options.Columns.Count-1] = day1;
}
catch(Exception ex)
{
return false;
}
}
}
}
}
return true;
}
private intGetBest()
{
int bestopt = 0;
double bestscore =999999999;
Parallel.For( 0, Options.Rows.Count,opt =>
{
double score = 0;
for(int i = 0; i <Options.Columns.Count;i++)
{score += Options.Rows[opt][i]}// just a stand in calc for a score
if (score < bestscore)
{bestscore = score;
bestopt = opt;
}
});
return bestopt;
}
Even if done without errors it can not significantly speed up your solution.
It looks like each level of recursion tries to start multiple (let say up to "k") next level calls for let's "n" level. This essentially mean code is O(k ^ n) which grows very fast. Non-algorithmic speedup of such O(k^n) solution is essentially useless (unless both k and n are very small). In particular, running code in parallel will only give you constant factor of speed up (roughly number of threads supported by your CPUs) which really does not change complexity at all.
Indeed creation of exponentially large number of requests for new threads will likely cause more problems by itself for just managing threads.
In addition to not significantly improving performance parallel code is harder to write as it needs proper synchronization or cleaver data partitioning - neither seem to be present in your case.
Parallelization works best when the workload is bulky and balanced. Ideally you would like your work splited to as many independent partitions as the logical processors of your machine, ensuring that all partitions have approximately the same size. This way all available processors will work with the maximum efficiency for approximately the same duration, and you'll get the results after the shortest time possible.
Of course you should start with a working and bug-free serial implementation, and then think about ways to partition your work. The easiest way is usually not optimal. For example an easy path is to convert your work to a LINQ query, and then parallelize it with AsParallel() (making it PLINQ). This usually results to a too granular partitioning, that introduces too much overhead. If you can't find ways to improve it you then can go the way of Parallel.For or Parallel.ForEach, which is a bit more complex.
A LINQ implementation should probably start by creating an iterator that produces all your units of work (Events or Solutions, it's not very clear to me).
public static IEnumerable<Solution> GetAllSolutions()
{
for (int day = 0; day < 3; day++)
{
for (int ev = 0; ev < 3; ev++)
{
yield return new Solution(); // ???
}
}
}
It will certainly be helpful if you have created concrete classes to represent the entities you are dealing with.
I've stumbled upon this effect when debugging an application - see the repro code below.
It gives me the following results:
Data init, count: 100,000 x 10,000, 4.6133365 secs
Perf test 0 (False): 5.8289565 secs
Perf test 0 (True): 5.8485172 secs
Perf test 1 (False): 32.3222312 secs
Perf test 1 (True): 217.0089923 secs
As far as I understand, the array store operations shouldn't normally have such a drastic performance effect (32 vs 217 seconds). I wonder if anyone understands what effects are at play here?
UPD extra test added; Perf 0 shows the results as expected, Perf 1 - shows the performance anomaly.
class Program
{
static void Main(string[] args)
{
var data = InitData();
TestPerf0(data, false);
TestPerf0(data, true);
TestPerf1(data, false);
TestPerf1(data, true);
if (Debugger.IsAttached)
Console.ReadKey();
}
private static string[] InitData()
{
var watch = Stopwatch.StartNew();
var data = new string[100_000];
var maxString = 10_000;
for (int i = 0; i < data.Length; i++)
{
data[i] = new string('-', maxString);
}
watch.Stop();
Console.WriteLine($"Data init, count: {data.Length:n0} x {maxString:n0}, {watch.Elapsed.TotalSeconds} secs");
return data;
}
private static void TestPerf1(string[] vals, bool testStore)
{
var watch = Stopwatch.StartNew();
var counters = new int[char.MaxValue];
int tmp = 0;
for (var j = 0; ; j++)
{
var allEmpty = true;
for (var i = 0; i < vals.Length; i++)
{
var val = vals[i];
if (j < val.Length)
{
allEmpty = false;
var ch = val[j];
var count = counters[ch];
tmp ^= count;
if (testStore)
counters[ch] = count + 1;
}
}
if (allEmpty)
break;
}
// prevent the compiler from optimizing away our computations
tmp.GetHashCode();
watch.Stop();
Console.WriteLine($"Perf test 1 ({testStore}): {watch.Elapsed.TotalSeconds} secs");
}
private static void TestPerf0(string[] vals, bool testStore)
{
var watch = Stopwatch.StartNew();
var counters = new int[65536];
int tmp = 0;
for (var i = 0; i < 1_000_000_000; i++)
{
var j = i % counters.Length;
var count = counters[j];
tmp ^= count;
if (testStore)
counters[j] = count + 1;
}
// prevent the compiler from optimizing away our computations
tmp.GetHashCode();
watch.Stop();
Console.WriteLine($"Perf test 0 ({testStore}): {watch.Elapsed.TotalSeconds} secs");
}
}
After testing your code for quite some time my best guess is, as already said in the comments, that you experience a lot of cache-misses with your current solution. The line:
if (testStore)
counters[ch] = count + 1;
might be force the compiler to completely load a new cache-line into the memory and displace the current content. There might also be some problems with branch-prediction in this scenario. This is highly hardware dependent and I'm not aware of a really good solution to test this in any interpreted language (It's also quite hard in compiled languages where the hardware is set and well-known).
After going through the disassembly, you can clearly see that you also introduce a whole bunch of new instruction which might increase the before mentioned problems further.
Overall I'd advice you the re-write the complete algorithm as there are better places to improve performance instead of picking at this one little assignment. This would be the optimizations I'd suggest (this also improves readability):
Invert your i and j loop. This will remove the allEmpty variable completely.
Cast ch to int with var ch = (int) val[j]; - because you ALWAYS use it as index.
Think about why this might be a problem at all. You introduce a new instruction and any instruction comes at a cost. If this is really the primary "hot-spot" of your code you can start to think about better solutions (Remember: "premature optimization is the root of all evil").
As this is a "test setting" which the name suggests, is this important at all? Just remove it.
EDIT: Why did I suggest to invert to loops? With this little rearrangement of code:
foreach (var val in vals)
{
foreach (int ch in val)
{
var count = counters[ch];
tmp ^= count;
if (testStore)
{
counters[ch] = count + 1;
}
}
}
I come from runtimes like this:
to runtimes like this:
Do you still think it's not worth a try? I saved some orders of magnitude here and nearly eliminated the effect of the if (to be clear - all optimizations are disabled in the settings). If there are special reasons not to do this you should tell us more about the context in which this code will be used.
EDIT2: For the in-depth answer. My best explanation for why this problem occurs is because you cross-reference your cache-lines. In the lines:
for (var i = 0; i < vals.Length; i++)
{
var val = vals[i];
you load a really massive dataset. This is by far bigger than a cache-line itself. So it will most likely need to be loaded every iteration fresh from the memory into a new cache-line (displacing the old content). This is also known as "cache-thrashing" if I remember correctly. Thanks to #mjwills for pointing this out in his comment.
In my suggested solution, on the other hand, the content of a cache-line can stay alive as long as the inner loop did not exceed its boundaries (which happens a lot less if you use this direction of memory access).
This is the closest explanation why me code runs that much faster and it also supports the assumption that you have serious caching problems with your code.
How expensive are exceptions in C#? It seems like they are not incredibly expensive as long as the stack is not deep; however I have read conflicting reports.
Is there definitive report that hasn't been rebutted?
Having read that exceptions are costly in terms of performance I threw together a simple measurement program, very similar to the one Jon Skeet published years ago. I mention this here mainly to provide updated numbers.
It took the program below 29914 milliseconds to process one million exceptions, which amounts to 33 exceptions per millisecond. That is fast enough to make exceptions a viable alternative to return codes for most situations.
Please note, though, that with return codes instead of exceptions the same program runs less than one millisecond, which means exceptions are at least 30,000 times slower than return codes. As stressed by Rico Mariani these numbers are also minimum numbers. In practice, throwing and catching an exception will take more time.
Measured on a laptop with Intel Core2 Duo T8100 # 2,1 GHz with .NET 4.0 in release build not run under debugger (which would make it way slower).
This is my test code:
static void Main(string[] args)
{
int iterations = 1000000;
Console.WriteLine("Starting " + iterations.ToString() + " iterations...\n");
var stopwatch = new Stopwatch();
// Test exceptions
stopwatch.Reset();
stopwatch.Start();
for (int i = 1; i <= iterations; i++)
{
try
{
TestExceptions();
}
catch (Exception)
{
// Do nothing
}
}
stopwatch.Stop();
Console.WriteLine("Exceptions: " + stopwatch.ElapsedMilliseconds.ToString() + " ms");
// Test return codes
stopwatch.Reset();
stopwatch.Start();
int retcode;
for (int i = 1; i <= iterations; i++)
{
retcode = TestReturnCodes();
if (retcode == 1)
{
// Do nothing
}
}
stopwatch.Stop();
Console.WriteLine("Return codes: " + stopwatch.ElapsedMilliseconds.ToString() + " ms");
Console.WriteLine("\nFinished.");
Console.ReadKey();
}
static void TestExceptions()
{
throw new Exception("Failed");
}
static int TestReturnCodes()
{
return 1;
}
I guess I'm in the camp that if performance of exceptions impacts your application then you're throwing WAY too many of them. Exceptions should be for exceptional conditions, not as routine error handling.
That said, my recollection of how exceptions are handled is essentially walking up the stack finding a catch statement that matches the type of the exception thrown. So performance will be impacted most by how deep you are from the catch and how many catch statements you have.
In my case, exceptions were very expensive. I rewrote this:
public BlockTemplate this[int x,int y, int z]
{
get
{
try
{
return Data.BlockTemplate[World[Center.X + x, Center.Y + y, Center.Z + z]];
}
catch(IndexOutOfRangeException e)
{
return Data.BlockTemplate[BlockType.Air];
}
}
}
Into this:
public BlockTemplate this[int x,int y, int z]
{
get
{
int ix = Center.X + x;
int iy = Center.Y + y;
int iz = Center.Z + z;
if (ix < 0 || ix >= World.GetLength(0)
|| iy < 0 || iy >= World.GetLength(1)
|| iz < 0 || iz >= World.GetLength(2))
return Data.BlockTemplate[BlockType.Air];
return Data.BlockTemplate[World[ix, iy, iz]];
}
}
And I noticed a good speed increase of about 30 seconds. This function gets called at least 32,000 times at startup. The code isn't as clear as to what the intention is, but the cost savings were huge.
I did my own measurements to find out how serious the exceptions implication were. I didn't try to measure the absolute time for throwing/catching exception. I was mostly interested in how much slower a loop will become if an exception is thrown in each pass. The measuring code looks like this:
for(; ; ) {
iValue = Level1(iValue);
lCounter += 1;
if(DateTime.Now >= sFinish)
break;
}
vs.
for(; ; ) {
try {
iValue = Level3Throw(iValue);
}
catch(InvalidOperationException) {
iValue += 3;
}
lCounter += 1;
if(DateTime.Now >= sFinish)
break;
}
The difference is 20 times. The second snippet is 20 times slower.
Barebones exception objects in C# are fairly lightweight; it's usually the ability to encapsulate an InnerException that makes it heavy when the object tree becomes too deep.
As for a definitive report, I'm not aware of any, although a cursory dotTrace profile (or any other profiler) for memory consumption and speed will be fairly easy to do.
Just to give my personal experience:
I'm working on a program that parses JSON files and extracts data from them, with Newtonsoft (Json.NET).
I rewrote this:
Option 1, with exceptions
try
{
name = rawPropWithChildren.Value["title"].ToString();
}
catch(System.NullReferenceException)
{
name = rawPropWithChildren.Name;
}
To this:
Option 2, without exceptions
if(rawPropWithChildren.Value["title"] == null)
{
name = rawPropWithChildren.Name;
}
else
{
name = rawPropWithChildren.Value["title"].ToString();
}
Of course, you don't really have context to judge about it, but here are my results (in debug mode):
Option 1, with exceptions.
38.50 seconds
Option 2, without exceptions.
06.48 seconds
To give a little bit of context, I'm working with thousands of JSON properties that can be null. Exceptions were thrown way too much, like maybe during 15% of the execution time. Well, not really precise, but they were thrown too many times.
I wanted to fix this, so I changed my code, and I did not understand why the execution time was so much faster. That was because of my poor exception handling.
So, what I've learned from this: I need to use exceptions only in particular cases and for things that can't be tested with a simple conditional statement. They also must be thrown the less often possible.
This is a kind of a random story for you, but I guess I would definitely think twice before using exceptions in my code from now on!
The performance hit with exceptions seems to be at the point of generating the exception object (albeit too small to cause any concerns 90% of the time). The recommendation therefore is to profile your code - if exceptions are causing a performance hit, you write a new high-perf method that does not use exceptions. (An example that comes to mind would be (TryParse introduced to overcome perf issues with Parse which uses exceptions)
THat said, exceptions in most cases do not cause significant performance hits in most situations - so the MS Design Guideline is to report failures by throwing exceptions
TLDR;
The answer should always start by answering "expensive compared to what?"
Exceptions are likely orders of magnitude faster than any connected service or data call so it's unlikely that avoiding their use is providing a tangible benefit over the improved information and control flow that they provide.
Throwing an exception can be measured in MICROseconds (but it depends on stack depth):
Image from this article and he posts the testing code: .Net exceptions performance
Do you generally get what you pay for? Most of the time yes.
Longer Explanation:
I'm quite interested in the origins of this question. As far as I can tell it is residual distaste for marginally useful c++ exceptions. .Net exceptions have a wealth of info in them and allow for neat and tidy code without excessive checks for success and logging. I explain much of the benefit of exceptions in another answer.
In 20 years of programming, I've never removed a throw or a catch to make something faster (not to say that I couldn't, just to say that there was lower hanging fruit and after that, nobody complained).
There is a separate question with competing answers, one catching an exception (no "Try" method was provided by the built-in method) and one that avoided exceptions.
I decided to do a head to head performance comparison of the two, and for a smaller number of columns the non-exception version was faster, but the exception version scaled better and eventually outperformed the exception-avoiding version:
The linqpad code for that test is below (including the graph rendering).
The point here though, is that this idea of "exceptions are slow" begs the question of "slower than what?" If a deep-stack exception costs 500 microseconds, does it matter if it occurs in response to a unique constraint that took a database 3000 microseconds to create? In any case, this demonstrates a generalized avoiding of exceptions for performance reasons will not necessarily yield more performant code.
Code for performance test:
void Main()
{
var loopResults = new List<Results>();
var exceptionResults = new List<Results>();
var totalRuns = 10000;
for (var colCount = 1; colCount < 20; colCount++)
{
using (var conn = new SqlConnection(#"Data Source=(localdb)\MSSQLLocalDb;Initial Catalog=master;Integrated Security=True;"))
{
conn.Open();
//create a dummy table where we can control the total columns
var columns = String.Join(",",
(new int[colCount]).Select((item, i) => $"'{i}' as col{i}")
);
var sql = $"select {columns} into #dummyTable";
var cmd = new SqlCommand(sql,conn);
cmd.ExecuteNonQuery();
var cmd2 = new SqlCommand("select * from #dummyTable", conn);
var reader = cmd2.ExecuteReader();
reader.Read();
Func<Func<IDataRecord, String, Boolean>, List<Results>> test = funcToTest =>
{
var results = new List<Results>();
Random r = new Random();
for (var faultRate = 0.1; faultRate <= 0.5; faultRate += 0.1)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
var faultCount=0;
for (var testRun = 0; testRun < totalRuns; testRun++)
{
if (r.NextDouble() <= faultRate)
{
faultCount++;
if(funcToTest(reader, "colDNE"))
throw new ApplicationException("Should have thrown false");
}
else
{
for (var col = 0; col < colCount; col++)
{
if(!funcToTest(reader, $"col{col}"))
throw new ApplicationException("Should have thrown true");
}
}
}
stopwatch.Stop();
results.Add(new UserQuery.Results{
ColumnCount = colCount,
TargetNotFoundRate = faultRate,
NotFoundRate = faultCount * 1.0f / totalRuns,
TotalTime=stopwatch.Elapsed
});
}
return results;
};
loopResults.AddRange(test(HasColumnLoop));
exceptionResults.AddRange(test(HasColumnException));
}
}
"Loop".Dump();
loopResults.Dump();
"Exception".Dump();
exceptionResults.Dump();
var combinedResults = loopResults.Join(exceptionResults,l => l.ResultKey, e=> e.ResultKey, (l, e) => new{ResultKey = l.ResultKey, LoopResult=l.TotalTime, ExceptionResult=e.TotalTime});
combinedResults.Dump();
combinedResults
.Chart(r => r.ResultKey, r => r.LoopResult.Milliseconds * 1.0 / totalRuns, LINQPad.Util.SeriesType.Line)
.AddYSeries(r => r.ExceptionResult.Milliseconds * 1.0 / totalRuns, LINQPad.Util.SeriesType.Line)
.Dump();
}
public static bool HasColumnLoop(IDataRecord dr, string columnName)
{
for (int i = 0; i < dr.FieldCount; i++)
{
if (dr.GetName(i).Equals(columnName, StringComparison.InvariantCultureIgnoreCase))
return true;
}
return false;
}
public static bool HasColumnException(IDataRecord r, string columnName)
{
try
{
return r.GetOrdinal(columnName) >= 0;
}
catch (IndexOutOfRangeException)
{
return false;
}
}
public class Results
{
public double NotFoundRate { get; set; }
public double TargetNotFoundRate { get; set; }
public int ColumnCount { get; set; }
public double ResultKey {get => ColumnCount + TargetNotFoundRate;}
public TimeSpan TotalTime { get; set; }
}
I recently measured C# exceptions (throw and catch) in a summation loop that threw an arithmetic overflow on every addition. Throw and catch of arithmetic overflow was around 8.5 microseconds = 117 KiloExceptions/second, on a quad-core laptop.
Exceptions are expensive, but there is more to it when you want to choose between exception and return codes.
Historically speaking the argument was: exceptions ensure that code is forced to handle the situation whereas return codes can be ignored. I never favoured these arguments as no programmer will want to ignore and break their codes on purpose - especially a good test team / or a well written test case will definitely not ignore return codes.
From a modern programming practices point of view, managing exceptions need to be looked at not only for their cost, but also for their viability.
First
Since most front ends will be disconnected from the API that is throwing exception. For example, a mobile app using a REST API. The same API can also be used for an Angular-based web frontend.
Either scenario will prefer return codes instead of exceptions.
Second
Nowadays, hackers randomly attempt to break all web utilities. In such a scenario, if they are constantly attacking your app's login API and if the app is constantly throwing exceptions, then you will end up dealing with thousands of exceptions a day. Of course, many will say the firewall will take care of such attacks. However, not all are spending money to manage a dedicated firewall or an expensive anti-spam service. It is better that your code is prepared for these scenarios.
I have a case which I know will happen but very scarce. For example in every 10 thousand times the code runs, this might happen once.
I can check for this case by a simple if but this if will run many times with no use.
On the other hand I can place the code in try-catch block and when that special case happens I do what is needed to recover.
The question is which one is better? I know that generally speaking try-catch should not be used for known cases because of the overhead issue and also the application logic should not rely on catch code, but running an if multiple times will have more performance issue. I have tested this using this small test code:
static void Main(string[] args)
{
Stopwatch sc = new Stopwatch();
var list = new List<int>();
var rnd = new Random();
for (int i = 0; i < 100000000; i++)
{
list.Add(rnd.Next());
}
sc.Start();
DoWithIf(list);
sc.Stop();
Console.WriteLine($"Done with IFs in {sc.ElapsedMilliseconds} milliseconds");
sc.Restart();
DoWithTryCatch(list);
sc.Stop();
Console.WriteLine($"Done with TRY-CATCH in {sc.ElapsedMilliseconds} milliseconds");
Console.ReadKey();
}
private static int[] DoWithTryCatch(List<int> list)
{
var res = new int[list.Count ];
try
{
for (int i = 0; i < list.Count; i++)
{
res[i] = list[i];
}
return res;
}
catch
{
return res;
}
}
private static int[] DoWithIf(List<int> list)
{
var res = new int[list.Count - 1];
for (int i = 0; i < list.Count; i++)
{
if (i < res.Length)
res[i] = list[i];
}
return res;
}
This code simply copies a lot of numbers to an array with not enough size. In my machine checking array bounds each time takes around 210 milliseconds to run while using try-catch that will hit catch once runs in around 190 milliseconds.
Also if you think it depends on the case my case is that I get push notifications in an app and will check if I have the topic of the message. If not I will get and store the topic information for next messages. There are many messages in few topics.
So, it would be accurate to say that in your test, the if option was slower than the try...catch option by 20 milliseconds, for a loop of 100000000 times.
That translates to 20 / 100,000,000 - that's 0.0000002 milliseconds for each iteration.
Do you really think that kind of nano-optimization is worth writing code that goes goes against proper design standards?
Exceptions are for exceptional cases, the things that you can't control or can't test in advance - for instance, when you are reading data from a database and the connection terminates in the middle - stuff like that.
Using exceptions for things that can be easily tested with simple code - well, that's just plain wrong.
If, for instance, you would have demonstrated a meaningful performance difference between these two options then perhaps you could justify using try...catch instead of if - but that's clearly not the case here.
So, to summarize - use if, not try...catch.
You should design your code for clarity, not for performance.
Write code that conveys the algorithm it is implementing in the clearest way possible.
Set performance goals and measure your code's performance against them.
If your code doesn't measure to your performance goals, Find the bottle necks and treat them.
Don't go wasting your time on nano-optimizations when you design the code.
In your case, you have somehow missed the obvious optimization: if you worry that calling an if 100.000 times is too much... don't?
private static int[] DoWithIf(List<int> list)
{
var res = new int[list.Count - 1];
var bounds = Math.Min(res.Length, list.Count)
for (int i = 0; i < bounds; i++)
{
res[i] = list[i];
}
return res;
}
So I know this is only a test case, but the answer is: optimize if you need it and for what you need it. If you have something in a loop that's supposedly costly, then try to move it out of the loop. Optimize based on logic, not based on compiler constructs. If you are down to optimizing compiler constructs, you should not be coding in a managed and/or high level language anyway.
I have an idea of how I can improve the performance with dynamic code generation, but I'm not sure which is the best way to approach this problem.
Suppose I have a class
class Calculator
{
int Value1;
int Value2;
//..........
int ValueN;
void DoCalc()
{
if (Value1 > 0)
{
DoValue1RelatedStuff();
}
if (Value2 > 0)
{
DoValue2RelatedStuff();
}
//....
//....
//....
if (ValueN > 0)
{
DoValueNRelatedStuff();
}
}
}
The DoCalc method is at the lowest level and it is called many times during calculation. Another important aspect is that ValueN are only set at the beginning and do not change during calculation. So many of the ifs in the DoCalc method are unnecessary, as many of ValueN are 0. So I was hoping that dynamic code generation could help to improve performance.
For instance if I create a method
void DoCalc_Specific()
{
const Value1 = 0;
const Value2 = 0;
const ValueN = 1;
if (Value1 > 0)
{
DoValue1RelatedStuff();
}
if (Value2 > 0)
{
DoValue2RelatedStuff();
}
....
....
....
if (ValueN > 0)
{
DoValueNRelatedStuff();
}
}
and compile it with optimizations switched on the C# compiler is smart enough to only keep the necessary stuff. So I would like to create such method at run time based on the values of ValueN and use the generated method during calculations.
I guess that I could use expression trees for that, but expression trees works only with simple lambda functions, so I cannot use things like if, while etc. inside the function body. So in this case I need to change this method in an appropriate way.
Another possibility is to create the necessary code as a string and compile it dynamically. But it would be much better for me if I could take the existing method and modify it accordingly.
There's also Reflection.Emit, but I don't want to stick with it as it would be very difficult to maintain.
BTW. I'm not restricted to C#. So I'm open to suggestions of programming languages that are best suited for this kind of problem. Except for LISP for a couple of reasons.
One important clarification. DoValue1RelatedStuff() is not a method call in my algorithm. It's just some formula-based calculation and it's pretty fast. I should have written it like this
if (Value1 > 0)
{
// Do Value1 Related Stuff
}
I have run some performance tests and I can see that with two ifs when one is disabled the optimized method is about 2 times faster than with the redundant if.
Here's the code I used for testing:
public class Program
{
static void Main(string[] args)
{
int x = 0, y = 2;
var if_st = DateTime.Now.Ticks;
for (var i = 0; i < 10000000; i++)
{
WithIf(x, y);
}
var if_et = DateTime.Now.Ticks - if_st;
Console.WriteLine(if_et.ToString());
var noif_st = DateTime.Now.Ticks;
for (var i = 0; i < 10000000; i++)
{
Without(x, y);
}
var noif_et = DateTime.Now.Ticks - noif_st;
Console.WriteLine(noif_et.ToString());
Console.ReadLine();
}
static double WithIf(int x, int y)
{
var result = 0.0;
for (var i = 0; i < 100; i++)
{
if (x > 0)
{
result += x * 0.01;
}
if (y > 0)
{
result += y * 0.01;
}
}
return result;
}
static double Without(int x, int y)
{
var result = 0.0;
for (var i = 0; i < 100; i++)
{
result += y * 0.01;
}
return result;
}
}
I would usually not even think about such an optimization. How much work does DoValueXRelatedStuff() do? More than 10 to 50 processor cycles? Yes? That means you are going to build quite a complex system to save less then 10% execution time (and this seems quite optimistic to me). This can easily go down to less then 1%.
Is there no room for other optimizations? Better algorithms? An do you really need to eliminate single branches taking only a single processor cycle (if the branch prediction is correct)? Yes? Shouldn't you think about writing your code in assembler or something else more machine specific instead of using .NET?
Could you give the order of N, the complexity of a typical method, and the ratio of expressions usually evaluating to true?
It would surprise me to find a scenario where the overhead of evaluating the if statements is worth the effort to dynamically emit code.
Modern CPU's support branch prediction and branch predication, which makes the overhead for branches in small segments of code approach zero.
Have you tried to benchmark two hand-coded versions of the code, one that has all the if-statements in place but provides zero values for most, and one that removes all of those same if branches?
If you are really into code optimisation - before you do anything - run the profiler! It will show you where the bottleneck is and which areas are worth optimising.
Also - if the language choice is not limited (except for LISP) then nothing will beat assembler in terms of performance ;)
I remember achieving some performance magic by rewriting some inner functions (like the one you have) using assembler.
Before you do anything, do you actually have a problem?
i.e. does it run long enough to bother you?
If so, find out what is actually taking time, not what you guess. This is the quick, dirty, and highly effective method I use to see where time goes.
Now, you are talking about interpreting versus compiling. Interpreted code is typically 1-2 orders of magnitude slower than compiled code. The reason is that interpreters are continually figuring out what to do next, and then forgetting, while compiled code just knows.
If you are in this situation, then it may make sense to pay the price of translating so as to get the speed of compiled code.