Speed up string concatenation [duplicate]

Speed up string concatenation [duplicate] - c#

This question already has answers here:
String.Format vs "string" + "string" or StringBuilder? [duplicate]
(2 answers)
Closed 6 years ago.
I have a program that will write a series of files in a loop. The filename is constructed using a parameter from an object supplied to the method.
ANTS Performance Profiler says this is dog slow and I'm not sure why:
public string CreateFilename(MyObject obj)
{
return "sometext-" + obj.Name + ".txt";
}
Is there a more performant way of doing this? The method is hit thousands of times and I don't know of a good way outside of having a discrete method for this purpose since the input objects are out of my control and regularly change.

The compiler will optimize your two concats into one call to:
String.Concat("sometext-", obj.Name, ".txt")
There is no faster way to do this.

If you instead compute the filename within the class itself, it will run much faster, in exchange for decreased performance when modifying the object. Mind you, I'd be very concerned if computing a filename was a bottleneck; writing to the file is way slower than coming up with its name.
See code samples below. When I benchmarked them with optimizations on (in LINQPad 5), Test2 ran about 15x faster than Test1. Among other things, Test2 doesn't constantly generate/discard tiny string objects.
void Main()
{
Test1();
Test1();
Test1();
Test2();
Test2();
Test2();
}
void Test1()
{
System.Diagnostics.Stopwatch sw = new Stopwatch();
MyObject1 mo = new MyObject1 { Name = "MyName" };
sw.Start();
long x = 0;
for (int i = 0; i < 10000000; ++i)
{
x += CreateFileName(mo).Length;
}
Console.WriteLine(x); //Sanity Check, prevent clever compiler optimizations
sw.ElapsedMilliseconds.Dump("Test1");
}
public string CreateFileName(MyObject1 obj)
{
return "sometext-" + obj.Name + ".txt";
}
void Test2()
{
System.Diagnostics.Stopwatch sw = new Stopwatch();
MyObject2 mo = new MyObject2 { Name = "MyName" };
sw.Start();
long x = 0;
for (int i = 0; i < 10000000; ++i)
{
x += mo.FileName.Length;
}
Console.WriteLine(x); //Sanity Check, prevent clever compiler optimizations
sw.ElapsedMilliseconds.Dump("Test2");
}
public class MyObject1
{
public string Name;
}
public class MyObject2
{
public string FileName { get; private set;}
private string _name;
public string Name
{
get
{
return _name;
}
set
{
_name=value;
FileName = "sometext-" + _name + ".txt";
}
}
}
I also tested adding memoization to CreateFileName, but it barely improved performance over Test1, and it couldn't possibly beat out Test2, since it performs the equivalent steps with additional overhead for hash lookups.

Related

Fast way to use String.Contains with huge list C#

I have somethings like this:
List<string> listUser;
listUser.Add("user1");
listUser.Add("user2");
listUser.Add("userhacker");
listUser.Add("user1other");
List<string> key_blacklist;
key_blacklist.Add("hacker");
key_blacklist.Add("other");
foreach (string user in listUser)
{
foreach (string key in key_blacklist)
{
if (user.Contains(key))
{
// remove it in listUser
}
}
}
The result of listUser is: user1, user2.
The problem is if i have a huge listUser (more than 10 million) and huge key_blacklist (100.000). That code is very very slow.
Is have anyway to get that faster?
UPDATE: I find new solution in there.
http://cc.davelozinski.com/c-sharp/fastest-way-to-check-if-a-string-occurs-within-a-string
Hope that will help someone when he got in there! :)

If you don't have much control over how the list of users is constructed, you can at least test each item in the list in parallel, which on modern machines with multiple cores will speed up the checking a fair bit.
listuser.AsParallel().Where(
s =>
{
foreach (var key in key_blacklist)
{
if (s.Contains(key))
{
return false; //Not to be included
}
}
return true; //To be included, as no match with the blacklist
});
Also - do you have to use .Contains? .Equals is going to be much much quicker, because in almost all cases a non-match will be determined when the HashCodes differ, which can be found only by an integer comparison. Super quick.
If you do need .Contains, you may want to think about restructuring the app. What do these strings in the list really represent? Separate sub-groups of users? Can I test each string, at the time it's added, for whether it represents a user on the blacklist?
UPDATE: In response to #Rawling's comment below - If you know that there is a finite set of usernames which have, say, "hacker" as a substring, that set would have to be pretty large before running a .Equals test of each username against a candidate would be slower than running .Contains on the candidate. This is because HashCode is really quick.

If you are using entity framework or linq to sql then using linq and sending the query to a server can improve the performance.
Then instead of removing the items you are actually querying for the items that fulfil the requirements, i.e. user where the name doesn't contain the banned expression:
listUser.Where(u => !key_blacklist.Any(u.Contains)).Select(u => u).ToList();

A possible solution is to use a tree-like data structure.
The basic idea is to have the blacklisted words organised like this:
+ h
| + ha
| + hac
| - hacker
| - [other words beginning with hac]
|
+ f
| + fu
| + fuk
| - fukoff
| - [other words beginning with fuk]
Then, when you check for blacklisted words, you avoid searching the whole list of words beginning with "hac" if you find out that your user string does not even contain "h".
In the example I provided, with your sample data, this does not of course make any difference, but with the real data sets this should reduce significantly the number of Contains, since you don't check against the full list of blacklisted words every time.
Here is a code example (please note that the code is pretty bad, this is just to illustrate my idea)
using System;
using System.Collections.Generic;
using System.Linq;
class Program {
class Blacklist {
public string Start;
public int Level;
const int MaxLevel = 3;
public Dictionary<string, Blacklist> SubBlacklists = new Dictionary<string, Blacklist>();
public List<string> BlacklistedWords = new List<string>();
public Blacklist() {
Start = string.Empty;
Level = 0;
}
Blacklist(string start, int level) {
Start = start;
Level = level;
}
public void AddBlacklistedWord(string word) {
if (word.Length > Level && Level < MaxLevel) {
string index = word.Substring(0, Level + 1);
Blacklist sublist = null;
if (!SubBlacklists.TryGetValue(index, out sublist)) {
sublist = new Blacklist(index, Level + 1);
SubBlacklists[index] = sublist;
}
sublist.AddBlacklistedWord(word);
} else {
BlacklistedWords.Add(word);
}
}
public bool ContainsBlacklistedWord(string wordToCheck) {
if (wordToCheck.Length > Level && Level < MaxLevel) {
foreach (var sublist in SubBlacklists.Values) {
if (wordToCheck.Contains(sublist.Start)) {
return sublist.ContainsBlacklistedWord(wordToCheck);
}
}
}
return BlacklistedWords.Any(x => wordToCheck.Contains(x));
}
}
static void Main(string[] args) {
List<string> listUser = new List<string>();
listUser.Add("user1");
listUser.Add("user2");
listUser.Add("userhacker");
listUser.Add("userfukoff1");
Blacklist blacklist = new Blacklist();
blacklist.AddBlacklistedWord("hacker");
blacklist.AddBlacklistedWord("fukoff");
foreach (string user in listUser) {
if (blacklist.ContainsBlacklistedWord(user)) {
Console.WriteLine("Contains blacklisted word: {0}", user);
}
}
}
}

You are using the wrong thing. If you have a lot of data, you should be using either HashSet<T> or SortedSet<T>. If you don't need the data sorted, go with HashSet<T>. Here is a program I wrote to demonstrate the time differences:
class Program
{
private static readonly Random random = new Random((int)DateTime.Now.Ticks);
static void Main(string[] args)
{
Console.WriteLine("Creating Lists...");
var stringList = new List<string>();
var hashList = new HashSet<string>();
var sortedList = new SortedSet<string>();
var searchWords1 = new string[3];
int ndx = 0;
for (int x = 0; x < 1000000; x++)
{
string str = RandomString(10);
if (x == 5 || x == 500000 || x == 999999)
{
str = "Z" + str;
searchWords1[ndx] = str;
ndx++;
}
stringList.Add(str);
hashList.Add(str);
sortedList.Add(str);
}
Console.WriteLine("Lists created!");
var sw = new Stopwatch();
sw.Start();
bool search1 = stringList.Contains(searchWords1[2]);
sw.Stop();
Console.WriteLine("List<T> {0} ==> {1}ms", search1, sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
search1 = hashList.Contains(searchWords1[2]);
sw.Stop();
Console.WriteLine("HashSet<T> {0} ==> {1}ms", search1, sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
search1 = sortedList.Contains(searchWords1[2]);
sw.Stop();
Console.WriteLine("SortedSet<T> {0} ==> {1}ms", search1, sw.ElapsedMilliseconds);
}
private static string RandomString(int size)
{
var builder = new StringBuilder();
char ch;
for (int i = 0; i < size; i++)
{
ch = Convert.ToChar(Convert.ToInt32(Math.Floor(26 * random.NextDouble() + 65)));
builder.Append(ch);
}
return builder.ToString();
}
}
On my machine, I got the following results:
Creating Lists...
Lists created!
List<T> True ==> 15ms
HashSet<T> True ==> 0ms
SortedSet<T> True ==> 0ms
As you can see, List<T> was extremely slow comparted to HashSet<T> and SortedSet<T>. Those were almost instantaneous.

Thread.MemoryBarrier and lock difference for a simple property

For the following scenario, is there any difference regarding thread-safeness, result and performance between using MemoryBarrier
private SomeType field;
public SomeType Property
{
get
{
Thread.MemoryBarrier();
SomeType result = field;
Thread.MemoryBarrier();
return result;
}
set
{
Thread.MemoryBarrier();
field = value;
Thread.MemoryBarrier();
}
}
and lock statement (Monitor.Enter and Monitor.Exit)
private SomeType field;
private readonly object syncLock = new object();
public SomeType Property
{
get
{
lock (syncLock)
{
return field;
}
}
set
{
lock (syncLock)
{
field = value;
}
}
}
Because reference assignment is atomic so I think that in this scenarios we do need any locking mechanism.
Performance
The MemeoryBarrier is about 2x faster than lock implementation for Release. Here are my test results:
Lock
Normaly: 5397 ms
Passed as interface: 5431 ms
Double Barrier
Normaly: 2786 ms
Passed as interface: 3754 ms
volatile
Normaly: 250 ms
Passed as interface: 668 ms
Volatile Read/Write
Normaly: 253 ms
Passed as interface: 697 ms
ReaderWriterLockSlim
Normaly: 9272 ms
Passed as interface: 10040 ms
Single Barrier: freshness of Property
Normaly: 1491 ms
Passed as interface: 2510 ms
Single Barrier: other not reodering
Normaly: 1477 ms
Passed as interface: 2275 ms
Here is how I tested it in LINQPad (with optimization set in Preferences):
void Main()
{
"Lock".Dump();
string temp;
var a = new A();
var watch = Stopwatch.StartNew();
for (int i = 0; i < 100000000; ++i)
{
temp = a.Property;
a.Property = temp;
}
Console.WriteLine("Normaly: " + watch.ElapsedMilliseconds + " ms");
Test(a);
"Double Barrier".Dump();
var b = new B();
watch.Restart();
for (int i = 0; i < 100000000; ++i)
{
temp = b.Property;
b.Property = temp;
}
Console.WriteLine("Normaly: " + watch.ElapsedMilliseconds + " ms");
Test(b);
"volatile".Dump();
var c = new C();
watch.Restart();
for (int i = 0; i < 100000000; ++i)
{
temp = c.Property;
c.Property = temp;
}
Console.WriteLine("Normaly: " + watch.ElapsedMilliseconds + " ms");
Test(c);
"Volatile Read/Write".Dump();
var d = new D();
watch.Restart();
for (int i = 0; i < 100000000; ++i)
{
temp = d.Property;
d.Property = temp;
}
Console.WriteLine("Normaly: " + watch.ElapsedMilliseconds + " ms");
Test(d);
"ReaderWriterLockSlim".Dump();
var e = new E();
watch.Restart();
for (int i = 0; i < 100000000; ++i)
{
temp = e.Property;
e.Property = temp;
}
Console.WriteLine("Normaly: " + watch.ElapsedMilliseconds + " ms");
Test(e);
"Single Barrier: freshness of Property".Dump();
var f = new F();
watch.Restart();
for (int i = 0; i < 100000000; ++i)
{
temp = f.Property;
f.Property = temp;
}
Console.WriteLine("Normaly: " + watch.ElapsedMilliseconds + " ms");
Test(f);
"Single Barrier: other not reodering".Dump();
var g = new G();
watch.Restart();
for (int i = 0; i < 100000000; ++i)
{
temp = g.Property;
g.Property = temp;
}
Console.WriteLine("Normaly: " + watch.ElapsedMilliseconds + " ms");
Test(g);
}
void Test(I a)
{
string temp;
var watch = Stopwatch.StartNew();
for (int i = 0; i < 100000000; ++i)
{
temp = a.Property;
a.Property = temp;
}
Console.WriteLine("Passed as interface: " + watch.ElapsedMilliseconds + " ms\n");
}
interface I
{
string Property { get; set; }
}
class A : I
{
private string field;
private readonly object syncLock = new object();
public string Property
{
get
{
lock (syncLock)
{
return field;
}
}
set
{
lock (syncLock)
{
field = value;
}
}
}
}
class B : I
{
private string field;
public string Property
{
get
{
Thread.MemoryBarrier();
string result = field;
Thread.MemoryBarrier();
return result;
}
set
{
Thread.MemoryBarrier();
field = value;
Thread.MemoryBarrier();
}
}
}
class C : I
{
private volatile string field;
public string Property
{
get
{
return field;
}
set
{
field = value;
}
}
}
class D : I
{
private string field;
public string Property
{
get
{
return Volatile.Read(ref field);
}
set
{
Volatile.Write(ref field, value);
}
}
}
class E : I
{
private string field;
private ReaderWriterLockSlim locker = new ReaderWriterLockSlim();
public string Property
{
get
{
locker.EnterReadLock();
string result = field;
locker.ExitReadLock();
return result;
}
set
{
locker.EnterReadLock();
field = value;
locker.ExitReadLock();
}
}
}
class F : I
{
private string field;
public string Property
{
get
{
Thread.MemoryBarrier();
return field;
}
set
{
field = value;
Thread.MemoryBarrier();
}
}
}
class G : I
{
private string field;
public string Property
{
get
{
string result = field;
Thread.MemoryBarrier();
return result;
}
set
{
Thread.MemoryBarrier();
field = value;
}
}
}

is there any difference regarding thread-safeness?
Both ensure that appropriate barriers are set up around the read and write.
result?
In both cases two threads can race to write a value. However, reads and writes cannot move forwards or backwards in time past either the lock or the full fences.
performance?
You've written the code both ways. Now run it. If you want to know which is faster, run it and find out! If you have two horses and you want to know which is faster, race them. Don't ask strangers on the Internet which horse they think is faster.
That said, a better technique is set a performance goal, write the code to be clearly correct, and then test to see if you met your goal. If you did, don't waste your valuable time trying to optimize further code that is already fast enough; spend it optimizing something else that isn't fast enough.
A question you didn't ask:
What would you do?
I'd not write a multithreaded program, that's what I'd do. I'd use processes as my unit of concurrency if I had to.
If I had to write a multithreaded program then I would use the highest-level tool available. I'd use the Task Parallel Library, I'd use async-await, I'd use Lazy<T> and so on. I'd avoid shared memory; I'd treat threads as lightweight processes that returned a value asynchronously.
If I had to write a shared-memory multithreaded program then I would lock everything, all the time. We routinely write programs these days that fetch a billion bytes of video over a satellite link and send it to a phone. Twenty nanoseconds spent taking a lock isn't going to kill you.
I am not smart enough to try to write low-lock code, so I wouldn't do that at all. If I had to then I would use that low-lock code to build a higher-level abstraction and use that abstraction. Fortunately I don't have to because someone already has built the abstractions I need.

As long as the variable in question is one of the limited set of variables that can be fetched/set atomically (i.e. reference types), then yes, the two solutions are applying the same thread-related constraints.
That said, I would honestly expect the MemoryBarrier solution to perform worse than a lock. Accessing an uncontested lock block is very fast. It has been optimized specifically for that case. On the other hand, introducing a memory barrier, which affects not only the access to that one variable, as is the case for a lock, but all memory, could very easily have significant negative performance implications throughout other aspects of the application. You would of course need to do some testing to be sure, (of your real applications, because testing these two in isolation isn't going to reveal the fact that the memory barrier is forcing all of the rest of the application's memory to be synchronized, not just this one variable).

There is no difference as far as thread safety goes. However, I would prefer:
private SomeType field
public SomeType Property
{
get
{
return Volatile.Read(ref field);
}
set
{
Volatile.Write(ref field, value);
}
}
Or,
private volatile SomeType field
public SomeType Property
{
get
{
return field;
}
set
{
field = value;
}
}

c# Generics - unexpected performance results

I believe Microsoft claims that generics is faster than using plain polymorphism when dealing with reference types. However the following simple test (64bit VS2012) would indicate otherwise. I typically get 10% faster stopwatch times using polymorphism. Am I misinterpreting the results?
public interface Base { Int64 Size { get; } }
public class Derived : Base { public Int64 Size { get { return 10; } } }
public class GenericProcessor<TT> where TT : Base
{
private Int64 sum;
public GenericProcessor(){ sum = 0; }
public void process(TT o){ sum += o.Size; }
public Int64 Sum { get { return sum; } }
}
public class PolymorphicProcessor
{
private Int64 sum;
public PolymorphicProcessor(){ sum = 0; }
public void process(Base o){ sum += o.Size; }
public Int64 Sum { get { return sum; } }
}
static void Main(string[] args)
{
var generic_processor = new GenericProcessor<Derived>();
var polymorphic_processor = new PolymorphicProcessor();
Stopwatch sw = new Stopwatch();
int N = 100000000;
var derived = new Derived();
sw.Start();
for (int i = 0; i < N; ++i) generic_processor.process(derived);
sw.Stop();
Console.WriteLine("Sum ="+generic_processor.Sum + " Generic performance = " + sw.ElapsedMilliseconds + " millisec");
sw.Restart();
sw.Start();
for (int i = 0; i < N; ++i) polymorphic_processor.process(derived);
sw.Stop();
Console.WriteLine("Sum ="+polymorphic_processor.Sum+ " Poly performance = " + sw.ElapsedMilliseconds + " millisec");
Even more surprising (and confusing) is that if I add a type cast to the polymorphic version of processor as follows, it then runs consistently ~20% faster than the generic version.
public void process(Base trade)
{
sum += ((Derived)trade).Size; // cast not needed - just an experiment
}
What's going on here? I understand generics can help avoid costly boxing and unboxing when dealing with primitive types, but I'm dealing strictly with reference types here.

Execute the test under .NET 4.5 x64 with Ctrl-F5 (without debugger). Also with N increased by 10x. That way the results reliably reproduce, no matter what order the tests are in.
With generics on ref types you still get the same vtable/interface lookup because there's just one compiled method for all ref types. There's no specialization for Derived. Performance of executing the callvirt should be the same based on this.
Furthermore, generic methods have a hidden method argument that is typeof(T) (because this allows you to actually write typeof(T) in generic code!). This is additional overhead explaining why the generic version is slower.
Why is the cast faster than the interface call? The cast is just a pointer compare and a perfectly predictable branch. After the cast the concrete type of the object is known, allowing for a faster call.
if (trade.GetType() != typeof(Derived)) throw;
Derived.Size(trade); //calling directly the concrete method, potentially inlining it
All of this is educated guessing. Validate by looking at the disassembly.
If you add the cast you get the following assembly:
My assembly skills are not enough to fully decode this. However:
16 loads the vtable ptr of Derived
22 and #25 are the branch to test the vtable. This completes the cast.
at #32 the cast is done. Note, that following this point there's no call. Size was inlined.
35 a lea implements the add
39 store back to this.sum
The same trick works with the generic version (((Derived)(Base)o).Size).

I believe Servy was correct it is a problem with your test. I reversed the order of the tests (just a hunch):
internal class Program
{
public interface Base
{
Int64 Size { get; }
}
public class Derived : Base
{
public Int64 Size
{
get
{
return 10;
}
}
}
public class GenericProcessor<TT>
where TT : Base
{
private Int64 sum;
public GenericProcessor()
{
sum = 0;
}
public void process(TT o)
{
sum += o.Size;
}
public Int64 Sum
{
get
{
return sum;
}
}
}
public class PolymorphicProcessor
{
private Int64 sum;
public PolymorphicProcessor()
{
sum = 0;
}
public void process(Base o)
{
sum += o.Size;
}
public Int64 Sum
{
get
{
return sum;
}
}
}
private static void Main(string[] args)
{
var generic_processor = new GenericProcessor<Derived>();
var polymorphic_processor = new PolymorphicProcessor();
Stopwatch sw = new Stopwatch();
int N = 100000000;
var derived = new Derived();
sw.Start();
for (int i = 0; i < N; ++i) polymorphic_processor.process(derived);
sw.Stop();
Console.WriteLine(
"Sum =" + polymorphic_processor.Sum + " Poly performance = " + sw.ElapsedMilliseconds + " millisec");
sw.Restart();
sw.Start();
for (int i = 0; i < N; ++i) generic_processor.process(derived);
sw.Stop();
Console.WriteLine(
"Sum =" + generic_processor.Sum + " Generic performance = " + sw.ElapsedMilliseconds + " millisec");
Console.Read();
}
}
In this case the polymorphic is slower in my tests. This shows that the first test is significantly slower than the second test. It could be loading classes the first time, preemptions, who knows ...
I just want to note that I am not arguing that generics are faster or as fast. I'm simply trying to prove that these kinds of tests don't make a case one way or the other.

Problem with delegates in C#

In the following program, DummyMethod always print 5. But if we use the commented code instead, we get different values (i.e. 1, 2, 3, 4). Can anybody please explain why this is happenning?
delegate int Methodx(object obj);
static int DummyMethod(int i)
{
Console.WriteLine("In DummyMethod method i = " + i);
return i + 10;
}
static void Main(string[] args)
{
List<Methodx> methods = new List<Methodx>();
for (int i = 0; i < 5; ++i)
{
methods.Add(delegate(object obj) { return DummyMethod(i); });
}
//methods.Add(delegate(object obj) { return DummyMethod(1); });
//methods.Add(delegate(object obj) { return DummyMethod(2); });
//methods.Add(delegate(object obj) { return DummyMethod(3); });
//methods.Add(delegate(object obj) { return DummyMethod(4); });
foreach (var method in methods)
{
int c = method(null);
Console.WriteLine("In main method c = " + c);
}
}
Also if the following code is used, I get the desired result.
for (int i = 0; i < 5; ++i)
{
int j = i;
methods.Add(delegate(object obj) { return DummyMethod(j); });
}

The problem is that you're capturing the same variable i in every delegate - which by the end of the loop just has the value 5.
Instead, you want each delegate to capture a different variable, which means declaring a new variable in the loop:
for (int i = 0; i < 5; ++i)
{
int localCopy = i;
methods.Add(delegate(object obj) { return DummyMethod(localCopy); });
}
This is a pretty common "gotcha" - you can read a bit more about captured variables and closures in my closures article.

This article will probably help you understand what is happening (i.e. what a closure is): http://blogs.msdn.com/oldnewthing/archive/2006/08/02/686456.aspx

If you look at the code generated (using Reflector) you can see the difference:
private static void Method2()
{
List<Methodx> list = new List<Methodx>();
Methodx item = null;
<>c__DisplayClassa classa = new <>c__DisplayClassa();
classa.i = 0;
while (classa.i < 5)
{
if (item == null)
{
item = new Methodx(classa.<Method2>b__8);
}
list.Add(item);
classa.i++;
}
foreach (Methodx methodx2 in list)
{
Console.WriteLine("In main method c = " + methodx2(null));
}
}
When you use the initial code it creates a temporary class in the background, this class holds a reference to the "i" variable, so as per Jon's answer, you only see the final value of this.
private sealed class <>c__DisplayClassa
{
// Fields
public int i;
// Methods
public <>c__DisplayClassa();
public int <Method2>b__8(object obj);
}
I really recommend looking at the code in Reflector to see what's going on, its how I made sense of captured variables. Make sure you set the Optimization of the code to ".NET 1.0" in the Option menu, otherwise it'll hide all the behind scenes stuff.

I think it is because the variable i is put to the heap (it's a captured variable)
Take a look at this answer.

What is the difference between calling a delegate directly, using DynamicInvoke, and using DynamicInvokeImpl?

The docs for both DynamicInvoke and DynamicInvokeImpl say:
Dynamically invokes (late-bound) the
method represented by the current
delegate.
I notice that DynamicInvoke and DynamicInvokeImpl take an array of objects instead of a specific list of arguments (which is the late-bound part I'm guessing). But is that the only difference? And what is the difference between DynamicInvoke and DynamicInvokeImpl.

The main difference between calling it directly (which is short-hand for Invoke(...)) and using DynamicInvoke is performance; a factor of more than *700 by my measure (below).
With the direct/Invoke approach, the arguments are already pre-validated via the method signature, and the code already exists to pass those into the method directly (I would say "as IL", but I seem to recall that the runtime provides this directly, without any IL). With DynamicInvoke it needs to check them from the array via reflection (i.e. are they all appropriate for this call; do they need unboxing, etc); this is slow (if you are using it in a tight loop), and should be avoided where possible.
Example; results first (I increased the LOOP count from the previous edit, to give a sensible comparison):
Direct: 53ms
Invoke: 53ms
DynamicInvoke (re-use args): 37728ms
DynamicInvoke (per-cal args): 39911ms
With code:
static void DoesNothing(int a, string b, float? c) { }
static void Main() {
Action<int, string, float?> method = DoesNothing;
int a = 23;
string b = "abc";
float? c = null;
const int LOOP = 5000000;
Stopwatch watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++) {
method(a, b, c);
}
watch.Stop();
Console.WriteLine("Direct: " + watch.ElapsedMilliseconds + "ms");
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++) {
method.Invoke(a, b, c);
}
watch.Stop();
Console.WriteLine("Invoke: " + watch.ElapsedMilliseconds + "ms");
object[] args = new object[] { a, b, c };
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++) {
method.DynamicInvoke(args);
}
watch.Stop();
Console.WriteLine("DynamicInvoke (re-use args): "
+ watch.ElapsedMilliseconds + "ms");
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++) {
method.DynamicInvoke(a,b,c);
}
watch.Stop();
Console.WriteLine("DynamicInvoke (per-cal args): "
+ watch.ElapsedMilliseconds + "ms");
}

Coincidentally I have found another difference.
If Invoke throws an exception it can be caught by the expected exception type.
However DynamicInvoke throws a TargetInvokationException. Here is a small demo:
using System;
using System.Collections.Generic;
namespace DynamicInvokeVsInvoke
{
public class StrategiesProvider
{
private readonly Dictionary<StrategyTypes, Action> strategies;
public StrategiesProvider()
{
strategies = new Dictionary<StrategyTypes, Action>
{
{StrategyTypes.NoWay, () => { throw new NotSupportedException(); }}
// more strategies...
};
}
public void CallStrategyWithDynamicInvoke(StrategyTypes strategyType)
{
strategies[strategyType].DynamicInvoke();
}
public void CallStrategyWithInvoke(StrategyTypes strategyType)
{
strategies[strategyType].Invoke();
}
}
public enum StrategyTypes
{
NoWay = 0,
ThisWay,
ThatWay
}
}
While the second test goes green, the first one faces a TargetInvokationException.
using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using SharpTestsEx;
namespace DynamicInvokeVsInvoke.Tests
{
[TestClass]
public class DynamicInvokeVsInvokeTests
{
[TestMethod]
public void Call_strategy_with_dynamic_invoke_can_be_catched()
{
bool catched = false;
try
{
new StrategiesProvider().CallStrategyWithDynamicInvoke(StrategyTypes.NoWay);
}
catch(NotSupportedException exc)
{
/* Fails because the NotSupportedException is wrapped
* inside a TargetInvokationException! */
catched = true;
}
catched.Should().Be(true);
}
[TestMethod]
public void Call_strategy_with_invoke_can_be_catched()
{
bool catched = false;
try
{
new StrategiesProvider().CallStrategyWithInvoke(StrategyTypes.NoWay);
}
catch(NotSupportedException exc)
{
catched = true;
}
catched.Should().Be(true);
}
}
}

Really there is no functional difference between the two. if you pull up the implementation in reflector, you'll notice that DynamicInvoke just calls DynamicInvokeImpl with the same set of arguments. No extra validation is done and it's a non-virtual method so there is no chance for it's behavior to be changed by a derived class. DynamicInvokeImpl is a virtual method where all of the actual work is done.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.