Cost of mapping POCO's in a high load system C#

Cost of mapping POCO's in a high load system C# - c#

I have a poco that needs to be mapped to another poco in a high traffic system. I intend to map these objects together in a simple mapper similar to this:
public class a
{
public int MyValue { get;set; }
public string YAV { get; set; }
}
public class B
{
public int aTestValue { get;set; }
public string YetAnotherValue { get; set; }
}
public class Mapper
{
public static B MapIt(A a)
{
return new B { aTestValue = a.MyValue, YetAnotherValue = a.YAV };
}
}
How much does a mapping like this really affect performance? Ignore the fact that we'll have to write a mapping for all our types and just focus on the performance lost doing the actual mapping.

How much does a mapping like this really affect performance?
I would say that such mapping wouldn't affect performance even in a high traffic system. The cost of calling getters and setters will probably be negligible compared to other operations you might be doing.
Obviously that's just some 2 cents, if you want real stuff do performance benchmarks and measure the difference with and without the mapping.
At least that's what I would do: make something that corresponds to the requirements, then benchmark it, then two possibilities: you are satisfied with the result => ship in production and enjoy life, or you are not satisfied with the results and those benchmarks have allowed you to identify that this part is the bottleneck for your application => refactor the code and start thinking about optimizing it. But never do premature optimization or you will hardly respect the project deadlines.

From our experience, the overhead won't be much. I tested this recently by retrieving 75,000 rows of data using Linq to SQL and then mapping the L2S entities to POCO entities using mapping code we wrote. The cost of doing this was amazing small. If I recall correctly, it was something like 75 to 100 Ms to map 75K rows.

It's almost impossible to know how this will affect performance without knowing something about the scale of the system, the looping structure in which this mapping occurs, etc.
In general, these types of simple mappings are quick, but you can always run into issues that are associated with the scaling issues I mentioned when things such as serialization are involved.

Best thing to do is hook it up to the profiler and make some measurements. Doing a manual mapping like that is a fairly light way to do it so shouldn't be significant. The AutoMapper tool is also available and will reduce the coding time but has a little more overhead as it does other services besides just mapping:
Analyzing AutoMapper Performance

How about using conversion operators. Only worry about its performance if a profiler shows it to be a bottleneck.

Related

Protobuf-net is spending excessive time in RuntimeTypeModel.TakeLock

My issue is very similar to the one in this question: protobuf-net concurrent performance issue in TakeLock. The difference is that in our case, we are using CompressedSerializer and Deserializing data. Following code is being called from 8 different threads (on 8 core CPU) each time we need to deserialize data.
var result = new ProtoCompressedSerializer().Deserialize<Dictionary<string, CustomStruct>>(...)
Here's the result from ANTS Performance Profiler:
First node in the graph is our method which is calling Deserialize using the code above. The number on left in each box is time in seconds and on the right is Hit count. As you can see RunTimeTypeModel.TakeLock is taking a lot of time. In the linked question above, the suggestion was to precompile model. Is that possible for CompressedSerializer and it's Deserialize method? From performance perspective, is it better to create one serializer and share it among all threads? Is it thread-safe?

After trying out several options and comparing their performance using Performance Profilers, I've come to the conclusion that Dictionary<T,K> is Not correctly supported by ProtoBuf-net and even initializing serializer beforehand won't help.
The solution is to wrap up the Dictionary into a new class as the only data member. This will not only improve performance several fold, we saw 3X improvement, but also virtually eliminate the use of TakeLock and associated timeout issues.
Example:
[Serializable]
[ProtoContract]
[DataContract]
public class ModelADict
{
[ProtoMember(1)]
[DataMember]
public Dictionary<string, ModelA> sub { get; set; }
public ModelADict()
{
sub = new Dictionary<string, ModelA>();
}
}

Best DataStructure for HighPerformance Seek in c#

I was wondering which data structure would offer me better performance for my scenario....
My requirements are:
Possible Huge DataSet several million of records, I am going to write it only once and I am not going to change it any more during the execution lifetime, I don't need that it is stored in a sorted way....
I was thinking to go with List but if I use a Linq query and in the where condition call InRange performance are very bad... if I do a foreach, performance are not so great.... I am pretty sure that there is a best way to do it ( I was thinking to use a struct and or implement IEquatable but performance are not improving...
witch is the quickest data structure in C# for querying in my range with optimal performances?
What I want is a data structure to store several million of instances of the class Rnage
class Range
{
public int Low {get; set;}
public int High {get; set;}
public bool InRange(int val) { return val >= Low && val <= High; }
}
A logic example would be List but I am afraid that List class is not optimized for my requirements... since it is sorted and I don't need sorting and it affect a lot on performances...
thanks for the help!

I think you may want an interval tree. Stackoverflow user alan2here has recently asked several questions regarding a project he's working on; Eric Lippert pointed him towards the interval tree structure in one of them.

C# Speeding Up a Parser with Constants? Abstract class though

I have a series of parser which parse the same basic sort of text for relevant data but they come from various sources so they differ subtltey. I am parsing millions of documents per day so any speed optimizations help.
Here is a simplified example to show the fundamental issue. The parser is set up such that there is a base abstract parser that actual parsers implement:
abstract class BaseParser
{
protected abstract string SomeRegex { get; }
public string ParseSomethingCool(string text)
{
return Regex.Match(text, SomeRegex).Value;
}
....
}
class Parser1: BaseParser
{
protected override string SomeRegex { get { return "^.*"; } } // example regex
...
}
class Parser2: BaseParser
{
protected override string SomeRegex { get { return "^[0-9]+"; } } // example regex
...
}
So my questions are:
If I were to make the things returned in the get constants would it speed things up?
Theoretically if it didn't use a property and everything was a straight up constant would that speed things up more?
What sort of speed increases if any could I see?
Am I just clutching at straws?

I don't think converting the properties to constants will give you any appreciable performance boost. The Jit'ed code probably have those inlined anyway (since you put in constants).
I think the best approach is profiling your code first and see which parts have the most potential of optimization. My suggestion of things to look at:
RegEx - as you already know, sometimes, a well constructed RegEx expression spells out the difference between fast and extremely slow. Its really a case to case basis, depending on the expression used and the text you feed it.
Alternatives - I'm not sure what kind of matching you perform, but it might be worth considering other approaches especially if what you are trying to match is not that complex. Then benchmark the results.
Other parts of your code - see where the bottle neck occurs. Is it in disk IO, or CPU? See if more threads will help or maybe revisit the function the reads the file contents.
Whatever you end up doing, its always a big help to measure. Identify the areas with opportunity, find a faster way to do it then measure again to verify if it is indeed faster.

The things in the get already are constant.
I bet the jitter is already optimizing away the property accessors, so you probably won't see much performance gain by refactoring them out.

I don't think you'd see appreciable speed improvements from this kind of optimsation. Your best bet, though, is to try it and benchmark the results.
One change that would make a difference is to not use Regex if you can get away without it. Regex is a pretty big and useful hammer, but not every nail needs a hammer that big.

From the code you show not clear why you need an abstract class and inheriting.
Using virtual members is slower. Moreover, your child classes aren't sealed.
Why don't you do something like this:
public class Parser
{
private Regex regex;
public Parser(string someRegex)
{
regex = new Regex(someRegex, RegexOptions.Compiled);
}
public string ParseSomethingCool(string text)
{
return regex.Match(text).Value;
}
}
or like this
public static class Parser
{
public static string ParseSomethingCool(string text, string someRegex)
{
return Regex.Match(text, someRegex).Value;
}
}
However, I think the greatest gain in performance you would achieve if you use multi-threading. Probably you already do. If you don't take a look at Task Parallel Library

C# performance question

quandry is - which of the following two method performs best
Goal - get an object of type Wrapper ( defined below )
criteria - speed over storage
no. of records - about 1000- about 2000, max about 6K
Choices - Create Object on the fly or do a lookup from a dictionary
Execution speed - called x times per second
NB - i need to deliver the working code first and then go for optimization hence if any theorists can provide glimpses on behind the scene info, that'll help before i get to the actual performance test possibly by eod thu
Definitions -
class Wrapper
{
public readonly DataRow Row;
public Wrapper(DataRow dr)
{
Row = dr;
}
public string ID { get { return Row["id"].ToString(); } }
public string ID2 { get { return Row["id2"].ToString(); } }
public string ID3 { get { return Row["id3"].ToString(); } }
public double Dbl1 { get { return (double)Row["dbl1"]; } }
// ... total about 12 such fields !
}
Dictionary<string,Wrapper> dictWrappers;
Method 1
Wrapper o = new Wrapper(dr);
/// some action with o
myMethod( o );
Method 2
Wrapper o;
if ( ! dictWrappers.TryGetValue( dr["id"].ToString(), out o ) )
{
o = new Wrapper(dr);
dictWrapper.Add(o.ID, o);
}
/// some action with o
myMethod( o );

Never optimize without profiling first.
Never profile unless the code does not meet specifications/expectations.
If you need to profile this code, write it both ways and benchmark it with your expected load.
EDIT: I try to favor the following over optimization unless performance is unacceptable:
Simplicity
Readability
Maintainability
Testability
I've (recently) seen highly-optimized code that was very difficult to debug. I refactored it to simplify it, then ran performance tests. The performance was unacceptable, so I profiled it, found the bottlenecks, and optimized only those. I re-ran the performance tests, and the new code was comparable to the highly-optimized version. And it's now much easier to maintain.

Here's a free profiling tool.

The first one would be faster, since it isn't actually doing a lookup, it is just doing a simple allocation and an assignment.
The two segments of code are not nearly equivalent. In function however, because Method 1 could create many duplicates.

Without actually testing I would expect that caching the field values in Wrapper (that is, avoiding all the ToString calls and casts) would probably have more of an impact on performance.
Then once you are caching those values you will probably want to keep instances of Wrapper around rather than frequently recreate them.

Assuming that you're really worried about per (hey, it happens) then your underlying wrapper itself could be improved. You're doing field lookups by string. If you're going to make the call a lot with the same field set in the row, it's actually faster to cache the ordinals and look up by ordinal.
Of course this is only if you really, really need to worry about performance, and the instances where this would make a difference are fairly rare (though in embedded devices it's not as rare as on the desktop).

To cache or not to cache - GetCustomAttributes

I currently have a function:
public static Attribute GetAttribute(MemberInfo Member, Type AttributeType)
{
Object[] Attributes = Member.GetCustomAttributes(AttributeType, true);
if (Attributes.Length > 0)
return (Attribute)Attributes[0];
else
return null;
}
I am wondering if it would be worthwhile caching all the attributes on a property into a
Attribute = _cache[MemberInfo][Type] dictionary,
This would require using GetCustomAttributes without any type parameter then enumerating over the result. Is it worth it?

You will get better bangs for your bucks if you replace the body of your method with this:
return Attribute.GetCustomAttribute(Member, AttributeType,false); // only look in the current member and don't go up the inheritance tree.
If you really need to cache on a type-basis:
public static class MyCacheFor<T>
{
static MyCacheFor()
{
// grab the data
Value = ExtractExpensiveData(typeof(T));
}
public static readonly MyExpensiveToExtractData Value;
private static MyExpensiveToExtractData ExtractExpensiveData(Type type)
{
// ...
}
}
Beats dictionary lookups everytime. Plus it's threadsafe:)
Cheers,
Florian
PS: Depends how often you call this. I had some cases where doing a lot of serialization using reflection really called for caching, as usual, you want to measure the performance gain versus the memory usage increase. Instrument your memory use and profile your CPU time.

The only way you can know for sure, is to profile it. I am sorry if this sounds like a cliche. But the reason why a saying is a cliche is often because it's true.
Caching the attribute is actually making the code more complex, and more error prone. So you might want to take this into account-- your development time-- before you decide.
So like optimization, don't do it unless you have to.
From my experience ( I am talking about AutoCAD-like Windows Application, with a lot of click-edit GUI operations and heavy number crunching), the reading of custom attribute is never--even once-- the performance bottleneck.

I just had a scenario where GetCustomAttributes turned out to be the performance bottleneck. In my case it was getting called hundreds of thousands of times in a dataset with many rows and this made the problem easy to isolate. Caching the attributes solved the problem.
Preliminary testing led to a barely noticeable performance hit at about 5000 calls on a modern day machine. (And it became drastically more noticeable as the dataset size increased.)
I generally agree with the other answers about premature optimization, however, on a scale of CPU instruction to DB call, I'd suggest that GetCustomAttributes would lean more towards the latter.

Your question is a case of premature optimization.
You don't know the inner workings of the reflection classes and therefore are making assumptions about the performance implications of calling GetCustomAttributes multiple times. The method itself could well cache its output already, meaning your code would actually add overhead with no performance improvement.
Save your brain cycles for thinking about things which you already know are problems!

Old question but GetCustomAttributes is costly/heavyweight
Using a cache if it is causing performance problems can be a good idea
The article I linked: (Dodge Common Performance Pitfalls to Craft Speedy Applications) was taken down but here a link to an archived version:
https://web.archive.org/web/20150118044646/http://msdn.microsoft.com:80/en-us/magazine/cc163759.aspx

Are you actually having a performance problem? If not then don't do it until you need it.
It might help depending on how often you call the method with the same paramters. If you only call it once per MemberInfo, Type combination then it won't do any good. Even if you do cache it you are trading speed for memory consumption. That might be fine for your application.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Cost of mapping POCO's in a high load system C# - c#

How about using conversion operators. Only worry about its performance if a profiler shows it to be a bottleneck.

Related

Protobuf-net is spending excessive time in RuntimeTypeModel.TakeLock

Best DataStructure for HighPerformance Seek in c#

C# Speeding Up a Parser with Constants? Abstract class though

C# performance question

To cache or not to cache - GetCustomAttributes

Categories

Resources