C# Optimizing a function by pre-loading it - c#

I have a function that is very small, but is called so many times that my profiler marks it as time consuming. It is the following one:
private static XmlElement SerializeElement(XmlDocument doc, String nodeName, String nodeValue)
{
XmlElement newElement = doc.CreateElement(nodeName);
newElement.InnerXml = nodeValue;
return newElement;
}
The second line (where it enters the nodeValue) is the one takes some time.
The thing is, I don't think it can be optimized code-wise, I'm still open to suggestions on that part though.
However, I remember reading or hearing somewhere that you could tell the compiler to flag this function, so that it is loaded in memory when the program starts and it runs faster.
Is this just my imagination or such a flag exists?
Thanks,
FB.

There are ways you can cause it to be jitted early, but it's not the jit time that's hurting you here.
If you're having performance problems related to Xml serialization, you might consider using XmlWriter rather than XmlDocument, which is fairly heavy. Also, most automatic serialization systems (including the built-in .NET XML Serialization) will emit code dynamically to perform the serialization, which can then be cached and re-used. Most of this has to do with avoiding the overhead of reflection, however, rather than the overhead of the actual XML writing/parsing.

I dont think this can be solved using any kind of catching or inlining. And I believe its your imagination. Mainly the part about performance. What you have in mind is pre-JIT-ing your code. This technique will remove the wait time for JITer when your function is first called. But this is only first time this function is called. It has no performance effect for subsequent calls.
As documentation states, setting InnterXml parses set string as XML. And parsing XML string can be expensive operation, especialy if set xml in string format is complex. And documentation even has this line:
InnerXml is not an efficient way to modify the DOM. There may be performance issues when replacing complex nodes. It is more efficient to construct nodes and use methods such as InsertBefore, InsertAfter, AppendChild, and RemoveChild to modify the Xml document.
So, if you are creating complex XML structure this way it would be wise to do it by hand.

Related

C# Performance issue about List<T> Count property

The project I am working on is about request a xml from a web set. The server side construct the xml. The xml may have many nodes, so the performance is not that good.
I use virtual studio 2010 profiler to analyze the performance issue.
Find out that the most time-consuming function is System.Collections.Generic.ICollection`1.get_Count() which actually is Count property of Generic List.This function is called about 9000 times.
The performance data shown as below:
The Elapsed exclusive time is 4154.14(ms), while the Application exclusive time is just 0.52(ms).
I know the different between Elapsed exclusive time and Application exclusive time.
Application exclusive time exclude the time spend on the context switch stuff.
How could context switch stuff happy when the code just obtain the Count property of the Generic List.
I am very confused by the performance profiler data. Is anyone can provide some information? Thanks a lot!
Actually the decompiled sources show the following for List<T>:
[__DynamicallyInvokable]
public int Count
{
[__DynamicallyInvokable, TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")] get
{
return this._size;
}
}
It's literally returning the value of a field and doing nothing else. I'd suggest your performance hit is elsewhere, or you're misinterpreting your profiler's output.

Read a value multiple times or store as a variable first time round?

Basically, is it better practice to store a value into a variable at the first run through, or to continually use the value? The code will explain it better:
TextWriter tw = null;
if (!File.Exists(ConfigurationManager.AppSettings["LoggingFile"]))
{
// ...
tw = File.CreateText(ConfigurationManager.AppSettings["LoggingFile"]);
}
or
TextWriter tw = null;
string logFile = ConfigurationManager.AppSettings["LoggingFile"].ToString();
if (!File.Exists(logFile))
{
// ...
tw = File.CreateText(logFile);
}
Clarity is important, and DRY (don't repeat yourself) is important. This is a micro-abstraction - hiding a small, but still significant, piece of functionality behind a variable. The performance is negligible, but the positive impact of clarity can't be understated. Use a well-named variable to hold the value once it's been acquired.
the 2nd solution is better for me because :
the dictionary lookup has a cost
it's more readable
Or you can have a singleton object with it's private constructor that populates once all configuration data you need.
Second one would be the best choice.
Imagine this next situation. Settings are updated by other threads and during some of them, since setting value isn't locked, changes to another value.
In the first situation, your execution can fail, or it'll be executed fine, but code was checking for a file of some name, and later saves whatever to a file that's not the one checked before. This is too bad, isn't it?
Another benefit is you're not retrieving the value twice. You get once, and you use wherever your code needs to read the whole setting.
I'm pretty sure, the second one is more readable. But if you talk about performance - do not optimize on early stages and without profiler.
I must agree with the others. Readability and DRY is important and the cost of the variable is very low considering that often you will have just Objects and not really store the thing multiple times.
There might be exceptions with special or large objects. You must keep in mind the question if the value you cache might change in between and if you would like or not (most times the second!) to know the new value within your code! In your example, think what might happen when ConfigurationManager.AppSettings["LoggingFile"] changes between the two calls (due to accessor logic or thread or always reading the value from a file from disk).
Resumee: About 99% you will want the second method / the cache!
IMO that would depend on what you are trying to cache. Caching a setting from App.conig might not be as benefiial (apart from code readability) as caching the results of a web service call over a GPRS connection.

To cache or not to cache - GetCustomAttributes

I currently have a function:
public static Attribute GetAttribute(MemberInfo Member, Type AttributeType)
{
Object[] Attributes = Member.GetCustomAttributes(AttributeType, true);
if (Attributes.Length > 0)
return (Attribute)Attributes[0];
else
return null;
}
I am wondering if it would be worthwhile caching all the attributes on a property into a
Attribute = _cache[MemberInfo][Type] dictionary,
This would require using GetCustomAttributes without any type parameter then enumerating over the result. Is it worth it?
You will get better bangs for your bucks if you replace the body of your method with this:
return Attribute.GetCustomAttribute(Member, AttributeType,false); // only look in the current member and don't go up the inheritance tree.
If you really need to cache on a type-basis:
public static class MyCacheFor<T>
{
static MyCacheFor()
{
// grab the data
Value = ExtractExpensiveData(typeof(T));
}
public static readonly MyExpensiveToExtractData Value;
private static MyExpensiveToExtractData ExtractExpensiveData(Type type)
{
// ...
}
}
Beats dictionary lookups everytime. Plus it's threadsafe:)
Cheers,
Florian
PS: Depends how often you call this. I had some cases where doing a lot of serialization using reflection really called for caching, as usual, you want to measure the performance gain versus the memory usage increase. Instrument your memory use and profile your CPU time.
The only way you can know for sure, is to profile it. I am sorry if this sounds like a cliche. But the reason why a saying is a cliche is often because it's true.
Caching the attribute is actually making the code more complex, and more error prone. So you might want to take this into account-- your development time-- before you decide.
So like optimization, don't do it unless you have to.
From my experience ( I am talking about AutoCAD-like Windows Application, with a lot of click-edit GUI operations and heavy number crunching), the reading of custom attribute is never--even once-- the performance bottleneck.
I just had a scenario where GetCustomAttributes turned out to be the performance bottleneck. In my case it was getting called hundreds of thousands of times in a dataset with many rows and this made the problem easy to isolate. Caching the attributes solved the problem.
Preliminary testing led to a barely noticeable performance hit at about 5000 calls on a modern day machine. (And it became drastically more noticeable as the dataset size increased.)
I generally agree with the other answers about premature optimization, however, on a scale of CPU instruction to DB call, I'd suggest that GetCustomAttributes would lean more towards the latter.
Your question is a case of premature optimization.
You don't know the inner workings of the reflection classes and therefore are making assumptions about the performance implications of calling GetCustomAttributes multiple times. The method itself could well cache its output already, meaning your code would actually add overhead with no performance improvement.
Save your brain cycles for thinking about things which you already know are problems!
Old question but GetCustomAttributes is costly/heavyweight
Using a cache if it is causing performance problems can be a good idea
The article I linked: (Dodge Common Performance Pitfalls to Craft Speedy Applications) was taken down but here a link to an archived version:
https://web.archive.org/web/20150118044646/http://msdn.microsoft.com:80/en-us/magazine/cc163759.aspx
Are you actually having a performance problem? If not then don't do it until you need it.
It might help depending on how often you call the method with the same paramters. If you only call it once per MemberInfo, Type combination then it won't do any good. Even if you do cache it you are trading speed for memory consumption. That might be fine for your application.

Is there an easy way to compare if 2 XDocuments are equal ignoring element/attribute order?

Unit testing my serialization code I found one failed because I had attributes listed in a different order (I'm just comparing the XDocument.ToString() values) and while I could fix that, it really doesn't matter to me in what order the elements or attributes appear as long as they're all there with the right name at the right level of hierarchy. I could probably write a method do this, but I'm wondering if there's an easy built in way I'm not aware of.
XNode has a DeepEquals function that should do the trick.
http://msdn.microsoft.com/en-us/library/system.xml.linq.xnode.deepequals.aspx
Update:
It appears that the DeepEquals function doesn't always work correctly. You may be best off implementing your own comparison routine.
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=400469
Try the Microsoft XML Diff and patch utility here
or google "Xml Diff"

What is the most efficient way to Deserialze an XML file

Aloha,
I have a 8MB XML file that I wish to deserialize.
I'm using this code:
public static T Deserialize<T>(string xml)
{
TextReader reader = new StringReader(xml);
Type type = typeof(T);
XmlSerializer serializer = new XmlSerializer(type);
T obj = (T)serializer.Deserialize(reader);
return obj;
}
This code runs in about a minute, which seems rather slow to me. I've tried to use sgen.exe to precompile the serialization dll, but this didn't change the performance.
What other options do I have to improve performance?
[edit] I need the object that is created by the deserialization to perform (basic) transformations on. The XML is received from an external webservice.
The XmlSerializer uses reflection and is therefore not the best choice if performance is an issue.
You could build up a DOM of your XML document using the XmlDocument or XDocument classes and work with that, or, even faster use an XmlReader. The XmlReader however requires you to write any object mapping - if needed - yourself.
What approach is the best depends stronly on what you want to do with the XML data. Do you simply need to extract certain values or do you have to work and edit the whole document object model?
Yes it does use reflection, but performance is a gray area. When talking an 8mb file... yes it will be much slower. But if dealing with a small file it will not be.
I would NOT saying reading the file vial XmlReader or XPath would be easier or really any faster. What is easier then telling something to turn your xml to an object or your object to XML...? not much.
Now if you need fine grain control then maybe you need to do it by hand.
Personally the choice is like this. I am willing to give up a bit of speed to save a TON of ugly nasty code.
Like everything else in software development there are trade offs.
You can try implementing IXmlSerializable in your "T" class write custom logic to process the XML.

Categories

Resources