Reflection is too slow while deserialising JSON strings into .NET objects - c#

I'm having some issues with System.Reflection in C#. I'm pulling data from a database and retrieving that data in a JSON string. I've made my own implementation of handling the data from JSON into my self declared objects using Reflection. However, since I ussually get a JSON string with an array of like 50 - 100 objects my program runs really slow because of the loops I'm using with reflection.
I've heard that reflection is slow but it shouldn't be this slow. I feel something is not right in my implementation since I have a different project where I use JSON.NET serializer and instantiate my objects a bit differently with reflection that runs just fine on the same output (less than a second) while my slow program takes about 10 seconds for 50 objects.
Below are my classses that I'm using to store data
class DC_Host
{
public string name;
public void printProperties()
{
//Prints all properties of a class usign reflection
//Doesn't really matter, since I'm not usign this for processing
}
}
class Host : DC_Host
{
public string asset_tag;
public string assigned;
public string assigned_to;
public string attributes;
public bool? can_print;
public string category;
public bool? cd_rom;
public int? cd_speed;
public string change_control;
public string chassis_type;
//And some more properties (around 70 - 80 fields in total)
Below you'll find my methods for processing the information into the objects that are stored inside a List. The JSON data is stored inside a dictionairy that contains a another dictionairy for every array object defined in the JSON input. Deserialising the JSON happens in a matter of miliseconds so there shouldn't be a problem in there.
public List<DC_Host> readJSONTtoHost(ref Dictionary<string, dynamic> json)
{
bool array = isContainer();
List<DC_Host> hosts = new List<DC_Host>();
//Do different processing on objects depending on table type (array/single)
if (array)
{
foreach (Dictionary<string, dynamic> obj in json[json.First().Key])
{
hosts.Add(reflectToObject(obj));
}
}
else
{
hosts.Add(reflectToObject(json[json.First().Key]));
}
return hosts;
}
private DC_Host reflectToObject(Dictionary<string,dynamic> obj)
{
Host h = new Host();
FieldInfo[] fields = h.GetType().GetFields();
foreach (FieldInfo f in fields)
{
Object value = null;
/* IF there are values that are not in the dictionairy or where wrong conversion is
* utilised the values will not be processed and therefore not inserted into the
* host object or just ignored. On a later stage I might post specific error messages
* in the Catch module. */
/* TODO : Optimize and find out why this is soo slow */
try
{
value = obj[convTable[f.Name]];
}
catch { }
if (value == null)
{
f.SetValue(h, null);
continue;
}
// Het systeem werkt met list containers, MAAAR dan mogen er geen losse values zijn dus dit hangt
// zeer sterk af van de implementatie van Service Now.
if (f.FieldType == typeof(List<int?>)) //Arrays voor strings,ints en bools dus nog definieren
{
int count = obj[convTable[f.Name]].Count;
List<int?> temp = new List<int?>();
for (int i = 0; i < count; i++)
{
temp.Add(obj[convTable[f.Name]][i]);
f.SetValue(h, temp);
}
}
else if (f.FieldType == typeof(int?))
f.SetValue(h, int.Parse((string)value));
else if (f.FieldType == typeof(bool?))
f.SetValue(h, bool.Parse((string)value));
else
f.SetValue(h, (string)value);
}
Console.WriteLine("Processed " + h.name);
return h;
}
I'm not sure what JSON.NET's implementation is in the background for using reflection but I'm assumign they use something I'm missing for optimising their reflection.

Basically, high-performance code like this tends to use meta-programming extensively; lots of ILGenerator etc (or Expression / CodeDom if you find that scary). PetaPoco showed a similar example earlier today: prevent DynamicMethod VerificationException - operation could destabilize the runtime
You could also look at the code other serialization engines, such as protobuf-net, which has crazy amounts of meta-programming.
If you don't want to go quite that far, you could look at FastMember, which handles the crazy stuff for you, so you just have to worry about object/member-name/value.

For people that are running into this article I'll post my solution to my problem in here.
The issue wasn't really related to reflection. There are ways to improve the speed using Reflection like CodesInChaos and Marc Gravell mentioned where Marc even craeted a very usefull library (FastMember) for people with not too much experience in low level reflection.
The solution however was non related to reflection itself. I had a Try Catch statement to evaluate if values exist in my dictionary. Using try catch statements to handle program flow is not a good idea. Handling exceptions is heavy on performance and especially when you're running the debugger, Try Catch statements can drastically kill your performance.
//New implementation, use TryGetValue from Dictionary to check for excising values.
dynamic value = null;
obj.TryGetValue(convTable[f.Name], out value);
My program runs perfectly fine now since I omitted the TryCatch statement.

Related

Do references to collections cause any trouble with threads?

I have something like the following code:
public class MainAppClass : BaseClass
{
public IList<Token> TokenList
{
get;
set;
}
// This is execute before any thread is created
public override void OnStart()
{
MyDataBaseContext dbcontext = new MyDataBaseContext();
this.TokenList = dbcontext.GetTokenList();
}
// After this the application will create a list of many items to be iterated
// and will create as many threads as are defined in the configuration (5 at the momment),
// then it will distribute those items among the threads for parallel processing.
// The OnProcessItem will be executed for every item and could be running on different threads
protected override void OnProcessItem(AppItem processingItem)
{
string expression = getExpressionFromItem();
expression = Utils.ReplaceTokens(processingItem, expression, this);
}
}
public class Utils
{
public static string ReplaceTokens(AppItem currentProcessingItem, string expression, MainAppClass mainAppClass)
{
Regex tokenMatchExpression = new Regex(#"\[[^+~][^$*]+?\]", RegexOptions.IgnoreCase);
Match tokenMatch = tokenMatchExpression.Match(expression)
if(tokenMatch.Success == false)
{
return expression;
}
string tokenName = tokenMatch.Value;
// This line is my principal suspect of messing in some way with the multiple threads
Token tokenDefinition = mainAppClass.TokenList.Where(x => x.Name == tokenName).First();
Regex tokenElementExpression = new Regex(tokenDefintion.Value);
MyRegexSearchResult evaluationResult = Utils.GetRegexMatches(currentProcessingItem, tokenElementExpression).FirstOrDefault();
string tokenValue = string.Empty;
if (evaluationResult != null && evaluationResult.match.Groups.Count > 1)
{
tokenValue = evaluationResult.match.Groups[1].Value;
}
else if (evaluationResult != null && evaluationResult.match.Groups.Count == 1)
{
tokenValue = evaluationResult.match.Groups[0].Value;
}
expression = expression.Replace("[" + tokenName + "]", tokenValue);
return expression;
}
}
The problem I have right now is that for some reason the value of the token replaced in the expression get confused with one from another thread, resulting in an incorrect replacement as it should be a different value, i.e:
Expression: Hello [Name]
Expected result for item 1: Hello Nick
Expected result for item 2: Hello Sally
Actual result for item 1: Hello Nick
Actual result for item 2: Hello Nick
The actual result is not always the same, sometimes is the expected one, sometimes both expressions are replaced with the value expected for the item 1, or sometimes both expressions are replaced with the value expected for the item 2.
I'm not able to find what's wrong with the code as I was expecting for all the variables within the static method to be in its own scope for every thread, but that doesn't seem to be the case.
Any help will be much appreciated!
Yeah, static objects only have one instance throughout the program - creating new threads doesn't create separate instances of those objects.
You've got a couple different ways of dealing with this.
Door #1. If the threads need to operate on different instances, you'll need to un-static the appropriate places. Give each thread its own instance of the object you need it to modify.
Door #2. Thread-safe objects (like mentioned by Fildor.) I'll admit, I'm a bit less familiar with this door, but it's probably the right approach if you can get it to work (less complexity in code is awesome)
Door #3. Lock on the object directly. One option is to, when modifying the global static, to put it inside a lock(myObject) { } . They're pretty simple and straight-foward (so much simpler than the old C/C++ days), and it'll make it so multiple modifications don't screw the object up.
Door #4. Padlock the encapsulated class. Don't allow outside callers to modify the static variable at all. Instead, they have to call global getters/setters. Then, have a private object inside the class that serves simply as a lockable object - and have the getters/setters lock that lockable object whenever they're reading/writing it.
The tokenValue that you're replacing the token with is coming from evaluationResult.
evaluationResult is based on Utils.GetRegexMatches(currentProcessingItem, tokenElementExpression).
You might want to check GetRegexMatches to see if it's using any static resources, but my best guess is that it's being passed the same currentProcessingItem value in multiple threads.
Look to the code looks like that splits up the AppItems. You may have an "access to modified closure" in there. For example:
for(int i = 0; i < appItems.Length; i++)
{
var thread = new Thread(() => {
// Since the variable `i` is shared across all of the
// iterations of this loop, `appItems[i]` is going to be
// based on the value of `i` at the time that this line
// of code is run, not at the time when the thread is created.
var appItem = appItems[i];
...
});
...
}

Looking for better understanding on the coding standards

I installed CodeCracker
This is my original method.
//Add
public bool AddItemToMenu(MenuMapper mapperObj)
{
using (fb_databaseContext entities = new fb_databaseContext())
{
try
{
FoodItem newItem = new FoodItem();
newItem.ItemCategoryID = mapperObj.ItemCategory;
newItem.ItemName = mapperObj.ItemName;
newItem.ItemNameInHindi = mapperObj.ItemNameinHindi;
entities.FoodItems.Add(newItem);
entities.SaveChanges();
return true;
}
catch (Exception ex)
{
//handle exception
return false;
}
}
}
This is the recommended method by CodeCracker.
public static bool AddItemToMenu(MenuMapper mapperObj)
{
using (fb_databaseContext entities = new fb_databaseContext())
{
try
{
var newItem = new FoodItem
{
ItemCategoryID = mapperObj.ItemCategory,
ItemName = mapperObj.ItemName,
ItemNameInHindi = mapperObj.ItemNameinHindi,
};
entities.FoodItems.Add(newItem);
entities.SaveChanges();
return true;
}
catch (Exception ex)
{
//handle exception
return false;
}
}
}
As far as I know Static methods occupy memory when the application intialize irrespective if they are called or not.
When I alrady know the return type then why should I use var keyword.
Why this way of Object intializer is better.
I am very curios to get these answer, as it can guide me in a long way.
Adding one more method:-
private string GeneratePaymentHash(OrderDetailMapper order)
{
var payuBizzString = string.Empty;
payuBizzString = "hello|" + order.OrderID + "|" + order.TotalAmount + "|FoodToken|" + order.CustomerName + "|myemail#gmail.com|||||||||||10000";
var sha1 = System.Security.Cryptography.SHA512Managed.Create();
var inputBytes = Encoding.ASCII.GetBytes(payuBizzString);
var hash = sha1.ComputeHash(inputBytes);
var sb = new StringBuilder();
for (var i = 0; i < hash.Length; i++)
{
sb.Append(hash[i].ToString("X2"));
}
return sb.ToString().ToLower();
}
As far as I know Static methods occupy memory when the application intialize irrespective if they are called or not.
All methods do that. You are probably confusing this with static fields, which occupy memory even when no instances of the class are created. Generally, if a method can be made static, it should be made static, except when it is an implementation of an interface.
When I already know the return type then why should I use var keyword.
To avoid specifying the type twice on the same line of code.
Why this way of Object intializer is better?
Because it groups the assignments visually, and reduces the clutter around them, making it easier to read.
Static methods don't occupy any more memory than instance methods. Additionally, your method should be static because it doesn't rely in any way on accessing itself (this) as an instance.
Using var is most likely for readability. var is always only 3 letters while many types are much longer and can force the name of the variable much further along the line.
The object initializer is, again, most likely for readability by not having the variable name prefix all the attributes. It also means all your assignments are done at once.
In most cases, this tool you're using seems to be about making code more readable and clean. There may be certain cases where changes will boost performance by hinting to the compiler about your intentions, but generally, this is about being able to understand the code at a glance.
Only concern yourself with performance if you're actually experiencing performance issues. If you are experiencing performance issues then use some profiling tools to measure your application performance and find out which parts of your code are running slowly.
As far as I know Static methods occupy memory when the application
initialize irrespective if they are called or not.
This is true for all kind of methods, so that's irrelevant.
When I already know the return type then why should I use var keyword.
var is a personal preference (which is a syntactic sugar). This analyzer might think since the return type is already known, there is no need to use type explicitly, so, I recommend to use var instead. Personaly, I use var as much as possible. For this issue, you might wanna read Use of var keyword in C#
Why this way of Object intializer is better.
I can't say object initializer is always better but object initialize supplies that either your newItem will be null or it's fully initialized since your;
var newItem = new FoodItem
{
ItemCategoryID = mapperObj.ItemCategory,
ItemName = mapperObj.ItemName,
ItemNameInHindi = mapperObj.ItemNameinHindi,
};
is actually equal to
var temp = new FoodItem();
newItem.ItemCategoryID = mapperObj.ItemCategory;
newItem.ItemName = mapperObj.ItemName;
newItem.ItemNameInHindi = mapperObj.ItemNameinHindi;
var newItem = temp;
so, this is not the same as your first one. There is a nice answer on Code Review about this subject. https://codereview.stackexchange.com/a/4330/6136 Also you might wanna check: http://community.bartdesmet.net/blogs/bart/archive/2007/11/22/c-3-0-object-initializers-revisited.aspx
A lot of these are personal preferences but most coding standards allow other programmers to read your code easier.
Changing the static method to an instance takes more advantage of OO concepts, it limits the amount of mixed state and also allows you to add interfaces so you can mock out the class for testing.
The var keyword is still statically typed but because we should concentrate on naming and giving our objects more meaningful so explicitly declaring the type becomes redundant.
As for the object initialisation this just groups everything that is required to setup the object. Just makes it a little easier to read.
As far as I know Static methods occupy memory when the application intialize irrespective if they are called or not.
Methods that are never called may or may not be optimized away, depending on the compiler, debug vs. release and such. Static vs. non-static does not matter.
A method that doesn't need a this reference can (and IMO should) be static.
When I already know the return type then why should I use var keyword
No reason. There's no difference; do whatever you prefer.
Why this way of Object intializer is better.
The object initializer syntax generates the same code for most practical purposes (see answer #SonerGönül for the details). Mostly it's a matter of preference -- personally I find the object initializer syntax easier to read and maintain.

Protobuf-net v2 and large Dictionaries

I have a weird situation happening that I'm not quite understanding.
I have a 'dataset' class that holds various metadata about a monitoring buoy including a list of 'sensors'.
Each current 'sensorstate'.
Each 'sensorstate' has a bit of metadata about it (timestamp, reason for change etc) but most importantly it has a Dictionary<DateTime,float> of values.
These sensors generally have upwards of 50k data points (years worth of 15min data readings) and so I wanted to find something that was a bit faster at serialising than the default .NET BinaryFormatter and so set up Protobuf-net which will serialize fantastically fast.
Unfortunately my problem occurs on deserialization when my dictionary of values throws a exception for there already being an item with the same key added and the only way I can get it to deserialise is to enable 'OverwriteList' but I'm a little unsure why when there aren't any duplicate keys (it's a dictionary) when serializing, so why are there duplicate keys when I deserialize? Which also brings up data integrity issues.
Any help in explaining this would be highly appreciated.
(On a side note, when giving ProtoMember attribute ids, do they need to be unique to the class or the whole project? and I'm looking for lossless compression recommendations to use in conjunction with protobuf-net as the files are getting pretty large)
Edit:
I've just put my source up on GitHub and here is the class in question
SensorState (Note: it currently has OverwriteList = true in order to have it working for other development)
Here is an example raw data file
I had already tried using the SkipContructor flag but even with it set to true it gets an exception unless OverwriteList is also true for the values dictionary.
If OverwriteList fixes it, then it suggests to me that the dictionary has some data in it by default, perhaps via a constructor or similar. If it is indeed coming from the constructor, you can disable that with [ProtoContract(SkipConstructor=true)].
If I have misunderstood the above, it may help to illustrate with a reproducible example, if possible.
With regard to the ids, they only need to be unique inside each type, and it is recommended to keep them small (due to "varint" encoding of tags, small keys are "cheaper" than large keys).
If you want to really minimise size, I would actually suggest looking at the content of the data, too. For example, you say that this is 15 minute readings... well, I'm guessing there are occasional gaps, but could you do, for example:
Block (class)
Start Time (DateTime)
Values (float[])
and have a Block for every contiguous bunch of 15-minute values (the assumption here is that every value is 15 after the last, else a new block is started). So you are storing multiple Block instances in place of a single dictionary. This has the advantages:
much less DateTime values to store
you can use "packed" encoding on the floats, which means it doesn't need to add all the intermediate tags; you do this by marking an array/list as ([ProtoMember({key}, IsPacked = true)]) - noting that it only works on a few basic data-types (not sub-objects)
combined, these two tweaks could yield significant savings
If the data has a lot of strings, you could try GZIP/DEFLATE. You can of course try these either way, but without large amounts of string data I would be cautious of expecting too much extra from compression.
As an update based on the supplied (CSV) data file, there is no inherent problem here handling the dictionary - as shown:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using ProtoBuf;
class Program
{
static void Main()
{
var data = new Data
{
Points =
{
{new DateTime(2009,09,1,0,0,0), 11.04F},
{new DateTime(2009,09,1,0,15,0), 11.04F},
{new DateTime(2009,09,1,0,30,0), 11.01F},
{new DateTime(2009,09,1,0,45,0), 11.01F},
{new DateTime(2009,09,1,1,0,0), 11F},
{new DateTime(2009,09,1,1,15,0), 10.98F},
{new DateTime(2009,09,1,1,30,0), 10.98F},
{new DateTime(2009,09,1,1,45,0), 10.92F},
{new DateTime(2009,09,1,2,00,0), 10.09F},
}
};
var ms = new MemoryStream();
Serializer.Serialize(ms, data);
ms.Position = 0;
var clone =Serializer.Deserialize<Data>(ms);
Console.WriteLine("{0} points:", clone.Points.Count);
foreach(var pair in clone.Points.OrderBy(x => x.Key))
{
float orig;
data.Points.TryGetValue(pair.Key, out orig);
Console.WriteLine("{0}: {1}", pair.Key, pair.Value == orig ? "correct" : "FAIL");
}
}
}
[ProtoContract]
class Data
{
private readonly Dictionary<DateTime, float> points = new Dictionary<DateTime, float>();
[ProtoMember(1)]
public Dictionary<DateTime, float> Points { get { return points; } }
}
This is where I apologize for ever suggesting it had anything to do with code that wasn't my own doing. And while I'm here mad props to the team behind protobuf and Marc Gravell for protobuf-net it's seriously fast.
What was happening was in the Sensor class I had some logic to never let a couple of Properties never be null.
[ProtoMember(12)]
public SensorState CurrentState
{
get { return (_currentState == null) ? RawData : _currentState; }
set { _currentState = value; }
}
Link
[ProtoMember(16)]
public SensorState RawData
{
get { return _rawData ?? (_rawData = new SensorState(this, DateTime.Now, new Dictionary<DateTime, float>(), "", true, null)); }
private set { _rawData = value; }
}
Link
While this works fantastically for when I'm using the properties it messes up serialization processes.
The simple fix was to instead mark the underlying objects for serialization instead.
[ProtoMember(16)]
private SensorState _rawData;
[ProtoMember(12)]
private SensorState _currentState;
Link

Lucene serializer in C#, need performance advice

I'm trying to build a Lucene Serializer class that would serialize/de-serialize objects (classes) with properties decorated with the DataMember and a special attribute with instruction on how to store the property/field in a Lucene index.
The class works fine when I need to retrieve a single object by a certain key/value pair.
But I noticed that if sometimes I need to retrieve all items, and there let's say are 100,000 documents - then MySQL does it ~bout 10 times faster... for some reason...
Could you please review this code (Lucene experts) and suggest any possible performance related ideas for improvement ?
public IEnumerable<T> LoadAll()
{
IndexReader reader = IndexReader.Open(this.PathToLuceneIndex);
int itemsCount = reader.NumDocs();
for (int i = 0; i < itemsCount; i++)
{
if (!reader.IsDeleted(i))
{
Document doc = reader.Document(i);
if (doc != null)
{
T item = Deserialize(doc);
yield return item;
}
}
}
if (reader != null) reader.Close();
}
private T Deserialize(Document doc)
{
T itemInstance = Activator.CreateInstance<T>();
foreach (string fieldName in fieldTypes.Keys)
{
Field myField = doc.GetField(fieldName);
//Not every document may have the full collection of indexable fields
if (myField != null)
{
object fieldValue = myField.StringValue();
Type fieldType = fieldTypes[fieldName];
if (fieldType == typeof(bool))
fieldValue = fieldValue == "1" ? true : false;
if (fieldType == typeof(DateTime))
fieldValue = DateTools.StringToDate((string)fieldValue);
pF.SetValue(itemInstance, fieldName, fieldValue);
}
}
return itemInstance;
}
Thank you in advance!
Here are some tips:
First, don't use IndexReader.Open(string path). Not only will it be removed in the next major release of Lucene.net, it's generally not your best option. There's actually a ton of unnecessary code called when you let Lucene generate the directory for you. I suggest:
var dir = new SimpleFSDirectory(new DirectoryInfo(path));
var reader = IndexReader.Open(dir, true);
You should also do as I did above, and open the IndexReader as readonly, if you don't absolutely need to write to it, as it will be quicker in multi-threaded environments especially.
If you know the size of your index is not more than you can hold into memory (ie less than 500-600 MB and not compressed), you can use a RAMDirectory instead. This will load the entire index into memory allowing you to bypass most of the costly IO operations if you were leaving the index on disk. It should greatly improve your speed, especially if you do it with the other suggestions below.
If the index is too large to fit in memory, you either need to split the index up into chunks (ie an index every n MBs) or just continue to read it from disk.
Also, I know you can't yield return in a try...catch, but you can in a try...finally, and I would recommend wrapping your logic in LoadAll() into a try...finally, like
IndexReader reader = null;
try
{
//logic here...
}
finally
{
if (reader != null) reader.Close();
}
Now, when it comes to your actual Deserialize code, you're probably doing it in nearly the fastest way possible, except that you are boxing the string when you don't need to. Lucene only stores the field as a byte[] array or a string. Since you're calling string value, you know it will always be a string, and should only have to box it if absolutely necessary. Change it to this:
string fieldValue = myField.StringValue();
That will at least sometimes save you a minor boxing cost. (really, not much)
On the topic of boxing, we're working on a branch of lucene you can pull from SVN, that changes the internals of Lucene from using boxing containers (ArrayLists, non-generic Lists and HashTables) to a version that uses generics and more .net-friendly things. This is the 2.9.4g branch. .Net'ified, as we like to say. We haven't officially benchmarked it, but developer tests have show it, in some cases, to be around 200% faster than older versions.
The other thing to keep in mind, Lucene is great as a search engine, you may find that in some cases, it may not stack up to MySQL. Really, though, the only way to know for sure is to just test and try to find performance bottlenecks like some of the ones I mentioned above.
Hope that helps! Don't forget about the Lucene.Net mailing list (lucene-net-dev#lucene.apache.org), either if you have any questions. Me and the other committers are generally quick to answer questions.

C# - Collection is enough or comobination of LINQ will improve performance?

According to the requirement we have to return a collection either in reverse order or as
it is. We, beginning level programmer designed the collection as follow :(sample is given)
namespace Linqfying
{
class linqy
{
static void Main()
{
InvestigationReport rpt=new InvestigationReport();
// rpt.GetDocuments(true) refers
// to return the collection in reverse order
foreach( EnquiryDocument doc in rpt.GetDocuments(true) )
{
// printing document title and author name
}
}
}
class EnquiryDocument
{
string _docTitle;
string _docAuthor;
// properties to get and set doc title and author name goes below
public EnquiryDocument(string title,string author)
{
_docAuthor = author;
_docTitle = title;
}
public EnquiryDocument(){}
}
class InvestigationReport
{
EnquiryDocument[] docs=new EnquiryDocument[3];
public IEnumerable<EnquiryDocument> GetDocuments(bool IsReverseOrder)
{
/* some business logic to retrieve the document
docs[0]=new EnquiryDocument("FundAbuse","Margon");
docs[1]=new EnquiryDocument("Sexual Harassment","Philliphe");
docs[2]=new EnquiryDocument("Missing Resource","Goel");
*/
//if reverse order is preferred
if(IsReverseOrder)
{
for (int i = docs.Length; i != 0; i--)
yield return docs[i-1];
}
else
{
foreach (EnquiryDocument doc in docs)
{
yield return doc;
}
}
}
}
}
Question :
Can we use other collection type to improve efficiency ?
Mixing of Collection with LINQ reduce the code ? (We are not familiar with LINQ)
Looks fine to me. Yes, you could use the Reverse extension method... but that won't be as efficient as what you've got.
How much do you care about the efficiency though? I'd go with the most readable solution (namely Reverse) until you know that efficiency is a problem. Unless the collection is large, it's unlikely to be an issue.
If you've got the "raw data" as an array, then your use of an iterator block will be more efficient than calling Reverse. The Reverse method will buffer up all the data before yielding it one item at a time - just like your own code does, really. However, simply calling Reverse would be a lot simpler...
Aside from anything else, I'd say it's well worth you learning LINQ - at least LINQ to Objects. It can make processing data much, much cleaner than before.
Two questions:
Does the code you currently have work?
Have you identified this piece of code as being your performance bottleneck?
If the answer to either of those questions is no, don't worry about it. Just make it work and move on. There's nothing grossly wrong about the code, so no need to fret! Spend your time building new functionality instead. Save LINQ for a new problem you haven't already solved.
Actually this task seems pretty straightforward. I'd actually just use the Reverse method on a Generic List.
This should already be well-optimized.
Your GetDocuments method has a return type of IEnumerable so there is no need to even loop over your array when IsReverseOrder is false, you can just return it as is as Array type is IEnumerable...
As for when IsReverseOrder is true you can use either Array.Reverse or the Linq Reverse() extension method to reduce the amount of code.

Categories

Resources