Populating and Using Dynamic Classes in C#/.NET 4.0 - c#

In our application we're considering using dynamically generated classes to hold a lot of our data. The reason for doing this is that we have customers with tables that have different structures. So you could have a customer table called "DOG" (just making this up) that contains the columns "DOGID", "DOGNAME", "DOGTYPE", etc. Customer #2 could have the same table "DOG" with the columns "DOGID", "DOG_FIRST_NAME", "DOG_LAST_NAME", "DOG_BREED", and so on. We can't create classes for these at compile time as the customer can change the table schema at any time.
At the moment I have code that can generate a "DOG" class at run-time using reflection. What I'm trying to figure out is how to populate this class from a DataTable (or some other .NET mechanism) without extreme performance penalties. We have one table that contains ~20 columns and ~50k rows. Doing a foreach over all of the rows and columns to create the collection take about 1 minute, which is a little too long.
Am I trying to come up with a solution that's too complex or am I on the right track? Has anyone else experienced a problem like this? Create dynamic classes was the solution that a developer at Microsoft proposed. If we can just populate this collection and use it efficiently I think it could work.

What you're trying to do wil get fairly complicted in .NET 3.5. You're probably better off creating a type with a Dictionary<> of key/value pairs for the data in your model.
If you can use .NET 4.0, you could look at using dynamic and ExpandoObject. The dynamic keyword lets you create reference to truly dynamic objects - ExpandoObject allows you to easily add properties and methods on the fly. All without complicated reflection or codegen logic. The benefit of dynamic types is that the DLR performs some sophisticated caching of the runtime binding information to allow the types to be more performant than regular reflection allows.
If all you need to do is load map the data to existing types, then you should consider using something like EntityFramework or NHibernate to provide ORM behavior for your types.
To deal with the data loading performance, I would suggest you consider bypassing the DataTable altogether and loading your records directly into the types you are generating. I suspect that most of the performance issues are in reading and casting the values from the DataTable where they original reside.

You should use a profiler to figure out, what exactly takes the time, and then optimize on that. If you are using reflection when setting the data, it will be slow. Much can be saved by caching the reflected type data.
I am curious, how will you use these class if the members are not known at compile time ? Perhaps you would be better off with just a plain DataTable for this scenario ?

Check out PicoPoco and Massive. These are Micro ORMS. (light weight opensource ORM frameworks) that you use by simply copying in a single code file into your project.
Massive uses the ExpandoObject and does conversions from IDataRecord to the ExpandoObject wich I think is exactly what you want.
PicoPoco take an existing plain class and dynamically and efficiently generates a method (which it then caches by the way) to load it from a database.

Related

Code Generator for populating a POCO class of table 'OrderHeader' based on ADO.Net SqlReader

I have a table called 'OrderHeader' in a SQL Server 2008 R2 database.
While I have found tools to generate the POCO class for this table in an automated manner, I cannot find a tool that will also generate the code to populate this class using SqlDataReader. This can help me save a lot of my time since I am writing low level ADO.Net code in my current project that uses SqlDataRader to populate data classes.
My question is: Is there any such tool ?
There are several free libraries that do this sort of mapping from SQL query results to POCO as a part of their broader functionality. AutoMapper for instance can convert your SqlDataReader to an appropriate IList<T> without too much difficulty, as can NHibernate apparently.
Another option is to use a framework like LinqToSQL or the EntityFramework to encapsulate your data, then build your queries against those. There are any number of good reasons to use EF or L2S for this, including such (probably) useful things as lazy loading and such.
Or you could learn to use reflection and build your mappings at runtime. I've used Linq Expression objects to do this in the past, but it's a lot of work to get only a little gain. It's fun to learn though :P
I'd suggest reading this article for more on reflection. It even has a simple example of using reflection to map an IDataReader to any type. It's not a complete solution since it doesn't handle DBNulls or any difference between your POCO and the data returned from the server will throw some exceptions. Make sure you read the whole article, then go investigate anything that still isn't clear.
If you want to go deeper than that, you can use Linq Expressions to build mapping functions for your types, then compile those expressions to Func<IDataReader, T> objects. Much easier than using System.Reflection.Emit... although still not simple.

What is more efficient, a datatable or a custom object?

Normally, when working with data from a database with a forms application, I keep it in a dataset or datatable and pull data out as needed. Now, I am working with WPF and trying to conform more to the MVVM pattern. Converting these datatables to objects make it a little easier to use with MVVM.
For example, if I had a table filled by a query like so -
Select p.first_name, p.last_name, p.phone,p.email from person as p where p.first_name = 'Bob'
Instead of keeping the datatable, I would just now convert that into a person object.
From a performance standpoint, is there and downfall to making objects or should I stick with datasets and datatables?
The performance impact of using an ORM such as EF (or rolling your own) instead of DataTable/DataSet is negligible in applications like you're describing, but it depends on how it's implemented.
The main advantage of using an ORM is ensuring type-safety and not having to perform type-casting when retrieving data from a DataTable object. There are also benefits from using lazy-loading in Linq.
I don't think that using entity objects everywhere is necessarily a solution. There's nothing really wrong with using a DataTable object in a ViewModel (although I'm unsure how you'd use it with respect to the data synchronisation features of the DataTable class, but you're free to not use it at all).
In a new project, I'd use EF, only because I like having the typing taken care of for me, but if you've got an older project using tables that works fine, I'd stick with it.
Just make sure that your objects contains only the necessary fields and embrace the objects if they make the solution clearer.
Any eventual performance downfall (minimal) will be compensated by potential long-term performance benefits of an cleaner and clearer solution.

What is more convenient resource-wise: generate values at runtime or save generated values to the database?

So I have a design decision to make. I'm building a website, so the speed would be the most important thing. I have values that depend on other values. I have two options:
1- Retrieve my objects from the database, and then generate the dependent values/objects.
2- Retrieve the objects with the dependent values already stored in the database.
I'm using ASP.NET MVC with Entity Framework.
What considerations should I have in making that choice?
You will almost certainly see no performance benefit in storing the derived values. Obviously this can change if the dependency is incredibly complex or relies on a huge amount of data, but you don't mention anything specific about the data so I can only speak in generalities.
In other words, don't store values that are completely derivative as they introduce update anomalies (in other words, someone has to have knowledge about and code for these dependencies when updating your data, rather than it being as self-explanatory and clear as possible).
Ask yourself this question:
Are the dependent values based on business rules?
If so, then don't store them in the database - not because you can't or shouldn't, but because it is good practice - you should only have business rules in the database if that is the best or only place to have it, not just because you can.
Serializing your objects to the database will usually be slower than creating the objects in normal compiled code. Database access is normally pretty quick, it is the act of serialization that is slow. However if you have a complicated object creation process that is time consuming then serialization could end up quicker, especially if you use a custom serialization method.
Sooooo.... if your 'objects' are relatively normal data objects with some calculated/derived values then I would suggest that you store the values of the 'objects' in the database, read those values from the database and map them to data objects created in the compiled code*, then calculate your dependent values.
*Note that this is standard data retrieval - some people use an ORM, some manually map the values to objects.

How to create a C# class whose attributes are rows in a database table with ADO.NET?

Is it possible?
Please note I am not using LINQ nor Entity Framework.
You could also check out Dapper-Dot-Net - a very lightweight and very capable "micro ORM" which - incidentally - is used to run this site here.
It's quite fast, a single *.cs file, works with your usual T-SQL commands and returns objects - works like a charm, it's very fast, very easy to understand, no big overhead - just use it and enjoy!
My personal favorite is done using the dynamic object featured in .NET4 via Rob Conery's Massive library. Like Dapper-Dot-Net it is small.
By going old school you can use Datasets to create strongly typed data table classes that mirror your database entirely right down to the relationships. It's a precursor to LINQ/EF that auto-generates a lot of bloated code but they're very handy for maintaining your field names, data types, data constraints and performing easily configured rapid updates.
http://msdn.microsoft.com/en-us/library/esbykkzb(v=VS.100).aspx

Query SQL Server Dynamically

I'm working on creating a dashboard. I have started to refactor the application so methods relating to querying the database are generic or dynamic.
I'm fairly new to the concept of generics and still an amateur programmer but I have done some searching and I have tried to come up with a solution. The problem is not really building the query string in a dynamic way. I'm fine with concatenating string literals and variables, I don't really need anything more complex. The bigger issue for me is when I create this query, getting back the data and assigning it to the correct variables in a dynamic way.
Lets say I have a table of defects, another for test cases and another for test runs. I want to create a method that looks something like:
public void QueryDatabase<T>(ref List<T> Entitylist, List<string> Columns, string query) where T: Defect, new()
Now this is not perfect but you get the idea. Not everything about defects, test cases and test runs are the same but, I'm looking for a way to dynamically assign the retrieved columns to its "correct" variable.
If more information is needed I can provide it.
You're re-inventing the wheel. Use an ORM, like Entity Framework or NHibernate. You will find it's much more flexible, and tools like that will continue to evolve over time and add new features and improve performance while you can focus on more important things.
EDIT:
Although I think overall it's important to learn to use tools for something like this (I'm personally a fan of Entity Framework and have used it successfully on several projects now, and used LINQ to SQL before that), it can still be valuable as a learning exercise to understand how to do this. The ORMs I have experience with use XML to define the data model, and use code generation on the XML file to create the class model. LINQ to SQL uses custom attributes on the code-generated classes to define the source table and columns for each class and property, and reflection at run-time to map the data returned from a SqlDataReader to the properties on your class object. Entity Framework can behave differently depending on the version you're using, whether you use the "default" or "POCO" templates, but ultimately does basically the same thing (using reflection to map the database results to properties on your class), it just may or not use custom attributes to determine the mapping. I assume NHibernate does it the same way as well.
You are reinventing the wheel, yes, it's true. You are best advised to use an object-relational mapper off the "shelf". But I think you also deserve an answer to your question: to assign the query results dynamically to the correct properties, you would use reflection. See the documentation for the System.Reflection namespace if you want more information.

Categories

Resources