I've already searched through StackOverflow (and other websites) about transforming a DataTable to List with reflection in C#.
My results until now are pretty good: I can reflect 200k lines in 3.5 seconds (0.5 seconds in hardcoded mode).
But my entities (the classes that represent my data, but I think you already know that) follow this pattern:
My database have columns like this (I don't actually do this, but you'll get the idea):
Table: Clients
Columns:
ClientID, ClientName, ClientPhone, CityID[FK]
I'm using SqlConnection (MySqlConnection), so I have to hardcode my entities and transform the database result in a list of this entity. Like:
Select *, cit.* from Clients cli
Inner join Cities cit on (cit.CityID == cli.CityID)
Inner join Countries cou on (cou.CountryID == cit.CountID)
I don't know if this SQL is correct, but I think you got the idea. This should return some fields like this:
ClientID, ClientName, ClientPhone, CityID, CityName, CountryID, CountryName
Shoud result a List<Client>.
Here's the problem: I have 2 inner joins and I represent this data in my entities like this (I like the expression "like this"):
public class Client
{
public int ClientID { get; set; }
public string ClientName { get; set; }
public string ClientPhone { get; set; }
public City ClientCity { get; set; }
}
public class City
{
public int CityID { get; set; }
public string CityName { get; set; }
public Country CityCountry { get; set; }
}
public class Country
{
public int ContryID { get; set; }
public string CountryName { get; set; }
}
So, if I have a Client object, I would get its country name by the expression client.ClientCity.CityCountry.CountryName. I call it a 3-level property acessor.
And I want to reflect it properly. Here is the main method to transform the DataTable into a List. My native language is Portuguese, but I tried to translate my comments to match my description above.
The idea of this code is: I try to find in the main class the column I have to set. If I don't find it, I search the property in the properties that are objects. Like CityName inside ClientCity inside Client. This code is a mess.
public List<T> ToList<T>(DataTable dt) where T : new()
{
Type type= typeof(T);
ReflectionHelper h = new ReflectionHelper(type);
insertPropInfo(tipo); //a pre-reflection work, I cache some delegates, etc..
List<T> list = new List<T>();
DataTableReader dtr = dt.CreateDataReader();
while (dtr.Read())
{
T obj = new T();
for (int i = 0; i < dtr.FieldCount; i++)
{
GetObject(ref obj, tipo, dtr.GetName(i), dtr.GetValue(i));
}
list.Add(obj);
}
return lista;
}
//ref T obj: the object I create before calling this method
//Type classType: the type of the object (say, Client)
//string colName: this is the Database Column i'm trying to fill. Like ClientID or CityName or CountryName.
//colLineData: the data I want to put in the colName.
public void GetObject<T>(ref T obj, Type classType, string colName, object colLineData) where T : new()
{
//I do some caching to reflect just once, and after the first iteration, I think all the reflection I need is already done.
foreach (PropertyInfo info in _classPropInfos[classType])
{
//If the current PropertyInfo is a valuetype (like int, int64) or string, and so on
if (info.PropertyType.IsValueType || info.PropertyType == typeof(string))
{
//I think string.Equals is a little faster, but i had not much difference using "string" == "string"
if (info.Name.Equals(colName)) //did I found the property?
if (info.PropertyType != typeof(char)) //I have to convert the type if this is a Char. MySql returns char as string.
{
_delegateSetters[info](obj, colLineData); //if it isn't a char, just set it.
}
else
{
_delegateSetters[info](obj, Convert.ChangeType(colLineData, typeof(char)));
}
break;
}
else //BUT, if the property is a class, like ClientCity:
{
//I reflect the City class, if it isn't reflected yet:
if (!_classPropInfos.ContainsKey(info.PropertyType))
{
insertPropInfo(info.PropertyType);
}
//now I search for the property:
Boolean foundProperty = false;
object instance = _delegateGetters[info](obj); //Get the existing instance of ClientCity, so I can fill the CityID and CityName in the same object.
foreach (PropertyInfo subInfo in _classPropInfos[info.PropertyType])
{
if (subInfo.Name.Equals(colName))//did I found the property?
{
if (instance == null)
{
//This will happen if i'm trying to set the first property of the class, like CityID. I have to instanciate it, so in the next iteration it won't be null, and will have it's CityID filled.
instance = _initializers[info.PropertyType]();//A very fast object initializer. I'm worried about the Dictionary lookups, but i have no other idea about how to cache it.
}
_delegateSetters[subInfo](instance, colLineData);//set the data. This method is very fast. Search about lambda getters & setters using System.Linq.Expression.
foundProperty = true;
break;//I break the loops when I find the property, so it wont iterate anymore.
}
}
if (foundProperty)//if I found the property in the code above, I set the instance of ClientCity to the Client object.
{
_delegateSetters[info](obj, instance);
break;
}
}
}
}
There is a problem with this code: I can reach the CityID and CityName, and fill it. But CountryID and CountryName wont. Because this code can do a 2-level reflection, I need some recursive-approach to fill many levels I need. I tried to do this BUT i got so many stack overflows and null reference exceptions I almost gave up.
This code would make it much easier to fetch database rows, Did you already find some library or anything that does what I want? If not, how could I achieve a n-level reflection to make a proper List from a DataTable?
Your problem is really common and practically every ORM in circulation addresses this question.
Of course changing an already written application to take advantage of an ORM is often unpractical, but there are some simple ORM that are really easy to add to an existing application and let you replace incrementally the already written code.
One of these ORMs is DAPPER. It consists of just one source file that you can include directly in the same project with your POCO classes and repository methods (Or just reference the compiled assembly). It is really easy to learn and it is incredibly fast considering the complexity of the work to be carried out. Not to mention that the authors of this little gem are regularly on this site answering questions on their work. Just do a search with the #dapper tag
The only nuisances that I have found to date are the mapping one-to-one from your POCO properties and the field names and also the sometime eluding rules between PK and FK when your keys are not named ID. But that's me that I still haven't fully understood these rules.
Consider to use EntityFramework. It will automate all this work.
This is based on you getting a dataset with the 3 tables and creating the proper DataRelation.
On your particular case(200k lines) i dont know how it will perform but shouldnt be that bad :).
Your calling code could be something like this:
List<Clients> clients = Test.CreateListFromTable<Clients>(ds.Tables["Clients"]);
Remember as i said its based in you fettching the dataset and creating the relations.
Next here is the class with the methods in question(ClientsToCity and CityToCountry are the names of the datarelations,you can place your own):
public class Test
{
// function that set the given object from the given data row
public static void SetItemFromRow<T>(T item, DataRow row) where T : new()
{
foreach (DataColumn c in row.Table.Columns)
{
PropertyInfo prop = item.GetType().GetProperty(c.ColumnName);
if (prop != null && row[c] != DBNull.Value)
{
prop.SetValue(item, row[c], null);
}
else
{
if (c.ColumnName == "CityID")
{
object obj = Activator.CreateInstance(typeof(City));
SetItemFromRow<City>(obj as City, row.GetChildRows("ClientsToCity")[0]);
PropertyInfo nestedprop = item.GetType().GetProperty("ClientCity");
nestedprop.SetValue(item, obj, null);
}
else if (c.ColumnName == "CountryID")
{
object obj = Activator.CreateInstance(typeof(Country));
SetItemFromRow<Country>(obj as Country, row.GetChildRows("CityToCountry")[0]);
PropertyInfo nestedprop = item.GetType().GetProperty("CityCountry");
nestedprop.SetValue(item, obj, null);
}
}
}
}
// function that creates an object from the given data row
public static T CreateItemFromRow<T>(DataRow row) where T : new()
{
T item = new T();
SetItemFromRow(item, row);
return item;
}
// function that creates a list of an object from the given data table
public static List<T> CreateListFromTable<T>(DataTable tbl) where T : new()
{
List<T> lst = new List<T>();
foreach (DataRow r in tbl.Rows)
{
lst.Add(CreateItemFromRow<T>(r));
}
return lst;
}
}
Related
I am working on an assignment for school and trying to implement as much features just for learning sake. Hence I've made a generic mapper that maps databse tables to objects to see what's possible. The Db in this case is local. I know I'm making loads and loads of calls and should go around this very differently but....
Everything works as intended except for when a class has a Collection of another class.
Example:
class Student {
public int Id { get; set; }
public string Name { get; set; }
}
My method for filling a list of all the students in the database.
public List<TModel> MapEntitiesFromDb<TModel>(string tablename, string customquery = "") where TModel : class, new()
{
try
{
sql = ValidateSelectSql(tablename, customquery);
}
catch (AccessViolationException ex) { Console.WriteLine(ex.Message); }
command.CommandText = sql;
command.Connection = conn;
List<TModel> list = new List<TModel>();
try
{
using (conn)
{
Type t = new TModel().GetType();
conn.Open();
using (reader = command.ExecuteReader())
{
if (t.GetProperties().Length != reader.FieldCount)
throw new Exception("There is a mismatch between the amount of properties and the database columns. Please check the input code and try again.");
//Possible check is to store each column and property name in arrays and match them to a new boolean array, if there's 1 false throw an exception.
string columnname;
string propertyname;
//Pairing properties with columns
while (reader.Read())
{
TModel obj = new TModel();
for (int i = 0; i < reader.FieldCount; i++)
{
columnname = reader.GetName(i).ToString().ToLower();
PropertyInfo[] properties = t.GetProperties();
foreach (PropertyInfo propertyinfo in properties)
{
propertyname = propertyinfo.Name.ToLower();
if (propertyname == columnname)
{
propertyinfo.SetValue(obj, reader.GetValue(i));
break;
}
}
}
list.Add(obj);
}
}
}
}
catch (Exception ex) { Console.WriteLine(ex.Message); }
return list;
}
My ValidateSelectSql just returns the sql string that needs to be used in the query.
After calling:
List<Student> = MapEntitiesFromDb<Student>("students");
It will return a list with all the students like intended.
Things go wrong when I add a collection for example:
class Student {
public Student()
{
this.Courses = new List<Course>();
string customsqlquery = ::: this works and is tested! :::
Courses = MapEntitiesFromDb<Course>("", customsqlquery);
}
public int Id { get; set; }
public string Name { get; set; }
public ICollection<Course> Courses;
}
The courses list returned empty and with some help of the debugger tool I found out at the time of creating the object the Id property is 0 of course. In my query I am filtering on student Id but at the time of executing the method to fill the Courses list in the constructor the Id of student will always be 0 becuase it's set at a later stage and the result will be no courses in the list.
I'm wondering if I should put a check for an ICollection property after the other properties are set and if so execute a method on the object that in return executes the method that's now inside the constructor?
I can't call any methods on TModel, else it would be as simple as finding if TModel has a collection property and call obj.FillCollection(); after the Id property has been assigned in the GetEntitiesFromDb method.
I was also thinking about recursion. Again I'd have to find if obj has a collection property and then call GetEntitiesFromDB but it seems undoable because I also need to find out the type in between <> and I Can't send any customquery from the outside...
Maybe tackle it from a whole other perspective?
I can really use some advice on how to tackle this problem.
The most straightforward way to approach this would be to have the collection property lazy load what it needs. I would additionally recommend that you use IEnumerable<T> instead of ICollection<T> because this represents a read-only view of what's currently in the database, nobody should be modifying it in any way.
public class Student
{
private readonly Lazy<IEnumerable<Course>> courses;
public int Id { get; set; }
public IEnumerable<Course> Courses => this.courses.Value;
public Student()
{
this.courses = new Lazy<IEnumerable<Course>>(LoadCourses);
}
private IEnumerable<Course> LoadCourses()
{
var sql = "custom SQL query that uses this.Id after it's loaded";
return MapEntitiesFromDb(sql);
}
}
I'm only recommending this approach because you mentioned that this is just an academic exercise to help you learn about the tools available to you. In an actual production environment this approach would very quickly become unwieldy and I would recommend using Entity Framework instead (which may be something else that you might want to learn about).
I'm using a Data Access Layer based on PetaPoco (DotNetNuke 7.0). I've used it successfully when working with relatively simple objects but I now have to insert an object which contains at least one property which is a List of other objects.
For example:
class Person
{
public Person(){}
public string name { get; set; }
public List<Address> addresses { get; set; }
}
class Address
{
...
}
The actual object I'm working with is much more complex than the example above - there are at least four composite List objects in the object to be inserted into the repository.
What I'd like to be able to do is define the table in SQL and to be able to make a simple call to PetaPoco like this:
public static void AddOrder(Person person)
{
using (IDataContext context = DataContext.Instance())
{
var repository = context.GetRepository<Person>();
repository.Insert(person);
}
}
The background to this is that the object is passed in to a web service from a Knockout/jQuery front-end so a JSON string is converted to a data object which must then be stored on the database.
I think there are three questions really:
How do I write the SQL table which represents Person and the contained Addresses List?
How do I write the necessary PetaPoco code to insert the Person object together with any objects it contains?
Should I forget about trying to store the object on the database and just store the JSON string on the database instead?
Thanks for looking :)
I haven't installed DotNetNuke 7 yet, however I examined the source code at codeplex and I think you can do it this way:
public static void AddOrder(Person person)
{
using (IDataContext context = DataContext.Instance())
{
var repositoryPerson = context.GetRepository<Person>();
var repositoryAddrress = context.GetRepository<Address>();
context.BeginTransaction();
try
{
repositoryPerson.Insert(person);
foreach(var address in person.addresses)
{
repositoryAddress.Insert(address);
}
context.Commit();
}
catch (Exception)
{
context.RollbackTransaction();
throw;
}
}
}
I haven't tested it so I can't guarantee it works, however this seems right to me.
Lets imaging the we have model:
public class InheritModel
{
public int Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public string OtherData { get; set; }
}
We have a controller with View, that represents this model:
private InheritModel GetAll()
{
return new InheritModel
{
Name = "name1",
Description = "decs 1",
OtherData = "other"
};
}
public ActionResult Index()
{
return View(GetAll());
}
Now we can edit this in View, change some data and post in back to server:
[HttpPost]
public ActionResult Index(InheritModel model)
{
var merged = new MergeModel();
return View(merged.Merge(model, GetAll()));
}
What i need to do:
In view we have a reproduction of model
User change something and post
Merge method need to compare field-by-field posted model and previous model
Merge method create a new InheritModel with data that was changed in posted model, all other data should be null
Can somebody help me to make this Merge method?
UPDATE(!)
It's not a trivial task. Approaching like:
public InheritModel Merge(InheritModel current, InheritModel orig)
{
var result = new InheritModel();
if (current.Id != orig.Id)
{
result.Id = current.Id;
}
}
Not applicable. It's should be Generic solution. We have more than 200 properties in the model. And the first model is built from severeal tables from DB.
public InheritModel Merge(InheritModel current, InheritModel orig)
{
var result = new InheritModel();
if (current.Id != orig.Id)
{
result.Id = current.Id;
}
if (current.Name != orig.Name)
{
result.Name = current.Name;
}
... for the other properties
return result;
}
Another possibility is to use reflection and loop through all properties and set their values:
public InheritModel Merge(InheritModel current, InheritModel orig)
{
var result = new InheritModel();
var properties = TypeDescriptor.GetProperties(typeof(InheritModel));
foreach (PropertyDescriptor property in properties)
{
var currentValue = property.GetValue(current);
if (currentValue != property.GetValue(orig))
{
property.SetValue(result, currentValue);
}
}
return result;
}
Obviously this works only for 1 level nesting of properties.
Per topic, it seems that what you want is a sort of "change tracking" mechanism which is definitely not trivial or simple by any means. Probably, it makes sense to use any modern ORM solution to do that for you, does it?
Because otherwise you need to develop something that maintains the "context" (the 1st level object cache) like EF's ObjectContext or NH's Session that would be generic solution.
Also, there is no information on what happens at the lower level - how do you actualy save the data. Do you already have some mechanism that saves the object by traversing it's "non-null" properties?
I have a similar project experience, which made me thought a lot about the original design. Think the following question:
You have a view that representing a model, then users modified
something of the model in the view, all the CHANGES are posted to
server and the model is modified, and then it's saved to database
probably. What's posted to the server on earth?
An instance of InheritModel? No. You want the changes only. It's actually part of InheritModel, it's a InheritModel Updater, it's an instance of Updater<InheritModel>. And in your question you need to merge two models, because your Update method looks like:
public InheritModel Update(InheritedModel newModel)
{
//assign the properties of the newModel to the old, and save it to db
//return the latest version of the InheritedModel
}
Now ask yourself: why do I need a whole instance of InheritedModel when I just want to update one property only?
So my final solution is: posting the changes to the controller, the argument is something like a Updater<TModel>, not TModel itself. And the Updater<TModel> can be applied to a TModel, the properties metioned in the updater is assigned and saved. There shouldn't a MERGE operation.
I was looking to map my database query results to strongly type objects in my c# code. So i wrote a quick and dirty helper method on the SqlConnection class which runs the query on the database and uses reflection to map the record columns to the object properties. The code is below:
public static T Query<T>(this SqlConnection conn, string query) where T : new()
{
T obj = default(T);
using (SqlCommand command = new SqlCommand(query, conn))
{
using (SqlDataReader reader = command.ExecuteReader())
{
while (reader.Read())
{
obj = new T();
PropertyInfo[] propertyInfos;
propertyInfos = typeof(T).GetProperties();
for (int i = 0; i < reader.FieldCount; i++)
{
var name = reader.GetName(i);
foreach (var item in propertyInfos)
{
if (item.Name.Equals(name, StringComparison.InvariantCultureIgnoreCase) && item.CanWrite)
{
item.SetValue(obj, reader[i], null);
}
}
}
}
}
}
return obj;
}
public class User
{
public int id { get; set; }
public string firstname { get; set; }
public string lastname { get; set; }
public DateTime signupDate { get; set; }
public int age { get; set; }
public string gender { get; set; }
}
var user = conn.Query<User>("select id,firstname,lastname from users");
I just wanted a second opinion on my approach above of using reflection to tie the values together, if there's anything i can do better in the code above. Or if there's some other totally different approach i can take to get the same result?
I think i can probably improve the code in the helper method by removing the loop for propertyInfos and using a dictionary instead. Is there anything else that needs to be tweaked?
P.S: i'm aware of Dapper, i just wanted to implement something similar on my own to help me learn better.
What you've done is basically what linq-to-sql or other OR-mappers do under the hood. To learn the details of how it works it's always a good idea to write something from scratch.
If you want more inspiration or want to have something that's ready for production use out-of-the-box I'd recommend reading up on linq-to-sql. It is lightweight, yet competent.
There are a few of things I can think of:
I think that in order to skip the loop you can use:
reader[item.Name]
I've done something similar myself, but I never ran into dapper. I'm not sure if it uses reflection, but it's always a good idea to read someone else's code to sharpen your skill (Scott Hanselman frequently recommends doing so).
You can also look at:
http://www.codeproject.com/KB/database/metaquery_part1.aspx
You can implement an attribute that maps a field to a database column, but that's just for fun.
Edit:
5: You can also skip the while loop over the reader and just take the first row, and document the fact that your query only returns one object, so it doesn't pull a thousand rows if the query returns a thousand rows.
I have a database to which i have to connect through odbc.
The data fetch takes app. 2 minutes. and the resulting DataTable has 350000 records.
I am trying to transform the data table into this object graph. The resultset has no primary key, the primary key is specified through the view from which i fetch data.
public class PriceCurve
{
public PriceCurve(DataTable dt)
{
this.Id = int.Parse(dt.AsEnumerable().First()["ID"].ToString());
this.Prices = new List<Price>();
GetPrices(dt);
}
public int Id { get; private set; }
public IList<Price> Prices { get; set; }
private void GetPrices(DataTable dt)
{
foreach (DataColumn column in dt.Columns)
{
switch (this.GetPriceProviderType(column)) // parses ColumnName to Enum
{
case Price.PriceProvider.A:
{
this.Prices.Add(new Price(Price.PriceProvider.A, dt.AsEnumerable()));
}
break;
case Price.PriceProvider.B:
{
this.Prices.Add(new Price(Price.PriceProvider.B, dt.AsEnumerable()));
}
break;
}
}
public class Price
{
public enum PriceProvider
{
A, B
}
public Price(PriceProvider type, IEnumerable<DataRow> dt)
{
this.Type = type;
this.TradingDates = new List<TradingDate>();
this.GetTradingDates(type, dt);
}
public IList<TradingDate> TradingDates { get; set; }
public PriceProvider Type { get; set; }
private void GetTradingDates(PriceProvider type, IEnumerable<DataRow> dt)
{
var data = dt.Select(column => column["TRADING_DATE"]).Distinct();
foreach (var date in data)
{
this.TradingDates.Add(new TradingDate(date.ToString(), type, dt));
}
}
public class TradingDate
{
public TradingDate(string id, PriceProvider type, IEnumerable<DataRow> dt)
{
this.Id = id;
this.DeliveryPeriodValues = new Dictionary<int, double?>();
this.GetDeliveryPeriodValues(type, dt);
}
public string Id { get; set; }
public IDictionary<int, double?> DeliveryPeriodValues { get; set; }
private void GetDeliveryPeriodValues(PriceProvider type, IEnumerable<DataRow> dt)
{
foreach (var row in dt.Where(column => column["TRADING_DATE"].ToString() == this.Name))
{
try
{
this.DeliveryPeriodValues.Add(
int.Parse(row["DELIVERY_PERIOD"].ToString()),
double.Parse(row[Enum.GetName(typeof(Price.PriceProvider), type)].ToString()));
}
catch (FormatException e)
{
this.DeliveryPeriodValues.Add(
int.Parse(row["DELIVERY_PERIOD"].ToString()),
null);
}
}
}
}
}
i create one object, which contains a list with two objects. Each of these two objects contains a list with 1000 objects. Each of these 1000 objects contains a dictionary with 350 pairs.
It either crashes visual studio 2010 during debug, fails because of OutOfMemory or takes minutes (unacceptable) to execute.
What is the best approach to this problem. i am new to c# and do not know how to optimize the looping through this huge data or my object graph. Any help is appreciated.
It either crashes visual studio 2010 during debug, fails because of OutOfMemory or takes minutes
(unacceptable) to execute.
YOu made me laugh. Really.
350.000 nodes is challenging on a 32 bit machine with .NET. Add some overhead and you are dead. Use objects, not adata table which is VERY memory destroying.
takes minutes is pretty much your decision / programming. Use a list of objects, not a data table. Use a profiler. DOnt make beginner mistakesl ike:
var data = dt.Select(column => column["TRADING_DATE"]).Distinct();
No need for that, deal with doubles later inthe code. Distinct is expensive. Profile it.
foreach (var row in dt.Where(column => column["TRADING_DATE"].ToString() == this.Name))
That is 350.000 row lookups by name to get the index of the column, compared by a lot of tostring.
Get a profiler and find out where you exactly spend your time. Get please rid of the table and use objects - DataTable is a memory hog and SLOW compared to a list of objects. And yes, it will take minutes. Main reasons:
Your programming. Not a shame. Just learn, Go objets / structs NOW.
ODBC. Takes time to just load the data, especially as you dont process swhile loading (DataReader) but wait for allto ahve loaded, and ODBC is NOT fast. 350.000 rows, good network, direct SQL Server is maybe 30 seconds - same machine less.