Each row of a DataTable as a Class?

Each row of a DataTable as a Class? - c#

I have a "location" class. This class basically holds addresses, hence the term "location". I have a datatable that returns multiple records which I want to be "locations".
Right now, I have a "load" method in the "location" class, that gets a single "location" by ID. But what do I do when I want to have a collection of "location" objects from multiple datatable rows? Each row would be a "location".
I don't want to go to the database for each record, for obvious reasons. Do I simply create a new instance of a location class, assigning values to the properties, while looping through the rows in the datatable bypassing the "load" method? It seems logical to me, but I am not sure if this is the correct/most efficient method.

That is (your description) pretty much how a row (or a collection of rows) of data gets mapped to a C# biz object(s). But to save yourself a lot of work you should consider one of a number of existing ORM (object relational mapper) frameworks such as NHibernate, Entity Framework, Castle ActiveRecord etc.
Most ORMs will actually generate all the boilerplate code where rows and fields are mapped to your .NET object properties and vice-versa. (Yes, ORMs allow you to add, update and delete db data just as easily as retrieving and mapping it.) Do give the ORMs a look. The small amount of learning (there is some learning curve with each) will pay off very shortly. ORMs are also becoming quite standard and indeed expected in any application that touches an RDBMS.
Additionally these links may be of interest (ORM-related):
Wikipedia article on ORMs
SO Discussion on different ORMs
Many different .NET ORMs listed

You're on the right track, getting all the locations you need with one trip to the database would be best in terms of performance.
To make your code cleaner/shorter, make a constructor of your Location class that takes a DataRow, which will then set your properties accordingly. By doing this, you'll centralize your mapping from columns to properties in one place in your code base, which will be easy to maintain.
Then, it's totally acceptable to loop through the rows in your data table and call your constructor.
You could also use an object to relational mapper, like Entity Framework to do your database interaction.

Create a method that returns an IEnumerable . In this method do your database stuff and I often pass in the sqldatareader into the location constructor. So I would have something like this
public static IEnumerable<location> GetLocations()
{
List<location> retval = new List<location>();
using(sqlconnection conn = new sqlconn(connection string here);
{
sqlcommand command = new sqlcommand(conn, "spLoadData");
command.commandtype=stored proc
SqlDataReader reader = command.executereader();
while(reader.read())
{
retval.add(new location(reader));
}
}
return retval;
}
obviously that code won't work but it's just to give you an idea.
An ORM mapper could save you loads of time if you have lots to do however!

Related

Migrating Db and Entity framework

I have to take data from an existing database and move it into a new database that has a new design. So the new database has other columns and tables than the old one.
So basically I need to read tables from the old database and put that data into the new structure, some data won't be used anymore and other data will be placed in other columns or tables etc.
My plan was to just read the data from the old database with basic queries like
Select * from mytable
and use Entity Framework to map the new database structure. Then I can basically do similar to this:
while (result.Read())
{
context.Customer.Add(new Customer
{
Description = (string) result["CustomerDescription"],
Address = (string) result["CuAdress"],
//and go on like this for all properties
});
}
context.saveChanges();
I think it is more convenient to do it like this to avoid writing massive INSERT-statements and so on, but is there any problems in doing like this? Is this considered bad for some reason that I don't understand. Poor performance or any other pitfalls? If anyone has any input on this it would be appreciated, so I don't start with this and it turns out to be a big no-no for some reason.

Something that you could perhaps also try, is merely to write a new DBContext class for the new target database.
Then simply write a console application with a static method which copies entities and properties from the one context to the other.
This will ensure that your referential integrity remains intact and saves you a lot of hassle in terms of having to write SQL code, since EF does all the heavy lifting for you in this regard.
If the dbContext contains a lot of entity dbsets I recommend that you use some sort of automapper.
But, this depends on the amount of data that you are trying to move. If we are talking terrabytes, I would rather suggest you do not take this approach.

efficent way to sync sql-table with datatable

I need to sync a sql-table with data from an DataTable (which is a modified copy of the SQL-table). I'd like to update/delete/insert only the differences, so i need to compare both and find the query-value (in my case ID) and change-type. Is there an efficient way, perhaps via some preset method? I'd like to have as little access as possible.

Create datadapter, set its commands, fill your datatable. Work with your datatable
Then get datatable filled with changes
DataTable updateDt = originalDt.GetChanges();
dataAdapter.Update(updateDt);
This is the basic logic of working in disconnected mode and updating database.

I can't recommend a specific practical strategy that you can code yourself, but I recommend that you look at the services provided by the Entity Framework ( SaveChanges, the ObjectContext, adding/removing Entities from the ObjectContext, Include keyword, Navigation and Association properties of related Entities, etc. ) as an estimate of the range and complexity of issues you need to solve.
SQL Server replication framework can also give you some hints ( Merge conflict resolution strategies ).

How to create a business model wrapper for a generic database approach?

I'm currently facing a performance problem with creating POCO objects from my database. I'm using Entity Framework 4 as OR-Mapper.
The whole application is a prototype for now.
Let's assume I want to have some business objects like classes 'Printer' or 'Scanner'. Both classes inherit from a BaseClass called Product.
The business classes exist.
I try to use a more generic database approach. I don't want to create tables for "Printer" nor "Scanner". I want to have 3 tables: One called Product, and the other Property and PropertyValue (which stores all assigned values to a specific Product).
In my business layer I do create a specific object like this:
public Printer GetPrinter(int IDProduct)
{
Printer item = new Printer();
// get the product object with EF
// get all PropertyValues
// (with Reflection) foreach property in item.GetType().GetProperties
// {
// property.SetValue("specific value")
// }
return item;
}
This is how the EF model looks like:
Works fine so far. For now I'm doing performance tests for retrieving multiple sets.
I've created a prototype and improved it several times to increase the performance. It is still far away from being usable.
I takes 919ms to create 300 objects who only contain 3 properties.
The reason for choosing such DB design is to have a generic database design. Adding new properties should only be done in the business model.
Am I just being too stupid to create a performant way of retrieving xx objects or is my approach totally wrong? As far as I understand OR-Mapper, they are basically doing the same?

I think you missed whole point of ORM. The reason why people are using ORM is to be able to persist buisness objects and easily retrieve business objects. You are using ORM to get just data for your business objects' factories. Factories are using reflection to build business object from materialized classes retrieved by ORM. This will always be very slow because:
Query compilation is slow (you can precompile it)
Object materialization is slow (you can't avoid it)
Reflection is slow (you can't avoid it)
IMO if you want to follwo this DB design to have generic tables absolutely independent on your business objects you don't need ORM or at least you don't need EF.
The reason for your performance problems is that generic approach is not follwed in your business model. So you must somewhere convert generic data to specific data = slow operation.
If you want to improve performance define set of shared properties and place them into Product. Then either use your current PropertyValue and Property for additional non shared properties or use simply ExtendedProperties table storing key value pairs. Your entities will be of type Product with inner type property, shared properties and collection of extended properties. That is generic approach.

Firstly, it's not clear to me what you have in the way of POCOs. Did you hand code these and your context or T4 generate them? There are some great articles here that benchmark performance with no POCO, T4 Generated POCOs/Context and hand coded POCOs/Context. As expected there is HUGE performance savings going with POCOs (more than a 15-fold boost in performance in his benchmark) going the POCO route over those generated by the Entity Framework. You don't say what DBMS...if MSSQL have you turned on the profiler and see what's being generated?

Database Best-Practices for Beginners

So, I am a fairly new programmer working towards an undergraduate Comp Sci degree with a very small amount of work experience. In looking for internship-type jobs for my program, I have noticed that what I've heard from several profs -- "working with databases makes up 90% of all modern computer science jobs" -- looks like it is actually true. However, my program doesn't really have any courses with databases until 3rd year, so I'm trying to at least learn some things myself in the mean time.
I've seen very little on SO and the internet in general for somebody like myself. There seem to be tons of tutorials on the mechanics of how to read and write data in a database, but little on the associated best practices. To demonstrate what I am talking about, and to help get across my actual question, here is what can easily be found on the internet:
public static void Main ()
{
using (var conn = new OdbcConnection())
{
var command = new OdbcCommand();
command.Connection = conn;
command.CommandText = "SELECT * FROM Customer WHERE id = 1";
var dbAdapter = new OdbcDataAdapter();
dbAdapter.SelectCommand = command;
var results = new DataTable();
dbAdapter.Fill(results);
}
// then you would do something like
string customerName = (string) results.Rows[0]["name"];
}
And so forth. This is pretty simple to understand but obviously full of problems. I started out with code like this and quickly started saying things like "well it seems dumb to just have SQL all over the place, I should put all that in a constants file." And then I realized that it was silly to have those same lines of code all over the place and just put all that stuff with connection objects etc inside a method:
public DataTable GetTableFromDB (string sql)
{
// code similar to first sample
}
string getCustomerSql = String.Format(Constants.SelectAllFromCustomer, customerId);
DataTable customer = GetTableFromDB(getCustomerSql);
string customerName = (string) customer.Rows[0]["name"];
This seemed to be a big improvement. Now it's super-easy to, say, change from an OdbcConnection to an SQLiteConnection. But that last line, accessing the data, still seemed awkward; and it is still a pain to change a field name (like going from "name" to "CustName" or something). I started reading about using typed Data sets or custom business objects. I'm still kind of confused by all the terminology, but decided to look into it anyway. I figure that it is stupid to rely on a shiny Database Wizard to do all this stuff for me (like in the linked articles) before I actually learn what is going on, and why. So I took a stab at it myself and started getting things like:
public class Customer
{
public string Name {get; set;}
public int Id {get; set;}
public void Populate ()
{
string getCustomerSql = String.Format(Constants.SelectAllFromCustomer, this.Id);
DataTable customer = GetTableFromDB(getCustomerSql);
this.Name = (string) customer.Rows[0]["name"];
}
public static IEnumerable<Customer> GetAll()
{
foreach ( ... ) {
// blah blah
yield return customer;
}
}
}
to hide the ugly table stuff and provide some strong typing, allowing outside code to just do things like
var customer = new Customer(custId);
customer.Populate();
string customerName = customer.Name;
which is really nice. And if the Customer table changes, changes in the code only need to happen in one place: inside the Customer class.
So, at the end of all this rambling, my question is this. Has my slow evolution of database code been going in the right direction? And where do I go next? This style is all well and good for small-ish databases, but when there are tons of different tables, writing out all those classes for each one would be a pain. I have heard about software that can generate that type of code for you, but am kind of still confused by the DAL/ORM/LINQ2SQL/etc jargon and those huge pieces of software are kind of overwhelming. I'm looking for some good not-overwhelmingly-complex resources that can point me in the right direction. All I can find on this topic are complex articles that go way over my head, or articles that just show you how to use the point-and-click wizards in Visual Studio and such. Also note that I'm looking for information on working with Databases in code, not information on Database design/normalization...there's lots of good material on that out there.
Thanks for reading this giant wall of text.

Very good question indeed and you are certainly on the right track!
Being a computer engineer myself, databases and how to write code to interact with databases was also never a big part of my university degree and sure enough I'm responsible for all the database code at work.
Here's my experience, using legacy technology from the the early 90s on one project and modern technology with C# and WPF on another.
I'll do my best to explain terminology as I go but I'm certainly not an expert myself yet.
Tables, Objects, and Mappings Oh My!
A database contains tables but what really is that? It's just flat data related to other flat data and if you dive in and start grabbing things its going to get messy quickly! Strings will be all over the place, SQL statements repeated, records loaded twice, etc... It's therefore generally a good practice to represent each table record ( or collection of tables records depending on their relationships ) as an single object, generally referred to as a Model. This helps to encapsulate the data and provide functionality for maintaining and updating its state.
In your posting your Customer class would act as the Model! So you've already realized that benefit.
Now there are a variety of tools/frameworks (LINQ2SQL, dotConnect, Mindscape LightSpeed) that will write all your Model code for you. In the end they are mapping objects to relational tables or O/R mapping as they refer to it.
As expected when your database changes so do your O/R mappings. Like you touched on, if your Customer changes, you have to fix it in one place, again why we put things in classes. In the case of my legacy project, updating models consumed a lot of time because their were so many, while in my newer project it's a few clicks BUT ultimately the result is the same.
Who should know what?
In my two projects there has been two different ways of how objects interact with their tables.
In some camps, Models should know everything about their tables, how to save themselves, have direct shared access to the connection/session and can perform actions like Customer.Delete() and Customer.Save() all by themselves.
Other camps, put reading, writing, deleting, logic in a managing class. For example, MySessionManager.Save( myCustomer ). This methodology has the advantage of being able to easily implement change tracking on objects and ensuring all objects reference the same underlying table record. Implementing it however is more complex than the previously mention method of localized class/table logic.
Conclusion
You're on the right track and in my opinion interacting with databases is extremely rewarding. I can remember my head spinning when I first started doing research myself.
I would recommend experimenting a bit, start a small project maybe a simple invoicing system, and try writing the models yourself. After that try another small project and try leveraging a database O/R mapping tool and see the difference.

Your evolution is definitely in the right direction. A few more things to consider:
Use prepared statements versus String.Format to bind your parameters. This will protect you from SQL injection attacks.
Use the DBProviderFactory and System.Data.Common inferfaces to further disconnect your implementation from a specific database.
After that, look at methods to generate your SQL commands and map data into objects automatically. If you don't want to jump into a big complex ORM, look for simple examples: ADO.NET ORM in 10 minutes, Light ORM library, or Creating an ORM in .NET. If you decide to go this route, you'll ultimately be better served by a mature library like the Entity Framework, Hibernate, or SubSonic.

My advice if you want to learn about databases, the first step is forget about the programming language, next, forget about which database you are using and learn SQL. Sure there are many differences between mySQL, MS SQLserver and Oracle but there is so much that is the same.
Learn about joins, select as, date formats, normalization. Learn what happens when you have millions and millions of records and things start to slow down, then learn to fix it.
Create a test project related to something that interests you, for example a bike store. See what happens when you add a few million products, and a few million customers and think of all the ways the data needs to relate.
Use a desktop app for running queries on a local database (sequel pro, mysql workbench etc) as it's much quicker than uploading source code to a server. And have fun with it!

IMHO, you're definitely going in the right direction for really nice to work with maintainable code! However I'm not convinced the approach will scale to a real app. A few thoughts that may be helpful
While the code you're writing will be really nice to work with and really maintainable, it involves a lot of work up-front, this is part of the reason the wizards are so popular. They aren;t the nicest thing to work with, but save a lot of time.
Querying from the database is just the beginning; another reason for the use of typed datasets and wizards in general is that in most applications, users are at some stage going to edit your information and send it back for updating. Single records are fine, but what if your data is best represented in a Normalised way with a hierarchy of tables 4 deep? Writing code to auto-generate the update/insert/delete statements by hand for all that call be hellish, so tools are the only way forward. typed DataSets will generate all the code to perform these updates for you and have some very powerful functionality for handling disconnected (e.g. Client-side) updates/rollbacks of recent modifications.
What the last guys said about SQL injection (which is a SERIOUSLY big deal in industry) and protecting yourself by using a DBCommand object and adding DbParameters.
In general there's a really big problem in going from code to databases referred to as an impedance mismatch. Bridging the gap is very tricky and that's why the majority of industry relies on tools to do the heavy lifting. My advice would be to try the wizards out - because while stepping through a wizard is no test in skill, learning all their drawbacks/bugs and their various workarounds is a really useful skill in industry, and will allow you to get to some more advanced scenarios in data management more quickly (e.g. the disconnected update of a 4-deep table hierarchy I mentioned).

If you're a bit scared of things like Linq to SQL and the Entity Framework, you could step half way in between and explore something like iBATIS.NET. It is simply a data mapper tool that takes some of the pain of the database connection management and mapping your result sets to custom domain objects.
You still have to write all of your object classes and SQL, but it maps all of your data to the classes for you using reflection, and you don't have to worry about all of the underlying connectivity (you could easily write a tool to generate your classes). When you're up and running with iBATIS (assuming you might be interested), your code will start to look like this:
var customer = Helpers.Customers.SelectByCustomerID(1);
That SelectByCustomerID function exists inside the Customers mapper, whose definition might look like:
public Customer SelectByCustomerID(int id)
{
Return Mapper.QueryForObject<Customer>("Customers.SelectByID", id);
}
The "Customers.SelectByID" maps to an XML statement definition where "Customers" is the namespace and "SelectByID" is the ID of the map containing your SQL:
<statements>
<select id="SelectByID" parameterClass="int" resultClass="Customer">
SELECT * FROM Customers WHERE ID = #value#
</select>
</statements>
Or when you want to change a customer you can do things like:
customer.FirstName = "George"
customer.LastName = "Costanza"
Helpers.Customers.Update(customer);
LINQ to SQL and the Entity Framework get fancier by producing the SQL for you automatically. I like iBATIS because I still have full control of the SQL and what my domain objects look like.
Check out iBATIS (now migrated to Google under the name MyBatis.NET). Another great package is NHibernate, which is a few steps ahead of iBATIS and closer to a full ORM.

Visual page of database just with combobox and datagrid
namespace
TestDatabase.Model
{
class Database
{
private MySqlConnection connecting;
private MySqlDataAdapter adapter;
public Database()
{
connecting = new MySqlConnection("server=;uid=;pwd=;database=;");
connecting.Open();
}
public DataTable GetTable(string tableName)
{
adapter = new MySqlDataAdapter("SELECT * FROM "+ tableName, connecting);
DataSet ds = new DataSet();
adapter.Fill(ds);
adapter.UpdateCommand = new MySqlCommandBuilder(adapter).GetUpdateCommand();
adapter.DeleteCommand = new MySqlCommandBuilder(adapter).GetDeleteCommand();
ds.Tables[0].RowChanged += new DataRowChangeEventHandler(Rowchanged);
ds.Tables[0].RowDeleted += new DataRowChangeEventHandler(Rowchanged);
return ds.Tables[0];
}
public void Rowchanged(object sender, DataRowChangeEventArgs args)
{
adapter.Update(sender as DataTable);
}
}
}

NHibernate and Modular Code

We're developing an application using Nhibernate as the data access layer.
One of the things I'm struggling with is finding a way to map 2 objects to the same table.
We have an object which is suited to data entry, and another which is used in more of a batch process.
The table contains all the columns for the data entry and some additional info for the batch processes.
When it's in a batch process I don't want to load all the data just a subset, but I want to be able to update the values in the table.
Does nhibernate support multiple objects pointed at the same table? and what is the thing that allows this?
I tried it a while ago and I remember that if you do a query for one of the objects it loads double the amount but i'm not so sure I didn't miss something.
e.g.
10 data entry objects
+
10 batch objects
So 20 object instead of 10.
Can anyone shed any light on this?
I should clarify that these objects are 2 different objects which in my mind should not be polymorphic in behaviour. However they do point at the same database record, it's more that the record has a dual purpose within the application and for sake of logical partitioning they should be kept separate. (A change to one domain object should not blow up numerous screens in other modules etc).
Thanks
Pete

An easy way to map multiple objects to the same table is by using a discriminator column. Add an extra column to the table and have it contain a value declaring it as type "Data Entry" or "Batch Process".
You'd create two objects - one for Data Entry and Batch Process. I'm not entirely sure how you enact that in regular NHibernate XML mapping - I use Castle ActiveRecord for annotating, so you'd mark up your objects like so:
[ActiveRecord("[Big Honking Table]",
DiscriminatorColumn = "Type",
DiscriminatorType = "String",
DiscriminatorValue = "Data Entry")]
public class Data Entry : ActiveRecordBase
{
//Your stuff here!
}
[ActiveRecord("[Big Honking Table]",
DiscriminatorColumn = "Type",
DiscriminatorType = "String",
DiscriminatorValue = "Batch Process")]
public class Batch Process : ActiveRecordBase
{
//Also your stuff!
}
Here's the way to do it with NHibernate + Castle ActiveRecord: http://www.castleproject.org/activerecord/documentation/trunk/usersguide/typehierarchy.html
Note that they use a parent object - I don't think that's necessary but I haven't implemented a discriminator column exactly the way you're describing, so it might be.
And here's the mapping in XML: https://www.hibernate.org/hib_docs/nhibernate/html/inheritance.html
You can also, through the mapping, let NHibernate know which columns to load / update - if you end up just making one big object.

I suppose you just might be overengineering it just a little bit:
If you worry about performance, that's premature optimization (besides, retrieving less columns is not much faster, as for saving you can enable dynamic updates to only update columns that changed).
If you trying to protect the programmer from himself by locking down his choices, you complicating your design for not so noble a cause.
In short, based on my 10 yrs+ of experience and my somewhat limited understanding of your problem I recommend you think again about doing what you wanna do.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.