Normally, when working with data from a database with a forms application, I keep it in a dataset or datatable and pull data out as needed. Now, I am working with WPF and trying to conform more to the MVVM pattern. Converting these datatables to objects make it a little easier to use with MVVM.
For example, if I had a table filled by a query like so -
Select p.first_name, p.last_name, p.phone,p.email from person as p where p.first_name = 'Bob'
Instead of keeping the datatable, I would just now convert that into a person object.
From a performance standpoint, is there and downfall to making objects or should I stick with datasets and datatables?
The performance impact of using an ORM such as EF (or rolling your own) instead of DataTable/DataSet is negligible in applications like you're describing, but it depends on how it's implemented.
The main advantage of using an ORM is ensuring type-safety and not having to perform type-casting when retrieving data from a DataTable object. There are also benefits from using lazy-loading in Linq.
I don't think that using entity objects everywhere is necessarily a solution. There's nothing really wrong with using a DataTable object in a ViewModel (although I'm unsure how you'd use it with respect to the data synchronisation features of the DataTable class, but you're free to not use it at all).
In a new project, I'd use EF, only because I like having the typing taken care of for me, but if you've got an older project using tables that works fine, I'd stick with it.
Just make sure that your objects contains only the necessary fields and embrace the objects if they make the solution clearer.
Any eventual performance downfall (minimal) will be compensated by potential long-term performance benefits of an cleaner and clearer solution.
Related
Is it possible?
Please note I am not using LINQ nor Entity Framework.
You could also check out Dapper-Dot-Net - a very lightweight and very capable "micro ORM" which - incidentally - is used to run this site here.
It's quite fast, a single *.cs file, works with your usual T-SQL commands and returns objects - works like a charm, it's very fast, very easy to understand, no big overhead - just use it and enjoy!
My personal favorite is done using the dynamic object featured in .NET4 via Rob Conery's Massive library. Like Dapper-Dot-Net it is small.
By going old school you can use Datasets to create strongly typed data table classes that mirror your database entirely right down to the relationships. It's a precursor to LINQ/EF that auto-generates a lot of bloated code but they're very handy for maintaining your field names, data types, data constraints and performing easily configured rapid updates.
http://msdn.microsoft.com/en-us/library/esbykkzb(v=VS.100).aspx
I'm working on creating a dashboard. I have started to refactor the application so methods relating to querying the database are generic or dynamic.
I'm fairly new to the concept of generics and still an amateur programmer but I have done some searching and I have tried to come up with a solution. The problem is not really building the query string in a dynamic way. I'm fine with concatenating string literals and variables, I don't really need anything more complex. The bigger issue for me is when I create this query, getting back the data and assigning it to the correct variables in a dynamic way.
Lets say I have a table of defects, another for test cases and another for test runs. I want to create a method that looks something like:
public void QueryDatabase<T>(ref List<T> Entitylist, List<string> Columns, string query) where T: Defect, new()
Now this is not perfect but you get the idea. Not everything about defects, test cases and test runs are the same but, I'm looking for a way to dynamically assign the retrieved columns to its "correct" variable.
If more information is needed I can provide it.
You're re-inventing the wheel. Use an ORM, like Entity Framework or NHibernate. You will find it's much more flexible, and tools like that will continue to evolve over time and add new features and improve performance while you can focus on more important things.
EDIT:
Although I think overall it's important to learn to use tools for something like this (I'm personally a fan of Entity Framework and have used it successfully on several projects now, and used LINQ to SQL before that), it can still be valuable as a learning exercise to understand how to do this. The ORMs I have experience with use XML to define the data model, and use code generation on the XML file to create the class model. LINQ to SQL uses custom attributes on the code-generated classes to define the source table and columns for each class and property, and reflection at run-time to map the data returned from a SqlDataReader to the properties on your class object. Entity Framework can behave differently depending on the version you're using, whether you use the "default" or "POCO" templates, but ultimately does basically the same thing (using reflection to map the database results to properties on your class), it just may or not use custom attributes to determine the mapping. I assume NHibernate does it the same way as well.
You are reinventing the wheel, yes, it's true. You are best advised to use an object-relational mapper off the "shelf". But I think you also deserve an answer to your question: to assign the query results dynamically to the correct properties, you would use reflection. See the documentation for the System.Reflection namespace if you want more information.
I'm really late to the .Net game and struggling to learn ADO.Net. I prefer to learn how to do data access the "right way". Somewhere I've picked up on the idea that it's considered superior to manually code your own Connections, Data Adapters, DataSets, DataTables, and even command statements for updating, adding, and deleting rather than using Visual Studios data wizard. I understand from my reading that there are some things you can only do by writing your own command statements but it isn't completely clear to me what that might be.
Should I always code my own connections, data adapters, datasets and datatables? What about my update, insert, and delete command statements? How do I know when I should code those manually?
There is no right or wrong way. However I would suggest you first do things the "hard way" in that you write your own code for each of the data access routines you need. Of course that would mean you'll also need to know and understand SQL. Eventually you could use/build tools that generate all of your code just the way you need it.
Preferably you'll use stored procedures instead of SQL statements in code, because stored procedures provide an additional level of abstraction, abstracting your database schema from even your data layer and of course your business layer.
I'd used ADO.NET core (that is writing your own code for data access and such). I'd use DataSets/DataTable (if you have to) purely as in-memory data structures without using them to do automatic updates/deletes and the like. Stick to DataReaders to the extent possible converting them over to DTOs (for data retrieval methods). For data modification methods, your data layer should get DTOs as parameters (or simple data types as parameters if there are just one or two).
Personally I use tools to generate the data access layer code that uses ADO.NET core (and not EF or LINQ2SQL and such). That is my personal preference and depending on the size of your application it goes a very long way in towards performance as well as needing to have in-depth knowledge of only two things. Your database and SQL and C# code without also having to learn about the nuances of abstraction layers and specialized languages (in some cases).
In large projects (and teams) leaving the database schema and stored procedures to people specialized in that area becomes a necessity and requirement and in those cases using ADO.NET core also becomes a requirement.
On my blog I have posted an article where in I introduce a tool that generates all of the code. The tool and source code are available for download. The tool also generates code for strongly typed datareaders. That is under the covers you're using a DataReader while in code it looks/feels like a DTO in terms of strongly typed properties.
Data Access Layer CodeGen
DataReader Wrappers - TypeSafe
in my own experience is preferred to always use hard code instead using smart control wizard.
I think you should learn how its done under the covers first and then pick your own abstraction layer of which there are many.
LINQ to SQL does a great job of automating common Db tasks. All your basic CRUD (Create,Read,Update,Delete) operations will be much easier to code by using a DataContext dbml file. The code is much easier to write, does not rely on strings, is compatible with other ADO.NET commands (You can execute a direct DbCommand against your DataContext, and it is more highly optimized than anything most people will write (Especially a beginner!). You will save yourself a whole lot of time by using something like LINQ to SQL or another ORM. Unless your objective is pure learning, you would be best off by creating a working DataContext, and analyzing the source to see how it is working instead of teaching yourself ADO.NET. The fact that you are at a point where you need to ask this question, probably indicates that you will not add value to your application by writing your own boiler plate DB access code.
It looks like a lot of people are recommending that you hard code your DAL first, before you use an ORM like LINQ to SQL. I would just like to point out that the logic involved in this line of thinking would necessitate that we also learn to code with IL before writing C# code, build a computer before we use one, and sail across the ocean before we take an international air plane.
There's not really going to be a black-and-white answer for this, but in my experience, I've always been better off coding my own stuff. This has largely been because I'm just an anal-retentive obsessive-compulsive control freak, and I just don't trust wizards to write code the way I want it written. I'm sure that many people agree with me, just as I'm sure that many people disagree with me.
The fact that OR/Ms exist is plenty of proof to prove that you don't always need to roll your own code. The fact that it's not mandatory is also proof that you aren't compelled to use it.
Do whatever feels right and meets the needs of your solution and its time and budgetary constraints.
In our application we're considering using dynamically generated classes to hold a lot of our data. The reason for doing this is that we have customers with tables that have different structures. So you could have a customer table called "DOG" (just making this up) that contains the columns "DOGID", "DOGNAME", "DOGTYPE", etc. Customer #2 could have the same table "DOG" with the columns "DOGID", "DOG_FIRST_NAME", "DOG_LAST_NAME", "DOG_BREED", and so on. We can't create classes for these at compile time as the customer can change the table schema at any time.
At the moment I have code that can generate a "DOG" class at run-time using reflection. What I'm trying to figure out is how to populate this class from a DataTable (or some other .NET mechanism) without extreme performance penalties. We have one table that contains ~20 columns and ~50k rows. Doing a foreach over all of the rows and columns to create the collection take about 1 minute, which is a little too long.
Am I trying to come up with a solution that's too complex or am I on the right track? Has anyone else experienced a problem like this? Create dynamic classes was the solution that a developer at Microsoft proposed. If we can just populate this collection and use it efficiently I think it could work.
What you're trying to do wil get fairly complicted in .NET 3.5. You're probably better off creating a type with a Dictionary<> of key/value pairs for the data in your model.
If you can use .NET 4.0, you could look at using dynamic and ExpandoObject. The dynamic keyword lets you create reference to truly dynamic objects - ExpandoObject allows you to easily add properties and methods on the fly. All without complicated reflection or codegen logic. The benefit of dynamic types is that the DLR performs some sophisticated caching of the runtime binding information to allow the types to be more performant than regular reflection allows.
If all you need to do is load map the data to existing types, then you should consider using something like EntityFramework or NHibernate to provide ORM behavior for your types.
To deal with the data loading performance, I would suggest you consider bypassing the DataTable altogether and loading your records directly into the types you are generating. I suspect that most of the performance issues are in reading and casting the values from the DataTable where they original reside.
You should use a profiler to figure out, what exactly takes the time, and then optimize on that. If you are using reflection when setting the data, it will be slow. Much can be saved by caching the reflected type data.
I am curious, how will you use these class if the members are not known at compile time ? Perhaps you would be better off with just a plain DataTable for this scenario ?
Check out PicoPoco and Massive. These are Micro ORMS. (light weight opensource ORM frameworks) that you use by simply copying in a single code file into your project.
Massive uses the ExpandoObject and does conversions from IDataRecord to the ExpandoObject wich I think is exactly what you want.
PicoPoco take an existing plain class and dynamically and efficiently generates a method (which it then caches by the way) to load it from a database.
I have been developing many application and have been into confusion about using dataset.
Till date i dont use dataset and works into my application directly from my database using queries and procedures that runs on Database Engine.
But I would like to know, what is the good practice
Using Dataset ?
or
Working direclty on Database.
Plz try to give me certain cases also when to use dataset along with operation (Insert/Update)
can we set read/write lock on dataset with respect to our database
You should either embrace stored procedures, or make your database dumb. That means that you have no logic whatsoever in your db, only CRUD operations. If you go with the dumb database model, Datasets are bad. You are better off working with real objects so you can add business logic to them. This approach is more complicated than just operating directly on your database with stored procs, but you can manage complexity better as your system grows. If you have large system with lots of little rules, stored procedures become very difficult to manage.
In ye olde times before MVC was a mere twinkle in Haack's eye, it was jolly handy to have DataSet handle sorting, multiple relations and caching and whatnot.
Us real developers didn't care about such trivia as locks on the database. No, we had conflict resolution strategies that generally just stamped all over the most recent edits. User friendliness? < Pshaw >.
But in these days of decent generic collections, a plethora of ORMs and an awareness of separation of concerns they really don't have much place any more. It would be fair to say that whenever I've seen a DataSet recently I've replaced it. And not missed it.
As a rule of thumb, I would put logic that refers to data consistency, integrity etc. as close to that data as possible - i.e. in the database. Also, if I am having to fetch my data in a way that is interdependent (i.e. fetch from tables A, B and C where the relationship between A, B and C's contribution is known at request time), then it makes sense to save on callout overhead and do it one go, via a database object such as a function, procedure (as already pointed out by OMGPonies). For logic that is a level or two removed, it makes sense to have it where dealing with it "procedurally" is a bit more intuitive, such as in a dataset. Having said all that, rules of thumb are sometimes what their acronym infers...ROT!
In past .Net projects I've often done data imports/transformations (e.g. for bank transaction data files) in the database (one callout, all logic is encapsulated in in procedure and is transaction protected), but have "parsed" items from that same data in a second stage, in my .net code using datatables and the like (although these days I would most likely skip the dataset stage and work on them from a higher lever of abstraction, using class objects).
I have seen datasets used in one application very well, but that is in 7 years development on quite a few different applications (at least double figures).
There are so many best practices around these days that point twords developing with Objects rather than datasets for enterprise development. Objects along with an ORM like NHibernate or Entity Framework can be very powerfull and take a lot of the grunt work out of creating CRUD stored procedures. This is the way I favour developing applications as I can seperate business logic nicely this way in a domain layer.
That is not to say that datasets don't have their place, I am sure in certain circumstances they may be a better fit than Objects but for me I would need to be very sure of this before going with them.
I have also been wondering this when I never needed DataSets in my source code for months.
Actually, if your objects are O/R-mapped, and use serialization and generics, you would never need DataSets.
But DataSet has a great use in generating reports.
This is because, reports have no specific structure that can be or should be O/R-mapped.
I only use DataSets in tandem with reporting tools.