What is a better practice ? Working with Dataset or Database

What is a better practice ? Working with Dataset or Database - c#

I have been developing many application and have been into confusion about using dataset.
Till date i dont use dataset and works into my application directly from my database using queries and procedures that runs on Database Engine.
But I would like to know, what is the good practice
Using Dataset ?
or
Working direclty on Database.
Plz try to give me certain cases also when to use dataset along with operation (Insert/Update)
can we set read/write lock on dataset with respect to our database

You should either embrace stored procedures, or make your database dumb. That means that you have no logic whatsoever in your db, only CRUD operations. If you go with the dumb database model, Datasets are bad. You are better off working with real objects so you can add business logic to them. This approach is more complicated than just operating directly on your database with stored procs, but you can manage complexity better as your system grows. If you have large system with lots of little rules, stored procedures become very difficult to manage.

In ye olde times before MVC was a mere twinkle in Haack's eye, it was jolly handy to have DataSet handle sorting, multiple relations and caching and whatnot.
Us real developers didn't care about such trivia as locks on the database. No, we had conflict resolution strategies that generally just stamped all over the most recent edits. User friendliness? < Pshaw >.
But in these days of decent generic collections, a plethora of ORMs and an awareness of separation of concerns they really don't have much place any more. It would be fair to say that whenever I've seen a DataSet recently I've replaced it. And not missed it.

As a rule of thumb, I would put logic that refers to data consistency, integrity etc. as close to that data as possible - i.e. in the database. Also, if I am having to fetch my data in a way that is interdependent (i.e. fetch from tables A, B and C where the relationship between A, B and C's contribution is known at request time), then it makes sense to save on callout overhead and do it one go, via a database object such as a function, procedure (as already pointed out by OMGPonies). For logic that is a level or two removed, it makes sense to have it where dealing with it "procedurally" is a bit more intuitive, such as in a dataset. Having said all that, rules of thumb are sometimes what their acronym infers...ROT!
In past .Net projects I've often done data imports/transformations (e.g. for bank transaction data files) in the database (one callout, all logic is encapsulated in in procedure and is transaction protected), but have "parsed" items from that same data in a second stage, in my .net code using datatables and the like (although these days I would most likely skip the dataset stage and work on them from a higher lever of abstraction, using class objects).

I have seen datasets used in one application very well, but that is in 7 years development on quite a few different applications (at least double figures).
There are so many best practices around these days that point twords developing with Objects rather than datasets for enterprise development. Objects along with an ORM like NHibernate or Entity Framework can be very powerfull and take a lot of the grunt work out of creating CRUD stored procedures. This is the way I favour developing applications as I can seperate business logic nicely this way in a domain layer.
That is not to say that datasets don't have their place, I am sure in certain circumstances they may be a better fit than Objects but for me I would need to be very sure of this before going with them.

I have also been wondering this when I never needed DataSets in my source code for months.
Actually, if your objects are O/R-mapped, and use serialization and generics, you would never need DataSets.
But DataSet has a great use in generating reports.
This is because, reports have no specific structure that can be or should be O/R-mapped.
I only use DataSets in tandem with reporting tools.

Related

What should I use for performance sensitive data access?

So I have an application which requires very fast access to large volumes of data and we're at the stage where we're undergoing a large re-design of the database, which gives a good opertunity to re-write the data access layer if nessersary!
Currently in our data access layer we use manually created entities along with plain SQL to fill them. This is pretty fast, but this technology is really getting old, and I'm concerned we're missing out on a newer framework or data access method which could be better in terms of neatness and maintainability.
We've seen the Entity Framework, but after some research it just seems that the benefit of the ORM it gives is not enough to justify the lower performance and as some of our queries are getting complex I'm sure performance with the EF would become more of an issue.
So it is a case of sticking with our current methods of data access, or is there something a bit neater than manually creating and maintaining entities?
I guess the thing that's bugging me is just opening our data layer solution and seeing lots of entities, all of which need to be maintained exactly in line with the database, which sometimes can be a lot of work, but then maybe this is the price we pay for performance?
Any ideas, comments and suggestions are very appreciated! :)
Thanks,
Andy.
** Update **
Forgot to mention that we really need to be able to handle using Azure (client requirements), which currently stops us from using stored procedures. ** Update 2 ** Actually we have an interface layer for our DAL which means we can created an Azure implementation which just override data access methods from the Local implementation which aren't suitable for Azure, so I guess we could just use stored procedures for performance sensitive local databases with EF for the cloud.

I would use an ORM layer (Entity Framework, NHibernate etc) for management of individual entities. For example, I would use the ORM / entities layers to allow users to make edits to entities. This is because thinking of your data as entities is conceptually simpler and the ORMs make it pretty easy to code this stuff without ever having to program any SQL.
For the bulk reporting side of things, I would definitely not use an ORM layer. I would probably create a separate class library specifically for standard reports, which creates SQL statements itself or calls sprocs. ORMs are not really for bulk reporting and you'll never get the same flexibility of querying through the ORM as through hand-coded SQL.

Stored procedures for performance. ORMs for ease of development
Do you feel up to troubleshooting some opaque generated SQL when it runs badly...? That generates several round trips where one would do? Or insists on using wrong datatypes?

You could try using mybatis (previously known as ibatis). It allows you to map sql statements to domain objects. This way you keep full control over SQL being executed and get cleanly defined domain model at the same time.

Don't rule out plain old ADO.NET. It may not be as hip as EF4, but it just works.
With ADO.NET you know what your SQL queries are going to look like because you get 100% control over them. ADO.NET forces developers to think about SQL instead of falling back on the ORM to do the magic.
If performance is high on your list, I'd be reluctant to take a dependency on any ORM especially EF which is new on the scene and highly complex. ORM's speed up development (a little) but are going to make your SQL query performance hard to predict, and in most cases slower than hand rolled SQL/Stored Procs.
You can also unit test SQL/Stored Procs independently of the application and therefore isolate performance issues as either DB/query related or application related.
I guess you are using ADO.NET in your DAL already, so I'd suggest investing the time and effort in refactoring it rather than throwing it out.

What's are the issues ORM are trying to solve that a Database does not?

I work in a team with 15 developers in a large enterprise. We deal with many tables that have millions of records on the OLTP database. The data warehouse database is much larger.
We're embarking on a new system that needs to be developed that will be going against a very similar sized database. Each one of us is highly proficient and very comfortable with SQL, stored procedure etc. examining execution plans, defining the correct indexes etc. We're all also very comfortable in .NET C# and ASP.NET.
However we've each been looking into ORMs independently and none of us is able to understand the real problem they solve. On the contrary, what we do see is the performance issues people have and all the tweaks that need to be done in order to accommodate what the lack.
Another aspect that seems be that people use ORMs so as not to have to get their hands dirty with dealing with the database and SQL etc., but in fact it seems that you can't escape it for too long, especially when it comes to performance.
So my question is, what are the problems ORMs solve (or attempt to solve).
I should note that we have about 900 tables and over 2000 stored procedure in our OLTP database and our data layer is auto generated off of our stored procedures and we use ADO.NET core currently.

ORMs attempt to solve the object-relational impedance mismatch.
That is, since relational databases are ... relational in nature, the data modeling used for them is very different from the type of modeling you would use in OOP.
This difference is known as the "object-relational impedance mismatch". ORMs try to let you use OOP without consideration of how the database is modeled.

ORMs are for object-oriented people who think that everything works better when they're in objects. They have their best opportunity on projects where OO skills are stronger and more plentiful than SQL and relational databases.
If you're writing in an object-oriented language you'll have to deal with objects at one time or another. Whether you use ORM or not, you'll have to get that data out of the tables and into objects on the middle tier so you can work with them. ORM can help you with the tedium of mapping. But it's not 100% necessary. It's a choice like any other.
Don't make the mistake of writing a client/server app with objects. If your objects are little more than carriers of data from the database to the client, I'd say you're doing something wrong. There's value in encapsulating behavior in objects where it makes sense.

I think, as with all these things, the answer to the question of whether you should use an ORM is 'it depends'. In a lot of cases, people may be writing relatively simple applications with relatively small databases in the background.
In these cases, an ORM makes sense, as it allows for easy maintenance (adding a column is a one place change to have it ripple through you application) and quick turnaround.
However, if you are dealing with very large databases and complex data manipulation, then an ORM is possibly not for you. That said, tables with millions of rows should still not be a problem for an ORM, it all depends on how you return and use the data - a well structured database should allow for reasonable performance.
In you case, you can't see the benefit, as it's maybe not suitable to your application - it is for some others.
BTW - what you describe in your question - stored procedures used to generate classes used in your business layer. This is essentially what a good ORM mapper is - get away from writing the boiler-plate data access code and work on the business logic.

Very good question. It depends on what you are building.
If you have complex object structures (objects with many relationships and encapsulated objects) and you manipulate these objects inside memory before committing the transaction, it is much easier to use ORM like Hibernate, because you don't need to worry about SQL, Caching, Just-In-Time object loading, etc. As a bonus feature you will get pretty good database independence.
If you have very simple objects in your application, without much functionality/methods, you can of course use plain SQL/DB connection. However I would recommend you to use ORM anyway, because you will be database independent, more portable and you will be ready to grow, when your system will need just-in-time object loading, long transactions (with caching), etc.
I have worked with many persistence frameworks in my life and I would recommend Hibernate.

In short, its really trying to make the DB appear as objects to the programmer, so the programmer can be more productive. The only benefit (besides assisting dev laziness) is that the columns mapped to classes are strongly typed - no more trying to fetch a string into an integer variable!
From all the years of ORM libraries that have come and gone, they all seem (IMHO) to be just another programmer toy. Ultimately, its another abstraction that might help you when you start out, or when writing small apps. My opinon here is that once you learn a god way of accessing the DB, you might as well continue to use that way rather than learn many ways of doing the same task - I always feel more productive being a specialist, but that could just be me.
The tools required to create and maintain the mapping, generate the classes, etc are another nuisance. In this regard a built-in framework (eg ruby on rails' ActiveRecord approach) is much better.
Performance can be an issue, as can the sql that is generated at the back end - you will nearly alwas be fetching much more data than you needed when using an ORM compared to the small SQL statements you might otherwise write.
The strong typing is good though, and I would praise ORMs for that.

Database Design In SQL Server or C#?

Should a database be designed on SQL Server or C#?
I always thought it was more appropriate to design it on SQL Server, but recently I started reading a book (Pro ASP.NET MVC Framework) which, to my understanding, basically says that it's probably a better idea to write it in C# since you will be accessing the model through C#, which does make sense.
I was wondering what everyone else's opinion on this matter was...
I mean, for example, do you consider "correct" having a table that specifies constants (like an AccessLevel table that is always supposed to contain
1 Everyone
2 Developers
3 Administrators
4 Supervisors
5 Restricted
Wouldn't it be more robust and streamlined to just have an enum for that same purpose?

A database schema should be designed on paper or with an ERD tool.
It should be implemented in the database.
Are you thinking about ORMs like Entity Framework that let you use code to generate the database?
Personally, I would rather think through my design on paper before committing it to a DB myself. I would be happy to use an ORM or class generator from this DB later on.

Before VS.NET 2010 I was using SQL Server Management Studio to design my databases, now I am using EF 4.0 designer, for me it's the best way to go.

If your problem domain is complex or its complexity grows as the system evolves you'll soon discover you need some meta data to make life easier. C# can be a good choice as a host language for such stuff as you can utilize its type-system to enforce some invariants (like char-columns length, null/not null restrictions or check-constraints; you can declared it as consts, enums, etc). Unfortunately i don't know utilities (sqlmetal.exe can export some meta but only as xml) that can do it out of the box, although some CASE tools probably can be customized. I'd go for some custom-made generator to produce the db schema from C# (just a few hours work comparing to learning, for example, customization options offered by Sybase PowerDesigner).

ORMs have their place, that place is NOT database design. There are many considerations in designing a database that need to be thought through not automatically generated no matter how appealing the idea of not thinking about design might be. There are often many things that need to be considered that have nothing to do with the application, things like data integrity, reporting, audit tables and data imports. Using an ORM to create a database that looks like an object model may not be the best design for performance and may not have the the things you really need in terms of data integrity. Remember even if you think nothing except the application will touch the database ever, this is not true. At some point the data base will need to have someone do a major data revision (to fix a problem) that is done directly on the database not through the application. At somepoint you are going to need need to import a million records from some other company you just bought and are goping to need an ETL process outside teh application. Putting all your hopes and dreams for the database (as well as your data integrity rules) is short-sighted.

pluggable data store architectures

I have a pluggable system management tool. The architecture of this kind of thing is well understood (interfaces, publish/ subscribe, ....). How about the data store though. What do people do?
I need plugins to be able to add new entities, extend existing entities, establish new relationships, etc.
My thoughts (SQL), not necessarily well thought out
each plugin simply extends the schema when they are installed. In the old days changing the schema was a big no-no; now databases are very relaxed about this
plugins have their own tables. If 2 of them have an entity (say) person, then there are 2 tables p1_person and p2_person
plugins have their own database
invent some sort of flexible scheme where the tables are softly typed. Maybe many attributes packed into a single attribute. The ultimate is to have one big table called data, with key of table name & column name and a single data value.
Not SQL
object DB. I have no experience with these. Anybody care to pass on experience. db4o for example. Can I change the 'schema' of objects as the app evolves
NO-SQL
this is 'where its at' at the moment. Most of these seem to be aimed slightly differently than my needs. Anybody want to pass on experience with these
Apologies for the open ended question

My suggestion is go read about the entity framework
a lot of the situations you are describing can be solved (very elegantly) using table inheritance.
Your idea of one big table called data makes the hamsters in my computer cry ;)
The general trend is away from weakly typed schemas because they cannot be debugged at compile time. What you get from something like entity framework is a strongly typed extenislbe schema that you can code against using linq.
Object databases:
like you i havent played with them massivley - however the time when i was considering them was a time when there was no good ORM for .net and writing ado.net code was slowly killing me.
as for NO-SQL these are databases that meet a performance need. SQL performs badly in situations here there are lots of small writes occuring. I say badly tounge in cheek - it performs very well but when you scale to millions of concurrent users everything changes. My understanding of no sql is that it is a non rationalised format designed for lots of small fast writes and reads. The scale of sites that use these is usually very large.
OK - in response
I am currently lucky enough to be on a green field project so i am using EF to generate my schema.
On non greenfield projects I use sql scripts to update my table structures. As for implementing table inheritance in sql its very easy once you know the concept, its essentially a one to many relationship with a constraint that it will only ever be 0-1.
I wouldn't write .net code that updates the database structure ... that sounds like a disaster waiting to happen to me.
Beginning to think i have misunderstood what you are looking for. I find databases to be second nature as I have spent so long with them.
I haven't found a replacement for being meticulous about script management.

When to use Stored Procedures instead of using any ORM with programming logic?

Hi all I wanted to know when I should prefer writing stored procedures over writing programming logic and pulling data using a ORM or something else.

Stored procedures are executed on server side.
This means that processing large amounts of data does not require passing these data over the network connection.
Also, with stored procedures, you can build consistent complicated business logic.
Say, you need to update the account balance each time you insert a transaction, and you need to insert many transactions at once.
Instead of doing this with triggers (which are implemented using inefficient record-by-record approach in many systems), you can pass a table variable or temporary table with the inputs and issue a set-based SQL statement inside the procedure. This will be much more efficient.

I prefer SPs over programming logic mainly for two reasons
Performance, anything what will reduce result set or can be more effectively done on the server, e.g.:
paging
filtering
ordering (on indexed columns)
Security -- if someone have got application's access to the database and wants to wipe out your all your records, having to execute Row_Delete for single each of them instead of DELETE FROM Rows already sounds good.

Never unless you identify a performance issue. (largely opinion)
(a Jeff blog post!)
http://www.codinghorror.com/blog/2004/10/who-needs-stored-procedures-anyways.html
If you see stored procs as optimizations:
http://en.wikipedia.org/wiki/Program_optimization#When_to_optimize

When appropriate.
complex data validation/checking logic
avoid several round trips to do one action in the DB
several clients
anything that should be set based
You can't say "never" or "always".
There is also the case where the database engine will outlive your client code. I bet there's more DAL or ORM upgrades/refactoring that DB engine upgrades/refactoring going on.
Finally, why can't I encapsulate code in a stored proc? Isn't that a good thing?

As ever, much of your decision as to which to use will depend on your application and its environment.
There are a couple of schools of thought here, and this debate always arouses strong sentiments on both sides.
The advantanges of Stored Procedures (as well as the large data moving that Quassnoi has mentioned) are that the logic is tied down in the database, and therefore potentially more secure. It is also only ever in one place.
However, there will be others who believe that the place for application logic should be in the application, especially if you are planning to access other types of datebases (for which you will have to write often different SPs).
Another consideration may be the skills of the resources you have to implement your application.

The point at which stored procedures become preferable to an ORM is that point at which you have multiple applications talking to the same database. At this point, you want your query logic embedded in one place, rather than once per application. And even here, you might want to prefer a service layer (which can scale horizontally) instead of the database (which only scales vertically).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.