Keeping an application database agnostic (ADO.NET vs encapsulating DB logic)

Keeping an application database agnostic (ADO.NET vs encapsulating DB logic) - c#

We are making a fairly serious application that needs to remain agnostic to the DB a client wants to use. Initially we plan on supporting MySQL, Oracle & SQL Server. The tables & views are simple as are the queries (no real fancy SQL), therefore the question:
Use native DB drivers (MySQLDbConnection etc.) and encapsulate the logic of executing queries and processing results or
Use a generic OleDbConnection
Obviously option 2 involves no overhead, but I presuming the performance is not as great as with native access?

Note: This answer is relevant if you decide to use basic ADO.NET 2 functionality instead of an ORM (such as Entity Framework or NHibernate) or LINQ to SQL.
Let's assume you've got a connection string defined in your app.config:
<connectionStrings>
<add name="SomeConnection"
providerName="System.Data.SqlClient"
connectionString="..." />
</connectionStrings>
Notice the presence of the providerName attribute and its value. You could also put in a value for another DB provider, e.g. System.Data.SQLite.
(Note that non-standard providers, i.e. those that are not in the .NET Framework by default, need to be registered first, either in app.config or in the client machine's machine.config.)
Now, you can work with the specified database in a completely provider-agnostic fashion as follows:
using System.Configuration; // for ConfigurationManager
using System.Data; // for all interface types
using System.Data.Common; // for DbProviderFactories
var cs = ConfigurationManager.ConnectionStrings["SomeConnection"];
// ^^^^^^^^^^^^^^^^
var factory = DbProviderFactories.GetFactory(cs.ProviderName);
// ^^^^^^^^^^^^^^^
using (IDbConnection connection = factory.CreateConnection())
{
connection.ConnectionString = cs.ConnectionString;
// ^^^^^^^^^^^^^^^^^^^
connection.Open();
try
{
using (IDbCommand command = connection.CreateCommand())
{
... // do something with the database
}
}
finally
{
connection.Close();
}
}
Note how this code only works with interface types. The only place where you indicate a particular DB provider is through the providerName attribute value in the app.config file. (I've marked all the places where a setting from app.config is taken with ^^^s.)
Further reading:
Generic Coding with the ADO.NET 2.0 Base Classes and Factories:
similar to my answer, but goes into more detail.
ADO.NET Managed Providers and DataSet Developer Center:
includes, among other things, an index of available ADO.NET database providers.

IMHO using an ORM is a good design decision in order to have a database agnostic application. Switching database might be as easy as changing a config setting and connection string.

You don't need OleDbConnection to access nonspecific ADO.NET providers. Just use DbConnection et. al. See DbProviderFactories on MSDN for more info.

By including Oracle in that list, you've guaranteed that nothing will be simple.
Oracle uses a different prefix character (colon) for parameters, as compared to SQL Server that uses an "at" symbol.
Oracle uses a single data type (number) for long, int, short, boolean, float, and decimal; your code will have to be sure that you map these properly.
You must parameterize Oracle date and time values; if you try to use strings for dates in your SQL statements, you will go insane because of Oracle's date format. (Oracle uses a three-character month abbreviation; the format is 01-JAN-2010.)
Basic SQL functions for handling nulls can be different, particularly for null coalescing. ("NVL" versus "COALESCE") Oracle is much pickier about reserved words.
Oracle does not have native identity column support. Workarounds involve sequences, triggers, and requiring transactions just to retrieve an identity value from a new row.
In other words, your app can't be DB-agnostic. If you don't use an ORM, you will definitely want to build a data access layer that hides all these things from the rest of the application.
Voice of experience here. Just sayin'. For a common schema across SQL Server and Oracle, we've had to build most of the infrastructure of an ORM, while avoiding the aspects that can degrade performance. Interesting, but non-trivial, definitely!

LINQ is a highly regarded .NET ORM, partly because you can use it and stored procedures. Problem is, it's SQL Server only but people are working to provide similar functionality for Oracle & MySQL.
For database & query optimizations, I cringe at the idea of using an ORM. Data types, functions & overall syntax are not very portable in SQL. The most performant means of interacting with each database will be to tailor the model & queries to each one, but it means expertise, time and money. If need be, focus on one database vendor with the code setup to support vendor swap out & add support for other databases as necessary.

There's no good reason to avoid the most generic interfaces with the broadest support - OleDb and even ODBC if you're comfortable with them. Anything beyond that reduces the pool of products/languages/platforms/tools/developers you can work with. Being closest to the SQL metal, the vendor isn't going to introduce much inefficiency - certainly less than the more esoteric options. They've been around a long, long time to wring out any problems.
If you're going to add an abstraction layer (your own or someone else's), then that should be decided based on the merits of the abstractions introduced in your particular context, not just to have an abstraction layer (which is just more support unless there's an intentional benefit.)
As you can see, everyone's mileage varies. :) But in general, I think simpler is better.

Why not use the Microsoft Patterns & Practices Enterprise Library Data Access Application Block. There's minimal overhead and switching providers is a snap.
Quote:
The Data Access Application Block
takes advantage of these classes and
provides a model that further supports
encapsulation of database
type—specific features, such as
parameter discovery and type
conversions. Because of this,
applications can be ported from one
database type to another without
modifying the client code.

You can always make part of the application database agnostic by having the bulk of the application use the DAL as a bunch of interfaces. The DAL itself would then provide a concrete implementation for the target database.
This way, you get decoupling in the use of the DAL, but the benefit of performance improvements or vendor specific constructs within the DAL.

Related

Dual-coding for SQL Server and SQLite

I want to code an application for both SQL Server and SQLite, but not have to duplicate all database access (that would clearly be unworkable!).
Is there any way of creating a class where the DB access is handled as required, but presented to the rest of the application as set of more generic common objects (i.e. a DataSet, a DataTable, etc.) irrespective of which DB the data was retrieved from?

Yes, you have some pretty good starting points in the .NET framework. Almost all database actions are abstracted away in interfaces, like IDbConnection, IDbCommand, IDataReader, etc. It is pretty easy to do if you rely on the interfaces (I have done it for our company supporting a lot of database platforms).
There are two common problems you have to tackle:
Specifying and creating of the connection (in our case find and load some drivers too). The DbProviderFactory can be of help;
Different database implementations of SQL. Consensus: rely on the shared subset only, or create an intermediary layer that abstracts this away.
We have written our own, but there are also frameworks that do this all for you already, like Entity Framework.

Entity Framework - Interact with Orace and SQL Server

I am working on a .NET web api service(with Odata support) to support Mobile client.The service should support both Oracle and SQL server databases, but only one database type will be used at a time, according to which ever database technology client is using.
How to create database agnostic data access layer? Dont want to write code twice - once for SQL server and once for Oracle.
Also it seems like in order to support oracle in EF, 3rd party oracle drivers are required - either from devart or oracle's ODP.NET.
I am debating should I use old style ADO.NET or use EF for building data access layer.
I will appreciate any help on this.
Thanks!

Your question seems to revolve around multiple concerns, i'll give answers based on my views on them:
1.- ¿How can you create a Database (DB Engine) agnostic DAL?
A: One approach for this is to follow the Repository pattern and/or use interfaces to decouple the code that manipulates the data from the code that retrieves/inserts it. The actual implementation of the interfaces used by your code to get the data can also be taylored to be DB Engine agnostic, if you're going to use ADO.NET, you can check out the Enterprise Library for some very useful code which is DB Engine agnostic. Entity Framework is also compatible with different DB engines but, as you mentioned, can only interact with one DB at a time, so whenever you generate the model, you tie it to the specifics of the DB Engine that your DB is hosted in. This is related to another concern in your question:
2.- ¿Should you use plain old ADO.NET or EF?
A: This is a very good question, which i'm sure has been asked before many times and given that both approaches give you the same practical results of being able to retrieve and manipulate data, the resulting question is: ¿what is your personal preference for coding and the time/resources constraints of the project?
IMO, Entity Framework is best suited for Code-First projects and when your business logic doesn't require complex logging, transactions and other security or performance constraints on the DB side, not because EF is not capable of including these requirements, but because it becomes rather convoluted and unpractical to do it and i personally believe that defeats the purpose of EF, to provide you with a tool that allows for rapid development.
So, if the people involved in the project is not very comfortable writing stored procedures in SQL and the data manipulation will revolve mostly around your service without the need for very complex operations on the DB side, then EF is a suitable approach, and you can leverage the Repository pattern as well as interfaces to implement "DBContext" objects that will allow you to create a DB Agnostic DAL.
However, if you are required to implement transactions, security, extensive logging, and are more comfortable writing SQL stored procedures, Entity Framework will often prove to be a burden for you simply because it is not yet suited for advanced tasks, for example:
Imagine you have a User table, with multiple fields (Address, phone, etc) that are not always necessary for all user-related operations (such as authentication); Trying to map an entity to the results of a stored procedure that does not return any of the fields that the entity contains will result in an error, and you will either need to create different models with more or less members or return additional columns in the SP that you might not need for a particular operation, increasing the bandwith consumption unnecessarily.
Another situation is taking advantage of features such as Table Valued Parameters in SQL Server to optimize sending multiple records at once to the DB, in this case Entity Framework does not include anything that will automatically optimize operations with multiple records, so in order to use TVPs you will need to manually define that operation, much like you would if you had gone the ADO.NET route.
Eventually, you will have to weigh the considerations of your project against what your alternatives provide you; ADO.NET gives you the best performance and customization for your DB operations, it is highly scalable and allows optimizations but it takes more time to code, while EF is very straightforward and practical for objects manipulation, and though it is constantly evolving and improving, its performance and capabilities are not quite on pair with ADO.NET yet.
And regarding the drivers issue, it shouldn't weigh too much in the matter since even Oracle encourages you to use their driver instead of the default one provided by Microsoft.

Best practices when using oracle DB and .NET

What are the best practices or pit falls that we need to be aware of when using Microsoft Oracle provider in a web service centric .NET application?

Some practices we employ based on our production experience:
Validate connections when retrieving them from the connection pool.
Write your service code to not assume that connections are valid - failure to do so can cause quite a bit of grief especially in production environments
Wherever possible, explicitly close and dispose connections after using them (using(conn){} blocks work well)
In a service, you should use connections for the shortest time possible - particularly if you are looking to create a scalable solution.
Consider using explicit timouts on requests appropriate to the typical duration of a request. The last thing you want is to have one type of request that hangs to potentially block your whole system.
Wherever possible use bind variables to avoid hard parses at the database (this can be a performance nightmare if you don't start out with this practice). Using bind variables also protect you from basic SQL-injection attacks.
Make sure you have adequate diagnostic support built into your system - consider creating a wrapper around the Oracle ADO calls so that you can instrument, log, and locate all of them.
Consider using stored procedures or views when possible to push query semantics and knowledge of the data model into the database. This allows easier profileing and query tuning.
Alternatively, consider use a good ORM library (EF, Hibernate, etc) to encapsulate data access - particularly if you perform both read and write operations.
Extending on the above - don't pepper your code with dozens of individually written SQL fragments. This quickly becomes a maintainability nightmare.
If you are committed to Oracle as a database, don't be afraid to use Oracle-specific features. The ODP library provides access to most features - such as returning table cursors, batch operations, etc.
Oracle treats empty strings ("") and NULLs as equivalent - .NET does not. Normalize your string treatment as appropriate for Oracle.
Consider using NVARCHAR2 instead of VARCHAR2 if you will store Unicode .NET string directly in your database. Otherwise, convert all unicode strings to conform to the core ASCII subset. Failure to do so can cause all sorts of confusing and evil data corruption problems.

Some more tips:
Avoid using Microsoft Oracle provider because it goes out of support (http://blogs.msdn.com/adonet/archive/2009/06/15/system-data-oracleclient-update.aspx)
If you're commited to Oracle use oracle specific features and link Oracle.DataAccess assembly to your code
If you're not sure and want to be flexible, use System.Data.Common classes and load oracle provider through

The Oracle Providers work fine in an ASP.NET application, but be aware of:
Matching the right version of the oracle client 32-bit or 64-bit with your application pool
32-bit client for a 32-bit app pool, 64-bit client for a 64-bit app pool.
Permissions - grant the app pool user rights to the oracle client directory (c:\oracle\product\10.2.0\client_1)
This doesn't have anything to do with ASP.NET, but it is important to note that Oracle stores empty string and null both as null, so if you need to know that something was empty and not null, you need to add an additional column to track that...

Beginner's database questions

I am, a beginner to databases, and am very inexperienced at programming generally. For a C# console app (I'm writing with VS Express), which after testing will have a UI added, I need to use a database to store its data.
Can someone tell me, or point me to, bare beginner's explanations, and pros and cons, of these database access methods so I can decide which I should use?:
SQLClient
ORM
OleDB
ODBC
ADO.NET
nHibernate
MS Enterprise Library

Quite the mix there ... first some explanations ...
1) SQL Client
An SQL client is an application that connects to a SQL Database for the purpose of querying / managing / working with the data in an SQL Database. (any program accessing a database, phpAdmin, SQLite Administrator, etc...).
2) ORM is object-relational mapping. Its a way to convert different types of data when data types are incompatible. Think about a car class that incorporates four instances of a tire class. This type of structure doesn't translate well directly to the types available in database design and may be a reason to use ORM. (To relate the objects (car, tires, etc..) into the plain database types (integer, float, blob, etc..)
3) OLE (pronounced Olay) DB Is the Microsoft method (API) for connecting to Database using COM. OLE DB is part of the MDAC Stack (grouping of MS technologies working together in a framework for data access).
4) ODBC is Open Database Connectivity and its an alternate API for for Database Management Systems (DBMS). Where OLE DB is a COM (Component Object Model) way to integrate with databases, ODBC aim's to be language independent.
5) ADO.NET is a set of base classes (API) for use in the .NET languages to connect to and communicate with Databases.
I would suggest starting with ADO.net for your C# background, OLE is typically for older (VB classic) applications, There is a good beginner tutorial here http://www.csharp-station.com/Tutorials/AdoDotNet/Lesson01.aspx
Don't let all the terminology scare you off, once you jump in and start tinkering you will understand all of the answeres provided better...
Best of Luck in your coding!! :-)

SQLClient, OleDB, ODBC are the DBMS Drivers/ADO.NET implementations of different DMBSs (err, hope that makes sense). For example, SQLClient is the ADO.NET implementation for connecting to a SQL Server database. The choice between these drivers is just which database you want to use. For a beginner, I would suggest SQL Server as you probably already have some version of that installed.
ORM is Object-Relational-Mapping. This is a code-based implementation of an auto-mapping between your code-based models, and your database that stores it. If you don't want to manually touch the database for the time being as you are learning, this is a good option - it is something useful for pros and beginners alike, as it allows you to not worry about the underlying database implementation, or writing CRUD (create, read, update, delete) functionality yourself. Take a look at ActiveRecord for .net (http://www.castleproject.org/activerecord/index.html)

If you are looking for an easy introduction to databases in C# you want to use LINQ and a data context.
Simply add a "Data Context" to your project. Double click the file to open the designer for the LINQ data context. Open the "Server Explorer" in visual studio (under View) and connect to your SQL Server. Using that you can drag and drop your tables onto the LINQ designer in visual studio.
Jump on google and have a look at using linq with a context to do work on your DB.
I'll jump in here with LINQ to say that it encourages you to write better database code, that doesn't pull the whole dataset out in one go and operate on it, you defer queries and you can benefit greatly from the functional infrastructure they've built upon it.
But this has a big learning curve, best way to do it is to try different kinds of code and see the ones that make sense to you.

Application that works in both SQL Server And Oracle Databases

What is the best approach to build a small (but scalable) application that works with Sql Server or Oracle?
I'm interested in build apps that supports multiple databases, in the process behind the feature.

Using an ORM that supports multiple databases is the first step here. You could look at either NHibernate or Entity framework for example - both have oracle and sql server support. That way you should just have to swap out the database mappings to get the application to work on either DBMS.
Edit - thanks to tvanfosson, added the 'new' link for nhibernate.

In addition to the ORM comments; sometimes life is not that simple.
You must keep separate scripts for generating your tables, views, and stored procedures on both systems as they will differ.
You may have the need to do something tricky for performance reasons that is specific to one database platform. For example, making a new partition in Oracle.
You should try to do it at this level by encapsulating it in a view or stored procedure.
Your client code can call the stored procedure with the same signature on any database. You can write a stored procedure that does nothing or lots depending on what that databse requires.

My suggestion would be to use an existing (free) framework, like nHibernate, which abstracts out the dependence on the database for you. Alternatively, you'll need to define your own abstraction layer that is able to interact with drivers for either of the two databases.

as a complement to the other answers, you should tak a look at DbProviderFactories architecture in ADO.Net... a bit low-profiled but maybe useful for you.

As many people have pointed out, using an ORM could solve your problem. I've used LLBLGen with great success. Alternatively you can use the interfaces IConnection, ICommand and so on to roll your own ConnectionFactory.

I would use an OR/M. Most of these have support for many different database vendors and have a database agnostic language to do quering and the like.
I can recommend NHibnernate for C#.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.