Query Multiple databases with single ado.net query - c#

I have distributive DB architecture where data is stored in multiple SQL servers.
how can i do select/update/delete by running a single query. for example "select * from employees" should return data from all databases i have.
How can write single query which run across multiple SQL servers and gets a single consolidated view to my web server.
NOTE: Since the number of SQL servers may change at varied times so I am looking for something else than linked queries since managing the linked queries at scale( up or down) is a big pain

To talk to different databases / connections, you'll need a distributed transaction via TransactionScope; fortunately, this is actually easier than db-transactions (although you need a reference to System.Transactions.dll):
using(TransactionScope tran = new TransactionScope()) {
// lots of code talking to different databases / connections
tran.Complete();
}
Additionally, TransactionScope nest naturally, and SqlConnection enlists automatically, making it really easy to use.

Use TransactionScope.
If you open connections to different servers within the scope, the transaction will be escelated to a distributed transaction.
Example:
using (TransactionScope scope = new TransactionScope())
{
conn1.Open(); //Open connection to db1
conn2.Open(); //Open connection to db2
// Don't forget to commit the transaction so it won't rollback
scope.Complete()
}

You can't do what you're after with a single query unless you're willing to insert an intermediary of some kind, such as a SQL Express instance, that would mediate with the other servers, perhaps using SQL CLR. But that's messy.
It would be much easier to simply issue a bunch of async requests, and then merge the responses into a single DataTable (or equivalent) when they arrive. By using native ADO.NET style async calls, all queries can happen in parallel. You will of course need to use a lock while reading the data into a single DataTable.

The best solution here is to use a Virtual DBMS to blend your multiple back-ends into a single apparent backend -- so your query goes to the Virtual DBMS which then relays it appropriately to the actual data stores.
OpenLink Virtuoso is one option. Virtuoso opens connections to any ODBC-accessible (including JDBC-accessible, via an ODBC-to-JDBC Bridge) data source.
Your data consuming applications can connect to Virtuoso via ODBC, JDBC, OLE-DB, or ADO.NET as needed. All remote linked objects (Tables, Views, Stored Procedures, etc.) are available through all data access mechanisms.
While you can achieve similar results using the other techniques outlined here, those require the end user to know all about the back-end data structures, and to optimize queries themselves. With Virtuoso, a built-in Cost-based Optimizer will re-write queries to deliver the fastest possible results, with the least possible network traffic, based on the Virtual Schema constructed when you link in the remote objects.
Disclaimer: I work for OpenLink Software, but do not directly benefit from anyone choosing to use our products.

If you do not have access or privilegies to make a Linked server and generates View with consolidate JOINed Query with all Sql Servers, fill results from the same query statement at all Sql Server instances and make and make a union of results. Looping throwing all Databases connections and add collected data to a consolidated collection data structure, at this example I choice DataTable:
DataTable consolidatedEmployees = new DataTable();
foreach(ConnectionStringSettings cs in ConfigurationManager.ConnectionStrings)
{
consolidatedEmployees.Merge(
SelectTransaction("select * from employees", cs.ConnectionString));
}
Using this method example to query any SQL Server Database based on ADO.NET:
/// <summary>
/// Method to execute SQL Query statements with
/// Transaction scope using isolation level to select read commited data
/// </summary>
/// <param name="query">SQL Query statement</param>
/// <param name="connString">Connections String</param>
internal DataTable SelectTransaction(string query, string connString)
{
DataTable tableResult = null;
SqlCommand cmd = null;
SqlConnection conn = null;
SqlDataAdapter adapter = null;
TransactionOptions tranOpt = new TransactionOptions();
tranOpt.IsolationLevel = IsolationLevel.ReadCommitted;
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, tranOpt))
{
tableResult = new DataTable();
try
{
conn = new SqlConnection(connString);
conn.Open();
cmd = new SqlCommand(query, conn);
adapter = new SqlDataAdapter(cmd);
adapter.Fill(tableResult);
break;
}
catch (Exception ex)
{
scope.Dispose();
throw new Exception("Erro durante a transação ao banco de Dados.", ex);
}
finally
{
if (null != adapter)
{
adapter.Dispose();
}
if (null != cmd)
{
cmd.Dispose();
}
if (null != conn)
{
conn.Close();
conn.Dispose();
}
}
scope.Complete();
}
return tableResult;
}
With this solution, only per attention to replicated data, needs after to make a distinct on consolidated result.

Related

Strange timeouts with READ UNCOMMITED

In our application, we need to make an exclusive access to a certain place. It is a legacy application with no communication between client and server, all clients communicate directly to the database.
Edit: This is an implementation of a semaphore for sql server
What we were using and working with no problem was this:
using (var sqlConnection = new SqlConnection(Parameters.ConnectionStringUncommited))
{
sqlConnection.Open();
using (var sqlTransaction = sqlConnection.BeginTransaction(IsolationLevel.ReadUncommitted))
{
using (var sqlCommand = new DstSqlCommand($"SELECT COUNT(*) FROM Locks WHERE Id = {IDTOCHECK}", sqlConnection, sqlTransaction))
{
if ((int)sqlCommand.ExecuteScalar() > 0)
ShowMessage("Locked");
else
{
sqlCommand.CommandText = "INSERT INTO Locks(Id) VALUES({IDTOCHECK})";
sqlCommand.ExecuteNonQuery();
// Here we make lot of work, can be locked a lot of time, minutes even, because we show windows and things
// We never commit the transaction, just rollback it
sqlTransaction.Rollback();
}
}
}
}
This was working perfectly with our version using SqlServer 2008 R2 and .NET Framework 3.5. We just made an update and updated to SqlServer 2014 and .NET Framework 4.8 and we are having sometimes timeouts in the SELECT COUNT(*).....statement. The problem is that only happens on our customers, so the debug is very hard.
I really don't get it, the transaction has an isolation level of read uncommited and the code has not changed. What is really happening here?
Just to update this, with sp_getapplock I have no problems, so I mark this as an answer.

What is the benefit of caching an IReader?

Reviewing some legacy code, there is a commonally used table that gets updated very infrequently.
To save having to constantly go to the database to get the same data each time, it seems like the developer was trying to cache the data. The code looks like this:
private static IDataReader _cachedCheckList;
public override IDataReader GetDataReader()
{
if (_cachedCheckList == null)
{
using (var oneTimeRead = base.GetDataReader())
{
_cachedCheckList = new CachedDataReader(oneTimeRead);
}
}
return _cachedCheckList ?? base.GetDataReader();
}
Then elsewhere in the system the function that uses this follows the pattern of:
IDataReader reader = new CheckList().GetDataReader();
while (reader.Read())
{
[snip]
}
By loading the IReader into memory, I don't think that this provides much in the way of a performance increase.
I'm trying to understand the developers reason for this code. What is the benefit of caching the IReader?
Update: The CachedDataReader() method is basically:
SqlConnection connection = new SqlConnection(ConnectionString);
connection.Open();
var sqlCommand = new SqlCommand(commandText, connection)
command.CommandType = CommandType.StoredProcedure;
return command.ExecuteReader();
I'd not seen anyone cache a DataReader before and was wondering there was a good reason to do this before refactoring the code.
May be
they used DataReader cache for following reasons and they are as follows :-
DataReader readonly and forward only. It fetches the record from databse and stores in the network buffer and gives whenever requests. DataReader release the records as query executes and do not wait for the entire query to execute. Therefore it is very fast as compare to the dataset. It releases only when read method is called.
DataReader should not be Cached. You should fetch data in DataSet or DataTable and then use:
Cache["Data"] = DataTable;
Never cache DataReader objects. Because a DataReader object holds an open connection to the database, caching the object extends the lifetime of the connection, affecting other users of the database. Also, because theDataReader is a forward-only stream of data, after a client has read the information, the information cannot be accessed again. Caching it would be futile.
Caching DataReader objects disastrously affects the scalability of your applications. You may hold connections open and eventually cache all available connections, making the database unusable until the connections are closed. Never cache DataReader objects no matter what caching technology you are using.
Above quotes taken from https://forums.asp.net/post/3224692.aspx

TransactionScope with MySQL and Distributed Transactions

I have an ASP.Net WebAPI instance setup that uses a MySQL database for storage. I have written an ActionFilter that handles creating a TransactionScope for the lifetime of a single endpoint request.
public async Task<HttpResponseMessage> ExecuteActionFilterAsync(
HttpActionContext actionContext,
CancellationToken cancellationToken,
Func<Task<HttpResponseMessage>> continuation)
{
var transactionScopeOptions = new TransactionOptions { IsolationLevel = IsolationLevel.ReadUncommitted };
using (var transaction = new TransactionScope(TransactionScopeOption.RequiresNew, transactionScopeOptions, TransactionScopeAsyncFlowOption.Enabled))
{
var handledTask = await continuation();
transaction.Complete();
return handledTask;
}
}
Then throughout the endpoints I have different queries/commands that open/close connections using the autoenlist=true functionality of DbConnection's. An example endpoint may be:
public async Task<IHttpActionResult> CreateStuffAsync()
{
var query = this.queryService.RetrieveAsync();
// logic to do stuff
var update = this.updateService.Update(query);
return this.Ok();
}
I don't create a single DbConnection and pass it around from the top, as this is a simplistic example, when in practise passing the connection between services would require a large refactor (although if necessary, this can be done). I also read that it is better to open/close the connections as necessary (i.e. keep them open for as little time as possible). The queryService and updateService open/close DbConnection's via using statements:
var factory = DbProviderFactories.GetFactory("MySql.Data.MySqlClient");
using (var connection = factory.CreateConnection())
{
connection.ConnectionString = "Data Source=localhost;Initial Catalog=MyDatabase;User ID=user;Password=password;Connect Timeout=300;AutoEnlist=true;";
if (connection.State != ConnectionState.Open)
{
connection.Open();
}
var result = await connection.QueryAsync(Sql).ConfigureAwait(false);
return result;
}
The same DbConnection is generally not used for multiple queries within the same API endpoint request -- but the same connection string is.
Intermittently I am seeing an exception thrown when attempting to open the connection:
"ExceptionType": "System.NotSupportedException",
"ExceptionMessage": "System.NotSupportedException: MySQL Connector/Net does not currently support distributed transactions.\r\n at MySql.Data.MySqlClient.ExceptionInterceptor.Throw(Exception exception)\r\n at MySql.Data.MySqlClient.MySqlConnection.EnlistTransaction(Transaction transaction)\r\n at MySql.Data.MySqlClient.MySqlConnection.Open()"
I do not understand why it is attempting to escalate the transaction to a distributed transaction, when all of the connections are against the same database. Or am I misunderstanding/misusing the TransactionScope and DbConnection instances?
The System.Transactions.Transaction object makes the determination of whether to escalate to a distributed transaction based on how many separate "resource managers" (e.g., a database) have enlisted in the transaction.
It does not draw a distinction between connections to different physical databases (that do require a distributed transaction) and multiple MySqlConnection connections that have the same connection string and connect to the same database (which might not). (It would be very difficult for it to determine that two separate "resource managers" ① represent the same physical DB and ② are being used sequentially, not in parallel.) As a result, when multiple MySqlConnection objects enlist in a transaction, it will always escalate to a distributed transaction.
When this happens, you run into MySQL bug #70587 that distributed transactions aren't supported in Connector/NET.
Workarounds would be:
Make sure only one MySqlConnection object is opened within any TransactionScope.
Change to a separate connector that does support distributed transactions. You could use MySqlConnector (NuGet, GitHub) as a drop-in replacement for Connector/NET. I've heard that dotConnect for MySQL supports them also (but haven't tried that one).

How to implement single SQL transaction for multiple ObjectContexts in EF4

I have a fairly big database with tables created for different business modules.
We decided to create different edmx-files for different modules respectively.
However, how can I prevent the usage of MSDTC when trying to implement a TransactionScope for a logical action that will incur writing to multiple tables in different edmx? Again, the underlying database is the same, I wouldn't want to use MSDTC for this scenario.
Is there any way to pass in an opened SQL connection with active transaction?
Thanks for help in advance.
Regards,
William
TransactionScope enlists the MSDTC when the databases are different and/or the connection strings are different.
Rick Strahl has a great article on this (his perspective is LINQ to SQL, but it's applicable to EF). The money paragraphs:
TransactionScope is a high level Transaction wrapper that makes it
real easy to wrap any code into a transaction without having to track
transactions manually. Traditionally TransactionScope was a .NET
wrapper around the Distributed Transaction Coordinator (DTC) but it’s
functionality has expanded somewhat. One concern is that the DTC is
rather expensive in terms of resource usage and it requires that the
DTC service is actually running on the machine (yet another service
which is especially bothersome on a client installation).
However, recent updates to TransactionScope and the SQL Server Client
drivers make it possible to use TransactionScope class and the ease of
use it provides without requiring DTC as long as you are running
against a single database and with a single consistent connection
string. In the example above, since the transaction works with a
single instance of a DataContext, the transaction actually works
without involving DTC. This is in SQL Server 2008.
See also this SO question/answer where I found the link to Rick's blog.
So if you're connecting to the same database and are using the same connection string, the DTC should not be involved.
thanks for all replies above!
by the way, just managed to find a solution which is to use EntityConnection and EntityTransaction explicitly. A sample is like this:
string theSqlConnStr = "data source=TheSource;initial catalog=TheCatalog;persist security info=True;user id=TheUserId;password=ThePassword";
EntityConnectionStringBuilder theEntyConnectionBuilder = new EntityConnectionStringBuilder();
theEntyConnectionBuilder.Provider = "System.Data.SqlClient";
theEntyConnectionBuilder.ProviderConnectionString = theConnectionString;
theEntyConnectionBuilder.Metadata = #"res://*/";
using (EntityConnection theConnection = new EntityConnection(theEntyConnectionBuilder.ToString()))
{
theConnection.Open();
theET = null;
try
{
theET = theConnection.BeginTransaction();
DataEntities1 DE1 = new DataEntities1(theConnection);
//DE1 do somethings...
DataEntities2 DE2 = new DataEntities2(theConnection);
//DE2 do somethings...
DataEntities3 DE3 = new DataEntities3(theConnection);
//DE3 do somethings...
theET.Commit();
}
catch (Exception ex)
{
if (theET != null) { theET.Rollback(); }
}
finally
{
theConnection.Close();
}
}
with explicit use of EntityConnection & EntityTransaction, I can achieve the sharing of single connection and transaction for multiple ObjectContexts for a single database, yet without the need to incur the usage of MSDTC.
Hope this info is helpful. Gd luck!!

How to rollback multiple Queries on different database servers in case of any error

I am using different SQL procedures in an application.
First procedures insert some rows then some processing in C#code and then 2nd procedure
do some updation then again some code processing then third procedure delete some record and then insert new record. When all is done on Sever 1 then data is fetch from this server and sent to Server 2 there record is deleted and new record is inserted.
IF there is error at any stage on any server in any procedure i want to roll back all the record.
I can not use begin trans because processing takes time and can not block table as others users are also using same tables in parallel. So kindly tell me how can i achieve it without blocking the table for other users.
Thanks in advance.
Edited (Added code example):
I tried Transaction Scope but i am getting exception while opening the connection. I configured MS DTC but may be not configured properly.
"
Network access for Distributed Transaction Manager (MSDTC) has been disabled. Please enable DTC for network access in the security configuration for MSDTC using the Component Services Administrative tool."
using (TransactionScope ts = new TransactionScope(TransactionScopeOption.Required))
{
try
{
dl.SetBookReadyToLive(13570, false);
//SetBookReadyToLive
dl.AddTestSubmiitedTitleID(23402);
dl.AddBookAuthorAtLIve(13570, 1);
ts.Complete();
}
catch (Exception ex)
{
Response.Write(ex.Message);
}
}
public void SetBookReadyToLive(long BookID, bool status)
{
try
{
if (dbConMeta.State != ConnectionState.Open)
dbConMeta.Open();
SqlCommand cmd = new SqlCommand("spSetBookReadyToLive", dbConMeta);
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Clear();
cmd.Parameters.Add("#BookID", BookID);
cmd.Parameters.Add("#status", status);
cmd.ExecuteNonQuery();
if (dbConMeta.State == ConnectionState.Open)
dbConMeta.Close();
}
catch
{
if (dbConMeta.State == ConnectionState.Open)
dbConMeta.Close();
}
}
I get the exception on opening the connection of method>
I am using SQL Server 2000, i have set the configuration of MS DTC on the machine where SQL Server is installed and also on my PC from where i am running the code. But still same exception.
Kindly help me to configure it
You can use the TransactionScope class. It works generally well but in case of distributed SQL servers like in your case requires the MS DTC enabled in both servers and configured properly (security has to be granted for execution of network transactions, distributed ones and so on...)
here a copy paste from an example on MSDN, you could "almost" use it like this... :)
// Create the TransactionScope to execute the commands, guaranteeing
// that both commands can commit or roll back as a single unit of work.
using (TransactionScope scope = new TransactionScope())
{
using (SqlConnection connection1 = new SqlConnection(connectString1))
{
// Opening the connection automatically enlists it in the
// TransactionScope as a lightweight transaction.
connection1.Open();
// Create the SqlCommand object and execute the first command.
SqlCommand command1 = new SqlCommand(commandText1, connection1);
returnValue = command1.ExecuteNonQuery();
writer.WriteLine("Rows to be affected by command1: {0}", returnValue);
// If you get here, this means that command1 succeeded. By nesting
// the using block for connection2 inside that of connection1, you
// conserve server and network resources as connection2 is opened
// only when there is a chance that the transaction can commit.
using (SqlConnection connection2 = new SqlConnection(connectString2))
{
// The transaction is escalated to a full distributed
// transaction when connection2 is opened.
connection2.Open();
// Execute the second command in the second database.
returnValue = 0;
SqlCommand command2 = new SqlCommand(commandText2, connection2);
returnValue = command2.ExecuteNonQuery();
writer.WriteLine("Rows to be affected by command2: {0}", returnValue);
}
}
// The Complete method commits the transaction. If an exception has been thrown,
// Complete is not called and the transaction is rolled back.
scope.Complete();
}
source: TransactionScope Class
to minimize locks you could specify the IsolationLevel with the overload of the constructor which takes a TransactionScopeOptions, default is Serializable if you are fine with that you could set it to ReadCommitted.
Note: Personally I would not use this one unless absolutely needed, because it's a bit of a pain to have the DTC always configured and Distributed Transactions are in general slower than local ones but really depends on your BL / DAL logic.
Short answer : The same way you would do it if you would do it in MS SQL Management Studio.
You open a connection to a server.
Open a transaction for a specific server
You run your queries related to this server
You make sure to keep your connection alive while you... [go back to 1. for next server]
If all your queries worked, commit all your changes.
Else, rollback all your queries.
Warning : The first table will most likely be locked until you're done with all your servers/queries. What you could do here to help this : If you got a lot of data, you can transfer the data to temporary tables on every servers before doing the step #2. Once this is done, you open the transaction, do your fast things, then commit/rollback asap.
Note: I know you asked how to achieve this without locking the tables, hence why I added an idea in the « warning » part.

Categories

Resources