I have a package in SSIS with multiple task. I am loading file,if the Filesystem task at the end fail i want to be able to rollback the transaction.My package look like that.
I like to be able to rollback all the operations the SSIS script have been done.To do that i need the SSIS script to enlist in the transaction created by the BEGIN_TRANSACTION Sql task.How could i do that ?
In ssis to gather the transaction i do :
object rawConnection = Dts.Connections["destination_ado"].AcquireConnection(Dts.Transaction);
myADONETConnection = (SqlConnection)rawConnection;
Then i do a BulkCopy:
using (SqlBulkCopy sbc = new SqlBulkCopy(myADONETConnection))
{
sbc.DestinationTableName = "[" + SCHEMA_DESTINATION + "].[" + TABLE_DESTINATION + "]";
// sbc.DestinationTableName = "test_load";
// Number of records to be processed in one go
sbc.BatchSize = 10000;
// Finally write to server
sbc.WriteToServer(destination);
}
myADONETConnection.Close();
How do I tell the SqlBulkCopy to use the existing transaction ?
In the options of the connection in SSIS i use RetainSameConnection:true
Thanks for all your thought
Vincent
Looking at your package, I see that you are iterating through bunch of files and for each iteration you are loading the files content into your destination tables.
You want all your data loads to be atomic i.e. fully loaded or none at all.
With this in mind I would like to suggest the following approaches and in all of these appraoches there is no need of using Script Task or Begin/End Transaction blocks explicitly -
Use a Data Flow Task and in the properties set TransactionOption to Required. This will do the job of enabling the transaction on the block
Have a error-redirection at the destination to a error table in a batch-wise manner so as to minimize the errors to the lowest minimum possible (such as -http://agilebi.com/jwelch/2008/09/05/error-redirection-with-the-ole-db-destination/). We used 100k, 50k, 1 as 3 batches successfully when doing data loads of over million per day. You can then deal with those errors seperately.
If the use case is such that the whole data has to fail then just redirect the failed records. Move the record to 'failed' folder using File System Task(FST). Have a DFT following the FST to perform a lookup on the destination and then deleting all those records.
So i found a solution.
On the first Script block (extract and load )i create a transaction with this code:
SqlTransaction tran = myADONETConnection.BeginTransaction(IsolationLevel.ReadCommitted);
Then i use this transaction in the SqlBulkCopy this way :
using (SqlBulkCopy sbc = new SqlBulkCopy(myADONETConnection,SqlBulkCopyOptions.Default,tran))
Pass the transaction object to an SSIS variable :
Dts.Variables["User::transaction_object"].Value = tran;
Then on my two block at the end Commit transaction and Rolloback transaction i use SSIS script, read the variable and either commit or rollback the transaction:
SqlTransaction tran = (SqlTransaction)Dts.Variables["User::transaction_object"].Value;
tran.Commit();
As a result if a file cannot be move to the Archive folder i don't get load twice,a transaction is fire for each file so if a file can't be more only the data about this file get rollback and the enumerator keep on going to the next one.
Related
I am working on a program in c# where I have a layout as shown in the image below:
The purpose of the program is to perform data archiving in SQL server. If I choose "Create Tables", it will generate new tables into my database ( should generate about 40 tables in order) which has similar table structure (columns,constraint,triggers,etc) as original tables in the same database as well. How this works is I'll execute the SQL scripts in c# and call them (all 40 scripts) to create tables.
Right now, I added another button "Transfer data" where it will select specfic data(based on date) in old data and transfer them into the new tables I created. I will use the query Insert Into....SELECT from to transfer data.
My question is should I create sql scripts for transferring data and execute them in c# or just put the SQL queries inside my c# code ?
If I go with SQL scripts, should I split them into 40 scripts as well or place all the queries inside 1 script? I know it will be tedious if i put everything in one script as if an error occurs, it's hard to trace the source of the problem.
Below is a sample of how the sql query looks like :
SET IDENTITY_INSERT Kiosk_Log_New ON
INSERT INTO Kiosk_Log_New(LOGID,
logAPPID,
logLOGTID,
logGUID,
logOriginator,
logReference,
logAssemblyName,
logFunctionName,
logMessage,
logException,
CreatedBy,
CreatedDate)
SELECT LOGID,
logAPPID,
logLOGTID,
logGUID,
logOriginator,
logReference,
logAssemblyName,
logFunctionName,
logMessage,
logException,
CreatedBy,
CreatedDate FROM Kiosk_Log
WHERE CreatedDate BETWEEN '2015-01-01' AND GETDATE()
EDIT: Since many suggested stored procedure is the best option, this would be my create tables script:
string constr = ConfigurationManager.ConnectionStrings["constr"].ConnectionString;
/* open sql connection to execute SQL script: PromotionEvent_New */
try
{
using (SqlConnection con = new SqlConnection(constr))
{
con.Open();
FileInfo file = new FileInfo("C:\\Users\\88106221\\Documents\\SQL Server Management Studio\\PromotionEvent_New.sql");
string script = file.OpenText().ReadToEnd();
Server server = new Server(new ServerConnection(con));
server.ConnectionContext.ExecuteNonQuery(script);
Display("PromotionEvent_New table has been created successfully");
con.Close();
}
}
catch(Exception ex)
{
textBox1.AppendText(string.Format("{0}", Environment.NewLine));
textBox1.AppendText(string.Format("{0} MainPage_Load() exception - {1}{2}", _strThisAppName, ex.Message, Environment.NewLine));
Display(ex.Message + "PromotionEvent_New could not be created");
textBox1.AppendText(string.Format("{0}", Environment.NewLine));
Debug.WriteLine(string.Format("{0} MainPage_Load() exception - {1}", _strThisAppName, ex.Message));
}
It's best to use a stored procedure with a transaction to execute all your INSERT queries.
It's not advisable to submit queries from your C# code as explained in last post by John Ephraim Tugado due to a number of reasons; the most important reasons being,
easier maintenance of INSERT queries
minimal bandwidth consumption between web server and database server
Sending long queries strings from C# code will consume more bandwidth between web server and database server and could slow the database response in a high traffic scenario.
You can execute the following T-SQL code against your database to create a stored procedure for transferring/archiving data to archived tables. This procedure makes sure that all your INSERTS are executed within a transaction, that ensures you don't end up with orphaned tables and unnecessary headaches down the road.
Stored Procedure for transferring data
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Lord Cookie
-- Create date: 11/01/2017
-- Description: Transfers data to existing archived tables
-- =============================================
CREATE PROCEDURE dbo.ArchiveData
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRY
--use transaction when inserting data else you may end up with orphaned data and hard to debug issues later on
BEGIN TRANSACTION
--add your INSERT queries one after the other below
SET IDENTITY_INSERT Kiosk_Log_New ON
INSERT INTO Kiosk_Log_New (LOGID,
logAPPID,
logLOGTID,
logGUID,
logOriginator,
logReference,
logAssemblyName,
logFunctionName,
logMessage,
logException,
CreatedBy,
CreatedDate)
SELECT
LOGID
,logAPPID
,logLOGTID
,logGUID
,logOriginator
,logReference
,logAssemblyName
,logFunctionName
,logMessage
,logException
,CreatedBy
,CreatedDate
FROM Kiosk_Log
WHERE CreatedDate BETWEEN '2015-01-01' AND GETDATE()
--add more of your insert queries below
-- finally commit transaction
COMMIT TRANSACTION
END TRY
BEGIN CATCH
DECLARE #errorDetails NVARCHAR(MAX);
set #errorDetails = 'Error ' + CONVERT(VARCHAR(50), ERROR_NUMBER()) +
', Severity ' + CONVERT(VARCHAR(5), ERROR_SEVERITY()) +
', State ' + CONVERT(VARCHAR(5), ERROR_STATE()) +
', Line ' + CONVERT(VARCHAR(5), ERROR_LINE());
--roll back the transaction
IF XACT_STATE() <> 0
BEGIN
ROLLBACK TRANSACTION
END
--you can log the above error message and/or re-throw the error so your C# code will see an error
--but do this only after rolling back
END CATCH;
END
GO
You can then call the above stored procedure using C# as shown in sample code below.
Call above stored procedure using C#
using(SqlConnection sqlConn = new SqlConnection("Your database Connection String")) {
using(SqlCommand cmd = new SqlCommand()) {
cmd.CommandText = "dbo.ArchiveData";
cmd.CommandType = CommandType.StoredProcedure;
cmd.Connection = sqlConn;
sqlConn.Open();
cmd.ExecuteNonQuery();
}
}
Depending on the table naming and design, I would suggest creating a script to create a stored procedure for this that would generate one for each of your tables. I'm no expert in scripting but it is the same with the script that generates an audit trail for each of your tables or at least the ones you defined in the script.
Hard-coding this inside your c# application is a big NO as there is the possibility of database changes. We would want our app to be flexible to change with the least amount of effort.
If generating the script to create a stored procedure is hard for you, I would still recommend manually creating stored procedures for this task.
I have a C# win form application which has a facility to backup databases with data to a script file while the application running.
I used following code to script the database using SMO;
public StringCollection GenerateDatabaseScript(string databaseName)
{
//Validate database name goes here
StringCollection dbScript = new StringCollection();
//Create db connection
sqlDataAccess.DBConnect(databaseName); //Custom class to do SQL data operations (sqlDataAccess)
//Create server and database objects
var serverConn = new ServerConnection(sqlDataAccess.Connection);
var dbServer = new Server(serverConn);
var database = dbServer.Databases[databaseName];
//Set script database options here
//--
//Set script database tables option here
//--
//Script database creation
//I also use a method 'ScriptObjectWithBatchDelimiter' to add GO delimiter for each command manually.
dbScript.AddRange(ScriptObjectWithBatchDelimiter(database.Script(dbScriptingOptions)).ToArray());
//Set focus to new db
dbScript.Add(string.Format("USE [{0}]", databaseName));
dbScript.Add("GO");
foreach (Table table in database.Tables)
{
//Skip scripting system tables
if (table.IsSystemObject)
continue;
//Script table
dbScript.AddRange(ScriptObjectWithBatchDelimiter(table.EnumScript(tableScriptingOptions)).ToArray());
}
return dbScript;
}
Problem occurs in this line when encountering a table in the database data is not committed ROWLOCK;
table.EnumScript(tableScriptingOptions)
The problem is how can I script data with READUNCOMMITTED? Is there any properties that I can set to achieve this?
The same question is asked here, but the only answer provided not suitable.
UPDATE: Following code is tried (assumed with the Isolation part in the name) but, still not working.
database.SetSnapshotIsolation(true);
I think what you are looking for is this:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
msdn: "Specifies that statements can read rows that have been modified by other transactions but not yet committed."
so wrap your scripting work inside a transaction with Isolation Level "ReadUncommited".
this is a good example how to use transactions in c#:
https://msdn.microsoft.com/en-us/library/5ha4240h(v=vs.110).aspx
I have this class A that begins an EF transaction where UserDb is my DbContext
using (DbContextTransaction dbTransaction = UserDb.Database.BeginTransaction(IsolationLevel.ReadUncommitted))
Then I have several inserts and there is a need to call another library [which essentially lives on the same server in the bin folder] to do another insert.
new ExtLibrary().CreatePoweruser(3, UserDb);
As you can see I am passing the same connection. And this statement is within the top using which I thought would mean that everythign is in the same transaction.
Extlibrary code:
Data.Entities.User UserEntity = new Data.Entities.User {
UserTypeId =34,
CreatedDate = DateTime.Now,
CreatedBy = "mk92Test",
};
UserDb.Users.Add(UserEntity);
UserDb.SaveChanges();
Everything works unless the ExtLibrary insert fails. Control comes back to the parent class which has rollback code on exception and I get an The underlying provider failed on rollback. But the first set of inserts certainly do rollback even after this exception.
Please advise.
I'm using SMO to execute a batch SQL script. In Management Studio, the script executes in about 2 seconds. With the following code, it takes about 15 seconds.
var connectionString = GetConnectionString();
// need to use master because the DB in the connection string no longer exists
// because we dropped it already
var builder = new SqlConnectionStringBuilder(connectionString)
{
InitialCatalog = "master"
};
using (var sqlConnection = new SqlConnection(builder.ToString()))
{
var serverConnection = new ServerConnection(sqlConnection);
var server = new Server(serverConnection);
// hangs here for about 12 -15 seconds
server.ConnectionContext.ExecuteNonQuery(sql);
}
The script creates a new database and inserts a few thousand rows across a few tables. The resulting DB size is about 5MB.
Anyone have any experience with this or have a suggestion on why this might be running so slowly with SMO?
SMO does lots of weird .. stuff in the background, which is a price you pay for ability to treat server/database objects in an object-oriented way.
Since you're not using the OO capabilites of SMO, why don't you just ignore SMO completely and simply run the script through normal ADO?
The best and fastest way to upload records into a database is through SqlBulkCopy.
Particularly when your scripts are ~1000 records plus - this will make a significant speed improvement.
You will need to do a little work to get your data into a DataSet, but this can easily be done using the DataSet xml functions.
I am trying to set up a synchronization routine in C# to send data from a ms access database to a sql server. MS Access is not my choice it's just the way it is.
I am able to query the MS Access database and get OleDbDataReader record set. I could potentially read each individual record and insert it onto SQL Server but it seems so wasteful.
Is there a better way to do this. I know I could do it in MS Access linking to sql server and perform the update easy but this is for end users and I don't want them messing with access.
EDIT:
Just looking at SqlBulkCopy I think that may be the answer if I get my results into DataRow[]
I found a solution in .NET that I am very happy with. It allows me to give the access to the sync routine to any user within my program. It involves the SQLBulkCopy class.
private static void BulkCopyAccessToSQLServer
(CommandType commandType, string sql, string destinationTable)
{
using (DataTable dt = new DataTable())
{
using (OleDbConnection conn = new OleDbConnection(Settings.Default.CurriculumConnectionString))
using (OleDbCommand cmd = new OleDbCommand(sql, conn))
using (OleDbDataAdapter adapter = new OleDbDataAdapter(cmd))
{
cmd.CommandType = commandType;
cmd.Connection.Open();
adapter.SelectCommand.CommandTimeout = 240;
adapter.Fill(dt);
adapter.Dispose();
}
using (SqlConnection conn2 = new SqlConnection(Settings.Default.qlsdat_extensionsConnectionString))
{
conn2.Open();
using (SqlBulkCopy copy = new SqlBulkCopy(conn2))
{
copy.DestinationTableName = destinationTable;
copy.BatchSize = 1000;
copy.BulkCopyTimeout = 240;
copy.WriteToServer(dt);
copy.NotifyAfter = 1000;
}
}
}
}
Basically this puts the data from MS Access into a DataTable it then uses the second connection conn2 and the SqlBulkCopy class to send the data from this DataTable to the SQL Server. It's probably not the best code but should give anyone reading this the idea.
You should harness the power of SET based queries over RBAR efforts.
Look into a SSIS solution to synchronize the data and then schedule the package to run at regular intervals using SQL Server Agent.
You can call an SSIS package from the command line so you can effectively do it from MS Access or from C#.
Also, the SQL Server, the MS Access DB and the SSIS package do not have to be on the same machine. As long as your calling program can see the SSIS package, and the package can connect to the SQL Server and the MS Access DB, you can transfer data from one place to another.
It sounds like what you are doing is ETL. There are several tools that are built to do this and to me, there is little reason to reinvent the functionality. You have SQL Server, therefore you have SSIS. It has a ton of tools for automated transformations, cleanups, lookups, etc. that you can use out of the box.
Unless this is a real cut-and-dry data load and there is absolutely no scope for the complexity of the upload to increase later on (yeah, right!) I would go with a tried and tested ETL tool.
If SQL Server Integration Services isn't an option, you could write out to a temporary text file the data that you read from Access and then call bcp.exe to load it to the database.
I have done something like this before.
I used
OleDbConnection aConnection = new OleDbConnection(String.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}", fileName));
aConnection.Open();
to open the access db. Then
OleDbCommand aCommand = new OleDbCommand(String.Format("select * from {0}", accessTable), aConnection);
OleDbDataReader aReader = aCommand.ExecuteReader();
to execute the read from the table. Then
int fieldCount = aReader.FieldCount;
to get the field count
while (aReader.Read())
to loop the records and
object[] values = new object[fieldCount];
aReader.GetValues(values);
to retrieve the values.
There are several ways to sync but it can give a problem when you change a field name in sql server or add a new column or delete. The best option would be:
Create connections for sql server and oledb.
Write custom query to fetch record from one connection and save it to another.
Before executing make sure to program in a way that you update all table definitions.
In my case this helped in a way because the load on sql server became down.
can you not transfer Access file to the server and delete it once sync is complete?
You can create windows service for that..