I have a database with a large number of data (millions of rows), and also is updating during the day with large number of data, I have a back up of this database for reporting, so getting report of data does not affect on the performance of main database.
For syncing back up database with main database, I wrote a windows service which queries the main database and inserts new data into backup database... every time the query gets 5000 rows from the main database...
EDIT:
the query is like below:
const string cmdStr = "SELECT * FROM [RLCConvertor].[dbo].[RLCDiffHeader] WHERE ID >= #Start and ID <= #End";
Here is the code:
using (var conn = new SqlConnection(_connectionString))
{
conn.Open();
var cmd = new SqlCommand(cmdStr, conn);
cmd.Parameters.AddWithValue("#Start", start);
cmd.Parameters.AddWithValue("#End", end);
SqlDataReader reader = cmd.ExecuteReader(CommandBehavior.SequentialAccess);
while (reader.Read())
{
var rldDiffId = Convert.ToInt32(reader["ID"].ToString());
var rlcDifHeader = new RLCDiffHeader
{
Tech_head_Type = long.Parse(reader["Tech_head_Type"].ToString()),
ItemCode = long.Parse(reader["ItemCode"].ToString()),
SessionNumber = long.Parse(reader["SessionNumber"].ToString()),
MarketFeedCode = reader["MarketFeedCode"].ToString(),
MarketPlaceCode = reader["MarketPlaceCode"].ToString(),
FinancialMarketCode = reader["FinancialMarketCode"].ToString(),
CIDGrc = reader["CIDGrc"].ToString(),
InstrumentID = reader["InstrumentID"].ToString(),
CValMNE = reader["CValMNE"].ToString(),
DEven = reader["DEven"].ToString(),
HEven = reader["HEven"].ToString(),
MessageCodeType = reader["MessageCodeType"].ToString(),
SEQbyINSTandType = reader["SEQbyINSTandType"].ToString()
};
newRLCDiffHeaders.Add(rldDiffId, rlcDifHeader);
}
conn.Close();
}
but when I started the service... the performance of main database got worse... is the code not efficient? Is there any better way? Because I searched and found that dataReader is the best for this case... or should I use DataTable and SqlDataAdapter?
You cannot treat this an a correct answer or solution for your problem.
Since the comment goes big, I am providing a solution to you.
Can you try using the concept of Ad hoc queries
Using this you can query another database using the following way
SELECT a.*
FROM OPENROWSET('SQLNCLI', 'Server=Seattle1;Trusted_Connection=yes;',
'SELECT GroupName, Name, DepartmentID
FROM AdventureWorks2012.HumanResources.Department
ORDER BY GroupName, Name') AS a;
Read more
http://technet.microsoft.com/en-us/library/ms187569.aspx
http://technet.microsoft.com/en-us/library/ms190312.aspx
Since you are using a service, the service account surely have access to read the main db and insert to report db. I will suggest you to have a SP in your report DB , that can access the main DB using OpenRowSet and insert to it.
Query will be similar like this.
Insert into tbl
SELECT a.*
FROM OPENROWSET('SQLNCLI', 'Server=Seattle1;Trusted_Connection=yes;',
'SELECT GroupName, Name, DepartmentID
FROM AdventureWorks2012.HumanResources.Department
ORDER BY GroupName, Name') AS a;
Form the service, you need to invoke the SP.
We had a similar issue and this was done by openrowset and I don't know how much performance impact this can provide. But I suggest you to do a POC and just analyze it.
Once again please consider this as a suggestion.
Related
The application I am developing is meant to be a quick and easy tool to import data to our main app. So the user loads in a CSV, meddles with the data a little and pushes it up to the database.
Before the data is pushed to the database, I have a verification check going on which basically says, "Does a customer exist in the database with the same name, account and sort codes? If so, put their guid (which is known already) into a list."
The problem is, the result variable is always 0; this is despite the fact that there is duplicate test data already in my database which should show a result. Added to that, using SQL Profiler, I can't see a query actually being executed against the database.
I'm sure that the ExecuteScalar() is what I should be doing, so my attention comes to the Parameters I'm adding to the SqlCommand... but I'll be blowed if I can figure it... any thoughts?
string connectionString = Generic.GetConnectionString("myDatabase");
using (SqlConnection conn = new SqlConnection(connectionString))
{
conn.Open();
SqlCommand check = new SqlCommand("select COUNT(*) from Customers C1 INNER JOIN CustomerBank B1 ON C1.Id = B1.CustomerId WHERE C1.Name = #Name AND B1.SortCode = #SortCode AND B1.AccountNumber = #AccountNumber", conn);
foreach (DataRow row in importedData.Rows)
{
check.Parameters.Clear();
check.Parameters.Add("#Name", SqlDbType.NVarChar).Value = row["Name"].ToString();
check.Parameters.Add("#SortCode", SqlDbType.NVarChar).Value = row["SortCode"].ToString();
check.Parameters.Add("#AccountNumber", SqlDbType.NVarChar).Value = row["AccountNumber"].ToString();
Object result = check.ExecuteScalar();
int count = (Int32)result;
if (count > 0)
{
DuplicateData.Add((Guid)row["BureauCustomerId"]);
}
}
}
Clarification: importedData is a DataTable of the user's data held in this C# application. During the ForEach loop, each row has various columns, a few of those being Name, SortCode and AccountNumber. The values seem to get set in the parameters (but will verify now).
My access database had 700 records,every record had 50 fields。I use ODBC Query of PHP which query speed is very fast,but i use ODBC Query of C# ,it speed is very slowly,codes below:
m_conn = new OdbcConnection("DSN=real");//This DSN set by through the windows control panel,ODBC manager,system dsn
m_conn.Open();
string sqlstr="select * from table1 where id = 1";
OdbcCommand selectCMD = new OdbcCommand(sqlstr, m_conn);
OdbcDataReader myreader;
myreader = selectCMD.ExecuteReader();
if (myreader == null)
return null;
string s =myreader["field"].ToString();//here,execution speed is very slow,why?
thanks for help
Don't select more data for your application than you need to process.
Your statement select * from table1 where id = 1 is selecting all fields. If you only need the field named field, than change your select statement to select field from table1 where id = 1.
If you provide additional information on your database structure I may be able to be of more help.
There are a couple of suggestions here. I am guessing that you are accessing this code multiple times, which is where the slowness is coming in. This is most likely caused by you not disposing/closing of the connections properly. Below is the code refactored with the using clause, which enforces the use of Dispose after the call.
Also, instead of using the asterisk, specifying the name of the columns will help access optimize the query.
Finally, if you are only concerned about 1 variable, consider changing the query to only return the one value you are looking for and changing the call to an ExecuteScalar() call.
// Consider specifying the fields you are concerned with
string sqlstr="select * from table1 where id = 1";
using (var m_conn = new OdbcConnection("DSN=real"))//This DSN set by through the windows control panel,ODBC manager,system dsn
using (var selectCMD = new OdbcCommand(sqlstr, m_conn))
{
m_conn.Open();
using (var myreader= selectCMD.ExecuteReader())
{
if (myreader == null)
return null;
string s =myreader["field"].ToString();
}
}
I dont know how to do this query in c#.
There are two databases and each one has a table required for this query. I need to take the data from one database table and update the other database table with the corresponding payrollID.
I have two tables in seperate databases, Employee which is in techData database and strStaff in QLS database. In the employee table I have StaffID but need to pull the PayrollID from strStaff.
Insert payrollID into Employee where staffID from strStaff = staffID from Employee
However I need to get the staffID and PayrollID from strStaff before I can do the insert query.
This is what I have got so far but it wont work.
cn.ConnectionString = ConfigurationManager.ConnectionStrings["PayrollPlusConnectionString"].ConnectionString;
cmd.Connection = cn;
cmd.CommandText = "Select StaffId, PayrollID From [strStaff] Where (StaffID = #StaffID)";
cmd.Parameters.AddWithValue("#StaffID", staffID);
//Open the connection to the database
cn.Open();
// Execute the sql.
dr = cmd.ExecuteReader();
// Read all of the rows generated by the command (in this case only one row).
For each (dr.Read()) {
cmd.CommandText = "Insert into Employee, where StaffID = #StaffID";
}
// Close your connection to the DB.
dr.Close();
cn.Close();
Assuminig, you want to add data to existing table, you have to use UPDATE + SELECT statement (as i mentioned in a comment to the question). It might look like:
UPDATE emp SET payrollID = sta.peyrollID
FROM Emplyoee AS emp INNER JOIN strStaff AS sta ON emp.staffID = sta.staffID
I have added some clarity to your question: the essential part is that you want to create a C# procedure to accomplish your task (not using SQL Server Management Studio, SSIS, bulk insert, etc). Pertinent to this, there will be 2 different connection objects, and 2 different SQL statements to execute on those connections.
The first task would be retrieving data from the first DB (for certainty let's call it source DB/Table) using SELECT SQL statement, and storing it in some temporary data structure, either per row (as in your code), or the entire table using .NET DataTable object, which will give substantial performance boost. For this purpose, you should use the first connection object to source DB/Table (btw, you can close that connection as soon as you get the data).
The second task would be inserting the data into second DB (target DB/Table), though from your business logic it's a bit unclear how to handle possible data conflicts if records with identical ID already exist in the target DB/Table (some clarity needed). To complete this operation you should use the second connection object and second SQL query.
The sample code snippet to perform the first task, which allows retrieving entire data into .NET/C# DataTable object in a single pass is shown below:
private static DataTable SqlReadDB(string ConnString, string SQL)
{
DataTable _dt;
try
{
using (SqlConnection _connSql = new SqlConnection(ConnString))
{
using (SqlCommand _commandl = new SqlCommand(SQL, _connSql))
{
_commandSql.CommandType = CommandType.Text;
_connSql.Open();
using (SqlCeDataReader _dataReaderSql = _commandSql.ExecuteReader(CommandBehavior.CloseConnection))
{
_dt = new DataTable();
_dt.Load(_dataReaderSqlCe);
_dataReaderSql.Close();
}
}
_connSqlCe.Close();
return _dt;
}
}
catch { return null; }
}
The second part (adding data to target DB/Table) you should code based on the clarified business logic (i.e. data conflicts resolution: do you want to update existing record or skip, etc). Just iterate through the data rows in DataTable object and perform either INSERT or UPDATE SQL operations.
Hope this may help. Kind regards,
I am working on sql server monitoring product and i have database query that will fetch data regarding All Table details of all the Databases in SQL server.
For this i have two options.
Fire query on data base from code as select name from [master].sys.sysdatabases
Get the DB name of all the data base first then i will fire my main query on each DB
using "USE <fetched DB name>;"+"mainQuery";
Please check followin code for the same.
public DataTable GetResultsOfAllDB(string query)
{
SqlConnection con = new SqlConnection(_ConnectionString);
string locleQuery = "select name from [master].sys.sysdatabases";
DataTable dtResult = new DataTable("Result");
SqlCommand cmdData = new SqlCommand(locleQuery, con);
cmdData.CommandTimeout = 0;
SqlDataAdapter adapter = new SqlDataAdapter(cmdData);
DataTable dtDataBases = new DataTable("DataBase");
adapter.Fill(dtDataBases);
foreach (DataRow drDB in dtDataBases.Rows)
{
if (dtResult.Rows.Count >= 15000)
break;
locleQuery = " Use [" + Convert.ToString(drDB[0]) + "]; " + query;
cmdData = new SqlCommand(locleQuery, con);
adapter = new SqlDataAdapter(cmdData);
DataTable dtTemp = new DataTable();
adapter.Fill(dtTemp);
dtResult.Merge(dtTemp);
}
return dtResult;
}
I will use sys store procedure i.e.EXEC sp_MSforeachdb and fetched data will be stored store data in table datatype select from temptable; Drop Table temptable.
Check following query for the same
Declare #TableDetail table
(
field1 varchar(500),
field2 int,
field3 varchar(500),
field4 varchar(500),
field5 decimal(18,2),
field6 decimal(18,2)
)
INSERT #TableDetail EXEC sp_MSforeachdb 'USE [?]; QYERY/COMMAND FOR ALL DATABASE'
Select
field1,field2 ,field3 ,field4 ,field5,field6 FROM #TableDetail
Note : In second option query takes time because if number of database and number of table are huge then this will wait until all database get finish.
Now my question is which is the good option from above two options and why? or any other solution for the same.
Thanks in advance.
One key difference is the second option blocks until everything is done. All of the work is done sql server side. That has the issue of not being able to apply feedback to the user as it runs and it can potentially time out and not be resiliant to network blips. This option can be used as a pure sql script (some sql admins like that) where the first needs a program.
In the first example, the client is doing iterative more granular tasks where you can supply feedback to the user. You can also retry in the face of network blips without redoing all of the work. In the first example, you can also use SqlConnectionBuild instead of USE concatentation.
If performance is a concern, you could also potentially parallelize the first one with some locking around adapter.Fill
Both suck - they are both serial.
Use the first, get rid of the ridiculous objects (DataSet) and use TASKS to parallelize X databases at the same time. X determined by trying ut how much load the server can handle.
Finished.
If your queries are simple enough you can try to generate single script instead of execute queries in each DB one by one:
select 'DB1' as DB, Field1, Field2, ...
from [DB1]..[TableOrViewName]
union all
select 'DB2' as DB, Field1, Field2, ...
from [DB2]..[TableOrViewName]
union all
...
Everything is looking fine. I just want to add Using statements for IDisposable objects
public DataTable GetResultsOfAllDB(string query)
{
using (SqlConnection con = new SqlConnection(_ConnectionString))
{
string locleQuery = "select name from [master].sys.sysdatabases";
DataTable dtResult = new DataTable("Result");
using (SqlCommand cmdData = new SqlCommand(locleQuery, con))
{
cmdData.CommandTimeout = 0;
using (SqlDataAdapter adapter = new SqlDataAdapter(cmdData))
{
using (DataTable dtDataBases = new DataTable("DataBase"))
{
adapter.Fill(dtDataBases);
foreach (DataRow drDB in dtDataBases.Rows)
{
if (dtResult.Rows.Count >= 15000)
break;
locleQuery = " Use [" + Convert.ToString(drDB[0]) + "]; " + query;
cmdData = new SqlCommand(locleQuery, con);
adapter = new SqlDataAdapter(cmdData);
using (DataTable dtTemp = new DataTable())
{
adapter.Fill(dtTemp);
dtResult.Merge(dtTemp);
}
}
return dtResult;
}
}
}
}
}
I am wondering is there a way to do batch updating? I am using ms sql server 2005.
I saw away with the sqlDataAdaptor but it seems like you have to first the select statement with it, then fill some dataset and make changes to dataset.
Now I am using linq to sql to do the select so I want to try to keep it that way. However it is too slow to do massive updates. So is there away that I can keep my linq to sql(for the select part) but using something different to do the mass update?
Thanks
Edit
I am interested in this staging table way but I am not sure how to do it and still not clear how it will be faster since I don't understand how the update part works.
So can anyone show me how this would work and how to deal with concurrent connections?
Edit2
This was my latest attempt at trying to do a mass update using xml however it uses to much resources and my shared hosting does not allow it to go through. So I need a different way so thats why I am not looking into a staging table.
using (TestDataContext db = new TestDataContext())
{
UserTable[] testRecords = new UserTable[2];
for (int count = 0; count < 2; count++)
{
UserTable testRecord = new UserTable();
if (count == 1)
{
testRecord.CreateDate = new DateTime(2050, 5, 10);
testRecord.AnotherField = true;
}
else
{
testRecord.CreateDate = new DateTime(2015, 5, 10);
testRecord.AnotherField = false;
}
testRecords[count] = testRecord;
}
StringBuilder sBuilder = new StringBuilder();
System.IO.StringWriter sWriter = new System.IO.StringWriter(sBuilder);
XmlSerializer serializer = new XmlSerializer(typeof(UserTable[]));
serializer.Serialize(sWriter, testRecords);
using (SqlConnection con = new SqlConnection(connectionString))
{
string sprocName = "spTEST_UpdateTEST_TEST";
using (SqlCommand cmd = new SqlCommand(sprocName, con))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandType = System.Data.CommandType.StoredProcedure;
SqlParameter param1 = new SqlParameter("#UpdatedProdData", SqlDbType.VarChar, int.MaxValue);
param1.Value = sBuilder.Remove(0, 41).ToString();
cmd.Parameters.Add(param1);
con.Open();
int result = cmd.ExecuteNonQuery();
con.Close();
}
}
}
# Fredrik Johansson I am not sure what your saying will work. Like it seems to me you want me to make a update statement for each record. I can't do that since I will have need update 1 to 50,000+ records and I will not know till that point.
Edit 3
So this is my SP now. I think it should be able to do concurrent connections but I wanted to make sure.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[sp_MassUpdate]
#BatchNumber uniqueidentifier
AS
BEGIN
update Product
set ProductQty = 50
from Product prod
join StagingTbl stage on prod.ProductId = stage.ProductId
where stage.BatchNumber = #BatchNumber
DELETE FROM StagingTbl
WHERE BatchNumber = #BatchNumber
END
You can use the sqlDataAdapter to do a batch update. It dosen’t matter how you fill your dataset. L2SQL or whatever, you can use different methods to do the update. Just define the query to run using the data in your datatable.
The key here is the UpdateBatchSize. The dataadapter will send the updates in batches of whatever size you define. You need to expirement with this value to see what number works best, but typicaly numbers of 500-1000 do best. SQL can then optimize the update and execute a little faster. Note that when doing batchupdates, you cannot update the row source of the datatable.
I use this method to do updates of 10-100K and it usualy runs in under 2 minutes. It will depend on what you are updating though.
Sorry, this is in VB….
Using da As New SqlDataAdapter
da.UpdateCommand = conn.CreateCommand
da.UpdateCommand.CommandTimeout = 300
da.AcceptChangesDuringUpdate = False
da.ContinueUpdateOnError = False
da.UpdateBatchSize = 1000 ‘Expirement for best preformance
da.UpdateCommand.UpdatedRowSource = UpdateRowSource.None 'Needed if UpdateBatchSize > 1
sql = "UPDATE YourTable"
sql += " SET YourField = #YourField"
sql += " WHERE ID = #ID"
da.UpdateCommand.CommandText = sql
da.UpdateCommand.UpdatedRowSource = UpdateRowSource.None
da.UpdateCommand.Parameters.Clear()
da.UpdateCommand.Parameters.Add("#YourField", SqlDbType.SmallDateTime).SourceColumn = "YourField"
da.UpdateCommand.Parameters.Add("#ID", SqlDbType.SmallDateTime).SourceColumn = "ID"
da.Update(ds.Tables("YourTable”)
End Using
Another option is to bulkcopy to a temp table, and then run a query to update the main table from it. This may be faster.
As allonym said, Use SqlBulkCopy, which is very fast(I found speed improvements of over 200x - from 1500 secs to 6s). However you can use the DataTable and DataRows classes to provide data to SQlBulkCopy (which seems easier). Using SqlBulkCopy this way has the added advantage of bein .NET 3.0 compliant as well (Linq was added only in 3.5).
Checkout http://msdn.microsoft.com/en-us/library/ex21zs8x%28v=VS.100%29.aspx for some sample code.
Use SqlBulkCopy, which is lightning-fast. You'll need a custom IDataReader implementation which enumerates over your linq query results. Look at http://code.msdn.microsoft.com/LinqEntityDataReader for more info and some potentially suitable IDataReader code.
You have to work with the expression trees directly, but it's doable. In fact, it's already been done for you, you just have to download the source:
Batch Updates and Deletes with LINQ to SQL
The alternative is to just use stored procedures or ad-hoc SQL queries using the ExecuteMethodCall and ExecuteCommand methods of the DataContext.
You can use SqlDataAdapter to do a batch-update even if a datatable is filled manually/programmatically (from linq of any other source).
Just remember to manually set the RowState for the rows in the datatable. Use dataRow.SetModified() for this.