I am trying to develop an application with some parallel data processing and the use of MySQL. Here is a piece of code where I ran into a problem
public ConcurrentDictionary<string, Info> GetDatabaseForCurrentDay(System.DateTime day)
{
string[] date = day.ToShortDateString().Split('.');
string sqlQuery = "SELECT * FROM testtable WHERE Date ='" + date[2] + "-" + date[1] + "-" + date[0] + "';";
ConcurrentDictionary<string, Info> info = new ConcurrentDictionary<string, Info>();
Info[] dayInfo = null;
Parallel.ForEach(ReadData(ConnectionString, sqlQuery), data =>
{
int num = 2;
string[] dataPieces = data.Split(new char[] { ',' }, num);
FileHelpers.FileHelperEngine<Info> engine = new FileHelpers.FileHelperEngine<Info>();
dayInfo = engine.ReadString(dataPieces[1], int.MaxValue);
info.TryAdd(dataPieces[0], dayInfo[0]);
});
return info;
}
Apart from this fragment, the function ReadData(ConnectionString, sqlQuery) is also worth being mentioned, since it provides an argument for the loop Parallel.ForEach.
public IEnumerable<string> ReadData(string connectionString, string queryString)
{
using (MySqlConnection conn = new MySqlConnection(connectionString))
{
using (MySqlCommand comm = new MySqlCommand(queryString, conn))
{
conn.Open();
string command2 = "USE testdatabase;";
MySqlCommand commandUse = new MySqlCommand(command2, conn);
commandUse.ExecuteNonQuery();
comm.CommandTimeout = 0;
MySqlDataReader reader = comm.ExecuteReader();
if (reader.HasRows)
{
while (reader.Read())
{
StringBuilder sb = new StringBuilder();
sb.Append(reader.GetString(0) + ",");
sb.Append(reader.GetDateTime(1).ToString("yyyy-MM-dd") + ",");
sb.Append(reader.GetDouble(2).ToString().Replace(',', '.') + ",");
sb.Append(reader.GetDouble(3).ToString().Replace(',', '.') + ",");
sb.Append(reader.GetDouble(4).ToString().Replace(',', '.') + ",");
sb.Append(reader.GetDouble(5).ToString().Replace(',', '.') + ",");
sb.Append(reader.GetUInt64(6) + ",");
sb.Append(reader.GetDouble(7).ToString().Replace(',', '.'));
yield return sb.ToString();
}
}
}
}
}
Now, let us move back to the problem. The code compiles and works, but the results it returns are incorrect. I noticed that ConcurrentDictionarycontains keys with wrong values -- in a nutshell, info.TryAdd(dataPieces[0], dayInfo[0]) may insert a key from one thread and the value from another thread and, therefore, the data may be corrupted. I understand that this behaviour is the setback of the parallel processing, but this method can't be omitted. I tried different ways to fix this problem, but nothing worked, and the data was still wrong. Are there any solutions to this problem which maintain the speed of execution for this code and save the data?
You need to move dayInfointo your parallel for loop. Basically this is a shared variable that keeps getting written over by each of the tasks giving you garbage results. If you put it into the delegate, then it will be a different private variable for each iteration and not get clobbered:
// Info[] dayInfo = null; <--Remove this
Parallel.ForEach(ReadData(ConnectionString, sqlQuery), data =>
{
int num = 2;
string[] dataPieces = data.Split(new char[] { ',' }, num);
FileHelpers.FileHelperEngine<Info> engine = new FileHelpers.FileHelperEngine<Info>();
//declare dayInfo locally within this scope instead
var dayInfo = engine.ReadString(dataPieces[1], int.MaxValue);
info.TryAdd(dataPieces[0], dayInfo[0]);
});
Related
My problem in the title i have allcodes array and codes TextBox (kodTxtBox)
i will split textbox like line per element and querying all elements with for loop then
when i run it, it shows the query of only the last element of the allcodes array with the
messagebox, but the others go into else and giving error message box
some turkish words in my codes so.
aciklama = description
birim = monad
birimFiyat = Price per 1 unit
ürünler = products
ürünler.sipariskod = products.ordercode etc.
i did a lot of ways for this i used foreach all variables type is string
allCodes = kodTxtBox.Text.Split('\n');
for (int i = 0; i < allCodes.Length; i++)
{
queryString = "SELECT ürünler.siparisKod, ürünler.aciklama, ürünler.birim, ürünler.fGrup, ürünler.birimfiyat FROM ürünler WHERE (((ürünler.siparisKod)=\"" + allCodes[i] + "\"));";
using (OleDbCommand query = new OleDbCommand(queryString))
{
query.Connection = connection;
reader = query.ExecuteReader();
if (reader.Read())
{
MessageBox.Show(allCodes[i] + " Succesful");
var desc = reader["aciklama"].ToString();
var monad = reader["birim"].ToString();
var sellPrice = reader["birimFiyat"].ToString();
MessageBox.Show("Açıklama: " + desc + " Birim: " + monad + " Satış Fiyatı: " + sellPrice);
reader.Close();
}
else
{
MessageBox.Show("Hata");
}
}
}
I solved the problem by making a single query instead of multiple queries. I saved the values returned in each single query into a list and at the end I made the necessary for loop using the elements of the list
enter image description hereI am reading from MySql DB using C# but because I am doing some fata processing for the data which is coming out from DB so I had to store these data in order using Queue<> Class.
I am trying to store the data as ( String[] arr ) format in the Queue
but I found the Queue memory is ecpanding per each time storing data but the last group of array data will be repeated in side queue memory which I believe I am doing something wrong.
I believe it is clear in the attached image.
Notices:
I am calling Queue.Enqueue(arr) method inside another Async method which I believe it is very problematic way as in below
`
public async void DBReadAsync(Calculations.RetStartEndDates f, DataBufferCls.DataBuffer dt)
{
int i;
Task t1 = Task.Factory.StartNew(() =>
{
DB.ReadFromDB3(f, dt);
}
) ;
await t1;
}
public string[] ReadFromDB3(RetStartEndDates WeekStrtEnd, DataBuffer dt)
{
string s;
string[] arr = new string[8];
string fromtime = WeekStrtEnd.StartTime_Str;/// "8:40:30";"10:09:00"; //
string FromDate = WeekStrtEnd.StartDateDBFormate_Str; //"2022-10-14 "; //"2022-11-06 "; //
string FromTD = FromDate + " " + fromtime;
string Totime = WeekStrtEnd.EndTime_Str; //"11:19:22"; //
string ToDate = WeekStrtEnd.EndDateDBFormate_Str; //"2022-11-07 "; //
string ToTD = ToDate + " " + Totime;
Console.WriteLine($"Start date is {FromTD}");
Console.WriteLine($"End Date is {ToTD}");
string query = "SELECT * FROM data_46";
//Open connection
if (this.IsConnect())
{
StartReadingFromDB_Event("Hold");
dt.Monitortest();
//Create Command
MySqlCommand cmd = new MySqlCommand(query, Connection);
//Create a data reader and Execute the command
MySqlDataReader dataReader = cmd.ExecuteReader();
//Read the data and store them in the list
while (dataReader.Read())
{
arr[0] = dataReader[0].ToString();
arr[1] = dataReader[1].ToString();
arr[2] = dataReader[2].ToString();
arr[3] = dataReader[3].ToString();
arr[4] = dataReader[4].ToString();
arr[5] = dataReader[5].ToString();
arr[6] = dataReader[6].ToString();
arr[7] = dataReader[7].ToString();
dt.WrToBuffer3(arr);
Thread.Sleep(10);
}[enter image description here](https://i.stack.imgur.com/aAJnB.png)`
`
`
I am trying to understand the reason for that
I am reading word file and add data after it find Favour. Just wont to add data after favour. First record using this code added perfect record but second time it give Syntax error(operator missing). Please help me to correctly add all records
private void button1_Click(object sender, EventArgs e)
{
try
{
Microsoft.Office.Interop.Word.Application app = new Microsoft.Office.Interop.Word.Application();
object nullobj = System.Reflection.Missing.Value;
object file = openFileDialog1.FileName;
Document doc = app.Documents.Open(#"C:\Users\juilee Raut\Downloads\ITCL-CAES 1 (1).docx");
doc.ActiveWindow.Selection.WholeStory();
doc.ActiveWindow.Selection.Copy();
IDataObject da = Clipboard.GetDataObject();
string text = da.GetData(DataFormats.Text).ToString();
richTextBox1.Text = text;
string data = string.Empty;
string[] data1 = richTextBox1.Lines;
List<string> Info = new List<string>();
int i = 0;
int j = 0;
int m = 0;
while (i < data1.Length)
{
if (data1[i].StartsWith("FAVOUR:"))
{
j++;
if (m == 0)
{
data = data + data1[i].ToString() + Environment.NewLine;
string inf = string.Join(Environment.NewLine, Info.ToArray());
con.Open();
OleDbCommand cmd = con.CreateCommand();
cmd.CommandType = CommandType.Text;
cmd.CommandText = "INSERT into AllData(FullJud) VALUES('" + inf + "')";
cmd.ExecuteNonQuery();
con.Close();
Info.Clear();
inf = string.Empty;
m = 1;
}
}
else
{
m = 0;
if (data1[i] != "")
{
if (data1[i].EndsWith("2017") && data1[i].Length == 10 || data1[i].EndsWith("2016") && data1[i].Length == 10)
{
data = data + data1[i].ToString() + Environment.NewLine + "##ln##" + Environment.NewLine;
Info.Add(data1[i]);
Info.Add("##ln##");
}
else if(data1[i].StartsWith("SECTION:") || data1[i].StartsWith("Section:") || data1[i].StartsWith("SECTION-") || data1[i].Contains("SUBJECT:") || data1[i].StartsWith("Subject:") || data1[i].StartsWith("SUBJECT-") || data1[i].StartsWith("SUBJECTS:"))
{
data = data + data1[i].ToString() + Environment.NewLine;
}
else if(data1[i].EndsWith("Respondent.") || data1[i].EndsWith("Petitioner.") || data1[i].EndsWith("Appellant.") || data1[i].EndsWith("Appellant") || data1[i].EndsWith("Respondent") || data1[i].EndsWith("Counsel,"))
{
data = data + data1[i].ToString() + Environment.NewLine;
}
else
{
data = data + data1[i].ToString() + Environment.NewLine;
Info.Add(data1[i]);
}
}
}
i++;
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
This is the error I get:
"Syntax error (missing operator) in query expression ''391 ITR 382 (BOM): 88 TAXMANN.COM 556\r\nHIGH COURT OF BOMBAY \r\nM.S. SANKLECHA AND A.K. MENON, JJ.\r\nMalay N. Sanghvi v/s. Income Tax Officer\r\nIT APPEAL NO. 1342 OF 2014\r\n31.01.2017\r\n##ln##\r\nSection 80-IB of the Income-tax Act, 1961 - Deductions - Profits a'."
What if Info contains several values, let's say:
Info = { "value1", "value2" }
Then, inf would be:
inf = "value1\r\nvalue2"
Therefore, cmd.CommandText would be:
cmd.CommandText = "INSERT into AllData(FullJud) VALUES('value1\r\nvalue2')";
and I'm quite sure this is not the wanted behaviour.
Edit
What if one value in Info contained a ' character?
Info = { "val'ue" }
Then, inf would be:
inf = "val'ue"
Therefore, cmd.CommandText would be:
cmd.CommandText = "INSERT into AllData(FullJud) VALUES('val'ue')";
// SQL won't understand that part --------------------------|||
and that's where you get an error.
Moreover, what if Info had the following value instead:
Info = { "value1');DROP TABLE [anytable];--" }
That's typical SQL Injection.
Some questions/comments:
What is j used for?
What is the purpose of inf = string.Empty;? It is a local variable and will be garbage collected.
What is the purpose of data? Will you even use it at some point?
You are using a while loop when you could be using a for(int i=0;i<data1.Length;i++) loop.
What if data1 contains two consecutive strings starting with "FAVOUR:"? Why would you insert only the first one, and not the second one?
else
{
data.Replace("'", "/");
data1[i] = data1[i].Replace("'","/");
Info.Add(data1[i]);
data = data + data1[i].ToString() + Environment.NewLine;
}
In else part I just replace the ' to / and my problem is solve
I am a bit in a pickle regarding a consolidation application we are using in our company. We create a csv file from an progress database this csv file has 14 columns and NO header.
The CSV file contains payments (around 173 thousand rows). Most of these rows are the same except for the column amount (last column)
Example:
2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65
(around 174000 rows)
As you can see some of these lines are the same except for the amount column. What i need is to sort all rows, add up the amount and save one unique row instead of 1100 rows with different amounts.
My coding skills are failing me to get the job done within a certain timeframe, maybe one of you can push me in the right direction solving this problem.
Example code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string input = File.ReadAllText(#"c:\temp\test.txt");
string inputLine = "";
StringReader reader = new StringReader(input);
List<List<string>> data = new List<List<string>>();
while ((inputLine = reader.ReadLine()) != null)
{
if (inputLine.Trim().Length > 0)
{
string[] inputArray = inputLine.Split(new char[] { ';' });
data.Add(inputArray.ToList());
}
}
//sort data by every column
for (int sortCol = data[0].Count() - 1; sortCol >= 0; sortCol--)
{
data.OrderBy(x => x[sortCol]);
}
//delete duplicate rows
for (int rowCount = data.Count - 1; rowCount >= 1; rowCount--)
{
Boolean match = true;
for (int colCount = 0; colCount < data[rowCount].Count - 2; colCount++)
{
if(data[rowCount][colCount] != data[rowCount - 1][colCount])
{
match = false;
break;
}
}
if (match == true)
{
decimal previousValue = decimal.Parse(data[rowCount - 1][data[rowCount].Count - 1]);
decimal currentValue = decimal.Parse(data[rowCount][data[rowCount].Count - 1]);
string newStrValue = (previousValue + currentValue).ToString();
data[rowCount - 1][data[rowCount].Count - 1] = newStrValue;
data.RemoveAt(rowCount);
}
}
string output = string.Join("\r\n",data.AsEnumerable()
.Select(x => string.Join(";",x.Select(y => y).ToArray())).ToArray());
File.WriteAllText(#"c:\temp\test1.txt",output);
}
}
}
Read the CSV file line by line, and build an in-memory dictionary in which you keep the totals (and other information you require). As most of the lines belong to the same key, it will probably not cause out of memory issues. Afterwards, generate a new CSV based on the information in the dictionary.
As I interpret your question, your problem and the solution you are asking for are how to take your input that are in the form of
#"2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65"
Get the last column and then sum it up? If so this is actually very easy to do with something like this
public static void Main()
{
string input = #"2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65";
var rows = input.Split('\n');
decimal totalValue = 0m;
foreach(var row in rows)
{
var transaction = row.Substring(row.LastIndexOf(';') +1);
decimal val = 0m;
if(decimal.TryParse(transaction, out val))
totalValue += val;
}
Console.WriteLine(totalValue);
}
But maybe I have misunderstood what you were asking for?
Sorry answering my post so late but this is my final solution
Replacing all " characters and write the output to the stream writer. (going from 25mb to a 15mb file.). Than copy my CSV file to the SQL server so i can bulk insert. After my insert i just query the table and read / write the result set to a new file. My new file is only +/-700KB!
The Filldata() method is filling a datagridview in my application so you can review the result instead of opening the file in excel.
I am new with C#, i am currently writing a new solution to query the csv file directly or in memory and write it back to a new file.
Method1:
string line;
StreamWriter sw = new StreamWriter(insertFile);
using (StreamReader sr = new StreamReader(sourcePath))
{
while ((line = sr.ReadLine()) != null)
{
sw.WriteLine(line.Replace("\"", ""));
}
sr.Close();
sw.Close();
sr.Dispose();
sw.Dispose();
File.Copy(insertFile, #"\\SQLSERVER\C$\insert.csv");
}
Method2:
var destinationFile = #"c:\insert.csv";
var querieImportCSV = "BULK INSERT dbo.TABLE FROM '" + destinationFile + "' WITH ( FIELDTERMINATOR = ';', ROWTERMINATOR = '\n', FIRSTROW = 1)";
var truncate = #"TRUNCATE TABLE dbo.TABLE";
string queryResult =
#"SELECT [Year]
,[Month]
,[Week]
,[Entity]
,[Account]
,[C11]
,[C12]
,[C21]
,[C22]
,[C3]
,[C4]
,[CTP]
,[VALUTA]
,SUM(AMOUNT) as AMOUNT
,[CURRENCY_ORIG]
,[AMOUNTEXCH]
,[AGENTCODE]
FROM dbo.TABLE
GROUP BY YEAR, MONTH, WEEK, Entity, Account, C11, C12, C21, C22, C3, C4, CTP, VALUTA, CURRENCY_ORIG, AMOUNTEXCH, AGENTCODE
ORDER BY Account";
var conn = new SqlConnection(connectionString);
conn.Open();
SqlCommand commandTruncate = new SqlCommand(truncate, conn);
commandTruncate.ExecuteNonQuery();
SqlCommand commandInsert = new SqlCommand(querieImportCSV, conn);
SqlDataReader readerInsert = commandInsert.ExecuteReader();
readerInsert.Close();
FillData();
SqlCommand commandResult = new SqlCommand(queryResult, conn);
SqlDataReader readerResult = commandResult.ExecuteReader();
StringBuilder sb = new StringBuilder();
while (readerResult.Read())
{
sb.Append(readerResult["Year"] + ";" + readerResult["Month"] + ";" + readerResult["Week"] + ";" + readerResult["Entity"] + ";" + readerResult["Account"] + ";" +
readerResult["C11"] + ";" + readerResult["C12"] + ";" + readerResult["C21"] + ";" + readerResult["C22"] + ";" + readerResult["C3"] + ";" + readerResult["C4"] + ";" +
readerResult["CTP"] + ";" + readerResult["Valuta"] + ";" + readerResult["Amount"] + ";" + readerResult["CURRENCY_ORIG"] + ";" + readerResult["AMOUNTEXCH"] + ";" + readerResult["AGENTCODE"]);
}
sb.Replace("\"","");
StreamWriter sw = new StreamWriter(homedrive);
sw.WriteLine(sb);
readerResult.Close();
conn.Close();
sw.Close();
sw.Dispose();
My program is now still running to import data from a log file into a remote SQL Server Database. The log file is about 80MB in size and contains about 470000 lines, with about 25000 lines of data. My program can import only 300 rows/second, which is really bad. :(
public static int ImportData(string strPath)
{
//NameValueCollection collection = ConfigurationManager.AppSettings;
using (TextReader sr = new StreamReader(strPath))
{
sr.ReadLine(); //ignore three first lines of log file
sr.ReadLine();
sr.ReadLine();
string strLine;
var cn = new SqlConnection(ConnectionString);
cn.Open();
while ((strLine = sr.ReadLine()) != null)
{
{
if (strLine.Trim() != "") //if not a blank line, then import into database
{
InsertData(strLine, cn);
_count++;
}
}
}
cn.Close();
sr.Close();
return _count;
}
}
InsertData is just a normal insert method using ADO.NET. It uses a parsing method:
public Data(string strLine)
{
string[] list = strLine.Split(new[] {'\t'});
try
{
Senttime = DateTime.Parse(list[0] + " " + list[1]);
}
catch (Exception)
{
}
Clientip = list[2];
Clienthostname = list[3];
Partnername = list[4];
Serverhostname = list[5];
Serverip = list[6];
Recipientaddress = list[7];
Eventid = Convert.ToInt16(list[8]);
Msgid = list[9];
Priority = Convert.ToInt16(list[10]);
Recipientreportstatus = Convert.ToByte(list[11]);
Totalbytes = Convert.ToInt32(list[12]);
Numberrecipient = Convert.ToInt16(list[13]);
DateTime temp;
if (DateTime.TryParse(list[14], out temp))
{
OriginationTime = temp;
}
else
{
OriginationTime = null;
}
Encryption = list[15];
ServiceVersion = list[16];
LinkedMsgid = list[17];
MessageSubject = list[18];
SenderAddress = list[19];
}
InsertData method:
private static void InsertData(string strLine, SqlConnection cn)
{
var dt = new Data(strLine); //parse the log line into proper fields
const string cnnStr =
"INSERT INTO LOGDATA ([SentTime]," + "[client-ip]," +
"[Client-hostname]," + "[Partner-Name]," + "[Server-hostname]," +
"[server-IP]," + "[Recipient-Address]," + "[Event-ID]," + "[MSGID]," +
"[Priority]," + "[Recipient-Report-Status]," + "[total-bytes]," +
"[Number-Recipients]," + "[Origination-Time]," + "[Encryption]," +
"[service-Version]," + "[Linked-MSGID]," + "[Message-Subject]," +
"[Sender-Address]) " + " VALUES ( " + "#Senttime," + "#Clientip," +
"#Clienthostname," + "#Partnername," + "#Serverhostname," + "#Serverip," +
"#Recipientaddress," + "#Eventid," + "#Msgid," + "#Priority," +
"#Recipientreportstatus," + "#Totalbytes," + "#Numberrecipient," +
"#OriginationTime," + "#Encryption," + "#ServiceVersion," +
"#LinkedMsgid," + "#MessageSubject," + "#SenderAddress)";
var cmd = new SqlCommand(cnnStr, cn) {CommandType = CommandType.Text};
cmd.Parameters.AddWithValue("#Senttime", dt.Senttime);
cmd.Parameters.AddWithValue("#Clientip", dt.Clientip);
cmd.Parameters.AddWithValue("#Clienthostname", dt.Clienthostname);
cmd.Parameters.AddWithValue("#Partnername", dt.Partnername);
cmd.Parameters.AddWithValue("#Serverhostname", dt.Serverhostname);
cmd.Parameters.AddWithValue("#Serverip", dt.Serverip);
cmd.Parameters.AddWithValue("#Recipientaddress", dt.Recipientaddress);
cmd.Parameters.AddWithValue("#Eventid", dt.Eventid);
cmd.Parameters.AddWithValue("#Msgid", dt.Msgid);
cmd.Parameters.AddWithValue("#Priority", dt.Priority);
cmd.Parameters.AddWithValue("#Recipientreportstatus", dt.Recipientreportstatus);
cmd.Parameters.AddWithValue("#Totalbytes", dt.Totalbytes);
cmd.Parameters.AddWithValue("#Numberrecipient", dt.Numberrecipient);
if (dt.OriginationTime != null)
cmd.Parameters.AddWithValue("#OriginationTime", dt.OriginationTime);
else
cmd.Parameters.AddWithValue("#OriginationTime", DBNull.Value);
//if OriginationTime was null, then insert with null value to this column
cmd.Parameters.AddWithValue("#Encryption", dt.Encryption);
cmd.Parameters.AddWithValue("#ServiceVersion", dt.ServiceVersion);
cmd.Parameters.AddWithValue("#LinkedMsgid", dt.LinkedMsgid);
cmd.Parameters.AddWithValue("#MessageSubject", dt.MessageSubject);
cmd.Parameters.AddWithValue("#SenderAddress", dt.SenderAddress);
cmd.ExecuteNonQuery();
}
How can my program run faster?
Thank you so much!
Use SqlBulkCopy.
Edit: I created a minimal implementation of IDataReader and created a Batch type so that I could insert arbitrary in-memory data using SqlBulkCopy. Here is the important bit:
IDataReader dr = batch.GetDataReader();
using (SqlTransaction tx = _connection.BeginTransaction())
{
try
{
using (SqlBulkCopy sqlBulkCopy =
new SqlBulkCopy(_connection, SqlBulkCopyOptions.Default, tx))
{
sqlBulkCopy.DestinationTableName = TableName;
SetColumnMappings(sqlBulkCopy.ColumnMappings);
sqlBulkCopy.WriteToServer(dr);
tx.Commit();
}
}
catch
{
tx.Rollback();
throw;
}
}
The rest of the implementation is left as an exercise for the reader :)
Hint: the only bits of IDataReader you need to implement are Read, GetValue and FieldCount.
Hmmm, let's break this down a little bit.
In pseudocode what you did is the ff:
Open the file
Open a connection
For every line that has data:
Parse the string
Save the data in SQL Server
Close the connection
Close the file
Now the fundamental problems in doing it this way are:
You are keeping a SQL connection open while waiting for your line parsing (pretty susceptible to timeouts and stuff)
You might be saving the data line by line, each in its own transaction. We won't know until you show us what the InsertData method is doing
Consequently you are keeping the file open while waiting for SQL to finish inserting
The optimal way of doing this is to parse the file as a whole, and then insert them in bulk. You can do this with SqlBulkCopy (as suggested by Matt Howells), or with SQL Server Integration Services.
If you want to stick with ADO.NET, you can pool together your INSERT statements and then pass them off into one large SQLCommand, instead of doing it this way e.g., setting up one SQLCommand object per insert statement.
You create the SqlCommand object for every row of data. The simplest improvement would therefore to create a
private static SqlCommand cmdInsert
and declare the parameters with the Parameters.Add() method. Then for each data row, set the parameter values using
cmdInsert.Parameters["#paramXXX"].Value = valueXXX;
A second performance improvement might be to skip creation of Data objects for each row, and assign Parameter values directly from the list[] array.