Is it possible with LINQ to SQL to search the entire database (obviously only the parts that are mapped in the .dbml file) for a string match? I'm trying to write a function that will take a string of "Search Term" and search all mapped entities and return a List(Of Object) that can contain a mixture of entities i.e. if I have a table "Foo" and table "Bar" and search for "wibble", if there is a row in "Foo" and one in "Bar" that contain "wibble" i would like to return a List(Of Object) that contains a "Foo" object and a "Bar" object.
Is this possible?
Ask your boss the following:
"Boss, when you go to the library to find a book about widgets, do you walk up to the first shelf and start reading every book to see if it is relevant, or do you use some sort of pre-compiled index that the librarian has helpfully configured for you, ahead of time?"
If he says "Well, I would use the index" then you need a Full Text index.
If he says "Well, I would start reading every book, one by one" then you need a new job, a new boss, or both :-)
LINQ to SQL, ORMs in general, even SQL is a bad match for such a query. You are describing a full-text search so you should use SQL Server's full text search functionality. Full Text Search is available in all versions and editions since 2000, including SQL Server Express. You need to create an FTS catalog and write queries that use the CONTAINS, FREETEXT functions in your queries.
Why do you need such functionality? Unless you specifically want to FTS-enable your application, this is a ... strange ... way to access your data.
It's probably 'possible', but most databases are accessed through web or network, so its a very expensive operation. So it sounds like bad design.
Also there is the problem of table and column names, this is probably your biggest problem. It's possible to get the column names through reflection, but I don't know for table names:
foreach (PropertyInfo property in typeof(TEntity).GetProperties())
yield return property.Name;
edit: #Ben, you'r right my mistake.
This can be done but will not be pretty. There are several possible solutions.
1. Write the queries for every table yourself and execute them all in your query method.
var users = context.Users
.Where(x => x.FirstName.Contains(txt) || x.LastName.Contains(txt))
.ToList();
var products = context.Products
.Where(x => x.ProductName.Contains(txt));
var result = user.Cast<Object>().Concat(products.Cast<Object>());
2. Fetch all (relevant) tables into memory and perform the search using reflection. Less code to write payed with a huge performance impact.
3. Build the expression trees for the searches using reflection. This is probably the best solution but it is probably challenging to realize.
4. Use something designed for full-text search - for example full-text search integrated into SQL Server or Apache Lucene.
All LINQ solution will (probably) require one query per table which imposes a non-negligible performance impact if you have many tables. Here one should look for a solution to batch this queries into a single one. One of our projects using LINQ to SQL used a library for batching queries but I don't know what it name was and what exactly it could do because I worked most of the time in the front-end team.
Possible but from my point of view, it is not recommended. Consider having 1000K of records of 100 of tables. Slow performance you can do that by Linq to SQL by making a Sp at database level and calling through entities. It will be much faster then the one you trying to achieve =)
Late answer, but since I just had to come up with something for myself, here goes. I wrote the following to search all columns of all tables for a string match. This is related to a data forensics task that was given to me to find all occurences of a string match in a database weighing around 24GB. At this size, you can imagine using cursors or single threaded queries will be rather slow and searching the entire database would take ages. I wrote the following CLR stored procedure to do the work for me server side and return results in XML, while forcing parallelization. It is impressively fast. A database-wide search on the standard AdventureWorks2017 database completes in less than 2 seconds. Enjoy!
Example usages:
Using all available processors on the server:
EXEC [dbo].[SearchAllTables] #valueSearchTerm = 'john michael'
Limiting the server to 4 concurrent threads:
EXEC [dbo].[SearchAllTables] #valueSearchTerm = 'john michael', #maxDegreeOfParallelism = 4
Using logical operators in search terms:
EXEC [dbo].[SearchAllTables] #valueSearchTerm = '(john or michael) and not jack', #tablesSearchTerm = 'not contact'
Limiting search to table names and/or column names containing some search terms:
EXEC [dbo].[SearchAllTables] #valueSearchTerm = 'john michael', #tablesSearchTerm = 'person contact', #columnsSearchTerm = 'address name'
Limiting search results to the first row of each table where the terms are found:
EXEC [dbo].[SearchAllTables] #valueSearchTerm = 'john michael', #getOnlyFirstRowPerTable = 1
Limiting the search to the schema only automatically returns only the first row for each table:
EXEC [dbo].[SearchAllTables] #tablesSearchTerm = 'person contact'
Only return the search queries:
EXEC [dbo].[SearchAllTables] #valueSearchTerm = 'john michael', #tablesSearchTerm = 'person contact', #onlyOutputQueries = 1
Capturing results into temporary table and sorting:
CREATE TABLE #temp (Result NVARCHAR(MAX));
INSERT INTO #temp
EXEC [dbo].[SearchAllTables] #valueSearchTerm = 'john';
SELECT * FROM #temp ORDER BY Result ASC;
DROP TABLE #temp;
https://pastebin.com/RRTrt8ZN
I ended up writing this little custom Gem (finds all matching records given a search term):
namespace SqlServerMetaSearchScan
{
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.IO;
using System.Linq;
using System.Security.Cryptography;
using System.Text;
using System.Threading;
using System.Xml;
public class Program
{
#region Ignition
public static void Main(string[] args)
{
// Defaulting
SqlConnection connection = null;
try
{
// Questions
ColorConsole.Print("SQL Connection String> ");
string connectionString = Console.ReadLine();
ColorConsole.Print("Search Term (Case Ignored)> ");
string searchTerm = Console.ReadLine();
ColorConsole.Print("Skip Databases (Comma Delimited)> ");
List<string> skipDatabases = Console.ReadLine().Split(',').Where(item => item.Trim() != string.Empty).ToList();
// Search
connection = new SqlConnection(connectionString);
connection.Open();
// Each database
List<string> databases = new List<string>();
string databasesLookup = "SELECT name FROM master.dbo.sysdatabases";
SqlDataReader reader = new SqlCommand(databasesLookup, connection).ExecuteReader();
while (reader.Read())
{
// Capture
databases.Add(reader.GetValue(0).ToString());
}
// Build quintessential folder
string logsDirectory = #"E:\Logs";
if (!Directory.Exists(logsDirectory))
{
// Build
Directory.CreateDirectory(logsDirectory);
}
string baseFolder = #"E:\Logs\SqlMetaProbeResults";
if (!Directory.Exists(baseFolder))
{
// Build
Directory.CreateDirectory(baseFolder);
}
// Close reader
reader.Close();
// Sort databases
databases.Sort();
// New space
Console.WriteLine(Environment.NewLine + " Found " + databases.Count + " Database(s) to Scan" + Environment.NewLine);
// Deep scan
foreach (string databaseName in databases)
{
// Skip skip databases
if (skipDatabases.Contains(databaseName))
{
// Skip
continue;
}
// Select the database
new SqlCommand("USE " + databaseName, connection).ExecuteNonQuery();
// Table count
int tablePosition = 1;
try
{
// Defaulting
List<string> tableNames = new List<string>();
// Schema examination
DataTable table = connection.GetSchema("Tables");
// Query tables
string tablesLookup = "SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES";
using (SqlDataReader databaseReader = new SqlCommand(tablesLookup, connection).ExecuteReader())
{
// Get data
while (databaseReader.Read())
{
// Push
if (databaseReader.GetValue(0).ToString().Trim() != string.Empty)
{
tableNames.Add(databaseReader.GetValue(0).ToString());
}
}
// Bail
databaseReader.Close();
}
// Sort
tableNames.Sort();
// Cycle tables
foreach (string tableName in tableNames)
{
// Build data housing
string databasePathName = #"E:\Logs\\SqlMetaProbeResults" + databaseName;
string tableDirectoryPath = #"E:\Logs\SqlMetaProbeResults\" + databaseName + #"\" + tableName;
// Count first
int totalEntityCount = 0;
int currentEntityPosition = 0;
string countQuery = "SELECT count(*) FROM " + databaseName + ".dbo." + tableName;
using (SqlDataReader entityCountReader = new SqlCommand(countQuery, connection).ExecuteReader())
{
// Query count
while (entityCountReader.Read())
{
// Capture
totalEntityCount = int.Parse(entityCountReader.GetValue(0).ToString());
}
// Close
entityCountReader.Close();
}
// Write the objects into the houseing
string jsonLookupQuery = "SELECT * FROM " + databaseName + ".dbo." + tableName;
using (SqlDataReader tableReader = new SqlCommand(jsonLookupQuery, connection).ExecuteReader())
{
// Defaulting
List<string> fieldValueListing = new List<string>();
// Read continue
while (tableReader.Read())
{
// Increment
currentEntityPosition++;
// Defaulting
string identity = null;
// Gather data
for (int i = 0; i < tableReader.FieldCount; i++)
{
// Set
if (tableReader.GetName(i).ToUpper() == "ID")
{
identity = tableReader.GetValue(0).ToString();
}
else
{
// Build column data entry
string thisColumn = tableReader.GetValue(i) != null ? "'" + tableReader.GetValue(i).ToString().Trim() + "'" : string.Empty;
// Piece
fieldValueListing.Add(thisColumn);
}
}
// Path-centric
string explicitIdentity = identity ?? Guid.NewGuid().ToString().Replace("-", string.Empty).ToLower();
string filePath = tableDirectoryPath + #"\" + "Obj." + explicitIdentity + ".json";
string reStringed = JsonConvert.SerializeObject(fieldValueListing, Newtonsoft.Json.Formatting.Indented);
string percentageMark = ((double)tablePosition / (double)tableNames.Count * 100).ToString("#00.0") + "%";
string thisMarker = Guid.NewGuid().ToString().Replace("-", string.Empty).ToLower();
string entityPercentMark = string.Empty;
if (totalEntityCount != 0 && currentEntityPosition != 0)
{
// Percent mark
entityPercentMark = ((double)currentEntityPosition / (double)totalEntityCount * 100).ToString("#00.0") + "%";
}
// Search term verify
if (searchTerm.Trim() != string.Empty)
{
// Search term scenario
if (reStringed.ToLower().Trim().Contains(searchTerm.ToLower().Trim()))
{
// Lazy build
if (!Directory.Exists(tableDirectoryPath))
{
// Build
Directory.CreateDirectory(tableDirectoryPath);
}
// Has the term
string idMolding = identity == null || identity == string.Empty ? "No Identity" : identity;
File.WriteAllText(filePath, reStringed);
ColorConsole.Print(percentageMark + " => " + databaseName + "." + tableName + "." + idMolding + "." + thisMarker + " (" + entityPercentMark + ")", ConsoleColor.Green, ConsoleColor.Black, true);
}
else
{
// Show progress
string idMolding = identity == null || identity == string.Empty ? "No Identity" : identity;
ColorConsole.Print(percentageMark + " => " + databaseName + "." + tableName + "." + idMolding + "." + thisMarker + " (" + entityPercentMark + ")", ConsoleColor.Yellow, ConsoleColor.Black, true);
}
}
}
// Close
tableReader.Close();
}
// Increment
tablePosition++;
}
}
catch (Exception err)
{
ColorConsole.Print("DB.Tables!: " + err.Message, ConsoleColor.Red, ConsoleColor.White, false);
}
}
}
catch (Exception err)
{
ColorConsole.Print("KABOOM!: " + err.ToString(), ConsoleColor.Red, ConsoleColor.White, false);
}
finally
{
try { connection.Close(); }
catch { }
}
// Await
ColorConsole.Print("Done.");
Console.ReadLine();
}
#endregion
#region Cores
public static string GenerateHash(string inputString)
{
// Defaulting
string calculatedChecksum = null;
// Calculate
SHA256Managed checksumBuilder = new SHA256Managed();
string hashString = string.Empty;
byte[] hashBytes = checksumBuilder.ComputeHash(Encoding.ASCII.GetBytes(inputString));
foreach (byte theByte in hashBytes)
{
hashString += theByte.ToString("x2");
}
calculatedChecksum = hashString;
// Return
return calculatedChecksum;
}
#endregion
#region Colors
public class ColorConsole
{
#region Defaulting
public static ConsoleColor DefaultBackground = ConsoleColor.DarkBlue;
public static ConsoleColor DefaultForeground = ConsoleColor.Yellow;
public static string DefaultBackPorch = " ";
#endregion
#region Printer Cores
public static void Print(string phrase)
{
// Use primary
Print(phrase, DefaultForeground, DefaultBackground, false);
}
public static void Print(string phrase, ConsoleColor customForecolor)
{
// Use primary
Print(phrase, customForecolor, DefaultBackground, false);
}
public static void Print(string phrase, ConsoleColor customBackcolor, bool inPlace)
{
// Use primary
Print(phrase, DefaultForeground, customBackcolor, inPlace);
}
public static void Print(string phrase, ConsoleColor customForecolor, ConsoleColor customBackcolor)
{
// Use primary
Print(phrase, customForecolor, customBackcolor, false);
}
public static void Print(string phrase, ConsoleColor customForecolor, ConsoleColor customBackcolor, bool inPlace)
{
// Capture settings
ConsoleColor captureForeground = Console.ForegroundColor;
ConsoleColor captureBackground = Console.BackgroundColor;
// Change colors
Console.ForegroundColor = customForecolor;
Console.BackgroundColor = customBackcolor;
// Write
if (inPlace)
{
// From beginning of this line + padding
Console.Write("\r" + phrase + DefaultBackPorch);
}
else
{
// Normal write
Console.Write(phrase);
}
// Revert
Console.ForegroundColor = captureForeground;
Console.BackgroundColor = captureBackground;
}
#endregion
}
#endregion
}
}
Related
My problem in the title i have allcodes array and codes TextBox (kodTxtBox)
i will split textbox like line per element and querying all elements with for loop then
when i run it, it shows the query of only the last element of the allcodes array with the
messagebox, but the others go into else and giving error message box
some turkish words in my codes so.
aciklama = description
birim = monad
birimFiyat = Price per 1 unit
ürünler = products
ürünler.sipariskod = products.ordercode etc.
i did a lot of ways for this i used foreach all variables type is string
allCodes = kodTxtBox.Text.Split('\n');
for (int i = 0; i < allCodes.Length; i++)
{
queryString = "SELECT ürünler.siparisKod, ürünler.aciklama, ürünler.birim, ürünler.fGrup, ürünler.birimfiyat FROM ürünler WHERE (((ürünler.siparisKod)=\"" + allCodes[i] + "\"));";
using (OleDbCommand query = new OleDbCommand(queryString))
{
query.Connection = connection;
reader = query.ExecuteReader();
if (reader.Read())
{
MessageBox.Show(allCodes[i] + " Succesful");
var desc = reader["aciklama"].ToString();
var monad = reader["birim"].ToString();
var sellPrice = reader["birimFiyat"].ToString();
MessageBox.Show("Açıklama: " + desc + " Birim: " + monad + " Satış Fiyatı: " + sellPrice);
reader.Close();
}
else
{
MessageBox.Show("Hata");
}
}
}
I solved the problem by making a single query instead of multiple queries. I saved the values returned in each single query into a list and at the end I made the necessary for loop using the elements of the list
I have small exec(old one) that treat adding members to a table in the DB. if the member not exist in the DB, it will insert new member in AllMember table. If the member already exists in the DB, it will update the values that are different. What exists already in the code is not updating all the members as I want. I want to code it efficiently now. For every update, I am taking all of the members from the DB(6000) and if I have excel with 4000 members it will make the comparison 24000000 and will increase with time.
Getting all the members:
public static IEnumerable<AllMember> GetAllMembersList()
{
string connection = ConfigurationManager.ConnectionStrings["connectionString"].ToString();
using (var dataAccess = new DataAccessDataContext(connection))
{
var v = (from row in dataAccess.AllMembers
//where row.PremiumType.HasValue && row.PremiumType.Value == 100
select row);
return v.ToList();
}
//#TODO fun
}
Handle the file of new\update members
internal override void ProcessFile()
{
StringBuilder CheckMembersList = new StringBuilder();
CheckMembersList.Clear();
ErrorFounds = false;
UpdateQuery = new StringBuilder();
if (!System.IO.File.Exists(InputFile))
{
Mail.InsertNewMail("שגיאה בתהליך קליטת פרטי משתמשים ", "הקובץ " + InputFile + " לא נמצא ");
return;
}
CsvReader fileReader = new CsvReader(InputFile, FileEncoding, false, false);
DataTable fileContentTable = fileReader.ReadFile();
FileInfo fileInfo = new FileInfo(InputFile);
UpdateDB(fileContentTable, CheckMembersList);
WriteResponseFile(fileContentTable);
}
Updating the DB:
private void UpdateDB(DataTable inputTable, StringBuilder CheckMembersList)
{
IEnumerable<AllMember> allMembersList = Utilities.GetAllMembersList();
DBUpdateStatus updateStatus = DBUpdateStatus.NO_CHANGE;
bool x;
bool newMember;
int rowIndex=0 ;
for (int i = 1; i < inputTable.Rows.Count; i++)
{
rowIndex = i;
DataRow fileRow = inputTable.Rows[i];
newMember = true;
foreach (AllMember membersRow in allMembersList)
{
if (!(String.IsNullOrEmpty(membersRow.TZ))) /*&& (fileRow[ConstDBRow.TZ].ToString().Trim().PadLeft(9, '0') == membersRow.TZ.ToString().Trim().PadLeft(9, '0')))*/
{
newMember = false;
updateStatus = UpdateMemberDetails(fileRow, membersRow);
break;
}
}
if (newMember == true)
updateStatus = InsertNewMember(fileRow);
var memberId = GetMemberId(fileRow[ConstDBRow.TZ].ToString().Trim().PadLeft(9, '0'));
if (updateStatus != DBUpdateStatus.NO_CHANGE)
QueryBuilder.InsertRequest(memberId, updateStatus);
fileRow["UPDATE_STATUS"] = Utilities.GetStatusString(updateStatus);
//append to CheckMembersList for sending members list through email
CheckMembersList.AppendLine("Row Index: " + Convert.ToString(rowIndex + 1) +", Identification number: " + (fileRow[ConstDBRow.TZ].ToString().Trim().PadLeft(9, '0')) + ", First Name: " + fileRow[ConstDBRow.FIRST_NAME].ToString().Replace("'","''") + ", Last Name: " + fileRow[ConstDBRow.LAST_NAME].ToString().Replace("'","''") + ", Update Status: " + fileRow["UPDATE_STATUS"].ToString().Replace("'", "''") + "<br/>");
}
}
How can I do this effectively? Is EntityFramework a good option or taking the list of All-Members differently?
I would leave it on DB to compare the records and insert/update using Merge SQL statement.
There is Merge in SQL Server, hope it is available on other DB servers too https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017
As a note: Are you doing insert/update request for each of your record? Try to perform one DB call
So I have grown tired of having to use the import wizard to create a table from a new TXT tab delimited file every time I get a new text file i have to analyze. Most of what I am doing is rough data analysis, but I grow tired of access SQL and the documents sometimes are large enough that Excel analysis wont cut it. I haven't found a great way to import Data from multiple files into SQL Server quickly. I know of bulk import, but from my understanding you need a table already made to use it. My text files are ALWAYS different so i cant just create a generic table and recreate it.
I am writing a C# code that hopefully eventually will take a filepath and take a copy of every text file in the path and cut it down to the first {CR}{LF} in each doc, define my delimiter '\t' ',' '|' etc. , and make a create table statement for every file in the path. I want to do this so I can then do a simple bulk import for each file and be done.
This is what I have so far: Im trying to get it to work on a SINGLE tab delimited file
private void button4_Click(object sender, EventArgs e)
{
string pathOfFile = textBox2.Text;
string origFileText = File.ReadAllText(pathOfFile);
int intIndexofCRLF = origFileText.IndexOf(Environment.NewLine);
string strIndexfCRLF = intIndexofCRLF.ToString();
string strJustHeader = origFileText.Substring(0, intIndexofCRLF);
string[] splitarray = strJustHeader.Split('\t');
string tablename = textBox3.Text;
string SQLPart1 = "CREATE TABLE " + tablename + "( ";
string sqlbody = "";
for (int i = 0; i < splitarray.Length; i++)
{
sqlbody = sqlbody + "[" + splitarray[i] + "] " + "varchar(255), " ;
}
string SQLpart2 = sqlbody.Substring(0, sqlbody.Length - 1);
string SQLPart3 = ");"
MessageBox.Show(SQLPart1 + SQLPart2 + SQLPart3);
}
For some reason, my array is messing up when I do this.
My input is D:\newtext.txt
abd abc ans azd
1 2 3 4
My desired output is
CREATE TABLE newtext ( abd varchar(255), abc varchar(255), ans varchar(255), azd varchar(255));
Thanks for the help!
I would improve few things in this code :
StringBuilders are more efficient than just concatenating strings,
you should use them
Handing both windows ("\r\n") and Unix ("\n") lines ends.
Reading only the first line of the file from disk instead of reading all of
the file to the memory and then get the first line from it
Here an example code that contains those improvements :
string tableName = "myTable";
string delimeter = " ";
string line = null;
using (Stream stream = File.OpenRead("FilePath"))
using (StreamReader sr = new StreamReader(stream))
{
line = sr.ReadLine();
}
string fileHeader = line.Replace("\r", string.Empty).Replace("\n", string.Empty);
string[] fileHeaderSegments = fileHeader.Split(new string[] { delimeter }, StringSplitOptions.None);
StringBuilder sb = new StringBuilder(string.Format("CREATE TABLE {0} (", tableName));
for (int i = 0; i < fileHeaderSegments.Length; i++)
{
if (i != 0)
{
sb.Append(",");
}
sb.Append(fileHeaderSegments[i]);
sb.Append(" varchar(255)");
}
sb.Append(");");
Console.WriteLine(sb.ToString());
Console.ReadKey();
Struggling with a C# Component. What I am trying to do is take a column that is ntext in my input source which is delimited with pipes, and then write the array to a text file. When I run my component my output looks like this:
DealerID,StockNumber,Option
161552,P1427,Microsoft.SqlServer.Dts.Pipeline.BlobColumn
Ive been working with the GetBlobData method and im struggling with it. Any help with be greatly appreciated! Here is the full script:
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
string vehicleoptionsdelimited = Row.Options.ToString();
//string OptionBlob = Row.Options.GetBlobData(int ;
//string vehicleoptionsdelimited = System.Text.Encoding.GetEncoding(Row.Options.ColumnInfo.CodePage).GetChars(OptionBlob);
string[] option = vehicleoptionsdelimited.Split('|');
string path = #"C:\Users\User\Desktop\Local_DS_CSVs\";
string[] headerline =
{
"DealerID" + "," + "StockNumber" + "," + "Option"
};
System.IO.File.WriteAllLines(path + "OptionInput.txt", headerline);
using (System.IO.StreamWriter file = new System.IO.StreamWriter(path + "OptionInput.txt", true))
{
foreach (string s in option)
{
file.WriteLine(Row.DealerID.ToString() + "," + Row.StockNumber.ToString() + "," + s);
}
}
Try using
BlobToString(Row.Options)
using this function:
private string BlobToString(BlobColumn blob)
{
string result = "";
try
{
if (blob != null)
{
result = System.Text.Encoding.Unicode.GetString(blob.GetBlobData(0, Convert.ToInt32(blob.Length)));
}
}
catch (Exception ex)
{
result = ex.Message;
}
return result;
}
Adapted from:
http://mscrmtech.com/201001257/converting-microsoftsqlserverdtspipelineblobcolumn-to-string-in-ssis-using-c
Another very easy solution to this problem, because it is a total PITA, is to route the error output to a derived column component and cast your blob data to a to a STR or WSTR as a new column.
Route the output of that to your script component and the data will come in as an additional column on the pipeline ready for you to parse.
This will probably only work if your data is less than 8000 characters long.
My program is now still running to import data from a log file into a remote SQL Server Database. The log file is about 80MB in size and contains about 470000 lines, with about 25000 lines of data. My program can import only 300 rows/second, which is really bad. :(
public static int ImportData(string strPath)
{
//NameValueCollection collection = ConfigurationManager.AppSettings;
using (TextReader sr = new StreamReader(strPath))
{
sr.ReadLine(); //ignore three first lines of log file
sr.ReadLine();
sr.ReadLine();
string strLine;
var cn = new SqlConnection(ConnectionString);
cn.Open();
while ((strLine = sr.ReadLine()) != null)
{
{
if (strLine.Trim() != "") //if not a blank line, then import into database
{
InsertData(strLine, cn);
_count++;
}
}
}
cn.Close();
sr.Close();
return _count;
}
}
InsertData is just a normal insert method using ADO.NET. It uses a parsing method:
public Data(string strLine)
{
string[] list = strLine.Split(new[] {'\t'});
try
{
Senttime = DateTime.Parse(list[0] + " " + list[1]);
}
catch (Exception)
{
}
Clientip = list[2];
Clienthostname = list[3];
Partnername = list[4];
Serverhostname = list[5];
Serverip = list[6];
Recipientaddress = list[7];
Eventid = Convert.ToInt16(list[8]);
Msgid = list[9];
Priority = Convert.ToInt16(list[10]);
Recipientreportstatus = Convert.ToByte(list[11]);
Totalbytes = Convert.ToInt32(list[12]);
Numberrecipient = Convert.ToInt16(list[13]);
DateTime temp;
if (DateTime.TryParse(list[14], out temp))
{
OriginationTime = temp;
}
else
{
OriginationTime = null;
}
Encryption = list[15];
ServiceVersion = list[16];
LinkedMsgid = list[17];
MessageSubject = list[18];
SenderAddress = list[19];
}
InsertData method:
private static void InsertData(string strLine, SqlConnection cn)
{
var dt = new Data(strLine); //parse the log line into proper fields
const string cnnStr =
"INSERT INTO LOGDATA ([SentTime]," + "[client-ip]," +
"[Client-hostname]," + "[Partner-Name]," + "[Server-hostname]," +
"[server-IP]," + "[Recipient-Address]," + "[Event-ID]," + "[MSGID]," +
"[Priority]," + "[Recipient-Report-Status]," + "[total-bytes]," +
"[Number-Recipients]," + "[Origination-Time]," + "[Encryption]," +
"[service-Version]," + "[Linked-MSGID]," + "[Message-Subject]," +
"[Sender-Address]) " + " VALUES ( " + "#Senttime," + "#Clientip," +
"#Clienthostname," + "#Partnername," + "#Serverhostname," + "#Serverip," +
"#Recipientaddress," + "#Eventid," + "#Msgid," + "#Priority," +
"#Recipientreportstatus," + "#Totalbytes," + "#Numberrecipient," +
"#OriginationTime," + "#Encryption," + "#ServiceVersion," +
"#LinkedMsgid," + "#MessageSubject," + "#SenderAddress)";
var cmd = new SqlCommand(cnnStr, cn) {CommandType = CommandType.Text};
cmd.Parameters.AddWithValue("#Senttime", dt.Senttime);
cmd.Parameters.AddWithValue("#Clientip", dt.Clientip);
cmd.Parameters.AddWithValue("#Clienthostname", dt.Clienthostname);
cmd.Parameters.AddWithValue("#Partnername", dt.Partnername);
cmd.Parameters.AddWithValue("#Serverhostname", dt.Serverhostname);
cmd.Parameters.AddWithValue("#Serverip", dt.Serverip);
cmd.Parameters.AddWithValue("#Recipientaddress", dt.Recipientaddress);
cmd.Parameters.AddWithValue("#Eventid", dt.Eventid);
cmd.Parameters.AddWithValue("#Msgid", dt.Msgid);
cmd.Parameters.AddWithValue("#Priority", dt.Priority);
cmd.Parameters.AddWithValue("#Recipientreportstatus", dt.Recipientreportstatus);
cmd.Parameters.AddWithValue("#Totalbytes", dt.Totalbytes);
cmd.Parameters.AddWithValue("#Numberrecipient", dt.Numberrecipient);
if (dt.OriginationTime != null)
cmd.Parameters.AddWithValue("#OriginationTime", dt.OriginationTime);
else
cmd.Parameters.AddWithValue("#OriginationTime", DBNull.Value);
//if OriginationTime was null, then insert with null value to this column
cmd.Parameters.AddWithValue("#Encryption", dt.Encryption);
cmd.Parameters.AddWithValue("#ServiceVersion", dt.ServiceVersion);
cmd.Parameters.AddWithValue("#LinkedMsgid", dt.LinkedMsgid);
cmd.Parameters.AddWithValue("#MessageSubject", dt.MessageSubject);
cmd.Parameters.AddWithValue("#SenderAddress", dt.SenderAddress);
cmd.ExecuteNonQuery();
}
How can my program run faster?
Thank you so much!
Use SqlBulkCopy.
Edit: I created a minimal implementation of IDataReader and created a Batch type so that I could insert arbitrary in-memory data using SqlBulkCopy. Here is the important bit:
IDataReader dr = batch.GetDataReader();
using (SqlTransaction tx = _connection.BeginTransaction())
{
try
{
using (SqlBulkCopy sqlBulkCopy =
new SqlBulkCopy(_connection, SqlBulkCopyOptions.Default, tx))
{
sqlBulkCopy.DestinationTableName = TableName;
SetColumnMappings(sqlBulkCopy.ColumnMappings);
sqlBulkCopy.WriteToServer(dr);
tx.Commit();
}
}
catch
{
tx.Rollback();
throw;
}
}
The rest of the implementation is left as an exercise for the reader :)
Hint: the only bits of IDataReader you need to implement are Read, GetValue and FieldCount.
Hmmm, let's break this down a little bit.
In pseudocode what you did is the ff:
Open the file
Open a connection
For every line that has data:
Parse the string
Save the data in SQL Server
Close the connection
Close the file
Now the fundamental problems in doing it this way are:
You are keeping a SQL connection open while waiting for your line parsing (pretty susceptible to timeouts and stuff)
You might be saving the data line by line, each in its own transaction. We won't know until you show us what the InsertData method is doing
Consequently you are keeping the file open while waiting for SQL to finish inserting
The optimal way of doing this is to parse the file as a whole, and then insert them in bulk. You can do this with SqlBulkCopy (as suggested by Matt Howells), or with SQL Server Integration Services.
If you want to stick with ADO.NET, you can pool together your INSERT statements and then pass them off into one large SQLCommand, instead of doing it this way e.g., setting up one SQLCommand object per insert statement.
You create the SqlCommand object for every row of data. The simplest improvement would therefore to create a
private static SqlCommand cmdInsert
and declare the parameters with the Parameters.Add() method. Then for each data row, set the parameter values using
cmdInsert.Parameters["#paramXXX"].Value = valueXXX;
A second performance improvement might be to skip creation of Data objects for each row, and assign Parameter values directly from the list[] array.