I have a requirement to copy a file, parse it's contents removing the line breaks, and the splitting the content along the pipes, and then storing the resulting string[] off in the database. My files could get as large as 65000 valid records each, and so performance is essential.
Below is what I currently have. Problem is it is extremely slow( 3 hours to process 65000 lines). I would appreciate any help with improving optimizing this piece so my runs can be significantly faster.
public void ReadFileLinesIntoRows()
{
try
{
using (var reader = new TextFieldParser(FileName))
{
reader.HasFieldsEnclosedInQuotes = false;
reader.TextFieldType = FieldType.Delimited;
reader.SetDelimiters("|");
String[] currentRow;
while (!reader.EndOfData)
{
try
{
currentRow = reader.ReadFields();
int rowcount = currentRow.Count();
//if it is less than what you need, pad it.
if (rowcount < 190)
{
Array.Resize<string>(ref currentRow, 190);
rows.Add(currentRow);
}
else
{
rows.Add(currentRow);
}
}
catch (MalformedLineException mex)
{
unreadlines.Add(reader.ErrorLine);//continue afterwards
}
}
this.TotalRowCount = rows.Count();
}
}
catch (Exception ex)
{
throw ex;
}
}
public void cleanfilecontent(String tempfilename, Boolean? HeaderIncluded)
{
try
{
//remove the empty lines in the file
using (var sr = new StreamReader(tempfilename))
{
// Write new file
using (var sw = new StreamWriter(CleanedCopy))
{
using (var smove = new StreamWriter(duptempfileremove))
{
string line;
Boolean skippedheader = false;
while ((line = sr.ReadLine()) != null)
{
// Look for text to remove
if (line.Contains("----------------------------------"))
{
smove.Write(line);
}
else if (HeaderIncluded.HasValue && HeaderIncluded.Value==true && ! skippedheader)
{
smove.Write(line);
skippedheader = true;
}
else if(skippedheader)
{
// Keep lines that does not match
sw.WriteLine(line);
}
}
smove.Flush();
}
sw.Flush();
}
sr.Close();
}
}
catch (Exception ex)
{
throw ex;
}
}
65000 records isn't all that big. If you have sufficient memory, I will suggest read the whole file to memory, perform your parsing and construct your row and use batch insert to commit the data to DB. This will be the fastest! I suspect most of your 3 hours is spent in inserting database record one at a time to the database.
Related
in this button click event I am trying to count strings from text file that are the same as in textboxes, then display number of them in label. My problem is that I have no idea how to count them-I'm talking about code inside if-statement. I would really appreciate any help.
private void btnCalculate_Click(object sender, EventArgs e)
{
string openFileName;
using (OpenFileDialog ofd = new OpenFileDialog())
{
if (ofd.ShowDialog() != DialogResult.OK)
{
MessageBox.Show("You did not select OK");
return;
}
openFileName = ofd.FileName;
}
FileStream fs = null;
StreamReader sr = null;
try
{
fs = new FileStream("x", FileMode.Open, FileAccess.Read);
fs.Seek(0, SeekOrigin.Begin);
sr = new StreamReader(fs);
string s = sr.ReadLine();
while (s != null)
{
s = sr.ReadLine();
}
if(s.Contains(tbFirstClub.Text))
{
s.Count = lblResult1.Text; //problem is here
}
else if(s.Contains(tbSecondClub.Text))
{
s.Count = lblResult2.Text; //problem is here
}
}
catch (IOException)
{
MessageBox.Show("Error reading file");
}
catch (Exception)
{
MessageBox.Show("Something went wrong");
}
finally
{
if (sr != null)
{
sr.Close();
}
}
}
Thanks in advance.
s.Count = lblResult1.Text; //problem is here
wait...you are saying here..
you have a variable (s)
and you access its property (Count)
and then set it to the label text(lblResult1.Text)
is that what you're trying to do? because the reverse seems more likely
Using LINQ you can get the number of occurences, like below:
int numOfOcuurences= s.Count( s=> s == tbFirstClub.Text);
lblResult1.Text = numOfOcuurences.ToString();
welcome to Stack Overflow.
I want to point out something you said.
else if(s.Contains(tbSecondClub.Text))
{
s.Count = lblResult2.Text; //problem is here
}
S is our string that we just read from the file.
You're saying assoung S.Count (The length of the string) to text.
I don't think this is what you want. We want to return the number of times specified strings show up in a specified file
Let's refactor this, (And add some tricks along the way).
// Let's create a dictionary to store all of our desired texts, and the counts.
var textAndCounts = new Dictionary<string, int>();
textAndCounts.Add(tbFirstClub.Text, 0); // Assuming the type of Text is string, change acccorrdingly
textAndCounts.Add(tbSecondClub.Text, 0);
//We added both out texts fields to our dictionary with a value of 0
// Read all the lines from the file.
var allLines = File.ReadAllLines(openFileName); /* using System.IO */
foreach(var line in allLines)
{
if(line.Contains(tbFirstClub.Text))
{
textAndCounts[tbFirstClub.Text] += 1; // Go to where we stored our count for our text and increment
}
if(line.Contains(tbSecondClub.Text))
{
textandCounts[tbSecondClub.Text] += 1;
}
}
This should solve your problem, but it's still pretty brittle. Optimally, we want to design a system that works for any number of strings and counts them.
So how would I do it?
public Dictionary<string, int> GetCountsPerStringInFile(IEnumerable<string> textsToSearch, string filePath)
{
//Lets use Linq to create a dictionary, assuming all strings are unique.
//This means, create a dictionary in this list, where the key is the values in the list, and the value is 0 <Text, 0>
var textsAndCount = textsToSearch.ToDictionary(text => text, count => 0);
var allLines = File.ReadAllLines(openFileName);
foreach (var line in allLines)
{
// You didn't specify if a line could maintain multiple values, so let's handle that here.
var keysContained = textsAndCounts.Keys.Where(c => line.Contains(c)); // take all the keys where the line has that key.
foreach (var key in keysContained)
{
textsAndCounts[key] += 1; // increment the count associated with that string.
}
}
return textsAndCounts;
}
The above code allows us to return a data structure with any amount of strings with a count.
I think this is a good example for you to save you some headaches going forward, and it's probably a good first toe-dip into design patterns. I'd suggest looking up some material on Data structures and their use cases.
I have an web application where I parse a csv file that can have over 200k records in it. I parse each line for information, verify that the key does not exist in the database and then add it to the context. When the count reaches 10,000 records it calls SaveChanges routine. The problem is that there can be duplicates in the context and it errors out. This is running on a Azure VM communicating to an Azure SQL server.
Two questions, how do I handle the duplicate issue and is there any way I can improve the speed as it takes several hours to run?
using (LoanFileEntities db = new LoanFileEntities())
{
db.Configuration.AutoDetectChangesEnabled = false; // 1. this is a huge time saver
db.Configuration.ValidateOnSaveEnabled = false; // 2. this can also save time
while (parser.Read())
{
counter++;
int loan_code = 0;
string loan_code_string = parser["LoanId"];
string dateToParse = parser["PullDate"].Trim();
DateTime date_pulled;
try
{
date_pulled = DateTime.Parse(dateToParse, CultureInfo.InvariantCulture);
}
catch (Exception)
{
throw new Exception("No Pull Date for line " + counter);
}
string originationdate = parser["OriginationDate"].Trim();
DateTime date_originated;
try
{
date_originated = DateTime.Parse(originationdate, CultureInfo.InvariantCulture);
}
catch (Exception)
{
throw new Exception("No Origination Date for line " + counter);
}
dateToParse = parser["DueDate"].Trim();
DateTime date_due;
try
{
date_due = DateTime.Parse(dateToParse, CultureInfo.InvariantCulture);
}
catch (Exception)
{
throw new Exception("No Due Date for line " + counter);
}
string region = parser["Region"].Trim();
string source = parser["Channel"].Trim();
string password = parser["FilePass"].Trim();
decimal principalAmt = Convert.ToDecimal(parser["Principal"].Trim());
decimal totalDue = Convert.ToDecimal(parser["TotalDue"].Trim());
string vitaLoanId = parser["VitaLoanId"];
var toAdd =
db.dfc_LoanRecords.Any(
x => x.loan_code_string == loan_code_string);
if (!toAdd)
{
dfc_LoanRecords loan = new dfc_LoanRecords();
loan.loan_code = loan_code;
loan.loan_code_string = loan_code_string;
loan.loan_principal_amt = principalAmt;
loan.loan_due_date = date_due;
loan.date_pulled = date_pulled;
loan.date_originated = date_originated;
loan.region = region;
loan.source = source;
loan.password = password;
loan.loan_amt_due = totalDue;
loan.vitaLoanId = vitaLoanId;
loan.load_file = fileName;
loan.load_date = DateTime.Now;
switch (loan.region)
{
case "UK":
if (location.Equals("UK"))
{
//db.dfc_LoanRecords.Add(loan);
if (loan.source == "Online")
{
counter_new_uk_online++;
}
else
{
counter_new_uk_retail++;
}
}
break;
case "US":
if (location.Equals("US"))
{
db.dfc_LoanRecords.Add(loan);
if (loan.source == "Online")
{
counter_new_us_online++;
}
else
{
counter_new_us_retail++;
}
}
break;
case "Canada":
if (location.Equals("US"))
{
db.dfc_LoanRecords.Add(loan);
if (loan.source == "Online")
{
counter_new_cn_online++;
}
else
{
counter_new_cn_retail++;
}
}
break;
}
// delay save to speed up load. 3. also saves transactional time
if (counter % 10000 == 0)
{
db.SaveChanges();
}
}
} // end of parser read
db.SaveChanges();
}
}
}
I would suggest removing duplicates in the code before sending it over to .SaveChanges().
Instead of going into detail about duplicate removal, I've put together this list of links to existing questions and answers on StackOverflow that may help:
Delete duplicates using Lambda
Using DISTINCT on a subquery to remove duplicates in Entity Framework
Using LINQ to find / delete duplicates
Hope that helps!
I do not want to read the whole file at any point, I know there are answers on that question, I want t
o read the First or Last line.
I know that my code locks the file that it's reading for two reasons 1) The application that writes to the file crashes intermittently when I run my little app with this code but it never crashes when I am not running this code! 2) There are a few articles that will tell you that File.ReadLines locks the file.
There are some similar questions but that answer seems to involve reading the whole file which is slow for large files and therefore not what I want to do. My requirement to only read the last line most of the time is also unique from what I have read about.
I nead to know how to read the first line (Header row) and the last line (latest row). I do not want to read all lines at any point in my code because this file can become huge and reading the entire file will become slow.
I know that
line = File.ReadLines(fullFilename).First().Replace("\"", "");
... is the same as ...
FileStream fs = new FileStream(#fullFilename, FileMode.Open, FileAccess.Read, FileShare.Read);
My question is, how can I repeatedly read the first and last lines of a file which may be being written to by another application without locking it in any way. I have no control over the application that is writting to the file. It is a data log which can be appended to at any time. The reason I am listening in this way is that this log can be appended to for days on end. I want to see the latest data in this log in my own c# programme without waiting for the log to finish being written to.
My code to call the reading / listening function ...
//Start Listening to the "data log"
private void btnDeconstructCSVFile_Click(object sender, EventArgs e)
{
MySandbox.CopyCSVDataFromLogFile copyCSVDataFromLogFile = new MySandbox.CopyCSVDataFromLogFile();
copyCSVDataFromLogFile.checkForLogData();
}
My class which does the listening. For now it simply adds the data to 2 generics lists ...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using MySandbox.Classes;
using System.IO;
namespace MySandbox
{
public class CopyCSVDataFromLogFile
{
static private List<LogRowData> listMSDataRows = new List<LogRowData>();
static String fullFilename = string.Empty;
static LogRowData previousLineLogRowList = new LogRowData();
static LogRowData logRowList = new LogRowData();
static LogRowData logHeaderRowList = new LogRowData();
static Boolean checking = false;
public void checkForLogData()
{
//Initialise
string[] logHeaderArray = new string[] { };
string[] badDataRowsArray = new string[] { };
//Get the latest full filename (file with new data)
//Assumption: only 1 file is written to at a time in this directory.
String directory = "C:\\TestDir\\";
string pattern = "*.csv";
var dirInfo = new DirectoryInfo(directory);
var file = (from f in dirInfo.GetFiles(pattern) orderby f.LastWriteTime descending select f).First();
fullFilename = directory + file.ToString(); //This is the full filepath and name of the latest file in the directory!
if (logHeaderArray.Length == 0)
{
//Populate the Header Row
logHeaderRowList = getRow(fullFilename, true);
}
LogRowData tempLogRowList = new LogRowData();
if (!checking)
{
//Read the latest data in an asynchronous loop
callDataProcess();
}
}
private async void callDataProcess()
{
checking = true; //Begin checking
await checkForNewDataAndSaveIfFound();
}
private static Task checkForNewDataAndSaveIfFound()
{
return Task.Run(() => //Call the async "Task"
{
while (checking) //Loop (asynchronously)
{
LogRowData tempLogRowList = new LogRowData();
if (logHeaderRowList.ValueList.Count == 0)
{
//Populate the Header row
logHeaderRowList = getRow(fullFilename, true);
}
else
{
//Populate Data row
tempLogRowList = getRow(fullFilename, false);
if ((!Enumerable.SequenceEqual(tempLogRowList.ValueList, previousLineLogRowList.ValueList)) &&
(!Enumerable.SequenceEqual(tempLogRowList.ValueList, logHeaderRowList.ValueList)))
{
logRowList = getRow(fullFilename, false);
listMSDataRows.Add(logRowList);
previousLineLogRowList = logRowList;
}
}
//System.Threading.Thread.Sleep(10); //Wait for next row.
}
});
}
private static LogRowData getRow(string fullFilename, bool isHeader)
{
string line;
string[] logDataArray = new string[] { };
LogRowData logRowListResult = new LogRowData();
try
{
if (isHeader)
{
//Asign first (header) row data.
//Works but seems to block writting to the file!!!!!!!!!!!!!!!!!!!!!!!!!!!
line = File.ReadLines(fullFilename).First().Replace("\"", "");
}
else
{
//Assign data as last row (default behaviour).
line = File.ReadLines(fullFilename).Last().Replace("\"", "");
}
logDataArray = line.Split(',');
//Copy Array to Generics List and remove last value if it's empty.
for (int i = 0; i < logDataArray.Length; i++)
{
if (i < logDataArray.Length)
{
if (i < logDataArray.Length - 1)
{
//Value is not at the end, from observation, these always have a value (even if it's zero) and so we'll store the value.
logRowListResult.ValueList.Add(logDataArray[i]);
}
else
{
//This is the last value
if (logDataArray[i].Replace("\"", "").Trim().Length > 0)
{
//In this case, the last value is not empty, store it as normal.
logRowListResult.ValueList.Add(logDataArray[i]);
}
else { /*The last value is empty, e.g. "123,456,"; the final comma denotes another field but this field is empty so we will ignore it now. */ }
}
}
}
}
catch (Exception ex)
{
if (ex.Message == "Sequence contains no elements")
{ /*Empty file, no problem. The code will safely loop and then will pick up the header when it appears.*/ }
else
{
//TODO: catch this error properly
Int32 problemID = 10; //Unknown ERROR.
}
}
return logRowListResult;
}
}
}
I found the answer in a combination of other questions. One answer explaining how to read from the end of a file, which I adapted so that it would read only 1 line from the end of the file. And another explaining how to read the entire file without locking it (I did not want to read the entire file but the not locking part was useful). So now you can read the last line of the file (if it contains end of line characters) without locking it. For other end of line delimeters, just replace my 10 and 13 with your end of line character bytes...
Add the method below to public class CopyCSVDataFromLogFile
private static string Reverse(string str)
{
char[] arr = new char[str.Length];
for (int i = 0; i < str.Length; i++)
arr[i] = str[str.Length - 1 - i];
return new string(arr);
}
and replace this line ...
line = File.ReadLines(fullFilename).Last().Replace("\"", "");
with this code block ...
Int32 endOfLineCharacterCount = 0;
Int32 previousCharByte = 0;
Int32 currentCharByte = 0;
//Read the file, from the end, for 1 line, allowing other programmes to access it for read and write!
using (FileStream reader = new FileStream(fullFilename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 0x1000, FileOptions.SequentialScan))
{
int i = 0;
StringBuilder lineBuffer = new StringBuilder();
int byteRead;
while ((-i < reader.Length) /*Belt and braces: if there were no end of line characters, reading beyond the file would give a catastrophic error here (to be avoided thus).*/
&& (endOfLineCharacterCount < 2)/*Exit Condition*/)
{
reader.Seek(--i, SeekOrigin.End);
byteRead = reader.ReadByte();
currentCharByte = byteRead;
//Exit condition: the first 2 characters we read (reading backwards remember) were end of line ().
//So when we read the second end of line, we have read 1 whole line (the last line in the file)
//and we must exit now.
if (currentCharByte == 13 && previousCharByte == 10)
{
endOfLineCharacterCount++;
}
if (byteRead == 10 && lineBuffer.Length > 0)
{
line += Reverse(lineBuffer.ToString());
lineBuffer.Remove(0, lineBuffer.Length);
}
lineBuffer.Append((char)byteRead);
previousCharByte = byteRead;
}
reader.Close();
}
I am running a loop in C# that reads a file and make updates to the MySQL database with MySQL ODBC 5.1 driver in a Windows 8 64-bit environment.
The operations is simple
Count +1
See if file exists
Load XML file(XDocument)
Fetch data from XDocument
Open ODBCConnection
Run a couple of Stored Procedures against the MySQL database to store data
Close ODBCConnection
The problem is that after a while it will hang on for example a OdbcCommand.ExecuteNonQuery. It is not always the same SP that it will hang on?
This is a real problem, I need to loop 60 000 files but it only last around 1000 at a time.
Edit 1:
The problem seemse to accure here hever time :
public bool addPublisherToGame(int inPublisherId, int inGameId)
{
string sqlStr;
OdbcCommand commandObj;
try
{
sqlStr = "INSERT INTO games_publisher_binder (gameId, publisherId) VALUE(?,?)";
commandObj = new OdbcCommand(sqlStr, mainConnection);
commandObj.Parameters.Add("#gameId", OdbcType.Int).Value = inGameId;
commandObj.Parameters.Add("#publisherId", OdbcType.Int).Value = inPublisherId;
if (Convert.ToInt32(executeNonQuery(commandObj)) > 0)
return true;
else
return false;
}
catch (Exception ex)
{
throw (loggErrorMessage(this.ToString(), "addPublisherToGame", ex, -1, "", ""));
}
finally
{
}
}
protected object executeNonQuery(OdbcCommand inCommandObj)
{
try
{
//FileStream file = new FileStream("d:\\test.txt", FileMode.Append, FileAccess.Write);
//System.IO.StreamWriter stream = new System.IO.StreamWriter(file);
//stream.WriteLine(DateTime.Now.ToString() + " - " + inCommandObj.CommandText);
//stream.Close();
//file.Close();
//mainConnection.Open();
return inCommandObj.ExecuteNonQuery();
}
catch (Exception ex)
{
throw (ex);
}
}
I can see that the in parameters is correct
The open and close of the connection is done in a top method for ever loop (with finally).
Edit 2:
This is the method that will extract the information and save to database :
public Boolean addBoardgameToDatabase(XElement boardgame, GameFactory gameFactory)
{
int incomingGameId = -1;
XElement tmpElement;
string primaryName = string.Empty;
List<string> names = new List<string>();
GameStorage externalGameStorage;
int retry = 3;
try
{
if (boardgame.FirstAttribute != null &&
boardgame.FirstAttribute.Value != null)
{
while (retry > -1)
{
try
{
incomingGameId = int.Parse(boardgame.FirstAttribute.Value);
#region Find primary name
tmpElement = boardgame.Elements("name").Where(c => c.Attribute("primary") != null).FirstOrDefault(a => a.Attribute("primary").Value.Equals("true"));
if (tmpElement != null)
primaryName = tmpElement.Value;
else
return false;
#endregion
externalGameStorage = new GameStorage(incomingGameId,
primaryName,
string.Empty,
getDateTime("1/1/" + boardgame.Element("yearpublished").Value),
getInteger(boardgame.Element("minplayers").Value),
getInteger(boardgame.Element("maxplayers").Value),
boardgame.Element("playingtime").Value,
0, 0, false);
gameFactory.updateGame(externalGameStorage);
gameFactory.updateGameGrade(incomingGameId);
gameFactory.removeDesignersFromGame(externalGameStorage.id);
foreach (XElement designer in boardgame.Elements("boardgamedesigner"))
{
gameFactory.updateDesigner(int.Parse(designer.FirstAttribute.Value), designer.Value);
gameFactory.addDesignerToGame(int.Parse(designer.FirstAttribute.Value), externalGameStorage.id);
}
gameFactory.removePublishersFromGame(externalGameStorage.id);
foreach (XElement publisher in boardgame.Elements("boardgamepublisher"))
{
gameFactory.updatePublisher(int.Parse(publisher.FirstAttribute.Value), publisher.Value, string.Empty);
gameFactory.addPublisherToGame(int.Parse(publisher.FirstAttribute.Value), externalGameStorage.id);
}
foreach (XElement element in boardgame.Elements("name").Where(c => c.Attribute("primary") == null))
names.Add(element.Value);
gameFactory.removeGameNames(incomingGameId);
foreach (string name in names)
if (name != null && name.Length > 0)
gameFactory.addGameName(incomingGameId, name);
return true;
}
catch (Exception)
{
retry--;
if (retry < 0)
return false;
}
}
}
return false;
}
catch (Exception ex)
{
throw (new Exception(this.ToString() + ".addBoardgameToDatabase : " + ex.Message, ex));
}
}
And then we got one step higher, the method that will trigger addBoardgameToDatabase :
private void StartThreadToHandleXmlFile(int gameId)
{
FileInfo fileInfo;
XDocument xmlDoc;
Boolean gameAdded = false;
GameFactory gameFactory = new GameFactory();
try
{
fileInfo = new FileInfo(_directory + "\\" + gameId.ToString() + ".xml");
if (fileInfo.Exists)
{
xmlDoc = XDocument.Load(fileInfo.FullName);
if (addBoardgameToDatabase(xmlDoc.Element("boardgames").Element("boardgame"), gameFactory))
{
gameAdded = true;
fileInfo.Delete();
}
else
return;
}
if (!gameAdded)
{
gameFactory.InactivateGame(gameId);
fileInfo.Delete();
}
}
catch (Exception)
{ throw; }
finally
{
if(gameFactory != null)
gameFactory.CloseConnection();
}
}
And then finally the top level :
public void UpdateGames(string directory)
{
DirectoryInfo dirInfo;
FileInfo fileInfo;
Thread thread;
int gameIdToStartOn = 1;
dirInfo = new DirectoryInfo(directory);
if(dirInfo.Exists)
{
_directory = directory;
fileInfo = dirInfo.GetFiles("*.xml").OrderBy(c=> int.Parse(c.Name.Replace(".xml",""))).FirstOrDefault();
gameIdToStartOn = int.Parse(fileInfo.Name.Replace(".xml", ""));
for (int gameId = gameIdToStartOn; gameId < 500000; gameId++)
{
try
{ StartThreadToHandleXmlFile(gameId); }
catch(Exception){}
}
}
}
Use SQL connection pooling by adding "Pooling=true" to your connectionstring.
Make sure you properly close the connection AND the file.
You can create one large query and execute it only once, I think it is a lot faster then 60.000 loose queries!
Can you show a bit of your code?
I have a netbook with 1.20Ghz Processor & 1GB Ram.
I'm running a C# WinForms app on it which, at 5 minute intervals, reads every line of a text file and depending on what the content of that line is, either skips it or writes it to an xml file. Sometimes it may be processing about 2000 lines.
When it begins this task, the processor gets maxed out, 100% use. However on my desktop with 2.40Ghz Processor and 3GB Ram it's untouched (for obvious reasons)... is there any way I can actually reduce this processor issue dramatically? The code isn't complex, I'm not bad at coding either and I'm not constantly opening the file, reading and writing... it's all done in one fell swoop.
Any help greatly appreciated!?
Sample Code
***Timer.....
#region Timers Setup
aTimer.Tick += new EventHandler(OnTimedEvent);
aTimer.Interval = 60000;
aTimer.Enabled = true;
aTimer.Start();
radioButton60Mins.Checked = true;
#endregion Timers Setup
private void OnTimedEvent(object source, EventArgs e)
{
string msgLoggerMessage = "Checking For New Messages " + DateTime.Now;
listBoxActivityLog.Items.Add(msgLoggerMessage);
MessageLogger messageLogger = new MessageLogger();
messageLogger.LogMessage(msgLoggerMessage);
if (radioButton1Min.Checked)
{
aTimer.Interval = 60000;
}
if (radioButton60Mins.Checked)
{
aTimer.Interval = 3600000;
}
if (radioButton5Mins.Checked)
{
aTimer.Interval = 300000;
}
// split the file into a list of sms messages
List<SmsMessage> messages = smsPar.ParseFile(smsPar.CopyFile());
// sanitize the list to get rid of stuff we don't want
smsPar.SanitizeSmsMessageList(messages);
ApplyAppropriateColoursToRecSMSListinDGV();
}
public List<SmsMessage> ParseFile(string filePath)
{
List<SmsMessage> list = new List<SmsMessage>();
using (StreamReader file = new StreamReader(filePath))
{
string line;
while ((line = file.ReadLine()) != null)
{
var sms = ParseLine(line);
list.Add(sms);
}
}
return list;
}
public SmsMessage ParseLine(string line)
{
string[] words = line.Split(',');
for (int i = 0; i < words.Length; i++)
{
words[i] = words[i].Trim('"');
}
SmsMessage msg = new SmsMessage();
msg.Number = int.Parse(words[0]);
msg.MobNumber = words[1];
msg.Message = words[4];
msg.FollowedUp = "Unassigned";
msg.Outcome = string.Empty;
try
{
//DateTime Conversion!!!
string[] splitWords = words[2].Split('/');
string year = splitWords[0].Replace("09", "20" + splitWords[0]);
string dateString = splitWords[2] + "/" + splitWords[1] + "/" + year;
string timeString = words[3];
string wholeDT = dateString + " " + timeString;
DateTime dateTime = DateTime.Parse(wholeDT);
msg.Date = dateTime;
}
catch (Exception e)
{
MessageBox.Show(e.ToString());
Application.Exit();
}
return msg;
}
public void SanitizeSmsMessageList(List<SmsMessage> list)
{
// strip out unwanted messages
// list.Remove(some_message); etc...
List<SmsMessage> remove = new List<SmsMessage>();
foreach (SmsMessage message in list)
{
if (message.Number > 1)
{
remove.Add(message);
}
}
foreach (SmsMessage msg in remove)
{
list.Remove(msg);
}
//Fire Received messages to xml doc
ParseSmsToXMLDB(list);
}
public void ParseSmsToXMLDB(List<SmsMessage> list)
{
try
{
if (File.Exists(WriteDirectory + SaveName))
{
xmlE.AddXMLElement(list, WriteDirectory + SaveName);
}
else
{
xmlE.CreateNewXML(WriteDirectory + SaveName);
xmlE.AddXMLElement(list, WriteDirectory + SaveName);
}
}
catch (Exception e)
{
MessageBox.Show(e.ToString());
Application.Exit();
}
}
public void CreateNewXML(string writeDir)
{
try
{
XElement Database = new XElement("Database");
Database.Save(writeDir);
}
catch (Exception e)
{
MessageBox.Show(e.ToString());
}
}
public void AddXMLElement(List<SmsMessage> messages, string writeDir)
{
try
{
XElement Database = XElement.Load(writeDir);
foreach (SmsMessage msg in messages)
{
if (!DoesExist(msg.MobNumber, writeDir))
{
Database.Add(new XElement("SMS",
new XElement("Number", msg.MobNumber),
new XElement("DateTime", msg.Date),
new XElement("Message", msg.Message),
new XElement("FollowedUpBy", msg.FollowedUp),
new XElement("Outcome", msg.Outcome),
new XElement("Quantity", msg.Quantity),
new XElement("Points", msg.Points)));
EventNotify.SendNotification("A New Message Has Arrived!", msg.MobNumber);
}
}
Database.Save(writeDir);
EventNotify.UpdateDataGridView();
EventNotify.UpdateStatisticsDB();
}
catch (Exception e)
{
MessageBox.Show(e.ToString());
}
}
public bool DoesExist(string number, string writeDir)
{
XElement main = XElement.Load(writeDir);
return main.Descendants("Number")
.Any(element => element.Value == number);
}
Use a profiler and/or Performance Monitor and/or \\live.sysinternals.com\tools\procmon.exe and/or ResourceMonitor to determine what's going on
If the 5 minute process is a background task, you can make use of Thread Priority.
MSDN here.
If you do the processing on a separate thread, change your timer to be a System.Threading.Timer and use callback events, you should be able to set a lower priority on that thread than the rest of your application.
Inside your ParseFile loop, you could try adding a Thread.Sleep and/or an Application.DoEvents() call to see if that helps. Its better to do this in the parsing is on a seperate thread, but at least you can try this simple test to see if it helps.
Might be that the MessageBoxes in your catches are running into cross-thread problems. Try swapping them out for writing to the trace output.
In any case, you've posted an entire (little) program, which will not help you get specific advice. Try deleting method bodies -- one at a time, scientifically -- and try to get the problem to occur/stop occurring. This will help you to locate the problem and eliminate the irrelevant parts of your question (both for yourself and for SO).
Your current processing model is batch based - do the parsing, then process the messages, and so on.
You'll likely reduce the memory overhead if you switched to a Linq style "pull" approach.
For example, you could convert your ParseFile() method in this way:
public IEnmerable<SmsMessage> ParseFile(string filePath)
{
using (StreamReader file = new StreamReader(filePath))
{
string line;
while ((line = file.ReadLine()) != null)
{
var sms = ParseLine(line);
yield return sms;
}
}
}
The advantage is that each SmsMessage can be handled as it is generated, instead of parsing all of the messages at once and then handling all of them.
This lowers your memory overhead, which is one of the most likely causes for the performance difference between your netbook and your desktop.