csv modify file - c#

I am a bit in a pickle regarding a consolidation application we are using in our company. We create a csv file from an progress database this csv file has 14 columns and NO header.
The CSV file contains payments (around 173 thousand rows). Most of these rows are the same except for the column amount (last column)
Example:
2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65
(around 174000 rows)
As you can see some of these lines are the same except for the amount column. What i need is to sort all rows, add up the amount and save one unique row instead of 1100 rows with different amounts.
My coding skills are failing me to get the job done within a certain timeframe, maybe one of you can push me in the right direction solving this problem.
Example code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string input = File.ReadAllText(#"c:\temp\test.txt");
string inputLine = "";
StringReader reader = new StringReader(input);
List<List<string>> data = new List<List<string>>();
while ((inputLine = reader.ReadLine()) != null)
{
if (inputLine.Trim().Length > 0)
{
string[] inputArray = inputLine.Split(new char[] { ';' });
data.Add(inputArray.ToList());
}
}
//sort data by every column
for (int sortCol = data[0].Count() - 1; sortCol >= 0; sortCol--)
{
data.OrderBy(x => x[sortCol]);
}
//delete duplicate rows
for (int rowCount = data.Count - 1; rowCount >= 1; rowCount--)
{
Boolean match = true;
for (int colCount = 0; colCount < data[rowCount].Count - 2; colCount++)
{
if(data[rowCount][colCount] != data[rowCount - 1][colCount])
{
match = false;
break;
}
}
if (match == true)
{
decimal previousValue = decimal.Parse(data[rowCount - 1][data[rowCount].Count - 1]);
decimal currentValue = decimal.Parse(data[rowCount][data[rowCount].Count - 1]);
string newStrValue = (previousValue + currentValue).ToString();
data[rowCount - 1][data[rowCount].Count - 1] = newStrValue;
data.RemoveAt(rowCount);
}
}
string output = string.Join("\r\n",data.AsEnumerable()
.Select(x => string.Join(";",x.Select(y => y).ToArray())).ToArray());
File.WriteAllText(#"c:\temp\test1.txt",output);
}
}
}

Read the CSV file line by line, and build an in-memory dictionary in which you keep the totals (and other information you require). As most of the lines belong to the same key, it will probably not cause out of memory issues. Afterwards, generate a new CSV based on the information in the dictionary.

As I interpret your question, your problem and the solution you are asking for are how to take your input that are in the form of
#"2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65"
Get the last column and then sum it up? If so this is actually very easy to do with something like this
public static void Main()
{
string input = #"2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65";
var rows = input.Split('\n');
decimal totalValue = 0m;
foreach(var row in rows)
{
var transaction = row.Substring(row.LastIndexOf(';') +1);
decimal val = 0m;
if(decimal.TryParse(transaction, out val))
totalValue += val;
}
Console.WriteLine(totalValue);
}
But maybe I have misunderstood what you were asking for?

Sorry answering my post so late but this is my final solution
Replacing all " characters and write the output to the stream writer. (going from 25mb to a 15mb file.). Than copy my CSV file to the SQL server so i can bulk insert. After my insert i just query the table and read / write the result set to a new file. My new file is only +/-700KB!
The Filldata() method is filling a datagridview in my application so you can review the result instead of opening the file in excel.
I am new with C#, i am currently writing a new solution to query the csv file directly or in memory and write it back to a new file.
Method1:
string line;
StreamWriter sw = new StreamWriter(insertFile);
using (StreamReader sr = new StreamReader(sourcePath))
{
while ((line = sr.ReadLine()) != null)
{
sw.WriteLine(line.Replace("\"", ""));
}
sr.Close();
sw.Close();
sr.Dispose();
sw.Dispose();
File.Copy(insertFile, #"\\SQLSERVER\C$\insert.csv");
}
Method2:
var destinationFile = #"c:\insert.csv";
var querieImportCSV = "BULK INSERT dbo.TABLE FROM '" + destinationFile + "' WITH ( FIELDTERMINATOR = ';', ROWTERMINATOR = '\n', FIRSTROW = 1)";
var truncate = #"TRUNCATE TABLE dbo.TABLE";
string queryResult =
#"SELECT [Year]
,[Month]
,[Week]
,[Entity]
,[Account]
,[C11]
,[C12]
,[C21]
,[C22]
,[C3]
,[C4]
,[CTP]
,[VALUTA]
,SUM(AMOUNT) as AMOUNT
,[CURRENCY_ORIG]
,[AMOUNTEXCH]
,[AGENTCODE]
FROM dbo.TABLE
GROUP BY YEAR, MONTH, WEEK, Entity, Account, C11, C12, C21, C22, C3, C4, CTP, VALUTA, CURRENCY_ORIG, AMOUNTEXCH, AGENTCODE
ORDER BY Account";
var conn = new SqlConnection(connectionString);
conn.Open();
SqlCommand commandTruncate = new SqlCommand(truncate, conn);
commandTruncate.ExecuteNonQuery();
SqlCommand commandInsert = new SqlCommand(querieImportCSV, conn);
SqlDataReader readerInsert = commandInsert.ExecuteReader();
readerInsert.Close();
FillData();
SqlCommand commandResult = new SqlCommand(queryResult, conn);
SqlDataReader readerResult = commandResult.ExecuteReader();
StringBuilder sb = new StringBuilder();
while (readerResult.Read())
{
sb.Append(readerResult["Year"] + ";" + readerResult["Month"] + ";" + readerResult["Week"] + ";" + readerResult["Entity"] + ";" + readerResult["Account"] + ";" +
readerResult["C11"] + ";" + readerResult["C12"] + ";" + readerResult["C21"] + ";" + readerResult["C22"] + ";" + readerResult["C3"] + ";" + readerResult["C4"] + ";" +
readerResult["CTP"] + ";" + readerResult["Valuta"] + ";" + readerResult["Amount"] + ";" + readerResult["CURRENCY_ORIG"] + ";" + readerResult["AMOUNTEXCH"] + ";" + readerResult["AGENTCODE"]);
}
sb.Replace("\"","");
StreamWriter sw = new StreamWriter(homedrive);
sw.WriteLine(sb);
readerResult.Close();
conn.Close();
sw.Close();
sw.Dispose();

Related

Accessing List values when having few strings inside a value

So I have the following code:
void ReadFromCsv()
{
using (var reader = new StreamReader(#"d:\test.csv", Encoding.Default))
{
List<string> listA = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(';');
listA.Add(values[0]);
}
Console.WriteLine(listA);
}
}
which is reading from my csv file and an example of a line I get is:
50,2,10,201,10,9090339,24-OCT-21 09.38.38.679000 AM,123456789,24/10/2021 09:39:23,22/10/2021 09:39:37,Sm123456789-SM-20211031-VSR-000123.pdf,,,,,26/01/2022 13:08:58,,2,,0
first of all, why are there many commas around the end of the line?
second of all, what if I wanted to access the value "10" (which is the 5th value ) of that string line, is that possible?,
or going further, my task is to check for that 5th value and if its 5 for example, I'd want to take every row with 5thvalue=5 and create a csv for them, if 5thvalue=10 I want to create a csv for those records, and so on. but one task at a time, how do I access that value?
1: commas around the end of the line mean first item of lines is empty ""
2: you can get 5th value as below:
string _list = "50,2,10,201,10,9090339,24-OCT-21 09.38.38.679000 AM,123456789,24/10/2021 09:39:23,22/10/2021 09:39:37,Sm123456789-SM-20211031-VSR-000123.pdf,,,,,26/01/2022 13:08:58,,2,,0";
var fiveIndex = _list.Split(',')[4];
3:
then you can get list of lines that have a value of fiveIndex
var result =_list.Split(',').Select((v, i) => new { value = v, index = i }).Where(item => item.value == fiveIndex);
In your example, line 3 and line 5 have a value of 10(index=2, index=4). Then you can save these lines in csv file.
ended up doing:
string chargeMonth = DateTime.Now.ToString("yyyyMM");
var fileCreationDate = DateTime.Now.ToString("yyyyMMdd");
string fileCreationTime = DateTime.Now.ToString("HHmmss");
string constVal = "MLL";
string fileType = "HIYUV-CHEVRA";
string[] values;
string header, sumRow;
string line, compId;
string inputFile = "records.CSV";
Dictionary<string, System.IO.StreamWriter> outputFiles = new Dictionary<string, System.IO.StreamWriter>();
using (System.IO.StreamReader file = new System.IO.StreamReader("D:\\" + inputFile, Encoding.Default))
{
header = file.ReadLine();
while ((line = file.ReadLine()) != null)
{
values = line.Split(",".ToCharArray());
compId = values[3];
if (!outputFiles.ContainsKey(compId))
{
string outputFileName = constVal + "-" + fileType + "-" + (String.Format("{0:00000}", Int32.Parse(compId))) + "-" + chargeMonth + "-" + fileCreationDate + "-" + fileCreationTime + ".CSV";
outputFiles.Add(compId, new System.IO.StreamWriter("D:\\" + outputFileName));
outputFiles[compId].WriteLine(header);
}
outputFiles[compId].WriteLine(line);
}
}
foreach (System.IO.StreamWriter outputFile in outputFiles.Values)
{
outputFile.Close();
}
and the mission is done.

C# compare id from text file in filestream

I need to fill a text file with information about workers. Then I need to read from the file and search for an ID that user tries to find. For example my file contains ids 1,2,3 and if I try to find id 3 and it matches, then this worker's all information is written in console. Otherwise it writes a text A worker cannot be found.
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
string file = "C:\\Temp\\registery.txt";
FileStream fOutStream = File.Open(file, FileMode.Append, FileAccess.Write);
StreamWriter sWriter = new StreamWriter(fOutStream);
int[] id = { 1, 2, 3 };
string[] name = { "John", "Carl", "Thomas" };
float[] salary = { 3500, 4800, 2100 };
for (int i = 0; i < id.Length; i++)
{
sWriter.WriteLine(id[i] + " " + name[i] + " " + salary[i]);
}
sWriter.Flush();
sWriter.Close();
FileStream fInStream = File.OpenRead(file);
StreamReader sReader = new StreamReader(fInStream);
int id2;
Console.WriteLine("Type worker's id");
id2 = int.Parse(Console.ReadLine());
bool a;
a = sReader.ReadToEnd().Contains(id2);
Console.WriteLine(a);
sReader.Close();
}
}
If you want to create a text file to be searchable, it should be delimited by a separator like comma /TAB
so modify your code:
sWriter.WriteLine(id[i] + "," + name[i] + "," + salary[i]);
To search your text file by id/name/..whatever and use AND/OR, you can use the method described here:
How would I convert data in a .txt file into xml? c#
BTW: Re-factor your code to create the file in a separate method, and the search in other one.
I found a solution myself to my problem and it worked good enough. It might not be the best solution. I removed bool things and I replaced the whole thing with this:
string line;
while ((line = sReader.ReadLine()) != null)
{
if (line.Contains("id: " + id2))
{
Console.WriteLine(line);
break;
}
else if ((line = sReader.ReadLine()) == null)
{
Console.WriteLine("Worker not found with id " + id2);
}
}
And I fixed the upper for loop to look like this:
sWriter.WriteLine("id: " + id[i] + " name: " + name[i] + " salary: " + salary[i]);

Dumping SQL table to .csv C#

I am trying to implement a script in my application that will dump the entire contents (for now, but I am trying to write the code so that I can easily customize it to only grab certain columns) of a sql db (running ms sql server express 2014) to a .csv file.
Here is the code I have written currently:
public void doCsvWrite(string timeStamp){
try {
//specify file name of log file (csv).
string newFileName = "C:/TestDirectory/DataExport-" + timeStamp + ".csv";
//check to see if file exists, if not create an empty file with the specified file name.
if (!File.Exists(newFileName)) {
FileStream fs = new FileStream(newFileName, FileMode.CreateNew);
fs.Close();
//define header of new file, and write header to file.
string csvHeader = "ITEM1,ITEM2,ITEM3,ITEM4,ITEM5";
using (FileStream fsWHT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swT = new StreamWriter(fsWHT))
{
swT.WriteLine(csvHeader.ToString());
}
}
//set up connection to database.
SqlConnection myDEConnection;
String cDEString = "Data Source=localhost\\NAMEDPIPE;Initial Catalog=db;User Id=user;Password=pwd";
String strDEStatement = "SELECT * FROM table";
try
{
myDEConnection = new SqlConnection(cDEString);
}
catch (Exception ex)
{
//error handling here.
return;
}
try
{
myDEConnection.Open();
}
catch (Exception ex)
{
//error handling here.
return;
}
SqlDataReader reader = null;
SqlCommand myDECommand = new SqlCommand(strDEStatement, myDEConnection);
try
{
reader = myDECommand.ExecuteReader();
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
if(reader["Column1"].ToString() == "") {
//does nothing if the current line is "bugged" (containing no values at all, typically happens after reboot of 3rd party equipment).
}
else {
//grab relevant tag data and set the csv line for the current row.
string csvDetails = reader["Column1"] + "," + reader["Column2"] + "," + String.Format("{0:0.0}", reader["Column3"]) + "," + String.Format("{0:0.000}", reader["Column4"]) + "," + reader["Column5"];
using (FileStream fsWDT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swDT = new StreamWriter(fsWDT))
{
//write csv line to file.
swDT.WriteLine(csvDetails.ToString());
}
}
}
}
}
catch (Exception ex)
{
//error handling here.
myDEConnection.Close();
return;
}
myDEConnection.Close();
}
catch (Exception ex)
{
//error handling here.
MessageBox.Show(ex.Message);
}
}
Now, this was working fine when I was using it with a 3rd party SQLite-based database, but the output I'm getting after modifing this to my MSSQL db looks something like this (ITEM1 is the primary key, a standard auto-incrementing ID-field):
ITEM1,ITEM2,ITEM3,ITEM4,ITEM5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
....
So it seems that it writes several entries of the same row, where I would just like one single line each row. Any suggestions?
Thanks in advance.
edit: Thanks everyone for your answers!
The for loop isn't needed in the section below. Because it loops from 0 to FieldCount I assume the loop was originally meant to append the text from each column together but inside the loop there's a single line that concatenates the text and assigns it to csvDetails.
try
{
reader = myDECommand.ExecuteReader();
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
if(reader["Column1"].ToString() == "") {
//does nothing if the current line is "bugged" (containing no values at all, typically happens after reboot of 3rd party equipment).
}
else {
//grab relevant tag data and set the csv line for the current row.
string csvDetails = reader["Column1"] + "," + reader["Column2"] + "," + String.Format("{0:0.0}", reader["Column3"]) + "," + String.Format("{0:0.000}", reader["Column4"]) + "," + reader["Column5"];
using (FileStream fsWDT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swDT = new StreamWriter(fsWDT))
{
//write csv line to file.
swDT.WriteLine(csvDetails.ToString());
}
}
}
}
}
Usually, we use specialy designed export/import utilites for dumping data.
However, if you have to implement you own routine I suggest decomposing.
private static IEnumerable<IDataRecord> SourceData(String sql) {
using (SqlConnection con = new SqlConnection(ConnectionStringHere)) {
con.Open();
using (SqlCommand q = new SqlCommand(sql, con)) {
using (var reader = q.ExecuteReader()) {
while (reader.Read()) {
//TODO: you may want to add additional conditions here
yield return reader;
}
}
}
}
}
private static IEnumerable<String> ToCsv(IEnumerable<IDataRecord> data) {
foreach (IDataRecord record in data) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < record .FieldCount; ++i) {
String chunk = Convert.ToString(record .GetValue(0));
if (i > 0)
sb.Append(',');
if (chunk.Contains(',') || chunk.Contains(';'))
chunk = "\"" + chunk.Replace("\"", "\"\"") + "\"";
sb.Append(chunk);
}
yield return sb.ToString();
}
}
Having SourceData and ToCsv you can easily implement
private static void WriteMyCsv(String fileName) {
var source = SourceData("SELECT * FROM table");
File.WriteAllLines(fileName, ToCsv(source));
}
You have a for loop which is looping over the fieldcount.
for (int i = 0; i < reader.FieldCount; i++)
I think it will work if you remove the loop as you don't need to iterate through the columns.
it happens because output placed inside for-loop
for (int i = 0; i < reader.FieldCount; i++)
and every record repeats FieldCount-times
Complete example. Verified working .NET 4.8, May 22. Code simplified for demo.
Why the DataTable ? Under circumstances it is useful. If you converting hundreds of files at once and multi threading - it works as large buffer + you can do pretty complex data mangling at the same time - should you need it.
UNFORTUNATELY - Microsoft trying to detect the column types and if your data not comply with the mechanism it ends with hard to correct errors. In that case use the second solution.
// Get the data from SQLite
SqliteConnection SQLiDataCon = new SqliteConnection(#"Data Source=c:\sqlite.db3");
SQLiDataCon.Open();
SqliteDataReader SQLiDtaReader = new SqliteCommand(#"SELECT * FROM stats;", SQLiDataCon).ExecuteReader();
// Load data to DataTable
DataTable csvTable = new DataTable();
csvTable.Load(SQLiDtaReader);
// Get "one" string with column names
string csvFields = #"""" + String.Join(#""",""",csvTable.Columns.Cast<DataColumn>().Select(dc => dc.ColumnName).ToArray()) + #"""";
// Prep "in memory the entire content of the CSV"
StringBuilder csvString = new StringBuilder();
// Write the header in
csvString.AppendLine(csvFields);
// Write the rows in
foreach (DataRow dr in csvTable.Rows)
{
csvString.AppendLine(#"""" + String.Join(#""",""", dr.ItemArray) + #"""");
}
// Save to file
StreamWriter csvFile = new StreamWriter(#"c:\stats.csv");
csvFile.Write(csvString);
Without DataTable.
// SQLITE
SqliteConnection SQLiDataCon = new SqliteConnection(#"Data Source=c:\sqlite.db3");
SQLiDataCon.Open();
StringBuilder csvString = new StringBuilder();
StreamWriter csvFile;
Object[] csvRow;
SqliteDataReader SQLiDtaReader = new SqliteCommand(#"SELECT * FROM sometable;", SQLiDataCon).ExecuteReader();
// CSV HEADER
csvString.AppendLine(#"""" + String.Join(#""",""", SQLiDtaReader.GetSchemaTable().AsEnumerable().Select(dr => dr.Field<string>("ColumnName")).ToArray<string>()) + #"""");
// CSV BODY
while (SQLiDtaReader.Read())
{
SQLiDtaReader.GetValues(csvRow = new Object[SQLiDtaReader.FieldCount]);
csvString.AppendLine(#"""" + String.Join(#""",""",csvRow ) + #"""");
}
// WRITE IT
csvFile = new StreamWriter(#"C:\somecsvfile.csv");
csvFile.Write(csvString);

Union of million line urls in 2 files

File A B contains million urls.
1, go through the url in file A one by one.
2, extract subdomain.com (http://subdomain.com/path/file)
3, if subdomain.com exist file B, save it to file C.
Any quickest way to get file C with c#?
Thanks.
when i use readline, it have no much different.
// stat
DateTime start = DateTime.Now;
int totalcount = 0;
int n1;
if (!int.TryParse(num1.Text, out n1))
n1 = 0;
// memory
dZLinklist = new Dictionary<string, string>();
// read file
string fileName = openFileDialog1.FileName; // get file name
textBox1.Text = fileName;
StreamReader sr = new StreamReader(textBox1.Text);
string fullfile = File.ReadAllText(#textBox1.Text);
string[] sArray = fullfile.Split( '\n');
//IEnumerable<string> sArray = tool.GetSplit(fullfile, '\n');
//string sLine = "";
//while (sLine != null)
foreach ( string sLine in sArray)
{
totalcount++;
//sLine = sr.ReadLine();
if (sLine != null)
{
//string reg = "http[s]*://.*?/";
//Regex R = new Regex(reg, RegexOptions.Compiled);
//Match m = R.Match(sLine);
//if(m.Success)
int length = sLine.IndexOf(' ', n1); // default http://
if(length > 0)
{
//string urls = sLine.Substring(0, length);
dZLinklist[sLine.Substring(0,length)] = sLine;
}
}
}
TimeSpan time = DateTime.Now - start;
int count = dZLinklist.Count;
double sec = Math.Round(time.TotalSeconds,2);
label1.Text = "(" + totalcount + ")" + count.ToString() + " / " + sec + " = " + (Math.Round(count / sec,2)).ToString();
sr.Close();
I would go for using Microsoft LogParser for processing big files: MS LogParser. Are you limited to implement it in described way only?

Reading a large CSV file and processing in C#. Any suggestions?

I have a large CSV file around 25G. I need to parse each line which has around 10 columns and do some processing and finally save it to a new file with parsed data.
I am using dictionary as my datastructure. To avoid the memory overflow I am writing the file after 500,000 records and clearing the dictionary.
Can anyone suggest whether is this good way of doing. If not, any other better way of doing this? Right now it is taking 30 mins to process 25G file.
Here is the code
private static void ReadData(string filename, FEnum fileType)
{
var resultData = new ResultsData
{
DataColumns = new List<string>(),
DataRows = new List<Dictionary<string, Results>>()
};
resultData.DataColumns.Add("count");
resultData.DataColumns.Add("userid");
Console.WriteLine("Start Processing : " + DateTime.Now);
const long processLimit = 100000;
//ProcessLimit : 500000, TimeElapsed : 30 Mins;
//ProcessLimit : 100000, TimeElaspsed - Overflow
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Dictionary<string, Results> parsedData = new Dictionary<string, Results>();
FileStream fileStream = new FileStream(filename, FileMode.Open, FileAccess.Read);
using (StreamReader streamReader = new StreamReader(fileStream))
{
string charsRead = streamReader.ReadLine();
int count = 0;
long linesProcessed = 0;
while (!String.IsNullOrEmpty(charsRead))
{
string[] columns = charsRead.Split(',');
string eventsList = columns[0] + ";" + columns[1] + ";" + columns[2] + ";" + columns[3] + ";" +
columns[4] + ";" + columns[5] + ";" + columns[6] + ";" + columns[7];
if (parsedData.ContainsKey(columns[0]))
{
Results results = parsedData[columns[0]];
results.Count = results.Count + 1;
results.Conversion = results.Count;
results.EventList.Add(eventsList);
parsedData[columns[0]] = results;
}
else
{
Results results = new Results {
Count = 1, Hash_Person_Id = columns[0], Tag_Id = columns[1], Conversion = 1,
Campaign_Id = columns[2], Inventory_Placement = columns[3], Action_Id = columns[4],
Creative_Group_Id = columns[5], Creative_Id = columns[6], Record_Time = columns[7]
};
results.EventList = new List<string> {eventsList};
parsedData.Add(columns[0], results);
}
charsRead = streamReader.ReadLine();
linesProcessed++;
if (linesProcessed == processLimit)
{
linesProcessed = 0;
SaveParsedValues(filename, fileType, parsedData);
//Clear Dictionary
parsedData.Clear();
}
}
}
stopwatch.Stop();
Console.WriteLine(#"File : {0} Batch Limit : {1} Time elapsed : {2} ", filename + Environment.NewLine, processLimit + Environment.NewLine, stopwatch.Elapsed + Environment.NewLine);
}
Thank you
The Microsoft.VisualBasic.FileIO.TextFieldParser class looks like it could do the job. Try it, it may speed things up.

Categories

Resources