CSV data to .txt and sum all the amounts
string fileName = "../../TechFiles/";
using (var reader = new StreamReader(ConfigurationManager.AppSettings["ConfigurationSource"]))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
string fullFileName = fileName + values[4] + ".txt";
List<helper> package = new List<helper>
{
new helper() { bankName = values[4], amount = double.Parse(values[6])}
};
List<ResultLine> result = package.GroupBy(i => i.bankName)
.SelectMany(cl => cl.Select(
csLine => new ResultLine
{
bankName = csLine.bankName,
Quantity = cl.Count().ToString(),
amount = cl.Sum(c => c.amount),
}))
.ToList<ResultLine>();
List<string> listA = new List<string>();
foreach (var book in result)
{
if (!listA.Contains(book.bankName))
{
listA.Add(book.bankName);
File.WriteAllText(fullFileName,
book.bankName + " " + book.amount + " " + book.Quantity);
}
}
I put csv path on the app.config then retrieved all the data to the text file, but the problem is I want to some my value[4] which is a header and a sum of all the amounts to the header but it only returns a single amount, so I need a way to pass all the amounts same time so I can be able to sum the total.
You'll either need to loop through all the values twice (once to calculate the sum and once to write the text file), or sum the values as you write them to the text file and then insert the header row to the text file after you have written all the details.
Related
So I have the following code:
void ReadFromCsv()
{
using (var reader = new StreamReader(#"d:\test.csv", Encoding.Default))
{
List<string> listA = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(';');
listA.Add(values[0]);
}
Console.WriteLine(listA);
}
}
which is reading from my csv file and an example of a line I get is:
50,2,10,201,10,9090339,24-OCT-21 09.38.38.679000 AM,123456789,24/10/2021 09:39:23,22/10/2021 09:39:37,Sm123456789-SM-20211031-VSR-000123.pdf,,,,,26/01/2022 13:08:58,,2,,0
first of all, why are there many commas around the end of the line?
second of all, what if I wanted to access the value "10" (which is the 5th value ) of that string line, is that possible?,
or going further, my task is to check for that 5th value and if its 5 for example, I'd want to take every row with 5thvalue=5 and create a csv for them, if 5thvalue=10 I want to create a csv for those records, and so on. but one task at a time, how do I access that value?
1: commas around the end of the line mean first item of lines is empty ""
2: you can get 5th value as below:
string _list = "50,2,10,201,10,9090339,24-OCT-21 09.38.38.679000 AM,123456789,24/10/2021 09:39:23,22/10/2021 09:39:37,Sm123456789-SM-20211031-VSR-000123.pdf,,,,,26/01/2022 13:08:58,,2,,0";
var fiveIndex = _list.Split(',')[4];
3:
then you can get list of lines that have a value of fiveIndex
var result =_list.Split(',').Select((v, i) => new { value = v, index = i }).Where(item => item.value == fiveIndex);
In your example, line 3 and line 5 have a value of 10(index=2, index=4). Then you can save these lines in csv file.
ended up doing:
string chargeMonth = DateTime.Now.ToString("yyyyMM");
var fileCreationDate = DateTime.Now.ToString("yyyyMMdd");
string fileCreationTime = DateTime.Now.ToString("HHmmss");
string constVal = "MLL";
string fileType = "HIYUV-CHEVRA";
string[] values;
string header, sumRow;
string line, compId;
string inputFile = "records.CSV";
Dictionary<string, System.IO.StreamWriter> outputFiles = new Dictionary<string, System.IO.StreamWriter>();
using (System.IO.StreamReader file = new System.IO.StreamReader("D:\\" + inputFile, Encoding.Default))
{
header = file.ReadLine();
while ((line = file.ReadLine()) != null)
{
values = line.Split(",".ToCharArray());
compId = values[3];
if (!outputFiles.ContainsKey(compId))
{
string outputFileName = constVal + "-" + fileType + "-" + (String.Format("{0:00000}", Int32.Parse(compId))) + "-" + chargeMonth + "-" + fileCreationDate + "-" + fileCreationTime + ".CSV";
outputFiles.Add(compId, new System.IO.StreamWriter("D:\\" + outputFileName));
outputFiles[compId].WriteLine(header);
}
outputFiles[compId].WriteLine(line);
}
}
foreach (System.IO.StreamWriter outputFile in outputFiles.Values)
{
outputFile.Close();
}
and the mission is done.
I have written a very simple program using a nuget package in c# to read in 2 csv files and fuzzy match them and output a new csv file with all the matches. The problem is i need the program to be able to read and compare files up to 700k and comparw it to 100k. I havent been able to find a way to speed up the process. Is there any way i can do this? I will even use another language if need be.
you can ignore all the commented code its just there for when i was using it for testing purposes. sorry im a newer programmer.
the read csv funciton is for reading in the csv. the rest is code inside another function where i pass in the string arrays to pass them through fuzzymatch
static string[] ReadCSV(string path)
{
List<string> name = new List<string>();
List<string> address = new List<string>();
List<string> city = new List<string>();
List<string> state = new List<string>();
List<string> zip = new List<string>();
using (var reader = new StreamReader(path))
{
reader.ReadLine();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
name.Add(values[0] +", "+ values[1]);
//address.Add(values[1]);
//city.Add(values[2]);
//state.Add(values[3]);
//zip.Add(values[4]);
}
}
string[] name1 = name.ToArray();
return name1;
//foreach (var item in name)
//{
// Console.WriteLine(item.ToString());
//}
}
StringBuilder csvcontent = new StringBuilder();
string csvpath = #"C:\Users\bigel\Documents\outputtest.csv";
csvcontent.AppendLine("Name,Address,Match");
//Console.WriteLine("Levenshtein Edit Distance:");
int x = 1;
foreach (var name in string1)
{
for (int i = 0; i < length; i++)
{
int leven = match[i].LevenshteinDistance(name);
//Console.WriteLine(match[i] + "\t{0} against {1}", leven, name);
if (leven <= 7)
{
output[i] = input[i] + ",match";
csvcontent.AppendLine(output[i]);
//Console.WriteLine(match[i] + " " + leven + " against " + name + " is a Match");
//Console.WriteLine(output[i]);
}
else
{
if (i == 500)
{
Console.WriteLine(x);
x++;
}
}
}
}
File.AppendAllText(csvpath, csvcontent.ToString());
So I've been trying to figure out how to bring an entire line of a .csv file but only the ones who's first string matches another one.
This is what I got so far, all im getting back in my listbox is info from the same random line.
If you guys can help me with the logic it would help out a lot thanks
cbocustinfo.Items.Clear();
lstcustinfo.Items.Clear();
StreamReader infile, transdata;
infile = File.OpenText(#"E:\AS2customers.csv");
transdata= File.OpenText(#"E:\AS2data.csv");
string[] custinfo, names;
string[] custtrans;
do
{
custtrans = transdata.ReadLine().Split(',');
if (custinfo[1] == custtrans[0])
{
lstcustinfo.Items.Add(custtrans[3] + " " + custtrans[4]);
}
}
while (transdata.EndOfStream != True);
infile.Close();
transdata.Close();
Here is where I initialize custinfo
do
{
custinfo = infile.ReadLine().Split(',');
names = custinfo[0].Split(' ');
cbocustinfo.Items.Add(names[0] +" "+ names[1]+ " " + custinfo[1]);
}
while (infile.EndOfStream != true);
If I understand what you're trying to do correctly, maybe it would be easier to just read the files into two strings, then do the splitting and looping over those. I don't know your file formats, so this may be doing unnecessary processing (looping through all the transactions for every customer).
For example:
cbocustinfo.Items.Clear();
lstcustinfo.Items.Clear();
var customers = File.ReadAllText(#"E:\AS2customers.csv")
.Split(new []{Environment.NewLine}, StringSplitOptions.None);
var transactions = File.ReadAllText(#"E:\AS2data.csv")
.Split(new []{Environment.NewLine}, StringSplitOptions.None);
foreach (var customer in customers)
{
var custInfo = customer.Split(',');
var names = custInfo[0].Split(' ');
cbocustinfo.Items.Add(names[0] + " " + names[1]+ " " + custinfo[1]);
foreach (var transaction in transactions)
{
var transInfo = transaction.Split(',');
if (custInfo[1] == transInfo[0])
{
lstcustinfo.Items.Add(transInfo[3] + " " + transInfo[4]);
}
}
}
I am a bit in a pickle regarding a consolidation application we are using in our company. We create a csv file from an progress database this csv file has 14 columns and NO header.
The CSV file contains payments (around 173 thousand rows). Most of these rows are the same except for the column amount (last column)
Example:
2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65
(around 174000 rows)
As you can see some of these lines are the same except for the amount column. What i need is to sort all rows, add up the amount and save one unique row instead of 1100 rows with different amounts.
My coding skills are failing me to get the job done within a certain timeframe, maybe one of you can push me in the right direction solving this problem.
Example code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string input = File.ReadAllText(#"c:\temp\test.txt");
string inputLine = "";
StringReader reader = new StringReader(input);
List<List<string>> data = new List<List<string>>();
while ((inputLine = reader.ReadLine()) != null)
{
if (inputLine.Trim().Length > 0)
{
string[] inputArray = inputLine.Split(new char[] { ';' });
data.Add(inputArray.ToList());
}
}
//sort data by every column
for (int sortCol = data[0].Count() - 1; sortCol >= 0; sortCol--)
{
data.OrderBy(x => x[sortCol]);
}
//delete duplicate rows
for (int rowCount = data.Count - 1; rowCount >= 1; rowCount--)
{
Boolean match = true;
for (int colCount = 0; colCount < data[rowCount].Count - 2; colCount++)
{
if(data[rowCount][colCount] != data[rowCount - 1][colCount])
{
match = false;
break;
}
}
if (match == true)
{
decimal previousValue = decimal.Parse(data[rowCount - 1][data[rowCount].Count - 1]);
decimal currentValue = decimal.Parse(data[rowCount][data[rowCount].Count - 1]);
string newStrValue = (previousValue + currentValue).ToString();
data[rowCount - 1][data[rowCount].Count - 1] = newStrValue;
data.RemoveAt(rowCount);
}
}
string output = string.Join("\r\n",data.AsEnumerable()
.Select(x => string.Join(";",x.Select(y => y).ToArray())).ToArray());
File.WriteAllText(#"c:\temp\test1.txt",output);
}
}
}
Read the CSV file line by line, and build an in-memory dictionary in which you keep the totals (and other information you require). As most of the lines belong to the same key, it will probably not cause out of memory issues. Afterwards, generate a new CSV based on the information in the dictionary.
As I interpret your question, your problem and the solution you are asking for are how to take your input that are in the form of
#"2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65"
Get the last column and then sum it up? If so this is actually very easy to do with something like this
public static void Main()
{
string input = #"2014;MONTH;;SC;10110;;;;;;;;EUR;-6500000
2014;01;;SC;10110;;;;;;;;EUR;-1010665
2014;01;;LLC;11110;;;;;;;;EUR;-6567000
2014;01;;SC;10110;;;;;;;;EUR;-1110665
2014;01;;LLC;11110;;;;;;;;EUR;65670.00
2014;01;;SC;10110;;;;;;;;EUR;-11146.65";
var rows = input.Split('\n');
decimal totalValue = 0m;
foreach(var row in rows)
{
var transaction = row.Substring(row.LastIndexOf(';') +1);
decimal val = 0m;
if(decimal.TryParse(transaction, out val))
totalValue += val;
}
Console.WriteLine(totalValue);
}
But maybe I have misunderstood what you were asking for?
Sorry answering my post so late but this is my final solution
Replacing all " characters and write the output to the stream writer. (going from 25mb to a 15mb file.). Than copy my CSV file to the SQL server so i can bulk insert. After my insert i just query the table and read / write the result set to a new file. My new file is only +/-700KB!
The Filldata() method is filling a datagridview in my application so you can review the result instead of opening the file in excel.
I am new with C#, i am currently writing a new solution to query the csv file directly or in memory and write it back to a new file.
Method1:
string line;
StreamWriter sw = new StreamWriter(insertFile);
using (StreamReader sr = new StreamReader(sourcePath))
{
while ((line = sr.ReadLine()) != null)
{
sw.WriteLine(line.Replace("\"", ""));
}
sr.Close();
sw.Close();
sr.Dispose();
sw.Dispose();
File.Copy(insertFile, #"\\SQLSERVER\C$\insert.csv");
}
Method2:
var destinationFile = #"c:\insert.csv";
var querieImportCSV = "BULK INSERT dbo.TABLE FROM '" + destinationFile + "' WITH ( FIELDTERMINATOR = ';', ROWTERMINATOR = '\n', FIRSTROW = 1)";
var truncate = #"TRUNCATE TABLE dbo.TABLE";
string queryResult =
#"SELECT [Year]
,[Month]
,[Week]
,[Entity]
,[Account]
,[C11]
,[C12]
,[C21]
,[C22]
,[C3]
,[C4]
,[CTP]
,[VALUTA]
,SUM(AMOUNT) as AMOUNT
,[CURRENCY_ORIG]
,[AMOUNTEXCH]
,[AGENTCODE]
FROM dbo.TABLE
GROUP BY YEAR, MONTH, WEEK, Entity, Account, C11, C12, C21, C22, C3, C4, CTP, VALUTA, CURRENCY_ORIG, AMOUNTEXCH, AGENTCODE
ORDER BY Account";
var conn = new SqlConnection(connectionString);
conn.Open();
SqlCommand commandTruncate = new SqlCommand(truncate, conn);
commandTruncate.ExecuteNonQuery();
SqlCommand commandInsert = new SqlCommand(querieImportCSV, conn);
SqlDataReader readerInsert = commandInsert.ExecuteReader();
readerInsert.Close();
FillData();
SqlCommand commandResult = new SqlCommand(queryResult, conn);
SqlDataReader readerResult = commandResult.ExecuteReader();
StringBuilder sb = new StringBuilder();
while (readerResult.Read())
{
sb.Append(readerResult["Year"] + ";" + readerResult["Month"] + ";" + readerResult["Week"] + ";" + readerResult["Entity"] + ";" + readerResult["Account"] + ";" +
readerResult["C11"] + ";" + readerResult["C12"] + ";" + readerResult["C21"] + ";" + readerResult["C22"] + ";" + readerResult["C3"] + ";" + readerResult["C4"] + ";" +
readerResult["CTP"] + ";" + readerResult["Valuta"] + ";" + readerResult["Amount"] + ";" + readerResult["CURRENCY_ORIG"] + ";" + readerResult["AMOUNTEXCH"] + ";" + readerResult["AGENTCODE"]);
}
sb.Replace("\"","");
StreamWriter sw = new StreamWriter(homedrive);
sw.WriteLine(sb);
readerResult.Close();
conn.Close();
sw.Close();
sw.Dispose();
I am getting data from a CSV file through my Web Api with this code
private List<Item> items = new List<Item>();
public ItemRepository()
{
string filename = HttpRuntime.AppDomainAppPath + "App_Data\\items.csv";
var lines = File.ReadAllLines(filename).Skip(1).ToList();
for (int i = 0; i < lines.Count; i++)
{
var line = lines[i];
var columns = line.Split('$');
//get rid of newline characters in the middle of data lines
while (columns.Length < 9)
{
i += 1;
line = line.Replace("\n", " ") + lines[i];
columns = line.Split('$');
}
//Remove Starting and Trailing open quotes from fields
columns = columns.Select(c => { if (string.IsNullOrEmpty(c) == false) { return c.Substring(1, c.Length - 2); } return string.Empty; }).ToArray();
var temp = columns[5].Split('|', '>');
items.Add(new Item()
{
Id = int.Parse(columns[0]),
Name = temp[0],
Description = columns[2],
Photo = columns[7]
});
}
}
But the CSV file returned data with special characters instead of an apostrophe.
For example in the CSV file the are values such as There’s which should be "There's" or "John’s" which should be "John's".
This ’ is there instead of an apostrophe.
How do I get rid of this to just show my apostrophe.
This kind of data is being returned in
Name = temp[0],
Description = columns[2],
You can use the HttpUtility.HtmlDecode to convert the characters. Here's an example:
var withEncodedChars = "For example in the CSV file the are values such as There’s which should be There's or John’s which should be John's. This ’ is there instead of an apostrophe.";
Console.WriteLine(HttpUtility.HtmlDecode(withEncodedChars));
If you run this in a console app it outputs:
For example in the CSV file the are values such as There's which should be There's or John's which should be John's. This ' is there instead of an apostrophe.