Issue renaming two columns in a CSV file instead of one - c#

I need to be able to rename the column in a spreadsheet from 'idn_prod' to 'idn_prod1', but there are two columns with this name.
I have tried implementing code from similar posts, but I've only been able to update both columns. Below you'll find the code I have that just renames both columns.
//locate and edit column in csv
string file1 = #"C:\Users\username\Documents\AppDevProjects\import.csv";
string[] lines = System.IO.File.ReadAllLines(file1);
System.IO.StreamWriter sw = new System.IO.StreamWriter(file1);
foreach(string s in lines)
{
sw.WriteLine(s.Replace("idn_prod", "idn_prod1"));
}
I expect only the 2nd column to be renamed, but the actual output is that both are renamed.
Here are the first couple rows of the CSV:

I'm assuming that you only need to update the column header, the actual rows need not be updated.
var file1 = #"test.csv";
var lines = System.IO.File.ReadAllLines(file1);
var columnHeaders = lines[0];
var textToReplace = "idn_prod";
var newText = "idn_prod1";
var indexToReplace = columnHeaders
.LastIndexOf("idn_prod");//LastIndex ensures that you pick the second idn_prod
columnHeaders = columnHeaders
.Remove(indexToReplace,textToReplace.Length)
.Insert(indexToReplace, newText);//I'm removing the second idn_prod and replacing it with the updated value.
using (System.IO.StreamWriter sw = new System.IO.StreamWriter(file1))
{
sw.WriteLine(columnHeaders);
foreach (var str in lines.Skip(1))
{
sw.WriteLine(str);
}
sw.Flush();
}

Replace foreach(string s in lines) loop with
for loop and get the lines count and rename only the 2nd column.

I believe the only way to handle this properly is to crack the header line (first string that has column names) into individual parts, separated by commas or tabs or whatever, and run through the columns one at a time yourself.
Your loop would consider the first line from the file, use the Split function on the delimiter, and look for the column you're interested in:
bool headerSeen = false;
foreach (string s in lines)
{
if (!headerSeen)
{
// special: this is the header
string [] parts = s.Split("\t");
for (int i = 0; i < parts.Length; i++)
{
if (parts[i] == "idn_prod")
{
// only fix the *first* one seen
parts[i] = "idn_prod1";
break;
}
}
sw.WriteLine( string.Join("\t", parts));
headerSeen = true;
}
else
{
sw.WriteLine( s );
}
}
The only reason this is even remotely possible is that it's the header and not the individual lines; headers tend to be more predictable in format, and you worry less about quoting and fields that contain the delimiter, etc.
Trying this on the individual data lines will rarely work reliably: if your delimiter is a comma, what happens if an individual field contains a comma? Then you have to worry about quoting, and this enters all kinds of fun.
For doing any real CSV work in C#, it's really worth looking into a package that specializes in this, and I've been thrilled with CsvHelper from Josh Close. Highly recommended.

Related

How Can I Read a Multiline Field from a CSV Without Altering It?

I have a CSV that looks like this. My goal is to extract each entry (notice I said entry, not line), where an entry starts from the first column and stretches to the last column, and may span multiple lines. I'd like to extract an entry without ruining the formatting. For example, I do not want the following to be considered four seperate lines,
Eg. 1, One Column Multiple Lines
...,"1. copy ctor
2. copy ctor
3. declares function
4. default ctor",... // Where ... represents the columns before and after
but rather a column in one entry that can be represented as such
Eg. 2, One Column Single Line
"1. copy ctor\n2.copy ctor\ndeclares function\n4.default ctor"
When I iterate over the CSV, as such, I get Eg. 1. I'm not sure why splitting on a comma is treating a new line as a comma.
using (var streamReader = new StreamReader("results-survey111101.csv"))
{
string line;
while ((line = streamReader.ReadLine()) != null)
{
string[] splitLine = line.Split(',');
foreach (var column in splitLine)
Console.WriteLine(column);
}
}
If someone can show me what I need to do to get these multi line CSV columns into one line that maintains the formatting (e.g. adds \t or \n where necessary) that would be great. Thanks!
Assuming your source file is valid CSV, variability in the data is really hard to account for. That's all I'll say, but I'll link you to another SO answer if you need convincing that writing your own CSV parser is a horrible task. Reading CSV files using C#
Let's assume you are going to take advantage of an existing CSV reader library. I'll use TextFieldParser from the Microsoft.VisualBasic library as is used in the example answer I linked.
Your task is to read your source file line by line, and validate whether the line is a complete CSV entry on it's own, or if it forms part of a broken line.
If it forms part of a broken line, we need to remember the line and add the next line to it before attempting validation again.
For this we need to know one thing:
What is the expected number of fields each data entry row should have?
int expectedFieldCount = 7;
string brokenLine = "";
using (var streamReader = new StreamReader("results-survey111101.csv"))
{
string line;
while ((line = streamReader.ReadLine()) != null) // read the next line
{
// if the previous line was incomplete, add it to the current line,
// otherwise use the current line
string csvLineData = (brokenLine.Length > 0) ? brokenLine + line : line;
try
{
using (StringReader stringReader = new StringReader(csvLineData ))
using (TextFieldParser parser = new TextFieldParser(stringReader))
{
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields(); // tests if the line is valid csv
if (expectedFieldCount == fields.Length)
{
// do whatever you want with the fields now.
foreach (var field in fields)
{
Console.WriteLine(field);
}
brokenLine = ""; // reset the brokenLine
}
else // it was valid csv, but we don't have the required number of fields yet
{
brokenLine += line + #"\r\n";
break;
}
}
}
}
catch (Exception ex) // the current line is NOT valid csv, update brokenLine
{
brokenLine += (line + #"\r\n");
}
}
}
I am replacing the line breaks that broken lines contain with \r\n literals. You can display these in your resulting one-liner field however you want. But you shouldn't expect to be able to copy paste the result into notepad and see line breaks.
One assumes you have the same number of columns in each record. Therefore in your code where you do your Split you can merely sum the length of splitLine into a running columnsReadCount until they equal the desired columnsPerRecordCount. At that point you have read all the record and can reset the running columnsReadCount back to zero ready for the next record to read.

Convert txt with different number of spaces into xls file

I tried searching for a solution here but I can't seem to find any answers. I have a textfile that appears like this:
Nmr_test 101E-6 PASSED PASSED PASSED PASSED
Dc_volts 10V_100 CAL_+10V +9.99999000 +10.0000100 +9.99999740 +9.99999727
Dcv_lin 10V_6U 11.5 +0.0000E+000 +7.0000E+000 +2.0367E+001 +2.7427E+001
Dcv_lin 10V_6U 3 +0.0000E+000 +5.0000E+000 +1.3331E+001 +1.8872E+001
I have to convert this textfile to an Excel/xls file but I can't figure out how to insert them to the correct excel columns as they have different number of spaces in between columns. I've tried using this code below which is using space as a separator but it fails of course due to the varying number of spaces between the columns:
var lines = File.ReadAllLines(string.Concat(Directory.GetCurrentDirectory(), "\\Temp_textfile.txt"));
var rowcounter = 1;
foreach(var line in lines)
{
var columncounter = 1;
var values = line.Split(' ');
foreach(var value in values)
{
excelworksheet.Cells[rowcounter, columncounter] = new Cell(value);
columncounter++;
}
rowcounter++;
}
excelworkbook.Worksheets.Add(excelworksheet);
excelworkbook.Save(string.Concat(Directory.GetCurrentDirectory(), "\\Exported_excelfile.xls"));
Any advice?
EDIT: Got it working using SubString that selects each column using their fixed width.

Error trying to read csv file

Good Day,
i am having trouble reading csv files on my asp.net project.
it always returns the error index out of range cannot find column 6
before i go on explaining what i did here is the code:
string savepath;
HttpPostedFile postedFile = context.Request.Files["Filedata"];
savepath = context.Server.MapPath("files");
string filename = postedFile.FileName;
todelete = savepath + #"\" + filename;
string forex = savepath + #"\" + filename;
postedFile.SaveAs(savepath + #"\" + filename);
DataTable tblcsv = new DataTable();
tblcsv.Columns.Add("latitude");
tblcsv.Columns.Add("longitude");
tblcsv.Columns.Add("mps");
tblcsv.Columns.Add("activity_type");
tblcsv.Columns.Add("date_occured");
tblcsv.Columns.Add("details");
string ReadCSV = File.ReadAllText(forex);
foreach (string csvRow in ReadCSV.Split('\n'))
{
if (!string.IsNullOrEmpty(csvRow))
{
//Adding each row into datatable
tblcsv.Rows.Add();
int count = 0;
foreach (string FileRec in csvRow.Split('-'))
{
tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
count++;
}
}
}
i tried using comma separated columns but the string that comes with it contains comma so i tried the - symbol just to make sure that there are no excess commas on the text file but the same error is popping up.
am i doing something wrong?
thank you in advance
Your excel file might have more columns than 6 for one or more rows. For this reason the splitting in inner foreach finds more columns but the tblcsv does not have more columns than 6 to assign the extra column value.
Try something like this:
foreach (string FileRec in csvRow.Split('-'))
{
if(count > 5)
return;
tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
count++;
}
However it would be better if you check for additional columns before processing and handle the issue.
StringBuilder errors = new StringBuilder(); //// this will hold the record for those array which have length greater than the 6
foreach (string csvRow in ReadCSV.Split('\n'))
{
if (!string.IsNullOrEmpty(csvRow))
{
//Adding each row into datatable
DataRow dr = tblcsv.NewRow(); and then
int count = 0;
foreach (string FileRec in csvRow.Split('-'))
{
try
{
dr[count] = FileRec;
tblcsv.Rows.Add(dr);
}
catch (IndexOutOfRangeException i)
{
error.AppendLine(csvRow;)
break;
}
count++;
}
}
}
Now in this case we will have the knowledge of the csv row which is causing the errors, and rest will be processed successfully. Validate the row in errors whether its desired input, if not then correct value in csv file.
You can't treat the file as a CSV if the delimiter appears inside a field. In this case you can use a regular expression to extract the first five fields up to the dash, then read the rest of the line as the sixth field. With a regex you can match the entire string and even avoid splitting lines.
Regular expressions are also a lot faster than splits and consume less memory because they don't create temporary strings. That's why they are used extensively to parse log files. The ability to capture fields by name doesn't hurt either
The following sample parses the entire file and captures each field in a named group. The last field captures everything to the end of the line:
var pattern="^(?<latitude>.*?)-(?<longitude>.*?)-(?<mps>.*?)-(?<activity_type>.*?)-" +
"(?<date_occured>.*?)-(?<detail>.*)$";
var regex=new Regex(pattern,RegexOptions.Multiline);
var matches=regex.Matches(forex);
foreach (Match match in matches)
{
DataRow dr = tblcsv.NewRow();
row["latitude"]=match.Groups["latitude"].Value);
row["longitude"]=match.Groups["longitude"].Value);
...
tblcsv.Rows.Add(dr);
}
The (?<latitude>.*?)- pattern captures everything up to the first dash into a group named latitude. The .*? pattern means the matching isn't greedy ie it won't try to capture everything to the end of the line but will stop when the first - is encountered.
The column names match the field names, which means you can add all fields with a loop:
foreach (Match match in matches)
{
var row = tblCsv.NewRow();
foreach (Group group in match.Groups)
{
foreach (DataColumn col in tblCsv.Columns)
{
row[col.ColumnName]=match.Groups[col.ColumnName].Value;
}
}
tblCsv.Rows.Add(row);
}
tblCsv.Rows.Add(row);

How to display data from text file into many columns?

I have text file which consists of many rows and 18 columns of data seperated by tabs. I used this code and it is displaying entire data in single column. What I need is the data should be displayed in columns.
public static List<string> ReadDelimitedFile(string docPath)
{
var sepList = new List<string>();
// Read the file and display it line by line.
using (StreamReader file = new StreamReader(docPath))
{
string line;
while ((line = file.ReadLine()) != null)
{
var delimiters = new char[] { '\t' };
var segments = line.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
foreach (var segment in segments)
{
//Console.WriteLine(segment);
sepList.Add(segment);
}
}
file.Close();
}
// Suspend the screen.
Console.ReadLine();
return sepList;
}
You're outputting everything in one column like this (pseudo-code, to illustrate structure):
while (reading lines)
for (reading entries)
WriteLine(entry)
That is, for every line in the file and for every entry in that line, you output a new line. Instead, you want to only write a new line for every line in the file, and write the entries with separators (tabs?). Something more like this:
while (reading lines)
for (reading entries)
Write(entry)
WriteLine(newline)
That way all the entries for any given line in the file are on the same line in the output.
How you delimit those entries in the output is up to you, of course. And to write a carriage return could be as simple as Console.WriteLine(string.Empty), though I bet there are lots of other ways to do it.
18 columns would seem to be served best by using a dataGridView.
// Create your dataGrodView with the 18 columns using your designer.
int col = 0;
foreach (var segment in segments)
{
//Console.WriteLine(segment);
//sepList.Add(segment);
dataGridView1.Rows[whateverRow].Cells[col].Value = segment;
}
So according to your code, you have a following loop:
while{
<reads the lines one by one>
for each line{
<reading each segment and adding to the list.>
}
}
Your code read each segment of a line and append to the list. Ideally you should have 18 list for 18 columns. In java this problem can be solved with hashmaps:
Hashmap <String, ArrayList<String>> hmp = new Hashmap<String, ArrayList<String>>();`
while(read each line){
List<String> newList = new ArrayList<String>
foreach(segment as segments){
newList.add(segment);
}
hmp.put(column1,segment);
}
return hmp;
so you will have hmp.put(column2, segment), hmp.put(column3, segment) and so on.
Hope it helps.
You should be using DataTable or similar type for that but if you want to use List you can "emulate" rows and columns like this:
var rows = new List<List<string>>();
foreach(var line in File.ReadAllLines(docPath))
{
var columns = line.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries).ToList();
rows.Add(columns);
}
That will give you row/column like structure
foreach(var row in rows)
{
foreach(var column in row)
{
Console.Write(column + ",");
}
Console.WriteLine();
}

Code does not execute

I know I've been a bit of pain, the last couple of days, that is, with all my questions, but I've been developing this project and I'm (figuratively) inches away from finishing it.
That being said, I would like your help on one more matter. It kind of relates to my previous questions, but you do not need the code for those. The problem lies exactly on this bit of code. What I want from you is to help me identify it and, consequently, solve it.
Before I show you the code I'd been working on, I'd like to say a few extra things:
My application has a file merging feature, merging two files together and handling duplicate entries.
In any given file, each line can have one of these four formats (the last three are optional): Card Name|Amount, .Card Name|Amount, ..Card Name|Amount, _Card Name|Amount.
If a line is not appropriately formatted, the program will skip it (ignore it altogether).
So, basically, a sample file could be as follows:
Blue-Eyes White Dragon|3
..Blue-Eyes Ultimate Dragon|1
.Dragon Master Knight|1
_Kaibaman|1
Now, when it comes to using the file merger, if a line starts with one of the special characters . .. _, it should act accordingly. For ., it operates normally. For lines starting with .., it moves the index to the second dot and, finally, it ignores _ lines completely (they have another use not related to this discussion).
Here is my code for the merge function (for some odd reason, the code inside the second loop won't execute at all):
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
// Save file names to array.
string[] fileNames = openFileDialog1.FileNames;
// Loop through the files.
foreach (string fileName in fileNames)
{
// Save all lines of the current file to an array.
string[] lines = File.ReadAllLines(fileName);
// Loop through the lines of the file.
for (int i = 0; i < lines.Length; i++)
{
// Split current line.
string[] split = lines[i].Split('|');
// If the current line is badly formatted, skip to the next one.
if (split.Length != 2)
continue;
string title = split[0];
string times = split[1];
if (lines[i].StartsWith("_"))
continue;
// If newFile (list used to store contents of the card resource file) contains the current line of the file that we're currently looping through...
for (int k = 0; k < newFile.Count; k++)
{
if (lines[i].StartsWith(".."))
{
string newTitle = lines[i].Substring(
lines[i].IndexOf("..") + 1);
if (newFile[k].Contains(newTitle))
{
// Split the line once again.
string[] secondSplit = newFile.ElementAt(
newFile.IndexOf(newFile[k])).Split('|');
string secondTimes = secondSplit[1];
// Replace the newFile element at the specified index.
newFile[newFile.IndexOf(newFile[k])] =
string.Format("{0}|{1}", newTitle, int.Parse(times) + int.Parse(secondTimes));
}
// If newFile does not contain the current line of the file we're looping through, just add it to newFile.
else
newFile.Add(string.Format(
"{0}|{1}",
newTitle, times));
continue;
}
if (newFile[k].Contains(title))
{
string[] secondSplit = newFile.ElementAt(
newFile.IndexOf(newFile[k])).Split('|');
string secondTimes = secondSplit[1];
newFile[newFile.IndexOf(newFile[k])] =
string.Format("{0}|{1}", title, int.Parse(times) + int.Parse(secondTimes));
}
else
{
newFile.Add(string.Format("{0}|{1}", title, times));
}
}
}
}
// Overwrite resources file with newFile.
using (StreamWriter sw = new StreamWriter("CardResources.ygodc"))
{
foreach (string line in newFile)
sw.WriteLine(line);
}
I know this is quite a long piece of code, but I believe all of it is relevant to a point. I skipped some unimportant bits (after all of this is executed) as they are completely irrelevant.

Categories

Resources