Error trying to read csv file - c#

Good Day,
i am having trouble reading csv files on my asp.net project.
it always returns the error index out of range cannot find column 6
before i go on explaining what i did here is the code:
string savepath;
HttpPostedFile postedFile = context.Request.Files["Filedata"];
savepath = context.Server.MapPath("files");
string filename = postedFile.FileName;
todelete = savepath + #"\" + filename;
string forex = savepath + #"\" + filename;
postedFile.SaveAs(savepath + #"\" + filename);
DataTable tblcsv = new DataTable();
tblcsv.Columns.Add("latitude");
tblcsv.Columns.Add("longitude");
tblcsv.Columns.Add("mps");
tblcsv.Columns.Add("activity_type");
tblcsv.Columns.Add("date_occured");
tblcsv.Columns.Add("details");
string ReadCSV = File.ReadAllText(forex);
foreach (string csvRow in ReadCSV.Split('\n'))
{
if (!string.IsNullOrEmpty(csvRow))
{
//Adding each row into datatable
tblcsv.Rows.Add();
int count = 0;
foreach (string FileRec in csvRow.Split('-'))
{
tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
count++;
}
}
}
i tried using comma separated columns but the string that comes with it contains comma so i tried the - symbol just to make sure that there are no excess commas on the text file but the same error is popping up.
am i doing something wrong?
thank you in advance

Your excel file might have more columns than 6 for one or more rows. For this reason the splitting in inner foreach finds more columns but the tblcsv does not have more columns than 6 to assign the extra column value.
Try something like this:
foreach (string FileRec in csvRow.Split('-'))
{
if(count > 5)
return;
tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
count++;
}
However it would be better if you check for additional columns before processing and handle the issue.

StringBuilder errors = new StringBuilder(); //// this will hold the record for those array which have length greater than the 6
foreach (string csvRow in ReadCSV.Split('\n'))
{
if (!string.IsNullOrEmpty(csvRow))
{
//Adding each row into datatable
DataRow dr = tblcsv.NewRow(); and then
int count = 0;
foreach (string FileRec in csvRow.Split('-'))
{
try
{
dr[count] = FileRec;
tblcsv.Rows.Add(dr);
}
catch (IndexOutOfRangeException i)
{
error.AppendLine(csvRow;)
break;
}
count++;
}
}
}
Now in this case we will have the knowledge of the csv row which is causing the errors, and rest will be processed successfully. Validate the row in errors whether its desired input, if not then correct value in csv file.

You can't treat the file as a CSV if the delimiter appears inside a field. In this case you can use a regular expression to extract the first five fields up to the dash, then read the rest of the line as the sixth field. With a regex you can match the entire string and even avoid splitting lines.
Regular expressions are also a lot faster than splits and consume less memory because they don't create temporary strings. That's why they are used extensively to parse log files. The ability to capture fields by name doesn't hurt either
The following sample parses the entire file and captures each field in a named group. The last field captures everything to the end of the line:
var pattern="^(?<latitude>.*?)-(?<longitude>.*?)-(?<mps>.*?)-(?<activity_type>.*?)-" +
"(?<date_occured>.*?)-(?<detail>.*)$";
var regex=new Regex(pattern,RegexOptions.Multiline);
var matches=regex.Matches(forex);
foreach (Match match in matches)
{
DataRow dr = tblcsv.NewRow();
row["latitude"]=match.Groups["latitude"].Value);
row["longitude"]=match.Groups["longitude"].Value);
...
tblcsv.Rows.Add(dr);
}
The (?<latitude>.*?)- pattern captures everything up to the first dash into a group named latitude. The .*? pattern means the matching isn't greedy ie it won't try to capture everything to the end of the line but will stop when the first - is encountered.
The column names match the field names, which means you can add all fields with a loop:
foreach (Match match in matches)
{
var row = tblCsv.NewRow();
foreach (Group group in match.Groups)
{
foreach (DataColumn col in tblCsv.Columns)
{
row[col.ColumnName]=match.Groups[col.ColumnName].Value;
}
}
tblCsv.Rows.Add(row);
}
tblCsv.Rows.Add(row);

Related

Issue renaming two columns in a CSV file instead of one

I need to be able to rename the column in a spreadsheet from 'idn_prod' to 'idn_prod1', but there are two columns with this name.
I have tried implementing code from similar posts, but I've only been able to update both columns. Below you'll find the code I have that just renames both columns.
//locate and edit column in csv
string file1 = #"C:\Users\username\Documents\AppDevProjects\import.csv";
string[] lines = System.IO.File.ReadAllLines(file1);
System.IO.StreamWriter sw = new System.IO.StreamWriter(file1);
foreach(string s in lines)
{
sw.WriteLine(s.Replace("idn_prod", "idn_prod1"));
}
I expect only the 2nd column to be renamed, but the actual output is that both are renamed.
Here are the first couple rows of the CSV:
I'm assuming that you only need to update the column header, the actual rows need not be updated.
var file1 = #"test.csv";
var lines = System.IO.File.ReadAllLines(file1);
var columnHeaders = lines[0];
var textToReplace = "idn_prod";
var newText = "idn_prod1";
var indexToReplace = columnHeaders
.LastIndexOf("idn_prod");//LastIndex ensures that you pick the second idn_prod
columnHeaders = columnHeaders
.Remove(indexToReplace,textToReplace.Length)
.Insert(indexToReplace, newText);//I'm removing the second idn_prod and replacing it with the updated value.
using (System.IO.StreamWriter sw = new System.IO.StreamWriter(file1))
{
sw.WriteLine(columnHeaders);
foreach (var str in lines.Skip(1))
{
sw.WriteLine(str);
}
sw.Flush();
}
Replace foreach(string s in lines) loop with
for loop and get the lines count and rename only the 2nd column.
I believe the only way to handle this properly is to crack the header line (first string that has column names) into individual parts, separated by commas or tabs or whatever, and run through the columns one at a time yourself.
Your loop would consider the first line from the file, use the Split function on the delimiter, and look for the column you're interested in:
bool headerSeen = false;
foreach (string s in lines)
{
if (!headerSeen)
{
// special: this is the header
string [] parts = s.Split("\t");
for (int i = 0; i < parts.Length; i++)
{
if (parts[i] == "idn_prod")
{
// only fix the *first* one seen
parts[i] = "idn_prod1";
break;
}
}
sw.WriteLine( string.Join("\t", parts));
headerSeen = true;
}
else
{
sw.WriteLine( s );
}
}
The only reason this is even remotely possible is that it's the header and not the individual lines; headers tend to be more predictable in format, and you worry less about quoting and fields that contain the delimiter, etc.
Trying this on the individual data lines will rarely work reliably: if your delimiter is a comma, what happens if an individual field contains a comma? Then you have to worry about quoting, and this enters all kinds of fun.
For doing any real CSV work in C#, it's really worth looking into a package that specializes in this, and I've been thrilled with CsvHelper from Josh Close. Highly recommended.

Matching cell in rows with an input string

I am basically trying to read from an excel file and find a certain ID Number in that file. Right now it is printing all of the rows as a match and I would like help figuring out why.
// input to search for
string value = textBox3.Text;
// verify the input is the correct format
Match match = Regex.Match(value, #".*[0-9].*");
Match myMatch = Regex.Match(value, textBox3.Text);
Console.WriteLine(value);
foreach (DataRow row in xlsDs.Rows)
{
if (match.Success && myMatch.Success)
{
Console.WriteLine(textBox3);
Console.Write(row.ItemArray.ToString());
Console.WriteLine("This was found");
}
}
int rowCount = xlsDs.Rows.Count;
I would still use a foreach loop, then add a simple counter and increment it via counter++ every time you loop, when you find it you can add that value & the data to a collection so you can the reference it later on.
foreach is much safer than a for loop, there are times where for loop is much preferred but I don't see this being one of those times.
You can solve it by following code
if you want to match value to some excel column E.G. ID
Put condition in for loop .... Because i think you want to match value with some column of excel..
string value = textBox3.Text;
Match match = Regex.Match(value, #".*[0-9].*");
Console.WriteLine(value);
int TotalRows = xlsDs.Rows.Count;
for(int i=0;i<TotalRows;i++)
{
DataRow row = xlsDs.Rows[i];
String row_Val=row["Cell_Name"].ToString();//Put Cell you want to match IE ID
Match myMatch = Regex.Match(row_Val, textBox3.Text);
if (match.Success && myMatch.Success)
{
Console.WriteLine(textBox3);
Console.Write(row.ItemArray.ToString());
//Console.WriteLine(row["Cell_Name"]);//if you want to print a specific cell
Console.WriteLine("This was found at row "+i);
}
}
Your error isn't the for vs foreach loop, it's the matching you're doing. Try this instead.
You also were not reading the rows in correctly, you should only look at the one column that you need. Change the column variable below to the correct column.
The primary difference between this and your code is that you want to check each row in the iteration and then if it is a match, print a line saying so. This is compared to what you did originally, where you compare one string once and if that is a match, print that over and over for each row.
string columnName = "Employee ID"; // change to the correct header
// Check the ID from the textbox to make sure it is valid?
Match match = Regex.Match(textBox3.Text #".*[0-9].*");
for(int i = 0; i < xlsDs.Rows.Count; i++)
{
// get the current row
DataRow row = xlsDs.Rows[i];
// get the ID from the row
string idValue = row[columnName].ToString();
// check if the row value is equal to the textbox entry
bool myMatch = idValue.Equals(textBox3.Text);
// if both of the above are true, do this
if (match.Success && myMatch == true)
{
Console.Write(idValue);
Console.WriteLine(" -This id was found");
}
}

C# removing last delimited field from a string split

I am a beginner c# programmer and just had a quick question on an application I am building. My process reads in multiple files with the purpose of stripping out specific records based on a 1 or 0 pipe delimited field in the text file. It is the last delimited field in the file actually. If it is a 0, I write it to a temp file (which will later replace the original that I read), if it is anything else I do not. And not to try to get it too confusing but there are two types of records in the file, a header row, and then that is followed by a few supp rows. The header row is the only one that has the flag, so as you can tell from below, if the bool gets set to a good record by being 0, it writes the header record along with all supp records below it until it hits a bad one in which case it will negate writing them until the next good one.
However, what I am trying to do now (and would like to know the easiest way), is how to write the header record without the last pipe delimited field (IE the flag). Since it should always be the last 2 characters of the row (for example "0|" or "1|" as the preceeding pipe is needed), should it be a string trim on my inputrecord string? Is there an easier way? Is there a way to do a split on the record but not actually include the last field (in this case, field 36)? Any advice would be appreciated. Thank you,
static void Main(string[] args)
{
try
{
string executionDirectory = RemoveFlaggedRecords.Properties.Settings.Default.executionDirectory;
string workDirectory = RemoveFlaggedRecords.Properties.Settings.Default.workingDirectory;
string[] files = Directory.GetFiles(executionDirectory, "FilePrefix*");
foreach (string file in files)
{
string tempFile = Path.Combine(workDirectory,Path.GetFileName(file));
using (StreamReader sr = new StreamReader(file,Encoding.Default))
{
StreamWriter sw = new StreamWriter(tempFile);
string inputRecord = sr.ReadLine();
bool goodRecord = false;
bool isheaderRecord = false;
while (inputRecord != null)
{
string[] fields = inputRecord.Split('|');
if (fields[0].ToString().ToUpper() == "HEADER")
{
goodRecord = Convert.ToInt32(fields[36]) == 0;
isheaderRecord = true;
}
if (goodRecord == true && isheaderRecord == true)
{
// I'm not sure what to do here to write the string without the 36th field***
}
else if (goodRecord == true)
{
sw.WriteLine(inputRecord);
}
inputRecord = sr.ReadLine();
}
sr.Close();
sw.Close();
sw = null;
}
}
string[] newFiles = Directory.GetFiles(workDirectory, "fileprefix*");
foreach (string file in newFiles)
{
string tempFile = Path.Combine(workDirectory, Path.GetFileName(file));
string destFile = Path.Combine(executionDirectory, Path.GetFileName(file));
File.Copy(tempFile, destFile, true);
if (File.Exists(destFile))
{
File.Delete(tempFile);
}
}
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
finally
{
// not done
}
}
One way you could do this - if what you want at that point in the code is to always write all but the final element in your string[] - is construct a for loop that terminates before the last item:
for (int i = 0; i < fields.Length - 1; i++)
{
// write your field here
}
This is assuming that you want to write each field individually, and that you want to iterate through fields in the first place. If all you want to do is just write a single string to a single line without using a loop, you could do this:
var truncatedFields = fields.Take(fields.Length - 1);
And then just write the truncatedFields string[] as you see fit. One way you could accomplish all this in a single line might look like so:
sw.WriteLine(String.Join("|", fields.Take(fields.Length - 1)));
goodRecord = fields.Last().Trim() == "0";
if (inputRecord.Contains("|") string outputRecord = inputRecord.Substring(1, inputRecord.LastIndexOf("|"));

How to display data from text file into many columns?

I have text file which consists of many rows and 18 columns of data seperated by tabs. I used this code and it is displaying entire data in single column. What I need is the data should be displayed in columns.
public static List<string> ReadDelimitedFile(string docPath)
{
var sepList = new List<string>();
// Read the file and display it line by line.
using (StreamReader file = new StreamReader(docPath))
{
string line;
while ((line = file.ReadLine()) != null)
{
var delimiters = new char[] { '\t' };
var segments = line.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
foreach (var segment in segments)
{
//Console.WriteLine(segment);
sepList.Add(segment);
}
}
file.Close();
}
// Suspend the screen.
Console.ReadLine();
return sepList;
}
You're outputting everything in one column like this (pseudo-code, to illustrate structure):
while (reading lines)
for (reading entries)
WriteLine(entry)
That is, for every line in the file and for every entry in that line, you output a new line. Instead, you want to only write a new line for every line in the file, and write the entries with separators (tabs?). Something more like this:
while (reading lines)
for (reading entries)
Write(entry)
WriteLine(newline)
That way all the entries for any given line in the file are on the same line in the output.
How you delimit those entries in the output is up to you, of course. And to write a carriage return could be as simple as Console.WriteLine(string.Empty), though I bet there are lots of other ways to do it.
18 columns would seem to be served best by using a dataGridView.
// Create your dataGrodView with the 18 columns using your designer.
int col = 0;
foreach (var segment in segments)
{
//Console.WriteLine(segment);
//sepList.Add(segment);
dataGridView1.Rows[whateverRow].Cells[col].Value = segment;
}
So according to your code, you have a following loop:
while{
<reads the lines one by one>
for each line{
<reading each segment and adding to the list.>
}
}
Your code read each segment of a line and append to the list. Ideally you should have 18 list for 18 columns. In java this problem can be solved with hashmaps:
Hashmap <String, ArrayList<String>> hmp = new Hashmap<String, ArrayList<String>>();`
while(read each line){
List<String> newList = new ArrayList<String>
foreach(segment as segments){
newList.add(segment);
}
hmp.put(column1,segment);
}
return hmp;
so you will have hmp.put(column2, segment), hmp.put(column3, segment) and so on.
Hope it helps.
You should be using DataTable or similar type for that but if you want to use List you can "emulate" rows and columns like this:
var rows = new List<List<string>>();
foreach(var line in File.ReadAllLines(docPath))
{
var columns = line.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries).ToList();
rows.Add(columns);
}
That will give you row/column like structure
foreach(var row in rows)
{
foreach(var column in row)
{
Console.Write(column + ",");
}
Console.WriteLine();
}

C# Text file and regular expressions

I seem to be having a problem with the following file:
*User Type 0: Database Administrator
Users of this Type:
Database Administrator DBA Can Authorise:Y Administrator:Y
DM3 Admin Account DM3 Can Authorise:Y Administrator:Y
Permissions for these users:
Data - Currencies Parameters - Database Add FRA Deal Reports - Confirmation Production
Add Currency Amend Database Parameters Cancel FRA Deal Reports - System Printer Definitions
Delete Currency Parameters - Data Retention Amend FRA Deal Save System Printers
Amend Currency Amend Data Retention Parameters Amend Settlements Only Custom Confs/Tickets
Amend Currency Rates Data - Rate References Verify FRA Deal Add Custom Confs/Tickets
Amend Currency Holidays Add Rate Reference Add FRA Deal (Restricted) Delete Custom Confs/Tickets
Add Nostro Delete Rate Reference Release FRA Deal Amend Custom Confs/Tickets
Amend Nostro Amend Rate Reference Deal - IRS Reports - System Report Batches
Delete Nostro Deal - Call Accounts Add IRS Deal Save System Batches
Data - Currency Pairs Open Call Account Cancel IRS Deal Reports - View Reports Spooled
Add Currency Pair Amend Call Account Amend IRS Deal View - Audits
Delete Currency Pair Close Call Account Amend Settlements Only Print Audit
Amend Currency Pair Amend Settlements Only Verify IRS Deal Print Audit Detail
Data - Books Data - Sales Relationship Mgrs Add IRS Deal (Restricted) Filter Audit*
I am using a regular expression to check each line for a pattern. In total there are three patterns that need to match. If you look at the first three lines, that is all the information that needs to be taken from the file. The problem im having is that my regex is not matching. Also what needs to be done is the information needs to be taken from between two lines.... How do i do that?
This is the code i have so far:
string path = #"C:/User Permissions.txt";
string t = File.ReadAllText(path);
//Uses regular expression check to match the specified string pattern
string pattern1 = #"User Type ";
string pattern2 = #"Users of this Type:";
string pattern3 = #"Permissions for these users:";
Regex rgx1 = new Regex(pattern1);
Regex rgx2 = new Regex(pattern2);
Regex rgx3 = new Regex(pattern3);
MatchCollection matches = rgx1.Matches(t);
List<string[]> test = new List<string[]>();
foreach (var match in matches)
{
string[] newString = match.ToString().Split(new string[] { #"User Type ", }, StringSplitOptions.RemoveEmptyEntries);
for (int i = 3; i <= newString.Length; i++)
{
test.Add(new string[] { newString[0], newString[1], newString[i - 1] });
}
}
MatchCollection matches2 = rgx2.Matches(t);
List<string[]> test2 = new List<string[]>();
foreach (var match2 in matches2)
{
string[] newString = match2.ToString().Split(new string[] { #"Permissions for these users: ", }, StringSplitOptions.RemoveEmptyEntries);
for (int i = 3; i <= newString.Length; i++)
{
test2.Add(new string[] { newString[0], newString[1], newString[i - 1] });
}
}
MatchCollection matches3 = rgx3.Matches(t);
List<string[]> test3 = new List<string[]>();
foreach (var match3 in matches3)
{
string[] newString = match3.ToString().Split(new string[] { #"Users of this Type: ", }, StringSplitOptions.RemoveEmptyEntries);
for (int i = 3; i <= newString.Length; i++)
{
test3.Add(new string[] { newString[0], newString[1], newString[i - 1] });
}
}
foreach (var line in test)
{
Console.WriteLine(line[0]);
Console.ReadLine();
}
Console.ReadLine();
Guffa's code seems very efficient compared to mine, the only problem i'm having now is how to extract the lines between "Users of this type" and Permissions for these users". How would go about doing this? Obviously checking to see if the name begins on a new line won't help.
No, you are not checking each line for a pattern, you are looking for the pattern in the entire file as a single string, and you only get the exact text that matches, so when you split each result you end up with an array containing two empty strings.
If I understand correctly, each line consists of a key and a value, so there is not really any point in using regular expressions for this. Just loop through the lines and compare strings.
Here is a start:
string[] lines = #"C:/User Permissions.txt"; string t = File.ReadAllLines(path);
foreach (string line in lines) {
if (line.StartsWith("User Type ") {
Console.WriteLine("User type:" + line.Substring(10));
} else if (line.StartsWith("Users of this Type:") {
Console.WriteLine("Users:" + line.Substring(19));
} else if (line.StartsWith("Permissions for these users:") {
Console.WriteLine("Permissions:" + line.Substring(28));
}
}
Edit:
Here is how to use a regular loop instead of a foreach, so that you can use an inner loop that reads lines:
string[] lines = #"C:/User Permissions.txt"; string t = File.ReadAllLines(path);
int line = 0;
while (line < lines.Length) {
if (lines[line].StartsWith("User Type ") {
Console.WriteLine("User type:" + lines[line].Substring(10));
} else if (lines[line].StartsWith("Users of this Type:") {
line++;
while (line < lines.Length && !lines[line].StartsWith("Permissions for these users:")) {
Console.WriteLine("User: " + lines[line]);
line++;
}
} else if (lines[line].StartsWith("Permissions for these users:") {
Console.WriteLine("Permissions:" + lines[line].Substring(28));
}
line++;
}
You are not going to succeed in extracting the data that you want from this txt dump using reg-exp (and hardly using any other technique without investing too much effort).
The most important obstacle to using regexp that I can see is the fact that information is actually listed in columns accross the txt file.
The problem is best illustrated with the fact that the category
Data - Sales Relationship Mgrs
is in one column whereas all the permissions for that category are in the next column.
Please investigate whether this information can be obtained in a different way.
Still, here is a rough algoritimic strategy for dealing with the file as is:
Read the file line by line,
Look at predefined offsets into the line for the information you are interested in.
When you get to the information stacked in columns, you could temporarily append each column to separate collections as you parse each line
Finally attempt to extract the privileges from a concatenation of all the temporary columns.

Categories

Resources