I am creating a CSV Importing tool (comma separated). I am trying to make this importing tool as generic as possible , so that it can process any CSV File.
I have almost finalised the tool , but came across one file which I am finding it difficult to process.
How can I process the file with data in following format?
column1,column2,column3,column4,column5
----------
alex,p,22323,23232,hello
mike,t,"121212,232323,4343434",33432,hi
guna,s,"2423,2332",whats
cena,a,34443,33432,up
Since the file is comma separated, and one of its value is comma separated as well between identifier "value,value,value" I am finding it difficult to process.
How can i tackle this issue?
I donot have control over CSV file. So I cant change the format
As per #dtb... use a CSV parser. If you reference Microsoft.VisualBasic then you can:
var data=#"column1,column2,column3,column4,column5
----------
alex,p,22323,23232,hello
mike,t,""121212,232323,4343434"",33432,hi
guna,s,""2423,2332"",whats
cena,a,34443,33432,up";
using (var sr = new StringReader(data))
using (var parser =
new TextFieldParser(sr)
{
TextFieldType = FieldType.Delimited,
Delimiters = new[] { "," },
CommentTokens = new[] { "--" }
})
{
while (!parser.EndOfData)
{
string[] fields;
fields = parser.ReadFields();
//yummy
}
}
This deals with quotes correctly.
Related
string[] splittedText = File.ReadAllLines(#"file.txt");//.Split(',');
foreach (string data in splittedText)
{
}
I want to read through a file in c# which returns array of string type. Then, I will be iterating over the array to fetch my desired data.
If you want to read a CSV file, you should use a CVS parser. Values in the CSV file are separated using command and in some cases, the value in the CSV file can also contain a comma. In that case, the column values are wrapped in double-quotes. And this solution will not handle that scenario.
var splittedText = File.ReadAllText("E:\\Test.txt").Split(',');
foreach (string data in splittedText)
{
Console.WriteLine(data.Trim());
}
Hint - Reading file line by line or Reading whole file content depends on your use case. May be below code snippet give some idea on how to split the content.
Please try.
var inputtext = File.ReadAllText(#"inpufile.txt");
inputtext.Replace("\n", "")
.Split(',',
StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.ToList().ForEach(t =>
{
System.Console.WriteLine(t);
//Other manupulations
});
if you want to split based on multiple characters , pass a character array to the split().
new char[] { ',', ':' };
Thank you.
You need change File.ReadAllLines to File.ReadAllText(path) then you can split method.
Hi my application basically reads a CSV file which will always have the same format and I need the application to create a CSV file with different formatting. Reading and writing CSV file is not the issue, however the problem I am having is getting the amounts value as these are formatted with a , in the csv file (ex: 4, 500). Having said that these are being split when writing to csv file.
Ex: From the below, how can I get the full numbers .i.e. 2241.84 & 1072809.33
line = "\"02 MAY 18\",\"TTEWTWTE\",\"GRHGWHWH\",\"02 MAY 18\",\"2,241.84\",\"\",\"1,072,809.33\""
This is how I am reading from CSV file.
openFileDialog1.ShowDialog();
var reader = new StreamReader(File.OpenRead(openFileDialog1.FileName));
List<string> searchList = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
searchList.Add(line);
}
So far I have tried to use the below which gets you \"2,241.84\" which is correct but when writing to csv file I am only getting 2
searchList[2].Split(',')[1].Replace("\"", "")
Let me visualize contents in another way:
"
\"02 MAY 18\",
\"TTEWTWTE\",
\"GRHGWHWH\",
\"02 MAY 18\",
\"2,241.84\",
\"\",
\"1,072,809.33\"
"
It seems that your separator is \", rather than ,. Change searchList[2].Split(',')[1].Replace("\"", "") to searchList[1].Split(new string[] { "\",\"" }, StringSplitOptions.None).
In your case you can use this:
var result = searchList[2].Split(new string[] { "\",\"" }, StringSplitOptions.None)[4].Replace("\"", "");
Split your string with "," separator, instead of ,.
I don't know why you are using static numbers for indexes, but I will assume it's for test purposes.
i'm reading a CSV file and changing the delimiter from a "," to a "|". However i've noticed in my data (which I have no control over) that in certain cases I have some data that does not want to follow this rule and it contains quoted data with a comma in it. I'm wondering how best to not replace these exceptions?
For example:
ABSON TE,Wick Lane,"Abson, Pucklechurch",Bristol,Avon,ENGLAND,BS16
9SD,37030,17563,BS0001A1,,
Should be changed to:
ABSON TE|Wick Lane|"Abson, Pucklechurch"|Bristol|Avon|ENGLAND|BS16
9SD|37030|17563|BS0001A1||
The code to read and replace the CSV file is this:
var contents = File.ReadAllText(filePath).Split(new string[] { "\n", "\r\n" }, StringSplitOptions.RemoveEmptyEntries).ToArray();
var formattedContents = contents.Select(line => line.Replace(',', '|'));
For anyone else struggling with this, I ended up using the built in .net csv parser. See here for more details and example: http://coding.abel.nu/2012/06/built-in-net-csv-parser/
My specific code:
// Create new parser object and setup parameters
var parser = new TextFieldParser(new StringReader(File.ReadAllText(filePath)))
{
HasFieldsEnclosedInQuotes = true,
Delimiters = new string[] { "," },
TrimWhiteSpace = true
};
var csvSplitList = new List<string>();
// Reads all fields on the current line of the CSV file and returns as a string array
// Joins each field together with new delimiter "|"
while (!parser.EndOfData)
{
csvSplitList.Add(String.Join("|", parser.ReadFields()));
}
// Newline characters added to each line and flattens List<string> into single string
var formattedCsvToSave = String.Join(Environment.NewLine, csvSplitList.Select(x => x));
// Write single string to file
File.WriteAllText(filePathFormatted, formattedCsvToSave);
parser.Close();
I am using TextFieldParser class to parse the file. I want to eliminate or ignore complete column if "entire column" is empty (which means single empty cell of a perticular row should be considered) Is this possible?
Note: as per functionality, I need to use data copied to clipboard. So can not pass direct file path to the parser.
TextFieldParser parser = new TextFieldParser(new StringReader(row));
string[] delimiters = { ",", "\t" };
parser.SetDelimiters(delimiters);
string[] columns = null;
while (!parser.EndOfData)
{
columns = parser.ReadFields();
}
Appreciate your help.
After reading through the TextFieldParser Class page on MSDN, I see that there is nothing written there that would make me think that this class can ignore a whole column. That would be something that you would have to do manually. Furthermore, your code does not seem right because you are trying to read the fields repeatedly with the same variable:
while (!parser.EndOfData)
{
columns = parser.ReadFields();
}
To read a CSV file, I use the following statement:
var query = from line in rawLines
let data = line.Split(';')
select new
{
col01 = data[0],
col02 = data[1],
col03 = data[2]
};
The CSV file I want to read is malformed in the way, that an entry can have the separator ; itself as data when surrounded with qutation marks.
Example:
col01;col02;col03
data01;"data02;";data03
My read statement above does not work here, since it interprets the second row as four columns.
Question: Is there an easy way to handle this malformed CSV correctly? Perhaps with another LINQ query?
Just use a CSV parser and STOP ROLLING YOUR OWN:
using (var parser = new TextFieldParser("test.csv"))
{
parser.CommentTokens = new string[] { "#" };
parser.SetDelimiters(new string[] { ";" });
parser.HasFieldsEnclosedInQuotes = true;
// Skip over header line.
parser.ReadLine();
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields();
Console.WriteLine("{0} {1} {2}", fields[0], fields[1], fields[2]);
}
}
TextFieldParser is built in .NET. Just add reference to the Microsoft.VisualBasic assembly and you are good to go. A real CSV parser will happily handle this situation.
Parsing CSV files manually can always lead to issues like this. I would advise that you use a third party tool like CsvHelper to handle the parsing.
Furthermore, it's not a good idea to explicitly parse commas, as your separator can be overridden in your computers environment options.
Let me know if I can help further,
Matt
Not very elegant but after using your method you can check if any colxx contains an unfinished quotation mark (single) you can join it with the next colxx.