C# - Splitting by columns on a CSV file - c#

I want to import some data from a csv file, but I've encountered a small problem I can't really figure out.
The person who gave me this file, added comma seperated values in cells, so when I split them they will be added to the list. Instead, I would like to get all values per column as a string, I just can't really figure out how.
For example, the column I'm talking about, is about the days a restaurant is open. This can be Mo, Tu, We, Su, but it can also be Mo, Tu.
Is there a way I can just loop over de values per column, instead of by the comma seperated values?
I'm currently using it like this, but this just adds each day to the total list of values:
using (var fs = File.OpenRead(csvUrl))
using (var reader = new StreamReader(fs, Encoding.UTF8))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (i > 0)
{
var values = line.Split(',');
}
}
}

Use TextFieldParser to parse CSV files:
TextFieldParser parser = new TextFieldParser(new StringReader(lineContent));
parser.SetDelimiters(",");
string[] rawFields = parser.ReadFields();
lineContent is a string with the content of the current line in your file.
TextFieldParser is available in the namespace:
Microsoft.VisualBasic.FileIO
Don't mind abaout the Visual Basic part it works fine in C#
EDIT
In your code you could implement it like this:
using (var fs = File.OpenRead(csvUrl))
using (var reader = new StreamReader(fs, Encoding.UTF8))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (i > 0)
{
TextFieldParser parser = new TextFieldParser(new StringReader(lineContent));
parser.SetDelimiters(",");
string[] rawFields = parser.ReadFields();
}
}
}

Best solution so far to deal with CSV values is using the .NET built in libraries:
Its explained here in my StackOverflow answer here:
Reading CSV file and storing values into an array
For easy reference, I am including the code here as well.
using Microsoft.VisualBasic.FileIO;
var path = #"C:\Person.csv"; // Habeeb, "Dubai Media City, Dubai"
using (TextFieldParser csvParser = new TextFieldParser(path))
{
csvParser.CommentTokens = new string[] { "#" };
csvParser.SetDelimiters(new string[] { "," });
csvParser.HasFieldsEnclosedInQuotes = true;
// Skip the row with the column names
csvParser.ReadLine();
while (!csvParser.EndOfData)
{
// Read current line fields, pointer moves to the next line.
string[] fields = csvParser.ReadFields();
string Name = fields[0];
string Address = fields[1];
}
}
More details about the parser is given here: http://codeskaters.blogspot.ae/2015/11/c-easiest-csv-parser-built-in-net.html

Related

Is there a way to read data from csv and write data to csv in c# without using a data table?

In a tool development, I need to access data from a csv file and create another csv and write the data in different place(different columns) there. IS there a way I can do it without using data table?
You do not have to use a data table.
You can also just read it with a filereader and split. Sameway you can write it with for example a stringbuilder.
Example of the read:
using(var reader = new StreamReader(#"C:\document.csv"))
{
List<object> list = new List<object>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(';');
list.Add(new {a = values[0], b = values[1]});
}
}

How to convert csv file's specific column to string List

Does anyone know how to convert a specific column in a csv file to string List ?
I am trying to have a 'contact number' list from csv file. (Please see my csv file in below)
Intending to do
List<string> contactNumberList = new List<string>();
-- contactNumberList.Add("1888714"); (Manual)
---contactNumberList.Add("1888759");(Manual)
In my CSV
"Email","Opt In Date","Opted Out","Opt In Details","Email Type","Opted Out Date","Opt Out Details","Contact Number","Salutation"
"test1#testApp.com","05/01/2014 11:23 AM","F","User Name: capn#goldpop.org.uk. IP Address: 62.213.118.139","0","","","1888714","Mrs Hall"
"test2#testApp.com","05/01/2014 11:23 AM","F","User Name: capntransfer#goldpop.org.uk. IP Address: 62.213.118.139","0","","","1888759","Mrs Heyworth"
For parsing CSVs I suggest using a ligrary a s simply splitting by line /columns separator can bring to errors if the values are escaped.
I.e.
"Mr John, Bob Smith"
is a valid CSV as it is escaped with quotes. But a Split function will stop working.
One valid choice is LumenWorks (you can find it in NuGet).
I.e.
using (var csv = new CsvReader(r, false, csvParserConfiguration.ColumnSeparator, '"', '"', '#', ValueTrimmingOptions.None))
{
// Read lines
while (csv.ReadNextRecord())
{
contactNumberList.Add[7];
}
}
Read the file line by line, Split() by commas, select the desired column, trim quotes off and add to a list?
Try this:
int columnId = 3;
string[] lines = File.ReadAllLines(#"C:\test.csv");
var list = lines.Select(line =>
{ var values = line.Split(';');
return values[columnId];
});
You could try the following:
var reader = new StreamReader(File.OpenRead(#"C:\test.csv"));
List<string> contactNumbersList = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
contactNumbersList.Add(values[7]);
}
but better yet you could use a dedicated library like CSV Reader.
IEnumerable<string> strings = System.IO.File.ReadAllLines({filepath})
.Select(x => x.Split(';')[{columnId}])

Find and delete a value in CSV

These are the contents of a CSV file -
altitudeAscent,altitudeDescent,altitudeavg,altitudecurrent,altitudeend,altitudemax,altitudemin,altitudestart,calories,className,createAppVersion,createDevice,createIOSVersion,distance,eaType,ecomodule,elapsedSeconds,fatCalories,footpodElapsedSecondsInterval,gpsSensor,gps_distance,gps_maxspeed,heartSensor,isCompletedAssessment,isLocationIndoors,isgpsonly,maxBPM,maxspeed,minBPM,otherWorkoutTypeName,otherzonetimes,primarySourceSpeedDistance,readCountBPM,routineds,routineid,startLat,startLon,startTime,totalBeats,totalDurationSeconds,totalPausedSeconds
0,2.37,-999999,33.66,33.66,42.76,28.21,36.47,410.08,ActiveWorkout,7.00,iPhone5,1,7.0,0.27,0,1024,2657,207.77,2655,ios,0.27,3.69,ble,NO,NO,YES,186,3.69,87,Weight Lifting,85,gps,884,4.000000,TEMP-1663889272,37.10730362,-76.50557709,2013-08-26T18:48:21,110056,0,0
I need to delete the '7.00' value from the CSV file. How do I do that in VB or C#?
string names;
List<string> values;
using (var stream = new StreamReader("path/to/file.csv"))
{
names = stream.ReadLine();
values = stream.ReadLine().Split(',').ToList();
}
values.RemoveAt(9);
names; // first part of file
string secondPartOfFile = string.Join(",", values);
This works for me
var sr = new StreamReader("file.csv");
var writeToFile = new StreamWriter("out.csv");
string line;
while ((line = sr.ReadLine()) != null)
{
writeToFile.WriteLine(sr.ReadLine().ToString().Replace(",7.00", ""));
}
writeToFile.Close();
sr.Close();
I would connect to the file using a connection string found here, or an example here, load it into a DataTable object, and then this (assuming you called the variable "dt"):
dt["columnName"][rowNumber] = string.Empty;
If you always have it equal to 7.00, then you could read the file using TextReader, and then do something like yourFile = yourFile.Replace(",7.00,", ",,");
It depends on your scenario, maybe you could even get your data in a more suitable format for working, like a JSON object or something else.

How do I pass a collection of strings as a TextReader?

I am using the CSVHelper library, which can extract a list of objects from a CSV file with just three lines of code:
var streamReader = // Create a reader to your CSV file.
var csvReader = new CsvReader( streamReader );
List<MyCustomType> myData = csvReader.GetRecords<MyCustomType>();
However, by file has nonsense lines and I need to skip the first ten lines in the file. I thought it would be nice to use LINQ to ensure 'clean' data, and then pass that data to CsvFReader, like so:
public TextReader GetTextReader(IEnumerable<string> lines)
{
// Some magic here. Don't want to return null;
return TextReader.Null;
}
public IEnumerable<T> ExtractObjectList<T>(string filePath) where T : class
{
var csvLines = File.ReadLines(filePath)
.Skip(10)
.Where(l => !l.StartsWith(",,,"));
var textReader = GetTextReader(csvLines);
var csvReader = new CsvReader(textReader);
csvReader.Configuration.ClassMapping<EventMap, Event>();
return csvReader.GetRecords<T>();
}
But I'm really stuck into pushing a 'static' collection of strings through a stream like a TextReaer.
My alternative here is to process the CSV file line by line through CsvReader and examine each line before extracting an object, but I find that somewhat clumsy.
The StringReader Class provides a TextReader that wraps a String. You could simply join the lines and wrap them in a StringReader:
public TextReader GetTextReader(IEnumerable<string> lines)
{
return new StringReader(string.Join("\r\n", lines));
}
An easier way would be to use CsvHelper to skip the lines.
// Skip rows.
csvReader.Configuration.IgnoreBlankLines = false;
csvReader.Configuration.IgnoreQuotes = true;
for (var i = 0; i < 10; i++)
{
csvReader.Read();
}
csvReader.Configuration.IgnoreBlankLines = false;
csvReader.Configuration.IgnoreQuotes = false;
// Carry on as normal.
var myData = csvReader.GetRecords<MyCustomType>;
IgnoreBlankLines is turned off in case any of those first 10 rows are blank. IgnoreQuotes is turned off so you don't get any BadDataExceptions if those rows contain a ". You can turn them back on after for normal functionality again.
If you don't know the amount of rows and need to test based on row data, you can just test csvReader.Context.Record and see if you need to stop. In this case, you would probably need to manually call csvReader.ReadHeader() before calling csvReader.GetRecords<MyCustomType>().

Reading a CSV file in .NET?

How do I read a CSV file using C#?
A choice, without using third-party components, is to use the class Microsoft.VisualBasic.FileIO.TextFieldParser (http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.aspx) . It provides all the functions for parsing CSV. It is sufficient to import the Microsoft.VisualBasic assembly.
var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(file);
parser.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited;
parser.SetDelimiters(new string[] { ";" });
while (!parser.EndOfData)
{
string[] row = parser.ReadFields();
/* do something */
}
You can use the Microsoft.VisualBasic.FileIO.TextFieldParser class in C#:
using System;
using System.Data;
using Microsoft.VisualBasic.FileIO;
static void Main()
{
string csv_file_path = #"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTableFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTableFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datacolumn = new DataColumn(column);
datacolumn.AllowDBNull = true;
csvData.Columns.Add(datacolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}
You could try CsvHelper, which is a project I work on. Its goal is to make reading and writing CSV files as easy as possible, while being very fast.
Here are a few ways you can read from a CSV file.
// By type
var records = csv.GetRecords<MyClass>();
var records = csv.GetRecords( typeof( MyClass ) );
// Dynamic
var records = csv.GetRecords<dynamic>();
// Using anonymous type for the class definition
var anonymousTypeDefinition =
{
Id = default( int ),
Name = string.Empty,
MyClass = new MyClass()
};
var records = csv.GetRecords( anonymousTypeDefinition );
I usually use a simplistic approach like this one:
var path = Server.MapPath("~/App_Data/Data.csv");
var csvRows = System.IO.File.ReadAllLines(path, Encoding.Default).ToList();
foreach (var row in csvRows.Skip(1))
{
var columns = row.Split(';');
var field1 = columns[0];
var field2 = columns[1];
var field3 = columns[2];
}
I just used this library in my application. http://www.codeproject.com/KB/database/CsvReader.aspx. Everything went smoothly using this library, so I'm recommending it. It is free under the MIT License, so just include the notice with your source files.
I didn't display the CSV in a browser, but the author has some samples for Repeaters or DataGrids. I did run one of his test projects to test a Sort operation I have added and it looked pretty good.
You can try Cinchoo ETL - an open source lib for reading and writing CSV files.
Couple of ways you can read CSV files
Id, Name
1, Tom
2, Mark
This is how you can use this library to read it
using (var reader = new ChoCSVReader("emp.csv").WithFirstLineHeader())
{
foreach (dynamic item in reader)
{
Console.WriteLine(item.Id);
Console.WriteLine(item.Name);
}
}
If you have POCO object defined to match up with CSV file like below
public class Employee
{
public int Id { get; set; }
public string Name { get; set; }
}
You can parse the same file using this POCO class as below
using (var reader = new ChoCSVReader<Employee>("emp.csv").WithFirstLineHeader())
{
foreach (var item in reader)
{
Console.WriteLine(item.Id);
Console.WriteLine(item.Name);
}
}
Please check out articles at CodeProject on how to use it.
Disclaimer: I'm the author of this library
I recommend Angara.Table, about save/load: http://predictionmachines.github.io/Angara.Table/saveload.html.
It makes column types inference, can save CSV files and is much faster than TextFieldParser. It follows RFC4180 for CSV format and supports multiline strings, NaNs, and escaped strings containing the delimiter character.
The library is under MIT license. Source code is https://github.com/Microsoft/Angara.Table.
Though its API is focused on F#, it can be used in any .NET language but not so succinct as in F#.
Example:
using Angara.Data;
using System.Collections.Immutable;
...
var table = Table.Load("data.csv");
// Print schema:
foreach(Column c in table)
{
string colType;
if (c.Rows.IsRealColumn) colType = "double";
else if (c.Rows.IsStringColumn) colType = "string";
else if (c.Rows.IsDateColumn) colType = "date";
else if (c.Rows.IsIntColumn) colType = "int";
else colType = "bool";
Console.WriteLine("{0} of type {1}", c.Name, colType);
}
// Get column data:
ImmutableArray<double> a = table["a"].Rows.AsReal;
ImmutableArray<string> b = table["b"].Rows.AsString;
Table.Save(table, "data2.csv");
You might be interested in Linq2Csv library at CodeProject. One thing you would need to check is that if it's reading the data when it needs only, so you won't need a lot of memory when working with bigger files.
As for displaying the data on the browser, you could do many things to accomplish it, if you would be more specific on what are your requirements, answer could be more specific, but things you could do:
1. Use HttpListener class to write simple web server (you can find many samples on net to host mini-http server).
2. Use Asp.Net or Asp.Net Mvc, create a page, host it using IIS.
Seems like there are quite a few projects on CodeProject or CodePlex for CSV Parsing.
Here is another CSV Parser on CodePlex
http://commonlibrarynet.codeplex.com/
This library has components for CSV parsing, INI file parsing, Command-Line parsing as well. It's working well for me so far. Only thing is it doesn't have a CSV Writer.
This is just for parsing the CSV. For displaying it in a web page, it is simply a matter of taking the list and rendering it however you want.
Note: This code example does not handle the situation where the input string line contains newlines.
public List<string> SplitCSV(string line)
{
if (string.IsNullOrEmpty(line))
throw new ArgumentException();
List<string> result = new List<string>();
int index = 0;
int start = 0;
bool inQuote = false;
StringBuilder val = new StringBuilder();
// parse line
foreach (char c in line)
{
switch (c)
{
case '"':
inQuote = !inQuote;
break;
case ',':
if (!inQuote)
{
result.Add(line.Substring(start, index - start)
.Replace("\"",""));
start = index + 1;
}
break;
}
index++;
}
if (start < index)
{
result.Add(line.Substring(start, index - start).Replace("\"",""));
}
return result;
}
}
I have been maintaining an open source project called FlatFiles for several years now. It's available for .NET Core and .NET 4.5.1.
Unlike most of the alternatives, it allows you to define a schema (similar to the way EF code-first works) with an extreme level of precision, so you aren't fight conversion issues all the time. You can map directly to your data classes, and there is also support for interfacing with older ADO.NET classes.
Performance-wise, it's been tuned to be one of the fastest parsers for .NET, with a plethora of options for quirky format differences. There's also support for fixed-length files, if you need it.
you can use this library: Sky.Data.Csv
https://www.nuget.org/packages/Sky.Data.Csv/
this is a really fast CSV reader library and it's really easy to use:
using Sky.Data.Csv;
var readerSettings = new CsvReaderSettings{Encoding = Encoding.UTF8};
using(var reader = CsvReader.Create("path-to-file", readerSettings)){
foreach(var row in reader){
//do something with the data
}
}
it also supports reading typed objects with CsvReader<T> class which has a same interface.

Categories

Resources