how to read a CSV file to a List<myClass>? - c#

is it possible to create a List of my own class from a CSV file?
the file e.g. looks like:
ID;NAME
1;Foo
2;Bar
then I have a class like:
class MyClass
{
public int id { get; set; }
public string name { get; set; }
}
is it possible to generate a list of this class out of the cvs file? maybe with some library

You could use a CSV parser such as FileHelpers or FastCSV. If you don't want to use third party libraries you may take a look at the built-in TextFieldParser class which could be used like that:
public IEnumerable<MyClass> Parse(string path)
{
using (TextFieldParser parser = new TextFieldParser(path))
{
parser.CommentTokens = new string[] { "#" };
parser.SetDelimiters(new string[] { ";" });
parser.HasFieldsEnclosedInQuotes = true;
// Skip over header line.
parser.ReadLine();
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields();
yield return new MyClass()
{
id = fields[0],
name = fields[1]
};
}
}
}
and then:
List<MyClass> list = Parse("data.csv").ToList();
But never, please never roll your own CSV parser as other people suggested you here as answers to your question.

Related

CsvHelper PrepareHeaderForMatch returns Context as one-item array

Been using CsvHelper version 6.0.0, decided to upgrade to latest (currently 12.3.2) and found out it uses another parameter, index in lambda for csv.Configuration.PrepareHeaderForMatch, (Func<string,int,string>).
The code for v6.0.0 looked like this:
csv.Configuration.PrepareHeaderForMatch = header => Regex.Replace(header, #"\/", string.Empty);
With previous line, the IReadingContext.Record returns an array with multiple records, one for each column.
The code for v12.3.2 looks like this:
csv.Configuration.PrepareHeaderForMatch = (header, index) => Regex.Replace(header, #"\/", string.Empty);
But ReadingContext.Record now returns an array with all columns in just one record. Used the exact same file for both versions. Tried messing with the lambda, but the outcome is the same. How can I get the columns in Records array?
Thanks in advance!
update - This is an issue with the delimiter that has changed since version 6.0.0. The default delimiter now uses CultureInfo.CurrentCulture.TextInfo.ListSeparator. Since I'm in the United States, my ListSeparator is , so both examples work for me. For many countries the ListSeparator is ; which is why for version 12.3.2 only 1 column was found for #dzookatz. The solution is to specify the delimiter in the configuration.
csv.Configuration.PrepareHeaderForMatch = header => Regex.Replace(header, #"\/", string.Empty);
csv.Configuration.Delimiter = ",";
I must be missing something. I get the same result for var record whether using version 6.0.0 or 12.3.2. I'm guessing there is more going on with your data that I'm not seeing.
Version 6.0.0
class Program
{
static void Main(string[] args)
{
var fooString = $"Id,First/Name{Environment.NewLine}1,David";
using (var reader = new StringReader(fooString))
using (var csv = new CsvReader(reader))
{
csv.Configuration.PrepareHeaderForMatch = header => Regex.Replace(header, #"\/", string.Empty);
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var record = csv.Context.Record;
}
}
}
}
public class Foo
{
public int Id { get; set; }
public string FirstName { get; set; }
}
Version 12.3.2
public class Program
{
public static void Main(string[] args)
{
var fooString = $"Id,First/Name{Environment.NewLine}1,David";
using (var reader = new StringReader(fooString))
using (var csv = new CsvReader(reader))
{
csv.Configuration.PrepareHeaderForMatch = (header, index) => Regex.Replace(header, #"\/", string.Empty);
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var record = csv.Context.Record;
}
}
}
}
public class Foo
{
public int Id { get; set; }
public string FirstName { get; set; }
}

Create JSON Dynamically using variables containing element names with Dot notation

I have 2 csv files in Azure Blob storage, I am using C# to parse these files
The first csv file is a small file that contains a mapping from the data in the 2nd csv file to an API The records look like the below
csvField1, apiField1.subfield1
csvField2, apiField2
csvField3, apiField5
csvField6, apiField1.subfield2
the second csv file is big so I will use stream to read it and the file has a header with the following column names
csvField1, csfField2, csvField4, csfField5, csvField6, csfField7
I want the output to be a JSON like the below
{
apiField1:{
subfield1: value(csvField1)
subfield2: value(csvField6)
},
apiField2:value(csvField2),
apiField5: value(csvField3)
}
The workable solution with Newtonsoft.Json library:
A) Create data class for apiField1
class ApiField1
{
public ApiField1(string s1, string s2)
{
subfield1 = s1;
subfield2 = s2;
}
public string subfield1 { get;}
public string subfield2 { get;}
}
B) Create data class for API record
class ApiRecord
{
public ApiRecord(string[] s)
{
apiField1 = new ApiField1(s[0], s[5]);
apiField2 = s[1];
apiField5 = s[2];
}
public ApiField1 apiField1 { get; }
public string apiField2 { get; }
public string apiField5 { get; }
}
C) Test
class Program
{
static void Main(string[] args)
{
ApiRecord a = new ApiRecord("0,1,2,3,4,5".Split(','));
Console.WriteLine(JsonConvert.SerializeObject(a));
Console.ReadLine();
}
}
Result:
{"apiField1":{"subfield1":"0","subfield2":"5"},"apiField2":"1","apiField5":"2"}
I just test with a simple string "0,1,2,3,4,5". In your case, you can read each line from the stream of the CSV file.
Or, you can use a dictionary:
Dictionary<string, String> apidield1 = new Dictionary<string, string>();
apidield1.Add("subfield1", "value(csvField1)");
apidield1.Add("subfield2", "value(csvField6)");
Dictionary<string, Object> apiRecord = new Dictionary<string, object>();
apiRecord.Add("apiField2", "value(csvField2)");
apiRecord.Add("apiField5", "value(csvField3)");
apiRecord.Add("apiField1", apidield1);
Console.WriteLine(JsonConvert.SerializeObject(apiRecord));
Output:
{"apiField2":"value(csvField2)","apiField5":"value(csvField3)","apiField1":{"subfield1":"value(csvField1)","subfield2":"value(csvField6)"}}

How to use FileHelpers with records with a hierarchical structure?

I have a class like this
public class Foo
{
public string Name { get; set; }
private List<string> _bar = new List<string>();
public List<string> Bar
{
get { return _bar; }
set { _bar = value; }
}
}
My .csv file looks like this:
Foo;Bar
A;B
A;C
So I want to have a Foo object with Name "A" and a Bar-list (or an array) with "B" and "C".
Is it possible to simply parse my .csv file with FileHelpers? Or some other library? Don't bother to create a more complex solution (with for-loops or so) with a 3rd party tool. Then I will create a standard solution with .NET.
So your FileHelpers class should look like this:
[DelimitedRecord(";")]
public class FooSpec
{
public string Name;
public string Bar;
}
Read all the records into an array and adjust with Linq.
var engine = new FileHelperEngine<FooSpec>();
// ReadFile returns an array of Foo
var records = engine.ReadFile(filename);
var fooRecords = list
.GroupBy(x => x.Name)
.Select(x =>
new Foo() {
Name = x.Key,
Bar = x.Select(y => y.Bar).ToList()
});

Reading file during LINQ query

I have a simple class, and I want to have the results from:
(which are correct so far)
Console.WriteLine(f.temp1);
Console.WriteLine(f.temp2);
in my Class Definitions temp1=Name; temp2=id
public class Definitions
{
public string Name { get; set; }
public string Id { get; set; }
}
class Program
{
static void Main()
{
ReadDefinitions();
}
public static void ReadDefinitions()
{
var files = from name in Directory.EnumerateFiles(Settings.Folder)
from id in File.ReadLines(name).Skip(2).Take(1)
select new
{
temp1= Path.GetFileNameWithoutExtension(name),
temp2= id
};
foreach (var f in files)
{
Console.WriteLine(f.temp1);
Console.WriteLine(f.temp2);
}
foreach (var f in files)
{
Console.WriteLine(f.temp1);
Console.WriteLine(f.temp2);
}
}
}
I know this is stupid with this temp stuff, but I could not manage to do it directly. :(
The goal is to:
Read the directory with many thousand files...
Put the name into Definitions.Name
Put line 3 of every file into Definitions.Id
So that I can access them anytime in my Program.
(I still need to trim the 3 left characters of line AND the 4 right characters of it,..but I'll probably manage that myself)
If understand correctly you just need to do this
var files = from name in Directory.EnumerateFiles(Settings.Folder)
select new
{
temp1= Path.GetFileNameWithoutExtension(name),
temp2= File.ReadLines(name).Skip(2).First()
};
If you want to skip the temp stuff then you can:
var files = from name in Directory.EnumerateFiles(Settings.Folder)
select new Definitions
{
Name = Path.GetFileNameWithoutExtension(name),
Id = File.ReadLines(name).Skip(2).First()
};

Importing CSV data into C# classes

I know how to read and display a line of a .csv file. Now I would like to parse that file, store its contents in arrays, and use those arrays as values for some classes I created.
I'd like to learn how though.
Here is an example:
basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3
As you can see, each field is separated by commas. I've created the Basketball.cs and Baseball classes which is an extension of the Sport.cs class, which has the fields:
private string sport;
private string date;
private string team1;
private string team2;
private string score;
I understand that this is simplistic, and that there's better ways of storing this info, i.e. creating classes for each team, making the date a DateType datatype, and more of the same but I'd like to know how to input this information into the classes.
I'm assuming this has something to do with getters and setters... I've also read of dictionaries and collections, but I'd like to start simple by storing them all in arrays... (If that makes sense... Feel free to correct me).
Here is what I have so far. All it does is read the csv and parrot out its contents on the Console:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace Assign01
{
class Program
{
static void Main(string[] args)
{
string line;
FileStream aFile = new FileStream("../../sportsResults.csv", FileMode.Open);
StreamReader sr = new StreamReader(aFile);
// read data in line by line
while ((line = sr.ReadLine()) != null)
{
Console.WriteLine(line);
line = sr.ReadLine();
}
sr.Close();
}
}
}
Help would be much appreciated.
For a resilient, fast, and low effort solution, you can use CsvHelper which handles a lot of code and edge cases and has pretty good documentation
First, install the CsvHelper package on Nuget
a) CSV with Headers
If your csv has headers like this:
sport,date,team 1,team 2,score 1,score 2
basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3
You can add attributes to your class to map the field names to your class names like this:
public class SportStats
{
[Name("sport")]
public string Sport { get; set; }
[Name("date")]
public DateTime Date { get; set; }
[Name("team 1")]
public string TeamOne { get; set; }
[Name("team 2")]
public string TeamTwo { get; set; }
[Name("score 1")]
public int ScoreOne { get; set; }
[Name("score 2")]
public int ScoreTwo { get; set; }
}
And then invoke like this:
List<SportStats> records;
using (var reader = new StreamReader(#".\stats.csv"))
using (var csv = new CsvReader(reader))
{
records = csv.GetRecords<SportStats>().ToList();
}
b) CSV without Headers
If your csv doesn't have headers like this:
basketball,2011/01/28,Rockets,Blazers,98,99
baseball,2011/08/22,Yankees,Redsox,4,3
You can add attributes to your class and map to the CSV ordinally by position like this:
public class SportStats
{
[Index(0)]
public string Sport { get; set; }
[Index(1)]
public DateTime Date { get; set; }
[Index(2)]
public string TeamOne { get; set; }
[Index(3)]
public string TeamTwo { get; set; }
[Index(4)]
public int ScoreOne { get; set; }
[Index(5)]
public int ScoreTwo { get; set; }
}
And then invoke like this:
List<SportStats> records;
using (var reader = new StreamReader(#".\stats.csv"))
using (var csv = new CsvReader(reader))
{
csv.Configuration.HasHeaderRecord = false;
records = csv.GetRecords<SportStats>().ToList();
}
Further Reading
Reading CSV file and storing values into an array (295🡅)
Parsing CSV files in C#, with header (245🡅)
Import CSV file to strongly typed data structure in .Net (104🡅)
Reading a CSV file in .NET? (45🡅)
Is there a “proper” way to read CSV files (17🡅)
... many more
Creating array to keep the information is not a very good idea, as you don't know how many lines will be in the input file. What would be the initial size of your Array ?? I would advise you to use for example a Generic List to keep the information (E.g. List<>).
You can also add a constructor to your Sport Class that accepts an array (result of the split action as described in above answer.
Additionally you can provide some conversions in the setters
public class Sport
{
private string sport;
private DateTime date;
private string team1;
private string team2;
private string score;
public Sport(string[] csvArray)
{
this.sport = csvArray[0];
this.team1 = csvArray[2];
this.team2 = csvArray[3];
this.date = Convert.ToDateTime(csvArray[1]);
this.score = String.Format("{0}-{1}", csvArray[4], csvArray[5]);
}
Just for simplicity I wrote the Convert Method, but keep in mind this is also not a very safe way unless you are sure that the DateField always contains valid Dates and Score always contains Numeric Values. You can try other safer methods like tryParse or some Exception Handling.
I all honesty, it must add that the above solution is simple (as requested), on a conceptual level I would advise against it. Putting the mapping logic between attributes and the csv-file in the class will make the sports-class too dependent on the file itself and thus less reusable. Any later changes in the file structure should then be reflected in your class and can often be overlooked. Therefore it would be wiser to put your “mapping & conversion” logic in the main program and keep your class a clean as possible
(Changed your "Score" issue by formatting it as 2 strings combined with a hyphen)
splitting the sting into arrays to get the data can be error prone and slow. Try using an OLE data provider to read the CSV as if it were a table in an SQL database, this way you can use a WHERE clause to filter the results.
App.Config:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<connectionStrings>
<add name="csv" providerName="System.Data.OleDb" connectionString="Provider=Microsoft.Jet.OLEDB.4.0;Data Source='C:\CsvFolder\';Extended Properties='text;HDR=Yes;FMT=Delimited';" />
</connectionStrings>
</configuration>
program.cs:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.OleDb;
using System.Configuration;
using System.Data;
using System.Data.Common;
namespace CsvImport
{
class Stat
{
public string Sport { get; set; }
public DateTime Date { get; set; }
public string TeamOne { get; set; }
public string TeamTwo { get; set; }
public int Score { get; set; }
}
class Program
{
static void Main(string[] args)
{
ConnectionStringSettings csv = ConfigurationManager.ConnectionStrings["csv"];
List<Stat> stats = new List<Stat>();
using (OleDbConnection cn = new OleDbConnection(csv.ConnectionString))
{
cn.Open();
using (OleDbCommand cmd = cn.CreateCommand())
{
cmd.CommandText = "SELECT * FROM [Stats.csv]";
cmd.CommandType = CommandType.Text;
using (OleDbDataReader reader = cmd.ExecuteReader(CommandBehavior.CloseConnection))
{
int fieldSport = reader.GetOrdinal("sport");
int fieldDate = reader.GetOrdinal("date");
int fieldTeamOne = reader.GetOrdinal("teamone");
int fieldTeamTwo = reader.GetOrdinal("teamtwo");
int fieldScore = reader.GetOrdinal("score");
foreach (DbDataRecord record in reader)
{
stats.Add(new Stat
{
Sport = record.GetString(fieldSport),
Date = record.GetDateTime(fieldDate),
TeamOne = record.GetString(fieldTeamOne),
TeamTwo = record.GetString(fieldTeamTwo),
Score = record.GetInt32(fieldScore)
});
}
}
}
}
foreach (Stat stat in stats)
{
Console.WriteLine("Sport: {0}", stat.Sport);
}
}
}
}
Here's how the csv should look
stats.csv:
sport,date,teamone,teamtwo,score
basketball,28/01/2011,Rockets,Blazers,98
baseball,22/08/2011,Yankees,Redsox,4
While there are a lot of libraries that will make csv reading easy (see: here), all you need to do right now that you have the line, is to split it.
String[] csvFields = line.Split(",");
Now assign each field to the appropriate member
sport = csvFields[0];
date = csvFields[1];
//and so on
This will however overwrite the values each time you read a new line, so you need to pack the values into a class and save the instances of that class to a list.
Linq also has a solution for this and you can define your output as either a List or an Array. In the example below there is a class that as the definition of the data and data types.
var modelData = File.ReadAllLines(dataFile)
.Skip(1)
.Select(x => x.Split(','))
.Select(dataRow => new TestModel
{
Column1 = dataRow[0],
Column2 = dataRow[1],
Column3 = dataRow[2],
Column4 = dataRow[3]
}).ToList(); // Or you can use .ToArray()
// use "Microsoft.VisualBasic.dll"
using System;
using Microsoft.VisualBasic.FileIO;
class Program {
static void Main(string[] args){
using(var csvReader = new TextFieldParser(#"sportsResults.csv")){
csvReader.SetDelimiters(new string[] {","});
string [] fields;
while(!csvReader.EndOfData){
fields = csvReader.ReadFields();
Console.WriteLine(String.Join(",",fields));//replace make instance
}
}
}
}
Below is for newbie and eye catching solution that most newbie like to try and error
please don;t forget to add System.Core.dll in references
Import namespace in your .cs file : using System.Linq;
Perhaps add iterator will be better code
private static IEnumerable<String> GetDataPerLines()
{
FileStream aFile = new FileStream("sportsResults.csv",FileMode.Open);
StreamReader sr = new StreamReader(aFile);
while ((line = sr.ReadLine()) != null)
{
yield return line;
}
sr.Close();
}
static void Main(string[] args)
{
var query = from data in GetDataPerLines()
let splitChr = data.Split(",".ToCharArray())
select new Sport
{
sport = splitChr[0],
date = splitChr[1],.. and so on
}
foreach (var item in query)
{
Console.Writeline(" Sport = {0}, in date when {1}",item.sport,item.date);
}
}
Maybe like this, the sample above is creating your own iteration using yield (please look at MSDN documentation for that) and create collection based on your string.
Let me know if I write the code wrong since I don;t have Visual studio when I write the answer.
For your knowledge, an array one dimension like "Sport[]" will translate into CLR IEnumerable

Categories

Resources