Reading multiple classes from single csv file using CsvHelper - c#

I've been using Josh Close' CsvHelper a bit recently to parse CSV files, I quite like the fluent api for class mapping.
I'm trying to map a csv file which contains multiple record types, the file structure is
C,Comment,Timestamp
I,Class1,Header1,Header2
D,Class1,Data1,Data2
D,Class1,Data1,Data2
...
I,Class2,Header1,Header2,Header3
D,Class2,Data1,Data2,Data3
D,Class2,Data1,Data2,Data3
...
C,Checksum
Is this something which can be handled by CsvHelper? I've writen a custom parser which basically works but all it really does is filter out the Header and Data fields for a specific class - I'd really like to be able to do something like
csv.Configuration.RegisterClassMap<Class1>();
csv.Configuration.RegisterClassMap<Class2>();
var data1 = csv.GetRecords<Class1>().ToList();
var data2 = csv.GetRecords<Class2>().ToList();
And read the file in one pass? Is this possible or am I using the wrong parser?
Regards
Dave

There is a way to do this; you just have to do it manually.
You manually read the csv file row by row
Inspect the first column for the discriminator that will indicate that you need to map to a Class object.
Inspect the second column for the class to map to.
Map the entire row to that given class.
public static void ReadMultiClassCsv()
{
var class1Data = new List<Class1>();
var class2Data = new List<Class2>();
using (StreamReader reader = File.OpenText(#"C:\filename.csv"))
using (var csvReader = new CsvReader(reader))
{
//1. You manually read the csv file row by row
while (csvReader.Read())
{
var discriminator = csvReader.GetField<string>(0);
//2. Inspect the first column for the discriminator that will indicate that you need to map to a Class object.
if (discriminator == "D")
{
var classType = csvReader.GetField<string>(1);
//3. Inspect the second column for the class to map to.
switch (classType)
{
//4. Map the entire row to that given class.
case "Class1":
class1Data.Add(csvReader.GetRecord<Class1>());
break;
case "Class2":
class2Data.Add(csvReader.GetRecord<Class2>());
break;
default:
break;
}
}
}
}
}

Related

Basic Read CSV File Questions

Thanks in advance, C# newb here having a few issues.
I this CSV file provided daily, large, and has no header. I only need certain items out of this file.
Here is the code I have so far.
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
HasHeaderRecord = false,
};
using (var reader = new StreamReader(iFile.FileName))
using (var csv = new CsvReader(reader, config))
{
var records = new List<BQFile>();
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var record = new BQFile()
{
SNumber = csv.GetField<string>("SNumber"),
FOBPoint = csv.GetField<string>("FOBPoint")
};
}
What I am not understanding since this CSV files 150+ fields, is how do grab the correct data. For example, if SNumber is column 46, FOBPoint is column 123. I am finding the CSVHelper documentation a little limited to me.
Any help is appreciated.
What I am not understanding since this CSV files 150+ fields, is how do grab the correct data
By index, because there is no header
In your BQFile, decorate the properties with an attribute of [Index(NNN)] where N is the column number (0-based). The IndexAttribute is found in CsvHelper.Configuration.Attributes namespace - I mention this because Entity Framework also has an Index attribute; be sure you use the correct one
pubic class BQFile{
[Index(46)]
public string SNumber { get; set;}
...
}
Then do:
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
HasHeaderRecord = false,
};
using (var reader = new StreamReader(iFile.FileName))
using (var csv = new CsvReader(reader, config))
{
var records = csv.GetRecords<BQFile>();
...
records is an enumeration on top of the file stream (via CSVHelper, which reads records as it goes and creates instances of BQFile). You can only enumerate it once, and then after you're done enumerating it the filestream will be at the end - if you wanted to re-read the file you'd have to Seek the stream or renew the reader. Also, the file is only read (in chunks, progressively) as you enumerate. If you return records somewhere, so you drop out of the using and you thus dispose the reader, you'll get an error when you try to start reading from records (because it's disposed)
To work with records, you either foreach it, processing the objects you get as you go:
foreach(BQFile bqf in records){
//do stuff with each BQFile here
}
Or if you want to load it all into memory, you can do something like ToList() it so you end up with a bunch of BQFile in a List, and then you can e.g. access them randomly, read them over and over etc..
var bqfs = records.ToList();
ps; I don't know, when you said "it's column 46" if that's counting from 1 or 0.. You might have to adjust your 46.

How do I create column headers using CsvHelper?

So I just installed CsvHelper because that's the one I heard was the better one to use.
I looked at the documentation and tried to figure out how to accompish what I wanted which is to create 1 column and fill it with values that are seperated with commas (,).
And then a cell under that one that would have the corresponding value so like this
ID,Type,Name,Description,Image
11,Variation,MyCoolProduct,A super cool product, Image1 | Image2
I dont want different columns to the side I want to have 1 column with a string inside that is formated like that.
This is what I did, which didnt work because you cant even open the file because you get a SYLK format issue
var records = new List<Columns>
{
new Columns {ID = 12, Type = "Variation", Description = "Simple product with different colors",
Images = "Image1 | Image 2", Price = 19.99d}
};
using (StreamWriter sw = new StreamWriter("Testfile.csv"))
{
var writer = new CsvWriter(sw);
writer.WriteRecords(records);
}
UPDATE
I've mapped it like this now, how do i write it out to a textfile?
public sealed class MyClassMap : ClassMap<Columns>
{
public MyClassMap()
{
Map(m => m.ID);
}
}
You should generally create a CsvClassMap class and map your class to the CSV format you need.
It's really simple and has a nice fluent interface. Just do something like:
public class ColMap : CsvClassMap<Column>
{
public ColMap()
{
Map(m=>m.ID).Name("ID").Index(0);
.
.
.
}
}
Some of the other options after the .Index call will allow you to further configure the format of each of your columns.

Enforce LF line endings with CsvHelper

If I have some LF converted (using N++) CSV files, everytime I write data to them using JoshClose's CsvHelper the line endings are back to CRLF.
Since I'm having problems with CLRF ROWTERMINATORS in SQL Server, I whish to keep my line endings like the initital status of the file.
Couldn't find it in the culture settings, I compile my own version of the library.
How to proceed?
Missing or incorrect Newline characters when using CsvHelper is a common problem with a simple but poorly documented solution. The other answers to this SO question are correct but are missing one important detail.
Configuration allows you to choose from one of four available alternatives:
// Pick one of these alternatives
CsvWriter.Configuration.NewLine = NewLine.CR;
CsvWriter.Configuration.NewLine = NewLine.LF;
CsvWriter.Configuration.NewLine = NewLine.CRLF;
CsvWriter.Configuration.NewLine = NewLine.Environment;
However, many people are tripped up by the fact that (by design) CsvWriter does not emit any newline character when you write the header using CsvWriter.WriteHeader() nor when you write a single record using CsvWriter.WriteRecord(). The reason is so that you can write additional header elements or additional record elements, as you might do when your header and row data comes from two or more classes rather than from a single class.
CsvWriter does emit the defined type of newline when you call CsvWriter.NextRecord(), and the author, JoshClose, states that you are supposed to call NextRecord() after you are done with the header and after you are done with each individual row added using WriteRecord. See GitHub Issues List 929
When you are writing multiple records using WriteRecords() CsvWriter automatically emits the defined type of newline at the end of each record.
In my opinion this ought to be much better documented, but there it is.
From what I can tell, the line terminator isn't controlled by CvsHelper. I've gotten it to work by adjusting the File writer I pass to CsvWriter.
TextWriter tw = File.CreateText(filepathname);
tw.NewLine = "\n";
CsvWriter csvw = new CsvWriter(tw);
csvw.WriteRecords(records);
csvw.Dispose();
Might be useful for somebody:
public static void AppendToCsv(ShopDataModel shopRecord)
{
using (var writer = new StreamWriter(DestinationFile, true))
{
using (var csv = new CsvWriter(writer))
{
csv.WriteRecord(shopRecord);
writer.Write("\n");
}
}
}
As of CsvHelper 13.0.0, line-endings are now configurable via the NewLine configuration property.
E.g.:
using CsvHelper;
using CsvHelper.Configuration;
using System.Globalization;
void Main()
{
using (var writer = new StreamWriter(#"my-file.csv"))
{
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.Configuration.HasHeaderRecord = false;
csv.Configuration.NewLine = NewLine.LF; // <<####################
var records = new List<Foo>
{
new Foo { Id = 1, Name = "one" },
new Foo { Id = 2, Name = "two" },
};
csv.WriteRecords(records);
}
}
}
private class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}

FileHelpers - Column mapping

Quick question regarding filehelper library:
I have used file helper engine to read stream, do my validation and if the CSV file has not got a header we need to match/map it to my model: i.e
id, name, age, phone, sex,
but the CSV might not come in this format/order all the time and we need to match them using a drop down list for each column.
Is there any way I can do this?
Thannks,
The short answer, no. BUT you can create a dependent class dynamically:
Since you have the list of possible fields in your JSON file, I would recommend doing a basic System.IO ReadLine for the first data row, and then parse by your delimiter for the individual headers. i.e.:
string headerString;
var headers = new List<String>();
var file = new System.IO.StreamReader("C:\\myFile.txt");
headerString = file.ReadLine();
file.Close();
headers = headerString.Split(',').ToList();
now you have the list of strings for the first row to match against your JSON file. Then you can create your dependent class using System.Reflection.Emit (referenced link below)
typeBuilder.SetParent(typeof(MyFileHelperBaseClass));
// can place the property definitions in a for loop against your headers
foreach(string h in headers){
typeBuilder.DefineProperty("<header/col#>", ..., typeof(System.Int32), null);
}
stackoverflow article 14724822: How Can I add properties to a class on runtime in C#?
File Helpers gets a little finicky at times, so it will take some tweaking.
Hope this helps
You can use File.ReadLines(#"C:\myfile.txt").First() to read the first line and get the headers.
Then you can just use a FileHelpers CodeBuilder to build your runtime class. From the example for a delimited csv file:
DelimitedClassBuilder cb = new DelimitedClassBuilder("Customers", ",");
cb.IgnoreFirstLines = 1;
cb.IgnoreEmptyLines = true;
cb.AddField("BirthDate", typeof(DateTime));
cb.LastField.TrimMode = TrimMode.Both;
cb.LastField.FieldNullValue = DateTime.Today;
cb.AddField("Name", typeof(string));
cb.LastField.FieldQuoted = true;
cb.LastField.QuoteChar = '"';
cb.AddField("Age", typeof(int));
engine = new FileHelperEngine(cb.CreateRecordClass());
DataTable dt = engine.ReadFileAsDT("testCustomers.txt");
Then you can traverse the resulting data table.

How do I pass a collection of strings as a TextReader?

I am using the CSVHelper library, which can extract a list of objects from a CSV file with just three lines of code:
var streamReader = // Create a reader to your CSV file.
var csvReader = new CsvReader( streamReader );
List<MyCustomType> myData = csvReader.GetRecords<MyCustomType>();
However, by file has nonsense lines and I need to skip the first ten lines in the file. I thought it would be nice to use LINQ to ensure 'clean' data, and then pass that data to CsvFReader, like so:
public TextReader GetTextReader(IEnumerable<string> lines)
{
// Some magic here. Don't want to return null;
return TextReader.Null;
}
public IEnumerable<T> ExtractObjectList<T>(string filePath) where T : class
{
var csvLines = File.ReadLines(filePath)
.Skip(10)
.Where(l => !l.StartsWith(",,,"));
var textReader = GetTextReader(csvLines);
var csvReader = new CsvReader(textReader);
csvReader.Configuration.ClassMapping<EventMap, Event>();
return csvReader.GetRecords<T>();
}
But I'm really stuck into pushing a 'static' collection of strings through a stream like a TextReaer.
My alternative here is to process the CSV file line by line through CsvReader and examine each line before extracting an object, but I find that somewhat clumsy.
The StringReader Class provides a TextReader that wraps a String. You could simply join the lines and wrap them in a StringReader:
public TextReader GetTextReader(IEnumerable<string> lines)
{
return new StringReader(string.Join("\r\n", lines));
}
An easier way would be to use CsvHelper to skip the lines.
// Skip rows.
csvReader.Configuration.IgnoreBlankLines = false;
csvReader.Configuration.IgnoreQuotes = true;
for (var i = 0; i < 10; i++)
{
csvReader.Read();
}
csvReader.Configuration.IgnoreBlankLines = false;
csvReader.Configuration.IgnoreQuotes = false;
// Carry on as normal.
var myData = csvReader.GetRecords<MyCustomType>;
IgnoreBlankLines is turned off in case any of those first 10 rows are blank. IgnoreQuotes is turned off so you don't get any BadDataExceptions if those rows contain a ". You can turn them back on after for normal functionality again.
If you don't know the amount of rows and need to test based on row data, you can just test csvReader.Context.Record and see if you need to stop. In this case, you would probably need to manually call csvReader.ReadHeader() before calling csvReader.GetRecords<MyCustomType>().

Categories

Resources