C# Linq doesn't recognize Czech characters while reading from .csv file - c#

Basically, when trying to get a .csv file into a list using Linq, all characters with diacritics turn into <?> character. What should i do to make the code keep them as in the .csv file?
using (StreamReader ctec = new StreamReader(souborovejmeno))
{
var lines = File.ReadAllLines(souborovejmeno).Select(a => a.Split('\t'));
var csv = from line in lines
select (from piece in line
select piece).ToList();
foreach (var c in csv)
{
hraci.Add(new Hrac(c[0], c[1]));
listBox1.Items.Add(c[0]);
}
}
Thanks in advance for answers. Sorry if this is quite dumb, i am not too experienced in coding.

I think your problem is missed encoding. I see you already have answer above that works.
var lines = File.ReadAllLines(path, Encoding.UTF8).Select(a => a.Split('\t'));
But I strongly recommend you to use CsvHelper
dotnet add package CsvHelper
And use something like this
public class Record
{
[Index(0)]
public int Key { get; set; }
[Index(1)]
public string Value { get; set; }
}
....
using (var reader = new StreamReader(souborovejmeno, Encoding.UTF8))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
var records = csv.GetRecords<Record>();
foreach(var record in records) {
hraci.Add(new Hrac(record.Key, record.Value));
listBox1.Items.Add(record.Key);
}
}
...

Try to include an encoding like that:
var lines = File.ReadAllLines(path, Encoding.UTF8).Select(a => a.Split('\t'));
Make sure to import System.Text

Related

How to parse CSV to object fields with different type C#?

I dont see this question specifically? But how do you parse a CSV file and store it as a model when one of the fields is not of the same type because it requires a little extra logic and a conversion. See below:
CSV rows/cols
"EmployeeId,OrganizationIdList"
"12345,987/654/321"
"54321,123/456/789"
Model
public class Employee
{
public long EmployeeId { get; set; }
public List<long> OrganizationIdList { get; set; }
}
Code I'm thinking about?
using (var stream = xxx(I get my stream using Azure))
using (var reader = new StreamReader(stream))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
var employees = new List<employees>();
var records = csv.GetRecords<Employee>().ToList();
foreach (var record in records)
{
employees.Add(new Employee { EmployeeId = record.EmployeeId, OrganizationIdList = record.OrganizationIdList });
}
}
Now as you and I know, OrganizationidList comes up NULL so how do I split and breakdown the incoming string to grab the longs? I cant figure this out? I need it to not be hard coded so the columns can switch places or the rows as I figured out how to do this by indexing. Any and all help welcomed!
Edit
I need to split and grab the 987/654/321 to convert(parse) into longs but how?
parse the second column as a string then
string inp ="987/654/321";
var bits = inp.Split('/');
var orgs = bits.Select(b=>Int32.Parse(b)).ToList();
you will need using System.Linq

C# - Splitting by columns on a CSV file

I want to import some data from a csv file, but I've encountered a small problem I can't really figure out.
The person who gave me this file, added comma seperated values in cells, so when I split them they will be added to the list. Instead, I would like to get all values per column as a string, I just can't really figure out how.
For example, the column I'm talking about, is about the days a restaurant is open. This can be Mo, Tu, We, Su, but it can also be Mo, Tu.
Is there a way I can just loop over de values per column, instead of by the comma seperated values?
I'm currently using it like this, but this just adds each day to the total list of values:
using (var fs = File.OpenRead(csvUrl))
using (var reader = new StreamReader(fs, Encoding.UTF8))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (i > 0)
{
var values = line.Split(',');
}
}
}
Use TextFieldParser to parse CSV files:
TextFieldParser parser = new TextFieldParser(new StringReader(lineContent));
parser.SetDelimiters(",");
string[] rawFields = parser.ReadFields();
lineContent is a string with the content of the current line in your file.
TextFieldParser is available in the namespace:
Microsoft.VisualBasic.FileIO
Don't mind abaout the Visual Basic part it works fine in C#
EDIT
In your code you could implement it like this:
using (var fs = File.OpenRead(csvUrl))
using (var reader = new StreamReader(fs, Encoding.UTF8))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (i > 0)
{
TextFieldParser parser = new TextFieldParser(new StringReader(lineContent));
parser.SetDelimiters(",");
string[] rawFields = parser.ReadFields();
}
}
}
Best solution so far to deal with CSV values is using the .NET built in libraries:
Its explained here in my StackOverflow answer here:
Reading CSV file and storing values into an array
For easy reference, I am including the code here as well.
using Microsoft.VisualBasic.FileIO;
var path = #"C:\Person.csv"; // Habeeb, "Dubai Media City, Dubai"
using (TextFieldParser csvParser = new TextFieldParser(path))
{
csvParser.CommentTokens = new string[] { "#" };
csvParser.SetDelimiters(new string[] { "," });
csvParser.HasFieldsEnclosedInQuotes = true;
// Skip the row with the column names
csvParser.ReadLine();
while (!csvParser.EndOfData)
{
// Read current line fields, pointer moves to the next line.
string[] fields = csvParser.ReadFields();
string Name = fields[0];
string Address = fields[1];
}
}
More details about the parser is given here: http://codeskaters.blogspot.ae/2015/11/c-easiest-csv-parser-built-in-net.html

Enforce LF line endings with CsvHelper

If I have some LF converted (using N++) CSV files, everytime I write data to them using JoshClose's CsvHelper the line endings are back to CRLF.
Since I'm having problems with CLRF ROWTERMINATORS in SQL Server, I whish to keep my line endings like the initital status of the file.
Couldn't find it in the culture settings, I compile my own version of the library.
How to proceed?
Missing or incorrect Newline characters when using CsvHelper is a common problem with a simple but poorly documented solution. The other answers to this SO question are correct but are missing one important detail.
Configuration allows you to choose from one of four available alternatives:
// Pick one of these alternatives
CsvWriter.Configuration.NewLine = NewLine.CR;
CsvWriter.Configuration.NewLine = NewLine.LF;
CsvWriter.Configuration.NewLine = NewLine.CRLF;
CsvWriter.Configuration.NewLine = NewLine.Environment;
However, many people are tripped up by the fact that (by design) CsvWriter does not emit any newline character when you write the header using CsvWriter.WriteHeader() nor when you write a single record using CsvWriter.WriteRecord(). The reason is so that you can write additional header elements or additional record elements, as you might do when your header and row data comes from two or more classes rather than from a single class.
CsvWriter does emit the defined type of newline when you call CsvWriter.NextRecord(), and the author, JoshClose, states that you are supposed to call NextRecord() after you are done with the header and after you are done with each individual row added using WriteRecord. See GitHub Issues List 929
When you are writing multiple records using WriteRecords() CsvWriter automatically emits the defined type of newline at the end of each record.
In my opinion this ought to be much better documented, but there it is.
From what I can tell, the line terminator isn't controlled by CvsHelper. I've gotten it to work by adjusting the File writer I pass to CsvWriter.
TextWriter tw = File.CreateText(filepathname);
tw.NewLine = "\n";
CsvWriter csvw = new CsvWriter(tw);
csvw.WriteRecords(records);
csvw.Dispose();
Might be useful for somebody:
public static void AppendToCsv(ShopDataModel shopRecord)
{
using (var writer = new StreamWriter(DestinationFile, true))
{
using (var csv = new CsvWriter(writer))
{
csv.WriteRecord(shopRecord);
writer.Write("\n");
}
}
}
As of CsvHelper 13.0.0, line-endings are now configurable via the NewLine configuration property.
E.g.:
using CsvHelper;
using CsvHelper.Configuration;
using System.Globalization;
void Main()
{
using (var writer = new StreamWriter(#"my-file.csv"))
{
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.Configuration.HasHeaderRecord = false;
csv.Configuration.NewLine = NewLine.LF; // <<####################
var records = new List<Foo>
{
new Foo { Id = 1, Name = "one" },
new Foo { Id = 2, Name = "two" },
};
csv.WriteRecords(records);
}
}
}
private class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}

Get a single element of CSV file

I'm trying to add some csv elements to a list of Alimento, where Alimento is declared as:
namespace ContaCarboidrati
{
class Alimento
{
public virtual string Codice { get; set; }
public virtual string Descrizione { get; set; }
public virtual int Carboidrati { get; set; }
}
}
My csv looks something like this:
"C00, Pasta, 75".
Here's the method that should create the list from the csv:
private static List<Alimento> CreaListaAlimentiDaCsv()
{
List<Alimento> listaCsv = new List<Alimento>();
StreamReader sr = new StreamReader(#"C:\Users\Alex\Documents\RecordAlimenti.csv");
string abc = sr.ReadLine();
//listaCsv = abc.Split(",");
}
abc is "C00, Pasta, 75". I want to get a single element to add it to the list, or add all the 3 elements to the list, i thought that a single element is easier to made.
Sorry for my bad English
Thanks in advance
Alex
You are on the right track, but you cannot just create an Alimento of three strings, which is what you will get if you do abc.Split(","). You need to create a new Alimento object for each item (line) in the csv file and initialize each object correctly. Something like this:
var item = abc.Split(',');
listaCsv.Add(new Alimento() { Codice = item[0], Descrizione = item[1],
Carboidrati = int.Parse(item[2])};
Also, your csv seems to include spaces after the commas which you might want to get rid of. You could use string.Trim() to get rid of leading/trailing spaces. You also have to make sure the third item is actually an integer and take action if that is not the case (i.e. add some error handling).
As a side note, implementing a csv reader is not as trivial as one may think, but there are several free C# implementations out there. If you need something a bit more advanced than just reading a simple (and strictly one-line-per-item) csv, try one of these:
http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader
http://www.filehelpers.com/
You can parse file with LINQ
var listaCsv = (from line in File.ReadAllLines("RecordAlimenti.csv")
let items = line.Split(',')
select new Alimento {
Codice = items[0],
Descrizione = items[1],
Carboidrati = Int32.Parse(items[2])
}).ToList();
You can parse it pretty easy assuming your data isn't bad.
private IEnumerable<Alimento> CreaListaAlimentiDaCsv(string fileName)
{
return File.Readlines(fileName) //#"C:\Users\Alex\Documents\RecordAlimenti.csv"
.Select(line => line.Split(',').Trim())
.Select(
values =>
new Alimento
{
Codice = value[0],
Descrizione = values[0],
Carboidrati = Convert.ToInt32(values[3])
});
}
You can also use Linq on the method such as
//Takes one line without iterating the entire file
CreaListaAlimentiDaCsv(#"C:\Users\Alex\Documents\RecordAlimenti.csv").Take(1);
//Skips the first line and takes the second line reading two lines total
CreaListaAlimentiDaCsv(#"C:\Users\Alex\Documents\RecordAlimenti.csv").Skip(1).Take(1);

Reading a CSV file in .NET?

How do I read a CSV file using C#?
A choice, without using third-party components, is to use the class Microsoft.VisualBasic.FileIO.TextFieldParser (http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.aspx) . It provides all the functions for parsing CSV. It is sufficient to import the Microsoft.VisualBasic assembly.
var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(file);
parser.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited;
parser.SetDelimiters(new string[] { ";" });
while (!parser.EndOfData)
{
string[] row = parser.ReadFields();
/* do something */
}
You can use the Microsoft.VisualBasic.FileIO.TextFieldParser class in C#:
using System;
using System.Data;
using Microsoft.VisualBasic.FileIO;
static void Main()
{
string csv_file_path = #"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTableFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTableFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datacolumn = new DataColumn(column);
datacolumn.AllowDBNull = true;
csvData.Columns.Add(datacolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}
You could try CsvHelper, which is a project I work on. Its goal is to make reading and writing CSV files as easy as possible, while being very fast.
Here are a few ways you can read from a CSV file.
// By type
var records = csv.GetRecords<MyClass>();
var records = csv.GetRecords( typeof( MyClass ) );
// Dynamic
var records = csv.GetRecords<dynamic>();
// Using anonymous type for the class definition
var anonymousTypeDefinition =
{
Id = default( int ),
Name = string.Empty,
MyClass = new MyClass()
};
var records = csv.GetRecords( anonymousTypeDefinition );
I usually use a simplistic approach like this one:
var path = Server.MapPath("~/App_Data/Data.csv");
var csvRows = System.IO.File.ReadAllLines(path, Encoding.Default).ToList();
foreach (var row in csvRows.Skip(1))
{
var columns = row.Split(';');
var field1 = columns[0];
var field2 = columns[1];
var field3 = columns[2];
}
I just used this library in my application. http://www.codeproject.com/KB/database/CsvReader.aspx. Everything went smoothly using this library, so I'm recommending it. It is free under the MIT License, so just include the notice with your source files.
I didn't display the CSV in a browser, but the author has some samples for Repeaters or DataGrids. I did run one of his test projects to test a Sort operation I have added and it looked pretty good.
You can try Cinchoo ETL - an open source lib for reading and writing CSV files.
Couple of ways you can read CSV files
Id, Name
1, Tom
2, Mark
This is how you can use this library to read it
using (var reader = new ChoCSVReader("emp.csv").WithFirstLineHeader())
{
foreach (dynamic item in reader)
{
Console.WriteLine(item.Id);
Console.WriteLine(item.Name);
}
}
If you have POCO object defined to match up with CSV file like below
public class Employee
{
public int Id { get; set; }
public string Name { get; set; }
}
You can parse the same file using this POCO class as below
using (var reader = new ChoCSVReader<Employee>("emp.csv").WithFirstLineHeader())
{
foreach (var item in reader)
{
Console.WriteLine(item.Id);
Console.WriteLine(item.Name);
}
}
Please check out articles at CodeProject on how to use it.
Disclaimer: I'm the author of this library
I recommend Angara.Table, about save/load: http://predictionmachines.github.io/Angara.Table/saveload.html.
It makes column types inference, can save CSV files and is much faster than TextFieldParser. It follows RFC4180 for CSV format and supports multiline strings, NaNs, and escaped strings containing the delimiter character.
The library is under MIT license. Source code is https://github.com/Microsoft/Angara.Table.
Though its API is focused on F#, it can be used in any .NET language but not so succinct as in F#.
Example:
using Angara.Data;
using System.Collections.Immutable;
...
var table = Table.Load("data.csv");
// Print schema:
foreach(Column c in table)
{
string colType;
if (c.Rows.IsRealColumn) colType = "double";
else if (c.Rows.IsStringColumn) colType = "string";
else if (c.Rows.IsDateColumn) colType = "date";
else if (c.Rows.IsIntColumn) colType = "int";
else colType = "bool";
Console.WriteLine("{0} of type {1}", c.Name, colType);
}
// Get column data:
ImmutableArray<double> a = table["a"].Rows.AsReal;
ImmutableArray<string> b = table["b"].Rows.AsString;
Table.Save(table, "data2.csv");
You might be interested in Linq2Csv library at CodeProject. One thing you would need to check is that if it's reading the data when it needs only, so you won't need a lot of memory when working with bigger files.
As for displaying the data on the browser, you could do many things to accomplish it, if you would be more specific on what are your requirements, answer could be more specific, but things you could do:
1. Use HttpListener class to write simple web server (you can find many samples on net to host mini-http server).
2. Use Asp.Net or Asp.Net Mvc, create a page, host it using IIS.
Seems like there are quite a few projects on CodeProject or CodePlex for CSV Parsing.
Here is another CSV Parser on CodePlex
http://commonlibrarynet.codeplex.com/
This library has components for CSV parsing, INI file parsing, Command-Line parsing as well. It's working well for me so far. Only thing is it doesn't have a CSV Writer.
This is just for parsing the CSV. For displaying it in a web page, it is simply a matter of taking the list and rendering it however you want.
Note: This code example does not handle the situation where the input string line contains newlines.
public List<string> SplitCSV(string line)
{
if (string.IsNullOrEmpty(line))
throw new ArgumentException();
List<string> result = new List<string>();
int index = 0;
int start = 0;
bool inQuote = false;
StringBuilder val = new StringBuilder();
// parse line
foreach (char c in line)
{
switch (c)
{
case '"':
inQuote = !inQuote;
break;
case ',':
if (!inQuote)
{
result.Add(line.Substring(start, index - start)
.Replace("\"",""));
start = index + 1;
}
break;
}
index++;
}
if (start < index)
{
result.Add(line.Substring(start, index - start).Replace("\"",""));
}
return result;
}
}
I have been maintaining an open source project called FlatFiles for several years now. It's available for .NET Core and .NET 4.5.1.
Unlike most of the alternatives, it allows you to define a schema (similar to the way EF code-first works) with an extreme level of precision, so you aren't fight conversion issues all the time. You can map directly to your data classes, and there is also support for interfacing with older ADO.NET classes.
Performance-wise, it's been tuned to be one of the fastest parsers for .NET, with a plethora of options for quirky format differences. There's also support for fixed-length files, if you need it.
you can use this library: Sky.Data.Csv
https://www.nuget.org/packages/Sky.Data.Csv/
this is a really fast CSV reader library and it's really easy to use:
using Sky.Data.Csv;
var readerSettings = new CsvReaderSettings{Encoding = Encoding.UTF8};
using(var reader = CsvReader.Create("path-to-file", readerSettings)){
foreach(var row in reader){
//do something with the data
}
}
it also supports reading typed objects with CsvReader<T> class which has a same interface.

Categories

Resources