I'm writing a program to read in CSV files and validate the data. The csv file is comma delimited.
The csv file contains a sales order that is retrieved online so we can't actually edit the CSV file itself. I need to read in the file and split it into the cells. However, the product description will contain further commas which is affecting how I access the data.
My code for pulling the values out is below.
private void csvParse()
{
List<string> products = new List<string>();
List<string> quantities = new List<string>();
List<string> price = new List<string>();
using (var reader = new StreamReader(txt_filePath.Text.ToString()))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
products.Add(values[0]);
quantities.Add(values[2]);
values[3] = values[3].Substring(4);
price.Add(values[3]);
}
}
if (validateData(products, quantities, price) != "")
{
MessageBox.Show(validateData(products, quantities, price));
}
}
Is there anyway to ignore the columns in a set cell or can the columns distinguished by another delimiter?
A snippet of a row in my csv file is below.
The raw CSV data is below:
TO12345,"E45 Dermatological Moisturising Lotion, 500 ml",765,GBP 1.75
You can use LinqToCSV from nuGet. ie:
void Main()
{
List<MyData> sample = new List<MyData> {
new MyData {Id=1, Name="Hammer", Description="Everything looks like a nail to a hammer, doesn't it?"},
new MyData {Id=2, Name="C#", Description="A computer language."},
new MyData {Id=3, Name="Go", Description="Yet another language, from Google, cross compiles natively."},
new MyData {Id=3, Name="BlahBlah"},
};
string fileName = #"c:\temp\MyCSV.csv";
File.WriteAllText(fileName,"Id,My Product Name,Ignore1,Ignore2,Description\n");
File.AppendAllLines(fileName, sample.Select(s => $#"{s.Id},""{s.Name}"",""ignore this"",""skip this too"",""{s.Description}"""));
CsvContext cc = new CsvContext();
CsvFileDescription inputFileDescription = new CsvFileDescription
{
SeparatorChar = ',',
FirstLineHasColumnNames = true,
IgnoreUnknownColumns=true
};
IEnumerable<MyData> fromCSV = cc.Read<MyData>(fileName, inputFileDescription);
foreach (var d in fromCSV)
{
Console.WriteLine($#"ID:{d.Id},Name:""{d.Name}"",Description:""{d.Description}""");
}
}
public class MyData
{
[CsvColumn(FieldIndex = 1, Name="Id", CanBeNull = false)]
public int Id { get; set; }
[CsvColumn(FieldIndex = 2, Name="My Product Name",CanBeNull = false, OutputFormat = "C")]
public string Name { get; set; }
[CsvColumn(FieldIndex = 5, Name="Description",CanBeNull = true, OutputFormat = "C")]
public string Description { get; set; }
}
It should work..:)
var csvSplit = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled);
string[] csvlines = File.ReadAllLines(txt_filePath.Text.ToString());
var query = csvlines.Select(csvline => new
{
data = csvSplit.Matches(csvline)
}).Select(t => t.data);
var row = query.Select(matchCollection =>
(from Match m in matchCollection select (m.Value.Contains(',')) ? m.Value.Replace(",", "") : m.Value)
.ToList()).ToList();
You can also use the Microsoft.VisualBasic.FileIO.TextFieldParser class. More detailed answer here: TextFieldParser
Related
How I can directly jump to a row of a csv file ? Please see the below shown structure of csv. I want to jump directly to the row 3 (skip 0,1,2 rows) using csvhelper class ? RowId is a column in the csv file.
RowId Name
----- ----
0 Raju
1 Sabu
2 Ravi
3 Lal
4 Babu
Here is how I read csv file :
CsvConfiguration csvConfiguration = new CsvConfiguration(CultureInfo.InvariantCulture)
{
HasHeaderRecord = true,
Delimiter = ",",
PrepareHeaderForMatch = args => args.Header.ToUpper(),
IgnoreBlankLines = true,
IgnoreReferences = true,
MissingFieldFound = null,
UseNewObjectForNullReferenceMembers = true
};
CsvReader csv = new CsvReader(File.OpenText(FileNameWithPath), csvConfiguration);
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
string d0 = csv.GetField<string>("Name");
var d5 = csv.GetField<int>("Class");
}
And thiscsv file contains more than 2000 records. So If I want to jump to a row (eg: 250 rowid) by skipping all the rows just above the rowid=230, how i can do that ?
After finding this rowid = 230, i also want to start reading from row id 230 onwards.
How we can do that using csvhelper with c#?
One way is to use ShouldSkipRecords.
void Main()
{
var s = new StringBuilder();
s.Append("RowId,Name\r\n");
s.Append("0,zero\r\n");
s.Append("1,one\r\n");
s.Append("2,two\r\n");
s.Append("3,three\r\n");
s.Append("4,four\r\n");
s.Append("5,five\r\n");
s.Append("6,six\r\n");
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
ShouldSkipRecord = args => args.Row.Parser.Row > 1 && args.Row.GetField<int>("RowId") < 3,
};
using (var reader = new StringReader(s.ToString()))
using (var csv = new CsvReader(reader, config))
{
csv.GetRecords<Foo>().ToList().Dump();
}
}
private class Foo
{
public int RowId { get; set; }
public string Name { get; set; }
}
Another is to read manually.
void Main()
{
var s = new StringBuilder();
s.Append("RowId,Name\r\n");
s.Append("0,zero\r\n");
s.Append("1,one\r\n");
s.Append("2,two\r\n");
s.Append("3,three\r\n");
s.Append("4,four\r\n");
s.Append("5,five\r\n");
s.Append("6,six\r\n");
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
};
using (var reader = new StringReader(s.ToString()))
using (var csv = new CsvReader(reader, config))
{
var records = new List<Foo>();
csv.Read();
csv.ReadHeader();
while (csv.Read())
{
var rowId = csv.GetField<int>("RowId");
if (rowId < 3)
{
continue;
}
records.Add(csv.GetRecord<Foo>());
}
records.Dump();
}
}
private class Foo
{
public int RowId { get; set; }
public string Name { get; set; }
}
this is my .csv file:
Apple,rose,tiger
Mango,lily,cheetah
Banana,sunflower,lion
Apple,marigold,cat
input: Mango (i write it in the text box)
desired output:
Flower = lily; Animal = cheetah
similarly,
input: Apple
desired output:
Flower = rose,marigold; Animal = tiger,cat
this is the code i have written:
private void button1_Click(object sender, EventArgs e)
{
using (var reader = new StreamReader(#"C:\asp_net\abc.csv"))
{
List<string> listA = new List<string>();
List<string> listB = new List<string>();
List<string> listC = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
listA.Add(values[0]);
listB.Add(values[1]);
listC.Add(values[2]);
}
string checkThis = obj.SearchSenSig(textBox1.Text);
if (listA.Any(checkThis.Contains))
{
int count = listA.Where(x => x.Equals(checkThis)).Count();
if (count == 1)
{
int index = listA.IndexOf(checkThis);
var firstItem = listB.ElementAt(index);
var secondItem = listC.ElementAt(index);
MessageBox.Show(String.Format("receiver = {0}, url = {1}", firstItem, secondItem));
}
else
{
foreach (string item in listA)
{
int i = listA.IndexOf(item);
bool result = item.Equals(checkThis);
if (result)
{
List<string> myCollection1 = new List<string>();
myCollection1.Add(listB.ElementAt(i));
string firstItem = string.Join(",", myCollection1);
List<string> myCollection2 = new List<string>();
myCollection2.Add(listC.ElementAt(i));
string secondItem = string.Join(",", myCollection2);
MessageBox.Show(String.Format("Flower = {0}, Animal = {1}", firstItem, secondItem));
}
}
}
}
else
{
MessageBox.Show("The search element does not exists.");
}
}
Still i am not getting the desired output. Please help.
Instead of creating a different list for each column, create a single class to hold an entire row data:
class Data // I'll bet you can find a better name for this class...
{
public string Fruit {get;set;}
public string Flower {get;set;}
public string Animal {get;set;}
}
and populate a list of this class:
private List<Data> data = new List<Data>(); // note: this is a field, not a local variable.
Populating this list should be done only once, in the constructor or in the form_load event:
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
data.Add(
new Data()
{
Fruit = values[0],
Flower = values[1],
Animal = values[2]
}
);
}
Now all you have to do in the button_click event handler is get all the items corresponding to your search string. Assuming you are only searching for fruits using the FindAll method and display the results:
var result = data.FindAll(d => d.Fruit == searchString);
This will return a list of Data where the Fruit property contains the same string as searchString. use linq and string.Join to format the results into a string:
var resultString = $"Flower = {string.Join(",", result.Select(r => r.Flower))}; Animal = {string.Join(",", result.Select(r => r.Animal))}";
I am trying to use csv helper libary to parse my csv. But I am having an issue it says that the itemcode does not exist when its there in the file.
// Adding stock item code
Sage.Accounting.Stock.StockItem stockItem = new Sage.Accounting.Stock.StockItem();
string line = null;
public void ImportCsv(string filename)
{
TextReader reader = File.OpenText(filename);
var csv = new CsvReader(reader);
csv.Configuration.HasHeaderRecord = true;
csv.Read();
// Dynamic
// Using anonymous type for the class definition
var anonymousTypeDefinition = new
{
Itemcode = string.Empty,
Barcode = string.Empty
};
var records = csv.GetRecords(anonymousTypeDefinition);
}
This is the csv structure
"Itemcode","Barcode","description"
"P4S100001","303300054486","Test Product"
This is my first time using the csvhelper as showing here at https://joshclose.github.io/CsvHelper/
You are better off creating a strongly typed model to hold the data if one does not already exist
public class Item {
public string Itemcode { get; set; }
public string Barcode { get; set; }
public string description { get; set; }
}
and using GetRecords<T>() to read the records by type
TextReader reader = File.OpenText(filename);
var csv = new CsvReader(reader);
var records = csv.GetRecords<Item>();
Your GetRecords function needs a type specifier like so:
var records = csv.GetRecords<type>();
Also you may want to put csv.Read() in a while loop depending on your need.
Since all your values have quotes you need to specify it in the config. Working with quotes in csvHelper is frustrating. if not all if the values have quotes there are ways to handle that as well but not as nicely as this
var csv = new CsvReader(reader,new CsvHelper.Configuration.Configuration
{
HasHeaderRecord = true,
QuoteAllFields = true
});
var anonymousTypeDefinition = new
{
Itemcode = string.Empty,
Barcode = string.Empty
};
var records = csv.GetRecords(anonymousTypeDefinition);
I need to read data from a .csv file and store the header and the content in my object in the following format. A list of the below mentioned class.
public class MappingData
{
private string ColumnName { get; set; }
private List<string> Data { get; set; }
}
So for example say I have a table like as shown below,
| Name | Phone | City |
|:-----------|------------:|:------------:|
| Yassser | 32342342234 | Mumbai
| Sachin | 32342342234 | Surat
| Will | 32342342234 | London
So for the above data my class should have 3 objects, first object will have the following details
ColumnName : 'Name'
Data: ['Yasser', 'Sachin', 'Will']
so, this is what I am trying to do, Below is what I have started with. I am reading the file using stream reader and spliting each line with comma seperator.
private List<MappingData> GetData(string filename)
{
var data = new List<MappingData>();
string fullPath = GetFilePath(filename);
StreamReader reader = new StreamReader(fullPath);
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
if (!String.IsNullOrWhiteSpace(line))
{
string[] values = line.Split(',');
}
}
return data;
}
can some one please help me mold this data into the required format. Thanks.
You should create a small int variable (or bool if you prefer) to determine whether the first row has been completed :
And create each of the list objects you will need (mpName, mpPhone, mpCity), on the first row you set the ColumnName property and on the subsequent rows you add to the MappingData's data list.
Then you add each of the lists (mpName, mpPhone, mpCity) to the data list for the method and return this.
private List<MappingData> GetData(string filename) {
List<MappingData> data = new List<MappingData>();
int NumRow = 0;
MappingData mpName = new MappingData();
MappingData mpPhone = new MappingData();
MappingData mpCity = new MappingData();
string fullPath = GetFilePath(filename);
StreamReader reader = new StreamReader(fullPath);
while (!reader.EndOfStream) {
string line = reader.ReadLine();
if (!String.IsNullOrWhiteSpace(line)) {
string[] values = line.Split(',');
if (NumRow == 0) {
mpName.ColumnName = values[0];
mpPhone.ColumnName = values[1];
mpCity.ColumnName = values[2];
NumRow = 1;
} else {
mpName.Data.Add(values[0]);
mpPhone.Data.Add(values[1]);
mpCity.Data.Add(values[2]);
}
}
}
data.Add(mpName);
data.Add(mpPhone);
data.Add(mpCity);
return data;
}
Hope this helps.
There's an excellent library for processing the CSV files.
KentBoogard
It is very easy with the above third party library to read content from the CSV files.
I would suggest this because you just seem to be starting and don't re invent the wheel.
Still, if you want to process the file in your own way, here's one working implementation. Enjoy
var csvData = File.ReadAllLines("d:\\test.csv");
var dataRows = csvData.Skip(1).ToList();
var csvHeaderColumns = csvData.First().Split(',').ToList();
var outputList = new List<MappingData>();
foreach (var columnName in csvHeaderColumns)
{
var obj = new MappingData { columnName = columnName, data = new List<string>() };
foreach (var rowStrings in dataRows.Select(dataRow => dataRow.Split(',').ToList()))
{
obj.data.Add(rowStrings[csvHeaderColumns.IndexOf(columnName)]);
}
outputList.Add(obj);
}
This will populate your MappingData class.
Assuming the rest of your method is correct then try this:
private List<MappingData> GetData(string filename)
{
var raw = new List<string[]>();
var data = new List<MappingData>();
string fullPath = GetFilePath(filename);
using(var reader = new StreamReader(fullPath))
{
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
if (!String.IsNullOrWhiteSpace(line))
{
raw.Add(line.Split(','));
}
}
}
Func<int, MappingData> extract =
n => new MappingData()
{
ColumnName = raw[0][n],
Data = raw.Skip(1).Select(x => x[n]).ToList(),
};
data.Add(extract(0));
data.Add(extract(1));
data.Add(extract(2));
return data;
}
You'd have to make you MappingData properties accessible though.
Take a look and let me know what the hell i'm derping on ;)
[HttpPost]
public ActionResult Upload(HttpPostedFileBase File)
{
HttpPostedFileBase csvFile = Request.Files["adGroupCSV"];
byte[] buffer = new byte[csvFile.ContentLength];
csvFile.InputStream.Read(buffer, 0, csvFile.ContentLength);
string csvString = System.Text.Encoding.UTF8.GetString(buffer);
string[] lines = Regex.Split(csvString, "\r");
List<string[]> csv = new List<string[]>();
foreach (string line in lines)
{
csv.Add(line.Split(','));
}
string json = new System.Web.Script.Serialization.JavaScriptSerializer().Serialize(csv);
ViewData["CSV"] = json;
return View(ViewData);
}
This is how it is coming across:
json = "[[\"Col1\",\"Col2\",\"Col3\",\"Col4\",\"Col5\",\"Col6\"],[\"test\",\"test\",\"test\",\"test\",\"http://www.test.com/\",\"test/\"],[\"test\",\"test\",\"test\",\"test\",\"http://www.test.com...
This is how I want it:
{"Col1":"test","Col2":"test","Col3":"test","Col4":"test","Col5":"http://www.test.com/","Col6":"test/"}
Here is an example of the CSV
Col1,Col2,Col3,Col4,Col5,Col6
Test1,Test1,Test1,test1,test.test,test/
Test2,Test2,Test2,test2,test.test,test/
Test3,Test3,Test3,test3,test.test,test/
Test4,Test4,Test4,test4,test.test,test/
You need a dictionary. Just replace
List<string[]> csv = new List<string[]>();
foreach (string line in lines)
{
csv.Add(line.Split(','));
}
with
var csv = lines.Select(l => l.Split(',')
.Select((s,i)=>new {s,i})
.ToDictionary(x=>"Col" + (x.i+1), x=>x.s));
It should work...
EDIT
var lines = csvString.Split(new char[] { '\n', '\r' }, StringSplitOptions.RemoveEmptyEntries);
var cols = lines[0].Split(',');
var csv = lines.Skip(1)
.Select(l => l.Split(',')
.Select((s, i) => new {s,i})
.ToDictionary(x=>cols[x.i],x=>x.s));
var json = new JavaScriptSerializer().Serialize(csv);
It looks like you have a an array of string arrays when you just want one object with all your columns as properties on it.
Instead of building up your
List<string[]> csv = new List<string[]>();
Can you make a new object from your JSON, like this:
public class UploadedFileObject
{
public string Col1 { get; set; }
public string Col2 { get; set; }
public string Col3 { get; set; }
public string Col4 { get; set; }
public string Col5 { get; set; }
public string Col6 { get; set; }
}
[HttpPost] public ActionResult Upload(HttpPostedFileBase File)
{
HttpPostedFileBase csvFile = Request.Files["adGroupCSV"];
byte[] buffer = new byte[csvFile.ContentLength];
csvFile.InputStream.Read(buffer, 0, csvFile.ContentLength);
string csvString = System.Text.Encoding.UTF8.GetString(buffer);
string[] lines = Regex.Split(csvString, "\r");
List<UploadedFileObject> returnObject = new List<UploadedFileObject>();
foreach (string line in lines)
{
String[] lineParts = line.Split(',');
UploadedFileObject lineObject = new UploadedFileObject();
lineObject.Col1 = lineParts[0];
lineObject.Col2 = lineParts[1];
lineObject.Col3 = lineParts[2];
lineObject.Col4 = lineParts[3];
lineObject.Col5 = lineParts[4];
lineObject.Col6 = lineParts[5];
returnObject.add(lineObject);
}
string json = new System.Web.Script.Serialization.JavaScriptSerializer().Serialize(returnObject);
ViewData["CSV"] = json;
return View(ViewData);
}
In line with the previous answer, perhaps you could use ServiceStack to deserialize the CSV into an array of objects, then re-serialize it using ServiceStack's JSON serializer?
http://www.servicestack.net/docs/text-serializers/json-csv-jsv-serializers
You need to ensure you are using the format that the JavaScriptSerializer expects you to.
Since you are giving it a List<string[]> it is formatting it as:
"[[list1Item1,list1Item2...],[list2Item1,list2Item2...]]"
Which would correlate in your file as row1 -> list1 etc.
However what you want is the first item from the first list, lining up with the first item in the second list and so on.
I don't know about the exact workings of JavaScriptSerializer, but you could try giving it a Dictionary instead of a List<string[]>, assuming you only had one line of data (two lines total).
This would involve caching the first two lines and doing the following:
for (int i = 0; i < first.Length; i++)
{
dict.Add(first[i],second[i]);
}