CSV file to class via Linq - c#

With the code below, on the foreach, I get an exception.
I place breakpoint on the csv (second line), I expand the result, I see 2 entries thats ok.
When I do the same on the csv in the foreach, I get an excpetion : can't read from closed text reader.
Any idea ?
Thanks,
My CSV file :
A0;A1;A2;A3;A4
B0;B1;B2;B3;B4
The code
var lines = File.ReadLines("filecsv").Select(a => a.Split(';'));
IEnumerable<IEnumerable<MyClass>> csv =
from line in lines
select (from piece in line
select new MyClass
{
Field0 = piece[0].ToString(),
Field1 = piece[1].ToString()
}
).AsEnumerable<MyClass>();
foreach (MyClass myClass in csv)
Console.WriteLine(myClass.Field0);
Console.ReadLine();
MyClass :
public class MyClass
{
public string Field0 { get; set; }
public string Field1 { get; set; }
}

Perhaps something like this instead, will give you exactly what you want:
var jobs = File.ReadLines("filecsv")
.Select(line => line.Split(','))
.Select(tokens => new MyClass { Field0 = tokens[0], Field1 = tokens[1] })
.ToList();
The problem you have is that you're saving the Enumerable, which has delayed execution. You're then looking at it through the debugger, which loops through the file, does all the work and disposes of it. Then you try and do it again.
The above code achieves what you currently want, is somewhat cleaner, and forces conversion to a list so the lazy behaviour is gone.
Note also that I can't see how your from piece in line could work correctly as it currently stands.

Perhabs it is because LINQ does not directly read all the items, it just creates the connection it read if it is needed.
You could try to cast:
var lines = File.ReadLines("filecsv").Select(a => a.Split(';')).ToArray();

I suspect it is a combination of the yield keyword (used in Select()) and the internal text reader (in ReadLines) not "agreeing".
Changes the lines variable to var lines = File.ReadLines("filecsv").Select(a => a.Split(';')).ToArray();
That should sort it.

Related

IEnumerable performs differently on Array vs List

This question is more of a "is my understanding accurate", and if not, please help me get my head around it. I have this bit of code to explain my question:
class Example
{
public string MyString { get; set; }
}
var wtf = new[] { "string1", "string2"};
IEnumerable<Example> transformed = wtf.Select(s => new Example { MyString = s });
IEnumerable<Example> transformedList = wtf.Select(s => new Example { MyString = s }).ToList();
foreach (var i in transformed)
i.MyString = "somethingDifferent";
foreach (var i in transformedList)
i.MyString = "somethingDifferent";
foreach(var i in transformed)
Console.WriteLine(i.MyString);
foreach (var i in transformedList)
Console.WriteLine(i.MyString);
It outputs:
string1
string2
somethingDifferent
somethingDifferent
Both Select() methods at first glance return IEnumerable< Example>. However, underlying types are WhereSelectArrayIterator< string, Example> and List< Example >.
This is where my sanity started to come into question. From my understanding the difference in output above is because of the way both underlying types implement the GetEnumerator() method.
Using this handy website, I was able to (I think) track down the bit of code that was causing the difference.
class WhereSelectArrayIterator<TSource, TResult> : Iterator<TResult>
{ }
Looking at that on line 169 points me to Iterator< TResult>, since that's where it appears GetEnumerator() is called.
Starting on line 90 I see:
public IEnumerator<TSource> GetEnumerator() {
if (threadId == Thread.CurrentThread.ManagedThreadId && state == 0) {
state = 1;
return this;
}
Iterator<TSource> duplicate = Clone();
duplicate.state = 1;
return duplicate;
}
What I gather from that is when you enumerate over it, you're actually enumerating over a cloned source (as written in the WhereSelectArrayIterator class' Clone() method).
This will satisfy my need to understand for now, but as a bonus, if someone could help me figure out why this isn't returned the first time I enumerate over the data. From what I can tell, the state should = 0 the first pass. Unless, perhaps there is magic happening under the hood that is calling the same method from different threads.
Update
At this point I'm thinking my 'findings' were a bit misleading (damn Clone method taking me down the wrong rabbit hole) and it was indeed due to deferred execution. I mistakenly thought that even though I deferred execution, once it was enumerated the first time it would store those values in my variable. I should have known better; after all I was using the new keyword in the Select. That said, it still did open my eyes to the idea that a particular class' GetEnumerator() implementation could still return a clone which would present a very similar problem. It just so happened that my problem was different.
Update2
This is an example of what I thought my problem was. Thanks everyone for the information.
IEnumerable<Example> friendly = new FriendlyExamples();
IEnumerable<Example> notFriendly = new MeanExamples();
foreach (var example in friendly)
example.MyString = "somethingDifferent";
foreach (var example in notFriendly)
example.MyString = "somethingDifferent";
foreach (var example in friendly)
Console.WriteLine(example.MyString);
foreach (var example in notFriendly)
Console.WriteLine(example.MyString);
// somethingDifferent
// somethingDifferent
// string1
// string2
Supporting classes:
class Example
{
public string MyString { get; set; }
public Example(Example example)
{
MyString = example.MyString;
}
public Example(string s)
{
MyString = s;
}
}
class FriendlyExamples : IEnumerable<Example>
{
Example[] wtf = new[] { new Example("string1"), new Example("string2") };
public IEnumerator<Example> GetEnumerator()
{
return wtf.Cast<Example>().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return wtf.GetEnumerator();
}
}
class MeanExamples : IEnumerable<Example>
{
Example[] wtf = new[] { new Example("string1"), new Example("string2") };
public IEnumerator<Example> GetEnumerator()
{
return wtf.Select(e => new Example(e)).Cast<Example>().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return wtf.Select(e => new Example(e)).GetEnumerator();
}
}
Linq works by making each function return another IEnumerable that is typically a deferred processor. No actual execution occurs until an enumeration of the finally returned Ienumerable occurs. This allows for the create of efficient pipelines.
When you do
var transformed = wtf.Select(s => new Example { MyString = s });
The select code has not actually executed yet. Only when you finally enumerate transformed will the select be done. ie here
foreach (var i in transformed)
i.MyString = "somethingDifferent";
Note that if you do
foreach (var i in transformed)
i.MyString = "somethingDifferent";
the pipeline will be executed again. Here thats is not a big deal but it can be huge if IO is involved.
this line
var transformedList = wtf.Select(s => new Example { MyString = s }).ToList();
Is the same as
var transformedList = transformed.ToList();
The real eyeopener is to place debug statements or breakpoints inside a where or select to actually see the deferred pipeline execution
reading the implementation of linq is useful. here is select https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,5c652c53e80df013,references

Why is the delimiter excluded in this String.Join? C#

I have an ObservableCollection<T> that I use for binding that I want to put into a String.Join statement, but I don't understand why it is giving the results I am getting and how to fix it?
This is the code I am using to get the result,
First I am getting the data I need via this LINQ query,
public static IEnumerable<string> GetNursingHomeNames(string home)
{
return DataContext.NursingHomeNameServerTables.Where(p => p.Nursing_Home_Section == home)
.Select(p => p.Nursing_Home_Name).Distinct();
}
I then put it into the ObservableCollection<T> (You may be wondering why I am not using an ObservableCollection<T> with the LINQ query, but I have my reasons.)
public static void NursingHomeNamesCollection(ObservableCollection<string> nursingHomeNames, string nursingHomeSection)
{
var homeNames = GetNursingHomeNames(nursingHomeSection);
if (homeNames == null)
{
return;
}
foreach (var item in homeNames)
{
nursingHomeNames.Add(item);
}
}
This is the property in the main window,
public ObservableCollection<string> NursingHomeNames { get; set; } =
new ObservableCollection<string>();
Then
Than I use Join to get the results for a specific purpose I need,
var result = String.Join(#",", NursingHomeNames.ToList());
And this gives the following result where there is no delimiter only a space,
foo bar bat baz
However, if just do this,
ObservableCollection<string> observableCol = new ObservableCollection<string>() { "foo", "bar", "bat", "baz" };
var result = String.Join(#",", observableCol.ToList());
The result is displayed with the delimiters in place.
foo,bar,bat,baz
Why is it doing this and is there a way to ensure the delimiters are correctly placed?
I know I have to work on my naming conventions.
EDIT: In the debuger, this is what I see,
When assigning the collection to a variable named data and viewing the results in the Watch Window
var data = NursingHomeNames.ToList();
Count = 4
[0] "foo"
[1] "bar"
[2] "bat"
[3] "baz"
However, I cannot reproduce this using any other code that does not use the LINQ query that pulls the data from the database. I tried making a new list and passing that list through the same code, but the error does not occur. I am sorry, but I can't post an example that can be reproduced.
As it turns out, after weeks of effort to figure this out, the answer to be had was with the comment that #Panagiotis Kanavos and #CodeCaster made.
I was using a Unicode character that looked like a comma and it was therefore creating a different behavior than what I was expecting.
In method
public static void NursingHomeNamesCollection(string nursingHomeSection)
you get string parameter to input. After in this method scope you add this string into static collection, but item in this scope is a char.
foreach (var item in homeNames)
You're trying to add a character at a time and join one big string. You need get collection to input of this method.

Convert Text file to Dictionary of Objects

I have the following class:
class Car
{
public Make{get; set;}
public Model{get; set;}
public Year{get; set;}
public Color{get; set;}
public Car(string make, string model, string year, string color)
{
this.Make= make;
this.Model= model;
this.Year= year;
this.Color= color;
}
}
I have the following text file "Carlist.txt":
Id,Make,Model,Year,Color
0,Toyoa,Corola,2000,Blue
1,Honda,Civic,2005,Red
I want to have a dictionary of the form:
Dictionary<string, Car>
Here is my code to read the text file and parse out the elements into a dictionary but I am not able to get this to work:
Dictionary<string, Car> result =
File.ReadLines("Carlist.txt")
.Select(line => line.Split(','))
.ToDictionary(split => split[0],
new Car(split => split[1],
split => split[2],
split => split[3],
split => split[4]));
What am I doing wrong? I keep getting the following error on each of the split elements in new Car(
Error CS1660 Cannot convert lambda expression to type 'string' because it is not a delegate type
Update:
Here is my current code with an auto increment key (variable i):
int i = 0;
Dictionary<int, Car> result =
File.ReadLines(path + "Carlist.txt")
.Select(line => line.Split(','))
.Where(split => split[0] != "Make")
.ToDictionary(split => i++,
split => new Car(split[0],
split[1],
split[2],
split[3]));
Thus my textfile now looks like this:
Make,Model,Year,Color
Toyoa,Corola,2000,Blue
Honda,Civic,2005,Red
There's a couple of issue you need to solve.
Firstly each parameter of the ToDictionary method is a single delegate the syntax for this is:
.ToDictionary(split => int.Parse(split[0]),
split => new Car(split[1], split[2], split[3], split[4]));
As opposed to trying to pass a delegate in to each parameter on your Car constructor (as in the original code).
The second is that you will read your header line and create a Car with the headers as values, you will want to exclude this, one way could be to add this above your ToDictionary:
.Where( split => split[0] != "Id" )
Here's a version that should do what you want
var result = File.ReadLines("Carlist.txt")
.Select(line => line.Split(','))
.Where( split => split[0] != "Id" )
.ToDictionary(split => int.Parse(split[0]), split => new Car(split[1], split[2], split[3], split[4]));
File.ReadLines returns an array of strings. You then split each string with gives you another array of string. string is not int, so you have to parse it. Also your second lambda was all messed up. Something like:
Dictionary<string, Car> result =
File.ReadLines("Carlist.txt")
.Select(line => line.Split(','))
.ToDictionary(split => int.Parse(split[0]),
split => new Car(split[1],
split[2],
split[3],
split[4]));
A couple of things to note, this will fail if that first element can't be parsed as an integer (which if you file actually does include those headers, it can't). So you'll need to skip the first row and/or add some error handling.
For others who come here looking for CSV deserialization:
The approach of reading CSV and splitting on coma works for many scenarios, but will also fail in many scenarios. For example CSV fields that contain coma, fields with quotation or fields with escaped quotation. These are all very common, standard and valid CSV-fields used by for example Excel.
Using a library that fully supports CSV is both simpler and more compatible. One such library is CsvHelper. It has support for a wide variety of mappings if you need manual control, but in the case described by op it is as simple as:
public class Car
{
public string Make { get; set;}
public string Model { get; set; }
public int Year { get; set; }
public string Color { get; set; }
}
void Main()
{
List<Car> cars;
using (var fileReader = File.OpenText("Cars.txt"))
{
using (var csv = new CsvHelper.CsvReader(fileReader))
{
cars = csv.GetRecords<Car>().ToList();
}
}
// cars now contains a list of Car-objects read from CSV.
// Header fields (first line of CSV) has been automatically matched to property names.
// Set up the dictionary. Note that the key must be unique.
var carDict = cars.ToDictionary(c => c.Make);
}

C# file line remove with arraylist

I am trying to make a class which will help me delete one specific line from a file. So I came up with the idea to put all lines in an arraylist, remove from the list the line i don't need, wipe clear my .txt file and write back the remaining objects of the list. My problem is that I encounter some sort of logical error i can't fint, that doesn't remove the line from the arraylist and writes it back again. Here's my code:
public class delete
{
public void removeline(string line_to_delete)
{
string[] lines = File.ReadAllLines("database.txt");
ArrayList list = new ArrayList(lines);
list.Remove(line_to_delete);
File.WriteAllText("database.txt", String.Empty);
using (var writer = new StreamWriter("database.txt"))
{
foreach (object k in lines)
{
writer.WriteLine(k);
}
}
}
}
What is that I am missing? I tried lots of things on removing a line from a text file that did not work. I tried this because it has the least file operations so far.
Thanks!
You can do:
var line_to_delete = "line to delete";
var lines = File.ReadAllLines("database.txt");
File.WriteAllLines("database.txt", lines.Where(line => line != line_to_delete));
File.WriteAllLines will overwrite the existing file.
Do not use ArrayList, there is a generic alternative List<T>. Your code is failing due to the use of ArrayList as it can only remove a single line matching the criteria. With List<T> you can use RemoveAll to remove all the lines matching criteria.
If you want to do the comparison with ignore case you can do:
File.WriteAllLines("database.txt", lines.Where(line =>
!String.Equals(line, line_to_delete, StringComparison.InvariantCultureIgnoreCase)));
I believe you're intending:
public static void RemoveLine(string line)
{
var document = File.ReadAllLines("...");
document.Remove(line);
document.WriteAllLines("..");
}
That would physically remove the line from the collection. Another approach would be to simply filter out that line with Linq:
var filter = document.Where(l => String.Compare(l, line, true) == 0);
You could do the Remove in an ArrayList proper documentation on how is here. Your code should actually work, all of these answers are more semantic oriented. The issue could be due to actual data, could you provide a sample? I can't reproduce with your code.

Creation and recalling of objects in a foreach loop?

Beginner programmer here so please keep (explanation of) answers as simple as possible.
For an assignment we got a text file that contains a large amount of lines.
Each line is a different 'Order' with an ordernumber, location, frequency (amount of times per week), and a few other arguments.
Example:
18; New York; 3PWK; ***; ***; etc
I've made it so that each line is read and stored in a string with
string[] orders = System.IO.File.ReadAllLines(#<filepath here>);
And I've made a separate class called "Order" which has a get+set for all the properties like ordernumber, etc.
Now I'm stuck on how to actually get the values in there. I know how string splitting works but not how I can make unique objects in a loop.
I'd need something that created a new Order for every line and then assign Order.ordernumber, Order.location etc.
Any help would be MUCH appreciated!
An easy approach will be to make a class to define the orders like this:
public class Order{
public string OrderNumber{get;set;}
public string OrderId{get;set;}
public string OrderSomeThingElse{get;set;}
}
Then initialize a List:
var orderList = new List<Order>();
Then loop through and populate it:
foreach( var order in orders ){
var splitString = order.Split(';');
orderList.Add( new Order{
OrderNumber = splitString[0],
OrderId = splitString[1],
OrderSomeThingElse = splitString[2]
});
}
If you want an easy, but not that elegant approach, this is it.
In addition to all the good answers you've already received. I recommend you to use File.ReadLines() instead File.ReadAllLines() Because you are reading large file.
The ReadLines and ReadAllLines methods differ as follows: When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. Therefore, when you are working with very large files, ReadLines can be more efficient. MSDN
Unless I misunderstand... do you mean something like this?
var ordersCollection = new List<Order>();
foreach (var order in orders)
{
var o = new Order();
o.PropertyName = ""; // Assign your property values like this
ordersCollection.Add(o);
}
// ordersCollection is now full of your orders.

Categories

Resources