Convert Text file to Dictionary of Objects

Convert Text file to Dictionary of Objects - c#

I have the following class:
class Car
{
public Make{get; set;}
public Model{get; set;}
public Year{get; set;}
public Color{get; set;}
public Car(string make, string model, string year, string color)
{
this.Make= make;
this.Model= model;
this.Year= year;
this.Color= color;
}
}
I have the following text file "Carlist.txt":
Id,Make,Model,Year,Color
0,Toyoa,Corola,2000,Blue
1,Honda,Civic,2005,Red
I want to have a dictionary of the form:
Dictionary<string, Car>
Here is my code to read the text file and parse out the elements into a dictionary but I am not able to get this to work:
Dictionary<string, Car> result =
File.ReadLines("Carlist.txt")
.Select(line => line.Split(','))
.ToDictionary(split => split[0],
new Car(split => split[1],
split => split[2],
split => split[3],
split => split[4]));
What am I doing wrong? I keep getting the following error on each of the split elements in new Car(
Error CS1660 Cannot convert lambda expression to type 'string' because it is not a delegate type
Update:
Here is my current code with an auto increment key (variable i):
int i = 0;
Dictionary<int, Car> result =
File.ReadLines(path + "Carlist.txt")
.Select(line => line.Split(','))
.Where(split => split[0] != "Make")
.ToDictionary(split => i++,
split => new Car(split[0],
split[1],
split[2],
split[3]));
Thus my textfile now looks like this:
Make,Model,Year,Color
Toyoa,Corola,2000,Blue
Honda,Civic,2005,Red

There's a couple of issue you need to solve.
Firstly each parameter of the ToDictionary method is a single delegate the syntax for this is:
.ToDictionary(split => int.Parse(split[0]),
split => new Car(split[1], split[2], split[3], split[4]));
As opposed to trying to pass a delegate in to each parameter on your Car constructor (as in the original code).
The second is that you will read your header line and create a Car with the headers as values, you will want to exclude this, one way could be to add this above your ToDictionary:
.Where( split => split[0] != "Id" )
Here's a version that should do what you want
var result = File.ReadLines("Carlist.txt")
.Select(line => line.Split(','))
.Where( split => split[0] != "Id" )
.ToDictionary(split => int.Parse(split[0]), split => new Car(split[1], split[2], split[3], split[4]));

File.ReadLines returns an array of strings. You then split each string with gives you another array of string. string is not int, so you have to parse it. Also your second lambda was all messed up. Something like:
Dictionary<string, Car> result =
File.ReadLines("Carlist.txt")
.Select(line => line.Split(','))
.ToDictionary(split => int.Parse(split[0]),
split => new Car(split[1],
split[2],
split[3],
split[4]));
A couple of things to note, this will fail if that first element can't be parsed as an integer (which if you file actually does include those headers, it can't). So you'll need to skip the first row and/or add some error handling.

For others who come here looking for CSV deserialization:
The approach of reading CSV and splitting on coma works for many scenarios, but will also fail in many scenarios. For example CSV fields that contain coma, fields with quotation or fields with escaped quotation. These are all very common, standard and valid CSV-fields used by for example Excel.
Using a library that fully supports CSV is both simpler and more compatible. One such library is CsvHelper. It has support for a wide variety of mappings if you need manual control, but in the case described by op it is as simple as:
public class Car
{
public string Make { get; set;}
public string Model { get; set; }
public int Year { get; set; }
public string Color { get; set; }
}
void Main()
{
List<Car> cars;
using (var fileReader = File.OpenText("Cars.txt"))
{
using (var csv = new CsvHelper.CsvReader(fileReader))
{
cars = csv.GetRecords<Car>().ToList();
}
}
// cars now contains a list of Car-objects read from CSV.
// Header fields (first line of CSV) has been automatically matched to property names.
// Set up the dictionary. Note that the key must be unique.
var carDict = cars.ToDictionary(c => c.Make);
}

Related

c# Appropriate data structure for storing values from csv file. Specific Case

I'm writing a program that will simply read 2 different .csv files containing following information:
file 1 file2
AA,2.34 BA,6.45
AB,1.46 BB,5.45
AC,9.69 BC,6.21
AD,3.6 AC,7.56
Where first column is string, second is double.
So far I have no difficulty in reading those files and placing values to the List:
firstFile = new List<KeyValuePair<string, double>>();
secondFile = new List<KeyValuePair<string, double>>();
I'm trying to instruct my program:
to take first value from the first column from the first row of the first file (in this case AA)
and look if there might be a match in the entire first column in the second file.
If string match is found, compare their corresponding second values (double in this case), and if in this case match found, add the entire row to the separate List.
Something similar to the below pseudo-code:
for(var i=0;i<firstFile.Count;i++)
{
firstFile.Column[0].value[i].SearchMatchesInAnotherFile(secondFile.Column[0].values.All);
if(MatchFound)
{
CompareCorrespondingDoubles();
if(true)
{
AddFirstValueToList();
}
}
}
Instead of List I tried to use Dictionary but this data structure is not sorted and no way to access the key by the index.
I'm not asking for the exact code to provide, rather the question is:
What would you suggest to use as an appropriate data structure for this program so that I can investigate myself further?

KeyValuePair is actually only used for Dictionarys. I suggest to create your own custom type:
public class MyRow
{
public string StringValue {get;set;}
public double DoubleValue {get;set;}
public override bool Equals(object o)
{
MyRow r = o as MyRow;
if (ReferenceEquals(r, null)) return false;
return r.StringValue == this.StringValue && r.DoubleValue == this.DoubleValue;
}
public override int GetHashCode()
{
unchecked { return StringValue.GetHashCode ^ r.DoubleValue.GetHashCode(); }
}
}
And store the files in lists of this type:
List<MyRow> firstFile = ...
List<MyRow> secondFile = ...
Then you can determine the intersection (all elements that occure in both lists) via LINQ's Intersect method:
var result = firstFile.Intersect(secondFile).ToList();
It's necessary to override Equals and GetHashCode, because otherwise Intersect would only make a reference comparison. Alternativly you could implement an own IEqualityComparer<MyRow, MyRow> that does the comparison and pass it to the appropriate Intersect overload, too.
But if you can ensure that the keys (the string values are unique), you can also use a
Dictionary<string, double> firstFile = ...
Dictionary<string, double> secondFile = ...
And in this case use this LINQ statement:
var result = new Dictionary<string, double>(
firstFile.Select(x => new { First = x, Second = secondFile.FirstOrDefault(y => x.Key == y.Key) })
.Where(x => x.Second?.Value == x.First.Value));
which had a time complexity of O(m+n) while the upper solution would be O(m*n) (for m and n being the row counts of the two files).

LINQ - Removing items in a List<T> that contain one item of an Array in any order or position

I am trying to write something similar to the following with LINQ:
var media = from s in db.Media select s;
string[] criteria = {"zombies", "horror"};
mediaList.RemoveAll(media.Where(s => s.description.Inersect(criteria).Any()));
//mediaList is a List(T) containing instances of the Media model.
I thought linq where list contains any in list's solution would apply in this case but my compiler complains that "string does not contain a definition for Intersect".
The behaviour I am expecting is for Media items that contain the words zombies or horror but not both in their description to be taken out of the list i.e.
A horror movie.
A movie with a lot of zombies.
But items like the following should stay in the list:
A horror movie with zombies.
The best zombies and the best horror.
The Media class:
public class Media
{
public int mediaID { get; set; }
public string type { get; set; }
public string description { get; set; }
}
The description field contains very long paragraphs. I am afraid the solution is very obvious but for the life of me I cannot work it out.
EDIT: added a better explanation of the behaviour expected.

Your confusing some methods here.
List<T>.RemoveAll() takes a Predicate<T> as parameter and removes all elements from the list for which this prediate returns true. So what you want could be somehting like that:
mediaList.RemoveAll(m => criteria.Any(crit => m.description.Contains(crit));
But note that this will also remove "A movie about nonzombies".
UPDATE after your clarification:
mediaList.RemoveAll(m =>
{
int count = criteria.Count(crit => m.description.Contains(crit));
return count > 0 && count < criteria.Length;
});
This removes all entries that contain at least one word of criteria, but not all of them. (it still does not match "whole words only", though).

You should use
var reuslt = mediaList.RemoveAll(media => criteria.Any(c => s.description.Contains(c));

You can't intersect a string with a string[] array, but you could split the description string into words and then do the intersection:
mediaList.RemoveAll(entry => entry.description.Split(new string[]{" "},
StringSplitOptions.None).Intersect(criteria).Any());
This avoids the problem of matching words that are containing a substring of the criterion strings.

Get every item in a list that contains every string in an other list

I have basically two string-lists and want to get the elements of the first list that contain every word of the second list.
List<Sentence> sentences = new List<Sentence> { many elements };
List<string> keyWords= new List<string>{"cat", "the", "house"};
class Sentence
{
public string shortname {get; set; }
}
Now, how do I perform a contain-check for every element of the keyWords-List for a sentence? Something like
var found = sentences.Where(x => x.shortname.ContainsAll(keyWords)));

Try this:
var found = sentences.Where(x=> keyWords.All(y => x.shortname.Contains(y)));
The All method is used to filter out those sentences which contain all keywords from the list of keywords.

Use All
sentences.Where(x => keywords.All(k => x.shortname.Contains(k)));
If you find this to be a common search, you could create your own extension method
public static bool ContainsAll<T>(this IEnumerable<T> src, IEnumerable<T> target)
{
return target.All(x => src.Contains(x));
}
This would allow you to write the code as you originall expressed it
sentences.Where(x => x.shortname.ContainsAll(keywords));

sentences.Where(s => keyWords.All(kw => s.shortname.Contains(kw)));
Use All, it returns true only if all elements in the sequence satisfy the condition

CSV file to class via Linq

With the code below, on the foreach, I get an exception.
I place breakpoint on the csv (second line), I expand the result, I see 2 entries thats ok.
When I do the same on the csv in the foreach, I get an excpetion : can't read from closed text reader.
Any idea ?
Thanks,
My CSV file :
A0;A1;A2;A3;A4
B0;B1;B2;B3;B4
The code
var lines = File.ReadLines("filecsv").Select(a => a.Split(';'));
IEnumerable<IEnumerable<MyClass>> csv =
from line in lines
select (from piece in line
select new MyClass
{
Field0 = piece[0].ToString(),
Field1 = piece[1].ToString()
}
).AsEnumerable<MyClass>();
foreach (MyClass myClass in csv)
Console.WriteLine(myClass.Field0);
Console.ReadLine();
MyClass :
public class MyClass
{
public string Field0 { get; set; }
public string Field1 { get; set; }
}

Perhaps something like this instead, will give you exactly what you want:
var jobs = File.ReadLines("filecsv")
.Select(line => line.Split(','))
.Select(tokens => new MyClass { Field0 = tokens[0], Field1 = tokens[1] })
.ToList();
The problem you have is that you're saving the Enumerable, which has delayed execution. You're then looking at it through the debugger, which loops through the file, does all the work and disposes of it. Then you try and do it again.
The above code achieves what you currently want, is somewhat cleaner, and forces conversion to a list so the lazy behaviour is gone.
Note also that I can't see how your from piece in line could work correctly as it currently stands.

Perhabs it is because LINQ does not directly read all the items, it just creates the connection it read if it is needed.
You could try to cast:
var lines = File.ReadLines("filecsv").Select(a => a.Split(';')).ToArray();

I suspect it is a combination of the yield keyword (used in Select()) and the internal text reader (in ReadLines) not "agreeing".
Changes the lines variable to var lines = File.ReadLines("filecsv").Select(a => a.Split(';')).ToArray();
That should sort it.

Concat all strings inside a List<string> using LINQ

Is there any easy LINQ expression to concatenate my entire List<string> collection items to a single string with a delimiter character?
What if the collection is of custom objects instead of string? Imagine I need to concatenate on object.Name.

string result = String.Join(delimiter, list);
is sufficient.

Warning - Serious Performance Issues
Though this answer does produce the desired result, it suffers from poor performance compared to other answers here. Be very careful about deciding to use it
By using LINQ, this should work;
string delimiter = ",";
List<string> items = new List<string>() { "foo", "boo", "john", "doe" };
Console.WriteLine(items.Aggregate((i, j) => i + delimiter + j));
class description:
public class Foo
{
public string Boo { get; set; }
}
Usage:
class Program
{
static void Main(string[] args)
{
string delimiter = ",";
List<Foo> items = new List<Foo>() { new Foo { Boo = "ABC" }, new Foo { Boo = "DEF" },
new Foo { Boo = "GHI" }, new Foo { Boo = "JKL" } };
Console.WriteLine(items.Aggregate((i, j) => new Foo{Boo = (i.Boo + delimiter + j.Boo)}).Boo);
Console.ReadKey();
}
}
And here is my best :)
items.Select(i => i.Boo).Aggregate((i, j) => i + delimiter + j)

Note: This answer does not use LINQ to generate the concatenated string. Using LINQ to turn enumerables into delimited strings can cause serious performance problems
Modern .NET (since .NET 4)
This is for an array, list or any type that implements IEnumerable:
string.Join(delimiter, enumerable);
And this is for an enumerable of custom objects:
string.Join(delimiter, enumerable.Select(i => i.Boo));
Old .NET (before .NET 4)
This is for a string array:
string.Join(delimiter, array);
This is for a List<string>:
string.Join(delimiter, list.ToArray());
And this is for a list of custom objects:
string.Join(delimiter, list.Select(i => i.Boo).ToArray());

using System.Linq;
public class Person
{
string FirstName { get; set; }
string LastName { get; set; }
}
List<Person> persons = new List<Person>();
string listOfPersons = string.Join(",", persons.Select(p => p.FirstName));

Good question. I've been using
List<string> myStrings = new List<string>{ "ours", "mine", "yours"};
string joinedString = string.Join(", ", myStrings.ToArray());
It's not LINQ, but it works.

You can simply use:
List<string> items = new List<string>() { "foo", "boo", "john", "doe" };
Console.WriteLine(string.Join(",", items));
Happy coding!

I think that if you define the logic in an extension method the code will be much more readable:
public static class EnumerableExtensions {
public static string Join<T>(this IEnumerable<T> self, string separator) {
return String.Join(separator, self.Select(e => e.ToString()).ToArray());
}
}
public class Person {
public string FirstName { get; set; }
public string LastName { get; set; }
public override string ToString() {
return string.Format("{0} {1}", FirstName, LastName);
}
}
// ...
List<Person> people = new List<Person>();
// ...
string fullNames = people.Join(", ");
string lastNames = people.Select(p => p.LastName).Join(", ");

List<string> strings = new List<string>() { "ABC", "DEF", "GHI" };
string s = strings.Aggregate((a, b) => a + ',' + b);

I have done this using LINQ:
var oCSP = (from P in db.Products select new { P.ProductName });
string joinedString = string.Join(",", oCSP.Select(p => p.ProductName));

Put String.Join into an extension method. Here is the version I use, which is less verbose than Jordaos version.
returns empty string "" when list is empty. Aggregate would throw exception instead.
probably better performance than Aggregate
is easier to read when combined with other LINQ methods than a pure String.Join()
Usage
var myStrings = new List<string>() { "a", "b", "c" };
var joinedStrings = myStrings.Join(","); // "a,b,c"
Extensionmethods class
public static class ExtensionMethods
{
public static string Join(this IEnumerable<string> texts, string separator)
{
return String.Join(separator, texts);
}
}

This answer aims to extend and improve some mentions of LINQ-based solutions. It is not an example of a "good" way to solve this per se. Just use string.Join as suggested when it fits your needs.
Context
This answer is prompted by the second part of the question (a generic approach) and some comments expressing a deep affinity for LINQ.
The currently accepted answer does not seem to work with empty or singleton sequences. It also suffers from a performance issue.
The currently most upvoted answer does not explicitly address the generic string conversion requirement, when ToString does not yield the desired result. (This can be remedied by adding a call to Select.)
Another answer includes a note that may lead some to believe that the performance issue is inherent to LINQ. ("Using LINQ to turn enumerables into delimited strings can cause serious performance problems.")
I noticed this comment about sending the query to the database.
Given that there is no answer matching all these requirements, I propose an implementation that is based on LINQ, running in linear time, works with enumerations of arbitrary length, and supports generic conversions to string for the elements.
So, LINQ or bust? Okay.
static string Serialize<T>(IEnumerable<T> enumerable, char delim, Func<T, string> toString)
{
return enumerable.Aggregate(
new StringBuilder(),
(sb, t) => sb.Append(toString(t)).Append(delim),
sb =>
{
if (sb.Length > 0)
{
sb.Length--;
}
return sb.ToString();
});
}
This implementation is more involved than many alternatives, predominantly because we need to manage the boundary conditions for the delimiter (separator) in our own code.
It should run in linear time, traversing the elements at most twice.
Once for generating all the strings to be appended in the first place, and zero to one time while generating the final result during the final ToString call. This is because the latter may be able to just return the buffer that happened to be large enough to contain all the appended strings from the get go, or it has to regenerate the full thing (unlikely), or something in between. See e.g. What is the Complexity of the StringBuilder.ToString() on SO for more information.
Final Words
Just use string.Join as suggested if it fits your needs, adding a Select when you need to massage the sequence first.
This answer's main intent is to illustrate that it is possible to keep the performance in check using LINQ. The result is (probably) too verbose to recommend, but it exists.

You can use Aggregate, to concatenate the strings into a single, character separated string but will throw an Invalid Operation Exception if the collection is empty.
You can use Aggregate function with a seed string.
var seed = string.Empty;
var seperator = ",";
var cars = new List<string>() { "Ford", "McLaren Senna", "Aston Martin Vanquish"};
var carAggregate = cars.Aggregate(seed,
(partialPhrase, word) => $"{partialPhrase}{seperator}{word}").TrimStart(',');
you can use string.Join doesn’t care if you pass it an empty collection.
var seperator = ",";
var cars = new List<string>() { "Ford", "McLaren Senna", "Aston Martin Vanquish"};
var carJoin = string.Join(seperator, cars);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Convert Text file to Dictionary of Objects - c#

Related

c# Appropriate data structure for storing values from csv file. Specific Case

LINQ - Removing items in a List<T> that contain one item of an Array in any order or position

Get every item in a list that contains every string in an other list

CSV file to class via Linq

Concat all strings inside a List<string> using LINQ

Categories

Resources