Why isn't this LINQ parsing a file properly?

Why isn't this LINQ parsing a file properly? - c#

I have a file with a simple key,value format, one per line.
e.g:
word1,filepath1
word2,filepath2
word3,filepath5
I'm trying to read this into a Dictionary<string,string> in one go with LINQ. There are some duplicates in the file (where the first part - the first string - is the duplicate). In this case, I'm ok with dropping the duplicates.
This is my LINQ which isn't working:
var indexes = File.ReadAllLines(indexFileName)
.Select(x => x.Split(','))
.GroupBy(x=>x[0])
.ToDictionary(x => x.Key, x => x.ElementAt(1));
The ToDictionary part is confusing me, how do I retrieve the first value from the group and assign it to the value of the dictionary?
I get a System.ArgumentOutOfRangeException: 'Specified argument was out of the range of valid values.' exception.

var indexes = File.ReadAllLines(indexFileName)
.Select(x => x.Split(','))
.GroupBy(x => x[0])
.ToDictionary(x => x.Key, x => x.First()[1]);

So the problem here is that you're grouping arrays, not strings. Therefore the group objects you're dealing with in the ToDictionary() lambda are enumerations of arrays, not of strings. g.ElementAt(0) isn't a string. It's the first array of strings:
When
g.Key == "word1"
then g.ElementAt(0) is...
{ "word1", "filepath1" }
So you want g.ElementAt(0).ElementAt(1), or g.First()[0], or something to that effect.
That seems painfully obvious in hindsight, but unfortunately only in hindsight, for me.
I would suggest that after you accept Matthew Whited's answer, you clarify the code by turning the split lines into anonymous objects as soon as you can. ElementAt(1) doesn't communicate much.
var indexes =
File.ReadAllLines(indexFileName)
.Where(s => !String.IsNullOrEmpty(s))
.Select(x => x.Split(','))
// Turn the array into something self-documenting
.Select(a => new { Word = a[0], Path = a[1] })
.GroupBy(o => o.Word)
.ToDictionary(g => g.Key, g => g.First().Path)
;
Converting each line to an object makes it easier for me to think about, and Intellisense starts playing on your team as well.

Related

How can I split a List<T> into two lists, one containing all duplicate values and the other containing the remainder?

I have a basic class for an Account (other properties removed for brevity):
public class Account
{
public string Email { get; set; }
}
I have a List<T> of these accounts.
I can remove duplicates based on the e-mail address easily:
var uniques = list.GroupBy(x => x.Email).Select(x => x.First()).ToList();
The list named 'uniques' now contains only one of each account based on e-mail address, any that were duplicates were discarded.
I want to do something a little different and split the list into two.
One list will contain only 'true' unique values, the other list will contain all duplicates.
For example the following list of Account e-mails:
unique#email.com
dupe#email.com
dupe#email.com
Would be split into two lists:
Unique
unique#email.com
Duplicates
dupe#email.com
dupe#email.com
I have been able to achieve this already by creating a list of unique values using the example at the top. I then use .Except() on the original list to get the differences which are the duplicates. Lastly I can loop over each duplicate to 'pop' it out of the unique list and move it to the duplicate list.
Here is a working example on .NET Fiddle
Can I split the list in a more efficient or syntactically sugary way?
I'd be happy to use a third party library if necessary but I'd rather just stick to pure LINQ.
I'm aware of CodeReview but feel the question also fits here.

var groups = list.GroupBy(x => x.Email)
.GroupBy(g => g.Count() == 1 ? 0 : 1)
.OrderBy(g => g.Key)
.Select(g => g.SelectMany(x => x))
.ToList();
groups[0] will be the unique ones and group[1] will be the non-unique ones.

var duplicates = list.GroupBy(x => x) // or x.Property if you are grouping by some property.
.Where(g => g.Count() > 1)
.SelectMany(g => g);
var uniques = list.GroupBy(x => x) // or x.Property if you are grouping by some property.
.Where(g => g.Count() == 1)
.SelectMany(g => g);
Alternatively, once you get one list, you can get the other one using Except:
var uniques = list.Except(duplicates);
// or
var duplicates = list.Except(uniques);

Another way to do it would be to get uniques, and then for duplicates simply get the elements in the original list that aren't in uniques.
IEnumerable<Account> uniques;
IEnumerable<Account> dupes;
dupes = list.Where(d =>
!(uniques = list.GroupBy(x => x.Email)
.Where(g => g.Count() == 1)
.SelectMany(u => u))
.Contains(d));

c# - Linq Query to retrieve all objects with a max value

Currently I have a List of objects in which I need to find all occurrences that have the maximum value.
Currently my solution to this has been:
Foo maxFoo = list.OrderByDescending(foo => foo.A).First();
List<Foo> maxFoos = new List<Foo>();
foreach(Foo foo in list) {
if (foo.A.Equals(maxFoo.A)) {
maxFoos.Add(foo);
}
}
However I want to know if there is a way to do this in a single Linq expression.
All the resources I have read only refer to getting the max value for one object.
Note: For the time being, I want to know a solution which doesn't rely on MoreLinq

You can group by the property, then order the groups by key, and take the content of the first one, like this:
var res = list
.GroupBy(item => item.A)
.OrderByDescending(g => g.Key)
.First()
.ToList();

You could group by A, order the group, and get the elements in the first group, which corresponds to the elements with the max value of A:
list
.GroupBy(x => x.A)
.OrderByDescending(grp=> grp.Key)
.First()
.Select(x => x);

This works:
var maxFoos =
list
.OrderByDescending(foo => foo.A)
.GroupBy(foo => foo.A)
.Take(1)
.SelectMany(foo => foo)
.ToList();

Order By on the Basis of Integer present in string

I've a problem in my C# application... I've some school classes in database for example 8-B, 9-A, 10-C, 11-C and so on .... when I use order by clause to sort them, the string comparison gives results as
10-C
11-C
8-B
9-A
but I want integer sorting on the basis of first integer present in string...
i.e.
8-B
9-A
10-C
11-C
hope you'll understand...
I've tried this but it throws exception
var query = cx.Classes.Select(x=>x.Name)
.OrderBy( x=> new string(x.TakeWhile(char.IsDigit).ToArray()));
Please help me... want ordering on the basis of classes ....

Maybe Split will do?
.OrderBy(x => Convert.ToInt32(x.Split('-')[0]))
.ThenBy(x => x.Split('-')[1])

If the input is well-formed enough, this would do:
var maxLen = cx.Classes.Max(x => x.Name.Length);
var query = cx.Classes.Select(x => x.Name).OrderBy(x => x.PadLeft(maxLen));

You can add 0 as left padding for a specified length as your data for example 6
.OrderBy(x => x.PadLeft(6, '0'))

This is fundamentally the same approach as Andrius's answer, written out more explicitly:
var names = new[] { "10-C", "8-B", "9-A", "11-C" };
var sortedNames =
(from name in names
let parts = name.Split('-')
select new {
fullName = name,
number = Convert.ToInt32(parts[0]),
letter = parts[1]
})
.OrderBy(x => x.number)
.ThenBy(x => x.letter)
.Select(x => x.fullName);
It's my naive assumption that this would be more efficient because the Split is only processed once in the initial select rather than in both OrderBy and ThenBy, but for all I know the extra "layers" of LINQ may outweigh any gains from that.

Remove duplicates of a List, selecting by a property value in C#?

I have a list of objects that I need some duplicates removed from. We consider them duplicates if they have the same Id and prefer the one whose booleanValue is false. Here's what I have so far:
objects.GroupBy(x => x.Id).Select(x => x.Where(y => !y.booleanValue));
I've determined that GroupBy is doing no such grouping, so I don't see if any of the other functions are working. Any ideas on this? Thanks in advance.

You can do this:
var results =
from x in objects
group x by x.Id into g
select g.OrderBy(y => y.booleanValue).First();
For every Id it finds in objects, it will select the first element where booleanValue == false, or the the first one (if none of them have booleanValue == false).
If you prefer fluent syntax:
var results = objects.GroupBy(x => x.Id)
.Select(g => g.OrderBy(y => y.booleanValue).First());

Something like this should work:
var result =
objects.GroupBy(x => x.Id).Select(g =>
g.FirstOrDefault(y => !y.booleanValue) ?? g.First())
This assumes that your objects are of a reference type.
Another possibility might be to use Distinct() with a custom IEqualityComparer<>.

This partially answers the question above, but I justed need a really basic solution:
objects.GroupBy(x => x.Id)
.Select(x => x.First())
.ToArray();
The key to getting the original object from the GroupBy() is the Select() getting the First() and the ToArray() gets you an array of your objects, not a Linq object.

How can I order a Dictionary<string,string> by a substring within the value?

I have a Dictionary <string, string> where the value is a concatenation of substrings delimited with a :. For example, 123:456:Bob:Smith.
I would like to order the dictionary by the last substring (Smith) ascending, and preferably like this:
orderedDictionary = unordered
.OrderBy(x => x.Value)
.ToDictionary(x => x.Key, x => x.Value);
So, I need to somehow treat the x.Value as a string and sort by extracting the fourth substring. Any ideas?

var ordered = unordered.OrderBy(x => x.Value.Split(':').Last())
.ToDictionary(x => x.Key, x => x.Value);

Try
orderedDictionary = unordered.OrderBy(x => x.Value.Substring(x.Value.LastIndexOf(":"))).ToDictionary(x => x.Key, x => x.Value);

Take a look at the OrderBy Method of IDictionary, specifically this one http://msdn.microsoft.com/en-us/library/bb549422.aspx noting the comparerparameter. That should point you in the right direction and I think you'll find learning the remainder of benefit.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why isn't this LINQ parsing a file properly? - c#

var indexes = File.ReadAllLines(indexFileName) .Select(x => x.Split(',')) .GroupBy(x => x[0]) .ToDictionary(x => x.Key, x => x.First()[1]);

Related

How can I split a List<T> into two lists, one containing all duplicate values and the other containing the remainder?

c# - Linq Query to retrieve all objects with a max value

Order By on the Basis of Integer present in string

Remove duplicates of a List, selecting by a property value in C#?

How can I order a Dictionary<string,string> by a substring within the value?

Categories

Resources