Get a list of unique strings from duplicate entries - c#

I'm looking for a LINQ function that returns a list of unique strings from a list of objects which contain these strings. The strings of the objects are not unique. Like this:
List before:
name="abc",value=3
name="xyz",value=5
name="abc",value=9
name="hgf",value=0
List this function would return:
"abc","xyz","hgf"
Does such a function even exist? Of course I know how I could implement this manually, but I was curious if LINQ can do this for me.

var foo = list.Select(p => p.name).Distinct().ToList();

You could use the Distinct extension method. So basically you will first project the original objects into a collection of strings and then apply the Distinct method:
string[] result = source.Select(x => x.name).Distinct().ToArray();

(from object in objectList
select object.name).Distinct();

Related

How do I use Any() instead of RemoveAll() to exclude list items?

ListWithAllItems contains two type of items: ones I want select, and ones that I don't.
listForExcluding contains item that I should to exclude:
List<string> listForExcluding = ...;
So I do it in two strings:
List<string> x = ListWithAllItems.ToList();
x.RemoveAll(p => listForExcluding.Any(itemForExclude => itemForExclude == p));
How do I use Any() instead of RemoveAll() to get this query with one line?
Any doesn't make sense here, just use Except:
var filtered = ListWithAllItems.Except(listForExcluding);
ToList if you actually need a list at the end, otherwise don't realize IEnumerables for no reason (causes an extra enumeration).
If you really want the RemoveAll version for some reason, use Contains (this is also how to do it using Where):
x.RemoveAll(p => listForExcluding.Contains(p));
There's a number of other valid lines... but seriously just go with Except

How to group a List<T> that has a property of List<string> by that nested List?

I have a custom data type that contains a List<string>.
I wish to group a List of CustomDataType by that nested List<string>.
I have tried the following
compoundSchedules.GroupBy(a => a.Timepoints);
Where Timepoints is a list of dates represented as strings. Where any CustomDataTypes have identical timepoints, I wish them to be grouped together.
Using the code above, it does not group them and instead just repeats the List of CustomDataType with its timepoint list as the IGrouping Key.
Thanks.
You should create an IEqualityComparer<List<string>> that checks that the lists have the same length and contents, and use this overload of Enumerable.GroupBy:
compoundSchedules.GroupBy(a => a.Timepoints, myComparer);
Either that, or create your own class to be a list of Timepoints, and have it implement GetHashCode and Equals (and/or implement IEquatable<T>), which are used by the default comparer.
Two things come to mind. First is to create a Timepoints class and implement IComparable. Then you can compare each element of each list to see if they are equivalent. Alternatively, you could create a new property on your compound schedule that holds the hash code for the list.
Since Timepoints is a List<string> - I would construct one aggregated string and group by that. You can accomplish this by using the Aggregate method
compoundSchedules.GroupBy(a => a.Timepoints.Aggregate( (x,y) => x + y));
You could group by the joined list of strings:
var grouped = compoundSchedules.GroupBy(cs => string.Join("", cs.Timepoints));
This would also take care of the same order (if this is desired).

Removing a list of objects from another list

I've been looking for something like that for days. I'm trying to remove all the elements from a bigger list A according to a list B.
Suppose that I got a general list with 100 elements with differents IDS and I get another list with specific elements with just 10 records. I need remove all the elements from the first list that doesn't exists inside the second list.
I'll try to show the code that I actually don't know how it didnt works.
List<Obj> listA = new List<Obj>();
List<Obj> listB = new List<Obj>();
//here I load my first list with many elements
//here I load my second list with some specific elements
listA.RemoveAll(x => !listB.Contains(x));
I don't know why but it's not working. If I try this example with a List<int> type, it works nicely but I'd like to do that with my object. This object got an ID but I don't know how to use this ID inside the LINQ sentence.
You need to compare the IDs:
listA.RemoveAll(x => !listB.Any(y => y.ID == x.ID));
List(T).RemoveAll
I believe you can use the Except Extension to do this.
var result = listA.Except(listB)
Reference: http://www.dotnetperls.com/except
If you want to remove a list of objects (listB) from another list (listA) use:
listA = listA.Except(listB).ToList()
Remember to use ToList() to convert IEnumerable<Obj> to List<Obj>.
who ever is viewing this now.I think var result = listA.Intersect(listB) will give the result for common values in the both the list.
According to the documentation on MSDN ( http://msdn.microsoft.com/en-us/library/bhkz42b3.aspx ), contains uses the default Equality comparer to determine equality, so you could use IEquatable's Equals method on your Obj class to make it work. HiperiX mentions the ref comparison above.
How to add the IEquateable interface: http://msdn.microsoft.com/en-us/library/ms131190.aspx

Sorting a list in .Net by one field, then another

I have list of objects I need to sort based on some of their properties. This works fine to sort it by one field:
reportDataRows.Sort((x, y) => x["Comment1"].CompareTo(y["Comment1"]));
foreach (var row in reportDataRows) {
...
}
I see lots of examples on here that do this with only one field. But how do I sort by one field, then another? Or how about a list of many fields? It seems like using LINQ orderby thenby would be best, but I don't know enough about it to know how use it.
For the parameters, something like this that supports any number of fields to sort by would be nice:
var sortBy = new List<string>(){"Comment1","Time"};
I don't want to be writing code to do this in every one of my apps. I plan on moving this sort code to the class that holds the data so that it can do more advanced things like using a list of parameters and implicitly recognizing that the field is a date and sorting it as a date instead of a string. The reportDataRow object contains fields with this information, so I don't have to do any messy checks to find out if the field is supposed to be a date.
Yes, I think it makes more sense to use OrderBy and ThenBy:
foreach (var row in reportDataRows.OrderBy(x => x["Comment1"]).ThenBy(x => x["Comment2"])
{
...
}
This assumes the other thing you want to order by is "Comment2".
Try this:
reportDataRows.Sort((x, y) =>
{
var compare = x["Comment1"].CompareTo(y["Comment1"]);
if(compare != 0)
return compare;
return x["Comment2"].CompareTo(y["Comment2"]);
});
You may want to look at this previous answer where I posted an extension method which handles multiple order by's in LINQ. This allows this sort of syntax:
myList.OrderByMany(x => x.Field1, x => x.Field2);
Look at the example for ThenBy on msdn.
If you're comparing your own objects, then you can implement the IComparable interface.
Otherwise, you can use the IComparer interface.
Using LINQ method syntax:
var sortedRows = reportDataRows.OrderBy(r => r["Comment1"])
.ThenBy(r => r["AnotherField"];
foreach (var row in sortedRows) {
...
}
And even more readable using query comprehension syntax:
var sortedRows = from r in reportDataRows
orderby r["Comment1"], r["Comment2"]
select r;
foreach (var row in sortedRows) {
...
}
You got it. Enumerable.OrderBy().ThenBy() is your ticket. It works exactly like it looks; elements are sorted by each projection, with ties decided by comparing the next projection. You can chain as many ThenBys as you want, and there are also OrderByDesc and ThenByDesc methods that will sort that projection in descending order.
As Albin has pointed out, An OrderBy chain does not touch the original list unless you assign the result of the ordering back to the original variable, like this:
reportDataRows = reportDataRows.OrderBy(x=>x.Comment1).ThenBy(x=>x.Comment2).ToList();
As a rule, OrderBy will perform slightly slower than List.Sort(); the algorithm is designed to work on any IEnumerable series of elements, so in order to sort (which requires knowing every element of the series) it slurps its entire source enumerable into a new array. However, OrderBy has a distinct advantage over Sort in that it is a "stable" sort; elements that are exactly equal to each other will retain their "relative order" in the sorted enumerable (the first of the two that you;d encounter when iterating through the unsorted list will be the first of the two encountered when iterating through the sorted list).

How to sort a list<datarow>?

I have a datatable which I will convert to a list<datarow>. How do I sort the list based on some datatable column? I think it's something like list = list.Sort(p=>p.Field() but I am not sure about the syntax.
I am interested in using LINQ heavily so I was wondering if I should convert the datatable to a strongly typed generic list and take it from there instead of using linq to datasets. The only reason I didn't do this now is for performance reason. The datatable consists of about 20,000 records.
I'd recommend you covert the dataset to a generic list and use linq to sort:
var collection =
from c in people
orderby c.Name ascending
select c;
return collection.ToList();
Where people is the generic List. There are other ways to sort using linq.
Are you always going to return 20k records? The way I understand it, there's more overhead with datatables/sets/rows than with generic lists due to all the built-in .NET functions and methods...but I might be wrong.
In general if you want to sort a List<T> you could do something like
list.Sort((a, b) => String.Compare(a.StringValue, b.StringValue));
Obviously this example is sorting string values alphabetically, but you should get the idea of the syntax to use.
You could use something like:-
list.Sort((x,y) => (int)x[i] > (int)y[i] ? 1 : ((int)x[i] < (int)y[i] ? -1 : 0))
Where i is the index of the column you want to compare. In the example above the i column contains integers. If it contained strings you could do something like
list.Sort((x,y) => string.Compare((string)x[i], (string)y[i]))
Note that the Sort method sorts the list in place, it does not create a new list.

Categories

Resources