I have a list of duplicate names and I want to get the list without the duplicates.
CSVCategories = from line in File.ReadAllLines(path).Skip(1)
let columns = line.Split(',')
select new Category
{
Name = columns[9]
};
var results = CSVCategories.GroupBy(x => x.Name)
.Select(g => g.FirstOrDefault())
.ToList();
I try to look at the elements and debug using the following loop, but it still returns the duplicates from the list including empty strings for null values:
foreach(var item in results)
{
Console.WriteLine(item.Name);
}
Calling Distinct does not work most likely because your Category class does not have proper implementation of Equals and GetHashCode.
You have two options. Properly overwrite Equals and GetHashCode methods, or use Hashset to check if Name is not already added.
var uniqueNames = new Hashset<string>();
// Original select statement
CSVCategories = CSVCategories.Where(x => uniqueName.Add(x.Name)).ToList();
Linq encourages immutability so it never modifies your input collection. So Distinct() returns a new collection rather modified the collection inline. Try:
foreach(var item in CSVCategories.Distinct())
{
Console.WriteLine(item.Name);
}
I noticed that the results variable brought me back a list containing duplicates, but only that were different in their casing.
E.g. My original list CSVCategories contained the elements: ["Home", "home", "EmptyString", "home", "Town", "Town", "Park"]
When de-duplicating with GroupBy, the results query returned ["Home", "home", "EmptyString", "Town", "Park"], so it kind of worked. Keeping values that are empty and those that have a different casing.
Now I need to find a way to remove casing duplicates and empty strings.
Related
I have a List<Map> and I wanted to update the Map.Target property based from a matching value from another List<Map>.
Basically, the logic is:
If mapsList1.Name is equal to mapsList2.Name
Then mapsList1.Target = mapsList2.Name
The structure of the Map class looks like this:
public class Map {
public Guid Id { get; set; }
public string Name { get; set; }
public string Target { get; set; }
}
I tried the following but obviously it's not working:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
// populate the 2 lists here
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
The count of items in list 1 will be always greater than or equal to the count of items in list 2. No duplicates in both lists.
Assuming there are a small number of items in the lists and only one item in list 1 that matches:
list2.ForEach(l2m => list1.First(l1m => l1m.Name == l2m.Name).Target = l2m.Target);
If there are more than one item in List1 that must be updated, enumerate the entire list1 doing a First on list2.
list1.ForEach(l1m => l1m.Target = list2.FirstOrDefault(l2m => l1.Name == l2m.Name)?.Target ?? l1m.Target);
If there are a large number of items in list2, turn it into a dictionary
var d = list2.ToDictionary(m => m.Name);
list1.ForEach(m => m.Target = d.ContainsKey(m.Name) ? d[m.Name].Target : m.Target);
(Presumably list2 doesn't contain any repeated names)
If list1's names are unique and everything in list2 is in list1, you could even turn list1 into a dictionary and enumerate list2:
var d=list1.ToDictionary(m => m.Name);
list2.ForEach(m => d[m.Name].Target = m.Target);
If List 2 has entries that are not in list1 or list1 has duplicate names, you could use a Lookup instead, you'd just have to do something to avoid a "collection was modified; enumeration may not execute" you'd get if you were trying to modify the list it returns in response to a name
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
LINQ Where doesn't really work like that / that's not a statement in itself. The m1 is the entry from list1, and the inner Where would produce an enumerable of list 2 items, but it doesn't result in the Boolean the outer Where is expecting, nor can you do anything to either of the sequences because LINQ operations are not supposed to have side effects. The only thing you can do with a Where is capture or use the sequence it returns in some other operation (like enumerating it), so Where isn't really something you'd use for this operation unless you use it to find all the objects you need to alter. It's probably worth pointing out that ForEach is a list thing, not a LINQ thing, and is basically just another way of writing foreach(var item in someList)
If collections are big enough better approach would be to create a dictionary to lookup the targets:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
var dict = mapsList2
.GroupBy(map => map.Name)
.ToDictionary(maps => maps.Key, maps => maps.First().Target);
foreach (var map in mapsList1)
{
if (dict.TryGetValue(map.Name, out var target))
{
map.Target = target;
}
}
Note, that this will discard any possible name duplicates from mapsList2.
I am trying to concate List<> as follows-
List<Student> finalList = new List<Student>();
var sortedDict = dictOfList.OrderBy(k => k.Key);
foreach (KeyValuePair<int, List<Student>> entry in sortedDict) {
List<Student> ListFromDict = (List<Student>)entry.Value;
finalList.Concat(ListFromDict);
}
But no concatenation happens. finalList remains empty. Any help?
A call to Concat does not modify the original list, instead it returns a new list - or to be totally accurate: it returns an IEnumerable<string> that will produce the contents of both lists concatenated, without modifying either of them.
You probably want to use AddRange which does what you want:
List<Student> ListFromDict = (List<Student>)entry.Value;
finalList.AddRange(ListFromDict);
Or even shorter (in one line of code):
finalList.AddRange((List<Student>)entry.Value);
And because entry.Value is already of type List<Student>, you can use just this:
finalList.AddRange(entry.Value);
Other answers have explained why Concat isn't helping you - but they've all kept your original loop. There's no need for that - LINQ has you covered:
List<Student> finalList = dictOfList.OrderBy(k => k.Key)
.SelectMany(pair => pair.Value)
.ToList();
To be clear, this replaces the whole of your existing code, not just the body of the loop.
Much simpler :) Whenever you find yourself using a foreach loop which does nothing but build another collection, it's worth seeing whether you can eliminate that loop using LINQ.
You may want to read up the documentation on Enumerable.Concat:
Return Value
Type: System.Collections.Generic.IEnumerable
An IEnumerable that contains the concatenated elements of the two input sequences.
So you may want to use the return value, which holds the new elements.
As an alternative, you can use List.AddRange, which Adds the elements of the specified collection to the end of the List.
As an aside, you can also achieve your goal with a simple LINQ query:
var finalList = dictOfList.OrderBy(k => k.Key)
.SelectMany(k => k.Value)
.ToList();
As specified here, Concat generates a new sequence whereas AddRange actually adds the elements to the list. You thus should rewrite it to:
List<Student> finalList = new List<Student>();
var sortedDict = dictOfList.OrderBy(k => k.Key);
foreach (KeyValuePair<int, List<Student>> entry in sortedDict) {
List<Student> ListFromDict = (List<Student>)entry.Value;
finalList.AddRange(ListFromDict);
}
Furthermore you can improve the efficiency a bit, by omitting the cast to a List<T> object since entry.Value is already a List<T> (and technically only needs to be an IEnumerable<T>):
var sortedDict = dictOfList.OrderBy(k => k.Key);
foreach (KeyValuePair<int, List<Student>> entry in sortedDict) {
finalList.AddRange(entry.Value);
}
Concat method does not modify original collection, instead it returns brand new collection with concatenation result. So, either try finalList = finalList.Concat(ListFromDict) or use AddRange method which modifies target list.
Consider following code snippet
List orderList ; // This list is pre-populated
foreach (System.Web.UI.WebControls.ListItem item in OrdersChoiceList.Items) // OrdersChoiceList is of type System.Web.UI.WebControls.CheckBoxList
{
foreach (Order o in orderList)
{
if (item.id == o.id)
{
item.Selected = scopeComputer.SelectedBox;
break;
}
}
}
There are thousands of item in the list, hence these loops are time consuming. How we can optimze it?
Also how can we do the same stuff with LINQ. I tried using join operation but not able to set the value of "Selected" variable based on "SelectedBox". For now I hardocoded the value in select clause to "true", how can we pass & use SelectedBox value in select clause
var v = (from c in ComputersChoiceList.Items.Cast<ListItem>()
join s in scopeComputers on c.Text equals s.CName
select c).Select(x=>x.Selected = true);
I think you need to eliminate the nested iteration. As you state, both lists have a large set of items. If they both have 5,000 items, then you're looking at 25,000,000 iterations in the worst case.
There's no need to continually re-iterate orderList for every single ListItem. Instead create an ID lookup so you have fast O(1) lookups for each ID. Not sure what work is involved hitting scopeComputer.SelectedBox, but that may as well be resolved once outside the loop as well.
bool selectedState = scopeComputer.SelectedBox;
HashSet<int> orderIDs = new HashSet<int>(orders.Select(o => o.id));
foreach (System.Web.UI.WebControls.ListItem item in OrdersChoiceList.Items)
{
if (orderIDs.Contains(item.id))
item.Selected = selectedState;
}
Using a HashSet lookup, you're now really only iterating 5,000 times plus a super-fast lookup.
EDIT: From what I can tell, there's no id property on ListItem, but I'm assuming that the code you've posted is condensed for brevity, but largely representative of your overall process. I'll keep my code API/usage to match what you have there; I'm assuming it's translatable back to your specific implementation.
EDIT: Based on your edited question, I think you're doing yet another lookup/iteration on retrieving the scopeComputer reference. Similarly, you can make another lookup for this:
HashSet<int> orderIDs = new HashSet<int>(orders.Select(o => o.id));
Dictionary<string, bool> scopeComputersSelectedState =
scopeComputers.ToDictionary(s => s.CName, s => s.Selected);
foreach (System.Web.UI.WebControls.ListItem item in OrdersChoiceList.Items)
{
if (orderIDs.Contains(item.id))
item.Selected = scopeComputersSelectedState[item.Text];
}
Again, not sure on the exact types/usage you have. You could also condense this down with a single LINQ query, but I don't think (performance speaking) you will see much of a improvement. I'm also assuming that there is a matching ScopeComputer for every ListItem.Text entry otherwise you'll get an exception when accessing scopeComputersSelectedState[item.Text]. If not, then it should be a trivial exercise for you to change it to perform a TryGetValue lookup instead.
This question already has answers here:
How can I find a specific element in a List<T>?
(8 answers)
Closed 6 years ago.
I have a list containing the following structure.
class CompareDesignGroup
{
string FieldId;
string Caption;
}
The list is containing items of the above structure.
Is it possible to retrieve an element of the list if FieldId is known?
You can use the Find method on the generic list class. The find method takes a predicate that lets you filter/search the list for a single item.
List<CompareDesignGroup> list = // ..;
CompareDesignGroup item = list.Find(c => c.FieldId == "SomeFieldId");
item will be null if there is no matching item in the list.
If you need to find more than one item you can use the FindAll method:
List<CompareDesignGroup> list = // ..;
List<CompareDesignGroup> result= list.FindAll(c => c.FieldId == "SomeFieldId");
You can use LINQ like this:
CompareDesignGroup result = yourList.FirstOrDefault(x => x.FieldId == yourKnownId);
If you use the FirstOrDefault method the result will be null when list doesn't contain a record with a known id. So before using result check if it is not null.
There are a plethora of methods to find an item inside a list.
LINQ provides extensions method useful to work with collections that does not provide their own search features (or when you do not have the collection itself but a generic interface like IEnumerable<T>). If you have a List<CompareDesignGroup> object and you'll work on that object you can use the methods provided by that class (specialized methods are almost always faster than LINQ methods, they know collection's internal structure and does not have to rely on many abstraction layers).
In all examples I'll perform a culture invariant and case sensitive comparison for FieldId to a hypothetical id parameter. This may not be what you need and you may have to change according to your requirements.
Using List<T>
Given a list declared as:
List<CompareDesignGroup>() list = new List<CompareDesignGroup>();
To find first element that matches the search criteria (it'll return null if no items have been found):
CompareDesignGroup item = list.Find(
x => String.Equals(x.FieldId, id, StringComparison.InvariantCulture));
To find all the elements that matches the search criteria:
List<CompareDesignGroup> items = list.FindAll(
x => String.Equals(x.FieldId, id, StringComparison.InvariantCulture));
Using IEnumerable<T> (or IList<T>, for example)
Given a list declared as:
IEnumerable<CompareDesignGroup> list = ...
To find first element that matches the search criteria (null if no items have been found):
CompareDesignGroup item = list.FirstOrDefault(
x => String.Equals(x.FieldId, id, StringComparison.InvariantCulture));
To find the first element that matches the search criteria (or throw an exception if no items have been found):
CompareDesignGroup item = list.First(
x => String.Equals(x.FieldId, id, StringComparison.InvariantCulture));
To find all elements that matches the search criteria:
IEnumerable<CompareDesignGroup> item = list.Where(
x => String.Equals(x.FieldId, id, StringComparison.InvariantCulture));
There are many LINQ extensions methods, I suggest to take a look to them all to find the one that better suits your needs.
You can use Where and then you can use FirstOrDefault. That is an LINQ expression.
var ls = new List<CompareDesignGroup>();
var result = ls.Where(a => a.FieldId=="123").FirstOrDefault();
Or SingleOrDefault to get the item you want. Like this:
var ls = new List<CompareDesignGroup>();
var result = ls.Where(a => a.FieldId=="123").SingleOrDefault()
Or even simpler:
var result = ls.SingleOrDefault(a => a.FieldId=="123");
var result2 = ls.FirstOrDefault(a => a.FieldId=="123");
Yes. Use LINQ or the built-in functionalities of List.
List<CompareDesignGroup> listData = new List<CompareDesignGroup>(); // init the data
var result = listData.Where(x=> String.Equals(x.FieldID,"FIELDID KNOWN VALUE"); // gets all data
var first = listData.FirstOrDefault(x=> String.Equals(x.FieldID,"FIELDID KNOWN VALUE"); // gets first search result
Im getting a table Tags from the db.
the table has columns ID and TagName
I'm doing something like this to get a list of strings:
var taglist = Model.Tags.Select(x => x.TagName.ToLower()).ToArray();
then I'm comparing against another string array to get the strings that occur in both:
var intersectList = tagList.Intersect(anotherList);
I have my list, but now I also want the ID of each item remaining in the intersect list that corresponds to the tagList. (can just be an int array)
Can anyone help with a good way to do this?
Don't use intersect, it only works for collections of the same type. You could do a simple join or other form of filtering. It would be easiest to throw the string list into a HashSet and filter by tags that contain TagNames in that set. This way, you keep your tags unprojected so they keep their ids and other properties.
var stringSet = anotherList.ToHashSet(StringComparer.OrdinalIgnoreCase);
var tagList = Model.Tags.Where(t => stringSet.Contains(t.TagName)).ToList();
And put them into a list. Don't throw them into an array unless you specifically need an array (for use in a method that expects an array).
Could you do:
var intersectIds = Model.Tags
.Where(tag => anotherList.Contains(tag.TagName))
.Select(tag => tag.Id)
.ToList();
Maybe use Dictionary<int, string> instead of Array?