Sort a C# list by word - c#

I want to sort a C# list by word. Assume I have a C# list (of objects) which contains following words:
[{id:1, name: "ABC"},
{id:2, name: "XXX"},
{id:3, name: "Mille"},
{id:4, name: "YYY"},
{id:5, name: "Mill",
{id:6, name: "Millen"},
{id:7, name: "OOO"},
{id:8, name: "GGGG"},
{id:9, name: null},
{id:10, name: "XXX"},
{id:11, name: "mil"}]
If user pass Mil as a search key, I want to return all the words starting with the search key & then all the words which does not match criteria & have them sort alphabetically.
Easiest way I can think of is to run a for loop over the result set, put all the words starting with search key into one list and put the renaming words into another list. Sort the second list and them combine both the list to return the result.
I wonder if there is a smarter or inbuilt way to get the desired result.

Sure! You will sort by the presence of a match, then by the name, like this:
var results = objects.OrderByDescending(o => o.Name.StartsWith(searchKey))
.ThenBy(o => o.Name);
Note that false comes before true in a sort, so you'll need to use OrderByDescending.
As AlexD points out, the name can be null. You'll have to decide how you want to treat this. The easiest way would be to use o.Name?.StartsWith(searchKey) ?? false, but you'll have to decide based on your needs. Also, not all Linq scenarios support null propagation (Linq To Entities comes to mind).

This should do it, but there's probably a faster way, maybe using GroupBy somehow.
var sorted = collection
.Where(x => x.Name.StartsWith(criteria))
.OrderBy(x => x.Name)
.Concat(collection
.Where(x => !x.Name.StartsWith(criteria))
.OrderBy(x => x.Name))

You can try GroupBy like this:
var sorted = collection
.GroupBy(item => item.Name.StartsWith(criteria))
.OrderByDescending(chunk => chunk.Key)
.SelectMany(chunk => chunk
.OrderBy(item => item.Name));
Separate items into two groups (meets and doesn't meet the criteria)
Order the groups as whole (1st that meets)
Order items within each group
Finally combine the items

There's nothing C#-specific to solve this, but it sounds like you're really looking for algorithm design guidance.
You should sort the list first. If this is a static list you should just keep it sorted all the time. If the list is large, you may consider using a different data structure (Binary Search Tree, Skip List, etc.) which is more optimized for this scenario.
Once it's sorted, finding matching elements becomes a simple binary search. Move the matching elements to the beginning of the result set, then return.

Add an indicator of a match into the select, and then sort on that:
void Main()
{
word[] Words = new word[11]
{new word {id=1, name= "ABC"},
new word {id=2, name= "XXX"},
new word {id=3, name= "Mille"},
new word {id=4, name= "YYY"},
new word {id=5, name= "Mill"},
new word {id=6, name= "Millen"},
new word {id=7, name= "OOO"},
new word {id=8, name= "GGGG"},
new word {id=9, name= null},
new word {id=10, name= "XXX"},
new word {id=11, name= "mil"}};
var target = "mil";
var comparison = StringComparison.InvariantCultureIgnoreCase;
var q = (from w in Words
where w.name != null
select new {
Match = w.name.StartsWith(target, comparison)?1:2,
name = w.name})
.OrderBy(w=>w.Match).ThenBy(w=>w.name);
q.Dump();
}
public struct word
{
public int id;
public string name;
}

It is probably not easier but you could create a class that implements IComparable Interface and have a property Mil that is used by CompareTo.
Then you could just call List.Sort(). And you can pass an IComparer to List.Sort.
It would probably be the most efficient and you can sort in place rather than producing a new List.
On average, this method is an O(n log n) operation, where n is Count;
in the worst case it is an O(n ^ 2) operation.
public int CompareTo(object obj)
{
if (obj == null) return 1;
Temperature otherTemperature = obj as Temperature;
if (otherTemperature != null)
{
if(string.IsNullOrEmpty(Mil)
return this.Name.CompareTo(otherTemperature.Name);
else if(this.Name.StartsWith(Mill) && otherTemperature.Name.StartsWith(Mill)
return this.Name.CompareTo(otherTemperature.Name);
else if(!this.Name.StartsWith(Mill) && !otherTemperature.Name.StartsWith(Mill)
return this.Name.CompareTo(otherTemperature.Name);
else if(this.Name.StartsWith(Mill))
return 1;
else
return 0;
}
else
throw new ArgumentException("Object is not a Temperature");
}
You will need to add how you want null Name to sort

First create a list of the words that match, sorted.
Then add to that list all of the words that weren't added to the first list, also sorted.
public IEnumerable<Word> GetSortedByMatches(string keyword, Word[] words)
{
var result = new List<Word>(words.Where(word => word.Name.StartsWith(keyword))
.OrderBy(word => word.Name));
result.AddRange(words.Except(result).OrderBy(word => word.Name));
return result;
}
Some of the comments suggest that it should be case-insensitive. That would be
public IEnumerable<Word> GetSortedByMatches(string keyword, Word[] words)
{
var result = new List<Word>(
words.Where(word => word.Name.StartsWith(keyword, true)) //<-- ignoreCase
.OrderBy(word => word.Name));
result.AddRange(words.Except(result).OrderBy(word => word.Name));
return result;
}

Related

Get the matching index of a value in a list

So I've got the following code:
string matchingName = "Bob";
List<string> names = GetAllNames();
if (names.Contains(matchingName))
// Get the index/position in the list of names where Bob exists
Is it possible to do this with a couple of lines of code, rather than iterating through the list to get the index or position?
If you have multiple matching instances and want to get all the indices you can use this:
var result = Enumerable.Range(0, names.Count).Where(i => names[i] == matchingName);
If it is just one index you want, then this will work:
int result = names.IndexOf(matchingName);
If there is no matching instance in names, the former solution will yield an empty enumeration, while the latter will give -1.
var index = names.IndexOf(matchingName);
if (index != -1)
{
// do something with index
}
If you want to look for a single match, then IndexOf will suit your purposes.
If you want to look for multiple matches, consider:
var names = new List<string> {"Bob", "Sally", "Hello", "Bob"};
var bobIndexes = names.Select((value, index) => new {value, index})
.Where(z => z.value == "Bob")
.Select(z => z.index);
Console.WriteLine(string.Join(",", bobIndexes)); // this outputs 0,3
The use of (value, index) within Select gives you access to both the element and its index.

Linq items in a list exist in another list

I have 2 lists
List 1
var hashTags = new List<HashTag>();
hashTags.Add(new HashTag
{
Name = "#HashTag1",
Index = 1
});
hashTags.Add(new HashTag
{
Name = "#HashTag2",
Index = 2
});
hashTags.Add(new HashTag
{
Name = "#HashTag3",
Index = 3
});
hashTags.Add(new HashTag
{
Name = "#HashTag4",
Index = 4
});
List 2
var hashTags2 = new List<HashTag>();
hashTags2.Add(new HashTag
{
Name = "#HashTag1",
Index = 1
});
hashTags2.Add(new HashTag
{
Name = "#HashTag3",
Index = 3
});
hashTags2.Add(new HashTag
{
Name = "#HashTag4",
Index = 4
});
How do I check if all the elements in hashTags2 exist in hashTags? The index can be ignored and only the name matching is crucial. I can write a for loop to check element but I am looking for a LINQ solution.
Simple linq approach.
hashTags2.All(h=> hashTags.Any(h1 => h1.Name == h.Name))
Working Demo
As only equality of the names is to be taken into account, the problem can be solved by first mapping to the names and then checking containment as follows.
var hashTags2Names = hashTags2.Select( iItem => iItem.Name );
var hashTagsNames = hashTags.Select( iItem => iItem.Name );
var Result = hashTags2Names.Except( hashTagsNames ).Any();
So you want a boolean linq expression that returns true if the name of every element in hashTags2 exists in hashTags?
For this you want the function Enumerable.All, you want that every Hashtag in hashTags2 ...
bool result = hashTags2.All(hashTag => ...)
what do you want to check for every hashTag in hashTags2? That the name is a name in hashTags. So we need the names of hashTags:
IEnumerable<string> names = hashTags.Select(hashTag => hashTag.Name);
and to check if an item is in a sequence: Enumerable.Contains.
Put it all together:
IEnumerable<string> names = hashTags.Select(hashTag => hashTag.Name);
bool result = hashTags2.All(hashTag => names.Contains(hashTag.Name));
Of if you want one fairly unreadable expression:
bool result = hashTags2.All(hashTagX =>
hashTags.Select(hashTagY => hashTagY.Name)
.Contains(hashtagX)))
Because of delayed execution there is no difference between the first and the second method. The first one will be more readable.
With Linq to objects you will need at least one IEqualityComparar, to
tell linq how to compare objects and to determine when they are equal.
A simple comparer would be the following that uses the Name property to determine equality of your HashTag.
public class NameEquality : IEqualityComparer<HashTag>
{
public bool Equals(HashTag tag, HashTag tag2)
{
return tag.Name == tag2.Name;
}
public int GetHashCode(HashTag tag)
{
return tag.Name.GetHashCode();
}
}
With this Equality Comparer you can use the linq method Except(), to get all Elements from your list hashTag that are not part of hashTag2.
hashTags.Except(hashTags2, new NameEquality())
I prefer the join operator, however it is just a matter of taste, I guess:
var hashMatched = hashTags.Join(hashTags2,_o => _o.Name, _i => _i.Name, (_o,_i) => _o);

Sorting a list of objects based on another

public class Product
{
public string Code { get; private set; }
public Product(string code)
{
Code = code;
}
}
List<Product> sourceProductsOrder =
new List<Product>() { new Product("BBB"), new Product("QQQ"),
new Product("FFF"), new Product("HHH"),
new Product("PPP"), new Product("ZZZ")};
List<Product> products =
new List<Product>() { new Product("ZZZ"), new Product("BBB"),
new Product("HHH")};
I have two product lists and I want to reorder the second one with the same order as the first.
How can I reorder the products list so that the result would be : "BBB", "HHH", "ZZZ"?
EDIT: Changed Code property to public as #juharr mentioned
You would use IndexOf:
var sourceCodes = sourceProductsOrder.Select(s => s.Code).ToList();
products = products.OrderBy(p => sourceCodes.IndexOf(p.Code));
The only catch to this is if the second list has something not in the first list those will go to the beginning of the second list.
MSDN post on IndexOf can be found here.
You could try something like this
products.OrderBy(p => sourceProductsOrder.IndexOf(p))
if it is the same Product object. Otherwise, you could try something like:
products.OrderBy(p => GetIndex(sourceProductsOrder, p))
and write a small GetIndex helper method. Or create a Index() extension method for List<>, which would yield
products.OrderBy(p => sourceProductsOrder.Index(p))
The GetIndex method is rather simple so I omit it here.
(I have no PC to run the code so please excuse small errors)
Here is an efficient way to do this:
var lookup = sourceProductsOrder.Select((p, i) => new { p.Code, i })
.ToDictionary(x => x.Code, x => x.i);
products = products.OrderBy(p => lookup[p.Code]).ToList();
This should have a running time complexity of O(N log N), whereas an approach using IndexOf() would be O(N2).
This assumes the following:
there are no duplicate product codes in sourceProductsOrder
sourceProductsOrder contains all of the product codes in products
you make the Code field/property non-private
If needed, you can create a safeguard against the first bullet by replacing the first statement with this:
var lookup = sourceProductsOrder.GroupBy(p => p.Code)
.Select((g, i) => new { g.Key, i })
.ToDictionary(x => x.Key, x => x.i);
You can account for the second bullet by replacing the second statement with this:
products = products.OrderBy(p =>
lookup.ContainsKey(p.Code) ? lookup[p.Code] : Int32.MaxValue).ToList();
And you can use both if you need to. These will slow down the algorithm a bit, but it should continue to have an O(N log N) running time even with these alterations.
I would implement a compare function that does a lookup of the order from sourceProductsOrder using a hash table. The lookup table would look like
(key) : (value)
"BBB" : 1
"QQQ" : 2
"FFF" : 3
"HHH" : 4
"PPP" : 5
"ZZZ" : 6
Your compare could then lookup the order of the two elements and do a simple < (pseudo code):
int compareFunction(Product a, Product b){
return lookupTable[a] < lookupTable[b]
}
Building the hash table would be linear and doing the sort would generally be nlogn
Easy come easy go:
IEnumerable<Product> result =
products.OrderBy(p => sourceProductsOrder.IndexOf(sourceProductsOrder.FirstOrDefault(p2 => p2.Code == p.Code)));
This will provide the desired result. Objects with ProductCodes not available in the source list will be placed at the beginning of the resultset. This will perform just fine for a couple of hundred of items I suppose.
If you have to deal with thousands of objects than an answer like #Jon's will likely perform better. There you first create a kind of lookup value / score for each item and then use that for sorting / ordering.
The approach I described is O(n2).

Linq intersect to filter multiple criteria against list

I'm trying to filter users by department. The filter may contain multiple departments, the users may belong to multiple departments (n:m). I'm fiddling around with LINQ, but can't find the solution. Following example code uses simplified Tuples just to make it runnable, of course there are some real user objects.
Also on CSSharpPad, so you have some runnable code: http://csharppad.com/gist/34be3e2dd121ffc161c4
string Filter = "Dep1"; //can also contain multiple filters
var users = new List<Tuple<string, string>>
{
Tuple.Create("Meyer", "Dep1"),
Tuple.Create("Jackson", "Dep2"),
Tuple.Create("Green", "Dep1;Dep2"),
Tuple.Create("Brown", "Dep1")
};
//this is the line I can't get to work like I want to
var tuplets = users.Where(u => u.Item2.Intersect(Filter).Any());
if (tuplets.Distinct().ToList().Count > 0)
{
foreach (var item in tuplets) Console.WriteLine(item.ToString());
}
else
{
Console.WriteLine("No results");
}
Right now it returns:
(Meyer, Dep1)
(Jackson, Dep2)
(Green, Dep1;Dep2)
(Brown, Dep1)
What I would want it to return is: Meyer,Green,Brown. If Filter would be set to "Dep1;Dep2" I would want to do an or-comparison and find *Meyer,Jackson,Green,Brown" (as well as distinct, as I don't want Green twice). If Filter would be set to "Dep2" I would only want to have Jackson, Green. I also played around with .Split(';'), but it got me nowhere.
Am I making sense? I have Users with single/multiple departments and want filtering for those departments. In my output I want to have all users from the specified department(s). The LINQ-magic is not so strong on me.
Since string implements IEnumerable, what you're doing right now is an Intersect on a IEnumerable<char> (i.e. you're checking each letter in the string). You need to split on ; both on Item2 and Filter and intersect those.
var tuplets = users.Where(u =>
u.Item2.Split(new []{';'})
.Intersect(Filter.Split(new []{';'}))
.Any());
string[] Filter = {"Dep1","Dep2"}; //Easier if this is an enumerable
var users = new List<Tuple<string, string>>
{
Tuple.Create("Meyer", "Dep1"),
Tuple.Create("Jackson", "Dep2"),
Tuple.Create("Green", "Dep1;Dep2"),
Tuple.Create("Brown", "Dep1")
};
//I would use Any/Split/Contains
var tuplets = users.Where(u => Filter.Any(y=> u.Item2.Split(';').Contains(y)));
if (tuplets.Distinct().ToList().Count > 0)
{
foreach (var item in tuplets) Console.WriteLine(item.ToString());
}
else
{
Console.WriteLine("No results");
}
In addition to the other answers, the Contains extension method may also be a good fit for what you're trying to do if you're matching on a value:
var result = list.Where(x => filter.Contains(x.Value));
Otherwise, the Any method will accept a delegate:
var result = list.Where(x => filter.Any(y => y.Value == x.Value));

Determine if string appears more than once in string array (C#)

I have an array of strings, f.e.
string [] letters = { "a", "a", "b", "c" };
I need to find a way to determine if any string in the array appears more than once.
I thought the best way is to make a new string-array without the string in question and to use Contains,
foreach (string letter in letters)
{
string [] otherLetters = //?
if (otherLetters.Contains(letter))
{
//etc.
}
}
but I cannot figure out how.
If anyone has a solution for this or a better approach, please answer.
The easiest way is to use GroupBy:
var lettersWithMultipleOccurences = letters.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(g => g.Key);
This will first group your array using the letters as keys. It then returns only those groups with multiple entries and returns the key of these groups. As a result, you will have an IEnumerable<string> containing all letters that occur more than once in the original array. In your sample, this is only "a".
Beware: Because LINQ is implemented using deferred execution, enumerating lettersWithMultipleOccurences multiple times, will perform the grouping and filtering multiple times. To avoid this, call ToList() on the result:
var lettersWithMultipleOccurences = letters.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(g => g.Key).
.ToList();
lettersWithMultipleOccurences will now be of type List<string>.
You can the LINQ extension methods:
if (letters.Distinct().Count() == letters.Count()) {
// no duplicates
}
Enumerable.Distinct removes duplicates. Thus, letters.Distinct() would return three elements in your example.
Create a HashSet from the array and compare their sizes:
var set = new HashSet(letters);
bool hasDoubleLetters = set.Size == letters.Length;
A HashSet will give you good performance:
HashSet<string> hs = new HashSet<string>();
foreach (string letter in letters)
{
if (hs.Contains(letter))
{
//etc. more as once
}
else
{
hs.Add(letter);
}
}

Categories

Resources