Query IEnumerable as IEnumerable<Type> - c#

I have a problem I need to solve efficiently.
I require the index of an element in an IEnumerable source, one way I could do this is with the following
var items = source.Cast<ObjectType>().Where(obj => obj.Start == forDate);
This would give me an IEnumerable of all the items that match the predicate.
if(items != null && items.Any()){
// I now need the ordinal from the original list
return source.IndexOf(items[0]);
}
However, the list could be vast and the operation will be carried out many times. I believe this is inefficient and there must be a better way to do this.
I would be grateful if anyone can point me in the correct direction.

Sometimes, it's good to forget about Linq and go back to basics:
int index = 0;
foeach (ObjectType element in source)
{
if (element.Start == forDate)
{
return index;
}
index++;
}
// No element found

Using Linq, you can take the index of each object before filtering them:
source
.Cast<ObjectType>()
.Select((obj, i) => new { Obj = obj, I = i })
.Where(x => x.Obj.Start == forDate)
.Select(x => x.I)
.FirstOrDefault();
However, this is not really efficient, the following will do the same without allocations:
int i = 0;
foreach (ObjectType obj in source)
{
if (obj.Start == forDate)
{
return i;
}
i++;
}

Your second code sample was invalid: since items is an IEnumerable, you cannot call items[0]. You can use First(). Anyway:
var items = source.Cast<ObjectType>()
.Select((item, index) => new KeyValuePair<int, ObjectType>(index, item))
.Where(obj => obj.Value.Start == forDate);
and then:
if (items != null && items.Any()) {
return items.First().Key;
}

If you need to do this multiple times I would create a lookup for the indices.
ILookup<DateTime, int> lookup =
source
.Cast<ObjectType>()
.Select((e, i) => new { e, i })
.ToLookup(x => x.e.Start, x => x.i);
Now given a forDate you can do this:
IEnumerable<int> indices = lookup[forDate];
Since the lookup is basically like a dictionary that returns multiple values you get the results instantly. So repeating this for multiple values is super fast.
And since this returns IEnumerable<int> you know when there are duplicate values within the source list. If you only need the first one then just do a .First().

Related

Sublists of consecutive elements that fit a condition in a list c# linq

So suppose we have a parking(represented as a dictionary<int,bool> :
Every parking lot has its id and a boolean(free,filled).
This way:
Dictionary<int,bool> parking..
parking[0]= true // means that the first parking lot is free
My question is i want to get the all sublist of consecutive elements that matchs in a condition : parking-lot is free.
First i can get elements that fits in this condition easy:
parking.Where(X => X.Value).Select(x => x.Key).ToList();
But then using linq operations i dont know how to get the first generated list that matchs in.
Can i do this without thousand of foreach-while loops checking iterating one by one, is there a easier way with linq?
This method gets a list of consecutive free parking lots
data:
0-free,
1-free,
2-filled ,
3-free
The results will be two lists:
First One will contain => 0 ,1
Second One will contain=> 3
These are the list of consecutive of parking lots that are free.
public List<List<int>> ConsecutiveParkingLotFree(int numberOfConsecutive){}
You can always write your own helper function to do things like this. For example
public static IEnumerable<List<T>> GroupSequential<T, TKey>(
this IEnumerable<T> self,
Func<T, bool> condition)
{
var list = new List<T>();
using var enumerator = self.GetEnumerator();
if (enumerator.MoveNext())
{
var current = enumerator.Current;
var oldValue = condition(current);
if (oldValue)
{
list.Add(current);
}
while (enumerator.MoveNext())
{
current = enumerator.Current;
var newValue = condition(current);
if (newValue)
{
list.Add(current);
}
else if (oldValue)
{
yield return list;
list = new List<T>();
}
oldValue = newValue;
}
if (list.Count > 0)
{
yield return list;
}
}
}
This will put all the items with a true-value in a list. When a true->false transition is encountered the list is returned and recreated. I would expect that there are more compact ways to write functions like this, but it should do the job.
You can apply GroupWhile solution here.
parking.Where(X => X.Value)
.Select(x => x.Key)
.GroupWhile((x, y) => y - x == 1)
.ToList()

Is it possible to always take 3 objects and if only 2 exists it returns 3 but one has null values in?

I need to return the 3 latest elements in a collection... If use Linq e.g. .OrderByDescending(a => a.Year).Take(3) then this is fine as long as the collection contains at least 3 elements. What I want is for it always to return 3, so for example if there are only 2 items then the last item would be a blank/initialised element (ideally where I could configure what was returned)
Is this possible?
You can concatenate the sequence with another (lazily created) sequence of 3 elements:
var result = query
.OrderByDescending(a => a.Year)
.Concat(Enumerable.Range(0, 3).Select(_ => new ResultElement()))
.Take(3);
Or perhaps:
var result = query
.OrderByDescending(a => a.Year)
.Concat(Enumerable.Repeat(new ResultElement(), 3))
.Take(3);
(The latter will end up with duplicate references and will always create an empty element, so I'd probably recommend the former... but it depends on the context. You might want to use Enumerable.Repeat(null, 3) and handle null elements instead.)
You could write your own extension method:
public static IEnumerable<T> TakeAndCreate<T>(this IEnumerable<T> input, int amount, Func<T> defaultElement)
{
int counter = 0;
foreach(T element in input.Take(amount))
{
yield return element;
counter++;
}
for(int i = 0; i < amount - counter; i++)
{
yield return defaultElement.Invoke();
}
}
Usage is
var result = input.OrderByDescending(a => a.Year).TakeAndCreate(3, () => new ResultElement());
One advantage of this solution is that it will create new elements only if they are acutally needed, which might be good for performance if you have a lot of elements to be created or their creation is not trivial.
Online demo: https://dotnetfiddle.net/HHexGd

Is there best practice to obtain elements/variables from collection based on different conditions

Or in general how to filter some elements from collection based on different and complex conditions in single pass
Let's say we have collection of elements
var cats = new List<Cat>{ new Cat("Fluffy"), new Cat("Meowista"), new Cat("Scratchy")};
And somewhere we use this collection
public CatFightResult MarchBoxing(List<Cat> cats, string redCatName, string blueCatName)
{
var redCat = cats.First(cat => cat.Name == redCatName);
var blueCat = cats.First(cat => cat.Name == blueCatName);
var redValue = redCat.FightValue();
var blueValue = blueCat.FightValue();
if (Cat.FightValuesEqualWithEpsilon(redValue, blueValue))
return new CatFightResult{IsDraw: true};
return new CatFightResult{Winner: redValue > blueValue ? redCat : blueCat};
}
Question: Is there a nice way to obtain multiple variables from collection based on some condition(s)?
The question probably requires some sort of uniqueness in collection, let's first assume there is some (i.e. HashSet/Dictionary)
AND preferably:
SINGLE pass/cycle on collection (the most important reason of question, as you can see there are 2 filter operations in above method)
oneliner or like that, with readability, and the shorter the better
generic way (IEnumerable<T> I think, or ICollection<T>)
typos error-prone and changes/additions safe (minimal use of actual conditions in code, preferably checked
null/exception check, because my intention that null is valid result for obtained variable
Would be also cool to have ability to provide custom conditions, which probably could be done via Func parameters, but I didn't tested yet.
There are my attempts, which I've posted in my repo https://github.com/phomm/TreeBalancer/blob/master/TreeTraverse/Program.cs
Here is the adaptation to example with Cats:
public CatFightResult MarchBoxing(List<Cat> cats, string redCatName, string blueCatName)
{
var redCat = null;
var blueCat = null;
//1 kinda oneliner, but hard to read and not errorprone
foreach (var c in cats) _ = c.Name == redCatName ? redCat = n : n.Name == blueCatName ? blueCat = n : null;
//2 very good try, because errorprone and easy to read (and find mistake in assignment), but not oneliner and not elegant (but fast) redundant fetching and not single pass at all, up to O(N*N) with FirstOrDefault
var filter = new [] { redCatName, blueCatName }.ToDictionary(x => x.Key, x => cats.FirstOrDefault(n => n.Name == x.Key));
redCat = filter[redCatName];
blueCat = filter[blueCatName];
//3 with readability and ckecks for mistakenly written searching keys (dictionary internal dupe key check) , but not oneliner and not actualy single pass
var dic = new Dictionary<int, Func<Cat, Cat>> { { redCatName, n => redCat = n }, { blueCatName, n => blueCat = n } };
cats.All(n => dic.TryGetValue(n.Name, out var func) ? func(n) is null : true);
//4 best approach, BUT not generic (ofc one can write simple generic IEnumerable<T> ForEach extension method, and it would be strong candidate to win)
cats.ForEach(n => _ = n.Name == redCatName ? redCat = n : n.Name == blueCatName ? blueCat = n : null);
//5 nice approach, but not single pass, enumerating collection twice
cats.Zip(cats, (n, s) => n.Name == redCatName ? redCat = n : n.Name == blueCatName ? blueCat = n : null);
//6 the one I prefer best, however it's arguable due to breaking functional approach of Linq, causing side effects
cats.All(n => (n.Name == redCatName ? redCat = n : n.Name == blueCatName ? blueCat = n : null) is null);
}
All the options with ternary op are not extensible easily and relatively error-prone, but are quite short and Linq-ish, they also rely (some trade-off with confusion) on not returning/using actual results of ternary (with discard "_" or "is null" as bool). I think the approach with Dictionary of Funcs is a good candidate to implement custom conditions, just bake-in them with variables.
Thank you, looking forward your solutions ! :)
I'm not sure if it's possible with Linq out of the box but if writing a custom extension once is an option for you, retrieving some values from a collection with arbitrary number of conditions may later be put in pretty concise manner.
For example, you may write something like
var (redCat, blueCat) = cats.FindFirsts(
x => x.Name == redCatName,
x => x.Name == blueCatName);
If you introduce the FindFirsts() extension as follows:
public static class FindExtensions
{
public static T[] FindFirsts<T>(this IEnumerable<T> collection,
params Func<T, bool>[] conditions)
{
if (conditions.Length == 0)
return new T[] { };
var unmatchedConditions = conditions.Length;
var lookupWork = conditions
.Select(c => (
value: default(T),
found: false,
cond: c
))
.ToArray();
foreach (var item in collection)
{
for (var i = 0; i < lookupWork.Length; i++)
{
if (!lookupWork[i].found && lookupWork[i].cond(item))
{
lookupWork[i].found = true;
lookupWork[i].value = item;
unmatchedConditions--;
}
}
if (unmatchedConditions <= 0)
break;
}
return lookupWork.Select(x => x.value).ToArray();
}
}
The full demo can be hound here: https://dotnetfiddle.net/QdVJUd
Note: In order to deconstruct the result array (i.e. use var (redCat, blueCat) = ...), you have to define a deconstruction extension. I borrowed some code from this thread to do so.

customize OrderBy for a List?

I have a list of items and I want to create two ways to sort them, Alphabetically and Last Modified.
Here's what I did:
// Alphabetically
tableItems = tableItems.OrderBy (MyTableItem => MyTableItem.ItemName).ToList();
reloadTable(tableItems);
// Last Modified
tableItems = tableItems.OrderBy (MyTableItem => MyTableItem.Timestamp).ToList();
reloadTable(tableItems);
and this works perfectly fine.
My problem is I want this happen to all items in the list except for one.
This one item will always be constant and I want to make sure it's ALWAYS on the top of the list.
What would I need to do for that?
if it matters, c# is the lang.
Thank you for your time.
tableItems = tableItems.OrderBy(i => i.ItemName != "yourexceptitem").ThenBy(i => i.Timestamp).ToList();
EDIT:
If you want to sort the itemname except one, do like this,
tableItems = tableItems.OrderBy(i => i.ItemName != "TestSubject3").ToList();
Other, generic solution:
public static IEnumerable<T> OrderByExcept<T>(
this IEnumerable<T> source,
Predicate<T> exceptPredicate,
Func<IEnumerable<T>, IOrderedEnumerable<T>> projection)
{
var rest = new List<T>();
using (var enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
if (exceptPredicate(enumerator.Current))
{
yield return enumerator.Current;
}
else
{
rest.Add(enumerator.Current);
}
}
}
foreach (var elem in projection(rest))
{
yield return elem;
}
}
Usage:
tableItems = tableItems.OrderByExcept(
item => item.ItemName == "TestSubject3",
items => items.OrderBy(MyTableItem => MyTableItem.ItemName)
.ThenBy(MyTableItem => MyTableItem.TimeStamp))
.ToList();
Items that meets predicate will always be on the top of list, to the rest of elements projection will be applied.

Checking a list with null values for duplicates in C#

In C#, I can use something like:
List<string> myList = new List<string>();
if (myList.Count != myList.Distinct().Count())
{
// there are duplicates
}
to check for duplicate elements in a list. However, when there are null items in list this produces a false positive. I can do this using some sluggish code but is there a way to check for duplicates in a list while disregarding null values with a concise way ?
If you're worried about performance, the following code will stop as soon as it finds the first duplicate item - all the other solutions so far require the whole input to be iterated at least once.
var hashset = new HashSet<string>();
if (myList.Where(s => s != null).Any(s => !hashset.Add(s)))
{
// there are duplicates
}
hashset.Add returns false if the item already exists in the set, and Any returns true as soon as the first true value occurs, so this will only search the input as far as the first duplicate.
I'd do this differently:
Given Linq statements will be evaluated lazily, the .Any will short-circuit - meaning you don't have to iterate & count the entire list, if there are duplicates - and as such, should be more efficient.
var dupes = myList
.Where(item => item != null)
.GroupBy(item => item)
.Any(g => g.Count() > 1);
if(dupes)
{
//there are duplicates
}
EDIT: http://pastebin.com/b9reVaJu Some Linqpad benchmarking that seems to conclude GroupBy with Count() is faster
EDIT 2: Rawling's answer below seems at least 5x faster than this approach!
var nonNulls = myList.Where(x => x != null)
if (nonNulls.Count() != nonNulls.Distinct().Count())
{
// there are duplicates
}
Well, two nulls are duplicates, aren't they?
Anyway, compare the list without nulls:
var denullified = myList.Where(l => l != null);
if(denullified.Count() != denullified.Distinct().Count()) ...
EDIT my first attempt sucks because it is not deferred.
instead,
var duplicates = myList
.Where(item => item != null)
.GroupBy(item => item)
.Any(g => g.Skip(1).Any());
poorer implementation deleted.

Categories

Resources