How to optimize a LINQ with minimum and additional condition - c#

Asume we have a list of objects (to make it more clear no properties etc.pp are used)
public class SomeObject{
public bool IsValid;
public int Height;
}
List<SomeObject> objects = new List<SomeObject>();
Now I want only the value from a list, which is both valid and has the lowest height.
Classically i would have used sth like:
SomeObject temp;
foreach(SomeObject so in objects)
{
if(so.IsValid)
{
if (null == temp)
temp = so;
else if (temp.Height > so.Height)
temp = so;
}
}
return temp;
I was thinking that it can be done more clearly with LinQ.
The first approach which came to my mind was:
List<SomeObject> sos = objects.Where(obj => obj.IsValid);
if(sos.Count>0)
{
return sos.OrderBy(obj => obj.Height).FirstOrDefault();
}
But then i waas thinking: In the foreach approach i am going one time through the list. With Linq i would go one time through the list for filtering, and one time for ordering even i do not need to complete order the list.
Would something like
return objects.OrderBy(obj => obj.Height).FirstOrDefault(o => o.IsValid);
also go twice throught the list?
Can this be somehow optimized, so that the linw also only needs to run once through the list?

You can use GroupBy:
IEnumerable<SomeObject> validHighestHeights = objects
.Where(o => o.IsValid)
.GroupBy(o => o.Height)
.OrderByDescending(g => g.Key)
.First();
This group contains all valid objects with the highest height.

The most efficient way to do this with Linq is as follows:
var result = objects.Aggregate(
default(SomeObject),
(acc, current) =>
!current.IsValid ? acc :
acc == null ? current :
current.Height < acc.Height ? current :
acc);
This will loop over the collection only once.
However, you said "I was thinking that it can be done more clearly with LinQ." Whether this is more clear or not, I leave that up to you to decide.

You can try this one:
return (from _Object in Objects Where _Object.isValid OrderBy _Object.Height).FirstOrDefault();
or
return _Objects.Where(_Object => _Object.isValid).OrderBy(_Object => _Object.Height).FirstOrDefault();

Would something like
return objects.OrderBy(obj => obj.Height).FirstOrDefault(o => o.IsValid);
also go twice throught the list?
Only in the worst case scenario, where the first valid object is the last in order of obj.Height (or there is none to be found). Iterating the collection using FirstOrDefault will stop as soon as a valid element is found.
Can this be somehow optimized, so that the linw also only needs to run
once through the list?
I'm afraid you'd have to make your own extension method. Considering what I've written above though, I'd consider it pretty optimized as it is.
**UPDATE**
Actually, the following would be a bit faster, as we'd avoid sorting invalid items:
return object.Where(o => o.IsValid).OrderBy(o => o.Height).FirstOrDefault();

Related

Set several class values using LINQ expression

I have the following two LINQ statements which set different values in the same item in a list
List<MyClass> myList = GetList();
myList.Where(x => x.Name == "someName").Select(x => x.MyArray = someList.ToArray()).ToList();
myList.Where(x => x.Name == "someName").Select( x => x.AnotherValue = GetValue()).ToList();
Is it possible to combine this so both are set in the one expression?
myList
.Where(x => x.Name == "someName")
.ToList()
.ForEach(x => {
x.MyArray = someList.ToArray();
x.AnotherValue = GetValue();
});
Why are you calling ToList() at the end of each of those expressions and discarding the result?
Also, Jon Skeet is right that this is an abuse of LINQ, and especially so in your original form: It's explicit that LINQ expressions aren't even necessarily expected to be fully enumerated. The fact that you needed those ToList() calls to make anything happen should have given you a grave and queasy sense that you were misusing a language feature. When you have to do something weird to use your chosen construct instead of the usual way of doing it, finish getting it to work (because weird is cool), and then go back and redo it the boring, lame way before you check it in.
What advantage do you see in the LINQ + ForEach() version above, compared to this version?
foreach (var x in myList.Where(x => x.Name == "someName"))
{
x.MyArray = someList.ToArray();
x.AnotherValue = GetValue();
}
The old-style loop version is shorter, instantly understandable because it's the default idiom, and IMO cleaner. You don't have to do everything with LINQ.
N.B., ForEach() isn't LINQ; it's a member of List<T>. That's why you have to call ToList() to use it.
Just use the lambda operator to pass an entire lambda expression defined inside a
{...} block:
myList.Where(x => x.Name == "someName").Select(x => { x.MyArray = someList.ToArray(); x.AnotherValue = GetValue(); return x;}).ToList();

How do I use Linq with a HashSet of Integers to pull multiple items from a list of Objects?

I have a HashSet of ID numbers, stored as integers:
HashSet<int> IDList; // Assume that this is created with a new statement in the constructor.
I have a SortedList of objects, indexed by the integers found in the HashSet:
SortedList<int,myClass> masterListOfMyClass;
I want to use the HashSet to create a List as a subset of the masterListOfMyclass.
After wasting all day trying to figure out the Linq query, I eventually gave up and wrote the following, which works:
public List<myclass> SubSet {
get {
List<myClass> xList = new List<myClass>();
foreach (int x in IDList) {
if (masterListOfMyClass.ContainsKey(x)) {
xList.Add(masterListOfMyClass[x]);
}
}
return xList;
}
private set { }
}
So, I have two questions here:
What is the appropriate Linq query? I'm finding Linq extremely frustrating to try to figuere out. Just when I think I've got it, it turns around and "goes on strike".
Is a Linq query any better -- or worse -- than what I have written here?
var xList = IDList
.Where(masterListOfMyClass.ContainsKey)
.Select(x => masterListOfMyClass[x])
.ToList();
If your lists both have equally large numbers of items, you may wish to consider inverting the query (i.e. iterate through masterListOfMyClass and query IDList) since a HashSet is faster for random queries.
Edit:
It's less neat, but you could save a lookup into masterListOfMyClass with the following query, which would be a bit faster:
var xList = IDList
.Select(x => { myClass y; masterListOfMyClass.TryGetValue(x, out y); return y; })
.Where(x => x != null)
.ToList();
foreach (int x in IDList.Where(x => masterListOfMyClass.ContainsKey(x)))
{
xList.Add(masterListOfMyClass[x]);
}
This is the appropriate linq query for your loop.
Here the linq query will not effective in my point of view..
Here is the Linq expression:
List<myClass> xList = masterListOfMyClass
.Where(x => IDList.Contains(x.Key))
.Select(x => x.Value).ToList();
There is no big difference in the performance in such a small example, Linq is slower in general, it actually uses iterations under the hood too. The thing you get with ling is, imho, clearer code and the execution is defered until it is needed. Not i my example though, when I call .ToList().
Another option would be (which is intentionally the same as Sankarann's first answer)
return (
from x in IDList
where masterListOfMyClass.ContainsKey(x)
select masterListOfMyClass[x]
).ToList();
However, are you sure you want a List to be returned? Usually, when working with IEnumerable<> you should chain your calls using IEnumerable<> until the point where you actually need the data. There you can decide to e.g. loop once (use the iterator) or actually pull the data in some sort of cache using the ToList(), ToArray() etc. methods.
Also, exposing a List<> to the public implies that modifying this list has an impact on the calling class. I would leave it to the user of the property to decide to make a local copy or continue using the IEnumerable<>.
Second, as your private setter is empty, setting the 'SubSet' has no impact on the functionality. This again is confusing and I would avoid it.
An alternate (an maybe less confusing) declaration of your property might look like this
public IEnumerable<myclass> SubSet {
get {
return from x in IDList
where masterListOfMyClass.ContainsKey(x)
select masterListOfMyClass[x]
}
}

Fetch Multiple items from linq

I have below legacy code in my application and would like to optimize it. arrayOfAttrValue has unique attributes. Can I use LINQ to acheive the loop optimization? If so then can you please show me how?
foreach (AttrValue attr in arrayOfAttrValue)
{
switch(attr.Attribute)
{
case Constants.Gender
mymodel.Gender = attr.Value;
break;
case Constants.Identifier
mymodel.AppIdentifier = attr.Value;
break;
}
}
My intention is not necessarily to use LINQ only. Any other way to minimize the loop would also help.
Thanks.
No, you can't do it in "true" LINQ because LINQ is about producing new objects from old objects. Here mymodel is a preexisting object that you want to modify.
You could use the Array.ForEach or the List.ForEach but
They aren't "true" LINQ and
The resulting code would be equivalent (a little slower because there would be a delegate)
Still, the downvoter probably wanted some LINQ, so I'll give some LINQ:
arrayOfAttrValue.All(attr => {
mymodel.Gender = attr.Attribute == Constants.Gender ? attr.Value : mymodel.Gender;
mymodel.AppIdentifier = attr.Attribute == Constants.Identifier ? attr.Value : mymodel.AppIdentifier;
return true;
});
One less line, ignoring the {} lines.
You have list of attributes, which represent key-value pairs. Natural way to keep such data structures is a dictionary. So, convert your input data format to dictionary:
var attributes = arrrayOfAttrValue.ToDictionary(a => a.Attribute, a => a.Value);
Or if each attribute is not unique in your array, dictionary creation will be more difficult, but that's data format you have. In order to make working with your data easier you should convert them to handy format:
var attributes = arrayOfAttrValue
.GroupBy(a => a.Attribute)
.ToDictionary(g => g.Key, g => g.Select(a => a.Value).Last());
After creating attributes dictionary, you can simply check if you have value for attribute and assign that value to model property. Attributes retrieving now simple and clear for any reader:
if (attributes.ContainsKey(Constants.Gender))
model.Gender = attributes[Constants.Gender];
if (attributes.ContainsKey(Constants.Identifier))
model.AppIdentifier = attributes[Constants.Identifier];
No need to do the loop manually in code, you can do it simply with .LastOrDefault():
mymodel.Gender = arrayOfAttrValue
.Where(attr => attr.Attribute == Constants.Gender)
.Select(attr => attr.Value).LastOrDefault() ?? mymodel.Gender;
mymodel.AppIdentifier = arrayOfAttrValue
.Where(attr => attr.Attribute == Constants.Identifier)
.Select(attr => attr.Value).LastOrDefault() ?? mymodel.AppIdentifier;
The ?? mymodel.Gender makes sure we're not setting it to Default(T) (i.e. null) in a situation where it was otherwise set to a value previously. This then matches the functional logic of your initial question.
Doing it this way makes it very clear what you are trying to do. Of course this approach means that you're looping over the array twice, however if your array is actually an array then this performance cost will be very small.
If you still have performance issues with this then you probably want to consider using a better data structure than an arrayOfAttrValue (something that is index accessable by Attribute such as a Dictionary<,>).

how to use linq to retrieve values from a 2 dimensional generic list

I have a generic List List[int, myClass], and I would like to find the smallest int value, and retrieve the items from the list that match this.
I am generating this from another LINQ statement
var traysWithExtraAisles = (from t in poolTrays
where t.TrayItems.Select(i=>i.Aisle)
.Any(a=> ! selectedAisles.Contains(a))
select new
{
count= t.TrayItems.Select(i=>i.Aisle)
.Count(a=> !selectedAisles.Contains(a)),
tray=t
}).ToList();
this gives me my anonymous List of [count, Tray], but now I want to figure out the smallest count, and return a sublist for all the counts that match this.
Can anyone help me out with this?
var smallestGroup = traysWithExtraAisles
.GroupBy(x => x.count)
.OrderBy(g => g.Key)
.First();
foreach(var x in smallestGroup)
{
var poolTray = x.tray;
}
You can use SelectMany to "flatten" your list. Meaning, combine all of the lists into one, then take the Min. So;
int minimum = poolTrays.SelectMany(x => x).Min(x => x.TheIntegerIWantMinOf);
Will give you the smallest value contained in the sub lists. I'm not entirely sure this is what you're asking for but if your goal is simply to find the smallest element in the collection then I would scrap the code you posted and use this instead.
Right, I now realise this is actually incredibly easy to do with a bit more fiddling around. I have gone with
int minCount = traysWithExtraAisles.Min(x=>x.count);
var minAislesList = (from t in trayswithExtraAisles
where t.count==mincount
select t).ToList()
I imagine it is probably possible to do this in one statement
You can use GroupBy as answered by Tim... or OrderBy as follow:
var result = traysWithExtraAisles.OrderBy(x=>x.count)
.TakeWhile((x,i)=> i == 0 || x.count == traysWithExtraAisles[i-1]).count;

Can I use linq to achieve the same thing this foreach loop does?

Here's the c# code that I have:
private double get806Fees (Loan loan)
{
Loan.Fee.Items class806;
foreach (Loan.Fee.Item currentFee in loan.Item.Fees)
{
if (currentFee.Classification == 806) class806.Add(currentFee);
}
// then down here I will return the sum of all items in class806
}
Can I do this using linq? If so, how? I have never used linq and i've read in several places that using linq instead of a foreach loop is faster... is this true?
Similar to some existing answers, but doing the projection in the query, to make the Sum call a lot simpler:
var sum = (from fee in loan.Items.Fees
where fee.Classification == 806
select fee.SomeValueToSum).Sum();
loan.Item.Fees.
Where(x => x.Classification == 806).
Sum(x => x.SomeValueProperty)
Whether it is faster or not is debatable. IMO, both complexities are the same, the non-LINQ version may be faster.
var q =
from currentFee in loan.Item.Fees
where currentFee.Classification == 806
select currentFee;
var sum = q.Sum(currentFee => currentFee.Fee);
private double get806Fees(Loan loan)
{
return load.Item.Fees.
Where(f => f.Classification == 806).
Sum(f => f.ValueToCalculateSum);
}
I'm assuming here that ValueToCalculateSum is also a double. If it's not then you have to convert it before it is returned.
All of the answers so far are assuming that you're summing up loan.Fees. But the code you actually posted calls Items.Add() to add each Item in loan.Fees.Items to an Items object, and it's that Items object (and not loan.Fees, which is also an Items object) that you say you want to sum up.
Now, if Items is just a simple collection class, then there's no need to do anything other than what people are suggesting here. But if there's some side-effect of the Add method that we don't know about (or, worse, that you don't know about), simply summing up a filtered list of Item objects might not give you the results you're looking for.
You could still use Linq:
foreach (Loan.Fee.Item currentFee in loan.Item.Fees.Where(x => x.Classification == 806)
{
class806.Add(currentFee);
}
return class806.Sum(x => x.Fee)
I'll confess that I'm a little perplexed by the class hierarchy implied here, though, in which the Loan.Item.Fees property is a collection of Loan.Fee.Item objects. I don't know if what I'm seeing is a namespace hierarchy that conflicts with a class hierarchy, or if you're using nested classes, or what. I know I don't like it.

Categories

Resources