LINQ Lambda efficiency of code groupby orderby - c#

I have this code, but I think that it could run faster, or I just hope to. But I have plenty of data. I'd like to have it as effective as it can be.
Here is the code:
(Need to return newest translations of words (Language and value) from resources grouped by resource and language based on Expression<Func<ResourcesTranslation, bool>> ConditionExpression)
KeyValues = item.Resources
.Where(ConditionExpression)
.GroupBy(g => new { g.ResourceId, g.Language })
.Select(m => m.OrderByDescending(o => o.Changed ?? o.Created))
.Select( s => new KeyValues
{
Language = s.FirstOrDefault().Language,
KeyValue = s.FirstOrDefault().Value
}).ToList();

As you need only one element after grouping, you can return it right in GroupBy clause, it will simplify your code:
KeyValues = item.Resources
.Where(ConditionExpression)
.GroupBy(g => new { g.ResourceId, g.Language },
(x, y) => new { Max = y.OrderByDescending(o => o.Changed ?? o.Created).First() })
.Select(s => new KeyValues
{
Language = s.Max.Language,
KeyValue = s.Max.Value
})
.ToList();

Even though you can get some performance by removing the first, unneeded select (depending on the volume of data this could be minimal to medium improvement) like this:
KeyValues = item.Resources
.Where(ConditionExpression)
.GroupBy(g => new { g.ResourceId, g.Language })
.OrderByDescending(o => o.Changed.HasValue ? o.Changed : o.Created)
.Select( s => new KeyValues
{
Language = s.Language,
KeyValue = s.Value
}).ToList();
Depending on your case, you could:
If your data is in a database, you can create database improvements like adding indexes, updating statistics, using hints etc.
if this is local data, you can use some strategy to split new and old data between various enumerables.
There is no other way to significantly improve your linq query. You need to find another strategy to achieve that.

I found out that Visual Studio translates it in to selects, so I realized that, the best solution for stuff like this is to make some View.. Just giving answer to own Q for another guys.

Related

Remove sub-domains from list of domains using LINQ

I have a list of strings like this:
a#domain.com
b#sub.domain.com
c#sub.sub.domain.com
d#sub.domain2.com
I want to remove the subdomains and only leave the domain.com, domain2.com, etc..
What I have tried so far but with no success:
string[] campusCup(string[] emails)
{
var emailList = emails.Select(x => x.Split('#').Last())
.Distinct()
.Select(x => x.Where(y => x.Split('.').Length > 2).Select(z => x.Split('.').Reverse().Take(2).Reverse()))
.Select(x => x)
.Distinct();
return emailList.ToArray();
}
Any help solving the task or explanation of what I am doing wrong and how can I solve it is appreciated. Thank you
You could first use MailAddress to get the host, then some string methods to get only the last two:
string[] domains = emails
.Select(e => new MailAddress(e).Host.Split('.'))
.Select(arr => String.Join(".", arr.Skip(arr.Length - 2)))
.Distinct()
.ToArray();
This seems to work for me given your data set:
var domains = emails.Select(e => e.Split('#')[1]).Select(d =>
{
var parts = d.Split('.');
return string.Join(".", parts.Skip(parts.Length - 2));
}).Distinct();
If you just want to learn about LINQ, as you mention in the comments of your question, here is another fun option:
var reg = new Regex(#"[a-z0-9\.]+#[a-z0-9\.]*?(?<domain>[a-z0-9]+\.[a-z0-9]+)$");
var secondLevelDomains = domains.SelectMany(domainName => reg.Matches(domainName).Cast<Match>()
.Select(m => m.Groups["domain"])
.Select(m => m.Value))
.Distinct();
It uses matching groups in regular expressions to parse the domain names, and several of the more interesting LINQ functions, like Cast (for converting older collections in to LINQ friendly enumerables), SelectMany (to merge enumerable properties of multiple items), and Distinct (to return only unique entries).
This is probably not the ideal way to do this in a real application, but it exposes a lot of LINQ functionality for learning purposes.

Improve Linq query performance that use ToList()

this code written by #Rahul Singh in this post Convert TSQL to Linq to Entities :
var result = _dbContext.ExtensionsCategories.ToList().GroupBy(x => x.Category)
.Select(x =>
{
var files = _dbContext.FileLists.Count(f => x.Select(z => z.Extension).Contains(f.Extension));
return new
{
Category = x.Key,
TotalFileCount = files
};
});
but this code have problem when used inside database context and we should use ToList() like this to fix "Only primitive types or enumeration types are supported in this context" error :
var files = _dbContext.FileLists.Count(f => x.Select(z => z.Extension).ToList().Contains(f.Extension));
the problem of this is ToList() fetch all records and reduce performance, now i wrote my own code :
var categoriesByExtensionFileCount =
_dbContext.ExtensionsCategories.Select(
ec =>
new
{
Category = ec.Category,
TotalSize = _dbContext.FileLists.Count(w => w.Extension == ec.Extension)
});
var categoriesTOtalFileCount =
categoriesByExtensionFileCount.Select(
se =>
new
{
se.Category,
TotalCount =
categoriesByExtensionFileCount.Where(w => w.Category == se.Category).Sum(su => su.TotalSize)
}).GroupBy(x => x.Category).Select(y => y.FirstOrDefault());
the performance of this code is better but it have much line of code, any idea about improve performance of first code or reduce line of second code :D
Regards, Mojtaba
You should have a navigation property from ExtensionCategories to FileLists. If you are using DB First, and have your foreign key constraints set up in the database, it should do this automatically for you.
If you supply your table designs (or model classes), it would help a lot too.
Lastly, you can rewrite using .ToList().Contains(...) with .Any() which should solve your immediate issue. Something like:
_dbContext.FileLists.Count(f => x.Any(z => z.Extension==f.Extension)));

Create Multiple Objects Single LINQ EF Method

List<MyObject> objects = await item.tables.ToAsyncEnumerable()
.Where(p => p.field1 == value)
.Select(p => new MyObject(p.field1,p.field2))
.ToList();
^ I have something like that, but what i'm wondering, is there anyway way to add a second object creation, in the same select? eg. new MyObject(p.field3,p.field4) ? and add it to the same list? order does not matter.
I know could do this with multiple calls to database or splitting up lists into sections, but is there way to do this in single line?
You could create it as a tuple.
List<Tuple<MyObject1, MyObject2>> = query.Select(x => Tuple.Create(
new MyObject1
{
// fields
},
new MyObject2
{
//fields
}))
.ToList();
From my testing in Linqpad, it seems that this will only hit the database once.
Alternatively, you could just select all the fields you know you'll need from the database to create both:
var myList = query.Select(x => new { FieldA = x.FieldA, FieldB = x.FieldB }).ToList(); //hits db once
var object1s = myList.Select(x => new MyObject1(x.FieldA));
var object2s = myList.Select(x => new MyObject1(x.FieldB));
var bothLists = object1s.Concat(object2s).ToList();
What you'd want to do is use the SelectMany method in linq. Which will select all the items from an array. The array can be created anonymously as seen below.
List<MyObject> objects = await item.tables.ToAsyncEnumerable()
.Where(p => p.field1 == value)
.SelectMany(p => new []{new MyObject(p.field1,p.field2), new MyObject(p.field3,p.field4)})
.ToList();
Hope that solves you problem!
If you use query syntax instead of method chaining, you can use the let operator to accomplish this. Note that the SQL generated may not be exactly performant as this article shows, but it should work for you if you're after a subquery.
You could try creating an array of objects and then flattening with SelectMany:
List<MyObject> objects = await item.tables.ToAsyncEnumerable()
.Where(p => p.field1 == value)
.Select(p => new [] {
new MyObject(p.field1,p.field2),
new MyObject(p.field3,p.field4)
})
.SelectMany(g => g)
.ToList();
But I suspect you'll have problems getting EF to translate that to a query.

combining one observable with latest from another observable

I'm trying to combine two observables whose values share some key.
I want to produce a new value whenever the first observable produces a new value, combined with the latest value from a second observable which selection depends on the latest value from the first observable.
pseudo code example:
var obs1 = Observable.Interval(TimeSpan.FromSeconds(1)).Select(x => Tuple.create(SomeKeyThatVaries, x)
var obs2 = Observable.Interval(TimeSpan.FromMilliSeconds(1)).Select(x => Tuple.create(SomeKeyThatVaries, x)
from x in obs1
let latestFromObs2WhereKeyMatches = …
select Tuple.create(x, latestFromObs2WhereKeyMatches)
Any suggestions?
Clearly this could be implemented by subcribing to the second observable and creating a dictionary with the latest values indexable by the key. But I'm looking for a different approach..
Usage scenario: one minute price bars computed from a stream of stock quotes. In this case the key is the ticker and the dictionary contains latest ask and bid prices for concrete tickers, which are then used in the computation.
(By the way, thank you Dave and James this has been a very fruitful discussion)
(sorry about the formatting, hard to get right on an iPad..)
...why are you looking for a different approach? Sounds like you are on the right lines to me. It's short, simple code... roughly speaking it will be something like:
var cache = new ConcurrentDictionary<long, long>();
obs2.Subscribe(x => cache[x.Item1] = x.Item2);
var results = obs1.Select(x => new {
obs1 = x.Item2,
cache.ContainsKey(x.Item1) ? cache[x.Item1] : 0
});
At the end of the day, C# is an OO language and the heavy lifting of the thread-safe mutable collections is already all done for you.
There may be fancy Rx approach (feels like joins might be involved)... but how maintainable will it be? And how will it perform?
$0.02
I'd like to know the purpose of a such a query. Would you mind describing the usage scenario a bit?
Nevertheless, it seems like the following query may solve your problem. The initial projections aren't necessary if you already have some way of identifying the origin of each value, but I've included them for the sake of generalization, to be consistent with your extremely abstract mode of questioning. ;-)
Note: I'm assuming that someKeyThatVaries is not shared data as you've shown it, which is why I've also included the term anotherKeyThatVaries; otherwise, the entire query really makes no sense to me.
var obs1 = Observable.Interval(TimeSpan.FromSeconds(1))
.Select(x => Tuple.Create(someKeyThatVaries, x));
var obs2 = Observable.Interval(TimeSpan.FromSeconds(.25))
.Select(x => Tuple.Create(anotherKeyThatVaries, x));
var results = obs1.Select(t => new { Key = t.Item1, Value = t.Item2, Kind = 1 })
.Merge(
obs2.Select(t => new { Key = t.Item1, Value = t.Item2, Kind = 2 }))
.GroupBy(t => t.Key, t => new { t.Value, t.Kind })
.SelectMany(g =>
g.Scan(
new { X = -1L, Y = -1L, Yield = false },
(acc, cur) => cur.Kind == 1
? new { X = cur.Value, Y = acc.Y, Yield = true }
: new { X = acc.X, Y = cur.Value, Yield = false })
.Where(s => s.Yield)
.Select(s => Tuple.Create(s.X, s.Y)));

Linq using Distinct() in C# Lambda Expression

SFC.OrderFormModifiedMonitoringRecords
.SelectMany(q => q.TimeModify, w => w.DateModify)
.Distinct()
.OrderBy(t => t)
.SelectMany(t => new { RowID = t.rowID, OFnum = t.OFNo });
It's Error did i missed something or is it Completely Coded The Wrong Way? After this i'm gonna use this on a Foreach method to gather up multiple data without the duplicates.
The delegate you pass to SelectMany must return an IEnumerable and is for collapsing multiple collections into one. So yes, something's definitely wrong here. I think you've confused it with Select which simply maps one collection to another.
Without knowing what your goal is, it's hard to know exactly how to fix it, but I'm guessing you want something like this:
SFC.OrderFormModifiedMonitoringRecords
.OrderBy(t => t.DateModify)
.ThenBy(t => t.TimeModify)
.Select(t => new { RowID = t.rowID, OFnum = t.OFNo })
.Distinct();
Or in query syntax:
(from t in SFC.OrderFormModifiedMonitoringRecords
orderby t.DateModify, t.TimeModify
select new { RowID = t.rowID, OFnum = t.OFNo })
.Distinct();
This will order the records by DateModify then by TimeModify, select two properties, rowID and OFNo and return only distinct pairs of values.

Categories

Resources