I'm using a few functions like
ICollection<ICache> caches = new HashSet<ICache>();
ICollection<T> Matches<T>(string dependentVariableName)
{
return caches
.Where(x => x.GetVariableName() == dependentVariableName)
.Where(x => typeof(T).IsAssignableFrom(x.GetType()))
.Select(x => (T) x)
.ToList();
}
in my current class design. They work wonderfully from an architecture perspective--where I can arbitrarily add objects of various related types (in this case ICaches) and retrieve them as collections of concrete types.
An issue is that the framework here is a scientific package, and these sorts of functions lie on very hot code paths that are getting called thousands of times over a few minute period. The result:
and functions like the above are the main consumers of the COMDelegate::DelegateConstruct.
As you can see from the relative distribution of the sample %, this isn't a deal breaker, but it would be fantastic to reduce the overhead a bit!
Thanks in advance.
1) i dont see how the code you posted it related to the performance data... the functions listed dont look like they are called from this code at all. so really i can't answer your question except to say that maybe you are interpreting he performance report wrong.
2) don't call .ToList at the end... just return the IEnumerable. that will help performance. only do ToList when you really do need a list that you can later add/remove/sort things in.
3) i dont have enough context but it seems this method could be eliminated by making use of the dynamic keyword
Related
I ran into an issue where the following code results in an exception.
public IList<Persoon> GetPersonenWithCurrentWorkScheme(int clientId)
{
//SELECT N+1
Session.QueryOver<Werkschema>()
.Fetch(ws => ws.Labels).Eager
.JoinQueryOver<WerkschemaHistory>(p => p.Histories)
.JoinQueryOver<ArbeidsContract>(a => a.ArbeidsContract)
.JoinQueryOver<Persoon>(p => p.Persoon)
.Where(p => p.Klant.ID == clientId)
.Future();
var result = Session.QueryOver<Persoon>()
.Fetch(p => p.ArbeidsContracten).Eager
.Where(p => p.Klant.ID == clientId)
.Future();
return result.ToList();
}
When I comment the first part of the method (Session.QueryOver WerkSchema...) the code runs fine. When it is not commented, the first NHibernate.Commit() that occurs throws an exception. (Something with a date-time conversion but that's not really what I'm worried about).
My question: Is the first bit of code useful? Does it do anything? The result is not stored in a variable which is used later, so to me it looks like dead code. Or is this some NHibernate dark magic which actually does something useful?
The Future is an optimization over existing API provided by NHibernate.
One of the nicest new features in NHibernate 2.1 is the Future() and FutureValue() functions. They essentially function as a way to defer query execution to a later date, at which point NHibernate will have more information about what the application is supposed to do, and optimize for it accordingly. This build on an existing feature of NHibernate, Multi Queries, but does so in a way that is easy to use and almost seamless.
This will execute multiple queries in single round trip to database. If it is not possible to get all the needed data in single round trip, multiple calls will be executed as expected; but it still helps in many cases.
The database call is triggered when first round trip is requested.
In your case, first round trip is requested by calling result.ToList(), which also including your first part of code.
As you suspect, output of first part; though retrieved is never being used. So in my view, that part can be safely commented. But, this is based on code you post in question only.
It may be the case that the data loaded while this call is saving round-trips in other part of code. But in that case, code should be optimized and should be moved to proper location.
In C#, I have a collection of objects that I want to transform to a different type. This conversion, which I would like to do with LINQ Select(), requires multiple operations in sequence. To perform these operations, is it better to chain together multiple Select() queries like
resultSet.Select(stepOneDelegate).Select(stepTwoDelegate).Select(stepThreeDelegate);
or instead to perform these three steps in a single call?
resultSet.Select(item => stepThree(stepTwo(stepOne(item))));
Note: The three steps themselves are not necessarily functions. They are meant to be a concise demonstration of the problem. If that has an effect on the answer please include that information.
Any performance difference would be negligible, but the definitive answer to that is simply "test it". The question would be more around readability, which one is easier to understand and grasp what is going on.
Cases where I have needed to Project on a Projection would include when working with EF Linq expressions where I ultimately need to do something that isn't supported by EF Linq so I need to materialize a projection (usually to an anonymous type) then finish the expression before selecting the final output. In these cases you would need to use the first example.
Personally I'd probably just stick to the first scenario as to me it is easy to understand what is going on, and it easily supports additions for other operations such as filtering with Where or using OrderBy etc. The second scenario only really comes up when I'm building a composite model.
.Select(x => new OuterModel
{
Id = x.Id,
InnerModel = x.Inner.Select(i => new InnerModel
{
// ...
})
}) // ...
In most cases though this can be handled through Automapper.
I'd be wary of any code that I felt needed chaining a lot of Select expressions as it would smell like trying to do too much in one expression chain. Making something easy to understand, even if it involves a few extra steps and might add a few milliseconds to the execution is far better than having the risk of needing to track down bugs that someone introduced because they misunderstood potentially complex looking code.
I got in a discussion with two colleagues regarding a setup for an iteration over an IEnumerable (the contents of which will not be altered in any way during the operation). There are three conflicting theories on which is the optimal approach. Both the others (and me as well) are very certain and that got me unsure, so for the sake of clarity, I want to check with an external source.
The scenario is as follows. We had the code below as a starting point and discovered that some of the hazaas need not to be acted upon. So, starting with the code below, we started to add a blocker for the action.
foreach(Hazaa hazaa in hazaas) ;
My suggestion is as follows.
foreach(Hazaa hazaa in hazaas.Where(element => condition)) ;
One of the guys wants to resolve it by a more explicit form, claiming that LINQ is not appropriate in this case (not sure why it'd be so but he seems to be very convinced). He's solution is this.
foreach(Hazaa hazaa in hazaas) ;
if(condition) ;
The other contra-suggestion is supported by the claim that Where risks to repeat the filtering process needlessly and that it's more certain to minimize the computational workload by picking the appropriate elements once for all by Select.
foreach(Hazaa hazaa in hazaas.Select(element => condition)) ;
I argue that the first is obsolete, since LINQ can handle data objects quite well.
I also believe that Select-ing is in this case equivalently fast to Where-ing and no needless steps will be taken (e.g. the evaluation of the condition on the elements will only be performed once). If anything, it should be faster using Where because we won't be creating an extra instance of anything.
Who's right?
Select is inappropriate. It doesn't filter anything.
if is a possible solution, but Where is just as explicit.
Where executes the condition exactly once per item, just as the if. Additionally, it is important to note that the call to Where doesn't iterate the list. So, using Where you iterate the list exactly once, just like when using if.
I think you are discussing with one person that didn't understand LINQ - the guy that wants to use Select - and one that doesn't like the functional aspect of LINQ.
I would go with Where.
The .Where() and the if(condition) approach will be the same.
But since LinQ is nicely readable i'd prefer that.
The approach with .Select() is nonsense, since it will not return the Hazaa-Object, but an IEnumerable<Boolean>
To be clear about the functions:
myEnumerable.Where(a => isTrueFor(a)) //This is filtering
myEnumerable.Select(a => a.b) //This is projection
Where() will run a function, which returns a Boolean foreach item of the enumerable and return this item depending on the result of the Boolean function
Select() will run a function for every item in the list and return the result of the function without doing any filtering.
This question already has answers here:
What is the "N+1 selects problem" in ORM (Object-Relational Mapping)?
(19 answers)
Closed 8 years ago.
I have never heard of it, but people are refering to an issue in an application as an "N+1 problem". They are doing a Linq to SQL based project, and a performance problem has been identified by someone. I don't quite understand it - but hopefully someone can steer me.
It seems that they are trying to get a list of obects, and then the Foreach after that is causing too many database hits:
From what I understand, the second part of the source is only being loaded in the forwach.
So, list of items loaded:
var program = programRepository.SingleOrDefault(r => r.ProgramDetailId == programDetailId);
And then later, we make use of this list:
foreach (var phase in program.Program_Phases)
{
phase.Program_Stages.AddRange(stages.Where(s => s.PhaseId == phase.PhaseId));
phase.Program_Stages.ForEach(s =>
{
s.Program_Modules.AddRange(modules.Where(m => m.StageId == s.StageId));
});
phase.Program_Modules.AddRange(modules.Where(m => m.PhaseId == phase.PhaseId));
}
It seems the problem idetified is that, they expected 'program' to contain it's children. But when we refer to the child in the query, it reloads the proram:
program.Program_Phases
They're expecting program to be fully loaded and in memory, and profilder seems to indicate that program table, with all the joins is being called on each 'foreach'.
Does this make sense?
(EDIT: I foind this link:
Does linq to sql automatically lazy load associated entities?
This might answer my quetion, but .. they're using that nicer (where person in...) notation, as opposed to this strange (x => x....). So if this link Is the answer - i.e, we need to 'join' in the query - can that be done?)
In ORM terminology, the 'N+1 select problem' typically occurs when you have an entity that has nested collection properties. It refers to the number of queries that are needed to completely load the entity data into memory when using lazy-loading. In general, the more queries, the more round-trips from client to server and the more work the server has to do to process the queries, and this can have a huge impact on performance.
There are various techniques for avoiding this problem. I am not familiar with Linq to SQL but NHibernate supports eager fetching which helps in some cases. If you do not need to load the entire entity instance then you could also consider doing a projection query. Another possibility is to change your entity model to avoid having nested collections.
For performant linq first work out exactly what properties you actually care about. The one advantage that linq has performance-wise is that you can easily leave out retrieval of data you won't use (you can always hand-code something that does better than linq does, but linq makes it easy to do this without creating a library full of hundreds of classes for slight variants of what you leave out each time).
You say "list" a few times. Don't go obtaining lists if you don't need to, only if you'll re-use the same list more than once. Otherwise working one item at a time will perform better 99% of the time.
The "nice" and "strange" syntaxes as you put it are different ways to say the same thing. There are some things that can only be expressed with methods and lambdas, but the other form can always be expressed as such - indeed is after compilation. The likes of from b in someSource where b.id == 21 select b.name becomes compiled as someSource.Where(b => b.id == 21).Select(b => b.name) anyway.
You can set DataContext.LoadOptions to define exactly which entities you want loaded with which object (and best of all, set it differently in different uses quite easily). This can solve your N+1 issue.
Alternatively you might find it easer to update existing entities or insert new ones, by setting the appropriate id yourself.
Even if you don't find it easier, it can be faster in some cases. Normally I'd say go with the easier and forget the performance as far as this choice goes, but if performance is a concern here (and you say it is), then that can make profiling to see which of the two works better, worthwhile.
Can someone help to change to following to select unique Model from Product table
var query = from Product in ObjectContext.Products.Where(p => p.BrandId == BrandId & p.ProdDelOn == null)
orderby Product.Model
select Product;
I'm guessing you that you still want to filter based on your existing Where() clause. I think this should take care of it for you (and will include the ordering as well):
var query = ObjectContext.Products
.Where(p => p.BrandId == BrandId && p.ProdDelOn == null)
.Select(p => p.Model)
.Distinct()
.OrderBy(m => m);
But, depending on how you read the post...it also could be taken as you're trying to get a single unique Model out of the results (which is a different query):
var model = ObjectContext.Products
.Where(p => p.BrandId == BrandId && p.ProdDelOn == null)
.Select(p => p.Model)
.First();
Change the & to && and add the following line:
query = query.Distinct();
I'm afraid I can't answer the question - but I want to comment on it nonetheless.
IMHO, this is an excellent example of what's wrong with the direction the .NET Framework has been going in the last few years. I cannot stand LINQ, and nor do I feel too warmly about extension methods, anonymous methods, lambda expressions, and so on.
Here's why: I have yet to see a situation where either of these things actually contribute anything to solving real-world programming problems. LINQ is ceratinly no replacement for SQL, so you (or at least the project) still need to master that. Writing the LINQ statements is not any simpler than writing the SQL, but it does add run-time processing to build an expression tree and "compile" it into an SQL statement. Now, if you could solve complex problems more easily with LINQ than with SQL directly, or if it meant you didn't need to also know SQL, and if you could trust LINQ would produce good-enough SQL all the time, it might still have been worth using. But NONE of these preconditions are met, so I'm left wondering what the benefit is supposed to be.
Of course, in good old-fashioned SQL the statement would be
SELECT DISTINCT [Model]
FROM [Product]
WHERE [BrandID] = #brandID AND [ProdDelOn] IS NULL
ORDER BY [Model]
In many cases the statements can be easily generated with dev tools and encapsulated by stored procedures. This would perform better, but I'll grant that for many things the performance difference between LINQ and the more straightforward stored procs would be totally irrelevant. (On the other hand, performance problems do have a tendency to sneak in, as we devs often work with totally unrealistic amounts of data and on environments that have little in common with those hosting our software in real production systems.) But the advantages of just not using LINQ are HUGE:
1) Fewer skills required (since you must use SQL anyway)
2) All data access can be performed in one way (since you need SQL anyway)
3) Some control over HOW to get data and not just what
4) Less chance of being rightfully accused of writing bloatware (more efficient)
Similar things can be said with respect to many of the new language features introduced since C# 2.0, though I do appreciate and use some of them. The "var" keyword with type inferrence is great for initializing locals - it's not much use getting the same type information two times on the same line. But let's not pretend this somehow helps one bit if you have a problem to solve. Same for anonymous types - nested private types served the same purpose with hardly any more code, and I've found NO use for this feature since trying it out when it was new and shiny. Extention methods ARE in fact just plain old utility methods, and I have yet to hear any good explanation of why one should use the SAME syntax for instance methods and static methods invoked on another class! This actually means that adding a method to a class and getting no build warnings or errors can break an application. (In case you doubt: If you had an extension method Bar() for your Foo type, Foo.Bar() invokes a completely different implementation which may or may not do something similar to what your extension method Bar() did the day you introduce an instance method with the same signature. It'll build and crash at runtime.)
Sorry to rant like this, and maybe there is a better place to post this than in response to a question. But I really think anyone starting out with LINQ is wasting their time - unless it's in preparation for an MS certification exam, which AFAIU is also something a bit removed from reality.