I've got the follow code, I just need to make sure this is the correct way to do this.. It works and everything but isn't as quick as Id expect.. I've timed each individual call and the longest is no where near the time it takes it takes to run.
public async Task<Result[]> DoSomethingGoodAsync()
{
List<Product> productList = getproducts();
IEnumerable<Task<Result>> list =
from p in productList select DoSomethingAsync(p);
Task<Result>[] slist = list.ToArray();
return await Task.WhenAll(slist);
}
Now my question again, is this correct? Is there a better and more efficient way to do this? DoSomethingAsync is an awaitable method also which calls another async method.
Edit: My question.. Is this the correct way to build up a collection of awaitable methods that I want to execute together?
Inside DoSomethingAysnc()
scrapeResult = await UrlScraper.ScrapeAsync(product.ProductUrl);
model = this.ProcessCheckStock(model, scrapeResult, product);
It appears that getproducts returns a type assignable to IList<T>. This means that getproducts materializes the result set before you call DoSomethingAsync on each item returned from getproducts.
Depending on how long it takes to yield each item and materialize the set is more than likely having the most impact.
That said, you should change the getproducts method to return an IEnumerable<T> implementation. But you need to do more than change the return type, you need to remove the materialization (more than likely a call to ToList) and use yield instead.
Related
I'm trying to iterate a collection call the TransformCvlValue for each record.
fields?.Select(x => TransformCvlValue(x, cvl)).ToList();
If I call .ToList() it works as expected.
Why does .ToList() need to be called?
Is there another way of doing this?
Calling Select() on an IEnumerable<T> does not immediately execute the action but builds a new IEnumerable<T> with the specified transform / action.
Generally, LINQ extension methods are only called when IEnumerable<T>s are materialized, for example via iterating over them in a foreach or by calling .ToList().
Select() should mostly be used when you really want to project elements from one type to another, e.g. by applying a projection to an element. It should not be used when you want to call a method for every element in an IEnumerable<T>.
Probably the most readable straightforward way for me would be to simply iterate over the fields:
if (fields != null)
{
foreach (var field in fields)
{
TransformCvlValue(field, cvl);
}
}
This makes clear what you intend the code to do and is easy to understand when you or your colleagues have to maintain the code in the future.
Here's my situation:
I have an C# API Get function, inside which contains a global variable (to be returned in the end) and several functions, as below:
public Dictionary<...,...> Get()
{
getDataFromLinQ(); //inside this, there is a LinQ select
processRawLinQData(); //this processes raw data, and store it into dictResult
return dictResult;
}
If i am writing in this way, it will return no result, as it seems that it doesn't wait the LinQ to finish select first.
However, if I am writing this way:
public Dictionary<...,...> Get()
{
//execute linq directly, but not inside a separate function
IQueryable<table1> linQResult = from t1 in db.table1
where t1...
select t1;
foreach(table1 x in linQResult)
{
//do processing and store in some variables.
}
processRawLinQData();
return dictResult;
}
This will work. But why?
Is LinQ select an Asynchronous method or it behaves differently if i put it inside another function?
P.S.:
1. I prefer method 1 (using function) as the codes are more readable.
2. I notice this scenario on the development/live server. In my local computer, both are working.
Linq is lazy ... if you don't use it, or use a ToList() to force a result, it might not be called if not needed.
So, you have a method call ... it goes in, have some linq, says ..I'll call it if I need, and goes back.
Then other method assumes your query finished and returned something ... but you're in a whole different place now.
I'm not sure I'm conveying my point ... but maybe it'll help you out a bit.
So this question is about .Select() statements on a static collection (ie not the result of a Select(), Where() or other LINQ operation, for example a List of array).
I was under the impression that when using .Select() or other non-filtering, non-sorting methods, a .ElementAt() would take the element from the original collection and run it through the .Select. I saw this as the best way as the .ElementAt() only returns one element and LINQ does not cache anything, so the other generated items get thrown away.
To provide an example:
var original = Enumerable.Range(0, 1000);
var listWithADifficultSelect = original.Select(aMethod);
var onlyOneItem = listWithADifficultSelect.ElementAt(898);
object aMethod(int number) {
// Gets the item from some kind of database, difficult operation
// Takes at least a few milliseconds
return new object();
}
To see this in the bigger picture, if I have a list of 20K items and I only need the nth item, but I perform a pretty heavy .Select(), I would expect the .Select() to only project that one item from the list.
So I have a two-fold question here:
Why is this built this way?
Is there a way to build an improved .Select() that does what I want it to do?
A universal solution that would even translate well to SQL (if that's an issue) would be to use Skip and Take. You can skip the first n-1 items and then take 1 from your original IEnumerable (or IQueryable).
var original = Enumerable.Range(0, 1000);
var onlyOneItem = original.Skip(898 - 1).Take(1).Select(aMethod);
Skip and Take are Linq's equivalent of SQL's OFFSET and LIMIT.
In a simplified case like your example, you won't see any improvements to performance, but if you have an expensive query in your actual application, this way you can avoid fetching any unnecessary elements
If I understand your problem correctly, you don't want LINQ to call aMethod for the first 897 elements if you only need the 898th.
So why don't you call it like that:
var onlyOneItem = aMethod(original.ElementAt(898));
If you want to get several specific elements and just don't want LINQ to re-evaluate aMethod all the time, then turn your result into a List or array:
var listWithADifficultSelect = original.Select(aMethod).ToList(); // or ToArray();
So the Select with all its aMethod calls is only executed once and you can access all your elements without re-calling aMethod.
If you want to write your own LINQ methods that do more what you want than LINQ already does, you can easily implement your own extensions:
public static class MyLinq
{
public static IEnumerable<TResult> MySelect<TSource,TResult>(this IEnumerable<TSource>, Func<TSource,TResult> selector)
{
// implement yourself
}
public static TSource MyElementAt<TSource>(this IEnumerable<TSource>, int index)
{
// implement yourself
}
}
In my application there are a fair number of existing "service commands" which generally return a List<TEntity>. However, I wrote them in such a way that any queries would not be evaluated until the very last statement, when they are cast ToList<TEntity> (or at least I think I did).
Now I need to start obtaining some "context-specific" information from the commands, and I am thinking about doing the following:
Keep existing commands largely the same as they are today, but make sure they return an IEnumerable<TEntity> rather than an IList<TEntity>.
Create new commands that call the old commands but return IEnumerable<TResult> where TResult is not an entity but rather a view model, result model, etc - some representation of the data that is useful for the application.
The first case in which I have needed this is while doing a search for a Group entity. In my schema, Groups come with User-specific permissions, but it is not realistic for me to spit out the entire list of users and permissions in my result - first, because there could be many users, second, because there are many permissions, and third, because that information should not be available to insufficiently-privileged users (ie a "guest" should not be able to see what a "member" can do).
So, I want to be able to take the result of the original command, an IEnumerable<Group>, and describe how each Group ought to be transformed into a GroupResult, given a specific input of User (by Username in this case).
If I try to iterate over the result of the original command with ForEach I know this will force the execution of the result and therefore potentially result in a needlessly longer execution time. What if I wanted to further compose the result of the "new" command (that returns GroupResult) and filter out certain groups? Then maybe I would be calculating a ton of privileges for the inputted user, only to filter out the parent GroupResult objects later on anyway!
I guess my question boils down to... how do I tell C# how I'd like to transform each member of the IEnumerable without necessarily doing it at the time the method is run?
To lazily cast an enumerable from one type to another you do this:
IEnumerable<TResult> result = source.Cast<TResult>();
This assumes that the elements of the source enumerable can be cast to TResult. If they can't you need to use a standard projection with .Select(x => ... ).
Also, be careful returning IEnumerable<T> from a service or database as often there are resources that you need to open to obtain the data so now you would need make sure those resources are open whenever you try to evaluate the enumerable. Keeping a database connection open is a bad idea. I would be more inclined to return an array that you've cast as an IEnumerable<>.
However, if you really want to get an IEnmerable<> from a service or database that is truly lazy and will automatically refresh the data then you need to try Microsoft's Reactive Framework Team's "Interactive Extensions" to help with it.
They have an nice IEnumerable<> extension called Using that makes a "hot" enumerable that opens a resource for each iteration.
It would look something like this:
var d =
EnumerableEx
.Using(
() => new DB(),
db => db.Data.Where(x => x == 2));
It creates a new DB instance every time the enumerable is iterated and will dispose of the database when the enumerable is completed. Something worth considering.
Use NuGet and look for "Ix-Main" for the Interactive Extensions.
You're looking for the yield return command.
When you define a method returning an IEnumerable, and return its data by yield return, the return value is iterated over in the consuming method. This is what it could look like:
IEnumerable<GroupResult> GetGroups(string userName)
{
foreach(var group in context.Groups.Where(g => <some user-specific condition>))
{
var result = new GroupResult()
... // Further compose the result.
yield return result;
}
}
In consuming code:
var groups = GetGroups("tacos");
// At this point no eumeration has occurred yet. Any breakpoints in GetGroups
// have not been hit.
foreach(var g in groups)
{
// Now iteration in GetGroups starts!
}
I was refactoring some code earlier and I came across an implementation of an iterator block I wasn't too sure about. In an integration layer of a system where the client is calling an extrernal API for some data I have a set of translators that take the data returned from the API and translate it into collections of business entities used in the logic layer. A common translator class will look like this:
// translate a collection of entities coming back from an extrernal source into business entities
public static IEnumerable<MyBusinessEnt> Translate(IEnumerable<My3rdPartyEnt> ents) {
// for each 3rd party ent, create business ent and return collection
return from ent in ents
select new MyBusinessEnt {
Id = ent.Id,
Code = ent.Code
};
}
Today I came across the following code. Again, it's a translator class, it's purpose is to translate the collection in the parameter into the method return type. However, this time it's an iterator block:
// same implementation of a translator but as an iterator block
public static IEnumerable<MyBusinessEnt> Translate(IEnumerable<My3rdPartyEnt> ents) {
foreach(var ent in ents)
{
yield return new MyBusinessEnt {
Id = ent.Id,
Code = ent.Code
};
}
}
My question is: is this a valid use of an iterator block? I can't see the benefit of creating a translator class in this way. Could this result in some unexpected behaviour?
Your two samples do pretty much exactly the same thing. The query version will be rewritten into a call to Select, and Select is written exactly like your second example; it iterates over each element in the source collection and yield-returns a transformed element.
This is a perfectly valid use of an iterator block, though of course it is no longer necessary to write your own iterator blocks like this because you can just use Select.
Yes, that's valid. The foreach has the advantage of being debuggable,so I tend to prefer that design.
The first example is not an iterator. It just creates and returns an IEnumerable<MyBusinessEnt>.
The second is an iterator and I don't see anything wrong with it. Each time the caller iterates over the return value of that method, the yield will return a new element.
Yes, that works fine, and the result is very similar.
Both creates an object that is capable of returning the result. Both rely on the source enumerable to remain intact until the result is completed (or cut short). Both uses deferred execution, i.e. the objects are created one at a time when you iterate the result.
There is a difference in that the first returns an expression that uses library methods to produce an enumerator, while the second creates a custom enumerator.
The major difference is on when each code runs. First one is delayed until return value is iterated while second one runs immediately. What I mean is that the for loop is forcing the iteration to run. The fact that the class exposes a IEnumerable<T> and in this case is delayed is another thing.
This does not provide any benefit over simple Select. yield's real power is when there is a conditional involved:
foreach(var ent in ents)
{
if(someCondition)
yield return new MyBusinessEnt {
Id = ent.Id,
Code = ent.Code
};
}