Is Linq Select asynchronous inside a separate function?

Is Linq Select asynchronous inside a separate function? - c#

Here's my situation:
I have an C# API Get function, inside which contains a global variable (to be returned in the end) and several functions, as below:
public Dictionary<...,...> Get()
{
getDataFromLinQ(); //inside this, there is a LinQ select
processRawLinQData(); //this processes raw data, and store it into dictResult
return dictResult;
}
If i am writing in this way, it will return no result, as it seems that it doesn't wait the LinQ to finish select first.
However, if I am writing this way:
public Dictionary<...,...> Get()
{
//execute linq directly, but not inside a separate function
IQueryable<table1> linQResult = from t1 in db.table1
where t1...
select t1;
foreach(table1 x in linQResult)
{
//do processing and store in some variables.
}
processRawLinQData();
return dictResult;
}
This will work. But why?
Is LinQ select an Asynchronous method or it behaves differently if i put it inside another function?
P.S.:
1. I prefer method 1 (using function) as the codes are more readable.
2. I notice this scenario on the development/live server. In my local computer, both are working.

Linq is lazy ... if you don't use it, or use a ToList() to force a result, it might not be called if not needed.
So, you have a method call ... it goes in, have some linq, says ..I'll call it if I need, and goes back.
Then other method assumes your query finished and returned something ... but you're in a whole different place now.
I'm not sure I'm conveying my point ... but maybe it'll help you out a bit.

Related

LINQ Select only throws IOException when the Enumerable is looked at

I'm currently using LINQ to load a list of files into XDocuments, like so:
var fileNames = new List<string>(){ "C:\file.xml" };
var xDocs = fileNames.Select(XDocument.Load);
var xDocs2 = xDocs.ToList(); // Crashes here
If I deliberately 'lock' one of the files with a different process, the IOException is only thrown when I actually start to look at the XDocuments I've been generating, ie when ToList() is called.
Can anyone explain why this is, and how best to handle this error? I'd like to have access to the working XDocuments still, if possible.

Can anyone explain why this is
As many pointed out, this is because of the so called deferred execution of many LINQ methods. For instanse, the Enumerable.Select method documentation states
This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.
while the Enumerable.ToList documentation contains
The ToList<TSource>(IEnumerable<TSource>) method forces immediate query evaluation and returns a List that contains the query results. You can append this method to your query in order to obtain a cached copy of the query results.
So the XDocument.Load is really executed for each file name during the ToList call. I guess that covers the why part.
and how best to handle this error? I'd like to have access to the working XDocuments still, if possible.
I don't know what does "best" mean in this context, but if you want to ignore the errors and include the "working XDocuments", then you can use something like this
var xDocs = fileNames.Select(fileName =>
{
try { return XDocument.Load(fileName); }
catch { return null; }
});
and then either append .Where(doc => doc != null), or account for null documents when processing the list.

This is why the linq .Select is an IEnumerable and the elements are first called if you make your IEnumerable to an List. Then you go through all of your elements.

How should I use Task When All?

I've got the follow code, I just need to make sure this is the correct way to do this.. It works and everything but isn't as quick as Id expect.. I've timed each individual call and the longest is no where near the time it takes it takes to run.
public async Task<Result[]> DoSomethingGoodAsync()
{
List<Product> productList = getproducts();
IEnumerable<Task<Result>> list =
from p in productList select DoSomethingAsync(p);
Task<Result>[] slist = list.ToArray();
return await Task.WhenAll(slist);
}
Now my question again, is this correct? Is there a better and more efficient way to do this? DoSomethingAsync is an awaitable method also which calls another async method.
Edit: My question.. Is this the correct way to build up a collection of awaitable methods that I want to execute together?
Inside DoSomethingAysnc()
scrapeResult = await UrlScraper.ScrapeAsync(product.ProductUrl);
model = this.ProcessCheckStock(model, scrapeResult, product);

It appears that getproducts returns a type assignable to IList<T>. This means that getproducts materializes the result set before you call DoSomethingAsync on each item returned from getproducts.
Depending on how long it takes to yield each item and materialize the set is more than likely having the most impact.
That said, you should change the getproducts method to return an IEnumerable<T> implementation. But you need to do more than change the return type, you need to remove the materialization (more than likely a call to ToList) and use yield instead.

cannot reach my inner breakpoint

I'm running the following code:
hashedUrlDataList.Select(mDalHashedUrlData.Save);
I have put a breakpoint in the called delegate,
public HashedUrlData Save(HashedUrlData item)
{
//breakpoint here
}
but it doesn't stop there.
How can I fix this?

Your method will be called when you'll enumerate the result of Select() not when declared.

Enumerable.Select is Lazy.
Try this and tell me if your break point is caught
hashedUrlDataList.Select(mDalHashedUrlData.Save).ToList();
Or the basic:
hashedUrlDataList.Select(mDalHashedUrlData.Save).GetEnumerator().MoveNext()
It just works if you have at least one element.
You can do it too:
hashedUrlDataList.Select(mDalHashedUrlData.Save).Any();
Any() do the same that GetEnumerator().MoveNext()
I think that what you want is:
List<HashedUrlData> hashedUrlDataList = new List<HashedUrlData>();
hashedUrlDataList.ForEach(Save);

LINQ is for querying data; it's not intended to cause side-effects.
If you're more interested in the side efects of your Savemethod than the HashedUrlData instance it returns, you should really be calling
foreach (HashedUrlData h in hashedUrlDataList)
{
h.Save();
}
If you will eventually be using the returned values and this is just an intermediate/debugging stage, then by all means use LINQ. Just be aware that Save will only be called as you access each returned value, or call something else that enumerates the whole enumerator as the other answers have shown.

LINQ to SQL use query after DataContext Dispose

Im writing DLL for one project, I just started using LINQ to SQL and after moving all methods to this new dll. I disovered that I can't acess DataContext because it was disposed, I understand why but I'm not sure how I can acess results of query for my main project method so:
My method in DLL
public static IEnumerable<Problem> GetProblemsNewest(int howMuch)
{
using (ProblemClassesDataContext context = new ProblemClassesDataContext())
{
var problems = (from p in context.Problems orderby p.DateTimeCreated select p).Take(howMuch);
return problems;
}
}
Calling it:
IEnumerable<Problem> problems = ProblemsManipulation.GetProblemsNewest(10);
//Error can't acess it because it was disposed..
This is just first method, I have larger ones so I really need a way to do this. There must be a way to use LINQ to SQL in DLL? I know I can do something Like .ToList or .ToArray but then I wouldn't be able to acess row properties directly and would have to reference it as problem[0],problem[1] etc. which is even more messy than having tone of code in main project.

After you are outside of the using statement the context is automatically disposed, so when the IEnumerable is actually enumerated the context is already disposed.
Therefore you need to tell Linq that it should go ahead and actually retrieve the values from the DB while your inside of your using statement. You can do so via ToList() or ToArray (or others).
See the updated code:
public static IList<Problem> GetProblemsNewest(int howMuch)
{
using (ProblemClassesDataContext context = new ProblemClassesDataContext())
{
var problems = (from p in context.Problems orderby p.DateTimeCreated select p).Take(howMuch);
return problems.ToList();
}
}

Change this:
return problems;
to this:
return problems.ToList();
Explanation:
The ToList() will iterate through the results and pull them all into memory. Because this happens inside the using statement, you're fine. And because it creates a list, your values will be returned.
You could do this in other ways. The basic idea is to actually retrieve the results before the using statement closes.
An alternate solution would be to avoid the using statement, create an iterator that owns the object and disposes it when the last item has been iterated past.

You could handle it by doing a .ToList() on the IEnumerable before exiting the using block. This will retrieve the records and populate the list. Depending on your scenario this might not be optimal in terms of performance though (you lose the possibility of lazy retrieval and additional filtering of the query)

You have to finish the query inside the using clause, i.e. use ToList() or First() or Count(), etc...

Currently you returning query, and when you want use it, because database connection closed before your usage, you will get an exception, just do:
return problems.AsEnumerable()
This is because of deffered execution manner of linq. In fact your problems object is just query, and you should convert it to objects to use it somewhere else.

You may not want to use the context in a using; the problem is then you can't use the navigation properties later on, because you'll have the same "object disposed" issue when it tries to load the related data. What you need to do is let the context live, and return the results directly. Or, when you return the results, call ToList(), and later on query all related data.

Correct usage of iterator blocks

I was refactoring some code earlier and I came across an implementation of an iterator block I wasn't too sure about. In an integration layer of a system where the client is calling an extrernal API for some data I have a set of translators that take the data returned from the API and translate it into collections of business entities used in the logic layer. A common translator class will look like this:
// translate a collection of entities coming back from an extrernal source into business entities
public static IEnumerable<MyBusinessEnt> Translate(IEnumerable<My3rdPartyEnt> ents) {
// for each 3rd party ent, create business ent and return collection
return from ent in ents
select new MyBusinessEnt {
Id = ent.Id,
Code = ent.Code
};
}
Today I came across the following code. Again, it's a translator class, it's purpose is to translate the collection in the parameter into the method return type. However, this time it's an iterator block:
// same implementation of a translator but as an iterator block
public static IEnumerable<MyBusinessEnt> Translate(IEnumerable<My3rdPartyEnt> ents) {
foreach(var ent in ents)
{
yield return new MyBusinessEnt {
Id = ent.Id,
Code = ent.Code
};
}
}
My question is: is this a valid use of an iterator block? I can't see the benefit of creating a translator class in this way. Could this result in some unexpected behaviour?

Your two samples do pretty much exactly the same thing. The query version will be rewritten into a call to Select, and Select is written exactly like your second example; it iterates over each element in the source collection and yield-returns a transformed element.
This is a perfectly valid use of an iterator block, though of course it is no longer necessary to write your own iterator blocks like this because you can just use Select.

Yes, that's valid. The foreach has the advantage of being debuggable,so I tend to prefer that design.

The first example is not an iterator. It just creates and returns an IEnumerable<MyBusinessEnt>.
The second is an iterator and I don't see anything wrong with it. Each time the caller iterates over the return value of that method, the yield will return a new element.

Yes, that works fine, and the result is very similar.
Both creates an object that is capable of returning the result. Both rely on the source enumerable to remain intact until the result is completed (or cut short). Both uses deferred execution, i.e. the objects are created one at a time when you iterate the result.
There is a difference in that the first returns an expression that uses library methods to produce an enumerator, while the second creates a custom enumerator.

The major difference is on when each code runs. First one is delayed until return value is iterated while second one runs immediately. What I mean is that the for loop is forcing the iteration to run. The fact that the class exposes a IEnumerable<T> and in this case is delayed is another thing.
This does not provide any benefit over simple Select. yield's real power is when there is a conditional involved:
foreach(var ent in ents)
{
if(someCondition)
yield return new MyBusinessEnt {
Id = ent.Id,
Code = ent.Code
};
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Is Linq Select asynchronous inside a separate function? - c#

Related

LINQ Select only throws IOException when the Enumerable is looked at

How should I use Task When All?

cannot reach my inner breakpoint

LINQ to SQL use query after DataContext Dispose

Correct usage of iterator blocks

Categories

Resources