documents is a IDictionary<string, string> where the parameters are <filename, fileUrl>
DocumentHandler.Download() returns a Task<Memorystram>
This code works:
foreach (var x in documents.Keys)
{
var result = await DocumentHandler.Download(new Uri(documents[x]));
// other code
}
however it rund synchronously.
In order to run it all async i wrote this code:
var keys =
documents.Keys.Select(async x =>
{
return Tuple.Create(x, await DocumentHandler.Download(new Uri(documents[x])));
});
await Task.WhenAll(keys);
foreach (var item in keys)
{
var tpl = item.Result;
// other code
}
It doesn't work, it crashes without showing an exception on the last line var tpl = item.Result; Why?
Your keys variable will create a new set of tasks every time you evaluate it... so after waiting for the first set of tasks to complete, you're iterating over a new set of unfinished tasks. The simple fix for this is to add a call to ToList():
var keys = documents
.Keys
.Select(async x => Tuple.Create(x, await DocumentHandler.Download(new Uri(documents[x]))))
.ToList();
await Task.WhenAll(keys);
foreach (var item in keys)
{
var tpl = item.Result;
// other code
}
Related
var fdPositions = dbContext.FdPositions.Where(s => s.LastUpdated > DateTime.UtcNow.AddDays(-1));
foreach (JProperty market in markets)
{
// bunch of logic that is irrelevant here
var fdPosition = fdPositions.Where(s => s.Id == key).FirstOrDefault();
if (fdPosition is not null)
{
fdPosition.OddsDecimal = oddsDecimal;
fdPosition.LastUpdated = DateTime.UtcNow;
}
else
{
// bunch of logic that is irrelevant here
}
}
await dbContext.SaveChangesAsync();
This block of code will make 1 database call on this line
var fdPosition = fdPositions.Where(s => s.Id == key).FirstOrDefault();
for each value in the loop, there will be around 10,000 markets to loop through.
What I thought would happen, and what I would like to happen, is 1 database call is made
var fdPositions = dbContext.FdPositions.Where(s => s.LastUpdated > DateTime.UtcNow.AddDays(-1));
on this line, then in the loop, it is checking against the local table I thought I pulled on the first line, making sure I still properly am updating the DB Object in this section though
if (fdPosition is not null)
{
fdPosition.OddsDecimal = oddsDecimal;
fdPosition.LastUpdated = DateTime.UtcNow;
}
So my data is properly propagated to the DB when I call
await dbContext.SaveChangesAsync();
How can I update my code to accomplish this so I am making 1 DB call to get my data rather than 10,000 DB calls?
Define your fdPositions variable as a Dictionary<int, T>, in your query do a GroupBy() on Id, then call .ToDictionary(). Now you have a materialized dictionary that lets you index by key quickly.
var fdPositions = context.FdPositions.Where(s => s.LastUpdatedAt > DateTime.UtcNow.AddDays(-1))
.GroupBy(x=> x.Id)
.ToDictionary(x=> x.Key, x=> x.First());
//inside foreach loop:
// bunch of logic that is irrelevant here
bool found = fdPositions.TryGetValue(key, out var item);
I have these example code:
private async Task<IEnumerable<long>> GetValidIds1(long[] ids)
{
var validIds = new List<long>();
var results = await Task.WhenAll(ids.Select(i => CheckValidIdAsync(i)))
.ConfigureAwait(false);
for (int i = 0; i < ids.Length; i++)
{
if (results[i])
{
validIds.Add(ids[i]);
}
}
return validIds;
}
private async Task<IEnumerable<long>> GetValidIds2(long[] ids)
{
var validIds = new ConcurrentBag<long>();
await Task.WhenAll(ids.Select(async i =>
{
var valid = await CheckValidIdAsync(i);
if (valid)
validIds.Add(i);
})).ConfigureAwait(false);
return validIds;
}
private async Task<bool> CheckValidIdAsync(long id);
I currently use GetValidIds1() but it has inconvenience of having to tie input ids to result using index at the end.
GetValidIds2() is what i want to write but there are a few concerns:
I have 'await' in select lambda expression. Because LINQ is lazy evaluation, I don't think it would block other CheckValidIdAsync() calls from starting but exactly who's context does it suspend? Per MSDN doc
The await operator suspends evaluation of the enclosing async method until the asynchronous operation represented by its operand completes.
So in this case, the enclosing async method is lambda expression itself so it doesn't affect other calls?
Is there a better way to process result of async method and collect output of that process in a list?
Another way to do it is to project each long ID to a Task<ValueTuple<long, bool>>, instead of projecting it to a Task<bool>. This way you'll be able to filter the results using pure LINQ:
private async Task<long[]> GetValidIds3(long[] ids)
{
IEnumerable<Task<(long Id, bool IsValid)>> tasks = ids
.Select(async id =>
{
bool isValid = await CheckValidIdAsync(id).ConfigureAwait(false);
return (id, isValid);
});
var results = await Task.WhenAll(tasks).ConfigureAwait(false);
return results
.Where(e => e.IsValid)
.Select(e => e.Id)
.ToArray();
}
The above GetValidIds3 is equivalent with the GetValidIds1 in your question. It returns the filtered IDs in the same order as the original ids. On the contrary the GetValidIds2 doesn't guarantee any order. If you have to use a concurrent collection, it's better to use a ConcurrentQueue<T> instead of a ConcurrentBag<T>, because the former preserves the insertion order. Even if the order is not important, preserving it makes the debugging easier.
I have trouble to decide which way is the best to query my SQL Server on my WebAPI 2 Backend
I am trying to use async/await as often as possible, but i found that when i return the whole collection, there is no async option available.
Which way would be the best one?
[ResponseType(typeof(List<ExposedPublisher>))]
[HttpGet]
public async Task<IHttpActionResult> GetPublisher()
{
var list = new List<PublisherWithMedia>();
foreach (var publisher in _db.Publisher.Where(e => e.IsDeleted == false))
{
var pub = new PublisherWithMedia()
{
Id = publisher.Id,
Name = publisher.Name,
Mediae = new List<WebClient.Models.Media>()
};
foreach (var media in publisher.Media)
{
pub.Mediae.Add(ApiUtils.GetMedia(media));
}
list.Add(pub);
}
return Ok(list);
}
or
[ResponseType(typeof(List<PublisherWithMedia>))]
[HttpGet]
public async Task<IHttpActionResult> GetPublisher()
{
var list = new List<PublisherWithMedia>();
var entity = await _db.Publisher.ToListAsync();
foreach (var publisher in entity.Where(e => e.IsDeleted == false))
{
var pub = new PublisherWithMedia()
{
Id = publisher.Id,
Name = publisher.Name,
Mediae = new List<WebClient.Models.Media>()
};
foreach (var media in publisher.Media)
{
pub.Mediae.Add(ApiUtils.GetMedia(media));
}
list.Add(pub);
}
return Ok(list);
}
The operation could potentially result in a very large resultset, so it would make sense to filter directly on the database, especially since with time, the amount of deleted records could surpass the amount of undeleted ones.
However, with the large result and the query of children (Media), it makes also sense to have the operation work asynchronously, as it should be quite time-consuming.
Sadly there is no async Where() in this context.
Or is there a third way i am not aware of?
You can have the best of both worlds: filter in the database and asynchronously execute the query:
var publishers = await _db.Publisher
.Where(e => !e.IsDeleted)
.ToListAsync();
That said, depending on what ApiUtils.GetMedia(media) does, you could even perform the projection on the query:
var publishers = await _db.Publisher
.Where(e => !e.IsDeleted)
.Select(e => new PublisherWithMedia()
{
Id = e.Id,
Name = e.Name,
Mediae = e.Mediae.Select(m => ApiUtils.GetMedia(m)).ToList()
};
.ToListAsync();
I have a list of headlines that I am trying to process by using a Task that updates a parameter of the headline object. The code I am trying to do it with does not actually populate the parameter properly. When I debug it, I can see that the setters are being activated and properly updating the backing fields, but when examined after Task.WhenAll , none of the properties are in fact set to their expected values.
//**Relevant signatures are:**
class Headline{
public Uri Uri { get; set; }
public IEnumerable<string> AttachedEmails { get; set; } = Enumerable.Empty<string>();
}
async Task<IEnumerable<string>> GetEmailsFromHeadline(Uri headlineUri) {
//bunch of async fetching logic that populates emailNodes correctly
return emailNodes.Select(e => e.InnerText).ToList();
}
//**Problem Area**
//Initiate tasks that start fetching information
var taskList =
postData
.Select(e => new HttpRequest() { Uri = actionUri, PostData = e })
.Select(e => Task.Run(() => GetHeadlines(e)))
.ToList();
//Wait till complete
await Task.WhenAll(taskList);
//Flatten list
var allHeadlines =
taskList
.SelectMany(e => e.Result.ToList());
//After this section of code I expect every member AttachedEmails property to be properly updated but this is not what is happening.
var headlineProcessTaskList =
allHeadlines
.Select(e => Task.Run( new Func<Task>( async () => e.AttachedEmails = await GetEmailsFromHeadline(e.Uri) ) ) )
.ToList();
await Task.WhenAll(headlineProcessTaskList);
There isn't enough code to reproduce the problem. However, we can guess that:
allHeadlines is IEnumerable<Headline>
Since this is IEnumerable<T>, it is possible that it is a LINQ query that has been deferred. Thus, when you enumerate it once in the ToList call, it creates Headline instances that have their AttachedEmails set. But when you enumerate it again later, it creates new Headline instances.
If this is correct, then one solution would be to change the type of allHeadlines to List<Headline>.
A similar problem could occur in GetEmailsFromHeadline, which presumably returns Task<IEnumerable<string>> and not IEnumerable<string> as stated. If the enumerable is deferred, then the only asynchronous part is defining the query; executing it would be after the fact - more specifically, outside the Task.Run. You might want to consider using ToList there, too.
On a side note, the new Func is just noise, and the wrapping of an asynchronous task in Task.Run is quite unusual. This would be more idiomatic:
var headlineProcessTaskList =
allHeadlines
.Select(async e => e.AttachedEmails = await GetEmailsFromHeadline(e.Uri) )
.ToList();
or, if the Task.Run is truly necessary:
var headlineProcessTaskList =
allHeadlines
.Select(e => Task.Run( () => e.AttachedEmails = (await GetEmailsFromHeadline(e.Uri)).ToList() ) )
.ToList();
I want to use TASK for running parallel activities.This task returns a value.How do i use a FUNC inside the task for returning a value?.I get an error at Func.
Task<Row> source = new Task<Row>(Func<Row> Check = () =>
{
var sFields = (
from c in XFactory.Slot("IdentifierA")
where c["SlotID"] == "A100"
select new
{
Field = c["Test"]
});
});
I change my code to
Task source = Task.Factory.StartNew(() =>
{
return
(from c in XFactory.Slot("IdentifierA")
where c["SlotID"] == "A100"
select new
{
Field = c["Test"]
});
}
);
Your basic pattern would look like:
Task<Row> source = Task.Factory.StartNew<Row>(() => ...; return someRow; );
Row row = source.Result; // sync and exception handling