This is an observable sequence that retrieves paginated data from a web service. Each web service response contains a nextRecordsUrl that indicates where to get the next set of records. What is the best way to convert this Observable to something more reusable?
Web services setup:
var auth = new AuthenticationClient ();
await auth.UsernamePasswordAsync (consumerKey, consumerSecret, userName, password + passwordSecurityToken);
var forceClient = new ForceClient (auth.InstanceUrl, auth.AccessToken, auth.ApiVersion);
The Observable:
var observable = Observable.Create<QueryResult<Account>> (async (IObserver<QueryResult<Account>> o) =>
{
try
{
var queryResult = await forceClient.QueryAsync<Account> ("SELECT Id, Name from Account");
if (queryResult != null)
{
o.OnNext (queryResult);
while (!string.IsNullOrEmpty (queryResult.nextRecordsUrl))
{
queryResult = await forceClient.QueryContinuationAsync<Account> (queryResult.nextRecordsUrl);
if (queryResult != null)
{
o.OnNext (queryResult);
}
}
}
o.OnCompleted ();
}
catch (Exception ex)
{
o.OnError (ex);
}
return () => {};
});
Subscribing to the observable and collecting the results:
var accounts = new List<Account> ();
observable.Subscribe (
observer => accounts.AddRange (observer.records),
ex => Console.WriteLine (ex.Message),
() => {});
EDIT: Using Brandon's solution I can now generate the list of results with Aggregate
List<Account> accounts = await forceClient.QueryPages<Account> ("SELECT Id, Name from Account")
.Aggregate (new List<Account> (), (list, value) =>
{
list.AddRange (value.records);
return list;
});
Believe it or not, the Rx-Expiremental library (also maintained by MS) has an operator for this called Expand. Expand is used to take each element from an observable and run it through a function which produces another observable of the same type. That observable is then flattened in to the original, and each item from that goes through the same process.
Imagine being given a tree node with an observable of child nodes. You could use expand to easily traverse this tree. Since a linked-list is just a constrained version of a tree, and since what you have is effectively a linked list where each node is an observable, you can use expand.
public static IObservable<QueryResult<TResult>> QueryPages<TResult>(this ForceClient forceClient, string query)
{
return Observable.FromAsync(() => forceClient.QueryAsync<TResult>(query))
.Where(QueryResultIsValid)
.Expand(result =>
Observable.FromAsync(() => forceClient.QueryContinuationAsync<TResult>(queryResult.nextRecordsUrl))
.Where(QueryResultIsValid)
);
}
public static bool QueryResultIsValid(QueryResult<TResult> result)
{
return result != null;
}
Is something like this what you are looking for?
public static IObservable<QueryResult<TResult>> QueryPages<TResult>(this ForceClient forceClient, string query)
{
return Observable.Create<QueryResult<T>> (async (observer, token) =>
{
// No need for try/catch. Create() will call OnError if your task fails.
// Also no need for OnCompleted(). Create() calls it when your task completes
var queryResult = await forceClient.QueryAsync<TResult> (query);
while (queryResult != null)
{
observer.OnNext (queryResult);
// check the token *after* we call OnNext
// because if an observer unsubscribes
// it typically occurs during the notification
// e.g. they are using .Take(..) or
// something.
if (string.IsNullOrEmpty(queryResult.nextRecordsUrl) ||
token.IsCancellationRequested)
{
break;
}
queryResult = await forceClient.QueryContinuationAsync<TResult> (queryResult.nextRecordsUrl);
}
// No need to return anything. Just the Task itself is all that Create() wants.
}
});
// Usage:
var forceClient = // ...
var foos = forceClient.QueryPages<Foo>("SELECT A, B, C FROM Foo");
Notice I switched it to the overload that provides a cancellation token so that you can stop fetching pages if the observer unsubscribes (Your original version would have continued fetching pages even though the observer had stopped listening). Also note that the async Create awaits your Task and calls OnError or OnCompleted for you so you do not need to worry about that most of the time.
Related
I am trying to determine if there is a linq expression equivalent of the following foreach statement below?
Two of the several solutions I've tried are commented with their results below the loop. I thought the first example or a variation would work but the type is always IEnumerable<Task> unless I call it synchronously which just feels apprehensive.
public async Task<IEnumerable<CompanySettings>> GetClientSettingsAsync()
{
foreach(var company in await GetParticipatingCompaniesAsync())
{
settings.Add(new CompanySettings(company, await GetSyncDataAsync(company)));
}
// a's Type is IEnumerable<Task<CompanySetting>> and not IEnumerable<CompanySetting>
// var a = (await GetParticipatingCompaniesAsync())
// .Select(async x => new CompanySettings(x, await GetSyncDataAsync(x)));
// return a;
// b's Type is correct it's also synchronous. Does that matter?
// var b = (GetParticipatingCompaniesAsync()).Result
// .Select(x => new CompanySettings(x, GetSyncDataAsync(x).Result));
//return b;
return settings;
}
Signatures of the methods:
private async Task<IEnumerable<UpdoxCompany>> GetParticipatingCompaniesAsync();
private async Task<UpdoxSyncData> GetUpdoxSyncDataAsync(UpdoxCompany company);
You are close with your first attempt, you just need to await the tasks that are returned
async Task<IEnumerable<CompanySettings>> GetClientSettingsAsync() {
var companies = await GetParticipatingCompaniesAsync();
var companySettings = companies.Select(async company =>
new CompanySettings(company, await GetSyncDataAsync(company))
);
return await Task.WhenAll(companySettings);
}
Yes, you can use Select to achieve the same result
Here is one way to do it:
public async Task<IEnumerable<CompanySettings>> GetClientSettingsAsync()
{
var companies = await GetParticipatingCompaniesAsync();
var companySettingsTasks = companies.Select(async company =>
new CompanySettings(company, await GetSyncDataAsync(company)));
var settings = await Task.WhenAll(companySettingsTasks);
return settings;
}
I have an issue with an endpoint blocking calls from other endpoints in my app. When we call this endpoint, this basically blocks all other api calls from executing, and they need to wait until this is finished.
public async Task<ActionResult> GrantAccesstoUsers()
{
// other operations
var grantResult = await
this._workSpaceProvider.GrantUserAccessAsync(this.CurrentUser.Id).ConfigureAwait(false);
return this.Ok(result);
}
The GrantUserAccessAsync method calls set of tasks that will run on a parallel.
public async Task<List<WorkspaceDetail>> GrantUserAccessAsync(string currentUser)
{
var responselist = new List<WorkspaceDetail>();
try
{
// calling these prematurely to be reused once threads are created
// none expensive calls
var properlyNamedWorkSpaces = await this._helper.GetProperlyNamedWorkspacesAsync(true).ConfigureAwait(false);
var dbGroups = await this._reportCatalogProvider.GetWorkspaceFromCatalog().ConfigureAwait(false);
var catalogInfo = await this._clientServiceHelper.GetDatabaseConfigurationAsync("our-service").ConfigureAwait(false);
if (properlyNamedWorkSpaces != null && properlyNamedWorkSpaces.Count > 0)
{
// these methods returns tasks for parallel processing
var grantUserContributorAccessTaskList = await this.GrantUserContributorAccessTaskList(properlyNamedWorkSpaces, currentUser, dbGroups, catalogInfo).ConfigureAwait(false);
var grantUserAdminAccessTaskList = await this.GrantUserAdminAccessTaskList(properlyNamedWorkSpaces, currentUser, dbGroups, catalogInfo).ConfigureAwait(false);
var removeInvalidUserAndSPNTaskList = await this.RemoveAccessRightsToWorkspaceTaskList(properlyNamedWorkSpaces, dbGroups, currentUser, catalogInfo).ConfigureAwait(false);
var tasklist = new List<Task<WorkspaceDetail>>();
tasklist.AddRange(grantUserContributorAccessTaskList);
tasklist.AddRange(grantUserAdminAccessTaskList);
tasklist.AddRange(removeInvalidUserAndSPNTaskList);
// Start running Parallel Task
Parallel.ForEach(tasklist, task =>
{
Task.Delay(this._config.CurrentValue.PacingDelay);
task.Start();
});
// Get All Client Worspace Processing Results
var clientWorkspaceProcessingResult = await Task.WhenAll(tasklist).ConfigureAwait(false);
// Populate result
responselist.AddRange(clientWorkspaceProcessingResult.ToList());
}
}
catch (Exception)
{
throw;
}
return responselist;
}
These methods are basically identical in structure and they look like this:
private async Task<List<Task<WorkspaceDetail>>> GrantUserContributorAccessTaskList(List<Group> workspaces, string currentUser, List<WorkspaceManagement> dbGroups, DatabaseConfig catalogInfo)
{
var tasklist = new List<Task<WorkspaceDetail>>();
foreach (var workspace in workspaces)
{
tasklist.Add(new Task<WorkspaceDetail>(() =>
this.GrantContributorAccessToUsers(workspace, currentUser, dbGroups, catalogInfo).Result));
// i added a delay here because we encountered an issue before in production and this seems to solve the problem. this is set to 4ms.
Task.Delay(this._config.CurrentValue.DelayInMiliseconds);
}
return tasklist;
}
The other methods called here looks like this:
private async Task<WorkspaceDetail> GrantContributorAccessToUsers(Group workspace, string currentUser, List<Data.ReportCatalogDB.WorkspaceManagement> dbGroups, DatabaseConfig catalogInfo)
{
// This prevents other thread or task to start and prevents exceeding the number of threads allowed
await this._batchProcessor.WaitAsync().ConfigureAwait(false);
var result = new WorkspaceDetail();
try
{
var contributorAccessresult = await this.helper.GrantContributorAccessToUsersAsync(workspace, this._powerBIConfig.CurrentValue.SPNUsers).ConfigureAwait(false);
if (contributorAccessresult != null
&& contributorAccessresult.Count > 0)
{
// do something
}
else
{
// do something
}
// this is done to reuse the call that is being executed in the helper above. it's an expensive call from an external endpoint so we opted to reuse what was used in the initial call, instead of calling it again for this process
var syncWorkspaceAccessToDb = await this.SyncWorkspaceAccessAsync(currentUser, workspace.Id, contributorAccessresult, dbGroups, catalogInfo).ConfigureAwait(false);
foreach (var dbResponse in syncWorkspaceAccessToDb) {
result.ResponseMessage += dbResponse.ResponseMessage;
}
}
catch (Exception ex)
{
this._loghelper.LogEvent(this._logger, logEvent, OperationType.GrantContributorAccessToWorkspaceManager, LogEventStatus.FAIL);
}
finally
{
this._batchProcessor.Release();
}
return result;
}
The last method called writes the record in a database table:
private async Task<List<WorkspaceDetail>> SyncWorkspaceAccessAsync(string currentUser,
Guid workspaceId,
List<GroupUser> groupUsers,
List<WorkspaceManagement> dbGroups,
DatabaseConfig catalogInfo) {
var result = new List<WorkspaceDetail>();
var tasklist = new List<Task<WorkspaceDetail>>();
// get active workspace details from the db
var workspace = dbGroups.Where(x => x.PowerBIGroupId == workspaceId).FirstOrDefault();
try
{
// to auto dispose the provider, we are creating this for each instance because
// having only one instance creates an error when the other task starts running
using (var contextProvider = this._contextFactory.GetReportCatalogProvider(
catalogInfo.Server,
catalogInfo.Database,
catalogInfo.Username,
catalogInfo.Password,
this._dbPolicy))
{
if (workspace != null)
{
// get current group users in the db from the workspace object
var currentDbGroupUsers = workspace.WorkspaceAccess.Where(w => w.Id == workspace.Id
&& w.IsDeleted == false).ToList();
#region identify to process
#region users to add
// identify users to add
var usersToAdd = groupUsers.Where(g => !currentDbGroupUsers.Any(w => w.Id == workspace.Id ))
.Select(g => new WorkspaceAccess
{
// class properties
}).ToList();
#endregion
var addTasks = await this.AddWorkspaceAccessToDbTask(catalogProvider, usersToAdd, workspace.PowerBIGroupId, workspace.WorkspaceName).ConfigureAwait(false);
tasklist.AddRange(addTasks);
// this is a potential fix that i did, hoping adding another parallel thread can solve the problem
Parallel.ForEach(tasklist, new ParallelOptions { MaxDegreeOfParallelism = this._config.CurrentValue.MaxDegreeOfParallelism }, task =>
{
Task.Delay(this._config.CurrentValue.PacingDelay);
task.Start();
});
var processResult = await Task.WhenAll(tasklist).ConfigureAwait(false);
// Populate result
result.AddRange(processResult.ToList());
}
}
}
catch (Exception ex)
{
// handle error
}
return result;
}
I tried some potential solutions already, like the methods here are written with Task.FromResult before instead of async so I changed that. Reference is from this thread:
Using Task.FromResult v/s await in C#
Also, I thought it was a similar issue that we faced before when we are creating multiple db context connections needed when running multiple parallel tasks by adding a small delay on tasks but that didn't solve the problem.
Task.Delay(this._config.CurrentValue.DelayInMiliseconds);
Any help would be much appreciated.
I assume your this._batchProcessor is an instance of SemaphoreSlim. If your other endpoints somehow call
await this._batchProcessor.WaitAsyc()
that means they can't go further until semaphor will be released.
Another thing I'd like to mention: please avoid using Parallel.ForEach with async/await. TPL is not designed to work with async/await, here is good answer why you should avoid using them together: Nesting await in Parallel.ForEach
I am trying to achieve following functionality using this code
1. I have list of items and i want process items in parallel way to speed up the process.
2. Also i want to wait until all the data in the list get processed and same thing i need to update in database
private async Task<bool> ProceeData<T>(IList<T> items,int typeId,Func<T, bool> updateRequestCheckPredicate, Func<T, bool> newRequestCheckPredicate)
{
continueFlag = (scripts.Count > =12 ) ? true : false;
await ProcessItems(items, updateRequestCheckPredicate, newRequestCheckPredicate);
//Wait Until all items get processed and Update Status in database
var updateStatus =UpdateStatus(typeId,DateTime.Now);
return continueFlag;
}
private async Task ProcessItems<T>(IList<T> items,Func<T, bool> updateRequestCheckPredicate, Func<T, bool> newRequestCheckPredicate)
{
var itemsToCreate = items.Where(newRequestCheckPredicate).ToList();
var createTask = scripts
.AsParallel().Select(item => CrateItem(item);
.ToArray();
var createTaskComplete = await Task.WhenAll(createTask);
var itemsToUpdate = items.Where(updateRequestCheckPredicate).ToList();
var updateTask = scripts
.AsParallel().Select(item => UpdateItem(item)
.ToArray();
var updateTaskComplete = await Task.WhenAll(updateTask);
}
private async Task<ResponseResult> CrateItem<T>(T item)
{
var response = new ResponseResult();
Guid requestGuid = Guid.NewGuid();
auditSave = SaveAuditData(requestGuid);
if (auditSaveInfo.IsUpdate)
{
response = await UpdateItem(item);
}
response = await CreateTicket<T>(item);
// Wait response
UpdateAuditData(response)
}
private async Task<ServiceNowResponseResult> CreateTicket<T>(T item)
{
// Rest call and need to wait for result
var response = await CreateNewTicket<T>(scriptObj, serviceRequestInfo);
return response;
}
I am new to await async concept and so anyone pls advice me whether i am doing is a right approach or If wrong pls help me with help of a sample code
All these AsParallel are not needed or desired, but you'd need to change the signature of your callbacks to be async.
Here's an example
async Task ProcessAllItems<T>(IEnumerable<T> items,
Func<T, Task<bool>> checkItem, // an async callback
Func<T, Task> processItem)
{
// if you want to group all the checkItem before any processItem is called
// then do WhenAll(items.Select(checkItem).ToList()) and inspect the result
// the code below executes all checkItem->processItem chains independently
List<Task> checkTasks = items
.Select(i => checkItem(i)
.ContinueWith(_ =>
{
if (_.Result)
return processItem(i);
return null;
}).Unwrap()) // .Unwrap takes the inner task of a Task<Task<>>
.ToList(); // when making collections of tasks ALWAYS materialize with ToList or ToArray to avoid accudental multiple executions
await Task.WhenAll(checkTasks);
}
And here's how to use it:
var items = Enumerable.Range(0, 10).ToList();
var process = ProcessAllItems(items,
checkItem: async (x) =>
{
await Task.Delay(5);
return x % 2 == 0;
},
processItem: async (x) =>
{
await Task.Delay(1);
Console.WriteLine(x);
});
I'm trying to implement this trivial task of listing all objects in an AmazonS3 bucket with paged requests asynchronously in C#4. I have it working in C#5 using the following snippet:
var listRequest = new ListObjectsRequest().WithBucketName(bucketName);
ListObjectsResponse listResponse = null;
var list = new List<List<S3Object>>();
while (listResponse == null || listResponse.IsTruncated)
{
listResponse = await Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null);
list.Add(listResponse.S3Objects);
if (listResponse.IsTruncated)
{
listRequest.Marker = listResponse.NextMarker;
}
}
return list.SelectMany(l => l);
I'm calling the BeginListObjects/EndListObjects pair asynchronously, but I have to repeat that call every time the response says it's truncated. This piece of code works for me.
However, I now want to do this in C#4's TPL, where I don't have the luxury of using async/await and want to understand if this can be done using continuations.
How do I do this same thing in C#4?
Okay, so rather than putting the items into a list with each task/continuation it's easier in a non-await model to just have each task/continuation return the entire sequence. Given that, I used the following helper method to add each one's iterative results onto the aggregate total.
public static Task<IEnumerable<T>> Concat<T>(Task<IEnumerable<T>> first
, Task<IEnumerable<T>> second)
{
return Task.Factory.ContinueWhenAll(new[] { first, second }, _ =>
{
return first.Result.Concat(second.Result);
});
}
Next, I used the follow method to take a task of a single result and turn it into a task of a sequence (containing just that one item).
public static Task<IEnumerable<T>> ToSequence<T>(this Task<T> task)
{
var tcs = new TaskCompletionSource<IEnumerable<T>>();
task.ContinueWith(_ =>
{
if (task.IsCanceled)
tcs.SetCanceled();
else if (task.IsFaulted)
tcs.SetException(task.Exception);
else
tcs.SetResult(Enumerable.Repeat(task.Result, 1));
});
return tcs.Task;
}
Note here that you have some fields/locals not defined; I'm assuming you can add them to the appropriate method without difficulty.
private Task<IEnumerable<S3Object>> method(object sender, EventArgs e)
{
ListObjectsResponse listResponse = null;
return Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null)
.ToSequence()
.ContinueWith(continuation);
}
Here is where the real magic happens. Basically,
public Task<IEnumerable<S3Object>> continuation(Task<IEnumerable<S3Object>> task)
{
if (task.Result == null) //not quite sure what null means here//may need to edit this recursive case
{
return Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null)
.ToSequence()
.ContinueWith(continuation);
}
else if (task.Result.First().IsTruncated)
{
//if the results were trunctated then concat those results with
//TODO modify the request marker here; either create a new one or store the request as a field and mutate.
Task<IEnumerable<S3Object>> nextBatch = Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null)
.ToSequence()
.ContinueWith(continuation);
return Concat(nextBatch, task);//recursive continuation call
}
else //if we're done it means the existing results are sufficient
{
return task;
}
}
I'm learning Reactive Extensions, and I've been trying to find out if it's a match for a task like this.
I have a Process() method that processes a batch of requests as a unit of work, and invoking a callback when all requests have completed.
The important thing here is that each request will call the callback either synchronous or asynchronous depending on it's implementation, and the batch processor must be able to handle both.
But no threads are started from the batch processor, any new threads (or other async execution) will be initiated from inside the request handlers if necessary. I don't know if this match the use cases of rx.
My current working code looks (almost) like this:
public void Process(ICollection<IRequest> requests, Action<List<IResponse>> onCompleted)
{
IUnitOfWork uow = null;
try
{
uow = unitOfWorkFactory.Create();
var responses = new List<IResponse>();
var outstandingRequests = requests.Count;
foreach (var request in requests)
{
var correlationId = request.CorrelationId;
Action<IResponse> requestCallback = response =>
{
response.CorrelationId = correlationId;
responses.Add(response);
outstandingRequests--;
if (outstandingRequests != 0)
return;
uow.Commit();
onCompleted(responses);
};
requestProcessor.Process(request, requestCallback);
}
}
catch(Exception)
{
if (uow != null)
uow.Rollback();
}
if (uow != null)
uow.Commit();
}
How would you implement this using rx? Is it reasonable?
Note, that the unit of work is to be committed synchronously even if there are async requests that have not yet returned.
My approach to this is two-step.
First create a general-purpose operator that turns Action<T, Action<R>> into Func<T, IObservable<R>>:
public static class ObservableEx
{
public static Func<T, IObservable<R>> FromAsyncCallbackPattern<T, R>(
this Action<T, Action<R>> call)
{
if (call == null) throw new ArgumentNullException("call");
return t =>
{
var subject = new AsyncSubject<R>();
try
{
Action<R> callback = r =>
{
subject.OnNext(r);
subject.OnCompleted();
};
call(t, callback);
}
catch (Exception ex)
{
return Observable.Throw<R>(ex, Scheduler.ThreadPool);
}
return subject.AsObservable<R>();
};
}
}
Next, turn the call void Process(ICollection<IRequest> requests, Action<List<IResponse>> onCompleted) into IObservable<IResponse> Process(IObservable<IRequest> requests):
public IObservable<IResponse> Process(IObservable<IRequest> requests)
{
Func<IRequest, IObservable<IResponse>> rq2rp =
ObservableEx.FromAsyncCallbackPattern
<IRequest, IResponse>(requestProcessor.Process);
var query = (
from rq in requests
select rq2rp(rq)).Concat();
var uow = unitOfWorkFactory.Create();
var subject = new ReplaySubject<IResponse>();
query.Subscribe(
r => subject.OnNext(r),
ex =>
{
uow.Rollback();
subject.OnError(ex);
},
() =>
{
uow.Commit();
subject.OnCompleted();
});
return subject.AsObservable();
}
Now, not only does this run the processing async, but it also ensures the correct order of the results.
In fact, since you are starting with a collection, you could even do this:
var rqs = requests.ToObservable();
var rqrps = rqs.Zip(Process(rqs),
(rq, rp) => new
{
Request = rq,
Response = rp,
});
Then you would have an observable that pairs up each request/response without the need for a CorrelationId property.
I hope this helps.
This is part of the genius of Rx, as you're free to return results either synchronously or asynchronously:
public IObservable<int> AddNumbers(int a, int b) {
return Observable.Return(a + b);
}
public IObservable<int> AddNumbersAsync(int a, int b) {
return Observable.Start(() => a + b, Scheduler.NewThread);
}
They both have the IObservable type, so they work identically. If you want to find out when all IObservables complete, Aggregate will do this, as it will turn 'n' items in an Observable into 1 item that is returned at the end:
IObservable<int> listOfObservables[];
listObservables.ToObservable()
.Merge()
.Aggregate(0, (acc, x) => acc+1)
.Subscribe(x => Console.WriteLine("{0} items were run", x));