I am using ManualResetEventSlim to have signaling mechanism in my application and It works great if requests/sec are 100. As I increase request/sec, it gets worse.
Example:
100 Requests/sec -> 90% transaction done in 250 ms and Throughput (Success request/sec) is 134.
150 Requests/sec -> 90% transaction done in 34067 ms and Throughput (Success request/sec) is 2.2.
I use ConcurrentDictionary as give below:
// <key, (responseString,ManualResetEventSlim) >
private static ConcurrentDictionary<string, (string, ManualResetEventSlim)> EventsDict = new ConcurrentDictionary<string, (string, ManualResetEventSlim)>();
Below given process describes need for ManualResetEventSlim (Api Solution 1 and Api Solution 2 are completely :
Api Solution 1 (REST Api) received a request, it added an element (null, ManualResetEventSlim) in ConcurrentDictionary against a key and called thirdparty service (SOAP) using async/await. Thirdparty soap api returned acknowledgement response but actual response is pending. After getting acknowledgement response, it goes to ManualResetEventSlim.wait
Once thirdparty processed the request, it calls Api Solution 2 (SOAP) using exposed method and sends actual response. Api solution 2 sends response to Api Solution 1 (REST Api) by making http request and then inserts data to database for auditlog.
Api Solution 1 will get key from response string and update response string in ConcurrentDictionary and set signal.
Api Solution 1 disposes ManualResetEventSlim object before returning response to client.
I think, you should be able to get rid of the blocking code by replacing (string, ManualResetEventSlim) with TaskCompletionSource<string>:
In Solution 1, you would do something along this:
TaskCompletionSource<string> tcs = new TaskCompletionSource<string>()
EventsDict.AddOrUpdate( key, tcs );
await KickOffSolution2ThirdParty( /*...*/ );
string result = await tcs.Task; // <-- now not blocking any thread anymore
And the counterpart:
void CallbackFromSolution2( string key, string result )
{
if( EventsDict.TryRemove(key, out TaskCompletionSource<string> tcs )
{
tcs.SetResult(result);
}
}
This is of course only a coarse outline of the idea. But hopefully enough to make my line of thought understandable. I cannot test this right now, so any improvements/corrections welcome.
Related
I have a web app that connects to an external API.
That API has a limit of 3 connections per second.
I have a method that gets employee data for a whole factory.
It works fine, but I've found that if a particular factory has a lot of employees, I hit the API connection limit and get an error.
(429) API calls exceeded...maximum 3 per Second
So I decided to use await Task.Delay(1000) to set a 1 second delay, every time this method is used.
Now it seems to have reduced the number of errors I get, but I am still getting a few limit errors.
Is there another method I could use to ensure my limit is not reached?
Here is my code:
public async Task<YourSessionResponder> GetAll(Guid factoryId)
{
UserSession.AuthData sessionManager = new UserSession.AuthData
{
UserName = "xxxx",
Password = "xxxx"
};
ISessionHandler sessionMgr = new APIclient();
YourSessionResponder response;
response = await sessionMgr.GetDataAsync(sessionManager, new ListerRequest
{
FactoryId = factoryId;
await Task.Delay(1000);
return response;
}
I call it like this:
var yourEmployees = GetAll(factoryId);
I have a web app that connects to an external API.
Your current code limits the number of outgoing requests made by a single incoming request to your API. What you need to do is limit all of your outgoing requests, app-wide.
It's possible to do this using a SemaphoreSlim:
private static readonly SemaphoreSlim Mutex = new(1);
public async Task<YourSessionResponder> GetAll(Guid factoryId)
{
...
YourSessionResponder response;
await Mutex.WaitAsync();
try
{
response = await sessionMgr.GetDataAsync(...);
await Task.Delay(1000);
}
finally
{
Mutex.Release();
}
return response;
}
But I would take a different approach...
Is there another method I could use to ensure my limit is not reached?
Generally, I recommend just retrying on 429 errors, using de-correlated jittered exponential backoff (see Polly for an easy implementation). That way, when you're "under budget" for the time period, your requests go through immediately, and they only slow down when you hit your API limit.
From a comment on the question:
I am calling it like this: var yourEmployees = GetAll(factoryId);
Then you're not awaiting the task. While there's a 1-second delay after each network operation, you're still firing off all of the network operations in rapid succession. You need to await the task before moving on to the next one:
var yourEmployees = await GetAll(factoryId);
Assuming that this is happening in some kind of loop or repeated operation, of course. Otherwise, where would all of these different network tasks be coming from? Whatever high-level logic is invoking the multiple network operations, that logic needs to await one before moving on to the next.
I am currently working on WebApi using .Net Core, one of my Api Method will call number of another Api (3rd party), and it will take some time to return response, but I don't want our Api consumers to wait for response, instead I want to return early response i.e The Operation is started. And I ll provide an endpoint to our Consumers through which they can get the status of that operation. For example our consumer calls the api to generate 100k records for which my Api will call around 20 parallel calls to 3rd party api. So I don't want consumer for these 20 apis response.
Currently I have this code:
public async Task<ActionResult> GenerateVouchers([FromBody][Required]CreateVoucherRequestModel request, string clientId)
{
_logger.LogInformation(Request.Method, Request.Path);
// await _voucherService.ValidateIdempotedKeyWithStatus(clientId, _idempotentHeader);
//TODO: Check Voucher type & Status before Generating Voucher
var watch = Stopwatch.StartNew();
var vouchers = new List<VoucherCreateResponseModel>();
var batchSize = 5000;
int numberOfBatches = (int)Math.Ceiling((double)request.quantity / batchSize);
int totalVoucherQuantity = request.quantity;
request.quantity = 5000;
var tasks = new List<Task<VoucherCreateResponseModel>>();
for (int i = 0; i < numberOfBatches; i++)
{
tasks.Add(_client.GenerateVoucher($"CouponsCreate", request));
vouchers.AddRange(await Task.WhenAll(tasks).ConfigureAwait(false));
}
// await _voucherService.GenerateVouchers(request, clientId, _idempotentHeader);
watch.Stop();
var totalMS = watch.ElapsedMilliseconds;
return Ok();
}
But the issue with above code even though I have ConfigureAwait(false), it waits for all 20 requests to execute and when response of all requests are returned than api consumer will get response, but each each of these 20 request will take around 5 seconds to execute, so our consumers may get request timeout while waiting for response.
How can I fix such issue in .Net Core.
It's not a good practice to wait for long running process inside controller.
My opinion is ,
put the data necessary (something like a Id for a batch) for long
running process to a Azure queue within the API
trigger a function app from the particular queue, So API's responsibility is
putting the data in to the queue
From there on it's function apps
responsibility to complete process .
May be using something like
signalR you can notify the frontend when process is completed
Background
We have a service operation that can receive concurrent asynchronous requests and must process those requests one at a time.
In the following example, the UploadAndImport(...) method receives concurrent requests on multiple threads, but its calls to the ImportFile(...) method must happen one at a time.
Layperson Description
Imagine a warehouse with many workers (multiple threads). People (clients) can send the warehouse many packages (requests) at the same time (concurrently). When a package comes in a worker takes responsibility for it from start to finish, and the person who dropped off the package can leave (fire-and-forget). The workers' job is to put each package down a small chute, and only one worker can put a package down a chute at a time, otherwise chaos ensues. If the person who dropped off the package checks in later (polling endpoint), the warehouse should be able to report on whether the package went down the chute or not.
Question
The question then is how to write a service operation that...
can receive concurrent client requests,
receives and processes those requests on multiple threads,
processes requests on the same thread that received the request,
processes requests one at a time,
is a one way fire-and-forget operation, and
has a separate polling endpoint that reports on request completion.
We've tried the following and are wondering two things:
Are there any race conditions that we have not considered?
Is there a more canonical way to code this scenario in C#.NET with a service oriented architecture (we happen to be using WCF)?
Example: What We Have Tried?
This is the service code that we have tried. It works though it feels like somewhat of a hack or kludge.
static ImportFileInfo _inProgressRequest = null;
static readonly ConcurrentDictionary<Guid, ImportFileInfo> WaitingRequests =
new ConcurrentDictionary<Guid, ImportFileInfo>();
public void UploadAndImport(ImportFileInfo request)
{
// Receive the incoming request
WaitingRequests.TryAdd(request.OperationId, request);
while (null != Interlocked.CompareExchange(ref _inProgressRequest, request, null))
{
// Wait for any previous processing to complete
Thread.Sleep(500);
}
// Process the incoming request
ImportFile(request);
Interlocked.Exchange(ref _inProgressRequest, null);
WaitingRequests.TryRemove(request.OperationId, out _);
}
public bool UploadAndImportIsComplete(Guid operationId) =>
!WaitingRequests.ContainsKey(operationId);
This is example client code.
private static async Task UploadFile(FileInfo fileInfo, ImportFileInfo importFileInfo)
{
using (var proxy = new Proxy())
using (var stream = new FileStream(fileInfo.FullName, FileMode.Open, FileAccess.Read))
{
importFileInfo.FileByteStream = stream;
proxy.UploadAndImport(importFileInfo);
}
await Task.Run(() => Poller.Poll(timeoutSeconds: 90, intervalSeconds: 1, func: () =>
{
using (var proxy = new Proxy())
{
return proxy.UploadAndImportIsComplete(importFileInfo.OperationId);
}
}));
}
It's hard to write a minimum viable example of this in a Fiddle, but here is a start that give a sense and that compiles.
As before, the above seems like a hack/kludge, and we are asking both about potential pitfalls in its approach and for alternative patterns that are more appropriate/canonical.
Simple solution using Producer-Consumer pattern to pipe requests in case of thread count restrictions.
You still have to implement a simple progress reporter or event. I suggest to replace the expensive polling approach with an asynchronous communication which is offered by Microsoft's SignalR library. It uses WebSocket to enable async behavior. The client and server can register their callbacks on a hub. Using RPC the client can now invoke server side methods and vice versa. You would post progress to the client by using the hub (client side). In my experience SignalR is very simple to use and very good documented. It has a library for all famous server side languages (e.g. Java).
Polling in my understanding is the totally opposite of fire-and-forget. You can't forget, because you have to check something based on an interval. Event based communication, like SignalR, is fire-and-forget since you fire and will get a reminder (cause you forgot). The "event side" will invoke your callback instead of you waiting to do it yourself!
Requirement 5 is ignored since I didn't get any reason. Waiting for a thread to complete would eliminate the fire and forget character.
private BlockingCollection<ImportFileInfo> requestQueue = new BlockingCollection<ImportFileInfo>();
private bool isServiceEnabled;
private readonly int maxNumberOfThreads = 8;
private Semaphore semaphore = new Semaphore(numberOfThreads);
private readonly object syncLock = new object();
public void UploadAndImport(ImportFileInfo request)
{
// Start the request handler background loop
if (!this.isServiceEnabled)
{
this.requestQueue?.Dispose();
this.requestQueue = new BlockingCollection<ImportFileInfo>();
// Fire and forget (requirement 4)
Task.Run(() => HandleRequests());
this.isServiceEnabled = true;
}
// Cache multiple incoming client requests (requirement 1) (and enable throttling)
this.requestQueue.Add(request);
}
private void HandleRequests()
{
while (!this.requestQueue.IsCompleted)
{
// Wait while thread limit is exceeded (some throttling)
this.semaphore.WaitOne();
// Process the incoming requests in a dedicated thread (requirement 2) until the BlockingCollection is marked completed.
Task.Run(() => ProcessRequest());
}
// Reset the request handler after BlockingCollection was marked completed
this.isServiceEnabled = false;
this.requestQueue.Dispose();
}
private void ProcessRequest()
{
ImportFileInfo request = this.requestQueue.Take();
UploadFile(request);
// You updated your question saying the method "ImportFile()" requires synchronization.
// This a bottleneck and will significantly drop performance, when this method is long running.
lock (this.syncLock)
{
ImportFile(request);
}
this.semaphore.Release();
}
Remarks:
BlockingCollection is a IDisposable
TODO: You have to "close" the BlockingCollection by marking it completed:
"BlockingCollection.CompleteAdding()" or it will loop indeterminately waiting for further requests. Maybe you introduce a additional request methods for the client to cancel and/ or to update the process and to mark adding to the BlockingCollection as completed. Or a timer that waits an idle time before marking it as completed. Or make your request handler thread block or spin.
Replace Take() and Add(...) with TryTake(...) and TryAdd(...) if you want cancellation support
Code is not tested
Your "ImportFile()" method is a bottleneck in your multi threading environment. I suggest to make it thread safe. In case of I/O that requires synchronization, I would cache the data in a BlockingCollection and then write them to I/O one by one.
The problem is that your total bandwidth is very small-- only one job can run at a time-- and you want to handle parallel requests. That means that queue time could vary wildly. It may not be the best choice to implement your job queue in-memory, as it would make your system much more brittle, and more difficult to scale out when your business grows.
A traditional, scaleable way to architect this would be:
An HTTP service to accept requests, load balanced/redundant, with no session state.
A SQL Server database to persist the requests in a queue, returning a persistent unique job ID.
A Windows service to process the queue, one job at a time, and mark jobs as complete. The worker process for the service would probably be single-threaded.
This solution requires you to choose a web server. A common choice is IIS running ASP.NET. On that platform, each request is guaranteed to be handled in a single-threaded manner (i.e. you don't need to worry about race conditions too much), but due to a feature called thread agility the request might end with a different thread, but in the original synchronization context, which means you will probably never notice unless you are debugging and inspecting thread IDs.
Given the constraints context of our system, this is the implementation we ended up using:
static ImportFileInfo _importInProgressItem = null;
static readonly ConcurrentQueue<ImportFileInfo> ImportQueue =
new ConcurrentQueue<ImportFileInfo>();
public void UploadAndImport(ImportFileInfo request) {
UploadFile(request);
ImportFileSynchronized(request);
}
// Synchronize the file import,
// because the database allows a user to perform only one write at a time.
private void ImportFileSynchronized(ImportFileInfo request) {
ImportQueue.Enqueue(request);
do {
ImportQueue.TryPeek(out var next);
if (null != Interlocked.CompareExchange(ref _importInProgressItem, next, null)) {
// Queue processing is already under way in another thread.
return;
}
ImportFile(next);
ImportQueue.TryDequeue(out _);
Interlocked.Exchange(ref _importInProgressItem, null);
}
while (ImportQueue.Any());
}
public bool UploadAndImportIsComplete(Guid operationId) =>
ImportQueue.All(waiting => waiting.OperationId != operationId);
This solution works well for the loads we are expecting. That load involves a maximum of about 15-20 concurrent PDF file uploads. The batch of up to 15-20 files tends to arrive all at once and then to go quiet for several hours until the next batch arrives.
Criticism and feedback is most welcome.
I have two internal processes which I use to upload long sdos strings to an API. Process 1 reads these from another stream. Process 1 (client) sends strings to process 2 (server) via a [ServiceContract] and a [MessageContract]. Process 2 then sends this to an API which in turn processes the sdos and uploads to a server.
[MessageContract]
public class CallRequestMessage
{
[MessageHeader]
public string Sdos;
[MessageHeader]
public int ArrayLength;
[MessageBodyMember]
public Stream SdosStream;
}
[MessageContract]
public class CallResponseMessage
{
[MessageHeader]
public Task<ResultCode> Task;
}
Since the bulk of the time processing the string is in the API, I want to try and return a Task<ResultCode> from my server that will get a result from the API once the processing has concluded. Then my threads can work on client-side processing (in this case, reading the sdos strings from a stream input).
My problem is that the tasks returned to the client seem to be different to the ones that I create on the server. On the server I have the code
task = Task<ResultCode>.Factory.StartNew(() =>
{
ResultCode res;
lock (SyncObject)
res = upload(/* input */)
return res;
});
// ...other code
return new CallResponseMessage { Task = task };
where upload is a method in the API, accessed by process 2 by using a [DllImportAttribute].
Using logs I have seen that the task does complete on the server (all sdos are uploaded), however on the client side, all tasks appear to not have started, and so retrieving the results is not possible directly.
An alternative approach that I thought of would be to return nothing from the server, and add a separate method that retrospectively goes to the server, awaits the tasks, and returns an aggregated result. I would like to try and get the task back directly, though, as this implementation may be a model for future services in my projects.
Thank you for any help.
There are no Task instances across process boundaries. The server's task is the Task that sends the data to the client. The client task is the task that receives the data. If you use the asnyc methods on the auto-generated WCF clients, by default, WCF will not stream the data from server to client, so your normal flow will be:
Start client task -> Send request -> Start server task -> End server task -> Send response -> End client task
In order for the server tasks to be performed asynchronously, you can design your service methods with the task asynchronous pattern (TAP). This example is from the official documentation:
public class SampleService:ISampleService
{
// ...
public async Task<string> SampleMethodTaskAsync(string msg)
{
return Task<string>.Factory.StartNew(() =>
{
return msg;
});
}
// ...
}
The benefits of tasks on client and server is not so much that the client can receive while the server sends the data, but to allow the server to process more incoming requests while other requests are waiting for long running operations (e.g. data access) and the client to do something useful while the data is received.
Your options are:
Use seperate asynchronous server and client operations
Unless you are transferring large amounts of data and performance is critical, there is nothing wrong with the situation. You can still use tasks for async programming. However, your approach of returning a task won't work. Use the described combination of async service methods and the auto-generated async client methods. You will essentially achieve the same result, which is that both, client and server will perform the operation asynchronously.
Stream the data
If you must start processing on the client while the server is sending the data (which only brings you a practical benefit for large amounts of data), you can stream the data from the server. This issue is too large to cover here, but a good point to start is the official documentation.
Background: We have import-functions that can take anywhere from a few seconds to 1-2 hours to run depending on the file being imported. We want to expose a new way of triggering imports, via a REST request.
Ideally the REST service would be called, trigger the import and reply with a result when done. My question is: since it can take up to two hours to run, is it possible to reply or will the request timeout for the caller? Is there a better way for this kind of operation?
What I use in these cases is an asynchronous operation that returns no result (void function result in case of c# Web API), then send the result asynchronously using a message queue.
E.g.
[HttpPut]
[Route("update")]
public void Update()
{
var task = Task.Run(() => this.engine.Update());
task.ContinueWith(t => publish(t, "Update()"));
}