Polly rate limiting too early [duplicate] - c#

This question already has answers here:
RateLimiting - Incorrect limiting
(2 answers)
Closed 8 months ago.
I'm trying to get my head around Polly rate-limit policy.
public class RateLimiter
{
private readonly AsyncRateLimitPolicy _throttlingPolicy;
private readonly Action<string> _rateLimitedAction;
public RateLimiter(int numberOfExecutions, TimeSpan perTimeSpan, Action<string> rateLimitedAction)
{
_throttlingPolicy = Policy.RateLimitAsync(numberOfExecutions, perTimeSpan);
_rateLimitedAction = rateLimitedAction;
}
public async Task<T> Throttle<T>(Func<Task<T>> func)
{
var result = await _throttlingPolicy.ExecuteAndCaptureAsync(func);
if (result.Outcome == OutcomeType.Failure)
{
var retryAfter = (result.FinalException as RateLimitRejectedException)?.RetryAfter ?? TimeSpan.FromSeconds(1);
_rateLimitedAction($"Rate limited. Should retry in {retryAfter}.");
return default;
}
return result.Result;
}
}
In my console application, I'm instantiating a RateLimiter with up to 5 calls per 10 seconds.
var rateLimiter = new RateLimiter(5, TimeSpan.FromSeconds(10), err => Console.WriteLine(err));
var rdm = new Random();
while (true)
{
var result = await rateLimiter.Throttle(() => Task.FromResult(rdm.Next(1, 10)));
if (result != default) Console.WriteLine($"Result: {result}");
await Task.Delay(200);
}
I would expect to see 5 results, and be rate limited on the 6th one. But this is what I get
Result: 9
Rate limited. Should retry in 00:00:01.7744615.
Rate limited. Should retry in 00:00:01.5119933.
Rate limited. Should retry in 00:00:01.2313921.
Rate limited. Should retry in 00:00:00.9797322.
Rate limited. Should retry in 00:00:00.7309150.
Rate limited. Should retry in 00:00:00.4812646.
Rate limited. Should retry in 00:00:00.2313643.
Result: 7
Rate limited. Should retry in 00:00:01.7982864.
Rate limited. Should retry in 00:00:01.5327321.
Rate limited. Should retry in 00:00:01.2517093.
Rate limited. Should retry in 00:00:00.9843077.
Rate limited. Should retry in 00:00:00.7203371.
Rate limited. Should retry in 00:00:00.4700262.
Rate limited. Should retry in 00:00:00.2205184.
I've also tried to use ExecuteAsync instead of ExecuteAndCaptureAsync and it didn't change the results.
public async Task<T> Throttle<T>(Func<Task<T>> func)
{
try
{
var result = await _throttlingPolicy.ExecuteAsync(func);
return result;
}
catch (RateLimitRejectedException ex)
{
_rateLimitedAction($"Rate limited. Should retry in {ex.RetryAfter}.");
return default;
}
}
This doesn't make any sense to me. Is there something I'm missing?

The rate limiter works in a bit different way than as you might expect. The expected behaviour could be the following:
Let's suppose I have 500 requests and I want to throttle it to 50 per minute
In that case after the first 50 executions the rate limiter should kick in if they were executed less than a minute
This intuitive approach does not put into account the equal distribution of the incoming load. This might induce the following observable behaviour:
Let's suppose the first 50 executions took 30 seconds
Then you have to wait another 30 seconds to execute the 51st request
Polly's rate limiter uses the Leaky bucket algorithm
This works in the following way:
The bucket has a fix capacity
The bucket has a leak at the bottom
Water drops are leaving the bucket on a given frequency
The bucket can receive new water drops from top
The bucket can overflow if the incoming frequency is greater than the outgoing
So, technically speaking:
it is a fixed sized queue
the dequeue is called periodically
if the queue is full then the enqueue throws an exception
The most important information from the above description is the following: the leaky bucket algorithm uses a constant rate to empty the bucket.
UPDATE 14/11/22
Let me correct myself. Polly's rate limiter is using token bucket not leaky bucket. There are also other algorithms like fixed window counter, sliding window log or sliding window counter. You can read about the alternatives here or inside the System Design Interview Volume 1 book's chapter 4
So, let's talk about the token bucket algorithm:
The bucket has a fix capacity
Tokens are put into the bucket in a fixed periodic rate
If the bucket is full no more token is added to it (overflow)
Each request tries to consume a single token
If there is at least one then the request consumes it and the request is allowed
If there isn't at least one token inside the bucket then the request is dropped
(Source)
If we scrutinise the implementation then we can see the following things:
The RateLimiterPolicy calls the RateLimiterEngine's static method
The RateLimiterEngine calls a method on a IRateLimiter interface
There is only one class (at the time of writing) which implements this interface
The RateLimiterFactory exposes a method to create LockFreeTokenBucketRateLimiter
public static IRateLimiter Create(TimeSpan onePer, int bucketCapacity)
=> new LockFreeTokenBucketRateLimiter(onePer, bucketCapacity);
Please be aware of how the parameters are named (onePer and bucketCapacity)!
If you are interested about the actual implementation then you can find here. (Almost each line is commented)
I want to emphasize one more thing. The rate limiter does not perform any retry. If you want to continue the execution after the penalty time is over then you have to do it yourself. Either by writing some custom code or by combining a retry policy with the rate limiter policy.

There is an overload accepting third parameter - maxBurst:
The maximum number of executions that will be permitted in a single burst (for example if none have been executed for a while).
The default value is 1, if you will set it to numberOfExecutions you will see the desired effect for the first execution, though after that it will deteriorate to the similar pattern as you observe (I would guess it is based on how the limiter "frees" the resources and var onePer = TimeSpan.FromTicks(perTimeSpan.Ticks / numberOfExecutions); calculation, but I have not dug too deep, but based on the docs and code it seems that rate limiting is happening with "1 execution per perTimeSpan/numberOfExecutions" rate rather than "numberOfExecutions in any selected perTimeSpan"):
_throttlingPolicy = Policy.RateLimitAsync(numberOfExecutions, perTimeSpan, numberOfExecutions);
Adding periodic wait for several seconds brings back the "bursts" though.
Also see:
docs
allow for bursts part of the doc
issue about rate limiter engine.

Related

Polly.Contrib.WaitAndRetry to "funnel" all requests when hitting rate limit

We're using the Dropbox API wrapped in Polly to handle retries.
We have it set up as an exponential back-off, like explained here.
The issue we have is that we make plenty of concurrent calls.
When the API starts throwing rate limit exceptions, each individual caller backs off
but new callers will still call the API and "steal" the retry of callers that are waiting.
That means that on high load we are experiencing failed API calls and errors.
What we would like to achieve is that on rate limit errors all calls (including new callers) to the API are synchronized and wait for the rate limit to expire.
Then calls can resume (ideally in sequence to make sure the calls don't return rate limit exceptions anymore).
Is there a Polly-supported way of achieving that?
According to my understanding you want to have the following:
The downstream system can throttle incoming requests
1.1 The system is smart enough to provide a RetryAfter time span
You want to avoid flooding the downstream system if you already know that you are throttled
But you don't want to lose any incoming request rather prefer processing all of them eventually
Let's put together a working example
#1 - Downstream system
Here we will implement a super simple mock which can mimic throttling.
Let's start with the exception
public class DownstreamServiceException: Exception
{
public TimeSpan RetryAfter { get; set; }
}
Now, let's see the service code
public class DownstreamService
{
private readonly CancellationTokenSource initCompletionSignal;
private readonly TimeSpan initDuration;
private bool isAvailable = false;
private DateTime initEstimatedEnd;
public DownstreamService()
{
initDuration = TimeSpan.FromSeconds(10);
initCompletionSignal = new CancellationTokenSource(initDuration);
initCompletionSignal.Token.Register(() => isAvailable = true);
initEstimatedEnd = DateTime.UtcNow.Add(initDuration);
}
public Task<string> GetAsync()
{
if (!isAvailable) throw new DownstreamServiceException { RetryAfter = initEstimatedEnd - DateTime.UtcNow };
return Task.FromResult("Available");
}
}
For the sake of simplicity I've used made the service unavailable for the first 10 seconds
I've used a CancellationTokenSource as a timer to make the service available
If the GetAsync is called while it is not available (we are throttled) it returns an exception otherwise with the "Available" string
#2 - Avoid flooding is downstream is not available
Here we will define a Circuit Breaker to short-cut the requests if the downstream is not available (we are throttled)
var throttledPolicy = Policy<string>
.Handle<DownstreamServiceException>()
.CircuitBreakerAsync(1, TimeSpan.FromSeconds(0),
onBreak: (result, state, _, __) => {
if (state == CircuitState.Open) return;
Console.WriteLine("onBreak");
throw result.Exception;
},
onReset: (_) => Console.WriteLine("onReset"),
onHalfOpen: () => { });
The Circuit Breaker will transit from Closed to Open when we receive the first DownstreamServiceException
The duration of break (TimeSpan.FromSeconds(0)) does not matter here
We will control the Circuit Breaker's state from the Retry logic
if (state == CircuitState.Open): This will be explained under the retry section
And finally re-throw the original exception (I know, I know ... it should be avoided, but it keeps our example application simple)
#3 - Retry until eventually processed
This is the most complicated part of the solution, because this retry policy handles multiple exceptions (DownstreamServiceException, IsolatedCircuitException) in a different way
CancellationTokenSource throttlingEndSignal;
var retryPolicy = Policy<string>
.Handle<DownstreamServiceException>()
.Or<IsolatedCircuitException>()
.WaitAndRetryForeverAsync(_ => TimeSpan.FromSeconds(3),
onRetry: (dr, __) =>
{
Console.WriteLine($"onRetry caused by {dr.Exception.GetType().Name}");
if (dr.Exception is DownstreamServiceException dse)
{
throttledPolicy.Isolate();
throttlingEndSignal = new(dse.RetryAfter);
throttlingEndSignal.Token.Register(() => throttledPolicy.Reset());
}
});
Let's start with the DownstreamServiceException
We will receive this exception because we are going to chain together the two policies and Circuit Breaker's onBreak delegate re-throws the received exception
Inside the onRetry we have a guard expression for DownstreamServiceException
Here we call the Isolate on the Circuit Breaker, which tries to transit from Open state to Isolated state >> calls the onBreak delegate
In order to avoid infinite loop that's why we had this if (state == CircuitState.Open) return; code there
We do the same timer trick here with the CancellationTokenSource, when ever the throttling ends we push the Circuit Breaker back to Closed state (Reset)
The IsolatedCircuitException case is much more simple
We receive this exception whenever we tries to perform a retry attempt but the Circuit Breaker is in Isolated state
So, the CB short cuts the execution and because of WaitAndRetryForever call we will eventually succeed
Put things together
var combinedPolicy = Policy.WrapAsync(retryPolicy, throttledPolicy);
var result = await combinedPolicy.ExecuteAsync(async () => await service.GetAsync());
Please note the followings:
This solution works well with multiple requests as well because Circuit Breaker is shared
This solution is a workaround, because we ca not set the duration of break dynamically
I hope you found this little sample application useful :)

Azure EventHub: Send Async performance

I have pretty naive code :
public async Task Produce(string topic, object message, MessageHeader messageHeaders)
{
try
{
var producerClient = _EventHubProducerClientFactory.Get(topic);
var eventData = CreateEventData(message, messageHeaders);
messageHeaders.Times?.Add(DateTime.Now);
await producerClient.SendAsync(new EventData[] { eventData });
messageHeaders.Times?.Add(DateTime.Now);
//.....
Log.Info($"Milliseconds spent: {(messageHeaders.Times[1]- messageHeaders.Times[0]).TotalMilliseconds});
}
}
private EventData CreateEventData(object message, MessageHeader messageHeaders)
{
var eventData = new EventData(Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(message)));
eventData.Properties.Add("CorrelationId", messageHeaders.CorrelationId);
if (messageHeaders.DateTime != null)
eventData.Properties.Add("DateTime", messageHeaders.DateTime?.ToString("s"));
if (messageHeaders.Version != null)
eventData.Properties.Add("Version", messageHeaders.Version);
return eventData;
}
in logs I had values for almost 1 second (~ 800 milliseconds)
What could be a reason for such long execution time?
The EventHubProducerClient opens connections to the Event Hubs service lazily, waiting until the first time an operation requires it. In your snippet, the call to SendAsync triggers an AMQP connection to be created, an AMQP link to be created, and authentication to be performed.
Unless the client is closed, most future calls won't incur that overhead as the connection and link are persistent. Most being an important distinction in that statement, as the client may need to reconnect in the face of a network error, when activity is low and the connection idles out, or if the Event Hubs service terminates the connection/link.
As Serkant mentions, if you're looking to understand timings, you'd probably be best served using a library like Benchmark.NET that works ove a large number of iterations to derive statistically meaningful results.
You are measuring the first 'Send'. That will incur some overhead that other Sends won't. So, always do warm up first like send single event and then measure the next one.
Another important thing. It is not right to measure just single 'Send' call. Measure bunch of calls instead and calculate latency percentile. That should provide a better figure for your tests.

Ensure Polly Policy Runs at Least Once

So, I'm writing some retry logic for acquiring a lock using Polly. The overall timeout value will be provided by the API caller. I know I can wrap a policy in an overall timeout. However, if the supplied timeout value is too low is there a way I can ensure that the policy is executed at least once?
Obviously I could call the delegate separately before the policy is executed but I was just wondering if there was a way to express this requriment in Polly.
var result = Policy.Timeout(timeoutFromApiCaller)
.Wrap(Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500))
.Execute(() => this.TryEnterLock());
If timeoutFromApiCaller is say 1 tick and there's a good chance it takes longer than that to reach the timeout policy then the delegate wouldn't get called (the policy would timeout and throw TimeoutRejectedException).
What I'd like to happen can be expressed as:
var result = this.TryEnterLock();
if (!result)
{
result = Policy.Timeout(timeoutFromApiCaller)
.Wrap(Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500))
.Execute(() => this.TryEnterLock());
}
But it'd be really nice if it could be expressed in pure-Polly...
To be honest I don't understand what does it mean 1 tick, in your case? Is it a nanosecond or greater than that? Your global timeout should be greater than your local timeout.
But as I can see you have not specified a local one. TryEnterLock should receive a TimeSpan in order to do not block the caller for infinite time. If you look at the built in sync primitives most of them provide such a capabilities: Monitor.TryEnter, SpinLock.TryEnter, WaitHandle.WaitOne, etc.
So, just to wrap it up:
var timeoutPolicy = Policy.Timeout(TimeSpan.FromMilliseconds(1000));
var retryPolicy = Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500));
var resilientStrategy = Policy.Wrap(timeoutPolicy, retryPolicy);
var result = resilientStrategy.Execute(() => this.TryEnterLock(TimeSpan.FromMilliseconds(100)));
The timeout and delay values should be adjusted to your business needs. I highly encourage you to log when the global Timeout (onTimeout / onTimeoutAsync) fires and when the retries (onRetry / onRetryAsync) to be able to fine tune / calibrate these values.
EDIT: Based on the comments of this post
As it turned out there is no control over the timeoutFromApiCaller so it can be arbitrary small. (In the given example it is just a few nano-seconds, with the intent to emphasize the problem.) So, in order to have at least one call guarantee we have to make use the Fallback policy.
Instead of calling manually upfront the TryEnterLock outside the policies, we should call it as the last action to satisfy the requirement. Because policies uses escalation, that's why whenever the inner fails then it delegates the problem to the next outer policy.
So, if the provided timeout is so tiny that action can not finish until that period then it will throw a TimeoutRejectedException. With the Fallback we can handle that and the action can be performed again but now without any timeout constraint. This will provide us the desired at least one guarantee.
var atLeastOnce = Policy.Handle<TimeoutRejectedException>
.Fallback((ct) => this.TryEnterLock());
var globalTimeout = Policy.Timeout(TimeSpan.FromMilliseconds(1000));
var foreverRetry = Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500));
var resilientStrategy = Policy.Wrap(atLeastOnce, globalTimeout, foreverRetry);
var result = resilientStrategy.Execute(() => this.TryEnterLock());

How to use Reactive Extensions to throttle client requests

I have a server receiving many objects from many clients and that fires ObjectReceived event each time it receives an object, including in the arguments who sent what.
Problem: There is a client annoying my server with requests but my server always responds.
I'd like to throttle requests based on who made it. For example, if I receive 100 requests in 1 second from 100 different clients and each client have made different request, I respond to every client who has made a request; but if I receive 100 requests in 1 second from 2 clients and each clients have done the same request 50 times, I only respond two times, one time to client A and one time to client B.
Is it possible in Rx?
Yes, one way is to group the requests by client id and selectively apply a throttle.
Say you had an event like this:
public class MyEvent
{
public int ClientId { get; set; }
public override string ToString()
{
return ClientId.ToString();
}
}
Lets set up slow and fast clients:
var slow = Observable.Interval(TimeSpan.FromSeconds(2))
.Select(_ => new MyEvent { ClientId = 1 });
var fast = Observable.Interval(TimeSpan.FromSeconds(0.5))
.Select(_ => new MyEvent { ClientId = 2 });
var all = slow.Merge(fast);
Now throttle selectively like this:
var throttled = all.GroupBy(x => x.ClientId).Select(
// apply the throttle here, this could even test the key
// property to apply different throttles to different clients
x => x.Throttle(TimeSpan.FromSeconds(1)))
.SelectMany(x => x);
And test it:
throttled.Subscribe(x => Console.WriteLine(x.ToString()));
With this throttle the fast client will never get a response - Throttle will suppress his requests indefinitely because they are less than a second apart. You can use other operators to suppress in different ways - e.g. Sample can pick out a single request over a given time interval.
After your question edit
You can apply different rules than by ClientId and use of Throttle - you can use DistinctUntilChanged() on a client stream to weed out duplicate requests, for example.
Slightly different question : Best way to implement request throttling in ASP.NET MVC?
In any case typical algorithm that performs really well would be : Hierarchical Token bucket.
The hierarchical token bucket (HTB) is a faster replacement for the
class-based queueing (CBQ) queuing discipline in Linux.
HTBs help in controlling the use of the outbound bandwidth on a given
link. HTB allows using one single physical link to simulate multiple
slower links and to send different kinds of traffic on different
simulated links. In both cases, one has to specify how to divide the
physical link into simulated links and how to decide which simulated
link a given packet is to be sent across.
In other words, HTB is very useful to limit a client's download/upload
rate. Thus, the limited client cannot saturate the total bandwidth.

C# Threadpooling HttpWebRequests

I've read and looked a quite a few examples for Threadpooling but I just cant seem to understand it they way I need to. What I have manage to get working is not really what I need. It just runs the function in its own thread.
public static void Main()
{
while (true)
{
try
{
ThreadPool.QueueUserWorkItem(new WaitCallback(Process));
Console.WriteLine("ID has been queued for fetching");
}
catch (Exception ex)
{
Console.WriteLine("Error: " + ex.Message);
}
Console.ReadLine();
}
}
public static void Process(object state)
{
var s = StatsFecther("byId", "0"); //returns all player stats
Console.WriteLine("Account: " + s.nickname);
Console.WriteLine("ID: " + s.account_id);
Console.ReadLine();
}
What I'm trying to do is have about 50 threads going (maybe more) that fetch serialized php data containing player stats. Starting from user 0 all the way up to a user ID i specify (300,000). My question is not about how to fetch the stats I know how to get the stats and read them, But how I write a Threadpool that will keep fetching stats till it gets to 300,000th user ID without stepping on the toes of the other threads and saves the stats as it retrieves them to a Database.
static int _globalId = 0;
public static void Process(object state)
{
// each queued Process call gets its own player ID to fetch
processId = InterlockedIncrement(ref _globalId);
var s = StatsFecther("byId", processId); //returns all player stats
Console.WriteLine("Account: " + s.nickname);
Console.WriteLine("ID: " + s.account_id);
Console.ReadLine();
}
This is the simplest thing to do. But is far from optimal. You are using synchronous calls, you are relying on the ThreadPool to throttle your call rate, you have no retry policy for failed calls and your application will behave extremly bad under error conditions (when the web calls are failing).
First you should consider using the async methods of WebRequest: BeginGetRequestStream (if you POST and have a request body) and/or BeginGetResponse. These methods scale much better and you'll get a higher troughput for less CPU (if the back end can keep up of course).
Second you should consider self-throthling. On a similar project I used a pending request count. On success, each call would submit 2 more calls, capped with the throtling count. On failure the call would not submit anything. If no calls are pending, a timer based retry submits a new call every minute. This way you only attempt once per minute when the service is down, saving your own resources from spinning w/o traction, and you increase the throughput back up to the throtling cap when the service is up.
You should also know that the .Net framework will limit the number of concurent conncetions it makes to any resource. You must find your destination ServicePoint and change the ConnectionLimit from its default value (2) to the max value you are willing to throttle on.
About the database update part, there are way to many variables at play and way too little information to give any meaningfull advice. Some general advice would be use asynchronous methods in the database call also, size yoru conneciton pool to allow for your throtling cap, make sure your updates use the player ID as a key so you don't deadlock on updating the same record from different threads.
How do you determine the user ID? One option is to segment all the threads so that thread X deals with ID's from 0 - N, and so on, as a fraction of how many threads you have.

Categories

Resources