I've ran into an issue which i'm struggling to decide the best way to solve. Perhaps my software articheture needs to change?
I have a cron job which hits my website method every 10 seconds and then on my website the method then makes an API call each time to an API however the API is rate limited x amount in a minute and y amount a day
Currently i'm exceeding the API limits and need to control this in the website method somehow. I've thought storing in a file perhaps but seems hacky similary to a database as I don't currently use one for this project.
I've tried this package: https://github.com/David-Desmaisons/RateLimiter but alas it doesn't work in my scenario and I think it would work if I did one request with a loop as provided in his examples. I noticed he had a persistent timer(PersistentCountByIntervalAwaitableConstraint) but he has no documentation or examples for it(I emailed him incase). I've done a lot of googling around and can't find any examples of this only server rate limiting which is the other way around server limiting client and not client limiting requests to server
How can I solve my issue without changing the cronjobs? What does everyone think the best solution to this is?
Assuming that you don't want to change the clients generating the load, there is no choice but to implement rate limiting on the server.
Since an ASP.NET application can be restarted at any time, the state used for that rate-limiting must be persisted somewhere. You can choose any data store you like for that.
In this case you have two limits: One per minute and one per day. If you simply apply two separate rate limiters you will end up with the daily limit being exceeded fairly quickly. After that, there will be no further access for the rest of the day. Likely, this is undesirable.
It seems better to only apply the daily limit because it is more restrictive. A simple solution would be to calculate how far apart requests must be to meet the daily limit. Then, you store the date of the last request. Any new incoming request is immediately failed if not enough time has passed.
Let me know if this helps you.
Related
I have a feed reader running every minute (it's picking up a feed that gets updated often). But I seem to be running into getting blocked by Akamai when accessing a few websites. Perhaps they think I'm up to something, but I'm not - I just want to get the feed.
Any thoughts on how to either play nice with Akamai or code this differently? From what I know, I can't know when the feed is updated other than polling it - but is there a preferred way - like checking a cache? This is coded in c# though I doubt that makes a difference.
Without more of a context it is hard to ascertain why you are being blocked. Is it because of rate limits or other access control measures?
Assuming it is rate limits, there is not much you can do. I would recommend you to first verify that the robots.txt allows you to crawl the URL and if allowed use some sort of exponential back off. Helps to play nice by providing a meaningful User-Agent so that when they do update their rules they might want to consider whitelisting legitimate requests such as yourself.
I'm writing an application in C# that allows people to track the amount of time they spend on tasks. It can be used by a single person to track their own personal time, but it will also be able to work in, for example, a company - like, if they want to track the amount of time spend on some project.
The data being stored by this program is pretty simple - a collection of all the tasks and each "block" of time that was spent on it (including date, start/stop time, and length of time spent).
For the multiuser functionality, my plan was to have a single server that the clients send updates to the tracked time. I don't think the clients will need a continuous connection as the updates would typically be pretty far apart.
Additionally, as both the server and the client will store a copy of the data, either of them can ask for a copy from the other if there's a data loss on either. Femaref has informed me that this is a poor idea, so I've removed it.
So, my question is, how should I approach this? I've seen some C# client/server tutorials, but those seem to be geared towards continuous connections.
Your best bet is to track the data separately. First Allow users to track there own time, and just store that in a local db (you can use something like csharp-sqlite ), then when the user connects sync what data you want to keep on server.
For data that you want to track sever side your just going to want the app to sign in and say its starting a task and then sign out when its stopping a task(then have the server side hit the db functions)(your going to want to keep the user data, and the server data separate, so you know what you can trust, and what implications there are for using what data ) .
Obviously, your going to want to handle situations where a task goes on longer then expected. For example someone forgets to say there done with the task(like there computer just crashes)(you can do this by having your app just say its still working on a task every so often).
The best way I have found to get around issues that are caused by trusting peoples input is to just tie into something like your local A.D or LDAP and allow management control(because in the end they are the ones that sort out any messes that come from people having the wrong hours) thats all handled server side. If you don't have A.D or LDAP, you might have to consider implementing some kind of RSA key mechanism for authentication and authority chains.
For talking to the server side process on the client, I suggest something like SOAP (SOAP using C#). That way you can move your server language to what ever makes your feel all warm and fuzzy.
This is a bit of a broad question so its hard to cover everything, but it should give you some leads in the right direction.
The ReadWriteTimeout for HttpWebRequests seems to be defaulted to 5 minutes.
Is there a reason why it is that high? I was trying to set the timeout of an API call to 10 seconds, but it was spinning for a over 2 minutes.
WHen I set this to 30 seconds, it times out in a reasonable amount of time now.
Is it dangerous to set this too low?
I can't imagine something taking longer than 20-30 seconds in my application (small 2-30kb payloads).
Reference: http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.readwritetimeout.aspx
Sure there's a reason for a 5 minute time-out. It looks like this:
This contraption is a robotic tape retrieval system, used by the International Centre for Radio Astronomy Research. It stores 32.5 petabytes of historical data. When its server gets an HttpWebRequest, the machine sends the robot on its way to retrieve the tape with the data. This takes a while, as you might imagine.
These systems were quite common a decade ago, around the time .NET was designed. Not so much today, the unrelenting improvements in hard disk storage capacity made them close to obsolete. Although more than 5 petabyte of SAN storage still sets you back a rather major chunk of money. If speed is not essential then tape is hard to beat.
Clearly .NET cannot possibly reliably declare a timeout when it doesn't know anything about what's happening on the other end of the wire. So the default is high. If you have good reasons to believe that there's an upper limit on your particular setup then don't hesitate to lower it. Do make it an editable setting, you can't predict the future.
You can't possibly know what connection speed the users have that connect to your website. And as the creator of this framework you can't know either what the developer will host. This class already existed in .NET 1.1, so for a very long time. And back then the users had slower speed too.
Finding a good default value is very difficult. You don't want to set it too high to prevent security flaws, and you don't want to set it too low because this would result in a million (exaggerated) threads and requests about aborted requests.
I'm sorry I can't give you any official sources, but this is just reasonable.
Why 5 minutes? Why not?
JustAnotherUserYouMayKnow explained it to you pretty good.
But as usual, you have the freedom to change this default value to a value that suits to your very case, so feel free to follow the path that Christian pointed out.
Setting a default value is not an easy task at all when we are talking about millions of users and maybe millions of billions of possible scenarios involved.
The bootom line is that it isn't that much important why it's 5 minutes but rather how you can adjust it to your very needs.
Well by setting it that low you may or may introduce a series of issues. As you may be able to reach the site within a reasonable time, others may not.
A perfect example is Verizon, they invoke a series of Proxy Servers which can drastically slow a connection down. The reason I brought such an example up; is our application specified a one-minute Timeout before it throws an exception.
Our server has no issues with large amounts of request, it handles them quite easily. However, some of our users throughout the world receive this error: Error 10060.
The issue can route from a incorrect Proxy Configuration or Invalid Registry Key which actually handles the Timeout request.
You'd think that one minute would indeed be fast enough, but it actually isn't. As with this customers particular network it doesn't siphon through the data quick enough- thus causing an error.
So you asked:
Why is the HttpWebRequest ReadWrite Timeout Defaulted to five minutes?
They are attempting to account for the lowest common denominator.
Simply, each network and client may have a vast degree of traffic or delays as it moves to the desired location. If it can't get to the destination within your ports ideal socket request your user will experience an exception.
Some really important things to know about a network:
Some networks that are configured have a limited hop count / time to live.
Proxies and Firewalls which are heavy in filtering data and security, may delay your traffic.
Some areas do not have Fiber or Cable high-speed. They may rely on Satellite or DSL.
Each network protocol is different.
Those are a few variables that you have to consider. If we are talking about an internet; each client has a home network; which connects to ISP; which connects to the Internet; which connects to you. So you have several forms of traffic to be aggregated.
If we are talking about an Intranet, with most modern day technology the odds of your time being an issue are slim but still possible.
Also each individual computer can partake or cause an issue. In Windows 8 the default Timeout specified for the browser is one minute; in some cases those users may experience exceptions with your application, your site, or others. So you'd manually alter the ServerTimeOut and TimeOut key in the registry to assign a longer value.
In short:
Client Machines may pose a problem in reaching your site within your allocated time.
Network / ISP may incur a problem for some users.
Your Server may be configured incorrectly or not allocate the right amount of time.
These are all variables that need to be accounted for; as they will impact access to your application. Unfortunately you won't know for certain until it's launched and users begin to utilize your site.
Unfortunately you won't know if your time you specified will be enough; but it defaults to a higher number because there is so much variation across the world that it is trying to consider the lowest common denominator. As your goal is to reach as many people as possible.
By the way very nice question, and some great answers so far as well.
I'm trying to create a website similar to BidCactus and LanceLivre.
The specific part I'm having trouble with is the seconds aspect of the timer.
When an auction starts, a timer of 15 seconds starts counting down, and every time a person bids, the timer is reset and the price of the item is increased by 0,01$.
I've tried using SignalR for this bit, and while it does work well during trials runs in the office, it's just not good enough for real world usage where seconds count. I would get HTTP 503 errors when too many users were bidding and idling on the site.
How can I make the timer on the clients end shows the correct remaining time?
Would HTTP GETting that information with AJAX every second allow me to properly display the missing time? That's a request each second!
And not only that, but when a user requests that GET, I calculate remaining seconds, but until the user see's that response, that time is no longer useful as a second or more might pass between processing and returning. Do you see my conundrum?
Any suggestions on how to approach this problem?
There are a couple problems with the solution you described:
It is extremely wasteful. There is already a fairly high accuracy clock built into every computer on the Internet.
The Internet always has latency. By the time the packet reaches the client, it will be old.
The Internet is a variable-latency network, so the time update packets you get could be as high or higher than one second behind for one packet, and as low as 20ms behind for another packet.
It takes complicated algorithms to deal with #2 and #3.
If you actually need second-level accuracy
There is existing Internet-standard software that solves it - the Network Time Protocol.
Use a real NTP client (not the one built into Windows - it only guarantees it will be accurate to within a couple seconds) to synchronize your server with national standard NTP servers, and build a real NTP client into your application. Sync the time on your server regularly, and sync the time on the client regularly (possibly each time they log in/connect? Maybe every hour?). Then simply use the system clock for time calculations.
Don't try to sync the client's system time - they may not have access to do so, and certainly not from the browser. Instead, you can get a reference time relative to the system time, and simply add the difference as an offset on client-side calculations.
If you don't actually need second-level accuracy
You might not really need to guarantee accuracy to within a second.
If you make this decision, you can simplify things a bit. Simply transmit a relative finish time to the client for each auction, rather than an absolute time. Re-request it on the client side every so often (e.g. every minute). Their global system time may be out of sync, but the second-hand on their clock should pretty accurately tick down seconds.
If you want to make this a little more slick, you could try to determine the (relative) latency for each call to the server. Keep track of how much time has passed between calls to the server, and the time-left value from the previous call. Compare them. Then, calculate whichever is smaller, and base your new time off that calculation.
I'd be careful when engineering such a solution, though. If you get the calculations wrong, or are dealing with inaccurate system clocks, you could break your whole syncing model, or unintentionally cause the client to prefer the higest latency call. Make sure you account for all cases if you write the "slick" version of this code :)
One way to get really good real-time communication is to open a connection from the browser to a special tcp/ip socket server that you write on the server. This is how a lot of chat packages on the web work.
Duplex sockets allow you to push data both directions. Because the connection is already open, you can send quite a bit of very fast data across.
In the past, you needed to use Adobe Flash to accomplish this. I'm not sure if browsers have advanced enough to handle this without a plugin (eg, websockets?)
Another approach worth looking at is long polling. In concept, a connection is made to the server that just doesn't die, and it gives you the opportunity on the server to trickle bits of realtime data down to the clients.
Just some pointers. I have written web software using JavaScript <-> Flash <-> Python/PHP, and was please with how it worked.
Good luck.
Possible/partial duplicates:
What’s a good rate limiting algorithm?
Throttling method calls to M requests in N seconds
Best way to implement request throttling in ASP.NET MVC?
I am looking for the best way to implement a moving time window rate limiting algorithm for a web application to reduce spam or brute force attacks.
Examples of use would be "Maximum number of failed login attempts from a given IP in the last 5 minutes", "Maximum number of (posts/votes/etc...) in the last N minutes".
I would prefer to use a moving time window algorithm, rather than a hard reset of statistics every X minutes (like twitter api).
This would be for a C#/ASP.Net app.
We found out Token Bucket is better algorithm for this kind of rate-limiting. It's widely used in routers/switches so our operation folks are more familiar with the concept.
Just to add a more 'modern' answer to this problem: For .NET WebAPI, WebApiThrottle is excellent and probably does everything you want out of the box.
It's also available on NuGet.
Implementation takes only a minute or so and it's highly customisable:
config.MessageHandlers.Add(new ThrottlingHandler()
{
Policy = new ThrottlePolicy(perSecond: 1, perMinute: 30, perHour: 500, perDay:2000)
{
IpThrottling = true,
ClientThrottling = true,
EndpointThrottling = true
},
Repository = new CacheRepository()
});
Use a fast memory-based hashtable like memcached. The keys will be the target you are limiting (e.g. an IP) and the expiration of each stored value should be the maximum limitation time.
The values stored for each key will contain a serialized list of the last N attempts they made at performing the action, along with the time for each attempt.
I just added the answer to the question Block API requests for 5 mins if API rate limit exceeds.
I used HttpRuntime.Cache to allow only 60 requests per minute. Exceeding the limit will block the API for next 5 minutes.
You find this page to be an interesting read:
http://www.codeproject.com/KB/aspnet/10ASPNetPerformance.aspx
The section to look out for starts as follows:
Prevent Denial of Service (DOS) Attack
Web services are the most attractive target for hackers because even a pre-school hacker can bring down a server by repeatedly calling a Web service which does expensive work.
EDIT: Similar question here:
Best way to implement request throttling in ASP.NET MVC?
I have been working on a new redis-based rate-limiting approach: http://blog.jnbrymn.com/2021/03/18/estimated-average-recent-request-rate-limiter.html
It is simpler than many other approaches that I've seen in that it doesn't require you to constantly create new redis keys (e.g. instead of one per user per minute window, it's just one per user). It has some nice properties regarding "forgetfulness and forgiveness" so that, for example, abusive users can't reoffend in the next minute window. It also has a nice interpretation as the state of the rate-limiter corresponds to an estimate of the user's recent request rate.