How to design precisely timed remote data fetching algorithm? - c#

There is a server that publishes some XML data every 5 seconds for GET fetch. The URL is simple and does not change, like www.XXX.com/fetch-data. The data is published in a loop every 5 seconds precisely, and IS NOT guaranteed to be unique every time (but does change quite often anyway). Apart from that, I can also fetch XML at www.XXX.com/fetch-time, where server time is stored, in unix time format. So, the fetch-time resolution is unfortunately just in seconds.
What I need is a way to synchronize my client code such that it fetches the data AS SOON AS POSSIBLE to when they are published. If I just naively fetch in a loop every 5 seconds, what might happen is that if I get really unlucky, my loop might start right before the server loop ends, so I will basically always end up with 5 second old data. I need a mechanism to get both server and client loops in tandem. Also, I need to compensate for lag (ping), so that the fetch request is sent actually a little before the server publishes new data.
The server code is proprietary and can't be changed, so all the hard stuff must be done by client. Also, there are many other questions about high-precision time measurements and sleep functions, so you can abstract from those and take them as granted. Any help with the algorithm would be much appreciated.

Related

Best way to rate limit clientside api in C#

I've ran into an issue which i'm struggling to decide the best way to solve. Perhaps my software articheture needs to change?
I have a cron job which hits my website method every 10 seconds and then on my website the method then makes an API call each time to an API however the API is rate limited x amount in a minute and y amount a day
Currently i'm exceeding the API limits and need to control this in the website method somehow. I've thought storing in a file perhaps but seems hacky similary to a database as I don't currently use one for this project.
I've tried this package: https://github.com/David-Desmaisons/RateLimiter but alas it doesn't work in my scenario and I think it would work if I did one request with a loop as provided in his examples. I noticed he had a persistent timer(PersistentCountByIntervalAwaitableConstraint) but he has no documentation or examples for it(I emailed him incase). I've done a lot of googling around and can't find any examples of this only server rate limiting which is the other way around server limiting client and not client limiting requests to server
How can I solve my issue without changing the cronjobs? What does everyone think the best solution to this is?
Assuming that you don't want to change the clients generating the load, there is no choice but to implement rate limiting on the server.
Since an ASP.NET application can be restarted at any time, the state used for that rate-limiting must be persisted somewhere. You can choose any data store you like for that.
In this case you have two limits: One per minute and one per day. If you simply apply two separate rate limiters you will end up with the daily limit being exceeded fairly quickly. After that, there will be no further access for the rest of the day. Likely, this is undesirable.
It seems better to only apply the daily limit because it is more restrictive. A simple solution would be to calculate how far apart requests must be to meet the daily limit. Then, you store the date of the last request. Any new incoming request is immediately failed if not enough time has passed.
Let me know if this helps you.

Time on server since client sent packet

I'm currently working on a game with a client and server, and am trying to figure out a way to tell the amount of time from a client sending a packet and the server receiving it (so I can check where the enemies were at that point).
I attempted sending
DateTime.Now.Subtract(DateTime.MinValue.AddYears(1969)).TotalMilliseconds
With then client, then just check that same value on the server when it recieves the packet, and subtract them, but the issue with this is that timezones could completely break this, if the client and server are on different timezones. Also it seemed not the most accurate.
Is there a "proper" way to do this?
Well sending the epoch time will not account for leap seconds, but timezone changes should not be affected if you use DateTime.UtcNow and do all processing in UTC. Using this method would allow for users to manipulate that number, since it is based of of the computers time setting. There is not real proper way to handle this. Look at many games with latency issues. This occurs for both clientside and server-side processing.
The other issue with this method, depending on the type of game, is that the reaction of a user depends on events in real time. So if you reverse time for a calculation, the result of that could have affected another players actions.
For a complex handling, I think the game 'Eve online' will slow down 'Game Time' for large fights.

What is the fastest way to persistently increment a list of numbers from multiple threads?

My application has different tasks each one posting an XML Document through each HTTP POST on a different endpoint. For every thread I need to keep count of the message I sent, which is identified by a unique incremental number.
I need a mechanism that, after a message has been received by the endpoint will save the last message id sent, so that if there is a problem and the application needs to restart it won't send the same message again, and will restart from where it currently was.
If I don't persist the counters, on my laptop I can manage to obtain a throughput of about 100 messages processed per second for every queue with 5 tasks running. My goal is to achieve no more than a 10/15% reduction in throughput by persisting the counters.
Using SQL Server for saving the counters, with a row for every tasks gives me a 50% decrease in throughput. Saving the counter value on a text file for every task is a bit faster but still far from my goal. I am looking for a way to persist such information so that I can be as close as possible to my goal. I thought that maybe appending the last processed Id rather than updating it could help me in avoiding possible write locks, but the bottom line is that I don't care if for the sake of performance I will have to waste disk space or have a higher startup time for reading the last counter.
In your experience what might be a fast way to avoid contentions and safely persist data from multiple tasks even at the cost of more disk space?
You can get pretty good performance with an ESENT storage, via the ManagedEsent - PersistentDictionary wrapper.
The PersistentDictionary class is concurrent and provides real concurrent access to the ESENT backend. You would represent everything in key-value pair format.
Give it a try, it is not much code to write.
ESENT is an in-process database engine, disk based + in-memory caching, used throughout several Windows components (Search, Exchange, etc). It does provide transactional support, which is what you're after.
It has been included in all versions of Windows since 2000 so you don't need to install any dependencies other than ManagedEsent.
You would probably want to define something like this:
var dictionary = new PersistentDictionary<Guid, int>("ThreadStorage");
The key, I assume, should be something unique (maybe even the service endpoint) so that you are able to re-map it after a restart. The value is the last message identifier.
I am pasting below, shamelessly, their performance benchmarks:
Sequential inserts 32,000 entries/second
Random inserts 17,000 entries/second
Random Updates 36,000 entries/second
Random lookups (database cached in memory) 137,000 entries/second
Linq queries (range of records) 14,000 queries/second
You fit in the Random Updates case, which as you can see offers a really good throughput.
I faced the same issue as OP asked.
I used SQL server Sequence Numbers (with CREATE SEQUENCE).
However, the accepted answer is a good solution to avoid using SQL server.

Need help with the architecture for a penny bidding website

I'm trying to create a website similar to BidCactus and LanceLivre.
The specific part I'm having trouble with is the seconds aspect of the timer.
When an auction starts, a timer of 15 seconds starts counting down, and every time a person bids, the timer is reset and the price of the item is increased by 0,01$.
I've tried using SignalR for this bit, and while it does work well during trials runs in the office, it's just not good enough for real world usage where seconds count. I would get HTTP 503 errors when too many users were bidding and idling on the site.
How can I make the timer on the clients end shows the correct remaining time?
Would HTTP GETting that information with AJAX every second allow me to properly display the missing time? That's a request each second!
And not only that, but when a user requests that GET, I calculate remaining seconds, but until the user see's that response, that time is no longer useful as a second or more might pass between processing and returning. Do you see my conundrum?
Any suggestions on how to approach this problem?
There are a couple problems with the solution you described:
It is extremely wasteful. There is already a fairly high accuracy clock built into every computer on the Internet.
The Internet always has latency. By the time the packet reaches the client, it will be old.
The Internet is a variable-latency network, so the time update packets you get could be as high or higher than one second behind for one packet, and as low as 20ms behind for another packet.
It takes complicated algorithms to deal with #2 and #3.
If you actually need second-level accuracy
There is existing Internet-standard software that solves it - the Network Time Protocol.
Use a real NTP client (not the one built into Windows - it only guarantees it will be accurate to within a couple seconds) to synchronize your server with national standard NTP servers, and build a real NTP client into your application. Sync the time on your server regularly, and sync the time on the client regularly (possibly each time they log in/connect? Maybe every hour?). Then simply use the system clock for time calculations.
Don't try to sync the client's system time - they may not have access to do so, and certainly not from the browser. Instead, you can get a reference time relative to the system time, and simply add the difference as an offset on client-side calculations.
If you don't actually need second-level accuracy
You might not really need to guarantee accuracy to within a second.
If you make this decision, you can simplify things a bit. Simply transmit a relative finish time to the client for each auction, rather than an absolute time. Re-request it on the client side every so often (e.g. every minute). Their global system time may be out of sync, but the second-hand on their clock should pretty accurately tick down seconds.
If you want to make this a little more slick, you could try to determine the (relative) latency for each call to the server. Keep track of how much time has passed between calls to the server, and the time-left value from the previous call. Compare them. Then, calculate whichever is smaller, and base your new time off that calculation.
I'd be careful when engineering such a solution, though. If you get the calculations wrong, or are dealing with inaccurate system clocks, you could break your whole syncing model, or unintentionally cause the client to prefer the higest latency call. Make sure you account for all cases if you write the "slick" version of this code :)
One way to get really good real-time communication is to open a connection from the browser to a special tcp/ip socket server that you write on the server. This is how a lot of chat packages on the web work.
Duplex sockets allow you to push data both directions. Because the connection is already open, you can send quite a bit of very fast data across.
In the past, you needed to use Adobe Flash to accomplish this. I'm not sure if browsers have advanced enough to handle this without a plugin (eg, websockets?)
Another approach worth looking at is long polling. In concept, a connection is made to the server that just doesn't die, and it gives you the opportunity on the server to trickle bits of realtime data down to the clients.
Just some pointers. I have written web software using JavaScript <-> Flash <-> Python/PHP, and was please with how it worked.
Good luck.

Performance for reading files and inserting contents into database

I'm developing a system that isn't real time but there's an intervening standalone server between the end user machines and the database. The idea is that instead of burdening the database server every time a user sends something up, a windows service on the database machine sweeps the relay server at regular intervals and updates the database, deleting the temporary files on the relay box.
There is a scenario where the client software installed on thousands of machines sends up information at nearly the same time. The following hold true:
The above scenario won't occur often but could occur once every other week.
For each machine, 24 bytes of data (4k on the disk) is written on the relay server, which we want to then pick up and update the database with. So although it's fine if the user base is only a few thousands for now, they may amount to millions overtime.
I was thinking of a batch operation that only picks up some 15,000 - 20,000 files at a time and runs every whenever (amendable from app.config). The problem is that if the user base grows to a few million that will take days to complete. Yes, it doesn't have to be real-time information but waiting for days for all the data to reach the database isn't ideal either.
I think there will always be a bottleneck if the relay box is hammered, but are there better ways to improve performance and get the data across at a reasonable time (a day, two tops)?
Regards,
F.
I think you might consider that to avoid hammering the disk only one thread reads the files and then hands off processing to multiple threads to write to the database and returns to the disk thread to delete the files after commit. The amount of DB threads could be "amendable from app.config" to find the best value for your hardware config.
Just my 2 cents to get you thinking.

Categories

Resources