I'm accessing a MySQL database using the standard MySql.Data package from Oracle. Every few releases of the application, we need to tweak the database schema (e.g. client wanted DECIMAL(10,2) changed to DECIMAL(10,3)) which the application handles by sending the necessary SQL statement. This works except that on a large database, the schema update can be a rather lengthy operation and times out.
The obvious solution is to crank up the timeout, but that results in a relatively poor user experience - I can put up a dialog that says "updating, please wait" and then just sit there with no kind of progress indicator.
Is there a way to get some kind of feedback from the MySQL server that it's 10% complete, 20% complete, etc., that I could pass on to the user?
There's two ways to approach this problem.
The first is the easiest way, as you've suggested, and just use a progress bar that bounces back and forth. It's not great, it's not the best user experience, but it's better than locking up the application and at least it's giving feedback. Also I assume this is not something that occurs regularly and is a one-off annoyance every now and again. Not something I'd really be worried about.
However, if you really are worried about user-experience and want to give better feed back, then you're going to need to try some metrics. Taking your DECIMAL example, time the change on different row-counts. 100,000 rows, a million rows, etc etc. This will give you a napkin-guess time it might take. Note, different hardware, other things running on the computer, you're never going to get it down exact. But you have an estimate.
Once you have an estimate, and you know the row-count, you can create a real progress bar based on those estimates. And if it gets to 100% and the real operation hasn't completed, or if it finishes before you get to 100% (and you can insta-jump the bar!), it's... something.
Personally I'd go with option one, and perhaps add a helpful message that Windows commonly does. "This may take a few minutes". Maybe add "Now's a great time for coffee!". And a nice little animated gif :)
Related
I have a feed reader running every minute (it's picking up a feed that gets updated often). But I seem to be running into getting blocked by Akamai when accessing a few websites. Perhaps they think I'm up to something, but I'm not - I just want to get the feed.
Any thoughts on how to either play nice with Akamai or code this differently? From what I know, I can't know when the feed is updated other than polling it - but is there a preferred way - like checking a cache? This is coded in c# though I doubt that makes a difference.
Without more of a context it is hard to ascertain why you are being blocked. Is it because of rate limits or other access control measures?
Assuming it is rate limits, there is not much you can do. I would recommend you to first verify that the robots.txt allows you to crawl the URL and if allowed use some sort of exponential back off. Helps to play nice by providing a meaningful User-Agent so that when they do update their rules they might want to consider whitelisting legitimate requests such as yourself.
With my WCF service, I am solving an issue that has both performance and design effects.
The service is a stateless RESTful PerCall service, that does a lot of simple and common thins, which all work like a dandy.
But, there is one operation, that has started to scare me a lot recently, so there is the problem:
Clients make parametrized calls to the operation and the computation of the result requires lots of time to finish. But result to a call with identical parameters will always be the same, until data on the server change. And clients make an awful LOT of calls with exact the same parameters. The server, however, cannot predict the parameters, that the users will like, so sadly enough, the results cannot be precomputed.
So I came up with caching layer and store the result object as a key-value pair, where key represents the parameters which lead to this result. And if the relevant data change, I just flush the cache. Still simple and no problems with this.
Client calls the service, server receives the call, looks, whether the result is already cached and returns it, if so. But, if the result is not cached yet, the client starts the computation. The computation may take up to 2 minutes (average time 10-15 seconds) to finish and by that time, other clients may come and because the result is still not known to cache, each of them would start their own computation. Which is NOT what we really want, so there is a flag, if someone has already started the computation with this parameters this is the place in code, where other callers' code stops and waits for the computation to be finished and inserted into cache, from where each of the invoked instances will grab the result, return it to the client and dispose.
And this is the part, which I am really struggling with.
By now, my solution looks something like this (before you read further, I want to warn you, because my experience is not near decent level and I still am a big noob in all C#, WCF and related stuff... no need telling me I'm a noob, because I am fully aware of that):
Stopwatch sw = new Stopwatch();
sw.Start();
while (true)
{
if (Cache.Contains(parameters) || sw.Elapsed > threshold)
break;
Thread.Sleep(100);
}
...do relevant stuff here
As you see, there are more problems with this solution:
Having the loop, check and all this stuff does not only feel ugly, with many clients waiting this way, the resources tend to jump up.
If the operation fails (the initial caller's computation fails to deliver within the limits of threshold), I do not really know, which client has got to be next up trying the computation, or how, or even whether should I run the operation again, or return a fault to the client...
EDIT: This is not related to synchronization, I am aware of the need for locking in some parts of my application, so my concerns are not synchronization-reated.
What should I do when the relevant server-side data change while invoked code is still performing computation (resulting in such result being a wrong one). ... More over, this has some other horrible effects on the application, but yeah, I am getting to the question here:
So, like most of the time, I did my homework and performed qoogling around before asking, but did not succeed in finding some guidance that I would either understand or that would suit my issues and domain.
I got a strong feel, that I have to introduce some kind of (static?) events-based-and-or-asynchronous class (call it layer if you will), that does some tricks and organizes and manages all this things in some kind of a register-to-me-and-i-will-give-you-a-poke / poke-all-registered-threads manner. But despite being able (to certain extent) to use the newly introduced tasks, TPL, and async-await, I not only have very limited experience on this field, more sadly, I really really need help explaining how it could come together with events (or do I even need them?)... When i try / run little things in a test-console application, I might succeed, but bringing it into this bigger environment of my WCF application, I struggle to get a clue.
So guys I will gladly welcome every kind of relevant thoughts, advice, guidance, links, code and criticism touching my topic.
I am aware of the fact, it might be confusing and will do my best to clear all misunderstandings and tricky parts, just ask me for doing that.
Thanks for help!
I have a c# application that generates data every 1 second (stock tick data) which can be discarded after each itteration.
I would like to pass this data to a Coldfusion (10) application and I have considered having the c# application writing the data to a file every second and then having the Coldfusion application reading that data, but this is most likely going to cause issues with the potential for both applications trying to read or write to the file at the same time ?
I was wondering if using Memory Mapped Files would be a better approach ? If so, how could I access the memory mapped file from Coldfusion ?
Any advice would be greatly appreciated. Thanks.
We have produced a number of stock applications that include tick by tick tracking of watchlists, charting etc. I think the idea of a file is probably not a great idea unless you are talking about a single stock with regular intervals. In my experience a change every "second" is probably way understating the case. Some stokes (AAPL or GOOG are good examples) have hundreds of "ticks" per second during peak times.
So if you are NOT taking every tick but really are "updating the file" every 1 second then your idea has some merit in that you could use a file watching gateway to fire events for you and "see" that the file is updated.
But keep in mind that you are in effect introducing something "in the middle". A file now stands between your Java or CF applications and the quote engine. That's going to introduce latency no matter what you choose to do (file handles getting and releasing etc). And the locks of one process may interfere with the other.
When you are dealing with facebook updates miliseconds don't really matter much - in spite of all the teenage girls who probably disagree with me :) With stock quotes however, half of the task is shaving off miliseconds to get your processes as close to real time as possible.
Our choice is usually to choose sockets instead of something in the middle bridging the data. The quote engine then keeps it's watchlist and updates it's arrays like normal but also sends any updates down stream to the socket engine which pushes it to something taht can handle it (a chart application, watchlist, socketgateway for webpage etc).
Hope this helps - it's not a clear answer but more of a clarification to the hurdles you face.
Ok, I'm currently writing a scheduling web app for a large company, and it needs to be fast. Normal fast (<1s) doesn't cut it with these guys, so we're aiming for <0.5s, which is hard to achieve when using postbacks.
My question is: does anyone have a suggestion of how best to buffer calendar/schedule data to speed load times?
My plan is to load the selected week's data, and another week on either side, and use these to buffer the output: i.e. it will never have load the week you've asked for, it'll always have that in memory, and it'll buffer the weeks on either side for when you next change.
However, I'm not sure exactly how to achieve this, the asynch loading is simple when using ajax pagemethods, but it's a question of where to store the data (temporarily) after it loads: I am currently using a static class with a dictionary> to do it, but this is probably not the best way when it comes to scaling to the large userbase.
Any suggestions?
EDIT
The amount of data loaded is not particularly high (there are a few fields on each appointment, which are converted to a small container class and have some processing done on them to organise the dates and calculate the concurrent appointments, and it's unlikely there'll be more than ~30 appointments a week due to the domain) however the database is under very high load from other areas of the application (this is a very large scale system with thousands of users transfering a large volume of information around).
So are you putting your buffered content on the client or the server here? I would think the thing to do would be to chuck the data for previous and next weeks into a javascript data structure on the page and then let the client arrange it for you. Then you could just be bouncing back to the server asynchronously for the next week when one of your buffered neighbour weeks is opened so you're always a week ahead as you have said, assuming that the data will only be accessed in a week-by-week way.
I would also, for the sake of experimentation, see what happens if you put a lot more calendar data into the page to process with Javascript - this type of data can often be pretty small, even a lot of information barely adding up to the equivalent of a small image in terms of data transfer - and you may well find that you can have quiet a bit of information cached ahead of time.
It can be really easy to assume that because you have a tool like Ajax you should be using it the whole time, but then I do use a hammer for pretty much all jobs around the home, so I'm a fine one to talk on that front.
The buffering won't help on the first page, though - only on subsequent back/forward requests.
Tbh I don't think there's much point, as you'll want to support hyperlinks and redirects from other sources as much as or more than just back/forward. You might also want to "jump" to a month. Forcing users to page back and forwards to get to the month they want is actually going to take longer and be more frustrating than a <1s response time to go straight to the page they want.
You're better off caching data generally (using something like Velocity) so that you almost never hit the db, but even that's going to be hard with lots of users.
My recommendation is to get it working, then use a profiling tool (like ANTS Profiler) to see which bits of code you can optimise once it's functionally correct.
Does anyone have any experience with receiving and updating a large volume of data, storing it, sorting it, and visualizing it very quickly?
Preferably, I'm looking for a .NET solution, but that may not be practical.
Now for the details...
I will receive roughly 1000 updates per second, some updates, some new rows of data records. But, it can also be very burst driven, with sometimes 5000 updates and new rows.
By the end of the day, I could have 4 to 5 million rows of data.
I have to both store them and also show the user updates in the UI. The UI allows the user to apply a number of filters to the data to just show what they want. I need to update all the records plus show the user these updates.
I have an visual update rate of 1 fps.
Anyone have any guidance or direction on this problem? I can't imagine I'm the first one to have to deal with something like this...
At first though, some sort of in memory database I would think, but will it be fast enough for querying for updates near the end of the day once I get a large enough data set? Or is that all dependent on smart indexing and queries?
Thanks in advance.
It's a very interesting and also challenging problem.
I would approach a pipeline design with processors implementing sorting, filtering, aggregation etc. The pipeline needs an async (threadsafe) input buffer that is processed in a timely manner (according to your 1fps req. under a second). If you can't do it, you need to queue the data somewhere, on disk or in memory depending on the nature of your problem.
Consequently, the UI needs to be implemented in a pull style rather than push, you only want to update it every second.
For datastore you have several options. Using a database is not a bad idea, since you need the data persisted (and I guess also queryable) anyway. If you are using an ORM, you may find NHibernate in combination with its superior second level cache a decent choice.
Many of the considerations might also be similar to those Ayende made when designing NHProf, a realtime profiler for NHibernate. He has written a series of posts about them on his blog.
May be Oracle is more appropriate RDBMS solution fo you. The problem with your question is that at this "critical" levels there are too much variables and condition you need to deal with. Not only software, but hardware that you can have (It costs :)), connection speed, your expected common user system setup and more and more and more...
Good Luck.