Retrieve huge data from rest api

Retrieve huge data from rest api - c#

I am calling third party rest api.this rest api returns more then 2 thousand records that takes much time to
retrieve this information.I want to show these records in asp.net page.
Records display in page successfully but problem is its takes much time to display/retrieve this information.
I call this api against dropdown change so api call might be again and again.means it is not one time call and
also i cann't cache this information because in server end may be information will change.
So How I reduce download time or any trick(paging is not supported),Any suggestion?
Thanks in advance

Well there's always a workaround for any given problem even when it doesn't come to your mind at first sight.
I had one situation like this, and this is what we did:
First, keep in mind that real-time consistency is just something that cannot be achieved without writing extensive code. For example, if a user requests data from your service and then that data is rendered on the page, potentially the user could be seeing out-of-data data if at that precise time the source is updated.
So we need to accept the data will be consistent eventually
Then consider the following, replicate the data in your database using an external service periodically
Expose this local data allowing paging and consume it from your client code
As any design decision, there are trade-offs that you should evaluate to take your final choice, these were problems we had with a design like this:
Potentially phe data will not be consistent when the user requests it. You can poll the data to the server every xxx minutes depending on how often it changes (Note: usually we as developers, we are not happy with this kind of data inconsistency, we would like to see everything on real-time, but the truth, is that the end users, most of the time can live with this kind of latency because that's the way they use to work in real-life, consult with your end-users to gather their point of views about this. You can read more about this from Eric Evans in the Blue Book)
You have to write extra code to maintain the data locally
Sadly when you are dealing with a third party service with these conditions, there are few options to choose from, I recommend you to evaluate them all with your end-users explaining their trade-offs to choose the solution that best fits your needs
Alternatives:
Since I already said the data will not be consistency all the time, then perhaps you could consider caching the call to the service =)
Usually the easiest way is always the best way to face something like this, since you didn't mention it in your post I will comment here. Have you tried to simply request to the third party organization to allow paging?. Get in touch with your third party partner and talk about this.
If you can, evaluate another source (other third party companies providing a similar service)

So How I reduce download time or any trick(paging is not supported),Any suggestion?
If the API doesn't support paging, there's not much you could do from a consumer perspective than probably caching the data to avoid the expensive HTTP call.
There's no miracle. There's no client.MaxDownloadTimeMs property that you could set to 5 for example and get your huge data in less than 5 milliseconds.

Your only option would be to cache the data for some amount of time. If the third party api has a call you can make which is more generic than your dropdown change you could call that before the page loads to get all possible drop down values and store the results for a certain period of time. Then just return the appropriate data each time.
Also, look to see if the third party api has any options for gzipping (compressing) the data before sending it down to you. I have created several API's and have created this type of option for large datasets.

Related

Get all values from web service API without iterating through every item

I've build a small C# application that has been able to get all values I wanted, specifically this is about products, from a webservice API. I have just switched to a larger database (via webservice API access) and now finding out that I get blocked while iterating through the requests, since I've gone from 25 products to 1000 products.
Well, honestly I'm guessing this is the case, but the database is the same built in both my trails, so what has changed is just the amount.
Might mention that I'm new to the whole conect of using webservices/API:s.
I have not tried anything specific yet. I guess one could use a timer between the requests, to have the server rest in between.
Currently I'm iterating through the list of all products, getting their URI, then each URI to get the specifics about each product.
In the best of worlds, I would like to either find a solution whichs allows me to make the numer of reqests lesser than what it is today, or to find a way to let the service rest from making requests. I.e. compare to SQL SELECT * FROM dbTable, make one request, use it
Anyone have any good suggestions. Thanks :)
Thanks to Jeremy, I understand that my backgroundinfo was incomplete. My programs purpose is to fetch all the data from a webshop (products) and distribute them to other systems. So this process of iterating through them will happen ever so often, to detect changes made to the products.

what is the best way in asp.net-mvc / SQL server, to store expensive calculated data and serve it up fast like stackoverflow

i have a similar requirement to stackoverflow to show a number of metrics on a page in my asp.net-mvc site that are very expensive to calculate. Stackoverflow has a lot of metrics on the page (like user accept rate, etc) which clearly is not being calculated on the fly on page request, given that it would be too slow.
What is a recommended practice for serving up calculated data really fast without the performance penalty (assuming we can accept that this data maybe a little out of date.
is this stored in some caching layer or stored in some other "results" database table so every day there is a job to calculate this data and store the results so they can be queries directly?
assuming that i am happy to deal with the delayed of having this data as a snapshot,what is the best solution for this type of problem.

Probably they may be relying on the Redis data store for such calculations and caching. This post from marcgravell may help.

yes, the answer is caching, how you do it is (can be) the complicated part, if you are using NHibernate adding caching is really easy, is part of your configuration and on the queries you just add .Cacheable and it manages it for you. Caching also depends on the type of environment, if you're using a single worker, web farm or web garden, you would have to build a caching layer to accomodate for your scenario

Although this is a somewhat-recent technique, one really great way to structure your system to make stuff like this possible is by using Command and Query Responsibility Segregation, more often referred to by CQRS.

How to manage data access / preloading efficiently using web services in C#?

Ok, this is very "generic" question. We currently have a SQL Server database for which we need to develop an application in ASP.NET with will contain all the business logic in C# Web Services.
The thing is that, architecturally speaking, I'm not sure how to design the web service and the data management. There are many things to consider :
We need have very rapid access to data. Right now, we have over a million "sales" and "purchases" record from which we need to often calculate and load the current stock for a given day according to a serie of parameter. I'm not sure how we should preload the data and keep the data in the Web Service. Doing a stock calculation within a SQL query will be very lengthy. They currently have a stock calculation application that preloads all sales and purchases for the day and afterwards calculate the stock on the code-side.
We want to develop powerful reporting tools. We want to implement a "pivot table" but not sure how to implement it and have good performances.
For the reasons above, I'm not sure how to design the data model.
How would you manage the display of the current stock, considering that "stock" is actually purchases - sales and that you have to consider all rows to calculate it ? Would you cache "stock" data in the database to optimize performances, even though its redundant data ?
Anybody can give me any guidelines on how to start, or from their personnal experiences (what have you done in the past ?)
I'm not sure if it's possible to make a bounty even though the question is new (I'd put 300 rep on it, since I really need something). If you know how, let me know.
Thanks

The first suggestion is to not use legacy ASMX web services. Use WCF, which Microsoft says should be used for all new web service development.
Second, are you sure you can't optimize the database, or else place it on faster hardware, or nearer to the web server?
I don't know that you're going to get that much data in memory at once. If you could, then you could use a DataSet and use LINQ to DataSets for queries against it.

I hope I hope I'm misunderstanding what you wrote, but if by
contain all the business logic in C# Web Services
you mean something like this then you're already headed in the anti-pattern direction. Accessing your data from an ASP.NET application over web-services would just cause you to incur the serialization/deserialization penalty for pretty much no gain.
A better approach would be to organize services you want to make available into a common layer that your applications are built on and access them directly from your ASP.NET application and maybe also expose them as Web Services to allow external sources to consume this data.
You could also look into exposing data that is expensive to compute using a data warehouse that is updated at regular intervals (once or a couple of times/day). This would help with getting better read performance out of data (as long as you're willing to accept data being a bit stale).
Is that the kind of information you're looking for?

Storing search result for paging and sorting

I've been implementing MS Search Server 2010 and so far its really good. Im doing the search queries via their web service, but due to the inconsistent results, im thinking about caching the result instead.
The site is a small intranet (500 employees), so it shouldnt be any problems, but im curious what approach you would take if it was a bigger site.
I've googled abit, but havent really come over anything specific. So, a few questions:
What other approaches are there? And why are they better?
How much does it cost to store a dataview of 400-500 rows? What sizes are feasible?
Other points you should take into consideration.
Any input is welcome :)

You need to employ many techniques to pull this off successfully.
First, you need some sort of persistence layer. If you are using a plain old website, then the user's session would be the most logical layer to use. If you are using web services (meaning session-less) and just making calls through a client, well then you still need some sort of application layer (sort of a shared session) for your services. Why? This layer will be home to your database result cache.
Second, you need a way of caching your results in whatever container you are using (session or the application layer of web services). You can do this a couple of ways... If the query is something that any user can do, then a simple hash of the query will work, and you can share this stored result among other users. You probably still want some sort of GUID for the result, so that you can pass this around in your client application, but having a hash lookup from the queries to the results will be useful. If these queries are unique then you can just use the unique GUID for the query result and pass this along to the client application. This is so you can perform your caching functionality...
The caching mechanism can incorporate some sort of fixed length buffer or queue... so that old results will automatically get cleaned out/removed as new ones are added. Then, if a query comes in that is a cache miss, it will get executed normally and added to the cache.
Third, you are going to want some way to page your result object... the Iterator pattern works well here, though probably something simpler might work... like fetch X amount of results starting at point Y. However the Iterator pattern would be better as you could then remove your caching mechanism later and page directly from the database if you so desired.
Fourth, you need some sort of pre-fetch mechanism (as others suggested). You should launch a thread that will do the full search, and in your main thread just do a quick search with the top X number of items. Hopefully by the time the user tries paging, the second thread will be finished and your full result will now be in the cache. If the result isn't ready, you can just incorporate some simple loading screen logic.
This should get you some of the way... let me know if you want clarification/more details about any particular part.
I'll leave you with some more tips...
You don't want to be sending the entire result to the client app (if you are using Ajax or something like an IPhone app). Why? Well because that is a huge waste. The user likely isn't going to page through all of the results... now you just sent over 2MB of result fields for nothing.
Javascript is an awesome language but remember it is still a client side scripting language... you don't want to be slowing the user experience down too much by sending massive amounts of data for your Ajax client to handle. Just send the prefetched result your client and additional page results as the user pages.
Abstraction abstraction abstraction... you want to abstract away the cache, the querying, the paging, the prefetching... as much of it as you can. Why? Well lets say you want to switch databases or you want to page directly from the database instead of using a result object in cache... well if you do it right this is much easier to change later on. Also, if using web services, many many other applications can make use of this logic later on.
Now, I probably suggested an over-engineered solution for what you need :). But, if you can pull this off using all the right techniques, you will learn a ton and have a very good base in case you want to extend functionality or reuse this code.
Let me know if you have questions.

It sounds like the slow part of the search is the full-text searching, not the result retrieval. How about caching the resulting resource record IDs? Also, since it might be true that search queries are often duplicated, store a hash of the search query, the query, and the matching resources. Then you can retrieve the next page of results by ID. Works with AJAX too.
Since it's an intranet and you may control the searched resources, you could even pre-compute a new or updated resource's match to popular queries during idle time.

I have to admit that I am not terribly familiar with MS Search Server so this may not apply. I have often had situations where an application had to search through hundreds of millions of records for result sets that needed to be sorted, paginated and sub-searched in a SQL Server though. Generally what I do is take a two step approach. First I grab the first "x" results which need to be displayed and send them to the browser for a quick display. Second, on another thread, I finish the full query and move the results to a temp table where they can be stored and retrieved quicker. Any given query may have thousands or tens of thousands of results but in comparison to the hundreds of millions or even billions of total records, this smaller subset can be manipulated very easily from the temp table. It also puts less stress on the other tables as queries happen. If the user needs a second page of records, or needs to sort them, or just wants a subset of the original query, this is all pulled from the temp table.
Logic then needs to be put into place to check for outdated temp tables and remove them. This is simple enough and I let the SQL Server handle that functionality. Finally logic has to be put into place for when the original query changes (significant perimeter changes) so that a new data set can be pulled and placed into a new temp table for further querying. All of this is relatively simple.
Users are so used to split second return times from places like google and this model gives me enough flexibility to actually achieve that without needing the specialized software and hardware that they use.
Hope this helps a little.

Tim's answer is a great way to handle things if you have the ability to run the initial query in a second thread and the logic (paging / sorting / filtering) to be applied to the results requires action on the server ..... otherwise ....
If you can use AJAX, a 500 row result set could be called into the page and paged or sorted on the client. This can lead to some really interesting features .... check out the datagrid solutions from jQueryUI and Dojo for inspiration!
And for really intensive features like arbitrary regex filters and drag-and-drop column re-ordering you can totally free the server.
Loading the data to the browser all at once also lets you call in supporting data (page previews etc) as the user "requests" them ....
The main issue is limiting the data you return per result to what you'll actually use for your sorts and filters.
The possibilities are endless :)

How to improve performance of XtraReports using a web service to get the data?

I'm facing a potential problem using the XtraReports tool and a web service about performance. in a Windows Form app.
I know XtraReport loads large data set (I understand a large data set as +10,000 rows) by loading the first pages and then continue loading the rest of the pages in the background, but all this is done with a data source in hand. So what happens if this data source has to pass through a web service, which will need to serialize the data in order to send it to the client?
The scenario is the following:
I have a thin client in windows form that makes calls to a web service, which takes that call and by reflection instantiates the corresponding class and calls the required method (Please notice that this architecture is inherited, I have almost no choice on this, I have to use it). So I'll have a class that gets the data from the database and send it to the client through the web service interface. This data can be a DataSet, SqlDataReader (also notice that we're using SQL Server 2000, but could be 2008 by the end of the year), DataTable, XML, etc.
If the result data set is large, the serialization + transference time can be considerable, and then render the report can add some more time, degradating the overall performance.
I know that there is a possibility for using something like streaming video, but for streaming data through a web service but I have no lead info for trying something around it.
What do you think about this? Please let me know any questions you may have or if I need to write more info for better statement of the problem.
Thanks!

I'll give you an answer you probably don't want to hear.
Set expectations.
Reports are typically slow because they have to churn through a lot of data. There just isn't a good way to get around it. But barring that, I'd do the following:
Serialize the data load to a binary state, convert to something transferable via soap (base64 for instance) and transfer that. This way you'll avoid a lot of useless angle brackets.
Precache as much data on the client as possible.
Focus on the perceived performance of the application. For instance, throw the report data gathering onto a background thread, so the user can go and do other work, then show the user a notification when the report is available.
Sometimes, it is possible to generate the report for the most often used criteria ahead of time and provide that when asked.

Is there a way you can partition the data, ie return a small subset of it?
If there's no absolute requirement to return all the data at the same time, then dividing up the data into smaller pieces will make a huge difference when reading from the database, serializing, and transporting through the webservice.
Edit: I had deleted this answer since you already mentioned partitioning. I'm undeleting it now, maybe it will serve some use as the starting point of a discussion...
As for your point about how to work with paging: I think you're on the right track with the "Show next 100 results" approach. I don't know how XtraReport works, and what its requirements for a datasource are. There are 3 issues I see:
Server support for partitioned data (eg your webservice should support returning just "page 3"'s data
Summary Data - does your report have a row of totals or averages? Are those calculated by the XtraReport control? Does it require a complete dataset to display those results? Can you provide the summaries to the control on your own (and find a more efficient way to calculate them without returning the entire data set?)
XtraReport support for a datasource that uses paging.

transfering DataSets is a bad idea. DataSet have a lot of unusefull date. Use Simple Objects. Use ORM on server side.
Also you can precalculte some data. Referencies can be cached on client and then joined with server data.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.