Design of queryable web based results

Design of queryable web based results - c#

I'm trying to accomplish something like a facebook news feed wall, loading N number of results from the overall dataset, starting with the most recent, date descending. When you click “more”, it displays the next N underneath and so on until you finish the dataset.
I’m struggling to come up with the best design to accomplish this. Ive always been told that stateless web services are the only way to build a scalable enterprise application, which means that as I understand it, keeping the whole results object cached serverside on the first call to the page, and just taking N results from it with each subsequent web service call is a no no?
If that’s the case, then something like GetResults(int pageindex, int pagesize) would work.... and thats how I WAS going to do it but then I realised it would not work if someone added a new DB record in between calls. Eg you start with 23 wall feed items in the DB and want to display them 10 at a time.
First call, page 1, page size 10 will return results 14-23 (most recent first)
Someone then adds 2 new posts, so you have 25 now in the DB
Second call, page 2, page size 10 will return results 6-15, two of which were already returned in the first call.
So this offsetting approach doesn’t work because you can’t guarantee the underlying dataset will remain the same between calls.
Im confused, how do I accomplish this?
Edit: Sorry a little more info. To avoid the problem of huge data table lookups, I had considered the option of pre-populating a "transient" table with the last few days data for that user when you first load the screen, then just reading the results a page at a time from that transient table to make it faster reading, with a slightly slower load time. Then when you exhaust that data, you bring in the next period (say 2 weeks) into the transient table and continue reading.
The difficulty is that users will "Post" items which then automatically will be picked up by users who match their search criteria. Eg if your criteria state you want to meet people between 25 and 32 and within 50 miles of you, then when you load up your news feed, you want it to show posts from all users who match your criteria. Kindof like a dynamic friends list.
How I was going to achieve this was at time of login, a stored proc would run which would populate a transient table in the DB by selecting all users and filtering down based on age and location criteria which I have in static lookup tables (postcode distances etc), then it will save the list of Users who match your criteria to this transient table for use whenever you then need to filter posts or search users. If you update your preferences, it will also recalculate this but only when you update prefs or re-login. So any new users signing up won't appear until you next login, which is fine I think.
Then when it comes time to display your news feed, all it does is retrieves this list of User Ids from the DB who match your criteria, then brings back all NewsFeedPosts which were posted by those users. Hey presto, dynamic news feed!
But obviously this is a subset of the entire NewsFeedPost table which is generated on the fly, so it doesn't make sense to recalculate this every time a user clicks "more", so this was how I was thinking about implementing it.
Tables - NewsFeedCurrent, NewsFeedRecent, NewsFeedArchive
New posts are created in the current table. Every night a batch job runs that moves all data from current that is 2 days old, to the recent table, and any data in the recent table that is a week old to the archive table.
The thinking being that 90% of the time, the user will only be interested in the last 2 days of data. So keep table small for access time. Another 9% of the time the user may want the last weeks data. So keep that separate in a secondary table. Then only 1% of the time the user wants data more than a week old so keep that in a larger, slow archive table that will be slower, but gives you performance boost by keeping current and recent tables small.
So when you first hit the news feed page, what it was going to do is take the pre-generated user list for your account and pull out all NewsFeedCurrent items and put them in a transient table, say TempNewsFeed under your user ID. You can then work with this resultset just by pulling back everything for your user id, no filtering required for items you arent interested in as they are pre-filtered. this will add a second or so to the page load but will improve response time when fetching results. Then when that data is exhausted, it will then - again using the list of users matching your criteria - pull out all relevant data from the Recent table, adding it to the TempNewsFeed table, allowing you to continue fetching data up to a week old. When thats exhausted, it will finally go to the archive table and using the user id list, pull out all data matching this and put in the temp table, allowing you to continue navigating the remaining data. This will give a fairly significant delay as it populates the archive data but if you are going back a week, then you will have to accept 5-10 seconds wait while it populates the data and says "loading data...". Once it has though, navigating historical data will be just as quick as recent data as it will all be in the transient table.
If you refresh the screen or go back onto it from another screen, it clears out the transient table and starts again from the Current table data.

Hope my answer makes sense, makes the right assumptions ...
I would divide the news feed into two sections. The first is for incoming news - which would be powered with AJAX calls. It is constantly saying "What is new?" The second section is for older news, where the user can lazily load more news by scrolling down.
Newest News Items
The important point is to make note of the maximum news feed id on your page. Let's imagine that is 10000. When the user loaded the page, news feed id 10000 was the latest news item.
When the new section is updated with AJAX, we simply ask, "What is newer than id 10000?" and we load those items onto the page. After we load them, we also increment the id on the page. For example, if we start with id 10000 and we load five new news items, the new id would be 10005. The next call would ask, "What is newer than 10005?"
Older News Items
The older section would keep track of the oldest news item on the page. Let's imagine they scroll back for a weeks worth of news. The minimum news item id would be 9000. When they want to scroll back further, we simply ask, "What is older than 9000?"
The idea then is to maintain on the page the maximum news item id and the minimum news item id and then keep loading from that reference point.

Related

Assign unique numbers in database when saving the form

I have a form where you can create an order and when you save it, is checking in the database (using oracle) for the last order number and is assigning the next one to the currently saved order. What I found is that if two users are saving a new order both in the same time or at few seconds apart, because of the connection speed my app is unable to assign different numbers for the newly two created orders. The problem is that both are checking in the same time the last assigned number and both orders get the same number..
I have some ideas but all of them have advantages and disadvantages..
To have the system wait a few seconds and check the order number when the user saves the order. But if both saved in the same time, the check will be done in the same time later and I guess that I will end up with the same problem..
To have the system check the order number (a check is run every time the treeview is refreshed) and see if it’s been duplicated and then let the user know via the treeview with some highlight, that it’s been duplicated. But if any documents are assigned to the order before the check, then I will end up with documents having a different number in the name and inside from the order to which is assigned..
To have the system check all order numbers periodically and give one of the duplicates a new order number, but Here is the same problem with the documents as at #2.. And also might cause some performance issue..
Assigning the order number when a user requests a new order not when he saves the order. I could have the system do Solution #1 along with this solution and recheck to see if the number is being used within the database and then reassign it a new one. Once again, if documents get assigned, someone has to go fix those.
One way of possibly stopping the documents from being assigned to duplicates is that the user is only allowed put some of the information and then save it or apply it and it does the recheck of #1, and then if it doesn't find anything, allow the user to add documents. This part of the solution could be applied possibly to any of the above but I don't want to delay the users work while is checking the numbers..
Please if you see any improvements to the ideas above or if you have new ones, let me know.
I need to find the best solution and as much as possible not to affect the user's current workflow..

If your Order ID is only a number you can use Oracle Sequence.
CREATE SEQUENCE order_id;
And before you save the record get a new order number.
SELECT order_id.NEXTVAL FROM DUAL;
See also Oracle/PLSQL: Sequences (Autonumber)

Persisting data for catch-all search

We have a set of catch all search pages we're creating in our ASP.NET application. We have an initial search page, a SERP, and then a single item details page. All 3 pages have a search bar with initial criteria, more criteria, and advanced criteria choices.
When we put all of our criteria together, in addition to the main search box we have 20 different criteria parameters (from price, to price, sale item, date created, etc.) and then three collections of parameter IDs. These collections are from a list of the Manufacturers, Product Lines, and Categories our users can search from. So we have this fixed set of 20 fields and then 3 collections that could have a manufacturer or two, or could hold a collection of 100 Guids for the lines whose checkboxes they selected and want to search through.
In our old system we had a single form solution and we just posted back and submitted everything to our business object, passing it into a method that returned the results. In this new form we need to submit the results from page to page and persist this criteria. We're trying to figure out the best way to persist the data, when I say best I mean most efficient.
Querystring - This isn't going to work with large collections of Guid values for the 3 collections.
Session - We would create a criteria object and store it in the Session. As they move from page to page we can pull it out. At our peak we probable have 200-300 people using the server concurrently and the search is our most used form. I'm worried about performance with all those session variables.
Database - We were thinking of serializing and stashing this criteria object into the database (SQL Server 2k5) and the users would always have a current Search or last Search in the database. This eliminates some of the web server load from the Session solution but I'm worried this object load, serialization, db round trip, and unload is going to slow the forms down and affect user experience.
I'm looking for advice on which method is going to work most efficiently for us or if there is an accepted best practice or pattern I've overlooked.

With HTML5 you can use localStorage and sessionStorage, which makes the client keep the information in their browser.
http://www.w3schools.com/html/html5_webstorage.asp

storing dataset of entire table and doing query on copy then updating GridView with results of query

I'm new to n-tier enterprise development. I just got quite a tutorial just reading threw the 'questions that may already have your answer' but didn't find what I was looking for. I'm doing a geneology site that starts off with the first guy that came over on the boat, you click on his name and the grid gets populated with all his children, then click on one of his kids that has kids and the grid gets populated with his kids and so forth. Each record has an ID and a ParentID. When you choose any given person, the ID is stored and then used in a search for all records that match the ParentID which returns all the kids. The data is never changed (at least by the user) so I want to just do one database access, fill all fields into one datatable and then do a requery of it each time to get the records to display. In the DAL I put all the records into a List which, in the ObjectDataSource the function that fills the GridView just returns the List of all entries. What I want to do is requery the datatable, fill the list back up with the new query and display in the GridView. My code is in 3 files here
(I can't get the backticks to show my code in this window) All I need is to figure out how to make a new query on the existing DataTable and copy it to a new DataTable. Hope this explains it well enough.
[edit: It would be easier to just do a new query from the database each time and it would be less resource intensive (in the future if the database gets too large) to store in memory, but I just want to know if I can do it this way - that is, working from 1 copy of the entire table] Any ideas...

Your data represents a tree structure by nature.
A grid to display it may not be my first choice...
Querying all data in one query can be done by using a complex SP.
But you are already considering performance. Thats always a good thing to keep in mind when coming up with a design. But creating something, improve it and only then start to optimize seems a better to go.
Since relational databases are not real good on hierarchical data, consider a nosql (graph)database. As you mentioned there are almost no writes to the DB, nosql shines here.

Need to display data as it is returned using web service/jquery

I have an operation (That I can't change) that starts threads that make calls to our Oracle database to see if a certain hotel(s) has availability on a certain date.
If a date/hotel combination has availability, that thread returns information about the date/hotel in the form of a DataTable that is merged into a Main DataTable of results. Yes, I know ... I inherited this.
So I am trying to re-write this operation. I still must query Oracle in threads to get the availability information, but I want to display the data as it is returned (in chunks of 5, 10? I'm flexible), instead of having the user sit in front of the screen for up to 4 minutes before a complete result is spat out into a GridView.
How do I do this directly from an .aspx page so I can make a web service call and populate a grid (JqGrid?) with the results?
If I haven't provided enough information or described what I am trying to achieve, please let me know and I will elaborate.

Oracle provides a field on each row called "rowid"
(http://www.adp-gmbh.ch/ora/concepts/rowid.html)
The first time you send the query, send in the int (x) to define what the highest rownumber you want is. Have the service return the total number of rows and the first x rows.
Then, the 2nd time you send the query, get the next x rows, rinse and repeat.
Basically, you need to send an ajax query for rows x through y each time until you have them all loaded.
I would recommend paging as well, since users typically don't want to see hundreds of results at a time.

Keeping Track of User Points (Like SO)

I want to be able to keep track of user points earned on my website. It isn't really like SO but the point system is similar in that I want each user to have a total and then I want to keep track of the transactions that got them to that total.
Should I keep a user total in the User table or should I just pull all the transactions that affect the User in questions point total, sum them and show the point total?
Seems like the latter is more work than needs to be done just to get the total. But then again I cringe at the idea of keeping the same data(more or less) in two different places.
What's the right way to design this?
EDIT: Took the advice. Using both and recalcs. I added a RecalcDate column, and if its over a day old it gets recalced. The total also get recalculated everytime a user does something that should affect their point total.

Both
You need to have a way of recalculating totals when things go wrong, say you add a new feature, or someone learns to exploit the system. You can keep a current total on the user table and a record of transactions to recalculate that total when needed...not every time you need the value to display.
You're not storing duplicate data so much as the audit history to fall back on, the only duplicate is one number in one column on the User table...the alternative is a user exploits the system, there's no way to roll it back. The same thing happened in the early days of SO, but they had the history and could recalculate totals without a sweat.

You should probably do a mix of both.
Keep a running total on the User table and also keep a log of each transaction that affects the user total, that way you don't need to do a sum of all the records, but you'll have them just in case.
The numbers may get out of sync, which is why you might need to do a recalc every now and then. (StackOverflow calls it a recalc, where they go through and update your reputation to what you should have).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.