Caching dynamically changing list of objects

Caching dynamically changing list of objects - c#

I have a question about how to efficiently cache list of objects. I have a sample table of Trips. Each Trip has DateFrom and DateTo. I want to cache a list of Trips.
First approach that I consider is getting Trips form database and cache list of Trips ValueObjects (by Value Objects I mean all data needed to display a Trip on a list) at x minutes. This approach is very simple, but have a few disadventages:
- dealing with pagination - if I store them page by page (for example with key TripsP1, TripsP2, etc...). When the page size on GUI changes I will have to make a set of Trips with different keys (for example with key TripsP1Size1, TripsP1Size2 etc...)
- how to deal with sorted list? Keep a set of Trips with different keys for each filters combinations?
Second approach I consider is hittng database for each request but take only Trip.Id from database. Next I want to get each Trip ValueObject from cache. After creating or modifying the Trip I will put them in Cache with TripId as key. Also if for some reason I couldn't find Trip ValueObject in cache I would take it from database and put into cache.
Which of this to approach is better? Or maybe you can suggest some more efficiently way?

I don't see why to hold those Trips in form of Pages.
Why not load all the trips into a List (which can be sorted, depends on your needs), and build an in-memory index for Linq search queries? for example this lib.
You can create in memory indexes of your cache for very quick searched, depending on you application query patterns.
If you are worried about Trips being changes while a transaction is in progress (which all caches miss this point), you should implement it using Immutable collection (and make the Trip immutable type).
This way when the user ask the server to provide items with specific criteria, you can easily select those items and provide it a IQueryable (IEnumarable sequence) method for getting streams of data.
Hope this helps, Ofir.

Related

Best approach to track Amount field on Invoice table when InvoiceItem items change?

I'm building an app where I need to store invoices from customers so we can track who has paid and who has not, and if not, see how much they owe in total. Right now my schema looks something like this:
Customer
- Id
- Name
Invoice
- Id
- CreatedOn
- PaidOn
- CustomerId
InvoiceItem
- Id
- Amount
- InvoiceId
Normally I'd fetch all the data using Entity Framework and calculate everything in my C# service, (or even do the calculation on SQL Server) something like so:
var amountOwed = Invoice.Where(i => i.CustomerId == customer.Id)
.SelectMany(i => i.InvoiceItems)
.Select(ii => ii.Amount)
.Sum()
But calculating everything every time I need to generate a report doesn't feel like the right approach this time, because down the line I'll have to generate reports that should calculate what all the customers owe (sometimes go even higher on the hierarchy).
For this scenario I was thinking of adding an Amount field on my Invoice table and possibly an AmountOwed on my Customer table which will be updated or populated via the InvoiceService whenever I insert/update/delete an InvoiceItem. This should be safe enough and make the report querying much faster.
But I've also been searching some on this subject and another recommended approach is using triggers on my database. I like this method best because even if I were to directly modify a value using SQL and not the app services, the other tables would automatically update.
My question is:
How do I add a trigger to update all the parent tables whenever an InvoiceItem is changed?
And from your experience, is this the best (safer, less error-prone) solution to this problem, or am I missing something?

There are many examples of triggers that you can find on the web. Many are poorly written unfortunately. And for future reference, post DDL for your tables, not some abbreviated list. No one should need to ask about the constraints and relationships you have (or should have) defined.
To start, how would you write a query to calculate the total amount at the invoice level? Presumably you know the tsql to do that. So write it, test it, verify it. Then add your amount column to the invoice table. Now how would you write an update statement to set that new amount column to the sum of the associated item rows? Again - write it, test it, verify it. At this point you have all the code you need to implement your trigger.
Since this process involves changes to the item table, you will need to write triggers to handle all three types of dml statements - insert, update, and delete. Write a trigger for each to simplify your learning and debugging. Triggers have access to special tables - go learn about them. And go learn about the false assumption that a trigger works with a single row - it doesn't. Triggers must be written to work correctly if 0 (yes, zero), 1, or many rows are affected.
In an insert statement, the inserted table will hold all the rows inserted by the statement that caused the trigger to execute. So you merely sum the values (using the appropriate grouping logic) and update the appropriate rows in the invoice table. Having written the update statement mentioned in the previous paragraphs, this should be a relatively simple change to that query. But since you can insert a new row for an old invoice, you must remember to add the summed amount to the value already stored in the invoice table. This should be enough direction for you to start.
And to answer your second question - the safest and easiest way is to calculate the value every time. I fear you are trying to solve a problem that you do not have and that you may never have. Generally speaking, no one cares about invoices that are of "significant" age. You might care about unpaid invoices for a period of time, but eventually you write these things off (especially if the amounts are not significant). Another relatively easy approach is to create an indexed view to calculate and materialize the total amount. But remember - nothing is free. An indexed view must be maintained and it will add extra processing for DML statements affecting the item table. Indexed views do have limitations - which are documented.
And one last comment. I would strongly hesitate to maintain a total amount at any level higher than invoice. Above that level one frequently wants to filter the results in any ways - date, location, type, customer, etc. At this level you are approaching data warehouse functionality which is not appropriate for a OLTP system.

First of all never use triggers for business logic. Triggers are tricky and easily forgettable. It will be hard to maintain such application.
For most cases you can easily populate your reporting data via entity framework or SQL query. But if it requires lots of joins then you need to consider using staging tables. Because reporting requires data denormalization. To populate staging tables you can use SQL jobs or other schedule mechanism (Azure Scheduler maybe). This way you won't need to work with lots of join and your reports will populate faster.

Persisting data for catch-all search

We have a set of catch all search pages we're creating in our ASP.NET application. We have an initial search page, a SERP, and then a single item details page. All 3 pages have a search bar with initial criteria, more criteria, and advanced criteria choices.
When we put all of our criteria together, in addition to the main search box we have 20 different criteria parameters (from price, to price, sale item, date created, etc.) and then three collections of parameter IDs. These collections are from a list of the Manufacturers, Product Lines, and Categories our users can search from. So we have this fixed set of 20 fields and then 3 collections that could have a manufacturer or two, or could hold a collection of 100 Guids for the lines whose checkboxes they selected and want to search through.
In our old system we had a single form solution and we just posted back and submitted everything to our business object, passing it into a method that returned the results. In this new form we need to submit the results from page to page and persist this criteria. We're trying to figure out the best way to persist the data, when I say best I mean most efficient.
Querystring - This isn't going to work with large collections of Guid values for the 3 collections.
Session - We would create a criteria object and store it in the Session. As they move from page to page we can pull it out. At our peak we probable have 200-300 people using the server concurrently and the search is our most used form. I'm worried about performance with all those session variables.
Database - We were thinking of serializing and stashing this criteria object into the database (SQL Server 2k5) and the users would always have a current Search or last Search in the database. This eliminates some of the web server load from the Session solution but I'm worried this object load, serialization, db round trip, and unload is going to slow the forms down and affect user experience.
I'm looking for advice on which method is going to work most efficiently for us or if there is an accepted best practice or pattern I've overlooked.

With HTML5 you can use localStorage and sessionStorage, which makes the client keep the information in their browser.
http://www.w3schools.com/html/html5_webstorage.asp

Improving nested objects filtering speed

Here's a problem I experience (simplified example):
Let's say I have several tables:
One customer can have mamy products and a product can have multiple features.
On my asp.net front-end I have a grid with customer info:
something like this:
Name Address
John 222 1st st
Mark 111 2nd st
What I need is an ability to filter customers by feature. So, I have a dropdown list of available features that are connected to a customer.
What I currently do:
1. I return DataTable of Customers from stored procedure. I store it in viewstate
2. I return DataTable of features connected to customers from stored procedure. I store it in viewstate
3. On filter selected, I run stored procedure again with new feature_id filter where I do joins again to only show customers that have selected feature.
My problem: It is very slow.
I think that possible solutions would be:
1. On page load return ALL data in one viewstate variable. So basically three lists of nested objects. This will make my page load slow.
2. Perform async loazing in some smart way. How?
Any better solutions?
Edit:
this is a simplified example, so I also need to filter customer by property that is connected through 6 tables to table Customer.

The way I deal with these scenarios is by passing in Xml to SQL and then running a join against that. So Xml would look something like:
<Features><Feat Id="2" /><Feat Id="5" /><feat Id="8" /></Features>
Then you can pass that Xml into SQL (depending on what version of SQL there are different ways), but in the newer version's its a lot easier than it used to be:
http://www.codeproject.com/Articles/20847/Passing-Arrays-in-SQL-Parameters-using-XML-Data-Ty
Also, don't put any of that in ViewState; there's really no reason for that.

Storing an entire list of customers in ViewState is going to be hideously slow; storing all information for all customers in ViewState is going to be worse, unless your entire customer base is very very small, like about 30 records.
For a start, why are you loading all the customers into ViewState? If you have any significant number of customers, load the data a page at a time. That will at least reduce the amount of data flowing over the wire and might speed up your stored procedure as well.
In your position, I would focus on optimizing the data retrieval first (including minimizing the amount you return), and then worry about faster ways to store and display it. If you're up against unusual constraints that prevent this (very slow database; no profiling tools; not allowed to change stored procedures) than please let us know.

Solution 1: Include whatever criteria you need to filter on in your query, only return and render the requested records. No need to use viewstate.
Solution 2: Retrieve some reasonable page limit of customers, filter on the browser with javascript. Allow easy navigation to the next page.

Is it faster to query a List<T> or database?

I have recently had several situations where I need different data from the same table. One example is where I would loop through each "delivery driver" and generate a printable PDF file for each customer they are to deliver to.
In this situation, I pulled all customers and stored them into
List<Customer> AllCustomersList = customers.GetAllCustomers();
As I looped through the delivery drivers, I'd do something like this:
List<Customer> DeliveryCustomers = AllCustomersList.Where(a => a.DeliveryDriverID == DriverID);
My question: Is the way I'm doing it by querying the List object faster than querying the database each time for customer records associated with the delivery driver?

There isn't an accurate number for amount of rows that if you pass it you should query the DB instead in in-memory List<T>
But the rule of thumb is, DB are designed to work with large amount of data and they have optimization "mechanisms" while in in-memory there aren't such things.
So you will need to benchmark it to see if the round-trip to DB is worth it for that amount of rows for each time it's important to you
"We should forget about small efficiencies, say about 97% of the time: premature
optimization is the root of all evil"

Avoiding round trips to DB is one of the major rules regarding database performance tuning, especially when the DB is located on network and has multiple users accessing it.
From the other point of view bringing large result sets into memory like you customers data looks to be, is not efficient and probably not faster than traveling to DB when you need them.
A good use of in memory collections to avoid round trips is for your look up tables (i.e. customer categories, customer regions, etc), which don't change often. That way you avoid joins in your main customer select query making it even faster.

Why not use Redis ? , It's an in memory database and it's very fast.

How to fill objects master - datail collections c#/SQL2005

I'm using Business Objects Collections.
(Not using datasets. Generics collections only.)
Collections of Business objects are filled using SQL Reader
I'd like to know your opinion what is best approach to fill master details (or parent-child) collections
Assume I have 2 objects: Invoice and Invoice_Details
Invoice Object has generic collection "Details" (type of Invoice Details)
What would be best approach to work / fill both collections?
(Eg I'd like to read all invoices from 2008 yr and present on GUI)
Do you read all invoices for selected date range, than all children and populate proper Invoice's Details
Or read invoice one by one and related details? (eg using multiply result sets)
I've also noticed approach based on Binding Source -> read children only when changing current record position...
I'm very interested your opinion what would be best / fastest scenario?

This depends on what you're trying to do.
If you need all of the invoice details whenever you need the invoice, then issue a query for the details when you first access the invoice (unless you know you'll always need both, in which case issue a single batch with two result sets).
Similar patterns apply to other cases.

To minimize the impact, I'd load the Invoice records in one go, and then lazy load the Invoice_Details only when they were needed. This has the advantage of being the quickest way to load all of the invoice data in, while presenting the lowest amount of memory usage.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.