I have been using ServiceStack.Redis for a couple of days and the last puzzle in my app is searching cache.
I have a simple object
public class Person
{
public int Id { get; set; }
public string Name { get; set; }
public string Surname { get; set; }
public int Age { get; set; }
public string Profession { get; set; }
}
e.g. I want to return all persons which Name is Joe and they are older than 10 years
What is better speed wise?
To run query against database which will return a list of ids and than to get matched records via Redis .GetByIds function.
or
As RedisClient doesn't have native Linq support (doesn't have AsEnumerable only iList) to run GetAll() and than to preform further filtering.
Does anyone have experience with this ?
I've been struggling with same problem, my option was to save a "light " set of data that represents the attributes I need to identify a whole register o attributes I need to filter the whole bunch of data, then go to the database for the rest if necessary.
I just started to use redis and I know this probably is not the best option but its better than go to the database each time even for filtering information.
Hope to know if you found a better solution :)
I think Redis is not a good candidate for such queries, it doesn't have indexes, so you might end up building your own in order to meet speed requirements. Which is not a good idea at all. So I would go with a SQL db which can help me on such queries, or even more complex ones on the Person type.
And then you can use the Redis cache only to store the query results in it so you can easy move through them for things like paging or sorting.
At least this is how we do it in our apps.
Related
RE: CRUD operations... Is it pulling more data than is needed a bad thing?
Let me preface this with saying I really did search for this answer. On and off for some time now. I'm certain it's been asked/answered before but I can't seem to find it. Most articles seem to be geared towards how to perform basic CRUD operations. I'm really wanting to get deeper into best practices. Having said that, here's an example model I mocked up for example purposes.
public class Book
{
public long Id { get; set; }
public string Name { get; set; }
public decimal AverageRating { get; set; }
public decimal ArPoints { get; set; }
public decimal BookLevel { get; set; }
public string Isbn { get; set; }
public DateTime CreatedAt { get; set; }
public DateTime PublishedAt { get; set; }
public Author Author { get; set; }
public IEnumerable<Genre> Genres { get; set; }
}
I'm using ServiceStack's OrmLite, migrating string queries to object model binding wherever possible. It's a C# MVC.NET project, using Controller/Service/Repository layers with DI. My biggest problem is with Read and Update operations. Take Reads for example. Here are two methods (only wrote what I thought was germane) for example purposes.
public class BookRepository
{
public Book Single(long id)
{
return _db.SelectById<Book>(id);
}
public IEnumerable<Book> List()
{
return _db.Select<Book>();
}
}
Regardless of how this would need to change for the real world, the problem is simply that to much information is returned. Say if I were displaying a list of books to the user. Even if the List method were written so that it didn't pull nested methods (Author & Genres), it would have data for properties that were not used.
It seems like I could either learn to live with getting data I don't need or write a bunch of extra methods that changes what properties are pulled. Using the Single method, here's a few examples...
public Book SinglePublic(long id): Returns a few properties
public Book SingleSubscribed(long id): Returns most properties
public Book SingleAdmin(long id): Returns all properties
Having to write out methods like this for most tables doesn't seem very maintainable to me. But then, almost always getting unused information on most calls has to affect performance, right? I have to be missing something. Any help would be GREATLY appreciated. Feel free to just share a link, give me a PluralSight video to watch, recommend a book, whatever. I'm open to anything. Thank you.
As a general rule you should avoid pre-mature optimization and always start with the simplest & most productive solution first as avoiding complexity & large code-base sizes should be your first priority.
If you're only fetching a single row, you should definitely start by only using a single API and fetch the full Book entity, I'll personally also avoid the Repository abstraction which I view as an additional unnecessary abstraction, so I'd just be using OrmLite APIs directly in your Controller or Service, e.g:
Book book = db.SingleById<Book>(id);
You're definitely not going to notice the additional unused fields over the I/O cost of the RDBMS network call and the latency & bandwidth between your App and your RDBMS is much greater than additional info on the wire over the Internet. Having multiple APIs for the sake of reducing unused fields adds unnecessary complexity, increases code-base size / technical debt, reduces reusability, cacheability & refactorability of your code.
Times when to consider multiple DB calls for a single entity:
You've received feedback & given a task to improve the performance of a page/service
Your entity contains large blobbed text or binary fields like images
The first speaks to avoiding pre-mature optimization by first focusing on simplicity & productivity before optimizing to resolve known realizable performance issues. In that case first profile the code, then if it shows the issue is with the DB query you can optimize for only returning the data that's necessary for that API/page.
To improve performance I'd typically first evaluate whether caching is viable as it's typically the least effort / max value solution where you can easily cache APIs with a [CacheResponse] attribute which will cache the optimal API output for the specified duration or you can take advantage of caching primitives in HTTP to avoid needing to return any non-modified resources over the wire.
To avoid the second issue of having different queries without large blobbed data, I would extract it out into a different 1:1 row & only retrieve it when it's needed as large row sizes hurts overall performance in accessing that table.
Custom Results for Summary Data
So it's very rare that I'd have different APIs for accessing different fields of a single entity (more likely due to additional joins) but for returning multiple results of the same entity I would have a different optimized view with just the data required. This existing answer shows some ways to retrieve custom resultsets with OrmLite (See also Dynamic Result Sets in OrmLite docs).
I'll generally prefer to use a custom Typed POCO with just the fields I want the RDBMS to return, e.g. in a summary BookResult Entity:
var q = db.From<Book>()
.Where(x => ...);
var results = db.Select<BookResult>(q);
This is all relative to the task at hand, e.g. the fewer results returned or fewer concurrent users accessing the Page/API the less likely you should be to use multiple optimized queries whereas for public APIs with 1000's of concurrent users of frequently accessed features I'd definitely be looking to profiling frequently & optimizing every query. Although these cases would typically be made clear from stakeholders who'd maintain "performance is a feature" as a primary objective & allocate time & resources accordingly.
I can't speak to ORM Lite, but for Entity Framework the ORM will look ahead, and only return columns that are necessary to fulfill subsequent execution. If you couple this with view models, you are in a pretty good spot. So, for example, lets say you have a grid to display the titles of your books. You only need a subset of columns from the database to do so. You could create a view model like this:
public class BookListViewItem{
public int Id {get;set;}
public string Title {get; set;}
public BookListView(Book book){
Id = book.Id;
Title = book.Title;
}
}
And then, when you need it, fill it like this:
var viewModel = dbcontext.Books
.Where(i => i.whateverFilter)
.Select(i => new BookListViewItem(i))
.ToList();
That should limit the generated SQL to only request the id and title columns.
In Entity Framework, this is called 'projection'. See:
https://social.technet.microsoft.com/wiki/contents/articles/53881.entity-framework-core-3-projections.aspx
I'm having a problem to find a standard, how such an update would look like. I have this model (simplified). Bear in mind that Team is allowed without any player and Team can have up to 500 players:
public class Team
{
public int TeamId { get; set; }
public string Name { get; set; }
public string City { get; set; }
public List<Player> Players { get; set; }
}
public class Player
{
public int PlayerId { get; set; }
public string Name { get; set; }
public int Age { get; set; }
}
and this endpoints:
Partial Team Update (without players): [PATCH] /api/teams/{teamId}. Offers me options to update particular fields of the team, but no players.
Update Team (with players): [PUT] /api/teams/{teamId}. In payload data I pass json with entire Team object, including collection of players.
Update Player alone: [PUT] /api/teams/{teamId}/players/{playerId}
I started wondering if I need endpoint #2 at all. The only advantage of endpoint #2 is that I can update many players in one request. I can delete or add many players at once, as well. So I started looking for any standard, how such a popular scenario is being handled in the real world?
I have two options:
Keep endpoint #2 to be able to update/add/remove many child records at the same time.
Remove endpoint #2. Allow to change Team only via PATCH without ability to manipulate Player collection. Player collection can be changed only by endpoints:
[POST] /api/teams/{teamId}/players
[PUT] /api/teams/{teamId}/players/{playerId}
[DELETE] /api/teams/{teamId}/players/{playerId}
Which option is a better practice? Is there a standard how to handle Entity with Collection situation?
Thanks.
This one here https://softwareengineering.stackexchange.com/questions/232130/what-is-the-best-pattern-for-adding-an-existing-item-to-a-collection-in-rest-api could really help you.
In essence it says that POST is the real append verb. If you are not really updating the player resource as a whole, then you are appending just another player to the list.
The main argument with which I agree, is that the PUT verb requires the entire representation of what you are updating.
The patch on the other hand, I would use to update a bunch of resources at the same time.
There is no really wrong or right way to do it. It depends on how you view the domain at the end of the day.
You can have bulk operations and I would certainly use POST with that. There are some things to consider though.
How to handle partial success. Would one fail the others? If not, what is your response?
How will you send back the new resources url? The new resources should be easily discoverable.
Apart from some design considerations, if you are taking about multiple inserts, you'd better do it in bulk. If it's a couple at a time, save yourself and the people who will consume it some time and go with one by one.
I'm creating asp.net-mvc application where user is uploading multiple files.
The data will be compared with db data, processed and exported later. Also paging.
When displaying these data, sorting and filtering is importing.
When data is uploaded, some of them will be stored in db, some will be displayed as not found in db, some will be modified and stored ... etc
My question is, what is the best way to store the uploaded data in order to be available to be process or viewed?
Load in memory
Create temp tables for every session? (even don't know if possible)
Different storage which can be queryable (access data using linq) (JSON??)
Another option.
The source files are (csv or excel)
One of the files example
Name Age Street City Country Code VIP
---------------------------------------------------------
Mike 42 AntwSt Leuven Belgium T5Df No
Peter 32 Ut123 Utricht Netherland T666 Yes
Example of class
public class User
{
public string Name { get; set; }
public Address Address { get; set; } // street, city,country
public Info Info { get; set; } // Age, and Cres
}
public class Info
{
public int Age { get; set; }
public Cres Cres { get; set; }
}
public class Cres
{
public string Code { get; set; }
public bool VIP { get; set; }
}
There are a variety of strategies for handling this (I actually just wrote an entire dissertation over the subject), and there are many different considerations you'll need to take under consideration to achieve this.
Depending on the amount of data present, and what you're doing with it, it may be simple enough to simply store information in Session Storage. Now how you actually implement the session store is up to you, and there are pros and cons to how you decide to do that.
I would personally recommend a server side session store to handle everything and there are a variety of different options for how to do that. For example: SSDB and Redis.
Then from there, you'll need a way of communicating to clients what has actually happened with their data. If multiple clients need to access the same data set and a single user uploads a change, how will you alert every user of this change? Again, there are a lot of options, you can use a Pub/Sub Framework to alert all listening clients. You could also tap into Microsoft's SignalR framework to attempt to handle this.
There's a lot of different If's, But's, Maybe's, etc to the question, and unfortunately I don't believe there is any one perfect solution to you problem without knowing exactly what you're trying to achieve.
If the data size is small and you just need them to exist temporarily, feel free to go with storing them in memory and thus cut all the overhead your would have with other solutions.
You just need to be sure to consider that the data in memory will be gone if the server or the app is switched off for whatever reason.
It might also be a good idea to consider, what happens if the same user performs the operation for the second time, while the operation on the first data is not completed yet. If this can happen to you (it usually does), make sure to use good synchronization mechanisms to prevent race conditions.
I’m creating an ecommerce with products having their own fields (Id, Name):
This is the object I have in c#
public class Product
{
public int Id { get; set; }
public string Name { get; set; }
}
This is my code to generate a product in C# to neo4j
Console.WriteLine("Generate node: ");
var newProduct = new Product{Id=666, Name="Banana"};
client.Cypher
.Create("(product:Product {newProduct})")
.WithParams(new { newProduct })
.ExecuteWithoutResults();
Supposing a user or I need to add some other attributes, such as price to the product node, the first thing is to add a new Product attribute to the class
..
public int price { get; set; }
..
And then modify the cypher code to add the product with the net attribute/property.
Clearly this is a hardcoded approach, not good for a dynamic db/site.
Since I’ve been used to RDBMS this type of problem could only be solved with EAV and numerous pivots, I was hoping that Nosql (ie Neo4J) could have helped me in dealing with variable attributes fileds without EAV.
Code that generates code could be a solution?
What comes in my mind is using Dynamic code/variable or codeDom, is this the way to go? are there other elegant solutions?
Please provide some explanations or topic to study.
NoSql should be schema-less but it’s schema-less application is not so easy am I correct?
In a schema-free database the schema lives in the applications that use it.
You can make schema changes at least in the database with a tool like Liquigraph
If you change your objects you will have code that uses these new properties, so you have to adapt your code anyway, or?
You can write some code (or use the library if it supports it) to consume and hydrate arbitrary objects.
I'm working on an internet website that provides some services to internet users. So we have a administration system, where my cooperators from the business team can get the information they want, e.g. how many new users registered in the last 3 days? or how many articles posted with the tag "joke" etc. Thus in the administration system, there are a few pages for searching some tables with conditions. These pages are quite alike:
UserID:[--------------] Nick Keyword:[------------] Registered Time:[BEGIN]~[END] [Search]
The searching results are listed here
The class User has more properties than just UserID/Nick/RegisterTime (as well as the user table), but only the 3 properties are treated as conditions. So I have a UserSearchCriteria class like:
public class UserSearchCriteria
{
public long UserID { get; set; }
public string NickKeyword { get; set; }
public DateTime RegisteredTimeStart { get; set; }
public DateTime RegisteredTimeEnd { get; set; }
}
Then in the data access layer, the search method takes an argument with its type UserSearchCriteria, and build the corresponding Expression<Func<User, bool>> to query. While out of theDAL, other developers can only search the user table with the 3 conditions provided by the criteria, for example, they can't search those users whose City property is "New York"(this is usually because this property has no index in DB, searching with it is slow).
Question 1: This implementation of enclosing the search is correct or not? Any suggestions?
Question 2: Now I find more criteria classes in the project like ArticleSearchCriteria,FavouriteSearchCriteria and etc, and the criteria will become more and more in the future I think. They have almost the same working mechanism but I need to repeat the code. Is there a better solution?
P.S. If you need these info: jQuery + ASP.NET MVC 3 + MongoDB
Makes perfect sense to me. If the user can't search by "anything" then using a search-by-template sort of approach doesn't make any sense. Also, if you try to make this more generic, it will get downright confusing. E.g., I would hate to code to something like:
class SearchCriteria{
Dictionary<object,object> KeyValuePairs;
EntityKind Entity;
}
to be used like this:
SearchCriteria sc = new SearchCriteria();
sc.KeyValuePairs.Add("UserId",32);
sc.Entity = EntityKind.User;
Eww. No compile time type checking, no checking to see if the entity and property match up, etc.
So, my answer is, yes :), I would use the design pattern you are currently using. Makes sense to me, and seems straightforward for anyone to see what you're doing and get up to speed.