SetFields in Mongo C# driver - c#

I'm using C# mongo driver and I have users collection like below,
public class User
{
public string Name { get; set; }
public DateField Date { get; set; }
/*
* Some more properties
*/
public List<string> Slugs { get; set; } //I just need to return this property
}
I'm writing a query in which it just returns me the slugs property.
To do this i'm trying to use SetFields(...) method from the mongo driver. SetFields returns the cursor of the User type i'm expecting something to be of my Slugs property type so that I don't return whole set of properties when i just need one.
Is it possible ?

Yes and no. You can use the aggregation framework's projection operator $project to change the structure of the data, but I wouldn't do that for two reasons:
MongoDB generally tries to preserve the structure unless you force it to, particularly because it makes it easier to work with statically typed languages (the old object/relational mismatch: SQL queries don't 'answer' in users or blog post, but some wild Chimaera of properties collected from various tables, which might require additional DTOs depending on the query itself, which is all a bit ugly).
Aggregation framework queries are a bit more complicated and a bit slower, and I wouldn't let the urge to do some micro-optimization dictate a lot of unnecessary complexity.
After all, omitting a few fields is a micro-optimization already (setting index covered queries aside), but on the client-side the cost of empty fields should be next to none.

Related

Best practices for my dynamic queries in Entity Framework

I am building a web application that is a recreation of an older system, and I am trying to build it in an architected, yet pragmatic and maintainable way (unlike the old system). Anyways, I am currently designing my queries for my models in my application. The old system allows developers to assign any field through a boolean to be a searchable value from a table, meaning a single view for maintaining some models' records might contain 20 searchable fields in the front-end and doing that only requires ticking a single box.
Now I would like to implement something similar in this new system with C# with a backend using EF as the data mapper, but I am not sure what approach is the most maintainable. In my current approach the filters are sent by the client as a record that (at most) contains all the possible filterable fields e.g
public record GetOrderQuery()
{
public string OrderReference { get; set;}
public string OrdererName { get; set; }
public int ItemCount { get; set; }
//etc...
}
I am fine with it, if the record limits filters which can be applied ( should I have the record contain an object that has fieldName, fieldValue, queryType and have that as an iterable property in the record instead?), but I would like to streamline the actual filtering. Basically if the client sent any of the above fields in the request (as JSON and none are required), the filtering is applied to those fields. I am currently thinking that I could implement this with reflection: I try to find a field in the actual model where the property name is the same as in the record, then I construct the predicate for the Where() by chaining expressions.
I construct expressions for each property that has a value in the query and can be found through reflection (a property with the same name), then I link those together using a Binary Expressions, combinining each of the filters in to a single expression. I am not sure if this is the best approach or even what is a good way to implement this though (performance or maintainability wise or just in general). Are there any other ways to implement this, are there any pitfalls in this I should look out for, any resources I should read? Thanks!

RE: CRUD operations. Is it pulling more data than is needed a bad thing?

RE: CRUD operations... Is it pulling more data than is needed a bad thing?
Let me preface this with saying I really did search for this answer. On and off for some time now. I'm certain it's been asked/answered before but I can't seem to find it. Most articles seem to be geared towards how to perform basic CRUD operations. I'm really wanting to get deeper into best practices. Having said that, here's an example model I mocked up for example purposes.
public class Book
{
public long Id { get; set; }
public string Name { get; set; }
public decimal AverageRating { get; set; }
public decimal ArPoints { get; set; }
public decimal BookLevel { get; set; }
public string Isbn { get; set; }
public DateTime CreatedAt { get; set; }
public DateTime PublishedAt { get; set; }
public Author Author { get; set; }
public IEnumerable<Genre> Genres { get; set; }
}
I'm using ServiceStack's OrmLite, migrating string queries to object model binding wherever possible. It's a C# MVC.NET project, using Controller/Service/Repository layers with DI. My biggest problem is with Read and Update operations. Take Reads for example. Here are two methods (only wrote what I thought was germane) for example purposes.
public class BookRepository
{
public Book Single(long id)
{
return _db.SelectById<Book>(id);
}
public IEnumerable<Book> List()
{
return _db.Select<Book>();
}
}
Regardless of how this would need to change for the real world, the problem is simply that to much information is returned. Say if I were displaying a list of books to the user. Even if the List method were written so that it didn't pull nested methods (Author & Genres), it would have data for properties that were not used.
It seems like I could either learn to live with getting data I don't need or write a bunch of extra methods that changes what properties are pulled. Using the Single method, here's a few examples...
public Book SinglePublic(long id): Returns a few properties
public Book SingleSubscribed(long id): Returns most properties
public Book SingleAdmin(long id): Returns all properties
Having to write out methods like this for most tables doesn't seem very maintainable to me. But then, almost always getting unused information on most calls has to affect performance, right? I have to be missing something. Any help would be GREATLY appreciated. Feel free to just share a link, give me a PluralSight video to watch, recommend a book, whatever. I'm open to anything. Thank you.
As a general rule you should avoid pre-mature optimization and always start with the simplest & most productive solution first as avoiding complexity & large code-base sizes should be your first priority.
If you're only fetching a single row, you should definitely start by only using a single API and fetch the full Book entity, I'll personally also avoid the Repository abstraction which I view as an additional unnecessary abstraction, so I'd just be using OrmLite APIs directly in your Controller or Service, e.g:
Book book = db.SingleById<Book>(id);
You're definitely not going to notice the additional unused fields over the I/O cost of the RDBMS network call and the latency & bandwidth between your App and your RDBMS is much greater than additional info on the wire over the Internet. Having multiple APIs for the sake of reducing unused fields adds unnecessary complexity, increases code-base size / technical debt, reduces reusability, cacheability & refactorability of your code.
Times when to consider multiple DB calls for a single entity:
You've received feedback & given a task to improve the performance of a page/service
Your entity contains large blobbed text or binary fields like images
The first speaks to avoiding pre-mature optimization by first focusing on simplicity & productivity before optimizing to resolve known realizable performance issues. In that case first profile the code, then if it shows the issue is with the DB query you can optimize for only returning the data that's necessary for that API/page.
To improve performance I'd typically first evaluate whether caching is viable as it's typically the least effort / max value solution where you can easily cache APIs with a [CacheResponse] attribute which will cache the optimal API output for the specified duration or you can take advantage of caching primitives in HTTP to avoid needing to return any non-modified resources over the wire.
To avoid the second issue of having different queries without large blobbed data, I would extract it out into a different 1:1 row & only retrieve it when it's needed as large row sizes hurts overall performance in accessing that table.
Custom Results for Summary Data
So it's very rare that I'd have different APIs for accessing different fields of a single entity (more likely due to additional joins) but for returning multiple results of the same entity I would have a different optimized view with just the data required. This existing answer shows some ways to retrieve custom resultsets with OrmLite (See also Dynamic Result Sets in OrmLite docs).
I'll generally prefer to use a custom Typed POCO with just the fields I want the RDBMS to return, e.g. in a summary BookResult Entity:
var q = db.From<Book>()
.Where(x => ...);
var results = db.Select<BookResult>(q);
This is all relative to the task at hand, e.g. the fewer results returned or fewer concurrent users accessing the Page/API the less likely you should be to use multiple optimized queries whereas for public APIs with 1000's of concurrent users of frequently accessed features I'd definitely be looking to profiling frequently & optimizing every query. Although these cases would typically be made clear from stakeholders who'd maintain "performance is a feature" as a primary objective & allocate time & resources accordingly.
I can't speak to ORM Lite, but for Entity Framework the ORM will look ahead, and only return columns that are necessary to fulfill subsequent execution. If you couple this with view models, you are in a pretty good spot. So, for example, lets say you have a grid to display the titles of your books. You only need a subset of columns from the database to do so. You could create a view model like this:
public class BookListViewItem{
public int Id {get;set;}
public string Title {get; set;}
public BookListView(Book book){
Id = book.Id;
Title = book.Title;
}
}
And then, when you need it, fill it like this:
var viewModel = dbcontext.Books
.Where(i => i.whateverFilter)
.Select(i => new BookListViewItem(i))
.ToList();
That should limit the generated SQL to only request the id and title columns.
In Entity Framework, this is called 'projection'. See:
https://social.technet.microsoft.com/wiki/contents/articles/53881.entity-framework-core-3-projections.aspx

MongoDB: How to define a dynamic entity in my own domain class?

New to MongoDB. Set up a C# web project in VS 2013.
Need to insert data as document into MongoDB. The number of Key-Value pair every time could be different.
For example,
document 1: Id is "1", data is one pair key-value: "order":"shoes"
document 2: Id is "2", data is a 3-pair key-value: "order":"shoes", "package":"big", "country":"Norway"
In this "Getting Started" says because it is so much easier to work with your own domain classes this quick-start will assume that you are going to do that. suggests make our own class like:
public class Entity
{
public ObjectId Id { get; set; }
public string Name { get; set; }
}
then use it like:
var entity = new Entity { Name = "Tom" };
...
entity.Name = "Dick";
collection.Save(entity);
Well, it defeats the idea of no-fixed columns, right?
So, I guess BsonDocument is the the model to use and is there any good samples for beginners?
I'm amazed how often this topic comes up... Essentially, this is more of a 'statically typed language limitation' than a MongoDB issue:
Schemaless doesn't mean you don't have any schema per se, it basically means you don't have to tell the database up front what you're going to store. It's basically "code first" - the code just writes to the database like it would to RAM, with all the flexibility involved.
Of course, the typical application will have some sort of reoccurring data structure, some classes, some object-oriented paradigm in one way or another. That is also true for the indexes: indexes are (usually) 'static' in the sense that you do have to tell mongodb about which field to index up front.
However, there is also the use case where you don't know what to store. If your data is really that unforeseeable, it makes sense to think "code first": what would you do in C#? Would you use the BsonDocument? Probably not. Maybe an embedded Dictionary does the trick, e.g.
public class Product {
public ObjectId Id {get;set;}
public decimal Price {get;set;}
public Dictionary<string, string> Attributes {get;set;}
// ...
}
This solution can also work with multikeys to simulate a large number of indexes to make queries on the attributes reasonably fast (though the lack of static typing makes range queries tricky). See
It really depends on your needs. If you want to have nested objects and static typing, things get a lot more complicated than this. Then again, the consumer of such a data structure (i.e. the frontend or client application) often needs to make assumptions that make it easy to digest this information, so it's often not possible to make this type safe anyway.
Other options include indeed using the BsonDocument, which I find too invasive in the sense that you make your business models depend on the database driver implementation; or using a common base class like ProductAttributes that can be extended by classes such as ProductAttributesShoes, etc. This question really revolves around the whole system design - do you know the properties at compile time? Do you have dropdowns for the property values in your frontend? Where do they come from?
If you want something reusable and flexible, you could simply use a JSON library, serialize the object to string and store that to the database. In any case, the interaction with such objects will be ugly from the C# side because they're not statically typed.

So many criteria

I'm working on an internet website that provides some services to internet users. So we have a administration system, where my cooperators from the business team can get the information they want, e.g. how many new users registered in the last 3 days? or how many articles posted with the tag "joke" etc. Thus in the administration system, there are a few pages for searching some tables with conditions. These pages are quite alike:
UserID:[--------------] Nick Keyword:[------------] Registered Time:[BEGIN]~[END] [Search]
The searching results are listed here
The class User has more properties than just UserID/Nick/RegisterTime (as well as the user table), but only the 3 properties are treated as conditions. So I have a UserSearchCriteria class like:
public class UserSearchCriteria
{
public long UserID { get; set; }
public string NickKeyword { get; set; }
public DateTime RegisteredTimeStart { get; set; }
public DateTime RegisteredTimeEnd { get; set; }
}
Then in the data access layer, the search method takes an argument with its type UserSearchCriteria, and build the corresponding Expression<Func<User, bool>> to query. While out of theDAL, other developers can only search the user table with the 3 conditions provided by the criteria, for example, they can't search those users whose City property is "New York"(this is usually because this property has no index in DB, searching with it is slow).
Question 1: This implementation of enclosing the search is correct or not? Any suggestions?
Question 2: Now I find more criteria classes in the project like ArticleSearchCriteria,FavouriteSearchCriteria and etc, and the criteria will become more and more in the future I think. They have almost the same working mechanism but I need to repeat the code. Is there a better solution?
P.S. If you need these info: jQuery + ASP.NET MVC 3 + MongoDB
Makes perfect sense to me. If the user can't search by "anything" then using a search-by-template sort of approach doesn't make any sense. Also, if you try to make this more generic, it will get downright confusing. E.g., I would hate to code to something like:
class SearchCriteria{
Dictionary<object,object> KeyValuePairs;
EntityKind Entity;
}
to be used like this:
SearchCriteria sc = new SearchCriteria();
sc.KeyValuePairs.Add("UserId",32);
sc.Entity = EntityKind.User;
Eww. No compile time type checking, no checking to see if the entity and property match up, etc.
So, my answer is, yes :), I would use the design pattern you are currently using. Makes sense to me, and seems straightforward for anyone to see what you're doing and get up to speed.

When to use Properties and Methods?

I'm new to the .NET world having come from C++ and I'm trying to better understand properties. I noticed in the .NET framework Microsoft uses properties all over the place. Is there an advantage for using properties rather than creating get/set methods? Is there a general guideline (as well as naming convention) for when one should use properties?
It is pure syntactic sugar. On the back end, it is compiled into plain get and set methods.
Use it because of convention, and that it looks nicer.
Some guidelines are that when it has a high risk of throwing Exceptions or going wrong, don't use properties but explicit getters/setters. But generally even then they are used.
Properties are get/set methods; simply, it formalises them into a single concept (for read and write), allowing (for example) metadata against the property, rather than individual members. For example:
[XmlAttribute("foo")]
public string Name {get;set;}
This is a get/set pair of methods, but the additional metadata applies to both. It also, IMO, simply makes it easier to use:
someObj.Name = "Fred"; // clearly a "set"
DateTime dob = someObj.DateOfBirth; // clearly a "get"
We haven't duplicated the fact that we're doing a get/set.
Another nice thing is that it allows simple two-way data-binding against the property ("Name" above), without relying on any magic patterns (except those guaranteed by the compiler).
There is an entire book dedicated to answering these sorts of questions: Framework Design Guidelines from Addison-Wesley. See section 5.1.3 for advice on when to choose a property vs a method.
Much of the content of this book is available on MSDN as well, but I find it handy to have it on my desk.
Consider reading Choosing Between Properties and Methods. It has a lot of information on .NET design guidelines.
properties are get/set methods
Properties are set and get methods as people around here have explained, but the idea of having them is making those methods the only ones playing with the private values (for instance, to handle validations).
The whole other logic should be done against the properties, but it's always easier mentally to work with something you can handle as a value on the left and right side of operations (properties) and not having to even think it is a method.
I personally think that's the main idea behind properties.
I always think that properties are the nouns of a class, where as methods are the verbs...
First of all, the naming convention is: use PascalCase for the property name, just like with methods. Also, properties should not contain very complex operations. These should be done kept in methods.
In OOP, you would describe an object as having attributes and functionality. You do that when designing a class. Consider designing a car. Examples for functionality could be the ability to move somewhere or activate the wipers. Within your class, these would be methods. An attribute would be the number of passengers within the car at a given moment. Without properties, you would have two ways to implement the attribute:
Make a variable public:
// class Car
public int passengerCount = 4;
// calling code
int count = myCar.passengerCount;
This has several problems. First of all, it is not really an attribute of the vehicle. You have to update the value from inside the Car class to have it represent the vehicles true state. Second, the variable is public and could also be written to.
The second variant is one widley used, e. g. in Java, where you do not have properties like in c#:
Use a method to encapsulate the value and maybe perform a few operations first.
// class Car
public int GetPassengerCount()
{
// perform some operation
int result = CountAllPassengers();
// return the result
return result;
}
// calling code
int count = myCar.GetPassengerCount();
This way you manage to get around the problems with a public variable. By asking for the number of passengers, you can be sure to get the most recent result since you recount before answering. Also, you cannot change the value since the method does not allow it. The problem is, though, that you actually wanted the amount of passengers to be an attribute, not a function of your car.
The second approach is not necessarily wrong, it just does not read quite right. That's why some languages include ways of making attributes look like variables, even though they work like methods behind the scenes. Actionscript for example also includes syntax to define methods that will be accessed in a variable-style from within the calling code.
Keep in mind that this also brings responsibility. The calling user will expect it to behave like an attribute, not a function. so if just asking a car how many passengers it has takes 20 seconds to load, then you probably should pack that in a real method, since the caller will expect functions to take longer than accessing an attribute.
EDIT:
I almost forgot to mention this: The ability to actually perform certain checks before letting a variable be set. By just using a public variable, you could basically write anything into it. The setter method or property give you a chance to check it before actually saving it.
Properties simply save you some time from writing the boilerplate that goes along with get/set methods.
That being said, a lot of .NET stuff handles properties differently- for example, a Grid will automatically display properties but won't display a function that does the equivalent.
This is handy, because you can make get/set methods for things that you don't want displayed, and properties for those you do want displayed.
The compiler actually emits get_MyProperty and set_MyProperty methods for each property you define.
Although it is not a hard and fast rule and, as others have pointed out, Properties are implemented as Get/Set pairs 'behind the scenes' - typically Properties surface encapsulated/protected state data whereas Methods (aka Procedures or Functions) do work and yield the result of that work.
As such Methods will take often arguments that they might merely consume but also may return in an altered state or may produce a new object or value as a result of the work done.
Generally speaking - if you need a way of controlling access to data or state then Properties allow the implementation that access in a defined, validatable and optimised way (allowing access restriction, range & error-checking, creation of backing-store on demand and a way of avoiding redundant setting calls).
In contrast, methods transform state and give rise to new values internally and externally without necessarily repeatable results.
Certainly if you find yourself writing procedural or transformative code in a property, you are probably really writing a method.
Also note that properties are available via reflection. While methods are, too, properties represent "something interesting" about the object. If you are trying to display a grid of properties of an object-- say, something like the Visual Studio form designer-- then you can use reflection to query the properties of a class, iterate through each property, and interrogate the object for its value.
Think of it this way, Properties encapsulate your fields (commoningly marked private) while at the same time provides your fellow developers to either set or get the field value. You can even perform routine validation in the property's set method should you desire.
Properties are not just syntactic sugar - they are important if you need to create object-relational mappings (Linq2Sql or Linq2Entities), because they behave just like variables while it is possible to hide the implementation details of the object-relational mapping (persistance). It is also possible to validate a value being assigned to it in the getter of the property and protect it against assigning unwanted values.
You can't do this with the same elegance with methods. I think it is best to demonstrate this with a practical example.
In one of his articles, Scott Gu creates classes which are mapped to the Northwind database using the "code first" approach. One short example taken from Scott's blog (with a little modification, the full article can be read at Scott Gu's blog here):
public class Product
{
[Key]
public int ProductID { get; set; }
public string ProductName { get; set; }
public Decimal? UnitPrice { get; set; }
public bool Discontinued { get; set; }
public virtual Category category { get; set; }
}
// class Category omitted in this example
public class Northwind : DbContext
{
public DbSet<Product> Products { get; set; }
public DbSet<Category> Categories { get; set; }
}
You can use entity sets Products, Categories and the related classes Product and Category just as if they were normal objects containing variables: You can read and write them and they behave just like normal variables. But you can also use them in Linq queries, persist them (store them in the database and retrieve them).
Note also how easy it is to use annotations (C# attributes) to define the primary key (in this example ProductID is the primary key for Product).
While the properties are used to define a representation of the data stored in the database, there are some methods defined in the entity set class which control the persistence: For example, the method Remove() marks a given entity as deleted, while Add() adds a given entity, SaveChanges() makes the changes permanent. You can consider the methods as actions (i.e. you control what you want to do with the data).
Finally I give you an example how naturally you can use those classes:
// instantiate the database as object
var nw = new NorthWind();
// select product
var product = nw.Products.Single(p => p.ProductName == "Chai");
// 1. modify the price
product.UnitPrice = 2.33M;
// 2. store a new category
var c = new Category();
c.Category = "Example category";
c.Description = "Show how to persist data";
nw.Categories.Add(c);
// Save changes (1. and 2.) to the Northwind database
nw.SaveChanges();

Categories

Resources