How to fetch first 1000 documents that don't have certain field? - c#

Say, I have collection People. How should I fetch first 1000 documents that doesn't have a field Phone? As I understand, I should use $exists however I cannot understand how to use it from .NET driver and there is next to no info on that topic on the internet. Any help will be appreciated. Thanks!

Assume your Model Class is Model and colelction name is "Model".
var coll = db.GetCollection<Model>("Model");
var ret = coll.Find(Builders<Model>.Filter.Exists(d => d.Phone, false))
.Limit(1000)
.ToList();
With ToList you will get already loaded list, sometimes it's better to use ToEnumerable and have enumerable to iterate.

Related

Retrieving and deserializing large lists of Redis-cache items

We're planning to add a Redis-cache to an existing solution.
We have this core entity which is fetched a lot, several times per session. The entity consists of 13 columns where the majority is less than 20 characters. Typically it's retrieved by parent id, but sometimes as a subset that is fetched by a list of ids. To solve this we're thinking of implementing the solution below, but the question is if it's a good idea? Typically the list is around 400 items, but in some cases it could be up to 3000 items.
We would store the instances in the list with this key pattern: EntityName:{ParentId}:{ChildId}, where ParentId and ChildId is ints.
Then to retrieve the list based on ParentId we would call the below method with EntityName:{ParentId}:* as the value of the pattern-argument:
public async Task<List<T>> GetMatches<T>(string pattern)
{
var keys = _multiPlexer.GetServer(_multiPlexer.GetEndPoints(true)[0]).Keys(pattern: pattern).ToArray();
var values = await Db.StringGetAsync(keys: keys);
var result = new List<T>();
foreach (var value in values.Where(x => x.HasValue))
{
result.Add(JsonSerializer.Deserialize<T>(value));
}
return result;
}
And to retrieve a specific list of items we would call the below method with a list of exact keys:
public async Task<List<T>> GetList<T>(string[] keys)
{
var values = await Db.StringGetAsync(keys: keys.Select(x => (RedisKey)x).ToArray());
var result = new List<T>();
foreach (var value in values.Where(x => x.HasValue))
{
result.Add(JsonSerializer.Deserialize<T>(value));
}
return result;
}
The obvious worry here is the amount of objects to deserialize and the performance of System.Text.Json.
A alternative to this would be to store the data twice, both as a list and on it's own, but that would only help in the case where we're fetching by ParentId. We could also only store the data as a list and retrieve it every time only to sometimes use a subset.
Is there a better way to tackle this?
All input is greatly appreciated! Thanks!
Edit
I wrote a small console application to load test the alternatives, fetching 2000 items 100 times took 2020ms with the pattern matching and fetching the list took 1568ms. I think we can live with that difference and go with the pattern matching.
It seems like #Xerillio was right. I did some load testing using hosted services and then it was almost three times slower to fetch the list using the pattern matching, slower then receiving the list directly from SQL. So, to answer my own question if it's a good idea, I would say no it isn't. The majority of the added time was not because of deserialization rather because of fetching the keys using the pattern matching.
Here's the result from fetching 2000 items 100 items in a loop:
Fetch directly from db = 8625ms
Fetch using list of exact keys = 5663ms
Fetch using match = 13098ms
Fetch full list = 5352ms

Linq: Get Item which is in a list which also is in a list

I'm currently trying to get a list of products which are in a list of stores but only if the product name is the same.
I always get 0 Items back.
I tried to solve to problem using two different approaches, which are below.
//First Approach, return 0
var stores= Store.ReadAll().Where(prods =>
prods.Products.Contains(product))
//Second Approach, doesn't compile but it shows what i wan't to do.
var stores= Store.ReadAll().Where(prods =>
prods.Products.Where(p => p.ProductName == productName));
Help appreciated :)
What you are looking for is Any instead of Where:
var products = Store.ReadAll().Where(prods => prods.Products.Any(p => p.ProductName == productName));
So i partly solved my problem and apparently, it wasn't really a linq problem, it was a database problem. I have a List of objects which contains a list of objects, which gives me a M:M relation. But the .net Entity Framework didn't regonize when i changed the list using list.add(item). So my list was always empty.
But anyway, thanks for the help ! :)
Assuming stores is an IEnumerable type, then following should work. It's important to pass the second parameter to Contains method, so comparision is not case sensitive, else it will return nothing if case is different between product names.
I have assumed in my answer that Store type has a property called Products of List< string > type.
var matchingStores = stores.Where(s=> s.Products.Contains(productName,
StringComparer.OrdinalIgnoreCase));

LINQ Optimization for searching a if an object exist in a list within a list

Currently I have 7,000 video entries and I have a hard time optimizing it to search for Tags and Actress.
This is my code I am trying to modify, I tried using HashSet. It is my first time using it but I don't think I am doing it right.
Dictionary dictTag = JsonPairtoDictionary(tagsId,tagsName);
Dictionary dictActresss = JsonPairtoDictionary(actressId, actressName);
var listVid = new List<VideoItem>(db.VideoItems.ToList());
HashSet<VideoItem> lll = new HashSet<VideoItem>(listVid);
foreach (var tags in dictTag)
{
lll = new HashSet<VideoItem>(lll.Where(q => q.Tags.Exists(p => p.Id == tags.Key)));
}
foreach (var actress in dictActresss)
{
listVid = listVid.Where(q => q.Actress.Exists(p => p.Id == actress.Key)).ToList();
}
First part I get all the Videos in Db by using db.VideoItems.ToList()
Then it will go through a loop to check if a Tag exist
For each VideoItem it has a List<Tags> and I use 'exist' to check if a tag is match.
Then same thing with Actress.
I am not sure if its because I am in Debug mode and ApplicationInsight is active but it is slow. And I will get like 10-15 events per second with baseType:RemoteDependencyData which I am not sure if it means it still connected to database (should not be since I only should only be messing with the a new list of all videos) or what.
After 7 mins it is still processing and that's the longest time I have waited.
I am afraid to put this on my live site since this will eat up my resource like candy
Instead of optimizing the linq you should optimize your database query.
Databases are great at optimized searches and creating subsets and will most likely be faster than anything you write. If you have need to create a subset based on more than on database parameter I would recommend looking into creating some indexes and using those.
Edit:
Example of db query that would eliminate first for loop (which is actually multiple nested loops and where the time delay comes from):
select * from videos where tag in [list of tags]
Edit2
To make sure this is most efficient, require the database to index on the TAGS column. To create the index:
CREATE INDEX video_tags_idx ON videos (tag)
Use 'explains' to see if the index is being used automatically (it should be)
explain select * from videos where tag in [list of tags]
If it doesn't show your index as being used you can look up the syntax to force the use of it.
The problem was not optimization but it was utilization of the Microsoft SQL or my ApplicationDbContext.
I found this when I realize that http://www.albahari.com/nutshell/predicatebuilder.aspx
Because the problem with Keyword search, there can be multiple keywords, and the code I made above doesn't utilize the SQL which made the long execution time.
Using the predicate builder, it will be possible to create dynamic conditions in LINQ

Two LINQ queries returning different data types

I come to you in desperate need of help today. I have two tables named setup and channels which have a many-to-one relationship. Many channels belong to one setup. At any given time only one setup is active. I am trying to query my database and get only the channels belonging to the active setup. The query I have works, but returns an IEnumerable when I need a list. Yet I use a similar query to just get all the channels, and it returns a list. What can I do to get a list?
This query returns a List like I expect:
var q = (from a in DTE.channels
select a).ToList();
this.myChannels = new ObservableCollection<channel>(q);
This query, which provides the filtered data I want, returns an IEnumerable even though I call ToList(), and my ObservableCollection doesn't like IEnumerables:
var f = (DTE.setups.Where(m => m.CurrentSetup == true).Select(m => m.channels)).ToList();
this.myChannels = new ObservableCollection<channel>(f);
Any help is very, very appreciated. Thank you and have a nice day.
Posting my comment as answer as it helped.
Your problem is that you have many to many relationship, so your .Select(m => m.channels) actually returns not List<channel>, but list of lists - List<List<channel>> as each record has own list, and it just returns these lists.
You just need to use .SelectMany(m => m.channels).ToList() instead .Select and SelectMany will get returned lists and will combine them into one list with all data. So you will have List<channel> as you need.

LINQ Select() function throws out loaded values (loadoptions)

I have a model where a Product can have multiple PriceDrops. I'm trying to generate a list of products with the most recent price drops.
Getting the most recent price drops with the products loaded is easy enough, and I thought it would be the best way to start:
dlo.LoadWith<PriceDrop>(pd => pd.Product);
db.LoadOptions = dlo;
return db.PriceDrops.OrderBy(d=>d.CreatedTime);
Works great for a list of recent price drops, but I want a list of products. If I append a ".Select(d=>d.Product)" I get a list of Products back - which is perfect - but they are no longer associated with the PriceDrops. That is, if I call .HasLoadedOrAssignedValues on the products, it returns false. If I try to interrogate the Price Drops, it tries to go back to the DB for them.
Is there a way around this, or do I have to craft a query starting with Products and not use the Select modifier? I was trying to avoid that, because in some cases I want a list of PriceDrops, and I wanted to re-use as much logic as possible (I left out the where clause and other filter code from the sample above, for clarity).
Thanks,
Tom
Try loading the Products, ordered by their latest PriceDrop:
dlo.LoadWith<Product>(p => p.PriceDrops);
db.LoadOptions = dlo;
return db.Products.OrderBy(d => d.PriceDrops.Max(pd => pd.CreatedTime));
I understand from your question that you're trying to avoid this, why?
I think what you need here is the the AssociateWith method, also on the DataLoadOptions class.
dlo.AssociateWith<Product>(p => p.PriceDrops.OrderBy(d=>d.CreatedTime))

Categories

Resources