Understanding MongoDB Aggregate and GroupBy - c#

I'm trying to do a query in MongoDB to first group by id and then sort descending. I have a functional LINQ expression here:
var list = this.GetPackages().ToList();
list = list.OrderByDescending(package => package.PackageVersion)
.GroupBy(g => g.PackageId)
.Select(packages => packages.First()).ToList();
return list;
But I can't seem to come up with the equivalent MongoDB expression, nor can I even get the $project function to work:
db.packages.aggregate([
{
$sort : { packageVersion : -1 }
},
{
$group: { _id: "$PackageId" }
},
{
$project: { PackageVersion: 1, Title: 1 }
}
])
My result is this:
{ "_id" : "e3afb1fe-dce7-476e-8372-cd8201abc131" }
{ "_id" : "e3722179-0903-4894-9a86-3a3ffd94de83" }
{ "_id" : "3e65e93a-4c2c-4a02-8b21-e5858a4058dd" }
Is the MongoDB query of the correct format, and is there an equivalent way to do this using the C# MongoDB driver?

Make use of the $first operator and $$ROOT variable to get the first document in the group.
$$ROOT is a system variable that:
References the root document, i.e. the top-level document, currently
being processed in the aggregation pipeline stage.
Then project the first document.
db.packages.aggregate([
{
$sort : { packageVersion : -1 }
},
{
$group: { "_id": "$PackageId","firstPackage":{$first:"$$ROOT"}}
},
{
$project: { "firstPackage": 1, "_id": 0}
}
])

Related

MongoDB .NET Driver - Use $in in the match stage

I have two collections one is for posts (PostInfo) and one for users (UserInfo), I join two collections and I want to find the posts if the given userid is in AsUser.Friends :
var docs = await _dbContext.PostInfos.Aggregate()
.Lookup("UserInfo", "UserId", "UserId", "AsUser")
.Unwind("AsUser")
.Match(
new BsonDocument() {
{ "$expr", new BsonDocument() {
{ "$in", new BsonArray(){ "$AsUser.Friends", BsonArray.Create(user.UserId) } }
}
}
}
)
.As<PostInfo>()
.Project<PostInfo>(Builders<PostInfo>.Projection.Exclude("AsUser"))
.ToListAsync();
This is userinfo document :
{
"_id" : ObjectId("62d64398772c29b212332ec2"),
"UserId" : "18F1FDB9-E5DE-4116-9486-271FE6738785",
"IsDeleted" : false,
"UserName" : "kaveh",
"Followers" : [],
"Followings" : [],
"Friends" : [
"9e3163b9-1ae6-4652-9dc6-7898ab7b7a00",
"2B5F6867-E804-48AF-BED3-672EBD770D10"
],
}
I am having a problem working with the $in operator.
Update
Also, I think this would work too (from here):
db.inventory.find( { tags: { $eq: [ "A", "B" ] } } )
But I can't convert this to C# format.
The $in operator (logic) is incorrect, you should check whether the userId in the AsUser.Friends array as below:
{
$match: {
$expr: {
$in: [
"9e3163b9-1ae6-4652-9dc6-7898ab7b7a00", // UserId
"$AsUser.Friends"
]
}
}
}
Sample Mongo Playground
For MongoDB C# syntax,
.Match(
new BsonDocument()
{
{
"$expr", new BsonDocument()
{
{ "$in", new BsonArray() { user.UserId, "$AsUser.Friends" } }
}
}
}
)

Project single field in array of subdocument returns more than one

I'm having difficulties putting up a code which returns an element in an array of subdocuments. I am actually trying to flatten a document to a new document which is strongly typed. My document is looking like;
{
"_id" : BinData(3, "7FRf4nbe60ev6XmGKBBW4Q=="),
"status" : NumberInt(1),
"title":"Central station",
"attributes" : [
{
"defId" : BinData(3, "QFDtR03NbkqwuhhG76wS8g=="),
"value" : "388",
"name" : null
},
{
"defId" : BinData(3, "RE3MT3clb0OdLEkkqhpFOg=="),
"value" : "",
"name" : null
},
{
"defId" : BinData(3, "pPgJR50h8kGdDaCcH2o17Q=="),
"value" : "Merkez",
"name" : null
}
]}
What I am trying to achieve is;
{
"title":"Central Station",
"value":"388"
}
What I've done already;
using (_dbContext)
{
var filter = Builders<CustomerModel>.Filter.Eq(q => q.Id, Guid.Parse("30B59585-CBFC-4CD5-A43E-0FDB0AE3167A")) &
Builders<CustomerModel>.Filter.ElemMatch(f => f.Attributes, q => q.DefId == Guid.Parse("47ED5040-CD4D-4A6E-B0BA-1846EFAC12F2"));
var projection = Builders<CustomerModel>.Projection.Include(f => f.Title).Include("attributes.value");
var document = _dbContext.Collection<CustomerModel>().Find(filter).Project(projection).FirstOrDefault();
if (document == null)
return null;
return BsonSerializer.Deserialize<TitleAndValueViewModel>(document);
}
Note: TitleAndCodeViewModel contains title and value properties.
This block of code returns;
{{ "_id" : CSUUID("30b59585-cbfc-4cd5-a43e-0fdb0ae3167a"), "title" : "388 güvenevler", "attributes" : [{ "value" : "388" }, { "value" : "" }, { "value" : "Merkez " }] }}
I am trying to get "value":"388" but instead I am getting another two value properties even tough the ElemMatch filter added for subdocument.
Thank you for your help in advance.
Note: I am looking for answers in C# mongodb driver.
Option 1: ( via aggregation)
db.collection.aggregate([
{
$match: {
_id: 5,
"attributes.defId": 1
}
},
{
"$addFields": {
"attributes": {
"$filter": {
"input": "$attributes",
"as": "a",
"cond": {
$eq: [
"$$a.defId",
1
]
}
}
}
}
},
{
$unwind: "$attributes"
},
{
$project: {
_id: 0,
title: 1,
value: "$attributes.value"
}
}
])
Explained:
Match ( good to add index for the matching fields )
Filter only the attribute you need
Unwind to convert the array to object
Project only the necessary output
Playground
Option 2: ( find/$elemMatch )
db.collection.find({
_id: 5,
attributes: {
"$elemMatch": {
"defId": 1
}
}
},
{
_id: 0,
title: 1,
"attributes": {
"$elemMatch": {
"defId": 1
}
}
})
Explained:
Match the element via _id and elemMatch the attribute
Project the necessary elements. ( Note here elemMatch also need to be used to filter the exact match attribute )
( Note this version will not identify if there is second attribute with same attribute.defId , also projection of attribute will be array with single element if found that need to be considered from the app side )
Playground 2
by specifying defId
db.collection.aggregate(
[{
$project: {
title: '$title',
attributes: {
$filter: {
input: '$attributes',
as: 'element',
cond: { $eq: ['$$element.defId', BinData(3, 'QFDtR03NbkqwuhhG76wS8g==')] }
}
}
}
}, {
$project: {
_id: 0,
title: '$title',
value: { $first: '$attributes.value' }
}
}])
result:
{
"title": "Central station",
"value": "388"
}

Replace Root with C# driver for .net core AggregateExpressionDefinition

I'm trying to perform a simple unwind and replace root in .net core 2.2.
I've already tried the query in MongoDB and it works but I'm finding it difficult to translate it 100 % to C# without using magic strings.
This is my document:
{
"_id" : ObjectId("5cb6475b20b49a5cec99eb89"),
"name" : "Route A"
"isActive" : true,
"buses" : [
{
"capacity" : "15",
"time" : "08:00:00",
"direction" : "Inbound"
},
{
"capacity" : "15",
"time" : "08:30:00",
"direction" : "Inbound"
},
{
"capacity" : "15",
"time" : "08:00:00",
"direction" : "Outbound"
},
{
"capacity" : "15",
"time" : "08:30:00",
"direction" : "Outbound"
}
]
}
I also have a class for the root document called Routes and another one for the subdocument called Bus.
The query I'm running in mongo is this one:
db.routes.aggregate([
{ $match : { "_id" : ObjectId("5cb4e818cb95b3572c8f0f2c") } },
{ $unwind: "$buses" },
{ $replaceRoot: { "newRoot": "$buses" } }
])
The expected result is a simple array of buses, so far I'm getting it with this query in C#
_routes.Aggregate()
.Match(r => r.Id == routeId)
.Unwind<Route, UnwoundRoute>(r => r.Buses)
.ReplaceRoot<Bus>("$buses")
.ToListAsync();
I want to know if it's possible to replace the string "$buses" with something that's not hardcoded.
I've tried using the AggregateExpressionDefinition class which is one of the possible parameters that the ReplaceRoot method can receive but I wasn't able to understand it completely.
Any help will be appreciated.
Posting this here in case someone ends up making the same mistakes I did.
I basically created a new "UnwoundRoute" entity to hold the results of the unwind operation and then used a simple LINQ expression. Thanks to the reddit user u/Nugsly for the suggestion about changing the way I should call unwind.
This works:
_routes.Aggregate()
.Match(r => r.Id == routeId)
.Unwind<Route, UnwoundRoute>(r => r.Buses)
.ReplaceRoot(ur => ur.Buses)
.ToListAsync();
You can also filter the result of the replace root afterwards:
_routes.Aggregate()
.Match(r => r.Id == routeId)
.Unwind<Route, UnwoundRoute>(r => r.Buses)
.ReplaceRoot(ur => ur.Buses)
.Match(b => b.Direction == direction)
.ToListAsync();
And it will return an array of documents.
{
"capacity" : "15",
"time" : "08:00:00",
"direction" : "Inbound"
},
{
"capacity" : "15",
"time" : "08:30:00",
"direction" : "Inbound"
}
Also, if you try to add the result type to replace root VS will throw an error saying that the lambda expression couldn't be converted because it's not a delegate type.
This doesn't (which is what I had in the beggining):
_routes.Aggregate()
.Match(r => r.Id == routeId)
.Unwind<Route, UnwoundRoute>(r => r.Buses)
.ReplaceRoot<Bus>(ur => ur.Buses)
.ToListAsync();
i can offer you a solution that uses MongoDAL, which is just a wrapper around the c# driver which hides most of the complexity of the driver.
using System;
using System.Linq;
using MongoDAL;
namespace BusRoutes
{
class Route : Entity
{
public string name { get; set; }
public bool isActive { get; set; }
public Bus[] buses { get; set; }
}
class Bus
{
public int capacity { get; set; }
public string time { get; set; }
public string direction { get; set; }
}
class Program
{
static void Main(string[] args)
{
new DB("busroutes");
var routeA = new Route
{
name = "Route A",
buses = new Bus[]
{
new Bus { capacity = 15, direction = "Inbound", time = "8:00:00"},
new Bus { capacity = 25, direction = "Outbound", time = "9:00:00" },
new Bus { capacity = 35, direction = "Inbound", time = "10:00:00" }
}
};
routeA.Save();
var query = routeA.Collection()
.Where(r => r.ID == routeA.ID)
.SelectMany(r => r.buses);
Console.WriteLine(query.ToString());
var busesOfRouteA = query.ToArray();
foreach (var bus in busesOfRouteA)
{
Console.WriteLine(bus.time.ToString());
}
Console.ReadKey();
}
}
}

MongoDb - Joining ObjectId references in the list with related collections [duplicate]

I have the following MongoDb query working:
db.Entity.aggregate(
[
{
"$match":{"Id": "12345"}
},
{
"$lookup": {
"from": "OtherCollection",
"localField": "otherCollectionId",
"foreignField": "Id",
"as": "ent"
}
},
{
"$project": {
"Name": 1,
"Date": 1,
"OtherObject": { "$arrayElemAt": [ "$ent", 0 ] }
}
},
{
"$sort": {
"OtherObject.Profile.Name": 1
}
}
]
)
This retrieves a list of objects joined with a matching object from another collection.
Does anybody know how I can use this in C# using either LINQ or by using this exact string?
I tried using the following code but it can't seem to find the types for QueryDocument and MongoCursor - I think they've been deprecated?
BsonDocument document = MongoDB.Bson.Serialization.BsonSerializer.Deserialize<BsonDocument>("{ name : value }");
QueryDocument queryDoc = new QueryDocument(document);
MongoCursor toReturn = _connectionCollection.Find(queryDoc);
There is no need to parse the JSON. Everything here can actually be done directly with either LINQ or the Aggregate Fluent interfaces.
Just using some demonstration classes because the question does not really give much to go on.
Setup
Basically we have two collections here, being
entities
{ "_id" : ObjectId("5b08ceb40a8a7614c70a5710"), "name" : "A" }
{ "_id" : ObjectId("5b08ceb40a8a7614c70a5711"), "name" : "B" }
and others
{
"_id" : ObjectId("5b08cef10a8a7614c70a5712"),
"entity" : ObjectId("5b08ceb40a8a7614c70a5710"),
"name" : "Sub-A"
}
{
"_id" : ObjectId("5b08cefd0a8a7614c70a5713"),
"entity" : ObjectId("5b08ceb40a8a7614c70a5711"),
"name" : "Sub-B"
}
And a couple of classes to bind them to, just as very basic examples:
public class Entity
{
public ObjectId id;
public string name { get; set; }
}
public class Other
{
public ObjectId id;
public ObjectId entity { get; set; }
public string name { get; set; }
}
public class EntityWithOthers
{
public ObjectId id;
public string name { get; set; }
public IEnumerable<Other> others;
}
public class EntityWithOther
{
public ObjectId id;
public string name { get; set; }
public Other others;
}
Queries
Fluent Interface
var listNames = new[] { "A", "B" };
var query = entities.Aggregate()
.Match(p => listNames.Contains(p.name))
.Lookup(
foreignCollection: others,
localField: e => e.id,
foreignField: f => f.entity,
#as: (EntityWithOthers eo) => eo.others
)
.Project(p => new { p.id, p.name, other = p.others.First() } )
.Sort(new BsonDocument("other.name",-1))
.ToList();
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "others"
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : { "$arrayElemAt" : [ "$others", 0 ] },
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
Probably the easiest to understand since the fluent interface is basically the same as the general BSON structure. The $lookup stage has all the same arguments and the $arrayElemAt is represented with First(). For the $sort you can simply supply a BSON document or other valid expression.
An alternate is the newer expressive form of $lookup with a sub-pipeline statement for MongoDB 3.6 and above.
BsonArray subpipeline = new BsonArray();
subpipeline.Add(
new BsonDocument("$match",new BsonDocument(
"$expr", new BsonDocument(
"$eq", new BsonArray { "$$entity", "$entity" }
)
))
);
var lookup = new BsonDocument("$lookup",
new BsonDocument("from", "others")
.Add("let", new BsonDocument("entity", "$_id"))
.Add("pipeline", subpipeline)
.Add("as","others")
);
var query = entities.Aggregate()
.Match(p => listNames.Contains(p.name))
.AppendStage<EntityWithOthers>(lookup)
.Unwind<EntityWithOthers, EntityWithOther>(p => p.others)
.SortByDescending(p => p.others.name)
.ToList();
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"let" : { "entity" : "$_id" },
"pipeline" : [
{ "$match" : { "$expr" : { "$eq" : [ "$$entity", "$entity" ] } } }
],
"as" : "others"
} },
{ "$unwind" : "$others" },
{ "$sort" : { "others.name" : -1 } }
]
The Fluent "Builder" does not support the syntax directly yet, nor do LINQ Expressions support the $expr operator, however you can still construct using BsonDocument and BsonArray or other valid expressions. Here we also "type" the $unwind result in order to apply a $sort using an expression rather than a BsonDocument as shown earlier.
Aside from other uses, a primary task of a "sub-pipeline" is to reduce the documents returned in the target array of $lookup. Also the $unwind here serves a purpose of actually being "merged" into the $lookup statement on server execution, so this is typically more efficient than just grabbing the first element of the resulting array.
Queryable GroupJoin
var query = entities.AsQueryable()
.Where(p => listNames.Contains(p.name))
.GroupJoin(
others.AsQueryable(),
p => p.id,
o => o.entity,
(p, o) => new { p.id, p.name, other = o.First() }
)
.OrderByDescending(p => p.other.name);
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "o"
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : { "$arrayElemAt" : [ "$o", 0 ] },
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
This is almost identical but just using the different interface and produces a slightly different BSON statement, and really only because of the simplified naming in the functional statements. This does bring up the other possibility of simply using an $unwind as produced from a SelectMany():
var query = entities.AsQueryable()
.Where(p => listNames.Contains(p.name))
.GroupJoin(
others.AsQueryable(),
p => p.id,
o => o.entity,
(p, o) => new { p.id, p.name, other = o }
)
.SelectMany(p => p.other, (p, other) => new { p.id, p.name, other })
.OrderByDescending(p => p.other.name);
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "o"
}},
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : "$o",
"_id" : 0
} },
{ "$unwind" : "$other" },
{ "$project" : {
"id" : "$id",
"name" : "$name",
"other" : "$other",
"_id" : 0
}},
{ "$sort" : { "other.name" : -1 } }
]
Normally placing an $unwind directly following $lookup is actually an "optimized pattern" for the aggregation framework. However the .NET driver does mess this up in this combination by forcing a $project in between rather than using the implied naming on the "as". If not for that, this is actually better than the $arrayElemAt when you know you have "one" related result. If you want the $unwind "coalescence", then you are better off using the fluent interface, or a different form as demonstrated later.
Querable Natural
var query = from p in entities.AsQueryable()
where listNames.Contains(p.name)
join o in others.AsQueryable() on p.id equals o.entity into joined
select new { p.id, p.name, other = joined.First() }
into p
orderby p.other.name descending
select p;
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "joined"
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : { "$arrayElemAt" : [ "$joined", 0 ] },
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
All pretty familiar and really just down to functional naming. Just as with using the $unwind option:
var query = from p in entities.AsQueryable()
where listNames.Contains(p.name)
join o in others.AsQueryable() on p.id equals o.entity into joined
from sub_o in joined.DefaultIfEmpty()
select new { p.id, p.name, other = sub_o }
into p
orderby p.other.name descending
select p;
Request sent to server:
[
{ "$match" : { "name" : { "$in" : [ "A", "B" ] } } },
{ "$lookup" : {
"from" : "others",
"localField" : "_id",
"foreignField" : "entity",
"as" : "joined"
} },
{ "$unwind" : {
"path" : "$joined", "preserveNullAndEmptyArrays" : true
} },
{ "$project" : {
"id" : "$_id",
"name" : "$name",
"other" : "$joined",
"_id" : 0
} },
{ "$sort" : { "other.name" : -1 } }
]
Which actually is using the "optimized coalescence" form. The translator still insists on adding a $project since we need the intermediate select in order to make the statement valid.
Summary
So there are quite a few ways to essentially arrive at what is basically the same query statement with exactly the same results. Whilst you "could" parse the JSON to BsonDocument form and feed this to the fluent Aggregate() command, it's generally better to use the natural builders or the LINQ interfaces as they do easily map onto the same statement.
The options with $unwind are largely shown because even with a "singular" match that "coalescence" form is actually far more optimal then using $arrayElemAt to take the "first" array element. This even becomes more important with considerations of things like the BSON Limit where the $lookup target array could cause the parent document to exceed 16MB without further filtering. There is another post here on Aggregate $lookup Total size of documents in matching pipeline exceeds maximum document size where I actually discuss how to avoid that limit being hit by using such options or other Lookup() syntax available to the fluent interface only at this time.

Using Facets in the Aggregation Framework C#

I would like to create an Aggregation on my data to get the total amount of counts for specific tags for a collection of books in my .Net application.
I have the following Book class.
public class Book
{
public string Id { get; set; }
public string Name { get; set; }
[BsonDictionaryOptions(DictionaryRepresentation.Document)]
public Dictionary<string, string> Tags { get; set; }
}
And when the data is saved, it is stored in the following format in MongoDB.
{
"_id" : ObjectId("574325a36fdc967af03766dc"),
"Name" : "My First Book",
"Tags" : {
"Edition" : "First",
"Type" : "HardBack",
"Published" : "2017",
}
}
I've been using facets directly in MongoDB and I am able to get the results that I need by using the following query:
db.{myCollection}.aggregate(
[
{
$match: {
"Name" : "SearchValue"
}
},
{
$facet: {
"categorizedByTags" : [
{
$project :
{
Tags: { $objectToArray: "$Tags" }
}
},
{ $unwind : "$Tags"},
{ $sortByCount : "$Tags"}
]
}
},
]
);
However I am unable to transfer this over to the .NET C# Driver for Mongo. How can I do this using the .NET C# driver?
Edit - I will ultimately be looking to query the DB on other properties of the books as part of a faceted book listings page, such as Publisher, Author, Page count etc... hence the usage of $facet, unless there is a better way of doing this?
I would personally not use $facet here since you've only got one pipeline which kind of defeats the purpose of $facet in the first place...
The following is simpler and scales better ($facet will create one single potentially massive document).
db.collection.aggregate([
{
$match: {
"Name" : "My First Book"
}
}, {
$project: {
"Tags": {
$objectToArray: "$Tags"
}
}
}, {
$unwind: "$Tags"
}, {
$sortByCount: "$Tags"
}, {
$group: { // not really needed unless you need to have all results in one single document
"_id": null,
"categorizedByTags": {
$push: "$$ROOT"
}
}
}, {
$project: { // not really needed, either: remove _id field
"_id": 0
}
}])
This could be written using the C# driver as follows:
var collection = new MongoClient().GetDatabase("test").GetCollection<Book>("test");
var pipeline = collection.Aggregate()
.Match(b => b.Name == "My First Book")
.Project("{Tags: { $objectToArray: \"$Tags\" }}")
.Unwind("Tags")
.SortByCount<BsonDocument>("$Tags");
var output = pipeline.ToList().ToJson(new JsonWriterSettings {Indent = true});
Console.WriteLine(output);
Here's the version using a facet:
var collection = new MongoClient().GetDatabase("test").GetCollection<Book>("test");
var project = PipelineStageDefinitionBuilder.Project<Book, BsonDocument>("{Tags: { $objectToArray: \"$Tags\" }}");
var unwind = PipelineStageDefinitionBuilder.Unwind<BsonDocument, BsonDocument>("Tags");
var sortByCount = PipelineStageDefinitionBuilder.SortByCount<BsonDocument, BsonDocument>("$Tags");
var pipeline = PipelineDefinition<Book, AggregateSortByCountResult<BsonDocument>>.Create(new IPipelineStageDefinition[] { project, unwind, sortByCount });
// string based alternative version
//var pipeline = PipelineDefinition<Book, BsonDocument>.Create(
// "{ $project :{ Tags: { $objectToArray: \"$Tags\" } } }",
// "{ $unwind : \"$Tags\" }",
// "{ $sortByCount : \"$Tags\" }");
var facetPipeline = AggregateFacet.Create("categorizedByTags", pipeline);
var aggregation = collection.Aggregate().Match(b => b.Name == "My First Book").Facet(facetPipeline);
var output = aggregation.Single().Facets.ToJson(new JsonWriterSettings { Indent = true });
Console.WriteLine(output);

Categories

Resources