Using MongoDB C# Driver write ElementMatch with Regex query - c#

I need to construct the following query using MongoDB C# driver
db.Notes.find({ "Group._id" : 74, "CustomFields" : { "$elemMatch" : { "Value" : /batch/i } }, "IsDeleted" : false }).sort({ "CreatedDateTimeUtc" : -1 })
I used a query like this
builder.ElemMatch(x => x.CustomFields, x => x.Value.Contains(filterValue))
It generated mongo query as
db.Notes.find({ "Group._id" : 74, "CustomFields" : { "$elemMatch" : { "Value" : /batch/s } }, "IsDeleted" : false }).sort({ "CreatedDateTimeUtc" : -1 })
if you notice it is appending s at /batch/s instead of i /batch/i
How can I get this work? I need to do this for filters like
contains, using .Contains()
equals, thinking of using .Equals()
doesn't contain, thinking of using !Field.contains(value)
not equals to
starts with
ends with
Can I do something like this, so that I can apply all my regex patterns for all above filters.
builder.Regex(x => x.CustomFields[-1].Value, new BsonRegularExpression($"/{filterValue}/i"));
This converts the query to as below, but that doesn't get any results
db.Notes.find({ "Project._id" : 74, "CustomFields.$.Value" : /bat/i, "IsDeleted" : false }).sort({ "CreatedDateTimeUtc" : -1 })
FYI: builder is FilterDefinition<Note>
My sample Notes Collection is like this:
{
Name:"",
Email:"",
Tel:"",
Date:02 /21/1945,
CustomFields:[
{
Name:"",
Value:"",
IsSearchable:true,
},
{
Name:"",
Value:"",
IsSearchable:true,
},
{
Name:"",
Value:"",
IsSearchable:true,
},
{
Name:"",
Value:"",
IsSearchable:true,
}
]
}

It sounds like all you're missing is the insensitive part. Have you tried this?
ToLower, ToLowerInvariant, ToUpper, ToUpperInvariant (string method)
These methods are used to test whether a string field or property of
the document matches a value in a case-insensitive manner.
According to the 1.1 documentation here, it says that will allow to perform a case insensitive regex match.
The current documentation doesn't mention it, so just to be sure, i checked github and the code to create an insensitive match is still there.

Related

C# MongoDB driver [2.7.0] CountDocumentAsync unexpected native query

I encountered weird thing when using C# MongoDB CountDocumentAsync function. I enabled query logging on MongoDB and this is what I got:
{
"op" : "command",
"ns" : "somenamespace",
"command" : {
"aggregate" : "reservations",
"pipeline" : [
{
"some_query_key": "query_value"
},
{
"$group" : {
"_id" : null,
"n" : {
"$sum" : 1
}
}
}
],
"cursor" : {}
},
"keyUpdates" : 0,
"writeConflicts" : 0,
"numYield" : 9,
"locks" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(24)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(12)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(12)
}
}
},
"responseLength" : 138,
"protocol" : "op_query",
"millis" : 2,
"execStats" : {},
"ts" : ISODate("2018-09-27T14:08:48.099Z"),
"client" : "172.17.0.1",
"allUsers" : [ ],
"user" : ""
}
simple count is converted into an aggregate.
More interestingly when I use CountAsync function (which btw is marked obsolete with remark I should be using CountDocumentsAsync) it produces:
{
"op" : "command",
"ns" : "somenamespace",
"command" : {
"count" : "reservations",
"query" : {
"query_key": "query_value"
}
},
"keyUpdates" : 0,
"writeConflicts" : 0,
"numYield" : 9,
"locks" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(20)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(10)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(10)
}
}
},
"responseLength" : 62,
"protocol" : "op_query",
"millis" : 2,
"execStats" : {
},
"ts" : ISODate("2018-09-27T13:58:27.758Z"),
"client" : "172.17.0.1",
"allUsers" : [ ],
"user" : ""
}
which is what I would expect. Does anyone know what might be a reason for this behavior? I browsed documentation but didn't find anything interesting regarding it.
This is the documented behaviour for drivers supporting 4.0 features. The reason for the change is to remove confusion and make it clear when an estimate is used and when it is not.
When counting based on a query filter (rather than just counting the entire collection) both methods will cause the server to iterate over matching documents to count them and therefore have similar performance.
From MongoDb docs: db.collection.count()
NOTE:
MongoDB drivers compatible with the 4.0 features deprecate their
respective cursor and collection count() APIs in favor of new APIs for
countDocuments() and estimatedDocumentCount(). For the specific API
names for a given driver, see the driver documentation.
From MongoDb docs: db.collection.countDocuments()
db.collection.countDocuments(query, options)
New in version 4.0.3.
Returns the count of documents that match the query for a collection
or view. The method wraps the $group aggregation stage with a $sum
expression to perform the count and is available for use in
Transactions.
A more detailed explanation for this change in API can be found on the MongoDb JIRA site:
Drivers supporting MongoDB 4.0 must deprecate the count() helper and
add two new helpers - estimatedDocumentCount() and countDocuments().
Both helpers are supported with MongoDB 2.6+.
The names of the new helpers were chosen to make it clear how they
behave and exactly what they do. The estimatedDocumentCount helper
returns an estimate of the count of documents in the collection using
collection metadata, rather than counting the documents or consulting
an index. The countDocuments helper counts the documents that match
the provided query filter using an aggregation pipeline.
The count() helper is deprecated. It has always been implemented using
the count command. The behavior of the count command differs depending
on the options passed to it and the topology in use and may or may not
provide an accurate count. When no query filter is provided the count
command provides an estimate using collection metadata. Even when
provided with a query filter the count command can return inaccurate
results with a sharded cluster if orphaned documents exist or if a
chunk migration is in progress. The countDocuments helper avoids these
sharded cluster problems entirely when used with MongoDB 3.6+, and
when using Primary read preference with older sharded clusters.

Cant use ElemMatch() if array elements are value types? C# MongoDB [duplicate]

If I have this schema...
person = {
name : String,
favoriteFoods : Array
}
... where the favoriteFoods array is populated with strings. How can I find all persons that have "sushi" as their favorite food using mongoose?
I was hoping for something along the lines of:
PersonModel.find({ favoriteFoods : { $contains : "sushi" }, function(...) {...});
(I know that there is no $contains in mongodb, just explaining what I was expecting to find before knowing the solution)
As favouriteFoods is a simple array of strings, you can just query that field directly:
PersonModel.find({ favouriteFoods: "sushi" }, ...); // favouriteFoods contains "sushi"
But I'd also recommend making the string array explicit in your schema:
person = {
name : String,
favouriteFoods : [String]
}
The relevant documentation can be found here: https://docs.mongodb.com/manual/tutorial/query-arrays/
There is no $contains operator in mongodb.
You can use the answer from JohnnyHK as that works. The closest analogy to contains that mongo has is $in, using this your query would look like:
PersonModel.find({ favouriteFoods: { "$in" : ["sushi"]} }, ...);
I feel like $all would be more appropriate in this situation. If you are looking for person that is into sushi you do :
PersonModel.find({ favoriteFood : { $all : ["sushi"] }, ...})
As you might want to filter more your search, like so :
PersonModel.find({ favoriteFood : { $all : ["sushi", "bananas"] }, ...})
$in is like OR and $all like AND. Check this : https://docs.mongodb.com/manual/reference/operator/query/all/
In case that the array contains objects for example if favouriteFoods is an array of objects of the following:
{
name: 'Sushi',
type: 'Japanese'
}
you can use the following query:
PersonModel.find({"favouriteFoods.name": "Sushi"});
In case you need to find documents which contain NULL elements inside an array of sub-documents, I've found this query which works pretty well:
db.collection.find({"keyWithArray":{$elemMatch:{"$in":[null], "$exists":true}}})
This query is taken from this post: MongoDb query array with null values
It was a great find and it works much better than my own initial and wrong version (which turned out to work fine only for arrays with one element):
.find({
'MyArrayOfSubDocuments': { $not: { $size: 0 } },
'MyArrayOfSubDocuments._id': { $exists: false }
})
Incase of lookup_food_array is array.
match_stage["favoriteFoods"] = {'$elemMatch': {'$in': lookup_food_array}}
Incase of lookup_food_array is string.
match_stage["favoriteFoods"] = {'$elemMatch': lookup_food_string}
Though agree with find() is most effective in your usecase. Still there is $match of aggregation framework, to ease the query of a big number of entries and generate a low number of results that hold value to you especially for grouping and creating new files.
PersonModel.aggregate([
{
"$match": {
$and : [{ 'favouriteFoods' : { $exists: true, $in: [ 'sushi']}}, ........ ] }
},
{ $project : {"_id": 0, "name" : 1} }
]);
There are some ways to achieve this. First one is by $elemMatch operator:
const docs = await Documents.find({category: { $elemMatch: {$eq: 'yourCategory'} }});
// you may need to convert 'yourCategory' to ObjectId
Second one is by $in or $all operators:
const docs = await Documents.find({category: { $in: [yourCategory] }});
or
const docs = await Documents.find({category: { $all: [yourCategory] }});
// you can give more categories with these two approaches
//and again you may need to convert yourCategory to ObjectId
$in is like OR and $all like AND. For further details check this link : https://docs.mongodb.com/manual/reference/operator/query/all/
Third one is by aggregate() function:
const docs = await Documents.aggregate([
{ $unwind: '$category' },
{ $match: { 'category': mongoose.Types.ObjectId(yourCategory) } }
]};
with aggregate() you get only one category id in your category array.
I get this code snippets from my projects where I had to find docs with specific category/categories, so you can easily customize it according to your needs.
For Loopback3 all the examples given did not work for me, or as fast as using REST API anyway. But it helped me to figure out the exact answer I needed.
{"where":{"arrayAttribute":{ "all" :[String]}}}
In case You are searching in an Array of objects, you can use $elemMatch. For example:
PersonModel.find({ favoriteFoods : { $elemMatch: { name: "sushiOrAnytthing" }}});
With populate & $in this code will be useful.
ServiceCategory.find().populate({
path: "services",
match: { zipCodes: {$in: "10400"}},
populate: [
{
path: "offers",
},
],
});
If you'd want to use something like a "contains" operator through javascript, you can always use a Regular expression for that...
eg.
Say you want to retrieve a customer having "Bartolomew" as name
async function getBartolomew() {
const custStartWith_Bart = await Customers.find({name: /^Bart/ }); // Starts with Bart
const custEndWith_lomew = await Customers.find({name: /lomew$/ }); // Ends with lomew
const custContains_rtol = await Customers.find({name: /.*rtol.*/ }); // Contains rtol
console.log(custStartWith_Bart);
console.log(custEndWith_lomew);
console.log(custContains_rtol);
}
I know this topic is old, but for future people who could wonder the same question, another incredibly inefficient solution could be to do:
PersonModel.find({$where : 'this.favouriteFoods.indexOf("sushi") != -1'});
This avoids all optimisations by MongoDB so do not use in production code.

How to Append to a Field in MongoDB C# [duplicate]

I have a document with a field containing a very long string. I need to concatenate another string to the end of the string already contained in the field.
The way I do it now is that, from Java, I fetch the document, extract the string in the field, append the string to the end and finally update the document with the new string.
The problem: The string contained in the field is very long, which means that it takes time and resources to retrieve and work with this string in Java. Furthermore, this is an operation that is done several times per second.
My question: Is there a way to concatenate a string to an existing field, without having to fetch (db.<doc>.find()) the contents of the field first? In reality all I want is (field.contents += new_string).
I already made this work using Javascript and eval, but as I found out, MongoDB locks the database when it executes javascript, which makes the overall application even slower.
Starting Mongo 4.2, db.collection.updateMany() can accept an aggregation pipeline, finally allowing the update of a field based on its current value:
// { a: "Hello" }
db.collection.updateMany(
{},
[{ $set: { a: { $concat: [ "$a", "World" ] } } }]
)
// { a: "HelloWorld" }
The first part {} is the match query, filtering which documents to update (in this case all documents).
The second part [{ $set: { a: { $concat: [ "$a", "World" ] } } }] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline). $set (alias of $addFields) is a new aggregation operator which in this case replaces the field's value (by concatenating a itself with the suffix "World"). Note how a is modified directly based on its own value ($a).
For example (it's append to the start, the same story ):
before
{ "_id" : ObjectId("56993251e843bb7e0447829d"), "name" : "London
City", "city" : "London" }
db.airports
.find( { $text: { $search: "City" } })
.forEach(
function(e, i){
e.name='Big ' + e.name;
db.airports.save(e);
}
)
after:
{ "_id" : ObjectId("56993251e843bb7e0447829d"), "name" : "Big London
City", "city" : "London" }
Old topic but i had the same problem.
Since mongo 2.4, you can use $concat from aggregation framework.
Example
Consider these documents :
{
"_id" : ObjectId("5941003d5e785b5c0b2ac78d"),
"title" : "cov"
}
{
"_id" : ObjectId("594109b45e785b5c0b2ac97d"),
"title" : "fefe"
}
Append fefe to title field :
db.getCollection('test_append_string').aggregate(
[
{ $project: { title: { $concat: [ "$title", "fefe"] } } }
]
)
The result of aggregation will be :
{
"_id" : ObjectId("5941003d5e785b5c0b2ac78d"),
"title" : "covfefe"
}
{
"_id" : ObjectId("594109b45e785b5c0b2ac97d"),
"title" : "fefefefe"
}
You can then save the results with a bulk, see this answer for that.
this is a sample of one document i have :
{
"_id" : 1,
"s" : 1,
"ser" : 2,
"p" : "9919871172",
"d" : ISODate("2018-05-30T05:00:38.057Z"),
"per" : "10"
}
to append a string to any feild you can run a forEach loop throught all documents and then update desired field:
db.getCollection('jafar').find({}).forEach(function(el){
db.getCollection('jafar').update(
{p:el.p},
{$set:{p:'98'+el.p}})
})
This would not be possible.
One optimization you can do is create batches of updates.
i.e. fetch 10K documents, append relevant strings to each of their keys,
and then save them as single batch.
Most mongodb drivers support batch operations.
db.getCollection('<collection>').update(
// query
{},
// update
{
$set: {<field>:this.<field>+"<new string>"}
},
// options
{
"multi" : true, // update only one document
"upsert" : false // insert a new document, if no existing document match the query
});

Elasticsearch Nest query to trim white spaces while comparing fields with provided value

I am working with Nest API for elastic search and I am looking for some solution where we can trim white spaces while comparing fields with provided value.
Problem:-
Elastic DB have field "customField1" ="Jinesh " and I am passing value to search ="Jinesh" which is not comparing and providing no result.
What I am looking for:-
It should search exact provided search value by ignoring white spaces in elastic field values.
Any help would be much appreciated.
Thanks.
There are a couple of ways to solve your issue depending on your requirements. The one that best fits your description, in my opinion, is using the Regexp query:
var result =
await
client.SearchAsync<object>(
searchDescriptor =>
searchDescriptor.Query(
queryDescriptor =>
queryDescriptor.Regexp(
regex => regex.OnField("customField1").Value(" *Jinesh *"))));
Other options would be using Prefix, Wildcard or MatchPhrasePrefix.
However, this goes against Elasticsearch best practices.
The "Elasticsearch way" of doing this would be to analyze the property using an analyzer that strips the whitespace characters (meaning it'll be saved in the database without the whitespace). A couple of analyzers that do that are the standard analyzer (default analyzer) or the whitespace analyzer. You could also add a custom analyzer and use the Trim Token Filter with your tokenizer.
You can do that by configuring your index.
If you require a particular analyzer that doesn't allow you to use any whitespace trimming, it is suggested by Elasticsearch that you add to your index a property that is simply a copy of the property in question (i.e. "customField1"), which could then use a better suiting analyzer for this scenario.
By default, a string property on your POCO will be indexed as an analyzed string field in 2.x, or as an analyzed text field in 5.x with a not_analyzed keyword subfield. The analyzer in both versions is the Standard Analyzer which, amongst other things, splits the input character stream on whitespace characters and removes them when generating tokens.
You can see the effect of an analyzer on a given string input with the Analyze API. In Sense/Console
GET _analyze
{
"text": ["Jinesh "],
"analyzer": "standard"
}
returns
{
"tokens": [
{
"token": "jinesh",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
}
]
}
These are the tokens that would be stored in the inverted index and search against.
To then find a match for this with NEST, you can use the match query
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "default-index";
var connectionSettings = new ConnectionSettings(pool)
.DefaultIndex(defaultIndex);
var client = new ElasticClient(connectionSettings);
client.CreateIndex(defaultIndex, c => c
.Mappings(m => m
.Map<Person>(mm => mm
.AutoMap()
)
)
);
client.Index(new Person
{
Name = "Jinesh "
}, i => i.Refresh(Refresh.WaitFor));
var searchResponse = client.Search<Person>(s => s
.Query(q => q
.Match(m => m
.Field(f => f.Name)
.Query("Jinesh")
)
)
);
}
public class Person
{
public string Name { get; set; }
}
The response from the search is
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "default-index",
"_type" : "person",
"_id" : "AVjeLMxUCwxm5eXshs-y",
"_score" : 0.2876821,
"_source" : {
"name" : "Jinesh "
}
}
]
}
}

Exact Contains Match on a Sub Collection

Using mongodb with the NoRM driver I have this document:
{
"_id" : ObjectId("0758030341b870c019591900"),
"TmsId" : "EP000015560091",
"RootId" : "1094362",
"ConnectorId" : "SH000015560000",
"SeasonId" : "7894681",
"SeriesId" : "184298",
"Titles" : [
{
"Size" : 120,
"Type" : "full",
"Lang" : "en",
"Description" : "House"
},
{
"Size" : 10,
"Type" : "red",
"Lang" : "en",
"Description" : "House M.D."
}
], yadda yadda yadda
and I am querying like:
var query = new Expando();
query["Titles.Description"] = Q.In(showNames);
var fuzzyMatches = db.GetCollection<Program>("program").Find(query).ToList();
where showNames is a string[] contain something like {"House", "Glee", "30 Rock"}
My results contain fuzzy matches. For example the term "House" returns every show with a Title with the word House in it ( like its doing a Contains ).
What I would like is straight matches. So if document.Titles contains "A big blue House" it does not return a match. Only if Titles.Description contains "House" would I like a match.
I haven't been able to reproduce the problem, perhaps because we're using different versions of MongoDB and/or NoRM. However, here are some steps that may help you to find the origin of the fuzzy results.
Turn on profiling, using the MongoDB shell:
> db.setProfilingLevel(2)
Run your code again.
Set the profiling level back to 0.
Review the queries that were executed:
> db.system.profile.find()
The profiling information should look something like this:
{
"ts" : "Wed Dec 08 2010 09:13:13 GMT+0100",
"info" : "query test.program ntoreturn:2147483647 reslen:175 nscanned:3 \nquery: { query: { Titles.Description: { $in: [ \"House\", \"Glee\", \"30 Rock\" ] } } } nreturned:1 bytes:159",
"millis" : 0
}
The actual query is in the info property and should be:
{ Titles.Description: { $in: [ "House", "Glee", "30 Rock" ] } }
If your query looks different, then the 'problem' is in the NoRM driver. For example, if NoRM translates your code to the following regex query, it will do a substring match:
{ Titles.Description: { $in: [ /House/, /Glee/, /30 Rock/ ] } }
I have used NoRM myself, but I haven't come across a setting to control this. Perhaps you're using a different version, that does come with such functionality.
If your query isn't different from what it should by, try running the query from the shell. If it still comes up with fuzzy results, then we're definitely using different versions of MongoDB ;)
in shell syntax:
db.mycollection.find( { "Titles.Description" : "House" } )

Categories

Resources