MongDB bulk write UpdateOne w/upsert option inserts new documents (c# driver) - c#

I have a collection that requires frequent updating. The structure is as follows:
{
ObjectId _Id;
string EntryKey;
string Description;
string SearchTerm;
}
There is a unique index on the EntryKey field. I am creating upsert operations like so:
var filter = filterBuilder.Eq(e => e.EntryKey, model.EntryKey);
var update = updateBuilder.Combine(
updateBuilder.SetOnInsert(e => e._Id, ObjectId.GenerateNewId()),
updateBuilder.SetOnInsert(e => e.EntryKey, model.EntryKey.ToLower()),
updateBuilder.Set(e => e.Description, model.Description),
updateBuilder.Set(e => e.SearchTerm, model.SearchTerm));
var op = new UpdateOneModel<SearchEntry>(filter, update){ IsUpsert = true, Hint = "EntryKey_1" };
bulkList.Add(op);
Using the exact same input data for each test with a fresh mongo instance, the first iteration succeeds and the second one fails with E10000: duplicate key error collection on the EntryKey field
When I remove the unique constraint from the index, duplicated documents are created in the collection (with the exception of _Id)
When I run the command in the shell using the same failed ID, the command succeeds and the document is updated.
db.search.bulkWrite([
{
updateOne: {
"filter": { "EntryKey": "ad544f72-496f-4eee-bf53-ffdda57de824" },
"update": {
$setOnInsert: {
_id: ObjectId(),
"EntryKey": "ad544f72-496f-4eee-bf53-ffdda57de824"
},
$set: {
"Description": "This is a description",
"SearchTerm",: "search"
}
},
"upsert": true
}
}
]);
I expect the documents that match the filter predicate to get updated instead of either throwing a duplicate key error (when the unique index is enforced) or insert essentially duplicate documents.
I see in this question the accepted answer is to separate updates from inserts. If that's the case, then what's the point of upsert if it cannot be used in the manner being tried here?

When the code runs the first time, it will create a document where EntryKey is set to model.EntryKey.ToLower().
In the second run, EntryKey is compared with model.EntryKey. Since it was downcased in the upsert, this will only match if there are no uppercase letters in model.EntryKey.
If there are any, the filter will fail to match, and it will attempt to upsert, but fail after converting to lowercase.
To make that consistent, also downcase in the filter, like
var filter = filterBuilder.Eq(e => e.EntryKey, model.EntryKey.ToLower());

Related

Mongo query Limit not showing in operation

I`m using MongoDB.Driver for .NET to query Mongo(Version 3.0.11). This is my code to query a field and limit the query to 200 documents.
BsonDocument bson = new BsonDocument();
bson.Add("Field", "Value");
BsonDocumentFilterDefinition<ResultClass> filter = new BsonDocumentFilterDefinition<ResultClass>(bson);
FindOptions queryOptions = new FindOptions() { BatchSize = 200 };
List<ResultClass> result = new List<ResultClass>();
result.AddRange(myCollection.Find<ResultClass>(filter, queryOptions).Limit(200).ToList());
My issue is that when I check the database`s current operations, the operation query field shows only :
{ Field : "Value" }
Which is different from the query using "AsQueryable" below:
List<ResultClass> result = myCollection.AsQueryable<ResultClass>().Where(t => t.Field == "Value").Take(200)
Query operation using "AsQueryable"
{ aggregate: "CollectionName", pipeline: [ { $match: { Field:
"Value" } }, { $limit: 200 } ], cursor: {} }
Why can't I see the limit in the query using Find?Is the limit being handled in the client side instead of the server?
I need to limit this in the server side but I can't use the second query because the field searched needs to be a string which can't be done using AsQueryable.
Using limit in the first piece of code executes the limit on a cursor object, which is still serverside until you actually request the document by invoking ToList(). At which point only 200 documents will go over the wire to your application.
It looks like the AsQueryable is executing an aggregation pipeline which will show up in currentOp, but both are essentially the same.
I'm not sure if there is a performance impact for either one though

Documentdb stored proc cross partition query

I have a stored procedure which gives me a document count (count.js on github). I have partitioned my collection. Due to this, I now have to pass the partition key in as an option to run the stored procedure.
Can and how should I enable crosspartition queries in the stored procedure (ie, collection(EnableCrossPartitionQuery = true)) so that I don't have to specify the partition key?
There is no way to do fan-out stored procedure execution in DocumentDB. The run against a single partition. I ran into this dilemma when trying to switch to partitioned collections and had to make some adjustments. Here are some options:
Download a 1 for every record and sum/count them client-side
Rerun the stored procedure for each unique partition key. In my case, this was not as bad as it sounds since the partition key is a tenantID and I only have a dozen of those and only expect a few hundred max.
I'm not sure about this one since I haven't tried it with partitioned collections, but each query now returns the resource usage of the collection in the x-ms-resource-usage header. That header has a documentsSize sub-header. You could use that divided by the average size of your documents to get an approximate count. There may even be a count record in that header information by now.
Also, there is an x-ms-item-count header but I'm not sure how that behaves. If you send a query for all the records in the entire partitioned collection and set the max-item-count to 1, you'll only get back one record and it shouldn't cost you a lot in RUs, but I don't know how that header behaves. Does it return a 1 in that case? Or does it return the total number of documents all the pages of the query would eventually return if you bothered to request every page. A quick experiment should confirm this.
Below you can find some example code that should allow you to read all records cross partion. The magic is inside the doForAll function, and at the top you can see how it is called.
// SAMPLE STORED PROCEDURE
function sample(prefix) {
var share = { counter: 0, hasEntityName : 0, isXXX: 0, partitions: {}, prefix };
doForAll({
filter: function limiter(record){
if (record && record.entityName === 'XXX') return true;
else return false;
},
callback: function handleRecord(record) {
//Keep track of this partition...
let partitionKey = record.partitionKey;
if (share.partitions[partitionKey])
share.partitions[partitionKey]++;
else
share.partitions[partitionKey] = 1;
//update some counters...
share.counter++;
if (record.entityName !== undefined) share.hasEntityName++;
if (record.entityName === 'XXX') share.isXXX++;
},
finaly: function whenAllIsDone() {
console.log("counter = " + share.counter + ". ");
console.log("has entity name: "+ share.hasEntityName+ ". ")
console.log("is XXX: " + share.isXXX+ ". ")
var parts = Object.getOwnPropertyNames(share.partitions)
console.log("partition keys: " + parts.length + " ...");
getContext()
.getResponse()
.setBody(share);
}
});
//The magic function...
//also see: https://azure.github.io/azure-cosmosdb-js-server/Collection.html
function doForAll(task, ctoken) {
if (!task) throw "Expected one parameter of type: { filter?: (rec?)=>boolean, callback?: (rec?) => void, finaly?: () => void }";
//Note:
//the "__" symbol is an alias for var collection = getContext().getCollection(); = aliased by __
var result = getContext()
.getCollection()
.chain()
.filter(task.filter || function (rec) { return true; })
.map(task.callback || function (rec) { return undefined; })
.value({ continuation: ctoken }, function afterBatchCallback (err, feed, options) {
if (err) throw err;
if (options.continuation)
doForAll(task, options.continuation);
else if (task.finaly)
task.finaly();
});
if (!result.isAccepted)
throw "catastrophic failure";
}
}
PS: it may to know how the data looks like that is used for the example.
This is an example of such a document:
{
"id": "123",
"partitionKey": "PART_1",
"entityName": "EXAMPLE_ENTITY",
"veryInterestingInfo": "The 'id' property is also the collections id, the 'partitionKey' property happens to be the collections partition key, and all the records in this collection have a 'entityName' property which contains a (non-unique) string"
}

How to use MongoDB C# Driver Aggregate to get the latest item with specific values on 2 fields?

thanks for stopping by :)
Basicly, I have a MongoDB with a collection of documents with these properties:
String group;
String key;
String value;
DateTime timestamp;
I want to get a list of all the most recent documents with unique (distinct) combination of group and key.
There are documents with the same group and key but different values; I only want the one with the most recent timestamp.
So if I have:
Document A: Group = 2; Key = Something; Value = 123; Timestamp = 20160621;
Document B: Group = 2; Key = Something; Value = 888; Timestamp = 20160622;
I want only document B (with value 888) to be retreived in the query.
This is what I got so far:
var aggregate = collection.Aggregate()
.Match(new BsonDocument { { "deviceid", deviceid } })
.Sort( new BsonDocument { { "timestamp", -1} })
.Group(new BsonDocument { { "groupkey", "$group" }, { "latestvalue", "$first.value" } });
This however results in the following exception: "Command aggregate failed: the group aggregate field 'groupkey' must be defined as an expression inside an object"
1) Any tips on how to use Aggregate to do what I want? (basicly, sort descending by timstamp and distinct based on 2 fields)
2) Any tips on how to convert the aggregate to a list or similar? (I need to use linq on all the results to find the value of a document with specific group and key)
when using aggregation framework, as a part of training I will suggest execute same in monogo shell/robomongo to see correct syntax.
Please find below fixed syntax of your query
var aggregate = collection.Aggregate()
.Match(x=>x.key== "k10")
.SortByDescending(x=>x.timestamp)
.Group( BsonDocument.Parse("{ '_id':'$group', 'latestvalue':{$first:'$value'} }")).ToList();

How can I delete nested array element in a mongodb document with the c# driver

I am new in the MongoDB world and now I am struggling of how can I delete, update element in a nested array field of a document. Here is my sample document:
{
"_id" : ObjectId("55f354533dd61e5004ca5208"),
"Name" : "Hand made products for real!",
"Description" : "Products all made by hand",
"Products" : [
{
"Identifier" : "170220151653",
"Price" : 20.5,
"Name" : "Leather bracelet",
"Description" : "The bracelet was made by hand",
"ImageUrl" : "https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcQii6JCvXtx0iJGWgpvSl-KrdZONKYzDwS0U8uDvUunjO6BO9Aj"
}
]
}
In my method, I get the id of the document and the id(Identifier) of the Product that I want to delete. Can anyone tell me how can I delete from the Products field the element having Identifier: 170220151653?
I tried:
var query = Query.And(Query.EQ("_id", categoryId), Query.EQ("Products.Identifier", productId));
var update = Update.Pull("Products", new BsonDocument() { { "Identifier", productId } });
myDb.Applications().Update(query, update);
as suggested here: MongoDB remove a subdocument document from a subdocument
But I get an error at
myDb.Applications()
It just can't be found.
SOLVED:
var pull = Update<Category>.Pull(x => x.Products, builder => builder.EQ(q => q.Identifier, productId));
collection.Update(Query.And(Query.EQ("_id", ObjectId.Parse(categoryId)), Query.EQ("Products.Identifier", productId)), pull);
You are calling method Pull(string name, MongoDB.Bson.BsonValue value) and according to the docs it
Removes all values from the named array element that are equal to some
value (see $pull)
and you provide { "Identifier", productId } as the value. I guess that mongo does not find that exact value.
Try to use the second overload of Pull with query-condition instead of exact value
Removes all values from the named array element that match some query
(see $pull).
var update = Update.Pull("Products", Query.EQ("Identifier", productId));
UPDATE
Since you mention Category entity so I can suggest using lambda instead of
Query.EQ:
var pull = Update<Category>.Pull(x => x.Products, builder =>
builder.Where(q => q.Identifier == productId));
Solution with C# MongoDB Driver. Delete a single nested element.
var filter = Builders<YourModel>.Filter.Where(ym => ym.Id == ymId);
var update = Builders<YourModel>.Update.PullFilter(ym => ym.NestedItems, Builders<NestedModel>.Filter.Where(nm => nm.Id == nestedItemId));
_repository.Update(filter, update);
I was also facing the same problem and then finally after doing lot of R&D, I came to know that, you have to use PullFilter instead of Pull when you want to delete using filter.
I had the same of deleting elements from the nested array but after research, I found this piece of working code.
var update = Builders<Category>.Update.PullFilter(y => y.Products, builder => builder.Identifier== productId);
var result = await _context.Category.UpdateOneAsync(filter, update);
return result.IsAcknowledged && result.ModifiedCount > 0;
Hi as per my understanding you want to remove whole matched elements of given id and identifier so below query will solve your problem but I don't know how to convert this into C#, here mongo $pull method used.
db.collectionName.update({"_id" : ObjectId("55f354533dd61e5004ca5208")}, {"$pull":{"Products":{"Identifier":"170220151653"}}})
Solution for C# MongoDB Driver. You can set empty [] the nested array.
var filter = Builders<MyUser>.Filter.Where(mu => mu.Id == "my user id");
var update = Builders<MyUser>.Update.Set(mu => mu.Phones, new List<Phone>());
_repository.Update(filter, update);

Set operations in RavenDB

I read this article on ravendb set operations, but it didn't show me exactly how to update a set of documents via C#. I would like to update a field on all documents that match a certain criteria. Or to put it another way, I would like to take this C# and make it more efficient:
var session = db.GetSession();
foreach(var data in session.Query<Data>().Where(d => d.Color == "Red"))
{
data.Color = "Green";
session.Store(data);
}
session.SaveChanges();
See http://ravendb.net/docs/2.5/faq/denormalized-updates
First parameter is the name of the index you wish to update.
Second parameter is the index query which lets you specify your where clause. The syntax for the query is the lucene syntax (http://lucene.apache.org/java/2_4_0/queryparsersyntax.html). Third parameter is the update clause. Fourth parameter is if you want stale results.
documentStore.DatabaseCommands.UpdateByIndex("DataByColor",
new IndexQuery
{
Query = "Color:red"
}, new[]
{
new PatchRequest
{
Type = PatchCommandType.Set,
Name = "Color",
Value = "Green"
}
},
allowStale: false);

Categories

Resources