I`m using MongoDB.Driver for .NET to query Mongo(Version 3.0.11). This is my code to query a field and limit the query to 200 documents.
BsonDocument bson = new BsonDocument();
bson.Add("Field", "Value");
BsonDocumentFilterDefinition<ResultClass> filter = new BsonDocumentFilterDefinition<ResultClass>(bson);
FindOptions queryOptions = new FindOptions() { BatchSize = 200 };
List<ResultClass> result = new List<ResultClass>();
result.AddRange(myCollection.Find<ResultClass>(filter, queryOptions).Limit(200).ToList());
My issue is that when I check the database`s current operations, the operation query field shows only :
{ Field : "Value" }
Which is different from the query using "AsQueryable" below:
List<ResultClass> result = myCollection.AsQueryable<ResultClass>().Where(t => t.Field == "Value").Take(200)
Query operation using "AsQueryable"
{ aggregate: "CollectionName", pipeline: [ { $match: { Field:
"Value" } }, { $limit: 200 } ], cursor: {} }
Why can't I see the limit in the query using Find?Is the limit being handled in the client side instead of the server?
I need to limit this in the server side but I can't use the second query because the field searched needs to be a string which can't be done using AsQueryable.
Using limit in the first piece of code executes the limit on a cursor object, which is still serverside until you actually request the document by invoking ToList(). At which point only 200 documents will go over the wire to your application.
It looks like the AsQueryable is executing an aggregation pipeline which will show up in currentOp, but both are essentially the same.
I'm not sure if there is a performance impact for either one though
Related
I have a collection that requires frequent updating. The structure is as follows:
{
ObjectId _Id;
string EntryKey;
string Description;
string SearchTerm;
}
There is a unique index on the EntryKey field. I am creating upsert operations like so:
var filter = filterBuilder.Eq(e => e.EntryKey, model.EntryKey);
var update = updateBuilder.Combine(
updateBuilder.SetOnInsert(e => e._Id, ObjectId.GenerateNewId()),
updateBuilder.SetOnInsert(e => e.EntryKey, model.EntryKey.ToLower()),
updateBuilder.Set(e => e.Description, model.Description),
updateBuilder.Set(e => e.SearchTerm, model.SearchTerm));
var op = new UpdateOneModel<SearchEntry>(filter, update){ IsUpsert = true, Hint = "EntryKey_1" };
bulkList.Add(op);
Using the exact same input data for each test with a fresh mongo instance, the first iteration succeeds and the second one fails with E10000: duplicate key error collection on the EntryKey field
When I remove the unique constraint from the index, duplicated documents are created in the collection (with the exception of _Id)
When I run the command in the shell using the same failed ID, the command succeeds and the document is updated.
db.search.bulkWrite([
{
updateOne: {
"filter": { "EntryKey": "ad544f72-496f-4eee-bf53-ffdda57de824" },
"update": {
$setOnInsert: {
_id: ObjectId(),
"EntryKey": "ad544f72-496f-4eee-bf53-ffdda57de824"
},
$set: {
"Description": "This is a description",
"SearchTerm",: "search"
}
},
"upsert": true
}
}
]);
I expect the documents that match the filter predicate to get updated instead of either throwing a duplicate key error (when the unique index is enforced) or insert essentially duplicate documents.
I see in this question the accepted answer is to separate updates from inserts. If that's the case, then what's the point of upsert if it cannot be used in the manner being tried here?
When the code runs the first time, it will create a document where EntryKey is set to model.EntryKey.ToLower().
In the second run, EntryKey is compared with model.EntryKey. Since it was downcased in the upsert, this will only match if there are no uppercase letters in model.EntryKey.
If there are any, the filter will fail to match, and it will attempt to upsert, but fail after converting to lowercase.
To make that consistent, also downcase in the filter, like
var filter = filterBuilder.Eq(e => e.EntryKey, model.EntryKey.ToLower());
Basically its just query get operation in azure cosmos db based on machine id and datetime.
I am really stuck on querying in cosmos based on datetime.The same query I am able to run and get the result on azure portal.But code wise I am not getting the result.
A few more details:
SELECT *
FROM c
WHERE c.IoTHub.ConnectionDeviceId IN ('hub20') AND c.MACHINE_ID = 'TAP_20' AND (c.EventEnqueuedUtcTime >= '2021-02-03T10:40:42.5180000Z' AND c.EventEnqueuedUtcTime <= '2021-02-03T10:40:42.5180000Z')
QueryDefinition queryDefinition = new QueryDefinition(sqlQueryText);
FeedIterator<dynamic> queryResultSetIterator = container.GetItemQueryIterator<dynamic>(queryDefinition);
FeedResponse<dynamic> currentResultSet;
while (queryResultSetIterator.HasMoreResults)
{
currentResultSet = await queryResultSetIterator.ReadNextAsync();
}
I am able to get all the data till MACHINE_ID but as soon as I apply c.EventEnqueuedUtcTime condition. I am not able to get the data.I tried every possible solution.c.EventEnqueuedUtcTime value we are getting as a string and also in database it is saved as string as you can see in the image.
{
"MESSAGE_GROUP_ID": "24c9e3ad-4fd6-4abb-88d8-eafb9060884e",
"TYPE": "Gauges",
"MACHINE_ID": "TAP_20",
"Gauges": {
"OVERRIDE": 85.8
},
"EventProcessedUtcTime": "2021-02-03T10:41:48.0493615Z",
"PartitionId": 3,
"EventEnqueuedUtcTime": "2021-02-03T10:40:42.5180000Z",
"IoTHub": {
"MessageId": "498b7df3-55e6-4b3f-a18e-698fd991e526",
"CorrelationId": null,
"ConnectionDeviceId": "hub20",
"ConnectionDeviceGenerationId": "637332663098221999",
"EnqueuedTime": "2021-02-03T10:41:11.2570000Z"
},
"id": "498b7df3-55e6-4b3f-a18e-698fd991e526"
}
Any lead will be appreciated.
Thanks
Thank you user2911592 for sharing the resolution steps. Posting them as answer to help other community members.
Initially the query/code was saved incorrectly using casting. Fixing
the same resolved the issue.
user2911592: feel free to add furthermore details.
I am quite new to MongoDb (and MongoDb C# Drivers) and lately, we are trying to implement an update wherein we use the value of a field and a variable (method parameter) to update several fields.
Basically, our doc is something like this
public class Inventory {
public string _id;
public decimal Quantity;
public decimal TotalCost;
}
What we want to do is to update the quantity and totalCost based on a passed value (qty).
(1) TotalCost -= (qty * (TotalCost / Quantity))
(2) Quantity -= Qty
The logic behind this is that we want to retain the average cost of our item.
Take note: the value of quantity field in step (1) should use the original value and not the result of step (2).
We can implement this using 2 queries but in our case, we need to execute this logic in one call only as there are different threads that will update a single item.
I have read the docs about Aggregate and Projection (and using Expressions) but I cannot seem to figure out how to use or combine the result of projection into the aggregate update.
Tried this projection to return the value that should be deducted from totalCost
Builders.Projection.Expression(e => (e.totalCost / e.quantity) *
-qty);
Thank you and hope you guys can point us in the right direction.
Here is an equivalent of what we are trying to achieve in mongo shell, provided that qty = 500.
db.inventory.updateOne(
{ _id: "1" },
[
{ "$set": { "TotalCost": { "$add": ["$TotalCost", { "$multiply": [-500, { "$divide": ["$TotalCost", "$Quantity"] }] }] } }
}
] )
There's currently no type-safe helper methods for creating update pipelines. However, you can just use the same json-ish syntax like you do with the console. Here's bit of example C# code.
// Some id to filter on
var id = ObjectId.GenerateNewId();
var db = client.GetDatabase("test");
var inventory = db.GetCollection<Inventory>("inventory");
var filter = Builders<Inventory>.Filter
.In(x => x.Id, id);
var update = Builders<Inventory>.Update.Pipeline(
new PipelineStagePipelineDefinition<Inventory, Inventory>(
new PipelineStageDefinition<Inventory, Inventory>[]
{
#"{ ""$set"": { ""TotalCost"": { ""$add"": [""$TotalCost"", { ""$multiply"": [-500, { ""$divide"": [""$TotalCost"", ""$Quantity""] }] }] } } }",
}));
await inventory.UpdateOneAsync(filter, update)
.ConfigureAwait(false);
I Am trying to do a cross partition query on Azure CosmosDB without a partition key. The throughput is set to be 4000, I get 250RU/s per partition key range.
My cosmos db collection has about 1million documents and is a total of 70gb in size. They are spread evenly across approx 40,000 logical partitions, the json documents are on average 100kb in size. This is what the structure of my json documents look like:
"ArrayOfObjects": [
{
// other properties omitted for brevity
"SubId": "ed2a49fb-51d4-45b4-9690-df0721d6a32f"
},
{
"SubId": "35c87833-9bea-4151-86da-4d9c482ae1fe"
},
"ParitionKey": "b42"
This is how I am querying currently without a partition key:
public async Task<ResponseModel> GetBySubId(string subId)
{
var collectionId = _cosmosClient.CollectionId;
var query = $#"SELECT * FROM {collectionId} c
WHERE ARRAY_CONTAINS(c.ArrayOfObjects, {{'SubId': '{subId}'}}, true)";
var feedOptions = new FeedOptions { EnableCrossPartitionQuery = true };
var docQuery = _cosmosClient.Client.CreateDocumentQuery(
_collectionUri,
query,
feedOptions)
.AsDocumentQuery();
var results = new List<ResponseModel>();
while (docQuery.HasMoreResults)
{
var executedQuery = await docQuery.ExecuteNextAsync<ResponseModel>();
if (executedQuery.Count != 0)
{
results.AddRange(executedQuery.ToList());
}
}
if (results.Count == 0)
{
return null;
}
return results.FirstOrDefault();
}
I am expecting to to be able to retrieve the document via one of the SubId's right after inserting it. What actually happens is that it is unable to get the document and returns back null even after the query finishes execution by draining all continuation tokens. This issue is intermittent and inconsistent as sometimes it can get the document after it is inserted other times not.
For those documents that are failing to be retrieved after being inserted, if you wait some time (a couple of minutes usually) and repeat the query with the same SubId it is able to then retrieve the document. There seems to be a delay.
I have checked the cosmosdb metrics in the Azure portal, the metrics indicate that I have not exceeded the provisioned RU/s per partition at all or that there has been any rate limiting in my requests (HTTP 429).
Given the above why am I still seeing issues with cross partition querying even when there is enough throughput provisioned?
I read this article on ravendb set operations, but it didn't show me exactly how to update a set of documents via C#. I would like to update a field on all documents that match a certain criteria. Or to put it another way, I would like to take this C# and make it more efficient:
var session = db.GetSession();
foreach(var data in session.Query<Data>().Where(d => d.Color == "Red"))
{
data.Color = "Green";
session.Store(data);
}
session.SaveChanges();
See http://ravendb.net/docs/2.5/faq/denormalized-updates
First parameter is the name of the index you wish to update.
Second parameter is the index query which lets you specify your where clause. The syntax for the query is the lucene syntax (http://lucene.apache.org/java/2_4_0/queryparsersyntax.html). Third parameter is the update clause. Fourth parameter is if you want stale results.
documentStore.DatabaseCommands.UpdateByIndex("DataByColor",
new IndexQuery
{
Query = "Color:red"
}, new[]
{
new PatchRequest
{
Type = PatchCommandType.Set,
Name = "Color",
Value = "Green"
}
},
allowStale: false);