Composite Mongo filter agains CosmosDB - c#

We are using Mongo C# driver. Locally my backend is a real MongoDB, and on production in Azure - MS CosmosDB with Mongo interface.
My Mongo document has a version. I read the document, modify it, increase the version, write it - and want to be sure that nobody has changed the document between read and write. So I use the version in the update filter:
So I'm doing this:
var builder = Builders<SettingsStorage>.Filter;
var filter = builder.Eq(c => c.Id, myId) & builder.Eq(c => c.Version, versionAsReadBeforeUpdate);
await this.configurations.FindOneAndUpdateAsync(filter, updateDef);
Or this, just to be sure:
var filter1 = Builders<SettingsStorage>.Filter.Eq(c => c.Id, myId);
var filter2 = Builders<SettingsStorage>.Filter.Eq(c => c.Version, versionAsReadBeforeUpdate);
var filter = Builders<SettingsStorage>.Filter.And(filter1, filter2);
await this.configurations.FindOneAndUpdateAsync(filter, updateDef);
So if somebody changed the document in between, the version will also change and the filter fail. I'll get "Command findAndModify failed: E11000 duplicate key error collection: configurations Failed _id or unique key constraint" exception and will be able to run retry policies etc.
Now the thing is it works perfect with Mongo backend, but almost always brings this exception when running agains CosmosDB, both when deployed and from the same local environment. It's the same call, it's for sure only one simultaneous caller. So how come? Does the c# driver act differently for CosmosDB? What could I try or how can this be explained?
Note: with a normal filter, i.e. just builder.Eq(c => c.Id, myId), both environments behave the same way and work properly.

CosmosDB is not real MongoDB. It emulates MongoDB. As a result the semantics will vary enormously between real MongoDB and CosmosDB. If you want identical semantics run MongoDB Atlas which also runs in the Azure cloud. The mongod process running locally will be identical to the mongod in the Atlas cloud if you are running the same version of MongoDB. You can choose which version of MongoDB to run when you build your first cluster.
There is a free tier for beginners that is free forever and doesn't require a credit card so give it a spin and see if you get the same semantics.

Related

Getting DbUpdateConcurrencyException, but only in production. Cannot reproduce in dev

I'm working on a Quartz.NET hosted job as part of a Blazor Server web application. This job downloads a list of products from a warehouse, and caches them in a local DbContext. In order to optimize performance due to the large number of entries, I'm inserting the new items in batches of 1000 (with DbSet.AddRange()), and after each batch I submit the changes with database.SaveChangesAsync().
The following code is a simplified version of this process:
public async Task Execute(IJobExecutionContext context)
{
// databaseFactory is injected as a dependency (IDbContextFactory<AppDatabase>)
await using var database = await databaseFactory.CreateDbContextAsync();
int page = 0;
while(true)
{
List<WarehouseEntry> dbItems = warehouse.QueryPage(page, 1000); // Retrieves 1000 entries
if(dbItems.Count == 0)
break;
database.ProductWarehouseCache.AddRange(dbItems.Select(p => new CachedWarehouseProduct {
Sku = p.Name,
Barcode = p.Barcode,
Stock = p.Stock,
Price = p.Price
}));
await database.SaveChangesAsync(); // <---- DbUpdateConcurrencyException here
page++;
}
}
Note that there is absolutely no concurrency on the code above. The IJob class has the [DisallowConcurrentExecution] attribute meaning that even if I accidentally trigger this procedure multiple times simultaneously, only one instance will be executing at any given time, so despite the exception message, this is not a concurrency issue. It's also important to note that nothing else is updating/querying the database while this code is running.
This works as intended on my local development machine. However when I tried to deploy the application to a production server for the first time, I've found that this specific part of the code fails with a DbUpdateConcurrencyException. Normally with an exception like this, I would look for concurrency issues, or DbContexts that are used by multiple threads at the same time, or aren't disposed properly. However, as I have explained above, this is not the case here.
The following is the full exception message:
Microsoft.EntityFrameworkCore.DbUpdateConcurrencyException:
The database operation was expected to affect 1 row(s), but actually affected 0 row(s);
data may have been modified or deleted since entities were loaded.
See http://go.microsoft.com/fwlink/?LinkId=527962 for information on
understanding and handling optimistic concurrency exceptions.
What could be causing an exception like this, when there is no concurrency whatsoever? And what could cause this to only happen on the production server, but never in the development workspace.
Additional information:
dotnet 6
EF Core 6.0.6
Local/Dev Database: MySQL 8.0.31
Local/Dev OS: Windows 11
Remote/Prod Database+OS: MySQL 8.0.30-0ubuntu0.20.04.2
I have fixed the issue. I was using DataGrip's built-in import/export tools to clone my database's DDL from local dev DB to remote prod DB. Apparently these tools don't replicate the DDL exactly as they should, which leads to EF core throwing random unexpected errors such as this.
To fix it, I rewrote my deployment pipeline to use the dotnet ef migrations script --idempotent script to generate an .sql file that automatically applies any missing migrations to the production database. By using this dotnet tool, I am no longer getting the exception.

how to set mongo local server using Mongo DB Driver .NET

I want to add an option to save data locally in my application using the mongo databse tools, I want to configure all the server information from within my application.
I have 2 questions.
the following code is working only after manual setup of mongodb localhost database in this way:
but on A computer that didn't configure the database setting, the code will not work.
my code is :
public void createDB()
{
MongoClient client = new MongoClient();
var db = client.GetDatabase("TVDB");
var coll = db.GetCollection<Media>("Movies");
Media video = new Media("", "");
video.Name = "split";
coll.InsertOne(video);
}
this code works only after manual set the database like the picture above.
without it I get in the last line A timeout exception.
how can I configure it from my application to make it work (define Server) ?
Is the user will be must install MongoDB software on his PC, or the API Package is enough in order to use the database?
Many Thanks!
By using that command you're not "configuring the database", you're running it.
If you don't want to manually run it, but want it to be always running, you should install it as a Windows Service as explained in How to run MongoDB as Windows service?.
You need to install and/or run a MongoDB server in order to use it. Using the API alone is not enough, it's not like SQLite.
The Code you are using will search for local mongodb.

MongoDB C# Drivers to do CRUD operations in Azure cosmos dB Emulator

I am using Azure cosmos dB Emulator to do CRUD operations on MongoDB using MongoDB C# Drivers.
I am able to create DB and collection using C# in emulator. This is my sample code to create DB and Collection..
IMongoDatabase db = dbClient.GetDatabase("<My DB name>");
db.CreateCollection("<Collection Name>");
These queries are working fine but when I am trying to insert sample data into this collection its throwing below error
Command insert failed: Unknown server error occurred when processing this request..
My sample code to insert sample data is
IMongoCollection<UserProfile> collection = db.GetCollection<UserProfile("<Collection Name>");
UserProfile c = new UserProfile();
c.ID = 21;
c.UserName = "<Some Name> ";
c.Email = "<Email ID>";
collection.InsertOne(c);
How to use MongoDB C# Drivers to do CRUD operations in Azure cosmos dB Emulator And how to run mongo queries in Emulator instead of SQL queries?
Thanks in Advance
The UI for MongoDB API in Emulator is not yet implemented (it's coming though), but everything else should work. There are two tutorials you need to combine for your use case:
https://learn.microsoft.com/en-us/azure/cosmos-db/local-emulator
(look for MongoDB section there)
https://learn.microsoft.com/en-us/azure/cosmos-db/create-mongodb-dotnet
- build, run and make sure it works new connection string for emulator and then just inject your code, it will work.

Azure MongoDB Api Condition not supported

The following is my codes to remove documents:
var filterAddInfo = builder.Lte("Claim_Date", branchEntity.Report_Date);
mongoDB.BranchPerformance.FindOneAndUpdate(
filterMain,
Builders<BsonDocument>.Update.PullFilter("Add_Info", filterAddInfo));
It's working with MongoDB, but it's not working if I connect to Azure MongoDB Api. It prompt:
Command findAndModify failed: Operator ''OPERATOR_PULL' with condition' is not supported..
Seems like condition (Eg. lte is not supported in Azure MongoDB Api). May I know is there any alternative way to change my codes cater for this condition?
We do not yet support the pull operator with a condition specified. Please reach out to askcosmosmongoapi [at] microsoft [dot] com with a sample document, and we'll be happy to work with you on a workaround.

Submit a Spark job from C# and get results

As per title, I would like to request a calculation to a Spark cluster (local/HDInsight in Azure) and get the results back from a C# application.
I acknowledged the existence of Livy which I understand is a REST API application sitting on top of Spark to query it, and I have not found a standard C# API package. Is this the right tool for the job? Is it just missing a well known C# API?
The Spark cluster needs to access Azure Cosmos DB, therefore I need to be able to submit a job including the connector jar library (or its path on the cluster driver) in order for Spark to read data from Cosmos.
As a .NET Spark connector to query data did not seem to exist I wrote one
https://github.com/UnoSD/SparkSharp
It is just a quick implementation, but it does have also a way of querying Cosmos DB using Spark SQL
It's just a C# client for Livy but it should be more than enough.
using (var client = new HdInsightClient("clusterName", "admin", "password"))
using (var session = await client.CreateSessionAsync(config))
{
var sum = await session.ExecuteStatementAsync<int>("val res = 1 + 1\nprintln(res)");
const string sql = "SELECT id, SUM(json.total) AS total FROM cosmos GROUP BY id";
var cosmos = await session.ExecuteCosmosDbSparkSqlQueryAsync<IEnumerable<Result>>
(
"cosmosName",
"cosmosKey",
"cosmosDatabase",
"cosmosCollection",
"cosmosPreferredRegions",
sql
);
}
If your just looking for a way to query your spark cluster using SparkSql then this is a way to do it from C#:
https://github.com/Azure-Samples/hdinsight-dotnet-odbc-spark-sql/blob/master/Program.cs
The console app requires an ODBC driver installed. You can find that here:
https://www.microsoft.com/en-us/download/details.aspx?id=49883
Also the console app has a bug: add this line to the code after the part where the connection string is generated.
Immediately after this line:
connectionString = GetDefaultConnectionString();
Add this line
connectionString = connectionString + "DSN=Sample Microsoft Spark DSN";
If you change the name of the DSN when you install the spark ODBC Driver you will need to change the name in the above line then.
Since you need to access data from Cosmos DB, you could open a Jupyter Notebook on your cluster and ingest data into spark (create a permanent table of your data there) and then use this console app/your c# app to query that data.
If you have a spark job written in scala/python and need to submit it from a C# app then I guess LIVY is the best way to go. I am unsure if Mobius supports that.
Microsoft just released a dataframe based .NET support for Apache Spark via the .NET Foundation OSS. See http://dot.net/spark and http://github.com/dotnet/spark for more details. It is now available in HDInsight per default if you select the correct HDP/Spark version (currently 3.6 and 2.3, soon others as well).
UPDATE:
Long ago I said a clear no to this question.
However times has changed and Microsoft made an effort.
Pleas check out https://dotnet.microsoft.com/apps/data/spark
https://github.com/dotnet/spark
// Create a Spark session
var spark = SparkSession
.Builder()
.AppName("word_count_sample")
.GetOrCreate();
Writing spark applications in C# now is that easy!
OUTDATED:
No, C# is not the tool you should choose if you would like to work with Spark! However if you really want to do the job with it try as mentioned above Mobius
https://github.com/Microsoft/Mobius
Spark has 4 main languages and API-s for them: Scala, Java, Python, R.
If you are looking for a language in production I would not suggest the R API. The Other 3 work well.
For Cosmo DB connection I would suggest: https://github.com/Azure/azure-cosmosdb-spark

Categories

Resources