I develop a console application to transfert data in a API. The application is called via a SQL Trigger.
I use a dbcontext to get the data. it worked, but suddenly it doesn't work anymore.
here my code :
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder.UseSqlServer(
"Data Source=.\SQLEXPRESS;Initial Catalog=mydatabase;User ID=sa;Password=password;;Trust Server Certificate=true;Encrypt=true;Connection Timeout=300",
providerOptions =>
{
providerOptions.CommandTimeout(180);
});
}
then in my program.cs
using (DB_DONNNESContext context = new DB_DONNNESContext())
{
ListeLot lot = new ListeLot();
try
{
lot = context.ListeLots.Where(e => e.Id == Id).First();
Log.Information("Get lot " + Lot.Id);
Log.CloseAndFlush();
return;
}
catch (Exception ex)
{
Log.Error($"Error cannot get lot {Id} " + ex.Message + " " + " / "+ context.ContextId);
Log.CloseAndFlush();
return;
}
}
what's wrong ? my query is very simple... yesterday it's worked once and then I had this message. I have 6Go on the hard disk.
When I test in debug mode, it work. but when I Try to update via a sql query I have this message in the log file
Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. / Microsoft.EntityFrameworkCore.Infrastructure.DatabaseFacade
It's strange cause I get a contextID. (I try in Loggin it)
I really need your help :)
have a nice day.
You may be encountering a deadlock. Querying a table as part of a Trigger could be potentially dangerous depending on your database isolation level.
If your code represents all you need to do, then it would be completely unnecessary and likely contributing to a deadlock scenario.
lot = context.ListeLots.Where(e => e.Id == Id).First();
Log.Information("Get lot " + Lot.Id);
This code is telling EF to load an entity from the database, and there appears to be a typo or a reference issue given the "lot" being loaded is "lot" but your Log.Information call is referencing "Lot". This could be a typo getting your code into the StackOverflow question, or your code is attempting to reference a module level/global property called "Lot". C# is case sensitive but you can run into issues like this quite easily when reusing variable names varying by case.
If you just want to check if a Lot record exists without loading the entire thing into memory (triggering a read lock for the row which may be deadlocking with the trigger or other DB operations) then instead use the following:
var lotExists = context.ListeLots.Any(e => e.Id == Id);
if(lotExists)
Log.Information($"Get lot {Id}");
else
Log.Warning($"Lot {Id} not found.");
Log.CloseAndFlush();
Rather than loading the Log entity, doing an Any check will just return if the value exists. This should amount to an index scan/seek and unlikely to deadlock.
Edit: If you need to get a couple of columns from a query and want to improve the chance to avoid the deadlock, one option would be to add those columns as included fields on an index. For instance if you want to get a Name column from a lot, create an index on ID with an included column for Name. Then when fetching your Id and Name from the ListeLots:
var lot = context.ListeLots
.Where(e => e.Id == Id)
.Select(e => new { e.Id, e.Name })
.SingleOrDefault();
When EF composes this to SQL and executes it on the server, SQL Server should optimize this to pull the values from the index rather than the table. A caveat though if you are searching for a data row that has been updated (I.e. you are searching for a row that the trigger is executing after an update on) then this will most likely still deadlock, especially if the Name had been updated. Indexes can help bypass deadlock scenarios due to row locks, but they aren't immune to them. This can generally improve query performance, but comes at a cost of storage/memory use on the DB server as well as update performance costs. Generally it can be a good trade off for a small number of commonly queried, small fields. However, as soon as you request a column not in the index it will return to the table data.
Related
Hi could someone guide me in the following problem, there must be tons of guides on this problem but for some reason I can't get google to find a nice how to, to follow
I'm implementing this in aspnet core API, but I think the problem/solution could go for any language.
The problem, i have to call a view from a database that is painfully slow it takes about 15-30 seconds to return the ~ 300 rows
It only returns the fields that are required. It joins from a lot of tables and multiple databases. (There are other applications that updates the data, I'm only interested in reading the result)
The DBA says there is nothing he can do, so I have to find a solution, and why not, it could be fun
Now the real problem is there are about 250 autonomous clients requesting data, a client request data about every 2 minutes, and with the time it takes to select data it doesn't take long for the system to become unresponsive. It is the same data in the response for all requests
It would be acceptable to cache the rows for 5 minutes. Now how would I implement it so only one request select form the database and update the cache while all others read from a cache and perhaps are waiting if the cache is empty for a short period, while new data is being loaded to cache from the database view?
(I could write a script to be scheduled to execute every x minute, but it would be more fun to solve this in the application.)
I could perhaps make some cache tables in the database and let the api call check if the cache table is empty, if not get the data from the slow view, populate the cache database and return result. But then what would be a god solution to only empty the cache and populate the cache once and not multiple times when there are going to come multiple requests in the timeframe it takes to load data from the view.
And perhaps there are better alternatives that caching in a database table?
Hope anyone can help
Your question is very unspecific to a technology. So you are asking for an concept. In general
check cache without locking
return data if it is up to date
perform lock
check cache again
update cache
unlock and return data
You may read/use https://learn.microsoft.com/en-us/dotnet/core/extensions/caching
// pseudo code
async Result QueryFromCache()
{
// check cache is up to date - without lock
var cacheData = await GetCacheData(); // latest data or null
if (cacheData == null)
{
// wait for cache data
cacheData = await UpdateCache();
return cacheData.Data;
}
// data is still up to date?
if (cacheData.UpdateDate.AddMinutes(5) > DateTime.Now())
{
return cacheData.Data;
}
// Cache Update is necessary
// Option 1: Start separate "fire and forget" Thread/Task to fill cache, but return old data
Task.Run(UpdateCache()); // do not await
return cacheData.Data; // return immediatly with old data > 5 Minutes
// Option 2:
cacheData = await UpdateCache(); // wait for an update
return cacheData.Data;
}
async CacheData UpdateCache()
{
var lock = GetLock(); // lock, Semaphore, Database-Lock => depends on your architecture
try
{
// doubled check cache is up to date - with lock
var cacheData = await GetCacheData();
if (cacheDate != null && cacheData.UpdateDate.AddMinutes(5) > DateTime.Now())
{
return cacheData;
}
// Query data
var result = await PerformLongQuery();
// update cache
var cacheData = new CacheData {
UpdateDate = DateTime.Now(),
Data = result;
}
await SetCacheData(cacheEntry);
return cacheData;
} finally {
lock.Release();
}
}
I'm using the .NET Connector to access a MySQL database from my C# program. All my queries are done with MySqlCommand.BeginExecuteReader, with the IAsyncResults held in a list so I can check them periodically and invoke appropriate callbacks whenever they finish, fetching the data via MySqlCommand.EndExecuteReader. I am careful never to hold one of these readers open while attempting to read results from something else.
This mostly works fine. But I find that if I start two queries at the same time, then I get the dreaded MySqlException: There is already an open DataReader associated with this Connection which must be closed first exception in EndExecuteReader. And this is happening the first time I invoke EndExecuteReader. So the error message is full of baloney; there is no other open DataReader at that point, unless the connector has somehow opened one behind the scenes without me calling EndExecuteReader. So what's going on?
Here's my update loop, including copious logging:
for (int i=queries.Count-1; i>=0; i--) {
Debug.Log("Checking query: " + queries[i].command.CommandText);
if (!queries[i].operation.IsCompleted) continue;
var q = queries[i];
queries.RemoveAt(i);
Debug.Log("Finished, opening Reader for " + q.command.CommandText);
using (var reader = q.command.EndExecuteReader(q.operation)) {
try {
q.callback(reader, null);
} catch (System.Exception ex) {
Logging.LogError("Exception while processing: " + q.command.CommandText);
Logging.LogError(ex.ToString());
q.callback(null, ex.ToString());
}
}
Debug.Log("And done with callback for: " + q.command.CommandText);
}
And here's the log:
As you can see, I start both queries in rapid succession. (This is the first thing my program does after opening the DB connection, just to pin down what's happening.) Then the first one I check says it's done, so I call EndExecuteReader on it, and boom -- already it claims there's another open one. This happens immediately, before it even gets to my callback method. How can that be?
Is it not valid to have two open queries at once, even if I only call EndExecuteReader on one at a time?
When you run two queries concurrently, you must have two Connection objects. Why? Each Connection can only handle one query at a time. It looks like your code got into some kind of race condition where some of your concurrent queries worked and then a pair of them collided and failed.
At any rate your system will be more resilient in production if you can keep your startup sequences simple. If I were you I'd run one query after another rather than trying to run them all at once. (Obvs if that causes real performance problems you'll have to run them concurrently. But keep it simple until you need it to be complex.)
When I try to query a PostgreSQL database through EntityFramework6.Npgsql with the following code:
using (MyDbContext context = new MyDbContext())
{
var res = (from b in context.mytable select new { b.Name, b.Age });
foreach (var row in res)
{
Console.WriteLine(row.Name + " - " + row.Age);
}
}
I get a timeout exception after fetching few lines with the following error:
[Npgsql.NpgsqlException] : {"57014: canceling statement due to
statement timeout"}
Message: 57014: canceling statement due to
statement timeout
When I execute the same operation while fetching all the data to a List, the code works fine:
using (MyDbContext context = new MyDbContext())
{
var res = (from b in context.mytable select new { b.Name, b.Age }).ToList();
foreach (var row in res)
{
Console.WriteLine(row.Name + " - " + row.Age);
}
}
I suspect that it is related to the way PostgreSQL manages its connection pool but I don't know how I could handle it correctly through Entity Framework.
This is probably related to the way Npgsql manages timeouts. In current versions, Npgsql sets the PostgreSQL statement_timeout variable which causes PostgreSQL to generate a timeout error after some time. The problem with this method is that statement_timeout is unreliable for this: it includes network time, client processing time, etc. so too much time spent on the client could make the server generate the error.
In your example, calling ToList() means that you immediately download all results, rather than iterate over them little by little. I do admit it's strange that such short client processing (i.e. Console.WriteLine) could introduce a delay sufficient to trigger a backend timeout (what is the command timeout set to?).
Note that the next major version of Npgsql will remove backend timeouts entirely because of the unreliable nature of statement_timeout - see https://github.com/npgsql/npgsql/issues/689. For now you can manually disable backend timeouts by setting the Backend Timeouts connection string parameter to false (see http://www.npgsql.org/doc/3.0/connection-string-parameters.html).
There are many articles here on EF taking a long time to save, but I've looked through them and used their answers and still seem to get very slow results.
My code looks like so:
using (MarketingEntities1 db = new MarketingEntities1())
{
//using (var trans = db.Database.BeginTransaction(IsolationLevel.ReadUncommitted))
//{
int count = 0;
db.Configuration.AutoDetectChangesEnabled = false;
db.Configuration.ValidateOnSaveEnabled = false;
while (count < ranges.Count)
{
if (bgw != null)
{
bgw.ReportProgress(0, "Saving count: " + count.ToString());
}
db.Set<xGeoIPRanx>().AddRange(ranges.Skip(count).Take(BATCHCOUNT));
db.SaveChanges();
count+=BATCHCOUNT;
}
//trans.Commit();
//}
}
Each batch takes 30+ seconds to complete. BatchCount is 1000. i know EF isn't that slow. You can see that I've stopped using transaction, I've taken tracking off, none of it seemed to help.
Some more info:
xGeoIpRanx is an empty table, with no PK(I'm not sure how much it would help). I'm trying to insert about 10 mil ranges.
Edit:
i feel stupid but im trying to use bulkInsert and i keep getting this entity doesnt exist errors, i look at this code
using (var ctx = GetContext())
{
using (var transactionScope = new TransactionScope())
{
// some stuff in dbcontext
ctx.BulkInsert(entities);
ctx.SaveChanges();
transactionScope.Complete();
}
}
What is "entities" I tried a list of my entities, that doesnt work, what data type is that?
nvm it works as expected it was a strange error due to how i generated the edmx file
Pause the debugger 10 times under load and look at the stack including
external code. Where does it stop most often?
.
Its taking a long time on the .SaveChanges(). just from some quick tests, ADO.net code
That means network latency and server execution time are causing this. For inserts server execution time is usually not that high. You cannot do anything about network latency with EF because it sends one batch per insert. (Yes, this is a deficiency of the framework.).
Don't use EF for bulk work. Consider using table-values parameters or SqlBulkCopy or any other means of bulk inserting such as Aducci's proposal from the comments.
This makes no sense to me but maybe someone with keener eyes can spot the problem.
I have a Windows service that uses FileSystemWatcher. It processes some files and uploads data to an MSSQL database. It works totally fine on my machine -- detached from Visual Studio (ie not debugging) and running as a service. If copy this compiled code to our server, and have it point to the same database, and even the same files (!), I get this error every single time:
System.InvalidOperationException: Invalid operation. The connection is closed.
at System.Data.SqlClient.SqlConnection.GetOpenTdsConnection()
at System.Data.SqlClient.SqlBulkCopy.CopyBatchesAsyncContinuedOnError(Boolean cleanupParser)
at System.Data.SqlClient.SqlBulkCopy.<>c__DisplayClass30.<CopyBatchesAsyncContinuedOnSuccess>b__2c()
at System.Data.SqlClient.AsyncHelper.<>c__DisplayClass9.<ContinueTask>b__8(Task tsk)
I have tried pointing my local code to the server's files and it works fine. .Net 4.5.1 is on both machines. Services are both running under the same domain user. It is baffling. Perhaps there is something I don't understand about SqlBulkCopy.WriteToServerAsync()? Does it automatically share connections or something? Does it close in between calls or something? Here's the relevant code:
private static void ProcessFile(FileInfo fileInfo)
{
using (var bulkCopy = new SqlBulkCopy("Data Source=myserver;Initial Catalog=mydb;Persist Security Info=True;User ID=myusr;Password=mypwd;")
using (var objRdr = ObjectReader.Create(ReadLogFile(fileInfo)
.Where(x => !string.IsNullOrEmpty(x.Level)),
"Id", "AppId", "AppDomain", "AppMachine",
"LocalDate", "UtcDate", "Thread", "Level", "Logger", "Usrname",
"ClassName", "MethodName", "LineNo", "Message", "Exception",
"StackTrace", "Properties"))
{
bulkCopy.DestinationTableName = "EventLog";
bulkCopy.BulkCopyTimeout = 600;
bulkCopy.EnableStreaming = true;
bulkCopy.BatchSize = AppConfig.WriteBatchSize;
bulkCopy.WriteToServerAsync(objRdr).ContinueWith(t =>
{
if (t.Status == TaskStatus.Faulted)
{
CopyToFailedDirectory(fileInfo);
_log.Error(
string.Format(
"Error copying logs to database for file {0}. File has been copied to failed directory for inspection.",
fileInfo.FullName), t.Exception.InnerException ?? t.Exception);
Debug.WriteLine("new handle error {0}",
(t.Exception.InnerException ?? t.Exception).Message);
}
if (t.Status == TaskStatus.RanToCompletion)
{
_log.InfoFormat("File {0} logs have been copied to database.", fileInfo.FullName);
Debug.WriteLine("Yay, finished {0}!", fileInfo.Name);
}
// if this is the last one, delete the original file
if (t.Status == TaskStatus.Faulted || t.Status == TaskStatus.RanToCompletion)
{
Debug.WriteLine("deleting file {0}", fileInfo.Name);
PurgeFile(fileInfo);
}
});
}
}
Couple notes in case you ask:
ObjectReader is a FastMember IDataReader implementation. CRAZY fast. It reads the file into custom objects with the properties you see listed.
It throws the error for every single file.
Again, this works on my machine, both as a service and as a console app. I even had it working once on the server. It threw the error and never worked again.
Any ideas?
Looks like an issue with it being Async.
Please let me know if I wrong, but what I noticed is you have your SqlBulkCopy and ObjectReader in a using statement which is great, however, you are doing all the processing asynchronously. Once, you call it and it starts doing work, your using statements are disposing of your objects which will also kill your connection.
The odd thing is that it sounds like it works sometimes, but perhaps it just becomes a race condition at that point.
Way late to throw this one on here but I was having what I thought was the same issue with SqlBulkCopy. Tried some of the steps in the other answers but with no luck. Turns out in my case that the actual error was caused by a string in the data going above the max length on one of the varchar columns, but for some reason the only error I was getting was the one about the closed connection.
Strangely, my coworker tried the same thing, and got an actual error message about the varchar being out of bounds. So we fixed the data and everything worked, but if you're here because of this error and nothing else works, you might want to start looking for different issues in your data.
This looks to me to be a bug in the SqlBulkCopy implementation. If you run a (large) number of bulk copies in parallel in separate tasks concurrently, disable your network connection and then trigger a full garbage collection, you will reliably get this exception thrown on the GC's finalizer thread. It is completely unavoidable.
That shouldn't happen because you are continuing the WriteToServerAsync task and handling the fault. But in the implementation, on error they start a new task that they don't continue or await.
This still seems to be a bug in .NET 4.6.2.
The only fix I can see is to subscribe to TaskScheduler.UnobservedTaskException and look for something in the stacktrace that identifies the issue. That isn't a fix by the way, it is a hack.