I have some tasks (nWorkers = 3):
var taskFactory = new TaskFactory(cancellationTokenSource.Token,
TaskCreationOptions.LongRunning, TaskContinuationOptions.LongRunning,
TaskScheduler.Default);
for (int i = 0; i < nWorkers; i++)
{
var task = taskFactory.StartNew(() => this.WorkerMethod(parserItems,
cancellationTokenSource));
tasks[i] = task;
}
And the following method called by the tasks:
protected override void WorkerMethod(BlockingCollection<ParserItem> parserItems,
CancellationTokenSource cancellationTokenSource)
{
//...log-1...
using (var connection = new OracleConnection(connectionString))
{
OracleTransaction transaction = null;
try
{
cancellationTokenSource.Token.ThrowIfCancellationRequested();
connection.Open();
//...log-2...
transaction = connection.BeginTransaction();
//...log-3...
using (var cmd = connection.CreateCommand())
{
foreach (var parserItem in parserItems.GetConsumingEnumerable(
cancellationTokenSource.Token))
{
cancellationTokenSource.Token.ThrowIfCancellationRequested();
try
{
foreach (var statement in this.ProcessRecord(parserItem))
{
cmd.CommandText = statement;
try
{
cmd.ExecuteNonQuery();
}
catch (OracleException ex)
{
//...log-4...
if (!this.acceptedErrorCodes.Contains(ex.Number))
{
throw;
}
}
}
}
catch (FormatException ex)
{
log.Warn(ex.Message);
}
}
if (!cancellationTokenSource.Token.IsCancellationRequested)
{
transaction.Commit();
}
else
{
throw new Exception("DBComponent has been canceled");
}
}
}
catch (Exception ex)
{
//...log-5...
cancellationTokenSource.Cancel();
if (transaction != null)
{
try
{
transaction.Rollback();
//...log-6...
}
catch (Exception rollbackException)
{
//...log-7...
}
}
throw;
}
finally
{
if (transaction != null)
{
transaction.Dispose();
}
connection.Close();
//...log-8...
}
}
//...log-9...
}
There is a producer of ParserItem objects and these are the consumers. Normally it works fine, there are sometimes that there is an Oracle connection timeout, but in these cases I can see the exception message and everything works as designed.
But sometimes the process get stuck. When it gets stuck, in the log file I can see log-1 message and after that (more or less 15 seconds later) I see log-8 message, but what is driving me nuts is why i cannot see neither the exception message log-5 nor the log-9 message.
Since the cancellationTokenSource.Cancel() method is never called, the producer of items for the bounded collection is stuck until a timeout two hours later.
It is compiled for NET Framework 4 and I'm using Oracle.ManagedDataAccess libraries for the Oracle connection.
Any help would be greatly appreciated.
You should never dispose a transaction or connection when you use using scope. Second, you should rarely rely on exception based programming style. Your code rewritten below:
using (var connection = new OracleConnection(connectionString))
{
using (var transaction = connection.BeginTransaction())
{
connection.Open();
//...log-2...
using (var cmd = connection.CreateCommand())
{
foreach (var parserItem in parserItems.GetConsumingEnumerable(cancellationTokenSource.Token))
{
if (!cancellationTokenSource.IsCancellationRequested)
{
try
{
foreach (var statement in ProcessRecord(parserItem))
{
cmd.CommandText = statement;
try
{
cmd.ExecuteNonQuery();
}
catch (OracleException ex)
{
//...log-4...
if (!acceptedErrorCodes.Contains(ex.ErrorCode))
{
log.Warn(ex.Message);
}
}
}
}
catch (FormatException ex)
{
log.Warn(ex.Message);
}
}
}
if (!cancellationTokenSource.IsCancellationRequested)
{
transaction.Commit();
}
else
{
transaction.Rollback();
throw new Exception("DBComponent has been canceled");
}
}
}
}
//...log-9...
Let me know if this helps.
I can confirm everything you're saying. (program stuck, low CPU usage, oracle connection timeouts, etc.)
One workaround is to use Threads instead of Tasks.
UPDATE: after careful investigation I found out that when you use a high number of Tasks, the ThreadPool worker threads queued by the Oracle driver become slow to start, which ends up causing a (fake) connect timeout.
A couple of solutions for this:
Solution 1: Increase the ThreadPool's minimum number of threads, e.g.:
ThreadPool.SetMinThreads(50, 50); // YMMV
OR
Solution 2: Configure your connection to use pooling and set its minimum size appropriately.
var ocsb = new OracleConnectionStringBuilder();
ocsb.DataSource = ocsb.DataSource;
ocsb.UserID = "myuser";
ocsb.Password = "secret";
ocsb.Pooling = true;
ocsb.MinPoolSize = 20; // YMMV
IMPORTANT: before calling any routine that creates a high number of tasks, open a single connection using that will "warm-up" the pool:
using(var oc = new OracleConnection(ocsb.ToString()))
{
oc.Open();
oc.Close();
}
Note: Oracle indexes the connection pools by the connect string (with the password removed), so if you want to open additional connections you must use always the same exact connect string.
Related
So I know this question has been asked here before, but the situation here is a little different.
I have a service Application that spawns worker threads. The main service thread is organized as so:
public void PollCrunchFilesTask()
{
try
{
var stuckDeletedServTableFiles = MaintenanceDbContext.stuckDeletedServTableFiles;
var stuckErrorStatusFiles = MaintenanceDbContext.stuckErrorStatusFiles;
while (_signalPollAutoEvent.WaitOne())
{
try
{
Poll();
lock (stuckDelLock)
{
if(stuckDeletedServTableFiles.Count > 0)
MaintenanceDbContext.DeleteFilesToBeDeletedInServiceTable(stuckDeletedServTableFiles);
}
lock (errorStatusLock)
{
if (stuckErrorStatusFiles.Count > 0)
MaintenanceDbContext.UpdateStuckErrorServiceLogEntries(stuckErrorStatusFiles);
}
}
catch (Exception ex)
{
}
}
}
catch (Exception ex)
{
}
}
Inside Poll you have this logic:
public void Poll()
{
try
{
if (ProducerConsumerQueue.Count() == 0 && ThreadCount_Diff_ActiveTasks > 0)
{
var dequeuedItems = MetadataDbContext.UpdateOdfsServiceEntriesForProcessingOnPollInterval(ThreadCount_Diff_ActiveTasks);
var handlers = Producer.GetParserHandlers(dequeuedItems);
foreach (var handler in handlers)
{
ProducerConsumerQueue.EnqueueTask(handler.Execute, CancellationTokenSource.Token);
}
}
}
catch (Exception ex)
{
}
}
That ProducerConsumerQueue.EnqueueTask(handler.Execute, CancellationTokenSource.Token); launches 4 worker threads and inside any one of these threads, the following function is called at any time:
public static int DeleteServiceEntry(string logFileName)
{
int rowsAffected = 0;
var stuckDeletedServTableFiles = MaintenanceDbContext.stuckDeletedServTableFiles;
try
{
string connectionString = GetConnectionString();
throw new Exception($"Testing Del HashSet");
using (SqlConnection connection = new SqlConnection())
{
//Attempt some query
}
}
catch (Exception ex)
{
lock (stuckDelLock)
{
stuckDeletedServTableFiles.Add(logFileName);
}
}
return rowsAffected;
}
Now I am testing the stuckDeletedServTableFiles hashset which is only called when there is an exception during a query. this is why I purposefully throw an exception. That hashset is the one that is operated on on the main service thread in the function DeleteFilesToBeDeletedInServiceTable(); who's excerpt is defined below:
public static int DeleteFilesToBeDeletedInServiceTable(HashSet<string> stuckDeletedServTableFiles)
{
int rowsAffected = 0;
string logname = String.Empty; //used to collect error log
var removedHashset = new HashSet<string>();
try
{
var dbConnString = MetadataDbContext.GetConnectionString();
string serviceTable = Constants.SERVICE_LOG_TBL;
using (SqlConnection connection = new SqlConnection(dbConnString))
{
SqlCommand cmd = new SqlCommand();
cmd.CommandType = CommandType.Text;
cmd.CommandText = $"DELETE FROM {serviceTable} WHERE LOGNAME = #LOGNAME";
cmd.Parameters.Add("#LOGNAME", SqlDbType.NVarChar);
cmd.Connection = connection;
connection.Open();
foreach (var logFname in stuckDeletedServTableFiles)
{
cmd.Parameters["#LOGNAME"].Value = logFname;
logname = logFname;
int currRowsAffected = cmd.ExecuteNonQuery();
rowsAffected += currRowsAffected;
if (currRowsAffected == 1)
{
removedHashset.Add(logFname);
Logger.Info($"Removed Stuck {logFname} Marked for Deletion from {serviceTable}");
}
}
Logger.Info($"Removed {rowsAffected} stuck files Marked for Deletion from {serviceTable}");
}
stuckDeletedServTableFiles.ExceptWith(removedHashset);
}
catch (Exception ex)
{
}
return rowsAffected;
}
Given that the hashset stuckDeletedServTableFiles is able to be accessed by multiple threads at the same time including the main service thread, I put a lock on the mainservice thread before it is operated on by DeleteFilesToBeDeletedInServiceTable() and in the function DeleteServiceEntry(). I am new to C# but I assumed this would be enough? I assume since a lock is called on the main service thread for the function DeleteFilesToBeDeletedInServiceTable(), that lock would prevent anything from using the hashset since its being operated on by the function. Why am I getting this error?
Note I am not modifying the Hashset in the forloop. I only do it after the loop is done. I am getting this error while I loop through the Hashset. I suppose because another thread is attempting to modify it. The question then is, why is a thread able to modify the hashSet when I have a lock on the function that calls it on the service level?
For Now, I fixed it by surrounding the for loop in DeleteFilesToBeDeletedInServiceTable() with a lock and also calling the same lock on the statement stuckDeletedServTableFiles.ExceptWith(removedHashset); inside that function. I am not sure if there are costs to this but it seems it will work. And giving how in practice how infrequent this issue will occur, I suppose it wont' cost much. Especially since the times that function is called, we are not doing intensive things. The file crunching would have finished by then and the main caller is utilizing another thread
So I am using a blocking collection to store a custom class, Node, which will hold a database connection and a query to run over that connection. The collection is complete before it is consumed from. However if I pull a Node from the collection and try and run it, it may fail, and I would like to re-add it to the collection to be rerun later.
I have two working solutions, but both I don't like either of them and was hoping someone could give me a cleaner solution, or some ideas on how to improve this.
1st:
Parallel.Foreach on the collection, anything that fails gets added to a new blocking collection which is recursively called.
Parallel.ForEach(NodeList, node => {
try {
using (NpgsqlConnection conn = new NpgsqlConnection(node.ConnectionString)) {
conn.Open();
using (NpgsqlCommand npgQuery = new NpgsqlCommand(node.Query, conn)) {
using (NpgsqlDataReader reader = npgQuery.ExecuteReader()) {
while (reader.Read()) {
//Do stuff
}
}
}
}
} catch (Exception e){
retryNodes.Add(node);
}
});
retryNodes.CompleteAdding();
NodeList = retryNodes.ToList<Node>();
try {
ExecuteNodes();
} catch (Exception e) {
throw e;
}
I don't like this because it means as the original collection gets to the end it is wasting threads waiting for the new collection to be started.
2nd:
Manually start tasks for each item in the collection:
int totalToFinish = NodeList.Count;
while (totalToFinish > 0) {
while (threadsRunning < MaxAllowedThreads && totalToFinish > 0) {
Interlocked.Increment(ref threadsRunning);
Task.Factory.StartNew(() => {
if (NodeList.Count == 0) {
Interlocked.Decrement(ref threadsRunning);
return;
}
Node node;
NodeList.TryTake(out node, 1000);
if (node.Attempts++ >= RetryAttempts) {
throw new Exception("Failed after allowed attemps of: " + RetryAttempts);
}
try {
using (NpgsqlConnection conn = new NpgsqlConnection(node.ConnectionString)) {
conn.Open();
using (NpgsqlCommand npgQuery = new NpgsqlCommand(node.Query, conn)) {
using (NpgsqlDataReader reader = npgQuery.ExecuteReader()) {
while (reader.Read()) {
//Do stuff
}
}
}
Interlocked.Decrement(ref totalToFinish);
}
} catch (Exception e) {
NodeList.Add(node);
}
Interlocked.Decrement(ref threadsRunning);
});
}
}
This way works a lot better in terms of performance, but has massive overhead on it and I feel like its not a good way to do it.
If anyone could help me with this it would be greatly appreciated.
Thanks.
I see that you're setting a limit to the retry count in the second algorithm - you can simplify the first algorithm by including this retry loop
Parallel.ForEach(NodeList, node => {
while(true) {
try {
using (NpgsqlConnection conn = new NpgsqlConnection(node.ConnectionString)) {
conn.Open();
using (NpgsqlCommand npgQuery = new NpgsqlCommand(node.Query, conn)) {
using (NpgsqlDataReader reader = npgQuery.ExecuteReader()) {
while (reader.Read()) {
//Do stuff
}
}
}
}
break; // break out of outer while loop
} catch (Exception e){
node.Attempts++;
if(node.Attempts >= RetryAttempts) {
throw new Exception("Too many retries");
}
}
}
});
When I do the following, I get a warning that not all code paths return a value if the catch block doesn't return an int; when that catch block returns an int, the capturedException test for null becomes unreachable unless it is placed inside a finally block. Is putting the throw inside finally{} acceptable? Is the connection closed automatically as it would be in a synchronous method that employs the using syntax?
public async Task<int> Write2Log()
{
ExceptionDispatchInfo capturedException = null;
using (SqlConnection conn = new SqlConnection(connectionString))
{
using (SqlCommand cmd = new SqlCommand(commandText, conn))
{
try
{
await cmd.Connection.OpenAsync();
return await cmd.ExecuteNonQueryAsync();
}
catch (Exception ex)
{
capturedException=ExceptionDispatchInfo(ex);
return -1;
}
// unreachable unless placed inside `finally` block
if (capturedException != null)
{
capturedException.Throw();
}
}
}
}
your code in question seems to be incorrect but anyway you need not return in catch.....you can return in end of the method. you know that successful scenario will never reach end of method.
public async Task<int> Write2Log()
{
ExceptionDispatchInfo capturedException = null;
using (SqlConnection conn = new SqlConnection(connectionString))
{
using (SqlCommand cmd = new SqlCommand(commandText, conn))
{
try
{
await cmd.Connection.OpenAsync();
return await cmd.ExecuteNonQueryAsync();
}
catch (Exception ex)
{
capturedException=ExceptionDispatchInfo(ex);
}
// unreachable unless placed inside `finally` block
if (capturedException != null)
{
capturedException.Throw();
}
}
}
return -1;
}
The code you're clumsily trying to write is equivalent to the much simpler
public async Task<int> Write2Log()
{
using (var conn = new SqlConnection(connectionString))
{
using (var cmd = new SqlCommand(commandText, conn)
{
await conn.OpenAsync();
return await cmd.ExecuteNonQueryAsync();
}
}
}
If you don't see why, you might want to refresh your knowledge about how .NET exceptions work, and how await works (and in particular, how it handles exceptions).
If you also want to observe the exception (e.g. for logging), you can use throw; to rethrow the exception at the end of the catch. There's no need to use ExceptionDispatchInfo, and there's no point in storing the exception in a local to rethrow later.
If you're instead trying to return -1 when there's an error, just do
public async Task<int> Write2Log()
{
using (var conn = new SqlConnection(connectionString))
{
using (var cmd = new SqlCommand(commandText, conn)
{
try
{
await conn.OpenAsync();
return await cmd.ExecuteNonQueryAsync();
}
catch
{
return -1;
}
}
}
}
A method cannot throw an exception and return a value at the same time. You have to choose one.
I have a code that adds data to two EntityFramework 6 DataContexts, like this:
using(var scope = new TransactionScope())
{
using(var requestsCtx = new RequestsContext())
{
using(var logsCtx = new LogsContext())
{
var req = new Request { Id = 1, Value = 2 };
requestsCtx.Requests.Add(req);
var log = new LogEntry { RequestId = 1, State = "OK" };
logsCtx.Logs.Add(log);
try
{
requestsCtx.SaveChanges();
}
catch(Exception ex)
{
log.State = "Error: " + ex.Message;
}
logsCtx.SaveChanges();
}
}
}
There is an insert trigger in Requests table that rejects some values using RAISEERROR. This situation is normal and should be handled by the try-catch block where the SaveChanges method is invoked. If the second SaveChanges method fails, however, the changes to both DataContexts must be reverted entirely - hence the transaction scope.
Here goes the error: when requestsCtx.SaveChanges() throws a exception, the whole Transaction.Current has its state set to Aborted and the latter logsCtx.SaveChanges() fails with the following:
TransactionException:
The operation is not valid for the state of the transaction.
Why is this happening and how do tell EF that the first exception is not critical?
Really not sure if this will work, but it might be worth trying.
private void SaveChanges()
{
using(var scope = new TransactionScope())
{
var log = CreateRequest();
bool saveLogSuccess = CreateLogEntry(log);
if (saveLogSuccess)
{
scope.Complete();
}
}
}
private LogEntry CreateRequest()
{
var req = new Request { Id = 1, Value = 2 };
var log = new LogEntry { RequestId = 1, State = "OK" };
using(var requestsCtx = new RequestsContext())
{
requestsCtx.Requests.Add(req);
try
{
requestsCtx.SaveChanges();
}
catch(Exception ex)
{
log.State = "Error: " + ex.Message;
}
finally
{
return log;
}
}
}
private bool CreateLogEntry(LogEntry log)
{
using(var logsCtx = new LogsContext())
{
try
{
logsCtx.Logs.Add(log);
logsCtx.SaveChanges();
}
catch (Exception)
{
return false;
}
return true;
}
}
from the documentation on transactionscope: http://msdn.microsoft.com/en-us/library/system.transactions.transactionscope%28v=vs.110%29.aspx
If no exception occurs within the transaction scope (that is, between
the initialization of the TransactionScope object and the calling of
its Dispose method), then the transaction in which the scope
participates is allowed to proceed. If an exception does occur within
the transaction scope, the transaction in which it participates will
be rolled back.
Basically as soon as an exception is encountered, the transaction is rolled back (as it seems you're aware) - I think this might work but am really not sure and can't test to confirm. It seems like this goes against the intended use of transaction scope, and I'm not familiar enough with exception handling/bubbling, but maybe it will help! :)
I think I finally figured it out. The trick was to use an isolated transaction for the first SaveChanges:
using(var requestsCtx = new RequestsContext())
using(var logsCtx = new LogsContext())
{
var req = new Request { Id = 1, Value = 2 };
requestsCtx.Requests.Add(req);
var log = new LogEntry { RequestId = 1, State = "OK" };
logsCtx.Logs.Add(log);
using(var outerScope = new TransactionScope())
{
using(var innerScope = new TransactionScope(TransactionScopeOption.RequiresNew))
{
try
{
requestsCtx.SaveChanges();
innerScope.Complete();
}
catch(Exception ex)
{
log.State = "Error: " + ex.Message;
}
}
logsCtx.SaveChanges();
outerScope.Complete();
}
}
Warning: most of the articles about RequiresNew mode discourage using it due to performance reasons. It works perfectly for my scenario, however if there are any side effects that I'm unaware of, please let me know.
I have a method inside a main one. I need the child method to be able to roll back if the parent method fails. The two data connections use different servers . Before I added the transaction scopes, they worked well. But when I tie them together, the child method aborts.
Edit: Error message: Network access for distributed transaction Manager(MSDTC) has been disabled. Please enable DTC for network access in the security configuration for MSDTC using Component Service Administrative tool.
public static void LoopStudent()
{
try
{
using(TransactionScope scope = new TransactionScope())
{
String connString = ConfigurationManager.AppSettings["DBConnection"];
using(SqlConnection webConn = new SqlConnection(connString))
{
webConn.Open();
String sql = "select * from students";
using(SqlCommand webComm = new SqlCommand(sql, webConn))
{
using(SqlDataReader webReader = webComm.ExecuteReader())
{
if (webReader.HasRows)
{
while (webReader.Read())
{
int i = GetNextId();
}
}
else
Console.WriteLine("wrong");
}
}
}
scope.Complete();
}
}
catch (Exception ex)
{
Console.WriteLine("Error " + ex.Message);
}
} //End LoopThroughCart
public static int GetNextId(String str)
{
int nextId = 0;
String connString = ConfigurationManager.AppSettings["SecondDBConnection"];
try
{
using(TransactionScope scope = new TransactionScope())
{
using(SqlConnection webConn = new SqlConnection(connString))
{
webConn.Open();
using(SqlCommand webComm = new SqlCommand("GetNextId", webConn))
{
//do things
}
}
scope.Complete();
}
}
catch (TransactionAbortedException ex)
{
Console.WriteLine("TransactionAbortedException Message: {0}", ex.Message);
}
catch (ApplicationException ex)
{
Console.WriteLine("ApplicationException Message: {0}", ex.Message);
}
return nextId;
} //End GetNextId
If you do not use RequireNew in you inner method, the inner method will be automatically rolled back if the parent fails to commit the transaction.
What error are you getting?