So I am using a blocking collection to store a custom class, Node, which will hold a database connection and a query to run over that connection. The collection is complete before it is consumed from. However if I pull a Node from the collection and try and run it, it may fail, and I would like to re-add it to the collection to be rerun later.
I have two working solutions, but both I don't like either of them and was hoping someone could give me a cleaner solution, or some ideas on how to improve this.
1st:
Parallel.Foreach on the collection, anything that fails gets added to a new blocking collection which is recursively called.
Parallel.ForEach(NodeList, node => {
try {
using (NpgsqlConnection conn = new NpgsqlConnection(node.ConnectionString)) {
conn.Open();
using (NpgsqlCommand npgQuery = new NpgsqlCommand(node.Query, conn)) {
using (NpgsqlDataReader reader = npgQuery.ExecuteReader()) {
while (reader.Read()) {
//Do stuff
}
}
}
}
} catch (Exception e){
retryNodes.Add(node);
}
});
retryNodes.CompleteAdding();
NodeList = retryNodes.ToList<Node>();
try {
ExecuteNodes();
} catch (Exception e) {
throw e;
}
I don't like this because it means as the original collection gets to the end it is wasting threads waiting for the new collection to be started.
2nd:
Manually start tasks for each item in the collection:
int totalToFinish = NodeList.Count;
while (totalToFinish > 0) {
while (threadsRunning < MaxAllowedThreads && totalToFinish > 0) {
Interlocked.Increment(ref threadsRunning);
Task.Factory.StartNew(() => {
if (NodeList.Count == 0) {
Interlocked.Decrement(ref threadsRunning);
return;
}
Node node;
NodeList.TryTake(out node, 1000);
if (node.Attempts++ >= RetryAttempts) {
throw new Exception("Failed after allowed attemps of: " + RetryAttempts);
}
try {
using (NpgsqlConnection conn = new NpgsqlConnection(node.ConnectionString)) {
conn.Open();
using (NpgsqlCommand npgQuery = new NpgsqlCommand(node.Query, conn)) {
using (NpgsqlDataReader reader = npgQuery.ExecuteReader()) {
while (reader.Read()) {
//Do stuff
}
}
}
Interlocked.Decrement(ref totalToFinish);
}
} catch (Exception e) {
NodeList.Add(node);
}
Interlocked.Decrement(ref threadsRunning);
});
}
}
This way works a lot better in terms of performance, but has massive overhead on it and I feel like its not a good way to do it.
If anyone could help me with this it would be greatly appreciated.
Thanks.
I see that you're setting a limit to the retry count in the second algorithm - you can simplify the first algorithm by including this retry loop
Parallel.ForEach(NodeList, node => {
while(true) {
try {
using (NpgsqlConnection conn = new NpgsqlConnection(node.ConnectionString)) {
conn.Open();
using (NpgsqlCommand npgQuery = new NpgsqlCommand(node.Query, conn)) {
using (NpgsqlDataReader reader = npgQuery.ExecuteReader()) {
while (reader.Read()) {
//Do stuff
}
}
}
}
break; // break out of outer while loop
} catch (Exception e){
node.Attempts++;
if(node.Attempts >= RetryAttempts) {
throw new Exception("Too many retries");
}
}
}
});
Related
So I know this question has been asked here before, but the situation here is a little different.
I have a service Application that spawns worker threads. The main service thread is organized as so:
public void PollCrunchFilesTask()
{
try
{
var stuckDeletedServTableFiles = MaintenanceDbContext.stuckDeletedServTableFiles;
var stuckErrorStatusFiles = MaintenanceDbContext.stuckErrorStatusFiles;
while (_signalPollAutoEvent.WaitOne())
{
try
{
Poll();
lock (stuckDelLock)
{
if(stuckDeletedServTableFiles.Count > 0)
MaintenanceDbContext.DeleteFilesToBeDeletedInServiceTable(stuckDeletedServTableFiles);
}
lock (errorStatusLock)
{
if (stuckErrorStatusFiles.Count > 0)
MaintenanceDbContext.UpdateStuckErrorServiceLogEntries(stuckErrorStatusFiles);
}
}
catch (Exception ex)
{
}
}
}
catch (Exception ex)
{
}
}
Inside Poll you have this logic:
public void Poll()
{
try
{
if (ProducerConsumerQueue.Count() == 0 && ThreadCount_Diff_ActiveTasks > 0)
{
var dequeuedItems = MetadataDbContext.UpdateOdfsServiceEntriesForProcessingOnPollInterval(ThreadCount_Diff_ActiveTasks);
var handlers = Producer.GetParserHandlers(dequeuedItems);
foreach (var handler in handlers)
{
ProducerConsumerQueue.EnqueueTask(handler.Execute, CancellationTokenSource.Token);
}
}
}
catch (Exception ex)
{
}
}
That ProducerConsumerQueue.EnqueueTask(handler.Execute, CancellationTokenSource.Token); launches 4 worker threads and inside any one of these threads, the following function is called at any time:
public static int DeleteServiceEntry(string logFileName)
{
int rowsAffected = 0;
var stuckDeletedServTableFiles = MaintenanceDbContext.stuckDeletedServTableFiles;
try
{
string connectionString = GetConnectionString();
throw new Exception($"Testing Del HashSet");
using (SqlConnection connection = new SqlConnection())
{
//Attempt some query
}
}
catch (Exception ex)
{
lock (stuckDelLock)
{
stuckDeletedServTableFiles.Add(logFileName);
}
}
return rowsAffected;
}
Now I am testing the stuckDeletedServTableFiles hashset which is only called when there is an exception during a query. this is why I purposefully throw an exception. That hashset is the one that is operated on on the main service thread in the function DeleteFilesToBeDeletedInServiceTable(); who's excerpt is defined below:
public static int DeleteFilesToBeDeletedInServiceTable(HashSet<string> stuckDeletedServTableFiles)
{
int rowsAffected = 0;
string logname = String.Empty; //used to collect error log
var removedHashset = new HashSet<string>();
try
{
var dbConnString = MetadataDbContext.GetConnectionString();
string serviceTable = Constants.SERVICE_LOG_TBL;
using (SqlConnection connection = new SqlConnection(dbConnString))
{
SqlCommand cmd = new SqlCommand();
cmd.CommandType = CommandType.Text;
cmd.CommandText = $"DELETE FROM {serviceTable} WHERE LOGNAME = #LOGNAME";
cmd.Parameters.Add("#LOGNAME", SqlDbType.NVarChar);
cmd.Connection = connection;
connection.Open();
foreach (var logFname in stuckDeletedServTableFiles)
{
cmd.Parameters["#LOGNAME"].Value = logFname;
logname = logFname;
int currRowsAffected = cmd.ExecuteNonQuery();
rowsAffected += currRowsAffected;
if (currRowsAffected == 1)
{
removedHashset.Add(logFname);
Logger.Info($"Removed Stuck {logFname} Marked for Deletion from {serviceTable}");
}
}
Logger.Info($"Removed {rowsAffected} stuck files Marked for Deletion from {serviceTable}");
}
stuckDeletedServTableFiles.ExceptWith(removedHashset);
}
catch (Exception ex)
{
}
return rowsAffected;
}
Given that the hashset stuckDeletedServTableFiles is able to be accessed by multiple threads at the same time including the main service thread, I put a lock on the mainservice thread before it is operated on by DeleteFilesToBeDeletedInServiceTable() and in the function DeleteServiceEntry(). I am new to C# but I assumed this would be enough? I assume since a lock is called on the main service thread for the function DeleteFilesToBeDeletedInServiceTable(), that lock would prevent anything from using the hashset since its being operated on by the function. Why am I getting this error?
Note I am not modifying the Hashset in the forloop. I only do it after the loop is done. I am getting this error while I loop through the Hashset. I suppose because another thread is attempting to modify it. The question then is, why is a thread able to modify the hashSet when I have a lock on the function that calls it on the service level?
For Now, I fixed it by surrounding the for loop in DeleteFilesToBeDeletedInServiceTable() with a lock and also calling the same lock on the statement stuckDeletedServTableFiles.ExceptWith(removedHashset); inside that function. I am not sure if there are costs to this but it seems it will work. And giving how in practice how infrequent this issue will occur, I suppose it wont' cost much. Especially since the times that function is called, we are not doing intensive things. The file crunching would have finished by then and the main caller is utilizing another thread
I am trying to build a queue to send data to a API after the API gives a sign of life.
System.InvalidOperationException in the following code:
private void sendHandler()
{
while (true)
{
if (!sendQueueActive && sendQueue.Count >= 1)
{
sendQueueActive = true;
foreach (relays relays in sendQueue)
{
dynamic result = IoLogikApiConnector.put("io/relay", relays);
int code = result.error.code;
if (code != 0)
{
_log.logErrorToApi("Cannot write to IoLogik", "Error code:" + result, _deviceID);
_device.logErrorToApi();
sendQueue.Remove(relays);
}
else
{
_device.logConnectedToApi();
sendQueue.Remove(relays);
}
sendQueueActive = false;
}
}
else
{
Thread.Sleep(20);
}
}
}
You are removing items from the queue whilst using a foreach. Never a good thing.
Better to write
using System.Linq;
using System.Collections.Generic;
using System.Collections;
private void sendHandler()
{
while (true)
{
if (!sendQueueActive && sendQueue.Count >= 1)
{
sendQueueActive = true;
// MAKE A COPY FIRST
var sendQueueCopy = sendQueue.ToList();
foreach (relays relays in sendQueueCopy)
{
dynamic result = IoLogikApiConnector.put("io/relay", relays);
int code = result.error.code;
if (code != 0)
{
_log.logErrorToApi("Cannot write to IoLogik", "Error code:" + result, _deviceID);
_device.logErrorToApi();
sendQueue.Remove(relays);
}
else
{
_device.logConnectedToApi();
sendQueue.Remove(relays);
}
sendQueueActive = false;
}
}
else
{
Thread.Sleep(20);
}
}
}
but even better use a thread safe queue.
https://msdn.microsoft.com/en-us/library/dd997371(v=vs.110).aspx
Here's the cut and paste example from the above link
// A bounded collection. It can hold no more
// than 100 items at once.
BlockingCollection<Data> dataItems = new BlockingCollection<Data>(100);
// A simple blocking consumer with no cancellation.
Task.Run(() =>
{
while (!dataItems.IsCompleted)
{
Data data = null;
// Blocks if number.Count == 0
// IOE means that Take() was called on a completed collection.
// Some other thread can call CompleteAdding after we pass the
// IsCompleted check but before we call Take.
// In this example, we can simply catch the exception since the
// loop will break on the next iteration.
try
{
data = dataItems.Take();
}
catch (InvalidOperationException) { }
if (data != null)
{
Process(data);
}
}
Console.WriteLine("\r\nNo more items to take.");
});
// A simple blocking producer with no cancellation.
Task.Run(() =>
{
while (moreItemsToAdd)
{
Data data = GetData();
// Blocks if numbers.Count == dataItems.BoundedCapacity
dataItems.Add(data);
}
// Let consumer know we are done.
dataItems.CompleteAdding();
});
Is it possible to run the following method 3 times (specifically the code within the try) before throwing an error message (i.e. fail-retry, fail-retry, fail-retry, throw error message), with a break of 1 seconds between each attempt?
public static int CheckBidDuplicate(int plotId, int operatorId)
{
const string query = "SELECT Count(*) FROM bid WHERE plot_id=#plot_id AND operator_id=#operator_id GROUP BY plot_id;";
using (var cmd = new MySqlCommand(query, DbConnect.Connection))
{
cmd.Parameters.AddWithValue(("#plot_id"), plotId);
cmd.Parameters.AddWithValue(("#operator_id"), operatorId);
try
{
var da = new MySqlDataAdapter(cmd);
var dtCounts = new DataTable();
da.Fill(dtCounts);
var count = dtCounts.Rows.Count;
return count;
}
catch(Exception ex)
{
ErrorHandlingComponent.LogError(ex.ToString());
throw;
}
}
}
This pattern of execution is reasonably common. I would say in most cases, people implement a helper type to encapsulate the redundant part. For example:
static class Retry
{
public static T Invoke<T>(Func<T> func, int tryCount, TimeSpan tryInterval)
{
if (tryCount < 1)
{
throw new ArgumentOutOfRangeException("tryCount");
}
while (true)
{
try
{
return func();
}
catch (Exception e)
{
if (--tryCount > 0)
{
Thread.Sleep(tryInterval);
continue;
}
ErrorHandlingComponent.LogError(ex.ToString());
throw;
}
}
}
}
Used like this:
int result = Retry.Invoke(
() => CheckBidDuplicate(plotId, operatorId), 3, TimeSpan.FromSeconds(1));
In my experience, you may wind up wanting additional features. Customization of logging, customizing of exception handling (e.g. don't bother retrying on certain exceptions known to be terminally fatal), default try-count and try-interval values, that sort of thing.
But the above should be a good starting point.
I have some tasks (nWorkers = 3):
var taskFactory = new TaskFactory(cancellationTokenSource.Token,
TaskCreationOptions.LongRunning, TaskContinuationOptions.LongRunning,
TaskScheduler.Default);
for (int i = 0; i < nWorkers; i++)
{
var task = taskFactory.StartNew(() => this.WorkerMethod(parserItems,
cancellationTokenSource));
tasks[i] = task;
}
And the following method called by the tasks:
protected override void WorkerMethod(BlockingCollection<ParserItem> parserItems,
CancellationTokenSource cancellationTokenSource)
{
//...log-1...
using (var connection = new OracleConnection(connectionString))
{
OracleTransaction transaction = null;
try
{
cancellationTokenSource.Token.ThrowIfCancellationRequested();
connection.Open();
//...log-2...
transaction = connection.BeginTransaction();
//...log-3...
using (var cmd = connection.CreateCommand())
{
foreach (var parserItem in parserItems.GetConsumingEnumerable(
cancellationTokenSource.Token))
{
cancellationTokenSource.Token.ThrowIfCancellationRequested();
try
{
foreach (var statement in this.ProcessRecord(parserItem))
{
cmd.CommandText = statement;
try
{
cmd.ExecuteNonQuery();
}
catch (OracleException ex)
{
//...log-4...
if (!this.acceptedErrorCodes.Contains(ex.Number))
{
throw;
}
}
}
}
catch (FormatException ex)
{
log.Warn(ex.Message);
}
}
if (!cancellationTokenSource.Token.IsCancellationRequested)
{
transaction.Commit();
}
else
{
throw new Exception("DBComponent has been canceled");
}
}
}
catch (Exception ex)
{
//...log-5...
cancellationTokenSource.Cancel();
if (transaction != null)
{
try
{
transaction.Rollback();
//...log-6...
}
catch (Exception rollbackException)
{
//...log-7...
}
}
throw;
}
finally
{
if (transaction != null)
{
transaction.Dispose();
}
connection.Close();
//...log-8...
}
}
//...log-9...
}
There is a producer of ParserItem objects and these are the consumers. Normally it works fine, there are sometimes that there is an Oracle connection timeout, but in these cases I can see the exception message and everything works as designed.
But sometimes the process get stuck. When it gets stuck, in the log file I can see log-1 message and after that (more or less 15 seconds later) I see log-8 message, but what is driving me nuts is why i cannot see neither the exception message log-5 nor the log-9 message.
Since the cancellationTokenSource.Cancel() method is never called, the producer of items for the bounded collection is stuck until a timeout two hours later.
It is compiled for NET Framework 4 and I'm using Oracle.ManagedDataAccess libraries for the Oracle connection.
Any help would be greatly appreciated.
You should never dispose a transaction or connection when you use using scope. Second, you should rarely rely on exception based programming style. Your code rewritten below:
using (var connection = new OracleConnection(connectionString))
{
using (var transaction = connection.BeginTransaction())
{
connection.Open();
//...log-2...
using (var cmd = connection.CreateCommand())
{
foreach (var parserItem in parserItems.GetConsumingEnumerable(cancellationTokenSource.Token))
{
if (!cancellationTokenSource.IsCancellationRequested)
{
try
{
foreach (var statement in ProcessRecord(parserItem))
{
cmd.CommandText = statement;
try
{
cmd.ExecuteNonQuery();
}
catch (OracleException ex)
{
//...log-4...
if (!acceptedErrorCodes.Contains(ex.ErrorCode))
{
log.Warn(ex.Message);
}
}
}
}
catch (FormatException ex)
{
log.Warn(ex.Message);
}
}
}
if (!cancellationTokenSource.IsCancellationRequested)
{
transaction.Commit();
}
else
{
transaction.Rollback();
throw new Exception("DBComponent has been canceled");
}
}
}
}
//...log-9...
Let me know if this helps.
I can confirm everything you're saying. (program stuck, low CPU usage, oracle connection timeouts, etc.)
One workaround is to use Threads instead of Tasks.
UPDATE: after careful investigation I found out that when you use a high number of Tasks, the ThreadPool worker threads queued by the Oracle driver become slow to start, which ends up causing a (fake) connect timeout.
A couple of solutions for this:
Solution 1: Increase the ThreadPool's minimum number of threads, e.g.:
ThreadPool.SetMinThreads(50, 50); // YMMV
OR
Solution 2: Configure your connection to use pooling and set its minimum size appropriately.
var ocsb = new OracleConnectionStringBuilder();
ocsb.DataSource = ocsb.DataSource;
ocsb.UserID = "myuser";
ocsb.Password = "secret";
ocsb.Pooling = true;
ocsb.MinPoolSize = 20; // YMMV
IMPORTANT: before calling any routine that creates a high number of tasks, open a single connection using that will "warm-up" the pool:
using(var oc = new OracleConnection(ocsb.ToString()))
{
oc.Open();
oc.Close();
}
Note: Oracle indexes the connection pools by the connect string (with the password removed), so if you want to open additional connections you must use always the same exact connect string.
I hava access DB, one of my function(C#.net) need to Exec a SQL more than 4000 times with transaction.
It seems that after execution the DB file stay opened exclusively. because there is a *.ldb file, and that file stay there for a long time.
Is that caused by dispose resources incorrectly???
private int AmendUniqueData(Trans trn)
{
int reslt = 0;
foreach (DataRow dr in _dt.Rows)
{
OleDbParameter[] _params = {
new OleDbParameter("#templateId",dr["Id"].ToString()),
new OleDbParameter("#templateNumber",dr["templateNumber"].ToString())
};
string sqlUpdateUnique = "UPDATE " + dr["proformaNo"].ToString().Substring(0,2) + "_unique SET templateId = #templateId WHERE templateNumber=#templateNumber";
reslt = OleDBHelper.ExecSqlWithTran(sqlUpdateUnique, trn, _params);
if (reslt < 0)
{
throw new Exception(dr["id"].ToString());
}
}
return reslt;
}
the transaction:
using (Trans trn = new Trans())
{
try
{
int reslt=AmendUniqueData(trn);
trn.Commit();
return reslt;
}
catch
{
trn.RollBack();
throw;
}
finally
{
trn.Colse();
}
}
forget closing the database connection.