DropQueue mechanism for RX.net - c#

I came across a back pressure issue with RX.net I can't find a solution for. I have an observable real-time stream of log messages.
var logObservable = /* Observable stream of log messages */
Which I want to expose via a TCP interface which serializes the real-time log messages from the logObservable before they are sent over the wire. So I do the following:
foreach (var message in logObservable.ToEnumerable())
{
// 1. Serialize message
// 2. Send it over the wire.
}
The problem arises with the .ToEnumerable() if a back pressure scenario happens e.g. if the client on the other end pauses the stream. The problem is that .ToEnumerable() caches the items which result in a lot of memory usage. I'm looking for a mechanism something like a DropQueue which only buffers, let say, the last 10 messages e.g.
var observableStream = logObservable.DropQueue(10).ToEnumerable();
Is this the right way to way to solve this issue? And do you know to implement such a mechanism to avoid possible back pressure issue?

My DropQueue implementation:
public static IEnumerable<TSource> ToDropQueue<TSource>(
this IObservable<TSource> source,
int queueSize,
Action backPressureNotification = null,
CancellationToken token = default(CancellationToken))
{
var queue = new BlockingCollection<TSource>(new ConcurrentQueue<TSource>(), queueSize);
var isBackPressureNotified = false;
var subscription = source.Subscribe(
item =>
{
var isBackPressure = queue.Count == queue.BoundedCapacity;
if (isBackPressure)
{
queue.Take(); // Dequeue an item to make space for the next one
// Fire back-pressure notification if defined
if (!isBackPressureNotified && backPressureNotification != null)
{
backPressureNotification();
isBackPressureNotified = true;
}
}
else
{
isBackPressureNotified = false;
}
queue.Add(item);
},
exception => queue.CompleteAdding(),
() => queue.CompleteAdding());
token.Register(() => { subscription.Dispose(); });
using (new CompositeDisposable(subscription, queue))
{
foreach (var item in queue.GetConsumingEnumerable())
{
yield return item;
}
}
}

Related

Hangfire in single application on multiple physical servers

I am running hangfire in a single web application, my application is being run on 2 physical servers but hangfire is in 1 database.
At the moment, i am generating a server for each queue, because each queue i need to run 1 worker at a time and they must be in order. I set them up like this
// core
services.AddHangfire(options =>
{
options.SetDataCompatibilityLevel(CompatibilityLevel.Version_170);
options.UseSimpleAssemblyNameTypeSerializer();
options.UseRecommendedSerializerSettings();
options.UseSqlServerStorage(appSettings.Data.DefaultConnection.ConnectionString, storageOptions);
});
// add multiple servers, this way we get to control how many workers are in each queue
services.AddHangfireServer(options =>
{
options.ServerName = "workflow-queue";
options.WorkerCount = 1;
options.Queues = new string[] { "workflow-queue" };
options.SchedulePollingInterval = TimeSpan.FromSeconds(10);
});
services.AddHangfireServer(options =>
{
options.ServerName = "alert-schedule";
options.WorkerCount = 1;
options.Queues = new string[] { "alert-schedule" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(1);
});
services.AddHangfireServer(options =>
{
options.ServerName = string.Format("trigger-schedule");
options.WorkerCount = 1;
options.Queues = new string[] { "trigger-schedule" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(1);
});
services.AddHangfireServer(options =>
{
options.ServerName = "report-schedule";
options.WorkerCount = 1;
options.Queues = new string[] { "report-schedule" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(1);
});
services.AddHangfireServer(options =>
{
options.ServerName = "maintenance";
options.WorkerCount = 5;
options.Queues = new string[] { "maintenance" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(10);
});
My problem is that it is generating multiple queues on the servers, with different ports.
In my code i am then trying to stop jobs from running if they are queued/retrying, but if the job is being run on a different physical server, it is not found and queued again.
Here is the code to check if its running already
public async Task<bool> IsAlreadyQueuedAsync(PerformContext context)
{
var disableJob = false;
var monitoringApi = JobStorage.Current.GetMonitoringApi();
// get the jobId, method and queue using performContext
var jobId = context.BackgroundJob.Id;
var methodInfo = context.BackgroundJob.Job.Method;
var queueAttribute = (QueueAttribute)Attribute.GetCustomAttribute(context.BackgroundJob.Job.Method, typeof(QueueAttribute));
// enqueuedJobs
var enqueuedjobStatesToCheck = new[] { "Processing" };
var enqueuedJobs = monitoringApi.EnqueuedJobs(queueAttribute.Queue, 0, 1000);
var enqueuedJobsAlready = enqueuedJobs.Count(e => e.Key != jobId && e.Value != null && e.Value.Job != null && e.Value.Job.Method.Equals(methodInfo) && enqueuedjobStatesToCheck.Contains(e.Value.State));
if (enqueuedJobsAlready > 0)
disableJob = true;
// scheduledJobs
if (!disableJob)
{
// check if there are any scheduledJobs that are processing
var scheduledJobs = monitoringApi.ScheduledJobs(0, 1000);
var scheduledJobsAlready = scheduledJobs.Count(e => e.Key != jobId && e.Value != null && e.Value.Job != null && e.Value.Job.Method.Equals(methodInfo));
if (scheduledJobsAlready > 0)
disableJob = true;
}
// failedJobs
if (!disableJob)
{
var failedJobs = monitoringApi.FailedJobs(0, 1000);
var failedJobsAlready = failedJobs.Count(e => e.Key != jobId && e.Value != null && e.Value.Job != null && e.Value.Job.Method.Equals(methodInfo));
if (failedJobsAlready > 0)
disableJob = true;
}
// if runBefore is true, then lets remove the current job running, else it will write a "successful" message in the logs
if (disableJob)
{
// use hangfire delete, for cleanup
BackgroundJob.Delete(jobId);
// create our sqlBuilder to remove the entries altogether including the count
var sqlBuilder = new SqlBuilder()
.DELETE_FROM("Hangfire.[Job]")
.WHERE("[Id] = {0};", jobId);
sqlBuilder.Append("DELETE TOP(1) FROM Hangfire.[Counter] WHERE [Key] = 'stats:deleted' AND [Value] = 1;");
using (var cmd = _context.CreateCommand(sqlBuilder))
await cmd.ExecuteNonQueryAsync();
return true;
}
return false;
}
Each method has something like the following attributes as well
public interface IAlertScheduleService
{
[Hangfire.Queue("alert-schedule")]
[Hangfire.DisableConcurrentExecution(60 * 60 * 5)]
Task RunAllAsync(PerformContext context);
}
Simple implementation of the interface
public class AlertScheduleService : IAlertScheduleService
{
public Task RunAllAsync(PerformContext context)
{
if (IsAlreadyQueuedAsync(context))
return;
// guess it isnt queued, so run it here....
}
}
Here is how i am adding my scheduled jobs
//// our recurring jobs
//// set these to run hourly, so they can play "catch-up" if needed
RecurringJob.AddOrUpdate<IAlertScheduleService>(e => e.RunAllAsync(null), Cron.Hourly(0), queue: "alert-schedule");
Why does this happen? How can i stop it happening?
Somewhat of a blind shot, preventing a job to be queued if a job is already queued in the same queue.
The try-catch logic is quite ugly but I have no better idea right now...
Also, really not sure the lock logic always prevents from having two jobs in EnqueudState, but it should help anyway. Maybe mixing with an IApplyStateFilter.
public class DoNotQueueIfAlreadyQueued : IElectStateFilter
{
public void OnStateElection(ElectStateContext context)
{
if (context.CandidateState is EnqueuedState)
{
EnqueuedState es = context.CandidateState as EnqueuedState;
IDisposable distributedLock = null;
try
{
while (distributedLock == null)
{
try
{
distributedLock = context.Connection.AcquireDistributedLock($"{nameof(DoNotQueueIfAlreadyQueued)}-{es.Queue}", TimeSpan.FromSeconds(1));
}
catch { }
}
var m = context.Storage.GetMonitoringApi();
if (m.EnqueuedCount(es.Queue) > 0)
{
context.CandidateState = new DeletedState();
}
}
finally
{
distributedLock.Dispose();
}
}
}
}
The filter can be declared as in this answer
There seems to be a bug with your currently used hangfire storage implementation:
https://github.com/HangfireIO/Hangfire/issues/1025
The current options are:
Switching to HangFire.LiteDB as commented here: https://github.com/HangfireIO/Hangfire/issues/1025#issuecomment-686433594
Implementing your own logic to enqueue a job, but this would take more effort.
Making your job execution idempotent to avoid side effects in case it's executed multiple times.
In either option, you should still apply DisableConcurrentExecution and make your job execution idempotent as explained below, so i think you can just go with below option:
Applying DisableConcurrentExecution is necessary, but it's not enough as there are no reliable automatic failure detectors in distributed systems. That's the nature of distributed systems, we usually have to rely on timeouts to detect failures, but it's not reliable.
Hangfire is designed to run with at-least-once execution semantics. Explained below:
One of your servers may be executing the job, but it's detected as being failed due to various reasons. For example: your current processing server does not send heartbeats in time due to a temporary network issue or due to temporary high load.
When the current processing server is assumed to be failed (but it's not), the job will be scheduled to another server which causes it to be executed more than once.
The solution should be still applying DisableConcurrentExecution attribute as a best effort to prevent multiple executions of the same job, but the main thing is that you need to make the execution of the job idempotent which does not cause side effects in case it's executed multiple times.
Please refer to some quotes from https://docs.hangfire.io/en/latest/background-processing/throttling.html:
Throttlers apply only to different background jobs, and there’s no
reliable way to prevent multiple executions of the same background job
other than by using transactions in background job method itself.
DisableConcurrentExecution may help a bit by narrowing the safety
violation surface, but it heavily relies on an active connection,
which may be broken (and lock is released) without any notification
for our background job.
As there are no reliable automatic failure detectors in distributed
systems, it is possible that the same job is being processed on
different workers in some corner cases. Unlike OS-based mutexes,
mutexes in this package don’t protect from this behavior so develop
accordingly.
DisableConcurrentExecution filter may reduce the probability of
violation of this safety property, but the only way to guarantee it is
to use transactions or CAS-based operations in our background jobs to
make them idempotent.
You can also refer to this as Hangfire timeouts behavior seems to be dependent on storage as well: https://github.com/HangfireIO/Hangfire/issues/1960#issuecomment-962884011

DataflowBlock ITargetSource.AsObservable() not triggering OnNext()

I'm trying to use a dataflowblock and I need to spy the items passing through for unit testing.
In order to do this, I'm using the AsObservable() method on ISourceBlock<T> of my TransformBlock<Tinput, T>,
so I can check after execution that each block of my pipeline have generated the expected values.
Pipeline
{
...
var observer = new MyObserver<string>();
_block = new TransformManyBlock<string, string>(MyHandler, options);
_block.LinkTo(_nextBlock);
_block.AsObservable().Subscribe(observer);
_block.Post("Test");
...
}
MyObserver
public class MyObserver<T> : IObserver<T>
{
public List<Exception> Errors = new List<Exception>();
public bool IsComplete = false;
public List<T> Values = new List<T>();
public void OnCompleted()
{
IsComplete = true;
}
public void OnNext(T value)
{
Values.Add(value);
}
public void OnError(Exception e)
{
Errors.Add(e);
}
}
So basically I subscribe my observer to the transformblock, and I expect that each value passing through get registered in my observer "values" list.
But, while the IsComplete is set to true, and the OnError() successfully register exception,
the OnNext() method never get called unless it is the last block of the pipeline...
I can't figure out why, because the "nextblock" linked to this sourceBlock successfully receive the data, proving that some data are exiting the block.
From what I understand, the AsObservable is supposed to report every values exiting the block and not only the values that have not been consumed by other linked blocks...
What am I doing wrong ?
Your messages are being consumed by _nextBlock before you get a chance to read them.
If you comment out this line _block.LinkTo(_nextBlock); it would likely work.
AsObservable sole purpose is just to allow a block to be consumed from RX. It doesn't change the internal working of the block to broadcast messages to multiple targets. You need a special block for that BroadcastBlock
I would suggest broadcasting to another block and using that to Subscribe
BroadcastBlock’s mission in life is to enable all targets linked from
the block to get a copy of every element published
var options = new DataflowLinkOptions {PropagateCompletion = true};
var broadcastBlock = new BroadcastBlock<string>(x => x);
var bufferBlock = new BufferBlock<string>();
var actionBlock = new ActionBlock<string>(s => Console.WriteLine("Action " + s));
broadcastBlock.LinkTo(bufferBlock, options);
broadcastBlock.LinkTo(actionBlock, options);
bufferBlock.AsObservable().Subscribe(s => Console.WriteLine("peek " + s));
for (var i = 0; i < 5; i++)
await broadcastBlock.SendAsync(i.ToString());
broadcastBlock.Complete();
await actionBlock.Completion;
Output
peek 0
Action 0
Action 1
Action 2
Action 3
Action 4
peek 1
peek 2
peek 3
peek 4

Process batch request callback asynchronous in .NET 4 and WebForms

In the example blow you can see I'm executing a batch request on a button click. After that I need to use the information provided in the callback, but without freezing the WebForms page. I have thought that the callback is asynchronous by itself, but obviously I'm wrong, because until the callback is not processed the page stays frozen.
batchRequest.Queue<Google.Apis.Calendar.v3.Data.Event>(
addRequest,
(content, error, i, message) => // callback
{
using (dbContext)
{
Event eventToUpdate = dbContextNewInstance.Events.FirstOrDefault(x => x.Id == dbObj.Id);
if (eventToUpdate != null)
{
eventToUpdate.GoogleCalendarMappingId = content.Id;
dbContextNewInstance.SubmitChanges();
}
}
});
batchRequest.ExecuteAsync();
*UPDATE:
I have made this implementation and its worked! So far I am worried is it everything going the correct way and no thread or DB connection is left unmanaged, guys?
batchRequest.Queue<Google.Apis.Calendar.v3.Data.Event>(
addRequest,
(content, error, i, message) => // callback
{
idsToMap[dbObj.Id] = content.Id; // A dictionary for my dbObj Id and the Id I receive from the remote API in the CALLBACK
});
Thread batchThread = new Thread(() => SubmitBatchRequest(batchRequest, idsToMap, connectionString));
batchThread.Start();
And the method with the Thread:
private static void SubmitBatchRequest(BatchRequest batchRequest, Dictionary<Guid, string> ids, string connectionString)
{
Thread.CurrentThread.IsBackground = true;
batchRequest.ExecuteAsync().GetAwaiter().GetResult(); // Send the batch request asynchronous
using (DataContext db = new DataContext(connectionString))
{
foreach (Guid dbObjId in ids.Keys)
{
Event eventToUpdate = db.Events.FirstOrDefault(x => x.Id == dbObjId);
if (eventToUpdate != null)
{
eventToUpdate.GoogleCalendarMappingId = ids[dbObjId];
}
}
// Thread.Sleep(50000);
db.SubmitChanges();
}
}

BlockingCollection Consumer is repeating output

TL;DR I have an application that is reading messages from a USB device in the background, and displaying the messages on the screen. I am using a BlockingCollection, as I need to read messages quickly so the device does not get a BufferOverflow.
I am reading messages like this (my producer):
private void ReadMessages(BlockingCollection<object> logMessages)
{
uint numMsgs;
Status status;
Message[] msgs = new Message[10];
while(!logMessages.IsAddingCompleted)
{
numMsgs = (uint) msgs.Length;
status = readMessages(channel, msgs, ref numMsgs, 1000);
if(status == Status.ERR_BUFFER_OVERFLOW)
{
logMessages.Add("BUFFER OVERFLOW - MESSAGES LOST!");
logMessages.Add(CopyMessages(msgs, numMsgs));
}
else if(status == Status.STATUS_NOERROR)
{
logMessages.Add(CopyMessages(msgs, numMsgs));
}
else
{
throw new Exception("Error");
}
}
The readMessages() method will fill the msgs array with the Message objects read, and the numMsgs reference holds the number of messages that were read (up to 10). I use a function called CopyMessages() so I only pass a Message[] that is the right size. i.e. if 5 messages are read, I send a Message[5] instead of Message[10].
I read the messages (my consumer) like this:
private void DisplayMessages(BlockingCollection<object> messages)
{
string[] msgs;
try
{
foreach (var item in messages.GetConsumingEnumerable(_cancellationTokenSource.Token))
{
if (item is string)
{
msgs = new string[] { item.ToString() };
}
else if (item is PassThruMsg[])
{
msgs = FormatMessages((PassThruMsg[])item);
}
else
{
msgs = new string[0];
}
Task.Factory.StartNew(new Action(() => outputTextBox.AppendText(String.Join(Environment.NewLine, msgs) + Environment.NewLine)), _cancellationTokenSource.Token, TaskCreationOptions.None, uiContext);
}
}
catch (OperationCanceledException)
{
//TODO:
}
}
I start the tasks inside a button click, like this:
var results = new BlockingCollection<object>();
var display = Task.Factory.StartNew(() => DisplayMessages(results));
var readMessages = Task.Factory.StartNew(() => ReadMessages(results));
Task[] tasks = new Task[] { display, readMessages };
try
{
await Task.Factory.ContinueWhenAll(tasks, result => { results.CompleteAdding(); }, _cancellationTokenSource.Token, TaskContinuationOptions.None, uiContext);
}
catch (TaskCanceledException)
{
//TODO:
}
This works fine, and when running idly it prints the messages from the device without a problem. However, after the device starts doing work under a really heavy load (the consumer is called so quickly it locks the UI temporarily) that I notice the output textbox is repeating values. It is my understanding that GetConsumingEnumerable() also removes items from the blocking collection, but I don't know why else I would see the messages printed multiple times. Each message has a timestamp, and when I readMessages from the device it clears the buffer so I know that I am not reading that message multiple times.
Am I missing something here? Is there a better way to handle this producer/consumer scenario to ensure accurate data? I have looked to see if there are references somewhere that may be overlapping, but I don't see it.

Queue to ConcurrentQueue

I have a regular Queue object in C# (4.0) and I'm using BackgroundWorkers that access this Queue.
The code I was using is as follows:
do
{
while (dataQueue.Peek() == null // nothing waiting yet
&& isBeingLoaded == true // and worker 1 still actively adding stuff
)
System.Threading.Thread.Sleep(100);
// otherwise ready to do something:
if (dataQueue.Peek() != null) // because maybe the queue is complete and also empty
{
string companyId = dataQueue.Dequeue();
processLists(companyId);
// use up the stuff here //
} // otherwise nothing was there yet, it will resolve on the next loop.
} while (isBeingLoaded == true // still have stuff coming at us
|| dataQueue.Peek() != null); // still have stuff we haven’t done
However, I guess when dealing with threads I should be using a ConcurrentQueue.
I was wondering if there were examples of how to use a ConcurrentQueue in a Do While Loop like above?
Everything I tried with the TryPeek wasn't working..
Any ideas?
You can use a BlockingCollection<T> as a producer-consumer queue.
My answer makes some assumptions about your architecture, but you can probably mold it as you see fit:
public void Producer(BlockingCollection<string> ids)
{
// assuming this.CompanyRepository exists
foreach (var id in this.CompanyRepository.GetIds())
{
ids.Add(id);
}
ids.CompleteAdding(); // nothing left for our workers
}
public void Consumer(BlockingCollection<string> ids)
{
while (true)
{
string id = null;
try
{
id = ids.Take();
} catch (InvalidOperationException) {
}
if (id == null) break;
processLists(id);
}
}
You could spin up as many consumers as you need:
var companyIds = new BlockingCollection<string>();
Producer(companyIds);
Action process = () => Consumer(companyIds);
// 2 workers
Parallel.Invoke(process, process);

Categories

Resources