I've checked into the documentation regarding Scheduling with Azure Service Bus, but I am not clear on how exactly to send a message from a "disconnected" Bus.
Here's how I've configured my Service that is processing messages on the server:
builder.AddMassTransit(mt =>
{
mt.AddConsumers(cqrsAssembly);
mt.AddBus(context => Bus.Factory.CreateUsingAzureServiceBus(x =>
{
x.RequiresSession = true;
x.MaxConcurrentCalls = 500;
x.MessageWaitTimeout = TimeSpan.FromMinutes(5);
x.UseRenewLock(TimeSpan.FromMinutes(4));
x.UseServiceBusMessageScheduler();
var host = x.Host(serviceUri, h =>
{
h.SharedAccessSignature(s =>
{
s.KeyName = "key-name";
s.SharedAccessKey = "access-key";
s.TokenTimeToLive = TimeSpan.FromDays(1);
s.TokenScope = TokenScope.Namespace;
});
h.OperationTimeout = TimeSpan.FromMinutes(2);
});
x.ReceiveEndpoint(host, $"mt.myqueue", ep =>
{
ep.RequiresSession = true;
ep.MaxConcurrentCalls = 500;
ep.RemoveSubscriptions = true;
ep.UseMessageRetry(r =>
{
r.Interval(4, TimeSpan.FromSeconds(30));
r.Handle<TransientCommandException>();
});
ep.ConfigureConsumers(context);
});
});
});
I've explicitly called UseServiceBusMessageScheduler().
In the project that is creating and sending messages to the queue (runs in a different context, so is done to "send only"), we have this:
var bus = Bus.Factory.CreateUsingAzureServiceBus(x =>
{
x.RequiresSession = true;
x.MessageWaitTimeout = TimeSpan.FromMinutes(5);
x.UseRenewLock(TimeSpan.FromMinutes(4));
x.Send<ICommand>(s => s.UseSessionIdFormatter(ctx => ctx.Message.SessionId ?? Guid.NewGuid().ToString()));
var host = x.Host(serviceUri, h =>
{
h.SharedAccessSignature(s =>
{
s.KeyName = "key-name";
s.SharedAccessKey = "key";
s.TokenTimeToLive = TimeSpan.FromDays(1);
s.TokenScope = TokenScope.Namespace;
});
h.OperationTimeout = TimeSpan.FromMinutes(2);
});
EndpointConvention.Map<ICommand>(new Uri($"{serviceUri.ToString()}mt.myqueue"));
EndpointConvention.Map<Command>(new Uri($"{serviceUri.ToString()}mt.myqueue"));
});
Now, to send a scheduled message, we do this:
var dest = "what?";
await bus.ScheduleSend(dest, scheduledEnqueueTimeUtc.Value, message);
I am unsure of what needs to be passed into the destinationAddress.
I've tried:
- serviceUri
- `{serviceUri}mt.myqueue"
But checking the queues, I don't see my message in either the base queue, the skipped queue or the error queue.
Am I missing some other configuration, and if not, how does one determine the destination queue?
Am using version 5.5.4 of Mass Transit, and every overload to ScheduleSend() requires it.
First of all, yes your Uri format is correct. In the end after formatting you need something like this :
new Uri(#"sb://yourdomain.servicebus.windows.net/yourapp/your_message_queue")
Also make sure you added when you configured your endPoint. (See the link below)
configurator.UseServiceBusMessageScheduler();
If you follow the Mass-Transit documentation, scheduling is done from a ConsumeContext. See Mass-Transit Azure Scheduling
public class ScheduleNotificationConsumer :
IConsumer<AssignSeat>
{
Uri _schedulerAddress;
Uri _notificationService;
public async Task Consume(ConsumeContext<AssignSeat> context)
{
if(context.Message.ReservationTime - DateTime.Now < TimeSpan.FromHours(8))
{
// assign the seat for the reservation
}
else
{
// seats can only be assigned eight hours before the reservation
context.ScheduleMessage(context.Message.ReservationTime - TimeSpan.FromHours(8), context.Message);
}
}
}
However in a use case we faced this week we needed to schedule from outside a consumeContext, or simply didn't want to forward the context down to where we scheduled. When using IBusControl.ScheduleSend we get no error feedback, but we also don't really get any scheduling done.
After looking at what Mass-Transit does it turns out that from an IBusControl, it creates a new scheduling provider. Whereas from the Context it uses the ServiceBusScheduleMessageProvider.
So what we're doing now until we clean up this bit, is calling the ServiceBusScheduleMessageProvider outright.
await new ServiceBusScheduleMessageProvider(_busControl).ScheduleSend(destinationUri
, scheduleDateTime.UtcDateTime
, Task.FromResult<T>(message)
, Pipe.Empty<SendContext>()
, default);
Hope it makes sense and helps a bit.
Related
I have an extremely simple setup for sending message to Kafka:
var producerConfig = new ProducerConfig
{
BootstrapServers = "www.example.com",
SecurityProtocol = SecurityProtocol.SaslSsl,
SaslMechanism = SaslMechanism.ScramSha512,
SaslUsername = _options.SaslUsername,
SaslPassword = _options.SaslPassword,
MessageTimeoutMs = 1
};
var producerBuilder = new ProducerBuilder<Null, string>(producerConfig);
using var producer = producerBuilder.Build();
producer.Produce("Some Topic", new Message<Null, string>()
{
Timestamp = Timestamp.Default,
Value = "hello"
});
Before, this code was working fine. Today it has decided to stop working and I'm trying to figure out why. I'm trying to get the Producer to throw an exception when failing to deliver a message, but it never seems to crash. Even when I fill in a wrong username and password, the producer still doesn't crash. Not even a logline in my local output window. How can I debug my Kafka connection when the producer never shows any problems?
You can add SetErrorHandler() to the ProducerBuilder. It would look like this:
var producerBuilder = new ProducerBuilder<Null, string>(producerConfig)
.SetErrorHandler(errorMessageString => .....);
Set a breakpoint in that lambda and you can break on errors.
Produce is asynchronous and not blocking, function signature is
void Produce(string topic, Message<TKey, TValue> message, Action<DeliveryReport<TKey, TValue>> deliveryHandler = null)
In order to verify that a message was delivered without error
you can add a delivery report handler function e.g.
private void DeliveryReportHandler(DeliveryReport<int, T> deliveryReport)
{
if (deliveryReport.Status == PersistenceStatus.NotPersisted)
{
_logger.LogError($"Failed message delivery: error reason:{deliveryReport.Error?.Reason}");
_messageWasNotDelivered = true;
}
}
_messageWasNotDelivered = false;
_producer.Produce(topic,
new Message<int, T>
{
Key = key,
Value = entity
},
DeliveryReportHandler)
_producer.Flush(); // Wait until all outstanding produce requests and delivery report callbacks are completed
if(_messageWasNotDelivered ){
// handle non delivery
}
This code can be trivially adjusted for batch producing like this
_messageWasNotDelivered = false;
foreach(var entity in entities){
_producer.Produce(topic,
new Message<int, T>
{
Key = entity.Id,
Value = entity
},
DeliveryReportHandler)
}
_producer.Flush(); // Wait until all outstanding produce requests and delivery report callbacks are completed
if(_messageWasNotDelivered ){
// handle non delivery
}
I have an Azure Function that is triggered by eventhub and sends data in batches. Inside function, there are multiple calls to insert data into CosmosDB. I have added following code as part of App Insight Monitoring.
builder.Services.AddApplicationInsightsTelemetry();
builder.Services.ConfigureTelemetryModule<DependencyTrackingTelemetryModule>((module, o) =>
{
module.EnableW3CHeadersInjection = true;
});
builder.Services.ConfigureTelemetryModule<EventCounterCollectionModule>(
(module, o) =>
{
module.Counters.Add(new EventCounterCollectionRequest("System.Runtime", "gen-0-size"));
}
);
I can see total response time in App Insight but cannot figure out how to track and send time spent by each insert query in CosmosDB
Here is C# code within Azure Function
var watch = System.Diagnostics.Stopwatch.StartNew();
var DemoContainerData = new
{
id = Guid.NewGuid().ToString(),
UserId = userId,
// other properties
};
_demoContainer.CreateItemAsync<object>(DemoContainerData);
var DemoContainerData2 = new
{
id = Guid.NewGuid().ToString(),
ProductId = productId,
// other properties
};
_productContainer.CreateItemAsync<object>(DemoContainerData2);
/* var dependency = new DependencyTelemetry
{
Name = "",
Target = "",
Data = ",
Timestamp = start,
Duration = DateTime.UtcNow - start,
Success = true
};
this._telemetryClient.TrackDependency(dependency);
*/
watch.Stop();
var elapsed = watch.Elapsed.TotalMilliseconds;
log.LogInformation("Total Items {0} - Total Time {1}", Items.Length, elapsed);
Your code is not awaiting the async operations, you should be:
ItemResponse<object> response = await _demoContainer.CreateItemAsync<object>(DemoContainerData);
From the response, you can measure the client latency:
var elapsedTimeForOperation = response.Diagnostics.GetClientElapsedTime();
What we recommend if you want to investigate high latency is to log the Diagnostics when the request goes above some threshold, for example:
if (response.Diagnostics.GetClientElapsedTime() > ConfigurableSlowRequestTimeSpan)
{
// Log the diagnostics and add any additional info necessary to correlate to other logs
log.LogWarning("Slow request {0}", response.Diagnostics);
}
For best latency, make sure you are following https://learn.microsoft.com/azure/cosmos-db/sql/troubleshoot-dot-net-sdk-slow-request?tabs=cpu-new#application-design (mainly make sure you are using a Singleton client, use ApplicationRegion or ApplicationPreferredRegions to define preferred region to connect which hopefully is the same region the Function is deployed to).
I am running hangfire in a single web application, my application is being run on 2 physical servers but hangfire is in 1 database.
At the moment, i am generating a server for each queue, because each queue i need to run 1 worker at a time and they must be in order. I set them up like this
// core
services.AddHangfire(options =>
{
options.SetDataCompatibilityLevel(CompatibilityLevel.Version_170);
options.UseSimpleAssemblyNameTypeSerializer();
options.UseRecommendedSerializerSettings();
options.UseSqlServerStorage(appSettings.Data.DefaultConnection.ConnectionString, storageOptions);
});
// add multiple servers, this way we get to control how many workers are in each queue
services.AddHangfireServer(options =>
{
options.ServerName = "workflow-queue";
options.WorkerCount = 1;
options.Queues = new string[] { "workflow-queue" };
options.SchedulePollingInterval = TimeSpan.FromSeconds(10);
});
services.AddHangfireServer(options =>
{
options.ServerName = "alert-schedule";
options.WorkerCount = 1;
options.Queues = new string[] { "alert-schedule" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(1);
});
services.AddHangfireServer(options =>
{
options.ServerName = string.Format("trigger-schedule");
options.WorkerCount = 1;
options.Queues = new string[] { "trigger-schedule" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(1);
});
services.AddHangfireServer(options =>
{
options.ServerName = "report-schedule";
options.WorkerCount = 1;
options.Queues = new string[] { "report-schedule" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(1);
});
services.AddHangfireServer(options =>
{
options.ServerName = "maintenance";
options.WorkerCount = 5;
options.Queues = new string[] { "maintenance" };
options.SchedulePollingInterval = TimeSpan.FromMinutes(10);
});
My problem is that it is generating multiple queues on the servers, with different ports.
In my code i am then trying to stop jobs from running if they are queued/retrying, but if the job is being run on a different physical server, it is not found and queued again.
Here is the code to check if its running already
public async Task<bool> IsAlreadyQueuedAsync(PerformContext context)
{
var disableJob = false;
var monitoringApi = JobStorage.Current.GetMonitoringApi();
// get the jobId, method and queue using performContext
var jobId = context.BackgroundJob.Id;
var methodInfo = context.BackgroundJob.Job.Method;
var queueAttribute = (QueueAttribute)Attribute.GetCustomAttribute(context.BackgroundJob.Job.Method, typeof(QueueAttribute));
// enqueuedJobs
var enqueuedjobStatesToCheck = new[] { "Processing" };
var enqueuedJobs = monitoringApi.EnqueuedJobs(queueAttribute.Queue, 0, 1000);
var enqueuedJobsAlready = enqueuedJobs.Count(e => e.Key != jobId && e.Value != null && e.Value.Job != null && e.Value.Job.Method.Equals(methodInfo) && enqueuedjobStatesToCheck.Contains(e.Value.State));
if (enqueuedJobsAlready > 0)
disableJob = true;
// scheduledJobs
if (!disableJob)
{
// check if there are any scheduledJobs that are processing
var scheduledJobs = monitoringApi.ScheduledJobs(0, 1000);
var scheduledJobsAlready = scheduledJobs.Count(e => e.Key != jobId && e.Value != null && e.Value.Job != null && e.Value.Job.Method.Equals(methodInfo));
if (scheduledJobsAlready > 0)
disableJob = true;
}
// failedJobs
if (!disableJob)
{
var failedJobs = monitoringApi.FailedJobs(0, 1000);
var failedJobsAlready = failedJobs.Count(e => e.Key != jobId && e.Value != null && e.Value.Job != null && e.Value.Job.Method.Equals(methodInfo));
if (failedJobsAlready > 0)
disableJob = true;
}
// if runBefore is true, then lets remove the current job running, else it will write a "successful" message in the logs
if (disableJob)
{
// use hangfire delete, for cleanup
BackgroundJob.Delete(jobId);
// create our sqlBuilder to remove the entries altogether including the count
var sqlBuilder = new SqlBuilder()
.DELETE_FROM("Hangfire.[Job]")
.WHERE("[Id] = {0};", jobId);
sqlBuilder.Append("DELETE TOP(1) FROM Hangfire.[Counter] WHERE [Key] = 'stats:deleted' AND [Value] = 1;");
using (var cmd = _context.CreateCommand(sqlBuilder))
await cmd.ExecuteNonQueryAsync();
return true;
}
return false;
}
Each method has something like the following attributes as well
public interface IAlertScheduleService
{
[Hangfire.Queue("alert-schedule")]
[Hangfire.DisableConcurrentExecution(60 * 60 * 5)]
Task RunAllAsync(PerformContext context);
}
Simple implementation of the interface
public class AlertScheduleService : IAlertScheduleService
{
public Task RunAllAsync(PerformContext context)
{
if (IsAlreadyQueuedAsync(context))
return;
// guess it isnt queued, so run it here....
}
}
Here is how i am adding my scheduled jobs
//// our recurring jobs
//// set these to run hourly, so they can play "catch-up" if needed
RecurringJob.AddOrUpdate<IAlertScheduleService>(e => e.RunAllAsync(null), Cron.Hourly(0), queue: "alert-schedule");
Why does this happen? How can i stop it happening?
Somewhat of a blind shot, preventing a job to be queued if a job is already queued in the same queue.
The try-catch logic is quite ugly but I have no better idea right now...
Also, really not sure the lock logic always prevents from having two jobs in EnqueudState, but it should help anyway. Maybe mixing with an IApplyStateFilter.
public class DoNotQueueIfAlreadyQueued : IElectStateFilter
{
public void OnStateElection(ElectStateContext context)
{
if (context.CandidateState is EnqueuedState)
{
EnqueuedState es = context.CandidateState as EnqueuedState;
IDisposable distributedLock = null;
try
{
while (distributedLock == null)
{
try
{
distributedLock = context.Connection.AcquireDistributedLock($"{nameof(DoNotQueueIfAlreadyQueued)}-{es.Queue}", TimeSpan.FromSeconds(1));
}
catch { }
}
var m = context.Storage.GetMonitoringApi();
if (m.EnqueuedCount(es.Queue) > 0)
{
context.CandidateState = new DeletedState();
}
}
finally
{
distributedLock.Dispose();
}
}
}
}
The filter can be declared as in this answer
There seems to be a bug with your currently used hangfire storage implementation:
https://github.com/HangfireIO/Hangfire/issues/1025
The current options are:
Switching to HangFire.LiteDB as commented here: https://github.com/HangfireIO/Hangfire/issues/1025#issuecomment-686433594
Implementing your own logic to enqueue a job, but this would take more effort.
Making your job execution idempotent to avoid side effects in case it's executed multiple times.
In either option, you should still apply DisableConcurrentExecution and make your job execution idempotent as explained below, so i think you can just go with below option:
Applying DisableConcurrentExecution is necessary, but it's not enough as there are no reliable automatic failure detectors in distributed systems. That's the nature of distributed systems, we usually have to rely on timeouts to detect failures, but it's not reliable.
Hangfire is designed to run with at-least-once execution semantics. Explained below:
One of your servers may be executing the job, but it's detected as being failed due to various reasons. For example: your current processing server does not send heartbeats in time due to a temporary network issue or due to temporary high load.
When the current processing server is assumed to be failed (but it's not), the job will be scheduled to another server which causes it to be executed more than once.
The solution should be still applying DisableConcurrentExecution attribute as a best effort to prevent multiple executions of the same job, but the main thing is that you need to make the execution of the job idempotent which does not cause side effects in case it's executed multiple times.
Please refer to some quotes from https://docs.hangfire.io/en/latest/background-processing/throttling.html:
Throttlers apply only to different background jobs, and there’s no
reliable way to prevent multiple executions of the same background job
other than by using transactions in background job method itself.
DisableConcurrentExecution may help a bit by narrowing the safety
violation surface, but it heavily relies on an active connection,
which may be broken (and lock is released) without any notification
for our background job.
As there are no reliable automatic failure detectors in distributed
systems, it is possible that the same job is being processed on
different workers in some corner cases. Unlike OS-based mutexes,
mutexes in this package don’t protect from this behavior so develop
accordingly.
DisableConcurrentExecution filter may reduce the probability of
violation of this safety property, but the only way to guarantee it is
to use transactions or CAS-based operations in our background jobs to
make them idempotent.
You can also refer to this as Hangfire timeouts behavior seems to be dependent on storage as well: https://github.com/HangfireIO/Hangfire/issues/1960#issuecomment-962884011
I have set up a recurring job that simply processes some orders:
{
RecurringJob.AddOrUpdate(
hangFireJob.JobId,
() => hangFireJob.Execute(),
hangFireJob.Schedule);
}
The issue I'm running into, is that we have multiple "backup" servers all running this same code. They all are reaching out to the same HangFire database.
I'm seeing the same job ran multiple times, which obviously gives us errors since the orders are already processed.
I want the backup servers to recognize that the recurring job has already been queued and not queue it again. I figured that it would go off the jobname to do this (the first parameter above). What am I missing here?
I included the hangfire server setup below:
private IEnumerable<IDisposable> GetHangfireServers()
{
var configSettings = new Settings();
GlobalConfiguration.Configuration
.UseNinjectActivator(_kernel)
.SetDataCompatibilityLevel(CompatibilityLevel.Version_170)
.UseSimpleAssemblyNameTypeSerializer()
.UseRecommendedSerializerSettings()
.UseSqlServerStorage(configSettings.HangfireConnectionString, new SqlServerStorageOptions
{
CommandBatchMaxTimeout = TimeSpan.FromMinutes(5),
SlidingInvisibilityTimeout = TimeSpan.FromMinutes(5),
QueuePollInterval = TimeSpan.Zero,
UseRecommendedIsolationLevel = true,
DisableGlobalLocks = false,
SchemaName = "LuxAdapter",
PrepareSchemaIfNecessary = false
});
yield return new BackgroundJobServer();
}```
Not sure if you have tried this already but look into using the DisableConcurrentExecution attribute on the job that will prevent multiple executions of the same job.
Since the defaults values provide uniqueness only on a process level, you should handle it manually if you want to run different server instances inside the same process:
var options = new BackgroundJobServerOptions
{
ServerName = String.Format(
"{0}.{1}",
Environment.MachineName,
Guid.NewGuid().ToString())
};
var server = new BackgroundJobServer(options);
// or
app.UseHangfireServer(options);
I have a an aync method that is looped:
private Task<HttpResponseMessage> GetResponseMessage(Region region, DateTime startDate, DateTime endDate)
{
var longLatString = $"q={region.LongLat.Lat},{region.LongLat.Long}";
var startDateString = $"{startDateQueryParam}={ConvertDateTimeToApixuQueryString(startDate)}";
var endDateString = $"{endDateQueryParam}={ConvertDateTimeToApixuQueryString(endDate)}";
var url = $"http://api?key={Config.Key}&{longLatString}&{startDateString}&{endDateString}";
return Client.GetAsync(url);
}
I then take the response and save it to my ef core database, however in some instances I get this Exception message: The Operaiton was canceled
I really dont understand that. This is a TCP handshake issue?
Edit:
For context I am making many of these calls, passing response to the method that writes to db (which is also so slow Its unbelievable):
private async Task<int> WriteResult(Response apiResponse, Region region)
{
// since context is not thread safe we ensure we have a new one for each insert
// since a .net core app can insert data at the same time from different users different instances of context
// must be thread safe
using (var context = new DalContext(ContextOptions))
{
var batch = new List<HistoricalWeather>();
foreach (var forecast in apiResponse.Forecast.Forecastday)
{
// avoid inserting duplicates
var existingRecord = context.HistoricalWeather
.FirstOrDefault(x => x.RegionId == region.Id &&
IsOnSameDate(x.Date.UtcDateTime, forecast.Date));
if (existingRecord != null)
{
continue;
}
var newHistoricalWeather = new HistoricalWeather
{
RegionId = region.Id,
CelsiusMin = forecast.Day.Mintemp_c,
CelsiusMax = forecast.Day.Maxtemp_c,
CelsiusAverage = forecast.Day.Avgtemp_c,
MaxWindMph = forecast.Day.Maxwind_mph,
PrecipitationMillimeters = forecast.Day.Totalprecip_mm,
AverageHumidity = forecast.Day.Avghumidity,
AverageVisibilityMph = forecast.Day.Avgvis_miles,
UvIndex = forecast.Day.Uv,
Date = new DateTimeOffset(forecast.Date),
Condition = forecast.Day.Condition.Text
};
batch.Add(newHistoricalWeather);
}
context.HistoricalWeather.AddRange(batch);
var inserts = await context.SaveChangesAsync();
return inserts;
}
Edit: I am making 150,000 calls. I know this is questionable since It all goes in memory I guess before even doing a save but this is where I got to in trying to make this run faster... only I guess my actual writing code is blocking :/
var dbInserts = await Task.WhenAll(
getTasks // the list of all api get requests
.Select(async x => {
// parsed can be null if get failed
var parsed = await ParseApixuResponse(x.Item1); // readcontentasync and just return the deserialized json
return new Tuple<ApiResult, Region>(parsed, x.Item2);
})
.Select(async x => {
var finishedGet = await x;
if(finishedGet.Item1 == null)
{
return 0;
}
return await writeResult(finishedGet.Item1, finishedGet.Item2);
})
);
.net core has a DefaultConnectionLimit setting as answered in comments.
this limits outgoing connections to specific domains to ensure all ports are not taken etc.
i did my parallel work incorrectly causing it to go over the limit - which everything i read says should not be 2 on .net core but it was - and that caused connections to close before receiving responses.
I made it greater, did parallel work correctly, lowered it again.