What is the best approach for knowing when a long running Elasticsearch request is complete?
Today I have a process that periodically purges ~100K documents from an AWS hosted ES that contains a total of ~60M documents.
var settings = new ConnectionSettings(new Uri("https://mycompany.es.aws.com"));
settings.RequestTimeout(TimeSpan.FromMinutes(3)); // not sure this helps
var client = new ElasticClient(settings);
var request = new DeleteByQueryRequest("MyIndex") { ... };
// this call will return an IsValid = true, httpstatus = 504 after ~60s,
var response = await client.DeleteByQueryAsync(request);
Even with timeout set to 3 minutes, the call always returns in ~60s with an empty response and a 504 status code. Though through Kibana, I can see that the delete action continues (and properly completes) over the next several minutes.
Is there a better way to request and monitor (wait for completion) a long running ES request?
UPDATE
Based on Simon Lang's response I updated my code to make use of ES Tasks. The final solution looks something like this...
var settings = new ConnectionSettings(new Uri("https://mycompany.es.aws.com"));
settings.RequestTimeout(TimeSpan.FromMinutes(3)); // not sure this helps
var client = new ElasticClient(settings);
var request = new DeleteByQueryRequest("MyIndex")
{
Query = ...,
WaitForCompletion = false
};
var response = await client.DeleteByQueryAsync(request);
if (response.IsValid)
{
var taskCompleted = false;
while (!taskCompleted)
{
var taskResponse = await client.GetTaskAsync(response.Task);
taskCompleted = taskResponse.Completed;
if (!taskCompleted)
{
await Task.Delay(5000);
}
}
}
I agree with #LeBigCat that the timeout comes from AWS and it is not a NEST problem.
But to address your question:
The _delete_by_query request supports the wait_for_completion parameter. If you set it to false, the request returns immediately with a task id. You then can request the task status by the task api.
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
This isnot a nest - elastic problem, the default timeout in nest query is 0 (no timeout).
You got timeout from amazon server (60s default)
https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/ts-elb-error-message.html
This link explain everything you need to know :)
Regarding #Simon Lang's answer, this is also the case with the _update_by_query api. For those unfamiliar with the _tasks api, you can query for your task in Kibana. The string returned by the update or delete by query will be of the form:
{
"tasks" : "nodeId:taskId"
}
and you can view the status of the task using this command in Kibana:
GET _tasks/nodeId:taskId
Related
I created a timeout middleware that works basically like this:
public async Task InvokeAsync(HttpContext httpContext)
{
var stopwatch = Stopwatch.StartNew();
using (var timeoutTS = CancellationTokenSource.CreateLinkedTokenSource(httpContext.RequestAborted))
{
var delayTask = Task.Delay(config.Timeout);
var res = await Task.WhenAny(delayTask, _next(httpContext));
Trace.WriteLine("Time taken = " + stopwatch.ElapsedMilliseconds);
if (res == delayTask)
{
timeoutTS.Cancel();
httpContext.Response.StatusCode = 408;
}
}
}
In order to test it, I created a controller action:
[HttpGet]
public async Task<string> Get(string timeout)
{
var result = DateTime.Now.ToString("mm:ss.fff");
if (timeout != null)
{
await Task.Delay(2000);
}
var rng = new Random();
result = result + " - " + DateTime.Now.ToString("mm:ss.fff");
return result;
}
The configured timeout to 500ms and the Time Taken reported is usually 501-504 ms (which is a very acceptable skid).
The problem is that every now and then I was seeing an error on the output windows saying that the response had already started. And I thought to myself: this cant be! this is happening 1 second earlier than the end of the Task.Delay on the corresponding controller.
So I opened up fiddler and (to my surprise) several requests are returning in 1.3-1.7 seconds WITH A FULL RESPONSE BODY.
By comparing the reported time written on the response body with the timestamp on fiddler "statistic" tab I can guarantee that the response im looking at does not belong to that request at hand!
Does anyone knows what's going on? Why is this "jumbling" happening?
Frankly, you're not using middleware in the way it is designed for.
You might want to read this middleware docs.
The ASP.NET Core request pipeline consists of a sequence of request delegates, called one after the other.
In your case, your middleware is running in parallel with the next middleware.
When a middleware short-circuits, it's called a terminal middleware because it prevents further middleware from processing the request.
If I understand you correctly, you might want to create such terminal middleware, but clearly your current one is not.
In your case, you have invoked the _next middleware, which means the request has already handed off to the next middleware in the request pipeline. The subsequent middleware components can start the response before the timeout has elapsed. i.e. a race condition between your middleware and a subsequent middleware.
To avoid the race condition, you should always check HasStarted before assigning the status code. And if the response has started, all you can do might only be aborting the request if you don't want the client to wait for too long.
static void ResetOrAbort(HttpContext httpContext)
{
var resetFeature = httpContext.Features.Get<IHttpResetFeature>();
if (resetFeature is not null)
{
resetFeature.Reset(2);
}
else
{
httpContext.Abort();
}
}
app.Use(next =>
{
return async context =>
{
var nextTask = next(context);
var t = await Task.WhenAny(nextTask, Task.Delay(100));
if (t != nextTask)
{
var response = context.Response;
// If response has not started, return 408
if (!response.HasStarted)
{
// NOTE: you will still get same exception
// because the above check does not eliminate
// any race condition
try
{
response.StatusCode = StatusCodes.Status408RequestTimeout;
await response.StartAsync();
}
catch
{
ResetOrAbort(context);
}
}
// Otherwise, abort the request
else
{
ResetOrAbort(context);
}
}
};
});
So I'm using Microsofts Bot framework and their DirectLine api to talk to it. I do this beacuse I need to send a notification to the bot. The class below is called by my endpoint that I have in my backend. So when I call my notify endpoint, this class is invoked and is supposed to start a conversation with the bot to trigger certain events in it. The problem is that it doesn't seem to work as expected. When I run the code and make a request to my endpoint, it get's stuck at var conversation = await client.Conversations.StartConversationAsync();
the await keyword stops the execution until it is finished, problem is that it never finishes. BUT I can see in the debug window that the request is sent with a 201 created statuscode, so it should finish, but it never does. Not sure what to do here.
private static async Task StartBotConversation()
{
string directLineSecret = "SECRECT";
string fromUser = "DirectLineSampleClientUser";
DirectLineClient client = new DirectLineClient(directLineSecret);
Debug.WriteLine("Before starting con ");
var conversation = await client.Conversations.StartConversationAsync();
Debug.WriteLine("After starting con");
Activity userMessage = new Activity
{
From = new ChannelAccount(fromUser),
Text = "ERROR1337",
Type = ActivityTypes.Trigger
};
Debug.WriteLine("Before posting activity");
await client.Conversations.PostActivityAsync(conversation.ConversationId, userMessage);
Debug.WriteLine("After posting activity");
}
Do this : BotConversation = await Client.Conversations
.StartConversationAsync().ConfigureAwait(false);
It worked for me, I hope it helps you.
I have the following piece of code that retrieves transactions from a Dynamics CRM (querying with OData):
public async Task<IEnumerable<Transaccion>> GetTransactions()
{
var tableName = Transaccion.CrmTableName;
var request = new RestRequest($"/api/data/v8.0/{tableName}");
request.AddHeader("Prefer", "odata.maxpagesize=500");
var responseData = await client.ExecuteGetTaskAsync<ODataResponse<List<Transaccion>>>(request);
var transactions = responseData.Data.Value;
while (responseData.Data.NextLink != null)
{
request = new RestRequest(responseData.Data.NextLink);
request.AddHeader("Prefer", "odata.maxpagesize=500");
responseData = await client.ExecuteGetTaskAsync<ODataResponse<List<Transaccion>>>(request);
transactions.AddRange(responseData.Data.Value);
}
return transactions;
}
once I execute the first "ExecuteGetTaskAsync", I get for my example and as expected a NextLink attribute that points to the next set of entities that I need to retrieve. However, when I try to perform the next RestRequest, I don't get a JSON as response, but a Html page corresponding to a redirect, where I can read the error message "".
It's weird, since the first call could be made correctly because the Restclient was correctly authenticated.
What's going on? How can I do paging with Dynamics CRM in .Net and use the NextLink?
In my case URL in #odata.nextLink was with an error.
How it was:
http://[Organization URI]/api/data/v8.2/[entity]/(68e95f08-d372-e711-966b-defe0719ce9e)/[relation entity]?$select=ne_name
And that did not work, but this did:
http://[Organization URI]/api/data/v8.2/[entity](68e95f08-d372-e711-966b-defe0719ce9e)/[relation entity]?$select=ne_name
There is no "/" between [entity] and (id)
The odada nextlink returns the full URL of the next request so you'll need to parse it to get only the /api/** portion.
I have a rest endpoint on asp.net mvc:
[HttpGet]
public List<SomeEntity> Get(){
// Lets imagine that this operation lasts for 1 minute
var someSlowOperationResult = SomeWhere.GetDataSlow();
return someSlowOperationResult;
}
On the frontEnd I have a next javascript:
var promise = $.get("/SomeEntities");
setTimeout(function(){promise.abort()}, 100);
How to force Thread to die after abort call, to prevent slow calculation to be done?
Thanks in advance.
I found that Response have isClientConnected property. So we can use next approach:
[HttpGet]
public List<SomeEntity> Get(){
var gotResult = false;
var result = new List<SomeEntity>();
var tokenSource2 = new CancellationTokenSource();
CancellationToken ct = tokenSource2.Token;
Task.Factory.StartNew(() =>
{
// Do something with cancelation token to break current operation
result = SomeWhere.GetSomethingReallySlow();
gotResult = true;
}, ct);
while (!gotResult)
{
if (!Response.IsClientConnected)
{
tokenSource2.Cancel();
return result;
}
Thread.Sleep(100);
}
return result;
}
Can we? Or I miss something?
UPDATE:
Yes, it works
The backend has no idea that you have called abort() and if the request has already been sent then the server-side logic will run until it completes. In order to stop it from running you will have to send another request to your controller which notifies that controller that you've aborted the request and the controller will have to access the instance that is currently running you slow operation and this instance should have a method which forces the calculations to cancel.
I need to integrate a third party's Web API methods into a WCF service.
The WCF Service will be called by another external party, and they expect the call and return to be synchronous.
I am able to get results from the API with the usual RestClient.ExecuteAsync.
I put together the following for synchronous calls:
public static List<Books> GetSyncBooks(int companyId)
{
var response = GetSynchronousBooks(companyId);
var content = response.Result.Content;
List<Books> result = new List<Books>();
return Helpers.JSONSerialiser.Deserialize<BookList>(content);
}
async private static Task<IRestResponse> GetSynchronousBooks(int companyId)
{
var request = BuildGETRequest("Book", companyId);
var response = await RestSharpHelper.ExecuteSynchronousRequest(request);
return response;
}
public static Task<IRestResponse> ExecuteSynchronousRequest(RestRequest request)
{
var client = new RestClient(BaseUrl);
client.AddHandler("application/json", new RestSharpJsonDotNetDeserializers());
var tcs = new TaskCompletionSource<IRestResponse>(TaskCreationOptions.AttachedToParent);
client.ExecuteAsync(request, (restResponse, asyncHandle) =>
{
if (restResponse.ResponseStatus == ResponseStatus.Error)
tcs.SetException(restResponse.ErrorException);
else
tcs.SetResult(restResponse);
});
return tcs.Task;
// BREAKPOINT here shows TASK value as
// Id = 1, Status = WaitingForActivation, Method = "{null}", Result = "{Not yet computed}"
}
The problem, however, is that I never get a result using this. The response content is always null. What am I doing wrong?
EDIT: Thanks Stephen. I have seen your name on some of the questions here on this subject: your score seems to indicate you know your way around this. I have indeed implemented the wcf service calls as you indicated based on your answer at another question. Can I ask you a related follow-up question? How scalable are async WCF service calls like this example? Would it "just work" for multiple simultaneous calls in the range of 10 to 100 per second, ignoring the processing overhead downstream?
I assume that your WCF service is hosted in ASP.NET.
Your problem is here: response.Result. I explain this deadlock situation on my blog. In summary, await will capture the current "context" (in this case, an ASP.NET request context) and will use that to resume the async method. However, the ASP.NET request context only allows one thread at a time, so if the request thread is blocked (calling response.Result), then the async method can never continue and you get a deadlock.
The solution is to correct this misunderstanding:
The WCF Service will be called by another external party, and they expect the call and return to be synchronous.
Since you're dealing with a client/server scenario, you don't have to make it synchronous. The asynchrony of the client is completely independent from the asynchrony of the server.
So, just implement your WCF service asynchronously:
public static Task<List<Books>> GetBooksAsync(int companyId)
{
var response = await GetBooksAsync(companyId);
var content = response.Content;
List<Books> result = new List<Books>();
return Helpers.JSONSerialiser.Deserialize<BookList>(content);
}
The client can still call it synchronously if they wish to.