I have built an api to handle my application and i'm using a controller to upload files with FlowJS. It works with no problems on my local machine and also on my development server but when i use it on my client's homologation server i get a problem with the requisitions.
After some requests, it's a random number or requests sometimes 5 sometimes 10, the next request just freezes and won't respond and if i try to duplicate it using Chrome dev console it works.
This is a print of the requests:
The controller and service that are called only have IO operations and i'm logging everything that is done there after the request reaches the controller and when i get that pending status then request isn't logged.
I have also tried to work around my DNS to check if it was a connection problem due to the amount of requisitions but the problem remais the same. What else could be happening here ?
Related
I am working on a .NET Core 3.1 web api application and am having a terrible time with something closing my long running API request.
I have a POST endpoint that looks like this (slightly simplified):
[HttpPost]
[IgnoreAntiforgeryToken]
[ActionName("LoadDataIntoCache")]
public async Task<IActionResult> LoadDataIntoCache([FromQuery] string filter)
{
//long running process (15-20 mins)
var success = await _riskService.LoadDataIntoCache(filter);
if (success == false)
{
return StatusCode(StatusCodes.Status500InternalServerError);
}
return Ok();
}
This endpoint works fine when I test it locally via Postman. However, when I push this to the server (IIS), and hit the endpoint via Postman it produces an error after 5 minutes: Error: read ECONNRESE.
No more details are produced that this. Checking the logs of the application, it does not throw an exception, in fact it appears that the long running processes continues to run as if nothing is wrong. Its as if the connection itself is just being closed by something, but that the application is working fine.
I have also tried calling this endpoint via C# instead of Postman. My calling code produced the following exception message Processing of the HTTP request resulted in an exception. and additionally The underlying connection was closed: A connection that was expected to be kept alive was closed by the server.
I have checked the IIS timeout, which is set to 120s, which does not align with the 5 minute time I am seeing. I have checked a bunch of timeout settings on the .NET side, but my understanding is the .NET Core 3.1 does not need this settings because it will wait forever by default? This application is also set to run inProcess if that is significant...
I am really scratching my head on this one. Any pointers would be much appreciated.
I have a c# application that the client uses wcf to talk to the server. In the background every X seconds the client calls a Ping method to the server (through WCF). The following error has reproduced a couple of times (for different method calls):
System.ServiceModel.ProtocolException: A reply message was received for operation 'MyMethodToServer' with action 'http://tempuri.org/IMyInterface/PingServerResponse'. However, your client code requires action 'http://tempuri.org/IMyInterface/MyMethodToServerResponse'.
MyMethodToServer is not consistent and it falls on different methods.
How can this happen that a request receives a different response?
I think you have a pretty mess problem with async communication, main suggestion (as your question isn't clear very well), is try to identify every request, catch the calls and waiting for them, do asyncronic communication and getting a several work with threading.
As you present it, is a typical architecture problem.
If you present more code, can I suggest some code fixing in my answer and I'll be glad to update my answer.
If this occurs randomly and not you consistently, you might be running in a load-balanced setup, and deployed an update to only one of the servers?
Wild guess: your client uses same connection to do two requests in parallel. So what happens is:
Thread 1 sends request ARequest
Thread 2 sends request BRequest
Server sends reply BReply
Thread 1 receives reply BReply while expecting AReply
If you have request logs on the server, it'll be easy to confirm - you'll likely see two requests coming with short delay from the client host experiencing the issue
I think MaxConcurrentCall and ConcurrencyMode may be relevant here (although I did not touch WCF for a long while)
I have a REST service in a self hosted ASP.Net WebApi application (Console).
Some clients poll the server in specific intervals to fetch new data. In general all is working fine.
The problem is, that the server stops responding to requests after some random duration (~30mins - 2.5 hours). All client requests start to time out.
The weird thing is, the server doesn't seem to receive the requests anymore as no controller method is invoked anymore). Server didn't throw any exceptions and the console app is still responsive. So I can only suppose there is a problem, before the request reaches the API controller.
In the debugger everything seems fine.
How can I diagnose such an issue?
What else can I try to fix the described behavior?
Notes:
Tested on multiple systems
.Net 4.5.1
Asp.Net WebApi 5.1.2
I have found the issue, the reason this is happening is because of connection leaks. If you are sending requests and aren't closing them correctly, either after the request is finished, or within an exception, the amount of open connections will eventuelly reach it's max value. Either you change the max amount of open connections in the connectionstring or(the prefered way) make sure your code is handling the closing part:
SqlConnection myConnection = new SqlConnection(ConnectionString);
try
{
conn.Open();
someCall (myConnection);
}
finally
{
myConnection.Close();
}
Credit goes to How can I solve a connection pool problem between ASP.NET and SQL Server? Where you can read more about this.
In my case, the issue was caused by never ending tasks. Due a misusage of the ReactiveExtensions Api, I randomly created never ending tasks. It seems, at some point the task scheduler simply couldn't handle them anymore, although I'm not completely sure about that.
Thing learned: It seems, by doing bad things in your app code (too many tasks, SQL connections ...) you can kill the WebApi infrastructure, so that it doesn't handle requests - at any level - anymore.
i have developed a server application with c# and a client application with flash action script 3.0. Flash socket asking for a policy file when called from a browser with a message
<policy-file-request/>
everything is normal so far. My server is waiting for this message and sending to client a policy file string which is like this:
public const String POLICY_FILE = "<?xml version=\"1.0\"?>\n" +
"<!DOCTYPE cross-domain-policy SYSTEM \"http://www.adobe.com/xml/dtds/cross-domain-policy.dtd\">\n" +
"<cross-domain-policy>" +
"<allow-access-from domain=\"*\" to-ports=\"*\" />" +
"</cross-domain-policy>\u0000";
this string is being sent this way:
if (message.Contains("policy-file-request"))
{
client.Send(Encoding.ASCII.GetBytes(Statics.POLICY_FILE));
return;
}
I'm pretty sure that this was working but i really don't know what happened and started not working. When flash client receives this message from server, connection was succesfull and everything was going how it had to go. But now the flash client waits 20 seconds (timeout of flash socket) and throws security exception
[SecurityErrorEvent type="securityError" bubbles=false cancelable=false eventPhase=2 text="Error #2048"]
I'm stuck and can't move forward. I'm listening to port 963, server machine fully qualified name is "mypc.domain.local" which can be accessible across my network. there is also an IIS running on this machine and the flash application is hosted here.
http://mypc.domain.local:90/page.html
this is the way, i call my flash application and
mypc.domain.local:963
is the address of server running. i am also working on this machine. i tried calling the page http://localhost:90/page.html or http://127.0.0.1:90/page.html and also tried the connection to server as localhost:963 or 127.0.0.1:963. same result on every combination.
What is wrong here? what could have been changed causing my working code broke down?
Thanks.
It's hard to tell without more code, but based on what you've shown, it appears that when that request comes in, you respond with the contents of the policy file, which isn't an actual valid HTTP response. My guess for the 20 second timeout would be that it's still waiting for the HTTP headers.
If possible, try to use the HTTP classes already in the BCL instead of doing http 'manually', but if you have to do the socket stuff yourself, then use something like Fiddler during debugging since it's great for identifying violations of the HTTP protocol.
All,
I have a WCF web service (let's called service "B") hosted under IIS using a service account (VM, Windows 2003 SP2). The service exposes an endpoint that use WSHttpBinding with the default values except for maxReceivedMessageSize, maxBufferPoolSize, maxBufferSize and some of the time outs that have been increased.
The web service has been load tested using Visual Studio Load Test framework with around 800 concurrent users and successfully passed all tests with no exceptions being thrown. The proxy in the unit test has been created from configuration.
There is a sharepoint application that use the Office Sharepoint Server Search service to call web services "A" and "B". The application will get data from service "A" to create a request that will be sent to service "B". The response coming from service "B" is indexed for search. The proxy is created programmatically using the ChannelFactory.
When service "A" takes less than 10 minutes, the calls to service "B" are successfull. But when service "A" takes more time (~20 minutes) the calls to service "B" throw the following exception:
Exception Message: An unsecured or incorrectly secured fault was received from the other party. See the inner FaultException for the fault code and detail
Inner Exception Message: The message could not be processed. This is most likely because the action 'namespace/OperationName' is incorrect or because the message contains an invalid or expired security context token or because there is a mismatch between bindings. The security context token would be invalid if the service aborted the channel due to inactivity. To prevent the service from aborting idle sessions prematurely increase the Receive timeout on the service endpoint's binding.
The binding settings are the same, the time in both client server and web service server are synchronize with the Windows Time service, same time zone.
When i look at the server where web service "B" is hosted i can see the following security errors being logged:
Source: Security
Category: Logon/Logoff
Event ID: 537
User NT AUTHORITY\SYSTEM
Logon Failure:
Reason: An error occurred during logon
Logon Type: 3
Logon Process: Kerberos
Authentication Package: Kerberos
Status code: 0xC000006D
Substatus code: 0xC0000133
After reading some of the blogs online, the Status code means STATUS_LOGON_FAILURE and the substatus code means STATUS_TIME_DIFFERENCE_AT_DC. but i already checked both server and client clocks and they are syncronized.
I also noticed that the security token seems to be cached somewhere in the client server because they have another process that calls the web service "B" using the same service account and successfully gets data the first time is called. Then they start the proccess to update the office sharepoint server search service indexes and it fails. Then if they called the first proccess again it will fail too.
Has anyone experienced this type of problems or have any ideas?
Regards,
--Damian
10 mins is the default receive timeout. If you have an idled proxy for more than 10mins, the security session of that proxy is aborted by the server. Enable logging and you will see this in the diagnostics log of the server. The error message you reported fits for this behavior.
Search your system diagnostic file for "SessionIdleManager". If you find it, the above is your problem.
Give it a whirl and set the establishSecurityContext="false" for the client and the server.
Don't call the service operation in a using statement. Instead use a pattern such as...
client = new ServiceClient("Ws<binding>")
try
{
client.Operation(x,y);
client.Close();
}
catch ()
{
client.Abort();
}
I don't understand why this works but I would guess that when the proxy goes out of scope in the using statement, Close isn't called. The service then waits until receiveTimeout (on the binding) has expired and then aborts the connection causing subsequent calls to fail.
What I believe is happening here is that your channel is timing out (as you suspect).
If I understand correctly, it is not the calls to service A that are timing out, but rather to service B, before you call your operation.
I'm guessing that you are creating your channel before you call service A, rather than just in time (i.e. before calling service B). You should create the channel (proxy, service client) just before you use it like:
AResponse aResp = null;
BResponse bResp = null;
using (ServiceAProxy proxyA = new ServiceAProxy())
{
aResp = proxyA.DoServiceAWork();
using (ServiceBProxy proxyB = new ServiceBProxy())
{
bResp = proxyB.DoOtherork(aResp);
}
}
return bResp;
I believe however, that once you get over that problem (service B timing out), you'll realize that the sharepoint app's proxy (that called service A) will timeout.
To solve that, you may wish to change your service model from a request-response, to a publish-subscribe model.
With long-running services, you'll want your sharepoint app to subscribe to service A, and have service A publish its results when it is ready to do so - regardless of how long it takes.
Programming WCF Services (O'Reilly) by Juval Lowey, has a great explanation, and IDesign (Juval's company) published a great set of coding standards for WCF, as well as the code for a great Publish-Subscribe Framework.
Hope this helps,
Assaf.
I actually triggered this error just now by doing something silly. I have a unit test that modifies the system date in order to test some time-based features. And I guess the apparent time difference between when I created the context and when I called my method (because of the changes to the system date), caused something to expire.