I am working on a .NET API that runs inside of a docker container. At some point it makes a call to a Python Flask API that is also running in a container.
var response = await httpClient.GetAsync("http://service-name:8000/actual/url")
which then produces the following error:
System.Net.Http.HttpRequestException: Resource temporarily unavailable
---> System.Net.Sockets.SocketException (11): Resource temporarily unavailable
at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken
cancellationToken)
Has anyone had experience with this before and potentially knows a solution? I cant find much on the web about it at all. I have some seen some mentions of the issue potentially being related to the Flask API not using async methods but that doesnt make sense to me.
The Flask API produces the appropriate responses when accessed through a web browser or Postman using localhost:8000/actual/url and the container logs these responses. I have tried using the localhost URL in the .NET API but that does not work either.
If anymore information is needed please leave a comment and I will do my best to update the post quickly.
-- Christie
TLDR
A reason for the "Resource temporarily unavailable" error is when during name resolution the DNS Server responds with RCODE 2 (Server failure).
Long answer
I noticed the same behavior in a dotnet application running in a dotnet runtime alpine docker container. Here are the results of my investigation:
The error message "Resource temporarily unavailable" corresponds to the EAGAIN error code which gets returned by various functions from the C standard library. At first I suspected the connect() function because the C# stack trace indicates the error happening during the ConnectAsync() call of the c# socket. And indeed the EAGAIN error code appears in the man page of connect() with this description: "No more free local ports or insufficient entries in the routing cache".
I simulated a system with depleted local ports and noticed that a different exception gets thrown in that case, which rules out local port availability as a root cause for the original exception. Regarding the other mentioned cause in the man page it turns out that the routing cache was removed from Linux in 2012. commit
I started to look around for EAGAIN in the source of the musl C lib which is used in the dotnet runtime alpine docker container. After a while I finally noticed the gethostbyname2_r function which is used for resolving a domain name to an ip address via DNS. During System.Net.Sockets.Socket.ConnectAsync() the hostname is still a string and the name resolving happens in native code using the gethostbyname2_r function (or one of its variations).
The final question is: When does gethostbyname2_r return the EAGAIN error code? It's when the RCODE field in the header of the DNS Response has the value 2, which stands for "Server failure". source line 166
To verify this result I ran a simple mock DNS server which always returns the RCODE 2 in the DNS response. The resulting c# exception along with the stack trace matched the original exception exactly.
Related
I'm using the NCryptoki dll to manage the acccess to our HSMs.
I use a C# windows service. This service is a socket: it listens for requests and it access to the HSMs, doing stuff.
Using my code to acccess HSM, I randomly get this message:
Cryptware.NCryptoki.CryptokiException: Error n. 145
Only few calls on the total get this message, but it is quite annoying. Do you know why this is happening?
I found 145 is 0x00000091 CKR_OPERATION_NOT_INITIALIZED: There is no active operation of an appropriate type in the specified session
I get this error, for example, when I call the find method:
Cryptware.NCryptoki.CryptokiException: Error n. 145 at Cryptware.NCryptoki.CryptokiObjects.Find(CryptokiCollection attList, Int32 nMaxCount)
It seems like the session isn't valid.
Our service is a listening socket. It gets a big load of requests and, few of them, fail with this message. Do you know why?
The weird point is the same request rarely fails and all the other times works.
You are most likely not using PKCS#11 library and PKCS#11 sessions in multi-threaded environment correctly. See my older answer to similar question for more details.
Okay I tried to play a little bit with the StatsManager but I always got an exception trying to use anything with it when comes to
Set a stat
Get a stat
Because I doubted myself I had the idea just to use the UWPIntegration sample that is on Github . I also added the Leaderboard items to my own project so the code works with my test sandbox. Logging in works as it should just StatsManager causes the issues.
But as with my own code I just get the same error / exception which is the following. I assume there is a bug in the code provided or the service configuration is not working as intended.
System.AggregateException occurred HResult=0x80131500 Message=One or more errors occurred. Source= StackTrace: at
System.Threading.Tasks.Task1.GetResultCore(Boolean
waitCompletionNotification) at
Microsoft.Xbox.Services.XboxLiveHttpRequest.<>c__DisplayClass35_0.<GetResponseWithAuth>b__1(Task1
getResponseTask) in
D:\Data\VisualStudio\Projects\xbox-live-api-csharp\Source\api\XboxLiveHttpRequest.cs:line
117 at System.Threading.Tasks.Task.Execute()
Inner Exception 1: AggregateException: One or more errors occurred.
Inner Exception 2: WebException: The remote server returned an error:
(404) Not Found.
Issue was found. My service.config used a wrong parameter name, see below in the comments of the solution.
There are a few different reasons why this might be the case. Not surprisingly, it means the cloud can't find the stat you've requested.
If you use Fiddler, you can capture the call and share with me the correlationID header. If you don't know Fiddler, let me know and I can help you.
However, some ideas off the top of my head
Make sure that you're in development mode - your sandbox is the one from the dev center site. If you aren't sure, you can use the Windows Device Portal to see what your sandbox is - just click on Xbox Live in the left hand navigation.
Make sure you have hit "Test" on the dev center page where you defined your featured stats and leaderboards.
Make sure you are requesting the stat by the ID name you specified in the config window, not the display name.
This kind of question has been asked several times, and I understand why it happens, and probably nothing we can do about it except retry.
I do have one question on name resolution though.
I am using AWS .Net SDK for 3.5 .Net. I am uploading a big file (>500MB up to 1.5GB, medical images). I call TransferUtility.Upload() method.
For most part the program works great.
Occasionally we get this error in the middle of the upload. Usually happens when the internet is slow.
I can catch the exception and retry, which means rery from the beginning since exception happens inside the AWS code.
My question is, if the program has resolved the s3 bucket name and has been uploading for a while why would it give me name resolution error instead of just using the cached resolved name?
Does each thread resolve the name independently and one of thread is failing since the network is saturated? Is this a computer setting? This error we were able to reproduce pretty consistently on a Windows 10 machine with Charter as ISP uploading a 800MB file.
The error occurred after about 250MB upload was done.
This is the actual exception
Exception during upload :Amazon.Runtime.AmazonServiceException:
A WebException with status NameResolutionFailure was thrown. --->
System.Net.WebException: The remote name could not be resolved: 'my-bucket.s3.amazonaws.com'
This web exception is telling you the there was an issue with the "Name Resolution". What it doesn't tell you is that the "name" it's referring to is the "EndpointRegion", for example: USEast1, USEast2 etc.
When using the Amazon.S3.Transfer.Transferutility it's crucial that the EndpointRegion you use in the Upload call MATCHES that of the bucket you're uploading into.
In my case using RegionEndpoint.GetBySystemName("USEast1") vs RegionEndpoint.GetBySystemName("US-East-1") was the difference maker.
Another cause for this issue could be DNS resolution. If your system is not able to perform DNS resolves it will give you this same error.
Essentially, while using the .NET version of the Email Migration v2 Google API our application is sending up too many requests per second to a single Google Apps mailbox/user; greater than 1 Request per second. A GoogleApiException is being returned which is fine, and expected, however the body of the error message states a service unavailable (503) error has occurred, but yet the "HttpStatusCode" property of that same GoogleApiException instance is equal to an Http status code of Gone (410), I will include a code snippet and some log output below. At this point, see the questions section at the bottom or read on for better detail.
What steps will reproduce the problem?
Create a process/application that does the following:
Create a Google.Apis.Admin.email_migration_v2.AdminService object, properly initialize it using your OAuth2.0 credentials.
For each message that needs to be sent to Google Apps Create a Google.Apis.Admin.email_migration_v2.MailResource.InsertMediaUpload instance using the AdminService object from above through using AdminService.Mail.Insert() providing proper parameters.
Call MailResource.InsertMediaUpload.UploadAsync while catching any errors that occur.
Do the following:
Begin sending messages at an exponential rate by spooling off hundreds of instances of this process/application all pointing to the same user, using the same OAuth2.0 credentials.
Sit back sip your mountain dew and wait for the [503] errors to roll in...
Once errors start rolling in close down your applications.. no need to hammer the poor Google servers other than for testing the applications exception handling..
What is the expected output? What do you see instead?
Using an instance of the following type: Google.GoogleApiException
One would expect to see the instance's HttpStatusCode property be equivalent to System.Net.HttpStatusCode.ServiceUnavailable instead of System.Net.HttpStatusCode.Gone
What version of the product are you using?
Google.Apis.Admin.Email_Migration_v2 (1.8.1.20)
What is your operating system?
Windows Server 2008 R2 Enterprise (SP1)
What is your IDE?
Visual Studio 2013 Premium
What is the .NET framework version?
4.0.30319
Please provide any additional information below.
Here is a code snippet of the method being called for uploading purposes:
UploadStatus TryUpload(MailResource.InsertMediaUpload insertMediaUpload)
{
try
{
IUploadProgress uploadProgress = insertMediaUpload.UploadAsync(_cancellationToken).Result; // Task.Result locks this thread until completed.
if (uploadProgress != null && uploadProgress.Exception != null)
{
// Display additional information on any of the various exceptions that can be returned by the upload call.
HandleUploadProgressException(uploadProgress);
}
return uploadProgress != null ? uploadProgress.Status : UploadStatus.Failed;
Here is a code snippet from the method that is displaying the output of the exception.
void HandleUploadProgressException(IUploadProgress uploadProgress)
{
if (uploadProgress.Exception is GoogleApiException)
{
GoogleApiException gApiEx = uploadProgress.Exception as GoogleApiException;
throw new vsEventException(vsMapEvent.MapHttpError(gApiEx.HttpStatusCode, vsEventMessages.Id.errGmailUnidentifiableGoogleApiException),
String.Format("GoogleApiException handled in GmailMessenger.HandleUploadProgressException. HttpStatusCode: {0}", gApiEx.HttpStatusCode),
gApiEx);
}
Here is paraphrased output of the GoogleApiException handled by the HandleUploadProgressException method: (note using a custom logging class;
outputting to DebugView)
** Context Info **
Error attempting to write item to Gmail...
** Event Details **
VS-EventID: 30003(errGmailTryUpload) GoogleApiException handled in
GmailMessenger.HandleUploadProgressException. HttpStatusCode: Gone
** Inner Exception Details **
The service admin has thrown an exception: Google.GoogleApiException:
Google.Apis.Requests.RequestError
Service unavailable. Please try again [503]
Errors [ Message[Service unavailable. Please try again] Location[ - ]
Reason[backendError] Domain[global] ]
at
Microsoft.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task
task)
at
Microsoft.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccess(Task
task)
at
Google.Apis.Upload.ResumableUpload`1.d__e.MoveNext()
Questions:
Can anyone shed some light on if this is expected behavior or a bug?
If this is the expected result how should the application handle a 410 based error? I know that when most if not all 400 errors are encountered processing of that particalar item should stop, but this does not seem to be a local issue, more a server side issue.
I appreciate any responses returned, I know it can be hard for questions as specific as this.
I want to monitor a WCF server, and send email notification if the server is down. To accomplish that, I am writing a console app to periodically send dummy request to the server, and check if response is sent back. When the console app received exception the server has issues, including the server being down.
However, the problem is that I received different exception on different status of the server. Below is the exceptions returned from the server when it is on different status. However, all seem belong to server down category. Any idea??:
When IIS is turned off
System.ServiceModel.EndpointNotFoundException,
Message:
There was no endpoint listening at http://localhost/service.svc that could accept the message. This is often caused by an incorrect
address or SOAP action. See InnerException, if present, for more
details.
Inner Exception Message:The remote server returned an error: (404) Not Found
When a Web.config file is deliberately changed to a wrong name:
System.ServiceModel.ServiceActivationException
Link:
http://localhost/service.svc
Message:
The requested service, 'http://localhost/service.svc' could not be activated. See the server's diagnostic trace logs for more
information.
For other unknown reason
System.ServiceModel.ServerTooBusyException
Message:
The HTTP service located at http://localhost/service.svc' is too busy.
Message:
The remote server returned an error: (503) Server Unavailable.
Update 1
The exception does NOT always return http status code.
Update 2
Apart from using WCF proxy to call the service, I have to use WebRequest too, as below:
try
{
WebRequest webRequest = WebRequest.Create(uri);
webRequest.Method = "GET";
HttpWebResponse httpWebResponse = (HttpWebResponse)webRequest.GetResponse();
}
catch () //what excpetion will tell me server is down??
{
...
}
The actual content of the error shouldn't really be of consequence - unless you're monitoring individual operations on the service (i.e. should a POST with some data to a particular URL return a specific response) - realistically, then, you're just going to be looking at the status code itself; and for that you want to look through all the HTTP Status Codes and see those which look like errors as far as you're concerned.
As a good starting point - you might want to consider nearly all of the 5xx codes; as they are all connected with server errors.
You might also want to consider some of the 4xx codes (although these are usually connected with clients, so be ruthless). In particular:
400 - Bad Request - so long as you can be sure that the server should be able to understand the request
404 - Not Found - if you're sure that the given URL should be present
405 - Method Not Allowed - if you're sure that the given HTTP verb should be supported (e.g. a POST or DELETE)
For some of the narrower 4xx codes, e.g. 413 Request Entity Too Large or 414 Request-URI Too Long; these could conceivably happen after days or months of normal operation due to things like security updates. In which case you're not necessarily identifying that the service is down as such, but you might be anticipating it being unable to perform it's intended function.
Any HTTP status result code in the 400 or 500 series is a problem that will prevent you're request from processing. All of these errors derive from System.ServiceModel.CommunicationException so check for that.