Extending SoapHttpClientProtocol to correct faulty server Content-Length

Extending SoapHttpClientProtocol to correct faulty server Content-Length - c#

Recently I got the need to access a web service created using SOAP::Lite. It's really messy to use since there's no WSDL, it doesn't return reasonable datatypes etc. so I started out using the provided sample code.
Right from the start I got problems, requests were timing out. Sometimes often, sometimes more seldom but never entirely without problem. After using Fiddler to sniff the traffic and searching it seems like there is/was a bug with SOAP::Lite that messed up the header Content-Length when dealing with UTF-8 encoded data. This seems reasonable since my analysis points to the timeouts being caused by the client waiting for more data (Content-Length) while the server said it was done (real data).
So now I need a way to counter this erroneous header field and either:
Provide the correct Content-Length or
Pad the payload to match the Content-Length
Problem is, I never get the chance to use a SoapExtension or any other modification since Invoke() eventually throws an IoException or WebException before parsing commences. Also, the WS is not mine and pretty unchangable, I presume.
I also tried overriding the SoapHttpClientProtocol.GetWebResponse() to do an async request but that didn't help either since I couldn't get hold of the ResponseStream before calling HttpWebRequest.EndGetResponse() and that one always threw an exception.
Does anyone have an idea how I could approach this?
UPDATE: by now I've also tried WCF and came across this post at MSDN - the answer is not very uplifting. Basically, this happens far too deep in the plumbing to be accesible by user code. My best bet now seems to be to use a Fiddler script to correct the Content-Length header, perhaps not trivial since this WS is only available by HTTPS.
/Dan

Related

AspNet.Core returns 200 OK and invalid Json if there is an exception while iterating an IEnumerable (returned from controller)

It seems that AspNet.Core starts sending response that is IEnumerable right away without iterating over the whole collection. E.g.:
[HttpGet("")]
public async Task<IActionResult> GetData()
{
IEnumerable<MyData> result = await _service.GetData();
return Ok(result.Select(_mapper.MapMyDataToMyDataWeb));
}
Now there is an exception that happens during mapping of one of the elements, so I would assume a 500 response, but in reality what happens is that I get a 200 with only partial (and incorrect) Json.
I assume it's a feature and not a bug in Asp.Net Core that provides this behavior and it is additionally relatively easy to fix by calling e.g. ToList(), but I am wondering if there is some kind of flag that can prevent this situation from happening since it does not really make sense for e.g. API project and standard JSON response.
I was not able to find anything in documentation that describes this behavior and how to prevent it.
P.S. I have verified that calling ToList() fixes the issue and the response is 500 with correct exception (with UseDeveloperExceptionPage)

It seems that this is actually "by design", this issue was raised few times on Asp.Net Core github repository.
What happens is that header with 200 is already sent, while the body is not. While I would think that enumeration must proceed before sending headers, asp.net team says it will use more resources on the server and that's why it is like that.
Here is a quote:
It is very likely the case that your exception is thrown while writing
to the body, after headers have already been sent to the client, so
there's no take-backs on the 200 that was already sent as part of the
response. The client will see an error because the body will come back
as incomplete.
If you want to deterministically report a 500 when this happens you'll
need to either:
Buffer your IEnumerable as part of the action (.ToList())
Buffer the response body -https://github.com/aspnet/BasicMiddleware/tree/dev/src/Microsoft.AspNetCore.Buffering
Obviously both of these things require more server-side resources,
which is why we don't have this kind of behavior by default.
I can confirm that this solution worked:
Reference Microsoft.AspNetCore.Buffering package
Write app.UseResponseBuffering() before app.UseMvc()

Is it good practice to use HttpStatusCode for application errors? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
We are currently in the process of rebuilding the internal company structure and are talking to a supplier (of a machine) who offers a Http-based interface (if we pay of course). I am not allowed to disclose too much but the interface is supposed to perform an action and it's possible that this action might fail for various reasons. I'll give an example
public HttpResponseMessage MoveItem(int item, string option)
Now this cause can either succeed or fail for (example) 3 reasons:
Succeed: It worked, the machine is in the state I want it to be
Reason A: The machine was currently performing an operation
Reason B: The item was unavailable
Reason C: Another issue occurred
Now the supplier offered the following return values for this function call:
Succeed: HttpStatusCode.OK
Reason A: HttpStatusCode.Conflict (409)
Reason B: HttpStatusCode.NoContent (204)
Reason C: HttpStatusCode.InternalServerError (500)
Each of those would be accompanied by a human readable message.
Now I have a feeling that this is kind of a strange approach. For my understanding (and please correct me if I'm wrong!), it appears that our supplier is actually mis-using Http status codes for application-side errors. If it was for me, I would actually always throw an exception if the call fails and provide a detailed info within that exception. Yet, I have to admin that I haven't worked with HttpClient until today so my knowledge is kind of limited at the moment (this will change, no worries).
As for the main question: is our supplier's approach considered good practice nowadays with the use of HttpClient and is our supplier right or is my feeling right and this seems a little like mis-using Http code as application error codes?

Using http codes to indicate the type of error is common, but using codes other than 5xx to indicate problems within the server is not.
In general, my inclination on the specifics aligns with yours, that their error codes are suspect.
It's difficult to say for sure without more internal details, and a lot of decisions around http return codes are judgement calls, so I can't give a definitive correct answer here. That said, I compared your descriptions to those in RFC 2616 :
Reason A: The machine was currently performing an operation
If "currently performing an operation" means an operation on the same resource, and this creates some sort of conflict, then 409 seems like it appropriate to me. If the machine is simply busy, that would sound like exactly what 503 is supposed to indicate.
10.5.4 503 Service Unavailable
The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.
10.4.10 409 Conflict
The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. The response body SHOULD include enough information for the user to recognize the source of the conflict. Ideally, the response entity would include enough information for the user or user agent to fix the problem; however, that might not be possible and is not required.
Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the entity being PUT included changes to a resource which conflict with those made by an earlier (third-party) request, the server might use the 409 response to indicate that it can't complete the request. In this case, the response entity would likely contain a list of the differences between the two versions in a format defined by the response Content-Type.
Reason B: The item was unavailable
A 204 might be adequate if the server wants to indicate that a request succeeded, but there was nothing found. If unavailability is a temporary situation, I would think a 5xx or 3xx might be more appropriate, since 5xx can indicate a temporary (internal) problem, and 3xx indicated the client needs to take additional action.
10.2.5 204 No Content
The server has fulfilled the request but does not need to return an entity-body, and might want to return updated metainformation. The response MAY include new or updated metainformation in the form of entity-headers, which if present SHOULD be associated with the requested variant.
If the client is a user agent, it SHOULD NOT change its document view from that which caused the request to be sent. This response is primarily intended to allow input for actions to take place without causing a change to the user agent's active document view, although any new or updated metainformation SHOULD be applied to the document currently in the user agent's active view.
The 204 response MUST NOT include a message-body, and thus is always terminated by the first empty line after the header fields.
10.3 Redirection 3xx
This class of status code indicates that further action needs to be taken by the user agent in order to fulfill the request. The action required MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD. A client SHOULD detect infinite redirection loops, since such loops generate network traffic for each redirection.
Reason C: Another issue occurred
This looks legit. 500 is often used as a catch-all for "some sort of problem we didn't think to handle".

If you throw an exception everything comes back to the caller as a 500 and the only way to distinguish why the error occurred is by parsing the error message. That's not a great way to handle things. By sending back different HTTP status codes the calling application can take action based on the feedback from the service. For example, if you get back a 409 you might wait a few seconds and try again, whereas if you get back a 500 there's no point in trying again.
My one potential beef would be that they used 204 instead of 404 to indicate that something was not found, but that's assuming that "item was unavailable" is the same thing as "item was not found/located." I may be misunderstanding the semantics of "unavailable" in which case my comment would be moot.
UPDATE
Comments and other posts talk about "hijacking" the error codes. If this were a true REST implementation I would agree. But it's not. It's more of an RPC call. Quite frankly, the vast majority of so-called "REST" APIs that I've encountered are simply JSON-over-HTTP (i.e. basically SOAP without the ceremony and substituting JSON for XML).
Given all that, the vendor's implementation seems fine to me. They are using discrete codes to indicate the status of the operation. As long as those codes remain consistent from release to release it's effectively the same thing. Any argument beyond that is academic and a matter of personal preference (vague warnings about how "it will cause problems the further you go into the project" notwithstanding).

I agree with you that this way of doing things is kind of kludgy. The http status codes are designed for the http resources, that is for handling the status of the request for a resource, not for passing back the status of some physical machine being controlled by that resource. It is the status of the transport mechanism. Because of that, they are going to be using some codes to mean things they are not meant for. If your client application consumes those codes the way they are using them, then I guess it works, but it is definitely a kludgy way of doing things, and it will cause problems the further you go into the project.
I would, like you, send back a custom exception and detail the problem in that exception. Some comments objected to having to drill down into the exception to get the actual error data. But that is a non-issue. The client application would basically be getting a success or fail, and on fail, the details of that fail. It's not like getting the details of an exception is any less inefficient than getting a status code.

C# HttpClient Post choosing the right Timeout

It is valid behavior that an http(tcp) request can get lost without the listeners get informed. see here for the discussion on that:
C# httpClient (block for async call) deadlock
Problem
We are using HttpClient.PostAsJsonAsync to upload a Json File to a server. However in worst case scenarios this upload can take several hours.
That's why just using HttpClient.Timeout is not working for us. This is an hard timeout and we need to have it huge.
So what do we do when the tcp connection is gone and the client does not detect that. With our huge timeout we are stuck for a long time. So is there any other Timeout we can use in such cases? Any other ideas or best practices?
I was also looking into tcp sockets keep alive, but that doesn't seem to be an option.

After some research, I finally found an article which describes the issue and provides a workaround:
http://www.thomaslevesque.com/2014/01/14/tackling-timeout-issues-when-uploading-large-files-with-httpwebrequest/
According to this article, there is a design flaw in HttpWebRequest which I was able to reproduce. Seems ridiculous that the timeout also effects the upload.
However, I can live with the provided workaround (WebRequestExtensions) since our code is synchronous anyway.

HttpListener : writing to outputstream slow depending on content?

Removed the old question & rewriting completely because i've worked on this quite a bit to pinpoint the problem. My issue is that i'm writing a custom CMS with a custom server, with very very high speed/thoroughput as a goal, however i'm noticing that some data or data patterns cause major slowdowns (go from 0 to 55+ms response time). I really need someone better than me helping on this as i'm absolutely clueless about what is going on, i'm suspecting a bug in the .net Framework but i have no idea about where it could be, the little .net code browsing i did didn't suggest the output Stream does anything data-specific
Things that i've tested and am sure aren't the issue:
Size of the content (larger content is faster)
Type of the content (difference between the same content types)
Most of the surrounding code (made a minimalist project to reproduce the bug, standing at around 15 lines, find the link at the bottom of the post, includes data to reproduce it, run it, test with 2 URL, see for yourself).
Not an issue with webpages / cache etc, issue reproduced with a single image and CTRL+F5 in Firefox, removing the last few bytes of the image fixes it 100% of the time, adding them back causes the issue again
Not an issue that exists outside of the outputstream (replacing it with a target memorystream doesn't show the issue)
How to reproduce the issue:
Download & run the project
Use your favorite browser and go to localhost:8080/magicnumber
replace magicnumber in that url by what you want, you will receive the image back minus that amount of bytes
My result:
Constant 50ms or so with that image
Getting the magic number up to 1000 doesn't affect this at all
a bit further (i think around 1080 ish?) it suddently drops to 0MS
Not sure what is going on but it seems there are 2 requests per request at least when using CTRL+F5 in Firefox, in the correct case both are 0ms, in the error case the first remains 0ms but the other becomes 50ms, i'm assuming the first one is simply checking if the file cache is ok & i'm still answering but Firefox closes the connection or something?
Any help is much appreciated, placing all my rep on Bounty there as i really need to know if i go down this path / get more info to report this or if i go lower level and do my own http.sys interop (and, most of all, if the bug is only on the .net side or lower level & going lower level won't fix it!)
The sample file is a gziped array, my content is pre cached and pre compressed in db so this is representative of the data i need to send.
https://www.dropbox.com/s/ao63d7din939new/StackOverFlowSlowServerBug.zip
Edit : if i have fiddler open, the problematic test gets back to 0ms, i'm not sure what to make of it so far this means i'm getting a major slowdown, when sending some data, which isn't defined by the type of data but the actual data, and this doesn't happen if i have fiddler in between. I'm at loss!
Edit 2 : Tested with another browser just to be sure and, actually it's back to 0ms on IE so i'm assuming it may actually not be a HttpListener bug but instead a Firefox bug, i will edit my question & tags toward that if no one suggests otherwise. If this is the case, anyone know where i should be looking in FF's code to understand the issue? (it definately is an issue even if on their side since once again i'm comparing with 2 files, one larger than the other, same file format, the largest one always takes 0ms the smallest one always 55ms!)

Two requests problem:
Chrome:
First request = favicon
Second request = image
Firefox:
First request = image for the tab
Second request = image
More on this:
http://forums.mozillazine.org/viewtopic.php?t=341179
https://bugzilla.mozilla.org/show_bug.cgi?id=583351
IE:
Only appears to make the one request
If you send the requests through fiddler you never get two coming through.
Performance problem:
Firstly there's a problem with the timer in your demo app. It's restarted everytime the async request handler fires, meaning that the timer started for request A will be restarted when request B is received, possibly before request A is finished, so you won't be getting the correct values. Create the stopwatch inside the ContinueWith callback instead.
Secondly I can't see anyway that "magicnumber" will really affect performance (unless it causes an exception to be thrown I guess). The only way I can cause performance to degrade is by issuing a lot of concurrent requests and causing the wait lock to be continually hit.
In summary: I don't think there's a problem with the HttpListener class

How to implement a SoapHttpClientProtocol

Since this question tells me that SoapHttpClientProtocol is not thread safe. And, my real life testing tells me this is true, as my SoapHeader properties keep getting mixed up between calls. Is there a way to make sure that I can use this across threads and keep my properties correct? And make sure I don't run into the example given in that question of one thread thinking the connection is open, when another thread has closed it? Do I need to worry about the soap header values after my request has been made? How can I verify the properties are as I set them until the request has been issued?

The first thing I would ask is does your service work correctly if you do not make it multi-threaded. If you make subsequent calls do they all work correctly and give you the desired results? If not then there is a problem on the server side more than likely.
To see what you are sending you could serialize down the soap message before it goes. Make sure it's getting generated correctly.
My job blocks access to a lot of websites but CodeProject has some examples if I remember correctly.
If the single thread works have the serialization layer in place and have it write the files to disk in your multi-threaded scenario. Then you can see what is working and what is not by what your code thinks it's sending.
More than likely your calls are getting mixed by the server since you are trying to establish multiple connections while it may be seeing your endpoint as one value, kind of like being behind a NAT firewall. Which means you may be getting a connection but one of your other threads gets its message through first. If that is the case you could try spinning each thread up in it's own app domain and see if it does anything for you. Not saying that it will work, but not sure off the top of my head what else may be available for you to try.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.