We have some resources which contains links to external sites. However, we want to avoid dead links and have implemented a ping routine written in c# .net6.
We loop through all links and do a HEAD and a GET request with HttpClient. Most sites return OK200 but some return bad request, forbidden and so forth. But if we inspect the link in the browser, the site/link works as expected.
If we get a 404 we mark the link as dead and someone should do something manually and update the link. We have added a useragent to httpclient.
How can we avoid the bad requests returned to the httpclient?
Related
We had a little mishap where an https binding was created for a website without a hostname (just an ip), and then another website was created with only an http binding to a hostname using the same ip as the first site.
So the problem is when you navigate to the 2nd site over https, instead of getting an error it just goes to the first website. As a result Google was able to access the first site through the 2nd sites host name over https and now we have lots of duplicate links out in google land.
I've already stopped the bleeding, but now I need to 301 all the bad links that were created by Google for the 2nd site. My plan is that, going forward, anytime a 404 error is encountered in the 2nd site then it will call for just the header from the same link on the 1st site. If the header returns with an OK status then it will do a permanent redirect to the 1st site.
There's just one part of that plan I don't know how to do off the top of my head... what's the best way to intercept the 404s in such a way I can run my code to determine whether it should be 301'd or not?
I'm writing an app to ensure my website is always up to date with our suppliers products. I can get the categories but not the subs.
Basically a webrequest on "xxxx/products/8-propagation/?sub-category=96" always returns "xxxx/products/8-propagation/". I have used console on firefox to see what headers are sent when browsing, I didn't see anything particular but still emulated.
Is there any way to retrieve php requests from URL's or is this something server side only?
I have tried numerous ways of doing this, all the same result.
Show us your server side code. I think, this is problem with your routing in controller.
I have written an application that starts with making a WCF call to login. I generated the client code with a service reference. It works fine for clients who have their services installed locally to their network. However, there is a saas environment, as well, where these same services are controlled by the corporate powers that be. In the saas environment, I was informed that the login was failing. Investigating using Fiddler, I found that the service call to login is returning HTML, specifically, the web page listing all the available methods from the .asmx.
The saas environment has one little quirk which may be causing the problem here, but I don't know how to verify that this is the problem, nor how to solve it if it is the problem. The quirk is that the server redirects (302) the call.
The client code:
client.Endpoint.Address = new EndpointAddress("http://" + settings.MyURL + "/myProduct/webservices/webapi.asmx");
client.DoLogin(username, password);
The raw data sent to the server, before the redirect, includes the s:Envelope XML tag. Notice the missing s:Envelope XML tag when sending to the redirected server:
GET https://www.myurl.com/myProduct/webservices/webapi.asmx HTTP/1.1
Content-Type: text/xml; charset=utf-8
VsDebuggerCausalityData: uIDPo7TgjY1gCLFLu6UXF8SWAoEAAAAAQxHTAupeAkWz2p2l3jFASiUPHh+L/1xNpTd0YqI2o+wACQAA
SOAPAction: "http://Interfaces.myProduct.myCompany.com/DoLogin"
Accept-Encoding: gzip, deflate
Host: www.gotimeforce2.com
Connection: Keep-Alive
How do I get this silly thing working?
Edit: It is worth noting that I am using WCF/svcutil.exe/service-reference rather than the older ASMX/wsdl.exe/web-reference. Otherwise, for future readers of this topic, the wsdl solution suggested by Raj would have been a great solution. If you are seeing this issue and are using the wsdl technique, see Raj's excellent answer.
Edit2: After doing a bunch of research into WCF and 302, it sounds like they just don't play well together, nor does there appear to be a simple way of giving the WCF api custom code to handle the situation. As I have no control over the server, I have sucked it up and re-generated my api as a web-reference and am using Raj's solution.
Edit3: Updated the title to better reflect the solution, now that the cause of the issue is understood. Original title: Why would WCF not include s:Envelope on a redirect?
Ok, So I did some digging on this and tried to replicate the issue on my side. I was able to replicate the issue and find a solution to it as well. However I'm not sure how well this will apply in your case since it is dependent on interfacing with the server team that manages the load balancer. Here are the findings.
Looking at http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html you notice the following addendum in the explanation for HTTP Status Codes 302 and 303.
302 Found
Note: RFC 1945 and RFC 2068 specify that the client is not allowed
to change the method on the redirected request. However, most
existing user agent implementations treat 302 as if it were a 303
response, performing a GET on the Location field-value regardless
of the original request method. The status codes 303 and 307 have
been added for servers that wish to make unambiguously clear which
kind of reaction is expected of the client.
303 See Other
Note: Many pre-HTTP/1.1 user agents do not understand the 303
status. When interoperability with such clients is a concern, the
302 status code may be used instead, since most user agents react
to a 302 response as described here for 303.
Further looking at http://en.wikipedia.org/wiki/List_of_HTTP_status_codes you notice the following explanation for the HTTP status codes 302, 303 and 307.
302 Found :
This is an example of industry practice contradicting the standard. The HTTP/1.0 specification (RFC 1945) required the client to perform a temporary redirect (the original describing phrase was "Moved Temporarily"), but popular browsers implemented 302 with the functionality of a 303 See Other. Therefore, HTTP/1.1 added status codes 303 and 307 to distinguish between the two behaviors. However, some Web applications and frameworks use the 302 status code as if it were the 303.
303 See Other (since HTTP/1.1):
The response to the request can be found under another URI using a GET method. When received in response to a POST (or PUT/DELETE), it should be assumed that the server has received the data and the redirect should be issued with a separate GET message.
Here is the basic flow in a normal Client/Server Interaction
307 Temporary Redirect (since HTTP/1.1):
In this case, the request should be repeated with another URI; however, future requests should still use the original URI. In contrast to how 302 was historically implemented, the request method is not allowed to be changed when reissuing the original request. For instance, a POST request should be repeated using another POST request.
So according to this, we are able to explain the behavior of the WCF call which sends a GET request without the s:Envelope on the 302 redirect. This will undoubtedly fail on the client side.
The easiest way of fixing this is to have the server return a 307 Temporary Redirect instead of a 302 Found status code in the response. Which is where you need the help of the Server Team that manages the redirect rules on the load balancer. I tested this out locally and the client code consuming the service with the Service Reference seamlessly executes the call even with the 307 Temporary Redirect.
In fact you could test this all out with the solution I've uploaded to Github Here. I've updated this to illustrate the utilization of a service reference instead of a wsdl generated proxy class to consume the asmx service.
However if the change from 302 Found to 307 Temporary Redirect is not feasible in your environment, then I would suggest using either Solution 1 (which shouldn't have a problem whether it is a 302 or 307 status code in the response) or using my original answer which would resolve this by directly accessing the service at the correct URL based on the setting in the config file. Hope this helps!
Solution 1
If you do not have access to the config files on production or if you just plain don't want to use the multiple URLs in the config file, you could use this following approach. Link to Github repo containing sample solution Click Here
Basically, if you notice the file auto generated by wsdl.exe you will notice that the service proxy class derives from System.Web.Services.Protocols.SoapHttpClientProtocol. This class has a protected method System.Net.WebRequest GetWebRequest(Uri uri) that you can override. In here you could add a method to check to see if a 302 temporary redirect is the result of HttpWebRequest.GetResponse() method. If so, you can set the Url to the new Url returned in the Location header of the response as follows.
this.Url = new Uri(uri, response.Headers["Location"]).ToString();
So create a class called SoapHttpClientProtocolWithRedirect as follows.
public class SoapHttpClientProtocolWithRedirect :
System.Web.Services.Protocols.SoapHttpClientProtocol
{
protected override System.Net.WebRequest GetWebRequest(Uri uri)
{
if (!_redirectFixed)
{
FixRedirect(new Uri(this.Url));
_redirectFixed = true;
return base.GetWebRequest(new Uri(this.Url));
}
return base.GetWebRequest(uri);
}
private bool _redirectFixed = false;
private void FixRedirect(Uri uri)
{
var request = (HttpWebRequest)WebRequest.Create(uri);
request.CookieContainer = new CookieContainer();
request.AllowAutoRedirect = false;
var response = (HttpWebResponse)request.GetResponse();
switch (response.StatusCode)
{
case HttpStatusCode.Redirect:
case HttpStatusCode.TemporaryRedirect:
case HttpStatusCode.MovedPermanently:
this.Url = new Uri(uri, response.Headers["Location"]).ToString();
break;
}
}
}
Now comes the part that illustrates the advantage of using a proxy class manually generated using wsdl.exe instead of a service reference. In the manually created proxy class. modify the class declaration from
public partial class WebApiProxy : System.Web.Services.Protocols.SoapHttpClientProtocol
to
public partial class WebApiProxy : SoapHttpClientProtocolWithRedirect
Now invoke the DoLogin method as follows.
var apiClient = new WebApiProxy(GetServiceUrl());
//TODO: Add any required headers etc.
apiClient.DoLogin(username,password);
You will notice that the 302 redirect is handled smoothly by the code in your SoapHttpClientProtocolWithRedirect class.
One other advantage is that, by doing this, you will not have to fear that some other developer is going to refresh the service reference and lose the changes that you made to the proxy class since you had manually generated it. Hope this helps.
Original Answer
Why don't you just include the entire url for production/local service in the config file? That way you can initiate the call with the appropriate url in the appropriate location.
Also, I would refrain from using a service reference in any code destined for production. One way of using the asmx service without a service reference would be to generate the WebApiProxy.cs file using the wsdl.exe tool. Now you can just include the WebApiProxy.cs file in your project and instantiate as shown below.
var apiClient = new WebApiProxy(GetServiceUrl());
//TODO: Add any required headers etc.
apiClient.DoLogin(username,password);
Here is the GetServiceUrl() method. Use a Configuration Repository to further decouple and improve testability.
private string GetServiceUrl()
{
try
{
return
_configurationRepository.AppSettings[
_configurationRepository.AppSettings["WebApiInstanceToUse"]];
}
catch (NullReferenceException ex)
{
//TODO: Log error
return string.Empty;
}
}
Then your config file can contain the following information in the section.
<add key="StagingWebApiInstance" value="http://mystagingserver/myProduct/webservices/webapi.asmx "/>
<add key="ProductionWebApiInstance" value="https://www.myurl.com/myProduct/webservices/webapi.asmx"/>
<!-- Identify which WebApi.asmx instance to Use-->
<add key="WebApiInstanceToUse" value="ProductionWebApiInstance"/>
Also I would refrain from concatenating strings using the + overload. When doing it once, it doesn't come across as too much of a performance impact but if you have many concatenations like this throughout the code, it would lead to a big difference in execution times compared to using a StringBuilder. Check http://msdn.microsoft.com/en-us/library/ms228504.aspx for more information on why using a StringBuilder improves performance.
This should be an easy question, but I've been unable to solve it. I'm trying to change the Referral header prior to redirecting the page of an HttpResponse object. I know this can be done in an HttpWebResponse, but can't get this to work for a standard Page.Response.
I'm trying to just set the referer header to look like it originated from a temp page on my site (this is for analytics tracking for an external system).
Is this possible to do??
I've tried to use the code below (as well as variations such as Response.AppendHeader and Response.AddHeader), however the Referer always shows as the page that the Request initiated from.
Response.Headers.Add("Referer", "http://test.local/fromA");
Response.Redirect(HttpContext.Current.Request.Url.AbsoluteUri);
If not via .net can this be accomplished via js?
Thanks!
Referer is controlled (and sent) by the client. You can't affect it server-side. There may be some JavaScript that you could emit that'd get the client to do it - but it's probably considered a security flaw, so I wouldn't count on it.
The referrer is set by the client, not the server. It is useful to include in a request and not a response as it points to the URL where the request came from.
I have a Silverlight (v3) application that uses WebRequest to make an HTTP POST request to a webpage on the same website as the Silverlight app. This HTTP request gets back a 302 (a redirect) to another page on the same website, which HttpWebRequest is automatically supposed to follow (according to the documentation).
There's nothing particularly special about the code that makes the request (it uses the browser's HTTP stack, it is not configured to use the alternate inbuilt Silverlight HTTP stack):
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(String.Format("{0}?name={1}&size={2}", _UploadUrl, Uri.EscapeUriString(Name), TotalBytes));
request.Method = "POST";
All this works fine in Firefox and Chrome; Silverlight makes the POST HTTP request, receives a 302 response and automatically does a GET HTTP request of the specified redirect URL and returns that to me (I know this because I used Fiddler to watch the HTTP requests going on).
However, in Internet Explorer (v8), Silverlight does the POST HTTP request and then throws a WebException with a 404 error code!
Using Fiddler, I can see that Silverlight/Internet Explorer was successfully returned the 302 status code for the request, and I assume that the 404 status code (and associated WebException) that I get in Silverlight is because as far as I know HTTP requests that are done via the browser stack can only return 200 or 404 due to limitations. The real question is why does Internet Explorer not follow through the redirect like the other browsers?
Thanks in advance for any help!
EDIT: I would prefer not to use the Silverlight client HTTP stack because to my knowledge requests issued by it do not include cookies that are a part of the browser's session, critically including the ASP.NET authentication cookie that I need to be attached to the HTTP requests being made by the Silverlight control.
EDIT 2: I have discovered that Internet Explorer only exhibits this behaviour when you do a POST request. A GET request redirects successfully. This seems like pretty bad behaviour considering how many websites now do things in the Post-Redirect-Get style.
IE is closer to the specification, in that in responding to a 302 for a POST the user agent should send a POST (though it should not do so without user confirmation).
On the other hand, FF and Chrome are deliberately wrong, in copying ways in which user agents were frequently wrong some considerable time ago (the problem started in the early days of HTTP).
For this reason, 307 was introduced in HTTP/1.1 to be clearer that the same HTTP method should be used (i.e. in this case, it should be a POST) while 303 has always meant that one should use GET.
Therefore, instead of doing Response.Redirect which results in a 302 - that different user agents will handle in different ways, send a 303. The following code does so (and includes a valid entity body just to be within the letter of the spec). There is an overload so you can call it with either a Uri or a string:
private void SeeOther(Uri uri)
{
if(!uri.IsAbsoluteUri)
uri = new Uri(Request.Url, uri);
Response.StatusCode = 303;
Response.AddHeader("Location", uri.AbsoluteUri);
Response.ContentType = "text/uri-list";
Response.Write(uri.AbsoluteUri);
Context.ApplicationInstance.CompleteRequest();
}
private void SeeOther(string relUri)
{
SeeOther(new Uri(Request.Url, relUri));
}
I believe this was a feature change in Internet Explorer 7, where by they changed the expected 200 response to a 302 telling IE to be redirected. There is no smooth solution to this problem that I know off. A similar question was posed a while back here.
Change in behavior with Internet Explorer 7 and later in regard to CONNECT requests