I'm trying to verify the existence of a Url using HttpWebRequest. I found a few examples that do basically this:
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(Url);
request.Method = "HEAD";
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
return response.StatusCode;
}
However, if the url is indeed broken, it's not returning a response, it's instead throwing an exception.
I modified my code to this:
try
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(Url);
request.Method = "HEAD";
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
return response.StatusCode;
}
}
catch (System.Net.WebException ex)
{
var response = ex.Response as HttpWebResponse;
return response == null ? HttpStatusCode.InternalServerError : response.StatusCode;
}
which seems to finally do what I want.
But I would like to know, why is the request throwing an exception instead of returning the response with a NotFound status code?
Ya this can be quite annoying when web pages use status codes heavily and not all of them are errors. Which can make processing the body quite a pain. Personally I use this extension method for getting the response.
public static class HttpWebResponseExt
{
public static HttpWebResponse GetResponseNoException(this HttpWebRequest req)
{
try
{
return (HttpWebResponse)req.GetResponse();
}
catch (WebException we)
{
var resp = we.Response as HttpWebResponse;
if (resp == null)
throw;
return resp;
}
}
}
Why not? They're both valid design options, and HttpWebRequest was just designed to work this way.
Just like #Will, I wrote similar extension method to get the response content in string from WebException.
/// <summary>
/// Reads Response content in string from WebException
/// </summary>
/// <param name="webException"></param>
/// <returns></returns>
public static (HttpStatusCode statusCode, string? responseString) GetResponseStringNoException(this WebException webException)
{
if (webException.Response is HttpWebResponse response)
{
Stream responseStream = response.GetResponseStream();
StreamReader streamReader = new(responseStream, Encoding.Default);
string responseContent = streamReader.ReadToEnd();
HttpStatusCode statusCode = response.StatusCode;
streamReader.Close();
responseStream.Close();
response.Close();
return (statusCode, responseContent);
}
else
{
return (HttpStatusCode.InternalServerError, null);
}
}
The above code is non-optimised solution.
Related
I am trying to get HTML content from the amazon website. Here is my code to create request, response, and get string:
public static HttpWebResponse GetHttpWebResponse(string url)
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.ContentType = "text/xml";
try
{
return (HttpWebResponse)webRequest.GetResponse();
}
catch (WebException e)
{
if (e.Response == null)
throw new Exception("Cannot get response");
return (HttpWebResponse)e.Response;
}
}
public static string GetString(HttpWebResponse response)
{
Encoding encoding = Encoding.UTF8;
using (var reader = new StreamReader(response.GetResponseStream(), encoding))
{
string responseText = reader.ReadToEnd();
return responseText;
}
}
It is working fine with other web sites. However, when I try to get content from amazon, for example:
https://www.amazon.com/gp/product/B00AEISSHA/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1
I am seeing encoded content:
I tried to change Encoding and used HttpUtility.HtmlDecode(html); but it couldn't help. Is there any simple way to get content from Amazon?
You're not catering for compression. If you update your webrequest like this, it should do the trick.
public static HttpWebResponse GetHttpWebResponse(string url)
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.ContentType = "text/xml";
webRequest.AutomaticDecompression = DecompressionMethods.GZip;
try
{
return (HttpWebResponse)webRequest.GetResponse();
}
catch (WebException e)
{
if (e.Response == null)
throw new Exception("Cannot get response");
return (HttpWebResponse)e.Response;
}
}
I am attempting to replicate the following C# code in Java. This code is a helper class that sends a request containing xml, and reads a response.
internal static String Send(String url, String body)
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
try
{
// create the new httpwebrequest with the uri
request.ContentLength = 0;
// set the method to POST
request.Method = "POST";
if (!String.IsNullOrEmpty(body))
{
request.ContentType = "application/xml; charset=utf-8";
byte[] postData = Encoding.Default.GetBytes(body);
request.ContentLength = postData.Length;
using (Stream s = request.GetRequestStream())
{
s.Write(postData, 0, postData.Length);
}
}
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
String responseString = new StreamReader(response.GetResponseStream()).ReadToEnd();
if (response.StatusCode != HttpStatusCode.OK)
{
throw new ResponseException(((int)response.StatusCode),
response.StatusCode.ToString(), request.RequestUri.ToString(),
responseString);
}
return responseString;
}
}
catch (WebException e)
{
using (WebResponse response = e.Response)
{
HttpWebResponse httpResponse = response as HttpWebResponse;
if (httpResponse != null)
{
using (Stream data = response.GetResponseStream())
{
data.Position = 0;
throw new ResponseException(((int)httpResponse.StatusCode),
httpResponse.StatusCode.ToString(), request.RequestUri.ToString(),
new StreamReader(data).ReadToEnd()
);
}
}
else
{
throw;
}
After reading other threads I determined that the Apache HttpComponents library would be my best bet to get the same functionality. After reading the documentation and following the example here:
http://hc.apache.org/httpcomponents-client-ga/quickstart.html
I am unable to figure out how to send the body string as xml. When I attempt to set the entity for the request it requires that I declare a BasicNameValuePair, and I do not understand what this is, or how I would format the body string to meet this specification.
Below is what I have currently done.
protected static String Send(String url, String body)
{
HttpPost request = new HttpPost(url);
try
{
request.setHeader("ContentType", "application/xml; charset=utf=8");
// Encode the body if needed
request.setEntity(new UrlEncodedFormEntity());
//get the response
// if the response code is not valid throw a ResponseException
// else return the response string.
} finally {
request.releaseConnection();
}
return null;
}
EDIT : or should I use a StringEntity and do the following
protected static String SendToJetstream(String url, String body)
{
HttpPost request = new HttpPost(url);
try
{
StringEntity myEntity = new StringEntity(body,
ContentType.create("application/xml", "UTF-8"));
// Encode the body if needed
request.setEntity(myEntity);
//get the response
// if the response code is not valid throw a ResponseException
// else return the response string.
} finally {
request.releaseConnection();
}
return null;
}
Use a FileEntity
File file = new File("somefile.xml");
FileEntity entity = new FileEntity(file, ContentType.create("application/xml", "UTF-8"));
Lots of good examples here: http://hc.apache.org/httpcomponents-client-ga/tutorial/html/fundamentals.html#d5e165
We have a url and we need to check whether web page is active or not. We tried following code:
WebResponse objResponse = null;
WebRequest objRequest = HttpWebRequest.Create(URL);
objRequest.Method = "HEAD";
try
{
objResponse = objRequest.GetResponse();
objResponse.Close();
}
catch (Exception ex)
{
}
Above code gave exception if unable to get a response but also works fine even if we have a "server error" on that page? Any help how to get server error?
The HttpResponse class has a StatusCode property which you can check. If it's 200 everything is ok.
You can change your code to this:
HttpWebResponse objResponse = null;
var objRequest = HttpWebRequest.Create("http://google.com");
objResponse = (HttpWebResponse) objRequest.GetResponse();
if(objResponse.StatusCode != HttpStatusCode.OK)
{
Console.WriteLine("It failed");
}else{
Console.WriteLine("It worked");
}
For one thing, use a using statement on the response - that way you'll dispose of it whatever happens.
Now, if a WebException is thrown, you can catch that and look at WebException.Response to find out the status code and any data sent back:
WebRequest request = WebRequest.Create(URL);
request.Method = "HEAD";
try
{
using (WebResponse response = request.GetResponse())
{
// Use data for success case
}
}
catch (WebException ex)
{
HttpWebResponse errorResponse = (HttpWebResponse) ex.Response;
HttpStatusCode status = errorResponse.StatusCode;
// etc
}
How to let Httpwebresponse ignore the 404 error and continue with it? It's easier than looking for exceptions in input as it is very rare when this happens.
I'm assuming you have a line somewhere in your code like:
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
Simply replace it with this:
HttpWebResponse response;
try
{
response = request.GetResponse() as HttpWebResponse;
}
catch (WebException ex)
{
response = ex.Response as HttpWebResponse;
}
try
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("http://mysite.com");
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
}
catch(WebException ex)
{
HttpWebResponse webResponse = (HttpWebResponse)ex.Response;
if (webResponse.StatusCode == HttpStatusCode.NotFound)
{
//Handle 404 Error...
}
}
If you look at the properties of the WebException that gets thrown, you'll see the property Response. Is this what you are looking for?
I'm trying to create a method in C# to return a string of a web pages html content from the url. I have tried several different ways, but I am getting the error System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a receive.
The following works fine locally, but gets the above error when running on a remote server:
public static string WebPageRead(string url)
{
string result = String.Empty;
WebResponse response = null;
StreamReader reader = null;
try
{
if (!String.IsNullOrEmpty(url))
{
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
request.Method = "GET";
request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
response = request.GetResponse();
reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
result = reader.ReadToEnd();
}
}
catch (Exception exc)
{
throw exc;
}
finally
{
if (reader != null)
{
reader.Close();
}
if (response != null)
{
response.Close();
}
}
return result;
}
This is probably not the problem, but try the following:
public static string WebPageRead(string url)
{
if (String.IsNullOrEmpty(url))
{
return null;
}
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
if (request == null)
{
return null;
}
request.Method = "GET";
request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
using (WebResponse response = request.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
using (StreamReader reader =
new StreamReader(stream, Encoding.UTF8))
{
return reader.ReadToEnd();
}
}
}
}
I echo the earlier answer that suggests you try this with a known good URL. I'll add that you should try this with a known good HTTP 1.1 URL, commenting out the line that sets the version to 1.0. If that works, then it narrows things down considerably.
Thanks for the responses, the problem was due to a DNS issue on the remote server! Just to confirm, I went with the following code in the end:
public static string WebPageRead(string url)
{
string content = String.Empty;
if (!String.IsNullOrEmpty(url))
{
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
if (request != null)
{
request.Method = "GET";
request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
try
{
using (WebResponse response = request.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
content = reader.ReadToEnd();
}
}
}
}
catch (Exception exc)
{
throw exc;
}
}
}
return content;
}
Had a problem like this before that was solved by opening the url in IE on the machine with the problem. IE then asks you whether you want to add the url to the list of secure sites. Add it and it works for that url.
This is just one of the possible causes. Seriously a lot of other problems could cause this. Besides the problem described above, the best way I've found to solve this is the just catch the exception and retry the request.