WebException when loading rss feed - c#

I am attempting to load a page I've received from an RSS feed and I receive the following WebException:
Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones.
with an inner exception:
Invalid URI: The hostname could not be parsed.
I wrote a code that would attempt loading the url via an HttpWebRequest. Due to some suggestions I received, when the HttpWebRequest fails I then set the AllowAutoRedirect to false and basically manually loop through the iterations of redirect until I find out what ultimately fails. Here's the code I'm using, please forgive the gratuitous Console.Write/Writeline calls:
Uri url = new Uri(val);
bool result = true;
System.Net.HttpWebRequest req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(url);
string source = String.Empty;
Uri responseURI;
try
{
using (System.Net.WebResponse webResponse = req.GetResponse())
{
using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)
{
responseURI = httpWebResponse.ResponseUri;
StreamReader reader;
if (httpWebResponse.ContentEncoding.ToLower().Contains("gzip"))
{
reader = new StreamReader(new GZipStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else if (httpWebResponse.ContentEncoding.ToLower().Contains("deflate"))
{
reader = new StreamReader(new DeflateStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else
{
reader = new StreamReader(httpWebResponse.GetResponseStream());
}
source = reader.ReadToEnd();
reader.Close();
}
}
req.Abort();
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(source);
result = true;
}
catch (ArgumentException ae)
{
Console.WriteLine(url + "\n--\n" + ae.Message);
result = false;
}
catch (WebException we)
{
Console.WriteLine(url + "\n--\n" + we.Message);
result = false;
string urlValue = url.ToString();
try
{
bool cont = true;
int count = 0;
do
{
req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(urlValue);
req.Headers.Add("Accept-Language", "en-us,en;q=0.5");
req.AllowAutoRedirect = false;
using (System.Net.WebResponse webResponse = req.GetResponse())
{
using (HttpWebResponse httpWebResponse = webResponse as HttpWebResponse)
{
responseURI = httpWebResponse.ResponseUri;
StreamReader reader;
if (httpWebResponse.ContentEncoding.ToLower().Contains("gzip"))
{
reader = new StreamReader(new GZipStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else if (httpWebResponse.ContentEncoding.ToLower().Contains("deflate"))
{
reader = new StreamReader(new DeflateStream(httpWebResponse.GetResponseStream(), CompressionMode.Decompress));
}
else
{
reader = new StreamReader(httpWebResponse.GetResponseStream());
}
source = reader.ReadToEnd();
if (string.IsNullOrEmpty(source))
{
urlValue = httpWebResponse.Headers["Location"].ToString();
count++;
reader.Close();
}
else
{
cont = false;
}
}
}
} while (cont);
}
catch (UriFormatException uriEx)
{
Console.WriteLine(urlValue + "\n--\n" + uriEx.Message + "\r\n");
result = false;
}
catch (WebException innerWE)
{
Console.WriteLine(urlValue + "\n--\n" + innerWE.Message+"\r\n");
result = false;
}
}
if (result)
Console.WriteLine("testing successful");
else
Console.WriteLine("testing unsuccessful");
Since this is currently just test code I hardcode val as http://rss.nytimes.com/c/34625/f/642557/s/3d072012/sc/38/l/0Lartsbeat0Bblogs0Bnytimes0N0C20A140C0A70C30A0Csarah0Ekane0Eplay0Eamong0Eofferings0Eat0Est0Eanns0Ewarehouse0C0Dpartner0Frss0Gemc0Frss/story01.htm
the ending url that gives the UriFormatException is: http:////www-nc.nytimes.com/2014/07/30/sarah-kane-play-among-offerings-at-st-anns-warehouse/?=_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&_php=true&_type=blogs&partner=rss&emc=rss&_r=6&
Now I'm sure if I'm missing something or if I'm doing the looping wrong, but if I take val and just put that into a browser the page loads fine, and if I take the url that causes the exception and put it in a browser I get taken to an account login for nytimes.
I have a number of these rss feed urls that are resulting in this problem. I also have a large number of these rss feed urls that have no problem loading at all. Let me know if there is any more information needed to help resolve this. Any help with this would be greatly appreciated.
Could it be that I need to have some sort of cookie capability enabled?

You need to keep track of the cookies while doing all your requests. You can use an instance of the CookieContainer class to achieve that.
At the top of your method I made the following changes:
Uri url = new Uri(val);
bool result = true;
// keep all our cookies for the duration of our calls
var cookies = new CookieContainer();
System.Net.HttpWebRequest req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(url);
// assign our CookieContainer to the new request
req.CookieContainer = cookies;
string source = String.Empty;
Uri responseURI;
try
{
And in the exception handler where you create a new HttpWebRequest, you do the assignment from our CookieContainer again:
do
{
req = (System.Net.HttpWebRequest)System.Net.HttpWebRequest.Create(urlValue);
// reuse our cookies!
req.CookieContainer = cookies;
req.Headers.Add("Accept-Language", "en-us,en;q=0.5");
req.AllowAutoRedirect = false;
using (System.Net.WebResponse webResponse = req.GetResponse())
{
This makes sure that on each successive call the already present cookies are resend again in the next request. If you leave this out, no cookies are sent and therefore the site you try to visit assumes you are a fresh/new/unseen user and gives you a kind of authentication path.
If you want to store/keep cookies beyond this method you could move the cookie instance variable to a static public property so you can use all those cookies program-wide like so:
public static class Cookies
{
static readonly CookieContainer _cookies = new CookieContainer();
public static CookieContainer All
{
get
{
return _cookies;
}
}
}
And to use it in a WebRequest:
var req = (System.Net.HttpWebRequest) WebRequest.Create(url);
req.CookieContainer = Cookies.All;

Related

Why does my API POST Request keep failing?

I am doing an API Post request and cant seem to get it to work. I always get a sendFailure webexception and the response for the exception is always null so catching the exception is useless. It keeps happening when I try to get the httpWebResponse. I noticed too the request.contentlength gave errors at postream getrequeststream so i commented it out. Test.json is the file I use for the request body. I also tested this on different API testers by including the URL, body, and content-type in the header and they worked. I just cant seem to code it for myself. The credentials work I just dont know if im doing the request correctly?
JSON File:
{
"email": "abc#123.com",
"password": "12345",
"facilityNumber": "987654"
}
string filepath = "test.json";
string result = string.Empty;
using (StreamReader r = new StreamReader(filepath))
{
var json = r.ReadToEnd();
var jobj = JObject.Parse(json);
foreach (var item in jobj.Properties())
{
item.Value = item.Value.ToString().Replace("v1", "v2");
}
result = jobj.ToString();
Console.WriteLine(result);
}
try
{
string setupParameters;
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("https://www.test.com/abcde");
request.AllowAutoRedirect = true;
setupParameters = result;
ServicePointManager.ServerCertificateValidationCallback = (s, cert, chain, ssl) => true;
ASCIIEncoding encoding = new ASCIIEncoding();
var postData = setupParameters;
request.Method = "POST";
request.ContentType = "application/json";
byte[] data = encoding.GetBytes(postData);
//request.ContentLength = data.Length;
using (StreamWriter postStream = new StreamWriter(request.GetRequestStream()))//error if uncomment contentlength
{
postStream.Write(postData);
postStream.Flush();
postStream.Close();
}
HttpWebResponse wr = (HttpWebResponse)request.GetResponse();//error occurs
Stream receiveStream = wr.GetResponseStream();
StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF8);
}
catch (WebException ex)
{
if (ex.Response != null)
{
using (var errorResponse = (HttpWebResponse)ex.Response)
{
using (var reader = new StreamReader(errorResponse.GetResponseStream()))
{
string error = reader.ReadToEnd();
result = error;
}
}
}
I suggest modifiying your request to follow this format. Especially pay attention to the request.Method and request.ContentType which have caught me out multiple times.
Also, handling the response is easier this way.
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(YOURURL);
request.ContentType = "application/json; charset=utf8";
request.Headers.Add(ADD HEADER HERE IF YOU NEED ONE);
request.Method = WebRequestMethods.Http.Post; // IMPORTANT
using (var streamWriter = new StreamWriter(request.GetRequestStream()))
{
streamWriter.Write(JsonConvert.SerializeObject(JSONBODYSTRING));
// I USUALLY YOU JSONCONVERT HERE TO SIMPLY SERIALIZE A STRING CONTAINING THE JSON INFO.
//BUT I GUESS YOUR METHOD WOULD ALSO WORK
streamWriter.Flush();
streamWriter.Close();
}
WebResponse response = request.GetResponse();
using (var streamReader = new StreamReader(response.GetResponseStream()))
{
string result = streamReader.ReadToEnd();
// DO WHATEVER YOU'D LIKE HERE
}
} catch (Exception ex)
{
// HANDLE YOUR EXCEPTIONS
}

Make http WebRequest work in C#

I want to get a respond from an http website,I have used this code
// Create a new request to the mentioned URL.
WebRequest myWebRequest = WebRequest.Create("http://127.0.0.1:8080/geoserver/NosazMohaseb/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=NosazMohaseb:GParcelLAyer&maxFeatures=50&outputFormat=application%2Fjson&bbox=5727579.437775434,3838435.3419322656,5727581.1322169611,3838437.0363737918");
// var myWebRequest = WebRequest.Create(myUri);
myWebRequest.Method ="GET";
myWebRequest.Timeout = TimeOut;
if (myWebRequest is HttpWebRequest)
{
( myWebRequest as HttpWebRequest).Accept = "application/json";
(myWebRequest as HttpWebRequest).ContentType = "application/json";
//(myWebRequest as HttpWebRequest).Accept =
(myWebRequest as HttpWebRequest).KeepAlive = false;
(myWebRequest as HttpWebRequest).UserAgent = "SharpMap-WMSLayer";
}
if (Credentials != null)
{
myWebRequest.Credentials = Credentials;
myWebRequest.PreAuthenticate = true;
}
else
myWebRequest.Credentials = CredentialCache.DefaultCredentials;
if (Proxy != null)
myWebRequest.Proxy = Proxy;
try
{
using (var myWebResponse = (HttpWebResponse)myWebRequest.GetResponse())
{
using (var dataStream = myWebResponse.GetResponseStream())
{
var cLength = (int)myWebResponse.ContentLength;
}
myWebResponse.Close();
}
}
catch (WebException webEx)
{
if (!this.ContinueOnError)
throw (new RenderException(
"There was a problem connecting to the WMS server when rendering layer '" + LayerName + "'",
webEx));
}
catch (Exception ex)
{
if (!ContinueOnError)
throw (new RenderException("There was a problem rendering layer '" + LayerName + "'", ex));
}
But when I try to get cLength it is -1,So it does not work,But When I try to access this website
http://127.0.0.1:8080/geoserver/NosazMohaseb/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=NosazMohaseb:GParcelLAyer&maxFeatures=50&outputFormat=application%2Fjson&bbox=5727579.437775434,3838435.3419322656,5727581.1322169611,3838437.0363737918
I get following answer
{"type":"FeatureCollection","totalFeatures":2,"features":[{"type":"Feature","id":"GParcelLAyer.14970","geometry":{"type":"Polygon","coordinates":[[[5727597.96542913,3838442.73401128],[5727595.60003176,3838429.21114233],[5727576.62444883,3838431.10604568],[5727571.16785106,3838432.76483769],[5727569.78420277,3838437.30665986],[5727570.19434939,3838439.63808217],[5727597.96542913,3838442.73401128]]]},"geometry_name":"geom","properties":{"FK_BlockNo":"12055","FK_LandNo":"8","NoApart":"100000","Name":" ","Family":"??","Father":" ","MeliNo":" ","MalekType":"1 ","PostCode":"0 ","Id_Parvande":null,"BuildNo":null,"BuildTypeCode":null,"BuildUserTypeCode":null,"BuildViewTypeCode":null,"BuildGhedmatCode":null,"Farsoode":"0"}}],"crs":{"type":"name","properties":{"name":"urn:ogc:def:crs:EPSG::900913"}}}
So it seems I am missing somthing while getting respond using C#..Can you please help me find my mistake?
thanks
In your code you're actually just getting response stream and later you're not reading anything from this stream - that's why you're not getting any data.
You have to create stream reader and use it to read data from response stream (consider to use buffered Read instead of ReadToEnd if your data size is large):
using (var dataStream = myWebResponse.GetResponseStream())
using (var reader = new StreamReader(dataStream))
{
string data = reader.ReadToEnd();
}
Concerning ContentLength equals to -1 in your case - well, it can be something at your server-side, check if your server actually returns this header. In fact, this header is not mandatory and you should not rely on it.

Multiple POST using WebRequest

I am able to send the first request working fine, however I can't get my head round why it stalls on getting the Stream os = smsRequest.GetRequestStream() the second time.
I am aware that you can't write to a Request more than once that is why a new instance is created each time.
public void SendSMS(Dictionary<double, IList<string>> texts)
{
if (CreateWebRequest())
{
foreach (double mpn in texts.Keys)
{
foreach (string sms in texts[mpn])
{
string formParams = string.Format("sendTo=0{0}&selectText=Please+Select...&textMessage={1}&x=28&y=10", mpn, sms);
byte[] encodedParams = Encoding.UTF8.GetBytes(formParams);
HttpWebRequest smsRequest = CreateSMSRequest(encodedParams);
using (Stream os = smsRequest.GetRequestStream())
{
os.Write(encodedParams, 0, encodedParams.Length);
os.Close();
}
}
}
}
}
private HttpWebRequest CreateSMSRequest(byte[] encodedParams)
{
HttpWebRequest smsRequest = (HttpWebRequest)WebRequest.Create(PostUrl);
smsRequest.Method = WebRequestMethods.Http.Post;
smsRequest.ContentType = "application/x-www-form-urlencoded";
smsRequest.ContentLength = encodedParams.Length;
smsRequest.AllowAutoRedirect = false;
smsRequest.Credentials = CredentialCache.DefaultNetworkCredentials;
smsRequest.Headers.Add(HttpRequestHeader.Cookie, _cookieData);
return smsRequest;
}
I think your answer is the same as this one:
HttpWebRequest getRequestStream hangs on multiple runs
After your using statement put:
var response = smsRequest.GetResponse() as HttpWebResponse;

How to ensure the response for image is complete?

I'm doing a webscraping project in ASP.net for a website, as there is a need for Catpcha code, hence I need to get the Captcha code for users to key in before continue.
So far the project is working fine, but the only problem I found is that sometimes the captcha code response was not entirely captured hence converting the response stream to Image caused the following errors:
"Parameter is invalid."
I noticed that web browsers do not have this problem, and it always can show the captcha code nicely as long as the server is not down.
However, this doesn't make sense to HttpWebRequest, it is sometimes able to get it, and sometimes not, may I know is there a way to ensure that the Response Stream is complete?
My Code snippet is as follow:
public Image GetCaptchaCode()
{
Image returnVal = null;
Uri uri = new Uri(URL_CAPTCHA);
HttpWebRequest request = null;
HttpWebResponse response = null;
try
{
// Get Cookies
CookieCollection cookies = this.GetCookies();
foreach (Cookie cookie in cookies)
{
Console.WriteLine(cookie.Name + ": " + cookie.Value);
}
// Get Catpcha
request = (HttpWebRequest)HttpWebRequest.Create(uri);
request.ProtocolVersion = HttpVersion.Version11;
request.Method = WebRequestMethods.Http.Get; // use GET for loading Captcha
request.CookieContainer = this._cookies; // Store Cookies Info
System.Net.ServicePointManager.Expect100Continue = false;
// Add more cookies
if (cookies != null)
{
request.CookieContainer.Add(cookies);
}
// Handle Gzip Compression
request.Headers.Add(HttpRequestHeader.AcceptEncoding, HEADER_TYPE);
request.AutomaticDecompression = DecompressionMethods.GZip;
request.Referer = URL_REFERER;
request.UserAgent = USER_AGENT;
// Get Response
response = (HttpWebResponse)request.GetResponse();
returnVal = Image.FromStream(response.GetResponseStream());
}
catch (Exception ex)
{
string errMsg = ex.Message;
}
finally
{
if (uri != null) uri = null;
if (request != null) request = null;
if (response != null)
{
response.Close();
response = null;
}
}
return returnVal;
}

System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a receive

I'm trying to create a method in C# to return a string of a web pages html content from the url. I have tried several different ways, but I am getting the error System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a receive.
The following works fine locally, but gets the above error when running on a remote server:
public static string WebPageRead(string url)
{
string result = String.Empty;
WebResponse response = null;
StreamReader reader = null;
try
{
if (!String.IsNullOrEmpty(url))
{
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
request.Method = "GET";
request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
response = request.GetResponse();
reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
result = reader.ReadToEnd();
}
}
catch (Exception exc)
{
throw exc;
}
finally
{
if (reader != null)
{
reader.Close();
}
if (response != null)
{
response.Close();
}
}
return result;
}
This is probably not the problem, but try the following:
public static string WebPageRead(string url)
{
if (String.IsNullOrEmpty(url))
{
return null;
}
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
if (request == null)
{
return null;
}
request.Method = "GET";
request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
using (WebResponse response = request.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
using (StreamReader reader =
new StreamReader(stream, Encoding.UTF8))
{
return reader.ReadToEnd();
}
}
}
}
I echo the earlier answer that suggests you try this with a known good URL. I'll add that you should try this with a known good HTTP 1.1 URL, commenting out the line that sets the version to 1.0. If that works, then it narrows things down considerably.
Thanks for the responses, the problem was due to a DNS issue on the remote server! Just to confirm, I went with the following code in the end:
public static string WebPageRead(string url)
{
string content = String.Empty;
if (!String.IsNullOrEmpty(url))
{
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
if (request != null)
{
request.Method = "GET";
request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
try
{
using (WebResponse response = request.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
content = reader.ReadToEnd();
}
}
}
}
catch (Exception exc)
{
throw exc;
}
}
}
return content;
}
Had a problem like this before that was solved by opening the url in IE on the machine with the problem. IE then asks you whether you want to add the url to the list of secure sites. Add it and it works for that url.
This is just one of the possible causes. Seriously a lot of other problems could cause this. Besides the problem described above, the best way I've found to solve this is the just catch the exception and retry the request.

Categories

Resources