Webclient 404 protocol error on valid url c# - c#

I have a webclient that calls to a URL that works fine when i view it in a browser, which led me to believe i would need to add headers in to my call
I have done this, but am still getting the error.
I do have other calls to the same API that work fine, and have checked that all the parameters I am passing across are exactly the same as expected(case, spelling)
using (var wb = new WebClient())
{
wb.Proxy = proxy;
wb.Headers.Add("Accept-Language", " en-US");
wb.Headers.Add("Accept", " text/html, application/xhtml+xml, */*");
wb.Headers.Add("User-Agent", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)");
byte[] response = wb.UploadValues("http://myserver/api/account/GetUser",
new NameValueCollection()
{
{ "email", register.Email },
});
userDetails = Encoding.UTF8.GetString(response);
}
Does anyone have an idea why I am still getting the protocol error on a call that works perfectly fine in a browser?

UploadValue uses a HTTP POST. Are you sure that it what you want? If you are viewing it in a browser it is likely a GET, unless you are filling out some sort of web form.
One might surmise that what you are trying to do is GET this response "http://myserver/api/account/GetUser?email=blah#blah.com"
in which case you would just formulate that url, with query parameters, and execute a GET using one of the DownloadString overloads.
using (var wb = new WebClient())
{
wb.Proxy = proxy;
userDetails = wb.DownloadString("http://myserver/api/account/GetUser?email=" + register.Email);
}
The Wikipedia article on REST has a nice table that outlines the semantics of each HTTP verb, which may help choosing the appropriate WebClient method to use for your use cases.

Related

C# WebClient downloads source for some pages, but not all

I currently have this code that is supposed to grab the HTML source of the website. Specifically, I am telling it to read the source of 4chan. It WILL get the source code for a board, such as /pol/ or /news/, but it will NOT get the source code for specific threads. It throws the error: [System.Net.WebException: 'The remote server returned an error: (403) Forbidden.']
Here is the code I am working with.
public string GetSource(string url)
{
WebClient client = new WebClient();
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12; //tried with & without this
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/6.0;)");
try
{
return client.DownloadString(url);
}
catch
{
Error(2); //error code 2
}
return "";
}
It will download the source of "https://boards.4chan.org/pol" for example.
It will not download the source of "https://boards.4chan.org/pol/thread/#"
I am completely lost as to how to proceed. I have a "user-agent" tag, and it works sometimes, so I don't know what the problem is. Any help would be appreciated. Thanks.

C# HttpClient request fails to scrape (both on System.Net and Windows.Web http requests)

I am trying to scrape the news off this site: https://www.livescore.com/soccer/news/
using (Windows.Web.Http.HttpClient client = new Windows.Web.Http.HttpClient())
{
client.DefaultRequestHeaders.Add("User-Agent",
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident / 6.0)");
using (Windows.Web.Http.HttpResponseMessage response = await client.GetAsync(new Uri(pageURL)))
using (Windows.Web.Http.IHttpContent content = response.Content)
{
try
{
string result = await content.ReadAsStringAsync();
Debug.WriteLine(result);
}
}
}
I see that I am getting a response containing Your browser is out of date or some of its features are disabled
I moved to Windows.Web to add certificates since I am on UWP and tried adding the following certificates
HttpBaseProtocolFilter filter = new HttpBaseProtocolFilter();
filter.IgnorableServerCertificateErrors.Add(ChainValidationResult.Untrusted);
filter.IgnorableServerCertificateErrors.Add(ChainValidationResult.Expired);
filter.IgnorableServerCertificateErrors.Add(ChainValidationResult.IncompleteChain);
filter.IgnorableServerCertificateErrors.Add(ChainValidationResult.WrongUsage);
filter.IgnorableServerCertificateErrors.Add(ChainValidationResult.InvalidName);
filter.IgnorableServerCertificateErrors.Add(ChainValidationResult.RevocationInformationMissing);
filter.IgnorableServerCertificateErrors.Add(ChainValidationResult.RevocationFailure);
but I am still getting the same response from the server.
Any idea how to bypass this?
Edit: They do have the old server, unsecured, http://www.livescore.com/, where I guess I can scrape everything but news aren't there.
I think that the problem is the user-agent string. you are telling to site that the browser are you using is Internet Explorer 10.
Look this page http://www.useragentstring.com/pages/useragentstring.php?name=Internet+Explorer and try to use the user agent for internet Explorer 11 (before make this open the page from your ie11 browser to check that function properly)

Why does my HttpWebRequest return a 503?

So i am just getting to learn HttpWebRequests and it's functions.
I've gotten to the point where I want to learn how to capture cookies in a CookieContainer and parse through them.
The issue is that some websites return a 503 error and I am not sure.
One of the websites will be used in this example.
From what Iäve read online a 503 error is this.
The HyperText Transfer Protocol (HTTP) 503 Service Unavailable server
error response code indicates that the server is not ready to handle
the request.
Common causes are a server that is down for maintenance or that is
overloaded. This response should be used for temporary conditions and
the Retry-After HTTP header should, if possible, contain the estimated
time for the recovery of the service.
Which doesnt seem to fit in at all since the website is up and running.
Why is my request returning a 503 status code and what should I do to resolve this issue in a propper manner?
static void Main(string[] args)
{
//1. Create a HTTP REQUEST
//Build the request
Uri site = new Uri("https://ucp.nordvpn.com/login/");
//Inizializing a new instance of the HttpWebRequest and casting it as a WebRequest
//And calling the Create function and using our site as a paramter which the Create function takes.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(site);
//Inizialize a new instance of the CookieContainer
CookieContainer cookies = new CookieContainer();
//The request has a CookieContainer, which is null by default, so we are just assinging the newly inizialized instance
//of our CookieContainer to our requests CookieContainer
request.CookieContainer = cookies;
//Print out the number of cookies before the response (of course it will be blank)
Console.WriteLine(cookies.GetCookieHeader(site));
//Get the response and print out the cookies again
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Console.WriteLine(cookies.GetCookieHeader(site));
}
Console.ReadKey();
}
The URL that you are trying to get to appears to be protected by CloudFlare. You can't use the basic HttpWebRequest for that type of request without some additional work. While I haven't tried this, it may be an option for you to get around that protection:
CloudFlareUtilities
The url you are trying to access is using cloud hosting which use many security measurement including which browser are accessing the site
for that to work you need to change the userAgent property of HttpWebRequest
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0";

WebClient DownloadFile with Authorization not working

I have tried just about everything I can think of to get this to work, including several things I've found online. All I'm trying to do is download a file (which has a direct link) from a website that I have to log in to.
I tried doing the following, with the "UploadValues":
WebClient myWebClient = new WebClient();
NameValueCollection myNameValueCollection = new NameValueCollection();
myNameValueCollection.Add("username", this.UserName);
myNameValueCollection.Add("password", this.Password);
byte[] responseArray = myWebClient.UploadValues(felony, myNameValueCollection);
myWebClient.DownloadFile(felony, localfelony);
and I've also tried putting the login info in the headers as well. I've also tried just setting the credentials, as you can see from the commented code:
WebClient client = new WebClient();
//client.UseDefaultCredentials = false;
//client.Credentials = new NetworkCredential(this.UserName, this.Password);
client.Headers.Add(HttpRequestHeader.Authorization, "Basic " + Convert.ToBase64String(Encoding.ASCII.GetBytes(this.UserName + ":" + this.Password)));
client.Headers.Add(HttpRequestHeader.UserAgent, "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36");
//client.Headers.Add(HttpRequestHeader.Cookie, this.webBrowser.Document.Cookie);
client.DownloadFile(felony, localfelony);
No matter what I try, the only thing I can get it to download is a file that ends up being the login page, as if it didn't accept the login info I passed.
I've looked at the headers and such, and I don't see anything out of the ordinary that would explain why this isn't working. Any ideas?
I could swear I had tried this before, but I guess maybe I had it just a little different or something. So it worked like this:
WebClient client = new WebClient();
client.UseDefaultCredentials = false;
client.Credentials = new NetworkCredential(this.UserName, this.Password);
client.Headers.Add(HttpRequestHeader.Cookie, "_gat=1; b46467afcb0b4bf5a47b2c6b22e3d284=mt84peq7u4r0bst72ejs5lb7p6; https://docs.stlucieclerk.com/=1,1; _ga=GA1.2.12049534.1467911267");
client.DownloadFile(webaddress, localname);
It was the cookie in the header that made it work. I thought I'd done that before, but maybe I did something involving a cookie that was different.
This seems to be a authentication/authorization issue.
There could be many reasons causing this like:
1) may be the authentication/authorization mechanism uses some kind of hash.
2) may be you are using the wrong kind of authentication mechanism ("Basic" as I can see).
3) may be you are getting authenticated but not authorized.
The best way to find the root cause is:
Use Fiddler.
Login using the UI page and try to download the file. While doing that capture the fiddler session. An there try to do the same with whatever code you have. Again capture the fiddler session. Compare the fiddler to find the difference.
Hope this helps.
Try temporarily changing the certificate validation:
System.Net.Security.RemoteCertificateValidationCallback r = System.Net.ServicePointManager.ServerCertificateValidationCallback;
System.Net.ServicePointManager.ServerCertificateValidationCallback =
delegate(object s, System.Security.Cryptography.X509Certificates.X509Certificate certificate, System.Security.Cryptography.X509Certificates.X509Chain chain, System.Net.Security.SslPolicyErrors sslPolicyErrors)
{ return true; };
//Do downloading here...
System.Net.ServicePointManager.ServerCertificateValidationCallback = r;
This would mean, however, that the webclient would accept any certificate, so see this post for more info.

How to send GET/POST request programmatically to simple ASPX page?

I use following code to post querystring
string URI = "http://somewebsite.com/default.aspx";
string myParameters = "param1=value1&param2=value2&param3=value3";
using (WebClient wc = new WebClient())
{
wc.Headers[HttpRequestHeader.ContentType] = "application/x-www-form-urlencoded";
string HtmlResult = wc.UploadString(URI, myParameters);
}
But somehow default.aspx does not accept that post call.
The point is when I manually in browser go to http://somewebsite.com/default.aspx all code there is working fine.
My questions is following what do I am missing here to archive the same result when I open page manually as I do it with WebClient?
Thank you in advance!
P.S. 1
I just tried to use GET method to that URL and it has no effect also. How is it possible?
What is difference between manual navigation to page and sending GET/POST?
P.S. 2
I even tried this
wc.Headers["Accept"] = "application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*";
wc.Headers["User-Agent"] = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; MDDC)";
and and Load event of Default.aspx is not hiting. :(
From your description of what you want to achieve, I think you may have chosen the wrong WebClient method. Instead of UploadString, try DownloadString:
using (WebClient wc = new WebClient())
{
string HtmlResult = wc.DownloadString("http://somewebsite.com/default.aspx?param1=value1&param2=value2&param3=value3");
}
So that comment is correct one
"What is difference between manual navigation to page and sending
GET/POST?" - see for yourself, for example using Fiddler. –
CodeCaster
I checked all requests with Fiddler and found that there is code of base page class that redirects to Index page. So Load event is never happened.

Categories

Resources