I have ASP.NET website. When I call the url 'http://example.org/worktodo.ashx' from browser it works ok.
I have created one android app and if I call the above url from android app then also it works ok.
I have created windows app in C# and if I call the above url from that windows app then it fails with error 403 forbidden.
Following is the C# code.
try
{
bool TEST_LOCAL = false;
//
// One way to call the url
//
WebClient client = new WebClient();
string url = TEST_LOCAL ? "http://localhost:1805/webfolder/worktodo.ashx" : "http://example.org/worktodo.ashx";
string status = client.DownloadString(url);
MessageBox.Show(status, "WebClient Response");
//
// Another way to call the url
//
WebRequest request = WebRequest.Create(url);
request.Method = "GET";
request.Headers.Add("Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
request.Headers.Add("Connection:keep-alive");
request.Headers.Add("User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36");
request.Headers.Add("Upgrade-Insecure-Requests:1");
request.Headers.Add("Accept-Encoding:gzip, deflate, sdch");
request.ContentType = "text/json";
WebResponse response = request.GetResponse();
string responseString = new System.IO.StreamReader(response.GetResponseStream()).ReadToEnd();
MessageBox.Show(responseString, "WebRequest Response");
}
catch (WebException ex)
{
string error = ex.Status.ToString();
}
The exception thrown is:
The remote server returned an error: (403) Forbidden.
StatusCode value is 'Forbidden'
StatusDescription value is 'ModSecurity Action'
Following is android app code (uses org.apache.http library):
Handler handler = new Handler() {
Context ctx = context; // save context for use inside handleMessage()
#SuppressWarnings("deprecation")
public void handleMessage(Message message) {
switch (message.what) {
case HttpConnection.DID_START: {
break;
}
case HttpConnection.DID_SUCCEED: {
String response = (String) message.obj;
JSONObject jobjdata = null;
try {
JSONObject jobj = new JSONObject(response);
jobjdata = jobj.getJSONObject("data");
String status = URLDecoder.decode(jobjdata.getString("status"));
Toast.makeText(ctx, status, Toast.LENGTH_LONG).show();
} catch (Exception e1) {
Toast.makeText(ctx, "Unexpected error encountered", Toast.LENGTH_LONG).show();
// e1.printStackTrace();
}
}
}
}
};
final ArrayList<NameValuePair> params1 = new ArrayList<NameValuePair>();
if (RUN_LOCALLY)
new HttpConnection(handler).post(LOCAL_URL, params1);
else
new HttpConnection(handler).post(WEB_URL, params1);
}
Efforts / Research done so far to solve the issue:
I found following solutions that fixed 403 forbidden error for them but that could not fix my problem
Someone said, the file needs to have appropriate 'rwx' permissions set, so, I set 'rwx' permissions for the file
Someone said, specifying USER-AGENT worked, I tried (ref. Another way to call)
Someone said, valid header fixed it - used Fiddler to find valid header to be set, I used Chrome / Developer Tools and set valid header (ref.
another way to call)
Someone configured ModSecurity to fix it, but, I don't have ModSecurity installed for my website, so, not an option for me
Many were having problem with MVC and fixed it, but, I don't use MVC, so those solutions are not for me
ModSecurity Reference manual says, to remove it from a website, add <modules><remove name="ModSecurityIIS" /></modules> to web.config. I did but couldn't fix the issue
My questions are:
Why C# WinApp fails where as Android App succeeds?
Why Android App doesn't encounter 'ModSecurity Action' exception?
Why C# WinApp encounter 'ModSecurity Action' exception?
How to fix C# code?
Please help me solve the issue. Thank you all.
I found the answer. Below is the code that works as expected.
bool TEST_LOCAL = false;
string url = TEST_LOCAL ? "http://localhost:1805/webfolder/worktodo.ashx" : "http://example.org/worktodo.ashx";
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = "GET";
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36";
request.ContentType = "text/json";
WebResponse response = request.GetResponse();
string responseString = new System.IO.StreamReader(response.GetResponseStream()).ReadToEnd();
MessageBox.Show(responseString, "WebRequest Response");
NOTE: requires using System.Net;
Related
First, I must admit that I'm not certain whether I should be using Windows.Web.Http.HttpClient or System.Net.Http.HttpClient. It seems Windows.Web.Http.HttpClient is the way to go for UWP, but I've tried both without success.
From the URL, I expect to receive a JSON object. When I copy and paste the URL into a browser, I can see the JSON object just fine. If I leave the cache alone after connecting with a browser, the UWP has no problem reading from the cache. I used my same code to connect to https://jsonplaceholder.typicode.com/todos/1 and was able to retrieve the JSON object from there without any issues. Here's my basic attempt:
async void Connect()
{
using (Windows.Web.Http.HttpClient httpClient = new Windows.Web.Http.HttpClient())
{
Uri uri = new Uri(#"https://www.nottheactualdomainname.com/?api=software-api&email=xxxx#xxxx.com&licence_key=xxxx&request=status&product_id=xxxxinstance=20181218215300");
Windows.Web.Http.HttpResponseMessage httpResponse = new Windows.Web.Http.HttpResponseMessage();
string httpResponseBody = "";
try
{
httpResponse = await httpClient.GetAsync(uri);
httpResponse.EnsureSuccessStatusCode();
httpResponseBody = await httpResponse.Content.ReadAsStringAsync();
if (JsonObject.TryParse(httpResponseBody, out JsonObject keyValuePairs))
{
//handle JSON object
}
else
{
//didn't recieve JSON object
}
}
catch (Exception ex)
{
httpResponseBody = "Error: " + ex.HResult.ToString("X") + " Message: " + ex.Message;
}
}
}
I have tried including this at various points (i.e. before the using statement, right before initializing httpResponse, and right before the try statement):
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
As well as this (knowing that I would only able to do this while debugging):
//for debugging ONLY; not safe for production
ServicePointManager.ServerCertificateValidationCallback = delegate { return true; };
Sometimes I get this error:
Error: 80131500 Message: Response status code does not indicate success: 403 ().
And sometiemes I get this error:
Error: 80190193 Message: Forbidden (403).
Or both. I have tried this with both Windows.Web.Http.HttpClient and System.Net.Http.HttpClient.
I will eventually need to make sure that my URL includes appropriate credentials for the server. However, incorrect credentials should just return a JSON with error information, and I'm not getting that. I can connect to https://www.nottheactualdomainname.com. What should I check next?
I found the answer here
I was able to use the browser developer tools to look at the request headers. Then I added this:
httpClient.DefaultRequestHeaders.Add("user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36");
I want to go to download a string from a website, I made this php file to show an example.
(This won't work around my whole website)
The link http://swageh.co/information.php won't be downloaded using a webClient from any PC.
I prefer using a webClient.
No matter what I try, it won't downloadString.
It works fine on a browser.
It returns an error 500 An unhandled exception of type 'System.Net.WebException' occurred in System.dll
Additional information: The underlying connection was closed: An unexpected error occurred on a send. is the error
Did you change something on the server-side?
All of the following options are working just fine for me as of right now (all return just "false" with StatusCode of 200):
var client = new WebClient();
var stringResult = client.DownloadString("http://swageh.co/information.php");
Also HttpWebRequest:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://swageh.co/information.php");
request.GetResponse().GetResponseStream();
Newer HttpClient:
var client = new HttpClient();
var req = new HttpRequestMessage(HttpMethod.Get, "http://swageh.co/information.php");
var res = client.SendAsync(req);
var stringResult = res.Result.Content.ReadAsStringAsync().Result;
it's because your website is responding with 301 Moved Permanently
see Get where a 301 URl redirects to
This shows how to automatically follow the redirect: Using WebClient in C# is there a way to get the URL of a site after being redirected?
look at Christophe Debove's answer rather than the accepted answer.
Interestingly this doesn't work - tried making headers the same as Chrome as below, perhaps use Telerik Fiddler to see what is happening.
var strUrl = "http://theurl_inhere";
var headers = new WebHeaderCollection();
headers.Add("Accept-Language", "en-US,en;q=0.9");
headers.Add("Cache-Control", "no-cache");
headers.Add("Pragma", "no-cache");
headers.Add("Upgrade-Insecure-Requests", "1");
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(strUrl);
request.Method = "GET";
request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
request.Accept = "text/html,application/xhtml+xml,application/xml; q = 0.9,image / webp,image / apng,*/*;q=0.8";
request.Headers.Add( headers );
request.AllowAutoRedirect = true;
request.KeepAlive = true;
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();
var strLastRedirect = response.ResponseUri.ToString();
StreamReader reader = new StreamReader(dataStream);
string strResponse = reader.ReadToEnd();
response.Close();
In my program,I check that a site is available or not, I use this code
HttpWebRequest request;
HttpWebResponse response;
Message = string.Empty;
string result="";
request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 300000;
request.AllowAutoRedirect = true;
try
{
response = (HttpWebResponse)request.GetResponse();
result = response.StatusCode.ToString();
response.Close();
}
catch (Exception ex)
{
result = ex.Message;
}
I set timeout to 5 min. when the program runs,for some sites(Urls) , result is "unable to connect to remote server" but site is available. how can I solve this problem?
Some sites have throttles to limit web requests from robots or client with invalid user agent or not recognized string.
Therefore I suggest that you adjust the user agent to a known browser.
For example:
WebClient client = new WebClient();
// Mozilla 2.2
client.Headers.Add("user-agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201");
// Safari 7.0.3
client.Headers.Add("user-agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A");
For a complete list of user agent string, check out: http://www.useragentstring.com/pages/useragentstring.php
Is there a way to spoof a web request from C# code so it doesn't look like a bot or spam hitting the site? I am trying to web scrape my website, but keep getting blocked after a certain amount of calls. I want to act like a real browser. I am using this code, from HTML Agility Pack.
var web = new HtmlWeb();
web.UserAgent =
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11";
I do way too much web scraping, but here are the options:
I have a default list of headers I add as all of these are expected from a browser:
wc.Headers[HttpRequestHeader.UserAgent] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11";
wc.Headers[HttpRequestHeader.ContentType] = "application/x-www-form-urlencoded";
wc.Headers[HttpRequestHeader.Accept] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
wc.Headers[HttpRequestHeader.AcceptEncoding] = "gzip,deflate,sdch";
wc.Headers[HttpRequestHeader.AcceptLanguage] = "en-GB,en-US;q=0.8,en;q=0.6";
wc.Headers[HttpRequestHeader.AcceptCharset] = "ISO-8859-1,utf-8;q=0.7,*;q=0.3";
(WC is my WebClient).
As a further help - here is my webclient class that keeps cookies stored - which is also a massive help:
public class CookieWebClient : WebClient
{
public CookieContainer m_container = new CookieContainer();
public WebProxy proxy = null;
protected override WebRequest GetWebRequest(Uri address)
{
try
{
ServicePointManager.DefaultConnectionLimit = 1000000;
WebRequest request = base.GetWebRequest(address);
request.Proxy = proxy;
HttpWebRequest webRequest = request as HttpWebRequest;
webRequest.Pipelined = true;
webRequest.KeepAlive = true;
if (webRequest != null)
{
webRequest.CookieContainer = m_container;
}
return request;
}
catch
{
return null;
}
}
}
Here is my usual use for it. Add a static copy to your base site class with all your parsing functions you likely have:
protected static CookieWebClient wc = new CookieWebClient();
And call it as such:
public HtmlDocument Download(string url)
{
HtmlDocument hdoc = new HtmlDocument();
HtmlNode.ElementsFlags.Remove("option");
HtmlNode.ElementsFlags.Remove("select");
Stream read = null;
try
{
read = wc.OpenRead(url);
}
catch (ArgumentException)
{
read = wc.OpenRead(HttpHelper.HTTPEncode(url));
}
hdoc.Load(read, true);
return hdoc;
}
The other main reason you may be crashing out is the connection is being closed by the server as you have had an open connection for too long. You can prove this by adding a try catch around the download part as above and if it fails, reset the webclient and try to download again:
HtmlDocument d = new HtmlDocument();
try
{
d = this.Download(prp.PropertyUrl);
}
catch (WebException e)
{
this.Msg(Site.ErrorSeverity.Severe, "Error connecting to " + this.URL + " : Resubmitting..");
wc = new CookieWebClient();
d = this.Download(prp.PropertyUrl);
}
This saves my ass all the time, even if it was the server rejecting you, this can re-jig the lot. Cookies are cleared and your free to roam again. If worse truly comes to worse - add proxy support and get a new proxy applied per 50-ish requests.
That should be more than enough for you to kick your own and any other sites arse.
RATE ME!
Use a regular browser and fiddler (if the developer tools are not up to scratch) and take a look at the request and response headers.
Build up your requests and request headers to match what the browser sends (you can use a couple of different browsers to asses if this makes a difference).
In regards to "getting blocked after a certain amount of calls" - throttle your calls. Only make one call every x seconds. Behave nicely to the site and it will behave nicely to you.
Chances are good that they simply look at the number of calls from your IP address per second and if it passes a threshold, the IP address gets blocked.
I have url like:
http://www.matweb.com/search/DataSheet.aspx?MatGUID=849e2916ab1541be9ff6a17b78f95c82
I want to download source code from that page using this code:
private static string urlTemplate = #"http://www.matweb.com/search/DataSheet.aspx?MatGUID=";
static string GetSource(string guid)
{
try
{
Uri url = new Uri(urlTemplate + guid);
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.Method = "GET";
HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();
Stream responseStream = webResponse.GetResponseStream();
StreamReader responseStreamReader = new StreamReader(responseStream);
String result = responseStreamReader.ReadToEnd();
return result;
}
catch (Exception ex)
{
return null;
}
}
When I do so I get:
You do not seem to have cookies enabled. MatWeb Requires cookies to be enabled.
Ok, that I understand, so I added lines:
CookieContainer cc = new CookieContainer();
webRequest.CookieContainer = cc;
I got:
Your IP Address has been restricted due to excessive use. The problem may be compounded when an IP address may be shared by many people in a company or through an internet service provider. We apologize for any inconvenience.
I can understand this but I'm not getting this message when I try to visit this page using web browser. What can I do to get the source code? Some cookies or http headers?
It probably doesn't like your UserAgent. Try this:
webRequest.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)"; //maybe substitute your own in here
It looks like you're doing something that the company doesn't like, if you got an "excessive use" response.
You are downloading pages too fast.
When you use a browser you might get up to one page per second. Using a application you can get several pages per second and that's probably what their web server is detecting. Hence the excessive usage.