403 forbidden when posting to a url - c#

I am posting a request to a website:
request = (HttpWebRequest)WebRequest.Create("https://www.footlocker.dk/api/users/carts/current/entries?timestamp=1611595223668");
request.Method = "POST";
using (var streamWriter = new StreamWriter(request.GetRequestStream()))
{
string json = "{\"user\":\"test\"," + "\"password\":\"bla\"}";
streamWriter.Write(json);
}
var httpResponse = (HttpWebResponse)request.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
{
var result = streamReader.ReadToEnd();
}
When I submit this request, I am getting a 403 forbidden, with following html:
<html>
<head>
<title>footlocker.dk</title>
<style>
#cmsg{animation: A 1.5s;}#keyframes A{0%{opacity:0;}99%{opacity:0;}100%{opacity:1;}}
</style>
</head>
<body style="margin:0">
<p id="cmsg">Please enable JS and disable any ad blocker</p>
<script>
var dd={'cid':'AHrlqAAAAAMA2k9UvgFgVkIAk04eSQ==','hsh':'A55FBF4311ED6F1BF9911EB71931D5','t':'fe','r':'b','s':17434,'host':'geo.captcha-delivery.com'}</script><script src="https://ct.captcha-delivery.com/c.js">
</script>
</body>
</html>
Are there anyway I can make the browser think that JS is enabled?

Why are you trying to make this request? As per the Footlocker Terms of Service,
You may not without the prior written permission of Foot Locker, use any computer code, data mining software, "robot", "bot", "spider", "scraper" or other automatic device, or program, algorithm or methodology having similar processes or functionality, or any manual process, to monitor or copy any of the web pages, data or content found on this Site or App, or accessed through this Site or App.
I'm assuming you're attempting to perform unauthorized scraping/monitoring of this site, and I'd highly advise you stop as that's against the aforementioned terms and conditions.

Maybe try to specify a user-agent.

Related

Making a RestClient request shortly after HttpWebRequest leads to 404 - Not Found

My application is sending some data to some government's service.
The workflow is to first authenticate on their REST(JSON) service to get an authentication token, and then send the actual data+token to their SOAP service.
The problem is that if I call the authentication service in quick succession after the last soap request, their REST serice will return "404 – Not Found" HTML instead of JSON response.
This is the code for sending authentication requests:
RestClient client = new RestClient(ret.Url);
AuthRequestToken requestToken = new AuthRequestToken();
requestToken.userLoginDetails.organisationCode = _organizationCode;
requestToken.userLoginDetails.userId = _username;
requestToken.userLoginDetails.password = _password;
ret.RequestJson = requestToken.ToString();
var request = new RestRequest(Method.POST);
request.AddHeader("Content-Type", "application/json");
request.AddHeader("cache-control", "no-cache");
request.AddParameter("application/json", ret.RequestJson, ParameterType.RequestBody);
IRestResponse response = client.Execute(request);
This is the code for sending SOAP requests:
HttpWebRequest webRequest = CreateWebRequest(envelope);
using (WebResponse webResponse = webRequest.GetResponse())
{
using (Stream responseStream = webResponse.GetResponseStream())
{
using (StreamReader rd = new StreamReader(responseStream))
{
ret.ResponseXML = rd.ReadToEnd();
}
responseStream.Close();
}
}
This is the CreateWebRequest() method
private HttpWebRequest CreateWebRequest(XElement content)
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(_url);
//webRequest.Headers.Add("SOAPAction", action);
webRequest.ContentType = "text/xml;charset=\"utf-8\"";
webRequest.Accept = "text/xml";
webRequest.Method = "POST";
using (Stream stream = webRequest.GetRequestStream())
{
content.Save(stream);
}
return webRequest;
}
RestClient is a class in the RestSharp library downloaded from https://restsharp.dev/
Using TcpView or netstat -abn I can see that after any request (either RestClient or HttpWebRequest), the connection stays in ESTABLISHED state for up to 5-30 seconds.
Everything works fine 99% of the time, except in a specific scenario when I make a RestClient request within 5-30 seconds after the last HttpWebRequest, before the connection switches from ESTABLISHED to CLOSE_WAIT.
I should mention that this code was working perfectly up to a couple of days ago. Before then, their authentication service was on a different IP address form their SOAP service. Now they are on the same IPAddress, and probably even on the same physical server.
Before they switched the servers I used to call authentication request before each and every SOAP request, and it worked, but since this error started happening, I modified my code to authenticate only occasionally and use the same token for a bunch of SOAP requests. This considerably reduced the chance for this error, but I still ocassinaly get it when traffic is high.
It seems to me that RestClient and HttpWebRequest are using the same socket under the hood and one of them is not cleaning up properly. It seems that RestClient inherits some junk from the HttpWebRequest because the "404 - Not Found" returned by the service looks the same as when I deliberately navigate to the wrong URL of the authentication service.
It is also possible that I'm not disposing or closing something properly, but I tried closing every stream, client or connection I could find, and injected 'using' everywhere, but nothing seems to help.
I tried contacting the government's tech suport, but judging by my prior experience, it will take weeks before they even bother to connect me to someone who can understand the problem.
This is the 404 HTML I get:
<!doctype html>
<html lang="en">
<head>
<title>HTTP Status 404 – Not Found</title>
<style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style>
</head>
<body>
<h1>HTTP Status 404 – Not Found</h1>
<hr class="line" />
<p>
<b>Type</b> Status Report</p>
<p>
<b>Description</b> The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.</p>
<hr class="line" />
<h3>Apache Tomcat/9.0.35</h3>
</body>
</html>
Do you have any suggestion on what I could try to prevent this from happening?
As I said, I currently have some workaround which tris to refresh the token when it gets the chance, and even delay regular requests if necessary, but Id like to not use workarounds if possible, especially since I don't know what the socket timeout is. It is 5 sec on most computers, but on some wireless networks it stays ESTABLISHED for almost a minute.
If it matters, both services are on HTTPS.
Thank you!
I solved it by making a small console application which receives credentials through command line parameters, connects to the rest service and returns a token in the standard output.
Parent application periodically calls this exe in the background, and reads a new token from the standard output.

Error 405 Method not Allowed on WebRequest

I am trying to grab the page code from the below page. It gives me a 405 error. If I try to get the page code from the home page it works fine but from this specific page i get Method not allowed, thoughts?
WebRequest request = WebRequest.Create("https://www.realtor.com/realestateandhomes-search/California/counties");
request.UseDefaultCredentials = true;
request.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string responseFromServer = reader.ReadToEnd();
Console.WriteLine(responseFromServer);
The site thinks you are a bot.
Details:
I tried it with HttpClient (recommended: doesn't throw an exception upon receiving a non-200 response code), and inspected the response HTML. Here is the important snipit:
<p>
As you were browsing, something about your browser made us think you might be a bot. There are a few reasons this might happen, including:
</p>
<ul>
<li>You're a power user moving through this website with super-human speed</li>
<li>You've disabled JavaScript and/or cookies in your web browser</li>
<li>A third-party browser plugin is preventing JavaScript from running. Additional information is available in this
<a title='Third party browser plugins that block javascript' href='http://ds.tl/help-third-party-plugins' target='_blank'>
support article
</a>.
</li>
</ul>
If you want the full response, try running this:
async void LogResponse()
{
using System.Net.Http.HttpClient client = new System.Net.Http.HttpClient();
var response = await client.GetAsync("https://www.realtor.com/realestateandhomes-search/California/counties");
Console.WriteLine(await response.Content.ReadAsStringAsync());
}
Side complaint against realtor.com, 405 (The method specified in the Request-Line is not allowed) is a rather poor response code for this; a 403 (The server understood the request, but is refusing to fulfill it.) seems better suited.

How to scrape data from another website which is built in AngularJS?

I have to get some specific data from another web page which is built in AngularJS.
What I have done until now:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
response = request.GetResponse();
reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
result = reader.ReadToEnd();
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(result);
It's not returning proper HTML and I suppose (after searching) that the site is returning 4 items but the page source shows only one item with this {{item.name}} type of syntax.
How to solve this issue?
If you use HttpWebRequest, it will just return you the HTML template, it will not contain any data. Due to the nature of Angular, data binding happens later on using JavaScript.
I suggest you to use WebBrowser Control instead of HttpWebRequest for data scraping. Using WebBrowser you should be able to get the complete HTML after the $scope is initialized and data is added to the DOM.
To know more about how to use WebBrowser in ASP.NET you can check this link

Web Server Does Not Allow Using Post Method

I want to connect a website with my user id and password and get my datas from website and store them in a text file, but I get error 405 that Method Not Allowed. Can somebody help me to figure out this?
Here is the html code of webserver:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>blablbablablabl</title>
</head>
<script type="text/javascript">
function login() {
setTimeout('window.close()',1000);
}
</script>
<body>
<div><h3>blablablaasdasd</h3><form onSubmit="javascript:login();" style='margin- top:10px;' id='loginPageForm' action='http://website.com' method='post' target='_blank'
<div>
<input name='t:ac' type='hidden' value='$002f$002website.com$002fclient$002fdefault$002fsearch$002faccount$003f' />
<input name='t:formdata' type='hidden' value='H4sIAAAAAAAAAJWQv0oDQRDGx4NAMJ1gEURstN2zMI02BkEQDgkc1mFvb7xs2Ntdd/ZMbKx8CRufQKz0CVLY+Q4+gI2FlYV7J6Lg/274mJnv932XD9CarMAyIXdiFA+4d0YnppB6czysCJ3mJZKDnnEF45aLETLPLZJ3Jz0mjEMlM5ZxQtbPgsiF35Wo8tUUfWXXDmad+8Xb5wjmEugIo8N3tR8+elhIxvyYx4rrIk69k7rYmloP8++uf8Hq/xdr4IxAorTKSkkkjZ5d5RuHTxd3EcDUfmtpOdHEuJyO4BSgwXyTfr2pT1qTJeh+sUU1hw9Btn8MIkxpjUbtiTXk/nOO8/Sxe3N9thNBlEBbKBm29xrvunpUWAahrr6R6qrbr+bD9Q/jCx9ggTUPAgAA' /></div>
<label for='identity'>Card Number:</label><div><input type='text' name='j_username' /</div>
<div style='clear:both;'></div>
<label for='password'>Password:</label>
<div><input name='j_password' type='password' class='pass' value='' /><input type='submit' value='Login' /></div></form></div>
</body>
</html>
Here is the C# code that I am trying to reach server.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://website.com/file.html");
request.AllowAutoRedirect = true;
request.Timeout = 10000; // timeout 10s
request.Method = "POST";
String formContent = "t:ac=$002f$002website.com$002fclient$002fdefault$002fsearch$002faccount$003f&t:formdata=H4sIAAAAAAAAAJWQv0oDQRDGx4NAMJ1gEURstN2zMI02BkEQDgkc1mFvb7xs2Ntdd/ZMbKx8CRufQKz0CVLY+Q4+gI2FlYV7J6Lg/274mJnv932XD9CarMAyIXdiFA+4d0YnppB6czysCJ3mJZKDnnEF45aLETLPLZJ3Jz0mjEMlM5ZxQtbPgsiF35Wo8tUUfWXXDmad+8Xb5wjmEugIo8N3tR8+elhIxvyYx4rrIk69k7rYmloP8++uf8Hq/xdr4IxAorTKSkkkjZ5d5RuHTxd3EcDUfmtpOdHEuJyO4BSgwXyTfr2pT1qTJeh+sUU1hw9Btn8MIkxpjUbtiTXk/nOO8/Sxe3N9thNBlEBbKBm29xrvunpUWAahrr6R6qrbr+bD9Q/jCx9ggTUPAgAA&j_username=johndoe0&j_password=12345";
byte[] byteArray = Encoding.UTF8.GetBytes(formContent);
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = byteArray.Length;
Stream dataStream = request.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
// Get the response ...
WebResponse response;
response = (HttpWebResponse)request.GetResponse();//ERROR OCCURS HERE!!!
dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
richTextBox1.AppendText(HttpUtility.UrlDecode(reader.ReadToEnd()));
reader.Close();
dataStream.Close();
response.Close();
EDIT: Problem solved, found another URL in that website that allows POST method.
#GSiry's solution is probably the way to go if you control the server you fetch data from.
Otherwise, the issue is about adjusting your request to whatever HTTP method the remote server accepts: Method Not Allowed is supposed to mean that server won't take some particular methods while accepting others, and for good reasons. See more on request safety and idempotence.
What happens if you use GET instead of POST?
EDIT: Assuming you are really POSTing to the same web URL from both the HTML form and your C# request (which does not seem to be the case anyway), the reason why it behaves differently is not obvious and is in fact server implementation-dependent. Which means we can only do guesswork (for example, it might not like the user agent it gets (or lack thereof) from your C# code.
Anyway, I stand by the advice of using GET. There seems to be no reason at all to issue a POST request, since you don't intend to modify website.com/file.html, which is the stated purpose of POST method.
EDIT2: Its not necessary to use POST for a login per se. HTTP authentication can be performed through form parameters, through HTTP request headers or through the own authoritative part of the domain name (http://username:password#website.com/your_file.html). But this depends exclusively on the concrete server implementation.
If you can't access the server logs, I'm afraid you're in for some trial-and-error session. Start by mimicking the browser's request exactly. Firebug, Chrome's or Safari's developer console will be your friends to see exactly what headers are being passed along with the browser request so that the POST method is allowed.
On a side note, what you should be using for authentication procedure is SSL/TLS (https://...)
If you are using MVC, it might be as simple as adding the
[HttpPost]
attribute to the controller function that accepts your post request
If you're trying to access a WebService add following section to target's site Web.config under System.Web:
<webServices>
<protocols>
<add name="HttpPost"/>
</protocols>
</webServices>

How can I Browse a page Programmatically?

I've seen numerous examples on how to get the contents of a URI. I also used HTMLAgilityPack a lot.
What I want is to create Unit Testing environment for asp websites.
I've seen the BrowserSession and this Question but although, the process seems fine, they do not login in a website. I tried numerous well-known websites.
Any ideas on how to browse though code?
It sounds like you want to submit a form on a web page and view the response HTML back of the resulting page.
This method will take a form target URL and submit a post with the given named arguments in the parms Dictionary.
I have used the method below to perform password authentication on a web page and view the response after authentication. You will need to know the target Url and the form fields you wish to pass in the request.
private string SubmitRequest(string url, Dictionary<string, string> parms)
{
var req = WebRequest.Create(url);
req.Method = "POST";
string parmsString = string.Join("&", parms.Select(p => string.Format("{0}={1}", p.Key, p.Value)));
req.ContentLength = parmsString.Length;
using (StreamWriter writer = new StreamWriter(req.GetRequestStream()))
{
writer.Write(parmsString);
writer.Close();
}
var res = req.GetResponse();
using (StreamReader reader = new StreamReader(res.GetResponseStream()))
{
string response = reader.ReadToEnd();
reader.Close();
return response;
}
}
If there is something more specific you are wanting or this is not what you are looking for then please post a comment.
My suggestion is to try some tutorials of WebDriverJs and see if that works for you. It is mainly used for testing but can also be used for other purposes. I am using it to automate responding to user's queries on a shopping platform.

Categories

Resources