How can I get html from page with cloudflare ddos portection? - c#

I use htmlagility to get webpage data but I tried everything with page using www.cloudflare.com protection for ddos. The redirect page is not possible to handle in htmlagility because they don't redirect with meta nor js I guess, they check if you have already being checked with a cookie that I failed to simulate with c#. When I get the page, the html code is from the landing cloadflare page.

I also encountered this problem some time ago. The real solution would be solve the challenge the cloudflare websites gives you (you need to compute a correct answer using javascript, send it back, and then you receive a cookie / your token with which you can continue to view the website). So all you would get normally is a page like
In the end, I just called a python-script with a shell-execute. I used the modules provided within this github fork. This could serve as a starting point to implement the circumvention of the cloudflare anti-dDoS page in C# aswell.
FYI, the python script I wrote for my personal usage just wrote the cookie in a file. I read that later again using C# and store it in a CookieJar to continue browsing the page within C#.
#!/usr/bin/env python
import cfscrape
import sys
scraper = cfscrape.create_scraper() # returns a requests.Session object
fd = open("cookie.txt", "w")
c = cfscrape.get_cookie_string(sys.argv[1])
fd.write(str(c))
fd.close()
print(c)
EDIT: To repeat this, this has only LITTLE to do with cookies! Cloudflare forces you to solve a REAL challenge using javascript commands. It's not as easy as accepting a cookie and using it later on. Look at https://github.com/Anorov/cloudflare-scrape/blob/master/cfscrape/init.py and the ~40 lines of javascript emulation to solve the challenge.
Edit2: Instead of writing something to circumvent the protection, I've also seen people using a fully-fledged browser-object (this is not a headless browser) to go to the website and subscribe to certain events when the page is loaded. Use the WebBrowser class to create an infinetly small browser window and subscribe to the appropiate events.
Edit3:
Alright, I actually implemented the C# way to do this. This uses the JavaScript Engine Jint for .NET, available via https://www.nuget.org/packages/Jint
The cookie-handling code is ugly because sometimes the HttpResponse class won't pick up the cookies, although the header contains a Set-Cookie section.
using System;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;
using System.Web;
using System.Collections;
using System.Threading;
namespace Cloudflare_Evader
{
public class CloudflareEvader
{
/// <summary>
/// Tries to return a webclient with the neccessary cookies installed to do requests for a cloudflare protected website.
/// </summary>
/// <param name="url">The page which is behind cloudflare's anti-dDoS protection</param>
/// <returns>A WebClient object or null on failure</returns>
public static WebClient CreateBypassedWebClient(string url)
{
var JSEngine = new Jint.Engine(); //Use this JavaScript engine to compute the result.
//Download the original page
var uri = new Uri(url);
HttpWebRequest req =(HttpWebRequest) WebRequest.Create(url);
req.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0";
//Try to make the usual request first. If this fails with a 503, the page is behind cloudflare.
try
{
var res = req.GetResponse();
string html = "";
using (var reader = new StreamReader(res.GetResponseStream()))
html = reader.ReadToEnd();
return new WebClient();
}
catch (WebException ex) //We usually get this because of a 503 service not available.
{
string html = "";
using (var reader = new StreamReader(ex.Response.GetResponseStream()))
html = reader.ReadToEnd();
//If we get on the landing page, Cloudflare gives us a User-ID token with the cookie. We need to save that and use it in the next request.
var cookie_container = new CookieContainer();
//using a custom function because ex.Response.Cookies returns an empty set ALTHOUGH cookies were sent back.
var initial_cookies = GetAllCookiesFromHeader(ex.Response.Headers["Set-Cookie"], uri.Host);
foreach (Cookie init_cookie in initial_cookies)
cookie_container.Add(init_cookie);
/* solve the actual challenge with a bunch of RegEx's. Copy-Pasted from the python scrapper version.*/
var challenge = Regex.Match(html, "name=\"jschl_vc\" value=\"(\\w+)\"").Groups[1].Value;
var challenge_pass = Regex.Match(html, "name=\"pass\" value=\"(.+?)\"").Groups[1].Value;
var builder = Regex.Match(html, #"setTimeout\(function\(\){\s+(var t,r,a,f.+?\r?\n[\s\S]+?a\.value =.+?)\r?\n").Groups[1].Value;
builder = Regex.Replace(builder, #"a\.value =(.+?) \+ .+?;", "$1");
builder = Regex.Replace(builder, #"\s{3,}[a-z](?: = |\.).+", "");
//Format the javascript..
builder = Regex.Replace(builder, #"[\n\\']", "");
//Execute it.
long solved = long.Parse(JSEngine.Execute(builder).GetCompletionValue().ToObject().ToString());
solved += uri.Host.Length; //add the length of the domain to it.
Console.WriteLine("***** SOLVED CHALLENGE ******: " + solved);
Thread.Sleep(3000); //This sleeping IS requiered or cloudflare will not give you the token!!
//Retreive the cookies. Prepare the URL for cookie exfiltration.
string cookie_url = string.Format("{0}://{1}/cdn-cgi/l/chk_jschl", uri.Scheme, uri.Host);
var uri_builder = new UriBuilder(cookie_url);
var query = HttpUtility.ParseQueryString(uri_builder.Query);
//Add our answers to the GET query
query["jschl_vc"] = challenge;
query["jschl_answer"] = solved.ToString();
query["pass"] = challenge_pass;
uri_builder.Query = query.ToString();
//Create the actual request to get the security clearance cookie
HttpWebRequest cookie_req = (HttpWebRequest) WebRequest.Create(uri_builder.Uri);
cookie_req.AllowAutoRedirect = false;
cookie_req.CookieContainer = cookie_container;
cookie_req.Referer = url;
cookie_req.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0";
//We assume that this request goes through well, so no try-catch
var cookie_resp = (HttpWebResponse)cookie_req.GetResponse();
//The response *should* contain the security clearance cookie!
if (cookie_resp.Cookies.Count != 0) //first check if the HttpWebResponse has picked up the cookie.
foreach (Cookie cookie in cookie_resp.Cookies)
cookie_container.Add(cookie);
else //otherwise, use the custom function again
{
//the cookie we *hopefully* received here is the cloudflare security clearance token.
if (cookie_resp.Headers["Set-Cookie"] != null)
{
var cookies_parsed = GetAllCookiesFromHeader(cookie_resp.Headers["Set-Cookie"], uri.Host);
foreach (Cookie cookie in cookies_parsed)
cookie_container.Add(cookie);
}
else
{
//No security clearence? something went wrong.. return null.
//Console.WriteLine("MASSIVE ERROR: COULDN'T GET CLOUDFLARE CLEARANCE!");
return null;
}
}
//Create a custom webclient with the two cookies we already acquired.
WebClient modedWebClient = new WebClientEx(cookie_container);
modedWebClient.Headers.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0");
modedWebClient.Headers.Add("Referer", url);
return modedWebClient;
}
}
/* Credit goes to https://stackoverflow.com/questions/15103513/httpwebresponse-cookies-empty-despite-set-cookie-header-no-redirect
(user https://stackoverflow.com/users/541404/cameron-tinker) for these functions
*/
public static CookieCollection GetAllCookiesFromHeader(string strHeader, string strHost)
{
ArrayList al = new ArrayList();
CookieCollection cc = new CookieCollection();
if (strHeader != string.Empty)
{
al = ConvertCookieHeaderToArrayList(strHeader);
cc = ConvertCookieArraysToCookieCollection(al, strHost);
}
return cc;
}
private static ArrayList ConvertCookieHeaderToArrayList(string strCookHeader)
{
strCookHeader = strCookHeader.Replace("\r", "");
strCookHeader = strCookHeader.Replace("\n", "");
string[] strCookTemp = strCookHeader.Split(',');
ArrayList al = new ArrayList();
int i = 0;
int n = strCookTemp.Length;
while (i < n)
{
if (strCookTemp[i].IndexOf("expires=", StringComparison.OrdinalIgnoreCase) > 0)
{
al.Add(strCookTemp[i] + "," + strCookTemp[i + 1]);
i = i + 1;
}
else
al.Add(strCookTemp[i]);
i = i + 1;
}
return al;
}
private static CookieCollection ConvertCookieArraysToCookieCollection(ArrayList al, string strHost)
{
CookieCollection cc = new CookieCollection();
int alcount = al.Count;
string strEachCook;
string[] strEachCookParts;
for (int i = 0; i < alcount; i++)
{
strEachCook = al[i].ToString();
strEachCookParts = strEachCook.Split(';');
int intEachCookPartsCount = strEachCookParts.Length;
string strCNameAndCValue = string.Empty;
string strPNameAndPValue = string.Empty;
string strDNameAndDValue = string.Empty;
string[] NameValuePairTemp;
Cookie cookTemp = new Cookie();
for (int j = 0; j < intEachCookPartsCount; j++)
{
if (j == 0)
{
strCNameAndCValue = strEachCookParts[j];
if (strCNameAndCValue != string.Empty)
{
int firstEqual = strCNameAndCValue.IndexOf("=");
string firstName = strCNameAndCValue.Substring(0, firstEqual);
string allValue = strCNameAndCValue.Substring(firstEqual + 1, strCNameAndCValue.Length - (firstEqual + 1));
cookTemp.Name = firstName;
cookTemp.Value = allValue;
}
continue;
}
if (strEachCookParts[j].IndexOf("path", StringComparison.OrdinalIgnoreCase) >= 0)
{
strPNameAndPValue = strEachCookParts[j];
if (strPNameAndPValue != string.Empty)
{
NameValuePairTemp = strPNameAndPValue.Split('=');
if (NameValuePairTemp[1] != string.Empty)
cookTemp.Path = NameValuePairTemp[1];
else
cookTemp.Path = "/";
}
continue;
}
if (strEachCookParts[j].IndexOf("domain", StringComparison.OrdinalIgnoreCase) >= 0)
{
strPNameAndPValue = strEachCookParts[j];
if (strPNameAndPValue != string.Empty)
{
NameValuePairTemp = strPNameAndPValue.Split('=');
if (NameValuePairTemp[1] != string.Empty)
cookTemp.Domain = NameValuePairTemp[1];
else
cookTemp.Domain = strHost;
}
continue;
}
}
if (cookTemp.Path == string.Empty)
cookTemp.Path = "/";
if (cookTemp.Domain == string.Empty)
cookTemp.Domain = strHost;
cc.Add(cookTemp);
}
return cc;
}
}
/*Credit goes to https://stackoverflow.com/questions/1777221/using-cookiecontainer-with-webclient-class
(user https://stackoverflow.com/users/129124/pavel-savara) */
public class WebClientEx : WebClient
{
public WebClientEx(CookieContainer container)
{
this.container = container;
}
public CookieContainer CookieContainer
{
get { return container; }
set { container = value; }
}
private CookieContainer container = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest r = base.GetWebRequest(address);
var request = r as HttpWebRequest;
if (request != null)
{
request.CookieContainer = container;
}
return r;
}
protected override WebResponse GetWebResponse(WebRequest request, IAsyncResult result)
{
WebResponse response = base.GetWebResponse(request, result);
ReadCookies(response);
return response;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
ReadCookies(response);
return response;
}
private void ReadCookies(WebResponse r)
{
var response = r as HttpWebResponse;
if (response != null)
{
CookieCollection cookies = response.Cookies;
container.Add(cookies);
}
}
}
}
The function will return a webclient with the solved challenges and cookies inside. You can use it as follows:
static void Main(string[] args)
{
WebClient client = null;
while (client == null)
{
Console.WriteLine("Trying..");
client = CloudflareEvader.CreateBypassedWebClient("http://anilinkz.tv");
}
Console.WriteLine("Solved! We're clear to go");
Console.WriteLine(client.DownloadString("http://anilinkz.tv/anime-list"));
Console.ReadLine();
}

A "simple" working method to bypass Cloudflare if you don't use libraries (that sometimes does not work).
Open a "hidden" WebBrowser (size 1,1 or so).
Open the root of your target Cloudflare site.
Get the cookies from WebBrowser.
Use these cookies in WebClient.
Make sure the UserAgent for both WebBrowser and WebClient are identical. Cloudflare will give you a 503 if a mismatch there on the WebClient aftwerwards.
You will need to search here on stack on how to get cookies from WebBrowser and how to modify WebClient so you can set its cookiecontainer + modify the UserAgent on 1 or both so they are identical.
Since the cookies from Cloudflare seems to never expire, you can then serialize the cookies to somewhere temporary and load it each time you run your app, maybe a verification and refetch if failing.
Been doing this for a while and it works quite well. Could not get the C# libs to work for a specific Cloudflare site while they worked on others. No clue to why yet.
This also works behind the scenes on an IIS server, but you will have to set up "frowned upon" settings. That is, run the app pool as SYSTEM or ADMIN and set it to Classic mode.

Nowadays' answer should include Flaresolverr project.
It is meant to be deployed as a container using Docker, so you only have to pass it a port and it's running.
It doesn't impact your project as you don't import a library. It is currently supported. The only bad point I see, is that you need to install Docker to make it work.

Use WebClient to get html of the page,
I wrote following class which handles cookies too,
Just pass CookieContainer instance in constructor.
using System;
using System.Collections.Generic;
using System.Configuration;
using System.Linq;
using System.Net;
using System.Text;
namespace NitinJS
{
public class SmsWebClient : WebClient
{
public SmsWebClient(CookieContainer container, Dictionary<string, string> Headers)
: this(container)
{
foreach (var keyVal in Headers)
{
this.Headers[keyVal.Key] = keyVal.Value;
}
}
public SmsWebClient(bool flgAddContentType = true)
: this(new CookieContainer(), flgAddContentType)
{
}
public SmsWebClient(CookieContainer container, bool flgAddContentType = true)
{
this.Encoding = Encoding.UTF8;
System.Net.ServicePointManager.Expect100Continue = false;
ServicePointManager.MaxServicePointIdleTime = 2000;
this.container = container;
if (flgAddContentType)
this.Headers["Content-Type"] = "application/json";//"application/x-www-form-urlencoded";
this.Headers["Accept"] = "application/json, text/javascript, */*; q=0.01";// "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
//this.Headers["Accept-Encoding"] = "gzip, deflate";
this.Headers["Accept-Language"] = "en-US,en;q=0.5";
this.Headers["User-Agent"] = "Mozilla/5.0 (Windows NT 6.1; rv:23.0) Gecko/20100101 Firefox/23.0";
this.Headers["X-Requested-With"] = "XMLHttpRequest";
//this.Headers["Connection"] = "keep-alive";
}
private readonly CookieContainer container = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest r = base.GetWebRequest(address);
var request = r as HttpWebRequest;
if (request != null)
{
request.CookieContainer = container;
request.Timeout = 3600000; //20 * 60 * 1000
}
return r;
}
protected override WebResponse GetWebResponse(WebRequest request, IAsyncResult result)
{
WebResponse response = base.GetWebResponse(request, result);
ReadCookies(response);
return response;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
ReadCookies(response);
return response;
}
private void ReadCookies(WebResponse r)
{
var response = r as HttpWebResponse;
if (response != null)
{
CookieCollection cookies = response.Cookies;
container.Add(cookies);
}
}
}
}
USAGE:
CookieContainer cookies = new CookieContainer();
SmsWebClient client = new SmsWebClient(cookies);
string html = client.DownloadString("http://www.google.com");

Related

C# getting JSON through Cloudflare protection

FIXED - I got this to work by declaring the webclient as a static and then doing the actual call inside a using statement; Now it works if anyone else comes across this. The original issue was it was losing the Cookies after the initial call.
I'm trying to implement the first answer to this question I don't have enough reputation to reply directly to the answer because I just signed up. Also I'm new to C# so this may be easy I hope. I've gotten as far as changing the Regex.Match parameters based on this script but it still doesn't work. Here are the changes I made.
var builder = Regex.Match(html, #"setTimeout\(function\(\){\s+(var s,t,o,p,b,r,e,a,k,i,n,g,f.+?\r?\n[\s\S]+?a\.value =.+?)\r?\n").Groups[1].Value;
builder = Regex.Replace(builder, #"a\.value = (parseInt\(.+?\)).+", "$1");
builder = Regex.Replace(builder, #"\s{3,}[a-z](?: = |\.).+", "");
//Format the javascript..
builder = Regex.Replace(builder, #"[\n\\']", "");
Here's how I'm calling it
private static WebClient ddosclient = null;
public MainWindow()
{
while (ddosclient == null)
{
Console.WriteLine("Trying..");
ddosclient = Cloudflare_Evader.CloudflareEvader.CreateBypassedWebClient("https://yobit.net");
}
using (ddosclient)
{
string tradesuri = "";
string tradesjson = "";
string depthjson = "";
string depthuri = "";
tradersuri = "https://yobit.net/api/3/trades/" + pair;
Console.WriteLine(tradersuri);
tradesjson = ddosclient.DownloadString(tradesuri);
depthuri = "https://yobit.net/api/3/depth/" + pair;
Console.WriteLine(depthuri);
depthjson = ddosclient.DownloadString(depthuri);
}
}
Here's the code from the link,including my edits to the Regex.Match:
using System;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;
using System.Web;
using System.Collections;
using System.Threading;
namespace Cloudflare_Evader
{
public class CloudflareEvader
{
/// <summary>
/// Tries to return a webclient with the neccessary cookies installed to do requests for a cloudflare protected website.
/// </summary>
/// <param name="url">The page which is behind cloudflare's anti-dDoS protection</param>
/// <returns>A WebClient object or null on failure</returns>
public static WebClient CreateBypassedWebClient(string url)
{
var JSEngine = new Jint.Engine(); //Use this JavaScript engine to compute the result.
//Download the original page
var uri = new Uri(url);
HttpWebRequest req =(HttpWebRequest) WebRequest.Create(url);
req.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0";
//Try to make the usual request first. If this fails with a 503, the page is behind cloudflare.
try
{
var res = req.GetResponse();
string html = "";
using (var reader = new StreamReader(res.GetResponseStream()))
html = reader.ReadToEnd();
return new WebClient();
}
catch (WebException ex) //We usually get this because of a 503 service not available.
{
string html = "";
using (var reader = new StreamReader(ex.Response.GetResponseStream()))
html = reader.ReadToEnd();
//If we get on the landing page, Cloudflare gives us a User-ID token with the cookie. We need to save that and use it in the next request.
var cookie_container = new CookieContainer();
//using a custom function because ex.Response.Cookies returns an empty set ALTHOUGH cookies were sent back.
var initial_cookies = GetAllCookiesFromHeader(ex.Response.Headers["Set-Cookie"], uri.Host);
foreach (Cookie init_cookie in initial_cookies)
cookie_container.Add(init_cookie);
/* solve the actual challenge with a bunch of RegEx's. Copy-Pasted from the python scrapper version.*/
var challenge = Regex.Match(html, "name=\"jschl_vc\" value=\"(\\w+)\"").Groups[1].Value;
var challenge_pass = Regex.Match(html, "name=\"pass\" value=\"(.+?)\"").Groups[1].Value;
var builder = Regex.Match(html, #"setTimeout\(function\(\){\s+(var s,t,o,p,b,r,e,a,k,i,n,g,f.+?\r?\n[\s\S]+?a\.value =.+?)\r?\n").Groups[1].Value;
builder = Regex.Replace(builder, #"a\.value = (parseInt\(.+?\)).+", "$1");
builder = Regex.Replace(builder, #"\s{3,}[a-z](?: = |\.).+", "");
//Format the javascript..
builder = Regex.Replace(builder, #"[\n\\']", "");
//Execute it.
long solved = long.Parse(JSEngine.Execute(builder).GetCompletionValue().ToObject().ToString());
solved += uri.Host.Length; //add the length of the domain to it.
Console.WriteLine("***** SOLVED CHALLENGE ******: " + solved);
Thread.Sleep(3000); //This sleeping IS requiered or cloudflare will not give you the token!!
//Retreive the cookies. Prepare the URL for cookie exfiltration.
string cookie_url = string.Format("{0}://{1}/cdn-cgi/l/chk_jschl", uri.Scheme, uri.Host);
var uri_builder = new UriBuilder(cookie_url);
var query = HttpUtility.ParseQueryString(uri_builder.Query);
//Add our answers to the GET query
query["jschl_vc"] = challenge;
query["jschl_answer"] = solved.ToString();
query["pass"] = challenge_pass;
uri_builder.Query = query.ToString();
//Create the actual request to get the security clearance cookie
HttpWebRequest cookie_req = (HttpWebRequest) WebRequest.Create(uri_builder.Uri);
cookie_req.AllowAutoRedirect = false;
cookie_req.CookieContainer = cookie_container;
cookie_req.Referer = url;
cookie_req.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0";
//We assume that this request goes through well, so no try-catch
var cookie_resp = (HttpWebResponse)cookie_req.GetResponse();
//The response *should* contain the security clearance cookie!
if (cookie_resp.Cookies.Count != 0) //first check if the HttpWebResponse has picked up the cookie.
foreach (Cookie cookie in cookie_resp.Cookies)
cookie_container.Add(cookie);
else //otherwise, use the custom function again
{
//the cookie we *hopefully* received here is the cloudflare security clearance token.
if (cookie_resp.Headers["Set-Cookie"] != null)
{
var cookies_parsed = GetAllCookiesFromHeader(cookie_resp.Headers["Set-Cookie"], uri.Host);
foreach (Cookie cookie in cookies_parsed)
cookie_container.Add(cookie);
}
else
{
//No security clearence? something went wrong.. return null.
//Console.WriteLine("MASSIVE ERROR: COULDN'T GET CLOUDFLARE CLEARANCE!");
return null;
}
}
//Create a custom webclient with the two cookies we already acquired.
WebClient modedWebClient = new WebClientEx(cookie_container);
modedWebClient.Headers.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0");
modedWebClient.Headers.Add("Referer", url);
return modedWebClient;
}
}
public static CookieCollection GetAllCookiesFromHeader(string strHeader, string strHost)
{
ArrayList al = new ArrayList();
CookieCollection cc = new CookieCollection();
if (strHeader != string.Empty)
{
al = ConvertCookieHeaderToArrayList(strHeader);
cc = ConvertCookieArraysToCookieCollection(al, strHost);
}
return cc;
}
private static ArrayList ConvertCookieHeaderToArrayList(string strCookHeader)
{
strCookHeader = strCookHeader.Replace("\r", "");
strCookHeader = strCookHeader.Replace("\n", "");
string[] strCookTemp = strCookHeader.Split(',');
ArrayList al = new ArrayList();
int i = 0;
int n = strCookTemp.Length;
while (i < n)
{
if (strCookTemp[i].IndexOf("expires=", StringComparison.OrdinalIgnoreCase) > 0)
{
al.Add(strCookTemp[i] + "," + strCookTemp[i + 1]);
i = i + 1;
}
else
al.Add(strCookTemp[i]);
i = i + 1;
}
return al;
}
private static CookieCollection ConvertCookieArraysToCookieCollection(ArrayList al, string strHost)
{
CookieCollection cc = new CookieCollection();
int alcount = al.Count;
string strEachCook;
string[] strEachCookParts;
for (int i = 0; i < alcount; i++)
{
strEachCook = al[i].ToString();
strEachCookParts = strEachCook.Split(';');
int intEachCookPartsCount = strEachCookParts.Length;
string strCNameAndCValue = string.Empty;
string strPNameAndPValue = string.Empty;
string strDNameAndDValue = string.Empty;
string[] NameValuePairTemp;
Cookie cookTemp = new Cookie();
for (int j = 0; j < intEachCookPartsCount; j++)
{
if (j == 0)
{
strCNameAndCValue = strEachCookParts[j];
if (strCNameAndCValue != string.Empty)
{
int firstEqual = strCNameAndCValue.IndexOf("=");
string firstName = strCNameAndCValue.Substring(0, firstEqual);
string allValue = strCNameAndCValue.Substring(firstEqual + 1, strCNameAndCValue.Length - (firstEqual + 1));
cookTemp.Name = firstName;
cookTemp.Value = allValue;
}
continue;
}
if (strEachCookParts[j].IndexOf("path", StringComparison.OrdinalIgnoreCase) >= 0)
{
strPNameAndPValue = strEachCookParts[j];
if (strPNameAndPValue != string.Empty)
{
NameValuePairTemp = strPNameAndPValue.Split('=');
if (NameValuePairTemp[1] != string.Empty)
cookTemp.Path = NameValuePairTemp[1];
else
cookTemp.Path = "/";
}
continue;
}
if (strEachCookParts[j].IndexOf("domain", StringComparison.OrdinalIgnoreCase) >= 0)
{
strPNameAndPValue = strEachCookParts[j];
if (strPNameAndPValue != string.Empty)
{
NameValuePairTemp = strPNameAndPValue.Split('=');
if (NameValuePairTemp[1] != string.Empty)
cookTemp.Domain = NameValuePairTemp[1];
else
cookTemp.Domain = strHost;
}
continue;
}
}
if (cookTemp.Path == string.Empty)
cookTemp.Path = "/";
if (cookTemp.Domain == string.Empty)
cookTemp.Domain = strHost;
cc.Add(cookTemp);
}
return cc;
}
}
public class WebClientEx : WebClient
{
public WebClientEx(CookieContainer container)
{
this.container = container;
}
public CookieContainer CookieContainer
{
get { return container; }
set { container = value; }
}
private CookieContainer container = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest r = base.GetWebRequest(address);
var request = r as HttpWebRequest;
if (request != null)
{
request.CookieContainer = container;
}
return r;
}
protected override WebResponse GetWebResponse(WebRequest request, IAsyncResult result)
{
WebResponse response = base.GetWebResponse(request, result);
ReadCookies(response);
return response;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
ReadCookies(response);
return response;
}
private void ReadCookies(WebResponse r)
{
var response = r as HttpWebResponse;
if (response != null)
{
CookieCollection cookies = response.Cookies;
container.Add(cookies);
}
}
}
}

HTML Agility Pack Problems with W3C tools

I'm trying to access the HTML result of the w3C mobileOK Checker by passing a url such as
http://validator.w3.org/mobile/check?async=false&docAddr=http%3A%2F%2Fwww.google.com/%2Ftv%2F
The URL works if you put it in a browser but I can't seem to be able to access it via the HTMLAgilityPack. The reason for this probably is that the URL needs to send a number of requests to it's server since it's an online testing, therefore it's not just a "static" URL. I have accessed other URLs without any problems. Below is my code:
HtmlAgilityPack.HtmlDocument webGet = new HtmlAgilityPack.HtmlDocument();
HtmlWeb hw = new HtmlWeb();
webGet = hw.Load("http://validator.w3.org/mobile/check?async=false&docAddr=http%3A%2F%2Fwww.google.com/%2Ftv%2F");
HtmlNodeCollection nodes = webGet.DocumentNode.SelectNodes("//head");
if (nodes != null)
{
foreach(HtmlNode n in nodes)
{
string x = n.InnerHtml;
}
}
Edit: I tried to access it via Stream Reader and the website returns the following error: The remote server returned an error: (403) Forbidden.
I'm guessing that it's related.
I checked your example and was able to verify the described behaviour. It seems to me that w3.org checks if the request program is a browser or anything else.
I created an extended webClient class for another project on my own, and was able to access the given url with success.
Program.cs
WebClientExtended client = new WebClientExtended();
string exportPath = #"e:\temp"; // adapt to your own needs
string url = "http://validator.w3.org/mobile/check?async=false&docAddr=http%3A%2F%2Fwww.google.com/%2Ftv%2F";
/// load html by using cusomt webClient class
/// but use HtmlAgilityPack for parsing, manipulation aso
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(System.Text.Encoding.UTF8.GetString(client.DownloadData(url)));
doc.Save(Path.Combine(exportPath, "check.html"));
WebClientExtended
public class WebClientExtended : WebClient
{
#region Felder
private CookieContainer container = new CookieContainer();
#endregion
#region Eigenschaften
public CookieContainer CookieContainer
{
get { return container; }
set { container = value; }
}
#endregion
#region Konstruktoren
public WebClientExtended()
{
this.container = new CookieContainer();
}
#endregion
#region Methoden
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest r = base.GetWebRequest(address);
var request = r as HttpWebRequest;
request.AllowAutoRedirect = false;
request.ServicePoint.Expect100Continue = false;
if (request != null)
{
request.CookieContainer = container;
}
((HttpWebRequest)r).Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
((HttpWebRequest)r).UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"; //IE
r.Headers.Set("Accept-Encoding", "gzip, deflate, sdch");
r.Headers.Set("Accept-Language", "de-AT,de;q=0.8,en;q=0.6,en-US;q=0.4,fr;q=0.2");
r.Headers.Add(System.Net.HttpRequestHeader.KeepAlive, "1");
((HttpWebRequest)r).AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
return r;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
if (!string.IsNullOrEmpty(response.Headers["Location"]))
{
request = GetWebRequest(new Uri(response.Headers["Location"]));
request.ContentLength = 0;
response = GetWebResponse(request);
}
return response;
}
#endregion
}
I think the crucial point is the addition/manipulation of userAgent, Accept-encoding, -language strings. The result of my code is the downloaded page check.html.

Why i'm getting exception: Too many automatic redirections were attempted on webclient?

In the top of form1 i did:
WebClient Client;
Then in the constructor:
Client = new WebClient();
Client.DownloadFileCompleted += Client_DownloadFileCompleted;
Client.DownloadProgressChanged += Client_DownloadProgressChanged;
Then i have this method i'm calling every minute:
private void fileDownloadRadar()
{
if (Client.IsBusy == true)
{
Client.CancelAsync();
}
else
{
Client.DownloadProgressChanged += Client_DownloadProgressChanged;
Client.DownloadFileAsync(myUri, combinedTemp);
}
}
Every minutes it's downloading an image from a website same image each time.
It was all working for more then 24 hours no problems untill now throwing this exception in the download completed event:
private void Client_DownloadFileCompleted(object sender, AsyncCompletedEventArgs e)
{
if (e.Error != null)
{
timer1.Stop();
span = new TimeSpan(0, (int)numericUpDown1.Value, 0);
label21.Text = span.ToString(#"mm\:ss");
timer3.Start();
}
else if (!e.Cancelled)
{
label19.ForeColor = Color.Green;
label19.Text = "חיבור האינטרנט והאתר תקינים";
label19.Visible = true;
timer3.Stop();
if (timer1.Enabled != true)
{
if (BeginDownload == true)
{
timer1.Start();
}
}
bool fileok = Bad_File_Testing(combinedTemp);
if (fileok == true)
{
File1 = new Bitmap(combinedTemp);
bool compared = ComparingImages(File1);
if (compared == false)
{
DirectoryInfo dir1 = new DirectoryInfo(sf);
FileInfo[] fi = dir1.GetFiles("*.gif");
last_file = fi[fi.Length - 1].FullName;
string lastFileNumber = last_file.Substring(82, 6);
int lastNumber = int.Parse(lastFileNumber);
lastNumber++;
string newFileName = string.Format("radar{0:D6}.gif", lastNumber);
identicalFilesComparison = File_Utility.File_Comparison(combinedTemp, last_file);
if (identicalFilesComparison == false)
{
string newfile = Path.Combine(sf, newFileName);
File.Copy(combinedTemp, newfile);
LastFileIsEmpty();
}
}
if (checkBox2.Checked)
{
simdownloads.SimulateDownloadRadar();
}
}
else
{
File.Delete(combinedTemp);
}
File1.Dispose();
}
}
Now it stopped inside the if(e.Error != null)
On the line: timer1.Stop();
Then i see on the Error the error:
This is the stack trace:
at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at System.Net.WebClient.GetWebResponse(WebRequest request, IAsyncResult result)
at System.Net.WebClient.DownloadBitsResponseCallback(IAsyncResult result)
How can i solve this problem so it won't happen again ? And why it happened ?
EDIT:
I tried to change the fileDownloadRadar method to this to release the client every time:
private void fileDownloadRadar()
{
using (WebClient client = new WebClient())
{
if (client.IsBusy == true)
{
client.CancelAsync();
}
else
{
client.DownloadFileAsync(myUri, combinedTemp);
}
}
}
The problem is that in the constructor i'm using Client and here it's client two different Webclient variables.
How can i solve this and the exception ?
This is the websitel ink for the site with the image i'm downloading every minute.
Still not sure yet why i got this exception after it was working no problems for more then 24 hours.
Now i ran the program again over again and it's working but i wonder if i will get this exception again tommorow or sometimes in the next hours.
The site with image i'm downloading
I had the same problem with WebClient and found the solution here:
http://blog.developers.ba/fixing-issue-httpclient-many-automatic-redirections-attempted/
Using HttpWebRequest and setting a CookieContainer solved the problem, for example:
HttpWebRequest webReq = (HttpWebRequest)HttpWebRequest.Create(linkUrl);
try
{
webReq.CookieContainer = new CookieContainer();
webReq.Method = "GET";
using (WebResponse response = webReq.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
StreamReader reader = new StreamReader(stream);
res = reader.ReadToEnd();
...
}
}
}
catch (Exception ex)
{
...
}
If you're getting an exception with a description that says there are too many redirections, it's because the Web site you're trying to access is redirecting to another site, which is directing to another, and another, etc. beyond the default redirections limit.
So, for example, you try to get an image from site A. Site A redirects you to site B. Site B redirects you to site C, etc.
WebClient is configured to follow redirections up to some default limit. Since WebClient is based on HttpWebRequest, it's likely that it is using the default value for MaximumAutomaticRedirections, which is 50.
Most likely, either there is a bug on the server and it's redirecting in a tight loop, or they got tired of you hitting the server for the same file once per minute and they're purposely redirecting you in a circle.
The only way to determine what's really happening is to change your program so that it doesn't automatically follow redirections. That way, you can examine the redirection URL returned by the Web site and determine what's really going on. If you want to do that, you'll need to use HttpWebRequest rather than WebClient.
Or, you could use something like wget with verbose logging turned on. That will show you what what the server is returning when you make a request.
Although this is an old topic, I couldn't help but notice that the poster was using WebClient which uses no UserAgent when making the request. Many sites will reject or redirect clients that don't have a proper UserAgent string.
Consider setting the WebClient.Headers["User-Agent"]
Problem can be solved by setting a cookie container and most importantly by setting webRequest.AllowAutoRedirect = false; like this:
HttpWebRequest webRequest = (HttpWebRequest)HttpWebRequest.Create(url);
webRequest.CookieContainer = new CookieContainer();
webRequest.AllowAutoRedirect = false;
I had this error but got a simple fix.
You don't need all that code, all you need is to do is in the beginning of your application download the cookie like this (sorry but i work with VB :) but is pretty simple to convert)
[your application namespace].Application.GetCookie(New Uri("https://[site]"))
The easiest is to create CookieAwareWebClient and override the creation of the WebRequest.
It looks like that:
public class CookieAwareWebClient : WebClient {
public CookieContainer CookieContainer { get; set; }
public Uri Uri { get; set; }
public CookieAwareWebClient()
: this(new CookieContainer()) {
}
public CookieAwareWebClient(CookieContainer cookies) {
this.CookieContainer = cookies;
}
protected override WebRequest GetWebRequest(Uri address) {
var request = base.GetWebRequest(address);
if (request is HttpWebRequest) {
(request as HttpWebRequest).CookieContainer = this.CookieContainer;
}
var httpRequest = (HttpWebRequest)request;
httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
httpRequest.AllowAutoRedirect = true;
httpRequest.MaximumAutomaticRedirections = 100;
httpRequest.ContinueTimeout = 5 * 60 * 1000;
httpRequest.Timeout = 5 * 60 * 1000;
return httpRequest;
}
protected override WebResponse GetWebResponse(WebRequest request) {
var response = base.GetWebResponse(request);
var setCookieHeader = response.Headers[HttpResponseHeader.SetCookie];
//if (setCookieHeader != null)
//{
// Cookie cookie = new Cookie(); //create cookie
// cookie.Value = setCookieHeader;
// this.CookieContainer.Add(cookie);
//}
return response;
}
}
Here is the VB.Net version of #Ron.Eng's answer:
Public Function DownloadFileWithCookieContainerWebRequest(URL As String, FileName As String)
Dim webReq As HttpWebRequest = HttpWebRequest.Create(URL)
Try
webReq.CookieContainer = New CookieContainer()
webReq.Method = "GET"
Using response As WebResponse = webReq.GetResponse()
Using Stream As Stream = response.GetResponseStream()
Dim reader As StreamReader = New StreamReader(Stream)
Dim res As String = reader.ReadToEnd()
File.WriteAllText(FileName, res)
End Using
End Using
Catch ex As Exception
Throw ex
End Try
End Function

C# WebClient login to accounts.google.com

I have very difficult time trying to authenticate to accounts.google.com using webclient
I'm using C# WebClient object to achieve following.
I'm submitting form fields to https://accounts.google.com/ServiceLoginAuth?service=oz
Here is POST Fields:
service=oz
dsh=-8355435623354577691
GALX=33xq1Ma_CKI
timeStmp=
secTok=
Email=test#test.xom
Passwd=password
signIn=Sign in
PersistentCookie=yes
rmShown=1
Now when login page loads before I submit data it has following headers:
Content-Type text/html; charset=UTF-8
Strict-Transport-Security max-age=2592000; includeSubDomains
Set-Cookie GAPS=1:QClFh_dKle5DhcdGwmU3m6FiPqPoqw:SqdLB2u4P2oGjt_x;Path=/;Expires=Sat, 21-Dec-2013 07:31:40 GMT;Secure;HttpOnly
Cache-Control no-cache, no-store
Pragma no-cache
Expires Mon, 01-Jan-1990 00:00:00 GMT
X-Frame-Options Deny
X-Auto-Login realm=com.google&args=service%3Doz%26continue%3Dhttps%253A%252F%252Faccounts.google.com%252FManageAccount
Content-Encoding gzip
Transfer-Encoding chunked
Date Thu, 22 Dec 2011 07:31:40 GMT
X-Content-Type-Options nosniff
X-XSS-Protection 1; mode=block
Server GSE
OK now how do I use WebClient Class to include those headers?
I have tried webClient_.Headers.Add(); but it has limited effect and always returns login page.
Below is a class that I use. Would appreciate any help.
Getting login page
public void LoginPageRequest(Account acc)
{
var rparams = new RequestParams();
rparams.URL = #"https://accounts.google.com/ServiceLoginAuth?service=oz";
rparams.RequestName = "LoginPage";
rparams.Account = acc;
webClient_.DownloadDataAsync(new Uri(rparams.URL), rparams);
}
void webClient__DownloadDataCompleted(object sender, DownloadDataCompletedEventArgs e)
{
RequestParams rparams = (RequestParams)e.UserState;
if (rparams.RequestName == "LoginPage")
{
ParseLoginRequest(e.Result, e.UserState);
}
}
Now getting form fields using HtmlAgilityPack and adding them into Parameters collection
public void ParseLoginRequest(byte[] data, object UserState)
{
RequestParams rparams = (RequestParams)UserState;
rparams.ClearParams();
ASCIIEncoding encoder = new ASCIIEncoding();
string html = encoder.GetString(data);
HtmlNode.ElementsFlags.Remove("form");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
HtmlNode form = doc.GetElementbyId("gaia_loginform");
rparams.URL = form.GetAttributeValue("action", string.Empty);
rparams.RequestName = "LoginPost";
var inputs = form.Descendants("input");
foreach (var element in inputs)
{
string name = element.GetAttributeValue("name", "undefined");
string value = element.GetAttributeValue("value", "");
if (!name.Equals("undefined")) {
if (name.ToLower().Equals("email"))
{
value = rparams.Account.Email;
}
else if (name.ToLower().Equals("passwd"))
{
value = rparams.Account.Password;
}
rparams.AddParam(name,value);
Console.WriteLine(name + "-" + value);
}
}
webClient_.UploadValuesAsync(new Uri(rparams.URL),"POST", rparams.GetParams,rparams);
After I post the data I get login page rather than redirect or success message.
What am I doing wrong?
After some fiddling around, it looks like the WebClient class is not the best approach to this particular problem.
To achieve following goal I had to jump one level below to WebRequest.
When making WebRequest (HttpWebRequest) and using HttpWebResponse it is possible to set CookieContainer
webRequest_ = (HttpWebRequest)HttpWebRequest.Create(rparams.URL);
webRequest_.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)";
CookieContainer cookieJar = new CookieContainer();
webRequest_.CookieContainer = cookieJar;
string html = string.Empty;
try
{
using (WebResponse response = webRequest_.GetResponse())
{
using (var streamReader = new StreamReader(response.GetResponseStream()))
{
html = streamReader.ReadToEnd();
ParseLoginRequest(html, response,cookieJar);
}
}
}
catch (WebException e)
{
using (WebResponse response = e.Response)
{
HttpWebResponse httpResponse = (HttpWebResponse)response;
Console.WriteLine("Error code: {0}", httpResponse.StatusCode);
using (var streamReader = new StreamReader(response.GetResponseStream()))
Console.WriteLine(html = streamReader.ReadToEnd());
}
}
and then when making post use the same Cookie Container in following manner
webRequest_ = (HttpWebRequest)HttpWebRequest.Create(rparams.URL);
webRequest_.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)";
webRequest_.Method = "POST";
webRequest_.ContentType = "application/x-www-form-urlencoded";
webRequest_.CookieContainer = cookieJar;
var parameters = new StringBuilder();
foreach (var key in rparams.Params)
{
parameters.AppendFormat("{0}={1}&",HttpUtility.UrlEncode(key.ToString()),
HttpUtility.UrlEncode(rparams.Params[key.ToString()]));
}
parameters.Length -= 1;
using (var writer = new StreamWriter(webRequest_.GetRequestStream()))
{
writer.Write(parameters.ToString());
}
string html = string.Empty;
using (response = webRequest_.GetResponse())
{
using (var streamReader = new StreamReader(response.GetResponseStream()))
{
html = streamReader.ReadToEnd();
}
}
So this works, this code is not for production use and can be/should be optimized.
Treat it just as an example.
This is a quick example written in the answer pane and untested. You will probably need to parse some values out of an initial request for some form values to go in to formData. A lot of my code is based on this type of process unless we need to scrape facebook spokeo type sites in which case the ajax makes us use a different approach.
using System;
using System.Collections.Generic;
using System.Collections.Specialized;
using System.Linq;
using System.Text;
namespace GMailTest
{
class Program
{
private static NameValueCollection formData = new NameValueCollection();
private static CookieAwareWebClient webClient = new CookieAwareWebClient();
static void Main(string[] args)
{
formData.Clear();
formData["service"] = "oz";
formData["dsh"] = "-8355435623354577691";
formData["GALX"] = "33xq1Ma_CKI";
formData["timeStmp"] = "";
formData["secTok"] = "";
formData["Email"] = "test#test.xom";
formData["Passwd"] = "password";
formData["signIn"] = "Sign in";
formData["PersistentCookie"] = "yes";
formData["rmShown"] = "1";
byte[] responseBytes = webClient.UploadValues("https://accounts.google.com/ServiceLoginAuth?service=oz", "POST", formData);
string responseHTML = Encoding.UTF8.GetString(responseBytes);
}
}
public class CookieAwareWebClient : WebClient
{
public CookieAwareWebClient() : this(new CookieContainer())
{ }
public CookieAwareWebClient(CookieContainer c)
{
this.CookieContainer = c;
this.Headers.Add("User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.52 Safari/536.5");
}
public CookieContainer CookieContainer { get; set; }
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = this.CookieContainer;
}
return request;
}
}
}

how to login in https sites with the help of webrequest and response

how to login in https sites with the help of webrequst and webresponse in c# .
here is the code
public string postFormData(Uri formActionUrl, string postData)
{
gRequest = (HttpWebRequest)WebRequest.Create(formActionUrl);
gRequest.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.4) Gecko/2008102920 Firefox/3.0.4";
gRequest.CookieContainer = new CookieContainer();
gRequest.Method = "POST";
gRequest.Accept = " text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, */*";
gRequest.KeepAlive = true;
gRequest.ContentType = #"text/html; charset=iso-8859-1";
#region CookieManagement
if (this.gCookies != null && this.gCookies.Count > 0)
{
gRequest.CookieContainer.Add(gCookies);
}
//logic to postdata to the form
string postdata = string.Format(postData);
byte[] postBuffer = System.Text.Encoding.GetEncoding(1252).GetBytes(postData);
gRequest.ContentLength = postBuffer.Length;
Stream postDataStream = gRequest.GetRequestStream();
postDataStream.Write(postBuffer, 0, postBuffer.Length);
postDataStream.Close();
//post data logic ends
//Get Response for this request url
gResponse = (HttpWebResponse)gRequest.GetResponse();
//check if the status code is http 200 or http ok
if (gResponse.StatusCode == HttpStatusCode.OK)
{
//get all the cookies from the current request and add them to the response object cookies
gResponse.Cookies = gRequest.CookieContainer.GetCookies(gRequest.RequestUri);
//check if response object has any cookies or not
if (gResponse.Cookies.Count > 0)
{
//check if this is the first request/response, if this is the response of first request gCookies
//will be null
if (this.gCookies == null)
{
gCookies = gResponse.Cookies;
}
else
{
foreach (Cookie oRespCookie in gResponse.Cookies)
{
bool bMatch = false;
foreach (Cookie oReqCookie in this.gCookies)
{
if (oReqCookie.Name == oRespCookie.Name)
{
oReqCookie.Value = oRespCookie.Name;
bMatch = true;
break; //
}
}
if (!bMatch)
this.gCookies.Add(oRespCookie);
}
}
}
#endregion
StreamReader reader = new StreamReader(gResponse.GetResponseStream());
string responseString = reader.ReadToEnd();
reader.Close();
//Console.Write("Response String:" + responseString);
return responseString;
}
else
{
return "Error in posting data";
}
}
// calling the above function
httphelper.postFormData(new Uri("https://login.yahoo.com/config/login?.done=http://answers.yahoo.com%2f&.src=knowsrch&.intl=us"), ".tries=1&.src=knowsrch&.md5=&.hash=&.js=&.last=&promo=&.intl=us&.bypass=&.partner=&.u=0b440p15q1nmb&.v=0&.challenge=Rt_fM1duQiNDnI5SrzAY_GETpNTL&.yplus=&.emailCode=&pkg=&stepid=&.ev=&hasMsgr=0&.chkP=Y&.done=http%3A%2F%2Fanswers.yahoo.com%2F&.pd=knowsrch_ver%3D0%26c%3D%26ivt%3D%26sg%3D&login=xyz&passwd=xyz&.save=Sign+In");
You need to see how authentication works for the site you are working with.
This may be through cookies, special headers, hidden field or something else.
Fire up a tool like Fiddler and see what the network traffic is like when logging in and how it is different from not being logged in
Recreate this logic with WebRequest and WebResponse.
See the answers to this SO question (HttpRequest: pass through AuthLogin).
What for? Watin is good for testing and such, and it's easy to do basic screen scraping with it. Why reinvent the wheel if you don't have to.
you can set the WebRequest.Credentials property. for an example and documentation see:
http://msdn.microsoft.com/en-us/library/system.net.networkcredential.aspx

Categories

Resources