How to pass cookies to HtmlAgilityPack or WebClient? - c#

I use this code to login:
CookieCollection cookies = new CookieCollection();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("example.com");
request.CookieContainer = new CookieContainer();
request.CookieContainer.Add(cookies);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
cookies = response.Cookies;
string getUrl = "example.com";
string postData = String.Format("my parameters");
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(getUrl);
getRequest.CookieContainer = new CookieContainer();
getRequest.CookieContainer.Add(cookies);
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0";
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version11;
getRequest.AllowAutoRedirect = true;
getRequest.ContentType = "application/x-www-form-urlencoded";
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream = getRequest.GetRequestStream();
newStream.Write(byteArray, 0, byteArray.Length);
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream(), Encoding.GetEncoding("windows-1251")))
{
doc.LoadHtml(sr.ReadToEnd());
webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;
}
then I want to use HtmlWeb (HtmlAgilityPack) or Webclient to parse the HTML to HtmlDocument(HtmlAgilityPack).
My problem is that when I use:
WebClient wc = new WebClient();
webBrowser1.DocumentText = wc.DownloadString(site);
or
doc = web.Load(site);
webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;
The login disappear so i think I must somehow pass the cookies.. Any suggestions?

Check HtmlAgilityPack.HtmlDocument Cookies
Here is an example of what you're looking for (syntax not 100% tested, I just modified some class I usually use):
public class MyWebClient
{
//The cookies will be here.
private CookieContainer _cookies = new CookieContainer();
//In case you need to clear the cookies
public void ClearCookies() {
_cookies = new CookieContainer();
}
public HtmlDocument GetPage(string url) {
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
//Set more parameters here...
//...
//This is the important part.
request.CookieContainer = _cookies;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
//When you get the response from the website, the cookies will be stored
//automatically in "_cookies".
using (var reader = new StreamReader(stream)) {
string html = reader.ReadToEnd();
var doc = new HtmlDocument();
doc.LoadHtml(html);
return doc;
}
}
}
Here is how you use it:
var client = new MyWebClient();
HtmlDocument doc = client.GetPage("http://somepage.com");
//This request will be sent with the cookies obtained from the page
doc = client.GetPage("http://somepage.com/another-page");
Note: If you also want to use POST method, just create a method similar to GetPage with the POST logic, refactor the class, etc.

There are some recommendations here: Using CookieContainer with WebClient class
However, it's probably just easier to keep using the HttpWebRequest and set the cookie in the CookieContainer:
HTTPWebRequest and CookieContainer
http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.cookiecontainer.aspx
The code looks something like this:
// Create a HttpWebRequest
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(getUrl);
// Create the cookie container and add a cookie
request.CookieContainer = new CookieContainer();
// Add all the cookies
foreach (Cookie cookie in response.Cookies)
{
request.CookieContainer.Add(cookie);
}
The second thing is that you don't need to download the site again, since you already have it from your web response and you're saving it here:
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream(), Encoding.GetEncoding("windows-1251")))
{
webBrowser1.DocumentText = doc.DocumentNode.OuterHtml;
}
You should be able to just take the HTML and parse it with the HTML Agility Pack:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(webBrowser1.DocumentText);
And that should do it... :)

Try caching cookies from previous response locally and resend them each web request as follows:
private CookieCollection cookieCollection;
...
parserObject = new HtmlWeb
{
AutoDetectEncoding = true,
PreRequest = request =>
{
if (cookieCollection != null)
cookieCollection.Cast<Cookie>()
.ForEach(cookie => request.CookieContainer.Add(cookie));
return true;
},
PostResponse = (request, response) => { cookieCollection = response.Cookies; }
};

Related

C# HttpWebRequest Use cookie

I have a little problem with cookie handling in C#
So on my web site, I have a login page, once logged in, I am redirected to the home page. I get with HttpWebRequest to connect and follow the redirection, I created a class, here it is :
class webReq
{
private string urlConnection;
private string login;
private string password;
private CookieCollection cookieContainer;
private long executionTime = 0;
public webReq(string urlCo, string login, string pass)
{
this.urlConnection = urlCo;
this.login = login;
this.password = pass;
this.cookieContainer = null;
}
public void StartConnection()
{
string WriteHTML = "D:/REM/Connection.html";
List<string> datas = new List<string>();
datas.Add("Username=" + this.login);
datas.Add("Password=" + this.password);
datas.Add("func=ll.login");
datas.Add("NextURL=/admin/livelink.exe");
datas.Add("loginbutton=Sign in");
string postData = "";
postData = string.Join("&", datas);
var buffer = Encoding.ASCII.GetBytes(postData);
try
{
var watch = System.Diagnostics.Stopwatch.StartNew();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(this.urlConnection);
request.AllowAutoRedirect = true;
request.Method = "POST";
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1003.1 Safari/535.19";
request.Accept = "text/html, application/xhtml+xml, */*";
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = buffer.Length;
request.CookieContainer = new CookieContainer();
Stream stream = request.GetRequestStream();
stream.Write(buffer, 0, buffer.Length);
stream.Close();
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
stream = response.GetResponseStream();
watch.Stop();
this.executionTime = watch.ElapsedMilliseconds;
StreamReader reader = new StreamReader(stream);
System.IO.File.WriteAllText(WriteHTML, reader.ReadToEnd());
this.cookieContainer = new CookieCollection();
foreach (Cookie cookie in response.Cookies)
{
this.cookieContainer.Add(cookie);
}
}
catch (WebException ex)
{
Console.WriteLine(ex.GetBaseException().ToString());
}
}
}
I load the home page well, and I manage to get a cookie.
So I developed a function to use my cookie to browse the website :
public void connectUrl(string url, int numeroTest)
{
string WriteHTML = "D:/REM/Page"+numeroTest+".html";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
//Add cookie to request.CookieContainer
request.CookieContainer = new CookieContainer();
request.CookieContainer.Add(this.cookieContainer);
var watch = System.Diagnostics.Stopwatch.StartNew();
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream stream = response.GetResponseStream();
watch.Stop();
this.executionTime = watch.ElapsedMilliseconds;
StreamReader reader = new StreamReader(stream);
System.IO.File.WriteAllText(WriteHTML, reader.ReadToEnd());
}
Normally, I have to retrieve three cookies, like on the website :
Only, I can't navigate on the website, I end up on the login page, the cookies are not good, and that I'm in debug, I only loaded one cookie(BrowseSettings) out of the three(LLCookie & LLTZCookie) :
I don't understand why I can't retrieve all the cookies on the website.... If anyone has a solution!
I found the reason why I can't get all the cookies, even if I can't find exactly why it works by disabling redirection, in my StartConnection() method :
request.AllowAutoRedirect = true;

Why does request.Headers.Add("Cookie", "...") have no effect? [duplicate]

I am trying to unit test some code, and I need to to replace this:
HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create( uri );
httpWebRequest.CookieContainer = new CookieContainer();
with
WebRequest webRequest = WebRequest.Create( uri );
webRequest.CookieContainer = new CookieContainer();
Basically, how do I get cookies into the request without using a HttpWebRequest?
Based on your comments, you might consider writing an extension method:
public static bool TryAddCookie(this WebRequest webRequest, Cookie cookie)
{
HttpWebRequest httpRequest = webRequest as HttpWebRequest;
if (httpRequest == null)
{
return false;
}
if (httpRequest.CookieContainer == null)
{
httpRequest.CookieContainer = new CookieContainer();
}
httpRequest.CookieContainer.Add(cookie);
return true;
}
Then you can have code like:
WebRequest webRequest = WebRequest.Create( uri );
webRequest.TryAddCookie(new Cookie("someName","someValue"));
Try with something like this:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.contoso.com/default.html");
request.CookieContainer = new CookieContainer();
request.CookieContainer.Add(new Cookie("ConstoCookie", "Chocolate Flavour"));
WebRequest is an abstract class that does not have a CookieContainer property. In addition you can't use the Headers collection (not implemented exception) so any attempt like webRequest.Headers.Add("Cookie", "...") will fail.
Sorry, but you have no chance to use cookies with WebRequest.
Stick on HttpWebRequest and add/edit as many cookies you like using its Headers collection!
dlev's answer ended up working, but I had problems implementing the solution ("The parameter '{0}' cannot be an empty string."), so I decided to write the full code in case anybody else has similar problems.
My goal was to get the html as a string, but I needed to add the cookies to the web request. This is the function that downloads the string using the cookies:
public static string DownloadString(string url, Encoding encoding, IDictionary<string, string> cookieNameValues)
{
using (var webClient = new WebClient())
{
var uri = new Uri(url);
var webRequest = WebRequest.Create(uri);
foreach(var nameValue in cookieNameValues)
{
webRequest.TryAddCookie(new Cookie(nameValue.Key, nameValue.Value, "/", uri.Host));
}
var response = webRequest.GetResponse();
var receiveStream = response.GetResponseStream();
var readStream = new StreamReader(receiveStream, encoding);
var htmlCode = readStream.ReadToEnd();
return htmlCode;
}
}
We are using the code from dlev's answer:
public static bool TryAddCookie(this WebRequest webRequest, Cookie cookie)
{
HttpWebRequest httpRequest = webRequest as HttpWebRequest;
if (httpRequest == null)
{
return false;
}
if (httpRequest.CookieContainer == null)
{
httpRequest.CookieContainer = new CookieContainer();
}
httpRequest.CookieContainer.Add(cookie);
return true;
}
This is how you use the full code:
var cookieNameValues = new Dictionary<string, string>();
cookieNameValues.Add("varName", "varValue");
var htmlResult = DownloadString(url, Encoding.UTF8, cookieNameValues);

C# How can I call multiple POST, GET requests with same cookie via HttpWebRequest?

my question is, how can I call multiple requests via HttpWebRequest with same authenticate cookie in C#? I tried a lot of times but for now I dunno how to do it :/
My code is below:
var postData = "method=loginFormAccount&args[0][email]=###&args[0][pass]=###&args[0][cache]=37317&args[]=1";
var data = Encoding.ASCII.GetBytes(postData);
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("###");
request.CookieContainer = new CookieContainer();
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.AllowAutoRedirect = true;
using (var stream = request.GetRequestStream())
{
stream.Write(data, 0, data.Length);
}
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
var cookies = new CookieContainer();
cookies.Add(response.Cookies);
System.IO.File.WriteAllText(#desktop + "\\post.html", new StreamReader(response.GetResponseStream()).ReadToEnd());
// =================================== END LOGIN ==================================== \\
System.IO.File.WriteAllText(#desktop + "\\cookie.html","");
foreach (Cookie cook in response.Cookies)
{
using (System.IO.StreamWriter file = new System.IO.StreamWriter(#desktop + "\\cookie.html", true))
{
file.WriteLine(cook.ToString());
}
// Show the string representation of the cookie.
}
HttpWebRequest requestNext = (HttpWebRequest)WebRequest.Create("####");
requestNext.CookieContainer = cookies;
requestNext.Method = "GET";
HttpWebResponse responseNext = (HttpWebResponse)requestNext.GetResponse();
//var responseString = new StreamReader(response.GetResponseStream()).ReadToEnd();
System.IO.File.WriteAllText(#desktop + "\\get.html", new StreamReader(responseNext.GetResponseStream()).ReadToEnd());
My main problem is that, cookie which I'm getting is the cookie BEFORE authenticate so I must to do something to get cookie AFTER authenticate.
Try this :
HttpWebRequest requestNext = (HttpWebRequest)WebRequest.Create("####");
requestNext.CookieContainer.Add(cookies);

how to remove/update a cookie in cookie container c#?

I open a website using webbrowser control and then save cookies in cookieContainer , and later use HTTPwebrequest to process forward browsing pages etc.
The issue arises, when i make a search and it returns 100 pages,on the first page ,it saves a cookie named : ABC ,which i add to the cookiecontainer and move to the next page , on the second page again same Cookie named: ABC is given some value, but now i have two same cookies in cookiecontainer and when i move to the next page it does not work , as its taking the first cookie which messes everything.
How to solve this?
HttpWEBREQUEST FUNCTION:
public string getHtmlCookies(string url)
{
string responseData = "";
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Accept = "*/*";
request.AllowAutoRedirect = true;
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";
request.Timeout = 30000;
request.Method = "GET";
request.CookieContainer = yummycookies;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
foreach (Cookie cookie in response.Cookies)
{
string name = string.Empty;
name = cookie.Name;
string value = cookie.Value;
string path = "/";
string domain = "www.example.com";
yummycookies.Add(new Cookie(name.Trim(), value.Trim(), path, domain));
}
Stream responseStream = response.GetResponseStream();
StreamReader myStreamReader = new StreamReader(responseStream);
responseData = myStreamReader.ReadToEnd();
}
response.Close();
}
catch (Exception e)
{
responseData = "An error occurred: " + e.Message;
}
return responseData;
}
You can use SetCookies method.
var container = new System.Net.CookieContainer();
var uri = new Uri("http://www.example.com");
container.SetCookies(uri,"name=value");
container.SetCookies(uri,"name=value1");
Calling GetCookies(uri) will give a single cookie with Value=value1.
And in your case, the code would be something like
var uri = new Uri("http://www.example.com");
yummycookies.SetCookies(uri, response.Headers[HttpResponseHeader.SetCookie]);
RePierre answer, in my case, duplicates cookies if they are present in container. I have used this instead:
cookieContainer.GetAllCookies().FirstOrDefault(x => x.Name == "myCookie").Value = "MyValue";

Cannot login website with httprequest POST in C#?

I'm trying to write a bit of code to login to a website. But it's not working. Please can you give me some advice. This is my a bit of code:
static void Main(string[] args)
{
CookieContainer container = new CookieContainer();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://pagehype.com/login.php");
request.Method = "POST";
request.Timeout = 10000;
request.ReadWriteTimeout = 30000;
request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729) (Prevx 3.0.5)";
request.CookieContainer = container;
ASCIIEncoding encoding = new ASCIIEncoding();
string postData = "username=user&password=password&processlogin=1&return=";
byte[] data = encoding.GetBytes(postData);
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = data.Length;
Stream newStream = request.GetRequestStream();
newStream.Write(data, 0, data.Length);
newStream.Close();
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string htmldoc = reader.ReadToEnd();
response.Close();
Console.Write(htmldoc);
}
Many thanks,
Use http://www.fiddler2.com/fiddler2/ to view the http request sent went you login in using a browser and ensure that the request you build in code is the same.
PHP logins use PHPSESSID cookie. You'll need to capture this and pass it back in the CookieContainer. This is how the server will recognise you as an authenticated user.
The cookie is set in the Set-Cookie header in the initial response. You'll need to parse it to recreate the cookie in your container (don't forget the path (and domain?)
var setCookie = response.GetResponseHeader("Set-Cookie");
response.Close();
container = new CookieContainer();
foreach (var cookie in setCookie.Split(','))
{
var split = cookie.Split(';');
string name = split[0].Split('=')[0];
string value = split[0].Split('=')[1];
var c = new Cookie(name, value);
if (cookie.Contains(" Domain="))
c.Domain = split.Where(x => x.StartsWith(" Domain")).First().Split('=')[1];
else
{
c.Domain = ".pagehype.com";
}
if (cookie.Contains(" Path="))
c.Path = split.Where(x => x.StartsWith(" Path")).First().Split('=')[1];
container.Add(c);
}
Then add this container to your request.

Categories

Resources