Im making a POST in a URL and getting the response as a string like this:
var responseBytes = client.UploadValues("www.example.com", requestData.Body);
var html = Encoding.UTF8.GetString(responseBytes, 0, responseBytes.Length);
html = HttpUtility.HtmlDecode(html);
Ive created a WebClient child to accept cookies so i made a this class:
public class CookieWebClient : WebClient
{
public CookieContainer CookieContainer { get; }
public CookieWebClient()
{
CookieContainer = new CookieContainer();
Encoding = Encoding.UTF8;
}
public CookieWebClient(CookieContainer cookieContainer)
{
CookieContainer = cookieContainer;
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address) as HttpWebRequest;
if (request == null) return base.GetWebRequest(address);
request.CookieContainer = CookieContainer;
return request;
}
}
As you can see i set the encode in the constructor. But when i get the html property some characters are coming like: � necess�rio and it should be É necessário.
Any ideas whats wrong?
You're using "Latin 1" characters (portuguese, maybe?), so you should go with:
var html = Encoding.GetEncoding("ISO-8859-1")
.GetString (responseBytes, 0, responseBytes.Length);
Related
I am trying to unit test some code, and I need to to replace this:
HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create( uri );
httpWebRequest.CookieContainer = new CookieContainer();
with
WebRequest webRequest = WebRequest.Create( uri );
webRequest.CookieContainer = new CookieContainer();
Basically, how do I get cookies into the request without using a HttpWebRequest?
Based on your comments, you might consider writing an extension method:
public static bool TryAddCookie(this WebRequest webRequest, Cookie cookie)
{
HttpWebRequest httpRequest = webRequest as HttpWebRequest;
if (httpRequest == null)
{
return false;
}
if (httpRequest.CookieContainer == null)
{
httpRequest.CookieContainer = new CookieContainer();
}
httpRequest.CookieContainer.Add(cookie);
return true;
}
Then you can have code like:
WebRequest webRequest = WebRequest.Create( uri );
webRequest.TryAddCookie(new Cookie("someName","someValue"));
Try with something like this:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.contoso.com/default.html");
request.CookieContainer = new CookieContainer();
request.CookieContainer.Add(new Cookie("ConstoCookie", "Chocolate Flavour"));
WebRequest is an abstract class that does not have a CookieContainer property. In addition you can't use the Headers collection (not implemented exception) so any attempt like webRequest.Headers.Add("Cookie", "...") will fail.
Sorry, but you have no chance to use cookies with WebRequest.
Stick on HttpWebRequest and add/edit as many cookies you like using its Headers collection!
dlev's answer ended up working, but I had problems implementing the solution ("The parameter '{0}' cannot be an empty string."), so I decided to write the full code in case anybody else has similar problems.
My goal was to get the html as a string, but I needed to add the cookies to the web request. This is the function that downloads the string using the cookies:
public static string DownloadString(string url, Encoding encoding, IDictionary<string, string> cookieNameValues)
{
using (var webClient = new WebClient())
{
var uri = new Uri(url);
var webRequest = WebRequest.Create(uri);
foreach(var nameValue in cookieNameValues)
{
webRequest.TryAddCookie(new Cookie(nameValue.Key, nameValue.Value, "/", uri.Host));
}
var response = webRequest.GetResponse();
var receiveStream = response.GetResponseStream();
var readStream = new StreamReader(receiveStream, encoding);
var htmlCode = readStream.ReadToEnd();
return htmlCode;
}
}
We are using the code from dlev's answer:
public static bool TryAddCookie(this WebRequest webRequest, Cookie cookie)
{
HttpWebRequest httpRequest = webRequest as HttpWebRequest;
if (httpRequest == null)
{
return false;
}
if (httpRequest.CookieContainer == null)
{
httpRequest.CookieContainer = new CookieContainer();
}
httpRequest.CookieContainer.Add(cookie);
return true;
}
This is how you use the full code:
var cookieNameValues = new Dictionary<string, string>();
cookieNameValues.Add("varName", "varValue");
var htmlResult = DownloadString(url, Encoding.UTF8, cookieNameValues);
I'm trying to access the HTML result of the w3C mobileOK Checker by passing a url such as
http://validator.w3.org/mobile/check?async=false&docAddr=http%3A%2F%2Fwww.google.com/%2Ftv%2F
The URL works if you put it in a browser but I can't seem to be able to access it via the HTMLAgilityPack. The reason for this probably is that the URL needs to send a number of requests to it's server since it's an online testing, therefore it's not just a "static" URL. I have accessed other URLs without any problems. Below is my code:
HtmlAgilityPack.HtmlDocument webGet = new HtmlAgilityPack.HtmlDocument();
HtmlWeb hw = new HtmlWeb();
webGet = hw.Load("http://validator.w3.org/mobile/check?async=false&docAddr=http%3A%2F%2Fwww.google.com/%2Ftv%2F");
HtmlNodeCollection nodes = webGet.DocumentNode.SelectNodes("//head");
if (nodes != null)
{
foreach(HtmlNode n in nodes)
{
string x = n.InnerHtml;
}
}
Edit: I tried to access it via Stream Reader and the website returns the following error: The remote server returned an error: (403) Forbidden.
I'm guessing that it's related.
I checked your example and was able to verify the described behaviour. It seems to me that w3.org checks if the request program is a browser or anything else.
I created an extended webClient class for another project on my own, and was able to access the given url with success.
Program.cs
WebClientExtended client = new WebClientExtended();
string exportPath = #"e:\temp"; // adapt to your own needs
string url = "http://validator.w3.org/mobile/check?async=false&docAddr=http%3A%2F%2Fwww.google.com/%2Ftv%2F";
/// load html by using cusomt webClient class
/// but use HtmlAgilityPack for parsing, manipulation aso
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(System.Text.Encoding.UTF8.GetString(client.DownloadData(url)));
doc.Save(Path.Combine(exportPath, "check.html"));
WebClientExtended
public class WebClientExtended : WebClient
{
#region Felder
private CookieContainer container = new CookieContainer();
#endregion
#region Eigenschaften
public CookieContainer CookieContainer
{
get { return container; }
set { container = value; }
}
#endregion
#region Konstruktoren
public WebClientExtended()
{
this.container = new CookieContainer();
}
#endregion
#region Methoden
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest r = base.GetWebRequest(address);
var request = r as HttpWebRequest;
request.AllowAutoRedirect = false;
request.ServicePoint.Expect100Continue = false;
if (request != null)
{
request.CookieContainer = container;
}
((HttpWebRequest)r).Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
((HttpWebRequest)r).UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"; //IE
r.Headers.Set("Accept-Encoding", "gzip, deflate, sdch");
r.Headers.Set("Accept-Language", "de-AT,de;q=0.8,en;q=0.6,en-US;q=0.4,fr;q=0.2");
r.Headers.Add(System.Net.HttpRequestHeader.KeepAlive, "1");
((HttpWebRequest)r).AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
return r;
}
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
if (!string.IsNullOrEmpty(response.Headers["Location"]))
{
request = GetWebRequest(new Uri(response.Headers["Location"]));
request.ContentLength = 0;
response = GetWebResponse(request);
}
return response;
}
#endregion
}
I think the crucial point is the addition/manipulation of userAgent, Accept-encoding, -language strings. The result of my code is the downloaded page check.html.
I adopted this code from one of the MSDN blogs and added webclient to download a resource ..
string formUrl = "My login url";
string formParams = string.Format("userName={0}&password={1}&x={2}&y={3}&login={4}", "user", "password","0","0","login");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
string pageSource;
string getUrl = "Resource url";
WebRequest getRequest = WebRequest.Create(getUrl);
getRequest.Headers.Add("Cookie", cookieHeader);
WebResponse getResponse = getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
System.Console.WriteLine(sr.ToString());
}
WebClient wc = new WebClient();
wc.Headers["Content-Type"] = "application/x-www-form-urlencoded";
wc.DownloadFile("Resource url","C:\\abc.tgz");
Console.Read();
But abc.tgz is not what it's supposed to be . So when I opened it up using notepad , I noticed that it is the source file of the "My login URL" page ..
Where am I going wrong?
Is there any property of webclient that I can use to see the error .. ie .. base address etc?
Let's make things simpler, shall we:
public class CookiesAwareWebClient : WebClient
{
public CookieContainer CookieContainer { get; private set; }
public CookiesAwareWebClient()
{
CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
((HttpWebRequest)request).CookieContainer = CookieContainer;
return request;
}
}
class Program
{
static void Main()
{
using (var client = new CookiesAwareWebClient())
{
var values = new NameValueCollection
{
{ "userName", "user" },
{ "password", "password" },
{ "x", "0" }, // <- I doubt the server cares about the x position of where the user clicked on the image submit button :-)
{ "y", "0" }, // <- I doubt the server cares about the y position of where the user clicked on the image submit button :-)
{ "login", "login" },
};
// We authenticate first
client.UploadValues("http://example.com/login", values);
// Now we can download
client.DownloadFile("http://example.com/abc.tgz", #"c:\abc.tgz");
}
}
}
And by the way the problem with your code is that you are not passing the authentication cookie issued by the server when you sent the first request to the second request which is supposed to access a protected resource. All you pass is some content type and no cookie and all. Servers like cookies :-)
Is it possible to Post some data from window based app to a web-server ? Lets be practical
ASCIIEncoding encodedText = new ASCIIEncoding();
string postData = string.Format("txtUserID={0}&txtPassword={1}", txtUName.Text, txtPassword.Text);
byte[] data = encodedText.GetBytes(postData);
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.aimlifes.com/office/Login.aspx");
webRequest.Method = "POST";
webRequest.ContentType = "text/html";
webRequest.ContentLength = data.Length;
Stream strm = webRequest.GetRequestStream();
strm.Write(data, 0, data.Length);
txtUrlData.Text = GetPageHtml("http://www.aimlifes.com/office/ReceiveRecharge.aspx");
strm.Close();
GetPageHtml Function:
public string GetPageHtml(String Url)
{
WebClient wbc = new WebClient();
return new System.Text.UTF8Encoding().GetString(wbc.DownloadData(Url));
}
What i'm trying is Login To xyz.com site using given credential and fetch the html data in a TextArea. GetPageHtml() function fetch's the html data. But the main problem is Posting the login details is not working, i mean m not able to get login to the xyz.com
If the site uses cookies to tack authenticated users you might try this:
public class CookieAwareWebClient : WebClient
{
public CookieAwareWebClient ()
{
CookieContainer = new CookieContainer();
}
public CookieContainer CookieContainer { get; set; }
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
((HttpWebRequest)request).CookieContainer = CookieContainer;
return request;
}
}
public class Program
{
static void Main()
{
using (var client = new CookieAwareWebClient())
{
var values = new NameValueCollection
{
{ "txtUserID", "foo" },
{ "txtPassword", "secret" }
};
// Authenticate. As we are using a cookie container
// the user will be authenticated on the next requests
// using this client
client.UploadValues("http://www.aimlifes.com/office/Login.aspx", values);
// The user is now authenticated:
var result = client.DownloadString("http://www.aimlifes.com/office/ReceiveRecharge.aspx");
Console.WriteLine(result);
}
}
}
Once the result from the login comes back, it'll have a cookie (assuming this is a Forms-authenticated site). You'll need to then take that cookie and use that when posting to the protected page. Otherwise, how will it know that you've authenticated?
I am trying to login to the TV Rage website and get the source code of the My Shows page. I am successfully logging in (I have checked the response from my post request) but then when I try to perform a get request on the My Shows page, I am re-directed to the login page.
This is the code I am using to login:
private string LoginToTvRage()
{
string loginUrl = "http://www.tvrage.com/login.php";
string formParams = string.Format("login_name={0}&login_pass={1}", "xxx", "xxxx");
string cookieHeader;
WebRequest req = WebRequest.Create(loginUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
String responseStream;
using (StreamReader sr = new StreamReader(resp.GetResponseStream()))
{
responseStream = sr.ReadToEnd();
}
return cookieHeader;
}
I then pass the cookieHeader into this method which should be getting the source of the My Shows page:
private string GetSourceForMyShowsPage(string cookieHeader)
{
string pageSource;
string getUrl = "http://www.tvrage.com/mytvrage.php?page=myshows";
WebRequest getRequest = WebRequest.Create(getUrl);
getRequest.Headers.Add("Cookie", cookieHeader);
WebResponse getResponse = getRequest.GetResponse();
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
pageSource = sr.ReadToEnd();
}
return pageSource;
}
I have been using this previous question as a guide but I'm at a loss as to why my code isn't working.
Here's a simplified and working version of your code using a WebClient:
class Program
{
static void Main()
{
var shows = GetSourceForMyShowsPage();
Console.WriteLine(shows);
}
static string GetSourceForMyShowsPage()
{
using (var client = new WebClientEx())
{
var values = new NameValueCollection
{
{ "login_name", "xxx" },
{ "login_pass", "xxxx" },
};
// Authenticate
client.UploadValues("http://www.tvrage.com/login.php", values);
// Download desired page
return client.DownloadString("http://www.tvrage.com/mytvrage.php?page=myshows");
}
}
}
/// <summary>
/// A custom WebClient featuring a cookie container
/// </summary>
public class WebClientEx : WebClient
{
public CookieContainer CookieContainer { get; private set; }
public WebClientEx()
{
CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = CookieContainer;
}
return request;
}
}