GetResponseStream reading error with htmlagilitypack - c#

Alright this is the whole function
But the read values are extremely incorrect
I don't know what can be the problem
Edit
Ok it seems like problem is about gzip compression.
How can i decompress the GetResponseStream ?
public static List<object> func_DoHTTPWebRequest(PerVotingSite myPerVote, string srUrl, string srCookiePrev = "", string srRefererParameter = null,
string srBrowserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31")
{
try
{
srUrl = "http://stackoverflow.com/"; // for testing purposes
string srReferer = myPerVote.srReferer;
if (srRefererParameter != null)
srReferer = srRefererParameter;
bool blKeepAlive = myPerVote.blKeepAlive;
int irRequestTimeOut = myPerVote.irRequestsTimeOut;
if (irRequestTimeOut == 0)
irRequestTimeOut = OtomatikVoter.irTimeOut;
bool blKeepCookies = myPerVote.blKeepCookies;
HttpWebRequest hWebReq = (HttpWebRequest)WebRequest.Create(srUrl);
hWebReq.KeepAlive = blKeepAlive;
hWebReq.Referer = srReferer;
hWebReq.Timeout = irRequestTimeOut;
hWebReq.ReadWriteTimeout = irRequestTimeOut;
hWebReq.UserAgent = srBrowserAgent;
hWebReq.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
WebHeaderCollection myWebHeaderCollection = hWebReq.Headers;
myWebHeaderCollection.Add("Accept-Language", "en-gb,en;q=0.5");
myWebHeaderCollection.Add("Accept-Encoding", "gzip, deflate");
if (srCookiePrev.Length > 1)
myWebHeaderCollection.Add("Cookie", srCookiePrev);
string srCookie = "";
HtmlAgilityPack.HtmlDocument hDoc = new HtmlAgilityPack.HtmlDocument();
using (HttpWebResponse hWebResp = (HttpWebResponse)hWebReq.GetResponse())
{
using (var resultStream = hWebResp.GetResponseStream())
{
if (hWebResp.Headers["Set-Cookie"] != null && blKeepCookies == true)
srCookie = hWebResp.Headers["Set-Cookie"].ToString();
hDoc.Load(resultStream,Encoding.UTF8);
}
}
return new List<object> { hDoc, srCookie, hWebReq };
}
catch (Exception E)
{
SpecialFunctions.writeError(E, "func_DoHTTPWebRequest");
return null;
}
}
And here the read result
This code was working before now i can't figure out why not working
Visual Studio 2012 , C# 5
http://i.stack.imgur.com/QXw1h.png
http://i.stack.imgur.com/QXw1h.png

Related

I can't use xNet to create a hostname on noip.com

I can't use xNet to create a hostname on noip.com. My post action will return a redirect to the login page. This is why?
using (var req = new HttpRequest())
{
req.UserAgent = "Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Microsoft; Lumia 950) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Mobile Safari/537.36 Edge/13.10586";
CookieDictionary _cookie = new CookieDictionary(false);
req.Cookies = _cookie;
req.AddHeader("Accept-Language", "vi-VN,vi;q=0.8,en-US;q=0.5,en;q=0.3");
req.AddHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
req.CharacterSet = Encoding.UTF8;
req.Referer = "https://www.noip.com/";
req.KeepAlive = true;
string input = "";
string value = "";
input = req.Get("https://www.noip.com/login", null).ToString();
value = Regex.Match(input, "name=\"csrf-token\" content=\"(.*?)\"").Groups[1].Value;
string param = string.Concat(new object[]
{
"_token=",
value,
"&username=fxnzpkg4hzm#johnpo.gq&password=cuongdzvlne&submit_login_page=1&_token=",
value,
"&Login"
});
// Login noip.com
input = req.Post("https://www.noip.com/login", param, "application/x-www-form-urlencoded").ToString();
req.Referer = "https://my.noip.com/";
req.AddHeader("Origin", "https://my.noip.com");
req.AddHeader("Accept", "application/json");
param = "{\"id\":0,\"target\":\"45.77.254.222\",\"name\":\"" + Path.GetRandomFileName().Replace(".", "") + "\",\"domain\":\"zapto.org\",\"wildcard\":false,\"type\":\"A\",\"ipv6\":\"\",\"url\":{\"scheme\":\"http\",\"is_masq\":false,\"masq_title\":\"\",\"meta_desc\":\"\",\"meta_keywords\":\"\"},\"is_offline\":false,\"offline_settings\":{\"action\":\"noop\",\"ip\":\"\",\"url\":\"\",\"protocol\":\"http\",\"page\":{\"title\":\"\",\"image_url\":\"\",\"text\":\"\",\"email\":\"\"}},\"mx_records\":[]}";
req.AddHeader("Content-Length", Convert.ToString(Encoding.UTF8.GetBytes(param).Length));
// Create hostname
input = req.Post("https://my.noip.com/api/host", param, "application/json").ToString();
File.AppendAllText("kq.html", input);
if (input.Contains("https://www.noip.com/login"))
{
MessageBox.Show("-------------- Error");
}
else
{
MessageBox.Show("-------------- OK");
}
}

c# read json from httpWebResponse

I'm getting a json from httpWebResponse which supposed to be in the following format :
{"d":"{\"Success\":\"\",\"PricedItineraries\":[{\"IsRecommendedFlight\":false,\"BookingClass\":null,\"AirItinerary\":{\"OriginDestinationOptions\":{\"OriginDestinationOption\":[{\"FlightSegment\":[{\"DepartureAirport\":{\"LocationCode\":\"RAK\",\"Terminal\":\"1\"},\"ArrivalAirport\":{\"LocationCode\":\"ORY\",\"Terminal\":\"S\"},\"Equipment\":{\"AirEquipType\":\"73H\",\"ChangeofGauge\":\"true\"},\"MarketingAirline\":{\"Code\":\"TO\",\"CodeContext\":\"KBATO\"},\"OperatingAirline\":{\"Code\":\"TO\"},\"BookingClassAvails\":[{\"ResBookDesigCode\":\"A\",\"RPH\":\"5\"}],\"BagAllownceInfo\":{\"Allowance\":\"00\",\"QuantityCode\":\" N\",\"UnitQualifier\":\" K\"},\"FareID\":\"000000000\",\"Token\":\"00000000-0000-0000-0000-000000000000\",\"AdultBaseFare\":\"0000517.77\",\"AdultTaxFare\":\"0000000.00\",\"ChildBaseFare\":\"0000000.00\",\"ChildTaxFare\":\"0000000.00\",\"InfantBaseFare\":\"0000000.00\",\"InfantTaxFare\":\"0000000.00\",\"PriceTotal\":\"0000307.90\",\"LFID\":\"0000000\",\"PFID\":\"00000\",\"PTFID\":\"
I tried to Deserialize the json using JSonConverter but it returns stream was not readable:
public static object DeserializeFromStream(Stream stream)
{
var serializer = new JsonSerializer();
using (var sr = new StreamReader(stream))
using (var jsonTextReader = new JsonTextReader(sr))
{
return serializer.Deserialize(jsonTextReader);
}
}
So I'm reading my httpwebresponse:
using (StreamReader Reader = new StreamReader(ResStream))
{
this.ResponseHTML = Reader.ReadToEnd();
}
But somehow it returns:
?W?S???"??yE,?????♣*Ay◄?lA+??F◄K?\?b;^~§????xN?yU,6U,?☺♣U,T??O?♠V►R?B§??*????↨??
e?-|T?P?b???s§???M§♂U,?'??*?^le?????▬????§%7↕???f??Qd←|?c♣??
bq7;???ffv%)?▬??↔L6???s?V?~?#?$♀]☺◄????D????'X?e?_?????"??E??Q]E,Ad-? ♥?Xb'??K
[???y;?d"0??:?-X??←Xòs←▬?→?$?↑-b☼}E,???"??MF??j↨??;vb)?aq?ai???R?.5,???????→▬jX?
♂5,?↑j??P?2?→?????+5,?tu?#??ev??☺7;???o☺??3w???P?B
w???E?4!?→??MF??j??v?→▬?→???]jX?I?R???⌂?i]?5,a↕☼↓????→▬???kX??mjX?a◄1?♥'P?;??(??
♂??????H?aq??→▬??↑e???[§???%M?.^??}X???z??t??an?→▬c????_~Z?→▬jX?a?gZ?→▬jX?a?????
♂5,?↑j??P?2?→▬?a)Y??a????>,
5,??a?P?Ro▼▬?→???]jX?I????%t-jX8K?→▬jXLI?a?,!jX??%??PO??P?R?f???↔/4?f?c?7;?jX6?L
I'm sending the webrequest :
public Response Send()
{
if (REQUEST == null)
return new Response(REQUEST) { HTTPStatusCode = "999" };
WebResponse Res = null;
CookieContainer Cookies = new CookieContainer();
if (REQUEST.Cookies != null)
{
Cookies = REQUEST.Cookies;
}
bool isdone = true;
DateTime Time = default(DateTime);
Sender = (HttpWebRequest)WebRequest.Create(this.REQUEST._url);
Sender.Host = REQUEST.Host;
Sender.Accept = REQUEST.Accept;
Sender.Method = REQUEST.Method;
Sender.UserAgent = REQUEST.UserAgent;
Sender.ContentType = REQUEST.ContentType;
Sender.CookieContainer = Cookies;
Sender.Referer = REQUEST.Refer;
if (REQUEST.Data != null && REQUEST.Data.Length > 0)
{
using (var writer = new StreamWriter(Sender.GetRequestStream()))
{
writer.Write(REQUEST.Data);
writer.Flush();
writer.Close();
}
}
try
{
Res = Sender.GetResponse();
Time = DateTime.Now;
}
catch (Exception Ex)
{
isdone = false;
if (OnExceptionHappened != null)
OnExceptionHappened(this, new ExceptionArgs { Msg = Ex.Message, Sender = this, Time = DateTime.Now });
}
return AssignWebResponse((HttpWebResponse)Res, isdone, Time, Cookies, Sender.RequestUri.AbsoluteUri);
}
Request :
var REQUEST = new Request()
{
_url = "https://www.example.com/",
Host = "host",
Method = "POST",
Refer = "refer",
UserAgent = "Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0",
ContentType = "application/json",
Cookies = CurrentContainer,
Data = "{ 'isFromCache': 'undefined', 'pageNumber': '1'}",
isJson = true
};
using (BeRequest.CORE.BeRequest Req = new BeRequest.CORE.BeRequest(REQUEST))
{
Req.OnExceptionHappened += Req_OnExceptionHappened;
var response = Req.Send();
return response.ResponseHTML;
}
Just do:
using (StreamReader Reader = new StreamReader(Res.GetResponseStream()))
{
var myObject = JsonConvert.DeserializeObject<YourClass>(Reader.ReadToEnd());
}
Your code is a bit all over the place. But assuming that Res is of type WebResponse, then it should work.

How to optimize httpwebrequest call performance

Currently I'm building a c# based desktop application. Its like a small web form auto filler. The application make GET and POST calls for getting the images in responsestream. But I'm getting very slow response. Any suggestions on how I can increase the speed of image download. Following is the code for downloading the image and POST call
Image Download Function:
public void download_image(string go_to_url, string referer)
{
this.httpWebRequest_cls_1 = null;
HttpWebResponse response = null;
try
{
this.httpWebRequest_cls_1 = (HttpWebRequest)WebRequest.Create(go_to_url);
this.httpWebRequest_cls_1.Proxy = null;
WebRequest.DefaultWebProxy = null;
HttpRequestCachePolicy policy = new HttpRequestCachePolicy(HttpRequestCacheLevel.NoCacheNoStore);
this.httpWebRequest_cls_1.ServicePoint.UseNagleAlgorithm = false;
this.httpWebRequest_cls_1.ServicePoint.Expect100Continue = false;
this.httpWebRequest_cls_1.ServicePoint.ConnectionLimit = 100000;
this.httpWebRequest_cls_1.ServicePoint.ConnectionLeaseTimeout = 65000;
this.httpWebRequest_cls_1.ServicePoint.MaxIdleTime = 100000;
this.httpWebRequest_cls_1.CookieContainer = global_store.cookieContainer_0;
this.httpWebRequest_cls_1.Referer = str1;
this.httpWebRequest_cls_1.KeepAlive = true;
this.httpWebRequest_cls_1.ReadWriteTimeout = 0xc350;
this.httpWebRequest_cls_1.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
this.httpWebRequest_cls_1.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip,deflate");
this.httpWebRequest_cls_1.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.65 Safari/537.36";
this.httpWebRequest_cls_1.UnsafeAuthenticatedConnectionSharing = true;
this.httpWebRequest_cls_1.AuthenticationLevel = System.Net.Security.AuthenticationLevel.None;
this.httpWebRequest_cls_1.PreAuthenticate = true;
this.httpWebRequest_cls_1.AllowWriteStreamBuffering = false;
this.httpWebRequest_cls_1.Pipelined = false;
this.httpWebRequest_cls_1.ProtocolVersion = HttpVersion.Version11;
this.httpWebRequest_cls_1.ContentType = "application/x-www-form-urlencoded";
this.httpWebRequest_cls_1.Headers.Set("Cache-Control", "max-age=86400");
this.httpWebRequest_cls_1.ContentLength = 0L;
this.httpWebRequest_cls_1.Method = "GET";
response = (HttpWebResponse)this.httpWebRequest_cls_1.GetResponse();
global_store.image_0 = Image.FromStream(response.GetResponseStream());
staticstore.get_cap_time = new TimeSpan(DateTime.Now.Hour,
DateTime.Now.Minute, DateTime.Now.Second);
}
catch (WebException exception)
{
string exp = exception.ToString();
}
finally
{
if (response != null)
{
response.Close();
response = null;
}
if (this.httpWebRequest_cls_1 != null)
{
this.httpWebRequest_cls_1.Abort();
this.httpWebRequest_cls_1 = null;
}
}
}
POST Request Sending Function:
public string simple_web_call(string go_to, string from, string params_to_post, string conditions)
{
string str = "";
string responseHeader = "";
httpWebRequest_3 = null;
Stream requestStream = null;
HttpWebResponse response = null;
Stream responseStream = null;
try
{
this.httpWebRequest_3 = (HttpWebRequest)WebRequest.Create(new Uri(go_to));
HttpRequestCachePolicy policy = new HttpRequestCachePolicy(HttpRequestCacheLevel.NoCacheNoStore);
this.httpWebRequest_3.CachePolicy = policy;
this.httpWebRequest_3.Proxy = null;
WebRequest.DefaultWebProxy = null;
this.httpWebRequest_3.ServicePoint.UseNagleAlgorithm = false;
this.httpWebRequest_3.ServicePoint.Expect100Continue = false;
this.httpWebRequest_3.ServicePoint.ConnectionLimit = 65000;
this.httpWebRequest_3.ServicePoint.ConnectionLeaseTimeout = Class9.int_7;
this.httpWebRequest_3.ServicePoint.MaxIdleTime = 10000;
this.httpWebRequest_3.CookieContainer = Class28.cookieContainer_0;
this.httpWebRequest_3.Referer = from;
this.httpWebRequest_3.KeepAlive = true;
this.httpWebRequest_3.Connection = "keepalive";
this.httpWebRequest_3.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
this.httpWebRequest_3.UserAgent = "Mozilla/5.0 (Windows T 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36";
this.httpWebRequest_3.Headers.Set("Cache-Control", "no-cache");
this.httpWebRequest_3.UnsafeAuthenticatedConnectionSharing = true;
this.httpWebRequest_3.AuthenticationLevel = System.Net.Security.AuthenticationLevel.None;
this.httpWebRequest_3.ProtocolVersion = HttpVersion.Version11;
this.httpWebRequest_3.Headers.Set("Cache-Control", "no-cache");
if (params_to_post != "")
{
this.httpWebRequest_3.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(params_to_post);
this.httpWebRequest_3.ContentType = "application/x-www-form-urlencoded";
this.httpWebRequest_3.ContentLength = bytes.Length;
requestStream = this.httpWebRequest_3.GetRequestStream();
requestStream.Write(bytes, 0, bytes.Length);
requestStream.Flush();
requestStream.Close();
}
else
{
this.httpWebRequest_3.Method = "GET";
}
response = (HttpWebResponse)this.httpWebRequest_3.GetResponse();
HttpWebResponse response2 = response;
switch (response2.StatusCode)
{
case HttpStatusCode.OK:
{
responseStream = response2.GetResponseStream();
str = new StreamReader(responseStream, Encoding.UTF8).ReadToEnd();
responseStream.Close();
responseStream = null;
goto flush_all;
}
case HttpStatusCode.MovedPermanently:
case HttpStatusCode.Found:
case HttpStatusCode.SeeOther:
case HttpStatusCode.TemporaryRedirect:
responseHeader = response2.GetResponseHeader("Location");
if (!responseHeader.Contains("err")) { break; }
str = "retry";
goto flush_all;
default:
str = "retry";
goto flush_all;
}
str = responseHeader;
flush_all:
response2 = null;
response.Close();
response = null;
this.httpWebRequest_3 = null;
if (str == "") { str = "retry"; }
}
catch (WebException exception)
{
string exp = exception.ToString();
}
finally
{
if (requestStream != null)
{
requestStream.Close();
requestStream.Dispose();
requestStream = null;
}
if (responseStream != null)
{
responseStream.Close();
responseStream.Dispose();
responseStream = null;
}
if (response != null)
{
response.Close();
response = null;
}
if (this.httpWebRequest_3 != null)
{
this.httpWebRequest_3.Abort();
this.httpWebRequest_3 = null;
}
}
return str;
}

How to get the HTML encoding right in C#?

I'm trying to get the pronunciation for certain word from a web dictionary. For example, in the following code, I want to get the pronunciation of good from http://collinsdictionary.com
(HTTP Agility Pack is used here)
static void test()
{
String url = "http://www.collinsdictionary.com/dictionary/english/good";
WebClient client = new WebClient();
client.Encoding = System.Text.Encoding.UTF8;
String html = client.DownloadString(url);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
HtmlAgilityPack.HtmlNode node = doc.DocumentNode.SelectSingleNode("//*[#id=\"good_1\"]/div[1]/h2/span/text()[1]");
if (node == null)
{
Console.WriteLine("XPath not found.");
}
else
{
Console.WriteLine(node.WriteTo());
}
}
I was expecting
(ɡʊd
but what I could get at best is
(ɡ?d
How to get it right?
The problem is not in your parsing of the text, rather it is a problem with the console output. If you are doing this from a command line app, you can set the output encoding of the console to be unicode:
Console.OutputEncoding = System.Text.Encoding.Unicode;
You need to also ensure that your font in the console is a font that has unicode support. See this answer for more info.
If you know the page encoding (e.g System.Text.Encoding.UTF8);
string html = DownloadSmallFiles_String(url, System.Text.Encoding.UTF8, 20000);
or use automatic encoding detection (depends on server response)
string html = DownloadSmallFiles_String(url, null, 20000);
and finally load the html
doc.LoadHtml(html);
Try below code
static void test()
{
String url = "http://www.collinsdictionary.com/dictionary/english/good";
System.Text.Encoding PageEncoding = null; //System.Text.Encoding.UTF8
//PageEncoding = null; it means try to detect encoding automatically
string html = DownloadSmallFiles_String(url, PageEncoding, 20000);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
//doc.LoadHtml(html);
doc.LoadHtml(html);
HtmlAgilityPack.HtmlNode node = doc.DocumentNode.SelectSingleNode("//*[#id=\"good_1\"]/div[1]/h2/span/text()[1]");
if (node == null)
{
Console .WriteLine("XPath not found.");
}
else
{
Console.WriteLine(node.WriteTo());
}
}
private static HttpWebRequest CreateWebRequest(string url, int TimeOut = 20000)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko";
request.Method = "GET";
request.Timeout = TimeOut;
request.CachePolicy = new HttpRequestCachePolicy(HttpRequestCacheLevel.NoCacheNoStore);
request.KeepAlive = false;
request.UseDefaultCredentials = true;
request.Proxy = null;//ProxyHelperClass.GetIEProxy;
return request;
}
public static string DownloadSmallFiles_String(string Url, System.Text.Encoding ForceTextEncoding_SetThistoNothingToUseAutomatic, int TimeOut = 20000)
{
try
{
string ResponsString = "";
HttpWebRequest request = CreateWebRequest(Url, TimeOut);
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
if (response.StatusCode == HttpStatusCode.OK)
{
using (Stream receiveStream = response.GetResponseStream())
{
if (ForceTextEncoding_SetThistoNothingToUseAutomatic != null)
{
ResponsString = new StreamReader(receiveStream, ForceTextEncoding_SetThistoNothingToUseAutomatic).ReadToEnd();
}
else
{
if (string.IsNullOrEmpty(response.CharacterSet) == false)
{
System.Text.Encoding respEncoding = System.Text.Encoding.GetEncoding(response.CharacterSet);
ResponsString = new StreamReader(receiveStream, respEncoding).ReadToEnd();
}
else
{
ResponsString = new StreamReader(receiveStream).ReadToEnd();
}
}
}
}
}
return ResponsString;
}
catch (Exception ex)
{
return "";
}
}

Loading a Japanese web page with HtmlAgilityPack

I have this piece of code to load and parse web pages using HtmlAgilityPack. It works for most web pages, but wen I tried to load a Japanese web page, it seems the encoding is wrong. How can I do this? Actually how can I set encoding based on web page encoding?
class Program {
private const string HttpMethod = "GET";
private const string UserAgent =
"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.7 (KHTML, like Gecko) Chrome/7.0.517.41 Safari/534.7";
static void Main(string[] args) {
var request = WebRequest.Create("http://infoseek.co.jp/") as HttpWebRequest;
if (request == null)
return;
request.Method = HttpMethod;
request.UserAgent = UserAgent;
var response = request.GetResponse() as HttpWebResponse;
if (response == null)
return;
var stream = response.GetResponseStream();
var document = new HtmlDocument {
OptionCheckSyntax = true,
OptionFixNestedTags = true,
OptionAutoCloseOnEnd = true,
OptionDefaultStreamEncoding = Encoding.UTF8,
OptionReadEncoding = true
};
document.Load(stream, Encoding.UTF8);
var d = document.DocumentNode;
}
}
infoseek.co.jp responds with the HTTP header
Content-Type text/html; charset=EUC-JP
which is mirrored in the HTML tag
<meta http-equiv="Content-Type" content="text/html; charset=EUC-JP">
In .Net, use Code Page 51932 to decode EUC-JP.
I tried to get encoding from HttpWebResponse object by code below. Do you see any problem or have you any other idea?
class Program {
private const string HttpMethod = "GET";
private const string UserAgent =
"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.7 (KHTML, like Gecko) Chrome/7.0.517.41 Safari/534.7";
static void Main(string[] args) {
var request = WebRequest.Create("http://infoseek.co.jp/") as HttpWebRequest;
if (request == null)
return;
request.Method = HttpMethod;
request.UserAgent = UserAgent;
var response = request.GetResponse() as HttpWebResponse;
if (response == null)
return;
var encoding = TryGetEncoding(response);
var stream = response.GetResponseStream();
var document = new HtmlDocument {
OptionCheckSyntax = true,
OptionFixNestedTags = true,
OptionAutoCloseOnEnd = true,
OptionReadEncoding = true,
OptionDefaultStreamEncoding = encoding
};
document.Load(stream, encoding);
var d = document.DocumentNode;
}
private static Encoding TryGetEncoding(HttpWebResponse response) {
var charset = response.CharacterSet;
if (string.IsNullOrWhiteSpace(charset))
charset = response.ContentEncoding;
if (string.IsNullOrWhiteSpace(charset))
return Encoding.UTF8;
try {
return Encoding.GetEncoding(charset);
} catch {
return Encoding.UTF8;
}
}
}

Categories

Resources