Get the latest html after ajax call in webbrowser control? - c#

There are lot of this kind of questions and I was not able to find a solution for my problem.
I have a webpage and after the webpage loads Ajax is called and it will load a table with data may be it takes 2 seconds.
I want the data inside that table.
When I try to access the table using document text It does not have the table HTML. It still have the initial HTML that has loaded before Ajax call.
webBrowser1.Update(); //Didn't work
Then I tried this didn't work
private void Timer_Tick(object sender, EventArgs e) //Interval of 5000
{
if (webBrowser1.ReadyState == WebBrowserReadyState.Complete)
{
HtmlElement element = webBrowser1.Document.GetElementById("tableType3");
if (element != null)
{
String webbrowsercontent = element.InnerHtml;
timer.Stop();
}
}
}
Then I tried this didn't work
private void WaitTillPageLoadsCompletly(WebBrowser webBrControl)
{
WebBrowserReadyState loadStatus;
int waittime = 20000;
int counter = 0;
while (true)
{
loadStatus = webBrControl.ReadyState;
Application.DoEvents();
if ((counter > waittime) || (loadStatus == WebBrowserReadyState.Uninitialized) || (loadStatus == WebBrowserReadyState.Loading) || (loadStatus == WebBrowserReadyState.Interactive))
{
break;
}
counter++;
}
counter = 0;
while (true)
{
loadStatus = webBrControl.ReadyState;
Application.DoEvents();
if (loadStatus == WebBrowserReadyState.Complete && webBrControl.IsBusy != true)
{
break;
}
counter++;
}
}
In debugging I saw the table contents in WebBrowser1.Document.NativeHtmlDocument2 which cant be accessed because of internal class.
Is there any other way to solve my problem.

Have you tried listening to the Ajax onpropertychange event?
I've recently visited a website that teaches how to handle a Ajax component onpropertychange event in webBrowser1_DocumentCompleted.
Here's the following code, I hope this leads the way to your solution.
(The idea here is to get webBrowser1.Document.GetElementById("abc");'s dynamic content generated by AJAX, and show how you can wait on the onpropertychange event in webBrowser1_DocumentCompleted)
HTML Code
<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8">
<title></title>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js"></script>
<script>
$.ajaxSetup({
cache: false
});
var aa = function() {
$.get("ajax.php", function(data) {
$("#abc").html(data);
});
};
$(function() {
aa();
setInterval(aa, 2000);
});
</script>
</head>
<body>
<div id="abc"></div>
</body>
</html>
ajax.php
<?php
echo date("H:i:s");
C# code
private void button1_Click(object sender, EventArgs e)
{
webBrowser1.Navigate("http://127.0.0.1/test.html");
}
private void handlerAbc(Object sender, EventArgs e)
{
HtmlElement elm = webBrowser1.Document.GetElementById("abc");
if (elm == null) return;
Console.WriteLine("elm.InnerHtml(handlerAbc):" + elm.InnerHtml);
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
/* Get the original HTML (method 1)*/
System.IO.StreamReader getReader = new System.IO.StreamReader(webBrowser1.DocumentStream, System.Text.Encoding.Default);
string htmlA = getReader.ReadToEnd(); // htmlA can only extract original HTML
/* Get the original HTML (method 2)*/
string htmlB = webBrowser1.DocumentText; // htmlB can only extract original HTML
/* You can use the following method to extract the 'onChanged' AJAX content*/
HtmlElement elm = webBrowser1.Document.GetElementById("abc"); // Get "abc" element by ID
Console.WriteLine("elm.InnerHtml(DocumentCompleted):" + elm.InnerHtml);
if (elm != null)
{
// Listen on 'abc' onpropertychange event
elm.AttachEventHandler("onpropertychange", new EventHandler(handlerAbc));
}
}
Result:
elm.InnerHtml(DocumentCompleted):
elm.InnerHtml(handlerAbc):06:32:36
elm.InnerHtml(handlerAbc):06:32:38
elm.InnerHtml(handlerAbc):06:32:40

I used OpenWebKitSharp to solved the problem that Html content rendered by js. If you can change the library, just go to this link to check the solution: Get final HTML content after javascript finished by Open Webkit Sharp

Related

Getting <img src=""> attribute from AliExpress error

Today I was trying to load images from aliexpress products.
I was using this code : string NowImage = HJ.GetElementsByTagName("img")[0].GetAttribute("src");
it worked for the first 8 images and didn't load the rest of images.
it was returning empty string.
And I checked the html of the aliexpress and found out that it should work.
Can someone help me ? Thanks for reading.
public bool Search()
{
WB.DocumentCompleted += WB_SearchCompleted;
WB.Navigate(URL);
while (WB.ReadyState != WebBrowserReadyState.Complete)
Application.DoEvents();
return true;
}
private void WB_SearchCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlElementCollection HEC = WB.Document.GetElementsByTagName("li");
foreach(HtmlElement HJ in HEC)
{
if(HJ.GetAttribute("qrdata") == "")
continue;
NowImage = HJ.GetElementsByTagName("img")[0].GetAttribute("src");
//for the first 8 images it was loading perfect after that it was
//returning empty string
}
}

DocumentCompleted not firing twice

Scraping a web page. The page loads and calls a DocumentCompleted handler. Inside that handler, I invoke a java method to set a date and then invoke a click to get the new data (via POST). This all works correctly except that the DocumentCompleted handler is only called once. The POST that goes back and "gets" a new page doesn't cause the handler to fire a second time.
I tried adding multiple handlers, removing the first and adding a second handler in the first handler. Didn't work. Also ran this as Administrator, didn't change anything.
Anyone have thoughts on how to proceed? I guess I can wait 60 seconds for it to load and then grab the text but that seems clunky.
public void FirstHandler(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb = ((WebBrowser)sender);
string url = e.Url.ToString();
if (!(url.StartsWith("http://")) || url.StartsWith("https://"))
{
// in AJAX
}
if (e.Url.AbsolutePath != webBrowser.Url.AbsolutePath)
{
// IFRAME Painting
}
else
{
// really really complete
wb.DocumentCompleted -= FirstHandler;
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(SecondHandler);
HtmlElement webDatePicker = wb.Document.GetElementById("ctl00_WebSplitter1_tmpl1_ContentPlaceHolder1_dtePickerBegin");
string szJava = string.Empty;
szJava = "a = $find(\"ctl00_WebSplitter1_tmpl1_ContentPlaceHolder1_dtePickerBegin\"); a.set_text(\"5/20/2017\");";
object a = wb.Document.InvokeScript("eval", new object[] { szJava });
if (webDatePicker != null)
webDatePicker.InvokeMember("submit");
HtmlElement button = wb.Document.GetElementById("ctl00$WebSplitter1$tmpl1$ContentPlaceHolder1$HeaderBTN1$btnRetrieve");
if (button != null)
{
button.InvokeMember("click");
}
}
}
public void SecondHandler(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb = ((WebBrowser)sender);
string url = e.Url.ToString();
string d = string.Empty;
if (!(url.StartsWith("http://")) || url.StartsWith("https://"))
{
// in AJAX
}
if (e.Url.AbsolutePath != webBrowser.Url.AbsolutePath)
{
// IFRAME Painting
}
else
{
d = wb.DocumentText;
System.IO.File.WriteAllText("Finally.htm", d);
wb.DocumentCompleted -= SecondHandler;
}
_fired = true;
}

How to select a textbox in a webbrowser control?

I made a program that opens a Google Translate window when F1 is pressed.
I want the "source" textbox to get selected (focused on)
I tried this:
formMain.Activate();
formMain.panelMain.Enabled = false;
formMain.panelMain.Focus();
formMain.panelMain.Select();
formMain.panelMain.Enabled = true;
formMain.webBrowserMain.BringToFront();
formMain.webBrowserMain.Select();
formMain.webBrowserMain.Focus();
formMain.ActiveControl = formMain.webBrowserMain;
if (formMain. WindowState != FormWindowState.Minimized)
{
Program.DoMouseClick((uint)formMain.PointToScreen(formMain. webBrowserMain.Location).X + 10, (uint)formMain.PointToScreen(formMain.webBrowserMain.Location).Y + 10);
}
HtmlElement textArea = formMain.webBrowserMain.Document.GetElementById("source");
if (textArea != null)
{
textArea.Focus();
}
But it only sometimes gets selected!
Try this code :
This includes injecting custom Js in the site and calling that function , works for me in a similar case (Always do operations on a webpage after it has completed rendered/loaded):
private void webBrowser1_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
HtmlElement head = webBrowser1.Document.GetElementsByTagName("head")[0];
HtmlElement scriptEl = webBrowser1.Document.CreateElement("script");
IHTMLScriptElement element = (IHTMLScriptElement)scriptEl.DomElement;
element.text = "function trig_focus() { document.getElementById(\"theIdOfInputBox\").focus(); }";
head.AppendChild(scriptEl);
webBrowser1.Document.InvokeScript("trig_focus");
}

C# - Get variable from webbrowser generated by javascript

have downloaded page by webbrowser and need to get mail address. But it is generated by javastript. In code i can find this script:
<script type="text/javascript" charset="utf-8">var i='ma'+'il'+'to';var a='impexta#impexta.sk';document.write(''+a+'');</script>
I read everywhere how to Invoke script, by i don't know his name. So what i want is to get "a" variable value.
EDIT: Code before:
...
WebBrowser wb = new WebBrowser();
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
wb.Navigate(url);
for (; wb.ReadyState != WebBrowserReadyState.Complete; )
{
System.Windows.Forms.Application.DoEvents();
}
...
void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb = sender as WebBrowser;
if (wb != null)
{
if (wb.ReadyState == WebBrowserReadyState.Complete)
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(wb.DocumentStream);
}
}
}
I found easy solution. Just finding the right part of string in HTML code:
foreach (HtmlNode link in root.SelectNodes("//script"))
{
if (link.InnerText.Contains("+a+"))
{
string[] strs = new string[] { "var a='", "';document.write" };
strs = link.InnerText.Split(strs, StringSplitOptions.None);
outMail = System.Net.WebUtility.HtmlDecode(strs[1]);
if (outMail != "")
{
break;
}
}
}

How to increment a variable in ASP.NET (C#) between page views or button clicks

I have the feeling that i'm missing something key here.
I've tried following guides on http://msdn.microsoft.com/en-us/magazine/cc300437.aspx
and on google, but i can't see what i haven't done.
I have some very basic code which i have written just trying to get this to work:
The Default.aspx code:
<%# Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="_Default" EnableSessionState="True" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<title>Demo Page</title>
</head>
<body>
<form id="form1" runat="server">
<div>
<asp:Label ID="myLabel" runat="server" Text="foo"></asp:Label>
<asp:LinkButton ID="lnkClickButton" runat="server" OnClick="lnkClickButton_Click" CommandName="Clicky">Click Me</asp:LinkButton>
</div>
</form>
</body>
</html>
The Default.aspx.cs code:
using System;
using System.Collections.Generic;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
public partial class _Default : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
Session["clickcount"] = 0;
Cache["clickscount"] = 0;
}
protected void lnkClickButton_Click(object sender, EventArgs e)
{
Session["clickcount"] = (int)Session["clickcount"] + 1;
Cache["clickscount"] = (int)Cache["clickscount"] + 1;
Label myLabel = ((Label)this.FindControl("myLabel"));
if (myLabel != null)
{
myLabel.Text = "Session: " + Session["clickcount"] + "; Cache: " + Cache["clickscount"] + ";";
}
}
}
I've tried using both the session object and the cache object to increment the values, but to no avail. I just get 1 every time.
N.B. this is my first asp.net project, also i'm fairly new to c#.
Page_Load is ran every postback as well as the initial load. You need to specify no postback in your Page_Load:
protected void Page_Load(object sender, EventArgs e)
{
if (!Page.IsPostBack){
Session["clickcount"] = 0;
Cache["clickscount"] = 0;
}
}
Better still, specify that it should only be set if it doesn't already have a value:
protected void Page_Load(object sender, EventArgs e)
{
if (Session["clickcount"] == null){
Session["clickcount"] = 0;
}
}
Just to clarify, the reason that it is better to only set the value if it isn't already set is that Page.IsPostBack is false every time someone directly visits the page. Say for instance you have your page http://example.com/Demo/Default.aspx and at the top you have a logo which you wrap in logo here, the session would be reset every time somebody clicked on the logo, even though they didn't actually leave the page. Also happens if they refresh on their browser without re-posting the last post.
Read on MSDN : Page.IsPostBack Property - Gets a value that indicates whether the page is being rendered for the first time or is being loaded in response to a postback.
vlue of property is true if the page is being loaded in response to a client postback; otherwise, false.
code like this ...you need to put the code in !IsPostBack as like below
protected void Page_Load(object sender, EventArgs e)
{
if(!IsPostBack)
{
Session["clickcount"] = 0;
Cache["clickscount"] = 0;
}
}
serverside control generate postback to page it self so the code that you dont wnat to execute on each postbck need to be put as above
this will resolve your issue easily...
further to this you can create static property for count like this
Check my post on : Programming Practice for Server Side State Maintenance Variable
private int ClickCount
{
get
{
if (Session["clickcount"] == null)
{ Session["clickcount"] = 0; return 0; }
else
return (int)Session["clickcount"] ;
}
set
{
Session["clickcount"] = value;
}
}
than in final code
protected void Page_Load(object sender, EventArgs e)
{
if(!IsPostBack)
{
ClickCount = 0;
}
}
protected void lnkClickButton_Click(object sender, EventArgs e)
{
int val = ClickCount ;
ClickCount = val + 1;
}
Writing:
Session["clickcount"] = 0;
In the Page_Load will cause the counter to be reset every time the user enters the page.
Seems to me you want something like this:
protected void lnkClickButton_Click(object sender, EventArgs e)
{
if (Session["clickcount"] == null)
{
Session["clickcount"] = 1;
}
else
{
Session["clickscount"] = (int)Session["clickscount"] + 1;
}
Label myLabel = ((Label)this.FindControl("myLabel"));
if (myLabel != null)
{
myLabel.Text = "Session: " + Session["clickcount"] + ";
}
}
You get 1 because every post back your session and cache variable equal to 0.
protected void Page_Load(object sender, EventArgs e)
{
Session["clickcount"] = 0;
Cache["clickscount"] = 0;
}
And button click occur after the page_load, So you should use IsPostback property.
protected void Page_Load(object sender, EventArgs e)
{
if(!IsPostBack)
{
Session["clickcount"] = 0;
Cache["clickscount"] = 0;
}
}
Now these variables initialize only when page is loaded.
You should go through following link. It is described Asp.net page life.
http://msdn.microsoft.com/en-us/library/ms178472.aspx

Categories

Resources