Scenario:
I'd like to use a WebBrowser Control to proxy website navigation on external websites for a research project. Therefore I tried to use the WebBrowser Control to load the site within a page request and forward the received HTML with some modifications (as changed src/href and javascript event handlers aso.). When a participant/user triggers an onclick event on the proxied website, I fetch this event on the server and would like to re-trigger it within my WebBrowser Control.
Problem:
I can't figure out how to handle the WebBrowser Control. Initially I thought it is just the matter of storing it as a session object, but the fact that it has to run in an STA thread makes this difficult. I need the same, active, browser object when the user invokes an onclick event to allow me to proxy this onclick on the control.
For now I use a Wrapper Class IEBrowser: System.Windows.Forms.ApplicationContext.
I copied the code from different sources, mainly from (http://www.codeproject.com/Articles/50544/Using-the-WebBrowser-Control-in-ASP-NET) but it does not consider using the same WebBrowser Control over many Requests.
Here is some of the code from the IEBrowser class:
public void Nav(string url)
{
this.url = url;
this.resultEvent = new AutoResetEvent(false);
htmlResult = null;
ths = new ThreadStart(delegate
{
// create a WebBrowser control
ieBrowser = new WebBrowser();
//Reset Session
InternetSetOption(IntPtr.Zero, INTERNET_OPTION_END_BROWSER_SESSION, IntPtr.Zero, 0);
// set WebBrowser event handls
ieBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(IEBrowser_DocumentCompleted);
//make request
ieBrowser.Navigate(url);
System.Windows.Forms.Application.Run(this);
//remove WebBrowser event handler
ieBrowser.DocumentCompleted -= new WebBrowserDocumentCompletedEventHandler(IEBrowser_DocumentIsCompleted);
//for now, we keep the webBrowser open
//ieBrowser.Dispose();
});
thrd = new Thread(ths);
thrd.Name = "Thread 2";
thrd.IsBackground = true;
// set thread to STA state before starting
thrd.SetApartmentState(ApartmentState.STA);
thrd.Start();
EventWaitHandle.WaitAll(new AutoResetEvent[] { resultEvent });
thrd.Join();
}
// DocumentCompleted event handle
void IEBrowser_DocumentIsCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (ieBrowser.ReadyState == WebBrowserReadyState.Complete && ieBrowser.IsBusy == false)
{
//Replace or Set IDs on every HTML Element [...]
//...
//
ieBrowser.Stop();
ExitThread();
//Dispose();
resultEvent.Set();
}
}
Limitations:
This is not about performance, I need to do this remote, but only 1-5 person will use the site simultaniously. I know that using WebBrowser Control is probably not a good solution in general, but in this case it is exactly what I need to capture all user navigation.
Related
I'm trying to programmatically login to a site like espn.com. The way the site is setup is once I click on the Log In button located on the homepage, a Log In popup window is displayed in the middle of the screen with the background slightly tinted. My goal is to programmatically obtain that popup box, supply the username and password, and submit it -- hoping that a cookie is returned to me to use as authentication. However, because Javascript is used to display the form, I don't necessarily have easy access to the form's input tags via the main page's HTML.
I've tried researching various solutions such as HttpClient and HttpWebRequest, however it appears that a Webbrowser is best since the login form is displayed using Javascript. Since I don't necessarily have easy access to the form's input tags, a Webbrowser seems the best alternative to capturing the popup's input elements.
class ESPNLoginViewModel
{
private string Url;
private WebBrowser webBrowser1 = new WebBrowser();
private SHDocVw.WebBrowser_V1 Web_V1;
public ESPNLoginViewModel()
{
Initialize();
}
private void Initialize()
{
Url = "http://www.espn.com/";
Login();
}
private void Login()
{
webBrowser1.Navigate(Url);
webBrowser1.DocumentCompleted +=
new WebBrowserDocumentCompletedEventHandler(webpage_DocumentCompleted);
Web_V1 = (SHDocVw.WebBrowser_V1)this.webBrowser1.ActiveXInstance;
Web_V1.NewWindow += new SHDocVw.DWebBrowserEvents_NewWindowEventHandler(Web_V1_NewWindow);
}
//This never gets executed
private void Web_V1_NewWindow(string URL, int Flags, string TargetFrameName, ref object PostData, string Headers, ref bool Processed)
{
//I'll start determing how to code this once I'm able to get this invoked
}
private void webpage_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlElement loginButton = webBrowser1.Document.GetElementsByTagName("button")[5];
loginButton.InvokeMember("click");
//I've also tried the below InvokeScript method to see if executing the javascript that
//is called when the Log In button is clicked, however Web_V1_NewWindow still wasn't called.
//webBrowser1.Document.InvokeScript("buildOverlay");
}
}
I'm expecting the Web_V1_NewWindow handler to be invoked when the InvokeMember("click") method is called. However, code execution only runs through the webpage_DocumentCompleted handler without any calls to Web_V1_NewWindow. It might be that I need to use a different method than InvokeMember("click") to invoke the Log In button's click event handler. Or I might need to try something completely different altogether. I'm not 100% sure the Web_V1.NewWindow is the correct approach for my needs, but I've seen NewWindow used often when dealing with popups so I figured I should give it a try.
Any help would be greatly appreciated as I've spent a significant amount of time on this.
I know it is the late answer. But it will help someone else.
You can extract the value from FRAME element by following
// Get frame using frame ID
HtmlWindow frameWindow = (from HtmlWindow win
in WbBrowser.Document.Window.Frames select win)
.Where(x => string.Compare(x.WindowFrameElement.Id, "frm1") == 0)
.FirstOrDefault();
// Get first frame textbox with ID
HtmlElement txtElement = (from HtmlElement element
in frameWindow.Document.GetElementsByTagName("input")
select element)
.Where(x => string.Compare(x.Id, "txt") == 0).FirstOrDefault();
// Check txtElement is nul or not
if(txtElement != null)
{
Label1.Text = txtElement.GetAttribute("value");
}
For more details check
this article
im trying to load multiple pages using DotNetBrowser , and i need to know each time when the new url is loaded,
myBro.FinishLoadingFrameEvent += delegate (object send, FinishLoadingEventArgs es)
{
if (es.IsMainFrame && es.ValidatedURL.Contains("login"))
{
DOMDocument document = myBro.GetDocument();
DOMElement user = document.GetElementById("LoginForm_login");
user.SetAttribute("value", "email");
DOMElement pass = document.GetElementById("LoginForm_password");
pass.SetAttribute("value", "pass");
DOMElement loginbtn = document.GetElementByTagName("button");
loginbtn.Click();
// can't add nothing more here //
};
but this code does inform me only if the first page is loaded
The FinishLoadingFrameEvent is fired for each frame loaded on the web page, even after the page is reloaded. You can use it multiple times to be notified when a browser has loaded the web page completely after the LoadURL method is called.
Here is a sample code based on the documentation article https://dotnetbrowser.support.teamdev.com/support/solutions/articles/9000110055-loading-url-synchronously :
ManualResetEvent waitEvent = new ManualResetEvent(false);
browser.FinishLoadingFrameEvent += delegate(object sender, FinishLoadingEventArgs e)
{
// Wait until main document of the web page is loaded completely.
if (e.IsMainFrame)
{
waitEvent.Set();
}
};
//Load URL
browser.LoadURL("http://www.google.com");
waitEvent.WaitOne();
//The page http://www.google.com is now loaded completely
//Then, reset the event and load the next URL
waitEvent.Reset();
browser.LoadURL("http://www.microsoft.com");
waitEvent.WaitOne();
//The page http://www.microsoft.com is now loaded completely
I have a C# form with a web browser control on it.
I am trying to visit different websites in a loop.
However, I can not control URL address to load into my form web browser element.
This is the function I am using for navigating through URL addresses:
public String WebNavigateBrowser(String urlString, WebBrowser wb)
{
string data = "";
wb.Navigate(urlString);
while (wb.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
data = wb.DocumentText;
return data;
}
How can I make my loop wait until it fully loads?
My loop is something like this:
foreach (string urlAddresses in urls)
{
WebNavigateBrowser(urlAddresses, webBrowser1);
// I need to add a code to make webbrowser in Form to wait till it loads
}
Add This to your code:
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
Fill in this function
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
//This line is so you only do the event once
if (e.Url != webBrowser1.Url)
return;
//do you actual code
}
After some time of anger of the crappy IE functionality I've came across making something which is the most accurate way to judge page loaded complete.
Never use the WebBrowserDocumentCompletedEventHandler event
use WebBrowserProgressChangedEventHandler with some modifections seen below.
//"ie" is our web browser object
ie.ProgressChanged += new WebBrowserProgressChangedEventHandler(_ie);
private void _ie(object sender, WebBrowserProgressChangedEventArgs e)
{
int max = (int)Math.Max(e.MaximumProgress, e.CurrentProgress);
int min = (int)Math.Min(e.MaximumProgress, e.CurrentProgress);
if (min.Equals(max))
{
//Run your code here when page is actually 100% complete
}
}
Simple genius method of going about this, I found this question googling "How to sleep web browser or put to pause"
According to MSDN (contains sample source) you can use the DocumentCompleted event for that. Additional very helpful information and source that shows how to differentiate between event invocations can be found here.
what you experiencend happened to me . readyStete.complete doesnt work in some cases. here i used bool in document_completed to check state
button1_click(){
//go site1
wb.Navigate("site1.com");
//wait for documentCompleted before continue to execute any further
waitWebBrowserToComplete(wb);
// set some values in html page
wb.Document.GetElementById("input1").SetAttribute("Value", "hello");
// then click submit. (submit does navigation)
wb.Document.GetElementById("formid").InvokeMember("submit");
// then wait for doc complete
waitWebBrowserToComplete(wb);
var processedHtml = wb.Document.GetElementsByTagName("HTML")[0].OuterHtml;
var rawHtml = wb.DocumentText;
}
// helpers
//instead of checking readState . we get state from DocumentCompleted Event via bool value
bool webbrowserDocumentCompleted = false;
public static void waitWebBrowserToComplete(WebBrowser wb)
{
while (!webbrowserDocumentCompleted )
Application.DoEvents();
webbrowserDocumentCompleted = false;
}
form_load(){
wb.DocumentCompleted += (o, e) => {
webbrowserDocumentCompleted = true;
};
}
I am wondering if BeforeNavigate2 or DocumentComplete events should fire on pages with AJAX. For example google maps. When I put something in addressbar everything is ok, but when I move the map and resizing it nothing happens (DocumentComplete and BeforeNavigate2 does not fire), but data is sent to and from Internet.
The a in ajax stands for asynchronous. These events fire in response to synchronous methods completing. Since an asynchronous request can be made at any time the browser has no way of knowing when they are all completed.
I think you need to handle ajax request and you can Handle with DownloadBegin and DownloadComplete Event.
In Code:
public int SetSite(object site)
{
if (site != null)
{
webBrowser = (WebBrowser)site;
webBrowser.DownloadComplete += new DWebBrowserEvents2_DownloadCompleteEventHandler(DownloadComplete);
webBrowser.DownloadBegin += new DWebBrowserEvents2_DownloadBeginEventHandler(DownloadBegin);
}
else
{
webBrowser.DownloadComplete += new DWebBrowserEvents2_DownloadCompleteEventHandler(DownloadComplete);
webBrowser.DownloadBegin += new DWebBrowserEvents2_DownloadBeginEventHandler(DownloadBegin);
webBrowser = null;
}
return 0;
}
Events:
private void DownloadBegin()
{
MessageBox.Show("Download Begin");
}
private void DownloadComplete()
{
MessageBox.Show("Download Complete");
}
it's work for me.
I monitor download begin and download complete events to process pages which include ajax codes.
Also need program logic to control the flow, e.g.. set/check flags.
I create an instance of IE outside my program, which the program finds and attaches to correctly. I set up my event handler and tell the program to advance to the login screen. The DocumentCompleted handle is supposed to fire when the web page is completely loaded, but mine seems to be firing before the new page has appeared.. The handle only fires once (meaning there is only one frame?).
This code executes fine if I modify it to work straight from the login page also.. Am I doing something wrong? Thanks for any assistance :)
Process.Start(#"IESpecial.exe");
SHDocVw.ShellWindows allBrowsers = new SHDocVw.ShellWindows();
while (true)
{
foreach (SHDocVw.WebBrowser ie in allBrowsers)
{
if (ie.LocationURL == "http://website/home.asp")
{
loggingIn = true;
webBrowser = ie;
webBrowser.DocumentComplete += new SHDocVw.DWebBrowserEvents2_DocumentCompleteEventHandler(webBrowser1_DocumentCompleted);
webBrowser.Navigate("http://website/logon.asp");
return;
}
}
Thread.Sleep(10);
}
}
private void webBrowser1_DocumentCompleted(object pDisp, ref object URL)
{
//we are attempting to log in
if (loggingIn)
{
mshtml.HTMLDocumentClass doc = (mshtml.HTMLDocumentClass)webBrowser.Document;
mshtml.HTMLWindow2 window = (mshtml.HTMLWindow2)doc.IHTMLDocument2_parentWindow;
doc.getElementById("Username").setAttribute("value", "MLAPAGLIA");
doc.getElementById("Password").setAttribute("value", "PASSWORD");
window.execScript("SubmitAction()", "javascript");
loggingIn = false;
return;
}