First i have this button click event:
private void toolStripButton3_Click(object sender, EventArgs e)
{
GetHtmls();
}
Then the method GetHtmls:
private void GetHtmls()
{
for (int i = 1; i < 2; i++)
{
adrBarTextBox.Text = sourceUrl + i;
getCurrentBrowser().Navigate(adrBarTextBox.Text);
targetHtmls = (combinedHtmlsDir + "\\Html" + i + ".txt");
}
}
Now the loop is for one html but later i will change the loop to be i < 45
getCurrentBrowser method:
private WebBrowser getCurrentBrowser()
{
return (WebBrowser)browserTabControl.SelectedTab.Controls[0];
}
Then in the load form1 event I have:
browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(Form1_DocumentCompleted);
And the completed event:
private void Form1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser currentBrowser = getCurrentBrowser();
StreamWriter writer = File.CreateText(targetHtmls);
writer.Write(getCurrentBrowser().DocumentText);
writer.Close();
}
What I'm doing here is loading the html to the web browser and create a file on the hard disk of the html source content.
But i'm getting two problems:
In the completed event it keep calling the completed event and create the html file over and over again untill the codument is loaded. How can i make that it will do it once in the completed event ? I mean that it will wait untill the dosument loaded and then write to the file and create the file only once ?
How do I make it all when the loop will be i < 45 and not 2?
So it will wait for the first html to be loaded then write to the file when finish the writing make the next html in the loop then write again in the completed event and so on so it will not move on each other one.
The completed event of the web browser document doesn't act like other completed events it keep calling the completed event every second or so until it finish loading the html.
I am not sure what the purpose of the WebBrowser is in this case. That control is for human interaction, not loading X number of sites.
I would recommand to use HttpWebRequest or the newer WebClient class. This is much easier to use in the case you show here.
The WebClient class can be used like this:
WebClient wc = new WebClient();
string html = wc.DownloadString("yourUrl");
This will wait until the request is completed and the result is returned. No need for event handlers or such. You could improve the performance by using async though.
Related
im trying to load multiple pages using DotNetBrowser , and i need to know each time when the new url is loaded,
myBro.FinishLoadingFrameEvent += delegate (object send, FinishLoadingEventArgs es)
{
if (es.IsMainFrame && es.ValidatedURL.Contains("login"))
{
DOMDocument document = myBro.GetDocument();
DOMElement user = document.GetElementById("LoginForm_login");
user.SetAttribute("value", "email");
DOMElement pass = document.GetElementById("LoginForm_password");
pass.SetAttribute("value", "pass");
DOMElement loginbtn = document.GetElementByTagName("button");
loginbtn.Click();
// can't add nothing more here //
};
but this code does inform me only if the first page is loaded
The FinishLoadingFrameEvent is fired for each frame loaded on the web page, even after the page is reloaded. You can use it multiple times to be notified when a browser has loaded the web page completely after the LoadURL method is called.
Here is a sample code based on the documentation article https://dotnetbrowser.support.teamdev.com/support/solutions/articles/9000110055-loading-url-synchronously :
ManualResetEvent waitEvent = new ManualResetEvent(false);
browser.FinishLoadingFrameEvent += delegate(object sender, FinishLoadingEventArgs e)
{
// Wait until main document of the web page is loaded completely.
if (e.IsMainFrame)
{
waitEvent.Set();
}
};
//Load URL
browser.LoadURL("http://www.google.com");
waitEvent.WaitOne();
//The page http://www.google.com is now loaded completely
//Then, reset the event and load the next URL
waitEvent.Reset();
browser.LoadURL("http://www.microsoft.com");
waitEvent.WaitOne();
//The page http://www.microsoft.com is now loaded completely
I use Fiddlercore to capture multiple url's at the same time inside a loop.
Example:
private void button1_Click(object sender, EventArgs e)
{
// I have 2 url
string arr = new string[]{ url1, url2 };
foreach(var url in arr)
{
new Webbrowser().Navigate(url);
}
Fiddler.FiddlerApplication.AfterSessionComplete
+= new Fiddler.SessionStateHandler(FiddlerApplication_AfterSessionComplete);
}
// I will catch 2 oSession contain same string "a/b/c" in 2 URL from 2 Webbrowser in loop
int Count = 0;
void FiddlerApplication_AfterSessionComplete(Fiddler.Session oSession)
{
if(oSession.fullUrl.contain("a/b/c"))
{
Count+= 1;
richtextbox1.AppendText("oSession.fullUrl" + "\n");
}
if(Count == 2)
{
Count = 0;
StopFiddler();
}
}
void StopFiddler()
{
Fiddler.FiddlerApplication.AfterSessionComplete
-= new Fiddler.SessionStateHandler(FiddlerApplication_AfterSessionComplete);
}
This works but I have a problem. Fiddlercore stops the capture session, but the web browser doesn't stop, it's still loading.
How to stop the WebBrowser from loading after I get what I need.
Use WebBrowser.Stop() to stop all loading.
Cancels any pending navigation and stops any dynamic page elements, such as background sounds and animations.
Edit: Also, you need to save a reference to those WebBrowser controls you're creating, so that you can actually call the Stop method for them. The way you use them now is quite strange and might lead to problems down the line (actually it led to problems already).
I am creating an application that involves using threads. Everything works until I click the button for the second time. Nothing happens on the second time the button is clicked. Its like the first time all the stuff loads and then just locks the values of the text boxes. The stuff in red is just private links that cannot be shown. Its not the links because they work just fine the first time. They just won't work the second time. I hope what I just said wasn't too confusing.
name1, name2, name3 are all downloaded when the form is created, they're just bound to the textboxes when you press the button the first time.
_name1(), _name2(), _name3() methods are just object instantiations and have no side effects of any kind (put differently, they don't do anything).
And all the threading stuff is just fluff - you're calling methods that don't do anything and then aborting the threads (thereby aborting something that isn't doing anything anyway). This has zero effect on the execution in any way as the code is currently written, even when executed the first time.
The simple, synchronous fix for your code will look like this:
private void Button_Click(object sender, EventArgs e)
{
using (WebClient client = new WebClient())
{
textBox1.Text = client.DownloadString("<your URL here>");
textBox2.Text = client.DownloadString("<your URL here>");
textBox3.Text = client.DownloadString("<your URL here>");
}
}
Seeing as you're using threads, your goal is obviously non-blocking, asynchronous execution. The easiest way to achieve it while preserving the sequencing of operations is with async/await:
private async void Button_Click(object sender, EventArgs e)
{
// Disabling the button ensures that it's not pressed
// again while the first request is still in flight.
materialRaisedButton1.Enabled = false;
try
{
using (WebClient client = new WebClient())
{
// Execute async downloads in parallel:
Task<string>[] parallelDownloads = new[] {
client.DownloadStringTaskAsync("<your URL here>"),
client.DownloadStringTaskAsync("<your URL here>"),
client.DownloadStringTaskAsync("<your URL here>")
};
// Collect results.
string[] results = await Task.WhenAll(parallelDownloads);
// Update all textboxes at the same time.
textBox1.Text = results[0];
textBox2.Text = results[1];
textBox3.Text = results[2];
}
}
finally
{
materialRaisedButton1.Enabled = true;
}
}
I'm creating an application that contains "geckoWebBrowser" in c #. But I have to wait the complete loading a web page, and then continue to execute other instructions. there is something similar to geckowebbrowser1.DocumentComplete, but i don't know how to use this.
Please help me with my code:
geckoWebBrowser1.Navigate(textBox1.Text);
// i want to perform below thing after web page load completes
listBox1.Items.RemoveAt(listBox1.SelectedIndex);
listBox1.SelectedIndex = 0;
int i = listBox1.Items.Count;
string str = Convert.ToString(i);
label2.Text = str;
You could use event:
geckoWebBrowser1_OnDocumentCompleted(EventArgs e)
But remember this event can fire many times before same page ends loading... write the logic accordingly.
I have a C# form with a web browser control on it.
I am trying to visit different websites in a loop.
However, I can not control URL address to load into my form web browser element.
This is the function I am using for navigating through URL addresses:
public String WebNavigateBrowser(String urlString, WebBrowser wb)
{
string data = "";
wb.Navigate(urlString);
while (wb.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
data = wb.DocumentText;
return data;
}
How can I make my loop wait until it fully loads?
My loop is something like this:
foreach (string urlAddresses in urls)
{
WebNavigateBrowser(urlAddresses, webBrowser1);
// I need to add a code to make webbrowser in Form to wait till it loads
}
Add This to your code:
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
Fill in this function
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
//This line is so you only do the event once
if (e.Url != webBrowser1.Url)
return;
//do you actual code
}
After some time of anger of the crappy IE functionality I've came across making something which is the most accurate way to judge page loaded complete.
Never use the WebBrowserDocumentCompletedEventHandler event
use WebBrowserProgressChangedEventHandler with some modifections seen below.
//"ie" is our web browser object
ie.ProgressChanged += new WebBrowserProgressChangedEventHandler(_ie);
private void _ie(object sender, WebBrowserProgressChangedEventArgs e)
{
int max = (int)Math.Max(e.MaximumProgress, e.CurrentProgress);
int min = (int)Math.Min(e.MaximumProgress, e.CurrentProgress);
if (min.Equals(max))
{
//Run your code here when page is actually 100% complete
}
}
Simple genius method of going about this, I found this question googling "How to sleep web browser or put to pause"
According to MSDN (contains sample source) you can use the DocumentCompleted event for that. Additional very helpful information and source that shows how to differentiate between event invocations can be found here.
what you experiencend happened to me . readyStete.complete doesnt work in some cases. here i used bool in document_completed to check state
button1_click(){
//go site1
wb.Navigate("site1.com");
//wait for documentCompleted before continue to execute any further
waitWebBrowserToComplete(wb);
// set some values in html page
wb.Document.GetElementById("input1").SetAttribute("Value", "hello");
// then click submit. (submit does navigation)
wb.Document.GetElementById("formid").InvokeMember("submit");
// then wait for doc complete
waitWebBrowserToComplete(wb);
var processedHtml = wb.Document.GetElementsByTagName("HTML")[0].OuterHtml;
var rawHtml = wb.DocumentText;
}
// helpers
//instead of checking readState . we get state from DocumentCompleted Event via bool value
bool webbrowserDocumentCompleted = false;
public static void waitWebBrowserToComplete(WebBrowser wb)
{
while (!webbrowserDocumentCompleted )
Application.DoEvents();
webbrowserDocumentCompleted = false;
}
form_load(){
wb.DocumentCompleted += (o, e) => {
webbrowserDocumentCompleted = true;
};
}