C#: How do I get the document title from a WebBrowser element? - c#

I'm having issues trying to get the document title from a WebBrowser in C#. It works fine in VB.NET, but it won't give me any properties in C#.
When I type in MyBrowser.Document., the only options I get are 4 methods: Equals, GetHashCode, GetType, and ToString - no properties.
I think it's because I have to assign the document to a new instance first, but I can't find the HTMLDocument class that exists in VB.NET.
Basically what I'm wanting to do is return the Document.Title each time the WebBrowser loads/reloads a page.
Can someone help please? It will be much appreciated!
Here is the code I have at the moment...
private void Link_Click(object sender, RoutedEventArgs e)
{
WebBrowser tempBrowser = new WebBrowser();
tempBrowser.HorizontalAlignment = HorizontalAlignment.Left;
tempBrowser.Margin = new Thickness(-4, -4, -4, -4);
tempBrowser.Name = "MyBrowser";
tempBrowser.VerticalAlignment = VerticalAlignment.Top;
tempBrowser.LoadCompleted += new System.Windows.Navigation.LoadCompletedEventHandler(tempBrowser_LoadCompleted);
tempTab.Content = tempBrowser; // this is just a TabControl that contains the WebBrowser
Uri tempURI = new Uri("http://www.google.com");
tempBrowser.Navigate(tempURI);
}
private void tempBrowser_LoadCompleted(object sender, EventArgs e)
{
if (sender is WebBrowser)
{
MessageBox.Show("Test");
currentBrowser = (WebBrowser)sender;
System.Windows.Forms.HtmlDocument tempDoc = (System.Windows.Forms.HtmlDocument)currentBrowser.Document;
MessageBox.Show(tempDoc.Title);
}
}
This code doesn't give me any errors, but I never see the second MessageBox. I do see the first one though (the "Test" message), so the program is getting to that code block.

Add reference to Microsoft.mshtml
Add event receiver for LoadCompleted
webbrowser.LoadCompleted += new LoadCompletedEventHandler(webbrowser_LoadCompleted);
Then you will have no problems with document not being loaded in order to read values back out
void webbrowser_LoadCompleted(object sender, NavigationEventArgs e)
{
// Get the document title and display it
if (webbrowser.Document != null)
{
mshtml.IHTMLDocument2 doc = webbrowser.Document as mshtml.IHTMLDocument2;
Informative.Text = doc.title;
}
}

You are not using the Windows Forms WebBrowser control. I think you got the COM wrapper for ieframe.dll, its name is AxWebBrowser. Verify that by opening the References node in the Solution Explorer window. If you see AxSHDocVw then you got the wrong control. It is pretty unfriendly, it just gives you an opaque interface pointer for the Document property. You'll indeed only get the default object class members.
Look in the toolbox. Pick WebBrowser instead of "Microsoft Web Browser".

string title = ((HTMLDocument)MyBrowser.Document).Title
Or
HTMLDocument Doc = (HTMLDocument)MyBrowser.Document.Title ;
string title = doc.Title;

LoadCompleted doesn't fire. You should use Navigated event handler instead of it.
webBrowser.Navigated += new NavigatedEventHandler(WebBrowser_Navigated);
(...)
private void WebBrowser_Navigated(object sender, NavigationEventArgs e)
{
HTMLDocument doc = ((WebBrowser)sender).Document as HTMLDocument;
foreach (IHTMLElement elem in doc.all)
{
(...)
}
// you may have to dispose WebBrowser object on exit
}

Finally works well with:
using System.Windows.Forms;
...
WebBrowser CtrlWebBrowser = new WebBrowser();
...
CtrlWebBrowser.Document.Title = "Hello World";
MessageBox.Show( CtrlWebBrowser.Document.Title );

Related

C# winforms webbrowser not going to url's asked for

I was asked by a friend to develop a winform app to be able to extract data. I figured it would be easy enough - how wrong I was!
In my winform, I have included a webbrowser control and some buttons. The URL for the webbrowser is http://www.racingpost.com/greyhounds/card.sd and as you can imagine, it is the place to get data for greyhounds. When on the page above, there are a number of links within this area which are specific to a race time. If you click on any of these, it takes you to that race, and its this data that I need to extract. So, my initial thoughts were to get ALL links off the link above, then store them in a list, then just have a button available to take in whatever link it is, and then take the webbrowser to that location. Once there, I can then look to extract the data and store it as needed.
So, in the first instance, I use
//url = link above
wb1.Url = new Uri(url);
grab the data (which are links for each race on that day)
once I have this, use a further button to go to the specific race
wb1.Url = new Uri("http://www.racingpost.com/greyhounds/card.sd#resultday=2015-01-17&raceid=1344640");
then, once there, click another button to capture the data, after which, return to the original link above.
The problem is, it will not go to the location present in the link. BUT, if I click the link manually within the webbrowser, it goes there no problem.
I have looked at the properties of the webbrowser, and these all look fine - although I can't qualify that tbh!
I know if I try to go to the links manually, I can, but if I try to do it through code, it just wont budge. I can only assume I have done something wrong in the code.
Hope some of that makes sense - first posting, so apologies if I made a mess of it. I will provide all code no problem, but cant seem to figure out how to post the code in 'code format'?
//here is the code
public partial class Form1 : Form
{
Uri _url;
public Form1()
{
InitializeComponent();
wb1.Url = new Uri("http://www.racingpost.com/greyhounds/card.sd");
wb1.Navigated +=new WebBrowserNavigatedEventHandler(wb1_Navigated);
}
classmodules.trackUrl tu;
private void btnGrabData_Click(object sender, EventArgs e)
{
classmodules.utility u = new classmodules.utility();
rtb1.Text = u.GetWebData("http://www.racingpost.com/greyhounds/card.sd");
HtmlDocument doc = wb1.Document;
string innerText = (((mshtml.HTMLDocument)(doc.DomDocument)).documentElement).outerHTML;
innerText = Regex.Replace(innerText, #"\r\n?|\n", "");
rtb1.Text = innerText;
tu = new classmodules.trackUrl();
u.splitOLs(ref tu, innerText);
classmodules.StaticUtils su = new classmodules.StaticUtils();
su.SerializeObject(tu, typeof(classmodules.trackUrl)).Save(#"d:\dogsUTL.xml");
classmodules.ExcelProcessor xl = new classmodules.ExcelProcessor();
xl.createExcel(tu);
}
private void wb1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb1 = sender as WebBrowser;
this.Text = wb1.Url.ToString();
}
void wb1_Navigated(object sender, WebBrowserNavigatedEventArgs e)
{
_url = e.Url;
}
private void btnGoBack_Click(object sender, EventArgs e)
{
goBack();
}
private void goBack()
{
wb1.Url = new Uri("http://www.racingpost.com/greyhounds/card.sd");
}
private void btnGetRaceData_Click(object sender, EventArgs e)
{
HtmlDocument doc = wb1.Document;
string innerText = (((mshtml.HTMLDocument)(doc.DomDocument)).documentElement).outerHTML;
rtb2.Text = innerText;
}
//###############################
//OK, here is the point where I want to take in the URL and click a button //to instruct the webbrowser to go to that location. I add an initial //counter to 0, and then get the first url from the list, increment the //counter, then when I click the button again, urlNo wil be 1, so then it //tries the second url
int urlNo = 0;
private void btnUseData_Click(object sender, EventArgs e)
{
if (tu.race.Count > urlNo)
{
string url = tu.race[urlNo].url;
wb1.Url = new Uri(url);
lblUrl.Text = url;
urlNo++;
}
else
{
lblUrl.Text = "No More";
}
}
Did you try the Navigate(...) method? In theory, the behavior of Navigate and Url is the same, but I can infer that they behave a bit different.
http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.navigate(v=vs.110).aspx

How to disable every navigation in WebBrowser?

I have a WebBrowser control which I dinamically refresh/change url based on user input. I don't want to let the user to navigate, so I set AllowNavigation to false. This seems to be OK, however the below link is still "active":
Close Page
The issue here is: If the user clicks it, and confirms closure in the pop-up window I can't manage WebBrowser anymore. Looks like it is closed though the last page is still visible. Also I can't remove this link as the site is not managed by me.
Disable the control? Nope, I have to allow the user to highlight and copy text from the webpage.
Do I have any other option to disable literally ALL links?
#TaW: here is my code based on yours. So I have to set the url from my code and call a custom one:
button_click()
{
webBrowser1_load_URL("http://website/somecheck.php?compname=" + textBoxHost.Text);
}
Here it is the function:
private void webBrowser1_load_URL(string url)
{
string s = GetDocumentText(url.ToString());
s = s.Replace(#"javascript:window.close()", "");
webBrowser1.AllowNavigation = true;
webBrowser1.DocumentText = s;
}
The rest is exaclty what's in your answer:
private void webBrowser1_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
webBrowser1.AllowNavigation = false;
}
public string GetDocumentText(string s)
{
WebBrowser dummy = new WebBrowser(); //(*)
dummy.Url = new Uri(s);
return dummy.DocumentText;
}
Still it's not working. Please help me to spot the issue with my code.
If you have control over the loading of the pages you could grab the pages' text and change the code to disable rogue scripts. The one you showed can simply be deleted. Of course you might have to forsee more than the one..
Obviously this could be eased if you could do without javascript alltogether, but if that is not an option go for those that do real or pseudo-navigation..
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
webBrowser1.AllowNavigation = false;
}
private void loadURL_Click(object sender, EventArgs e)
{
webBrowser1.AllowNavigation = true;
string s = File.ReadAllText(textBox_URL.Text);
s = s.Replace("javascript:window.close()", "");
webBrowser1.DocumentText = s;
}
If the pages are not in the file system, the same trick should work, for instance by loading the URL into a dummy WebBrowser like this:
private void cb_loadURL_Click(object sender, EventArgs e)
{
string s = GetDocumentText(tb_URL.Text);
s = s.Replace("javascript:window.close()", "");
webBrowser1.AllowNavigation = true;
webBrowser1.DocumentText = s;
}
public string GetDocumentText(string s)
{
WebBrowser dummy = new WebBrowser(); //(*)
dummy.Url = new Uri(s);
return dummy.DocumentText;
}
Note: According to this post you can't set the DocumentText quite as freely as one would think; probably a bug.. Instead of creating the dummy each time you can also move the (*) line to class level. Then, no matter how many changes you had to make, you would always have an unchanged version, th user could e.g. save somewhere..

webbrowser control SetAttribute does not respond programmatically

i have application and i need to add text programmatically to some fields it works in most pages but in www.google.com when i try to but value to search, it did not work until i clicked on the text area then the value appear
is there any way to get around this
my code:
HtmlElementCollection el = webBrowser1.Document.All;
foreach (HtmlElement H in el)
{
if (H.GetAttribute("type").Equals("text") )
H.SetAttribute("value", sendtext);
}
i tried to click on it programmatically
object obj = H.DomElement;
System.Reflection.MethodInfo mi = obj.GetType().GetMethod("click");
mi.Invoke(obj, new object[0]);
also it does not work
Project + Add Reference, Browse tab and select c:\windows\system32\mshtml.tlb (.dll on earlier Windows versions). This gives you access to the native COM interface that the DomElement property returns. So you can write your code cleanly like this:
var obj = (mshtml.IHtmlElement)H.DomElement;
obj.click();
Or you can do it a bit less cleanly with the HtmlElement.InvokeMember() method:
H.InvokeMember("click");
A sample form that runs a google query using this technique:
public partial class Form1 : Form {
public Form1() {
InitializeComponent();
webBrowser1.Url = new Uri("http://google.com");
webBrowser1.DocumentCompleted += webBrowser1_DocumentCompleted;
}
void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
if (webBrowser1.Url.Host.EndsWith("google.com")) {
HtmlDocument doc = webBrowser1.Document;
HtmlElement ask = doc.All["q"];
HtmlElement lucky = doc.All["btnI"];
ask.InnerText = "stackoverflow";
lucky.InvokeMember("click");
}
}
}

.NET C#: WebBrowser control Navigate() does not load targeted URL

I'm trying to programmatically load a web page via the WebBrowser control with the intent of testing the page & it's JavaScript functions. Basically, I want to compare the HTML & JavaScript run through this control against a known output to ascertain whether there is a problem.
However, I'm having trouble simply creating and navigating the WebBrowser control. The code below is intended to load the HtmlDocument into the WebBrowser.Document property:
WebBrowser wb = new WebBrowser();
wb.AllowNavigation = true;
wb.Navigate("http://www.google.com/");
When examining the web browser's state via Intellisense after Navigate() runs, the WebBrowser.ReadyState is 'Uninitialized', WebBrowser.Document = null, and it overall appears completely unaffected by my call.
On a contextual note, I'm running this control outside of a Windows form object: I do not need to load a window or actually look at the page. Requirements dictate the need to simply execute the page's JavaScript and examine the resultant HTML.
Any suggestions are greatly appreciated, thanks!
You should handle the WebBrowser.DocumentComplete event, once that event is raised you will have the Document etc.
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb = sender as WebBrowser;
// wb.Document is not null at this point
}
Here is a complete example, that I quickly did in a Windows Forms application and tested.
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
WebBrowser wb = new WebBrowser();
wb.AllowNavigation = true;
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
wb.Navigate("http://www.google.com");
}
private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb = sender as WebBrowser;
// wb.Document is not null at this point
}
}
Edit: Here is a simple version of code that runs a window from a console application. You can of course go further and expose the events to the console code etc.
using System;
using System.Windows;
using System.Windows.Forms;
namespace ConsoleApplication1
{
class Program
{
[STAThread]
static void Main(string[] args)
{
Application.Run(new BrowserWindow());
Console.ReadKey();
}
}
class BrowserWindow : Form
{
public BrowserWindow()
{
ShowInTaskbar = false;
WindowState = FormWindowState.Minimized;
Load += new EventHandler(Window_Load);
}
void Window_Load(object sender, EventArgs e)
{
WebBrowser wb = new WebBrowser();
wb.AllowNavigation = true;
wb.DocumentCompleted += wb_DocumentCompleted;
wb.Navigate("http://www.bing.com");
}
void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
Console.WriteLine("We have Bing");
}
}
}
You probably need to host the control in a parent window. You can do this without breaking requirements by simply not showing the window that hosts the browser control by moving it off screen. It might also be useful for development to "see" that it does actually load something for testing, verification etc.
So try:
// in a form's Load handler:
WebBrowser wb = new WebBrowser();
this.Controls.Add(wb);
wb.AllowNavigation = true;
wb.Navigate("http://www.google.com/");
Also check to see what other properties are set on the WebBrowser object when you instantiate it via the IDE. E.g. create a Form, drop a browser control onto it and then check the form's designer file to see what code is generated. You might be missing some key property that needs to be set. I've discovered many-an-omission in my code in this way and also learned how to properly instantiate visual objects programmatically.
P.S. If you do use a host window, it should only be visible during development. You would hide in some manner for production.
Another approach:
You could go "raw" by tryiing something like this:
System.Net.WebClient wc = new System.Net.WebClient();
System.IO.StreamReader webReader = new System.IO.StreamReader(
wc.OpenRead("http://your_website.com"));
string webPageData = webReader.ReadToEnd();
...then RegEx or parse webPageData for what you need. Or do you need the jscript in the page to actually execute? (Which should be possible with .NET 4.0)
I had this problem, and I did not realize that I had uninstalled Internet Explorer. If you have, nothing will ever happen, since the WebBrowser control only instantiates IE.
The Webbrowser control is just a wrapper around Internet Explorer.
You can set in onto an invisible Windows Forms window to completely instantiate it.

Open link in new TAB (WebBrowser Control)

Does anybody know how to click on a link in the WebBrowser control in a WinForms application and then have that link open in a new tab inside my TabControl?
I've been searching for months, seen many tutorials/articles/code samples but it seems as though nobody has ever tried this in C# before.
Any advice/samples are greatly appreciated.
Thank you.
Based on your comments, I understand that you want to trap the "Open In New Window" action for the WebBrowser control, and override the default behavior to open in a new tab inside your application instead.
To accomplish this reliably, you need to get at the NewWindow2 event, which exposes ppDisp (a settable pointer to the WebBrowser control that should open the new window).
All of the other potential hacked together solutions (such as obtaining the last link selected by the user before the OpenWindow event) are not optimal and are bound to fail in corner cases.
Luckily, there is a (relatively) simple way of accomplishing this while still using the System.Windows.Forms.WebBrowser control as a base. All you need to do is extend the WebBrowser and intercept the NewWindow2 event while providing public access to the ActiveX Instance (for passing into ppDisp in new tabs). This has been done before, and Mauricio Rojas has an excellent example with a complete working class "ExtendedWebBrowser":
http://blogs.artinsoft.net/mrojas/archive/2008/09/18/newwindow2-events-in-the-c-webbrowsercontrol.aspx
Once you have the ExtendedWebBrowser class, all you need to do is setup handlers for NewWindow2 and point ppDisp to a browser in a new tab. Here's an example that I put together:
private void InitializeBrowserEvents(ExtendedWebBrowser SourceBrowser)
{
SourceBrowser.NewWindow2 += new EventHandler<NewWindow2EventArgs>(SourceBrowser_NewWindow2);
}
void SourceBrowser_NewWindow2(object sender, NewWindow2EventArgs e)
{
TabPage NewTabPage = new TabPage()
{
Text = "Loading..."
};
ExtendedWebBrowser NewTabBrowser = new ExtendedWebBrowser()
{
Parent = NewTabPage,
Dock = DockStyle.Fill,
Tag = NewTabPage
};
e.PPDisp = NewTabBrowser.Application;
InitializeBrowserEvents(NewTabBrowser);
Tabs.TabPages.Add(NewTabPage);
Tabs.SelectedTab = NewTabPage;
}
private void Form1_Load(object sender, EventArgs e)
{
InitializeBrowserEvents(InitialTabBrowser);
}
(Assumes TabControl named "Tabs" and initial tab containing child control docked ExtendedWebBrowser named "InitialWebBrowser")
Don't forget to unregister the events when the tabs are closed!
private Uri _MyUrl;
System.Windows.Forms.WebBrowser browser = new System.Windows.Forms.WebBrowser();
browser.Navigating += new System.Windows.Forms.WebBrowserNavigatingEventHandler(browser_Navigating);
void browser_Navigating(object sender, System.Windows.Forms.WebBrowserNavigatingEventArgs e)
{
_MyUrl = e.Url;
e.Cancel;
}
The following code works, just follow the first reply for creating the ExtendedWebBrowser class.
I'm using this to open a new tab but it also works to open a new window using your browser and not IE.
Hope it helps.
private void Window_Loaded(object sender, RoutedEventArgs e)
{
if (current_tab_count == 10) return;
TabPage tabPage = new TabPage("Loading...");
tabpages.Add(tabPage);
tabControl1.TabPages.Add(tabPage);
current_tab_count++;
ExtendedWebBrowser browser = new ExtendedWebBrowser();
InitializeBrowserEvents(browser);
webpages.Add(browser);
browser.Parent = tabPage;
browser.Dock = DockStyle.Fill;
browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
browser.DocumentTitleChanged += new EventHandler(Browser_DocumentTitleChanged);
browser.Navigated += Browser_Navigated;
browser.IsWebBrowserContextMenuEnabled = true;
public void InitializeBrowserEvents(ExtendedWebBrowser browser)
{
browser.NewWindow2 += new EventHandler<ExtendedWebBrowser.NewWindow2EventArgs>(Browser_NewWindow2);
}
void Browser_NewWindow2(object sender, ExtendedWebBrowser.NewWindow2EventArgs e)
{
if (current_tab_count == 10) return;
TabPage tabPage = new TabPage("Loading...");
tabpages.Add(tabPage);
tabControl1.TabPages.Add(tabPage);
current_tab_count++;
ExtendedWebBrowser browser = new ExtendedWebBrowser();
webpages.Add(browser);
browser.Parent = tabPage;
browser.Dock = DockStyle.Fill;
browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
browser.DocumentTitleChanged += new EventHandler(Browser_DocumentTitleChanged);
browser.Navigated += Browser_Navigated;
tabControl1.SelectedTab = tabPage;
browser.Navigate(textBox.Text);
{
e.PPDisp = browser.Application;
InitializeBrowserEvents(browser);
}
I did a bit of research on this topic and one does not need to do all the COM plumbing that is present in the ExtendedWebBrowser class, as that code is already present in the generated Interop.SHDocVw. As such, I was able to use the more natural construct below to subscribe to the NewWindow2 event. In Visual Studio I had to add a reference to "Microsoft Internet Controls".
using SHDocVw;
...
internal WebBrowserSsoHost(System.Windows.Forms.WebBrowser webBrowser,...)
{
ParameterHelper.ThrowOnNull(webBrowser, "webBrowser");
...
(webBrowser.ActiveXInstance as WebBrowser).NewWindow2 += OnNewWindow2;
}
private void OnNewWindow2(ref object ppDisp, ref bool Cancel)
{
MyTabPage tabPage = TabPageFactory.CreateNewTabPage();
tabPage.SetBrowserAsContent(out ppDisp);
}
Please read http://bit.ly/IDWm5A for more info. This is page #5 in the series, for a complete understanding I had to go back and read pages 3 -> 5.
You simply cancel the new window event and handle the navigation and tab stuff yourself.
Here is a fully working example. This assumes you have a tabcontrol and at least 1 tab page in place.
using System.ComponentModel;
using System.Windows.Forms;
namespace stackoverflow2
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
this.webBrowser1.NewWindow += WebBrowser1_NewWindow;
this.webBrowser1.Navigated += Wb_Navigated;
this.webBrowser1.DocumentText=
"<html>"+
"<head><title>Title</title></head>"+
"<body>"+
"<a href = 'http://www.google.com' target = 'abc' > test </a>"+
"</body>"+
"</html>";
}
private void WebBrowser1_NewWindow(object sender, CancelEventArgs e)
{
e.Cancel = true; //stop normal new window activity
//get the url you were trying to navigate to
var url= webBrowser1.Document.ActiveElement.GetAttribute("href");
//set up the tabs
TabPage tp = new TabPage();
var wb = new WebBrowser();
wb.Navigated += Wb_Navigated;
wb.Size = this.webBrowser1.Size;
tp.Controls.Add(wb);
wb.Navigate(url);
this.tabControl1.Controls.Add(tp);
tabControl1.SelectedTab = tp;
}
private void Wb_Navigated(object sender, WebBrowserNavigatedEventArgs e)
{
tabControl1.SelectedTab.Text = (sender as WebBrowser).DocumentTitle;
}
}
}
There is no tabbing in the web browser control, therefor you need to handle the tabs yourself. Add a tab control above the web browser control and create new web browser controls when new tabs are being opened. Catch and cancel when the user opens new windows and open new tabs instead.

Categories

Resources