Webbrowser, scripts and possible alternatives - c#

I have a situation where a rather clever website updates the latest information on the site via Shockwave Flash through a TCP connection. The data received is then updated onto the page via JavaScript so in order to get the latest data a browser is required. If attempts are made to hit the website with continual requests then a) you get banned and b) you're not actually getting the latest data, only the last updated base framework.
So I need to run a browser with scripts enabled.
My first question is, using the standard WPF WebBrowser in .NET I get the following warnings which I don't get in standard IE, Chrome or Firefox. What is causing this and how do I supress/allow it but still allowing scripts for the site to be run?
My second question relates to is there a better way do to this or are there any better alternatives to the WebBrowser control that will
Allow scripts to run
can access the DOM or html and scripts returned in at least text format
is compatible with WPF
can hide the browser as I don't actually want it displayed.
So far I've looked into WebKit.NET which doesn't seem to allow access to the DOM and didn't like WPF windows when I tested and also Awesomium but again didn't appear to allow direct access to the DOM without javascript.
Are there any other options (apart from hacking their scripts)?
Thank you

set WebBrowser.ScriptErrorsSuppressed = true;

Ultimately I ended up keeping the WPF control and used this code to inject a JavaScript script to disable JavaScript errors. The Microsoft HTML Object Library needs to be added.
private const string DisableScriptError = #"function noError() { return true;} window.onerror = noError;";
private void webBrowser1_Navigated(object sender, System.Windows.Navigation.NavigationEventArgs e)
{
InjectDisableScript();
}
private void InjectDisableScript()
{
HTMLDocumentClass doc = webBrowser1.Document as HTMLDocumentClass;
HTMLDocument doc2 = webBrowser1.Document as HTMLDocument;
IHTMLScriptElement scriptErrorSuppressed = (IHTMLScriptElement)doc2.createElement("SCRIPT");
scriptErrorSuppressed.type = "text/javascript";
scriptErrorSuppressed.text = DisableScriptError;
IHTMLElementCollection nodes = doc.getElementsByTagName("head");
foreach (IHTMLElement elem in nodes)
{
HTMLHeadElementClass head = (HTMLHeadElementClass)elem;
head.appendChild((IHTMLDOMNode)scriptErrorSuppressed);
}
}

WPF WebBrowser does not have this property as the WinForms control.
You'd be better using a WindowsFormsHost in your WPF application and use the WinForms WebBrowser (so that you can use SuppressScriptErrors.) Make sure you run in full trust.

Related

WebBrowser control - see files loaded when navigating to a website

I am trying to extract some information from a website. But when I navigate to it, it uses javascript to connect me to a server before dynamically loading a php-page. I can follow the sequence in Chrome with the developer tools. I figured it would be easiest to reproduce it in C# with the Webbrowser control and simply navigate to the website. Then the webbrowser control must contain all the javascript files, the text from the dynamically loaded php page and so on. But is this true and where in the control are they stored? I can't seem to find them.
Recreate the whole sequence diagram implemented in Chrome would be a lot of work. However, "extract some information from a website" is something that can be done quite easily.
Disclaimer: I assumed this question was for the WPF's WebBrower control (it would be almost the same for WinForms)
You can get the HTMLDocument once the page is loaded, using:
using mshtml; // <- don't forget to add the reference
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
browser.Navigate("http://google.com/");
browser.LoadCompleted += browser_LoadCompleted;
}
void browser_LoadCompleted(object sender, NavigationEventArgs e)
{
HTMLDocument doc = (HTMLDocument)browser.Document;
string html = doc.documentElement.innerHTML.ToString();
// from here, you should be able to parse the HTML
// or sniff the HTMLDocument (using HTML Agility Pack for instance)
}
}
From this HTMLDocument, you have access to a lot of properties, including HTML elements, CSS styles and scripts. I invite you to put a break-point and check out what best fits your needs.
Nevertheless, since the page you want to load uses JavaScript to fill its content, the HTMLDocument will probably not be complete a the time the LoadCompleted is raise.
In that case, I suggest to use a timer to poll until the content is stable.
You could also use HTMLDocument to inject your own JavaScript code, and call C# methods througth WebBrowser.ObjectForScripting, but this is gonna be much more complicated and harder to maintain.

Webbrowser forButton click and set text of TextBox

I'm programming in WPF/C# VS2012/2010. I was trying to make an application where you can click on a button to login to an account. The very first webbrowser i used was C# System.Windows.Forms.WebBrowser.
It was fine all methods was nice and simple to use:
Browser.Document.GetElementById("Email").SetAttribute("value", "xxx");
Browser.Document.GetElementById("signin").InvokeMember("Click");
or
HtmlElementCollection textArea = Browser.Document.GetElementsByTagName("textarea");
foreach (HtmlElement element in textArea)
{
if (element != null)
{
element.Focus();
element.InnerText = "Very nice :]";
}
}
This webbrowser is very simple to use, but it is not good enough: it crashed, doesn't use Active-X, HTML5, Silverlight, and much more... So the next one I was trying to use it was "Awensomium".
This is a good webbrowser, no crashes and can easily use all that things I described above, but it's not so simple to use, it doesn't have methods to click buttons, or anything and I can't figure out how I can do this.
Do you know some webbrowser search engine for WPF/C# that allows me to click button etc... and using Active-X,HTML5 and other technologies?
If you are developing in WPF, you shuld use System.Windows.Controls.WebBrowser instead of Forms.WebBrowser. It uses your installed instance of Internet Explorer, so features depend on your IE version. If you upgrade to IE9 you'll be able to show and handle html5,css3 .. items. But if you like Awesomium, then you should try this: http://wpfchromium.codeplex.com/ (there are some examples also) .

C# - Using Awesomium to Interact with Gmail

I'm able to navigate to gmail, but then I want to do something as simple as enter the credientials and click the login button.
private void btnSubmit_Click(object sender, EventArgs e)
{
btnSubmit.Enabled = false;
webGmail.LoadURL("http://www.gmail.com");
webGmail.LoadCompleted += ExecuteSomething;
}
private void ExecuteSomething(object sender, EventArgs eventArgs)
{
webGmail.ExecuteJavascript(#"<script src = 'http://code.jquery.com/jquery-latest.min.js' type = 'text/javascript'></script>");
webGmail.ExecuteJavascript(#"$('#Email').val('foo');");
webGmail.ExecuteJavascript(#"$('#Passwd').val('bar');");
webGmail.ExecuteJavascript(#"$('#signIn').click();");
}
Nothing happens. I know using developer tools with Chrome that you cant modify anything on the page. But is there a way of filling in forms?
Are there any other better headless browsers? I actually need one that supports a web control that I can put into my form so that I can see what is going on. This is mandatory
The problem is that the script tag is not javascript - it's HTML - so executing it as javascript will just throw an error. To load a script with the ExecuteJavascript method, you'd need to create a script element in javascript and inject it into the page head.
See here for an example:
http://www.kobashicomputing.com/injecting-jquery-into-awesomium
I recently came across a similar problem. I tried cefsharp, awesomium, open-webkit-sharp, geckofx. The most advanced was, oddly enough, WebBrowser. It allows you to perform almost all activities directly with C#. For example, click on a submit button in C# you could only in WebBrowser. If you still want to use an alternative engine, I recommend the open-webkit-sharp - it is the most advanced of them (although it has the same problem with the click of buttons).
WatiN has an Javascript implementation for Webkit, which Awesomium is based on, the source code is free and can be downloaded at their homepage. Good luck.
Maybe this question could help you too, calling Javascript from c# using awesomium.

How to click a link element programmatially with HTMLElement?

I'm doing an automation program. I load a webpage into my windows form and load it in WebBrowser control. Then, I need to click on a link from the WebBrowser programatically. How can I do this? for example:
Google Me
Facebook Me
The above are 2 different conditions. The first element does not have an id attribute while the second one does. Any idea on how to click each programmatically?
You have to find your element first, by its ID or other filters:
HtmlElement fbLink = webBrowser.Document.GetElementByID("fbLink");
And to simulate "click":
fbLink.InvokeMember("click");
An example for finding your link by inner text:
HtmlElement FindLink(string innerText)
{
foreach (HtmlElement link in webBrowser.Document.GetElementsByTagName("a"))
{
if (link.InnerText.Equals("Google Me"))
{
return link;
}
}
}
You need a way to automate the browser then.
One way to do this is to use Watin (https://sourceforge.net/projects/watin/). It allows you to write a .Net program that controls the browser via a convenient object model. It is mainly used to write automated tests for web pages, but it can also be used to control the browser.
If you don't want to control the browser this way then you could write a javascript that you include on your page that does the clicking, but I doubt that is what you are after.

Get the links id for the current cursor position in the webrowser control

We are using the WebBrowser control in c# winforms and need to be able to get information about the Url the cursor is positioned on.
So we have a web page in design mode, which has multiple urls, when the cursor is over one I would like to call a method which would return the id of the link.
Thanks
You can use the IHTMLCaret to get the cursor position from there using IMarkupPointer you can get the element in the current scope.
The webBrowser control has a Document property which has a Links collection. Each Link is an HTMLElement which has events you can tap into. Again, I'm not sure what you mean "cursor" because in the web world, unless if you're in a textbox, there really isn't a "cursor" (which is what I meant to ask in my comment) but you can tap into the MouseOver event and other stuff like that.
Example:
foreach (HtmlElement element in this.webBrowser1.Document.Links)
{
element.MouseOver += (o, ex) =>
{
Console.WriteLine(ex.ToElement.GetAttribute("HREF"));
};
}
This will print out the actual URL that the mouse is over.
You can have a look at this article - Hosting a web browser component in a C# winform - which explains several ways to perform that. or go directly to this one - Hosting a webpage inside a Windows Form - Basically what you need to do is handle the Click of the DOM object inside the COM WebBrowser of IE. You achieve this by handling the Js events inside your C# code.
I remember this kind of customization must be done using the AxSHDocVw.AxWebBrowser COM object instead of the System.Windows.Forms.WebBrowser Class from the newer versions of the .Net Framework.
I could send you more data about this, I did it some project, just give me time to find it ;). In the mean time try with those links.
By!

Categories

Resources