web browser delete element's outer html - c#

I have a C# application which has a web browser, navigating to a specified page by default.
What I want to do is when the document has completely loaded, select a html element by tag name(not ID/Class) and then delete the html outside of it but I have tried for some time and still didn't success..
This is my event and where I got so far
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var elementToDelete = webBrowser1.Document.GetElementsByTagName("form");
}
I want to select that form element which has no CLASS/ID and delete all html code that is outside of it(outer-html) so it will be the only thing visible on the page.

You say that you want to delete an element, but then after your code you say that you want to delete everything outside of "form". I'm not sure which you actually want, but you can do the second with the following.
First, note that elementToDelete is actually a collection, not a single element, so we need to get a single element.
var formElements = webBrowser1.Document.GetElementsByTagName("form");
var elementToSave = formElements.FirstOrDefault();
if(elementToSave == null)
throw new InvalidOperationException("No element named 'form'");
You can then set the Document.DocumentText property to the InnerHtml property of "form". You should probably wrap up the inner HTML so that it's a valid page, but this should work:
webBrowser1.Document.DocumentText = elementToSave.InnerHtml;

Related

Selenium can't find iframe after postback inside the iframe

I am trying to create a test where I have to fill out some information inside an iframe. Getting access to the iframe work fine and I can fill out information inside the frame. The issue is that when I fill out a textbox 'A' it has a postback attached to it which reloads the content inside the iframe, to fill out another textbox 'B' depending on the information inside textbox A.
Here are my observations:
When I first locate the iframe it looks like this:
<iframe frameborder="0" src="<removed for clearity>">...</iframe>
After the postback has occurred it looks like this:
<iframe frameborder="0" src="<removed for clearity>" cd_frame_id_="668325d5a0a2a8cb76a92b9eb819d327">...</iframe>
So something changed.
In my C# code first find my frame like so (and yes that is the best way sadly):
var iframe = driver.FindElement(By.XPath("//div[#rawtitle=\"TIME\"]//table//tbody//tr//td//div//div//iframe"));
driver.SwitchTo().Frame(iframe);
I can easily enter text in textbox A:
var completed = driver.FindElement(By.Id("MainContent_txtCompletedHours"));
completed.SendKeys("0,25");
Then I wait for textbox B to be filled but at this point, I can't locate it and I can't locate the iframe either. I tried to relocate the frame again to switch to it again, but I can't find the element. It hasn't moved position. It just got that cd_frame_id attribute. Here is the code where I try to re-locate the iframe:
while (true)
{
try
{
iframe = driver.FindElement(By.XPath("//div[#rawtitle=\"TIME\"]//table//tbody//tr//td//div//div//iframe"));
driver.SwitchTo().Frame(iframe);
invoiced = driver.FindElement(By.Id("MainContent_txtInvoiceHours"));
if (invoiced.Text == "0,25") // and wait for it
break;
}
catch (NoSuchElementException e)
{
Debug.WriteLine("Could not find element, retrying...");
}
finally
{
Thread.Sleep(500);
}
}
The code fails when I try to get hold of the iframe element.
How can I get hold of the iframe again, after the postback inside the frame?
As you mentioned when you fill out a textbox 'A' it has a postback attached, so we will take help of a unique xpath hich identifies the <iframe> as follows :
//Ensure that you are back to the base frame
driver.SwitchTo().DefaultContent();
//SwitchTo the intended frame
driver.SwitchTo().Frame(driver.FindElement(By.XPath("//iframe[contains(#src,'<removed for clearity>')]")));

How to do multiple actions in WebBrowser_DocumentCompleted in c#

So, I'm creating a bot using webBrowser in c# that loads a website entered in the text box. When the website is loaded, I need bot to click on a specific anchor text. After that when a new page is loaded, I need to click on another anchor text and so on, until a form to fill out details appears. I also need to show captcha to the user where he/she can fill it and submit it, so that the page can continue to next page.
What I need is to invoke different methods, each time the browser navigated to next page and loading is complete. I have successfully created a WebBrowser_DocumentCompleted, but it get invoked over and over again, due to the fact that same hyper link is present on the page that I want to visit. But, on that page I need to click on a button.
I did this for getting the link and visiting it.
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Select the html element by inner text of anchor and click on it
HtmlElementCollection elc = this.webBrowser1.Document.GetElementsByTagName("a");
foreach (HtmlElement el in elc)
{
if (el.InnerText == null || el.InnerText.Equals("Matching text"))
{
el.InvokeMember("click");
}
}
}
After this the link that have matched innretext get clicked and the page loads. The page have same anchor text and it gets loaded again and again. But, I need to click on another button and go to next page.
So, if you have any way that I can use to do it then it would be awesome.Any help is welcomed!
P.S. I'm a beginner in C# and .net
The behaviour you see is normal, I suppose the page you are loading has some iframes or embedded content and for each one loaded the DocumentCompleted will be fired (it's not related on having a link to the page, a link does nothing until it's clicked).
You must take actions based on the Url parameter of the WebBrowserDocumentCompletedEventArgs passed on thos function, in this way you can execute the required action for each concrete page, something like this:
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
switch(e.Url.ToString())
{
case "http://myfakeserver.com/mypageone.htm":
//Do whetever you want to do
break;
case "http://myfakeserver.com/mypagetwo.htm":
//Do more stuff
break;
}
}
Hope it helps.
EDIT:
Ok, now I get what you need.
It's easy, just check if you area already on that page.
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Select the html element by inner text of anchor and click on it
HtmlElementCollection elc = this.webBrowser1.Document.GetElementsByTagName("a");
foreach (HtmlElement el in elc)
{
var hRef = el.GetAttribute("href");
if(string.IsNullOrWhitespace(hRef))
continue;
var lnkUri = new Uri(hRef);
//If the link points to this page, ignore it
if(lnkUri.Segments[lnkUri.Segments.Length - 1] == e.Url.Segments[e.Url.Segments.Length - 1])
continue;
if ((el.InnerText == null || el.InnerText.Equals("Matching text"))
{
el.InvokeMember("click");
}
}
}
Beware in the example i'm just checking the last part of the url, so if you have different paths which have the same page name it will fail, you must adapt it to your needs, depending on how the uris are written on the href you can do a full check of the urls.

Display portion of a Website into a Web Browser

i want to navigate to a specific website, and i want then to be displayed in the web browser only a portion of the website, which starts with:
<div id="dex1" ...... </div>
I know i need to get the element by id, but firstly i tried writing this:
string data = webBorwser.Document.Body.OuterHtml;
So from data i need to grab that content "id" and display it and the rest to be deleted.
Any idea on this?
webBrowser1.DocumentCompleted += (sender, e) =>
{
webBrowser1.DocumentText = webBrowser1.Document.GetElementById("dex1").OuterHtml;
};
On second thoughts, don't do that, setting the DocumentText property causes the DocumentCompleted event to fire again. So maybe do:
webBrowser1.DocumentCompleted += webBrowser1_DocumentCompleted;
void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
webBrowser1.DocumentCompleted -= webBrowser1_DocumentCompleted;
webBrowser1.DocumentText = webBrowser1.Document.GetElementById("dex1").OuterHtml;
}
Although in most real world cases I'd expect you'd get better results injecting some javascript to do the DOM manipulation, a la Andrei's answer.
Edit: to just replace everything inside the body tag which might if you're lucky maintain all the required styling and scripts if they're all in the head don't reference any discarded context, you may have some joy with:
webBrowser1.Document.Body.InnerHtml = webBrowser1.Document.GetElementById("dex1").OuterHtml;
So, as you probably need a lot of external resources like scripts and images. You can add some custom javascript to modify the DOM however you like after you have loaded the document from your website. From How to update DOM content inside WebBrowser Control in C#? it would look something like this:
HtmlElement headElement = webBrowser1.Document.GetElementsByTagName("head")[0];
HtmlElement scriptElement = webBrowser1.Document.CreateElement("script");
IHTMLScriptElement domScriptElement = (IHTMLScriptElement)scriptElement.DomElement;
domScriptElement.text = "function applyChanges(){ $('body >').hide(); $('#dex1').show().prependTo('body');}";
headElement.AppendChild(scriptElement);
// Call the nextline whenever you want to execute your code
webBrowser1.Document.InvokeScript("applyChanges");
This is also assuming that jquery is available so you can do simple DOM manipulation.
The javascript code is just hiding all children on the body and then prepending the '#dex' div to the body so that it's at the top and visible.

Adding asp controls to programmatically added content

What I would like:
In an ideal scenario I would be able to create an anchor with oncommand and commandargument attributes, but if I'm not mistaken that doesn't work and you have to create a control, such as a button, where it will work. My problem then comes from wanting to place that button for each item on the page, as I need something to add the control to, but if I created an anchor with runat="server" and, say, id="try", I can't then do: try.Controls.Add(button) because the anchor 'try' hasn't actually been created yet.
Background:
The majority of content is being added programmatically. A stringbuilder is used to create a string of what will be html displayed on the page. Is it possible to add a control to the page in the middle of this string? OR into an element which is programmatically added this way?
I have tried:
Creating anchors (or otherwise) and targeting the id of those elements and then creating a button as follows, but, because the elements are added programmatically and the number required will vary, the ids will then be try0, try1, etc:
var button = new Button {
CommandArgument = "test",
Text = "Try"
};
button.Command += bt_sendMail_tryDevice_Click;
try.Controls.Add(button);
So I tried variations of the following, where in my aspx page I have a 'dummy' element with the id="try" so it doesn't complain, but I understand why it doesn't like it, at the same time though I don't know how to get around it. (tryCount being an int which increase with each iteration to keep the id unique).
this.FindControl(try.ToString() + tryCount.ToString()).Controls.Add(button);
Its kinda hard to tell what your going for, but I will do my best to get as close as I can. The first thing is you need some control to work as a "container". This can be just about any control you like. In my test for this scenario I did something like this:
<div runat="server" id="ContainerDiv"></div>
The next thing is you need a way to manage your Id. I did this by creating a simple variable and method like so:
private int IdCount = 0;
private string GetNewID()
{
return string.Format("try{0}", IdCount++);
}
Now you say you want an achor tag that also has a CommandName and CommandArgument a LinkButton will do this. You can add a LinkButton to your div above like this:
ContainerDiv.Controls.Add(
new LinkButton()
{
ID = GetNewID(),
CommandName = "DoSomething",
CommandArgument = "arg",
Text= "Try Me",
});
Obviously replacing the CommandName and the rest with the values you really want. Just be sure to call GetNewID() when assigning the ID so they will always be unique.
Controls can be added to a page anywhere as a child of an existing server control, including the page itself, but doing so can be tricky. Be sure to add them to your page as soon as you can (in the page lifecycle) as post back events may not work correctly.
Update
Keeping references to already created elements on your page may simplify your implementation:
public partial class _Default : Page
{
Control containerDiv;
protected void Page_Load(object sender, EventArgs e)
{
this.containerDiv = SomeMethodThatCreatesADiv();
this.Page.Controls.Add(containerDiv);
}
void SomeOtherMethod()
{
this.containerDiv.Controls.Add(
new LinkButton()
{
ID = GetNewID(),
CommandName = "DoSomething",
CommandArgument = "arg",
Text= "Try Me",
});
}
}

Change contents without redirecting to about:blank

I'm currently working on an application that modifies a specific web page to hide irrelevant information and the displays it in a WebBrowser control in the application window. Unfortunately as soon as i set the DocumentText Property of the WebBrowser, it navigates to about:blank and the displays the HTML content. However, because it redirects to about:blank, all relative element in the web page become invalid, creating a very odd looking web page with no stylesheet what so ever.
Is there a way i can modify what the WebBrowser control displays, without having it redirect to about:blank and therefore ruining all relative elements?
This should work for injecting a HTML Element to your page, without resetting the rest of the DOM:
private void button1_Click(object sender, EventArgs e)
{
HtmlElement myElem = webBrowser1.Document.CreateElement("input");
dynamic element = myElem.DomElement;
element.SetAttribute("value", "Hello, World!");
(webBrowser1.Document.GetElementsByTagName("body")[0]).AppendChild(myElem);
}

Categories

Resources