C# Selenium Webdriver
So i need to ensure that none of my pages (around 200 pages) contain a particular known string. Is there any way that i can scan a page for the existence of this string and if it does then return both the ElementID of that element and the entire string?
For example my source is like:
<a id="cancel_order_lnkCancel">Cancel Order</a>
I want to search for the word 'Cancel' on the whole page (<div id="sitewrapper">) and return both
cancel_order_lnkCancel;Cancel Order
Thanks
You can use XPath to find by text. e.g.:
var element = driver.FindElement(By.XPath(string.Format("//*[contains(text(), '{0}')]", value)));
value being the string you are searching for.
Then to get the element's markup and content:
var html = element.GetAttribute("outerHTML");
var text = element.Text;
or
var text = element.GetAttribute("innerHTML");
I haven't worked in C# binding but you can use FindElements to get a list of all elements containing the text. You can by no doubt use #Jarga's xpath. The good thing with FIndElements will be that it won't throw you an exception (atleast this is what happens in java) though you have to use try catch to handle getAttribute if you get null for value of id. And if you iterate over the list you can fetch all texts using getText method.
Related
I'm trying to get the results of a google "define word" search. According to Chrome's Inspect Elements, the text I want is under the class "div class="lr_dct_ent vmod" data-hveid="28"" I'm using this code to try and do it:
var thecq = CQ.CreateFromUrl("https://www.google.be/search?q=define+word&oq=define+word");
var please = thecq.Select(".lr_dct_ent.vmod").Text();
var work = thecq[".lr_dct_ent.vmod"].Text();
Console.WriteLine(please);
Console.WriteLine(work);
neither of these return anything in Console, just empty lines. If I do "div" instead of ".lr_dct_ent.vmod" I get a lot of text and one of them is the text I want which leads me to believe that ".lr_dct_ent.vmod" is not how I'm supposed to search the div class that I wanted. But according to every documentation I found, it IS how I'm supposed to do it. Is Google just a special case or am I the one who's special here?
I'm using the Html Agility Pack for this task, basically I've got a URL, and my program should read through the content of the html page on it, and if it finds a line of text (ie: "John had three apples"), it should change a label's text to "Found it".
I tried to do it with contains, but I guess it only checks for one word.
var nodeBFT = doc.DocumentNode.SelectNodes("//*[contains(text(), 'John had three apples')]");
if (nodeBFT != null && nodeBFT.Count != 0)
myLabel.Text = "Found it";
EDIT: Rest of my code, now with ako's attempt:
if (CheckIfValidUrl(v)) // foreach var v in a list..., checks if the URL works
{
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(v);
try
{
if (doc.DocumentNode.InnerHtml.ToString().Contains("string of words"))
{
mylabel.Text = v;
}
...
One possible option is using . instead of text(). Passing text() to contains() function the way you did will, as you suspected, effective only when the searched text is the first direct child of the current element :
doc.DocumentNode.SelectNodes("//*[contains(., 'John had three apples')]");
In the other side, contains(., '...') evaluates the entire text content of current element, concatenated. So, just a heads up, the above XPath will also consider the following element for example, as a match :
<span>John had <br/>three <strong>apples</strong></span>
If you need the XPath to only consider cases when the entire keyword contained in a single text node, and therefore considers the above case as a no-match, you can try this way :
doc.DocumentNode.SelectNodes("//*[text()[contains(., 'John had three apples')]]");
If none of the above works for you, please post minimal HTML snippet that contains the keyword but returned no match, so we can examine further what possibly causes that behavior and how to fix it.
use this:
if (doc.DocumentNode.InnerHtml.ToString().Contains("John had three apples"))
myLabel.Text="Found it";
There is a Page, I want to get it's body for reading input areas, and changing their values by GetAttribute and SetAttribute in C#. This is no problem to do it but,
There is nothing returns (I mean empty string), when I call the body via:
webBrowser1.Document.Body.InnerText
or,
webBrowser1.Document.Body.InnerHtml
That's why I can't acces any input field.
I see The Web Page in webbrowser component, But Neither InnerText nor InnerHtml return. It's a saved Bank weppage running on local.
So How can I read body, for running SetAttribute or GetAttribute or InvokeMember something else?
You need to get the input element then to get or set the text :
HtmlElementCollection elements = currentBrowser.Document.GetElementsByTagName("INPUT");
foreach (HtmlElement element in elements)
{
//to get the text use : string value = element.GetAttribute("value");
//to set the text use : elemet.InnerText = "something";
}
but don't forget that by the code above you get all the input elements , to search for a specific element you can check its id or name as in the webpage : for example :
if (element.Name.ToLower().Contains("email"))
//do work
Is it possible to find links on a webpage by searching their text using a pattern like A-ZNN:NN:NN:NN, where N is a single digit (0-9).
I've used Regex in PHP to turn text into links, so I was wondering if it's possible to use this sort of filter in Selenium with C# to find links that will all look the same, following a certain format.
I tried:
driver.FindElements(By.LinkText("[A-Z][0-9]{2}):([0-9]{2}):([0-9]{2}):([0-9]{2}")).ToList();
But this didn't work. Any advice?
In a word, no, none of the FindElement() strategies support using regular expressions for finding elements. The simplest way to do this would be to use FindElements() to find all of the links on the page, and match their .Text property to your regular expression.
Note though that if clicking on the link navigates to a new page in the same browser window (i.e., does not open a new browser window when clicking on the link), you'll need to capture the exact text of all of the links you'd like to click on for later use. I mention this because if you try to hold onto the references to the elements found during your initial FindElements() call, they will be stale after you click on the first one. If this is your scenario, the code might look something like this:
// WARNING: Untested code written from memory.
// Not guaranteed to be exactly correct.
List<string> matchingLinks = new List<string>();
// Assume "driver" is a valid IWebDriver.
ReadOnlyCollection<IWebElement> links = driver.FindElements(By.TagName("a"));
// You could probably use LINQ to simplify this, but here is
// the foreach solution
foreach(IWebElement link in links)
{
string text = link.Text;
if (Regex.IsMatch("your Regex here", text))
{
matchingLinks.Add(text);
}
}
foreach(string linkText in matchingLinks)
{
IWebElement element = driver.FindElement(By.LinkText(linkText));
element.Click();
// do stuff on the page navigated to
driver.Navigate().Back();
}
Dont use regex to parse Html.
Use htmlagilitypack
You can follow these steps:
Step1 Use HTML PARSER to extract all the links from the particular webpage and store it into a List.
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(/* url */);
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[#href]"))
{
//collect all links here
}
Step2 Use this regex to match all the links in the list
.*?[A-Z]\d{2}:\d{2}:\d{2}:\d{2}.*?
Step 3 You get your desired links.
I am trying to grab elements from HTML source based on the class or id name, using C# windows forms application. I am putting the source into a string using WebClient and plugging it into the HTMLAgilityPack using HtmlDocument.
However, all the examples I find with the HTMLAgilityPack pack parse through and find items based on tags. I need to find a specific id, of say a link in the html, and retrieve the value inside of the tags. Is this possible and what would be the most efficient way to do this? Everything I am trying to parse out the ids is giving me exceptions. Thanks!
You should be able to do this with XPath:
HtmlDocument doc = new HtmlDocument();
doc.Load(#"file.htm");
HtmlNode node = doc.DocumentNode.SelectSingleNode("//*[#id=\"my_control_id\"]");
string value = (node == null) ? "Error, id not found" : node.InnerHtml;
Quick explanation of the xpath here:
// means search everywhere in the path, Use SelectNodes if it will be matching multiples
* means match any type of node
[] define "Predicates" which are basically checking properties relative to this node
[#id=\"my_control_id\"] means find nodes that have an attribute named "id" with the value "my_control_id"
Further reference