Selenium XPath Query - FindElement After Text - c#

I am trying to get a link in a website which changes name on a daily basis. The structure is similar to this (but with many more levels):
<li>
<div class = "contentPlaceHolder1">
<div class="content">
<p>
<strong>'Today's File Here:<strong>
</p>
</div>
</div>
</li>
<li>...<li>
<li>...<li>
<li>...<li>
<li>
<div class = "contentPlaceHolder1">
<div class="content">
<div class="DocLink">
<li>
Download
</li>
</div>
</div>
</div>
</li>
<li>...<li>
etc...
If I find the text (which will remain constant) which is immediately above it in the page by using
IWebElement foundTextElement = chrome.FindElement(By.XPath("//p/strong['Today's File Here:']"));
How can I find the next link in the page by using XPath (or alternative solution)? I am unsure of how to search for the next element after this.
If I use
IWebElement link = chrome.FindElement(By.XPath("//a[#class='txtLnk'"));
then this finds the first link in the page. I only want the first occurance of it after 'foundTextElement'
I have had it working by navigating up the tree to the parent above <li>, and finding the 4th sibling using By.XPath("following-sibling::*[4]/div/div/div/li/a[#class='txtLnk']") but that seems a little precarious to me.
I could parse the HTML until it finds the next occurrence in the html, but was wondering whether there is a more clever way of doing this?
Thanks.

You can try this xpath. It's complicated, as we don't see the rest of the page to optimize it
//li[preceding-sibling::li[.//*[contains(text(),'File Here')]]][.//a[contains(#class,'txtLnk')]][1]
it searches first li which has inside a tag with txtLnk class and it is first found followed after li element with text containing File Here

By.XPath("//a[#class='txtLnk'")
Is a very generic selector, there might be other elements on the page using the same class
You can find this using a CssSelector, try this:
IWebElement aElement = chrome.FindElement(By.CssSelector("div.contentPlaceHolder1 div.content div.DocLink li a"));
Then you can get the href using:
string link = aElement.getAttribute("href") ;

Related

xPath working for only one of multiple div tags found at the same level

Writing tests with Selenium webdriver in C#. I absolutely can't understand why only the first in a list of (same-level) div elements can be identified with xPath.
I have this html, I have inspected two elements on the page, two different divs. I managed to copy just the text of the first element, by running this SIMPLE code:
IWebElement chapterElement = webDriver.FindElement(By.XPath("/html/body/div[3]/main/div[2]/div[3]/article/div[1]"));
...after which I can just type:
chapterElement.Text to find out the inner text.
And the other one is another div, at the same level as the first, the xPath I just copied from the HTML (copy entire xPath):
IWebElement chapterElement = webDriver.FindElement(By.XPath("/html/body/div[3]/main/div[2]/div[3]/article/div[2]"));
... and it doesn't fail, but it doesn't copy the text also, the text is "" (empty string).
The only differences between the two divs are:
the last segment in the path: div[1] versus div[2].
the second div is actually hidden from the page (probably because it lacks the class "chapter_visible"), but does show up completely in the html with Inspect!
In case this helps, I'm gonna say
"/html/body/div[3]/main/div[2]/div[3]/article/div[1]"
corresponds with:
<div class="chapter chapter chapter_visible" data-chapterno="0" data-chapterid="5e8798266cee070006f5a3d1" style="display: block;">
<h1>some text</h1>
<div class="chapter__content"><p>some text</p>
<p>some text</p>
<p>some text</p>
<ul>
<li>some text</li>
<li>some text</li>
<li>some text.</li>
</ul></div>
</div>
and
"/html/body/div[3]/main/div[2]/div[3]/article/div[2]" (the second xPath)
corresponds to the following (as is located at the same level as the first):
<div class="chapter chapter" data-chapterno="1" data-chapterid="5e8798436cee070006f5a3d2">
<h1>some text</h1>
<div class="chapter__content"><p>some text</p>
<p><strong>some text</strong></p>
<p>some text.</p>
<p>some text</p>
<p>some text</p></div>
</div>
This is my first experience playing around with xPath, a bit disappointed because I just copied the xPath, I didn't even write it manually. It was supposed to be fast and straightforward, right? Thank you.
IWebElement chapterElement = webDriver.FindElement(By.XPath("//div[#class='chapter chapter']"));
Can u try this?
if you want get_attribute
IWebElement chapterElement = webDriver.FindElement(By.XPath("//div[#class='chapter chapter']")).GetAttribute("attribute_name");

class name and then the tag name contains the distinguish text

I have class name active and then there is unique text called active text in span(which is nested). Class name active is the unique among other class names then nested text is unique. How would i click on that. I have used following methods.
FindElement(By.XPath("//li[#class='active']//*[contains(.,'active text')]"));
also i tried
findelement(BY.xpath(//li[#class='active']//div//div//div//span[contains(.,'active text')]"))
also tried this
FindElement(By.XPath("//li[contains(#class,'active')] and //span[contains(.,'active text')]")).Text;
Every time i get no such element found
ANythoughts
this is the html code
<li class="active">
<div class="a">
<div class="b">
<div class="c">
<h1></h1>
<h3 class="d"> some text</h3>
<div class="e">
<span class="f">
Active Text</span>
</div></div></div></div>
</li>
You can use either of the following Locator Strategies:
CssSelector:
FindElement(By.CssSelector("li.active span.f"));
XPath 1:
FindElement(By.XPath("//li[#class='active']//span[normalize-space()='Active Text']"));
XPath 2:
FindElement(By.XPath("//li[#class='active']//span[#class='f' and normalize-space()='Active Text']"));
FindElement(By.XPath("//li[#class='active']//span[contains(text(),'Active Text')]"));
OR
FindElement(By.XPath("//li[#class='active']//span[#class='f' and contains(text(),'Active Text')]"));
Please try the above code. both will work. also, let me know if clarification is required
So what worked for me was this,
FindElement(By.CssSelector("li.active")).FindElement(By.XPath("//span[contains(.,'Active Text')]"));

Retrieving specific URLs with HtmlAgilityPack C#

I'm currently attempting to use HtmlAgilityPack to extract specific links from an html page. I tried using plain C# to force my way in but that turned out to be a real pain. The links are all inside of <div> tags that all have the same class. Here's what I have:
HtmlWeb web = new HtmlWeb();
HtmlDocument html = web.Load(url);
//this should select only the <div> tags with the class acTrigger
foreach (HtmlNode node in html.DocumentNode.SelectNodes("//div[#class='acTrigger']"))
{
//not sure how to dig further in to get the href values from each of the <a> tags
}
and the sites code looks along the lines of this
<li>
<div class="acTrigger">
<a href="/16014988/d/" onclick="return queueRefinementAnalytics('Category','Battery')">
Battery <em> (1)</em>
</a>
</div>
</li>
<li>
<div class="acTrigger">
<a href="/15568540/d/" onclick="return queueRefinementAnalytics('Category','Brakes')">
Brakes <em> (2)</em>
</a>
</div>
</li>
<li>
<div class="acTrigger">
<a href="/11436914/d/1979-honda-ct90-cables-lines" onclick="return queueRefinementAnalytics('Category','Cables/Lines')">
Cables/Lines <em> (1)</em>
</a>
</div>
</li>
There are a lot of links on this page, but the href I need are contained inside of those <a> tags which are nested inside of the <div class="acTrigger"> tags. It would be simple if each <a> shared unique classes, but unfortunately only the <div> tags have classes. What I need to do is grab each one of those hrefs and store them so I can retrieve them later, go to each page, and retrieve more information from each page. I just need a nudge in the right direction to get over this hump, then I should be able to do the other pages as well. I have no previous experience with this HtmlAgilityPack and all the example I find seem to want to extract all the URLs from a page, not specific ones. I just need a link to an example or documentation, any help is greatly appreciated.
You should be able to change your select to include the <a> tag: //div[#class='acTrigger']/a. That way your HtmlNode is your <a> tag instead of the div.
To store the links you can use GetAttributeValue.
foreach (HtmlNode node in html.DocumentNode.SelectNodes("//div[#class='acTrigger']/a"))
{
// Get the value of the HREF attribute.
string hrefValue = node.GetAttributeValue( "href", string.Empty );
// Then store hrefValue for later.
}

Get First Input in first ul and its first li

Here's the markup:
<h3>Customers</h3>
<ul>
<li>
<label for="customer115">Some Customer Name</label>
<input id="customer115" type="checkbox" value="115" name="customer115">
</li>
.. there are more <li> here...and so on
</ul>
<h3>Dealers</h3>
<ul>
<li>
<label for="dealer100">Some DealerName</label>
<input id="dealer100" type="checkbox" value="115" name="dealer115">
</li>
.. there are more <li> here...and so on
</ul>
I'm trying to get reference to the customer checkbox for example so I can do a click() on it via XPath. I'm doing this in Selenium so something like:
string sXPath = string.Format(string.Format("//h3[text()='{0}']/ul/li/input[1]", "Customers"));
IWebElement firstCompanyCheckbox = GetElementByXPath(sXPath);
firstCompanyCheckbox.Click();
So far I can't figure out how to get to this reference, the above xPath does not find it. I want to click that checkbox.
The ul is not a child of h3. It is a sibling. Adjust your XPath to use the following-sibling:: axis
//h3[text()='{0}']/following-sibling::ul/li/input[1]
If you want to ensure that you select the first ul and the first li, then add additional predicate filters:
//h3[text()='{0}']/following-sibling::ul[1]/li[1]/input[1]
You could also also simplify, in my opinion, that XPath by using an easier to read CSS Selector as well:
ul:first li:first input[type='checkbox']
I'm sure there will be a lot of debate as to which is preferable: CSS vs XPath. But typically when I see my QA going the route of a complex XPath query. I try to find ways to implement "id" attributes on the elements or simplify the DOM elements for selecting.

Related to predicates in HtmlAgilityPack

I want to fetch data from website. I am using HtmlAgilityPack. In the website content is like this
<div id="list">
<div class="list1">
<a href="example1.com" class="href1" >A1</a>
<a href="example4.com" class="href2" />
</div>
<div class="list2">
<a href="example2.com" class="href1" >A2</a>
<a href="example5.com" class="href2" />
</div>
<div class="list3">
<a href="example3.com" class="href1" >A3</a>
<a href="example6.com" class="href2" />
</div>
</div>
Now, I want to fetch the first two links which has class="href1". I am using code.
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//a[#class='href1'][position()<3]");
But, it is not working. It gives all three links. I want to fetch only first two links. How to do this?
Hey! Now I want to do 1 thing also.
Above, I have only three links with class="href1". Suppose, I have 10 links with class="href1". And I want to fetch only four links from 6th link to 9th link. How to fetch these particular four links?
Try like wrapping the anchor selector in parentheses before applying the position() function:
var nodes = doc.DocumentNode.SelectNodes("(//a[#class='href1'])[position()<3]");
Why not just get them all and use the first two from the returned collection? Whatever xpath you would need to do this would be ultimately a hell of a lot less readable than using LINQ:
using System.Linq;
...
HtmlNodeCollection firstTwoHrefs = doc.DocumentNode
.SelectNodes("//a[#class='href']").Take(2);

Categories

Resources