I'm working on a page, where page loads dynamically and the data gets added while scrolling. To identify the properties of an item, I identified the parent div, where to identify the address, I have to locate an XPath from the parent to span element.
Below is my DOM structure:
<div class = "parentdiv">
<div class = "search">
<div class="header">
<div class="data"></div>
<div class="address-data">
<div class="address" itemprop="address">
<a itemprop="url" href="/search/Los-Angeles-CA-90025">
<span itemprop="streetAddress">
Avenue
</span>
<br>
<span itemprop="Locality">Los Angeles</span>
<span itemprop="Region">CA</span>
</a>
</div>
</div>
</div>
</div>
</div>
</div>
Here I want to locate the three spans, where I' currently in parent div.
Can someone guide how to locate an element using XPath from particular div?
You can try the following XPaths,
To locate the street address:
//div[#class="parentdiv"]/div/div/a/span[#itemprop="streetAddress"]
To locate the locality/city:
//div[#class="parentdiv"]/div/div/a/span[#itemprop="Locality"]
To locate the state:
//div[#class="parentdiv"]/div/div/a/span[#itemprop="Region"]
To print the list of <span> tagged WebElements with texts like Avenue with respect to div class = "parentdiv" node you can use the following block of code :
IList<IWebElement> myList = Driver.FindElements(By.CssSelector("div.parentdiv > div.address > a[itemprop=url] > span"));
foreach (IWebElement element in myList)
{
string my_add = element.GetAttribute("innerHTML");
Console.WriteLine(my_add);
}
Your DOM might become fairly large, since it adds elements while scrolling, so using CSS selectors might be quicker.
To get all the span tags in the div, use:
div[class='address'] span
To get a specific span by using the itemprop attribute use:
div[class='address'] span[itemprop='streetAddress']
div[class='address'] span[itemprop='Locality']
div[class='address'] span[itemprop='Region']
You can store the elements in a variable like so:
var streetAddress = driver.FindElement(By.CssSelector("div[class='address'] span[itemprop='streetAddress']"));
var locality = driver.FindElement(By.CssSelector("div[class='address'] span[itemprop='Locality']"));
var region = driver.FindElement(By.CssSelector("div[class='address'] span[itemprop='Region']"));
Related
I tried to select the second div with class="editor-col", but the element that i get is in the second div in first div with class="editor-col".
C# code:
/*Create and initialize object*/
IWebElement dropdown_priority = driver.FindElement(By.CssSelector("div.k-edit-form-container div:nth-child(2)")); //select issue tab
HTML element:
<div class="k-edit-form-container">
<div class="editor-label"></div>
<div class="editor-label"></div>
<div class="editor-label"></div>
</div>
Apreciate your advice on how to select the second div with class="editor-col". Thanks.
U find all divs with same class and use index to get 2nd of em
By.XPath(".//div[#class='editor-col'][1]")
first .. sorry about my bad english
my question is how can i scrape div inside div in htmlagilitypack c#
this is test html code
<html>
<div class="all_ads">
<div class="ads__item">
<div class="test">
test 1
</div>
</div>
<div class="ads__item">
<div class="test">
test 2
</div>
</div>
<div class="ads__item">
<div class="test">
test 3
</div>
</div>
</div>
</html>
how to make a loop that get all ads then loop that control test inside ads
You can select all the nodes inside class all_ads as follow:-
var res = div.SelectNodes(".//div[#class='all_ads ads__item']");
.//div[#class='all_ads ads__item'] This will select all the nodes inside all_adswhich has class ads_item.
You have to use this path => //div[contains(#class, 'test')]
This means you need to select those div(s) that contains class with name ads__item.
and then select all those selected div(s) inner html. like
class Program
{
static void Main(string[] args)
{
string html = File.ReadAllText(#"Path to your html file");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
var innerContent = doc.DocumentNode.SelectNodes("//div[contains(#class, 'test')]").Select(x => x.InnerHtml.Trim());
foreach (var item in innerContent)
Console.WriteLine(item);
Console.ReadLine();
}
}
Output:
I want to loop through all rows in a table and select all <p> in a row.
foreach (var r in Table.SelectNodes("tr"))
{
var Paragraphs = r.SelectNodes("//p");
}
Why do I have have to use SelectNodes("//p") and not just SelectNodes("p")? If I do the latter I always get null.
I'm wondering why don't I use //tr in the foreach statement.
As such written //p, in this case, it will find "p" nodes located at any depth within the HTML tree of your tr element.
If you write it /p it will search only in the root node of the HTML tree of your tr element
Example:
With //p you will find 2 <p> elements, with only /p you will not find it and null will be return.
<tr>
<div>
<p></p>
</div>
<div>
<div>
<p></p>
</div>
<div>
</tr>
In this case, if you search by /p, the element will be found.
<tr>
<p></p>
</tr>
Find the elements bellow the ul element, as per the following sample HTML:
<ul _ngcontent-nkg-43="" ngmodelgroup="option">
<span _ngcontent-nkg-17="" style="cursor: pointer;">Option 1</span>
<span _ngcontent-nkg-17="" style="cursor: pointer;">Option 2</span>
<span _ngcontent-nkg-17="" style="cursor: pointer;">Option 3</span>
</ul>
var yourParentElement = driver.FindElement(By.XPath(".//ul[ngmodelgroup='option']"));
var children = yourParentElement.FindElements(By.XPath(".//*"))
This latter call will return all children elements of yourParentElement
If you're trying to fetch the span elements you could do:
driver.FindElement(By.Xpath(".//ul[ngmodelgroup='option']")).FindElements(By.TagName("span"));
I have a code in C# where I want to extract the below value (the text "I want this text" in the HTML code below). I have reformat the HTML code to make it easily readable.
<div class="paste-copy-url" style="margin:0 0 0 0;">
<h4>My Stats:</h4>
<div class="line">
<div class="wrap-input">
<input onclick="this.select();" value="I want this text" readonly="readonly">
</div>
</div>
<h4>Website Link:</h4>
<div class="line">
<div class="wrap-input"><input onclick="this.select();" value="Some value" readonly="readonly">
</div>
</div>
</div>
The code I tried (It is giving me the text : "Website Link:"):
var myvaluetoextract = htmlDocument.DocumentNode.SelectSingleNode("//div[#class='paste-copy-url']");
What am I doing wrong? Can I use this approach to get that element (There is only 1 instance of the div class in the page)?
var input = htmlDocument.DocumentNode
.SelectSingleNode("//div[#class='paste-copy-url']//div[#class='wrap-input']/input");
var yourText = input.Attributes["value"].Value;
You can do it like this:
var myvaluetoextract = htmlDocument.DocumentNode.SelectSingleNode("//div[#class='paste-copy-url']//input");
var value = myvaluetoextract.GetAttributeValue("value", null);
//input means you search for input elements in the div's subtree, recursively. GetAttributeValue is a helper that will never fail, even if the attribute doesn't exists (in this case if will return the 2nd passed parameter - which is null here)