Replacing certain containers inside a cloned "div" in jquery - c#

I am trying to replace the contents of a selected "div" element, and append it to the parent control. So far I am able to clone and append it to the parent, but I want to know how I can replace certain tags inside.
to be specific here is the jquery i use to clone the target control
var x = $(parent).children('div[class="answer"]:first').children('div[class="ansitem"]:first').clone();
the html content inside the clone div is like this :
<div id="ansthumb_anstext_anscontrols">
<div id="image" class="ansthumb">
replace 1
</div>
<div id="atext" class="anstext">
<p class="atext_para">
<span id="mainwrapper_QRep_ARep_0_UName_0" style="color: rgb(51, 102, 255); font-weight: bold;">Replace 2 </span>
Replace 3
</p>
<p id="answercontrols">
<input name="ctl00$mainwrapper$QRep$ctl01$ARep$ctl01$AnsID" id="mainwrapper_QRep_ARep_0_AnsID_0" value='replace 4' type="hidden">
<a id="mainwrapper_QRep_ARep_0_Like_0" title="Like this answer" href="#">Like</a>
<a id="mainwrapper_QRep_ARep_0_Report_0" title="Report question" href="#">Report</a>
<span id="mainwrapper_QRep_ARep_0_lblDatetime_0" class="date"> replace 5 </span>
</p>
</div>
here i have marked all the areas I want to be replaced. The id's of the above div elements are named as such because it is generated within a repeater control.
I have gone through the jquery API and this function seems to be the thing I should be using as far as i understand.
replaceWith(content)
but the drawback of this way is i have to dump the entire html on to a string variable and include replacement text wherever needed. I think it is not the best way, and may be something like selecting particular tags and changing data would be the way to do it. Any help appreicated guys!
thanks

You could use the .html() and a couple other jQuery functions and use the surrounding elements as your selectors.
For example
<script type='text/javascript'>
$("#image").html("YourData1"); //replace 1
var secondSpan = $("#mainwrapper_QRep_ARep_0_UName_0");
$(secondSpan).html("YourData2"); //replace 2
$(secondSpan).after("YourData3"); //replace 3
$("#mainwrapper_QRep_ARep_0_AnsID_0").attr("value", "YourData4"); //replace 4
$("#mainwrapper_QRep_ARep_0_lblDatetime_0").html("YourData5"); //replace 5
</script>
Since these ids are defined by .NET, you can get the ClientID of the .NET control.
For example:
var secondSpan = $("#<%= UName.ClientID %>");
Hope this helps!

Related

Selenium not recognizing span element within a div, thinks it's text?

I'm trying to grab the text from a span that's inside a div. The div is currently selected, so it has "curr" within its class.
The DOM:
<a id="ctl00_oAjaxContentPlaceHolder_LinkButtonAlerts" href="javascript:__doPostBack('ctl00$oAjaxContentPlaceHolder$LinkButtonAlerts','')">
<div id="ctl00_oAjaxContentPlaceHolder_divAlertAlertsHolder" class="profile-menu-alerts curr" title="Activities & Alerts">
<span>Activities & Alerts</span>
</div>
</a>
This XPath should find the span (it works when I use the Find tool in DevTools), but it fails to find the element
//div[contains(#class,'curr')]/span
If I remove the /span from the xpath, it finds the div just fine. And the strange part is that if I grab the text of that div with
driver.FindElement(By.XPath("//div[contains(#class,'curr')]")).Text;
it returns "<span>Activities & Alerts</span>". Why is this span element being incorrectly recognized as Text?
I ran this on my solution using the below and had no issues.
var test = Driver.FindElement_byXPath("//div[contains(#class,'curr')]/span").Text;
html - added another option:
<a id="ctl00_oAjaxContentPlaceHolder_LinkButtonAlerts" href="javascript:__doPostBack('ctl00$oAjaxContentPlaceHolder$LinkButtonAlerts','')">
<div id="ctl00_oAjaxContentPlaceHolder_divAlertAlertsHolder" class="profile-menu-alerts" title="Activities & Alerts">
<span>Test 1</span>
</div>
</a>
<a id="ctl00_oAjaxContentPlaceHolder_LinkButtonAlerts" href="javascript:__doPostBack('ctl00$oAjaxContentPlaceHolder$LinkButtonAlerts','')">
<div id="ctl00_oAjaxContentPlaceHolder_divAlertAlertsHolder" class="profile-menu-alerts curr" title="Activities & Alerts">
<span>Activities & Alerts</span>
</div>
</a>

Retrieving deep nested values looping through a HTML page using HTMLAgilityPack C#

I'm trying to use the HTMLAgilityPack to retrieve various specific values from a web page. The web page is always the same an the data I want to scrape from it is always in the same place (same divs/classes/attributes etc).
I've tried to loop through and get the values, but I always mess up somewhere. I'd provide some code to help but honestly I've tried 5 times and each time I don't get results close to what I want to - I'm well and truly in a pickle.
I have written the main chunk of HTML:
<div id ="markers">
<div class="row">
<div class="span2 filter-pane ">
<div class="teaser teaser-small">
<h1 class="teaser-title">
...
</div>
<p> Value4 </p>
</div>
</div>
<div class="span2 filter-pane ">
</div>
<div class="span2 filter-pane ">
</div>
</div>
<div class="row"></div>
<div class="row"></div>
</div>
Basically the values (1-4) are the values I want to extract from the data.
The <div id="markers"> is ONE div on the page, all the information I need is in this div.
There are multiple <div class="row"> divs, I need to loop through all of these.
Inside each of these divs, there are three or less <div class="span2 filter-pane "> divs. I need to loop through these 3 divs also.
My data is inside here - Value3 is here in the <p>...</p>. And the other values can be found within the <h1 class="teaser-title"> node, where they are attributes in an <a> element.
I hope somebody can provide me with a solution, or at least some good guidance to accessing all pieces of data I want. I've tried various things but I don't get the results I want.
Thanks.
Here are some hints for you. So first you need to get div#markers because you mentioned that it contains all your info you need.
string mainURL = your url;
HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load(mainURL);
var markerDiv = doc.DocumentNode.Descendants("div").FirstOrDefault(n => n.Id.Equal("markers');
//Check if marketDiv is null or not
//Same idea, get list of row divs
var rows = marketDiv.Descendants("div").HasClass("row") //I will provide .HasClass function or you can write your own, it's simple;
//Iterate throw your rows object
//for each row object
var aElement = row.Descendants("a").FirstOrDefault()//you can have more criteria here if it has more than 1 a element
aElement.GetAttributeValue("data-lat", "") //will return Value1 here, do the same thing for other attributes and p.
Hope it helps

HTMLAgilityPach - How to get a child div ignoring subgroups

I have these two following HTML:
-- first HTML
<div id="FIRST">
<span>foo</span>
<div id="SECOND">
<span>bar</span>
</div>
</div>
-- second HTML
<div id="FIRST">
<div id="SECOND">
<span>bar</span>
</div>
</div>
I would like to get the span inside the FIRST div on the first HTML, but there are situations when this span inside the FIRST div doesn't exists as you can see on the second HTML.
Now I am using the following code, but the code is getting the span inside the SECOND div.
SelectSingleNode(".//span")
Obs: Remember that in my example I have only two levels of divs but in my real HTML I have a loooooooot of levels.
I need to get the span considering only tags in the first div
To get only <span>s that is direct child of the <div id="FIRST">, you can either use ./span or span, assuming that the context where you want to call SelectSingleNode() is the aforementioned <div id="FIRST"> :
SelectSingleNode("./span")
SelectSingleNode("span")
Here is an alternative:
SelectSingleNode("span[1]");
This selects the first span element in the HtmlDocument

Getting text enclosed by <li> tags

Hi This is how my html file look like
<div class="panel-body sozluk">
<ol>
<li>kitap <code>isim</code> </li>
</span> </ol>
</div>
I am required to get values enclosed by the "li" tags.
This is my Xpath
//*[#id="wrap"]/div[2]/div[5]/div/div/div[1]/div[1]/div/div[1]/div[2]
This is what I have tried so far
HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
document.Load("word.html");
var v = document.DocumentNode
.SelectNodes("//[#id='wrap']/div[2]/div[5]/div/div/div[1]/div[1]/div/div[1]/div[2]/ol ")
.Select(x => x.ChildNodes["li"].InnerText);
Application crashes everytime.How can I do this
First things first, your XPath is invalid because it missing the star symbol (*) at the beginning :
var v = document.DocumentNode
.SelectNodes("//[#id='wrap']/div[2]/div[5]/....")
^here, right after '//'
Such verbose XPath is fragile, always prefer selecting elements by id or class or some other attribute, possible example :
var v = document.DocumentNode
.SelectNodes("//*[#id='wrap']//div[#class='panel-body sozluk']/ol/li")
.Select(o => o.InnerText);
You need to look at your HTML first:
<div class="panel-body sozluk">
<ol>
<li>kitap <code>isim</code> </li>
</span> </ol>
</div>
This is invalid. You have a div, inside which you have an ol, inside which you have a li, inside which you have a code. However, you are closing a span inside your div. The span, if opened at all was opened outside the div which contains the closing of the span. Make sure you are having valid html, before you try to extract things from it. And structure your code, I am sure you would have observed this problem if your code was structured.
Your HTML is kinda messy, but if you don't mind using another package,
use Fizzler for HTMLAgilityPack, that will allow you to use jquery-like selectors to get them instead of xpath.
var liList = document.DocumentNode.QuerySelectorAll("li");

Related to predicates in HtmlAgilityPack

I want to fetch data from website. I am using HtmlAgilityPack. In the website content is like this
<div id="list">
<div class="list1">
<a href="example1.com" class="href1" >A1</a>
<a href="example4.com" class="href2" />
</div>
<div class="list2">
<a href="example2.com" class="href1" >A2</a>
<a href="example5.com" class="href2" />
</div>
<div class="list3">
<a href="example3.com" class="href1" >A3</a>
<a href="example6.com" class="href2" />
</div>
</div>
Now, I want to fetch the first two links which has class="href1". I am using code.
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//a[#class='href1'][position()<3]");
But, it is not working. It gives all three links. I want to fetch only first two links. How to do this?
Hey! Now I want to do 1 thing also.
Above, I have only three links with class="href1". Suppose, I have 10 links with class="href1". And I want to fetch only four links from 6th link to 9th link. How to fetch these particular four links?
Try like wrapping the anchor selector in parentheses before applying the position() function:
var nodes = doc.DocumentNode.SelectNodes("(//a[#class='href1'])[position()<3]");
Why not just get them all and use the first two from the returned collection? Whatever xpath you would need to do this would be ultimately a hell of a lot less readable than using LINQ:
using System.Linq;
...
HtmlNodeCollection firstTwoHrefs = doc.DocumentNode
.SelectNodes("//a[#class='href']").Take(2);

Categories

Resources