I got this code in my webpage:
<div class="goog-inline-block goog-flat-menu-button-caption">
TestText
</div>
I was wondering how I can access this class and change TestText into some other string using C#.
I was trying with HtmlCollection, but there's no InnerText option.
EDIT: I CANT CHANGE CODE ABOVE.
assuming you are using ASP.NET and your div is inside atleast one container having runat="server" attribute i.e. Form
<form id="form1" runat="server">
<div class="goog-inline-block goog-flat-menu-button-caption">
TestText
</div>
</form>
you can simply do this:
var xml = form1.InnerHtml;
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
var nodes = doc.SelectSingleNode("//div[contains(#class,'goog-inline-block goog')]");
foreach(XmlNode node in nodes)
{
node.InnerText = " changed Text";
}
form1.InnerHtml = xml = doc.InnerXml;
using Linq to SQL i.e XDocument
XDocument doc = XDocument.Parse(xml);
var nodes = doc.Elements("div")
.Where(s => s.Attribute("class").Value
.Contains("goog-inline-block goog")
)
.ToList();
foreach (XElement elem in nodes)
{
elem.Value = "changed text";
}
form1.InnerHtml = doc.ToString();
Add the runat="server" and id attribute to it so you have:
<div id="mydiv" class="goog-inline-block goog-flat-menu-button-caption" runat="server" >
TestText
</div>
you can use the class attribute by using:
mydiv.Attributes["class"] = "classOfYourChoice";
or
mydiv.InnerText = "Your Selected text";
Hope you understand and helps for you..
In C# you need to give id and set runat="server" in div tag like this.
<div id="divTest" runat="server" class="goog-inline-block goog-flat-menu-button-caption">
TestText
</div>
then
In C# Code behind
divText.InnerText = "Change Text From Here.";
try this, if not work then please explain me your question in detail.
Related
I have html source:
<div class="lit-plot">
<b class="red">خلاصه داستان :</b>
Content
</div>
I want to get the value of <div> (not <b> and only the string "Content") with HtmlAgilityPack. What is the best way to do this?
Here is what am I doing. movieDesHTMLSource is given html source. I don't know how to access the InnerHtml!
string movieDes;
//Exctact the movie's description HTML source
var movieDesHTMLSource = new HtmlAgilityPack.HtmlDocument();
movieDesHTMLSource.LoadHtml(postPageHTMLDes[95].InnerHtml);
var src = movieDesHTMLSource.DocumentNode.SelectNodes("//div[contains(#class,'lit-plot')]");
Use Xpath text() to retrieve just the text inside div tag.
var html = #"<body>
<div class='lit-plot'>
<b class='red'>خلاصه داستان :</b>
Content
</div>
</body>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//div[contains(#class,'lit-plot')]/text()");
foreach (HtmlNode node in htmlNodes)
{
Console.WriteLine(node.InnerText.Trim());
}
Fiddle here : https://dotnetfiddle.net/mXFs8k
I recommend that you wrap your content inside <p> or <span> etc tags then you can easily target it using HtmlAgilityPack.
first .. sorry about my bad english
my question is how can i scrape div inside div in htmlagilitypack c#
this is test html code
<html>
<div class="all_ads">
<div class="ads__item">
<div class="test">
test 1
</div>
</div>
<div class="ads__item">
<div class="test">
test 2
</div>
</div>
<div class="ads__item">
<div class="test">
test 3
</div>
</div>
</div>
</html>
how to make a loop that get all ads then loop that control test inside ads
You can select all the nodes inside class all_ads as follow:-
var res = div.SelectNodes(".//div[#class='all_ads ads__item']");
.//div[#class='all_ads ads__item'] This will select all the nodes inside all_adswhich has class ads_item.
You have to use this path => //div[contains(#class, 'test')]
This means you need to select those div(s) that contains class with name ads__item.
and then select all those selected div(s) inner html. like
class Program
{
static void Main(string[] args)
{
string html = File.ReadAllText(#"Path to your html file");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
var innerContent = doc.DocumentNode.SelectNodes("//div[contains(#class, 'test')]").Select(x => x.InnerHtml.Trim());
foreach (var item in innerContent)
Console.WriteLine(item);
Console.ReadLine();
}
}
Output:
Firstly, I tried a lot of ways but I couldn't solve my problem. I don't know how to place my node way in SelectSingleNode(?) method. I create a html path to reach my node in my c# code but if I run this code, I take NullReferenceException because of my html path. I just want you that how can I create my html way or any other solution?
This is example of html code:
<html>
<body>
<div id="container">
<div id="box">
<div class="box">
<div class="boxContent">
<div class="userBox">
<div class="userBoxContent">
<div class="userBoxElement">
<ul id ="namePart">
<li>
<span class ="namePartContent>
</span>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
And this my C# code:
namespace AgilityTrial
{
class Program
{
static void Main(string[] args)
{
Uri url = new Uri("https://....");
WebClient client = new WebClient();
client.Encoding = Encoding.UTF8;
string html = client.DownloadString(url);
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
string path = #"//html/body/div[#id='container']/div[#id='classifiedDetail']"+
"/div[#class='classifiedDetail']/div[#class='classifiedDetailContent']"+
"/div[#class='classifiedOtherBoxes']/div[#class='classifiedUserBox']"+
"/div[#class='classifiedUserContent']/ul[#id='phoneInfoPart']/li"+
"/span[#class='pretty-phone-part show-part']";
var tds = doc.DocumentNode.SelectSingleNode(path);
var date = tds.InnerHtml;
Console.WriteLine(date);
}
}
}
Take as an example your namePartContent span node. If you want to fetch that data you would simply do this:
doc.DocumentNode.SelectSingleNode(".//span[#class='namePartContent']")?.InnerText;
It will search/fetch a single span node with namePartContent as its class, begining at the root node, in your case <html>;
I have a HTML file that looks like this:
<div class="user_meals">
<div class="name">Name Surname</div>
<div class="day_meals">
<div class="meal">First Meal</div>
</div>
<div class="day_meals">
<div class="meal">Second Meal</div>
</div>
<div class="day_meals">
<div class="meal">Third Meal</div>
</div>
<div class="day_meals">
<div class="meal">Fourth Meal</div>
</div>
<div class="day_meals">
<div class="meal">Fifth Meal</div>
</div>
This code repeats a few times.
I want to get Name and Surname which is between <div> tag with class "name".
This is my code using HtmlAgilityPack:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(#"C:\workspace\file.html");
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[#class='name']"))
{
string vaule = node.InnerText;
}
But actually it doesn't work. Visual Studio throws me Exception:
An unhandled exception of type 'System.NullReferenceException'.
You are using wrong method to load HTML from a path LoadHtml expect HTML and not location of the file. Use Load instead.
The error you are getting is quite misleading as all properties are not null and standard tips from What is a NullReferenceException, and how do I fix it? don't apply.
Essentially this comes from the fact SelectNodes correctly returns null as there are not elements matching the query and foreach throws on it.
Fixed code:
HtmlDocument doc = new HtmlDocument();
// either doc.Load(#"C:\workspace\file.html") or pass HTML:
doc.LoadHtml("<div class='user_meals'><div class='name'>Name Surname</div></div> ");
var nodes = doc.DocumentNode.SelectNodes("//div[#class='name']");
// SelectNodes returns null if nothing found - may need to check
if (nodes == null)
{
throw new InvalidOperationException("Where all my nodes???");
}
foreach (HtmlNode node in nodes)
{
string vaule = node.InnerText;
vaule.Dump();
}
I want to show a specific section of a html-page in a textbox in a WP7-app (C#). After a bit of searching online I found this:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml("http://www.positief-project.be/?p=532");
string links = doc.DocumentNode
.Descendants("section")
.Where(section => section.Attributes["class"] != null &&
section.Attributes["class"].Value == "article-content").ToString();
txbContent.Text = links;
This doesn't give an error, but doesn't work either. How can I make it show in the text box?
Is jQuery an option?
HTML
<div class="section">
<div class="article-content">some foo 1</div>
<div class="article-content">some foo 2</div>
<div class="article-content">some foo 3</div>
<div class="article-content">some foo 4</div>
</div>
<br>
<input type="text" id="tbContent" />
jQuery
$(document).ready(function () {
var content;
$('.article-content').each(function(i, obj){
content += obj.innerHTML;
});
$('#tbContent').val(content);
});
See this fiddle http://jsfiddle.net/rodhartzell/Fk2xM/