I need extract value of just one specific td from the table by using XPath, but code always return null. How can I fix this?
var location = GetLocation(document.Result.DocumentNode.SelectSingleNode("//*[#id='detailTabTable']/tbody/tr[3]/td[2]"));
and the code
private string GetLocation(HtmlNode h)
{
try
{
string location = null;
if (h == null)
{
location = "N/A";
}
else
{
location = h.InnerText;
location = location.Substring(0, location.IndexOf(",", StringComparison.InvariantCulture));
}
return location;
}
catch (Exception ex)
{
log.ErrorFormat("Error in Link Data Repository {0} in Parse Links {1}", ex.Message, ex.StackTrace);
throw new Exception(ex.Message);
}
}
And small simple table:
<table id="detailTabTable" width="99%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td class="detailTabContentLt">Current List Price:</td>
<td class="detailTabContentPriceRt">
<span class="aiDetailCurrentPrice">AED 6,600,000</span>
</td>
</tr>
<tr>
<td class="detailTabContentLt" style="white-space: nowrap;">Plot size (Sq. Ft.):</td>
<td class="detailTabContentRt">N/A</td>
</tr>
<tr>
<td class="detailTabContentLt" valign="top">Locality</td>
<td class="detailTabContentRt">Dubai, Dubai</td>
</tr>
<tr>
<td colspan="2"></td>
</tr>
</table>
I have just tested your code. As mentioned in the comments when you do remove tbody from your xpath expression everything works fine. This worked fine for
me.
private static void htmlAgilityPackTest()
{
string html = " <table id=\"detailTabTable\" width=\"99%\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\"><tr><td class=\"detailTabContentLt\">Current List Price:</td><td class=\"detailTabContentPriceRt\"><span class=\"aiDetailCurrentPrice\">AED 6,600,000</span></td> </tr><tr> <td class=\"detailTabContentLt\" style=\"white-space: nowrap;\">Plot size (Sq. Ft.):</td><td class=\"detailTabContentRt\">N/A</td></tr> <tr><td class=\"detailTabContentLt\" valign=\"top\">Locality</td> <td class=\"detailTabContentRt\">Dubai, Dubai</td> </tr> <tr><td colspan=\"2\"></td> </tr> </table>";
HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
document.LoadHtml(html);
var node = document.DocumentNode.SelectSingleNode("//*[#id='detailTabTable']/tr[3]/td[2]");
string location = GetLocation(node);
Console.WriteLine("Location: " + location);
}
In case I misunderstood anything please let me know.
You can use fizzler and select stuff the CSS way :)
http://blog.simontimms.com/2014/02/24/parsing-html-in-c-using-css-selectors/
Related
Using RazorEngine to compile the HTML getting error as
RazorEngine.Templating.TemplateCompilationException: Errors while compiling a Template.
Please try the following to solve the situation:
* If the problem is about missing/invalid references or multiple defines either try to load
the missing references manually (in the compiling appdomain!) or
Specify your references manually by providing your own IReferenceResolver implementation.
See https://antaris.github.io/RazorEngine/ReferenceResolver.html for details.
Currently all references have to be available as files!
* If you get 'class' does not contain a definition for 'member':
try another modelType (for example 'null' to make the model dynamic).
NOTE: You CANNOT use typeof(dynamic) to make the model dynamic!
Or try to use static instead of anonymous/dynamic types.
private string RunCompile(string rootPath, string templateName, EmailViewModel model, string templateKey = null)
{
string result = string.Empty;
if (string.IsNullOrEmpty(rootPath) || string.IsNullOrEmpty(templateName) || model == null) return result;
string templateFilePath = Path.Combine(rootPath, templateName);
if (File.Exists(templateFilePath))
{
string template = File.ReadAllText(templateFilePath);
if (string.IsNullOrEmpty(templateKey))
{
templateKey = Guid.NewGuid().ToString();
}
result = Engine.Razor.RunCompile(template, templateKey, typeof(EmailViewModel), model);
}
return result;
}
HTML template
#model ViewModel.EmailViewModel
<table bgcolor="white" align="left" border="0" cellspacing="0" cellpadding="0" style="color:#5E5E5E;font-size:14px;font-family:Proxima Nova,Century Gothic,Arial,Verdana,sans-serif;width:100%;border:1px solid #F0F0F0; padding: 32px;">
<tbody>
<tr align="center">
<td style="padding-top:30px;padding-bottom:32px;">
<img src="https://ok6static.oktacdn.com/bc/image/fileStoreRecord?id=fs01fviisxo2dNOCM2p7" alt="monash university logo"/>
</td>
</tr>
<tr>
<td style="padding-top:24px;">
<div style="font-family:Consolas,Courier New,Courier,monospace;text-align:left;margin:10px;">
<h1>
Thank you for the registration
</h1>
</div>
</td>
</tr>
<tr>
<td style="padding-top:24px;">
<div style="font-family:Consolas,Courier New,Courier,monospace;text-align:left;margin:10px;">
<p>Please click on the below link to confirm email address</p>
</div>
</td>
</tr>
<tr>
<td style="padding-top:24px;">
<div style="font-family:Consolas,Courier New,Courier,monospace;text-align:left;margin:10px;">
#Model.Message
</div>
</td>
</tr>
</tbody>
</table>
I had the same issue. And resolved it with removing string #model ViewModel.EmailViewModel
I have this html with table.
I can get "col1" and "col2" but I don't know how to get also value of "data-index", "data-name":
<table class="footable table" id="footable">
<tbody>
<tr class="trclass red" data-index="123" data-name="Apple">
<td class="col1" >Green</td>
<td class="col2" >1.25</td>
</td></tr>
</tbody>
</table>
What I have tried:
public static void Main()
{
var html =
#"<html>
<tbody>
<table id=\'footable\'>
<tr class=\'trclass red\' data-index=\'123\' data-name=\'Apple\'>
<td class=\'col1\' >Green</td>
<td class=\'col2\' > 1.25</td>
</table>
</tbody></html>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var tbody = htmlDoc.DocumentNode.SelectNodes("//table[contains(#id, 'foo')]//tr//td");
foreach(var nob in tbody)
{
Console.Write(nob.InnerHtml);
}
}
I know that I can use nob.Attributes["data-index"], but my data is in tr before td where are my "Green" and "1.25".
I have the following line of HTML code and I used google chrome for xpath.
<DIV id=TasheelPaymentCtrl1_dvPayment>
<TABLE border=1 cellSpacing=0 borderColor=black cellPadding=7 width=625 align=center>
<TBODY>
<TR>
<TD class=ReceiptHeadArbCenterHead1 width=320>المسمى </TD>
<TD class=ReceiptHeadArbCenterHead1 width=75>دفع إلى</TD>
<TD class=ReceiptHeadArbCenterHead1 width=75>القيمة</TD>
<TD class=ReceiptHeadArbCenterHead1 width=75>الكمية</TD>
<TD class=ReceiptHeadArbCenterHead1 width=75>المجموع</TD></TR>
<TR>
<TD class=ReceiptHeadArbCenterHead>رسوم وزارة العمل</TD>
<TD class=ReceiptValueArbCenter>MOFI</TD>
<TD class=ReceiptValueArbCenter>3</TD>
<TD class=ReceiptValueArbCenter>1</TD>
<TD class=ReceiptValueArbCenter>3</TD>
<TR>
<TD class=ReceiptHeadArbCenterHead>رسوم الدرهم الإلكتروني</TD>
<TD class=ReceiptValueArbCenter>MOFI</TD>
<TD class=ReceiptValueArbCenter>3</TD>
<TD class=ReceiptValueArbCenter>1</TD>
<TD class=ReceiptValueArbCenter>3</TD>
<TR>
<TD class=ReceiptHeadArbCenterHead>رسوم مراكز الخدمة </TD>
<TD class=ReceiptValueArbCenter>MOFI</TD>
<TD class=ReceiptValueArbCenter>47</TD>
<TD class=ReceiptValueArbCenter>1</TD>
<TD class=ReceiptValueArbCenter>47</TD>
<TR>
<TD class=ReceiptHeadArbCenterHead1 colSpan=4>المجموع</TD>
<TD class=ReceiptValueArbCenter>53</TD></TR></TBODY></TABLE></DIV>
I want to extract values 3, 3, 47 and 53
I tried using this xpath
var gf = doc.DocumentNode.SelectNodes("//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[2]/td[5]");
foreach (var node in gf)
{
Console.WriteLine(node.InnerText); //output: "3"
}
var sf = doc.DocumentNode.SelectNodes("//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[3]/td[5]");
foreach (var node in sf)
{
Console.WriteLine(node.InnerText); //output: "3"
}
var tf = doc.DocumentNode.SelectNodes("//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[4]/td[5]");
foreach (var node in tf)
{
Console.WriteLine(node.InnerText); //output: "47"
}
var Allf = doc.DocumentNode.SelectNodes("//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[5]/td[2]");
foreach (var node in Allf )
{
Console.WriteLine(node.InnerText); //output: "53"
}
but i am getting null object exception..
I used Google chrome developer tools to copy the xpath. I am getting null point exception . How can extract value ..
My question is why I am getting null point reference exception, is there any mistake in xpath value?
Please help me.
As you have discovered, some of your XPath expressions don't work because the <tr> tags are not all closed.
Therefore, you will need to cater for this in your XPath expressions:
//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[2]/td[5] - no change
//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[3]/td[5] - should be //div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[2]/tr/td[5]
//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[4]/td[5] - should be //div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[2]/tr/tr/td[5]
//div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[5]/td[2] - should be //div[#id='TasheelPaymentCtrl1_dvPayment']/table/tbody/tr[2]/tr/tr/tr/td[2]
On the asp page i have created this function to check if two strings are equal:
<script type="text/javascript">
function ButtonClick(a, b)
{
if (a == b)
{
alert("Correct!");
}
else
{
alert("Wrong!");
}
}
</script>
Then, i have created this function which I use when the page loads, to display everything:
public void FillPageSpelling()
{
ArrayList videoList1 = new ArrayList();
if (!IsPostBack)
{
videoList1 = ConnectionClass.GetSpelling(1);
}
else
{
int i = Convert.ToInt32(DropDownList1.SelectedValue);
videoList1 = ConnectionClass.GetSpelling(i);
}
StringBuilder sb = new StringBuilder();
foreach (Spelling sp in videoList1)
{
sb.Append(
string.Format(
#"<table class='VideoTable'>
<tr>
<td align='center'><font face='Verdana'> <font size='3'>Level:</font> <font size='2'>{3}</font></font></td>
</tr>
<tr>
<td align='center'><font face='Verdana'> <font size='3'>Sentence:</font> <font size='2'>{1}</font></font></td>
</tr>
<tr>
<td align='center'><font size='3'>Sound:<audio controls><source src=sound/{2}></audio>
<font face='Verdana'> <font size='2'> </font> </font></td>
</tr>
<tr>
<tr><td align='center'><font face='Verdana'> <font size='3'>Write the word here: <input type=text name=TextBox1></font></font> </td> </tr>
<td><button name=btnCheck type=button onclick='ButtonClick(TextBox1.Text, lblWord.Text)'>Check</button> </td>
<td><button name=btnCheat type=button onclick='ButtonClick(TextBox1.Text, lblWord.Text)'>Cheat</button> </td>
</tr>
<tr>
<td align='center'><font face='Verdana'> <font size='3'>Word:</font> <font size='2'><asp:Label ID=lblWord runat=server>{4}</asp:Label></font></font></td>
</tr>
</br>
</table>", sp.SID, sp.Sentence, sp.Sound, sp.Level, sp.Word));
lblOutput.Text = sb.ToString();
}
Well, it turns out I have made a mistake here: <td><button name=btnCheck type=button onclick='ButtonClick(TextBox1.Text, lblWord.Text)'>Check</button> </td>
I changed label lblWord to be a textbox - TextBox2 instead, and this is how you should call the function:
<input type=button value='Check' class='p-userButton' onClick='ButtonClick(document.getElementById(""TextBox1"").value, document.getElementById(""TextBox2"").value);'/>
JavaScript functions do not have typed parameters. Try your function like this:
function ButtonClick(a, b)
{
if (a == b)
{
alert("Correct!");
}
else
{
alert("Wrong!");
}
}
function ButtonClick(string a, string b)
javascript does not support parameters with data type:
try : function ButtonClick(a, b)
Here is the Html code:
<table style="border:1px solid #000">
<tr style="background:#ddd;">
<td width="150">TableEle1</td>
<td width="150">TableEle2</td>
<td width="150">TableEle3</td>
<td width="150">TableEle4</td>
<td width="150">TableEle5</td>
<td width="150">TableEle6</td>
<td width="150">TableEle7</td>
<td width="150">TableEle8</td>
</tr>
And here is the code I use to extract the table element 1 (but not successful)
htmlHelper.SetNode(#"//td/text()='TableEle1'");
Is there any advice for me?
You can use a blend of HtmlAgilityPack and Linq to get the desired td node.
HtmlDocument document = new HtmlDocument();
document.LoadHtml("[your HTML string]");
var node = document.DocumentNode.SelectNodes("//td/text()");
var tdNode = node.Where(s => s.InnerText == "TableEle1").Select(s => s);
Hope this helps!