How to use Majestic13 for parsing HTML? - c#

I have a HTML document of the structure
<table width="85%" border="1" height="315" align="center">
<tr>
<td colspan="2" align="center"><font color="#400040"><b>Register No</b></font></td>
<th colspan="2"><font color="Brown">42209104069</font></th>
<td colspan="2" align="center"><font color="#400040"><b>Name</b></font></td>
<th colspan="2"><font color="Brown">SATHISH KUMAR R</font></th>
</tr>
<tr>
<td colspan="2"><font color="blue"><center><b>Subject</b></font></td>
<td colspan="2"><font color="blue"><center><b>Credits</b></font></td>
<td colspan="2"><font color="blue"><center><b>Grade</b></font></td>
<td colspan="2"><font color="blue"><center><b>Result</b></font></td>
</tr>
<tr>
<td colspan="2"><center> CS2301</td> //1
<td colspan="2"><center> 3</td> //2
<td colspan="2"><center> E</td> //3
<td colspan="2"><center> PASS</td> //4
</tr>
</table>
I want to extract the contents of the tag of lines 1,2,3,4 and save to a string. I want to know how to achieve this using Majestic13 in my C# project.

PM> Install-Package Majestic13
var html=#"<table width="85%" border="1" height="315" align="center">
<tr>
<td colspan="2" align="center"><font color="#400040"><b>Register No</b></font></td>
<th colspan="2"><font color="Brown">42209104069</font></th>
<td colspan="2" align="center"><font color="#400040"><b>Name</b></font></td>
<th colspan="2"><font color="Brown">SATHISH KUMAR R</font></th>
</tr>
<tr>
<td colspan="2"><font color="blue"><center><b>Subject</b></font></td>
<td colspan="2"><font color="blue"><center><b>Credits</b></font></td>
<td colspan="2"><font color="blue"><center><b>Grade</b></font></td>
<td colspan="2"><font color="blue"><center><b>Result</b></font></td>
</tr>
<tr>
<td colspan="2" class="a"><center> CS2301</td> //1
<td colspan="2" class="a"><center> 3</td> //2
<td colspan="2" class="a"><center> E</td> //3
<td colspan="2" class="a"><center> PASS</td> //4
</tr>
</table>";
var paser = new HtmlParser();
var node = paser.Pasrse(html);
var finder = new FindTagsVisitor(TagBuilder => tag.Name == "td" && tag.Attributes.ContainsKey("class"));
node.AcceptVisitor(finder);

Related

The width of the row for HTML table is not displayed as expected when sent as an email using SMTP client

I am using a HTML code with css styling and sending it as an email body using SMTP. But in the email, the width of the rows are more than it is expected.The spacing between the rows are more.
I have also added
'IsBodyHtml = true'
'BodyEncoding = System.Text.Encoding.UTF7'
Still nothing seems to work.
please suggest.
Below is the HTML code. Also attached is the screenshot of the email https://i.stack.imgur.com/jZbpJ.png-:
Hello Team,
This needs your input.
<HTML>
<DIV>
<TABLE BORDER ="1" WIDTH = 1000 >
<TR>
<TH align ="left" bgcolor="#D3D3D3"><H3><B>Test Details</H3></TH>
</TR>
<TR>
<TD>
<Div>
<table align="left" width=100%>
<TR bgcolor="#D3D3D3">
<TD class="ColumnBorder" align="right"></TD>
<TD class="ColumnBorder" align="right">Test1</TD>
<TD class="ColumnBorder" align="right">Test2</TD>
<TD class="ColumnBorder" align="right">Price</TD>
</TR>
<TR class="RowBorder">
<TD align="left" BGCOLOR="#99B4D1" width="25%">Price1</TD>
<TD BGCOLOR="D8E6FE" align="right" width="25%">27,082.16</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%">64.23</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%">49,167.00</TD>
</TR>
<TR class="RowBorder">
<TD align="left" BGCOLOR="#99B4D1" width="25%">Price2</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%">10.00</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%">20.00</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%">30.00</TD>
</TR>
</TABLE>
<TABLE align="left" style = "border-width:1px" width=100%>
<TR>
<TD align="left" BGCOLOR="#99B4D1" width="25%">Price Testing TTTTTT xxxxxxx</TD>
<TD BGCOLOR="#FFC000" align="right" width="25%">Value1</TD>
<TD BGCOLOR="#99B4D1" align="right" width="25%">Some Value2</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%"></TD>
</TR>
<TR>
<TD align="left" BGCOLOR="#99B4D1" width="25%">The final Status</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%">Success</TD>
<TD BGCOLOR="#99B4D1" align="right" width="25%">The Previous Status</TD>
<TD BGCOLOR="#D8E6FE" align="right" width="25%">Failed</TD>
</TR>
<TR>
<TD BGCOLOR="#99B4D1">This is the Correct Version</TD>
<TD BGCOLOR="#D8E6FE" align="right">1</TD>
<TD BGCOLOR="#99B4D1" align="right">2</TD>
<TD BGCOLOR="#D8E6FE" align="right">3</TD>
</TR>
</TABLE>
</Div>
</TD>
</TR>
<TR>
<TD><TABLE width = "100%">
<TH colspan = "5" bgcolor ="#D3D3D3"><B>The main Price</B></TH>
<TR bgcolor ="#D3D3D3">
<TD align="left">Description</TD>
<TD align="right">Final Price(>=)</TD>
<TD align="right">Min Discount</TD>
<TD align="right">Days</TD>
<TD align="right">Max Discount</TD>
</TR>
<TR bgcolor = #00B050 style="display:#Testing1">
<TD align="left">Testing1</TD>
<TD align="right">34,267.81</TD>
<TD align="right">59.52</TD>
<TD align="right">NA</TD>
<TD align="right">N/A</TD>
</TR>
<TR bgcolor = #00B050 style="display:#Testing2">
<TD align="left">Testing2</TD>
<TD align="right">99,210.81</TD>
<TD align="right">97.52</TD>
<TD align="right">NA</TD>
<TD align="right">N/A</TD>
</TR>
<TR bgcolor = #CCFFCC style="display:#Testing3">
<TD align="left">Testing3</TD>
<TD align="right">59,190.81</TD>
<TD align="right">57.52</TD>
<TD align="right">NA</TD>
<TD align="right">N/A</TD>
</TR>
<TR bgcolor = #CCFFCC style="display:#Testing4">
<TD align="left">Testing4</TD>
<TD align="right">79,270.51</TD>
<TD align="right">59.52</TD>
<TD align="right">NA</TD>
<TD align="right">N/A</TD>
</TR>
<TR bgcolor = #FFFF99 style="display:#Testing5">
<TD align="left">Testing5</TD>
<TD align="right">98,141.30</TD>
<TD align="right">59.98</TD>
<TD align="right">NA</TD>
<TD align="right">N/A</TD>
</TR>
<TR bgcolor = #FFFF99 style="display:#Testing6">
<TD align="left">Testing6</TD>
<TD align="right">10,011.80</TD>
<TD align="right">61.23</TD>
<TD align="right">NA</TD>
<TD align="right">N/A</TD>
</TR>
<TR bgcolor = #FF0000 style="display:#Testing7">
<TD align="left">Testing7</TD>
<TD align="right">30.00</TD>
<TD align="right">190</TD>
<TD align="right">NA</TD>
<TD align="right">N/A</TD>
</TR>
</TR>
</TABLE>
</TD>
</TR>
</TABLE>
</DIV>
<STYLE Type ="text/css">
table{border-collapse:collapse}
td{
font-family: Microsoft Sans Serif;
font-size:8.25pt;
}
th{
font-family: Microsoft Sans Serif;
font-size:8.25pt;
width:13em;}
tr{
height:2em}
.RowBorder{border-bottom:none;border-top:none}
.ColumnBorder{border-Right-width:1px}
</STYLE>
</HTML><br/>Disclaimer: This is for reference only.<br/>
<br/>This is just for testing purpose.<br/>

How to get line number of specific word from HTML file

How to get line number of the Subtotal string from HTML file using visual c#.. in below represented the HTML code of the HTML file.
MY HTML
<tr>
<td>
<table width="100%"
class="sales">
<!-- Headers -->
<tr>
<th align="center">Qty</th>
<th align="center">Item</th>
<th align="right">Price</th>
<th align="right">Amount</th>
</tr>
<!-- Rows -->
<tr class="saleline">
<td align="left">144</td>
<td align="left">0002</td>
<td align="right">5.00</td>
<td align="right">720.00</td>
</tr>
<tr class="saleline">
<td align="left">8</td>
<td align="left">0788</td>
<td align="right">1,200.00</td>
<td align="right">9,600.00</td>
</tr>
<tr class="saleline">
<td align="left">12</td>
<td align="left">0013</td>
<td align="right">15.00</td>
<td align="right">180.00</td>
</tr>
<tr class="saleline">
<td align="left">144</td>
<td align="left">120p CR SR 115/=</td>
<td align="right">115.00</td>
<td align="right">16,560.00</td>
</tr>
<!-- Totals -->
<tr>
<td align="right"
colspan="3">Subtotal</td>
<td align="right">27,060.00</td>
</tr>
<tr>
<td align="right"
colspan="3">
<b>TOTAL</b>
</td>
<td align="right">
<b>27,060.00</b>
</td>
</tr>
<tr>
<td align="right"
colspan="3">Less Payment</td>
<td align="right">20,000.00</td>
</tr>
<tr class="total">
<td align="right"
colspan="3">
<strong>Balance Due</strong>
</td>
<td align="right">7,060.00</td>
</tr>
</table>
</td>
</tr>
int counter = 0;
string line;
// Read the file and display it line by line.
System.IO.StreamReader file = new System.IO.StreamReader("c:\\test.html");
while((line = file.ReadLine()) != null)
{
if ( line.Contains("Subtotal") )
{
Console.WriteLine (counter.ToString() + ": " + line);
}
counter++;
}
file.Close();
search text file using c# and display the line number and the complete line that contains the search keyword

different table with same with

I am dynamically generating tables one after another using asp.net c#.I have different section in a table
td.section{width:200px;}
<table>
<tr>
<td colspan="2" class="section">Sec 1</td>
</tr>
<tr>
<td>1</td>
<td>s</td>
</tr>
</table>
<table>
<tr>
<td class="section">Sec 2</td>
</tr>
<tr>
<td>1</td>
</tr>
</table>
<table>
<tr>
<td colspan="3" class='section'>Sec 3</td>
</tr>
<tr>
<td colspan="3">1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
</table>
I want each sections must be same with but I am unable to do this.Can you guide me how can I do this.
I added the width attribute at the top within a style tag and I think I got what you're aiming for. Note that width is spelled with a "d" and you are also missing a "<" on your second tag.
Finished product:
<style>
td.section{width:200px;}
</style>
<table>
<tr>
<td colspan='2' class='section'>Sec 1</td>
</tr>
<tr>
<td>1</td>
<td>s</td>
</tr>
</table>
<table>
<tr>
<td class='section'>Sec 2</td>
</tr>
<tr>
<td>1</td>
</tr>
</table>
<tr>
<td colspan='3' class='section'>Sec 3</td>
</tr>
<tr>
<td>1</td>
</tr>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
</table>

Export Html table with css and arabic data also show to excel in asp.net C#

I have following table in asp.net and I want to generate excel with css and arabic data also show in reoprt.I am using following but not arabic data appear in my Excel and not arabic data appear in my css.I am using asp.net c#
Response.ContentType = "application/x-msexcel";
Response.AddHeader("Content-Disposition", "attachment; filename=ExcelFile.xls"); Response.ContentEncoding = System.Text.Encoding.UTF8;
StringWriter tw = new System.IO.StringWriter(); HtmlTextWriter
hw = new HtmlTextWriter(tw); tblid.RenderControl(hw);
Response.Write(tw.ToString());
Response.End();
<style type="text/css" >
.repFont
{
-webkit-transform: rotate(-90deg);
-moz-transform: rotate(-90deg);
width: 53px;
}
table {
border: 1px solid black;
border-spacing:0px;
}
td {
border: 1px solid black;
border-spacing:0px;
font-size:11PX;
text-align:center;
padding: 0px;
}
.auto-style1 {
height: 27px;
}
.bgtd1 {
background-color:#FFFF99;
}
.title {
color:#CC3333;
}
.titleMale {
color:#3366FF;
}
.titleTotal {
font-weight:bold;
font-size:12PX;
}
.maintitle {
font-size:12px;
}
.maintileArabic {
color:#CC3333;
padding-left: 187px;
}
.maintileEng {
color:#3366FF;
padding-left: 113px;
padding-bottom: 11px;
}
</style>
<table id="tblid" runat="server" border="1" >
<tr class="bgtd1">
<td colspan="2" rowspan="2" class="title">Region</td>
<td colspan="3" class="title"><span id="result_box" lang="ar" xml:lang="ar">المجموع العام</span></td>
<td colspan="2" class="title"> سلطانالخارج</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" class="title">مسندم</td>
<td colspan="2" rowspan="2" class="title"><span id="result_box2" lang="ar" xml:lang="ar">منطقة</span></td>
</tr>
<tr class="bgtd1">
<td colspan="3" class="auto-style1">Grand Total</td>
<td colspan="2" class="auto-style1"> </td>
<td colspan="2" class="auto-style1">
Musandam </td>
<td colspan="2" class="auto-style1"> Al-wusta </td>
<td colspan="2" class="auto-style1"> Alburaimi </td>
<td colspan="2" class="auto-style1"> Al-Dhahira </td>
<td colspan="2" class="auto-style1"> Dohfar </td>
<td colspan="2" class="auto-style1"> Al-dhakhila </td>
<td colspan="2" class="auto-style1"> Al-sharqiya(n) </td>
<td colspan="2" class="auto-style1"> Al-sharqiyah </td>
<td colspan="2" class="auto-style1"> Albatiniah(s) </td>
<td colspan="2" class="auto-style1"> Al-Albatinah </td>
<td colspan="2" class="auto-style1"> Muscat </td>
</tr>
<tr>
<td colspan="2" class="bgtd1">Gender</td>
<td class="title"><span id="result_box8" lang="ar" xml:lang="ar"> مجموع </span></td>
<td >١</td>
<td class="titleMale">ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td>١</td>
<td class="titleMale" >ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td>١</td>
<td class="titleMale">ذ</td>
<td>١</td>
<td class="titleMale">ذ</td>
<td>١</td>
<td class="titleMale">ذ</td>
<td >١</td>
<td class="titleMale">ذ</td>
<td colspan="2" class="bgtd1"><span id="result_box3" lang="ar" xml:lang="ar">جنس</span></td>
</tr>
<tr>
<td colspan="2" class="bgtd1">Specialization</td>
<td class="title">Total</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td>F</td>
<td class="titleMale">M</td>
<td colspan="2" class="bgtd1" ><span id="result_box4" lang="ar" xml:lang="ar">تخصص</span></td>
</tr>
<tr>
<td rowspan="4" class="bgtd1"><div class="repFont bgtd1">Foundation</div></td>
<td class="bgtd1">Engnieering</td>
<td >120</td>
<td>48</td>
<td class="titleMale">72</td>
<td>0</td>
<td class="titleMale">0</td>
<td>0</td>
<td class="titleMale">0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td class="bgtd1">الهندسة </td>
<td rowspan="4" class="bgtd1"><div class="repFont "> مؤسسة </div></td>
</tr>
<tr>
<td class="bgtd1">Information Technology</td>
<td>213</td>
<td>147</td>
<td class="titleMale">66</td>
<td>1</td>
<td class="titleMale">2</td>
<td>0</td>
<td class="titleMale">0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td >1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td>1</td>
<td class="titleMale">1</td>
<td>0</td>
<td class="titleMale">0</td>
<td class="bgtd1"> تكنولوجيا المعلومات </td>
</tr>
<tr>
<td rowspan="2" class="bgtd1"><span class="titleTotal">Total</span></td>
<td rowspan="2">313</td>
<td >195</td>
<td class="titleMale">138</td>
<td>1</td>
<td class="titleMale">2</td>
<td>0</td>
<td class="titleMale">0</td>
<td>2</td>
<td class="titleMale">2</td>
<td>0</td>
<td class="titleMale">0</td>
<td>2</td>
<td class="titleMale">2</td>
<td>0</td>
<td>0</td>
<td>2</td>
<td class="titleMale">2</td>
<td>0</td>
<td class="titleMale">0</td>
<td>2</td>
<td class="titleMale">2</td>
<td>0</td>
<td class="titleMale">0</td>
<td>2</td>
<td class="titleMale">2</td>
<td>0</td>
<td class="titleMale">0</td>
<td rowspan="2" class="bgtd1" >إجمالي</td>
</tr>
<tr>
<td colspan="2">313</td>
<td colspan="2">3</td>
<td colspan="2">0</td>
<td colspan="2">4</td>
<td colspan="2">0</td>
<td colspan="2">4</td>
<td colspan="2">0</td>
<td colspan="2">4</td>
<td colspan="2">0</td>
<td colspan="2">4</td>
<td colspan="2">0</td>
<td colspan="2">4</td>
<td colspan="2">0</td>
</tr>
</table>

Webrowser manipulate HTML Table

I'm trying to manipulate a html table open in webbrowser control, this tool will be used ti access a sharepoint page with an autologin option. This far this is what i have:
HtmlElementCollection htmlcol =
wb.Document.GetElementsByTagName("formTextfield277");
for (int i = 0; i < htmlcol.Count; i++)
{
if (htmlcol[i].Name == "portal_id")
{
htmlcol[i].SetAttribute("VALUE",
Properties.Settings.Default.sharepoint_user);
}
else if (htmlcol[i].Name == "password")
{
htmlcol[i].SetAttribute("VALUE",
Properties.Settings.Default.sharepoint_pw);
}
}
This C# code if for manipulate this HTML page:
<TABLE CELLSPACING="0" CELLPADDING="0" WIDTH="100%" BORDER="0">
<TR>
<TD CLASS="txtRedBold10" WIDTH="4"> </TD>
<TD CLASS="txtRedBold10" COLSPAN="2" HEIGHT="30">Please log in</TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" WIDTH="4"> </TD>
<TD CLASS="txtBlackReg10">Username:</TD>
<TD><INPUT CLASS="formTextfield277" TYPE="text" NAME="portal_id" VALUE="" VCARD_NAME="vCard.Email" SIZE="28"></TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="3"> </TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="2"> </TD>
<TD CLASS="txtBlackReg10">Please enter your username or E-Mail Address</TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="3"> </TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" WIDTH="4"> </TD>
<TD CLASS="txtBlackReg10">Password:</TD>
<TD><INPUT CLASS="formTextfield277" TYPE="password" NAME="password" SIZE="28" AUTOCOMPLETE="off"></TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="3"> </TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="2"> </TD>
<TD CLASS="txtBlackReg10">Please enter your network or Intranet password</TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="3"> </TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="2"> </TD>
<TD CLASS="txtBlackReg10">
<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0">
<TR>
<TD><INPUT TYPE="image" HEIGHT="24" WIDTH="20" SRC="images/cp_arrow.gif" VALUE="Log In"
BORDER="0"></TD>
<TD><A CLASS="linkTxtRedBold10" HREF="javascript:signin()"
onClick="saveForm()">Login</A>
</TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
<TD CLASS="txtBlackReg10" COLSPAN="3"> </TD>
</TR>
</TABLE>
Any sugestions?
Thanks in advance!
wb.Document.GetElementsByTagName("input") not wb.Document.GetElementsByTagName("formTextfield277");
HtmlElementCollection inputHtmlCollection = Document.GetElementsByTagName("input");
foreach (HtmlElement anInputElement in inputHtmlCollection)
{
if (anInputElement.Name.Equals("portal_id"))
{
anInputElement.SetAttribute("VALUE", Properties.Settings.Default.sharepoint_user);
}
if (anInputElement.Name.Equals("password"))
{
anInputElement.SetAttribute("VALUE", roperties.Settings.Default.sharepoint_pw);
}
}
hope this help!

Categories

Resources