Retrieve data from HTML table in C# - c#

I want to retrieve data from HTML document.
I am scraping data from a web site I almost done but get issue when tried to retrieve data from the table.
Here is HTML code
<div id="middle_column">
<form action="url?" method="post" name="inquirydetail">
<input type="hidden" name="ServiceName" value="SurgeWebService">
<input type="hidden" name="TemplateName" value="Inpat_AvailableResponses.htm">
<input type="hidden" name="CurrentPage" value="inquirydetail">
<form method="post" action="url" name="ResponseSel" onSubmit="return EditPage(document.forms[3])">
<TABLE
<tBody
<table
....
</table
<table
....
</table
<table border="0" width="90%">
<tr>
<td width="10%" valign="bottom" class="content"> Service Number</td>
<td width="30%" valign="bottom" class="content"> Status</td>
<td width="50%" valign="bottom" class="content"> Status Date</td>
</tr>
<tr>
<td width="20%" bgcolor="white" class="subtitle">1</td>
<td width="40%" bgcolor="white" class="subtitle">Approved</td>
<td width="40%" bgcolor="white" class="subtitle">03042014</td>
</tr>
<tr>
<td></td>
</tr>
</table>
</tbody>
</TABle>
</div>
I have to retrieve data for Status field It is Approved and write it in SQL DB
There are many tables in the form tag.Tables do not have IDs.How I can get correct table,row and cell
Here is my code
HtmlElement tBody = WB.Document.GetElementById("middle_column");
if (tBody != null)
{
string sURL = WB.Url.ToString();
int iTableCount = tBody.GetElementsByTagName("table").Count;
}
for (int i = 0; i <= iTableCount; i++)
{
HtmlElement tb=tBody.GetElementsByTagName("table")[i];
}
Something is wrong here
Please help with this.

Don't you have any control over the page being displayed within the Webbrowser control? If you do it's better you add an id field for status TD. Then your life would be much easier.
Anyway, here's how you could search a value within a table.
HtmlElementCollection tables = this.WB.Document.GetElementsByTagName("table");
foreach (HtmlElement TBL in tables)
{
foreach (HtmlElement ROW in TBL.All)
{
foreach (HtmlElement CELL in ROW.All)
{
// Now you are looping through all cells in each table
// Here you could use CELL.InnerText to search for "Status" or "Approved"
}
}
}
But, this is not a good approach as you are looping through each table and each cell within each table to find your text. Keep this as the last option.
Hope this helps you to get an idea.

I prefer using the dynamic type and the DomElement property, but you must be using .net 4+.
For tables, the main advantage here is that you don't have to loop through everything. If you know the row and column that you are looking for, then you can just target the important data by row and column numbers instead of looping through the whole table.
The other big advantage is that you can basically use the entire DOM, reading more than just the contents of the table. Make sure you use lowercase properties as required in javascript, even though you are in c#.
HtmlElement myTableElement;
//Set myTableElement using any GetElement... method.
//Use a loop or square bracket index if the method returns an HtmlElementCollection.
dynamic myTable = myTableElement.DomElement;
for (int i = 0; i < myTable.rows.length; i++)
{
for (int j = 0; j < myTable.rows[i].cells.length; j++)
{
string CellContents = myTable.rows[i].cells[j].innerText;
//You are not limited to innerText; you have the whole DOM available.
//Do something with the CellContents.
}
}

Related

Bootstrap collapse/accordion on table not working as intended

I have a bootstrap table with 4 rows. Each of those rows can be clicked, revealing child rows which are unique to each of the four parent rows.
All works fine as long as I collapse a row before un-collapsing another, but if I un-collapse a row and then un-collapse another, only one child row will show inside this newly un-collapsed parent row, even if there are supposed to be more child rows than just the one inside that parent row. Then, if I click on the parent row again, it will show all of the child rows inside that parent row, apart from the one that was just showing.
How do I fix this glitchy behavior? I should be able to click a parent row without collapsing the previous one first. Ideally the previous row would be collapsed automatically upon un-collapsing a new row.
Razor Page table:
<table id="accordionTable" class="table" style="border-collapse:collapse;">
<thead style="color:white;">
<th>DATABASE</th>
<th>STATUS</th>
</thead>
#for (var i = 0; i < Model.statusList.Count; i++)
{
<tr data-toggle="collapse" data-target="#("#innerTable" + i)">
<td>#Model.Databases[i]["DATABASE_NAME"].ToString().ToUpper()</td>
<td #*class="#Model.StatusAll[i]"*#>#Model.StatusAll[i]</td>
</tr>
foreach (var number in Model.statusList[i])
{
<tbody>
<tr #if (number.ToString() == "Running...") { #: style="background-color: #ccffc4;"
} else if (number.ToString() == "Scheduled") { #: style="background-color: #d1f3fc;"
} else if (number.ToString() == "Failed") { #: style="background-color:#ffe4e4;"
}
style="cursor:default; background-color: white;" data-parent="#accordionTable" id="#("innerTable" + i)">
<td> #number.ToString()</td>
<td #if (number == "Running...") { #: class="fas fa-spinner"
} else if (number == "Scheduled") { #: class="far fa-clock"
} else if (number == "Failed") { #: class="fas fa-exclamation-triangle"
}>
</td>
</tr>
</tbody>
}
}
</table>
As suggested by #Nan Yu above, this is how I fixed it:
You can try to remove attribute data-parent="#accordionTable" in the child tr tags, then add class ="collapse" in the child tr tags. Like style="cursor:default; background-color: white;" class="collapse" id="#("innerTable" + i)"> – Nan Yu

Retrieve the table data with xpath and Selenium

I have HTML with looks basically like the following
....
<div id="a">
<table class="a1">
<tbody>
<tr>
<td><a href="a11.html>a11</a>
</tr>
<tr>
<td><a href="a12.html>a12</a>
</tr>
</tbody>
<table>
</div>
...
The following coding in C# I used, however, I cannot retrieve the URL in this stage
IWebElement baseTable = driver.FindElement(By.ClassName(TableID));
// gets all table rows
ICollection<IWebElement> rows = baseTable.FindElements(By.TagName("tr"));
// for every row
IWebElement matchedRow = null;
foreach(var row in rows)
{
Console.Write (row.FindElements(By.XPath("td/a")));
}
First of all, you gave us invalid markup. Right one:
<div id="a">
<table class="a1">
<tbody>
<tr>
<td>
a11
</td>
</tr>
<tr>
<td>
a12
</td>
</tr>
</tbody>
</table>
</div>
If you have only one anchor in table row, you should use this code to retrieve url:
IWebElement baseTable = driver.FindElement(By.ClassName(TableID));
// gets all table rows
ICollection<IWebElement> rows = baseTable.FindElements(By.TagName("tr"));
// for every row
IWebElement matchedRow = null;
foreach (var row in rows)
{
Console.WriteLine(row.FindElement(By.XPath("td/a")).GetAttribute("href"));
}
You need to get href attribute of found element. Otherwise, row.FindElement(By.XPath("td/a") will print type name of the IWebElement inherited class, because it is an some type object, not string.
This does not look like a valid xpath to me
Console.Write (row.FindElements(By.XPath("td/a")));
try
Console.Write (row.FindElements(By.XPath("/td/a")));

Multiplying a textbox with a cell in a dynamically created table with JQuery

I have a dynamically created table with id called "editTable" that looks as follows:
<tbody>
#{var i = 0;}
#foreach (var item in Model)
{
<tr>
<td width="25%">
#Html.DisplayFor(modelItem => item.Product.Name)
</td>
<td width="25%">
#Html.DisplayFor(modelItem => item.Quantity)
</td>
<td width="25%">
<div class="editor-field">
#Html.EditorFor(modelItem => item.UnitPrice)
#Html.ValidationMessageFor(model => item.UnitPrice)
</div>
</td>
<td width="25%" id="total"></td>
</td>
</tr>
}
</tbody>
The 3th td-element consists of a C# textbox that is turned into a element in html.
Now I want to multiply the quantity by the unit price to display this value in the 4th td element next to it. This value should update every time the value in the textbox is adjusted. I am a newbie at JQuery / JavaScript and came up with the following code:
// Calculating quantity*unitprice
$('#editTable tr td:nth-child(3) input').each( function (event) {
var $quant = $('#editTable tr td:nth-child(2)', this).val();
var $unitPrice = $('#editTable tr td:nth-child(3) input', this).val();
$('#editTable tr td:nth-child(4)').text($quant * $unitPrice);
});
This doesn't work and only displays NaN in the 4th element. Can anyone help me updating this code to a working version? Any help would be very much appreciated.
I geussed you accidentally switched units and price because it has more logic to change the number of units then the price. I took your html and javascript and tried to change as little as possible to make it work (I'm not saying the solution is perfect, I just don't want to give you a totaly different example of how to do it).
The html (The C# is irrelevant for this problem):
<table id="editTable">
<tbody>
<tr>
<td width="25%">
Product name
</td>
<td width="25%">
5
</td>
<td width="25%">
<div class="editor-field">
<input id="UnitPrice" name="UnitPrice" type="number" value="2" style="width:40px" />
</div>
</td>
<td width="25%" id="total"></td>
</tr>
</tbody>
</table>
The javascript/jquery (which should run on load):
$('#editTable tr td:nth-child(3) input').each(updateTotal);
$('#editTable tr td:nth-child(3) input').change(updateTotal);
var element;
function updateTotal(element)
{
var quantity = $(this).closest('tr').find('td:nth-child(2)').text();
var price = $(this).closest('tr').find('td:nth-child(3) input').val();
$(this).closest('tr').find('td:nth-child(4)').text(quantity * price);
}
The problem you had were with jquery. I've created a function that recieves an element (in our case it's your UnitPrice input), then it grabs the closest ancestor of type tr (the row it's in) and from there it does what you've tried to do.
You've used jquery selector to get all 2nd cells in all table rows, the closest('tr').find limits it to the current row.
You've tried to use .val() on a td element, you should use either .text() or .html(). Instead, You can also add a data-val="<%=value%>" on the td and then use .data('val').
It will be better to take the units directly from $(element).val() and no going to the tr and then back into the td and the input.
To see it working: http://jsfiddle.net/Ynsgf/1/
I hope I didn't caused you any confusion with my explanation and the options I gave you.
Here is another way to write the jquery part.
$('#editTable tr').each(function (i, row) {
var $quant = $(row).find('.editor-field input').val();
var $unitPrice = $(row).find('.editor-field input').val();
$(row).find('td:nth-child(4)').text($quant * $unitPrice);
});

How to search through html table rows?

Using Windows Forms and C#.
For example...
<table id=tbl>
<tbody>
<tr>
<td>HELLO</td>
<td>YES</td>
<td>TEST</td>
</tr>
<tr>
<td>BLAH BLAH</td>
<td>YES</td>
<td>TEST</td>
</tr>
</tbody>
</table>
I load the page using the WebBrowser Control. The page loads perfectly.
The next thing I want to do is search through all the rows in the table and check if they contain a specific value ; for example in this instance YES.
If they contain it I want the row to be passed on to me so I can store it as string.
But I want the row to be in HTML form. (containing the tags).
How can I accomplish this ?
Please help me.
You can use the HtmlAgilityPack to easily parse the html. For example, to get all of the TD elements, you can do this:
string value = #" <table id=tbl>
<tbody>
<tr>
<td>HELLO</td>
<td>YES</td>
<td>TEST</td>
</tr>
<tr>
<td>BLAH BLAH</td>
<td>YES</td>
<td>TEST</td>
</tr>
</tbody>
</table>";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(value);
var nodes = doc.GetElementbyId("tbl").SelectNodes("tbody/tr/td");
foreach (var node in nodes)
{
Debug.WriteLine(node.InnerText);
}
You can use this: http://simplehtmldom.sourceforge.net/ , its really simple way how to search in HTML files
Just include simple_html_dom.php to your file and then just follow this manual
http://simplehtmldom.sourceforge.net/manual.htm
and your php code will looks like
$html = file_get_html('File.html');
foreach($html->find('td') as $element)
echo $element->text. '<br>';

Finding the child of a parent's sibling element with WatiN

The scenario that I am looking at is that we have a table with multiple columns. One of those columns has a name, another has a dropdown list. I need to manipulate the dropdown for a row that contains a particular name. I looked at the source output, and tried getting the element's grandparent (the table row) so that I could search for the list. However, there was no such search functionality when I used the parent object.
It seems like there would be a lot of this kind of scenario in automating/testing a site, but I have not found anything after searching for a couple of hours. Any help would be appreciated.
EDIT: The application in question is an ASP.NET, and the output HTML is gnarly at best. However, here is a cleaned up example of what the HTML being searched looks like:
<table class="myGrid" cellspacing="0" cellpadding="3" rules="all" border="1" id="ctl00_content_MyRpt_ctl01_MyGrid" style="border-collapse:collapse;">
<tr align="left" style="color:Black;background-color:#DFDBDB;">
<th scope="col">Name</th><th scope="col">Unit</th><th scope="col">Status</th><th scope="col">Action</th>
</tr>
<tr>
<td>
<span id="ctl00_content_MyRpt_ctl01_MyGrid_ctl02_Name">JOHN DOE</span>
</td>
<td>
<span id="ctl00_content_MyRpt_ctl01_MyGrid_ctl02_UnitType">Region</span>
<span id="ctl00_content_MyRpt_ctl01_MyGrid_ctl02_UnitNum">1</span>
</td>
<td>
<span id="ctl00_content_MyRpt_ctl01_MyGrid_ctl02_Status">Complete</span>
</td>
<td class="dropdown">
<select name="ctl00$content$MyRpt$ctl01$MyGrid$ctl02$ActionDropDown" onchange="javascript:setTimeout('__doPostBack(\'ctl00$content$MyRpt$ctl01$MyGrid$ctl02$ActionDropDown\',\'\')', 0)" id="ctl00_content_MyRpt_ctl01_MyGrid_ctl02_ActionDropDown" class="dropdown">
<option value="123456">I want to...</option>
<option value="Details.aspx">View Details</option>
<option value="Summary.aspx">View Summary</option>
<option value="DirectReports.aspx">View Direct Reports</option>
</select>
</td>
</tr>
<tr>
...
</tr>
</table>
I found a way to do what I wanted. It is probably not the best or most elegant solution, but it works (it is not production code).
private void btnStart_Click(object sender, EventArgs e)
{
using (var browser = new IE("http://godev/review"))
{
browser.Link(Find.ByText("My Direct Reports")).Click();
TableRow tr = browser.Span(Find.ByText("JOHN DOE")).Parent.Parent as TableRow;
SelectList objSL = null;
if (tr.Exists)
{
foreach (var td in tr.TableCells)
{
objSL = td.ChildOfType<SelectList>(Find.Any) as SelectList;
if (objSL.Exists) break;
}
if (objSL != null && objSL.Exists)
{
Option o = objSL.Option(Find.ByText("View Direct Reports"));
if (o.Exists) o.Select();
}
}
}
}
Hopefully this saves someone a little time and effort. Also, I would love to see if someone has a better solution.

Categories

Resources