I have an html table I need to query, get the contents and then act on that.
this the table
<table>
<thead>
<tr>
<th>Version</th>
<th>Usage</th>
</tr>
</thead>
<tbody>
<tr>
<td>1.0.1.1</td>
<td>86</td>
</tr>
<tr>
<td>1.0.0.1</td>
<td>65</td>
</tr>
<tr>
<td>1.0.1.0</td>
<td>28</td>
</tr>
<tr>
<td>1.0.0.0</td>
<td>1</td>
</tr>
</tbody>
</table>
I'm getting that by passing the WebResponse through a regex expression.
What is the best way to get this into some data structure in C# so that I can query on the Version and Usage.. Baically have a List of class Foo.
Foo()
{
Version {get; set;}
Usage {get; set;}
}
Along those lines.
Thanks for your help
The best tool I've found for parsing HTML is the Html Agility Pack library. It is a fairly easy to use library, and will handle improperly formatted markup fairly well. You'll have to do the footwork of getting the data out from the library and into your own structures, but it'll make it easy for getting at the data.
Related
<table class ="table" cellpadding="0" cellspacing="0">
<tr>
<th>Item Name</th>
<th>Price</th>
<th>Seafood</th>
<th>Has Gluten</th>
<th>Picture</th>
</tr>
#foreach (var items in #Model)
{
<tr>
<td>#items.ItemName</td>
<td>#items.Price</td>
<td>#items.IsSeafood</td>
<td>#items.HasGluten</td>
</tr>
}
I am currently learning how to develop using ASP.NET MVC.
I am trying to insert different images for each of my items. My foreach loop creates the table and items from a mysql database. I want to add the pictures next to the tables unless I'm unable to with the code I have.
You can use an <img src=""/> tag inside your last <td> to embed the image directly into the html.
Something like this, depending on how your image is stored.
<td><img src="#items.Url"/></td>
I have a requirement like display many of collections in a HTML table format like mentioned below.
<table>
<thead>
<tr>
<th>Column 1 Heading</th>
<!--More column headings-->
</tr>
</thead>
<tbody>
#foreach (var item in Model.MyCollection)
{
<tr>
<td>#item.PropertyName</td>
<!--More properties-->
</tr>
}
</tbody>
<!--Add footer for totals-->
</table>
I will not specify the view model property names explicitly. I wanted to pass any type of collections to view that will need to generate a HTML table for me.
Please suggest better way of doing it.
I have a string of html returning from a serivce. I need to update this html server side (Using .Net) and reorder some of the elements around before sending it to the client. As a simple example lets say I have an html string like below. If the string is a table like below. How can I manipulate it to put the last name <th> and <td> into it's own <tr>. The html would be much larger and more complex but for one section of it the below illustrate how I would need to change it. Just using string replace hasn't worked well due to the complexity of the actual HTML.
Initial String
"<table>
<tbody>
<tr>
<th>First name</th>
<td>some first name</td>
<th>Last name</th>
<td>some last name</td>
</tr>
<tr>
<th>blah</td>
<td>blah blah</td>
</tr>
</tbody>
</table>
"
After Modification
"<table>
<tbody>
<tr>
<th>First name</th>
<td>some first name</td>
</tr>
<th>Last name</th>
<td>some last name</td>
<tr>
<th>blah</td>
<td>blah blah</td>
</tr>
</tbody>
</table>
"
I know URL answers are frowned upon, but you should look into the HTML Agility Pack. It's designed for this kind of thing.
http://html-agility-pack.net/?z=codeplex
For the purposes of this answer, I will make the silly assumption that you have read the file in a string list. Let us name this list HTMLLines. Then the following should do what you want
int length=HTMLLines.Count;
for(int loop=0;loop<length;loop++)
{
if(HTMLLines[loop].Equals("<th>Last name</th>"))
{
HTMLLines[loop]="</tr>\n<tr>\n"+HTMLLines[loop];
//break;//If there is only one occurrence, remove the leading // else keep that to repeat for each occurence
}
}
If you save the list after this loop, you should have the desired output.
This code assumes that there are no nulls in the list. If there are any nulls, you should replace HTMLLines[loop].Equals("<th>Last name</th>") with HTMLLines[loop]=="<th>Last name</th>"
If the "<th>Last name</th>" is just a sample you used for this question that cannot be used to match exactly, then you should place all possible matches to an array and check for them each loop. In this case, if we name the array theHeaders, the code will be something like:
int length=HTMLLines.Count;
for(int loop=0;loop<length;loop++)
{
for(int loop1=0;loop1<theHeaders.Length;loop1++)
{
if(HTMLLines[loop].Equals(theHeaders[loop1]))
{
HTMLLines[loop]="</tr>\n<tr>\n"+HTMLLines[loop];
break;
}
}
}
I hope this helps to point you to the right direction.
A very simple approach could be...
var result = htmlString.Replace("<th>Last name</th>", "</tr><tr><th>Last name</th>");
If you need something more complex than this you'll need to add more detail to your question.
I am on the server side of an asp.net application. There I have some html source code in a variable called 'HtmlText'. This source code is generated from xml via a xsl transformation, and is resulting in something like this:
<h1>ABC Test KH</h1>
<!--place for the control-->
<table class="tablesorter" id="tablesorter183">
<thead>
<tr>
<th align="left">Name</th>
<th align="right">DB</th>
<th align="right">DB Anteil in Prozent</th>
<th align="right">ABC</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" fieldName="Name">Fabrikam, Inc.</td>
<td align="right" fieldName="DB">881.378,00 €</td>
<td align="right" fieldName="DB_Anteil_in_Prozent">29,92</td>
<td align="right" fieldName="ABC">A</td>
</tr>
</tbody>
</table>
Now this source code is inserted in a aspx-website via the InnerHtml-property.
There is a div with id 'book' in that aspx:
book.InnerHtml = HtmlText
This works fine so far.
But now I want to create a dropdown-control in that html, which I can access on server-side. This control should be placed between the h1 and table-tags, where the comment <!--place for the control--> is located.
I know how to create asp-control dynamically and bind an event to that, but this works only if I have the aspx in the first place. I cannot do that to some html-source which exists just in a string at that time.
Is there any way to do what I want, or am I on the wrong track here?
Thanks in advance for any suggestions.
Kind regards,
Kai
I think the only solution is to create a control that inherits DropDownList, and override its RenderControl method.
Something like this:
public override void RenderControl(HtmlTextWriter writer)
{
//...
//Fill in the variable HtmlText content
//Split it to 2 variables - before and after the control place, and:
writer.Write(startString);
base.RenderControl(writer);
writer.Write(endString);
}
And use this control instead of DropDownList.
EDIT: In a case of several controls, I would use the way suggested here: Render .net controls to string and get events to fire:
Split the string to several strings - the first string - from the beginning to the first control, second string - from the first control to the second control, and so on.
And then insert each of the strings to a new LiteralControl, and add them to the page Like this:
book.Controls.Add(LiteralControl1);
book.Controls.Add(DropDownList1);
book.Controls.Add(LiteralControl2);
book.Controls.Add(Button1);
I have some code that gets a web response. How do I take that response and search for a table using its CSS class (class="data")? Once I have the table, I need to extract certain field values. For example, in the sample markup below, I need the values of Field #3 and Field #5, so "85" and "1", respectively.
<table width="570" border="0" cellpadding="1" cellspacing="2" class="data">
<tr>
<td width="158"><strong>Field #1:</strong></td>
<td width="99">1</td>
<td width="119"><strong>Field #2:</strong></td>
<td width="176">110</td>
</tr>
<tr>
<td width="158"><strong>Field #3:</strong></td>
<td width="99">85</td>
<td width="119"><strong>Field #4:</strong></td>
<td width="176">-259.34</td>
</tr>
<tr>
<td width="158"><strong>Field #5:</strong></td>
<td width="99">1</td>
<td width="119"><strong>Field #6:</strong></td>
<td width="176">110</td>
</tr>
<tr>
<td width="158"><strong>Field #7:</strong></td>
<td width="99">12</td>
<td width="119"><strong>Field #8:</strong></td>
<td width="176">123.23</td>
</tr>
</table>
Use the HTML Agility Pack and parse the HTML. If you want to do it the simplest way then go grab its beta (it supports LINQ).
As Randolf suggests, using HTML Agility Pack is a good option.
But, if you have control of the format of the HTML, it is also possible to do string parsing to extract the values you are after.
It is nearly trivial to download the entire HTML as a string and search for the string "<table" followed by the string "class=\"data\"". Then you can easily extract the values you are after by doing similar string manipulations.
I'm not saying you should do this, for the resulting code will be harder to read and maintain that the code using HTML Agility Pack, but it will save you an external dependency and your code will probably perform much better.
In a WP7 app I made, I started using HTML Agility Pack to parse some HTML and extract some values. This worked well, but it was quite slow. Switching to the string parsing regime made my code many times faster while returning the exact same result.