Add space within foreach elements - c#

i have two foreach operations nested like below.
#foreach (var post in Model.Posts)
{
<article class="post-#post.PostId #post.PostType sticky post-item isotope-item
#foreach (var category in post.Categories)
{
#category.FormattedCategoryName
}">
}
Here's a sample of the output data:
<article class="post-1024 format-standard sticky post-item isotope-item cat1cat2cat3cat4cat5" style="width: 429px; position: absolute; left: 0px; top: 0px; transform: translate3d(2px, 1px, 0px);">
The only thing wrong is I could not separate #category.FormattedCategoryName with blank spaces. It might be an easy string operation but how? Any idea?
Thanks a lot.

Try below instead:
#category.FormattedCategoryName<text> </text>
Or Alternately
#Html.Raw(string.Contact(category.FormattedCategoryName, " "))
Edit:
As per the #freedomn-m comment the proposed solution should be replace the foreach loop with below:
#string.Join(" ", post.Categories.Select(c => c.FormattedCategoryName).ToArray())
So overall structure would be:
<article class="post-#post.PostId #post.PostType sticky post-item isotope-item
#string.Join(" ", post.Categories.Select(c => c.FormattedCategoryName).ToArray())">
Hope this will help !!

Related

How to get a particular text inside HTML using c#?

How to get the text "Attractions" from the below HTML ?
<li class="product">
<strong>
Attractions
</strong>
<span></span>
</li>
I usually get this done by the below code, when i need the text inside span. But need some help for the above situation.
foreach (HtmlNode selectNode in htmlDocument.DocumentNode.SelectNodes("//span[#class='cityName']"))
{
Result = selectNode.InnerHtml;
}
How can i do this ?
Result = htmlDocument.DocumentNode.SelectSingleNode("//li[#class='product']/strong/a").InnerText;
You can also do a foreach using SelectNodes like what you did up there.

Wrapping an HTML element with another element?

I am writing a program that parses a bit of HTML. Specifically, I am looking for underlined elements within a list, and turning those underlined elements into hyperlinks.
Here's an example of the pre-converted HTML:
<ul>
<li>
<u>Mode selector </u>
</li>
<li>
<u>LAND ALT</u>
</li>
<li>
<u>FLT ALT</u>
</li>
</ul>
Here's what I'm wanting the result to look like:
<ul>
<li>
<a id="triv14522" onclick="TxtLinkAction(15627,15673)">
<span style="color: rgb(102, 204, 255); font-size: 11pt;">
<u>Mode selector</u>
</span>
</a>
</l1>
<li>
<a id="triv14523" onclick="TxtLinkAction(15627,15674)">
<span style="color: rgb(102, 204, 255); font-size: 11pt;">
<u>LAND ALT</u>
</span>
</a>
</li>
<a id="triv14887" onclick="TxtLinkAction(15627,15679)">
<span style="color: rgb(102, 204, 255); font-size: 11pt;">
<u>FLT ALT</u>
</span>
</a>
</li>
</ul>
In my program, I've already built the anchor and span elements for each underlined element. Just for reference, here's how I've done this:
TrivId = trivId;
ActionItemId = actionItemId;
TextLayerId = textLayerId;
var trivIdText = "id=\"triv" + TrivId + "\"";
var onClickText = "onclick=\"TxtLinkAction(" + TextLayerId + "," + ActionItemId + ")\"";
var anchor = "<a " + trivIdText + " " + onClickText + ">";
var span = "<span style=\"color: rgb(102, 204, 255); font-size: 11pt;\">";
So, my main problem is I don't exactly know how to "wrap" each underlined element in the list with my anchor and span elements. If this were XML, I could add my XML element by using AddBeforeSelf. Can I do something similar with HTML?
NOTE: I notice that the C# tag has been removed, and Javascript tag added. I should clarify: This is a C# program that is parsing a PowerPoint document. One of the values that is being brought in is in HTML format. I am not using Javascript at all, since this isn't an actual webpage. I'm just grabbing this particular value from the PowerPoint slide, which happens to be in HTML format.
For further clarification, here's the C# method that I'm using. The resulting, modified HTML will be written out to an XML file. The resulting HTML will be stored in an XML tag, <RTF>, with the valid HTML as that tag's value.
public Hyperlink(int textLayerId, int runGroupId)
{
TrivId = LectoraTitle.GetId();
ActionItemId = LectoraTitle.GetId();
TextLayerId = textLayerId;
var trivIdText = "id=\"triv" + TrivId + "\"";
var onClickText = "onclick=\"TxtLinkAction(" + TextLayerId + "," + ActionItemId + ")\"";
var styleText = "style=\"" + Settings.Default.Style + "\"";
// build anchor/span and determine where to insert into text.text
var anchor = "<a " + trivIdText + " " + onClickText + " " + styleText + ">";
var span = "<span style=\"color: rgb(102, 204, 255); font-size: 11pt;\">";
ActionItem = new ActionItem { ActionType = ActionType.rungroup, TargetId = runGroupId };
}
Further explanation: I'm assuming that I can iterate over my HTML elements with a foreach loop, using something like the below code:
// note: this is pseudocode
var nodes = htmlSnippet;
foreach (var node in nodes)
{
// if node is underline element
// surround node with generated anchor
// and span elements.
}
I'm just not quite sure how to get my HTML snippet into an enumerable state so that I can iterate over it, and then wrap a particular element with my generated elements.
NEW EDIT:
So, after looking at HtmlAgilityPack, I've incorporated it into my program and am iterating over the Html like so (The variable text contains the HTML value (see first example above)):
htmlDocument.LoadHtml(text);
var nodes = htmlDocument.DocumentNode.SelectNodes("//u");
foreach (var node in nodes)
{
// insert code here to wrap the
// underline element with the generated
// anchor/span elements
}
So, now I'm able to parse the HTML and get only the underline elements. I now need to figure out how to surround these underline elements with my generated anchor/span elements. I was hoping I could do something like node.AddParent(anchor).
In order to iterate the HTML you may want to use HTML Agility Pack
http://htmlagilitypack.codeplex.com/
Examples here:
http://htmlagilitypack.codeplex.com/wikipage?title=Examples
A decent how-to here:
http://www.codeproject.com/Articles/659019/Scraping-HTML-DOM-elements-using-HtmlAgilityPack-H
You can install it using NuGet.

How do I loop this in XDocument using c#

I've table and td value as below code
foreach (var descendant in xmlDoc.Descendants("thead"))
{
var title = descendant.Element("td1 style=background:#cccccc").Value;
}
Assume I've below thead in the table
<thead>
<tr align="center" bgcolor="white">
<td1 style="background:#cccccc">Start</td1>
<td1 style="background:#cccccc">A</td1>
<td1 style="background:#cccccc">B</td1>
<td1 style="background:#cccccc">C</td1>
<td1 style="background:#cccccc">D</td1>
<td1 style="background:#cccccc">E</td1>
<td1 style="background:#cccccc">F</td1>
<td1 style="background:#cccccc">G</td1>
</tr>
</thead>
I need to get all td1 values
Your use of Element is incorrect - you just pass in a name, not the whole content of an element declaration.
If you want all td1 elements, you want something like:
foreach (var descendant in xmlDoc.Descendants("thead"))
{
foreach (var title in descendant.Element("tr")
.Elements("td1")
.Select(td1 => td1.Value))
{
...
}
}
Or if you don't actually need anything from the thead elements:
foreach (var title in descendant.Descendants("thead")
.Elements("tr")
.Elements("td1")
.Select(td1 => td1.Value))
{
...
}
(Do you really mean td1 rather than td by the way?)
If you need td1 elements, then in this case you can select them directly:
var titles = xdoc.Descendants("td1").Select(td => (string)td);
Or you can use XPath
var titles = from td in xdoc.XPathSelectElements("//thread/tr/td1")
select (string)td;
NOTE if you are going to parse html documents, then better consider to use HtmlAgilityPack (available from NuGet).

XML parsing : Reading CDATA

<item><title>this is title</title><guid isPermaLink="true">http://www.i.com/video/nokia-lumia-920-deki-pureview_2879.html</guid><link>http://www.i.com/video/nokia-lumia-920-deki-pureview_2879.html</link>
<description><![CDATA[this is the info.]]></description>
<pubDate>Wed, 5 Sep 2012 22:10:00 UT</pubDate>
<media:content type="image/jpg" expression="sample" fileSize="2956" medium="image" url="http://media.chip.com.tr/images/content/video/88/201209060102428081-0.jpg"/>
<enclosure type="image/jpg" url="http://media.chip.com.tr/images/content/video/88/201209060102428081-0.jpg" length="2956"/></item>
I want read the CDATA in <"description">
I wrote this
var x = e.Result;// e is downlaoded xml file
var videos = XElement.Parse(e.Result);
var fList = (from haber in videos.Descendants("channel").Elements("item")
select new Video
{
title = haber.Element("title").Value,
link = haber.Element("link").Value,
//description = ???????
}).ToList();
what should i write to description ? //EDIT Answer: The same way
but if the description like this?
<![CDATA[<p>Zombiler adına ne umduk ne bulduk!</p> <p> </p><p><img style="margin: 5px 0px 5px 5px; border: 1px solid #333333; float: right;" alt="Black_ops" src="http://or.com/images/stories/haber/haberler6/20120918_Castlevania/Black_ops.jpg" height="0" width="0" /><strong>Black Ops 2</strong>'de Zombi modu olabilir haberi çıktığından beri bir ses, bir görüntü beklerken <strong>Call of Duty</strong>'nin resmi <strong>Youtube</strong> sayfasında aşağıdaki video yayınlandı. Açıkçası ne demek istiyorlar anlamak güç. <p>Devamını oku...</p>]]>
You should be able to use exactly the same code:
description = haber.Element("description").Value
Or
description = (string) haber.Element("description")
LINQ to XML will take care of reading the text for you.
To read the CDATA block you just use the same methods; you what you want is to clean the HTML from it, then check this answer.

Extract Content from <div class=" "> </div> Tag C# RegEx

I have a code`
string tag = "div";
string pattern = string.Format(#"\<{0}.*?\>(?<tegData>.+?)\<\/{0}\>", tag.Trim());
Regex regex = new Regex(pattern, RegexOptions.ExplicitCapture);
MatchCollection matches = regex.Matches(data);
`
and i need to get content between <div class="in"> .... </div> tags
<div class="in">
ВАЗ 2121 <span class="for">за</span> <span class="price">2 700 $</span></span><br/><span class="year">1990 г.</span><br/><div style="margin: 3px 0 3px 0">1.6 л, бензин, КПП механика, с пробегом, белый, литые диски, тонировка, спойлер, ветровики, противотуманки, Движок после капитального ремонта!</div><div>
<span style="display:block; padding: 4px 0 0 0;"><span class="region">Костанай</span><span class="adv-phones">, +7 (777) 4464451</span></span>
<small class="gray air">24 просмотра</small>
<small class="gray air">13 июня</small>
</div>
<div class="selectItem" title="Выбрать" id="fv_sic_7184569">
</div>
</div>
How can I do it?
My code doesn't work.
Here's a regex that might extract simple div tags:
// <div[^>]*>(.+?)</div>
string tag = "div";
string pattern = string.Format(#"<{0}[^>]*>(?<tegData>.+?)</{0}>", tag.Trim());
However, using RegEx for HTML parsing is almost always inappropriate and guaranteed to not work properly. That is simply because markup languages such as HTML are not regular languages.
That being said you would be much better off using an XML parser to parse the document or fragment and then extract what you need. In fact, using a forward-only parser would probably even be faster than trying to use RegEx.
You should look at the XmlReader class in .NET.
If it doesn't have to be Server Side you could use some JavaScript to make this happen. Such as:
<script language="javascript">
function getData(){
var divs = document.getElementByTagName('div');
var data;
var x;
for(x = 0; x < divs.length; x++)
{
if(divs[x].className == 'in')
{
data = divs[x].innerHTML;
}
}
}
</script>
To get nested tags try use this function:
public static MatchCollection ParseTag(string str, string tagpat, string argpat, string valpat) {
if (null == tagpat) argpat = #"\w+";
if (null == argpat) argpat = #"[^>]*";
if (null == valpat) valpat = #"(?><\k'tag'\b[^>]*>(?'nst')|</\k'tag'>(?'-nst')|.?)*?(?(nst)(?!))";
return Regex.Matches(str, #"(?><(?'tag'" + tagpat + #"\b)\s*(?'arg'" + argpat + #")>)(?'val'" + valpat + #")</\k'tag'>",
RegexOptions.IgnoreCase | RegexOptions.Singleline);
}
Parameters are simple regexes to filter the target tag, here are examples:
ParseTag(page, "div", #"id=""content""\s+class=""mw-body""", null);
ParseTag(wikipage, "span", #"class=""bday""", #"\d{4}-\d{2}-\d{2}");
This variant handles opening and closing tags and nested tags of the same type (other nested tags can be broken and ignored).
The other variant checks nested tags more strict and does not match if some of them are mis-opened or closed:
if (null == valpat) valpat = #"(?><(?'itag'\w+)\b[^>]*>(?'nst')|</\k'itag'>(?'-nst')|.?)*?(?(nst)(?!))";
It much easier for me to use XPath. Maybe you will find it useful.
textBox2.Text = "<div style=\"padding: 5px; width: 212px\"><div>more text</div></div>";
string x = "//div[contains(#style,'padding: 5px; width: 212px;')]";
XmlDocument doc = new XmlDocument();
doc.LoadXml(textBox2.Text);
XmlNodeList nodes = doc.SelectNodes(textBox1.Text);
foreach(XmlNode node in nodes)
{
textBox3.Text = node.InnerXml;
}
Code that worked for me for RegEx would find the first inner div.
string r = #"<div style=""padding: 5px; width: 212px;";
Regex rg = new Regex(r);
var matches = rg.Matches(s);
if (matches.Count > 0)
{
foreach (Match m in matches)
{
textBox3.Text += m.Groups[1];
}
}

Categories

Resources