Dynamic String Format C# - c#

In C#, Windows Form, how would I accomplish this:
07:55 Header Text: This is the data<br/>07:55 Header Text: This is the data<br/>07:55 Header Text: This is the data<br/>
So, as you can see, i have a return string, that can be rather long, but i want to be able to format the data to be something like this:
<b><font color="Red">07:55 Header Text</font></b>: This is the data<br/><b><font color="Red">07:55 Header Text</font></b>: This is the data<br/><b><font color="Red">07:55 Header Text</font></b>: This is the data<br/>
As you can see, i essentially want to prepend <b><font color="Red"> to the front of the header text & time, and append </font></b> right before the : section.
So yeah lol i'm kinda lost.
I have messed around with .Replace() and Regex patterns, but not with much success. I dont really want to REPLACE text, just append/pre-pend at certain positions.
Is there an easy way to do this?
Note: the [] tags are actually <> tags, but i can't use them here lol

Just because you're using RegEx doesn't mean you have to replace text.
The following regular expression:
(\d+:\d+.*?:)(\s.*?\[br/\])
Has two 'capturing groups.' You can then replace the entire text string with the following:
[b][font color="Red"]\1[/font][/b]\2
Which should result in the following output:
[b][font color="Red"]07:55 Header Text:[/font][/b] This is the data[br/]
[b][font color="Red"]07:55 Header Text:[/font][/b] This is the data[br/]
[b][font color="Red"]07:55 Header Text:[/font][/b] This is the data[br/]
Edit: Here's some C# code which demonstrates the above:
var fixMe = #"07:55 Header Text: This is the data[br/]07:55 Header Text: This is the data[br/]07:55 Header Text: This is the data[br/]";
var regex = new Regex(#"(\d+:\d+.*?:)(\s.*?\[br/\])");
var matches = regex.Matches(fixMe);
var prepend = #"[b][font color=""Red""]";
var append = #"[/font][/b]";
string outputString = "";
foreach (Match match in matches)
{
outputString += prepend + match.Groups[1] + append + match.Groups[2] + Environment.NewLine;
}
Console.Out.WriteLine(outputString);

have you tried .Insert() check this.

Have you considered creating a style and setting the css class of each line by wrapping each line in a p or div tag?
Easier to maintain and to construct.

The easiest way probably is to use string.Replace() and string.Split(). Say your input string is input (untested):
var output = string.Join("<br/>", in
.Split("<br/>)
.Select(l => "<b><font color=\"Red\">" + l.Replace(": ", "</font></b>: "))
.ToList()
) + "<br/>";

Related

How to save line breaks after parsing?

anglesharp - 0.9.11
On the page in the browser, the text is displayed as:
String_1.
String_2.
String_3.
String_4.
Parsing result:
String_1.String_2.String_3.String_4.
Page layout:
<div class="adv-point view-adv-point"><span>String_1. <br><br>String_2.<br>String_3.<br>String_4.</span></div>
I use code to parse:
var text = document.QuerySelectorAll("div:nth-child(4) >div:nth-child(3) > div.adv-point.view-adv-point");
text = items[0].TextContent.Trim();
Question
How to make the result of parsing with line breaks?
In other words, the result of the parsing should be:
String_1.
String_2.
String_3.
String_4.
I think if you use innerText here then it will work fine for you. Here is the code
var x = document.querySelectorAll("div:nth-child(4) >div:nth-child(3) > div.adv-point.view-adv-point");
console.log(x[0].innerText);
Try this-
var text=document.querySelectorAll(".view-adv-point span")[0].innerText;
If you log/alert text, you will see that the line break is present.
If you want to replace <br> with \n, then you can do this-
var text=document.querySelectorAll(".view-adv-point span")[0].innerHTML;
text = text.replace(/<br>/g, '\n');
But i believe this will return the same value as the first approach

Captures in C# regexes

This code is supposed to convert the value of img src to a local path.
var matches = Regex.Replace(html, "(<[ ]*img[^s]+src=[\"'])([^\"']*)([\"'][^/]*/>)",
(match)=> {
return string.Format("{0}{1}{2}",
match.Captures[0],
HostingEnvironment.MapPath("~/" + match.Captures[1]),
match.Captures[2]);
});
It matches the whole image tag correctly but there's only one capture. I thought the parentheses delimited captures but it doesn't seem to be working like that.
How should I have written this to get three captures, the middle one being the path?
Try using the Groups Property instead of Captures, like so:
var matches = Regex.Replace("<img src=\"dsa\"/>", "(<[ ]*img[^s]+src=[\"'])([^\"']*)([\"'][^/]*/>)",
(match)=> {
return string.Format("{0}{1}{2}",
match.Groups[1],
HostingEnvironment.MapPath("~/" + match.Groups[2]),
match.Groups[3]);
});

Stripping out malformed HTML from string

Sometimes from a 3rd party API I get malformed HTML elements returned:
olor:red">Text</span>
when I expect:
<span style="color:red">Text</span>
For my context, the text content of the HTML is more important so it does not matter if I lose surrounding tags/formatting.
What would be the best way to strip out the malformed tags such that the first example would read
Text
and the second would not change?
I recommend you to take a look at the HtmlAgilityPack, which is a very handy tool also for HTML sanitization.
Here's an approach example by using the aforementioned library:
static void Main()
{
var inputs = new[] {
#"olor:red"">Text</span>",
#"<span style=""color:red"">Text</span>",
#"Text</span>",
#"<span style=""color:red"">Text",
#"<span style=""color:red"">Text"
};
var doc = new HtmlDocument();
inputs.ToList().ForEach(i => {
if (!i.StartsWith("<"))
{
if (i.IndexOf(">") != i.Length-1)
i = "<" + i;
else
i = i.Substring(0, i.IndexOf("<"));
doc.LoadHtml(i);
Console.WriteLine(doc.DocumentNode.InnerText);
}
else
{
doc.LoadHtml(i);
Console.WriteLine(doc.DocumentNode.OuterHtml);
}
});
}
Outputs:
Text
<span style="color:red">Text</span>
Text
<span style="color:red">Text</span>
<span style="color:red">Text</span>
If you just need the content of the tags, and no information of what type of tag etc, you could use Regular Expressions:
var r = new Regex(">([^>]+)<");
var text = "olor:red\">Text</span>";
var m = r.Match(text);
This will find every inner text of each tag.
Very crudely, you could strip out all 'tags' by stripping everything before a > and keeping everything before a <.
I'm assuming you also need to consider the situation where the text your receive is without tags: e.g. Text.
In pseudo-code:
returnText = ""
loop:
gtI = text.IndexOf(">")
ltI = text.IndexOf("<")
if -1==gtI and -1==ltI:
returnText += text
we're done
if gtI==-1:
returnText += text up to position ltI
return returnText
if ltI==-1:
returnText += text after gtI
return returnText
if ltI < gtI:
returnText += textBefore ltI
text = text after ltI
loop
// gtI < ltI:
text = text after gtI
loop
It's crude and can be done much better (and faster) with a custom coded parser, but essentially the logic would be the same.
You should really be asking why the API returns only part of what you require: I can't see why it should be returning ext</span> either, which really messes you up.

Replace with .Replace/.Regex

I am using Html.Raw(Html.Encode()) to allow some of html to be allowed. For example I want bold, italic, code etc... I am not sure it's the right method, code seems pretty ugly.
Input
Hello, this text will be [b]bold[/b]. [code]alert("Test...")[/code]
Output
Code
#Html.Raw(Html.Encode(Model.Body)
.Replace(Environment.NewLine, "<br />")
.Replace("[b]", "<b>")
.Replace("[/b]", "</b>")
.Replace("[code]", "<div class='codeContainer'><pre name='code' class='javascript'>")
.Replace("[/code]", "</pre></div>"))
My Solution
I want to make it all a bit different. Instead of using BB-Tags I want to use simpler tags.For example * will stand for bold. That means if I input This text is *bold*. it will replace text to This text is <b>bold</b>.. Kinda like this website is using BTW.
Problem
To implement this I need some Regex and I have little to no experience with it. I've searched many sites, but no luck.
My implementation of it looks something like this, but it fails since I can't really replace a char with string.
static void Main(string[] args)
{
string myString = "Hello, this text is *bold*, this text is also *bold*. And this is code: ~MYCODE~";
string findString = "\\*";
int firstMatch, nextMatch;
Match match = Regex.Match(myString, findString);
while (match.Success == true)
{
Console.WriteLine(match.Index);
firstMatch = match.Index;
match = match.NextMatch();
if (match.Success == true)
{
nextMatch = match.Index;
myString = myString[firstMatch] = "<b>"; // Ouch!
}
}
Console.ReadLine();
}
To implement this I need some Regex
Ah no, you don't need Regex. Manipulating HTML with Regex could lead to some undesired effects. So you could simply use MarkDownSharp which by the way is what this site uses to safely render Markdown markup into HTML.
Like this:
var markdown = new Markdown();
string html = markdown.Transform(SomeTextContainingMarkDown);
Of course to polish this you would write an HTML helper so that in your view:
#Html.Markdown(Model.Body)

Simple text to HTML conversion

I have a very simple asp:textbox with the multiline attribute enabled. I then accept just text, with no markup, from the textbox. Is there a common method by which line breaks and returns can be converted to <p> and <br/> tags?
I'm not looking for anything earth shattering, but at the same time I don't just want to do something like:
html.Insert(0, "<p>");
html.Replace(Enviroment.NewLine + Enviroment.NewLine, "</p><p>");
html.Replace(Enviroment.NewLine, "<br/>");
html.Append("</p>");
The above code doesn't work right, as in generating correct html, if there are more than 2 line breaks in a row. Having html like <br/></p><p> is not good; the <br/> can be removed.
I know this is old, but I couldn't find anything better after some searching, so here is what I'm using:
public static string TextToHtml(string text)
{
text = HttpUtility.HtmlEncode(text);
text = text.Replace("\r\n", "\r");
text = text.Replace("\n", "\r");
text = text.Replace("\r", "<br>\r\n");
text = text.Replace(" ", " ");
return text;
}
If you can't use HttpUtility for some reason, then you'll have to do the HTML encoding some other way, and there are lots of minor details to worry about (not just <>&).
HtmlEncode only handles the special characters for you, so after that I convert any combo of carriage-return and/or line-feed to a BR tag, and any double-spaces to a single-space plus a NBSP.
Optionally you could use a PRE tag for the last part, like so:
public static string TextToHtml(string text)
{
text = "<pre>" + HttpUtility.HtmlEncode(text) + "</pre>";
return text;
}
Your other option is to take the text box contents and instead of trying for line a paragraph breaks just put the text between PRE tags. Like this:
<PRE>
Your text from the text box...
and a line after a break...
</PRE>
Depending on exactly what you are doing with the content, my typical recommendation is to ONLY use the <br /> syntax, and not to try and handle paragraphs.
How about throwing it in a <pre> tag. Isn't that what it's there for anyway?
I know this is an old post, but I've recently been in a similar problem using C# with MVC4, so thought I'd share my solution.
We had a description saved in a database. The text was a direct copy/paste from a website, and we wanted to convert it into semantic HTML, using <p> tags. Here is a simplified version of our solution:
string description = getSomeTextFromDatabase();
foreach(var line in description.Split('\n')
{
Console.Write("<p>" + line + "</p>");
}
In our case, to write out a variable, we needed to prefix # before any variable or identifiers, because of the Razor syntax in the ASP.NET MVC framework. However, I've shown this with a Console.Write, but you should be able to figure out how to implement this in your specific project based on this :)
Combining all previous plus considering titles and subtitles within the text comes up with this:
public static string ToHtml(this string text)
{
var sb = new StringBuilder();
var sr = new StringReader(text);
var str = sr.ReadLine();
while (str != null)
{
str = str.TrimEnd();
str.Replace(" ", " ");
if (str.Length > 80)
{
sb.AppendLine($"<p>{str}</p>");
}
else if (str.Length > 0)
{
sb.AppendLine($"{str}</br>");
}
str = sr.ReadLine();
}
return sb.ToString();
}
the snippet could be enhanced by defining rules for short strings
I understand that I was late with the answer for 13 years)
but maybe someone else needs it
sample line 1 \r\n
sample line 2 (last at paragraph) \r\n\r\n [\r\n]+
sample line 3 \r\n
Example code
private static Regex _breakRegex = new("(\r?\n)+");
private static Regex _paragrahBreakRegex = new("(?:\r?\n){2,}");
public static string ConvertTextToHtml(string description) {
string[] descrptionParagraphs = _paragrahBreakRegex.Split(description.Trim());
if (descrptionParagraphs.Length > 0)
{
description = string.Empty;
foreach (string line in descrptionParagraphs)
{
description += $"<p>{line}</p>";
}
}
return _breakRegex.Replace(description, "<br/>");
}

Categories

Resources