Insert at index of search term substring - c#

I am trying to highlight search terms in display results. Generally it works OK based on code found here on SO. My issue with it is that it replaces the substring with the search term, i.e. in this example it will replace "LOVE" with "love" (unacceptable). So I was thinking I probably want to find the index of the start of the substring, do an INSERT of the opening <span> tag, and do similar at the end of the substring. As yafs may be quite long I'm also thinking I need to integrate stringbuilder into this. Is this do-able, or is there a better way? As always, thank you in advance for your suggestions.
string yafs = "Looking for LOVE in all the wrong places...";
string searchTerm = "love";
yafs = yafs.ReplaceInsensitive(searchTerm, "<span style='background-color: #FFFF00'>"
+ searchTerm + "</span>");

how about this:
public static string ReplaceInsensitive(string yafs, string searchTerm) {
return Regex.Replace(yafs, "(" + searchTerm + ")", "<span style='background-color: #FFFF00'>$1</span>", RegexOptions.IgnoreCase);
}
update:
public static string ReplaceInsensitive(string yafs, string searchTerm) {
return Regex.Replace(yafs,
"(" + Regex.Escape(searchTerm) + ")",
"<span style='background-color: #FFFF00'>$1</span>",
RegexOptions.IgnoreCase);
}

Check this code
private static string ReplaceInsensitive(string text, string oldtext,string newtext)
{
int indexof = text.IndexOf(oldtext,0,StringComparison.InvariantCultureIgnoreCase);
while (indexof != -1)
{
text = text.Remove(indexof, oldtext.Length);
text = text.Insert(indexof, newtext);
indexof = text.IndexOf(oldtext, indexof + newtext.Length ,StringComparison.InvariantCultureIgnoreCase);
}
return text;
}

Does what you need:
static void Main(string[] args)
{
string yafs = "Looking for LOVE in all the wrong love places...";
string searchTerm = "LOVE";
Console.Write(ReplaceInsensitive(yafs, searchTerm));
Console.Read();
}
private static string ReplaceInsensitive(string yafs, string searchTerm)
{
StringBuilder sb = new StringBuilder();
foreach (string word in yafs.Split(' '))
{
string tempStr = word;
if (word.ToUpper() == searchTerm.ToUpper())
{
tempStr = word.Insert(0, "<span style='background-color: #FFFF00'>");
int len = tempStr.Length;
tempStr = tempStr.Insert(len, "</span>");
}
sb.AppendFormat("{0} ", tempStr);
}
return sb.ToString();
}
Gives:
Looking for < span style='background-color: #FFFF00'>LOVE< /span> in all the wrong < span style='background-color: #FFFF00'>love< /span> places...

Related

Regex catch string between strings

I created a small function to catch a string between strings.
public static string[] _StringBetween(string sString, string sStart, string sEnd)
{
if (sStart == "" && sEnd == "")
{
return null;
}
string sPattern = sStart + "(.*?)" + sEnd;
MatchCollection rgx = Regex.Matches(sString, sPattern);
if (rgx.Count < 1)
{
return null;
}
string[] matches = new string[rgx.Count];
for (int i = 0; i < matches.Length; i++)
{
matches[i] = rgx[i].ToString();
//MessageBox.Show(matches[i]);
}
return matches;
}
However if i call my function like this: _StringBetween("[18][20][3][5][500][60]", "[", "]");
It will fail. A way would be if i changed this line string sPattern = "\\" + sStart + "(.*?)" + "\\" + sEnd;
However i can not because i dont know if the character is going to be a bracket or a word.
Sorry if this is a stupid question but i couldn't find something similar searching.
A way would be if i changed this line string sPattern = "\\" + sStart + "(.*?)" + "\\" + sEnd; However i can not because i don't know if the character is going to be a bracket or a word.
You can escape all meta-characters by calling Regex.Escape:
string sPattern = Regex.Escape(sStart) + "(.*?)" + Regex.Escape(sEnd);
This would cause the content of sStart and sEnd to be interpreted literally.

regex to exclude if preceded by string?

I haven't used regex much before but found something useful on the net that I'm using:
private string ConvertUrlsToLinks(string msg)
{
string regex = #"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:#=.+?,##%&~-]*[^.|\'|\\||\# |!|\(|?|\[|,| |>|<|;|\)])";
Regex r = new Regex(regex, RegexOptions.IgnoreCase);
return r.Replace(msg, "$1").Replace("href=\"www", "href=\"http://www").Replace(#"\r\n", "<br />").Replace(#"\n", "<br />").Replace(#"\r", "<br />");
}
It does a good job but now I want it to exclude urls that already have a "a href=" in front. There's the ending "/a" to consider too.
Can that be done with regex or have to use totally different approach, like coding?
Try this:
((?<!href=')(?<!href=")(www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:#=.+?,##%&~-]*[^.|\'|\\||\# |!|\(|?|\[|,| |>|<|;|\)])
I tested on regex101.com
With the following sample set:
www.google.com
http://hi.com
http://www.fishy.com
href='www.ignore.com'
www.ouch.com
Using your existing regex pattern you could make a few simple changes to handle additional text being prepended or appended to your string:
`.+` <- pattern -> `(.+)?`
Which would give you:
.+((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:#=.+?,##%&~-]*[^.|\'|\\||\# |!|\(|?|\[|,| |>|<|;|\)])(.+)?
So passing the string of either:
<a href='http://www.test.com'>http://www.test.com</a>
...or...
http://www.test.com
Would result in:
www.test.com
Examples:
https://regex101.com/r/bO0cW6/1
http://ideone.com/suVw3I
I think it would be a little ToNy tHe pOny to do that in regex after all, so wrote the code, in case anyone is interested here it is:
private string handleatag(string msg, string tagbegin, string tagend)
{
ArrayList tags = new ArrayList();
int tagbeginpos = msg.IndexOf(tagbegin);
int tagendpos;
string hash = tagbegin.GetHashCode().ToString();
while (tagbeginpos != -1)
{
tagendpos = msg.IndexOf(tagend, tagbeginpos);
if (tagendpos != -1)
{
string atag = msg.Substring(tagbeginpos, tagendpos - tagbeginpos + tagend.Length);
msg = msg.Replace(atag, hash + tags.Count.ToString());
tags.Add(atag);
}
else
msg = msg.Remove(tagbeginpos, tagbegin.Length);
tagbeginpos = msg.IndexOf(tagbegin, tagbeginpos);
}
msg = ConvertUrlsToLinks(msg);
for (int i = 0; i < tags.Count; i++)
msg = msg.Replace(hash + i.ToString(), tags[i].ToString());
return msg;
}
private string ConvertUrlsToLinks(string msg)
{
if (msg.IndexOf("<a href=") != -1)
return handleatag(msg, "<a href=", "</a>");
string regex = #"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:#=.+?,##%&~-]*[^.|\'|\\||\# |!|\(|?|\[|,| |>|<|;|\)])";
Regex r = new Regex(regex, RegexOptions.IgnoreCase);
return r.Replace(msg, "$1").Replace("href=\"www", "href=\"http://www").Replace(#"\r\n", "<br />").Replace(#"\n", "<br />").Replace(#"\r", "<br />");
}

c# Parse URL in text

I have a sentence that may contain URL's. I need to take any URL in uppercase that starts with WWW., and append HTTP://. I have tried the following:
private string ParseUrlInText(string text)
{
string currentText = text;
foreach (string word in currentText.Split(new[] { "\r\n", "\n", " ", "</br>" }, StringSplitOptions.RemoveEmptyEntries))
{
string thing;
if (word.ToLower().StartsWith("www."))
{
if (IsAllUpper(word))
{
thing = "HTTP://" + word;
currentText = ReplaceFirst(currentText, word, thing);
}
}
}
return currentText;
}
public string ReplaceFirst(string text, string search, string replace)
{
int pos = text.IndexOf(search);
if (pos < 0)
{
return text;
}
return text.Substring(0, pos) + replace + text.Substring(pos + search.Length);
}
private static bool IsAllUpper(string input)
{
return input.All(t => !Char.IsLetter(t) || Char.IsUpper(t));
}
However its only appending multiple HTTP:// to the first URL using the following:
WWW.GOOGLE.CO.ZA
WWW.GOOGLE.CO.ZA WWW.GOOGLE.CO.ZA
HTTP:// WWW.GOOGLE.CO.ZA
there are a lot of domains (This shouldn't be parsed)
to
HTTP:// WWW.GOOGLE.CO.ZA
HTTP:// WWW.GOOGLE.CO.ZA HTTP:// WWW.GOOGLE.CO.ZA
HTTP:// WWW.GOOGLE.CO.ZA
there are a lot of domains (This shouldn't be parsed)
Please could someone show me the proper way to do this
Edit: I need to keep the format of the string (Spaces, newlines etc)
Edit2: A url might have an HTTP:// appended. I've updated the demo.
The issue with your code: you're using a ReplaceFirst method, which does exactly what it's meant to: it replaces the first occurence, which is obviously not always the one you want to replace. This is why only your first WWW.GOOGLE.CO.ZA get all the appending of HTTP://.
One method would be to use a StreamReader or something, and each time you get to a new word, you check if it's four first characters are "WWW." and insert at this position of the reader the string "HTTP://". But it's pretty heavy lenghted for something that can be way shorter...
So let's go Regex!
How to insert characters before a word with Regex
Regex.Replace(input, #"[abc]", "adding_text_before_match$1");
How to match words not starting with another word:
(?<!wont_start_with_that)word_to_match
Which leads us to:
private string ParseUrlInText(string text)
{
return Regex.Replace(text, #"(?<!HTTP://)(WWW\.[A-Za-z0-9_\.]+)",
#"HTTP://$1");
}
I'd go for the following:
1) You don't handle same elements twice,
2) You replace all instances once
private string ParseUrlInText(string text)
{
string currentText = text;
var workingText = currentText.Split(new[] { "\r\n", "\n", " ", "</br>" },
StringSplitOptions.RemoveEmptyEntries).Distinct() // .Distinct() gives us just unique entries!
foreach (string word in workingText)
{
string thing;
if (word.ToLower().StartsWith("www."))
{
if (IsAllUpper(word))
{
thing = "HTTP://" + word;
currentText = currentText.Replace("\r\n" + word, "\r\n" + thing)
.Replace("\n" + word, "\n" + thing)
.Replace(" " + word, " " + thing)
.Replace("</br>" + word, "</br>" + thing)
}
}
}
return currentText;
}

Replace closest instance of a word

I have a string like this:
“I’m a member of the Imperial Senate on a diplomatic mission to Alderaan.”
I want to insert <strong> around the "a" in "a diplomatic", but nowhere else.
What I have as input is diplomatic from a previous function, and I wan't to add <strong>to the closest instance of "a".
Right now, of course when I use .Replace("a", "<strong>a</strong>"), every single instance of "a" receives the <strong>-treatment, but is there any way to apply this to just to one I want?
Edit
The string and word/char ("a" in the case above) could be anything, as I'm looping through a lot of these, so the solution has to be dynamic.
var stringyourusing = "";
var letter = "";
var regex = new Regex(Regex.Escape(letter));
var newText = regex.Replace(stringyourusing , "<strong>letter</strong>", 1);
Would this suffice?
string MakeStrongBefore(string strong, string before, string s)
{
return s.Replace(strong + " " + subject, "<strong>" + strong + "</strong> " + before);
}
Used like this:
string s = “I’m a member of the Imperial Senate on a diplomatic mission to Alderaan.”;
string bolded = MakeStrongBefore("a", "diplomatic", s);
Try this:
public string BoldBeforeString(string source, string bolded,
int boldBeforePosition)
{
string beforeSelected = source.Substring(0, boldBeforePosition).TrimEnd();
int testedWordStartIndex = beforeSelected.LastIndexOf(' ') + 1;
string boldedString;
if (beforeSelected.Substring(testedWordStartIndex).Equals(bolded))
{
boldedString = source.Substring(0, testedWordStartIndex) +
"<strong>" + bolded + "</strong>" +
source.Substring(testedWordStartIndex + bolded.Length);
}
else
{
boldedString = source;
}
return boldedString;
}
string phrase = "I’m a member of the Imperial Senate on a diplomatic mission to Alderaan.";
string boldedPhrase = BoldBeforeString(phrase, "a", 41);
Hei!
I've tested this and it works:
String replaced = Regex.Replace(
"I’m a member of the Imperial Senate on a diplomatic mission to Alderaan.",
#"(a) diplomatic",
match => "<strong>" + match.Result("$1") + "</strong>");
So to make it a general function:
public static String StrongReplace(String sentence, String toStrong, String wordAfterStrong)
{
return Regex.Replace(
sentence,
#"("+Regex.Escape(toStrong)+") " + Regex.Escape(wordAfterStrong),
match => "<strong>" + match.Result("$1") + "</strong>");
}
Usage:
String sentence = "I’m a member of the Imperial Senate on a diplomatic mission to Alderaan.";
String replaced = StrongReplace(sentence, "a", "diplomatic");
edit:
considering your other comments, this is a function for placing strong tags around each word surrounding the search word:
public static String StrongReplace(String sentence, String word)
{
return Regex.Replace(
sentence,
#"(\w+) " + Regex.Escape(word) + #" (\w+)",
match => "<strong>" + match.Result("$1") + "</strong> " + word + " <strong>" + match.Result("$2") + "</strong>");
}

How to trim/replace a char '\' or '"' from a string?

I need to use a string for path for a file but sometimes there are forbidden characters in this string and I must replace them. For example, my string _title is rumbaton jonathan \"racko\" contreras.
Well I should replace the chars \ and ".
I tried this but it doesn't work:
_title.Replace(#"/", "");
_title.Replace(#"\", "");
_title.Replace(#"*", "");
_title.Replace(#"?", "");
_title.Replace(#"<", "");
_title.Replace(#">", "");
_title.Replace(#"|", "");
Since strings are immutable, the Replace method returns a new string, it doesn't modify the instance you are calling it on. So try this:
_title = _title
.Replace(#"/", "")
.Replace(#"""", "")
.Replace(#"*", "")
.Replace(#"?", "")
.Replace(#"<", "")
.Replace(#">", "")
.Replace(#"|", "");
Also if you want to replace " make sure you have properly escaped it.
Try regex
string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
illegal = r.Replace(illegal, "");
Before: "M"\a/ry/ h**ad:>> a/:?"| litt|le|| la"mb.?
After: Mary had a little lamb.
Also another answer from same post is much cleaner
private static string CleanFileName(string fileName)
{
return Path.GetInvalidFileNameChars().Aggregate(fileName, (current, c) => current.Replace(c.ToString(), string.Empty));
}
from How to remove illegal characters from path and filenames?
Or you could try this (probably terribly inefficient) method:
string inputString = #"File ~!##$%^&*()_+|`1234567890-=\[];',./{}:""<>? name";
var badchars = Path.GetInvalidFileNameChars();
foreach (var c in badchars)
inputString = inputString.Replace(c.ToString(), "");
The result will be:
File ~!##$%^&()_+`1234567890-=[];',.{} name
But feel free to add more chars to the badchars before running the foreach loop on them.
See http://msdn.microsoft.com/cs-cz/library/fk49wtc1.aspx:
Returns a string that is equivalent to the current string except that all instances of oldValue are replaced with newValue.
I have written a method to do the exact operation that you want and with much cleaner code.
The method
public static string Delete(this string target, string samples) {
if (string.IsNullOrEmpty(target) || string.IsNullOrEmpty(samples))
return target;
var tar = target.ToCharArray();
const char deletechar = '♣'; //a char that most likely never to be used in the input
for (var i = 0; i < tar.Length; i++) {
for (var j = 0; j < samples.Length; j++) {
if (tar[i] == samples[j]) {
tar[i] = deletechar;
break;
}
}
}
return tar.ConvertToString().Replace(deletechar.ToString(CultureInfo.InvariantCulture), string.Empty);
}
Sample
var input = "rumbaton jonathan \"racko\" contreras";
var cleaned = input.Delete("\"\\/*?><|");
Will result in:
rumbaton jonathan racko contreras
Ok ! I've solved my issue thanks to all your indications. This is my correction :
string newFileName = _artist + " - " + _title;
char[] invalidFileChars = Path.GetInvalidFileNameChars();
char[] invalidPathChars = Path.GetInvalidPathChars();
foreach (char invalidChar in invalidFileChars)
{
newFileName = newFileName.Replace(invalidChar.ToString(), string.Empty);
}
foreach (char invalidChar in invalidPathChars)
{
newFilePath = newFilePath.Replace(invalidChar.ToString(), string.Empty);
}
Thank you so musch everybody :)

Categories

Resources