Replacement in a String with a regular expression - c#

I'm trying to replace a string in C# with the class Regex but I don't know use the class properly.
I want replace the next appearance chain in the String "a"
":(one space)(one or more characters)(one space)"
by the next regular expression
":(two spaces)(one or more characters)(three spaces)"
Will anyone help me and give me the code and explains me the regular expresion used?

you can use string.Replace(string, string)
try this one.
http://msdn.microsoft.com/en-us/library/fk49wtc1.aspx
try this one
private String StrReplace(String Str)
{
String Output = string.Empty;
String re1 = "(:)( )((?:[a-z][a-z]+))( )";
Regex r = new Regex(re1, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m = r.Match(Str);
if (m.Success)
{
String c1 = m.Groups[1].ToString();
String ws1 = m.Groups[2].ToString() + " ";
String word1 = m.Groups[3].ToString();
String ws2 = m.Groups[4].ToString() + " ";
Output = c1.ToString() + ws1.ToString() + word1.ToString() + ws2.ToString() + "\n";
Output = Regex.Replace(Str, re1, Output);
}
return Output;
}

Using String.Replace
var str = "Test string with : .*. to replace";
var newstr = str.Replace(": .*. ", ": .*. ");
Using Regex.Replace
var newstr = Regex.Replace(str,": .*. ", ": .*. ");

Related

Split and Append "AND" between values

how to split below value and append AND between values ?
I cannot Split with Space as there is spaces between words
"\"Mark John\" \"Tina Roy\""
as
"\"Mark John\" AND \"Tina Roy\""
In the end it should look like -
"Mark John" AND "Tina Roy"
Any help is appreciated.
string operatorValue = " AND ";
if (!string.IsNullOrEmpty(operatorValue))
{
foreach (string searchVal in SearchRequest.Text.Split(' '))
{
if (!string.IsNullOrEmpty(searchVal))
searchValue += searchVal + operatorValue;
}
}
int index = searchValue.LastIndexOf(operatorValue);
if (index != -1)
{
outputSearchValue = searchValue.Substring(0, index);
}
Try
var result = str.Replace("\" \"","\" And \"");
If you have more than one name, or there is a possibility that you could have more than one whitespace between two names, you could opt for Regex.
var result = Regex.Replace(str,"\"\\s+\"","\" And \"");
Example,
var str = "\"Mark John\" \"Tina Roy\" \"Anu Viswan\"";
var result = Regex.Replace(str,"\"\\s+\"","\" And \"");
Output
"Mark John" And "Tina Roy" And "Anu Viswan"
Or use Regular Expressions:
var test = "\"John Smith\" \"Bill jones\" \"Bob Norman\"";
Console.WriteLine(Regex.Replace(test, "\" \"", "\" AND \""));
Instead of splitting, replace the " " with " AND "
var test = "\"Mark John\" \"Tina Roy\"";
var new_string= test.Replace("\" \"", " AND ");

Get Regex occurance with escaped symbols

I would appreciate help with non-working regex (does not work for special symbols % or $)
public System.Tuple<string, string> GetParts(string str, string beginMark, string endMark)
{
var pattern =
new Regex(beginMark + #"(?<val>.*?)" + endMark,
RegexOptions.Compiled |
RegexOptions.Singleline);
return (from Match match in pattern.Matches(str)
where match.Success
select new Tuple(
match.Value,
match.Groups["val"].Value))
.ToList();
}
Calling method:
string input = #"%sometext%\another text";
string replacedValue = "AAA";
var occurrences = GetPart(input, #"(%", ")");
foreach (var occurrence in occurrences)
{
Console.WriteLine(occurrence.Item1 + Environment.NewLine);
Console.WriteLine(occurrence.Item2 + Environment.NewLine);
// replace
onsole.WriteLine(input.Replace(occurrence.Item1, replacedValue) + Environment.NewLine);
}
Expected Output:
%sometext%
sometext
AAA\another text
You need to escape your symbols. Try to change
new Regex(beginMark + #"(?<val>.*?)" + endMark,
to
new Regex(Regex.Escape(beginMark) + #"(?<val>.*?)" + Regex.Escape(endMark),

Regex Replace on a JSON structure

I am currently trying to do a Regex Replace on a JSON string that looks like:
String input = "{\"`####`Answer_Options11\": \"monkey22\",\"`####`Answer_Options\": \"monkey\",\"Answer_Options2\": \"not a monkey\"}";
a
The goal is to find and replace all the value fields who's key field starts with `####`
I currently have this:
static Regex _FieldRegex = new Regex(#"`####`\w+" + ".:.\"(.*)\",");
static public string MatchKey(string input)
{
MatchCollection match = _encryptedFieldRegex.Matches(input.ToLower());
string match2 = "";
foreach (Match k in match )
{
foreach (Capture cap in k.Captures)
{
Console.WriteLine("" + cap.Value);
match2 = Regex.Replace(input.ToLower(), cap.Value.ToString(), #"CAKE");
}
}
return match2.ToString();
}
Now this isn't working. Naturally I guess since it picks up the entire `####`Answer_Options11\": \"monkey22\",\"`####`Answer_Options\": \"monkey\", as a match and replaces it. I want to just replace the match.Group[1] like you would for a single match on the string.
At the end of the day the JSON string needs to look something like this:
String input = "{\"`####`Answer_Options11\": \"CATS AND CAKE\",\"`####`Answer_Options\": \"CAKE WAS A LIE\",\"Answer_Options2\": \"not a monkey\"}";
Any idea how to do this?
you want a positive lookahead and a positive lookbehind :
(?<=####.+?:).*?(?=,)
the lookaheads and lookbehinds will verify that it matches those patterns, but not include them in the match. This site explains the concept pretty well.
Generated code from RegexHero.com :
string strRegex = #"(?<=####.+?:).*?(?=,)";
Regex myRegex = new Regex(strRegex);
string strTargetString = #" ""{\""`####`Answer_Options11\"": \""monkey22\"",\""`####`Answer_Options\"": \""monkey\"",\""Answer_Options2\"": \""not a monkey\""}""";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
// Add your code here
}
}
this will match "monkey22" and "monkey" but not "not a monkey"
Working from #Jonesy's answer I got to this which works for what I wanted. It includes the .Replace on the groups that I required. The negative look ahead and behinds were very interesting but I needed to replace some of those values hence groups.
static public string MatchKey(string input)
{
string strRegex = #"(__u__)(.+?:\s*)""(.*)""(,|})*";
Regex myRegex = new Regex(strRegex, RegexOptions.IgnoreCase | RegexOptions.Multiline);
IQS_Encryption.Encryption enc = new Encryption();
int count = 1;
string addedJson = "";
int matchCount = 0;
foreach (Match myMatch in myRegex.Matches(input))
{
if (myMatch.Success)
{
//Console.WriteLine("REGEX MYMATCH: " + myMatch.Value);
input = input.Replace(myMatch.Value, "__e__" + myMatch.Groups[2].Value + "\"c" + count + "\"" + myMatch.Groups[4].Value);
addedJson += "c"+count + "{" +enc.EncryptString(myMatch.Groups[3].Value, Encoding.UTF8.GetBytes("12345678912365478912365478965412"))+"},";
}
count++;
matchCount++;
}
Console.WriteLine("MAC" + matchCount);
return input + addedJson;
}`
Thanks again to #Jonesy for the huge help.

Extract some numbers and decimals from a string

I have a string:
" a.1.2.3 #4567 "
and I want to reduce that to just "1.2.3".
Currently using Substring() and Remove(), but that breaks if there ends up being more numbers after the pound sign.
What's the best way to go about doing this? I've read a bunch of questions on regex & string.split, but I can't get anything I try to work in VB.net. Would I have to do a match then replace using the match result?
Any help would be much appreciated.
This should work:
string input = " a.1.2.3 #4567 ";
int poundIndex = input.IndexOf("#");
if(poundIndex >= 0)
{
string relevantPart = input.Substring(0, poundIndex).Trim();
IEnumerable<Char> numPart = relevantPart.SkipWhile(c => !Char.IsDigit(c));
string result = new string(numPart.ToArray());
}
Demo
Try this...
String[] splited = split("#");
String output = splited[0].subString(2); // 1 is the index of the "." after "a" considering there are no blank spaces before it..
Here is regex way of doing it
string input = " a.1.2.3 #4567 ";
Regex regex = new Regex(#"(\d\.)+\d");
var match = regex.Match(input);
if(match.Success)
{
string output = match.Groups[0].Value;//"1.2.3"
//Or
string output = match.Value;//"1.2.3"
}
If the pound sign is the most relevant bit, rely on Split. Sample VB.NET code:
Dim inputString As String = " a.1.2.3 #4567 "
If (inputString.Contains("#")) Then
Dim firstBit As String = inputString.Split("#")(0).Trim()
Dim headingToRemove As String = "a."
Dim result As String = firstBit.Substring(headingToRemove.Length, firstBit.Length - headingToRemove.Length)
End If
As far as this is a multi-language question, here comes the translation to C#:
string inputString = " a.1.2.3 #4567 ";
if (inputString.Contains("#"))
{
string firstBit = inputString.Split('#')[0].Trim();
string headingToRemove = "a.";
string result = firstBit.Substring(headingToRemove.Length, firstBit.Length - headingToRemove.Length);
}
I guess another way using unrolled
\d+ (?: \. \d+ )+

What is the Best Way to Clean a URL with a Title in it

What is the best way to clean a URL? I am looking for a URL like this
what_is_the_best_headache_medication
My current code
public string CleanURL(string str)
{
str = str.Replace("!", "");
str = str.Replace("#", "");
str = str.Replace("#", "");
str = str.Replace("$", "");
str = str.Replace("%", "");
str = str.Replace("^", "");
str = str.Replace("&", "");
str = str.Replace("*", "");
str = str.Replace("(", "");
str = str.Replace(")", "");
str = str.Replace("-", "");
str = str.Replace("_", "");
str = str.Replace("+", "");
str = str.Replace("=", "");
str = str.Replace("{", "");
str = str.Replace("[", "");
str = str.Replace("]", "");
str = str.Replace("}", "");
str = str.Replace("|", "");
str = str.Replace(#"\", "");
str = str.Replace(":", "");
str = str.Replace(";", "");
str = str.Replace(#"\", "");
str = str.Replace("'", "");
str = str.Replace("<", "");
str = str.Replace(">", "");
str = str.Replace(",", "");
str = str.Replace(".", "");
str = str.Replace("`", "");
str = str.Replace("~", "");
str = str.Replace("/", "");
str = str.Replace("?", "");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", " ");
str = str.Replace(" ", "_");
return str;
}
Regular expressions for sure:
public string CleanURL(string str)
{
str = Regex.Replace(str, "[^a-zA-Z0-9 ]", "");
str = Regex.Replace(str, " +", "_");
return str;
}
(Not actually tested, off the top of my head.)
Let me explain:
The first line removes everything that's not an alphanumeric character (upper or lowercase) or a space .
The second line replaces any sequence of spaces (1 or more, sequentially) with a single underscore.
Generally your best bet is to go with a white list regular expression approach instead of removing all the unwanted characters because you definitely are going to miss some.
The answers here are fine so far but I personally did not want to remove umlauts and characters with accent marks entirely. So the final solution I came up with looks like this:
public static string CleanUrl(string value)
{
if (value.IsNullOrEmpty())
return value;
// replace hyphens to spaces, remove all leading and trailing whitespace
value = value.Replace("-", " ").Trim().ToLower();
// replace multiple whitespace to one hyphen
value = Regex.Replace(value, #"[\s]+", "-");
// replace umlauts and eszett with their equivalent
value = value.Replace("ß", "ss");
value = value.Replace("ä", "ae");
value = value.Replace("ö", "oe");
value = value.Replace("ü", "ue");
// removes diacritic marks (often called accent marks) from characters
value = RemoveDiacritics(value);
// remove all left unwanted chars (white list)
value = Regex.Replace(value, #"[^a-z0-9\s-]", String.Empty);
return value;
}
The used RemoveDiacritics method is based on the SO answer by Blair Conrad:
public static string RemoveDiacritics(string value)
{
if (value.IsNullOrEmpty())
return value;
string normalized = value.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
foreach (char c in normalized)
{
if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
sb.Append(c);
}
Encoding nonunicode = Encoding.GetEncoding(850);
Encoding unicode = Encoding.Unicode;
byte[] nonunicodeBytes = Encoding.Convert(unicode, nonunicode, unicode.GetBytes(sb.ToString()));
char[] nonunicodeChars = new char[nonunicode.GetCharCount(nonunicodeBytes, 0, nonunicodeBytes.Length)];
nonunicode.GetChars(nonunicodeBytes, 0, nonunicodeBytes.Length, nonunicodeChars, 0);
return new string(nonunicodeChars);
}
Hope that helps somebody challenged by slugifying URLs and keeping umlauts and friends with their URL friendly equivalent at the same time.
You should consider using a regular expression instead. It's much more efficient than what you're trying to do above.
More on Regular Expressions here.
How do you define "friendly" URL - I'm assuming you mean to remove _'s etc.
I'd look into a regular expression here.
If you want to persist with the method above, I would suggest moving to StringBuilder over a string. This is because each of your replace operations is creating a new string.
I can tighten up one piece of that:
while (str.IndexOf(" ") > 0)
str = str.Replace(" ", " ");
...instead of your infinite number of " " replacements. But you almost certainly want a regular expression instead.
Or, a bit more verbose, but this only allows alphanumeric and spaces (which are replaced by '-')
string Cleaned = String.Empty;
foreach (char c in Dirty)
if (((c >= 'a') && (c <= 'z')) ||
(c >= 'A') && (c <= 'Z') ||
(c >= '0') && (c <= '9') ||
(c == ' '))
Cleaned += c;
Cleaned = Cleaned.Replace(" ", "-");
The way stackoverflow is doing it can be found here:
https://stackoverflow.com/a/25486/142014
optimized for speed ("This is the second version, unrolled for 5x more performance") and taking care of a lot of special characters.

Categories

Resources