String "de-concatenation" - c#

I have two strings like this
string s = "abcdef";
string t = "def";
I would like to remove t from s. Can I do this like this?
s = s - t?
EDIT
I will have two strings s and t, t will be an ending substring of s. I want to remove t from s.

No, but you can do this:
var newStr = "abcdef".Replace("def", "");
Per your comments, if you want to only remove the trailing pattern you can use a Regex:
var newStr = Regex.Replace("defdefdef", "(def)$", "");
The '$' will anchor to the end of the string, so it will only remove the final 'def'
Turning this into an extension method:
public static String ReplaceEnd(this string input, string subStr, string replace = "")
{
//Per Alexei Levenkov's comments, the string should
// be escaped in order to avoid accidental injection
// of special characters into the Regex pattern
var escaped = Regex.Escape(subStr);
var pattern = String.Format("({0})$", escaped);
return Regex.Replace(input, pattern, replace);
}
Using this method with your code above would become:
string s = "abcdef";
string t = "def";
s = s.ReplaceEnd(t); // Ta Da!

Like this:
if (s.EndsWith(t))
{
s = s.Substring(0, s.LastIndexOf(t));
}

s = s.Substring(0, s.Length - t.Length)
Substring takes two arguments: start and length. You want to take things from the start of abcdef, that's index 0, and you want to take all the characters minus the characters from t, which is the difference of length of the two strings.
This assumes the OP's contract of "t will be an ending substring of s". If in fact this precondition is not guaranteed, it needs if (s.EndsWith(t)) around it.

Related

Regex replace special characters defind by client

I need a c# function which will replace all special characters customized by the client from a string Example
string value1 = #"‹¥ó׬¶ÝÆ";
string input1 = #"Thi¥s is\123a strÆing";
string output1 = Regex.Replace(input1, value1, "");
I want have a result like this : output1 =Thi s is\123a str ing
Why do you need regex? This is more efficient, concise also readable:
string result = string.Concat(input1.Except(value1));
If you don't want to remove but replace them with a different string you can still use a similar(but not as efficient) approach:
string replacement = "[foo]";
var newChars = input1.SelectMany(c => value1.Contains(c) ? replacement : c.ToString());
string result = string.Concat( newChars ); // Thi[foo]s is\123a str[foo]ing
Someone asked for a regex?
string value1 = #"^\-[]‹¥ó׬¶ÝÆ";
string input1 = #"T-^\hi¥s is\123a strÆing";
// Handles ]^-\ by escaping them
string value1b = Regex.Replace(value1, #"([\]\^\-\\])", #"\$1");
// Creates a [...] regex and uses it
string input1b = Regex.Replace(input1, "[" + value1b + "]", " ");
The basic idea is to use a [...] regex. But first you have to escape some characters that have special meaning inside a [...]. They should be ]^-\ Note that you don't need to escape the [
note that this solution isn't compatible with non-BMP unicode characters (characters that fill-up two char)
A solution that is compatible with them is more complex, but for normal use it shouldn't be a problem.

How to remove only certain substrings from a string?

Using C#, I have a string that is a SQL script containing multiple queries. I want to remove sections of the string that are enclosed in single quotes. I can do this using Regex.Replace, in this manner:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, "'[^']*'", string.Empty);
Results in: "Only can we turn him to the of the Force"
What I want to do is remove the substrings between quotes EXCEPT for substrings containing a specific substring. For example, using the string above, I want to remove the quoted substrings except for those that contain "dark," such that the resulting string is:
Results in: "Only can we turn him to the 'dark side' of the Force"
How can this be accomplished using Regex.Replace, or perhaps by some other technique? I'm currently trying a solution that involves using Substring(), IndexOf(), and Contains().
Note: I don't care if the single quotes around "dark side" are removed or not, so the result could also be: "Only can we turn him to the dark side of the Force." I say this because a solution using Split() would remove all the single quotes.
Edit: I don't have a solution yet using Substring(), IndexOf(), etc. By "working on," I mean I'm thinking in my head how this can be done. I have no code, which is why I haven't posted any yet. Thanks.
Edit: VKS's solution below works. I wasn't escaping the \b the first attempt which is why it failed. Also, it didn't work unless I included the single quotes around the whole string as well.
test = Regex.Replace(test, "'(?![^']*\\bdark\\b)[^']*'", string.Empty);
'(?![^']*\bdark\b)[^']*'
Try this.See demo.Replace by empty string.You can use lookahead here to check if '' contains a word dark.
https://www.regex101.com/r/rG7gX4/12
While vks's solution works, I'd like to demonstrate a different approach:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, #"'[^']*'", match => {
if (match.Value.Contains("dark"))
return match.Value;
// You can add more cases here
return string.Empty;
});
Or, if your condition is simple enough:
test = Regex.Replace(test, #"'[^']*'", match => match.Value.Contains("dark")
? match.Value
: string.Empty
);
That is, use a lambda to provide a callback for the replacement. This way, you can run arbitrary logic to replace the string.
some thing like this would work. you can add all strings you want to keep into the excludedStrings array
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
var excludedString = new string[] { "dark side" };
int startIndex = 0;
while ((startIndex = test.IndexOf('\'', startIndex)) >= 0)
{
var endIndex = test.IndexOf('\'', startIndex + 1);
var subString = test.Substring(startIndex, (endIndex - startIndex) + 1);
if (!excludedString.Contains(subString.Replace("'", "")))
{
test = test.Remove(startIndex, (endIndex - startIndex) + 1);
}
else
{
startIndex = endIndex + 1;
}
}
Another method through regex alternation operator |.
#"('[^']*\bdark\b[^']*')|'[^']*'"
Then replace the matched character with $1
DEMO
string str = "Only 'together' can we turn him to the 'dark side' of the Force";
string result = Regex.Replace(str, #"('[^']*\bdark\b[^']*')|'[^']*'", "$1");
Console.WriteLine(result);
IDEONE
Explanation:
(...) called capturing group.
'[^']*\bdark\b[^']*' would match all the single quoted strings which contains the substring dark . [^']* matches any character but not of ', zero or more times.
('[^']*\bdark\b[^']*'), because the regex is within a capturing group, all the matched characters are stored inside the group index 1.
| Next comes the regex alternation operator.
'[^']*' Now this matches all the remaining (except the one contains dark) single quoted strings. Note that this won't match the single quoted string which contains the substring dark because we already matched those strings with the pattern exists before to the | alternation operator.
Finally replacing all the matched characters with the chars inside group index 1 will give you the desired output.
I made this attempt that I think you were thinking about (some solution using split, Contain, ... without regex)
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
string[] separated = test.Split('\'');
string result = "";
for (int i = 0; i < separated.Length; i++)
{
string str = separated[i];
str = str.Trim(); //trim the tailing spaces
if (i % 2 == 0 || str.Contains("dark")) // you can expand your condition
{
result += str+" "; // add space after each added string
}
}
result = result.Trim(); //trim the tailing space again

Replace Contiguous Instance of a String in C#

How can I replace contiguous substring of a string in C#?
For example, the string
"<p>The quick fox</p>"
will be converted to
"<p>The quick fox</p>"
Use the below regex
#"(.+)\1+"
(.+) captures the group of characters and matches also the following \1+ one or more same set of characters.
And then replace the match with $1
DEMO
string result = Regex.Replace(str, #"(.+)\1+", "$1");
Maybe this simple one is enough:
( ){2,}
and replace with $1 ( that's captured in first parenthesized group)
See test at regex101
To check, if a substring is followed by itself, also can use a lookahead:
(?:( )(?=\1))+
and replace with empty. See test at regex101.com
Let's call the original string s and the substring subString:
var s = "<p>The quick fox</p>";
var subString = " ";
I'd prefer this instead of a regex, much more readable:
var subStringTwice = subString + subString;
while (s.Contains(subStringTwice))
{
s = s.Replace(subStringTwice, subString);
}
Another possible solution with better performance:
var elements = s.Split(new []{subString}, StringSplitOptions.RemoveEmptyEntries);
s = string.Join(subString, elements);
// This part is only needed when subString can appear at the start or the end of s
if (result != "")
{
if (s.StartsWith(subString)) result = subString + result;
if (s.EndsWith(subString)) result = result + subString;
}

replacing characters in a single field of a comma-separated list

I have string in my c# code
a,b,c,d,"e,f",g,h
I want to replace "e,f" with "e f" i.e. ',' which is inside inverted comma should be replaced by space.
I tried using string.split but it is not working for me.
OK, I can't be bothered to think of a regex approach so I am going to offer an old fashioned loop approach which will work:
string DoReplace(string input)
{
bool isInner = false;//flag to detect if we are in the inner string or not
string result = "";//result to return
foreach(char c in input)//loop each character in the input string
{
if(isInner && c == ',')//if we are in an inner string and it is a comma, append space
result += " ";
else//otherwise append the character
result += c;
if(c == '"')//if we have hit an inner quote, toggle the flag
isInner = !isInner;
}
return result;
}
NOTE: This solution assumes that there can only be one level of inner quotes, for example you cannot have "a,b,c,"d,e,"f,g",h",i,j" - because that's just plain madness!
For the scenario where you only need to match one pair of letters, the following regex will work:
string source = "a,b,c,d,\"e,f\",g,h";
string pattern = "\"([\\w]),([\\w])\"";
string replace = "\"$1 $2\"";
string result = Regex.Replace(source, pattern, replace);
Console.WriteLine(result); // a,b,c,d,"e f",g,h
Breaking apart the pattern, it is matching any instance where there is a "X,X" sequence where X is any letter, and is replacing it with the very same sequence, with a space in between the letters instead of a comma.
You could easily extend this if you needed to to have it match more than one letter, etc, as needed.
For the case where you can have multiple letters separated by commas within quotes that need to be replaced, the following can do it for you. Sample text is a,b,c,d,"e,f,a",g,h:
string source = "a,b,c,d,\"e,f,a\",g,h";
string pattern = "\"([ ,\\w]+),([ ,\\w]+)\"";
string replace = "\"$1 $2\"";
string result = source;
while (Regex.IsMatch(result, pattern)) {
result = Regex.Replace(result, pattern, replace);
}
Console.WriteLine(result); // a,b,c,d,"e f a",g,h
This does something similar compared to the first one, but just removes any comma that is sandwiched by letters surrounded by quotes, and repeats it until all cases are removed.
Here's a somewhat fragile but simple solution:
string.Join("\"", line.Split('"').Select((s, i) => i % 2 == 0 ? s : s.Replace(",", " ")))
It's fragile because it doesn't handle flavors of CSV that escape double-quotes inside double-quotes.
Use the following code:
string str = "a,b,c,d,\"e,f\",g,h";
string[] str2 = str.Split('\"');
var str3 = str2.Select(p => ((p.StartsWith(",") || p.EndsWith(",")) ? p : p.Replace(',', ' '))).ToList();
str = string.Join("", str3);
Use Split() and Join():
string input = "a,b,c,d,\"e,f\",g,h";
string[] pieces = input.Split('"');
for ( int i = 1; i < pieces.Length; i += 2 )
{
pieces[i] = string.Join(" ", pieces[i].Split(','));
}
string output = string.Join("\"", pieces);
Console.WriteLine(output);
// output: a,b,c,d,"e f",g,h

How to replace the text between two characters in c#

I am bit confused writing the regex for finding the Text between the two delimiters { } and replace the text with another text in c#,how to replace?
I tried this.
StreamReader sr = new StreamReader(#"C:abc.txt");
string line;
line = sr.ReadLine();
while (line != null)
{
if (line.StartsWith("<"))
{
if (line.IndexOf('{') == 29)
{
string s = line;
int start = s.IndexOf("{");
int end = s.IndexOf("}");
string result = s.Substring(start+1, end - start - 1);
}
}
//write the lie to console window
Console.Write Line(line);
//Read the next line
line = sr.ReadLine();
}
//close the file
sr.Close();
Console.ReadLine();
I want replace the found text(result) with another text.
Use Regex with pattern: \{([^\}]+)\}
Regex yourRegex = new Regex(#"\{([^\}]+)\}");
string result = yourRegex.Replace(yourString, "anyReplacement");
string s = "data{value here} data";
int start = s.IndexOf("{");
int end = s.IndexOf("}", start);
string result = s.Substring(start+1, end - start - 1);
s = s.Replace(result, "your replacement value");
To get the string between the parentheses to be replaced, use the Regex pattern
string errString = "This {match here} uses 3 other {match here} to {match here} the {match here}ation";
string toReplace = Regex.Match(errString, #"\{([^\}]+)\}").Groups[1].Value;
Console.WriteLine(toReplace); // prints 'match here'
To then replace the text found you can simply use the Replace method as follows:
string correctString = errString.Replace(toReplace, "document");
Explanation of the Regex pattern:
\{ # Escaped curly parentheses, means "starts with a '{' character"
( # Parentheses in a regex mean "put (capture) the stuff
# in between into the Groups array"
[^}] # Any character that is not a '}' character
* # Zero or more occurrences of the aforementioned "non '}' char"
) # Close the capturing group
\} # "Ends with a '}' character"
The following regular expression will match the criteria you specified:
string pattern = #"^(\<.{27})(\{[^}]*\})(.*)";
The following would perform a replace:
string result = Regex.Replace(input, pattern, "$1 REPLACE $3");
For the input: "<012345678901234567890123456{sdfsdfsdf}sadfsdf" this gives the output "<012345678901234567890123456 REPLACE sadfsdf"
You need two calls to Substring(), rather than one: One to get textBefore, the other to get textAfter, and then you concatenate those with your replacement.
int start = s.IndexOf("{");
int end = s.IndexOf("}");
//I skip the check that end is valid too avoid clutter
string textBefore = s.Substring(0, start);
string textAfter = s.Substring(end+1);
string replacedText = textBefore + newText + textAfter;
If you want to keep the braces, you need a small adjustment:
int start = s.IndexOf("{");
int end = s.IndexOf("}");
string textBefore = s.Substring(0, start-1);
string textAfter = s.Substring(end);
string replacedText = textBefore + newText + textAfter;
the simplest way is to use split method if you want to avoid any regex .. this is an aproach :
string s = "sometext {getthis}";
string result= s.Split(new char[] { '{', '}' })[1];
You can use the Regex expression that some others have already posted, or you can use a more advanced Regex that uses balancing groups to make sure the opening { is balanced by a closing }.
That expression is then (?<BRACE>\{)([^\}]*)(?<-BRACE>\})
You can test this expression online at RegexHero.
You simply match your input string with this Regex pattern, then use the replace methods of Regex, for instance:
var result = Regex.Replace(input, "(?<BRACE>\{)([^\}]*)(?<-BRACE>\})", textToReplaceWith);
For more C# Regex Replace examples, see http://www.dotnetperls.com/regex-replace.

Categories

Resources