Replace SubString on Partial match of word - c#

I have two strings :-
String S1 = "This is my\r\n string."
String S2 = "This is my\n self."
I want to have a generic method to replace any existence of "\n" to "\r\n". But it should not replace any part of the string if it already has "\r\n".

Use regular expression with negative lookbehind:
string result = Regex.Replace(input, #"(?<!\r)\n", "\r\n");
It matches all \n which are not preceded by \r.

Try something like this:
var unused = "§";
S2 =
S2
.Replace("\r\n", unused)
.Replace("\n", unused)
.Replace(unused, "\r\n");

Assuming you have well-behaved standard input text, i.e. no consecutive \r, you can simply use:
var result = S1.replace("\n","\r\n").replace("\r\r","\r")
This won't work in general cases, obviously

Related

Regex to match only numbers , no apostrophes

I want to match only numbers in the following string
String : "40’000"
Match : "40000"
basically tring to ignore apostrophe.
I am using C#, in case it matters.
Cant use any C# methods, need to only use Regex.
Replace like this it replace all char excpet numbers
string input = "40’000";
string result = Regex.Replace(input, #"[^\d]", "");
Since you said; I just want to pick up numbers only, how about without regex?
var s = "40’000";
var result = new string(s.Where(char.IsDigit).ToArray());
Console.WriteLine(result); // 40000
I suggest use regex to find the special characters not the digits, and then replace by ''.
So a simple (?=\S)\D should be enough, the (?=\S) is to ignore the whitespace at the end of number.
DEMO
Replace like this it replace all char excpet numbers and points
string input = "40’000";
string result = Regex.Replace(input, #"[^\d^.]", "");
Don't complicate your life, use Regex.Replace
string s = "40'000";
string replaced = Regex.Replace(s, #"\D", "");

Regex replace special characters defind by client

I need a c# function which will replace all special characters customized by the client from a string Example
string value1 = #"‹¥ó׬¶ÝÆ";
string input1 = #"Thi¥s is\123a strÆing";
string output1 = Regex.Replace(input1, value1, "");
I want have a result like this : output1 =Thi s is\123a str ing
Why do you need regex? This is more efficient, concise also readable:
string result = string.Concat(input1.Except(value1));
If you don't want to remove but replace them with a different string you can still use a similar(but not as efficient) approach:
string replacement = "[foo]";
var newChars = input1.SelectMany(c => value1.Contains(c) ? replacement : c.ToString());
string result = string.Concat( newChars ); // Thi[foo]s is\123a str[foo]ing
Someone asked for a regex?
string value1 = #"^\-[]‹¥ó׬¶ÝÆ";
string input1 = #"T-^\hi¥s is\123a strÆing";
// Handles ]^-\ by escaping them
string value1b = Regex.Replace(value1, #"([\]\^\-\\])", #"\$1");
// Creates a [...] regex and uses it
string input1b = Regex.Replace(input1, "[" + value1b + "]", " ");
The basic idea is to use a [...] regex. But first you have to escape some characters that have special meaning inside a [...]. They should be ]^-\ Note that you don't need to escape the [
note that this solution isn't compatible with non-BMP unicode characters (characters that fill-up two char)
A solution that is compatible with them is more complex, but for normal use it shouldn't be a problem.

Replace Contiguous Instance of a String in C#

How can I replace contiguous substring of a string in C#?
For example, the string
"<p>The quick fox</p>"
will be converted to
"<p>The quick fox</p>"
Use the below regex
#"(.+)\1+"
(.+) captures the group of characters and matches also the following \1+ one or more same set of characters.
And then replace the match with $1
DEMO
string result = Regex.Replace(str, #"(.+)\1+", "$1");
Maybe this simple one is enough:
( ){2,}
and replace with $1 ( that's captured in first parenthesized group)
See test at regex101
To check, if a substring is followed by itself, also can use a lookahead:
(?:( )(?=\1))+
and replace with empty. See test at regex101.com
Let's call the original string s and the substring subString:
var s = "<p>The quick fox</p>";
var subString = " ";
I'd prefer this instead of a regex, much more readable:
var subStringTwice = subString + subString;
while (s.Contains(subStringTwice))
{
s = s.Replace(subStringTwice, subString);
}
Another possible solution with better performance:
var elements = s.Split(new []{subString}, StringSplitOptions.RemoveEmptyEntries);
s = string.Join(subString, elements);
// This part is only needed when subString can appear at the start or the end of s
if (result != "")
{
if (s.StartsWith(subString)) result = subString + result;
if (s.EndsWith(subString)) result = result + subString;
}

Regex: the "replace" pattern

I have a string which I want to manipulate. Problem is that the replacement contains also regular expressions.
var result = Regex.Replace("Abc", "\r\nAbc\r\n", "\r\n");
// or something like that, it can be also \t and so on...
But the result is not a newline, but the string "\r\n".
PS: By the way, if I want to replace something by nothing, as very simple example:
Regex.Replace("abc", "abc", "")
regex seems to fail. Cannot strings are replaced by an empty string?
You can just use as follows instaed of using regex.
string result = "abc".Replace("abc", string.empty);
Try this:
var result = Regex.Replace("Abc", "\r\nAbc\r\n", System.Environment.NewLine);
Try adding the string literal # identifier before your regex.
var result = Regex.Replace("Abc", #"\r\nAbc\r\n", "\r\n");
My guess it that they are being read in as escape characters before they get to the regex engine.
Basically, the pattern getting passed in is this:
"
Abc
"
Intead of this:
"\r\nAbc\r\n"

Replace Line Breaks in a String C#

How can I replace Line Breaks within a string in C#?
Use replace with Environment.NewLine
myString = myString.Replace(System.Environment.NewLine, "replacement text"); //add a line terminating ;
As mentioned in other posts, if the string comes from another environment (OS) then you'd need to replace that particular environments implementation of new line control characters.
The solutions posted so far either only replace Environment.NewLine or they fail if the replacement string contains line breaks because they call string.Replace multiple times.
Here's a solution that uses a regular expression to make all three replacements in just one pass over the string. This means that the replacement string can safely contain line breaks.
string result = Regex.Replace(input, #"\r\n?|\n", replacementString);
To extend The.Anyi.9's answer, you should also be aware of the different types of line break in general use. Dependent on where your file originated, you may want to look at making sure you catch all the alternatives...
string replaceWith = "";
string removedBreaks = Line.Replace("\r\n", replaceWith).Replace("\n", replaceWith).Replace("\r", replaceWith);
should get you going...
I would use Environment.Newline when I wanted to insert a newline for a string, but not to remove all newlines from a string.
Depending on your platform you can have different types of newlines, but even inside the same platform often different types of newlines are used. In particular when dealing with file formats and protocols.
string ReplaceNewlines(string blockOfText, string replaceWith)
{
return blockOfText.Replace("\r\n", replaceWith).Replace("\n", replaceWith).Replace("\r", replaceWith);
}
If your code is supposed to run in different environments, I would consider using the Environment.NewLine constant, since it is specifically the newline used in the specific environment.
line = line.Replace(Environment.NewLine, "newLineReplacement");
However, if you get the text from a file originating on another system, this might not be the correct answer, and you should replace with whatever newline constant is used on the other system. It will typically be \n or \r\n.
if you want to "clean" the new lines, flamebaud comment using regex #"[\r\n]+" is the best choice.
using System;
using System.Text.RegularExpressions;
class MainClass {
public static void Main (string[] args) {
string str = "AAA\r\nBBB\r\n\r\n\r\nCCC\r\r\rDDD\n\n\nEEE";
Console.WriteLine (str.Replace(System.Environment.NewLine, "-"));
/* Result:
AAA
-BBB
-
-
-CCC
DDD---EEE
*/
Console.WriteLine (Regex.Replace(str, #"\r\n?|\n", "-"));
// Result:
// AAA-BBB---CCC---DDD---EEE
Console.WriteLine (Regex.Replace(str, #"[\r\n]+", "-"));
// Result:
// AAA-BBB-CCC-DDD-EEE
}
}
Use new in .NET 6 method
myString = myString.ReplaceLineEndings();
Replaces ALL newline sequences in the current string.
Documentation:
ReplaceLineEndings
Don't forget that replace doesn't do the replacement in the string, but returns a new string with the characters replaced. The following will remove line breaks (not replace them). I'd use #Brian R. Bondy's method if replacing them with something else, perhaps wrapped as an extension method. Remember to check for null values first before calling Replace or the extension methods provided.
string line = ...
line = line.Replace( "\r", "").Replace( "\n", "" );
As extension methods:
public static class StringExtensions
{
public static string RemoveLineBreaks( this string lines )
{
return lines.Replace( "\r", "").Replace( "\n", "" );
}
public static string ReplaceLineBreaks( this string lines, string replacement )
{
return lines.Replace( "\r\n", replacement )
.Replace( "\r", replacement )
.Replace( "\n", replacement );
}
}
To make sure all possible ways of line breaks (Windows, Mac and Unix) are replaced you should use:
string.Replace("\r\n", "\n").Replace('\r', '\n').Replace('\n', 'replacement');
and in this order, to not to make extra line breaks, when you find some combination of line ending chars.
Why not both?
string ReplacementString = "";
Regex.Replace(strin.Replace(System.Environment.NewLine, ReplacementString), #"(\r\n?|\n)", ReplacementString);
Note: Replace strin with the name of your input string.
I needed to replace the \r\n with an actual carriage return and line feed and replace \t with an actual tab. So I came up with the following:
public string Transform(string data)
{
string result = data;
char cr = (char)13;
char lf = (char)10;
char tab = (char)9;
result = result.Replace("\\r", cr.ToString());
result = result.Replace("\\n", lf.ToString());
result = result.Replace("\\t", tab.ToString());
return result;
}
var answer = Regex.Replace(value, "(\n|\r)+", replacementString);
As new line can be delimited by \n, \r and \r\n, first we’ll replace \r and \r\n with \n, and only then split data string.
The following lines should go to the parseCSV method:
function parseCSV(data) {
//alert(data);
//replace UNIX new lines
data = data.replace(/\r\n/g, "\n");
//replace MAC new lines
data = data.replace(/\r/g, "\n");
//split into rows
var rows = data.split("\n");
}
Use the .Replace() method
Line.Replace("\n", "whatever you want to replace with");
Best way to replace linebreaks safely is
yourString.Replace("\r\n","\n") //handling windows linebreaks
.Replace("\r","\n") //handling mac linebreaks
that should produce a string with only \n (eg linefeed) as linebreaks.
this code is usefull to fix mixed linebreaks too.
Another option is to create a StringReader over the string in question. On the reader, do .ReadLine() in a loop. Then you have the lines separated, no matter what (consistent or inconsistent) separators they had. With that, you can proceed as you wish; one possibility is to use a StringBuilder and call .AppendLine on it.
The advantage is, you let the framework decide what constitutes a "line break".
string s = Regex.Replace(source_string, "\n", "\r\n");
or
string s = Regex.Replace(source_string, "\r\n", "\n");
depending on which way you want to go.
Hopes it helps.
If you want to replace only the newlines:
var input = #"sdfhlu \r\n sdkuidfs\r\ndfgdgfd";
var match = #"[\\ ]+";
var replaceWith = " ";
Console.WriteLine("input: " + input);
var x = Regex.Replace(input.Replace(#"\n", replaceWith).Replace(#"\r", replaceWith), match, replaceWith);
Console.WriteLine("output: " + x);
If you want to replace newlines, tabs and white spaces:
var input = #"sdfhlusdkuidfs\r\ndfgdgfd";
var match = #"[\\s]+";
var replaceWith = "";
Console.WriteLine("input: " + input);
var x = Regex.Replace(input, match, replaceWith);
Console.WriteLine("output: " + x);
This is a very long winded one-liner solution but it is the only one that I had found to work if you cannot use the the special character escapes like "\r" and "\n" and \x0d and \u000D as well as System.Environment.NewLine as parameters to thereplace() method
MyStr.replace( System.String.Concat( System.Char.ConvertFromUtf32(13).ToString(), System.Char.ConvertFromUtf32(10).ToString() ), ReplacementString );
This is somewhat offtopic but to get it to work inside Visual Studio's XML .props files, which invoke .NET via the XML properties, I had to dress it up like it is shown below.
The Visual Studio XML --> .NET environment just would not accept the special character escapes like "\r" and "\n" and \x0d and \u000D as well as System.Environment.NewLine as parameters to thereplace() method.
$([System.IO.File]::ReadAllText('MyFile.txt').replace( $([System.String]::Concat($([System.Char]::ConvertFromUtf32(13).ToString()),$([System.Char]::ConvertFromUtf32(10).ToString()))),$([System.String]::Concat('^',$([System.Char]::ConvertFromUtf32(13).ToString()),$([System.Char]::ConvertFromUtf32(10).ToString())))))
Based on #mark-bayers answer and for cleaner output:
string result = Regex.Replace(ex.Message, #"(\r\n?|\r?\n)+", "replacement text");
It removes \r\n , \n and \r while perefer longer one and simplify multiple occurances to one.

Categories

Resources