Removing parts of the path from string - c#

Consider the following string
string path = #"\\ParentDirectory\All_Attachments$\BATCH_NUMBERS\TS0001\SubDirectory\FileName.txt";
I am trying to modify the path by removing the \\ParentDirectory\All_Attachments$\. So I want my final string to look like:
BATCH_NUMBERS\TS0001\SubDirectory\FileName.txt
I have come up with the following regex
string pattern = #"(?<=\$)(\\)";
string returnValue = Regex.Replace(path, pattern, "", RegexOptions.IgnoreCase);
With the above if I do Console.WriteLine(returnValue) I get
\\ParentDirectory\All_Attachments$BATCH_NUMBERS\TS0001\SubDirectory\FileName.txt
So it only removes \ can someone tell me how to achieve this please.

The code below should do the trick.
string path = #"\\ParentDirectory\All_Attachments$\BATCH_NUMBERS\TS0001\SubDirectory\FileName.txt";
var result = Regex.Replace(path,
#"^ # Start of string
[^$]+ # Anything that is not '$' at least one time
\$ # The '$ sign
\\ # The \ after the '$'
", String.Empty, RegexOptions.IgnorePatternWhitespace);
When executed in LinqPad it gives the following result:
BATCH_NUMBERS\TS0001\SubDirectory\FileName.txt

As an alternative avoiding an RE or split/join you could just run along the string until you have seen 4 slashes:
string result = null;
for (int i = 0, m = 0; i < path.Length; i++)
if (path[i] == '\\' && ++m == 4) {
result = path.Substring(i + 1);
break;
}

Using a regex that takes the first 2 groups of (backslash(es) followed by 1 or more non-backslashes). And including the $ and backslash after that.
string returnValue = Regex.Replace(path, #"^(?:\\+[^\\]+){2}\$\\", "");
Or by splitting on $, joining the string array without it's first element and then trim \ from the start:
string returnValue = string.Join(null, path.Split('$').Skip(1)).TrimStart('\\');
But you'll be using System.Linq for that method to work.

You can use a combination of Substring() and IndexOf() to accomplish your goal:
string result = path.Substring(path.IndexOf("$") + 1);

Related

Substring IndexOf in c#

I have a string that looks like this: "texthere^D123456_02". But I want my result to be D123456.
this is what i do so far:
if (name.Contains("_"))
{
name = name.Substring(0, name.LastIndexOf('_'));
}
With this I remove at least the _02, however if I try the same way for ^ then I always get back texthere, even when I use name.IndexOf("^")
I also tried only to check for ^, to get at least the result:D123456_02 but still the same result.
I even tried to name.Replace("^" and then use the substring way I used before. But again the result stays the same.
texthere is not always the same length, so .Remove() is out of the question.
What am I doing wrong?
Thanks
When call Substring you should not start from 0, but from the index found:
String name = "texthere^D123456_02";
int indexTo = name.LastIndexOf('_');
if (indexTo < 0)
indexTo = name.Length;
int indexFrom = name.LastIndexOf('^', indexTo - 1);
if (indexFrom >= 0)
name = name.Substring(indexFrom + 1, indexTo - indexFrom - 1);
string s = "texthere^D123456_02";
string result= s.Substring(s.IndexOf("^") + 1);//Remove all before
result = result.Remove(result.IndexOf("_"));//Remove all after
Use the String.Split method :
var split1 = name.Split('^')[1];
var yourText = split1.Split('_')[0];
Or you could use RegExp to achieve basically the same.
Your easiest solution would be to split the string first, and then use your original solution for the second part.
string name = "texthere^D123456_02";
string secondPart = name.Split('^')[1]; // This will be D123456_02
Afterwards you can use the Substring as before.
With Regular Expression
string s = "texthere^D123456_02";
Regex r1 = new Regex(#"\^(.*)_");
MatchCollection mc = r1.Matches(s);
Console.WriteLine("Result is " + mc[0].Groups[1].Value);
An alternative to what's already been suggested is to use regex:
string result = Regex.Match("texthere^D123456_02", #"\^(.*)_").Groups[1].Value; // D123456
use regex.
Regex regex = new Regex(#"\^(.*)_");
Match match = regex.Match(name);
if(match.Success)
{
name= match.Groups[1].Value
}
An easier way would be to use Split
string s = "texthere^D123456_02";
string[] a = s.Split('^', '_');
if (a.Length == 3) // correct
{
}
Well, if you use the same code you posted, it's doing the right thing, you start to retrieve characters from the char 0 and stop when it finds "^", so what you will get is "texthere".
If you want only the value, then use this:
name = name.Substring(0, name.LastIndexOf('_')).Substring(name.IndexOf("^") + 1);
It will first remove whatever is after the "_" and whatever is before "^".
Substring takes a position and a length, so you need to actually figure out where your caret position is and where the underscore is to calculate the length
var name = "texthere^D123456_02";
if(name.Contains('_'))
{
var caretPos = name.IndexOf('^') + 1; // skip ahead
var underscorePos = name.IndexOf('_');
var length = underscorePos - caretPos;
var substring = name.Substring(caretPos, length);
Debug.Print(substring);
};
Try this and let me know how it goes
string inputtext = "texthere^D123456_02";
string pattern = #".+\^([A-Z]+[0-9]+)\_[0-9]+";
string result = Regex.Match(inputtext, pattern).Groups[1].Value;
String name = "texthere^D123456_02"
print name.split('_', '^')[1]
This splits your string at all occurrences of _ and ^ and returns the list of strings after the split. Since the string you need D123456 would be at the 1st index, (i.e. the 2nd position), I have printed out that.
If you are just wanting the "d123456" it's simple with just String.Split() there is no need for anything else. Just define the index you want afterwards. There are overloads on Split() for this very reason.
//...
var source = "texthere^D123456_02";
var result = source.Split(new char[] {'^', '_'}, StringSplitOptions.RemoveEmptyEntries)[1];
Console.WriteLine(result);
//Outputs: "D123456"
Hope this helps.

How to remove only certain substrings from a string?

Using C#, I have a string that is a SQL script containing multiple queries. I want to remove sections of the string that are enclosed in single quotes. I can do this using Regex.Replace, in this manner:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, "'[^']*'", string.Empty);
Results in: "Only can we turn him to the of the Force"
What I want to do is remove the substrings between quotes EXCEPT for substrings containing a specific substring. For example, using the string above, I want to remove the quoted substrings except for those that contain "dark," such that the resulting string is:
Results in: "Only can we turn him to the 'dark side' of the Force"
How can this be accomplished using Regex.Replace, or perhaps by some other technique? I'm currently trying a solution that involves using Substring(), IndexOf(), and Contains().
Note: I don't care if the single quotes around "dark side" are removed or not, so the result could also be: "Only can we turn him to the dark side of the Force." I say this because a solution using Split() would remove all the single quotes.
Edit: I don't have a solution yet using Substring(), IndexOf(), etc. By "working on," I mean I'm thinking in my head how this can be done. I have no code, which is why I haven't posted any yet. Thanks.
Edit: VKS's solution below works. I wasn't escaping the \b the first attempt which is why it failed. Also, it didn't work unless I included the single quotes around the whole string as well.
test = Regex.Replace(test, "'(?![^']*\\bdark\\b)[^']*'", string.Empty);
'(?![^']*\bdark\b)[^']*'
Try this.See demo.Replace by empty string.You can use lookahead here to check if '' contains a word dark.
https://www.regex101.com/r/rG7gX4/12
While vks's solution works, I'd like to demonstrate a different approach:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, #"'[^']*'", match => {
if (match.Value.Contains("dark"))
return match.Value;
// You can add more cases here
return string.Empty;
});
Or, if your condition is simple enough:
test = Regex.Replace(test, #"'[^']*'", match => match.Value.Contains("dark")
? match.Value
: string.Empty
);
That is, use a lambda to provide a callback for the replacement. This way, you can run arbitrary logic to replace the string.
some thing like this would work. you can add all strings you want to keep into the excludedStrings array
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
var excludedString = new string[] { "dark side" };
int startIndex = 0;
while ((startIndex = test.IndexOf('\'', startIndex)) >= 0)
{
var endIndex = test.IndexOf('\'', startIndex + 1);
var subString = test.Substring(startIndex, (endIndex - startIndex) + 1);
if (!excludedString.Contains(subString.Replace("'", "")))
{
test = test.Remove(startIndex, (endIndex - startIndex) + 1);
}
else
{
startIndex = endIndex + 1;
}
}
Another method through regex alternation operator |.
#"('[^']*\bdark\b[^']*')|'[^']*'"
Then replace the matched character with $1
DEMO
string str = "Only 'together' can we turn him to the 'dark side' of the Force";
string result = Regex.Replace(str, #"('[^']*\bdark\b[^']*')|'[^']*'", "$1");
Console.WriteLine(result);
IDEONE
Explanation:
(...) called capturing group.
'[^']*\bdark\b[^']*' would match all the single quoted strings which contains the substring dark . [^']* matches any character but not of ', zero or more times.
('[^']*\bdark\b[^']*'), because the regex is within a capturing group, all the matched characters are stored inside the group index 1.
| Next comes the regex alternation operator.
'[^']*' Now this matches all the remaining (except the one contains dark) single quoted strings. Note that this won't match the single quoted string which contains the substring dark because we already matched those strings with the pattern exists before to the | alternation operator.
Finally replacing all the matched characters with the chars inside group index 1 will give you the desired output.
I made this attempt that I think you were thinking about (some solution using split, Contain, ... without regex)
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
string[] separated = test.Split('\'');
string result = "";
for (int i = 0; i < separated.Length; i++)
{
string str = separated[i];
str = str.Trim(); //trim the tailing spaces
if (i % 2 == 0 || str.Contains("dark")) // you can expand your condition
{
result += str+" "; // add space after each added string
}
}
result = result.Trim(); //trim the tailing space again

replacing characters in a single field of a comma-separated list

I have string in my c# code
a,b,c,d,"e,f",g,h
I want to replace "e,f" with "e f" i.e. ',' which is inside inverted comma should be replaced by space.
I tried using string.split but it is not working for me.
OK, I can't be bothered to think of a regex approach so I am going to offer an old fashioned loop approach which will work:
string DoReplace(string input)
{
bool isInner = false;//flag to detect if we are in the inner string or not
string result = "";//result to return
foreach(char c in input)//loop each character in the input string
{
if(isInner && c == ',')//if we are in an inner string and it is a comma, append space
result += " ";
else//otherwise append the character
result += c;
if(c == '"')//if we have hit an inner quote, toggle the flag
isInner = !isInner;
}
return result;
}
NOTE: This solution assumes that there can only be one level of inner quotes, for example you cannot have "a,b,c,"d,e,"f,g",h",i,j" - because that's just plain madness!
For the scenario where you only need to match one pair of letters, the following regex will work:
string source = "a,b,c,d,\"e,f\",g,h";
string pattern = "\"([\\w]),([\\w])\"";
string replace = "\"$1 $2\"";
string result = Regex.Replace(source, pattern, replace);
Console.WriteLine(result); // a,b,c,d,"e f",g,h
Breaking apart the pattern, it is matching any instance where there is a "X,X" sequence where X is any letter, and is replacing it with the very same sequence, with a space in between the letters instead of a comma.
You could easily extend this if you needed to to have it match more than one letter, etc, as needed.
For the case where you can have multiple letters separated by commas within quotes that need to be replaced, the following can do it for you. Sample text is a,b,c,d,"e,f,a",g,h:
string source = "a,b,c,d,\"e,f,a\",g,h";
string pattern = "\"([ ,\\w]+),([ ,\\w]+)\"";
string replace = "\"$1 $2\"";
string result = source;
while (Regex.IsMatch(result, pattern)) {
result = Regex.Replace(result, pattern, replace);
}
Console.WriteLine(result); // a,b,c,d,"e f a",g,h
This does something similar compared to the first one, but just removes any comma that is sandwiched by letters surrounded by quotes, and repeats it until all cases are removed.
Here's a somewhat fragile but simple solution:
string.Join("\"", line.Split('"').Select((s, i) => i % 2 == 0 ? s : s.Replace(",", " ")))
It's fragile because it doesn't handle flavors of CSV that escape double-quotes inside double-quotes.
Use the following code:
string str = "a,b,c,d,\"e,f\",g,h";
string[] str2 = str.Split('\"');
var str3 = str2.Select(p => ((p.StartsWith(",") || p.EndsWith(",")) ? p : p.Replace(',', ' '))).ToList();
str = string.Join("", str3);
Use Split() and Join():
string input = "a,b,c,d,\"e,f\",g,h";
string[] pieces = input.Split('"');
for ( int i = 1; i < pieces.Length; i += 2 )
{
pieces[i] = string.Join(" ", pieces[i].Split(','));
}
string output = string.Join("\"", pieces);
Console.WriteLine(output);
// output: a,b,c,d,"e f",g,h

How to replace the text between two characters in c#

I am bit confused writing the regex for finding the Text between the two delimiters { } and replace the text with another text in c#,how to replace?
I tried this.
StreamReader sr = new StreamReader(#"C:abc.txt");
string line;
line = sr.ReadLine();
while (line != null)
{
if (line.StartsWith("<"))
{
if (line.IndexOf('{') == 29)
{
string s = line;
int start = s.IndexOf("{");
int end = s.IndexOf("}");
string result = s.Substring(start+1, end - start - 1);
}
}
//write the lie to console window
Console.Write Line(line);
//Read the next line
line = sr.ReadLine();
}
//close the file
sr.Close();
Console.ReadLine();
I want replace the found text(result) with another text.
Use Regex with pattern: \{([^\}]+)\}
Regex yourRegex = new Regex(#"\{([^\}]+)\}");
string result = yourRegex.Replace(yourString, "anyReplacement");
string s = "data{value here} data";
int start = s.IndexOf("{");
int end = s.IndexOf("}", start);
string result = s.Substring(start+1, end - start - 1);
s = s.Replace(result, "your replacement value");
To get the string between the parentheses to be replaced, use the Regex pattern
string errString = "This {match here} uses 3 other {match here} to {match here} the {match here}ation";
string toReplace = Regex.Match(errString, #"\{([^\}]+)\}").Groups[1].Value;
Console.WriteLine(toReplace); // prints 'match here'
To then replace the text found you can simply use the Replace method as follows:
string correctString = errString.Replace(toReplace, "document");
Explanation of the Regex pattern:
\{ # Escaped curly parentheses, means "starts with a '{' character"
( # Parentheses in a regex mean "put (capture) the stuff
# in between into the Groups array"
[^}] # Any character that is not a '}' character
* # Zero or more occurrences of the aforementioned "non '}' char"
) # Close the capturing group
\} # "Ends with a '}' character"
The following regular expression will match the criteria you specified:
string pattern = #"^(\<.{27})(\{[^}]*\})(.*)";
The following would perform a replace:
string result = Regex.Replace(input, pattern, "$1 REPLACE $3");
For the input: "<012345678901234567890123456{sdfsdfsdf}sadfsdf" this gives the output "<012345678901234567890123456 REPLACE sadfsdf"
You need two calls to Substring(), rather than one: One to get textBefore, the other to get textAfter, and then you concatenate those with your replacement.
int start = s.IndexOf("{");
int end = s.IndexOf("}");
//I skip the check that end is valid too avoid clutter
string textBefore = s.Substring(0, start);
string textAfter = s.Substring(end+1);
string replacedText = textBefore + newText + textAfter;
If you want to keep the braces, you need a small adjustment:
int start = s.IndexOf("{");
int end = s.IndexOf("}");
string textBefore = s.Substring(0, start-1);
string textAfter = s.Substring(end);
string replacedText = textBefore + newText + textAfter;
the simplest way is to use split method if you want to avoid any regex .. this is an aproach :
string s = "sometext {getthis}";
string result= s.Split(new char[] { '{', '}' })[1];
You can use the Regex expression that some others have already posted, or you can use a more advanced Regex that uses balancing groups to make sure the opening { is balanced by a closing }.
That expression is then (?<BRACE>\{)([^\}]*)(?<-BRACE>\})
You can test this expression online at RegexHero.
You simply match your input string with this Regex pattern, then use the replace methods of Regex, for instance:
var result = Regex.Replace(input, "(?<BRACE>\{)([^\}]*)(?<-BRACE>\})", textToReplaceWith);
For more C# Regex Replace examples, see http://www.dotnetperls.com/regex-replace.

C# Regex to replace invalid character to make it as perfect float number

for example if the string is "-234.24234.-23423.344"
the result should be "-234.2423423423344"
if the string is "898.4.44.4"
the result should be "898.4444"
if the string is "-898.4.-"
the result should be "-898.4"
the result should always make scene as a double type
What I can make is this:
string pattern = String.Format(#"[^\d\{0}\{1}]",
NumberFormatInfo.CurrentInfo.NumberDecimalSeparator,
NumberFormatInfo.CurrentInfo.NegativeSign);
string result = Regex.Replace(value, pattern, string.Empty);
// this will not be able to deal with something like this "-.3-46821721.114.4"
Is there any perfect way to deal with those cases?
It's probably a bad idea, but you can do this with regex like this:
Regex.Replace(input, #"[^-.0-9]|(?<!^)-|(?<=\..*)\.", "")
The regex matches:
[^-.0-9] # anything which isn't ., -, or a digit.
| # or
(?<!^)- # a - which is not at the start of the string
| # or
(?<=\..*)\. # a dot which is not the first dot in the string
This works on your examples, and additionally this case: "9-1.1" becomes "91.1".
You could also change (?<!^)- to (?<!^[^-.0-9]*)- if you'd like "asd-8" to become "-8" rather than "8".
It's not a good idea using regex itself to achieve your goal, since regex lack AND and NOT logic for expression.
Try the code below, it will do the same thing.
var str = #"-.3-46821721.114.4";
var beforeHead = "";
var afterHead = "";
var validHead = new Regex(#"(\d\.)" /* use #"\." if you think "-.5" is also valid*/, RegexOptions.Compiled);
Regex.Replace(str, #"[^0-9\.-]", "");
var match = validHead.Match(str);
beforeHead = str.Substring(0, str.IndexOf(match.Value));
if (beforeHead[0] == '-')
{
beforeHead = '-' + Regex.Replace(beforeHead, #"[^0-9]", "");
}
else
{
beforeHead = Regex.Replace(beforeHead, #"[^0-9]", "");
}
afterHead = Regex.Replace(str.Substring(beforeHead.Length + 2 /* 1, if you use \. as head*/), #"[^0-9]", "");
var validFloatNumber = beforeHead + match.Value + afterHead;
String must be trimmed before operation.

Categories

Resources