I have a string which is
string a = #"\server\MainDirectory\SubDirectoryA\SubDirectoryB\SubdirectoryC\MyFile.pdf";
The SubDirectoryB will always start with a prefix of RN followed by 6 unique numbers. Now I'm trying to modify SubDirectoryB parth of the string to be replaced by a new value lets say RN012345
So the new string should look like
string b = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
To achieve this I'm making use of the following helper method
public static string ReplaceAt(this string path, int index, int length, string replace)
{
return path.Remove(index, Math.Min(length, path.Length - index)).Insert(index, replace);
}
Which works great for now.
However the orginial path will be changing in the near future so it will something like #\MainDirectory\RN012345\AnotherDirectory\MyFile.pdf. So I was wondering if there is like a regex or another feature I can use to just change the value in the path rather than providing the index which will change in the future.
Assuming you need to only replace those \RNxxxxxx\ where each x is a unique digit, you need to capture the 6 digits and analyze the substring inside a match evaluator.
var a = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
var res = Regex.Replace(a, #"\\RN([0-9]{6})\\", m =>
m.Groups[1].Value.Distinct().Count() == m.Groups[1].Value.Length ?
"\\RN0123456\\" : m.Value);
// res => \server\MainDirectory\SubDirectoryA\RN0123456\SubdirectoryC\MyFile.pdf
See the C# demo
The regex is
\\RN([0-9]{6})\\
It matches a \ with \\, then matches RN, then matches and captures into Group 1 six digits (with ([0-9]{6})) and then will match a \. In the replacment part, the m.Groups[1].Value.Distinct().Count() == m.Groups[1].Value.Length checks if the number of distinct digits is the same as the number of the substring captured, and if yes, the digits are unique and the replacement occurs, else, the whole match is put back into the replacement result.
Use String.Replace
string oldSubdirectoryB = "RN012345";
string newSubdirectoryB = "RN147258";
string fileNameWithPath = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
fileNameWithPath = fileNameWithPath.Replace(oldSubdirectoryB, newSubdirectoryB);
You can use Regex.Replace to replace the SubDirectoryB with your required value
string a = #"\server\MainDirectory\SubDirectoryA\RN123456\SubdirectoryC\MyFile.pdf";
a = Regex.Replace(a, "RN[0-9]{6,6}","Mairaj");
Here i have replaced a string with RN followed by 6 numbers with Mairaj.
Related
Using C#, i am stuck while trying to extract a specific string while limiting the string to be matched. Here is my input string:
NPS_CNTY01_10112018_Adult_Submittal.txt
I would like to extract 01 after CNTY and ingnore anything after 01.
So far i have the regex to be:
(?!NPS_CNTY)\d{2}
But the above regex gets many other digit matches from the input string. One approach i was thinking was to limit the input to 9 characters to eventually get 01. But somehow not able to achieve that. Any help is appreciated.
I would like to add that the only variable data in this input string is:
NPS_CNTY[two digit county code excluding this bracket]_[date in MMDDYYYY format excluding the brackets]_Adult_Submittal.txt.
Also please limit solutions to regex's.
The (?!NPS_CNTY)\d{2} pattern matches a location that is not immediately followed with NPS_CNTY and then matches 2 digits. The lookahead always returns true since two digits cannot start a NPS_CNTY char sequence, it is redundant.
You may use a positive lookbehind like this to get 01:
var m = Regex.Match(s, #"(?<=NPS_CNTY)\d+");
var result = "";
if (m.Success)
{
result = m.Value;
}
See the .NET regex demo
Here, (?<=NPS_CNTY), a positive lookbehind, matches a location that is immediately preceded with NPS_CNTY and then \d+ matches 1 or more digits.
An equivalent solution using capturing mechanism is
var m = Regex.Match(s, #"NPS_CNTY(\d+)");
var result = "";
if (m.Success)
{
result = m.Groups[1].Value;
}
If the string always start with NPS_CNTY and you have to extract 2 digits then you don't need a regular expression. Just use Substring() method:
string text = #"NPS_CNTY01_01141980_Adult_Submittal.txt";
string digits = text.Substring(8, 2);
EDIT:
In case you need to match N digits after NPS_CNTY you can use the following code:
string text = #"NPS_CNTY012_01141980_Adult_Submittal.txt";
string digits = text.Replace("NPS_CNTY", string.Empty)
.Split("_", StringSplitOptions.RemoveEmptyEntries)
.FirstOrDefault();
Using C#, I have a string that is a SQL script containing multiple queries. I want to remove sections of the string that are enclosed in single quotes. I can do this using Regex.Replace, in this manner:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, "'[^']*'", string.Empty);
Results in: "Only can we turn him to the of the Force"
What I want to do is remove the substrings between quotes EXCEPT for substrings containing a specific substring. For example, using the string above, I want to remove the quoted substrings except for those that contain "dark," such that the resulting string is:
Results in: "Only can we turn him to the 'dark side' of the Force"
How can this be accomplished using Regex.Replace, or perhaps by some other technique? I'm currently trying a solution that involves using Substring(), IndexOf(), and Contains().
Note: I don't care if the single quotes around "dark side" are removed or not, so the result could also be: "Only can we turn him to the dark side of the Force." I say this because a solution using Split() would remove all the single quotes.
Edit: I don't have a solution yet using Substring(), IndexOf(), etc. By "working on," I mean I'm thinking in my head how this can be done. I have no code, which is why I haven't posted any yet. Thanks.
Edit: VKS's solution below works. I wasn't escaping the \b the first attempt which is why it failed. Also, it didn't work unless I included the single quotes around the whole string as well.
test = Regex.Replace(test, "'(?![^']*\\bdark\\b)[^']*'", string.Empty);
'(?![^']*\bdark\b)[^']*'
Try this.See demo.Replace by empty string.You can use lookahead here to check if '' contains a word dark.
https://www.regex101.com/r/rG7gX4/12
While vks's solution works, I'd like to demonstrate a different approach:
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
test = Regex.Replace(test, #"'[^']*'", match => {
if (match.Value.Contains("dark"))
return match.Value;
// You can add more cases here
return string.Empty;
});
Or, if your condition is simple enough:
test = Regex.Replace(test, #"'[^']*'", match => match.Value.Contains("dark")
? match.Value
: string.Empty
);
That is, use a lambda to provide a callback for the replacement. This way, you can run arbitrary logic to replace the string.
some thing like this would work. you can add all strings you want to keep into the excludedStrings array
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
var excludedString = new string[] { "dark side" };
int startIndex = 0;
while ((startIndex = test.IndexOf('\'', startIndex)) >= 0)
{
var endIndex = test.IndexOf('\'', startIndex + 1);
var subString = test.Substring(startIndex, (endIndex - startIndex) + 1);
if (!excludedString.Contains(subString.Replace("'", "")))
{
test = test.Remove(startIndex, (endIndex - startIndex) + 1);
}
else
{
startIndex = endIndex + 1;
}
}
Another method through regex alternation operator |.
#"('[^']*\bdark\b[^']*')|'[^']*'"
Then replace the matched character with $1
DEMO
string str = "Only 'together' can we turn him to the 'dark side' of the Force";
string result = Regex.Replace(str, #"('[^']*\bdark\b[^']*')|'[^']*'", "$1");
Console.WriteLine(result);
IDEONE
Explanation:
(...) called capturing group.
'[^']*\bdark\b[^']*' would match all the single quoted strings which contains the substring dark . [^']* matches any character but not of ', zero or more times.
('[^']*\bdark\b[^']*'), because the regex is within a capturing group, all the matched characters are stored inside the group index 1.
| Next comes the regex alternation operator.
'[^']*' Now this matches all the remaining (except the one contains dark) single quoted strings. Note that this won't match the single quoted string which contains the substring dark because we already matched those strings with the pattern exists before to the | alternation operator.
Finally replacing all the matched characters with the chars inside group index 1 will give you the desired output.
I made this attempt that I think you were thinking about (some solution using split, Contain, ... without regex)
string test = "Only 'together' can we turn him to the 'dark side' of the Force";
string[] separated = test.Split('\'');
string result = "";
for (int i = 0; i < separated.Length; i++)
{
string str = separated[i];
str = str.Trim(); //trim the tailing spaces
if (i % 2 == 0 || str.Contains("dark")) // you can expand your condition
{
result += str+" "; // add space after each added string
}
}
result = result.Trim(); //trim the tailing space again
I have text like this
Inc12345_Month
Ted12345_Month
J8T12345_Month
What I need to do is extract the 12345 and also remove everything before it. This will be done in C#
.+?(?=\d_Monthly) was working in a regex tester online but when I put it in my code it only returned 5_Month.
Edit: the 12345 could be a variable length so I cannot [0-9] multiple times.
Edit2: Code this was just to try and remove everything before the 12345
string text = /* the above text pulled in from a file */;
Regex reg = new Regex(#".+?(?=\d+_Monthly)");
text = reg.Replace(string, "");
You can use this function to strip it:
private static Regex getNumberAndBeyondRegex = new Regex(^.{2}\D+(\d.*)$", RegexOptions.Compiled);
public static string GetNumberAndBeyond(string input)
{
var match = getNumberAndBeyondRegex.Match(input);
if (!match.Success) throw new ArgumentException("String isn't in the correct format.", "input");
return match.Groups[1].Value;
}
The regex at work is ^.{2}\D+(\d.*)$
It works by grabbing anything that's a number, after at least one character that isn't a number. It'll not only match _Month but also other endings.
The regex exists out of a few parts:
^ matches the beginning of the string
.{2} matches any two characters, to prevent a digit from matching if it's the first or 2nd character, you can increase this number to be equal to the minimum prefix length - 1
\D+ matches at least one character that isn't a number
( starts capturing a group
\d.* matches at least one number and any values beyond that
) closes the capturing group
$ matches the end of the string
There are a lot of different regex flavors, many of them have slight differences in terms of escaping, capturing, replacing and quite surely some others.
For testing .NET regexes online I use the free version of the tool RegexHero, it has an popup every now and then, but it makes up for that time by showing you live results, capture groups, and instant replacing. Next to having quite a lot of features.
If you want to match anywhere within the string, you can use the regex \d+_Month, it is very similiar to your original regex. In code:
new Regex("\d+_Month").Match(input).Value
Edit:
Based on the format you supplied in the comment I've created a regex and function to parse the entire file name:
private static Regex parseFileNameRegex = new Regex(#"^.*\D(\d+)_Month_([a-zA-Z]+)\.(\w+)$", RegexOptions.Compiled);
public static bool TryParseFileName(string fileName, out int id, out string month, out string fileExtension)
{
id = 0; month = null; fileExtension = null;
if (fileName == null) return false;
var match = parseFileNameRegex.Match(fileName);
if (!match.Success) return false;
if (!int.TryParse(match.Groups[1].Value, out id) || id < 1) return false; // Convert the ID into a number
month = match.Groups[2].Value;
fileExtension = match.Groups[3].Value;
return true;
}
In the parse function it requires the ID to be at least 1, 0 isn't accepted (and negative numbers won't match the regex), if you don't want this restriction, simply remove || id < 1 from the function.
Using the function would look like:
int id; string month, fileExtension;
if (!TryParseFileName("CompanyName_ClientName12345_Month_Nov.pdf", out id, out month, out fileExtension))
throw new FormatException("File name is incorrectly formatted."); // Do whatever you want when you get an invalid filename
// Use id, month and fileExtension here :)
The regex ^.*\D(\d+)_Month_([a-zA-Z]+)\.(\w+)$ works like:
^ matches the beginning of the string
.*\D matches at least one non-numeric character
(\d+) captures at least 1 number, this is the ID
_Month_ is the literal text in between
([a-zA-Z]+) matches and captures at least 1 letter, this is the month
\. matches a . character
(\w+) matches and captures any alphanumeric (letters and numbers), this is the file extension
$ matches the end of the string
Using :
Regex reg = new Regex(#"\D+(?=(\d+)_Monthly)");
is more explicit, the result is in Groups[1].
Part by part:
.+?
Match anything, maybe. This doesn't make any sense to me. It would be equivalent to ".*", which may or may not be what you meant.
(?=
start a group
\d
Match exactly 1 decimal, which explains what you are seeing, and the rest of the number is matched by .+? which is outside the group
_Monthly
match the literal text
)
end group
I think what you want is:
.*(?=\d+_Monthly)
I guess you are missing the + sign after \d
.+?(?=\d+_Monthly)
This should ask for one or more digits.
If you don't need anything before the number, this should work:
(\d+_Month)
I use Derek Slager's regex tester when I'm working with C# regex.
Better dotnet regular expression tester
I have the following string example:
\\servername\Client\Range\Product\
The servername, Client, Range, and Product vales can be anything but they simply represe t a samba share on a server.
I want to be able to take one of these paths and relace everything upto the fourth \ with a new path: for example:
\\10.0.1.1\ITClient\001\0012\ will become:
\\10.0.1.1\Archive\001\0012\
All the paths that I get will follow the same start pattern \\servername\Client\, using C# how can I replace everything in the string upto the 4th "\"?
I have looked at using regex but I have never been able to understand its wonders and powers
This Regex pattern will match everything through the 4th \
^(?:.*?\\){4}
usage:
var result = Regex.Replace(inputString, #"^(?:.*?\\){4}", #"\\10.0.1.1\Archive\");
To elucidate the Regex a bit:
^ // denotes start of line
(?:…) // we need to group some stuff, so we use parens, and ?: denotes that we do not want to use the parens for capturing (this is a performance optimization)
.*? // denotes any character, zero or more times, until what follows (\)
\\ //denotes a backslash (the backslash is also escape char)
{4} // repeat 4 times
You can use String.Format or Path.Combine
string template = #"\\{0}\{1}\{2}\{3}\";
string server = "10.0.1.1";
string folder = "Archive";
string range = "001";
string product = "0012";
string s1 = String.Format(template,
server,
folder,
range,
product);
// s1 = \\10.0.1.1\Archive\001\0012\
string s2 = Path.Combine(#"\\", server, folder, range, product);
// s2 = \\10.0.1.1\Archive\001\0012\
Elegant regex solution would be:
(new Regex(#"(?<=[^\\]\\)[^\\]+")).Replace(str, "Archive", 1);
which replace part of string behind a single slash with "Archive" string.
Test this code here.
Unless I'm missing something major, you could just use a mask and format it:
static string pathMask = #"\\{0}\{1}\{2}\{3}\";
string server = "10.0.1.1";
string client = "archive";
string range = "001";
string product = "0012";
...
string path = string.Format(pathMask, server, client, range, product);
The string methods would work but maybe more verbose than a regex. If you want to match everything from the start of the string up to the 4th \ then the following regex will do it (assuming your string meets the pattern provided)
^\\\\[^\\]+\\[^\\]+\\
So some code something like
string updated = Regex.Replace(#"\\10.0.1.1\ITClient\001\0012\", "^\\\\[^\\]+\\[^\\]+\\", #"\\10.0.1.1\Archive\");
Should do the trick.
This one is easy and fast, as long as we are talking about only 4 parts:
string example = #"\\10.0.1.1\ITClient\001\0012\";
string[] parts = example.Split(new string[] { #"\" }, StringSplitOptions.RemoveEmptyEntries);
i have a username= LICTowner.
i need to get the prefix from the word LICTowner i.e LICT.
how to split the word and get the 4 letter prefix.
in asp.net using C#
if the prefix is ALWAYS 4 letters you can use the Substring method:
var prefix = username.Substring(0, 4);
where the first int is the start index and the second int is the length.
Substring on MSDN
String userName = "LICTowner";
String prefix = userName.Substring(0,4); // LICT
String restOfWord = userName.Substring(4); // owner
hmmmm.... before anything, a) you should really look for similar questions and b) thats not a hard problem to do... i mean, have you even tried???
if the prefix is always 4 letters, then just use the .Substring method... as in
string username;
string prefix=username.Substring(0,4)// or something like that, cant remember off the top of my head
string s = "LICTowner";
Label1.Text= Regex.Replace(s, "[^A-Z]", "");
Simple regular expression to remove all characters other than upper case
string name="aryan";
string prefixwords= name.substring(a,b);
//a from where you want the string
// b till where you need the string now take any label and print the value
string prefixwords= name.substring(0,2);
label lblmsg= new label();
lblmsg=prefixwords.tostring(); // ary
string restofwords= name.substring(2); // an
If the prefix is always 4 characters use simple substring
Else
a) remove all non upper case characters
string s = "LICTowner";
Label1.Text= Regex.Replace(s, "[^A-Z]", "");
b)
split at the first non upper case character
Label1.Text= Regex.Split(s, "[^A-Z]")[0];
You can use the Substring() function to get the first four Character.
string mystring = "LISTowner";
string prefixword = mystring.Substring(0,4);
Label label1 = new Label();
label1.Text = prefixword;