Replace character with any possible string c# - c#

Lets say i have string like this Test%Test and i have stored strings like this:
Test123Test
TestTTTTest
Test153jhdsTest
123Test
TEST123
So what i want is when i type in textbox Test it would filter me everything with Test in itselft and that will get me all strings which is easy, but i want to type in Test%Test and it needs to filter me everything that has Test[anything]Test in itself (so result would be first, second and third string). How can i do it?

a simple solution using a regex is:
string[] values = new string[] { "Test123Test",
"TestTTTTest",
"Test153jhdsTest",
"123Test",
"TEST123" };
string searchQuery = "Test%Test";
string regex = Regex.Escape(searchQuery).Replace("%", ".*?");
string[] filteredValues = values.Where(str => Regex.IsMatch(str, regex)).ToArray();
Or for a single match:
string value = "Test123Test";
string searchQuery = "Test%Test";
string regex = Regex.Escape(searchQuery).Replace("%", ".*?");
if ( Regex.IsMatch(value, regex) )
{
// do something with the match...
}
We replace % with a regular expression (. = any character, * = zero or more times, ? = lazy quantifier). You can learn more about regular expressions here

Related

Regular expression split string, extract string value before and numeric value between square brackets

I need to parse a string that looks like "Abc[123]". The numerical value between the brackets is needed, as well as the string value before the brackets.
The most examples that I tested work fine, but have problems to parse some special cases.
This code seems to work fine for "normal" cases, but has some problems handling "special" cases:
var pattern = #"\[(.*[0-9])\]";
var query = "Abc[123]";
var numVal = Regex.Matches(query, pattern).Cast<Match>().Select(m => m.Groups[1].Value).FirstOrDefault();
var stringVal = Regex.Split(query, pattern)
.Select(x => x.Trim())
.FirstOrDefault();
How should the code be adjusted to handle also some special cases?
For instance for the string "Abc[]" the parser should return correctly "Abc" as the string value and indicate an empty the numeric value (which could be eventually defaulted to 0).
For the string "Abc[xy33]" the parser should return "Abc" as the string value and indicate an invalid numeric value.
For the string "Abc" the parser should return "Abc" as the string value and indicate a missing numeric value. The blanks before/after or inside the brackets should be trimmed "Abc [ 123 ] ".
Try this pattern: ^([^\[]+)\[([^\]]*)\]
Explanation of a pattern:
^ - match beginning of a string
([^\[]+) - match one or more of any character ecept [ and store it insinde first capturing group
\[ - match [ literally
([^\]]*) - match zero or more of any character except ] and store inside second capturing group
\] - match ] literally
Here's tested code:
var pattern = #"^([^\[]+)\[([^\]]*)\]";
var queries = new string[]{ "Abc[123]", "Abc[xy33]", "Abc[]", "Abc[ 33 ]", "Abc" };
foreach (var query in queries)
{
string beforeBrackets;
string insideBrackets;
var match = Regex.Match(query, pattern);
if (match.Success)
{
beforeBrackets = match.Groups[1].Value;
insideBrackets = match.Groups[2].Value.Trim();
if (insideBrackets == "")
insideBrackets = "0";
else if (!int.TryParse(insideBrackets, out int i))
insideBrackets = "incorrect value!";
}
else
{
beforeBrackets = query;
insideBrackets = "no value";
}
Console.WriteLine($"Input string {query} : before brackets: {beforeBrackets}, inside brackets: {insideBrackets}");
}
Console.ReadKey();
Output:
We can try doing a regex replacement on the input, for a one-liner solution:
string input = "Abc[123]";
string letters = Regex.Replace(input, "\\[.*\\]", "");
string numbers = Regex.Replace("Abc[123]", ".*\\[(\\d+)\\]", "$1");
Console.WriteLine(letters);
Console.WriteLine(numbers);
This prints:
Abc
123
Pretty sure there'd be some language-based techniques for that, which I wouldn't know, yet with a regular expression, we'd capture everything using capturing groups and check for things one by one, maybe:
^([A-Za-z]+)\s*(\[?)\s*([A-Za-z]*)(\d*)\s*(\]?)\s*$
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.
You can achieve that easily without using regex
string temp = "Abc[123]";
string[] arr = temp.Split('[');
string name = arr[0];
string value = arr[1].ToString().TrimEnd(']');
output name = Abc, and value = 123

Modifying string value

I have a string which is
string a = #"\server\MainDirectory\SubDirectoryA\SubDirectoryB\SubdirectoryC\MyFile.pdf";
The SubDirectoryB will always start with a prefix of RN followed by 6 unique numbers. Now I'm trying to modify SubDirectoryB parth of the string to be replaced by a new value lets say RN012345
So the new string should look like
string b = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
To achieve this I'm making use of the following helper method
public static string ReplaceAt(this string path, int index, int length, string replace)
{
return path.Remove(index, Math.Min(length, path.Length - index)).Insert(index, replace);
}
Which works great for now.
However the orginial path will be changing in the near future so it will something like #\MainDirectory\RN012345\AnotherDirectory\MyFile.pdf. So I was wondering if there is like a regex or another feature I can use to just change the value in the path rather than providing the index which will change in the future.
Assuming you need to only replace those \RNxxxxxx\ where each x is a unique digit, you need to capture the 6 digits and analyze the substring inside a match evaluator.
var a = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
var res = Regex.Replace(a, #"\\RN([0-9]{6})\\", m =>
m.Groups[1].Value.Distinct().Count() == m.Groups[1].Value.Length ?
"\\RN0123456\\" : m.Value);
// res => \server\MainDirectory\SubDirectoryA\RN0123456\SubdirectoryC\MyFile.pdf
See the C# demo
The regex is
\\RN([0-9]{6})\\
It matches a \ with \\, then matches RN, then matches and captures into Group 1 six digits (with ([0-9]{6})) and then will match a \. In the replacment part, the m.Groups[1].Value.Distinct().Count() == m.Groups[1].Value.Length checks if the number of distinct digits is the same as the number of the substring captured, and if yes, the digits are unique and the replacement occurs, else, the whole match is put back into the replacement result.
Use String.Replace
string oldSubdirectoryB = "RN012345";
string newSubdirectoryB = "RN147258";
string fileNameWithPath = #"\server\MainDirectory\SubDirectoryA\RN012345\SubdirectoryC\MyFile.pdf";
fileNameWithPath = fileNameWithPath.Replace(oldSubdirectoryB, newSubdirectoryB);
You can use Regex.Replace to replace the SubDirectoryB with your required value
string a = #"\server\MainDirectory\SubDirectoryA\RN123456\SubdirectoryC\MyFile.pdf";
a = Regex.Replace(a, "RN[0-9]{6,6}","Mairaj");
Here i have replaced a string with RN followed by 6 numbers with Mairaj.

Omit unnecessary parts in string array

In C#, I have a string comes from a file in this format:
Type="Data"><Path.Style><Style
or maybe
Type="Program"><Rectangle.Style><Style
,etc. Now I want to only extract the Data or Program part of the Type element. For that, I used the following code:
string output;
var pair = inputKeyValue.Split('=');
if (pair[0] == "Type")
{
output = pair[1].Trim('"');
}
But it gives me this result:
output=Data><Path.Style><Style
What I want is:
output=Data
How to do that?
This code example takes an input string, splits by double quotes, and takes only the first 2 items, then joins them together to create your final string.
string input = "Type=\"Data\"><Path.Style><Style";
var parts = input
.Split('"')
.Take(2);
string output = string.Join("", parts); //note: .net 4 or higher
This will make output have the value:
Type=Data
If you only want output to be "Data", then do
var parts = input
.Split('"')
.Skip(1)
.Take(1);
or
var output = input
.Split('"')[1];
What you can do is use a very simple regular express to parse out the bits that you want, in your case you want something that looks like this and then grab the two groups that interest you:
(Type)="(\w+)"
Which would return in groups 1 and 2 the values Type and the non-space characters contained between the double-quotes.
Instead of doing many split, why don't you just use Regex :
output = Regex.Match(pair[1].Trim('"'), "\"(\w*)\"").Value;
Maybe I missed something, but what about this:
var str = "Type=\"Program\"><Rectangle.Style><Style";
var splitted = str.Split('"');
var type = splitted[1]; // IE Data or Progam
But you will need some error handling as well.
How about a regex?
var regex = new Regex("(?<=^Type=\").*?(?=\")");
var output = regex.Match(input).Value;
Explaination of regex
(?<=^Type=\") This a prefix match. Its not included in the result but will only match
if the string starts with Type="
.*? Non greedy match. Match as many characters as you can until
(?=\") This is a suffix match. It's not included in the result but will only match if the next character is "
Given your specified format:
Type="Program"><Rectangle.Style><Style
It seems logical to me to include the quote mark (") when splitting the strings... then you just have to detect the end quote mark and subtract the contents. You can use LinQ to do this:
string code = "Type=\"Program\"><Rectangle.Style><Style";
string[] parts = code.Split(new string[] { "=\"" }, StringSplitOptions.None);
string[] wantedParts = parts.Where(p => p.Contains("\"")).
Select(p => p.Substring(0, p.IndexOf("\""))).ToArray();

Regular Expression to divide a string with pipe-delimited into multiple groups

I'm writing a c# code that divide a string into two different groups. a string is pipe-delimited as example below:
there could be an empty space between two pipes.
number of pipes to "5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w" are fixed; In this case, there are 4 pipes.
string value = "122312121|test value||test value 2|5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w|123456789|123456789";
const string sPattern = #"What should it be here?????";
var regex = new Regex(sPattern);
var match = regex.Match(value);
if (match.Success)
{
var begin = match.Groups["begin"].Value;
var middle = match.Groups["middle"].Value;
var end = match.Groups["end"].Value;
}
I am trying to get the output of the code to return as following:
begin = "122312121|test value||test value 2|"
middle = "5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w"
end = "|123456789|123456789"
However, I'm so new to regular expression, and I have tried to write a regular expression for variable sPattern, but could not produce the right regular expression for it. Could any please help? Thanks.
you should use String.Split
string [] sarray = value.Split('|')
What that will do is give you the array
{"122312121", "test value", "" , "test value" , "2", "5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w", "123456789", "123456789"}
and 5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w will be in sarray[5]
If you're looking for a regular expression to match this and want to use a regular expression rather than .Split, you could try this:
"^((.*?[|]){4})(.*?)([|].*)*$"
or more explicitly:
"^(?<begin>(.*?[|]){4})(?<middle>.*?)(?<end>[|].*)*$"
This is based on the fact that you said the number of pipes before your long string is fixed (at four).
Your code would then read as follows:
string value = "122312121|test value||test value 2|5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w|123456789|123456789";
const string sPattern = #"^((.*?[|]){4})(.*?)([|].*)*$";
var regex = new Regex(sPattern);
var match = regex.Match(value);
if (match.Success)
{
var begin = match.Groups[1].Value;
var middle = match.Groups[3].Value;
var end = match.Groups[4].Value;
}
The trick may be to escape the pipe character:
const string sPattern = #"(?<begin>[^|]*\|[^|]*\|[^|]*\|[^|]*\|)" +
"(?<middle>[^|]*)" +
"(?<end>\|.*)";
You could use String.Split and some Linq to do what you need
Rough example:
string value = "122312121|test value||test value 2|5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w|123456789|123456789";
string[] split = value.Split('|');
string begin = string.Join("|", split.Take(4));
string middle = split.Skip(4).Take(1).FirstOrDefault();
string end = "|" + string.Join("|", split.Skip(5).Take(2));
Returns
begin = "122312121|test value||test value 2|"
middle = "5GOdNF7Q5fK5O9QKiZefJEfO1YECcX1w"
end = "|123456789|123456789"
Here's another one:
^(?<begin>(.*?\|){4})(?<middle>.*?(?=\|))(?<end>.*)

A More Efficient Way to Parse a String in C#

I have this code that reads a file and creates Regex groups. Then I walk through the groups and use other matches on keywords to extract what I need. I need the stuff between each keyword and the next space or newline. I am wondering if there is a way using the Regex keyword match itself to discard what I don't want (the keyword).
//create the pattern for the regex
String VSANMatchString = #"vsan\s(?<number>\d+)[:\s](?<info>.+)\n(\s+name:(?<name>.+)\s+state:(?<state>.+)\s+\n\s+interoperability mode:(?<mode>.+)\s\n\s+loadbalancing:(?<loadbal>.+)\s\n\s+operational state:(?<opstate>.+)\s\n)?";
//set up the patch
MatchCollection VSANInfoList = Regex.Matches(block, VSANMatchString);
// set up the keyword matches
Regex VSANNum = new Regex(#" \d* ");
Regex VSANName = new Regex(#"name:\S*");
Regex VSANState = new Regex(#"operational state\S*");
//now we can extract what we need since we know all the VSAN info will be matched to the correct VSAN
//match each keyword (name, state, etc), then split and extract the value
foreach (Match m in VSANInfoList)
{
string num=String.Empty;
string name=String.Empty;
string state=String.Empty;
string s = m.ToString();
if (VSANNum.IsMatch(s)) { num=VSANNum.Match(s).ToString().Trim(); }
if (VSANName.IsMatch(s))
{
string totrim = VSANName.Match(s).ToString().Trim();
string[] strsplit = Regex.Split (totrim, "name:");
name=strsplit[1].Trim();
}
if (VSANState.IsMatch(s))
{
string totrim = VSANState.Match(s).ToString().Trim();
string[] strsplit=Regex.Split (totrim, "state:");
state=strsplit[1].Trim();
}
It looks like your single regex should be able to gather all you need. Try this:
string name = m.Groups["name"].Value; // Or was it m.Captures["name"].Value?

Categories

Resources