Regex extraction of a specific pattern - c#

I have a string of following format. I have three scenarios which follows as:
Scenario 1:
"\\hjsschjsn\Bunong.PU2.PV/-56Noogg.BSC";
The extraction should be until ".BSC" , ".BSC" will be there in the original string always. Also "\" and "\" will be there but the text will change.
I have to omit the middle part , my output should be :
"\\hjsschjsn\-56Noogg.BSC";
Scenarion 2:
"\\adajsschjsn\Bcscx.sdjhs\AHHJogg.BSC";
The output should be :
"\\adajsschjsn\AHHJogg.BSC";
Scenario 3:
"aasjkankn\\adajsschjsn\Bcscx.sdjhs\AHHJogg.BSC\djkhakdjhjkj";
output should be:
"\\adajsschjsn\AHHJogg.BSC";
Here's what I have tried:
string text = "\\\\hjsschjsn\Bunong.PU2.PV/-56Noogg.BSC";
//Note: I have given \\\\ instead of \\ because of string literal to be accomadated in a string
Match pattern = Regex.Match(text, #"\\\\[\w]+\\/[\w*]+.BSC");

Try following mask:
.*(\\\\[^\\]*\\)([^\\\/]+)[\\\/](.*?\.BSC).*
Replace it with $1$3
Regex reg = new Regex(#".*(\\\\[^\\]*\\)([^\\\/]+)[\\\/](.*?\.BSC).*");
string input = #"\\hjsschjsn\Bunong.PU2.PV/-56Noogg.BSC";
string output = reg.Replace(input, "$1$3");
See example here

Match pattern1 = Regex.Match(text, #"\\\\\w+\\");
Match pattern2 = Regex.Match(text, #"\w+.BSC");
Console.WriteLine(pattern1.ToString() + pattern2.ToString());

Related

Regex for text between two characters

I'm trying to get some text between two strings in C# in regex expression.
The text is in variable (tb1.product_name) : Example Text | a:10,Colour:Green
Get all text before |, in this case, Example Text
Get all text between : and ,, in this case, 10
In two differents regex.
I try with:
Regex.Match(tb1.product_name, #"\:([^,]*)\)").Groups[1].Value
But this doesn't work.
If it is not so necessary to use regex, you can do this simply by using string.Substring & string.IndexOf:
string str = "Example Text | a:10,Colour:Green";
string strBeforeVerticalBar = str.Substring(0, str.IndexOf('|'));
string strInBetweenColonAndComma = str.Substring(str.IndexOf(':') + 1, str.IndexOf(',') - str.IndexOf(':') - 1);
Edit 1:
I feel Regex might be an overkill for something as simple as this. Also if use what i suggested, you can add Trim() at the end to remove whitespaces, if any. Like:
string strBeforeVerticalBar = str.Substring(0, str.IndexOf('|')).Trim();
string strInBetweenColonAndComma = str.Substring(str.IndexOf(':') + 1, str.IndexOf(',') - str.IndexOf(':') - 1).Trim();
string str = #"Example Text |a:10,Colour: Green";
Match match = Regex.Match(str, #"^([A-Za-z\s]*)|$");
Match match2= Regex.Match(str, #":([0-9]*),");
//output Example Text
Console.WriteLine(match.Groups[1].Value);
//output 10
Console.WriteLine(match2.Groups[1].Value);

How can I cut out the below pattern from a string using Regex?

I have a string which will have the word "TAG" followed by an integer,underscore and another word.
Eg: "TAG123_Sample"
I need to cut the "TAGXXX_" pattern and get only the word Sample. Meaning I will have to cut the word "TAG" and the integer followed by and the underscore.
I wrote the following code but it doesn't work. What have I done wrong? How can I do this? Please advice.
static void Main(string[] args)
{
String sentence = "TAG123_Sample";
String pattern=#"TAG[^\d]_";
String replacement = "";
Regex r = new Regex(pattern);
String res = r.Replace(sentence,replacement);
Console.WriteLine(res);
Console.ReadLine();
}
You're currently negating (matching NOT a digit), you need to modify the regex as follows:
String s = "TAG123_Sample";
String r = Regex.Replace(s, #"TAG\d+_", "");
Console.WriteLine(r); //=> "Sample"
Explanation:
TAG match 'TAG'
\d+ digits (0-9) (1 or more times)
_ '_'
You can use String.Split for this:
string[] s = "TAG123_Sample".Split('_');
Console.WriteLine(s[1]);
https://msdn.microsoft.com/en-us/library/b873y76a.aspx
Try this will work in this case for sure:
resultString = Regex.Replace(sentence ,
#"^ # Match start of string
[^_]* # Match 0 or more characters except underscore
_ # Match the underscore", "", RegexOptions.IgnorePatternWhitespace);
No regex is necessary if your string contains 1 underscore and you need to get a substring after it.
Here is a Substring+IndexOf-based approach:
var res = sentence.Substring(sentence.IndexOf('_') + 1); // => Sample
See IDEONE demo

Regex to replace double nested quotes in C#

I am trying to replace double nested quotes from string in C# using Regex, but not able to achieve it so far. Below is the sample text and the code i tried -
string html = "<img src=\"imagename=\"g1\"\" alt = \"\">";
string output = string.Empty;
Regex reg = new Regex(#"([^\^,\r\n])""""+(?=[^$,\r\n])", RegexOptions.Multiline);
output = reg.Replace(html, #"$1");
the above gives below output -
"<img src="imagename="g1 alt = >"
actual output i am looking for is -
"<img src="imagename=g1" alt = "">"
Please suggest how to correct the above code.
Pattern : \s*"\s*([^ "]+)"\s*(?=[">])|(?<=")("")(?=")
Replacement : $1
Here is demo and tested at regexstorm
String literals for use in programs:
#"\s*""\s*([^ ""]+)""\s*(?=["">])|(?<="")("""")(?="")"
To keep it simple and more precised directly focused for src attribute value
Pattern : (\bsrc="[^ =]+=)"([^ "]+")"
Replacement : $1$2
Here is online demo and tested at regexstorm
String literals for use in programs:
#"(\bsrc=""[^ =]+=)""([^ ""]+"")"""
Note: I assume attribute values don't contain any spaces.

Regex to find next word (which contains special character) after given word

I am facing problem with writing REGEX to get desired output from a string.
I have a string like string simpleInput = #"Website address www.yahoo[mail].com AND Following is the";
I want to specify "address" word and in result want the next word after it, i.e."www.yahoo[mail].com"
I have written following piece of code.
string pattern = #"address (?<after>\w+)";
MatchCollection matches = Regex.Matches(simpleInput, pattern, RegexOptions.Multiline | RegexOptions.IgnoreCase);
string nextWord = string.Empty;
foreach (Match match in matches)
{
nextWord = match.Groups["after"].ToString();
}
Console.WriteLine("Word is: " + nextWord );
This gives me output as:
Word is: www
Where as I expect output to be www.yahoo[mail].com
Can anyone please help?
I tried with \D+, that gives me entire string.. till the end of string, so gives additional text like "AND Following is the" also comes in result.
Where as I just wanted the single word "www.yahoo[mail].com"
\w+ doesn't match . or some other characters in the string you want to match. Try using \S+ instead which means non-space characters:
string pattern = #"address (\S+)";

C# How to replace a shorter string than the matched string?

How can I replace only a part of a matched regex string ? I need to find some strings that are inside of some brackets like < >. In this example I need to match 23 characters and replace only 3 of them:
string input = "<tag abc=\"hello world\"> abc=\"whatever\"</tag>";
string output = Regex.Replace(result, ???, "def");
// wanted output: <tag def="hello world"> abc="whatever"</tag>
So I either need to find abc in <tag abc="hello world"> or find <tag abc="hello world"> and replace just abc. Do regular expressions or C# allow that ? And even if I solve the problem differently is it possible to match a big string but replace only a little part of it ?
I'd have to look up the #NET regex dialect, but in general you want to capture the parts you don't want to replace and refer to them in your replacement string.
string output = Regex.Replace(input, "(<tag )abc(=\"hello world\">)", "$1def$2");
Another option would be to use lookaround to match "abc" where it follows "<tag " and precedes "="hello world">"
string output = Regex.Replace(input, "(?<=<tag )abc(?==\"hello world\")", "def");
Instead of Regex.Replace use Regex.Match, then you can use the properties on the Match object to figure out where the match occurred.. then the regular string functions (String.Substring) can be used to replace the bit you want replaced.
Working sample with named groups:
string input = #"<tag abc=""hello world""> abc=whatever</tag>";
Regex regex = new Regex(#"<(?<Tag>\w+)\s+(?<Attr>\w+)=.*?>.*?</\k<Tag>>");
string output = regex.Replace(input, match =>
{
var attr = match.Groups["Attr"];
var value = match.Value;
var left = value.Substring(0, attr.Index);
var right = value.Substring(attr.Index + attr.Length);
return left + attr.Value.Replace("abc", "def") + right;
});

Categories

Resources