Regular Expression - One Item Has Multiple Words [closed] - c#

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have multiple lines in a text file.
The text file looks similar to this:
Column 1 Column 2 Column 3
12345 stack overflow 12345678
I need a regular expression to check this and then grab column two. My problem is column two can be one or multiple words and I need it to be one item in the string when I grab it or grab other columns.

Read the file line by line and match it using the following regex:
^\d*\s*([\w\s]*\w)\s*\d*$
Now, the 1st named subgroup should give you what you need. I'm not exactly sure about C# syntax, but for notepad++, $1 works well.
The ^ ensures that the regex starts matching from the very beginning of the read line and the $ ensures that it matches up till the very end.
The default greedy matching of the regex assures that no extra spaces are captured in the beginning of the column two content and the \w at the end ensures no trailing spaces.
If the carriage return and new-line characters are read by your platform as well, you can modify it as:
^\s*\d*\s*([\w\s]*\w)\s*\d*\s*$

Related

RegEx to find if string has first a character and then numbers [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
Can anyone tell me which pattern should I use for following string Format:
First character always 'C' or 'c'
After that 1-4 numeric Digits
Then '_'
At last again series of some characters (no. not fixed)
eg: C10_COM, C1122_ABC etc.
In c# for Regex.IsMatch()
Try this:
^[cC][0-9]{1,4}_.*$
Where:
^ = Start of the line
[cC] = either upper or lowercase c
[0-9]{1,4] = Match a number 1 to 4 times
_ = underscore
.* = Any number of characters
$ = end of line
Addendum: You didn't specify if you were allowed to have zero characters at the end of the line. If not, then replace .* with ?*.

How to match 4 character code to actual words using regular expressions [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Having codes in the following format [Alphanumeric][Letter][Alphanumeric][Alphanumeric] i.e A1AA
Now I also have a dictionary of all the 4 letter words I am trying to block i.e R2D2
What I am looking for is a regular expression to find the matches for each code and item in my dictionary but also 1 step further and to replace characters and letters which look alike i.e. i and 1, s and 5 and see if any matches happen there.
Anything like that out there
Let's say your dictionnary contains this codes:
R2D2
C3P0
X5ZZ
I would load the dictionnary and build on the fly a regex. The final regex would be:
(?-i)(R2D2|C3P0|X5ZZ)
Then apply this regex to each of your codes
If Regex.Matches(finalRegex) Then
// Evil code catched
Else
// Nice code found ...
End If
I'll give you a headstart
var matches = Regex.Matches(#"A1AA", #"([a-zA-Z0-9][a-zA-Z][a-zA-Z0-9][a-zA-Z0-9])");
foreach(Match match in matches)
{
Console.WriteLine(match.Groups[1].Value);
}
This part of the Regex code [a-zA-Z0-9] will capture any character which is alphanumeric.
You can then do a foreach loop on the matches against a dictionary of words.

Regex to remove consecutive special characters greater than specified count [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I would like to remove all appearances of the character / and \ in a string if it appears consecutively more than twice, using Regex. This means, if a string contains abc////////////////////////def, I would like to get all / removed. However, it should not remove "the / in http://.
Could someone please suggest?
You can use /{3,}, which will match 3 or more occurrences of the / character.
var result = Regex.Replace("abc///def", "/{3,}", "");
Update: to reply to your comment, the * character is a metacharacter in regex, which holds a special meaning, so you need to escape it. Try this: \*{3,}. If you want to combine both characters, you can use a character class: [/*]{3,}. A character class is denoted by the square brackets. Inside a character class you don't need to escape metacharacters, which is why I simply list * inside without escaping it as I did earlier.
Use negative look-behind assertion:
#"(?<!https?:)/{2,}|\\{2,}"
For example:
Regex pattern = new Regex(#"(?<!https?:)/{2,}|\\{2,}");
Console.WriteLine(pattern.Replace(#"http://example.com", ""));
Console.WriteLine(pattern.Replace(#"abc//////////def", ""));
Console.WriteLine(pattern.Replace(#"abc\\\\\\\\\\def", ""));
prints
http://example.com
abcdef
abcdef

.NET: "identity\\..*" does not match "identity.requesttoken" [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm trying to match a string in .NET Regex, I want the expression to match "identity." with anything added to the end (can still limit the scope of * later on), testing the pattern in any regexeditor works just fine (I have one less backslash there, due to escaping).
I have set a breakpointright on my Regex.IsMatchto check the values, there are exactly what I put in the title (note that this is from the VS2010 debugger, escape sequences are unparsed)
Try using a string literal (prefix the string with #) when specifying your regex. This will remove the need for you to escape the \:
Regex.IsMatch("identity.requesttoken", #"identity\..*")

Custom regular expression [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Hi I need to match this format
N - Number
NN,NN
or
NN.NN
also
N,N and N.N
and combinations
N.NN and N,NN or NN,N and NN.N
Here's your regex:
\d{1,2}[.,]\d{1,2}
See it here in action: http://regexr.com?2vman
Here's a slightly different version:
\d\d?[.,]\d\d?
See it here in action: http://regexr.com?2vmaq
If you want to also match with out a decimal, use this:
\d\d?[.,]?\d{0,2}
See it here in action: http://regexr.com?2vml4
How about:
\d{1,2}(?:[.,]\d{1,2})?
explanation:
\d{1,2} : one or two digits
(?: : start non capture group
[.,] : . or ,
\d{1,2} : one or two digits
)? : end group, optional
Why match it?
Just remove the commas and use the actual number:
Regex.Replace("8,675,309.02", "(,)", string.Empty) // Outputs 8675309.02
If this is a validation scenario, using int.Parse will let you know if its valid.
I would go with something like this:
Regex regex = new Regex(#"\d{1,2}[\.,]\d{1,2}");

Categories

Resources