C# stripping out the string needed - c#

Ok so i have these strings
location = "C:\\Users\\John\\Desktop\\399";
location = "C:\\Users\\John\\Desktop\\399\\DISK1";
location = "C:\\Users\\John\\Desktop\\399\\DISK2";
location = "\\somewhere\\on\\Network\\399\\DISK2";
how do i strip out the 399 from all these situations ....FYI the number might be 2 digits like 42 so i cant grab the last 3 in the first case....i was thinking of some regex that would take out the DISKn if it exists and grab the number till the \ before the number but i dont know how to do that in C#...any ideas

Here is how to do this with Regex against your example input:
Regex rgx = new Regex("\\\d+");
string result = rgx.Replace(input, string.Empty);
The regular expression will match on a \ followed by at least one digit and replace them. You need to be careful though, as it will not preserve the string if you have this pattern elsewhere in the string.
If your inputs are exactly as you have described, using string.Split can be much more efficient (assuming the portion you need to remove is always last of before last).
Update:
The regex I provided will work only if you have a single part of the path that starts with numbers, not multiples or paths that have begin with numbers but do not end with them.
The information you have provided is not enough to built a regular expression that will do as you wish - how do you distinguish between numeric paths that do need to be stripped out and those that do not, for example?

var parts = location.Split('\\');
var number = parts.Last().Starts("DISK") ? parts[parts.Length - 2] : parts[parts.Length - 1];
strip number out:
var index = parts.Last().Starts("DISK") ? parts.Length - 2 : parts.Length - 1;
var newParts = parts.Take(index).Concat(parts.Skip(index + 1)).ToArray();
var newLocation = string.Join("\\", newParts);

Take a look at the Split() method for breaking the string up around separators. Then you can use techniques such as checking for the last part starting with DISK, or checking for a part that is purely integer (possibly risky, in case higher subdirectories are pure numbers - unless you work from the back!).

int i = int.Parse(location.Split(new string[] { "\\" }, StringSplitOptions.RemoveEmptyEntries)[4]);

Related

Trying to do a simple regular expression to match 7 digits or 9 digits. No solutions I've found actually work [duplicate]

testing= testing.match(/(\d{5})/g);
I'm reading a full html into variable. From the variable, want to grab out all numbers with the pattern of exactly 5 digits. No need to care of whether before/after this digit having other type of words. Just want to make sure whatever that is 5 digit numbers been grabbed out.
However, when I apply it, it not only pull out number with exactly 5 digit, number with more than 5 digits also retrieved...
I had tried putting ^ in front and $ behind, but it making result come out as null.
I am reading a text file and want to use regex below to pull out numbers with exactly 5 digit, ignoring alphabets.
Try this...
var str = 'f 34 545 323 12345 54321 123456',
matches = str.match(/\b\d{5}\b/g);
console.log(matches); // ["12345", "54321"]
jsFiddle.
The word boundary \b is your friend here.
Update
My regex will get a number like this 12345, but not like a12345. The other answers provide great regexes if you require the latter.
My test string for the following:
testing='12345,abc,123,54321,ab15234,123456,52341';
If I understand your question, you'd want ["12345", "54321", "15234", "52341"].
If JS engines supported regexp lookbehinds, you could do:
testing.match(/(?<!\d)\d{5}(?!\d)/g)
Since it doesn't currently, you could:
testing.match(/(?:^|\D)(\d{5})(?!\d)/g)
and remove the leading non-digit from appropriate results, or:
pentadigit=/(?:^|\D)(\d{5})(?!\d)/g;
result = [];
while (( match = pentadigit.exec(testing) )) {
result.push(match[1]);
}
Note that for IE, it seems you need to use a RegExp stored in a variable rather than a literal regexp in the while loop, otherwise you'll get an infinite loop.
This should work:
<script type="text/javascript">
var testing='this is d23553 test 32533\n31203 not 333';
var r = new RegExp(/(?:^|[^\d])(\d{5})(?:$|[^\d])/mg);
var matches = [];
while ((match = r.exec(testing))) matches.push(match[1]);
alert('Found: '+matches.join(', '));
</script>
what is about this? \D(\d{5})\D
This will do on:
f 23 23453 234 2344 2534 hallo33333 "50000"
23453, 33333 50000
No need to care of whether before/after this digit having other type of words
To just match the pattern of 5 digits number anywhere in the string, no matter it is separated by space or not, use this regular expression (?<!\d)\d{5}(?!\d).
Sample JavaScript codes:
var regexp = new RegExp(/(?<!\d)\d{5}(?!\d)/g);
var matches = yourstring.match(regexp);
if (matches && matches.length > 0) {
for (var i = 0, len = matches.length; i < len; i++) {
// ... ydo something with matches[i] ...
}
}
Here's some quick results.
abc12345xyz (✓)
12345abcd (✓)
abcd12345 (✓)
0000aaaa2 (✖)
a1234a5 (✖)
12345 (✓)
<space>12345<space>12345 (✓✓)

C# How do i get the Right(String) based on one character?

I have this:
MyString = #"C:\\Somepath\otherpath\etc\string";
And i need this string (which can be longer than a group of characters)
How can i do something like:
NewString = MyString.Right(string, when last "\" is found) ?
For a path specifically, you can use Path.GetFileName(String).
var MyString = #"C:\Somepath\otherpath\etc\string";
var NewString = Path.GetFileName(MyString);
Despite the name of the method, it also works on directory names, provided they aren't followed by a trailing backslash. So C:\directory becomes directory, but C:\directory\ becomes the empty string. (This might be what you want, based on how you phrased the question.)
Depending on your environment, you might be able to use the new indices and range features that came with C# 8.0
var result = MyString.Split('\\')[^1];
Indices and Ranges
This will return everything after the last instance of the character '\'.
var result = MyString.Substring(MyString.LastIndexOf('\\') + 1);
If you don't mind using a bit of LINQ:
var result = MyString?.Split('\\').LastOrDefault();

evaluate filename with regex

I have a bit of code to evaluate a filename using a regex, this works fine, but I want to add in a 2nd pattern of out_\d\d\d\d\d\d_ (then up to 150 character to hold an address).
Obviously I don't want to have \d 150 times, can anyone tell me the best way to to this?
thanks
REGEX_PATTERN = #"out_\d\d\d\d\d\d";
if (!Regex.Match(Path.GetFileNameWithoutExtension(e.Name), REGEX_PATTERN).Success) {
return;
}
You want:
REGEX_PATTERN = #"^out_\d{6}(?:_.{1,150})?$";
This breaks down as
`^` - start of string
`out_\d{6}` - `out_` followed by 6 digits
`(?:_.{1,50})?` - an optional string of _ followed by 1-150 characters
`$` - end of string
Try this out:
REGEX_PATTERN = #"out_\d{1,150}";
OR
// For strict boundary match
REGEX_PATTERN = #"^out_\d{1,150}$";

Is there a way to get a string up until a year value?

Basically I have some filenames where there is a year in the middle. I am only interested in getting any letter or number up until the year value, but only letters and numbers, not commas, dots, underscores, etc. Is it possible? Maybe with Regex?
For instance:
"A-Good-Life-2010-For-Archive"
"Any.Chararacter_Can+Come.Before!2011-RedundantInfo"
"WhatyouseeIsWhatUget.2012-Not"
"400-Gestures.In1.2000-Communication"
where I want:
"AGoodLife"
"AnyChararacterCanComeBefore"
"WhatyouseeIsWhatUget"
"400GesturesIn1"
By numbers I mean any number that doesn't look like a year, i.e. 1 digit, 2 digits, 3 digits, 5 digits, and so on. I only want to recognize 4 digit numbers as years.
You'll have to do this in two parts -- first to remove the symbols you don't want, and second to grab everything up to the year (or vice versa).
To do grab everything up to the year, you can use:
Match match = Regex.Match(movieTitle,#"(.*)(?<!\d)(?:19|20)[0-9]{2}(?!\d)");
// if match.Success, result is in match.Groups[1].value
I've made the year regex so it only matches things in the 1900s or 2000s, to make sure you don't match four-digit numbers as year if they're not a year (e.g. "Ali-Baba-And-the-1234-Thieves.2011").
However, if your movie title involves a year, then this won't really work ("2001:-Space-Odyssey(1968)").
To then replace all the non-characters, you can replace "[^a-zA-Z0-9]" with "". (I've allowed digits because a movie might have legitimate numbers in the title).
UPDATED from comments below:
if you search from the end to find the year you might do better. ie find the latest occuring year-candidate as the year. Hence, I've changed a .*? to .* in the regex so that the title is as greedy as possible and only uses the last year-candidate as the year.
Added a (?!\d) to the end of the year regex and a (?<!\d) to the start so that it doesn't match "My-title-1" instead of "My-title-120012-fdsa" & "2001" in "My-title-120012-fdsa" (I didn't add the boundary \b because the title might be "A-Good-Life2010" which has no boundary around the year).
changed the string to a raw string (#"...") so I don't need to worry about escaping backslashes in the regex because of C# interpreting backslashes.
you can try like this
/\b\d{4}\b/
d{4}\b will match four d's at a word boundary.Depending on the input data you may also want to consider adding another word boundary (\b) at the beginning.
using System.Text.RegularExpressions;
string GoodParts(string input) {
Regex re = new Regex(#"^(.*\D)\d{4}(\D|$)");
var match = re.Match(input);
string result = Regex.Replace(match.Groups[1].Value, "[^0-9a-zA-Z]+", "");
return result;
}
You can use Regex.Split() to make the code ever so terser (and possibly faster due to the simpler regex):
var str = "400-Gestures.In1.2000-Communication";
var re = new Regex(#"(^|\D)\d{4}(\D|$)");
var start = re.Split(str)[0];
// remove nonalphanumerics
var result = new string(start.Where(c=>Char.IsLetterOrDigit(c)).ToArray());
I suppose you want a fancy regular excpression?
Why not a simple for loop?
digitCount = 0;
for i = 0 to strlen(filename)
{
if isdigit(fielname[i])
{
digitCount++;
if digitCount == 4
thePartOfTheFileNameThatYouWant = strcpy(filename, 0, i-4)
}
else digitCount = 0;
}
// Sorry, I don't know C-sharp

RegEx Problem using .NET

I have a little problem on RegEx pattern in c#. Here's the rule below:
input: 1234567
expected output: 123/1234567
Rules:
Get the first three digit in the input. //123
Add /
Append the the original input. //123/1234567
The expected output should looks like this: 123/1234567
here's my regex pattern:
regex rx = new regex(#"((\w{1,3})(\w{1,7}))");
but the output is incorrect. 123/4567
I think this is what you're looking for:
string s = #"1234567";
s = Regex.Replace(s, #"(\w{3})(\w+)", #"$1/$1$2");
Instead of trying to match part of the string, then match the whole string, just match the whole thing in two capture groups and reuse the first one.
It's not clear why you need a RegEx for this. Why not just do:
string x = "1234567";
string result = x.Substring(0, 3) + "/" + x;
Another option is:
string s = Regex.Replace("1234567", #"^\w{3}", "$&/$&"););
That would capture 123 and replace it to 123/123, leaving the tail of 4567.
^\w{3} - Matches the first 3 characters.
$& - replace with the whole match.
You could also do #"^(\w{3})", "$1/$1" if you are more comfortable with it; it is better known.
Use positive look-ahead assertions, as they don't 'consume' characters in the current input stream, while still capturing input into groups:
Regex rx = new Regex(#"(?'group1'?=\w{1,3})(?'group2'?=\w{1,7})");
group1 should be 123, group2 should be 1234567.

Categories

Resources