534-W1A-R1 this is my file name and I want to split it so it prints like
Code=534 Phase=1 Zone=A
in my Autocad file.
The below split code should work:
string str = #"534-W1A-R1";
var split = str.Split('-');
string code = split.First();
string phase = new string(split.ElementAt(1).Skip(1).Take(1).ToArray());
string zone = new string(split.ElementAt(1).Skip(2).Take(1).ToArray());
string result = String.Format("Code={0} Phase={1} Zone={2}", code, phase, zone);
Console.WriteLine(result);
Output:
Code=534 Phase=1 Zone=A
Use the Substring() method.
string input = "534-W1A-R1";
string sub = input.Substring(0, 3);
string sub2 = input.Substring(5, 1);
string sub3 = input.Substring(6, 1);
Console.WriteLine("Code={0} Phase={1} Zone={2}", sub, sub2, sub3);
Output:
Code=534 Phase=1 Zone=A
You have different ways to do it. if you are sure about the format of the text you can just use this:
var str= "534-W1A-R1";
var parts=str.Split('-');
var code= parts[0];
var secondPart= parts[1];
var phase=secondPart.Substring(1,secondPart.Length-2);
var zone=secondPart[secondPart.Length-1];
You can also use Regex if it is more complicated.
Using Regex
Edit: added some comments (pattern description)
var pattern = #"^(\d+)-[A-Z](\d+)([A-Z])-";
/* pattern description:
^(\d+) group 1: one or more digits at the begining
- one hyphen (literal)
[A-Z] one alphabetic character
(\d+) group 2: one or more digits
([A-Z]) group 3: one alphabetic character
- one hyphen (literal)
*/
var input = "534-W1A-R1";
var groups = Regex.Match(input, pattern, RegexOptions.IgnoreCase).Groups;
var code = groups[1].Value;
var phase = groups[2].Value;
var zone = groups[3].Value;
Related
I am trying to get some individual values from a string based on a format, now this format can change so ideally, I want to specify this using another string.
For example let's say my input is 1. Line One - Part Two (Optional Third Part) I would want to specify the format as to match so %number%. %first% - %second% (%third%) and then I want these values as variables.
Now the only way I could think of doing this was using RegEx groups and I have very nearly got RegEx works.
var input = "1. Line One - Part Two (Optional Third Part)";
var formatString = "%number%. %first% - %second% (%third%)";
var expression = new Regex("(?<Number>[^.]+). (?<First>[^-]+) - (?<Second>[^\\(]+) ((?<Third>[^)]+))");
var match = expression.Match(input);
Console.WriteLine(match.Groups["Number"].ToString().Trim());
Console.WriteLine(match.Groups["First"].ToString().Trim());
Console.WriteLine(match.Groups["Second"].ToString().Trim());
Console.WriteLine(match.Groups["Third"].ToString().Trim());
This results in the following output, so all good apart from that opening bracket.
1 Line One Part Two (Optional Third Part
I'm now a bit lost as to how I could translate my format string into a regular expression, now there are no rules on this format, but it would need to be fairly easy for a user.
Any advice is greatly appreciated, or perhaps there is another way not involving Regex?
You included in your pattern couple of special characters (such as .) without escaping them, so Regex does not match . literlally.
Here's corrected code of yours:
using System.Text.RegularExpressions;
var input = "1. Line One - Part Two (Optional Third Part)";
var pattern = string.Format(
"(?<Number>{0})\\. (?<First>{1}) - (?<Second>{2}) \\((?<Third>{3})\\)",
"[^\\.]+",
"[^\\-]+",
"[^\\(]+",
"[^\\)]+");
var match = Regex.Match(input, pattern);
Console.WriteLine(match.Groups["Number"]);
Console.WriteLine(match.Groups["First"]);
Console.WriteLine(match.Groups["Second"]);
Console.WriteLine(match.Groups["Third"]);
Sample output:
If you want to keep you syntax, you can leverage Regex.Escape method. I also written some code that parses all parameters within %
using System.Text.RegularExpressions;
var input = "1. Line One - Part Two (Optional Third Part)";
var formatString = "%number%. %first% - %second% (%third%)";
formatString = Regex.Escape(formatString);
var parameters = new List<string>();
formatString = Regex.Replace(formatString, "%([^%]+)%", match =>
{
var paramName = match.Groups[1].Value;
var groupPattern = "(?<" + paramName + ">{" + parameters.Count + "})";
parameters.Add(paramName);
return groupPattern;
});
var pattern = string.Format(
formatString,
"[^\\.]+",
"[^\\-]+",
"[^\\(]+",
"[^\\)]+");
var match = Regex.Match(input, pattern);
foreach (var paramName in parameters)
{
Console.WriteLine(match.Groups[paramName]);
}
Further notes
You need to adjust part where you specify pattern for each group, currently it's not generic and does not care about how many paramters there would be.
So finally, taking it all into account and cleaning up the code a little, you can use such solution:
public static class FormatBasedCustomRegex
{
public static string GetPattern(this string formatString,
string[] subpatterns,
out string[] parameters)
{
formatString = Regex.Escape(formatString);
formatString = formatString.ReplaceParams(out var #params);
if(#params.Length != subpatterns.Length)
{
throw new InvalidOperationException();
}
parameters = #params;
return string.Format(
formatString,
subpatterns);
}
private static string ReplaceParams(
this string formatString,
out string[] parameters)
{
var #params = new List<string>();
var outputPattern = Regex.Replace(formatString, "%([^%]+)%", match =>
{
var paramName = match.Groups[1].Value;
var groupPattern = "(?<" + paramName + ">{" + #params.Count + "})";
#params.Add(paramName);
return groupPattern;
});
parameters = #params.ToArray();
return outputPattern;
}
}
and main method would look like:
var input = "1. Line One - Part Two (Optional Third Part)";
var pattern = "%number%. %first% - %second% (%third%)".GetPattern(
new[]
{
"[^\\.]+",
"[^\\-]+",
"[^\\(]+",
"[^\\)]+",
},
out var parameters);
var match = Regex.Match(input, pattern);
foreach (var paramName in parameters)
{
Console.WriteLine(match.Groups[paramName]);
}
But it's up to you how would you define particular methods and what signatures they should have for you to have the best code :)
You may use this regex:
^(?<Number>[^.]+)\. (?<First>[^-]+) - (?<Second>[^(]+)(?: \((?<Third>[^)]+)\))?$
RegEx Demo
RegEx Details:
^: Start
(?<Number>[^.]+): Match and capture 1+ of any char that is not .
\. : Match ". "
(?<First>[^-]+):
-: Match " - "
(?<Second>[^(]+): Match and capture 1+ of any char that is not (
(?:: Start a non-capture group
\(: Match space followed by (
(?<Third>[^)]+): Match and capture 1+ of any char that is not )
\): Match )
)?: End optional non-capture group
$: End
Your format contains special characters that are becoming part of the regular expression. You can use the Regex.Escape method to handle that. After that, you can just use a Regex.Replace with a delegate to transform the format into a regular expression:
var input = "1. Line One - Part Two (Optional Third Part)";
var fmt = "%number%. %first% - %second% (%third%)";
var templateRE = new Regex(#"%([a-z]+)%", RegexOptions.Compiled);
var pattern = templateRE.Replace(Regex.Escape(fmt), m => $"(?<{m.Groups[1].Value}>.+?)");
var ansRE = new Regex(pattern);
var ans = ansRE.Match(input);
Note: You may want to place ^ and $ at the beginning and end of the pattern respectively, to ensure the format must match the entire input string.
I have this string (it's from EDI data):
ISA*ESA?ISA*ESA?
The * indicates it could be any character and can be of any length.
? indicates any single character.
Only the ISA and ESA are guaranteed not to change.
I need this split into two strings which could look like this: "ISA~this is date~ESA|" and
"ISA~this is more data~ESA|"
How do I do this in c#?
I can't use string.split, because it doesn't really have a delimeter.
You can use Regex.Split for accomplishing this
string splitStr = "|", inputStr = "ISA~this is date~ESA|ISA~this is more data~ESA|";
var regex = new Regex($#"(?<=ESA){Regex.Escape(splitStr)}(?=ISA)", RegexOptions.Compiled);
var items = regex.Split(inputStr);
foreach (var item in items) {
Console.WriteLine(item);
}
Output:
ISA~this is date~ESA
ISA~this is more data~ESA|
Note that if your string between the ISA and ESA have the same pattern that we are looking for, then you will have to find some smart way around it.
To explain the Regex a bit:
(?<=ESA) Look-behind assertion. This portion is not captured but still matched
(?=ISA) Look-ahead assertion. This portion is not captured but still matched
Using these look-around assertions you can find the correct | character for splitting
Simply use the
int x = whateverString.indexOf("?ISA"); // replace ? with the actual character here
and then just use the substring from 0 to that indexOf, indexOf to length.
Edit:
If ? is not known,
can we just use the regex Pattern and Matcher.
Matcher matcher = Patter.compile("ISA.*ESA").match(whateverString);
if(matcher.find()) {
matcher.find();
int x = matcher.start();
}
Here x would give that start index of that match.
Edit: I mistakenly saw it as java one, for C#
string pattern = #"ISA.*ESA";
Regex myRegex = new Regex(pattern, RegexOptions.IgnoreCase);
Match m = myRegex.Match(whateverString); // m is the first match
while (m.Success)
{
Console.writeLine(m.value);
m = m.NextMatch(); // more matches
}
RegEx will probably be the best for this. See this link
Mask would be
ISA(?<data1>.*?)ESA.ISA(?<data2>.*?)ESA.
This will give you 2 groups with data you need
Match match = Regex.Match(input, #"ISA(?<data1>.*?)ESA.ISA(?<data2>.*?)ESA.",RegexOptions.IgnoreCase);
if (match.Success)
{
var data1 = match.Groups["data1"].Value;
var data2 = match.Groups["data2"].Value;
}
Use Regex.Matches If you need multiple matches found, and specify different RegexOptions if needed.
It's kinda hacky but you could do...
string x = "ISA*ESA?ISA*ESA?";
x = x.Replace("*","~"); // OR SOME OTHER DELIMITER
string[] y = x.Split('~');
Not perfect in all situations, but it could solve your problem simply.
You could split by "ISA" and "ESA" and then put the parts back together.
string input = "ISA~this is date~ESA|ISA~this is more data~ESA|";
string start = "ISA",
end = "ESA";
var splitedInput = input.Split(new[] { start, end }, StringSplitOptions.None);
var firstPart = $"{start}{splitedInput[1]}{end}{splitedInput[2]}";
var secondPart = $"{start}{splitedInput[3]}{end}{splitedInput[4]}";
firstPart = "ISA~this is date~ESA|"
secondPart = "ISA~this is more data~ESA|";
Use a Regex like ISA(.+?)ESA and select the first group
string input = "ISA~mycontent+ESA";
Match match = Regex.Match(input, #"ISA(.+?)ESA",RegexOptions.IgnoreCase);
if (match.Success)
{
string key = match.Groups[1].Value;
}
Instead of "splitting" by a string, I would instead describe your question as "grouping" by a string. This can easily be done using a regular expression:
Regular expression: ^(ISA.*?(?=ESA)ESA.)(ISA.*?(?=ESA)ESA.)$
Explanation:
^ - asserts position at start of the string
( - start capturing group
ISA - match string ISA exactly
.*?(?=ESA) - match any character 0 or more times, positive lookahead on the
string ESA (basically match any character until the string ESA is found)
ESA - match string ESA exactly
. - match any character
) - end capturing group
repeat one more time...
$ - asserts position at end of the string
Try it on Regex101
Example:
string input = "ISA~this is date~ESA|ISA~this is more data~ESA|";
Regex regex = new Regex(#"^(ISA.*?(?=ESA)ESA.)(ISA.*?(?=ESA)ESA.)$",
RegexOptions.Compiled);
Match match = regex.Match(input);
if (match.Success)
{
string firstValue = match.Groups[1].Value; // "ISA~this is date~ESA|"
string secondValue = match.Groups[2].Value; // "ISA~this is more data~ESA|"
}
There are two answers to the question "How to split a string by another string".
var matches = input.Split(new [] { "ISA" }, StringSplitOptions.RemoveEmptyEntries);
and
var matches = Regex.Split(input, "ISA").ToList();
However, the first removes empty entries, while the second does not.
I need to replace multiple file names in a folder. Here is one of the files:
Abc.CDE.EFG
I need to replace the first part of the string before the dot ("ABC") and replace it with: "zef".
Any ideas? I found this but it takes out the dot and not sure how to add the "zef".
var input = _FileInfo.ToString();
var output = input.Substring(input.IndexOf(".").Trim())
Since the question is tagged with regex, you can use a regular expression like so:
var input = "abc.def.efg";
var pattern = "^[^\\.]+";
var replacement = "zef";
var rgx = new Regex(pattern);
var output = rgx.Replace(input, replacement);
Source: https://msdn.microsoft.com/en-us/library/xwewhkd1(v=vs.110).aspx
You are almost there, try:
string myString = "Abc.CDE.EFG";
//This splits your string into an array with 3 items
//"Abc", "CDE" and "EFG"
var stringArray = myString.Split('.');
//Now modify the first item by changing it to "zef"
stringArray[0] = "zef";
//Then we rebuild the string by joining the array together
//delimiting each group by a period
string newString = string.Join(".", stringArray);
With this solution you can independently access any of the "blocks" just by referencing the array by index.
Fiddle here
Try this:
var input = _FileInfo.ToString();
var output = "zef" + input.Substring(input.IndexOf("."));
If you know the length of the first string , you can replace mentioning number of characters starting from position until the length you want to replace else.
string s = "Abc.CDE.EFG";
string [] n = s.Split('.');
n[0] = "ZEF";
string p = string.Join(".",n);
Console.WriteLine(p);
}
I have the text:
SMS \r\n\t• Map - locations of
How can I remove all of the white space between • and the first following character?
The above example should result in
SMS \r\n\t•Map - locations of
By using a regular expression it can be done like so:
var input = "SMS \r\n\t• Map - locations of";
var regexPattern = #"(?<=•)\s+(?=\w)";
var cleanedInput = Regex.Replace(input, regexPattern, String.Empty);
This will replace any whitespace between • and the first word character with an empty string.
string s = "SMS \r\n\t• Map - locations of";
string[] temp = s.Split('•');
s = temp[0]+temp[1].TrimStart(' ');
You can use this Regex:
string toInsertBetween = string.Empty;
string toReplace = "SMS \r\n\t• Map - locations of";
string res = Regex.Replace(toReplace, "•[ ]+([^ ])", "•" + toInsertBetween + "$1");
i have some string like the ones below:
hu212 text = 1
reference = 1
racial construction = 1
2007 = 1
20th century history = 2
and i want to take only the integer AFTER the '='.. how can i do that?
i am trying this:
Regex exp = new Regex(#"[a-zA-Z]*[0-9]*[=][0-9]+",RegexOptions.IgnoreCase);
try
{
MatchCollection MatchList = exp.Matches(line);
Match FirstMatch = MatchList[0];
Console.WriteLine(FirstMatch.Value);
}catch(ArgumentOutOfRangeException ex)
{
System.Console.WriteLine("ERROR");
}
but it is not working...
i tryed some others but i get results like "20th" or "hu212"...
What exaclty Matches does? gives me the rest of the string that doesn match with the reg?
Instead of Regex you could also do:
int match = int.Parse(line.SubString(line.IndexOf('=')).Trim());
You need to allow whitespace (\s) between the = and the digits:
Regex pattern = new Regex(#"=\s*([0-9]+)$");
Here's a more complete example:
Regex pattern = new Regex(#"=\s*([0-9]+)$");
Match match = pattern.Match(input);
if (match.Success)
{
int value = int.Parse(match.Groups[1].Value);
// Use the value
}
See it working online: ideone
what about
string str = "hu212 text = 1"
string strSplit = str.split("=")[1].trim();
String StringToParse = "hu212 text = 1";
String[] splitString = String.Split(StringToParse);
Int32 outNum;
Int32.TryParse ( splitString[splitString.Length-1], out outNum );
Regex pattern = new Regex(#"=\s?(\d)");
This allow to have with or without space. The number is in group 1.
hu212 text =1
reference = 1