Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
Result:
0 0
0 0
-6361 0
-6384 -6672
0 0
0 -6793
...
Code:
string regex = #"X:(.*?)\sY:(.*?)";
if (File.Exists("minelist.log"))
File.Delete("minelist.log");
File.Copy(war3path + "\\minelist.log", "minelist.log");
string[] crdlist = File.ReadAllLines("minelist.log");
for (int i = 0; i < crdlist.Length;i++)
{
Match COORM = Regex.Match(crdlist[i], regex);
if (COORM.Success)
{
float x = 0.0f, y = 0.0f;
float.TryParse(COORM.Groups[1].Value, out x);
float.TryParse(COORM.Groups[2].Value, out y);
MessageBox.Show(x.ToString(), y.ToString());
}
}
if (File.Exists("minelist.log"))
File.Delete("minelist.log");
As a result, only certain values are parsed. Others = 0.
FILE
Result:
0 0
0 0
6361 0
-6384 6672
0 0
0 -6793
...
Your RegEx is not matching what you think it's matching. You could have inspected the capture groups using MessageBox (or by stepping over in the debugger). The problem is you you used .*? to capture the group of digits: any number of any character, lazily; Then in the foreach loop you used TryParse() but did not check the result! On the lines you got "0" as a result, the regex probably stopped too soon. The TryParse() would fail and leave your X and Y to there default values.
Complete Console example properly parsing everything:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
using System.Globalization;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] crdlist = {
"X:-6625.5 Y:-6585.5",
"X:-6601.25 Y:-6703.75",
"X:-6361 Y:-6516.5",
"X:-6384 Y:-6672",
"X:-6400.25 Y:-6847.75",
"X:-6608.75 Y:-6793",
"X:-6739.75 Y:-6872",
"X:-6429.25 Y:-6940",
"X:-7015.5 Y:-6835.5",
"X:-7117 Y:-6903",
"X:-6885.5 Y:-6662.5",
"X:-6861.5 Y:-6597",
"X:-7006.5 Y:-6728",
"X:-7009 Y:-6608.75",
"X:-6924 Y:-6798",
"X:-6970.25 Y:-6898.25",
"X:-6495.25 Y:-6775",
"X:-7112.5 Y:-6614.5",
"X:-7115.25 Y:-6717.25",
"X:-7113.25 Y:-6835.5",
"X:-6493 Y:-6620.25"
};
Regex re = new Regex(#"^\ *X\:([\-\.0-9]*)\ *Y\:([\-\.0-9]*)\ *$", RegexOptions.Compiled);
var us_EN = new CultureInfo("en-US");
foreach(var line in crdlist)
{
Match m = re.Match(line);
if (m.Success)
{
String X = m.Groups[1].Value;
String Y = m.Groups[2].Value;
float fX = float.Parse(X, us_EN);
float fY = float.Parse(Y, us_EN);
Console.WriteLine("X={0}, Y={1}", fX, fY);
}
}
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
}
Use this regular expression pattern:
string regex = #"X:(-*d*.*d*)\sY:(-*d*.*d*)";
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
void RandomRegex(object sender, TextCompositionEventArgs e)
{
var regex = new Regex("^[0-9]*$");
if (regex.IsMatch(e.Text) && !(e.Text == "," && ((TextBox)sender).Text.Contains(e.Text)))
{
e.Handled = false;
}
else
{
e.Handled = true;
}
}
How can I change it that it also accepts 0,5 with a dot like this -> 0.5
EDIT: I use this Regex to avoid letters in TextBoxes like Height for example.
I think "[0-9,\\.]+" could work here.
var pattern = "[0-9,\\.]+";
foreach( var test in new [] {"0,5", "0.5", "", "abc"})
Console.WriteLine($"{test}: {Regex.IsMatch(test, pattern)}");
0,5: True
0.5: True
: False
abc: False
As others have pointed out you could make your number parsing better.
You could:
(probably preferred) use the user's culture; or
make number parsing more resilient instead.
foreach (var test in new[] { "0.5", "0,5", "2019,11", "2,019.11", "abc" })
{
var frenchR = decimal.TryParse(test, NumberStyles.AllowThousands|NumberStyles.AllowDecimalPoint, new CultureInfo("fr-FR"), out var dec );
bool? invariantR = null;
if( !frenchR )
invariantR = decimal.TryParse(test, NumberStyles.AllowThousands|NumberStyles.AllowDecimalPoint, CultureInfo.InvariantCulture, out dec );
//var dec2 = decimal.Parse(test, CultureInfo.InvariantCulture);
Console.WriteLine($"{test,10} => {dec,10} (french={frenchR}, invariant={(invariantR?.ToString()??"not attempted")})");
}
0.5 => 0.5 (french=False, invariant=True)
0,5 => 0.5 (french=True, invariant=not attempted)
2019,11 => 2019.11 (french=True, invariant=not attempted)
2,019.11 => 2019.11 (french=False, invariant=True)
abc => 0 (french=False, invariant=False)
I'm attempting to parse key-value pairs from strings which look suspiciously like markup using .Net Core 2.1.
Considering the sample Program.cs file below...
My Questions Are:
1.
How can I write the pattern kvp to behave as "Key and Value if exists" instead of "Key or Value" as it currently behaves?
For example, in the test case 2 output, instead of:
=============================
input = <tag KEY1="vAl1">
--------------------
kvp[0] = KEY1
key = KEY1
value =
--------------------
kvp[1] = vAl1
key =
value = vAl1
=============================
I want to see:
=============================
input = <tag KEY1="vAl1">
--------------------
kvp[0] = KEY1="vAl1"
key = KEY1
value = vAl1
=============================
Without breaking test case 9:
=============================
input = <tag noValue1 noValue2>
--------------------
kvp[0] = noValue1
key = noValue1
value =
--------------------
kvp[1] = noValue2
key = noValue2
value =
=============================
2.
How can I write the pattern value to stop matching at the next character matched by the group named "quotes"? In other words, the very next balancing quote. I obviously misunderstand how backreferencing works, my understanding is that \k<quotes> will be replaced by the value matched at runtime (not the pattern defined at design time) by (?<quotes>[""'`]).
For example, in the test case 5 output, instead of:
--------------------
kvp[4] = key3='hello,
key =
value = key3='hello,
--------------------
kvp[5] = experts
key =
value = experts
=============================
I want to see (notwithstanding a solution to question 1):
--------------------
kvp[4] = key3
key = key3
value =
--------------------
kvp[5] = hello, "experts"
key =
value = hello, "experts"
=============================
3.
How can I write the pattern value to stop matching before />? In test case 7, the value for key2 should be thing-1. I can't remember all that I've attempted, but I haven't found a pattern that works without breaking test case 6, in which the / is part of the value.
Program.cs
using System;
using System.Reflection;
using System.Text.RegularExpressions;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
RegExTest();
Console.ReadLine();
}
static void RegExTest()
{
// Test Cases
var case1 = #"<tag>";
var case2 = #"<tag KEY1=""vAl1"">";
var case3 = #"<tag kEy2='val2'>";
var case4 = #"<tag key3=`VAL3`>";
var case5 = #"<tag key1='val1'
key2=""http://www.w3.org"" key3='hello, ""experts""'>";
var case6 = #"<tag :key1 =some/thing>";
var case7 = #"<tag key2=thing-1/>";
var case8 = #"<tag key3 = thing-2>";
var case9 = #"<tag noValue1 noValue2>";
var case10 = #"<tag/>";
var case11 = #"<tag />";
// A key may begin with a letter, underscore or colon, follow by
// zero or more of those, or numbers, periods, or dashs.
string key = #"(?<key>(?<=\s+)[a-z_:][a-z0-9_:\.-]*?(?=[\s=>]+))";
// A value may contain any character, and must be wrapped in balanced quotes (double, single,
// or back) if the value contains any quote, whitespace, equal, or greater- or less- than
// character.
string value = #"(?<value>((?<=(?<quotes>[""'`])).*?(?=\k<quotes>)|(?<=[=][\s]*)[^""'`\s=<>]+))";
// A key-value pair must contain a key,
// a value is optional
string kvp = $"(?<kvp>{key}|{value})"; // Without the | (pipe), it doesn't match any test case...
// ...value needs to be optional (case9), tried:
//kvp = $"(?<kvp>{key}{value}?)";
//kvp = $"(?<kvp>{key}({value}?))";
//kvp = $"(?<kvp>{key}({value})?)";
// ...each only matches key, but also matches value in case8 as key
Regex getKvps = new Regex(kvp, RegexOptions.IgnoreCase);
FormatMatches(getKvps.Matches(case1)); // OK
FormatMatches(getKvps.Matches(case2)); // OK
FormatMatches(getKvps.Matches(case3)); // OK
FormatMatches(getKvps.Matches(case4)); // OK
FormatMatches(getKvps.Matches(case5)); // Backreference and/or lazy qualifier doesn't work.
FormatMatches(getKvps.Matches(case6)); // OK
FormatMatches(getKvps.Matches(case7)); // The / is not part of the value.
FormatMatches(getKvps.Matches(case8)); // OK
FormatMatches(getKvps.Matches(case9)); // OK
FormatMatches(getKvps.Matches(case10)); // OK
FormatMatches(getKvps.Matches(case11)); // OK
}
static void FormatMatches(MatchCollection matches)
{
Console.WriteLine(new string('=', 78));
var _input = matches.GetType().GetField("_input",
BindingFlags.NonPublic |
BindingFlags.Instance)
.GetValue(matches);
Console.WriteLine($"input = {_input}");
Console.WriteLine();
if (matches.Count < 1)
{
Console.WriteLine("[kvp not matched]");
return;
}
for (int i = 0; i < matches.Count; i++)
{
Console.WriteLine(new string('-', 20));
Console.WriteLine($"kvp[{i}] = {matches[i].Groups["kvp"]}");
Console.WriteLine($"\t key\t=\t{matches[i].Groups["key"]}");
Console.WriteLine($"\tvalue\t=\t{matches[i].Groups["value"]}");
}
}
}
}
You may use
\s(?<key>[a-z_:][a-z0-9_:.-]*)(?:\s*=\s*(?:(?<q>[`'"])(?<value>.*?)\k<q>|(?<value>(?:(?!/>)[^\s`'"<>])+)))?
See the regex demo with groups highlighed and a .NET regex demo (proof).
C# usage:
var pattern = #"\s(?<key>[a-z_:][a-z0-9_:.-]*)(?:\s*=\s*(?:(?<q>[`'""])(?<value>.*?)\k<q>|(?<value>(?:(?!/>)[^\s`'""<>])+)))?";
var matches = Regex.Matches(case, pattern, RegexOptions.IgnoreCase);
foreach (Match m in matches)
{
Console.WriteLine(m.Value); // The whole match
Console.WriteLine(m.Groups["key"].Value); // Group "key" value
Console.WriteLine(m.Groups["value"].Value); // Group "value" value
}
Details
\s - a whitespace
(?<key>[a-z_:][a-z0-9_:.-]*) - Group "key": a letter, _ or : and then 0+ letters, digits, _, :, . or -
(?:\s*=\s*(?:(?['"])(?<value>.*?)\k<q>|(?<value>(?:(?!/>)[^\s'"<>])+)))? - one or zero occurrence of (the value is thus optional):
\s*=\s* - a = enclosed with 0+ whitespaces
(?: - start of a non-capturing group:
(?[`'"]) - a delimiter, `, ' or "
(?<value>.*?) - Group "value" matching any 0+ chars other than line break chars as few as possible
\k<q> - backreference to Group "q", same value must match
| - or
(?<value>(?:(?!/>)[^\s`'"<>])+) - Group "value": a char other than whitespace, `, ', ", < and >, 1 or more occurrences, that does not start a /> char sequence
) - end of the non-capturing group.
I want to convert string values to decimal. When there is a smaller or greater symbol I want to add/remove to the value like this
string value | decimal result
--------------|----------------
"< 0.22" | 0.219
"< 32.45" | 32.449
"> 2.0" | 2.01
"> 5" | 5.1
This has to work for decimal numbers with any number of decimal places. Is there an elegant way to do this?
I can only think of counting the number of decimal places, getting the last digit, adding/removing ...
So I would imagine the following solution
Split the string on the space
Identify the sign (GT or LT)
Count the number of decimal places and store that value
Convert the number to a decimal
Based on the symbol either add or subtract 10^(-1 * (numberOfDecimals + 1))
Using Regex
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] inputs = {
"< 0.22",
"< 32.45",
"> 2.0"
};
string pattern = #"(?'sign'[\<\>])\s+(?'integer'\d+).(?'fraction'\d+)";
decimal number = 0;
foreach(string input in inputs)
{
Match match = Regex.Match(input, pattern);
if(match.Groups["sign"].Value == ">")
{
number = decimal.Parse(match.Groups["integer"].Value + "." + match.Groups["fraction"].Value);
number += decimal.Parse("." + new string('0', match.Groups["fraction"].Value.Length) + "1");
}
else
{
number = decimal.Parse(match.Groups["integer"].Value + "." + match.Groups["fraction"].Value);
number -= decimal.Parse("." + new string('0', match.Groups["fraction"].Value.Length) + "1");
}
Console.WriteLine(number.ToString());
}
Console.ReadLine();
}
}
}
Solution without counting the decimals:
If the string starts with >, append "1". If the string starts with < replace the last character by the next lower one (Notice: for "0"->"9" and repeat on the character left to it) and then append "9".
Then split at the blank and Double.Parse the right part.
No need to count digits or split apart the decimal... (you can run this in linqpad)
string inputString = "> 0.22";
string[] data = inputString.Split(' ');
double input = double.Parse(data[1]);
double gt = double.Parse(data[1] + (data[1].Contains('.') ? "1" : ".1"));
switch(data[0])
{
case ">":
gt.Dump(); // return gt;
break;
case "<":
double lt = input - (gt - input);
lt.Dump(); // return lt;
break;
}
// you could even make it smaller by just doing (instead of the switch):
return data[0]==">" ? gt : input - (gt - input);
If trailing zeros are allowed as input but are considered insignificant then trim them.
inputString = inputString.TrimEnd("0");
I have a 9 character string I am trying to provide multiple checks on. I want to first check if the first 1 - 7 characters are numbers and then say for example the first 3 characters are numbers how would I check the 5th character for a letter range of G through T.
I am using c# and have tried this so far...
string checkString = "123H56789";
Regex charactorSet = new Regex("[G-T]");
Match matchSetOne = charactorSetOne.Match(checkString, 3);
if (Char.IsNumber(checkString[0]) && Char.IsNumber(checkString[1]) && Char.IsNumber(checkString[2]))
{
if (matchSetOne.Success)
{
Console.WriteLine("3th char is a letter");
}
}
But am not sure if this is the best way to handle the validations.
UPDATE:
The digits can be 0 - 9, but can concatenate from one number to seven. Like this "12345T789" or "1R3456789" etc.
It'a easy with LINQ:
check if the first 1 - 7 characters are numbers :
var condition1 = input.Take(7).All(c => Char.IsDigit(c));
check the 5th character for a letter range of G through T
var condition2 = input.ElementAt(4) >= 'G' && input.ElementAt(4) <= 'T';
As it is, both conditions can't be true at the same time (if the first 7 chars are digits, then the 5th char can't be a letter).
I'm new to C# so expect some mistakes ahead. Any help / guidance would be greatly appreciated.
I want to limit the accepted inputs for a string to just:
a-z
A-Z
hyphen
Period
If the character is a letter, a hyphen, or period, it's to be accepted. Anything else will return an error.
The code I have so far is
string foo = "Hello!";
foreach (char c in foo)
{
/* Is there a similar way
To do this in C# as
I am basing the following
Off of my Python 3 knowledge
*/
if (c.IsLetter == true) // *Q: Can I cut out the == true part ?*
{
// Do what I want with letters
}
else if (c.IsDigit == true)
{
// Do what I want with numbers
}
else if (c.Isletter == "-") // Hyphen | If there's an 'or', include period as well
{
// Do what I want with symbols
}
}
I know that's a pretty poor set of code.
I had a thought whilst writing this:
Is it possible to create a list of the allowed characters and check the variable against that?
Something like:
foreach (char c in foo)
{
if (c != list)
{
// Unaccepted message here
}
else if (c == list)
{
// Accepted
}
}
Thanks in advance!
Easily accomplished with a Regex:
using System.Text.RegularExpressions;
var isOk = Regex.IsMatch(foo, #"^[A-Za-z0-9\-\.]+$");
Rundown:
match from the start
| set of possible matches
| |
|+-------------+
|| |any number of matches is ok
|| ||match until the end of the string
|| |||
vv vvv
^[A-Za-z0-9\-\.]+$
^ ^ ^ ^ ^
| | | | |
| | | | match dot
| | | match hyphen
| | match 0 to 9
| match a-z (lowercase)
match A-Z (uppercase)
You can do this in a single line with regular expressions:
Regex.IsMatch(myInput, #"^[a-zA-Z0-9\.\-]*$")
^ -> match start of input
[a-zA-Z0-9\.\-] -> match any of a-z , A-Z , 0-9, . or -
* -> 0 or more times (you may prefer + which is 1 or more times)
$ -> match the end of input
You can use Regex.IsMatch function and specify your regular expression.
Or define manually chars what you need. Something like this:
string foo = "Hello!";
char[] availableSymbols = {'-', ',', '!'};
char[] availableLetters = {'A', 'a', 'H'}; //etc.
char[] availableNumbers = {'1', '2', '3'}; //etc
foreach (char c in foo)
{
if (availableLetters.Contains(c))
{
// Do what I want with letters
}
else if (availableNumbers.Contains(c))
{
// Do what I want with numbers
}
else if (availableSymbols.Contains(c))
{
// Do what I want with symbols
}
}
Possible solution
You can use the CharUnicodeInfo.GetUnicodeCategory(char) method. It returns the UnicodeCategory of a character. The following unicode categories might be what you're look for:
UnicodeCategory.DecimalDigitNumber
UnicodeCategory.LowercaseLetter and UnicodeCategory.UppercaseLetter
An example:
string foo = "Hello!";
foreach (char c in foo)
{
UnicodeCategory cat = CharUnicodeInfo.GetUnicodeCategory(c);
if (cat == UnicodeCategory.LowercaseLetter || cat == UnicodeCategory.UppercaseLetter)
{
// Do what I want with letters
}
else if (cat == UnicodeCategory.DecimalDigitNumber)
{
// Do what I want with numbers
}
else if (c == '-' || c == '.')
{
// Do what I want with symbols
}
}
Answers to your other questions
Can I cut out the == true part?:
Yes, you can cut the == true part, it is not required in C#
If there's an 'or', include period as well.:
To create or expressions use the 'barbar' (||) operator as i've done in the above example.
Whenever you have some kind of collection of similar things, an array, a list, a string of characters, whatever, you'll see at the definition of the collection that it implements IEnumerable
public class String : ..., IEnumerable,
here T is a char. It means that you can ask the class: "give me your first T", "give me your next T", "give me your next T" and so on until there are no more elements.
This is the basis for all Linq. Ling has about 40 functions that act upon sequences. And if you need to do something with a sequence of the same kind of items, consider using LINQ.
The functions in LINQ can be found in class Enumerable. One of the function is Contains. You can use it to find out if a sequence contains a character.
char[] allowedChars = "abcdefgh....XYZ.-".ToCharArray();
Now you have a sequence of allowed characters. Suppose you have a character x and want to know if x is allowed:
char x = ...;
bool xIsAllowed = allowedChars.Contains(x);
Now Suppose you don't have one character x, but a complete string and you want only the characters in this string that are allowed:
string str = ...
var allowedInStr = str
.Where(characterInString => allowedChars.Contains(characterInString));
If you are going to do a lot with sequences of things, consider spending some time to familiarize yourself with LINQ:
Linq explained
You can use Regex.IsMatch with "^[a-zA-Z_.]*$" to check for valid characters.
string foo = "Hello!";
if (!Regex.IsMatch(foo, "^[a-zA-Z_\.]*$"))
{
throw new ArgumentException("Exception description here")
}
Other than that you can create a list of chars and use string.Contains method to check if it is ok.
string validChars = "abcABC./";
foreach (char c in foo)
{
if (!validChars.Contains(c))
{
// Throw exception
}
}
Also, you don't need to check for == true/false in if line. Both expressions are equal below
if (boolvariable) { /* do something */ }
if (boolvariable == true) { /* do something */ }