Regular expression in C# - c#

i have text something like this.
##MMIVLoader#ProductVer#4.1.2#BCM_7400S_LE#Product#Aug 21 2009#
##MMIVLib#ObjectVer#4.1.2#BCM_7400S_LE#Product#Aug 21 2009#
##HuaweFGDLDrv#ObjectVer#01.00.09#7324#PRODUCT#Aug 20 2009#
##ProtectVer#ObjectVer#127.8.1 #BCM_SDE5.03#PRODUCT#Aug 4 2009 06:56:19#
##KernelSw#ObjectVer#0.0.1#BCM-7454#PRODUCT# Dec 19 2007#
##ReceiverSw#ObjectVer#E.5.6.001#HWBC01ZS#PRODUCT#May 3 2010#
i want the out put in an array like
MMIVLoader 4.1.2
MMIVLib 4.1.2
HuaweFGDLDrv 01.00.09
ProtectVer 127.8.1
KernelSw 0.0.1
ReceiverSw E.5.6.001
Can any one suggest me how to do this in c# using regular expression or is there a any sophisticated way to do this
thanks in advance

This is easy, you can just split by # (removing the empty items) and pull the first and third items.
var list = myString.Split(new String[] {Environment.NewLine},
StringSplitOptions.RemoveEmptyEntries)
.Select(item => item.Split(new char[] {'#'},
StringSplitOptions.RemoveEmptyEntries))
.Where(a => a.Length > 2)
.Select(a => new { Item = a[0], Version = a[2] }).ToArray();

Or simply remove extra stuff from line
Regex.Replace(text, #"^##([^#]+)#[^#]+#([^#]+).*", "$1,$2",RegexOptions.Multiline);
to get
MMIVLoader,4.1.2
MMIVLib,4.1.2
HuaweFGDLDrv,01.00.09
ProtectVer,127.8.1
KernelSw,0.0.1
ReceiverSw,E.5.6.001
And then, just split by comma on each line to get array

If you do want a crazy regex solution, you can use this:
var matches = Regex.Matches(
input,
"##(?<name>.*?)#(Product|Object)Ver#(?<ver>.*?)#",
RegexOptions.IgnoreCase
).Cast<Match>().Select(m => m.Groups);
foreach (var match in matches)
{
Console.WriteLine("{0} {1}", match["name"], match["ver"]);
}

For completeness, here is a LINQ version with query syntax :)
string[] lines = new string[] {
"##MMIVLoader#ProductVer#4.1.2#BCM_7400S_LE#Product#Aug 21 2009#",
"##MMIVLib#ObjectVer#4.1.2#BCM_7400S_LE#Product#Aug 21 2009#",
"##HuaweFGDLDrv#ObjectVer#01.00.09#7324#PRODUCT#Aug 20 2009#",
"##ProtectVer#ObjectVer#127.8.1 #BCM_SDE5.03#PRODUCT#Aug 4 2009 06:56:19#",
"##KernelSw#ObjectVer#0.0.1#BCM-7454#PRODUCT# Dec 19 2007#",
"##ReceiverSw#ObjectVer#E.5.6.001#HWBC01ZS#PRODUCT#May 3 2010#" };
var q = from components in
(from line in lines
select line.Split(new char[] { '#' },
StringSplitOptions.RemoveEmptyEntries))
select new { Name = components[0], Version = components[2] };
foreach (var item in q)
{
Console.WriteLine("Item: Name={0} Version={1}", item.Name, item.Version);
}

Related

Reading double Numbers from a text file which contains string and number mixed

I have a file which contains Numbers and Texts. and I'm trying to read all numbers as double and put them in a one dimension double array.
In the file , some lines begin with Space. also some lines contain Two or Three numbers after each other.
The file is creating from another app which i don't want to change its output format.
The data in the file is like blow and some lines begin with some space :
110 ! R1
123.000753 ! Radian per s as R2
600.0451 65 ! j/kg
12000 ! 4 Number of iteration
87.619 ! (min 20 and max 1000)
My code so far is :
char[] splits = { ' ', '!' };
var array = File.ReadAllLines(#"myfile.dat")
.SelectMany(linee => linee.Split(splits))
.Where(n => !string.IsNullOrWhiteSpace(n.ToString()))
.Select(n =>
{
double doub;
bool suc = double.TryParse(n, out doub);
return new { doub, suc };
}).Where( values=>values.suc).ToArray();
The problem is that my code also read numbers after ! in the descriptions like line 4 and line 5.
Array have to be like this :
110 , 123.000735 , 6000.0451 , 65 , 120000 , 87.619
But in my code is like this :
110 , 123.000735 , 6000.0451 , 65 , 120000 , 4 , 87.619 , 20 , 1000
It's hard to give a general formula when given only a single example, but the following will work for your example:
return File.ReadLines(#"myfile.dat")
.Where(s => !String.IsNullOrWhiteSpace(s))
.Select(s => s.Substring(0, s.IndexOf('!')).Split(new [] {' '}, StringSplitOptions.RemoveEmptyEntries))
.SelectMany(s => s)
.Select(s => Double.Parse(s));
One approach could be as following.
var lines = str.Split(new []{"!",Environment.NewLine},StringSplitOptions.RemoveEmptyEntries)
.Where(x=> x.Split(new []{" "},StringSplitOptions.RemoveEmptyEntries).All(c=>double.TryParse(c, out _))).
SelectMany(x=> x.Split(new []{" "},StringSplitOptions.RemoveEmptyEntries).Select(c=>double.Parse(c)));
Here's an alternate solution using regular expressions:
var regex = new Regex(#"^(\s*(?<v>\d+(\.\d+)?)\s*)+\!.*$");
var query = from line in lines
let match = regex.Match(line)
where match.Success
from #group in match.Groups.Cast<Group>()
where #group.Name == "v"
select double.Parse(#group.Value, NumberStyles.Float, CultureInfo.InvariantCulture);

Get Group of Numbers on a String

I have a string which consists of numbers and letters like the example below:
string strFood = "123d 4hello12";
What I want to accomplish is get all the group of numbers which is 123, 4, and 12.
I am trying to do this via LinQ but I am not getting the array results since my plan is to get the array then add them altogether which is 123 + 4 + 12 and the result is 139.
This is what I tried so far but this doesn't result to group of string or integer:
string[] strArr =
strFood .GroupBy(y => Char.IsDigit(y)).Select(y => y.ToString()).ToArray();
I also tried this one but this returns all the number in one string:
var foo = from a in strFood .ToCharArray() where Char.IsDigit(a) == true select a;
Any help would be appreciated.
I suggest using regular expressions to find all groups (matches) with aggregation via Linq:
string strFood = "123d 4hello12";
var sum = Regex
.Matches(strFood, "[0-9]+") // groups of integer numbers
.OfType<Match>()
.Select(match => int.Parse(match.Value)) // treat each group as integer
.Sum(); // sum up
If you want to obtain an array (and sum up later):
int[] result = Regex
.Matches(strFood, "[0-9]+") // groups of integer numbers
.OfType<Match>()
.Select(match => int.Parse(match.Value))
.ToArray();
...
var sum = result.Sum();
You could split your string to integers collection:
string strFood = "123d 4hello12";
var integers = new Regex(#"\D").Split(strFood)
.Where(x=>!string.IsNullOrWhiteSpace(x))
.Select(x=>int.Parse(x));
and after that sum it with:
var sum = integers.Sum(); // Result : 139
Edit after comment of #Dmitry Bychenko: with some characters, such as persian digits that won't work.
Solution: either use
new Regex(#"[^0-9+]")
or
new Regex(#"\D", RegexOptions.ECMAScript)
Just to add decimal numbers in summation, you can use this regex instead:
var str = "123d 4hello12and0.2plus.1and-1and2+8.but 1....1 a.b";
// ^^^ ^ ^^ ^^^ ^^ ^^ ^ ^ ^ ^^
var s = Regex
.Matches(str, #"-?([0-9]+|[0-9]*\.[0-9]+)")
.OfType<Match>()
.Sum(c=> double.Parse(c.Value, CultureInfo.InvariantCulture));
Result will be:
Count = 11
[0]: {123}
[1]: {4}
[2]: {12}
[3]: {0}
[4]: {.2}
[5]: {.1}
[6]: {-1}
[7]: {2}
[8]: {8}
[9]: {1}
[10]: {.1}
Sum = 149.39999999999998 //~= 149.4
var yourSum = strFood.Where(x=>Char.IsDigit(x)).Select(x=>Convert.ToInt32(x)).Sum()
This will give you the sum of all numbers in your string.
If you want just an IEnumerable of ints remove the Sum() from the end
Why don't you use a simple regular expression?
string input = "123d 4hello12";
int sum = System.Text.RegularExpressions.Regex.Matches(input, #"\d+").Cast<System.Text.RegularExpressions.Match>().Sum(m => Convert.ToInt32(m.Value));
I tried using an approach using Split and Join.
First i use Linq Select to replace non digits with a ',':
strFood.Select(ch => (Char.IsDigit(ch)) ? ch : ',');
I then use Join to turn this back into a string of the form "123,,4,,,,,12", I then Split this on "," and filter out values (using Where) which have an empty string, I then convert the string into a number e.g. "123" becomes 123 and I sum the array.
Putting this all together becomes:
var Sum = String.Join("",(strFood.Select(c => (Char.IsDigit(c)) ? c : ',')))
.Split(',').Where(c => c != "").Select(c => int.Parse(c)).Sum();
Here's a slightly shorter version using Concat:
var Sum = String.Concat(strFood.Select(ch => (Char.IsDigit(ch)) ? ch : ','))
.Split(',').Where(c => c != "").Select(c => int.Parse(c)).Sum();
This gives a result of 139
Try this:
int[] strArr = strFood.ToCharArray().Where(x=> Char.IsDigit(x)).Select(y => Convert.ToInt32(y.ToString())).ToArray();

Check if chars of a string contains in another string with LINQ

I'm making a Scrabble game in the command line with C#. The player must input some words like list below:
Word
Points
some
6
first
8
potsie
8
day
7
could
8
postie
8
from
9
have
10
back
12
this
7
The letters the player got are this:
sopitez
This value is a string. I'll check if the letters contains in the words. For this I've tried this code:
String highst = (from word
in words
where word.Contains(letters)
orderby points descending
select word).First();
But it doesn't work how I'll it. This code wouldn't select any word. I know the reason why because sopitez doesn't contain in any word.
My question now is there a way to check the chars in the string letters contain into the words whitout looping over the chars.
Note: Each letter must be used at most once in the solution.
If I calculate the result it must be potsie or postie. (I must write the logic for that)
P.S.: I'm playing this game: www.codingame.com/ide/puzzle/scrabble
This will not be performant at all but at least it will do the trick. Notice that I've used a dictionary just for the sake of simplicity (also I don't see why you would have repeated words like "potsie", I've never played scrabble). You can as well use a list of Tuples if you follow this code
EDIT: I changed this according to the OP's new comments
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var letters = new HashSet<char>("sopitez");
var wordsMap = new Dictionary<string, int>()
{
{"some", 6}, {"first", 8}, {"potsie", 8}, {"postie", 8}, {"day", 7},
{"could", 8}, {"from", 9}, {"have", 10}, {"back", 12},
{"this", 7}
};
var highest = wordsMap
.Select(kvp => {
var word = kvp.Key;
var points = kvp.Value;
var matchCount = kvp.Key.Sum(c => letters.Contains(c) ? 1 : 0);
return new {
Word = word,
Points = points,
MatchCount = matchCount,
FullMatch = matchCount == word.Length,
EstimatedScore = points * matchCount /(double) word.Length // This can vary... it's just my guess for an "Estiamted score"
};
})
.OrderByDescending(x => x.FullMatch)
.ThenByDescending(x => x.EstimatedScore);
foreach (var anon in highest)
{
Console.WriteLine("{0}", anon);
}
}
}
The problem here is that Contains checks to see if one string contains another; it is not checking to see if it contains all of those characters. You need to replace each string in your dictionary with a HashSet<char> and perform set comparisons like IsSubset or IsSuperset to determine if the letters are matching.
Here is what you're doing:
string a= "Hello";
string b= "elHlo";
bool doesContain = b.Contains(a); //This returns false
Here is what you need to do:
var setA = new HashSet<char>(a);
var setB = new HashSet<char>(b);
bool isSubset = a.IsSubsetOf(b); //This returns true
Update
Actually, this is wrong, because sets remove duplicate elements. But essentially you are misusing Contains. You'll need some more complicated sequence comparison that can allow duplicate letters.
Update2
You need this for word/letters comparison:
//Compares counts of each letter in word and tiles
bool WordCanBeMadeFromLetters(string word, string tileLetters) {
var tileLetterCounts = GetLetterCounts(tileLetters);
var wordLetterCounts = GetLetterCounts(word);
return wordLetterCounts.All(letter =>
tileLetterCounts.ContainsKey(letter.Key)
&& tileLetterCounts[letter.Key] >= letter.Value);
}
//Gets dictionary of letter/# of letter in word
Dictionary<char, int> GetLetterCounts(string word){
return word
.GroupBy(c => c)
.ToDictionary(
grp => grp.Key,
grp => grp.Count());
}
So your original example can look like this:
String highst = (from word
in words
where WordCanBeMadeFromLetters(word, letters)
orderby points descending
select word).First();
Since letters can repeat, I think you need something like this (of course that's not very efficient, but pure LINQ):
var letters = "sopitezwss";
var words = new Dictionary<string, int>() {
{"some", 6}, {"first", 8}, {"potsie", 8}, {"day", 7},
{"could", 8}, {"from", 9}, {"have", 10}, {"back", 12},
{"this", 7}, {"postie", 8}, {"swiss", 15}
};
var highest = (from word
in words
where word.Key.GroupBy(c => c).All(c => letters.Count(l => l == c.Key) >= c.Count())
orderby word.Value descending
select word);

Parse and find string in between (English string inside double square bracket) with C#?

Below is code snippet. Wanted to find Item starts with "[[" and ends with "]]" and followed by any English letters a-z and A-Z. What is the efficient way?
string sample_input = "'''அர்காங்கெல்சுக் [[sam]] மாகாணம்''' (''Arkhangelsk Oblast'', {{lang-ru|Арха́нгельская о́бласть}}, ''அர்காங்கெல்சுக்யா ஓபிலாஸ்து'') என்பது [[உருசியா]]வின் [[I am sam]] [[உருசியாவின் கூட்டாட்சி அமைப்புகள்|நடுவண் அலகு]] ஆகும். <ref>{{cite news|author=Goldman, Francisco|date=5 April 2012|title=Camilla Vallejo, the World's Most Glamorous Revolutionary|newspaper=[[The New York Times Magazine]]| url=http://www.nytimes.com/2012/04/08/magazine/camila-vallejo-the-worlds-most-glamorous-revolutionary.html|accessdate=5 April 2013}}</ref>";
List<string> found = new List<string>();
foreach (var item in sample_input.Split(' '))
{
if (item.StartsWith("[[s") || item.StartsWith("[[S") || item.StartsWith("[[a") || item.StartsWith("[[a"))
{
found.Add(item);
}
}
Expected Results: [[Sam]], [[I am Sam]], [[The New York Times Magazine]].
Try this
string sample_input = "'''அர்காங்கெல்சுக் [[sam]] மாகாணம்''' (''Arkhangelsk Oblast'', {{lang-ru|Арха́нгельская о́бласть}}, ''அர்காங்கெல்சுக்யா ஓபிலாஸ்து'') என்பது [[உருசியா]]வின் [[உருசியாவின் கூட்டாட்சி அமைப்புகள்|நடுவண் அலகு]] ஆகும்.";
var regex= new Regex(#"\[\[[a-zA-Z]+\]\]");
var found = regex.Matches(sample_input).OfType<Match>().Select(x=>x.Value).ToList();

Split a string containing various spaces

I have txt file as follows and would like to split them into double arrays
node Strain Axis Strain F P/S Sum Cur Moment
0 0.00000 0.00 0.0000 0 0 0 0 0.00
1 0.00041 -83.19 0.0002 2328 352 0 0 -0.80
2 0.00045 -56.91 0.0002 2329 352 0 0 1.45
3 0.00050 -42.09 0.0002 2327 353 0 0 -0.30
My goal is to have a series of arrays of each column. i.e.
node[] = {0,1,2,3), Axis[]= {0.00,-83.19,-56.91,-42.09}, ....
I know how to read the txt file and covert strings to double arrays. but the problem is the values are not separated by tab, but by different number of spaces. I googled to find out a way to do it. However, I couldn't find any. some discussed a way to do with a constant spaces. If you know how to do or there is an existing Q&A for this issue and let me know, it will be greatly appreciated. Thanks,
A different way, although I would suggest you stick with the other answers here using RemoveEmptyEntries would be to use a regular expression, but in this case it is overkill:
string[] elements = Regex.Split(s, #"\s+");
StringSplitOptions.RemoveEmptyEntires should do the trick:
var items = source.Split(new [] { " " }, StringSplitOptions.RemoveEmptyEntries);
The return value does not include array elements that contain an empty string
var doubles = text.Split("\n\r".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.Skip(1)
.Select(line => line.Split(new char[]{' '},StringSplitOptions.RemoveEmptyEntries)
.Select(x => double.Parse(x)).ToArray())
.ToArray();
Use the option StringSplitOptions.RemoveEmptyEntries to treat consecutive delimiters as one:
string[] parts = source.Split(' ',StringSplitOptions.RemoveEmptyEntries);
then parse from there:
double[] values = parts.Select(s => double.Parse(s)).ToArray();

Categories

Resources