Get a number and string from string

Get a number and string from string - c#

I have a kinda simple problem, but I want to solve it in the best way possible. Basically, I have a string in this kind of format: <some letters><some numbers>, i.e. q1 or qwe12. What I want to do is get two strings from that (then I can convert the number part to an integer, or not, whatever). The first one being the "string part" of the given string, so i.e. qwe and the second one would be the "number part", so 12. And there won't be a situation where the numbers and letters are being mixed up, like qw1e2.
Of course, I know, that I can use a StringBuilder and then go with a for loop and check every character if it is a digit or a letter. Easy. But I think it is not a really clear solution, so I am asking you is there a way, a built-in method or something like this, to do this in 1-3 lines? Or just without using a loop?

You can use a regular expression with named groups to identify the different parts of the string you are interested in.
For example:
string input = "qew123";
var match = Regex.Match(input, "(?<letters>[a-zA-Z]+)(?<numbers>[0-9]+)");
if (match.Success)
{
Console.WriteLine(match.Groups["letters"]);
Console.WriteLine(match.Groups["numbers"]);
}

You can try Linq as an alternative to regular expressions:
string source = "qwe12";
string letters = string.Concat(source.TakeWhile(c => c < '0' || c > '9'));
string digits = string.Concat(source.SkipWhile(c => c < '0' || c > '9'));

You can use the Where() extension method from System.Linq library (https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where), to filter only chars that are digit (number), and convert the resulting IEnumerable that contains all the digits to an array of chars, that can be used to create a new string:
string source = "qwe12";
string stringPart = new string(source.Where(c => !Char.IsDigit(c)).ToArray());
string numberPart = new string(source.Where(Char.IsDigit).ToArray());
MessageBox.Show($"String part: '{stringPart}', Number part: '{numberPart}'");
Source:
https://stackoverflow.com/a/15669520/8133067

if possible add a space between the letters and numbers (q 3, zet 64 etc.) and use string.split
otherwise, use the for loop, it isn't that hard

You can test as part of an aggregation:
var z = "qwe12345";
var b = z.Aggregate(new []{"", ""}, (acc, s) => {
if (Char.IsDigit(s)) {
acc[1] += s;
} else {
acc[0] += s;
}
return acc;
});
Assert.Equal(new [] {"qwe", "12345"}, b);

Related

Remove anything from string after any "a-zA-Z" char

I have this types of string:
"10a10", "10b5641", "5a1121", "438z2a5f"
and I need to remove anything after the FIRST a-zA-Z char in the string (the symbol itself should be removed as well). What could be a solution?
Examples of results I expect:
"10a10" returns "10"
"10b5641" returns "10"
"5a1121" returns "5"
"438z2a5f" returns "438"

You could use Regular Expressions along with Regex, something like:
string str = "10a10";
str = Regex.Replace(str, #"[a-zA-Z].*", "");
Console.WriteLine(str);
will output:
10
Basically it will takes everything that starts with a-zA-Z and everything after it (.* matches any characters zero or unlimited times) and remove it from the string.

An easy to understand approach would be to use the String.IndexOfAny Method to find the Index of the first a-zA-Z char, and then use the String.Substring Method to cut the string accordingly.
To do so you would create an array containing all a-zA-Z characters and use this as an argument to String.IndexOfAny. After that you use 0 and the result of String.IndexOfAny as arguments for String.Substring.
I am pretty sure there are more elegant ways to do this, but this seems the most basic approach to me, so its worth mentioning.

You could do so using Linq as follows.
var result = new string(strInput.TakeWhile(x => !char.IsLetter(x)).ToArray());

var sList = new List<string> { "10a10", "10b5641", "5a1121", "438z2a5f" };
foreach (string s in sList.ToArray())
{
string number = new string(s.TakeWhile(c => !Char.IsLetter(c)).ToArray());
Console.WriteLine(number);
}

Either Linq:
var result = string.Concat(strInput
.TakeWhile(c => !((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')));
Or regular expression:
using System.Text.RegularExpressions;
...
var result = Regex.Match(strInput, "^[^A-Za-z]*").Value;
In both cases starting from strInput beginning take characters until a..z or A-Z occurred
Demo:
string[] tests = new[] {
"10a10", "10b5641", "5a1121", "438z2a5f"
};
string demo = string.Join(Environment.NewLine, tests
.Select(test => $"{test,-10} returns \"{Regex.Match(test, "^[^A-Za-z]*").Value}\""));
Console.Write(demo);
Outcome:
10a10 returns "10"
10b5641 returns "10"
5a1121 returns "5"
438z2a5f returns "438"

C# How to extract words from a string and put them into class members

I have a problem with c# string manipulation and I'd appreciate your help.
I have a file that contains many lines. It looks like this:
firstWord number(secondWord) thirdWord(Phrase) Date1 Date2
firstWord number(secondWord) thirdWord(Phrase) Date1 Time1
...
I need to separate these words and put them in a class properties. As you can see the problem is that the spaces between words are not the same, sometimes is one space sometimes eight spaces between them. And the second problem is that on the third place comes a phrase containing 2 to 5 words (again divided by spaces or sometimes contected with _ or -) and it needs to be considered as one string - it has to be one class member. The class should look like this:
class A
string a = firstWord;
int b = number;
string c = phrase;
Date d = Date1;
Time e = Time1;
I'd appreciate if you had any ideas how to solve this. Thank you.

Use the following steps:
Use File.ReadAllLines() to get a string[], where each element represents one line of the file.
For each line, use string.Split() and chop your line into individual words. Use both space and parentheses as your delimiters. This will give you an array of words. Call it arr.
Now create an object of your class and assign like this:
string a = arr[0];
int b = int.Parse(arr[1]);
string c = string.Join(" ", arr.Skip(4).Take(arr.Length - 6));
Date d = DateTime.Parse(arr[arr.Length - 2]);
Date e = DateTime.Parse(arr[arr.Length - 1]);
The only tricky stuff is string c above. Logic here is that from element no. 4 up to the 3rd last element, all of these elements form your phrase part, so we use linq to extract those elements and join them together to get back your phrase. This would obviously require that the phrase itself doesn't contain any parentheses itself, but that shouldn't normally be the case I assume.

You need a loop and string- and TryParse-methods:
var list = new List<ClassName>();
foreach (string line in File.ReadLines(path).Where(l => !string.IsNullOrEmpty(l)))
{
string[] fields = line.Trim().Split(new char[] { }, StringSplitOptions.RemoveEmptyEntries);
if (fields.Length < 5) continue;
var obj = new ClassName();
list.Add(obj);
obj.FirstWord = fields[0];
int number;
int index = fields[1].IndexOf('(');
if (index > 0 && int.TryParse(fields[1].Remove(index), out number))
obj.Number = number;
int phraseStartIndex = fields[2].IndexOf('(');
int phraseEndIndex = fields[2].LastIndexOf(')');
if (phraseStartIndex != phraseEndIndex)
{
obj.Phrase = fields[2].Substring(++phraseStartIndex, phraseEndIndex - phraseStartIndex);
}
DateTime dt1;
if(DateTime.TryParse(fields[3], out dt1))
obj.Date1 = dt1;
DateTime dt2;
if (DateTime.TryParse(fields[3], out dt2))
obj.Date2 = dt2;
}

The following regular expression seems to cover what I imagine you would need - at least a good start.
^(?<firstWord>[\w\s]*)\s+(?<secondWord>\d+)\s+(?<thirdWord>[\w\s_-]+)\s+(?<date>\d{4}-\d{2}-\d{2})\s+(?<time>\d{2}:\d{2}:\d{2})$
This captures 5 named groups
firstWord is any alphanumeric or whitespace
secondWord is any numeric entry
thirdWord any alphanumeric, space underscore or hyphen
date is any iso formatted date (date not validated)
time any time (time not validated)
Any amount of whitespace is used as the delimiter - but you will have to Trim() any group captures. It makes a hell of a lot of assumptions about your format (dates are ISO formatted, times are hh:mm:ss).
You could use it like this:
Regex regex = new Regex( #"(?<firstWord>[\w\s]*)\s+(?<secondWord>\d+)\s+(?<thirdWord>[\w\s_-]+)\s+(?<date>\d{4}-\d{2}-\d{2})\s+(?<time>\d{2}:\d{2}:\d{2})$", RegexOptions.IgnoreCase );
var match = regex.Match("this is the first word 123 hello_world 2017-01-01 10:00:00");
if(match.Success){
Console.WriteLine("{0}\r\n{1}\r\n{2}\r\n{3}\r\n{4}",match.Groups["firstWord"].Value.Trim(),match.Groups["secondWord"].Value,match.Groups["thirdWord"].Value,match.Groups["date"].Value,match.Groups["time"].Value);
}
http://rextester.com/LGM52187

You have to use Regex, you may have a look here as a starting point. so for example to get the first word you may use this
string data = "Example 2323 Second This is a Phrase 2017-01-01 2019-01-03";
string firstword = new Regex(#"\b[A-Za-z]+\b").Matches(data )[0]

Using string.ToUpper on substring

Have an assignment to allow a user to input a word in C# and then display that word with the first and third characters changed to uppercase. Code follows:
namespace Capitalizer
{
class Program
{
static void Main(string[] args)
{
string text = Console.ReadLine();
char[] delimiterChars = { ' ' };
string[] words = text.Split(delimiterChars);
string Upper = text.ToUpper();
Console.WriteLine(Upper);
Console.ReadKey();
}
}
}
This of course generates the entire word in uppercase, which is not what I want. I can't seem to make text.ToUpper(0,2) work, and even then that'd capitalize the first three letters. Only solution I can think of now that would make the word appear on one line (and I don't know if it works) is to move the capitalized letters and lowercase letters into a character array and try to get that to print all values in a modified order.

The simplest way I can think of to address your exact question as described — to convert to upper case the first and third characters of the input — would be something like the following:
StringBuilder sb = new StringBuilder(text);
sb[0] = char.ToUpper(sb[0]);
sb[2] = char.ToUpper(sb[2]);
text = sb.ToString();
The StringBuilder class is essentially a mutable string object, so when doing these kinds of operations is the most fluid way to approach the problem, as it provides the most straightforward conversions to and from, as well as the full range of string operations. Changing individual characters is easy in many data structures, but insertions, deletions, appending, formatting, etc. all also come with StringBuilder, so it's a good habit to use that versus other approaches.
But frankly, it's hard to see how that's a useful operation. I can't help but wonder if you have stated the requirements incorrectly and there's something more to this question than is seen here.

You could use LINQ:
var upperCaseIndices = new[] { 0, 2 };
var message = "hello";
var newMessage = new string(message.Select((c, i) =>
upperCaseIndices.Contains(i) ? Char.ToUpper(c) : c).ToArray());
Here is how it works. message.Select (inline LINQ query) selects characters from message one by one and passes into selector function:
upperCaseIndices.Contains(i) ? Char.ToUpper(c) : c
written as C# ?: shorthand syntax for if. It reads as "If index is present in the array, then select upper case character. Otherwise select character as is."
(c, i) => condition
is a lambda expression. See also:
Understand Lambda Expressions in 3 minutes
The rest is very simple - represent result as array of characters (.ToArray()), and create a new string based off that (new string(...)).

Only solution I can think of now that would make the word appear on one line (and I don't know if it works) is to move the capitalized letters and lowercase letters into a character array and try to get that to print all values in a modified order.
That seems a lot more complicated than necessary. Once you have a character array, you can simply change the elements of that character array. In a separate function, it would look something like
string MakeFirstAndThirdCharacterUppercase(string word) {
var chars = word.ToCharArray();
chars[0] = chars[0].ToUpper();
chars[2] = chars[2].ToUpper();
return new string(chars);
}

My simple solution:
string text = Console.ReadLine();
char[] delimiterChars = { ' ' };
string[] words = text.Split(delimiterChars);
foreach (string s in words)
{
char[] chars = s.ToCharArray();
chars[0] = char.ToUpper(chars[0]);
if (chars.Length > 2)
{
chars[2] = char.ToUpper(chars[2]);
}
Console.Write(new string(chars));
Console.Write(' ');
}
Console.ReadKey();

get measurement value only from string

I have a string which gives the measurement followed the units in either cm, m or inches.
For example :
The number could be 112cm, 1.12m, 45inches or 45in.
I would like to extract only the number part of the string. Any idea how to use the units as the delimiters to extract the number ?
While I am at it, I would like to ignore the case of the units.
Thanks

You can try:
string numberMatch = Regex.Match(measurement, #"\d+\.?\d*").Value;
EDIT
Furthermore, converting this to a double is trivial:
double result;
if (double.TryParse(number, out result))
{
// Yeiiii I've got myself a double ...
}

Use String.Split http://msdn.microsoft.com/en-us/library/tabh47cf.aspx
Something like:
var units = new[] {"cm", "inches", "in", "m"};
var splitnumber = mynumberstring.Split(units, StringSplitOptions.RemoveEmptyEntries);
var number = Convert.ToInt32(splitnumber[0]);

Using Regex this can help you out:
(?i)(\d+(?:\.\d+)?)(?=c?m|in(?:ch(?:es)?)?)
Break up:
(?i) = ignores characters case // specify it in C#, live do not have it
\d+(\.\d+)? = supports numbers like 2, 2.25 etc
(?=c?m|in(ch(es)?)?) = positive lookahead, check units after the number if they are
m, cm,in,inch,inches, it allows otherwise it is not.
?: = specifies that the group will not capture
? = specifies the preceding character or group is optional
Demo
EDIT
Sample code:
MatchCollection mcol = Regex.Matches(sampleStr,#"(?i)(\d+(?:\.\d+)?)(?=c?m|in(?:ch(?:es)?)?)")
foreach(Match m in mcol)
{
Debug.Print(m.ToString()); // see output window
}

I guess I'd try to replace with "" every character that is not number or ".":
//s is the string you need to convert
string tmp=s;
foreach (char c in s.ToCharArray())
{
if (!(c >= '0' && c <= '9') && !(c =='.'))
tmp = tmp.Replace(c.ToString(), "");
}
s=tmp;

Try using regular expression \d+ to find an integer number.
resultString = Regex.Match(measurementunit , #"\d+").Value;

Is it a requirement that you use the unit as the delimiter? If not, you could extract the number using regex (see Find and extract a number from a string).

RegExp: X number of matches => X number of replacements?

Using regular expressions I'm trying to match a string, which has a substring consisting of unknown number of repeats (one or more) and then replace the repeating substring with the same number of replacement strings.
If the Regexp is "(st)[a]+(ck)", then I want to get these kind of results:
"stack" => "stOck"
"staaack" => "stOOOck" //so three times "a" to be replaced with three times "O"
"staaaaack" => "stOOOOOck"
How do I do that?
Either C# or AS3 would do.

If you use .net you can do this
find: (?<=\bsta*)a(?=a*ck\b)
replace: o
If you want to change all sta+ck that are substring of other words, only remove the \b

Since I am not familiar with either C# or AS3, I will write a solution in JavaScript, but the concept in the solution can be used for C# code or AS3 code.
var str = "stack stackoverflow staaaaaack stOackoverflow should not replace";
var replaced = str.replace(/st(a+)ck/g, function ($0, $1) {
var r = "";
for (var i = 0; i < $1.length; i++) {
r += "O";
}
return "st" + r + "ck";
});
Output:
"stOck stOckoverflow stOOOOOOck stOackoverflow should not replace"
In C#, you would use Regex.Replace(String, String, MatchEvaluator) (or other Regex.Replace methods that takes in a MatchEvaluator delegate) to achieve the same effect.
In AS3, you can pass a function as replacement, similar to how I did above in JavaScript. Check out the documentation of String.replace() method.

For AS3 you can pass a function to the replace method on the String object where matching elements are into the arguments array. So you can build and return a new String with all the 'a' replaced by 'O'
for example:
// first way explicit loop
var s:String="staaaack";
trace("before", s);
var newStr:String = s.replace(/(st)(a+)(ck)/g, function():String{
var ret:String=arguments[1]; // here match 'st'
//arguments[2] match 'aaa..'
for (var i:int=0, len:int=arguments[2].length; i < len; i++)
ret += "O";
return ret + arguments[3]; // arguments[3] match 'ck'
});
trace("after", newStr); // output stOOOOck
// second way array and join
var s1:String="staaaack staaaaaaaaaaaaack stack paaaack"
trace("before", s1)
var after:String = s1.replace(/(st)(a+)(ck)/g, function():String{
return arguments[1]+(new Array(arguments[2].length+1)).join("O")+arguments[3]
})
trace("after", after)
here live example on wonderfl : http://wonderfl.net/c/bOwE

Why not use the String Replace() method instead?
var str = "stack";
str = str.Replace("a", "O");

I would do it like this:
String s = "Staaack";
Console.WriteLine(s);
while (Regex.Match(s,"St[O]*([a]{1})[a]*ck").Success){
s = Regex.Replace(s,"(St[O]*)([a]{1})([a]*ck)", "$1O$3");
Console.WriteLine(s);
}
Console.WriteLine(s);
Console.ReadLine();
it replaces one a with every iteration, until no more as can be found.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Get a number and string from string - c#

You can try Linq as an alternative to regular expressions: string source = "qwe12"; string letters = string.Concat(source.TakeWhile(c => c < '0' || c > '9')); string digits = string.Concat(source.SkipWhile(c => c < '0' || c > '9'));

if possible add a space between the letters and numbers (q 3, zet 64 etc.) and use string.split otherwise, use the for loop, it isn't that hard

You can test as part of an aggregation: var z = "qwe12345"; var b = z.Aggregate(new []{"", ""}, (acc, s) => { if (Char.IsDigit(s)) { acc[1] += s; } else { acc[0] += s; } return acc; }); Assert.Equal(new [] {"qwe", "12345"}, b);

Related

Remove anything from string after any "a-zA-Z" char

C# How to extract words from a string and put them into class members

Using string.ToUpper on substring

get measurement value only from string

RegExp: X number of matches => X number of replacements?

Categories

Resources