I have the following text in an Excel spreadsheet cell:
"Calories (kcal) "
(minus quotes).
I can get the value of the cell into my code:
string nutrientLabel = dataRow[0].ToString().Trim();
I'm new to C# and need help in separating the "Calories" and "(kcal)" to to different variables that I can upload into my online system. I need the result to be two strings:
nutrientLabel = Calories
nutrientUOM = kcal
I've googled the hell out of this and found out how to make it work to separate them and display into Console.WriteLine but I need the values actually out to 2 variables.
foreach (DataRow dataRow in nutrientsdataTable.Rows)
{
string nutrientLabel = dataRow[0].ToString().Trim();
}
char[] paraSeparator = new char[] { '(', ')' };
string[] result;
Console.WriteLine("=======================================");
Console.WriteLine("Para separated strings :\n");
result = nutrientLabel.Split(paraSeparator,
StringSplitOptions.RemoveEmptyEntries);
foreach (string str in result)
{
Console.WriteLine(str);
}
You can use a simple regex for this:
var reg = new Regex(#"(?<calories>\d+)\s\((?<kcal>\d+)\)");
Which essentially says:
Match at least one number and store it in the group 'calories'
Match a space and an opening parenthesis
Match at least one number and store it in the group 'kcal'
Match a closing parenthesis
Then we can extract the results using the named groups:
var sampleInput = "15 (35)";
var match = reg.Match(sampleInput);
var calories = match.Groups["calories"];
var kcal = match.Groups["kcal"];
Note that calories and kcal are still strings here, you'll need to parse them into an integer (or decimal)
string [] s = dataRow[0].ToString().Split(' ');
nutrientLabel = s[0];
nutrientUOM = s[1].Replace(")","").Replace("(","");
Related
Consider a number of strings, which are assumed to contain "keys" of the form "Wxxx", where x are digits from 0-9. Each one can contain either one only, or multiple ones, separated by ',' followed by two spaces. For example:
W123
W432
W546, W234, W167
The ones that contain multiple "keys" need to be split up, into an array. So, the last one in the above examples should be split into an array like this: {"W546", "W234", "W167"}.
As a quick solution, String.Split comes to mind, but as far as I am aware, it can take one character, like ','. The problem is that it would return an array with like this: {"W546", " W234", " W167"}. The two spaces in all the array entries from the second one onwards can probably be removed using Substring, but is there a better solution?
For context, these values are being held in a spreadsheet, and are assumed to have undergone data validation to ensure the "keys" are separated by a comma followed by two spaces.
while ((ws.Cells[row,1].Value!=null) && (ws.Cells[row,1].Value.ToString().Equals("")))
{
// there can be one key, or multiple keys separated by ','
if (ws.Cells[row,keysCol].Value.ToString().Contains(','))
{
// there are multiple
// need to split the ones in this cell separated by a comma
}
else
{
// there is one
}
row++;
}
You can just specify ',' and ' ' as separators and RemoveEmptyEntries.
Using your sample of single keys and a string containing multiple keys you can just handle them all the same and get your list of individual keys:
List<string> cells = new List<string>() { "W123", "W432", "W546, W234, W167" };
List<string> keys = new List<string>();
foreach (string cell in cells)
{
keys.AddRange(cell.Split(new char[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries));
}
Split can handle strings where's nothing to split and AddRange will accept your single keys as well as the multi-key split results.
You could use an old favorite--Regular Expressions.
Here are two flavors 'Loop' or 'LINQ'.
static void Main(string[] args)
{
var list = new List<string>{"W848","W998, W748","W953, W9484, W7373","W888"};
Console.WriteLine("LINQ");
list.ForEach(l => TestSplitRegexLinq(l));
Console.WriteLine();
Console.WriteLine("Loop");
list.ForEach(l => TestSplitRegexLoop(l));
}
private static void TestSplitRegexLinq(string s)
{
string pattern = #"[W][0-9]*";
var reg = new Regex(pattern);
reg.Matches(s).ToList().ForEach(m => Console.WriteLine(m.Value));
}
private static void TestSplitRegexLoop(string s)
{
string pattern = #"[W][0-9]*";
var reg = new Regex(pattern);
foreach (Match m in reg.Matches(s))
{
Console.WriteLine(m.Value);
}
}
Just replace the Console.Write with anything you want: eg. myList.Add(m.Value).
You will need to add the NameSpace: using System.Text.RegularExpressions;
Eliminate the extra space first (using Replace()), then use split.
var input = "W546, W234, W167";
var normalized = input.Replace(", ",",");
var array = normalized.Split(',');
This way, you treat a comma followed by a space exactly the same as you'd treat a comma. If there might be two spaces you can also replace that:
var input = "W546, W234, W167";
var normalized = input.Replace(" "," ").Replace(", ",",");
var array = normalized.Split(',');
After trying this in .NET fiddle, I think I may have a solution:
// if there are multiple
string keys = ws.Cells[row,keysCol].Value.ToString();
// remove spaces
string keys_normalised = keys.Replace(" ", string.Empty);
Console.WriteLine("Checking that spaces have been removed: " + keys3_normalised + "\n");
string[] splits = keys3_normalised.Split(',');
for (int i = 0; i < splits.Length; i++)
{
Console.WriteLine(splits[i]);
}
This produces the following output in the console:
Checking that spaces have been removed: W456,W234,W167
W456
W234
W167
I have a string with a primary separator as ; and secondary separator as |,
I need to extract the 3rd word after the separator | and return a single string with a separator as ; and trim the rest of the string.
Example:
Input:
company1|23|**NJ**|0321;company2|24|**PH**|0322;company3|25|**NY**|0323;company4|26|**PA**|0323
Expected Output:
NJ;PH;NY;PA
Try this please, also see my result attached;
string input = "company1|23|NJ|0321;company2|24|PH|0322;company3|25|NY|0323;company4|26|PA|0323";
List<string> resultLevel2 = new List<string>();
string[] resultLevel1 = input.Split(';');
foreach (var item in resultLevel1)
{
resultLevel2.Add(item.Split('|')[2]);
}
string output = string.Join(";", resultLevel2);
var s = "company1|23|NJ|0321;company2|24|PH|0322;company3|25|NY|0323;company4|26|PA|0323";
var result = s.Split(';').Select(x=>x.Split('|')[2]).ToList();
var resultStr = string.Join(";",result);
Remember to include System.Linq
I have a following string array as shown in the image. while looping through the array, i need to separate numeric value and Alphabetic value .
eg:
35.00MY to 35.00 and MY
2.10D8 to 2.10 and D8
80.00YRI to 80.00 and YRI
4.00G8 to 4.00 and G8
I tried following code , but that didn't help
foreach (string taxText in taxSplit) {
Regex re = new Regex(#"([a-zA-Z]+)(\d+)");
Match result = re.Match(taxText);
string alphaPart = result.Groups[1].ToString();
string numberPart = result.Groups[2].ToString(); }
Both returned empty
You can bastardize a Split and use a lookahead (?= ... ) and a lookbehind (?<= ... ):
string original = "35.00ab3500bc";
Regex reg = new Regex("(?<=[0-9])(?=[A-Za-z])");
string[] parts = reg.Split(original, 2);
Here, we have to instantiate a new Regex instance because this version of Split isn't available as a static method. The pattern we pass says to find a void where the left side of the void is a number (i.e. the lookbehind), and the right side of the void is a letter (i.e. the lookahead). We pass a 2 to say that we want at most two items in parts.
var lst = new List<string>() { "35.00MY", "2.10D8", "80.00YRI", "4.00GB" };
var res = new List<string>();
lst.ForEach(v =>
{
res.Add(new string(v.TakeWhile(c => !Char.IsLetter(c)).ToArray()));
res.Add(v.TrimStart("01234567890.".ToCharArray()));
} );
I found it not efficient to iterate through string parts split by space character and extract numeric parts and apply
UInt64.Parse(Regex.Match(numericPart, #"\d+").Value)
and the concatenating them together to form the string with numbers being grouped.
Is there a better, more efficient way to 3-digit grouping of all numbers in an string containing other characters?
I am pretty sure the most efficient way (CPU-wise, with just a single pass over the string) is the basic foreach loop, along these lines
var sb = new StringBuilder()
foreach(char c in inputString)
{
// if c is a digit count
// else reset counter
// if there are three digits insert a "."
}
return sb.ToString()
This will produce 123.456.7
If you want 1.234.567 you'll need an additional buffer for digit-sequences
So you want to replace all longs in a string with the same long but with a number-group-separator of the current culture? .... Yes
string[] words = input.Split();
var newWords = words.Select(w =>
{
long l;
bool isLong = System.Int64.TryParse(w.Trim(), out l);
if(isLong)
return l.ToString("N0");
else
return w;
});
string result = string.Join(" ", newWords);
With the input from your comment:
string input = "hello 134443 in the 33 when 88763 then";
You get the expected result: "hello 134,443 in the 33 when 88,763 then", if your current culture uses comma as number-group-separator.
I will post my regex-based example. I believe regex does not have to be too slow, especially once it is compiled and is declared with static and readonly.
// Declare the regex
private static readonly Regex regex = new Regex(#"(\d)(?=(\d{3})+(?!\d))", RegexOptions.Compiled);
// Then, somewhere inside a method
var replacement = string.Format("$1{0}", System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberGroupSeparator); // Get the system digit grouping separator
var strn = "Hello 34234456 where 3334 is it?"; // Just a sample string
// Somewhere (?:inside a loop)?
var res = regex.Replace(strn, replacement);
Output (if , is a system digit grouping separator):
Hello 34,234,456 where 3,334 is it?
Split or Regex.Split is used to extract the word in a sentence(s) and store them in array. I instead would like to extract the spaces in a sentence(s) and store them in array (it is possible that this sentence contains multiple spaces). Is there easy way of doing it? I first tried to split it normally, and then use string.split(theSplittedStrings, StringSplitOptions.RemoveEmptyEntries) however, that did not preserve the amount of spaces that exists.
---------- EDIT -------------
for example. If there is a sentence "This is a test".
I would like to make an array of string { " ", " ", " "}.
---------- EDIT END ---------
Any helps are appreciated.
Thank you.
EDIT:
Based on your edited question, I believe you can do that with simple iteration like:
string str = "This is a test";
List<string> spaceList = new List<string>();
var temp = str.TakeWhile(char.IsWhiteSpace).ToList();
List<char> charList = new List<char>();
foreach (char c in str)
{
if (c == ' ')
{
charList.Add(c);
}
if (charList.Any() && c != ' ')
{
spaceList.Add(new string(charList.ToArray()));
charList = new List<char>();
}
}
That would give you spaces in different elements of List<string>, if you need an array back then you can call ToArray
(Old Answer)
You don't need string.Split. You can count the spaces in the string and then create array like:
int spaceCount = str.Count(r => r == ' ');
char[] array = Enumerable.Repeat<char>(' ', spaceCount).ToArray();
If you want to consider White-Space (Space, LineBreak, Tabs) as space then you can use:
int whiteSpaceCount = str.Count(char.IsWhiteSpace);
This code matches all spaces in the input string and outputs their indexes:
const string sentence = "This is a test sentence.";
MatchCollection matches = Regex.Matches(sentence, #"\s");
foreach (Match match in matches)
{
Console.WriteLine("Space at character {0}", match.Index);
}
This code retrieves all space groups as an array:
const string sentence = "This is a test sentence.";
string[] spaceGroups = Regex.Matches(sentence, #"\s+").Cast<Match>().Select(arg => arg.Value).ToArray();
In either case, you can look at the Match instances' Index property values to get the location of the space/space group in the string.