Take all elements of .Split (' ') and take last OR exclude last item - c#

I have an inconsistend address field which I am reading from excel. I need to split this field and enter those 2 values into two list properties.
the content could be something like this
At the gates 42
I need to split the number from the rest and add "At the gates" to the property "Street" and "42" to the property "number"
I have solved the problem with taking the number with this:
Number = Convert.ToString(ws.Cells[j + 3, i + 2].Value).Split(' ').LastOrDefault()
How can I split this string with Linq and just exclude the last number to get the street?
Kind regards

Use regex so you can also have it flexible to your needs.
Here's a sample:
static void Main(string[] args)
{
Regex regex = new Regex(#"(?<words>[A-Za-z ]+)(?<digits>[0-9]+)");
string input = "At the gates 42";
var match = regex.Match(input);
var a = match.Groups["words"].Value; //At the gates
var b = match.Groups["digits"].Value; //42
}

Splitting the string is a bit too much here. You just need to find the last occurence of " " and split at this position:
var address = "At the gates 42";
var idx = address.LastIndexOf(' ');
var street = address.Substring(0, idx);
var number = address.Substring(idx + 1);
Also consider using regular expressions if you need more flexibility.

You could just do this:
String[] sa = Convert.ToString(ws.Cells[j + 3, i + 2].Value).Split(' ');
Int32 number = Convert.ToInt32(sa[sa.Length - 1]);
String street = String.Join(" ", sa.Take(sa.Length - 1));

I would use LastIndexOf for this (if you can safely assume that your house numbers don't have spaces in):
var original = Convert.ToString(ws.Cells[j + 3, i + 2].Value).Split().LastOrDefault()
var street = original.Substring(1, original.LastIndexOf(' '))
var number = original.Substring(original.LastIndexOf(' ') + 1)
Note that it might be easier to get Excel to do the work for you though, you could use a formula in your sheet to perform the same function, so...
This gives you the street name (assuming your original value is in cell A1, just replace all occurrences of A1 otherwise):
=MID(A1,1,LOOKUP(9.9999999999E+307,FIND(" ",A1,ROW($1:$1024)))-1)
and this gives you the number:
=MID(A1,LOOKUP(9.9999999999E+307,FIND(" ",A1,ROW($1:$1024)))+1,LEN(A1))

Related

Taking parts out of a string, how?

So I have a server that receives a connection with the message being converted to a string, I then have this string split between by the spaces
So you have a line:
var line = "hello world my name is bob";
And you don't want "world" or "is", so you want:
"hello my name bob"
If you split to a list, remove the things you don't want and recombine to a line, you won't have extraneous spaces:
var list = line.Split().ToList();
list.Remove("world");
list.Remove("is");
var result = string.Join(" ", list);
Or if you know the exact index positions of your list items, you can use RemoveAt, but remove them in order from highest index to lowest, because if you e.g. want to remove 1 and 4, removing 1 first will mean that the 4 you wanted to remove is now in index 3.. Example:
var list = line.Split().ToList();
list.RemoveAt(4); //is
list.RemoveAt(1); //world
var result = string.Join(" ", list);
If you're seeking a behavior that is like string.Replace, which removes all occurrences, you can use RemoveAll:
var line = "hello is world is my is name is bob";
var list = line.Split().ToList();
list.RemoveAll(w => w == "is"); //every occurence of "is"
var result = string.Join(" ", list);
You could remove the empty space using TrimStart() method.
Something like this:
string text = "Hello World";
string[] textSplited = text.Split(' ');
string result = text.Replace(textSplited[0], "").TrimStart();
Assuming that you only want to remove the first word and not all repeats of it, a much more efficient way is to use the overload of split that lets you control the maximum number of splits (the argument is the maximum number of results, which is one more than the maximum number of splits):
string[] arguments = line.Split(new[] { ' ' }, 2, StringSplitOptions.RemoveEmptyEntries); // split only once
User.data = arguments.Skip(1).FirstOrDefault();
arguments[1] does the right thing when there are "more" arguments, but throw IndexOutOfRangeException if the number of words is zero or one. That could be fixed without LINQ by (arguments.Length > 1)? arguments[1]: string.Empty
If you're just removing the first word of a string, you don't need to use Split at all; doing a Substring after you found the space will be more efficient.
var line = ...
var idx = line.IndexOf(' ')+1;
line = line.Substring(idx);
or in recent C# versions
line = line[idx..];

Split and add specific charcters to a string in C#

I am working with key-wording modules where each word end with , and remove spaces from the string example is given below:
if string is
one man sitting with girl , baby girl , Smiling girl
then result should be
,one man sitting with girl,baby girl,Smiling girl,
I am trying this
string[] strArray = str15.Split(new char[] { ',', ';' });
if (strArray.Length >= 1)
{
foreach (string str3 in strArray)
{
if (str3 != string.Empty)
{
strr1 = strr1 + str3 + ",";
}
}
}
But not able to remove spaces from string.
The first thing you want to do is tokenise the string, using Split().
string input = "some words , not split , in a sensible , way";
string[] sections = input.Split(',');
This gives you an array of strings split by the comma delimiter, which would look something like this:
"some words "
" not split "
" in a sensible "
" way"
Now you want to trim those spaces off. The string class has a great little function called Trim() which removes all whitespace characters (spaces, tabs, etc) from the start and end of strings.
for (int i = 0; i < sections.Length; i++)
sections[i] = sections[i].Trim();
Now you have an array with strings like this:
"some words"
"not split"
"in a sensible"
"way"
Next, you want to join them back together with a comma delimiter.
string result = string.Join(",", sections);
This gives you something along the lines of this:
"some words,not split,in a sensible,way"
And finally, you can add the commas at the start and end:
result = "," + result + ",";
Of course, this isn't the cleanest way of doing it. It's just an easy way of describing the individual steps. You can combine all of this together using LINQ extensions:
string result = "," + string.Join(",", input.Split(',').Select(s => s.Trim())) + ",";
This takes the input, splits it on the comma delimiter, then for each item in the list executes a lambda expression s => s.Trim(), which selects the trimmed version of the string for each element. This resulting enumerable is then passed back into string.Join(), and then we add the two commas at the start and end. It has the same function as the above steps, but does it one line.
Try this:
var input = "one man sitting with girl , baby girl , Smiling girl";
var output = string.Join(",", input.Split(',').Select(x => x.Trim()));
output = string.Concat(",", output, ",");
This should work:
string test = "one man sitting with girl , baby girl , Smiling girl";
Regex regex = new Regex(#"\s+,\s+");
string result = regex.Replace(test, ",");
Split your string and join it again by removing the white-spaces:
var input = "one man sitting with girl , baby girl , Smiling girl";
var output = string.Join(",", input.Split(',').Select(x => x.Trim()));
// If you wanna enclose it with commas
output = string.Format(",{0},",output);

getting string and numbers

I got a string
string newString = "[17, Appliance]";
how can I put the 17 and Appliance in two separate variables while ignoring the , and the [ and ]?
I tried looping though it but the loop doesn't stop when it reaches the ,, not to mention it separated 1 & 7 instead of reading it as 17.
For example, you could use this:
newString.Split(new[] {'[', ']', ' ', ','}, StringSplitOptions.RemoveEmptyEntries);
This is another option, even though I wouldn't go with it, especially if you might have more than one [something, anothersomething] in the string.
But there you go:
string newString = "assuming you might [17, Appliance] have it like this";
int first = newString.IndexOf('[')+1; // location of first after the `[`
int last = newString.IndexOf(']'); // location of last before the ']'
var parts = newString.Substring(first, last-first).Split(','); // an array of 2
var int_bit = parts.First ().Trim(); // you could also go with parts[0]
var string_bit = parts.Last ().Trim(); // and parts[1]
This may not be the most performant method, but I'd go with it for ease of understanding.
string newString = "[17, Appliance]";
newString = newString.Replace("[", "").Replace("]",""); // Remove the square brackets
string[] results = newString.Split(new string[] { ", " }, StringSplitOptions.RemoveEmptyEntries); // Split the string
// If your string is always going to contain one number and one string:
int num1 = int.Parse(results[0]);
string string1 = results[1];
You'd want to include some validation to ensure your first element is indeed a number (use int.TryParse), and that there are indeed two elements returned after you split the string.

split a string into 2 arrays based on 2 delimiters

I want to split a string into 2 arrays, one with the text that's delimited by vbTab (I think it's \t in c#) and another string with the test thats delimited by vbtab (I think it's \n in c#).
By searching I found this (StackOverFlow Question: 1254577):
string input = "abc][rfd][5][,][.";
string[] parts1 = input.Split(new string[] { "][" }, StringSplitOptions.None);
string[] parts2 = Regex.Split(input, #"\]\[");
but my string would be something like this:
aaa\tbbb\tccc\tddd\teee\nAccount\tType\tCurrency\tBalance\t123,456.78\nDate\tDetails\tAmount\n03NOV13\tTransfer\t9,999,999.00-\n02NOV13\t\Cheque\t125.00\nDebit Card Cash\t200.00
so in the above code input becomes:
string input = "aa\tbbb\tccc\tddd\teee\nAccount\tType\tPersonal Current Account\tCurrency\tGBP\tBalance\t123,456.78\nDate\tDetails\tAmount\n03NOV13\tTransfer\t9,999,999.00-\n02NOV13\t\Cheque\t125.00\nDebit Card Cash\t200.00\n30OCT13\tLoan Repayment\t1,234.56-\n\tType\t30-Day Notice Savings Account\tCurrency\tGBP\tBalance\t983,456.78\nDate\tDetails\tAmount\n03NOV13\tRepaid\t\250\n"
but how do I create one string array with everthing up to the first newline and another array that holds everything after?
Then the second one will have to be split again into several string arrays so I can write out a mini-statement with the account details, then showing the transactions for each account.
I want to be able to take the original string and produce something like this on A5 paper:
You can use a LINQ query:
var cells = from row in input.Split('\n')
select row.Split('\t');
You can get just the first row using First() and the remaining rows using Skip(). For example:
foreach (string s in cells.First())
{
Console.WriteLine("First: " + s);
}
Or
foreach (string[] row in cells.Skip(1))
{
Console.WriteLine(String.Join(",", row));
}
The code below should do what you requested. This resulted in part1 having 5 entries and part2 having 26 entries
string input = "aa\tbbb\tccc\tddd\teee\nAccount\tType\tPersonal Current Account\tCurrency\tGBP\tBalance\t123,456.78\nDate\tDetails\tAmount\n03NOV13\tTransfer\t9,999,999.00-\n02NOV13\t\Cheque\t125.00\nDebit Card Cash\t200.00\n30OCT13\tLoan Repayment\t1,234.56-\n\tType\t30-Day Notice Savings Account\tCurrency\tGBP\tBalance\t983,456.78\nDate\tDetails\tAmount\n03NOV13\tRepaid\t\250\n";
// Substring starting at 0 and ending where the first newline begins
string input1 = input.Substring(0, input.IndexOf(#"\n"));
/* Substring starting where the first newline begins
plus the length of the new line to the end */
string input2 = input.Substring(input.IndexOf(#"\n") + 2);
string[] part1 = Regex.Split(input1, #"\\t");
string[] part2 = Regex.Split(input2, #"\\t");

Getting a value from a string using regular expressions?

I have a string "Page 1 of 15".
I need to get the value 15 as this value could be any number between 1 and 100. I'm trying to figure out:
If regular expressions are best suited here. Considering string will never change maybe just split the string by spaces? Or a better solution.
How to get the value using a regular expression.
Regular expression you can use: Page \d+ of (\d+)
Regex re = new Regex(#"Page \d+ of (\d+)");
string input = "Page 1 of 15";
Match match = re.Match(input);
string pages = match.Groups[1].Value;
Analysis of expression: between ( and ) you capture a group. The \d stands for digit, the + for one or more digits. Note that it is important to be exact, so copy the spaces as well.
The code is a tad verbose, but I figured it'd be better understandable this way. With a split you just need: var pages = input.Split(' ')[3];, looks easier, but is error-prone. The regex is easily extended to grab other parts of the string in one go.
var myString = "Page 1 of 15";
var number = myString.SubString(myString.LastIndexOf(' ') + 1);
If there is a risk of whitespace at the end of the string then apply a TrimEnd method:
var number = myString.SubString(myString.TrimEnd().LastIndexOf(' ') + 1);
I think a simple String.Replace() is the best, most readable solution for something so simple.
string myString = "Page 1 of 15";
string pageNumber = myString.Replace("Page 1 of ", "");
EDIT:
The above solution assumes that the string will never be Page 2 of 15. If the first page number can change, you'll need to use String.Split() instead.
string myString = "Page 1 of 15";
string pageNumber = myString.Split(new string[] {"of"},
StringSplitOptions.None).Last().Trim();
if the string format will never change then you can try this...
string input = "Page 1 of 15";
string[] splits = input.Split(' ');
int totalPages = int.Parse(splits[splits.Length - 1]);
If this is the only case you will ever have to handle, just split the string by spaces and use the parse the 4th part to an integer. Something like that will work:
string input = "Page 1 of 15";
string[] splitInput = string.Split(' ');
int numberOfPages = int.Parse(splitInput[3]);
In c# it should looks like this (using Regex class):
Regex r = new Regex(#"Page \d+ of (\d+)");
var str = "Page 1 of 15";
var mathes = r.Matches(str);
Your resoult will be in: mathes[0].Groups[1]
You dont need a regex for this. Just the the index of the last space
string var = "1 of100";
string secondvar = string.Empty;
int startindex = var.LastIndexOf(" ");
if (startindex > -1)
{
secondvar = var.Substring(startindex +1);
}

Categories

Resources