Related
I am trying an efficient way to split up a string. I have a string in the below format which represents a value.
string input = "1A2B3C4D5DC";
i have to fetch the numeric value next to each character , so that i can compute the final value.
Currently im doing this, This works fine, Can you suggest me a better approach.
public double GetValue(string input)
{
string value;
int beginIndex = 0, endIndex = 0, unit1 = 0, unit2 = 0, unit3 = 0, unit4 = 0, unit5 = 0;
input = input.Replace("cd", "zz");
if (input.ToLower().Contains("a"))
{
endIndex = input.ToLower().IndexOf('a');
value = input.Substring(beginIndex, endIndex - beginIndex);
int.TryParse(value, out unit1);
beginIndex = endIndex + 1;
}
if (input.ToLower().Contains("b"))
{
endIndex = input.ToLower().IndexOf('b');
value = input.Substring(beginIndex, endIndex - beginIndex);
int.TryParse(value, out unit2);
beginIndex = endIndex + 1;
}
if (input.ToLower().Contains("c") )
{
endIndex = input.ToLower().IndexOf('b');
value = input.Substring(beginIndex, endIndex - beginIndex);
int.TryParse(value, out unit3);
beginIndex = endIndex + 1;
}
if (input.ToLower().Contains("d"))
{
endIndex = input.ToLower().IndexOf('d');
value = input.Substring(beginIndex, endIndex - beginIndex);
int.TryParse(value, out unit4);
beginIndex = endIndex + 1;
}
if (input.Length > beginIndex + 2)
{
value = input.Substring(beginIndex, input.Length - beginIndex - 2);
int.TryParse(value, out unit5);
}
return (unit1 * 10 + unit2 * 20 + unit3 * 30 + unit4 * 40 + unit5 * 50); //some calculation
}
Possible inputs can be : 21A34DC , 4C, 2BDC, 2B. basically they all are optional but if present it has to be in the same sequence
you can find all numbers within string with a regular expression:
string input = "1A2B3C4D5DC";
Regex rx = new Regex(#"\d+");
// Regex rx = new Regex(#"-?\d+"); // this one includes negative integers
var matches = rx.Matches(input);
int[] numbers = matches.OfType<Match>()
.Select(m => Convert.ToInt32(m.Value))
.ToArray();
make necessary computations with resulting array.
If you want to extract just numbers from string, then use Regular Expressions:
string input = "1A2B3C4D5DC";
var resultString = Regex.Replace(input, #"[^0-9]+", "");
Or linq way:
string input = "1A2B3C4D5DC";
var resultString = new String(input.Where(Char.IsDigit).ToArray());
Just looking at your code there is a lot of repeating code, so refactoring it "as is" and using a mapping dictionary is likely good solurtion
Something like this
public static double GetValue(string input)
{
var map = new Dictionary<string, int>()
{
{"a", 10 }, {"b", 20}, {"c", 30}, {"d", 40}
};
int result = 0;
foreach(var i in map)
{
int endIndex, outValue;
string value;
endIndex = input.ToLower().IndexOf(i.Key);
value = input.Substring(endIndex -1, 1);
int.TryParse(value, out outValue);
result += (i.Value * outValue);
}
return result;
}
Following code for me ,
public double GetValue(string input)
{
input)= input)();
string value;
int aValue = 0, bValue = 0, cValue = 0, dvalue = 0, cdValue = 0;
if (match.Groups[5].Success && !string.IsNullOrEmpty(match.Groups[5].Value))
{
string val = match.Groups[5].Value;
if (!int.TryParse(val.Substring(0, val.Length - 2), out cdValue))
{
return -1;
}
}
if (match.Groups[4].Success && !string.IsNullOrEmpty(match.Groups[4].Value))
{
string val = match.Groups[4].Value;
if (!int.TryParse(val.Substring(0, val.Length - 1), out dvalue))
{
return -1;
}
}
if (match.Groups[3].Success && !string.IsNullOrEmpty(match.Groups[3].Value))
{
string val = match.Groups[3].Value;
if (!int.TryParse(val.Substring(0, val.Length - 1), out cValue))
{
return -1;
}
}
if (match.Groups[2].Success && !string.IsNullOrEmpty(match.Groups[2].Value))
{
string val = match.Groups[2].Value;
if (!int.TryParse(val.Substring(0, val.Length - 1), out bValue))
{
return -1;
}
}
if (match.Groups[1].Success && !string.IsNullOrEmpty(match.Groups[1].Value))
{
string val = match.Groups[1].Value;
if (!int.TryParse(val.Substring(0, val.Length - 1), out aValue))
{
return -1;
}
}
return (aValue * 10 + bValue * 20 + cValue * 30 + dvalue * 40 + cdValue * 50); //some calculation
}
Tell me if this produces the expected output:
static void Main(string[] args)
{
int sum = GetValue("1A2B3C4D5DC");
// {1,2,3,4,5} = 10*(1+2*2+3*3+4*4+5*5) = 550
}
public static int GetValue(string input)
{
// make input all lowercase
input = input.ToLower();
// replace terminator dc with next letter to
// avoid failing the search;
input = input.Replace("dc", "e");
// initialize all unit values to zero
const string tokens = "abcde";
int[] units = new int[tokens.Length];
// keep track of position of last parsed number
int start = 0;
for (int index = 0; index < tokens.Length; index++)
{
// fetch next letter
char token = tokens[index];
// find letter in input
int position = input.IndexOf(token, start);
// if found
if (position>start)
{
// extract string before letter
string temp = input.Substring(start, position-start);
// and convert to integer
int.TryParse(temp, out units[index]);
}
// update last parsed number
start = position+1;
}
// add unit values, each one worth +10 more than the
// previous one.
//
// {x,y,z} = 10*x + 20*y + 30*z
int sum = 0;
for (int i = 0; i < units.Length; i++)
{
sum += 10*(i+1)*units[i];
}
return sum;
}
}
Please add some test cases in the question with the expected results just to make sure our answers are correct.
"1A2B3C4D5DC" => 550
???
Out of curiosity, is there a faster/more efficient way to parse a dynamic list of ints from a string?
Currently I have this, and it works absolutely fine; I was just thinking there might be a better way as this seems a little overly complex for something so simple.
public static void Send(string providerIDList)
{
String[] providerIDArray = providerIDList.Split('|');
var providerIDs = new List<int>();
for (int counter = 0; counter < providerIDArray.Count(); counter++)
{
providerIDs.Add(int.Parse(providerIDArray[counter].ToString()));
}
//do some stuff with the parsed list of int
Edit: Perhaps I should have said a more simple way to parse out my list from the string. But since the original question did state faster and more efficient the chosen answer will reflect that.
There's definitely a better way. Use LINQ:
var providerIDs = providerIDList.Split('|')
.Select(x => int.Parse(x))
.ToList();
Or using a method group conversion instead of a lambda expression:
var providerIDs = providerIDList.Split('|')
.Select(int.Parse)
.ToList();
This is not the most efficient way it can be done, but it's quite possibly the simplest. It's about as efficient as your approach - though that could be made slightly more efficient fairly easily, e.g. giving the List an initial capacity.
The difference in performance is likely to be irrelevant, so I'd stick with this simple code until you've got evidence that it's a bottleneck.
Note that if you don't need a List<int> - if you just need something you can iterate over once - you can kill the ToList call and use providerIDs as an IEnumerable<int>.
EDIT: If we're in the efficiency business, then here's an adaptation of the ForEachChar method, to avoid using int.Parse:
public static List<int> ForEachCharManualParse(string s, char delim)
{
List<int> result = new List<int>();
int tmp = 0;
foreach(char x in s)
{
if(x == delim)
{
result.Add(tmp);
tmp = 0;
}
else if (x >= '0' && x <= '9')
{
tmp = tmp * 10 + x - '0';
}
else
{
throw new ArgumentException("Invalid input: " + s);
}
}
result.Add(tmp);
return result;
}
Notes:
This will add zeroes for any consecutive delimiters, or a delimiter at the start or end
It doesn't handle negative numbers
It doesn't check for overflow
As noted in comments, using a switch statement instead of the x >= '0' && x <= '9' can improve the performance further (by about 10-15%)
If none of those are a problem for you, it's about 7x faster than ForEachChar on my machine:
ListSize 1000 : StringLen 10434
ForEachChar1000 Time : 00:00:02.1536651
ForEachCharManualParse1000 Time : 00:00:00.2760543
ListSize 100000 : StringLen 1048421
ForEachChar100000 Time : 00:00:02.2169482
ForEachCharManualParse100000 Time : 00:00:00.3087568
ListSize 10000000 : StringLen 104829611
ForEachChar10000000 Time : 00:00:22.0803706
ForEachCharManualParse10000000 Time : 00:00:03.1206769
The limitations can be worked around, but I haven't bothered... let me know if they're significant concerns for you.
I don't like any of the answers so far. So to actually answer the question the OP posed "fastest/most efficient" String.Split with Int.Parse, I wrote and tested some code.
Using Mono on an Intel 3770k.
I found that using String.Split + IEnum.Select is not the fastest (maybe the prettiest) solution. In fact it's the slowest.
Here's some benchmark results
ListSize 1000 : StringLen 10468
SplitForEach1000 Time : 00:00:02.8704048
SplitSelect1000 Time : 00:00:02.9134658
ForEachChar1000 Time : 00:00:01.8254438
SplitParallelSelectr1000 Time : 00:00:07.5421146
ForParallelForEachChar1000 Time : 00:00:05.3534218
ListSize 100000 : StringLen 1048233
SplitForEach100000 Time : 00:00:01.9500846
SplitSelect100000 Time : 00:00:02.2662606
ForEachChar100000 Time : 00:00:01.2554577
SplitParallelSelectr100000 Time : 00:00:02.6509969
ForParallelForEachChar100000 Time : 00:00:01.5842131
ListSize 10000000 : StringLen 104824707
SplitForEach10000000 Time : 00:00:18.2658261
SplitSelect10000000 Time : 00:00:20.6043874
ForEachChar10000000 Time : 00:00:10.0555613
SplitParallelSelectr10000000 Time : 00:00:18.1908017
ForParallelForEachChar10000000 Time : 00:00:08.6756213
Here's the code to get the benchmark results
using System;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Diagnostics;
namespace FastStringSplit
{
class MainClass
{
public static void Main (string[] args)
{
Random rnd = new Random();
char delim = ':';
int[] sizes = new int[]{1000, 100000, 10000000 };
int[] iters = new int[]{10000, 100, 10};
Stopwatch sw;
List<int> list, result = new List<int>();
string str;
for(int s=0; s<sizes.Length; s++) {
list = new List<int>(sizes[s]);
for(int i=0; i<sizes[s]; i++)
list.Add (rnd.Next());
str = string.Join(":", list);
Console.WriteLine(string.Format("\nListSize {0} : StringLen {1}", sizes[s], str.Length));
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = SplitForEach(str, delim);
sw.Stop();
}
Console.WriteLine("SplitForEach" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = SplitSelect(str, delim);
sw.Stop();
}
Console.WriteLine("SplitSelect" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = ForEachChar(str, delim);
sw.Stop();
}
Console.WriteLine("ForEachChar" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = SplitParallelSelect(str, delim);
sw.Stop();
}
Console.WriteLine("SplitParallelSelectr" + result.Count + " Time : " + sw.Elapsed.ToString());
////
sw = new Stopwatch();
for(int i=0; i<iters[s]; i++) {
sw.Start();
result = ForParallelForEachChar(str, delim);
sw.Stop();
}
Console.WriteLine("ForParallelForEachChar" + result.Count + " Time : " + sw.Elapsed.ToString());
}
}
public static List<int> SplitForEach(string s, char delim) {
List<int> result = new List<int>();
foreach(string x in s.Split(delim))
result.Add(int.Parse (x));
return result;
}
public static List<int> SplitSelect(string s, char delim) {
return s.Split(delim)
.Select(int.Parse)
.ToList();
}
public static List<int> ForEachChar(string s, char delim) {
List<int> result = new List<int>();
int start = 0;
int end = 0;
foreach(char x in s) {
if(x == delim || end == s.Length - 1) {
if(end == s.Length - 1)
end++;
result.Add(int.Parse (s.Substring(start, end-start)));
start = end + 1;
}
end++;
}
return result;
}
public static List<int> SplitParallelSelect(string s, char delim) {
return s.Split(delim)
.AsParallel()
.Select(int.Parse)
.ToList();
}
public static int NumOfThreads = Environment.ProcessorCount > 2 ? Environment.ProcessorCount : 2;
public static List<int> ForParallelForEachChar(string s, char delim) {
int chunkSize = (s.Length / NumOfThreads) + 1;
ConcurrentBag<int> result = new ConcurrentBag<int>();
int[] chunks = new int[NumOfThreads+1];
Task[] tasks = new Task[NumOfThreads];
for(int x=0; x<NumOfThreads; x++) {
int next = chunks[x] + chunkSize;
while(next < s.Length) {
if(s[next] == delim)
break;
next++;
}
//Console.WriteLine(next);
chunks[x+1] = Math.Min(next, s.Length);
tasks[x] = Task.Factory.StartNew((o) => {
int chunkId = (int)o;
int start = chunks[chunkId];
int end = chunks[chunkId + 1];
if(start >= s.Length)
return;
if(s[start] == delim)
start++;
//Console.WriteLine(string.Format("{0} {1}", start, end));
for(int i = start; i<end; i++) {
if(s[i] == delim || i == end-1) {
if(i == end-1)
i++;
result.Add(int.Parse (s.Substring(start, i-start)));
start = i + 1;
}
}
}, x);
}
Task.WaitAll(tasks);
return result.ToList();
}
}
}
Here's the function I recommend
public static List<int> ForEachChar(string s, char delim) {
List<int> result = new List<int>();
int start = 0;
int end = 0;
foreach(char x in s) {
if(x == delim || end == s.Length - 1) {
if(end == s.Length - 1)
end++;
result.Add(int.Parse (s.Substring(start, end-start)));
start = end + 1;
}
end++;
}
return result;
}
Why it's faster?
It doesn't split the string into an array first. It does the splitting and parsing at the same time so there is no added overhead of iterating over the string to split it and then iterating over the array to parse it.
I also threw in a parallel-ized version using tasks, but it is only faster in the case with very large strings.
This appears cleaner:
var providerIDs = providerIDList.Split('|').Select(x => int.Parse(x)).ToList();
if you really want to know the most efficent way, then use unsafe code, define char pointer from string, iterate all chars incrementing char pointer, buffer read chars until the next '|', convert buffered chars to int32. if you want to be really fast then do it manually (begin with last char, substruct value of '0' char, multiply it 10, 100, 1000... accoring to iteration variable, then add it to the sum variable. i dont have time to write code but hopefully you get the idea
How can I take the value 123456789012345 or 1234567890123456 and turn it into:
************2345 and ************3456
The difference between the strings above is that one contains 15 digits and the other contains 16.
I have tried the following, but it does not keep the last 4 digits of the 15 digit number and now matter what the length of the string, be it 13, 14, 15, or 16, I want to mask all beginning digits with a *, but keep the last 4. Here is what I have tried:
String.Format("{0}{1}", "************", str.Substring(11, str.Length - 12))
Something like this:
string s = "1234567890123"; // example
string result = s.Substring(s.Length - 4).PadLeft(s.Length, '*');
This will mask all but the last four characters of the string. It assumes that the source string is at least 4 characters long.
using System;
class Program
{
static void Main()
{
var str = "1234567890123456";
if (str.Length > 4)
{
Console.WriteLine(
string.Concat(
"".PadLeft(12, '*'),
str.Substring(str.Length - 4)
)
);
}
else
{
Console.WriteLine(str);
}
}
}
Easiest way: Create an extension method to extract the last four digits. Use that in your String.Format call.
For example:
public static string LastFour(this string value)
{
if (string.IsNullOrEmpty(value) || value.length < 4)
{
return "0000";
}
return value.Substring(value.Length - 4, 4)
}
In your code:
String.Format("{0}{1}", "************", str.LastFour());
In my opinion, this leads to more readable code, and it's reusable.
EDIT: Perhaps not the easiest way, but an alternative way that may produce more maintainable results. <shrug/>
Try this:
var maskSize = ccDigits.Length - 4;
var mask = new string('*', maskSize) + ccDigits.Substring(maskSize);
LINQ:
char maskBy = '*';
string input = "123456789012345";
int count = input.Length <= 4 ? 0 : input.Length - 4;
string output = new string(input.Select((c, i) => i < count ? maskBy : c).ToArray());
static private String MaskInput(String input, int charactersToShowAtEnd)
{
if (input.Length < charactersToShowAtEnd)
{
charactersToShowAtEnd = input.Length;
}
String endCharacters = input.Substring(input.Length - charactersToShowAtEnd);
return String.Format(
"{0}{1}",
"".PadLeft(input.Length - charactersToShowAtEnd, '*'),
endCharacters
);
}
Adjust the function header as required, call with:
MaskInput("yourInputHere", 4);
private string MaskDigits(string input)
{
//take first 6 characters
string firstPart = input.Substring(0, 6);
//take last 4 characters
int len = input.Length;
string lastPart = input.Substring(len - 4, 4);
//take the middle part (****)
int middlePartLenght = len - (firstPart.Length + lastPart.Length);
string middlePart = new String('*', middlePartLenght);
return firstPart + middlePart + lastPart;
}
MaskDigits("1234567890123456");
// output : "123456******3456"
Try the following:
private string MaskString(string s)
{
int NUM_ASTERISKS = 4;
if (s.Length < NUM_ASTERISKS) return s;
int asterisks = s.Length - NUM_ASTERISKS;
string result = new string('*', asterisks);
result += s.Substring(s.Length - NUM_ASTERISKS);
return result;
}
Regex with a match evaluator will do the job
string filterCC(string source) {
var x=new Regex(#"^\d+(?=\d{4}$)");
return x.Replace(source,match => new String('*',match.Value.Length));
}
This will match any number of digits followed by 4 digits and the end (it won't include the 4 digits in the replace). The replace function will replace the match with a string of * of equal length.
This has the additional benefit that you could use it as a validation algorthim too. Change the first + to {11,12} to make it match a total of 15 or 16 chars and then you can use x.IsMatch to determine validity.
EDIT
Alternatively if you always want a 16 char result just use
return x.Replace(source,new String('*',12));
// "123456789".MaskFront results in "****56789"
public static string MaskFront(this string str, int len, char c)
{
var strArray = str.ToCharArray();
for (var i = 0; i < len; i++)
{
if(i < strArray.Length)
{
strArray[i] = c;
}
else
{
break;
}
}
return string.Join("", strArray);
}
// "123456789".MaskBack results in "12345****"
public static string MaskBack(this string str, int len, char c)
{
var strArray = str.ToCharArray();
var tracker = strArray.Length - 1;
for (var i = 0; i < len; i++)
{
if (tracker > -1)
{
strArray[tracker] = c;
tracker--;
}
else
{
break;
}
}
return string.Join("", strArray);
}
Try this out:
static string Mask(string str)
{
if (str.Length <= 4) return str;
Regex rgx = new Regex(#"(.*?)(\d{4})$");
string result = String.Empty;
if (rgx.IsMatch(str))
{
for (int i = 0; i < rgx.Matches(str)[0].Groups[1].Length; i++)
result += "*";
result += rgx.Matches(str)[0].Groups[2];
return result;
}
return str;
}
Mask from start and from end with sending char
public static string Maskwith(this string value, int fromStart, int fromEnd, char ch)
{
return (value?.Length >= fromStart + fromEnd) ?
string.Concat(Enumerable.Repeat(ch, fromStart)) + value.Substring(fromStart, value.Length - (fromStart + fromEnd)) + string.Concat(Enumerable.Repeat(ch, fromEnd))
: "";
} //Console.WriteLine("mytestmask".Maskwith(2,3,'*')); **testm***
show chars from start and from end by passing value and mask the middle
public static string MasktheMiddle(this string value, int visibleCharLength, char ch)
{
if (value?.Length <= (visibleCharLength * 2))
return string.Concat(Enumerable.Repeat(ch,value.Length));
else
return value.Substring(0, visibleCharLength) + string.Concat(Enumerable.Repeat(ch, value.Length - (visibleCharLength * 2))) + value.Substring(value.Length - visibleCharLength);
} //Console.WriteLine("mytestmask".MasktheMiddle(2,'*')); Result: my******sk
How can I take the value 123456789012345 or 1234567890123456 and turn it into:
************2345 and ************3456
one more way to do this:
var result = new string('*',0,value.Length - 4) + new string(value.Skip(value.Length - 4).ToArray())
// or using string.Join
An extension method using C# 8's index and range:
public static string MaskStart(this string input, int showNumChars, char maskChar = '*') =>
input[^Math.Min(input.Length, showNumChars)..]
.PadLeft(input.Length, maskChar);
A simple way
string s = "1234567890123"; // example
int l = s.Length;
s = s.Substring(l - 4);
string r = new string('*', l);
r = r + s;
I need to split a number into even parts for example:
32427237 needs to become 324 272 37
103092501 needs to become 103 092 501
How does one go about splitting it and handling odd number situations such as a split resulting in these parts e.g. 123 456 789 0?
If you have to do that in many places in your code you can create a fancy extension method:
static class StringExtensions {
public static IEnumerable<String> SplitInParts(this String s, Int32 partLength) {
if (s == null)
throw new ArgumentNullException(nameof(s));
if (partLength <= 0)
throw new ArgumentException("Part length has to be positive.", nameof(partLength));
for (var i = 0; i < s.Length; i += partLength)
yield return s.Substring(i, Math.Min(partLength, s.Length - i));
}
}
You can then use it like this:
var parts = "32427237".SplitInParts(3);
Console.WriteLine(String.Join(" ", parts));
The output is 324 272 37 as desired.
When you split the string into parts new strings are allocated even though these substrings already exist in the original string. Normally, you shouldn't be too concerned about these allocations but using modern C# you can avoid this by altering the extension method slightly to use "spans":
public static IEnumerable<ReadOnlyMemory<char>> SplitInParts(this String s, Int32 partLength)
{
if (s == null)
throw new ArgumentNullException(nameof(s));
if (partLength <= 0)
throw new ArgumentException("Part length has to be positive.", nameof(partLength));
for (var i = 0; i < s.Length; i += partLength)
yield return s.AsMemory().Slice(i, Math.Min(partLength, s.Length - i));
}
The return type is changed to public static IEnumerable<ReadOnlyMemory<char>> and the substrings are created by calling Slice on the source which doesn't allocate.
Notice that if you at some point have to convert ReadOnlyMemory<char> to string for use in an API a new string has to be allocated. Fortunately, there exists many .NET Core APIs that uses ReadOnlyMemory<char> in addition to string so the allocation can be avoided.
You could use a simple for loop to insert blanks at every n-th position:
string input = "12345678";
StringBuilder sb = new StringBuilder();
for (int i = 0; i < input.Length; i++)
{
if (i % 3 == 0)
sb.Append(' ');
sb.Append(input[i]);
}
string formatted = sb.ToString();
One very simple way to do this (not the most efficient, but then not orders of magnitude slower than the most efficient).
public static List<string> GetChunks(string value, int chunkSize)
{
List<string> triplets = new List<string>();
while (value.Length > chunkSize)
{
triplets.Add(value.Substring(0, chunkSize));
value = value.Substring(chunkSize);
}
if (value != "")
triplets.Add(value);
return triplets;
}
Heres an alternate
public static List<string> GetChunkss(string value, int chunkSize)
{
List<string> triplets = new List<string>();
for(int i = 0; i < value.Length; i += chunkSize)
if(i + chunkSize > value.Length)
triplets.Add(value.Substring(i));
else
triplets.Add(value.Substring(i, chunkSize));
return triplets;
}
This is half a decade late but:
int n = 3;
string originalString = "32427237";
string splitString = string.Join(string.Empty,originalString.Select((x, i) => i > 0 && i % n == 0 ? string.Format(" {0}", x) : x.ToString()));
LINQ rules:
var input = "1234567890";
var partSize = 3;
var output = input.ToCharArray()
.BufferWithCount(partSize)
.Select(c => new String(c.ToArray()));
UPDATED:
string input = "1234567890";
double partSize = 3;
int k = 0;
var output = input
.ToLookup(c => Math.Floor(k++ / partSize))
.Select(e => new String(e.ToArray()));
If you know that the whole string's length is exactly divisible by the part size, then use:
var whole = "32427237!";
var partSize = 3;
var parts = Enumerable.Range(0, whole.Length / partSize)
.Select(i => whole.Substring(i * partSize, partSize));
But if there's a possibility the whole string may have a fractional chunk at the end, you need to little more sophistication:
var whole = "32427237";
var partSize = 3;
var parts = Enumerable.Range(0, (whole.Length + partSize - 1) / partSize)
.Select(i => whole.Substring(i * partSize, Math.Min(whole.Length - i * partSize, partSize)));
In these examples, parts will be an IEnumerable, but you can add .ToArray() or .ToList() at the end in case you want a string[] or List<string> value.
The splitting method:
public static IEnumerable<string> SplitInGroups(this string original, int size) {
var p = 0;
var l = original.Length;
while (l - p > size) {
yield return original.Substring(p, size);
p += size;
}
yield return original.Substring(p);
}
To join back as a string, delimited by spaces:
var joined = String.Join(" ", myNumber.SplitInGroups(3).ToArray());
Edit: I like Martin Liversage solution better :)
Edit 2: Fixed a bug.
Edit 3: Added code to join the string back.
I would do something like this, although I'm sure there are other ways. Should perform pretty well.
public static string Format(string number, int batchSize, string separator)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i <= number.Length / batchSize; i++)
{
if (i > 0) sb.Append(separator);
int currentIndex = i * batchSize;
sb.Append(number.Substring(currentIndex,
Math.Min(batchSize, number.Length - currentIndex)));
}
return sb.ToString();
}
I like this cause its cool, albeit not super efficient:
var n = 3;
var split = "12345678900"
.Select((c, i) => new { letter = c, group = i / n })
.GroupBy(l => l.group, l => l.letter)
.Select(g => string.Join("", g))
.ToList();
Try this:
Regex.Split(num.toString(), "(?<=^(.{8})+)");
A nice implementation using answers from other StackOverflow questions:
"32427237"
.AsChunks(3)
.Select(vc => new String(vc))
.ToCsv(" "); // "324 272 37"
"103092501"
.AsChunks(3)
.Select(vc => new String(vc))
.ToCsv(" "); // "103 092 501"
AsChunks(): https://stackoverflow.com/a/22452051/538763
ToCsv(): https://stackoverflow.com/a/45891332/538763
I went through all the comments and decided to build this extension method:
public static string FormatStringToSplitSequence(this string input, int splitIndex, string splitCharacter)
{
if (input == null)
return string.Empty;
if (splitIndex <= 0)
return string.Empty;
return string.Join(string.Empty, input.Select((x, i) => i > 0 && i % splitIndex == 0 ? string.Format(splitCharacter + "{0}", x) : x.ToString()));
}
Example:
var text = "24455";
var result = text.FormatStringToSplitSequence(2, ".");
Output: 24.45.5
This might be off topic as I don't know why you wish to format the numbers this way, so please just ignore this post if it's not relevant...
How an integer is shown differs across different cultures. You should do this in a local independent manner so it's easier to localize your changes at a later point.
int.ToString takes different parameters you can use to format for different cultures.
The "N" parameter gives you a standard format for culture specific grouping.
steve x string formatting is also a great resource.
For a dividing a string and returning a list of strings with a certain char number per place, here is my function:
public List<string> SplitStringEveryNth(string input, int chunkSize)
{
var output = new List<string>();
var flag = chunkSize;
var tempString = string.Empty;
var lenght = input.Length;
for (var i = 0; i < lenght; i++)
{
if (Int32.Equals(flag, 0))
{
output.Add(tempString);
tempString = string.Empty;
flag = chunkSize;
}
else
{
tempString += input[i];
flag--;
}
if ((input.Length - 1) == i && flag != 0)
{
tempString += input[i];
output.Add(tempString);
}
}
return output;
}
You can try something like this using Linq.
var str = "11223344";
var bucket = 2;
var count = (int)Math.Ceiling((double)str.Length / bucket);
Enumerable.Range(0, count)
.Select(_ => (_ * bucket))
.Select(_ => str.Substring(_, Math.Min(bucket, str.Length - _)))
.ToList()
You can also use the StringReader class to reads a block of characters from the input string and advances the character position by count.
StringReader Class Read(Char[], Int32, Int32)
The simplest way to separate thousands with a space, which actually looks bad, but works perfect, would be:
yourString.ToString("#,#").Replace(',', ' ');
What is the fastest c# function that takes and int and returns a string containing a letter or letters for use in an Excel function? For example, 1 returns "A", 26 returns "Z", 27 returns "AA", etc.
This is called tens of thousands of times and is taking 25% of the time needed to generate a large spreadsheet with many formulas.
public string Letter(int intCol) {
int intFirstLetter = ((intCol) / 676) + 64;
int intSecondLetter = ((intCol % 676) / 26) + 64;
int intThirdLetter = (intCol % 26) + 65;
char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
char ThirdLetter = (char)intThirdLetter;
return string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
}
I currently use this, with Excel 2007
public static string ExcelColumnFromNumber(int column)
{
string columnString = "";
decimal columnNumber = column;
while (columnNumber > 0)
{
decimal currentLetterNumber = (columnNumber - 1) % 26;
char currentLetter = (char)(currentLetterNumber + 65);
columnString = currentLetter + columnString;
columnNumber = (columnNumber - (currentLetterNumber + 1)) / 26;
}
return columnString;
}
and
public static int NumberFromExcelColumn(string column)
{
int retVal = 0;
string col = column.ToUpper();
for (int iChar = col.Length - 1; iChar >= 0; iChar--)
{
char colPiece = col[iChar];
int colNum = colPiece - 64;
retVal = retVal + colNum * (int)Math.Pow(26, col.Length - (iChar + 1));
}
return retVal;
}
As mentioned in other posts, the results can be cached.
I can tell you that the fastest function will not be the prettiest function. Here it is:
private string[] map = new string[]
{
"A", "B", "C", "D", "E" .............
};
public string getColumn(int number)
{
return map[number];
}
Don't convert it at all. Excel can work in R1C1 notation just as well as in A1 notation.
So (apologies for using VBA rather than C#):
Application.Worksheets("Sheet1").Range("B1").Font.Bold = True
can just as easily be written as:
Application.Worksheets("Sheet1").Cells(1, 2).Font.Bold = True
The Range property takes A1 notation whereas the Cells property takes (row number, column number).
To select multiple cells: Range(Cells(1, 1), Cells(4, 6)) (NB would need some kind of object qualifier if not using the active worksheet) rather than Range("A1:F4")
The Columns property can take either a letter (e.g. F) or a number (e.g. 6)
Here's my version: This does not have any limitation as such 2-letter or 3-letter.
Simply pass-in the required number (starting with 0) Will return the Excel Column Header like Alphabet sequence for passed-in number:
private string GenerateSequence(int num)
{
string str = "";
char achar;
int mod;
while (true)
{
mod = (num % 26) + 65;
num = (int)(num / 26);
achar = (char)mod;
str = achar + str;
if (num > 0) num--;
else if (num == 0) break;
}
return str;
}
I did not tested this for performance, if someone can do that will great for others.
(Sorry for being lazy) :)
Cheers!
You could pre-generate all the values into an array of strings. This would take very little memory and could be calculated on the first call.
Here is a concise implementation using LINQ.
static IEnumerable<string> GetExcelStrings()
{
string[] alphabet = { string.Empty, "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" };
return from c1 in alphabet
from c2 in alphabet
from c3 in alphabet.Skip(1) // c3 is never empty
where c1 == string.Empty || c2 != string.Empty // only allow c2 to be empty if c1 is also empty
select c1 + c2 + c3;
}
This generates A to Z, then AA to ZZ, then AAA to ZZZ.
On my PC, calling GetExcelStrings().ToArray() takes about 30 ms. Thereafter, you can refer to this array of strings if you need it thousands of times.
Once your function has run, let it cache the results into a dictionary. So that, it won't have to do the calculation again.
e.g. Convert(27) will check if 27 is mapped/stored in dictionary. If not, do the calculation and store "AA" against 27 in the dictionary.
The absolute FASTEST, would be capitalizing that the Excel spreadsheet only a fixed number of columns, so you would do a lookup table. Declare a constant string array of 256 entries, and prepopulate it with the strings from "A" to "IV". Then you simply do a straight index lookup.
Try this function.
// Returns name of column for specified 0-based index.
public static string GetColumnName(int index)
{
var name = new char[3]; // Assumes 3-letter column name max.
int rem = index;
int div = 17576; // 26 ^ 3
for (int i = 2; i >= 0; i++)
{
name[i] = alphabet[rem / div];
rem %= div;
div /= 26;
}
if (index >= 676)
return new string(name, 3);
else if (index >= 26)
return new string(name, 2);
else
return new string(name, 1);
}
Now it shouldn't take up that much memory to pre-generate each column name for every index and store them in a single huge array, so you shouldn't need to look up the name for any column twice.
If I can think of any further optimisations, I'll add them later, but I believe this function should be pretty quick, and I doubt you even need this sort of speed if you do the pre-generation.
Your first problem is that you are declaring 6 variables in the method. If a methd is going to be called thousands of times, just moving those to class scope instead of function scope will probably cut your processing time by more than half right off the bat.
This is written in Java, but it's basically the same thing.
Here's code to compute the label for the column, in upper-case, with a 0-based index:
public static String findColChars(long index) {
char[] ret = new char[64];
for (int i = 0; i < ret.length; ++i) {
int digit = ret.length - i - 1;
long test = index - powerDown(i + 1);
if (test < 0)
break;
ret[digit] = toChar(test / (long)(Math.pow(26, i)));
}
return new String(ret);
}
private static char toChar(long num) {
return (char)((num % 26) + 65);
}
Here's code to compute 0-based index for the column from the upper-case label:
public static long findColIndex(String col) {
long index = 0;
char[] chars = col.toCharArray();
for (int i = 0; i < chars.length; ++i) {
int cur = chars.length - i - 1;
index += (chars[cur] - 65) * Math.pow(26, i);
}
return index + powerDown(chars.length);
}
private static long powerDown(int limit) {
long acc = 0;
while (limit > 1)
acc += Math.pow(26, limit-- - 1);
return acc;
}
#Neil N -- nice code I think the thirdLetter should have a +64 rather than +65 ? am I right?
public string Letter(int intCol) {
int intFirstLetter = ((intCol) / 676) + 64;
int intSecondLetter = ((intCol % 676) / 26) + 64;
int intThirdLetter = (intCol % 26) + 65; ' SHOULD BE + 64?
char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
char ThirdLetter = (char)intThirdLetter;
return string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
}
Why don't we try factorial?
public static string GetColumnName(int index)
{
const string letters = "ZABCDEFGHIJKLMNOPQRSTUVWXY";
int NextPos = (index / 26);
int LastPos = (index % 26);
if (LastPos == 0) NextPos--;
if (index > 26)
return GetColumnName(NextPos) + letters[LastPos];
else
return letters[LastPos] + "";
}
Caching really does cut the runtime of 10,000,000 random calls to 1/3 its value though:
static Dictionary<int, string> LetterDict = new Dictionary<int, string>(676);
public static string LetterWithCaching(int index)
{
int intCol = index - 1;
if (LetterDict.ContainsKey(intCol)) return LetterDict[intCol];
int intFirstLetter = ((intCol) / 676) + 64;
int intSecondLetter = ((intCol % 676) / 26) + 64;
int intThirdLetter = (intCol % 26) + 65;
char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
char ThirdLetter = (char)intThirdLetter;
String s = string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
LetterDict.Add(intCol, s);
return s;
}
I think caching in the worst-case (hit every value) couldn't take up more than 250kb (17576 possible values * (sizeof(int)=4 + sizeof(char)*3 + string overhead=2)
It is recursive. Fast, and right :
class ToolSheet
{
//Not the prettyest but surely the fastest :
static string[] ColName = new string[676];
public ToolSheet()
{
ColName[0] = "A";
for (int index = 1; index < 676; ++index) Recurse(index, index);
}
private int Recurse(int i, int index)
{
if (i < 1) return 0;
ColName[index] = ((char)(65 + i % 26)).ToString() + ColName[index];
return Recurse(i / 26, index);
}
public string GetColName(int i)
{
return ColName[i - 1];
}
}
sorry there was a shift. corrected.
class ToolSheet
{
//Not the prettyest but surely the fastest :
static string[] ColName = new string[676];
public ToolSheet()
{
for (int index = 0; index < 676; ++index)
{
Recurse(index, index);
}
}
private int Recurse(int i, int index)
{
if (i < 1)
{
if (index % 26 == 0 && index > 0) ColName[index] = ColName[index - 1].Substring(0, ColName[index - 1].Length - 1) + "Z";
return 0;
}
ColName[index] = ((char)(64 + i % 26)).ToString() + ColName[index];
return Recurse(i / 26, index);
}
public string GetColName(int i)
{
return ColName[i - 1];
}
}
My solution:
static class ExcelHeaderHelper
{
public static string[] GetHeaderLetters(uint max)
{
var result = new List<string>();
int i = 0;
var columnPrefix = new Queue<string>();
string prefix = null;
int prevRoundNo = 0;
uint maxPrefix = max / 26;
while (i < max)
{
int roundNo = i / 26;
if (prevRoundNo < roundNo)
{
prefix = columnPrefix.Dequeue();
prevRoundNo = roundNo;
}
string item = prefix + ((char)(65 + (i % 26))).ToString(CultureInfo.InvariantCulture);
if (i <= maxPrefix)
{
columnPrefix.Enqueue(item);
}
result.Add(item);
i++;
}
return result.ToArray();
}
}
barrowc's idea is much more convenient and fastest than any conversion function! i have converted his ideas to actual c# code that i use:
var start = m_xlApp.Cells[nRow1_P, nCol1_P];
var end = m_xlApp.Cells[nRow2_P, nCol2_P];
// cast as Range to prevent binding errors
m_arrRange = m_xlApp.get_Range(start as Range, end as Range);
object[] values = (object[])m_arrRange.Value2;
private String columnLetter(int column) {
if (column <= 0)
return "";
if (column <= 26){
return (char) (column + 64) + "";
}
if (column%26 == 0){
return columnLetter((column/26)-1) + columnLetter(26) ;
}
return columnLetter(column/26) + columnLetter(column%26) ;
}
Just use an Excel formula instead of a user-defined function (UDF) or other program, per Allen Wyatt (https://excel.tips.net/T003254_Alphabetic_Column_Designation.html):
=SUBSTITUTE(ADDRESS(ROW(),COLUMN(),4),ROW(),"")
(In my organization, using UDFs would be very painful.)
The code I'm providing is NOT C# (instead is python) but the logic can be used for any language.
Most of previous answers are correct. Here is one more way of converting column number to excel columns.
solution is rather simple if we think about this as a base conversion. Simply, convert the column number to base 26 since there is 26 letters only.
Here is how you can do this:
steps:
set the column as a quotient
subtract one from quotient variable (from previous step) because we need to end up on ascii table with 97 being a.
divide by 26 and get the remainder.
add +97 to remainder and convert to char (since 97 is "a" in ASCII table)
quotient becomes the new quotient/ 26 (since we might go over 26 column)
continue to do this until quotient is greater than 0 and then return the result
here is the code that does this :)
def convert_num_to_column(column_num):
result = ""
quotient = column_num
remainder = 0
while (quotient >0):
quotient = quotient -1
remainder = quotient%26
result = chr(int(remainder)+97)+result
quotient = int(quotient/26)
return result
print("--",convert_num_to_column(1).upper())
If you need to generate letters not starting only from A1
private static string GenerateCellReference(int n, int startIndex = 65)
{
string name = "";
n += startIndex - 65;
while (n > 0)
{
n--;
name = (char)((n % 26) + 65) + name;
n /= 26;
}
return name + 1;
}