How do you convert a numerical number to an Excel column name in C# without using automation getting the value directly from Excel.
Excel 2007 has a possible range of 1 to 16384, which is the number of columns that it supports. The resulting values should be in the form of excel column names, e.g. A, AA, AAA etc.
Here's how I do it:
private string GetExcelColumnName(int columnNumber)
{
string columnName = "";
while (columnNumber > 0)
{
int modulo = (columnNumber - 1) % 26;
columnName = Convert.ToChar('A' + modulo) + columnName;
columnNumber = (columnNumber - modulo) / 26;
}
return columnName;
}
If anyone needs to do this in Excel without VBA, here is a way:
=SUBSTITUTE(ADDRESS(1;colNum;4);"1";"")
where colNum is the column number
And in VBA:
Function GetColumnName(colNum As Integer) As String
Dim d As Integer
Dim m As Integer
Dim name As String
d = colNum
name = ""
Do While (d > 0)
m = (d - 1) Mod 26
name = Chr(65 + m) + name
d = Int((d - m) / 26)
Loop
GetColumnName = name
End Function
Sorry, this is Python instead of C#, but at least the results are correct:
def ColIdxToXlName(idx):
if idx < 1:
raise ValueError("Index is too small")
result = ""
while True:
if idx > 26:
idx, r = divmod(idx - 1, 26)
result = chr(r + ord('A')) + result
else:
return chr(idx + ord('A') - 1) + result
for i in xrange(1, 1024):
print "%4d : %s" % (i, ColIdxToXlName(i))
You might need conversion both ways, e.g from Excel column adress like AAZ to integer and from any integer to Excel. The two methods below will do just that. Assumes 1 based indexing, first element in your "arrays" are element number 1.
No limits on size here, so you can use adresses like ERROR and that would be column number 2613824 ...
public static string ColumnAdress(int col)
{
if (col <= 26) {
return Convert.ToChar(col + 64).ToString();
}
int div = col / 26;
int mod = col % 26;
if (mod == 0) {mod = 26;div--;}
return ColumnAdress(div) + ColumnAdress(mod);
}
public static int ColumnNumber(string colAdress)
{
int[] digits = new int[colAdress.Length];
for (int i = 0; i < colAdress.Length; ++i)
{
digits[i] = Convert.ToInt32(colAdress[i]) - 64;
}
int mul=1;int res=0;
for (int pos = digits.Length - 1; pos >= 0; --pos)
{
res += digits[pos] * mul;
mul *= 26;
}
return res;
}
I discovered an error in my first post, so I decided to sit down and do the the math. What I found is that the number system used to identify Excel columns is not a base 26 system, as another person posted. Consider the following in base 10. You can also do this with the letters of the alphabet.
Space:.........................S1, S2, S3 : S1, S2, S3
....................................0, 00, 000 :.. A, AA, AAA
....................................1, 01, 001 :.. B, AB, AAB
.................................... …, …, … :.. …, …, …
....................................9, 99, 999 :.. Z, ZZ, ZZZ
Total states in space: 10, 100, 1000 : 26, 676, 17576
Total States:...............1110................18278
Excel numbers columns in the individual alphabetical spaces using base 26. You can see that in general, the state space progression is a, a^2, a^3, … for some base a, and the total number of states is a + a^2 + a^3 + … .
Suppose you want to find the total number of states A in the first N spaces. The formula for doing so is A = (a)(a^N - 1 )/(a-1). This is important because we need to find the space N that corresponds to our index K. If I want to find out where K lies in the number system I need to replace A with K and solve for N. The solution is N = log{base a} (A (a-1)/a +1). If I use the example of a = 10 and K = 192, I know that N = 2.23804… . This tells me that K lies at the beginning of the third space since it is a little greater than two.
The next step is to find exactly how far in the current space we are. To find this, subtract from K the A generated using the floor of N. In this example, the floor of N is two. So, A = (10)(10^2 – 1)/(10-1) = 110, as is expected when you combine the states of the first two spaces. This needs to be subtracted from K because these first 110 states would have already been accounted for in the first two spaces. This leaves us with 82 states. So, in this number system, the representation of 192 in base 10 is 082.
The C# code using a base index of zero is
private string ExcelColumnIndexToName(int Index)
{
string range = string.Empty;
if (Index < 0 ) return range;
int a = 26;
int x = (int)Math.Floor(Math.Log((Index) * (a - 1) / a + 1, a));
Index -= (int)(Math.Pow(a, x) - 1) * a / (a - 1);
for (int i = x+1; Index + i > 0; i--)
{
range = ((char)(65 + Index % a)).ToString() + range;
Index /= a;
}
return range;
}
//Old Post
A zero-based solution in C#.
private string ExcelColumnIndexToName(int Index)
{
string range = "";
if (Index < 0 ) return range;
for(int i=1;Index + i > 0;i=0)
{
range = ((char)(65 + Index % 26)).ToString() + range;
Index /= 26;
}
if (range.Length > 1) range = ((char)((int)range[0] - 1)).ToString() + range.Substring(1);
return range;
}
This answer is in javaScript:
function getCharFromNumber(columnNumber){
var dividend = columnNumber;
var columnName = "";
var modulo;
while (dividend > 0)
{
modulo = (dividend - 1) % 26;
columnName = String.fromCharCode(65 + modulo).toString() + columnName;
dividend = parseInt((dividend - modulo) / 26);
}
return columnName;
}
Easy with recursion.
public static string GetStandardExcelColumnName(int columnNumberOneBased)
{
int baseValue = Convert.ToInt32('A');
int columnNumberZeroBased = columnNumberOneBased - 1;
string ret = "";
if (columnNumberOneBased > 26)
{
ret = GetStandardExcelColumnName(columnNumberZeroBased / 26) ;
}
return ret + Convert.ToChar(baseValue + (columnNumberZeroBased % 26) );
}
I'm surprised all of the solutions so far contain either iteration or recursion.
Here's my solution that runs in constant time (no loops). This solution works for all possible Excel columns and checks that the input can be turned into an Excel column. Possible columns are in the range [A, XFD] or [1, 16384]. (This is dependent on your version of Excel)
private static string Turn(uint col)
{
if (col < 1 || col > 16384) //Excel columns are one-based (one = 'A')
throw new ArgumentException("col must be >= 1 and <= 16384");
if (col <= 26) //one character
return ((char)(col + 'A' - 1)).ToString();
else if (col <= 702) //two characters
{
char firstChar = (char)((int)((col - 1) / 26) + 'A' - 1);
char secondChar = (char)(col % 26 + 'A' - 1);
if (secondChar == '#') //Excel is one-based, but modulo operations are zero-based
secondChar = 'Z'; //convert one-based to zero-based
return string.Format("{0}{1}", firstChar, secondChar);
}
else //three characters
{
char firstChar = (char)((int)((col - 1) / 702) + 'A' - 1);
char secondChar = (char)((col - 1) / 26 % 26 + 'A' - 1);
char thirdChar = (char)(col % 26 + 'A' - 1);
if (thirdChar == '#') //Excel is one-based, but modulo operations are zero-based
thirdChar = 'Z'; //convert one-based to zero-based
return string.Format("{0}{1}{2}", firstChar, secondChar, thirdChar);
}
}
Same implementation in Java
public String getExcelColumnName (int columnNumber)
{
int dividend = columnNumber;
int i;
String columnName = "";
int modulo;
while (dividend > 0)
{
modulo = (dividend - 1) % 26;
i = 65 + modulo;
columnName = new Character((char)i).toString() + columnName;
dividend = (int)((dividend - modulo) / 26);
}
return columnName;
}
int nCol = 127;
string sChars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string sCol = "";
while (nCol >= 26)
{
int nChar = nCol % 26;
nCol = (nCol - nChar) / 26;
// You could do some trick with using nChar as offset from 'A', but I am lazy to do it right now.
sCol = sChars[nChar] + sCol;
}
sCol = sChars[nCol] + sCol;
Update: Peter's comment is right. That's what I get for writing code in the browser. :-) My solution was not compiling, it was missing the left-most letter and it was building the string in reverse order - all now fixed.
Bugs aside, the algorithm is basically converting a number from base 10 to base 26.
Update 2: Joel Coehoorn is right - the code above will return AB for 27. If it was real base 26 number, AA would be equal to A and the next number after Z would be BA.
int nCol = 127;
string sChars = "0ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string sCol = "";
while (nCol > 26)
{
int nChar = nCol % 26;
if (nChar == 0)
nChar = 26;
nCol = (nCol - nChar) / 26;
sCol = sChars[nChar] + sCol;
}
if (nCol != 0)
sCol = sChars[nCol] + sCol;
..And converted to php:
function GetExcelColumnName($columnNumber) {
$columnName = '';
while ($columnNumber > 0) {
$modulo = ($columnNumber - 1) % 26;
$columnName = chr(65 + $modulo) . $columnName;
$columnNumber = (int)(($columnNumber - $modulo) / 26);
}
return $columnName;
}
Just throwing in a simple two-line C# implementation using recursion, because all the answers here seem far more complicated than necessary.
/// <summary>
/// Gets the column letter(s) corresponding to the given column number.
/// </summary>
/// <param name="column">The one-based column index. Must be greater than zero.</param>
/// <returns>The desired column letter, or an empty string if the column number was invalid.</returns>
public static string GetColumnLetter(int column) {
if (column < 1) return String.Empty;
return GetColumnLetter((column - 1) / 26) + (char)('A' + (column - 1) % 26);
}
Although there are already a bunch of valid answers1, none get into the theory behind it.
Excel column names are bijective base-26 representations of their number. This is quite different than an ordinary base 26 (there is no leading zero), and I really recommend reading the Wikipedia entry to grasp the differences. For example, the decimal value 702 (decomposed in 26*26 + 26) is represented in "ordinary" base 26 by 110 (i.e. 1x26^2 + 1x26^1 + 0x26^0) and in bijective base-26 by ZZ (i.e. 26x26^1 + 26x26^0).
Differences aside, bijective numeration is a positional notation, and as such we can perform conversions using an iterative (or recursive) algorithm which on each iteration finds the digit of the next position (similarly to an ordinary base conversion algorithm).
The general formula to get the digit at the last position (the one indexed 0) of the bijective base-k representation of a decimal number m is (f being the ceiling function minus 1):
m - (f(m / k) * k)
The digit at the next position (i.e. the one indexed 1) is found by applying the same formula to the result of f(m / k). We know that for the last digit (i.e. the one with the highest index) f(m / k) is 0.
This forms the basis for an iteration that finds each successive digit in bijective base-k of a decimal number. In pseudo-code it would look like this (digit() maps a decimal integer to its representation in the bijective base -- e.g. digit(1) would return A in bijective base-26):
fun conv(m)
q = f(m / k)
a = m - (q * k)
if (q == 0)
return digit(a)
else
return conv(q) + digit(a);
So we can translate this to C#2 to get a generic3 "conversion to bijective base-k" ToBijective() routine:
class BijectiveNumeration {
private int baseK;
private Func<int, char> getDigit;
public BijectiveNumeration(int baseK, Func<int, char> getDigit) {
this.baseK = baseK;
this.getDigit = getDigit;
}
public string ToBijective(double decimalValue) {
double q = f(decimalValue / baseK);
double a = decimalValue - (q * baseK);
return ((q > 0) ? ToBijective(q) : "") + getDigit((int)a);
}
private static double f(double i) {
return (Math.Ceiling(i) - 1);
}
}
Now for conversion to bijective base-26 (our "Excel column name" use case):
static void Main(string[] args)
{
BijectiveNumeration bijBase26 = new BijectiveNumeration(
26,
(value) => Convert.ToChar('A' + (value - 1))
);
Console.WriteLine(bijBase26.ToBijective(1)); // prints "A"
Console.WriteLine(bijBase26.ToBijective(26)); // prints "Z"
Console.WriteLine(bijBase26.ToBijective(27)); // prints "AA"
Console.WriteLine(bijBase26.ToBijective(702)); // prints "ZZ"
Console.WriteLine(bijBase26.ToBijective(16384)); // prints "XFD"
}
Excel's maximum column index is 16384 / XFD, but this code will convert any positive number.
As an added bonus, we can now easily convert to any bijective base. For example for bijective base-10:
static void Main(string[] args)
{
BijectiveNumeration bijBase10 = new BijectiveNumeration(
10,
(value) => value < 10 ? Convert.ToChar('0'+value) : 'A'
);
Console.WriteLine(bijBase10.ToBijective(1)); // prints "1"
Console.WriteLine(bijBase10.ToBijective(10)); // prints "A"
Console.WriteLine(bijBase10.ToBijective(123)); // prints "123"
Console.WriteLine(bijBase10.ToBijective(20)); // prints "1A"
Console.WriteLine(bijBase10.ToBijective(100)); // prints "9A"
Console.WriteLine(bijBase10.ToBijective(101)); // prints "A1"
Console.WriteLine(bijBase10.ToBijective(2010)); // prints "19AA"
}
1 This generic answer can eventually be reduced to the other, correct, specific answers, but I find it hard to fully grasp the logic of the solutions without the formal theory behind bijective numeration in general. It also proves its correctness nicely. Additionally, several similar questions link back to this one, some being language-agnostic or more generic. That's why I thought the addition of this answer was warranted, and that this question was a good place to put it.
2 C# disclaimer: I implemented an example in C# because this is what is asked here, but I have never learned nor used the language. I have verified it does compile and run, but please adapt it to fit the language best practices / general conventions, if necessary.
3 This example only aims to be correct and understandable ; it could and should be optimized would performance matter (e.g. with tail-recursion -- but that seems to require trampolining in C#), and made safer (e.g. by validating parameters).
I wanted to throw in my static class I use, for interoping between col index and col Label. I use a modified accepted answer for my ColumnLabel Method
public static class Extensions
{
public static string ColumnLabel(this int col)
{
var dividend = col;
var columnLabel = string.Empty;
int modulo;
while (dividend > 0)
{
modulo = (dividend - 1) % 26;
columnLabel = Convert.ToChar(65 + modulo).ToString() + columnLabel;
dividend = (int)((dividend - modulo) / 26);
}
return columnLabel;
}
public static int ColumnIndex(this string colLabel)
{
// "AD" (1 * 26^1) + (4 * 26^0) ...
var colIndex = 0;
for(int ind = 0, pow = colLabel.Count()-1; ind < colLabel.Count(); ++ind, --pow)
{
var cVal = Convert.ToInt32(colLabel[ind]) - 64; //col A is index 1
colIndex += cVal * ((int)Math.Pow(26, pow));
}
return colIndex;
}
}
Use this like...
30.ColumnLabel(); // "AD"
"AD".ColumnIndex(); // 30
private String getColumn(int c) {
String s = "";
do {
s = (char)('A' + (c % 26)) + s;
c /= 26;
} while (c-- > 0);
return s;
}
Its not exactly base 26, there is no 0 in the system. If there was, 'Z' would be followed by 'BA' not by 'AA'.
if you just want it for a cell formula without code, here's a formula for it:
IF(COLUMN()>=26,CHAR(ROUND(COLUMN()/26,1)+64)&CHAR(MOD(COLUMN(),26)+64),CHAR(COLUMN()+64))
In Delphi (Pascal):
function GetExcelColumnName(columnNumber: integer): string;
var
dividend, modulo: integer;
begin
Result := '';
dividend := columnNumber;
while dividend > 0 do begin
modulo := (dividend - 1) mod 26;
Result := Chr(65 + modulo) + Result;
dividend := (dividend - modulo) div 26;
end;
end;
A little late to the game, but here's the code I use (in C#):
private static readonly string _Alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
public static int ColumnNameParse(string value)
{
// assumes value.Length is [1,3]
// assumes value is uppercase
var digits = value.PadLeft(3).Select(x => _Alphabet.IndexOf(x));
return digits.Aggregate(0, (current, index) => (current * 26) + (index + 1));
}
In perl, for an input of 1 (A), 27 (AA), etc.
sub excel_colname {
my ($idx) = #_; # one-based column number
--$idx; # zero-based column index
my $name = "";
while ($idx >= 0) {
$name .= chr(ord("A") + ($idx % 26));
$idx = int($idx / 26) - 1;
}
return scalar reverse $name;
}
Though I am late to the game, Graham's answer is far from being optimal. Particularly, you don't have to use the modulo, call ToString() and apply (int) cast. Considering that in most cases in C# world you would start numbering from 0, here is my revision:
public static string GetColumnName(int index) // zero-based
{
const byte BASE = 'Z' - 'A' + 1;
string name = String.Empty;
do
{
name = Convert.ToChar('A' + index % BASE) + name;
index = index / BASE - 1;
}
while (index >= 0);
return name;
}
More than 30 solutions already, but here's my one-line C# solution...
public string IntToExcelColumn(int i)
{
return ((i<16926? "" : ((char)((((i/26)-1)%26)+65)).ToString()) + (i<2730? "" : ((char)((((i/26)-1)%26)+65)).ToString()) + (i<26? "" : ((char)((((i/26)-1)%26)+65)).ToString()) + ((char)((i%26)+65)));
}
After looking at all the supplied Versions here, I decided to do one myself, using recursion.
Here is my vb.net Version:
Function CL(ByVal x As Integer) As String
If x >= 1 And x <= 26 Then
CL = Chr(x + 64)
Else
CL = CL((x - x Mod 26) / 26) & Chr((x Mod 26) + 1 + 64)
End If
End Function
Refining the original solution (in C#):
public static class ExcelHelper
{
private static Dictionary<UInt16, String> l_DictionaryOfColumns;
public static ExcelHelper() {
l_DictionaryOfColumns = new Dictionary<ushort, string>(256);
}
public static String GetExcelColumnName(UInt16 l_Column)
{
UInt16 l_ColumnCopy = l_Column;
String l_Chars = "0ABCDEFGHIJKLMNOPQRSTUVWXYZ";
String l_rVal = "";
UInt16 l_Char;
if (l_DictionaryOfColumns.ContainsKey(l_Column) == true)
{
l_rVal = l_DictionaryOfColumns[l_Column];
}
else
{
while (l_ColumnCopy > 26)
{
l_Char = l_ColumnCopy % 26;
if (l_Char == 0)
l_Char = 26;
l_ColumnCopy = (l_ColumnCopy - l_Char) / 26;
l_rVal = l_Chars[l_Char] + l_rVal;
}
if (l_ColumnCopy != 0)
l_rVal = l_Chars[l_ColumnCopy] + l_rVal;
l_DictionaryOfColumns.ContainsKey(l_Column) = l_rVal;
}
return l_rVal;
}
}
Here is an Actionscript version:
private var columnNumbers:Array = ['A', 'B', 'C', 'D', 'E', 'F' , 'G', 'H', 'I', 'J', 'K' ,'L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'];
private function getExcelColumnName(columnNumber:int) : String{
var dividend:int = columnNumber;
var columnName:String = "";
var modulo:int;
while (dividend > 0)
{
modulo = (dividend - 1) % 26;
columnName = columnNumbers[modulo] + columnName;
dividend = int((dividend - modulo) / 26);
}
return columnName;
}
JavaScript Solution
/**
* Calculate the column letter abbreviation from a 1 based index
* #param {Number} value
* #returns {string}
*/
getColumnFromIndex = function (value) {
var base = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.split('');
var remainder, result = "";
do {
remainder = value % 26;
result = base[(remainder || 26) - 1] + result;
value = Math.floor(value / 26);
} while (value > 0);
return result;
};
These my codes to convert specific number (index start from 1) to Excel Column.
public static string NumberToExcelColumn(uint number)
{
uint originalNumber = number;
uint numChars = 1;
while (Math.Pow(26, numChars) < number)
{
numChars++;
if (Math.Pow(26, numChars) + 26 >= number)
{
break;
}
}
string toRet = "";
uint lastValue = 0;
do
{
number -= lastValue;
double powerVal = Math.Pow(26, numChars - 1);
byte thisCharIdx = (byte)Math.Truncate((columnNumber - 1) / powerVal);
lastValue = (int)powerVal * thisCharIdx;
if (numChars - 2 >= 0)
{
double powerVal_next = Math.Pow(26, numChars - 2);
byte thisCharIdx_next = (byte)Math.Truncate((columnNumber - lastValue - 1) / powerVal_next);
int lastValue_next = (int)Math.Pow(26, numChars - 2) * thisCharIdx_next;
if (thisCharIdx_next == 0 && lastValue_next == 0 && powerVal_next == 26)
{
thisCharIdx--;
lastValue = (int)powerVal * thisCharIdx;
}
}
toRet += (char)((byte)'A' + thisCharIdx + ((numChars > 1) ? -1 : 0));
numChars--;
} while (numChars > 0);
return toRet;
}
My Unit Test:
[TestMethod]
public void Test()
{
Assert.AreEqual("A", NumberToExcelColumn(1));
Assert.AreEqual("Z", NumberToExcelColumn(26));
Assert.AreEqual("AA", NumberToExcelColumn(27));
Assert.AreEqual("AO", NumberToExcelColumn(41));
Assert.AreEqual("AZ", NumberToExcelColumn(52));
Assert.AreEqual("BA", NumberToExcelColumn(53));
Assert.AreEqual("ZZ", NumberToExcelColumn(702));
Assert.AreEqual("AAA", NumberToExcelColumn(703));
Assert.AreEqual("ABC", NumberToExcelColumn(731));
Assert.AreEqual("ACQ", NumberToExcelColumn(771));
Assert.AreEqual("AYZ", NumberToExcelColumn(1352));
Assert.AreEqual("AZA", NumberToExcelColumn(1353));
Assert.AreEqual("AZB", NumberToExcelColumn(1354));
Assert.AreEqual("BAA", NumberToExcelColumn(1379));
Assert.AreEqual("CNU", NumberToExcelColumn(2413));
Assert.AreEqual("GCM", NumberToExcelColumn(4823));
Assert.AreEqual("MSR", NumberToExcelColumn(9300));
Assert.AreEqual("OMB", NumberToExcelColumn(10480));
Assert.AreEqual("ULV", NumberToExcelColumn(14530));
Assert.AreEqual("XFD", NumberToExcelColumn(16384));
}
Sorry, this is Python instead of C#, but at least the results are correct:
def excel_column_number_to_name(column_number):
output = ""
index = column_number-1
while index >= 0:
character = chr((index%26)+ord('A'))
output = output + character
index = index/26 - 1
return output[::-1]
for i in xrange(1, 1024):
print "%4d : %s" % (i, excel_column_number_to_name(i))
Passed these test cases:
Column Number: 494286 => ABCDZ
Column Number: 27 => AA
Column Number: 52 => AZ
For what it is worth, here is Graham's code in Powershell:
function ConvertTo-ExcelColumnID {
param (
[parameter(Position = 0,
HelpMessage = "A 1-based index to convert to an excel column ID. e.g. 2 => 'B', 29 => 'AC'",
Mandatory = $true)]
[int]$index
);
[string]$result = '';
if ($index -le 0 ) {
return $result;
}
while ($index -gt 0) {
[int]$modulo = ($index - 1) % 26;
$character = [char]($modulo + [int][char]'A');
$result = $character + $result;
[int]$index = ($index - $modulo) / 26;
}
return $result;
}
Another VBA way
Public Function GetColumnName(TargetCell As Range) As String
GetColumnName = Split(CStr(TargetCell.Cells(1, 1).Address), "$")(1)
End Function
Here's my super late implementation in PHP. This one's recursive. I wrote it just before I found this post. I wanted to see if others had solved this problem already...
public function GetColumn($intNumber, $strCol = null) {
if ($intNumber > 0) {
$intRem = ($intNumber - 1) % 26;
$strCol = $this->GetColumn(intval(($intNumber - $intRem) / 26), sprintf('%s%s', chr(65 + $intRem), $strCol));
}
return $strCol;
}
Related
I have been thinking of adding binary numbers where binary numbers are in a form of string and we add those two binary numbers to get a resultant binary number (in string).
So far I have been using this:
long first = Convert.ToInt64(a, 2);
long second = Convert.ToInt64(b, 2);
long addresult = first + second;
string result = Convert.ToString(addresult, 2);
return result;
Courtesy of Stackoverflow: Binary addition of 2 values represented as strings
But, now I want to add numbers which are far greater than the scope of a long data type.
For Example, a Binary value whose decimel result is a BigInteger, i.e., incredibly huge integers as shown below:
111101101000010111101000101010001010010010010110011010100001000010110110110000110001101 which equals to149014059082024308625334669
1111001101011000001011000111100011101011110100101010010001110101011101010100101000001101000010000110001110100010011101011111111110110101100101110001010101011110001010000010111110011011 which equals to23307765732196437339985049250988196614799400063289798555
At least I think it does.
Courtesy of Stackoverflow:
C# Convert large binary string to decimal system
BigInteger to Hex/Decimal/Octal/Binary strings?
I have used logic provided in above links which are more or less perfect.
But, is there a more compact way to add the given two binary strings?
Kindly let me know as this question is rattling in my mind for some time now.
You can exploit the same scheme you used before but with BigInteger:
using System.Linq;
using System.Numerics;
...
BigInteger first = a.Aggregate(BigInteger.Zero, (s, item) => s * 2 + item - '0');
BigInteger second = b.Aggregate(BigInteger.Zero, (s, item) => s * 2 + item - '0');
StringBuilder sb = new StringBuilder();
for (BigInteger addresult = first + second; addresult > 0; addresult /= 2)
sb.Append(addresult % 2);
if (sb.Length <= 0)
sb.Append('0');
string result = string.Concat(sb.ToString().Reverse());
This question was a nostalgic one - thanks. Note that my code is explanatory and inefficient with little to no validation, but it works for your example. You definitely do not want to use anything like this in a real world solution, this is just to illustrate binary addition in principle.
BinaryString#1
111101101000010111101000101010001010010010010110011010100001000010110110110000110001101
decimal:149014059082024308625334669
BinaryString#2
1111001101011000001011000111100011101011110100101010010001110101011101010100101000001101000010000110001110100010011101011111111110110101100101110001010101011110001010000010111110011011
decimal:23307765732196437339985049250988196614799400063289798555
Calculated Sum
1111001101011000001011000111100011101011110100101010010001110101011101010100101000001101000010001101111011100101011010100101010000000111111000100100101001100110100000111001000100101000
decimal:23307765732196437339985049251137210673881424371915133224
Check
23307765732196437339985049251137210673881424371915133224
decimal:23307765732196437339985049251137210673881424371915133224
Here's the code
using System;
using System.Linq;
using System.Numerics;
namespace ConsoleApp3
{
class Program
{
// return 0 for '0' and 1 for '1' (C# chars promotion to ints)
static int CharAsInt(char c) { return c - '0'; }
// and vice-versa
static char IntAsChar(int bit) { return (char)('0' + bit); }
static string BinaryStringAdd(string x, string y)
{
// get rid of spaces
x = x.Trim();
y = y.Trim();
// check if valid binaries
if (x.Any(c => c != '0' && c != '1') || y.Any(c => c != '0' && c != '1'))
throw new ArgumentException("binary representation may contain only '0' and '1'");
// align on right-most bit
if (x.Length < y.Length)
x = x.PadLeft(y.Length, '0');
else
y = y.PadLeft(x.Length, '0');
// NNB: the result may require one more bit than the longer of the two input strings (carry at the end), let's not forget this
var result = new char[x.Length];
// add from least significant to most significant (right to left)
var i = result.Length;
var carry = '0';
while (--i >= 0)
{
// to add x[i], y[i] and carry
// - if 2 or 3 bits are set then we carry '1' again (otherwise '0')
// - if the number of set bits is odd the sum bit is '1' otherwise '0'
var countSetBits = CharAsInt(x[i]) + CharAsInt(y[i]) + CharAsInt(carry);
carry = countSetBits > 1 ? '1' : '0';
result[i] = countSetBits == 1 || countSetBits == 3 ? '1' : '0';
}
// now to make this byte[] a string
var ret = new string(result);
// remember that final carry?
return carry == '1' ? carry + ret : ret;
}
static BigInteger BigIntegerFromBinaryString(string s)
{
var biRet = new BigInteger(0);
foreach (var t in s)
{
biRet = biRet << 1;
if (t == '1')
biRet += 1;
}
return biRet;
}
static void Main(string[] args)
{
var s1 = "111101101000010111101000101010001010010010010110011010100001000010110110110000110001101";
var s2 = "1111001101011000001011000111100011101011110100101010010001110101011101010100101000001101000010000110001110100010011101011111111110110101100101110001010101011110001010000010111110011011";
var sum = BinaryStringAdd(s1, s2);
var bi1 = BigIntegerFromBinaryString(s1);
var bi2 = BigIntegerFromBinaryString(s2);
var bi3 = bi1 + bi2;
Console.WriteLine($"BinaryString#1\n {s1}\n decimal:{bi1}");
Console.WriteLine($"BinaryString#2\n {s2}\n decimal:{bi2}");
Console.WriteLine($"Calculated Sum\n {sum}\n decimal:{BigIntegerFromBinaryString(sum)}");
Console.WriteLine($"Check\n {bi3}\n decimal:{bi3}");
Console.ReadKey();
}
}
}
I'll add an alternative solution alongside AlanK's just as an example of how you might go about this without converting the numbers to some form of integer before adding them.
static string BinaryStringAdd(string b1, string b2)
{
char[] c = new char[Math.Max(b1.Length, b2.Length) + 1];
int carry = 0;
for (int i = 1; i <= c.Length; i++)
{
int d1 = i <= b1.Length ? b1[^i] : 48;
int d2 = i <= b2.Length ? b2[^i] : 48;
int sum = carry + (d1-48) + (d2-48);
if (sum == 3)
{
sum = 1;
carry = 1;
}
else if (sum == 2)
{
sum = 0;
carry = 1;
}
else
{
carry = 0;
}
c[^i] = (char) (sum+48);
}
return c[0] == '0' ? String.Join("", c)[1..] : String.Join("", c);
}
Note that this solution is ~10% slower than Alan's solution (at least for this test case), and assumes the strings arrive to the method formatted correctly.
I need to calculate the similarity between 2 strings. So what exactly do I mean? Let me explain with an example:
The real word: hospital
Mistaken word: haspita
Now my aim is to determine how many characters I need to modify the mistaken word to obtain the real word. In this example, I need to modify 2 letters. So what would be the percent? I take the length of the real word always. So it becomes 2 / 8 = 25% so these 2 given string DSM is 75%.
How can I achieve this with performance being a key consideration?
I just addressed this exact same issue a few weeks ago. Since someone is asking now, I'll share the code. In my exhaustive tests my code is about 10x faster than the C# example on Wikipedia even when no maximum distance is supplied. When a maximum distance is supplied, this performance gain increases to 30x - 100x +. Note a couple key points for performance:
If you need to compare the same words over and over, first convert the words to arrays of integers. The Damerau-Levenshtein algorithm includes many >, <, == comparisons, and ints compare much faster than chars.
It includes a short-circuiting mechanism to quit if the distance exceeds a provided maximum
Use a rotating set of three arrays rather than a massive matrix as in all the implementations I've see elsewhere
Make sure your arrays slice accross the shorter word width.
Code (it works the exact same if you replace int[] with String in the parameter declarations:
/// <summary>
/// Computes the Damerau-Levenshtein Distance between two strings, represented as arrays of
/// integers, where each integer represents the code point of a character in the source string.
/// Includes an optional threshhold which can be used to indicate the maximum allowable distance.
/// </summary>
/// <param name="source">An array of the code points of the first string</param>
/// <param name="target">An array of the code points of the second string</param>
/// <param name="threshold">Maximum allowable distance</param>
/// <returns>Int.MaxValue if threshhold exceeded; otherwise the Damerau-Leveshteim distance between the strings</returns>
public static int DamerauLevenshteinDistance(int[] source, int[] target, int threshold) {
int length1 = source.Length;
int length2 = target.Length;
// Return trivial case - difference in string lengths exceeds threshhold
if (Math.Abs(length1 - length2) > threshold) { return int.MaxValue; }
// Ensure arrays [i] / length1 use shorter length
if (length1 > length2) {
Swap(ref target, ref source);
Swap(ref length1, ref length2);
}
int maxi = length1;
int maxj = length2;
int[] dCurrent = new int[maxi + 1];
int[] dMinus1 = new int[maxi + 1];
int[] dMinus2 = new int[maxi + 1];
int[] dSwap;
for (int i = 0; i <= maxi; i++) { dCurrent[i] = i; }
int jm1 = 0, im1 = 0, im2 = -1;
for (int j = 1; j <= maxj; j++) {
// Rotate
dSwap = dMinus2;
dMinus2 = dMinus1;
dMinus1 = dCurrent;
dCurrent = dSwap;
// Initialize
int minDistance = int.MaxValue;
dCurrent[0] = j;
im1 = 0;
im2 = -1;
for (int i = 1; i <= maxi; i++) {
int cost = source[im1] == target[jm1] ? 0 : 1;
int del = dCurrent[im1] + 1;
int ins = dMinus1[i] + 1;
int sub = dMinus1[im1] + cost;
//Fastest execution for min value of 3 integers
int min = (del > ins) ? (ins > sub ? sub : ins) : (del > sub ? sub : del);
if (i > 1 && j > 1 && source[im2] == target[jm1] && source[im1] == target[j - 2])
min = Math.Min(min, dMinus2[im2] + cost);
dCurrent[i] = min;
if (min < minDistance) { minDistance = min; }
im1++;
im2++;
}
jm1++;
if (minDistance > threshold) { return int.MaxValue; }
}
int result = dCurrent[maxi];
return (result > threshold) ? int.MaxValue : result;
}
Where Swap is:
static void Swap<T>(ref T arg1,ref T arg2) {
T temp = arg1;
arg1 = arg2;
arg2 = temp;
}
What you are looking for is called edit distance or Levenshtein distance. The wikipedia article explains how it is calculated, and has a nice piece of pseudocode at the bottom to help you code this algorithm in C# very easily.
Here's an implementation from the first site linked below:
private static int CalcLevenshteinDistance(string a, string b)
{
if (String.IsNullOrEmpty(a) && String.IsNullOrEmpty(b)) {
return 0;
}
if (String.IsNullOrEmpty(a)) {
return b.Length;
}
if (String.IsNullOrEmpty(b)) {
return a.Length;
}
int lengthA = a.Length;
int lengthB = b.Length;
var distances = new int[lengthA + 1, lengthB + 1];
for (int i = 0; i <= lengthA; distances[i, 0] = i++);
for (int j = 0; j <= lengthB; distances[0, j] = j++);
for (int i = 1; i <= lengthA; i++)
for (int j = 1; j <= lengthB; j++)
{
int cost = b[j - 1] == a[i - 1] ? 0 : 1;
distances[i, j] = Math.Min
(
Math.Min(distances[i - 1, j] + 1, distances[i, j - 1] + 1),
distances[i - 1, j - 1] + cost
);
}
return distances[lengthA, lengthB];
}
There is a big number of string similarity distance algorithms that can be used. Some listed here (but not exhaustively listed are):
Levenstein
Needleman Wunch
Smith Waterman
Smith Waterman Gotoh
Jaro, Jaro Winkler
Jaccard Similarity
Euclidean Distance
Dice Similarity
Cosine Similarity
Monge Elkan
A library that contains implementation to all of these is called SimMetrics
which has both java and c# implementations.
I have found that Levenshtein and Jaro Winkler are great for small differences betwen strings such as:
Spelling mistakes; or
ö instead of o in a persons name.
However when comparing something like article titles where significant chunks of the text would be the same but with "noise" around the edges, Smith-Waterman-Gotoh has been fantastic:
compare these 2 titles (that are the same but worded differently from different sources):
An endonuclease from Escherichia coli that introduces single polynucleotide chain scissions in ultraviolet-irradiated DNA
Endonuclease III: An Endonuclease from Escherichia coli That Introduces Single Polynucleotide Chain Scissions in Ultraviolet-Irradiated DNA
This site that provides algorithm comparison of the strings shows:
Levenshtein: 81
Smith-Waterman Gotoh 94
Jaro Winkler 78
Jaro Winkler and Levenshtein are not as competent as Smith Waterman Gotoh in detecting the similarity. If we compare two titles that are not the same article, but have some matching text:
Fat metabolism in higher plants. The function of acyl thioesterases in the metabolism of acyl-coenzymes A and acyl-acyl carrier proteins
Fat metabolism in higher plants. The determination of acyl-acyl carrier protein and acyl coenzyme A in a complex lipid mixture
Jaro Winkler gives a false positive, but Smith Waterman Gotoh does not:
Levenshtein: 54
Smith-Waterman Gotoh 49
Jaro Winkler 89
As Anastasiosyal pointed out, SimMetrics has the java code for these algorithms. I had success using the SmithWatermanGotoh java code from SimMetrics.
Here is my implementation of Damerau Levenshtein Distance, which returns not only similarity coefficient, but also returns error locations in corrected word (this feature can be used in text editors). Also my implementation supports different weights of errors (substitution, deletion, insertion, transposition).
public static List<Mistake> OptimalStringAlignmentDistance(
string word, string correctedWord,
bool transposition = true,
int substitutionCost = 1,
int insertionCost = 1,
int deletionCost = 1,
int transpositionCost = 1)
{
int w_length = word.Length;
int cw_length = correctedWord.Length;
var d = new KeyValuePair<int, CharMistakeType>[w_length + 1, cw_length + 1];
var result = new List<Mistake>(Math.Max(w_length, cw_length));
if (w_length == 0)
{
for (int i = 0; i < cw_length; i++)
result.Add(new Mistake(i, CharMistakeType.Insertion));
return result;
}
for (int i = 0; i <= w_length; i++)
d[i, 0] = new KeyValuePair<int, CharMistakeType>(i, CharMistakeType.None);
for (int j = 0; j <= cw_length; j++)
d[0, j] = new KeyValuePair<int, CharMistakeType>(j, CharMistakeType.None);
for (int i = 1; i <= w_length; i++)
{
for (int j = 1; j <= cw_length; j++)
{
bool equal = correctedWord[j - 1] == word[i - 1];
int delCost = d[i - 1, j].Key + deletionCost;
int insCost = d[i, j - 1].Key + insertionCost;
int subCost = d[i - 1, j - 1].Key;
if (!equal)
subCost += substitutionCost;
int transCost = int.MaxValue;
if (transposition && i > 1 && j > 1 && word[i - 1] == correctedWord[j - 2] && word[i - 2] == correctedWord[j - 1])
{
transCost = d[i - 2, j - 2].Key;
if (!equal)
transCost += transpositionCost;
}
int min = delCost;
CharMistakeType mistakeType = CharMistakeType.Deletion;
if (insCost < min)
{
min = insCost;
mistakeType = CharMistakeType.Insertion;
}
if (subCost < min)
{
min = subCost;
mistakeType = equal ? CharMistakeType.None : CharMistakeType.Substitution;
}
if (transCost < min)
{
min = transCost;
mistakeType = CharMistakeType.Transposition;
}
d[i, j] = new KeyValuePair<int, CharMistakeType>(min, mistakeType);
}
}
int w_ind = w_length;
int cw_ind = cw_length;
while (w_ind >= 0 && cw_ind >= 0)
{
switch (d[w_ind, cw_ind].Value)
{
case CharMistakeType.None:
w_ind--;
cw_ind--;
break;
case CharMistakeType.Substitution:
result.Add(new Mistake(cw_ind - 1, CharMistakeType.Substitution));
w_ind--;
cw_ind--;
break;
case CharMistakeType.Deletion:
result.Add(new Mistake(cw_ind, CharMistakeType.Deletion));
w_ind--;
break;
case CharMistakeType.Insertion:
result.Add(new Mistake(cw_ind - 1, CharMistakeType.Insertion));
cw_ind--;
break;
case CharMistakeType.Transposition:
result.Add(new Mistake(cw_ind - 2, CharMistakeType.Transposition));
w_ind -= 2;
cw_ind -= 2;
break;
}
}
if (d[w_length, cw_length].Key > result.Count)
{
int delMistakesCount = d[w_length, cw_length].Key - result.Count;
for (int i = 0; i < delMistakesCount; i++)
result.Add(new Mistake(0, CharMistakeType.Deletion));
}
result.Reverse();
return result;
}
public struct Mistake
{
public int Position;
public CharMistakeType Type;
public Mistake(int position, CharMistakeType type)
{
Position = position;
Type = type;
}
public override string ToString()
{
return Position + ", " + Type;
}
}
public enum CharMistakeType
{
None,
Substitution,
Insertion,
Deletion,
Transposition
}
This code is a part of my project: Yandex-Linguistics.NET.
I wrote some tests and it's seems to me that method is working.
But comments and remarks are welcome.
Here is an alternative approach:
A typical method for finding similarity is Levenshtein distance, and there is no doubt a library with code available.
Unfortunately, this requires comparing to every string. You might be able to write a specialized version of the code to short-circuit the calculation if the distance is greater than some threshold, you would still have to do all the comparisons.
Another idea is to use some variant of trigrams or n-grams. These are sequences of n characters (or n words or n genomic sequences or n whatever). Keep a mapping of trigrams to strings and choose the ones that have the biggest overlap. A typical choice of n is "3", hence the name.
For instance, English would have these trigrams:
Eng
ngl
gli
lis
ish
And England would have:
Eng
ngl
gla
lan
and
Well, 2 out of 7 (or 4 out of 10) match. If this works for you, and you can index the trigram/string table and get a faster search.
You can also combine this with Levenshtein to reduce the set of comparison to those that have some minimum number of n-grams in common.
Here's a VB.net implementation:
Public Shared Function LevenshteinDistance(ByVal v1 As String, ByVal v2 As String) As Integer
Dim cost(v1.Length, v2.Length) As Integer
If v1.Length = 0 Then
Return v2.Length 'if string 1 is empty, the number of edits will be the insertion of all characters in string 2
ElseIf v2.Length = 0 Then
Return v1.Length 'if string 2 is empty, the number of edits will be the insertion of all characters in string 1
Else
'setup the base costs for inserting the correct characters
For v1Count As Integer = 0 To v1.Length
cost(v1Count, 0) = v1Count
Next v1Count
For v2Count As Integer = 0 To v2.Length
cost(0, v2Count) = v2Count
Next v2Count
'now work out the cheapest route to having the correct characters
For v1Count As Integer = 1 To v1.Length
For v2Count As Integer = 1 To v2.Length
'the first min term is the cost of editing the character in place (which will be the cost-to-date or the cost-to-date + 1 (depending on whether a change is required)
'the second min term is the cost of inserting the correct character into string 1 (cost-to-date + 1),
'the third min term is the cost of inserting the correct character into string 2 (cost-to-date + 1) and
cost(v1Count, v2Count) = Math.Min(
cost(v1Count - 1, v2Count - 1) + If(v1.Chars(v1Count - 1) = v2.Chars(v2Count - 1), 0, 1),
Math.Min(
cost(v1Count - 1, v2Count) + 1,
cost(v1Count, v2Count - 1) + 1
)
)
Next v2Count
Next v1Count
'the final result is the cheapest cost to get the two strings to match, which is the bottom right cell in the matrix
'in the event of strings being equal, this will be the result of zipping diagonally down the matrix (which will be square as the strings are the same length)
Return cost(v1.Length, v2.Length)
End If
End Function
public static Int64 Decode(string input)
{
var reversed = input.ToLower().Reverse();
long result = 0;
int pos = 0;
foreach (char c in reversed)
{
result += CharList.IndexOf(c) * (long)Math.Pow(36, pos);
pos++;
}
return result;
}
I am using a method to decode a value from base36 to decimal. The method works fine
but when I get to decode the input value of "000A" then things start to go wrong and it
decodes this as -1.
Can anyone see what's going wrong? I am really confused with the code and how it works.
private const string CharList = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
I can only assume that your CharList does not contain A, therefore IndexOf(c) returns -1 to show that the character wasn't found. Keep in mind that IndexOf is case sensitive by default, so if you're using lowercase in your CharList and uppercase for c, it won't match.
// pos = 0
result += CharList.IndexOf(c) * (long)Math.Pow(36, pos);
// pos = 36^0 = 1
// CharList.IndexOf(c) gives -1 when not found
// therefore, it equates to:
result += -1 * 1
You're using ToLower() on your source and your list contains uppercase chars only, so IndexOf('a') returns -1.
I think you'd want to use ToUpper() instead.
Here is a recursive function:
using System;
class Program {
static int decode(string sIn, int nBase) {
int n = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ".IndexOf(sIn[^1]);
string s = sIn[..^1];
return s != "" ? decode(s, nBase) * nBase + n : n;
}
static void Main() {
var s = "q3ezbz";
var n = decode(s, 36);
Console.WriteLine(n == 1577858399);
}
}
You can succinctly perform this task with Linq and avoid having a CharList at all & support converting from any base (2 to 36) to base 10 as follows:
string b36 = "000A", tbase = 36;
int b10 = b36
.Select(d => d >= '0' && d <= '9' ? d - '0' : 10 + char.ToUpper(d) - 'A')
.Aggregate(0, (pos, d) => pos * tbase + d);
For completeness (to go from base 10 to any base):
int value = 10, tbase = 36;
string result = "";
while (value > 0)
{
int x = value % tbase;
result = (char)(x >= 0 && x <= 9 ? x + 48 : x + 'A' - 10) + result;
value /= tbase;
}
Console.WriteLine(result.PadLeft(4,'0'));
I'm trying to write a function that takes a revision number (int) and turns it into a revision name (string). The formula should produce outputs similar to this:
Number Name
1 A
2 B
3 C
... ...
25 Y
26 Z
27 AA
28 AB
29 AC
... ...
51 AY
52 AZ
53 BA
54 BB
55 BC
... ...
This seems simple, but I think it might involve recursion and I'm terrible at that. Any suggestions?
I think this is the same as working out the Excel column name from a column number:
private string GetExcelColumnName(int columnNumber)
{
int dividend = columnNumber;
string columnName = String.Empty;
int modulo;
while (dividend > 0)
{
modulo = (dividend - 1) % 26;
columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
dividend = (int)((dividend - modulo) / 26);
}
return columnName;
}
I think you basically need a transformation of a number in 10x numerical system to a number in 26x numerical system.
For example:
53 = 5*10^1 + 3*10^0 = [5][3]
53 = B*26^1 + A*26^0 = [B][A]
int value10 = 53;
int base10 = 10;
string value26 = "";
int base26 = 26;
int input = value10;
while (true)
{
int mod = input / base26;
if (mod > 0)
value26 += Map26SymbolByValue10 (mod); // Will map 2 to 'B'
else
value26 += Map26SymbolByValue10 (input); // Will map 1 to 'A'
int rest = input - mod * base26;
if (input < base26) break;
input = rest;
}
I really hope this isn't homework... (untested solution):
if(value == 1)
return "A";
StringBuilder result = new StringBuilder();
value--;
while(value > 0)
{
result.Insert(0, 'A' + (value % 26));
value /= 26;
}
Recursive version based on tanascius' original answer (also untested):
string ConvertToChar(int value)
{
char low = 'A' + (value - 1) % 26;
if(value > 26)
return ConvertToChar((value - 1) / 26 + 1) + low.ToString();
else
return low.ToString();
}
Tested solution:
private static string VersionName(int versionNum)
{
StringBuilder sb = new StringBuilder();
while (versionNum > 0)
{
versionNum--;
sb.Insert(0, (char)('A' + (versionNum % 26)));
versionNum /= 26;
}
return sb.ToString();
}
I wouldn't bother using recursion for this. Looping with a StringBuilder is more efficient than concatenating strings with each recursion, although you'd probably need a crazy number of revisions to notice the difference (4 letters is enough for over 400,000 revisions).
You can use the modulo operator and division to get your code.
Like 55 / 26 == 2 (that is B) and 55 % 26 = 3 (that is C). It works for two characters. When you have an unknown count of characters, you have to start looping:
[look at Aaron's solution, mine was wrong]
What is the fastest c# function that takes and int and returns a string containing a letter or letters for use in an Excel function? For example, 1 returns "A", 26 returns "Z", 27 returns "AA", etc.
This is called tens of thousands of times and is taking 25% of the time needed to generate a large spreadsheet with many formulas.
public string Letter(int intCol) {
int intFirstLetter = ((intCol) / 676) + 64;
int intSecondLetter = ((intCol % 676) / 26) + 64;
int intThirdLetter = (intCol % 26) + 65;
char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
char ThirdLetter = (char)intThirdLetter;
return string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
}
I currently use this, with Excel 2007
public static string ExcelColumnFromNumber(int column)
{
string columnString = "";
decimal columnNumber = column;
while (columnNumber > 0)
{
decimal currentLetterNumber = (columnNumber - 1) % 26;
char currentLetter = (char)(currentLetterNumber + 65);
columnString = currentLetter + columnString;
columnNumber = (columnNumber - (currentLetterNumber + 1)) / 26;
}
return columnString;
}
and
public static int NumberFromExcelColumn(string column)
{
int retVal = 0;
string col = column.ToUpper();
for (int iChar = col.Length - 1; iChar >= 0; iChar--)
{
char colPiece = col[iChar];
int colNum = colPiece - 64;
retVal = retVal + colNum * (int)Math.Pow(26, col.Length - (iChar + 1));
}
return retVal;
}
As mentioned in other posts, the results can be cached.
I can tell you that the fastest function will not be the prettiest function. Here it is:
private string[] map = new string[]
{
"A", "B", "C", "D", "E" .............
};
public string getColumn(int number)
{
return map[number];
}
Don't convert it at all. Excel can work in R1C1 notation just as well as in A1 notation.
So (apologies for using VBA rather than C#):
Application.Worksheets("Sheet1").Range("B1").Font.Bold = True
can just as easily be written as:
Application.Worksheets("Sheet1").Cells(1, 2).Font.Bold = True
The Range property takes A1 notation whereas the Cells property takes (row number, column number).
To select multiple cells: Range(Cells(1, 1), Cells(4, 6)) (NB would need some kind of object qualifier if not using the active worksheet) rather than Range("A1:F4")
The Columns property can take either a letter (e.g. F) or a number (e.g. 6)
Here's my version: This does not have any limitation as such 2-letter or 3-letter.
Simply pass-in the required number (starting with 0) Will return the Excel Column Header like Alphabet sequence for passed-in number:
private string GenerateSequence(int num)
{
string str = "";
char achar;
int mod;
while (true)
{
mod = (num % 26) + 65;
num = (int)(num / 26);
achar = (char)mod;
str = achar + str;
if (num > 0) num--;
else if (num == 0) break;
}
return str;
}
I did not tested this for performance, if someone can do that will great for others.
(Sorry for being lazy) :)
Cheers!
You could pre-generate all the values into an array of strings. This would take very little memory and could be calculated on the first call.
Here is a concise implementation using LINQ.
static IEnumerable<string> GetExcelStrings()
{
string[] alphabet = { string.Empty, "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" };
return from c1 in alphabet
from c2 in alphabet
from c3 in alphabet.Skip(1) // c3 is never empty
where c1 == string.Empty || c2 != string.Empty // only allow c2 to be empty if c1 is also empty
select c1 + c2 + c3;
}
This generates A to Z, then AA to ZZ, then AAA to ZZZ.
On my PC, calling GetExcelStrings().ToArray() takes about 30 ms. Thereafter, you can refer to this array of strings if you need it thousands of times.
Once your function has run, let it cache the results into a dictionary. So that, it won't have to do the calculation again.
e.g. Convert(27) will check if 27 is mapped/stored in dictionary. If not, do the calculation and store "AA" against 27 in the dictionary.
The absolute FASTEST, would be capitalizing that the Excel spreadsheet only a fixed number of columns, so you would do a lookup table. Declare a constant string array of 256 entries, and prepopulate it with the strings from "A" to "IV". Then you simply do a straight index lookup.
Try this function.
// Returns name of column for specified 0-based index.
public static string GetColumnName(int index)
{
var name = new char[3]; // Assumes 3-letter column name max.
int rem = index;
int div = 17576; // 26 ^ 3
for (int i = 2; i >= 0; i++)
{
name[i] = alphabet[rem / div];
rem %= div;
div /= 26;
}
if (index >= 676)
return new string(name, 3);
else if (index >= 26)
return new string(name, 2);
else
return new string(name, 1);
}
Now it shouldn't take up that much memory to pre-generate each column name for every index and store them in a single huge array, so you shouldn't need to look up the name for any column twice.
If I can think of any further optimisations, I'll add them later, but I believe this function should be pretty quick, and I doubt you even need this sort of speed if you do the pre-generation.
Your first problem is that you are declaring 6 variables in the method. If a methd is going to be called thousands of times, just moving those to class scope instead of function scope will probably cut your processing time by more than half right off the bat.
This is written in Java, but it's basically the same thing.
Here's code to compute the label for the column, in upper-case, with a 0-based index:
public static String findColChars(long index) {
char[] ret = new char[64];
for (int i = 0; i < ret.length; ++i) {
int digit = ret.length - i - 1;
long test = index - powerDown(i + 1);
if (test < 0)
break;
ret[digit] = toChar(test / (long)(Math.pow(26, i)));
}
return new String(ret);
}
private static char toChar(long num) {
return (char)((num % 26) + 65);
}
Here's code to compute 0-based index for the column from the upper-case label:
public static long findColIndex(String col) {
long index = 0;
char[] chars = col.toCharArray();
for (int i = 0; i < chars.length; ++i) {
int cur = chars.length - i - 1;
index += (chars[cur] - 65) * Math.pow(26, i);
}
return index + powerDown(chars.length);
}
private static long powerDown(int limit) {
long acc = 0;
while (limit > 1)
acc += Math.pow(26, limit-- - 1);
return acc;
}
#Neil N -- nice code I think the thirdLetter should have a +64 rather than +65 ? am I right?
public string Letter(int intCol) {
int intFirstLetter = ((intCol) / 676) + 64;
int intSecondLetter = ((intCol % 676) / 26) + 64;
int intThirdLetter = (intCol % 26) + 65; ' SHOULD BE + 64?
char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
char ThirdLetter = (char)intThirdLetter;
return string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
}
Why don't we try factorial?
public static string GetColumnName(int index)
{
const string letters = "ZABCDEFGHIJKLMNOPQRSTUVWXY";
int NextPos = (index / 26);
int LastPos = (index % 26);
if (LastPos == 0) NextPos--;
if (index > 26)
return GetColumnName(NextPos) + letters[LastPos];
else
return letters[LastPos] + "";
}
Caching really does cut the runtime of 10,000,000 random calls to 1/3 its value though:
static Dictionary<int, string> LetterDict = new Dictionary<int, string>(676);
public static string LetterWithCaching(int index)
{
int intCol = index - 1;
if (LetterDict.ContainsKey(intCol)) return LetterDict[intCol];
int intFirstLetter = ((intCol) / 676) + 64;
int intSecondLetter = ((intCol % 676) / 26) + 64;
int intThirdLetter = (intCol % 26) + 65;
char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
char ThirdLetter = (char)intThirdLetter;
String s = string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
LetterDict.Add(intCol, s);
return s;
}
I think caching in the worst-case (hit every value) couldn't take up more than 250kb (17576 possible values * (sizeof(int)=4 + sizeof(char)*3 + string overhead=2)
It is recursive. Fast, and right :
class ToolSheet
{
//Not the prettyest but surely the fastest :
static string[] ColName = new string[676];
public ToolSheet()
{
ColName[0] = "A";
for (int index = 1; index < 676; ++index) Recurse(index, index);
}
private int Recurse(int i, int index)
{
if (i < 1) return 0;
ColName[index] = ((char)(65 + i % 26)).ToString() + ColName[index];
return Recurse(i / 26, index);
}
public string GetColName(int i)
{
return ColName[i - 1];
}
}
sorry there was a shift. corrected.
class ToolSheet
{
//Not the prettyest but surely the fastest :
static string[] ColName = new string[676];
public ToolSheet()
{
for (int index = 0; index < 676; ++index)
{
Recurse(index, index);
}
}
private int Recurse(int i, int index)
{
if (i < 1)
{
if (index % 26 == 0 && index > 0) ColName[index] = ColName[index - 1].Substring(0, ColName[index - 1].Length - 1) + "Z";
return 0;
}
ColName[index] = ((char)(64 + i % 26)).ToString() + ColName[index];
return Recurse(i / 26, index);
}
public string GetColName(int i)
{
return ColName[i - 1];
}
}
My solution:
static class ExcelHeaderHelper
{
public static string[] GetHeaderLetters(uint max)
{
var result = new List<string>();
int i = 0;
var columnPrefix = new Queue<string>();
string prefix = null;
int prevRoundNo = 0;
uint maxPrefix = max / 26;
while (i < max)
{
int roundNo = i / 26;
if (prevRoundNo < roundNo)
{
prefix = columnPrefix.Dequeue();
prevRoundNo = roundNo;
}
string item = prefix + ((char)(65 + (i % 26))).ToString(CultureInfo.InvariantCulture);
if (i <= maxPrefix)
{
columnPrefix.Enqueue(item);
}
result.Add(item);
i++;
}
return result.ToArray();
}
}
barrowc's idea is much more convenient and fastest than any conversion function! i have converted his ideas to actual c# code that i use:
var start = m_xlApp.Cells[nRow1_P, nCol1_P];
var end = m_xlApp.Cells[nRow2_P, nCol2_P];
// cast as Range to prevent binding errors
m_arrRange = m_xlApp.get_Range(start as Range, end as Range);
object[] values = (object[])m_arrRange.Value2;
private String columnLetter(int column) {
if (column <= 0)
return "";
if (column <= 26){
return (char) (column + 64) + "";
}
if (column%26 == 0){
return columnLetter((column/26)-1) + columnLetter(26) ;
}
return columnLetter(column/26) + columnLetter(column%26) ;
}
Just use an Excel formula instead of a user-defined function (UDF) or other program, per Allen Wyatt (https://excel.tips.net/T003254_Alphabetic_Column_Designation.html):
=SUBSTITUTE(ADDRESS(ROW(),COLUMN(),4),ROW(),"")
(In my organization, using UDFs would be very painful.)
The code I'm providing is NOT C# (instead is python) but the logic can be used for any language.
Most of previous answers are correct. Here is one more way of converting column number to excel columns.
solution is rather simple if we think about this as a base conversion. Simply, convert the column number to base 26 since there is 26 letters only.
Here is how you can do this:
steps:
set the column as a quotient
subtract one from quotient variable (from previous step) because we need to end up on ascii table with 97 being a.
divide by 26 and get the remainder.
add +97 to remainder and convert to char (since 97 is "a" in ASCII table)
quotient becomes the new quotient/ 26 (since we might go over 26 column)
continue to do this until quotient is greater than 0 and then return the result
here is the code that does this :)
def convert_num_to_column(column_num):
result = ""
quotient = column_num
remainder = 0
while (quotient >0):
quotient = quotient -1
remainder = quotient%26
result = chr(int(remainder)+97)+result
quotient = int(quotient/26)
return result
print("--",convert_num_to_column(1).upper())
If you need to generate letters not starting only from A1
private static string GenerateCellReference(int n, int startIndex = 65)
{
string name = "";
n += startIndex - 65;
while (n > 0)
{
n--;
name = (char)((n % 26) + 65) + name;
n /= 26;
}
return name + 1;
}