We have a legacy requirement to store what are now newly migrated int ID values into a guid type for use on ID-agnostic data types (basically old code that took advantage of the "globally unique" part of guid in order to contain all possible IDs in one column/field).
Due to this requirement, there was a follow-on requirement to embed the integer ID of entites into guid in a human-readable manner. This is important and is currently what is stopping me from working against the byte values directly.
Currently, I have the following:
public static byte[] IntAsHexBytes(int value)
{
return BitConverter.GetBytes(Convert.ToInt64(value.ToString(), 16));
}
public static Guid EmbedInteger(int id)
{
var ib = IntAsHexBytes(id);
return new Guid(new byte[]
{
0,0,0,0,0,0,1,64,ib[7],ib[6],ib[5],ib[4],ib[3],ib[2],ib[1],ib[0]
});
}
It treats the visual representation of the int as a hex value (value.ToString()), converts that to a long (Convert.ToInt64(value.ToString(), 16)) and the grabs the bytes from the long into a flattened byte[] for creating a guid in a particular structure.
So given an int of 42, when you treat 42 as a hex and convert that to an long you get 66, and on to the bytes of 66 gives, placing into a guid gives:
"00000000-0000-4001-0000-000000000042"
And an int of 379932126 gives:
"00000000-0000-4001-0000-000379932126"
So the end result is to place the integer into the guid in the last 12 digits so it visually looks like the integer 42 (even though the underlying integer value was 66).
This is roughly 30%-40% faster than constructing a string using concatenation in order to feed into the new Guid(string) constructor, but I feel I'm missing the solution that avoids having to do anything with strings in the first place.
The actual timings involved are quite small so as a performance improvement it probably won't justify the effort.
This is purely for the sake of my own curiosity to see if there are faster ways of tackling this problem. I posted here as I'm a long-standing SO user, but I'm torn as to whether this is a code-review-ish question, though I'm not asking for anything against my code directly, it just demonstrates what I want as output.
The integer range being supplied is 0 to int.MaxValue.
Update: For completeness, this is what we currently have and what I'm testing against:
string s = string.Format("00000000-0000-4001-0000-{0:D12}", id);
return new Guid(s);
My other code above is faster than this by around 30%.
Ok, here's another version which completely avoids strings. Hopefully this might be better. :)
public static Guid EmbedInteger(int id)
{
byte[] bytes = new byte[8];
int i = 0;
while (id > 0)
{
int remainder = id%100;
bytes[i++] = (byte)(16*(remainder/10) + remainder%10);
id /= 100;
}
return new Guid(0, 0, 0x4001, bytes[7], bytes[6], bytes[5], bytes[4], bytes[3], bytes[2], bytes[1], bytes[0]);
}
Adam Houldsworth: Update: This code can also be unrolled:
int remainder = id % 100;
bytes[0] = (byte)(16 * (remainder / 10) + remainder % 10);
id /= 100;
if (id == 0) return;
remainder = id % 100;
bytes[1] = (byte)(16 * (remainder / 10) + remainder % 10);
id /= 100;
if (id == 0) return;
remainder = id % 100;
bytes[2] = (byte)(16 * (remainder / 10) + remainder % 10);
id /= 100;
if (id == 0) return;
remainder = id % 100;
bytes[3] = (byte)(16 * (remainder / 10) + remainder % 10);
id /= 100;
if (id == 0) return;
remainder = id % 100;
bytes[4] = (byte)(16 * (remainder / 10) + remainder % 10);
I think this will do what you want. Not sure if it is any more efficient than your code, but it is a little shorter at least. :)
public static Guid EmbedInteger(int id)
{
string guid = string.Format("00000000-0000-4001-0000-{0,12:D12}", id);
return new Guid(guid);
}
It works by using the numeric format 12:D12 which causes the input number to be formatted as a decimal in a field width of 12 with leading zeroes.
Related
I am trying to convert decimal to hexadecimal. I have found many codes online. I used
int decValue = int.Parse(hexValue, System.Globalization.NumberStyles.HexNumber);
but my instructor told me I can't use any of those, just use recursive method. I am new to programming and little confused about recursive method.
I did find other methods to convert it, I am using below method, and I used switch statement to change numbers to letters. Program works fine. But not sure if it is recursive method? Can someone let me know if it is recursive method, if not help me understand how recursive method work.
static void HexadecimalConversion(int decimals)
{
if (decimals == 0)
return;
else
{
int hexadecimals = decimals % 16;
decimals = decimals / 16;
HexadecimalConversion(decimals);
With most recursive problems, you have 1 or 2 special cases and a general case. For this problem there are 3 cases:
Special Case #1. The value to be converted is 0.
The General Case. The value to be converted is greater than 0.
The Terminating Case. When the value to be converted is finally decremented to 0.
You need to distinguish between the two 'zero' conditions, lest you always append a trailing zero to the result, so...you need a 2-layered approach, something like this:
static string Int2Hex( int value )
{
if ( value < 0 ) throw new ArgumentOutOfRangeException("value") ;
if ( value == 0 ) return "0" ;
string result = ToHex( (uint) value ).ToString() ;
return result ;
}
static StringBuilder ToHex ( uint value )
{
StringBuilder buffer ;
if ( value <= 0 )
{
buffer = new StringBuilder() ;
}
else
{
buffer = ToHex( value / 16 ).Append( "0123456789ABCDEF"[ (int)(value % 16 ) ] ) ;
}
return buffer ;
}
Yet another implementation:
public string ConvertToHexa(int number)
{
if (number == 0)
return String.Empty;
var head = ConvertToHexa(number / 16);
var remainder = number % 16;
var tail = (char)(remainder + (remainder >= 10 ? 'A' - 10 : '0'));
return head + tail;
}
Console.WriteLine(ConvertToHexa(202)) gives "CA" (which is correct).
Another implementation
public void ConvertToHexa(int number)
{
if (number == 0)
return;
ConvertToHexa(number / 16);
var remainder = number % 16;
Console.Write(remainder >= 10 ? ((char)(remainder - 10 + 'A')).ToString() : remainder.ToString());
}
numbers[i] = numbers[i] * 2;
if (numbers[i] >= 10)
{
string t = numbers[i].ToString();
Console.WriteLine(t[0] + " plus " + t[1]+" = "+quersumme(t).ToString());
numbers[i] = Convert.ToInt32(t[0]) + Convert.ToInt32(t[1]);
}
public int quersumme(string n)
{
return n[0] + n[1];
}
The function returns 101 when I enter 7. But 7 * 2 = 14 and quersumme should do 1+4 = 5
t[0] is the character '1', and t[1] is the character '4', which is translated to 49 + 52, hence 101. Check out an ASCII chart to see what I'm talking about.
You could try using the Char.GetNumericValue() function:
return (int)Char.GetNumericValue(n[0]) + (int)Char.GetNumericValue(n[1]);
You're currently summing the Unicode code points - '1' is 49, and '4' is 52, hence the 101. You want to take the digit value of each character.
If you know that the digits will be in the range '0'-'9', the simplest way of doing that is just to subtract '0' and to use the LINQ Sum method to sum each value:
public int SumDigits(string n)
{
return n.Sum(c => c - '0');
}
Or you could use Char.GetNumericValue(), but that returns double because it also copes with characters such as U+00BD: ½.
Try converting n[0] and n[1] to separate int32's in your quersomme function
You are doing string concatenation in quesumme method.
Should be:
public int quersumme(string n)
{
return (int)Char.GetNumericValue(n[0]) + (int)Char.GetNumericValue(n[1]);
}
It looks to me like you are trying to enumerate the digits in an int.
Try this to avoid slow and cumbersome parsing and conversion. (Its all relative, I haven't tested performance.)
static IEnumerable<int> EnumerateDigits(int value, int baseValue = 10)
{
while (value > 0)
{
yield return value % baseValue;
value = value / baseValue
}
}
Then, if you want to switch the order into an array
var t = EnumerateDigits(numbers[i]).Reverse().ToArray();
But, if you just want to sum the digits.
var checksum = EnumerateDigits(numbers[i]).Sum()
I would like to generate a code like goo.gl and jsfiddle websites (http://jsfiddle.net/XzKvP/).
I tried different things that give me too large of a guid, a repeating alphanumeric code, etc.
I'm thinking I should be able to generate an alphanumeric code based on the Primary Key in my database table. This way it will be non-repeating? The PK is an auto-incremented integer by 1. But not sure that's how it should be done.
I want the code to look random, but it does NOT have to be.
For example, I do NOT want item 1234 in my database to be BCDE and the 1235 item to be BCDF.
Examples:
Notice how the url http://jsfiddle.net/XzKvP/ has a unique 5 character code XzKvP associated to the page. I want to be able to generate the same type of code.
goo.gl does it too: http://goo.gl/UEhtg has UEhtg
How is this done?
The solutions based on a random substring are no good because the outputs will collide. It may happen prematurely (with bad luck), and it will eventually happen when the list of generated values grows large. It doesn't even have to be that large for the probability of collisions to become high (see birthday attack).
What's good for this problem is a pseudo random permutation between the incrementing ID and its counterpart that will be shown in the URL. This technique guarantees that a collision is impossible, while still generating into an output space that is as small as the input space.
Implementation
I suggest this C# version of a Feistel cipher with 32 bits blocks, 3 rounds and a round function that is inspired by pseudo-random generators.
private static double RoundFunction(uint input)
{
// Must be a function in the mathematical sense (x=y implies f(x)=f(y))
// but it doesn't have to be reversible.
// Must return a value between 0 and 1
return ((1369 * input + 150889) % 714025) / 714025.0;
}
private static uint PermuteId(uint id)
{
uint l1=(id>>16)&65535;
uint r1=id&65535;
uint l2, r2;
for (int i = 0; i < 3; i++)
{
l2 = r1;
r2 = l1 ^ (uint)(RoundFunction(r1) * 65535);
l1 = l2;
r1 = r2;
}
return ((r1 << 16) + l1);
}
To express the permuted ID in a base62 string:
private static string GenerateCode(uint id)
{
return ToBase62(PermuteId(id));
}
The Base62 function is the same as the previous answer except that is takes uint instead of int (otherwise these functions would have to be rewritten to deal with negative values).
Customizing the algorithm
RoundFunction is the secret sauce of the algorithm. You may change it to a non-public version, possibly including a secret key. The Feistel network has two very nice properties:
even if the supplied RoundFunction is not reversible, the algorithm guarantees that PermuteId() will be a permutation in the mathematical sense (wich implies zero collision).
changing the expression inside the round function even lightly will change drastically the list of final output values.
Beware that putting something too trivial in the round expression would ruin the pseudo-random effect, although it would still work in terms of uniqueness of each PermuteId output. Also, an expression that wouldn't be a function in the mathematical sense would be incompatible with the algorithm, so for instance anything involving random() is not allowed.
Reversability
In its current form, the PermuteId function is its own inverse, which means that:
PermuteId(PermuteId(id))==id
So given a short string produced by the program, if you convert it back to uint with a FromBase62 function, and give that as input to PermuteId(), that will return the corresponding initial ID. That's pretty cool if you don't have a database to store the [internal-ID / shortstring] relationships: they don't actually need to be stored!
Producing even shorter strings
The range of the above function is 32 bits, that is about 4 billion values from 0 to 2^32-1. To express that range in base62, 6 characters are needed.
With only 5 characters, we could hope to represent at most 62^5 values, which is a bit under 1 billion. Should the output string be limited to 5 characters, the code should be tweaked as follows:
find N such that N is even and 2^N is as high as possible but lower than 62^5. That's 28, so our real output range that fits in 62^5 is going to be 2^28 or about 268 million values.
in PermuteId, use 28/2=14 bits values for l1 and r1 instead of 16 bits, while being careful to not ignore a single bit of the input (which must be less than 2^28).
multiply the result of RoundFunction by 16383 instead of 65535, to stay within the 14 bits range.
at the end of PermuteId, recombine r1 and l1 to form a 14+14=28 bits value instead of 32.
The same method could be applied for 4 characters, with an output range of 2^22, or about 4 million values.
What does it look like
In the version above, the first 10 produced strings starting with id=1 are:
cZ6ahF
3t5mM
xGNPN
dxwUdS
ej9SyV
cmbVG3
cOlRkc
bfCPOX
JDr8Q
eg7iuA
If I make a trivial change in the round function, that becomes:
ey0LlY
ddy0ak
dDw3wm
bVuNbg
bKGX22
c0s5GZ
dfNMSp
ZySqE
cxKH4b
dNqMDA
You can think of the five-letter code as a number in base-62 notation: your "digits" are 26 lowercase and 26 uppercase letters, and digits from 0 to 9. (26+26+10) digits in total. Given a number from 0 to 62^5 (which equals 916132832) (say, your primary key) you can do the conversion to a five-digit base-62 as follows:
private static char Base62Digit(int d) {
if (d < 26) {
return (char)('a'+d);
} else if (d < 52) {
return (char)('A'+d-26);
} else if (d < 62) {
return (char)('0'+d-52);
} else {
throw new ArgumentException("d");
}
}
static string ToBase62(int n) {
var res = "";
while (n != 0) {
res = Base62Digit(n%62) + res;
n /= 62;
}
return res;
}
private static int Base62Decode(char c) {
if (c >= '0' && c <= '9') {
return 52 + c - '0';
} else if (c >= 'A' && c <= 'Z') {
return 26 + c - 'A';
} else if (c >= 'a' && c <= 'z') {
return c - 'a';
} else {
throw new ArgumentException("c");
}
}
static int FromBase62(string s) {
return s.Aggregate(0, (current, c) => current*62 + Base62Decode(c));
}
Here is how to generate cryptographically strong random numbers (you need to add a reference to System.Security):
private static readonly RNGCryptoServiceProvider crypto =
new RNGCryptoServiceProvider();
private static int NextRandom() {
var buf = new byte[4];
crypto.GetBytes(buf);
return buf.Aggregate(0, (p, v) => (p << 8) + v) & 0x3FFFFFFF;
}
This is what I ended up doing
(Updated since Daniel Vérité's answer):
class Program
{
private static double RoundFunction(uint input)
{
// Must be a function in the mathematical sense (x=y implies f(x)=f(y))
// but it doesn't have to be reversible.
// Must return a value between 0 and 1
return ((1369 * input + 150889) % 714025) / 714025.0;
}
private static char Base62Digit(uint d)
{
if (d < 26)
{
return (char)('a' + d);
}
else if (d < 52)
{
return (char)('A' + d - 26);
}
else if (d < 62)
{
return (char)('0' + d - 52);
}
else
{
throw new ArgumentException("d");
}
}
private static string ToBase62(uint n)
{
var res = "";
while (n != 0)
{
res = Base62Digit(n % 62) + res;
n /= 62;
}
return res;
}
private static uint PermuteId(uint id)
{
uint l1 = (id >> 16) & 65535;
uint r1 = id & 65535;
uint l2, r2;
for (int i = 0; i < 3; i++)
{
l2 = r1;
r2 = l1 ^ (uint)(RoundFunction(r1) * 65535);
l1 = l2;
r1 = r2;
}
return ((r1 << 16) + l1);
}
private static string GenerateCode(uint id)
{
return ToBase62(PermuteId(id));
}
static void Main(string[] args)
{
Console.WriteLine("testing...");
try
{
for (uint x = 1; x < 1000000; x += 1)
{
Console.Write(GenerateCode(x) + ",");
}
}
catch (Exception err)
{
Console.WriteLine("error: " + err.Message);
}
Console.WriteLine("");
Console.WriteLine("Press 'Enter' to continue...");
Console.Read();
}
}
I'm having a problem with modulo from int which has 31 chars. It seems to bug out on
Int64 convertedNumber = Int64.Parse(mergedNumber); with Value was either too large or too small for an Int64. (Overflow Exception). How to fix it so that modulo doesn't bug out ?
class GeneratorRachunkow {
private static string numerRozliczeniowyBanku = "11111155"; // 8 chars
private static string identyfikatorNumeruRachunku = "7244"; // 4 chars
private static string stalaBanku = "562100"; // 6 chars
public static string generator(string pesel, string varKlientID) {
string peselSubstring = pesel.Substring(pesel.Length - 5); // 5 chars (from the end of the string);
string toAttach = varKlientID + peselSubstring;
string indywidualnyNumerRachunku = string.Format("{0}", toAttach.ToString().PadLeft(13, '0')); // merging pesel with klient id and adding 0 to the begining to match 13 chars
string mergedNumber = numerRozliczeniowyBanku + identyfikatorNumeruRachunku + indywidualnyNumerRachunku + stalaBanku; // merging everything -> 31 chars
Int64 convertedNumber = Int64.Parse(mergedNumber);
Int64 modulo = MathMod(convertedNumber, 97);
Int64 wynik = 98 - modulo;
string wynikString = string.Format("{0}", wynik.ToString().PadLeft(2, '0')); // must be 2 chars
indywidualnyNumerRachunku = wynikString + numerRozliczeniowyBanku + identyfikatorNumeruRachunku + indywidualnyNumerRachunku;
return indywidualnyNumerRachunku;
}
private static Int64 MathMod(Int64 a, Int64 b) {
return (Math.Abs(a * b) + a) % b;
}
}
The max value for Int64 is 9223372036854775807 (19 characters when printed). You will probably want to use BigInteger instead (which was introduced in .NET 4):
public static string generator(string pesel, string varKlientID) {
// I have cut some code here to keep it short
BigInteger convertedNumber;
if (BigInteger.TryParse(mergedNumber , out convertedNumber))
{
BigInteger modulo = convertedNumber % 97;
// The rest of the method goes here...
}
else
{
// string could not be parsed to BigInteger; handle gracefully
}
}
private static BigInteger MathMod(BigInteger a, BigInteger b)
{
return (BigInteger.Abs(a * b) + a) % b;
}
Int64.MaxValue is 9,223,372,036,854,775,807 that's 19 characters. So you just can't fit that in. I suggest looking at this question for working with big numbers.
Try this function instead of "MathMod":
static int ModString(string x, int y)
{
if (x.Length == 0)
return 0;
string x2 = x.Substring(0,x.Length - 1); // first digits
int x3 = int.Parse(x.Substring(x.Length - 1)); // last digit
return (ModString(x2, y) * 10 + x3) % y;
}
(since all of your numbers are positive, there is no point in using Math.Abs, as in your original MathMod function).
Use it this way:
modulo = ModString(mergedNumber,97);
This should works with all versions of .NET since 1.1, without the need of BigInteger.
The answer you are looking for is demonstrated here. It includes various manners to calculate the modulus for huge numbers. I used similar methods as described here for international bank account numbers.
A direct link to someone who has a copy pastable method is here.
I'm sure there must be a much better way of doing this. I'm trying to do a count operation on a Flags enum. Before I was itterating over all the possible values and counting the succesful AND operations.
e.g.
[Flags]
public enum Skills
{
None = 0,
Skill1 = 1,
Skill2 = 2,
Skill3 = 4,
Skill4 = 8,
Skill5 = 16,
Skill6 = 32,
Skill7 = 64,
Skill8 = 128
}
public static int Count(Skills skillsToCount)
{
Skills skill;
for (int i = 0; i < SkillSet.AllSkills.Count; i++)
{
skill = SkillSet.AllSkills[i];
if ((skillsToCount & skill) == skill && skill != Skills.None)
count++;
}
return count;
}
I'm sure there must be a better way of doing this though, but must be suffering from a mental block. Can anyone advise a nicer solution?
The following code will give you the number of bits that are set for a given number of any type varying in size from byte up to long.
public static int GetSetBitCount(long lValue)
{
int iCount = 0;
//Loop the value while there are still bits
while (lValue != 0)
{
//Remove the end bit
lValue = lValue & (lValue - 1);
//Increment the count
iCount++;
}
//Return the count
return iCount;
}
This code is very efficient as it only iterates once for each bit rather than once for every possible bit as in the other examples.
After looking on the site Assaf suggested I managed to find a slightly different solution that I got working for Int32's.
Here's the code for anyone else:
internal static UInt32 Count(this Skills skills)
{
UInt32 v = (UInt32)skills;
v = v - ((v >> 1) & 0x55555555); // reuse input as temporary
v = (v & 0x33333333) + ((v >> 2) & 0x33333333); // temp
UInt32 c = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24; // count
return c;
}
A very concise way to do it using BitArray and LINQ:
public static int Count(Skills skillsToCount)
{
return new BitArray(new[] {(int)skillsToCount}).OfType<bool>().Count(x => x);
}
If you're targeting .NET Core 3.0 or above, you can use BitOperations.PopCount(), it operates in uint or ulong and returns the number of 1 bits.
If your CPU supports SSE4, it'll use the POPCNT CPU instruction, otherwise it'll use a software fallback.
public static int Count(Skills skillsToCount)
{
return BitOperations.PopCount((ulong)skillsToCount);
}
The count is equivalent to counting how many bits are set to 1 in the integer value of the enum.
There are very fast ways of doing this in C/C++, which you can adapt to C#:
e.g.
int bitcount(unsigned int n) {
/* works for 32-bit numbers only */
/* fix last line for 64-bit numbers */
register unsigned int tmp;
tmp = n - ((n >> 1) & 033333333333)
- ((n >> 2) & 011111111111);
return ((tmp + (tmp >> 3)) & 030707070707) % 63;
}
Taken from here.
EDIT
Provided link is dead. Found another one that probably contains the same content.
There's a straight-forward way using functional programming (LINQ):
var skillCount = Enum
.GetValues(typeof(Skills))
.Cast<Enum>()
.Count(skills.HasFlag);
It might be a bit slower than the bit-juggling solutions but it has a constant run-time and is more intuitive.
While GetValues still allocates, there is a good chance that the compiler optimizes this away.
<FlagsAttribute()> _
Public Enum Skills As Byte
None = 0
Skill1 = 1
Skill2 = 2
Skill3 = 4
Skill4 = 8
Skill5 = 16
Skill6 = 32
Skill7 = 64
Skill8 = 128
End Enum
Dim x As Byte = Skills.Skill4 Or Skills.Skill8 Or Skills.Skill6
Dim count As Integer
If x = Skills.None Then count = 0 Else _
count = CType(x, Skills).ToString().Split(New Char() {","c}, StringSplitOptions.RemoveEmptyEntries).Count
depends on the definition of "better".
the check for Skills.None is required because if no bits are on, the string() returns Skills.None which results in a count of 1. this would work the same for integer, long, and their unsigned relatives.
the only reason to use this method is if the flags are not contiguous and if flags will be added periodically.
<FlagsAttribute()> _
Public Enum Skills As Integer
Skill1 = CInt(2 ^ 0) 'bit 0
Skill2 = CInt(2 ^ 1)
Skill3 = CInt(2 ^ 2)
Skill4 = CInt(2 ^ 3)
Skill5 = CInt(2 ^ 4)
Skill6 = CInt(2 ^ 5)
Skill7 = CInt(2 ^ 6)
Skill8 = CInt(2 ^ 7)
Skillx = CInt(2 ^ 10) 'bit 10, some bits were skipped
End Enum
Dim mySkills As Integer = Skills.Skillx Or Skills.Skill4 Or Skills.Skill8 Or Skills.Skill6
Dim count As Integer 'count of bits on
count = CType(mySkills, Skills).ToString().Split(New Char() {","c}, _
StringSplitOptions.RemoveEmptyEntries).Count
if "better" means faster this ain't ;) it.
int count = Enum.GetValues(typeof(Skills)).Length;