How do I create a unique identifier that does not contain numbers? - c#

I'm looking to use some sort of unique identifier within a .resx file but it does not allow the key to begin with a number. Rather than cycling through GUIDs until I get one that starts with a letter, I'm wondering if there's an alternative UID type that either does not contain numbers or would otherwise meet this requirement.
Any thoughts?

If you just want to create a Guid that starts with a letter, you could do something like this:
var b = Guid.NewGuid().ToByteArray();
b[3] |= 0xF0;
return new Guid(b);
This will always generate a GUID that starts with the hex digit F.
To create a Guid that doesn't contain any numbers you could use something like this:
return new Guid(Guid.NewGuid().ToByteArray()
.Select(b => (byte)(((b % 16) < 10 ? 0xA : b) |
(((b >> 4) < 10 ? 0xA : (b >> 4)) << 4)))
.ToArray());
This will test each hex digit (two per byte) and coerce it to A if it's less than A.
Both the above solutions generate real Guid objects, although the added restrictions do decrease the uniqueness of the resulting GUIDs to some degree (far more so in the second example). If you don't care about the output being actual GUIDs, you can simply remap the hex digits to something else and return the result a string, as others have suggested. FWIW, here's the shortest solution I can think of:
return String.Concat(Guid.NewGuid().ToString("N").Select(c => (char)(c + 17)));
This maps the hex digits 0 through 9 to the characters A through J, and the hex digits A - F to the characters r through w. It also generates a string without any hyphens. It For example:
Before: e58d0f329a2f4615b922ecf53dcd090a
After: vFIuAwDCJrCwEGBFsJCCvtwFDutuAJAr
Of course, you could convert this to all upper or lower case if you don't like the mixed case here.

How about generating a unique number and then prefixing it with a letter? So instead of
1234
You would use
a1234
As long as the algorithm you choose for the identifier guarantees a unique number, this should work just fine. It will also give you the ability to strip out the prefix and work with the identifier as a number again if need be.

You can write and use a psuedorandom sequence generator. Here's one that gives the basic idea:
class RandomLetterSequence {
private static Random r;
private static char MinChar = (char) 0x0061;
private static char MaxChar = (char) 0x007A;
public static string RandomSequence() {
return RandomSequence(32);
}
public static string RandomSequence(int length) {
if (r == null)
r = new Random();
var sb = new StringBuilder();
for (int i = length; i >= 0; i--) {
sb.Append((char)(r.Next(MinChar, MaxChar)));
}
return sb.ToString();
}
}
With this implementation, there are 26^32 possible different sequences that are generated that conform to your requirements:
Similar to GUIDs in terms of collision rate (infinitesimally small)
Contains only letters

Assuming that you don't need it to a be valid Guid (you refer to 'some sort of unique identifier'), just create a string based guid (using Guid.NewGuid().ToString()) then map the first digit to a range of suitable letters e.g. 0=G, 1=H, 2=I etc.

Just write your own GUID-like generator, a valid character would be a-z (you can also use A-Z to increase the number of probabilities).

Generate the new GUID and just replace the characters 0-9 with characters g-p.

#p.s.w.g provided good solution.
You can write his/her recommendations as Extension:
using System;
using System.Linq;
namespace YourApp.Extensions.GuidExtensions
{
public static class Extension
{
public static Guid FirstLetter(this Guid obj)
{
var b = obj.ToByteArray();
b[3] |= 0xF0;
return new Guid(b);
}
public static Guid OnlyLetters(this Guid obj)
{
var ba = obj.ToByteArray();
return new Guid(
ba.Select(b => (byte)(((b % 16) < 10 ? 0xA : b) |
(((b >> 4) < 10 ? 0xA : (b >> 4)) << 4)))
.ToArray()
);
}
}
}
And then use it somewhere in app:
// ...
using YourApp.Extensions.GuidExtensions;
// ...
class SomeClass {
Guid SomeMethodWithFirstLetter() {
return Guid.NewGuid().FirstLetter();
}
Guid SomeMethodWithOnlyLetters() {
return Guid.NewGuid().OnlyLetters();
}
}

Related

Generate a bruteforce string given an offset?

Some years back when I was still a beginner at programming, I found some code online that could generate a bruteforce-code given an offset.
So for instance, if I did GetPassword(1) it would return "a", and if I did GetPassword(2) it would return "b" etc.
Every increment of the offset would provide the next possible combination of strings. A minimum and maximum length of the "password to guess" could also be provided.
Now I have no idea where this code is, or what the algorithm is called. I want to implement one myself, since I need it for URL-enshortening purposes. A user generates a URL that I want to look somewhat long these lines: http://fablelane.com/i/abc where "abc" is the code.
You can think of the output from GetPassword as a number in a different base. For example if GetPassword can output upper and lower case alphanumeric then it is in base 62 -> 26 letters + 26 letters + 10 digits.
GetPassword must convert from base 10 to base 62 in this case. You can use a look up array to find the output characters.
You can convert from one base to another by using an algorithm such as this:
Another stackoverflow post
This is base 26 encoding end decoding:
public static string Encode(int number){
number = Math.Abs(number);
StringBuilder converted = new StringBuilder();
// Repeatedly divide the number by 26 and convert the
// remainder into the appropriate letter.
do
{
int remainder = number % 26;
converted.Insert(0, (char)(remainder + 'a'));
number = (number - remainder) / 26;
} while (number > 0);
return converted.ToString();
}
public static int Decode(string number) {
if (number == null) throw new ArgumentNullException("number");
int s = 0;
for (int i = 0; i < number.Length; i++) {
s += (number[i] - 'a');
s = i == number.Length - 1 ? s : s * 26;
}
return s;
}

Calculating the number of bits in a Subnet Mask in C#

I have a task to complete in C#. I have a Subnet Mask: 255.255.128.0.
I need to find the number of bits in the Subnet Mask, which would be, in this case, 17.
However, I need to be able to do this in C# WITHOUT the use of the System.Net library (the system I am programming in does not have access to this library).
It seems like the process should be something like:
1) Split the Subnet Mask into Octets.
2) Convert the Octets to be binary.
3) Count the number of Ones in each Octet.
4) Output the total number of found Ones.
However, my C# is pretty poor. Does anyone have the C# knowledge to help?
Bit counting algorithm taken from:
http://www.necessaryandsufficient.net/2009/04/optimising-bit-counting-using-iterative-data-driven-development/
string mask = "255.255.128.0";
int totalBits = 0;
foreach (string octet in mask.Split('.'))
{
byte octetByte = byte.Parse(octet);
while (octetByte != 0)
{
totalBits += octetByte & 1; // logical AND on the LSB
octetByte >>= 1; // do a bitwise shift to the right to create a new LSB
}
}
Console.WriteLine(totalBits);
The most simple algorithm from the article was used. If performance is critical, you might want to read the article and use a more optimized solution from it.
string ip = "255.255.128.0";
string a = "";
ip.Split('.').ToList().ForEach(x => a += Convert.ToInt32(x, 2).ToString());
int ones_found = a.Replace("0", "").Length;
A complete sample:
public int CountBit(string mask)
{
int ones=0;
Array.ForEach(mask.Split('.'),(s)=>Array.ForEach(Convert.ToString(int.Parse(s),2).Where(c=>c=='1').ToArray(),(k)=>ones++));
return ones
}
You can convert a number to binary like this:
string ip = "255.255.128.0";
string[] tokens = ip.Split('.');
string result = "";
foreach (string token in tokens)
{
int tokenNum = int.Parse(token);
string octet = Convert.ToString(tokenNum, 2);
while (octet.Length < 8)
octet = octet + '0';
result += octet;
}
int mask = result.LastIndexOf('1') + 1;
The solution is to use a binary operation like
foreach(string octet in ipAddress.Split('.'))
{
int oct = int.Parse(octet);
while(oct !=0)
{
total += oct & 1; // {1}
oct >>=1; //{2}
}
}
The trick is that on line {1} the binary AND is in sence a multiplication so multiplicating 1x0=0, 1x1=1. So if we have some hypothetic number
0000101001 and multiply it by 1 (so in binary world we execute &), which is nothig else then 0000000001, we get
0000101001
0000000001
Most right digit is 1 in both numbers so making binary AND return 1, otherwise if ANY of the numbers minor digit will be 0, the result will be 0.
So here, on line total += oct & 1 we add to tolal either 1 or 0, based on that digi number.
On line {2}, instead we just shift the minor bit to right by, actually, deviding the number by 2, untill it becomes 0.
Easy.
EDIT
This is valid for intgere and for byte types, but do not use this technique on floating point numbers. By the way, it's pretty valuable solution for this question.

Reversing a hash function

I have the following hash function, and I'm trying to get my way to reverse it, so that I can find the key from a hashed value.
uint Hash(string s)
{
uint result = 0;
for (int i = 0; i < s.Length; i++)
{
result = ((result << 5) + result) + s[i];
}
return result;
}
The code is in C# but I assume it is clear.
I am aware that for one hashed value, there can be more than one key, but my intent is not to find them all, just one that satisfies the hash function suffices.
EDIT :
The string that the function accepts is formed only from digits 0 to 9 and the chars '*' and '#' hence the Unhash function must respect this criteria too.
Any ideas? Thank you.
This should reverse the operations:
string Unhash(uint hash)
{
List<char> s = new List<char>();
while (hash != 0)
{
s.Add((char)(hash % 33));
hash /= 33;
}
s.Reverse();
return new string(s.ToArray());
}
This should return a string that gives the same hash as the original string, but it is very unlikely to be the exact same string.
Characters 0-9,*,# have ASCII values 48-57,42,35, or binary: 00110000 ... 00111001, 00101010, 00100011
First 5 bits of those values are different, and 6th bit is always 1. This means that you can deduce your last character in a loop by taking current hash:
uint lastChar = hash & 0x1F - ((hash >> 5) - 1) & 0x1F + 0x20;
(if this doesn't work, I don't know who wrote it)
Now roll back hash,
hash = (hash - lastChar) / 33;
and repeat the loop until hash becomes zero. I don't have C# on me, but I'm 70% confident that this should work with only minor changes.
Brute force should work if uint is 32 bits. Try at least 2^32 strings and one of them is likely to hash to the same value. Should only take a few minutes on a modern pc.
You have 12 possible characters, and 12^9 is about 2^32, so if you try 9 character strings you're likely to find your target hash. I'll do 10 character strings just to be safe.
(simple recursive implementation in C++, don't know C# that well)
#define NUM_VALID_CHARS 12
#define STRING_LENGTH 10
const char valid_chars[NUM_VALID_CHARS] = {'0', ..., '#' ,'*'};
void unhash(uint hash_value, char *string, int nchars) {
if (nchars == STRING_LENGTH) {
string[STRING_LENGTH] = 0;
if (Hash(string) == hash_value) { printf("%s\n", string); }
} else {
for (int i = 0; i < NUM_VALID_CHARS; i++) {
string[nchars] = valid_chars[i];
unhash(hash_value, string, nchars + 1);
}
}
}
Then call it with:
char string[STRING_LENGTH + 1];
unhash(hash_value, string, 0);
Hash functions are designed to be difficult or impossible to reverse, hence the name (visualize meat + potatoes being ground up)
I would start out by writing each step that result = ((result << 5) + result) + s[i]; does on a separate line. This will make solving a lot easier. Then all you have to do is the opposite of each line (in the opposite order too).

Are there any working implementations of the rolling hash function used in the Rabin-Karp string search algorithm?

I'm looking to use a rolling hash function so I can take hashes of n-grams of a very large string.
For example:
"stackoverflow", broken up into 5 grams would be:
"stack", "tacko", "ackov", "ckove",
"kover", "overf", "verfl", "erflo", "rflow"
This is ideal for a rolling hash function because after I calculate the first n-gram hash, the following ones are relatively cheap to calculate because I simply have to drop the first letter of the first hash and add the new last letter of the second hash.
I know that in general this hash function is generated as:
H = c1ak − 1 + c2ak − 2 + c3ak − 3 + ... + cka0 where a is a constant and c1,...,ck are the input characters.
If you follow this link on the Rabin-Karp string search algorithm , it states that "a" is usually some large prime.
I want my hashes to be stored in 32 bit integers, so how large of a prime should "a" be, such that I don't overflow my integer?
Does there exist an existing implementation of this hash function somewhere that I could already use?
Here is an implementation I created:
public class hash2
{
public int prime = 101;
public int hash(String text)
{
int hash = 0;
for(int i = 0; i < text.length(); i++)
{
char c = text.charAt(i);
hash += c * (int) (Math.pow(prime, text.length() - 1 - i));
}
return hash;
}
public int rollHash(int previousHash, String previousText, String currentText)
{
char firstChar = previousText.charAt(0);
char lastChar = currentText.charAt(currentText.length() - 1);
int firstCharHash = firstChar * (int) (Math.pow(prime, previousText.length() - 1));
int hash = (previousHash - firstCharHash) * prime + lastChar;
return hash;
}
public static void main(String[] args)
{
hash2 hashify = new hash2();
int firstHash = hashify.hash("mydog");
System.out.println(firstHash);
System.out.println(hashify.hash("ydogr"));
System.out.println(hashify.rollHash(firstHash, "mydog", "ydogr"));
}
}
I'm using 101 as my prime. Does it matter if my hashes will overflow? I think this is desirable but I'm not sure.
Does this seem like the right way to go about this?
i remember a slightly different implementation which seems to be from one of sedgewick's algorithms books (it also contains example code - try to look it up). here's a summary adjusted to 32 bit integers:
you use modulo arithmetic to prevent your integer from overflowing after each operation.
initially set:
c = text ("stackoverflow")
M = length of the "n-grams"
d = size of your alphabet (256)
q = a large prime so that (d+1)*q doesn't overflow (8355967 might be a good choice)
dM = dM-1 mod q
first calculate the hash value of the first n-gram:
h = 0
for i from 1 to M:
h = (h*d + c[i]) mod q
and for every following n-gram:
for i from 1 to lenght(c)-M:
// first subtract the oldest character
h = (h + d*q - c[i]*dM) mod q
// then add the next character
h = (h*d + c[i+M]) mod q
the reason why you have to add d*q before subtracting the oldest character is because you might run into negative values due to small values caused by the previous modulo operation.
errors included but i think you should get the idea. try to find one of sedgewick's algorithms books for details, less errors and a better description. :)
As i understand it's a function minimization for:
2^31 - sum (maxchar) * A^kx
where maxchar = 62 (for A-Za-z0-9). I've just calculated it by Excel (OO Calc, exactly) :) and a max A it found is 76, or 73, for a prime number.
Not sure what your aim is here, but if you are trying to improve performance, using math.pow will cost you far more than you save by calculating a rolling hash value.
I suggest you start by keeping to simple and efficient and you are very likely find it is fast enough.

Convert char to int in C#

I have a char in c#:
char foo = '2';
Now I want to get the 2 into an int. I find that Convert.ToInt32 returns the actual decimal value of the char and not the number 2. The following will work:
int bar = Convert.ToInt32(new string(foo, 1));
int.parse only works on strings as well.
Is there no native function in C# to go from a char to int without making it a string? I know this is trivial but it just seems odd that there's nothing native to directly make the conversion.
This will convert it to an int:
char foo = '2';
int bar = foo - '0';
This works because each character is internally represented by a number. The characters '0' to '9' are represented by consecutive numbers, so finding the difference between the characters '0' and '2' results in the number 2.
Interesting answers but the docs say differently:
Use the GetNumericValue methods to
convert a Char object that represents
a number to a numeric value type. Use
Parse and TryParse to convert a
character in a string into a Char
object. Use ToString to convert a Char
object to a String object.
http://msdn.microsoft.com/en-us/library/system.char.aspx
Has anyone considered using int.Parse() and int.TryParse() like this
int bar = int.Parse(foo.ToString());
Even better like this
int bar;
if (!int.TryParse(foo.ToString(), out bar))
{
//Do something to correct the problem
}
It's a lot safer and less error prone
char c = '1';
int i = (int)(c - '0');
and you can create a static method out of it:
static int ToInt(this char c)
{
return (int)(c - '0');
}
Try This
char x = '9'; // '9' = ASCII 57
int b = x - '0'; //That is '9' - '0' = 57 - 48 = 9
By default you use UNICODE so I suggest using faulty's method
int bar = int.Parse(foo.ToString());
Even though the numeric values under are the same for digits and basic Latin chars.
This converts to an integer and handles unicode
CharUnicodeInfo.GetDecimalDigitValue('2')
You can read more here.
The real way is:
int theNameOfYourInt = (int).Char.GetNumericValue(theNameOfYourChar);
"theNameOfYourInt" - the int you want your char to be transformed to.
"theNameOfYourChar" - The Char you want to be used so it will be transformed into an int.
Leave everything else be.
Principle:
char foo = '2';
int bar = foo & 15;
The binary of the ASCII charecters 0-9 is:
0   -   0011 0000
1   -   0011 0001
2   -   0011 0010
3   -   0011 0011
4   -   0011 0100
5   -   0011 0101
6   -   0011 0110
7   -   0011 0111
8   -   0011 1000
9   -   0011 1001
and if you take in each one of them the first 4 LSB (using bitwise AND with 8'b00001111 that equals to 15) you get the actual number (0000 = 0,0001=1,0010=2,... )
Usage:
public static int CharToInt(char c)
{
return 0b0000_1111 & (byte) c;
}
I am agree with #Chad Grant
Also right if you convert to string then you can use that value as numeric as said in the question
int bar = Convert.ToInt32(new string(foo, 1)); // => gives bar=2
I tried to create a more simple and understandable example
char v = '1';
int vv = (int)char.GetNumericValue(v);
char.GetNumericValue(v) returns as double and converts to (int)
More Advenced usage as an array
int[] values = "41234".ToArray().Select(c=> (int)char.GetNumericValue(c)).ToArray();
First convert the character to a string and then convert to integer.
var character = '1';
var integerValue = int.Parse(character.ToString());
I'm using Compact Framework 3.5, and not has a "char.Parse" method.
I think is not bad to use the Convert class. (See CLR via C#, Jeffrey Richter)
char letterA = Convert.ToChar(65);
Console.WriteLine(letterA);
letterA = 'あ';
ushort valueA = Convert.ToUInt16(letterA);
Console.WriteLine(valueA);
char japaneseA = Convert.ToChar(valueA);
Console.WriteLine(japaneseA);
Works with ASCII char or Unicode char
Comparison of some of the methods based on the result when the character is not an ASCII digit:
char c1 = (char)('0' - 1), c2 = (char)('9' + 1);
Debug.Print($"{c1 & 15}, {c2 & 15}"); // 15, 10
Debug.Print($"{c1 ^ '0'}, {c2 ^ '0'}"); // 31, 10
Debug.Print($"{c1 - '0'}, {c2 - '0'}"); // -1, 10
Debug.Print($"{(uint)c1 - '0'}, {(uint)c2 - '0'}"); // 4294967295, 10
Debug.Print($"{char.GetNumericValue(c1)}, {char.GetNumericValue(c2)}"); // -1, -1
I was searched for the most optimized method and was very surprized that the best is the easiest (and the most popular answer):
public static int ToIntT(this char c) =>
c is >= '0' and <= '9'?
c-'0' : -1;
There a list of methods I tried:
c-'0' //current
switch //about 25% slower, no method with disabled isnum check (it is but performance is same as with enabled)
0b0000_1111 & (byte) c; //same speed
Uri.FromHex(c) /*2 times slower; about 20% slower if use my isnum check*/ (c is >= '0' and <= '9') /*instead of*/ Uri.IsHexDigit(testChar)
(int)char.GetNumericValue(c); // about 20% slower. I expected it will be much more slower.
Convert.ToInt32(new string(c, 1)) //3-4 times slower
Note that isnum check (2nd line in the first codeblock) takes ~30% of perfomance, so you should take it off if you sure that c is char. The testing error was ~5%
Use this:
public static string NormalizeNumbers(this string text)
{
if (string.IsNullOrWhiteSpace(text)) return text;
string normalized = text;
char[] allNumbers = text.Where(char.IsNumber).Distinct().ToArray();
foreach (char ch in allNumbers)
{
char equalNumber = char.Parse(char.GetNumericValue(ch).ToString("N0"));
normalized = normalized.Replace(ch, equalNumber);
}
return normalized;
}
One very quick simple way just to convert chars 0-9 to integers:
C# treats a char value much like an integer.
char c = '7'; (ascii code 55) int x = c - 48; (result = integer of 7)
Use Uri.FromHex.
And to avoid exceptions Uri.IsHexDigit.
char testChar = 'e';
int result = Uri.IsHexDigit(testChar)
? Uri.FromHex(testChar)
: -1;
I prefer the switch method.
The performance is the same as c - '0' but I find the switch easier to read.
Benchmark:
Method
Mean
Error
StdDev
Allocated Memory/Op
CharMinus0
90.24 us
7.1120 us
0.3898 us
39.18 KB
CharSwitch
90.54 us
0.9319 us
0.0511 us
39.18 KB
Code:
public static int CharSwitch(this char c, int defaultvalue = 0) {
switch (c) {
case '0': return 0;
case '1': return 1;
case '2': return 2;
case '3': return 3;
case '4': return 4;
case '5': return 5;
case '6': return 6;
case '7': return 7;
case '8': return 8;
case '9': return 9;
default: return defaultvalue;
}
}
public static int CharMinus0(this char c, int defaultvalue = 0) {
return c >= '0' && c <= '9' ? c - '0' : defaultvalue;
}
This worked for me:
int bar = int.Parse("" + foo);
I've seen many answers but they seem confusing to me. Can't we just simply use Type Casting.
For ex:-
int s;
char i= '2';
s = (int) i;

Categories

Resources