I'm re-writing alibrary with a mandate to make it totally allocation free. The goal is to have 0 collections after the app's startup phase is done.
Previously, there were a lot of calls like this:
Int32 foo = Int32.Parse(ASCIIEncoding.ASCII.GetString(bytes, start, length));
Which I believe is allocating a string. I couldn't find a C# library function that would do the same thing automatically. I looked at the BitConverter class, but it looks like that is only if your Int32 is encoded with the actual bytes that represent it. Here, I have an array of bytes representing Ascii characters that represent an Int32.
Here's what I did
public static Int32 AsciiBytesToInt32(byte[] bytes, int start, int length)
{
Int32 Temp = 0;
Int32 Result = 0;
Int32 j = 1;
for (int i = start + length - 1; i >= start; i--)
{
Temp = ((Int32)bytes[i]) - 48;
if (Temp < 0 || Temp > 9)
{
throw new Exception("Bytes In AsciiBytesToInt32 Are Not An Int32");
}
Result += Temp * j;
j *= 10;
}
return Result;
}
Does anyone know of a C# library function that already does this in a more optimal way? Or an improvement to make the above run faster (its going to be called millions of times during the day probably). Thanks!
Millions of times per day shouldn't be a problem - I'd expect that to be able to run hundreds of thousands of times per second. Personally I'd rewrite the above to only declare "temp" within the loop (and get rid of the Pascal-cases local variable names - urgh) but it should be okay.
The code would be more immediately understandable as:
int digit = bytes[i] - '0';
which does the same as your
Temp = ((Int32)bytes[i]) - 48;
line, but in a simpler way (IMO). They should behave exactly the same way.
On a general note, trying to write C# without any allocations is pretty harsh, and fights against the way the language and framework are designed. Do you believe this is actually a reasonable requirement? Admittedly I've heard about it being the way some games are written in managed code... but it does seem a bit odd.
Of course, you're going to allocate an exception if the bytes are inappropriate...
EDIT: Note that your code doesn't allow for negative numbers. Is that okay?
Related
I got a big byte array (around 50kb) and i need to extract numeric values from it. Every three bytes are representing one value.
What i tried is to work with LINQs skip & take but it's really slow regarding the large size of the array.
This is my very slow routine:
List<int> ints = new List<int>();
for (int i = 0; i <= fullFile.Count(); i+=3)
{
ints.Add(BitConverter.ToInt16(fullFile.Skip(i).Take(i + 3).ToArray(), 0));
}
I think i got a wrong approach to this.
Your code
First of all, ToInt16 only uses two bytes. So your third byte will be discarded.
You can't use ToInt32 as it would include one extra byte.
Let's review this:
fullFile.Skip(i).Take(i + 3).ToArray()
..and take a careful look at Take(i + 3). It says that you want to copy a larger and larger buffer. For instance, when i is on index 32000 you copy 32003 bytes into your new buffer.
That's why the code is quite slow.
The code is also slow since you allocate a lot of byte buffers which will need to be garbage collected. 65535 extra buffers of growing size which would have to be garbage collected.
You could also have done like this:
List<int> ints = new List<int>();
var workBuffer = new byte[4];
for (int i = 0; i <= fullFile.Length; i += 3)
{
// Copy the three bytes into the beginning of the temp buffer
Buffer.BlockCopy(fullFile, i, workBuffer, 0, 3);
// Now we can use ToInt32 as the last byte always is zero
var value = BitConverter.ToInt32(workBuffer, 0);
ints.Add(value);
}
Quite easy to understand, but not the fastest code.
A better solution
So the most efficient way is to do the conversion by yourself (bit shifting).
Something like:
List<int> ints = new List<int>();
for (int i = 0; i <= fullFile.Length; i += 3)
{
// This code assume little endianess
var value = (fullFile[i + 2] << 16)
+ (fullFile[i + 1] << 8)
+ fullFile[i];
ints.Add(value);
}
This code do not allocate anything extra (except the ints), and should be quite fast.
You can read more about Shift operators in MSDN. And about endianess
So I've been trying to write a program in C# that returns all factors for a given number (I will implement user input later). The program looks as follows:
//Number to divide
long num = 600851475143;
//initializes list
List<long> list = new List<long>();
//Defines combined variable for later output
var combined = string.Join(", ", list);
for (int i = 1; i < num; i++)
{
if (num % i == 0)
{
list.Add(i);
Console.WriteLine(i);
}
}
However, after some time the program starts to also try to divide negative numbers, which after some time ends in a System.DivideByZeroException. It's not clear to me why it does this. It only starts to do this after the "num" variable contains a number with 11 digits or more. But since I need such a high number, a fix or similiar would be highly appreciated. I am still a beginner.
Thank you!
I strongly suspect the problem is integer overflow. num is a 64-bit integer, whereas i is a 32-bit integer. If num is more than int.MaxValue, then as you increment i it will end up overflowing back to negative values and then eventually 0... at which point num % i will throw.
The simplest option is just to change i to be a long instead:
for (long i = 1; i < num; i++)
It's unfortunate that there'd be no warning in your original code - i is promoted to long where it needs to be, because there's an implicit conversion from int to long. It's not obvious to me what would need to change for this to be spotted in the language itself. It would be simpler for a Roslyn analyzer to notice this sort of problem.
I need to convert a 8-bit number such as 00001110 to char. The problem is easy so I wrote the code and everything is working fine, but now I need to optimize for speed as much as possible.
In test class :
class Program
{
static void Main(string[] args)
{
Random r = new Random();
int[] testTab = new int[8];
Normal n = new Normal();
long time;
Stopwatch watch = new Stopwatch();
watch.Start();
for (int i = 0; i < 9000; i++)
{
for (int j = 0; j < 8; j++)
{
testTab[j] = r.Next(2);
}
n.SetTable(testTab);
n.Decode();
}
watch.Stop();
time = watch.ElapsedTicks;
Console.WriteLine(time);
time = watch.ElapsedMilliseconds;
Console.WriteLine(time);
Console.ReadKey();
}
}
and class with algorithm :
class Normal
{
private int[] _tab = new int[8];
public void SetTable(int[] tab)
{
_tab = tab;
}
public void Decode()
{
char a = ((char)( _tab[0]*1 + _tab[1]*2 + _tab[2]*4 + _tab[3]*8 + _tab[4]*16 + _tab[5]*32 +
_tab[6]*64 + _tab[7]*124));
}
}
In the output for 9000 times I get time 2ms it is not a long time ( for 9000 ) time, but I have good proc in my PC.
The final code will be running in smartphone so there is no powerful CPU. In my algorithm I use random data, in final version I will load data by Camera (so it will be longer ) and try to repeat this operation 10 times in one second so that is why I need best time in even smallest operations.
Is there a faster way to convert byte to char than this?
char a = ((char)( _tab[0]*1 + _tab[1]*2 + _tab[2]*4 + _tab[3]*8 + _tab[4]*16 + _tab[5]*32 + _tab[6]*64 + _tab[7]*128));
tl;dr Your conversion code is already efficient, and is not your bottleneck.
Your benchmarking is flawed. You are not just timing the conversion of binary stored in int[] to integer value. You are also timing the generation of your random data. I expect that the majority of the time is spent generating the random data.
Re-write your benchmarking program to operate on data prepared before you start timing. Make sure that the duration of the test is at least 5 or 10 seconds so that you can generate meaningful answers. If you only run for two milliseconds then the granularity of your timer affects the quality of your results.
Bear in mind that in your real application you will be taking a picture on a camera of a QR code and decoding that. The cost of that is many orders of magnitude greater than the cost of converting the 8 bit int arrays.
Your code to do that conversion is already efficient. Do not seek to optimize it further. Not only is there no need to optimize it, there is little hope for significant gains. For the sake of clarity and conciseness you may well opt to use one of the .net library methods that perform such a conversion, but performance of this part of your program is not an issue.
As an aside, it looks like you need to be converting the 8 bit value to byte, adding these values to a byte array, and then feeding to Encoding.GetString to obtain your text. A cast to UTF-16 char as per your code is not correct.
It worth a try this:
var yourString = "00100000";
char yourChar = (char) Convert.ToByte(yourString, 2); // you got ' ' (space)
It may or may not faster, but definitely simpler, more stable and more maintainable.
I ran some tests with different implementations.
First was #Melnikovl answer.
Second was mine, where I replaced + with | and * with << operator.
Third was author's original solution.
I tested with modified code and measured only conversion code.
First and second solution showed a little better performance. But BitConverter a little more often was better, so I think you should choose it (also because if simplicity of code)
var byte[] bytes = { 1, 1, 1, 1 };
int i = BitConverter.ToInt32(bytes, 0);
char a = (char)i;
Don't forgot to check if byte array litte or big endian
(C#, prime generator)
Heres some code a friend and I were poking around on:
public List<int> GetListToTop(int top)
{
top++;
List<int> result = new List<int>();
BitArray primes = new BitArray(top / 2);
int root = (int)Math.Sqrt(top);
for (int i = 3, count = 3; i <= root; i += 2, count++)
{
int n = i - count;
if (!primes[n])
for (int j = n + i; j < top / 2; j += i)
{
primes[j] = true;
}
}
if (top >= 2)
result.Add(2);
for (int i = 0, count = 3; i < primes.Length; i++, count++)
{
if (!primes[i])
{
int n = i + count;
result.Add(n);
}
}
return result;
}
On my dorky AMD x64 1800+ (dual core), for all primes below 1 billion in 34546.875ms. Problem seems to be storing more in the bit array. Trying to crank more than ~2billion is more than the bitarray wants to store. Any ideas on how to get around that?
I would "swap" parts of the array out to disk. By that, I mean, divide your bit array into half-billion bit chunks and store them on disk.
The have only a few chunks in memory at any one time. With C# (or any other OO language), it should be easy to encapsulate the huge array inside this chunking class.
You'll pay for it with a slower generation time but I don't see any way around that until we get larger address spaces and 128-bit compilers.
Or as an alternative approach to the one suggested by Pax, make use of the new Memory-Mapped File classes in .NET 4.0 and let the OS decide which chunks need to be in memory at any given time.
Note however that you'll want to try and optimise the algorithm to increase locality so that you do not needlessly end up swapping pages in and out of memory (trickier than this one sentence makes it sound).
Use multiple BitArrays to increase the maximum size. If a number is to great bit-shift it and store the result in a bit-array for storing bits 33-64.
BitArray second = new BitArray(int.MaxValue);
long num = 23958923589;
if (num > int.MaxValue)
{
int shifted = (int)num >> 32;
second[shifted] = true;
}
long request = 0902305023;
if (request > int.MaxValue)
{
int shifted = (int)request >> 32;
return second[shifted];
}
else return first[request];
Of course it would be nice if BitArray would support size up to System.Numerics.BigInteger.
Swapping to disk will make your code really slow.
I have a 64-bit OS, and my BitArray is also limited to 32-bits.
PS: your prime number calculations looks wierd, mine looks like this:
for (int i = 2; i <= number; i++)
if (primes[i])
for (int scalar = i + i; scalar <= number; scalar += i)
{
primes[scalar] = false;
yield return scalar;
}
The Sieve algorithm would be better performing. I could determine all the 32-bit primes (total about 105 million) for the int range in less than 4 minutes with that. Of course returning the list of primes is a different thing as the memory requirement there would be a little over 400 MB (1 int = 4 bytes). Using a for loop the numbers were printed to a file and then imported to a DB for more fun :) However for the 64 bit primes the program would need several modifications and perhaps require distributed execution over multiple nodes. Also refer to the following links
http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm
http://en.wikipedia.org/wiki/Prime-counting_function
How many parameters can you pass to a string.Format() method?
There must be some sort of theoretical or enforced limit on it. Is it based on the limits of the params[] type or the memory usage of the app that is using it or something else entirely?
OK, I emerge from hiding... I used the following program to verify what was going on and while Marc pointed out that a string like this "{0}{1}{2}...{2147483647}" would succeed the memory limit of 2 GiB before the argument list, my findings did't match yours. Thus the hard limit, of the number of parameters you can put in a string.Format method call has to be 107713904.
int i = 0;
long sum = 0;
while (sum < int.MaxValue)
{
var s = sizeof(char) * ("{" + i + "}").Length;
sum += s; // pseudo append
++i;
}
Console.WriteLine(i);
Console.ReadLine();
Love the discussion people!
Not as far as I know...
well, the theoretical limit would be the int32 limit for the array, but you'd hit the string length limit long before that, I guess...
Just don't go mad with it ;-p It may be better to write lots of small fragments to (for example) a file or response, than one huge hit.
edit - it looked like there was a limit in the IL (0xf4240), but apparently this isn't quite as it appears; I can make it get quite large (2^24) before I simply run out of system memory...
Update; it seems to me that the bounding point is the format string... those {1000001}{1000002} add up... a quick bit of math (below) shows that the maximum useful number of arguments we can use is 206,449,129:
long remaining = 2147483647;// max theoretical format arg length
long count = 10; // i.e. {0}-{9}
long len = 1;
int total = 0;
while (remaining >= 0) {
for(int i = 0 ; i < count && remaining >= 0; i++) {
total++;
remaining -= len + 2; // allow for {}
}
count *= 10;
len++;
}
Console.WriteLine(total - 1);
Expanding on Marc's detailed answer.
The only other limitation that is important is for the debugger. Once you pass a certain number of parameters directly to a function, the debugger becomes less functional in that method. I believe the limit is 64 parameters.
Note: This does not mean an array with 64 members, but 64 parameters passed directly to the function.
You might laugh and say "who would do this?" which is certainly a valid question. Yet LINQ makes this a lot easier than you think. Under the hood in LINQ the compiler generates a lot of code. It's possible that for a large generate SQL query where more than 64 fields are selected that you would hit this issue. Because the compiler under the hood would need to pass all of the fields to the constructor of an anonymous type.
Still a corner case.
Considering that both the limit of the Array class and the String class are the upper limit of Int32 (documented at 2,147,483,647 here: Int32 Structure), it is reasonable to believe that this value is the limit of the number string of format parameters.
Update Upon checking reflector, John is right. String.Format, using the Red Gate Reflector, shows the ff:
public static string Format(IFormatProvider provider, string format, params object[] args)
{
if ((format == null) || (args == null))
{
throw new ArgumentNullException((format == null) ? "format" : "args");
}
StringBuilder builder = new StringBuilder(format.Length + (args.Length * 8));
builder.AppendFormat(provider, format, args);
return builder.ToString();
}
The format.Length + (args.Length * 8) part of the code is enough to kill most of that number. Ergo, '2,147,483,647 = x + 8x' leaves us with x = 238,609,294 (theoretical).
It's far less than that of course; as the guys in the comments mentioned the string hitting the string length limit earlier is quite likely.
Maybe someone should just code this into a machine problem! :P