Uninitialized floating point variables, repoducing indeterminate behavior - c#

I had to debug some code that was exhibiting transient and sporadic behavior, which ultimately could be attributed to an uninitialized float in a line of initializations, i.e.:
float a = number, b, c = other_number;
This section of code was rapidly sampling a device over a serial connection and averaging the output over some interval. Every once in a while, the number 2.7916085e+035 would get reported, but otherwise the code worked as intended and the bug was not reproducible.
Since the number was always 2.7916085e+035, I thought there might have been some issues with the communications handling, or the device itself, but these were ruled out. I was almost ready to blame it on external interference until I finally caught a faulty sample in the debugger.
So, to the question. Can someone postulate the significance of 2.7916085e+035? I'm not sure it has any meaning outside of my context, but what bothers me is that this number was essentially unreproducibly reproducible. That is to say, I couldn't replicate the problem reliably, but when it arose, it was always the same thing. From my understanding, uninitialized variables are supposed to be indeterminate. It's worth noting that the issue happened in all different places of program execution, phase, time of day, etc... but always on the same system.
Is there something in the .NET framework, runtime, or operating system that was causing the behavior? This was particularly troublesome to track down because the uninitialized variable always had the same value, when it didn't luckily get set to 0.
Edit: Some context. The code is within a timer with a variable tick rate, so the variables are local non-static members of a class:
if(//some box checked)
{
switch(//some output index)
{
case problem_variable:
{
if(ready_to_sample)
{
float average;
for each(float num in readings)
{
average += num;
}
average /= readings.Count;
}
}
}
}
The variable in question here would be average. readings is a list of outputs that I want to average. average would be redeclared one time per.... average, which can happen in seconds, minutes, hours, or whenever the condition is met to take an average. More often than not the variable would get 0, but occasionally it would get the number above.

In the common floating-point encodings, 2.7916085e+035 is 0x7a570ec5 as a float and 0x474ae1d8a58be975 as a double, modulo endianness. These do not look like a typical text character string, a simple integer, or a common address. (The low bits of the double encoding are uncertain, as you did not capture enough decimal digits to determine them, but the high bits do not look meaningful.)
I expect there is little information to be deduced from this value by itself.

That double in 64-bit binary translates to
0100011101001010111000011101100010100101100010111110100000000000
or
01111010010101110000111011000101
as a 32-bit float. Nearly all modern processors keep instructions and data separate--especially R/W data. The exception, of course, is the old x86, which is a CISC processor, based on the 4004 from the days when every byte was at a premium, and even minicomputers did not have caches to work with. With modern OS's, however, it is much more likely that, while 4 or 8KB pages were being moved around, a page of instructions was changed without zeroing out the old page.
The double version might be the equivalent to
Increment by 1, where r7 (EDI - extended destination index) is selected
The second, viewed as a float, looks like it would translate to either x86 or x86-64:
How do I interpet this x86_64 assembly opcode?

The value that you see for an uninitialized variable is whatever happens to be in that memory location. It's not random; it's a value that was stored in memory by a previous function call. For example:
void f() {
int i = 3;
}
void g() {
int i;
std::cout << i << std::endl;
}
int main() {
f();
g();
return 0;
}
Chances are, this program (assuming the compiler doesn't optimize out the initialization in f()) will write 3 to the console.

Floats are a base 2 number system. Because of this there are specific values that can not be accurately saved, and evaluate to an approximation.
Your output is probably giving you a value that specifically gets the same estimation. Try running through some common values that you get from the serial connection and see if you can find the value that is causing you grief. I personally would use a double for something like this instead of floats, especially if you are going to be doing any kind of calculations against those numbers.

Related

Why is this loop endless only when I am not debugging it?

Below program does some computations to compute how many terms of some infinite converging sum are needed to exceed a certain threshold. I understand/suspect that loops like this may not terminate if rate is too large (e.g. 750, see below) because of computation inaccuracies caused by floating point arithmatic.
However, below loop outputs i=514 in debug mode (microsoft visual studio .net 4.6.1), but does not terminate ("hangs") in release mode. Perhaps even more strange: If I out-comment the "if" part inside the loop (meant to figure out what is happening), then the release mode code suddenly also outputs i=514.
What are the reasons behind this? How to avoid problems like this from popping up in release mode? (Edit: I would rather not add an if statement or break statement in production code; this code should be as performant as possible.)
static void Main(string[] args)
{
double rate = 750;
double d = 0.2;
double rnd = d * Math.Exp(rate);
int i = 0;
int j = 0;
double term = 1.0;
do
{
rnd -= term;
term *= rate;
term /= ++i;
//if (j++ > 1000000)
//{
// Console.WriteLine(d + " " + rate + " " + term);
// j = 0;
// Console.ReadLine();
//}
} while (rnd > 0);
Console.WriteLine("i= "+i);//do something with i
Console.ReadLine();
return;
}
Executive Summary Your code is broken even in Debug - it produces the wrong result even when the loop exits. You need to be aware of the limits of floating-point arithmetic.
If you step through your code with a debugger, you quickly see what's wrong.
Math.Exp(rate) is large. Very large. Larger than a double-precision number can hold. Therefore rnd starts off with the value Infinity.
When you come to rnd -= term, that's Infinity minus some number, which is still Infinity. Therefore rnd > 0 is always true, as Infinity is greater than zero.
This carries on until term also reaches Infinity. Then rnd -= term becomes Infinity - Infinity, which is NaN. Anything compared to NaN is false, so rnd > 0 is suddenly false, and your loop exits.
I don't know why this changes in release mode (I can't reproduce it), but it's entirely possible that the order of your floating-point operations was changed. This can have drastic affects on the output if you're dealing with both large and small numbers at the same time. For example, term *= rate; term /= ++i might be ordered such that term * rate always happens first in Debug, and that multiplication reaches Infinity before the division happens. In Release, that might be re-ordered such that rate / ++i happens first, and that stops you from ever hitting Infinity. Because you started off with the error that rnd is always Infinity, your loop can only break when term is also Infinity.
I suspect it may depend on factors such as your processor, as well.
EDIT: See This answer by #HansPassant for a much better explanation.
So again, while the difference between hanging and not hanging may depend on Debug vs Release, your code never worked in the first place. Even in Debug, it produces the wrong result!
If you're dealing with large or small numbers, you need to be careful about the limits of double precision. Floating-point numbers are complex beasts, and have lots of subtle behaviour. You need to be aware of that, see for example this famous article: What Every Computer Scientist Should Know About Floating-Point Arithmetic. Be aware of the limits, the issues with combining large and small numbers, etc.
If you're working near the limits of floating-point, you need to test your assumptions: make sure that you're not dealing with numbers which are too large or too small, for example. If you expect an exponentiation to be less than Infinity, test that. If you expect a loop to exit, add a guard condition to make sure it exits with an error after a certain number of iterations. Test your code! Make sure that it behaves correctly in the edge cases.
Also, use a big number library where appropriate. If possible, rework your algorithm to be more computer-friendly. Many algorithms are written such that they're elegant for mathematicians to write in a textbook, but they're impractical for a processor to execute. There are often versions which do the same things, but are more computer-friendly.
I would rather not add an if statement or break statement in production code; this code should be as performant as possible.)
Don't be afraid of a single if statement in a loop. If it always produces the same result -- e.g. your break is never hit -- the branch predictor very quickly catches on and the branch has almost no cost. It's a loop with a branch which is unpredictable that you need to be careful of.

Denormalized numbers C#

I recently came across denormalized definition and I understand that there are some numbers that cannot be represented in a normalized form because they are too small to fit into its corresponding type. According with IEEE
So what I was trying to do is catch when a denormalized number is being passed as a parameter to avoid calculations with this numbers. If I am understanding correct I just need to look for numbers within the Range of denormalized
private bool IsDenormalizedNumber(float number)
{
return Math.Pow(2, -149) <= number && number<= ((2-Math.Pow(2,-23))*Math.Pow(2, -127)) ||
Math.Pow(-2, -149) <= number && number<= -((2 - Math.Pow(2, -23)) * Math.Pow(2, -127));
}
Is my interpretation correct?
I think a better approach would be to inspect the bits. Normalized or denormalized is a characteristic of the binary representation, not of the value itself. Therefore, you will be able to detect it more reliably this way and you can do so without and potentially dangerous floating point comparisons.
I put together some runnable code for you, so that you can see it work. I adapted this code from a similar question regarding doubles. Detecting the denormal is much simpler than fully excising the exponent and significand, so I was able to simplify the code greatly.
As for why it works... The exponent is stored in offset notation. The 8 bits of the exponent can take the values 1 to 254 (0 and 255 are reserved for special cases), they are then offset adjusted by -127 yielding the normalized range of -126 (1-127) to 127 (254-127). The exponent is set to 0 in the denormal case. I think this is only required because .NET does not store the leading bit on the significand. According to IEEE 754, it can be stored either way. It appears that C# has opted for dropping it in favor of a sign bit, though I don't have any concrete details to back that observation.
In any case, the actual code is quite simple. All that is required is to excise the 8 bits storing the exponent and test for 0. There is a special case around 0, which is handled below.
NOTE: Per the comment discussion, this code relies on platform specific implementation details (x86_64 in this test case). As #ChiuneSugihara pointed out, the CLI does not ensure this behavior and it may differ on other platforms, such as ARM.
using System;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("-120, denormal? " + IsDenormal((float)Math.Pow(2, -120)));
Console.WriteLine("-126, denormal? " + IsDenormal((float)Math.Pow(2, -126)));
Console.WriteLine("-127, denormal? " + IsDenormal((float)Math.Pow(2, -127)));
Console.WriteLine("-149, denormal? " + IsDenormal((float)Math.Pow(2, -149)));
Console.ReadKey();
}
public static bool IsDenormal(float f)
{
// when 0, the exponent will also be 0 and will break
// the rest of this algorithm, so we should check for
// this first
if (f == 0f)
{
return false;
}
// Get the bits
byte[] buffer = BitConverter.GetBytes(f);
int bits = BitConverter.ToInt32(buffer, 0);
// extract the exponent, 8 bits in the upper registers,
// above the 23 bit significand
int exponent = (bits >> 23) & 0xff;
// check and see if anything is there!
return exponent == 0;
}
}
}
The output is:
-120, denormal? False
-126, denormal? False
-127, denormal? True
-149, denormal? True
Sources:
extracting mantissa and exponent from double in c#
https://en.wikipedia.org/wiki/IEEE_floating_point
https://en.wikipedia.org/wiki/Denormal_number
http://csharpindepth.com/Articles/General/FloatingPoint.aspx
Code adapted from:
extracting mantissa and exponent from double in c#
From my understanding denormalized numbers are there for help with underflows in some cases (see answer to Denormalized Numbers - IEEE 754 Floating Point).
So to get a denormalized number you would need to explicitly create it or else cause an underflow. In the first case it seems unlikely that a literal denormalized number would be specified in code, and even if someone tried it I am not sure that .NET would allow it. In the second case as long as you are in a checked context you should get an OverflowException thrown for any overflow or underflow in an arithmetic computation so that would guard against the possibility of getting a denormalized number. In an unchecked context I am not sure if an underflow will take you to a denormalized number, but you can try it and see if you are wanting to run calculations in unchecked.
Long story short you can not worry about it if you are running in checked and try an underflow and see in unchecked if you want to run in that context.
EDIT
I wanted to update my answer since a comment didn't feel substantial enough. First off I struck out the comment I made about the checked context since that only applies to non-floating point calculations (like int) and not to float or double. That was my mistake on that one.
The issue with denormalized numbers is that they are not consistent in the CLI. Notice how I am using "CLI" and not "C#" because we need to go lower level than just C# to understand the issue. From The Common Language Infrastructure Annotated Standard Partition I Section 12.1.3 the second note (page 125 of the book) it states:
This standard does not specify the behavior of arithmetic operations on denormalized floating point numbers, nor does it specify when or whether such representations should be created. This is in keeping with IEC 60559:1989. In addition, this standard does not specify how to access the exact bit pattern of NaNs that are created, nor the behavior when converting a NaN between 32-bit and 64-bit representation. All of this behavior is deliberately left implementation specific.
So at the CLI level the handling of denormalized numbers is deliberately left to be implementation specific. Furthermore, if you look at the documentation for float.Epsilon (found here), which is the smallest positive number representable by a float you will get a denormalized number on most machines that matches what is listed in the documentation (which is approximately 1.4e-45). This is what #Kevin Burdett was most likely seeing in his answer. That being said if you scroll down farther on the page you will see the following quote under "Platform Notes"
On ARM systems, the value of the Epsilon constant is too small to be detected, so it equates to zero. You can define an alternative epsilon value that equals 1.175494351E-38 instead.
So there are portability issues that can come into play when you are dealing with manually handling denormalized numbers even just for the .NET CLR (which is an implementation of the CLI). In fact this ARM specific value is kind of interesting since it appears to be a normalized number (I used the function from #Kevin Burdett with IsDenormal(1.175494351E-38f) and it returned false). In the CLI proper the concerns are more severe since there is no standardization on their handling by design according to the annotation on the CLI standard. So this leaves questions about what would happen with the same code on Mono or Xamarin for instance which is a difference implementation of the CLI than the .NET CLR.
In the end I am right back to my previous advice. Just don't worry about denormalized numbers, they are there to silently help you and it is hard to imagine why you would need to specifically single them out. Also as #HansPassant mentioned you most likely won't even encounter anyway. It is just hard to imagine how you would be going under the smallest, positive normalized number in double which is absurdly small.

Difference between Pointers in C# and C

i have this code in c:
long data1 = 1091230456;
*(double*)&((data1)) = 219999.02343875566
when i use the same code in C# the result is:
*(double*)&((data1)) = 5.39139480005278E-315
but if i define another varibale in C# :
unsafe
{long *data2 = &(data1);}
now :
*(double)&((data2)) = 219999.02343875566
Why the difference ?
Casting pointers is always tricky, especially when you don't have guarantees about the layout and size of the underlying types.
In C#, long is always a 64-bit integer and double is always 64-bit floating point number.
In C, long can easily end up being smaller than the 64-bits needed. If you're using a compiler that translates long as a 32-bit number, the rest of the value will be junk read from the next piece of memory - basically a "buffer" overflow.
On Windows, you usually want to use long long for 64-bit integers. Or better, use something like int64_t, where you're guaranteed to have exactly 64-bits of data. Or the best, don't cast pointers.
C integer types can be confusing if you have a Java / C# background. They give you guarantees about the minimal range they must allow, but that's it. For example, int must be able to hold values in the [−32767,+32767] range (note that it's not -32768 - C had to support one's complement machines, which had two zeroes), close to C#'s short. long must be able to hold values in the [−2147483647,+2147483647] range, close to C#'s int. Finally, long long is close to C#'s long, having at least the [-2^63,+2^63] range. float and double are specified even more loosely.
Whenever you cast pointers, you throw away even the tiny bits of abstraction C provides you with - you work with the underlying hardware layouts, whatever those are. This is one road to hell and something to avoid.
Sure, these days you probably will not find one's complement numbers, or other floating points than IEEE 754, but it's still inherently unsafe and unpredictable.
EDIT:
Okay, reproducing your example fully in a way that actually compiles:
unsafe
{
long data1 = 1091230456;
long *data2 = &data1;
var result = *(double*)&((data2));
}
result ends up being 219999.002675845 for me, close enough to make it obvious. Let's see what you're actually doing here, in more detail:
Store 1091230456 in a local data1
Take the address of data1, and store it in data2
Take the address of data2, cast it to a double pointer
Take the double value of the resulting pointer
It should be obvious that whatever value ends up in result has little relation to the value you stored in data1 in the first place!
Printing out the various parts of what you're doing will make this clearer:
unsafe
{
long data1 = 1091230456;
long *pData1 = &data1;
var pData2 = &pData1;
var pData2Double = (double*)pData2;
var result = *pData2Double;
new
{
data1 = data1,
pData1 = (long)pData1,
pData2 = (long)pData2,
pData2Double = (long)pData2Double,
result = result
}.Dump();
}
This prints out:
data1: 1091230456
pData1: 91941328
pData2: 91941324
pData2Double: 91941324
result: 219999.002675845
This will vary according to many environmental settings, but the critical part is that pData2 is pointing to memory four bytes in front of the actual data! This is because of the way the locals are allocated on stack - pData2 is pointing to pData1, not to data1. Since we're using 32-bit pointers here, we're reading the last four bytes of the original long, combined with the stack pointer to data1. You're reading at the wrong address, skipping over one indirection. To get back to the correct result, you can do something like this:
var pData2Double = (double**)pData2;
var result = *(*pData2Double);
This results in 5.39139480005278E-315 - the original value produced by your C# code. This is the more "correct" value, as far as there can even be a correct value.
The obvious answer here is that your C code is wrong as well - either due to different operand semantics, or due to some bug in the code you're not showing (or again, using a 32-bit integer instead of 64-bit), you end up with a pointer to a pointer to the value you want, and you mistakenly build the resulting double on a scrambled value that includes part of the original long, as well as the stack pointer - in other words, exactly one of the reasons you should be extra cautious whenever using unsafe code. Interestingly, this also implies that when compiled as a 64-bit executable, the result will be entirely decoupled from the value of data1 - you'll have a double built on the stack pointer exclusively.
Don't mess with pointers until you understand indirection very, very well. They have a tendency to "mostly work" when used entirely wrong. Then you change a tiny part of the code (for example, in this code you could add a third local, which could change where pData1 is allocated) or move to a different architecture (32-bit vs. 64-bit is quite enough in this example), or a different compiler, or a different OS... and it breaks completely. You don't guess around your way with pointers. Either you know exactly what every single expression in the code means, or you shouldn't deal with pointers at all.

How to combine float representation with discontinous function?

I have read tons of things about floating error, and floating approximation, and all that.
The thing is : I never read an answer to a real world problem. And today, I came across a real world problem. And this is really bad, and I really don't know how to escape.
Take a look at this example :
[TestMethod]
public void TestMethod1()
{
float t1 = 8460.32F;
float t2 = 5990;
var x = t1 - t2;
var y = F(x);
Assert.AreEqual(x, y);
}
float F(float x)
{
if (x <= 2470.32F) { return x; }
else { return -x; }
}
x is supposed to be 2470.32. But in fact, due to rounding error, its value is 2470.32031.
Most of the time, this is not a problem. Functions are continuous, and all is good, the result is off by a little value.
But here, we have a discontinous function, and the error is really, really big. The test failed exactly on the discontinuous point.
How could I handle the rounding error with discontinuous functions?
The key problem here is:
The function has a large (and significant) change in output value in certain cases when there is a small change in input value.
You are passing an incorrect input value to the function.
As you write, “due to rounding error, [x’s value] is 2470.32031”. Suppose you could write any code you desire—simply describe the function to be performed, and a team of expert programmers will provide complete, bug-free source code within seconds. What would you tell them?
The problem you are posing is, “I am going to pass a wrong value, 2470.32031, to this function. I want it to know that the correct value is something else and to provide the result for the correct value, which I did not pass, instead of the incorrect value, which I did pass.”
In general, that problem is impossible to solve, because it is impossible to distinguish when 2470.32031 is passed to the function but 2470.32 is intended from when 2470.32031 is passed to the function and 2470.32031 is intended. You cannot expect a computer to read your mind. When you pass incorrect input, you cannot expect correct output.
What this tells us is that no solution inside of the function F is possible. Therefore, we must zoom out and look at the larger problem. You must examine whether the value passed to F can be improved (calculated in a better way or with higher precision or with supplementary information) or whether the nature of the problem is such that, when 2470.32031 is passed, 2470.32 is always intended, so that this knowledge can be incorporated into F.
NOTE: this answer is essentially the same as the one of Eric
It just enlighten the test point of view, since a test is a form of specification.
The problem here is that testMethod1 does not test F.
It rather tests that conversion of decimal quantity 8460.32 to float and float subtraction are inexact.
But is it the intention of the test?
All you can say is that in certain bad conditions (near discontinuity), a small error on input will result in a large error on output, so the test could express that it is an expected result.
Note that function F is almost perfect, except maybe for the float value 2470.32F itself.
Indeed, the floating point approximation will round the decimal by excess (1/3200 exactly).
So the answer should be:
Assert.AreEqual(F(2470.32F), -2470.32F); /* because 2470.32F exceed the decimal 2470.32 */
If you want to test such low level requirements, you'll need a library with high (arbitrary/infinite) precision to perform the tests.
If you can't afford such imprecision on function F, then Float is a mismatch., and you'll have to find another implementation with increased, arbitrary or infinite precision.
It's up to you to specify your needs, and testMethod1 shall explicit this specification better than it does right now.
If you need the 8460.32 number to be exactly that without rounding error, you could look at the .NET Decimal type which was created explicitly to represent base 10 fractional numbers without rounding error. How they perform that magic is beyond me.
Now, I realize this may be impractical for you to do because the float presumably comes from somewhere and refactoring it to Decimal type could be way too much to do, but if you need it to have that much precision for the discontinuous function that relies on that value you'll either need a more precise type or some mathematical trickery. Perhaps there is some way to always ensure that a float is created that has rounding error such that it's always less than the actual number? I'm not sure if such a thing exists but it should also solve your issue.
You have three numbers represented in your application, you have accepted imprecision in each of them by representing them as floats.
So I think you can reasonably claim that your program is working correctly
(oneNumber +/- some imprecision ) - (another number +/- some imprecision)
is not quite bigger than another number +/- some imprecision
when viewed in decimal representation on paper it looks wrong but that's not what you've implemented. What's the origin of the data? How precisely was 8460.32 known? Had it been 8460.31999 what should have happened? 8460.32001? Was the original value known to such precision?
In the end if you want to model more accuracy use a different data type, as suggested elsewhere.
I always just assume that when comparing floating point values a small margin of error is needed because of rounding issues. In your case, this would most likely mean choosing values in your test method that aren't quite so stringent--e.g., define a very small error constant and subtract that value from x. Here's a SO question that relates to this.
Edit to better address the concluding question: Presumably it doesn't matter what the function outputs on the discontinuity exactly, so test just slightly on either side of it. If it does matter, then really about the best you can do is allow either of two outputs from the function at that point.

Should I use byte or int?

I recall having read somewhere that it is better (in terms of performance) to use Int32, even if you only require Byte. It applies (supposedly) only to cases where you do not care about the storage. Is this valid?
For example, I need a variable that will hold a day of week. Do I
int dayOfWeek;
or
byte dayOfWeek;
EDIT:
Guys, I am aware of DayOfWeek enum. The question is about something else.
Usually yes, a 32 bit integer will perform slightly better because it is already properly aligned for native CPU instructions. You should only use a smaller sized numeric type when you actually need to store something of that size.
You should use the DayOfWeek enum, unless there's a strong reason not to.
DayOfWeek day = DayOfWeek.Friday;
To explain, since I was downvoted:
The correctness of your code is almost always more critical than the performance, especially in cases where we're talking this small of a difference. If using an enum or a class representing the semantics of the data (whether it's the DayOfWeek enum, or another enum, or a Gallons or Feet class) makes your code clearer or more maintainable, it will help you get to the point where you can safely optimize.
int z;
int x = 3;
int y = 4;
z = x + y;
That may compile. But there's no way to know if it's doing anything sane or not.
Gallons z;
Gallons x = new Gallons(3);
Feet y = new Feet(4);
z = x + y;
This won't compile, and even looking at it it's obvious why not - adding Gallons to Feet makes no sense.
My default position is to try to use strong types to add constraints to values - where you know those in advance. Thus in your example, it may be preferable to use byte dayOfWeek because it is closer to your desired value range.
Here is my reasoning; with the example of storing and passing a year-part of a date. The year part - when considering other parts of the system that include SQL Server DateTimes, is constrained to 1753 through to 9999 (note C#'s possible range for DateTime is different!) Thus a short covers my possible values and if I try to pass anything larger the compiler will warn me before the code will compile. Unfortunately, in this particular example, the C# DateTime.Year property will return an int - thus forcing me to cast the result if I need to pass e.g. DateTime.Now.Year into my function.
This starting-position is driven by considerations of long-term storage of data, assuming 'millions of rows' and disk space - even though it is cheap (it is far less cheap when you are hosted and running a SAN or similar).
In another DB example, I will use smaller types such as byte (SQL Server tinyint) for lookup ID's where I am confident that there will not be many lookup types, through to long (SQL Server bigint) for id's where there are likely to be more records. i.e. to cover transactional records.
So my rules of thumb:
Go for correctness first if possible. Use DayOfWeek in your example, of course :)
Go for a type of appropriate size thus making use of compiler safety checks giving you errors at the earliest possible time;
...but offset against extreme performance needs and simplicity, especially where long-term storage is not involved, or where we are considering a lookup (low row count) table rather than a transactional (high row count) one.
In the interests of clarity, DB storage tends not to shrink as quickly as you expect by shrinking column types from bigint to smaller types. This is both because of padding to word boundaries and page-size issues internal to the DB. However, you probably store every data item several times in your DB, perhaps through storing historic records as they change, and also keeping the last few days of backups and log backups. So saving a few percent of your storage needs will have long term savings in storage cost.
I have never personally experienced issues where the in-memory performance of bytes vs. ints has been an issue, but I have wasted hours and hours having to reallocate disk space and have live servers entirely stall because there was no one person available to monitor and manage such things.
Use an int. Computer memory is addressed by "words," which are usually 4 bytes long. What this means is that if you want to get one byte of data from memory, the CPU has to retrieve the entire 4-byte word from RAM and then perform some extra steps to isolate the single byte that you're interested in. When thinking about performance, it will be a lot easier for the CPU to retrieve a whole word and be done with it.
Actually in all reality, you won't notice any difference between the two as far as performance is concerned (except in rare, extreme circumstances). That's why I like to use int instead of byte, because you can store bigger numbers with pretty much no penalty.
In terms of storage amount, use byte and in terms of cpu performance, use int.
System.DayOfWeek
MSDN
Most of the time use int. Not for performance but simplicity.

Categories

Resources