Types for large numbers

Types for large numbers - c#

I am working on an app that will need to handle very large numbers.
I checked out a few available LargeNumber classes and have found a few that I am happy with. I have a class for large integers and for large floating point numbers.
Since some of the numbers will be small and some large the question is whether it is worth checking the length of the number and if it is small use a regular C# int or double and if is is large use the other classes I have or if I am already using the Large Integer and Large Float classes I should just stick with them even for smaller numbers.
My consideration is purely performance. Will I save enough time on the math for the smaller numbers that it would be worthwhile to check each number after is is put in.

Really hard to tell - Depends on your 3rd party libraries :)
Best bet would be to use the System.Diagnostics.StopWatch class, do a gazzillion different calculations, time them and compare the results, I guess ..
[EDIT] - About the benchmarks, I'd do a series of benchmarks your largeInt-type to do the calculations on regular 32/64 bits numbers, and a series checking if the number can fit in the regular Int32/Int64 types (which they should), "downcasting" them to these types, and then run the same calculations using these types. From your question, this sounds like what you'll be doing if the built-in types are faster..
If your application is targetted for more people than yourself, try to run them on different machines (single core, multicore, 32bit, 64bit platforms), and if the platform seems to have a large impact in the time the calculations take, use some sort of strategy-pattern to do the calculations differently on different machines.
Good luck :)

I would expect that a decent large numbers library would be able to do this optimization on it's own...

I would say yes, the check will more than pay for itself, as long as you have enough values within the regular range.
The logic is simple: an integer addition is one assembly instruction. Combined with a comparison, that's three or four instructions. Any software implementation of such operation will most probably be much slower.
Optimally, this check should be done in the LargeNumber libraries themselves. If they don't do it, you may need a wrapper to avoid having checks all over the place. But then you need to think of the additional cost of the wrapper as well.

Worked in a project where the same fields needed to handle very large numbers and at the same time handels precision for very small numbers.
The ended up with storing to fields (mantissa and exponent) for every number of such kind.
We made a class for mantissa/exponent calculations and it performed well.

Related

Subtraction Is Not Returning Correct Value [duplicate]

No, this is not another "Why is (1/3.0)*3 != 1" question.
I've been reading about floating-points a lot lately; specifically, how the same calculation might give different results on different architectures or optimization settings.
This is a problem for video games which store replays, or are peer-to-peer networked (as opposed to server-client), which rely on all clients generating exactly the same results every time they run the program - a small discrepancy in one floating-point calculation can lead to a drastically different game-state on different machines (or even on the same machine!)
This happens even amongst processors that "follow" IEEE-754, primarily because some processors (namely x86) use double extended precision. That is, they use 80-bit registers to do all the calculations, then truncate to 64- or 32-bits, leading to different rounding results than machines which use 64- or 32- bits for the calculations.
I've seen several solutions to this problem online, but all for C++, not C#:
Disable double extended-precision mode (so that all double calculations use IEEE-754 64-bits) using _controlfp_s (Windows), _FPU_SETCW (Linux?), or fpsetprec (BSD).
Always run the same compiler with the same optimization settings, and require all users to have the same CPU architecture (no cross-platform play). Because my "compiler" is actually the JIT, which may optimize differently every time the program is run, I don't think this is possible.
Use fixed-point arithmetic, and avoid float and double altogether. decimal would work for this purpose, but would be much slower, and none of the System.Math library functions support it.
So, is this even a problem in C#? What if I only intend to support Windows (not Mono)?
If it is, is there any way to force my program to run at normal double-precision?
If not, are there any libraries that would help keep floating-point calculations consistent?

I know of no way to way to make normal floating points deterministic in .net. The JITter is allowed to create code that behaves differently on different platforms(or between different versions of .net). So using normal floats in deterministic .net code is not possible.
The workarounds I considered:
Implement FixedPoint32 in C#. While this is not too hard(I have a half finished implementation) the very small range of values makes it annoying to use. You have to be careful at all times so you neither overflow, nor lose too much precision. In the end I found this not easier than using integers directly.
Implement FixedPoint64 in C#. I found this rather hard to do. For some operations intermediate integers of 128bit would be useful. But .net doesn't offer such a type.
Implement a custom 32 bit floatingpoint. The lack of a BitScanReverse intrinsic causes a few annoyances when implementing this. But currently I think this is the most promising path.
Use native code for the math operations. Incurs the overhead of a delegate call on every math operation.
I've just started a software implementation of 32 bit floating point math. It can do about 70million additions/multiplications per second on my 2.66GHz i3.
https://github.com/CodesInChaos/SoftFloat . Obviously it's still very incomplete and buggy.

The C# specification (§4.1.6 Floating point types) specifically allows floating point computations to be done using precision higher than that of the result. So, no, I don't think you can make those calculations deterministic directly in .Net. Others suggested various workarounds, so you could try them.

The following page may be useful in the case where you need absolute portability of such operations. It discusses software for testing implementations of the IEEE 754 standard, including software for emulating floating point operations. Most information is probably specific to C or C++, however.
http://www.math.utah.edu/~beebe/software/ieee/
A note on fixed point
Binary fixed point numbers can also work well as a substitute for floating point, as is evident from the four basic arithmetic operations:
Addition and subtraction are trivial. They work the same way as integers. Just add or subtract!
To multiply two fixed point numbers, multiply the two numbers then shift right the defined number of fractional bits.
To divide two fixed point numbers, shift the dividend left the defined number of fractional bits, then divide by the divisor.
Chapter four of Hattangady (2007) has additional guidance on implementing binary fixed point numbers (S.K. Hattangady, "Development of a Block Floating Point Interval ALU for DSP and Control Applications", Master's thesis, North Carolina State University, 2007).
Binary fixed point numbers can be implemented on any integer data type such as int, long, and BigInteger, and the non-CLS-compliant types uint and ulong.
As suggested in another answer, you can use lookup tables, where each element in the table is a binary fixed point number, to help implement complex functions such as sine, cosine, square root, and so on. If the lookup table is less granular than the fixed point number, it is suggested to round the input by adding one half of the granularity of the lookup table to the input:
// Assume each number has a 12 bit fractional part. (1/4096)
// Each entry in the lookup table corresponds to a fixed point number
// with an 8-bit fractional part (1/256)
input+=(1<<3); // Add 2^3 for rounding purposes
input>>=4; // Shift right by 4 (to get 8-bit fractional part)
// --- clamp or restrict input here --
// Look up value.
return lookupTable[input];

Is this a problem for C#?
Yes. Different architectures are the least of your worries, different framerates etc. can lead to deviations due to inaccuracies in float representations - even if they are the same inaccuracies (e.g. same architecture, except a slower GPU on one machine).
Can I use System.Decimal?
There is no reason you can't, however it's dog slow.
Is there a way to force my program to run in double precision?
Yes. Host the CLR runtime yourself; and compile in all the nessecary calls/flags (that change the behaviour of floating point arithmetic) into the C++ application before calling CorBindToRuntimeEx.
Are there any libraries that would help keep floating point calculations consistent?
Not that I know of.
Is there another way to solve this?
I have tackled this problem before, the idea is to use QNumbers. They are a form of reals that are fixed-point; but not fixed point in base-10 (decimal) - rather base-2 (binary); because of this the mathematical primitives on them (add, sub, mul, div) are much faster than the naive base-10 fixed points; especially if n is the same for both values (which in your case it would be). Furthermore because they are integral they have well-defined results on every platform.
Keep in mind that framerate can still affect these, but it is not as bad and is easily rectified using syncronisation points.
Can I use more mathematical functions with QNumbers?
Yes, round-trip a decimal to do this. Furthermore, you should really be using lookup tables for the trig (sin, cos) functions; as those can really give different results on different platforms - and if you code them correctly they can use QNumbers directly.

According to this slightly old MSDN blog entry the JIT will not use SSE/SSE2 for floating point, it's all x87. Because of that, as you mentioned you have to worry about modes and flags, and in C# that's not possible to control. So using normal floating point operations will not guarantee the exact same result on every machine for your program.
To get precise reproducibility of double precision you are going to have to do software floating point (or fixed point) emulation. I don't know of C# libraries to do this.
Depending on the operations you need, you might be able to get away with single precision. Here's the idea:
store all values you care about in single precision
to perform an operation:
expand inputs to double precision
do operation in double precision
convert result back to single precision
The big issue with x87 is that calculations might be done in 53-bit or 64-bit accuracy depending on the precision flag and whether the register spilled to memory. But for many operations, performing the operation in high precision and rounding back to lower precision will guarantee the correct answer, which implies that the answer will be guaranteed to be the same on all systems. Whether you get the extra precision won't matter, since you have enough precision to guarantee the right answer in either case.
Operations that should work in this scheme: addition, subtraction, multiplication, division, sqrt. Things like sin, exp, etc. won't work (results will usually match but there is no guarantee). "When is double rounding innocuous?" ACM Reference (paid reg. req.)
Hope this helps!

As already stated by other answers:
Yes, this is a problem in C# - even when staying pure Windows.
As for a solution:
You can reduce (and with some effort/performance hit) avoid the problem completely if you use built-in BigInteger class and scaling all calculations to a defined precision by using a common denominator for any calculation/storage of such numbers.
As requested by OP - regarding performance:
System.Decimal represents number with 1 bit for a sign and 96 bit Integer and a "scale" (representing where the decimal point is). For all calculations you make it must operate on this data structure and can't use any floating point instructions built into the CPU.
The BigInteger "solution" does something similar - only that you can define how much digits you need/want... perhaps you want only 80 bits or 240 bits of precision.
The slowness comes always from having to simulate all operations on these number via integer-only instructions without using the CPU/FPU-built-in instructions which in turn leads to much more instructions per mathematical operation.
To lessen the performance hit there are several strategies - like QNumbers (see answer from Jonathan Dickinson - Is floating-point math consistent in C#? Can it be?) and/or caching (for example trig calculations...) etc.

Well, here would be my first attempt on how to do this:
Create an ATL.dll project that has a simple object in it to be used for your critical floating point operations. make sure to compile it with flags that disable using any non xx87 hardware to do floating point.
Create functions that call floating point operations and return the results; start simple and then if it's working for you, you can always increase the complexity to meet your performance needs later if necessary.
Put the control_fp calls around the actual math to ensure that it's done the same way on all machines.
Reference your new library and test to make sure it works as expected.
(I believe you can just compile to a 32-bit .dll and then use it with either x86 or AnyCpu [or likely only targeting x86 on a 64-bit system; see comment below].)
Then, assuming it works, should you want to use Mono I imagine you should be able to replicate the library on other x86 platforms in a similar manner (not COM of course; although, perhaps, with wine? a little out of my area once we go there though...).
Assuming you can make it work, you should be able to set up custom functions that can do multiple operations at once to fix any performance issues, and you'll have floating point math that allows you to have consistent results across platforms with a minimal amount of code written in C++, and leaving the rest of your code in C#.

I'm not a game developer, though I do have a lot of experience with computationally difficult problems ... so, I'll do my best.
The strategy I would adopt is essentially this:
Use a slower (if necessary; if there's a faster way, great!), but predictable method to get reproducible results
Use double for everything else (eg, rendering)
The short'n long of this is: you need to find a balance. If you're spending 30ms rendering (~33fps) and only 1ms doing collision detection (or insert some other highly sensitive operation) -- even if you triple the time it takes to do the critical arithmetic, the impact it has on your framerate is you drop from 33.3fps to 30.3fps.
I suggest you profile everything, account for how much time is spent doing each of the noticeably expensive calculations, then repeat the measurements with 1 or more methods of resolving this problem and see what the impact is.

Checking the links in the other answers make it clear you'll never have a guarantee of whether floating point is "correctly" implemented or whether you'll always receive a certain precision for a given calculation, but perhaps you could make a best effort by (1) truncating all calculations to a common minimum (eg, if different implementations will give you 32 to 80 bits of precision, always truncating every operation to 30 or 31 bits), (2) have a table of a few test cases at startup (borderline cases of add, subtract, multiply, divide, sqrt, cosine, etc.) and if the implementation calculates values matching the table then not bother making any adjustments.

Your question in quite difficult and technical stuff O_o. However I may have an idea.
You sure know that the CPU makes some adjustment after any floating operations.
And CPU offer several different instructions which make different rounding operation.
So for an expression, your compiler will choose a set of instructions which lead you to a result. But any other instruction workflow, even if they intend to compute the same expression, can provide another result.
The 'mistakes' made by a rounding adjustment will grow at each further instructions.
As an exemple we can say that at an assembly level: a * b * c is not equivalent to a * c * b.
I'm not entirely sure of that, you will need to ask for someone who know CPU architecture a lot more than me : p
However to answer your question: in C or C++ you can solve your problem because you have some control on the machine code generate by your compiler, however in .NET you don't have any. So as long as your machine code can be different, you'll never be sure about the exact result.
I'm curious in which way this can be a problem because variation seems very minimal, but if you need really accurate operation the only solution I can think about will be to increase the size of your floating registers. Use double precision or even long double if you can (not sure that's possible using CLI).
I hope I've been clear enough, I'm not perfect in English (...at all : s)

Best List Capacity For a Known Distribution

Is there a best algorithm for defining the capacity of a C# list in the constructor, if the general distribution of eventual sizes is known?
As a concrete example, if the numbers of values to be placed in each list has a mean of 500, and a standard deviation of 50, with approximately a normal distribution what is the best initial capacity for the list in terms of memory consumption?

Leave the list to decide. I wouldn't bother setting it (just use an empty constructor) unless you experience concrete performance problems, at which point there are probably other things you can fix first.
Premaure optimisation is the root of all evil.

This is personal opinion, rather than research-based, but remember that a List itself only holds the reference to each object, and therefore it's probably better to err a little on the side allocating space for a few too many references, rather than accidentally doubling the amount of references that you need. With that in mind, a full two or even three standard deviations extra (600 or 650) is probably not out of line. But, again, that's my opinion rather than a researched result.

If you go with the three sigma rule, http://en.wikipedia.org/wiki/68-95-99.7_rule states if you account for 3 standard deviations, a single sample will be within that range 99.7% of the time.

I've done a little research and it seems that there is a "right" answer to this question.
First of all I agree that this can be premature optimisation, so profiling before deciding to switch is essential.
The graph above was generated in excel, using a normal distribution, and testing the space overused by various initial list capacities, using 10,000 samples and a mean of 10,000. As you can see it has several interesting features.
For low standard deviations, picking a bad initial capacity can waste up to eight times the space of the best choice.
For high standard deviations relative to the mean, less savings are possible.
Troughs, corresponding to the lowest memory wastage, occur at points dependant on the standard deviation.
It is better to choose a value from the right half of the graph to avoid list reallocations.
I couldn't find an exact formula for the minimum wastage, but mean + 1.75 x standard deviation seems to be the best choice based on this analysis.
Caveat: YMMV with other distributions, means etc.

There's no right answer. It's going to be a tradeoff between memory usage and CPU. The larger you initialize the list, the more memory you're probably wasting but your saving CPU since it doesn't have to be resized again later.

Is floating-point math consistent in C#? Can it be?

I know of no way to way to make normal floating points deterministic in .net. The JITter is allowed to create code that behaves differently on different platforms(or between different versions of .net). So using normal floats in deterministic .net code is not possible.
The workarounds I considered:
Implement FixedPoint32 in C#. While this is not too hard(I have a half finished implementation) the very small range of values makes it annoying to use. You have to be careful at all times so you neither overflow, nor lose too much precision. In the end I found this not easier than using integers directly.
Implement FixedPoint64 in C#. I found this rather hard to do. For some operations intermediate integers of 128bit would be useful. But .net doesn't offer such a type.
Implement a custom 32 bit floatingpoint. The lack of a BitScanReverse intrinsic causes a few annoyances when implementing this. But currently I think this is the most promising path.
Use native code for the math operations. Incurs the overhead of a delegate call on every math operation.
I've just started a software implementation of 32 bit floating point math. It can do about 70million additions/multiplications per second on my 2.66GHz i3.
https://github.com/CodesInChaos/SoftFloat . Obviously it's still very incomplete and buggy.

The C# specification (§4.1.6 Floating point types) specifically allows floating point computations to be done using precision higher than that of the result. So, no, I don't think you can make those calculations deterministic directly in .Net. Others suggested various workarounds, so you could try them.

As already stated by other answers:
Yes, this is a problem in C# - even when staying pure Windows.
As for a solution:
You can reduce (and with some effort/performance hit) avoid the problem completely if you use built-in BigInteger class and scaling all calculations to a defined precision by using a common denominator for any calculation/storage of such numbers.
As requested by OP - regarding performance:
System.Decimal represents number with 1 bit for a sign and 96 bit Integer and a "scale" (representing where the decimal point is). For all calculations you make it must operate on this data structure and can't use any floating point instructions built into the CPU.
The BigInteger "solution" does something similar - only that you can define how much digits you need/want... perhaps you want only 80 bits or 240 bits of precision.
The slowness comes always from having to simulate all operations on these number via integer-only instructions without using the CPU/FPU-built-in instructions which in turn leads to much more instructions per mathematical operation.
To lessen the performance hit there are several strategies - like QNumbers (see answer from Jonathan Dickinson - Is floating-point math consistent in C#? Can it be?) and/or caching (for example trig calculations...) etc.

Well, here would be my first attempt on how to do this:
Create an ATL.dll project that has a simple object in it to be used for your critical floating point operations. make sure to compile it with flags that disable using any non xx87 hardware to do floating point.
Create functions that call floating point operations and return the results; start simple and then if it's working for you, you can always increase the complexity to meet your performance needs later if necessary.
Put the control_fp calls around the actual math to ensure that it's done the same way on all machines.
Reference your new library and test to make sure it works as expected.
(I believe you can just compile to a 32-bit .dll and then use it with either x86 or AnyCpu [or likely only targeting x86 on a 64-bit system; see comment below].)
Then, assuming it works, should you want to use Mono I imagine you should be able to replicate the library on other x86 platforms in a similar manner (not COM of course; although, perhaps, with wine? a little out of my area once we go there though...).
Assuming you can make it work, you should be able to set up custom functions that can do multiple operations at once to fix any performance issues, and you'll have floating point math that allows you to have consistent results across platforms with a minimal amount of code written in C++, and leaving the rest of your code in C#.

I'm not a game developer, though I do have a lot of experience with computationally difficult problems ... so, I'll do my best.
The strategy I would adopt is essentially this:
Use a slower (if necessary; if there's a faster way, great!), but predictable method to get reproducible results
Use double for everything else (eg, rendering)
The short'n long of this is: you need to find a balance. If you're spending 30ms rendering (~33fps) and only 1ms doing collision detection (or insert some other highly sensitive operation) -- even if you triple the time it takes to do the critical arithmetic, the impact it has on your framerate is you drop from 33.3fps to 30.3fps.
I suggest you profile everything, account for how much time is spent doing each of the noticeably expensive calculations, then repeat the measurements with 1 or more methods of resolving this problem and see what the impact is.

Checking the links in the other answers make it clear you'll never have a guarantee of whether floating point is "correctly" implemented or whether you'll always receive a certain precision for a given calculation, but perhaps you could make a best effort by (1) truncating all calculations to a common minimum (eg, if different implementations will give you 32 to 80 bits of precision, always truncating every operation to 30 or 31 bits), (2) have a table of a few test cases at startup (borderline cases of add, subtract, multiply, divide, sqrt, cosine, etc.) and if the implementation calculates values matching the table then not bother making any adjustments.

Your question in quite difficult and technical stuff O_o. However I may have an idea.
You sure know that the CPU makes some adjustment after any floating operations.
And CPU offer several different instructions which make different rounding operation.
So for an expression, your compiler will choose a set of instructions which lead you to a result. But any other instruction workflow, even if they intend to compute the same expression, can provide another result.
The 'mistakes' made by a rounding adjustment will grow at each further instructions.
As an exemple we can say that at an assembly level: a * b * c is not equivalent to a * c * b.
I'm not entirely sure of that, you will need to ask for someone who know CPU architecture a lot more than me : p
However to answer your question: in C or C++ you can solve your problem because you have some control on the machine code generate by your compiler, however in .NET you don't have any. So as long as your machine code can be different, you'll never be sure about the exact result.
I'm curious in which way this can be a problem because variation seems very minimal, but if you need really accurate operation the only solution I can think about will be to increase the size of your floating registers. Use double precision or even long double if you can (not sure that's possible using CLI).
I hope I've been clear enough, I'm not perfect in English (...at all : s)

Does it really matter to distinct between short, int, long?

In my C# app, I would like to know whether it is really important to use short for smaller numbers, int for bigger etc. Does the memory consumption really matter?

Unless you are packing large numbers of these together in some kind of structure, it will probably not affect the memory consumption at all. The best reason to use a particular integer type is compatibility with an API. Other than that, just make sure the type you pick has enough range to cover the values you need. Beyond that for simple local variables, it doesn't matter much.

The simple answer is that it's not really important.
The more complex answer is that it depends.
Obviously you need to choose a type that will hold your datastructure without overflowing, and even if you're only storing smaller numbers then choosing int is probably the most sensible thing to do.
However, if your application loads a lot of data or runs on a device with limited memory then you might need to choose short for some values.

For C# apps that aren't trying to mirror some sort of structure from a file, you're better off using ints or whatever your native format is. The only other time it might matter is if using arrays on the order of millions of entries. Even then, I'd still consider ints.

Only you can be the judge of whether the memory consumption really matters to you. In most situations it won't make any discernible difference.
In general, I would recommend using int/Int32 where you can get away with it. If you really need to use short, long, byte, uint etc in a particular situation then do so.

This is entirely relative to the amount of memory you can afford to waste. If you aren't sure, it probably doesn't matter.

The answer is: it depends. The question of whether memory matters is entirely up to you. If you are writing a small application that has minimal storage and memory requirements, then no. If you are google, storing billions and billions of records on thousands of servers, then every byte can cost some real money.

There are a few cases where I really bother choosing.
When I have memory limitations
When I do bitshift operations
When I care about x86/x64 portability
Every other case is int all the way
Edit : About x86/x64
In x86 architecture, an int is 32 bits but in x64, an int is 64 bits
If you write "int" everywhere and move from one architecture to another, it might leads to problems. For example you have an 32 bits api that export a long. You cast it to an integer and everything is fine. But when you move to x64, the hell breaks loose.
The int is defined by your architecture so when you change architecture you need to be aware that it might lead to potential problems

That all depends on how you are using them and how many you have. Even if you only have a few in memory at a time - this might drive the data type in your backing store.

Memory consumption based on the type of integers you are storing is probably not an issue in a desktop or web app. In a game or a mobile device app, it may be more of an issue.
However, the real reason to differentiate between the types is the kind of numbers you need to store. If you have really big numbers, or high precision, you may need to use long to store it.

The context of the situation is very important here. You don't need to take a guess at whether it is important or not though, we are dealing with quantifiable things here. We know that we are saving 2 bytes by using a short instead of an int.
What do you estimate the largest number of instances are going to be in memory at a given point in time? If there are a million then you are saving ~2Mb of Ram. Is that a large amount of ram? Again, it depends on the context, if the app is running on a desktop with 4Gb of ram you probably don't care too much about the 2Mb.
If there will be hundreds of millions of instances in memory the savings are going to get pretty big, but if that is the case you may just not have enough ram to deal with it and you may have to store this structure on disk and work with parts of it at a time.

Int32 will be fine for almost anything. Exceptions include:
if you have specific needs where a different type is clearly better. Example: if you're writing a 16 bit emulator, Int16 (aka: short) would probably be better to represent some of the internals
when an API requires a certain type
one time, I had an invalid int cast and Visual Studio's first suggestion was to verify my value was less than infinity. I couldn't find a good type for that without using the pre-defined constants, so i used ulong since that was the closest I could come in .NET 2.0 :)

Why use flags+bitmasks rather than a series of booleans?

Given a case where I have an object that may be in one or more true/false states, I've always been a little fuzzy on why programmers frequently use flags+bitmasks instead of just using several boolean values.
It's all over the .NET framework. Not sure if this is the best example, but the .NET framework has the following:
public enum AnchorStyles
{
None = 0,
Top = 1,
Bottom = 2,
Left = 4,
Right = 8
}
So given an anchor style, we can use bitmasks to figure out which of the states are selected. However, it seems like you could accomplish the same thing with an AnchorStyle class/struct with bool properties defined for each possible value, or an array of individual enum values.
Of course the main reason for my question is that I'm wondering if I should follow a similar practice with my own code.
So, why use this approach?
Less memory consumption? (it doesn't seem like it would consume less than an array/struct of bools)
Better stack/heap performance than a struct or array?
Faster compare operations? Faster value addition/removal?
More convenient for the developer who wrote it?

It was traditionally a way of reducing memory usage. So, yes, its quite obsolete in C# :-)
As a programming technique, it may be obsolete in today's systems, and you'd be quite alright to use an array of bools, but...
It is fast to compare values stored as a bitmask. Use the AND and OR logic operators and compare the resulting 2 ints.
It uses considerably less memory. Putting all 4 of your example values in a bitmask would use half a byte. Using an array of bools, most likely would use a few bytes for the array object plus a long word for each bool. If you have to store a million values, you'll see exactly why a bitmask version is superior.
It is easier to manage, you only have to deal with a single integer value, whereas an array of bools would store quite differently in, say a database.
And, because of the memory layout, much faster in every aspect than an array. It's nearly as fast as using a single 32-bit integer. We all know that is as fast as you can get for operations on data.

Easy setting multiple flags in any order.
Easy to save and get a serie of 0101011 to the database.

Among other things, its easier to add new bit meanings to a bitfield than to add new boolean values to a class. Its also easier to copy a bitfield from one instance to another than a series of booleans.

It can also make Methods clearer. Imagine a Method with 10 bools vs. 1 Bitmask.

Actually, it can have a better performance, mainly if your enum derives from an byte.
In that extreme case, each enum value would be represented by a byte, containing all the combinations, up to 256. Having so many possible combinations with booleans would lead to 256 bytes.
But, even then, I don't think that is the real reason. The reason I prefer those is the power C# gives me to handle those enums. I can add several values with a single expression. I can remove them also. I can even compare several values at once with a single expression using the enum. With booleans, code can become, let's say, more verbose.

From a domain Model perspective, it just models reality better in some situations. If you have three booleans like AccountIsInDefault and IsPreferredCustomer and RequiresSalesTaxState, then it doesnn't make sense to add them to a single Flags decorated enumeration, cause they are not three distinct values for the same domain model element.
But if you have a set of booleans like:
[Flags] enum AccountStatus {AccountIsInDefault=1,
AccountOverdue=2 and AccountFrozen=4}
or
[Flags] enum CargoState {ExceedsWeightLimit=1,
ContainsDangerousCargo=2, IsFlammableCargo=4,
ContainsRadioactive=8}
Then it is useful to be able to store the total state of the Account, (or the cargo) in ONE variable... that represents ONE Domain Element whose value can represent any possible combination of states.

Raymond Chen has a blog post on this subject.
Sure, bitfields save data memory, but
you have to balance it against the
cost in code size, debuggability, and
reduced multithreading.
As others have said, its time is largely past. It's tempting to still do it, cause bit fiddling is fun and cool-looking, but it's no longer more efficient, it has serious drawbacks in terms of maintenance, it doesn't play nicely with databases, and unless you're working in an embedded world, you have enough memory.

I would suggest never using enum flags unless you are dealing with some pretty serious memory limitations (not likely). You should always write code optimized for maintenance.
Having several boolean properties makes it easier to read and understand the code, change the values, and provide Intellisense comments not to mention reduce the likelihood of bugs. If necessary, you can always use an enum flag field internally, just make sure you expose the setting/getting of the values with boolean properties.

Space efficiency - 1 bit
Time efficiency - bit comparisons are handled quickly by hardware.
Language independence - where the data may be handled by a number of different programs you don't need to worry about the implementation of booleans across different languages/platforms.
Most of the time, these are not worth the tradeoff in terms of maintance. However, there are times when it is useful:
Network protocols - there will be a big saving in reduced size of messages
Legacy software - once I had to add some information for tracing into some legacy software.
Cost to modify the header: millions of dollars and years of effort.
Cost to shoehorn the information into 2 bytes in the header that weren't being used: 0.
Of course, there was the additional cost in the code that accessed and manipulated this information, but these were done by functions anyways so once you had the accessors defined it was no less maintainable than using Booleans.

I have seen answers like Time efficiency and compatibility. those are The Reasons, but I do not think it is explained why these are sometime necessary in times like ours. from all answers and experience of chatting with other engineers I have seen it pictured as some sort of quirky old time way of doing things that should just die because new way to do things are better.
Yes, in very rare case you may want to do it the "old way" for performance sake like if you have the classic million times loop. but I say that is the wrong perspective of putting things.
While it is true that you should NOT care at all and use whatever C# language throws at you as the new right-way™ to do things (enforced by some fancy AI code analysis slaping you whenever you do not meet their code style), you should understand deeply that low level strategies aren't there randomly and even more, it is in many cases the only way to solve things when you have no help from a fancy framework. your OS, drivers, and even more the .NET itself(especially the garbage collector) are built using bitfields and transactional instructions. your CPU instruction set itself is a very complex bitfield, so JIT compilers will encode their output using complex bit processing and few hardcoded bitfields so that the CPU can execute them correctly.
When we talk about performance things have a much larger impact than people imagine, today more then ever especially when you start considering multicores.
when multicore systems started to become more common all CPU manufacturer started to mitigate the issues of SMP with the addition of dedicated transactional memory access instructions while these were made specifically to mitigate the near impossible task to make multiple CPUs to cooperate at kernel level without a huge drop in perfomrance it actually provides additional benefits like an OS independent way to boost low level part of most programs. basically your program can use CPU assisted instructions to perform memory changes to integers sized memory locations, that is, a read-modify-write where the "modify" part can be anything you want but most common patterns are a combination of set/clear/increment.
usually the CPU simply monitors if there is any other CPU accessing the same address location and if a contention happens it usually stops the operation to be committed to memory and signals the event to the application within the same instruction. this seems trivial task but superscaler CPU (each core has multiple ALUs allowing instruction parallelism), multi-level cache (some private to each core, some shared on a cluster of CPU) and Non-Uniform-Memory-Access systems (check threadripper CPU) makes things difficult to keep coherent, luckily the smartest people in the world work to boost performance and keep all these things happening correctly. todays CPU have a large amount of transistor dedicated to this task so that caches and our read-modify-write transactions work correctly.
C# allows you to use the most common transactional memory access patterns using Interlocked class (it is only a limited set for example a very useful clear mask and increment is missing, but you can always use CompareExchange instead which gets very close to the same performance).
To achieve the same result using a array of booleans you must use some sort of lock and in case of contention the lock is several orders of magnitude less permorming compared to the atomic instructions.
here are some examples of highly appreciated HW assisted transaction access using bitfields which would require a completely different strategy without them of course these are not part of C# scope:
assume a DMA peripheral that has a set of DMA channels, let say 20 (but any number up to the maximum number of bits of the interlock integer will do). When any peripheral's interrupt that might execute at any time, including your beloved OS and from any core of your 32-core latest gen wants a DMA channel you want to allocate a DMA channel (assign it to the peripheral) and use it. a bitfield will cover all those requirements and will use just a dozen of instructions to perform the allocation, which are inlineable within the requesting code. basically you cannot go faster then this and your code is just few functions, basically we delegate the hard part to the HW to solve the problem, constraints: bitfield only
assume a peripheral that to perform its duty requires some working space in normal RAM memory. for example assume a high speed I/O peripheral that uses scatter-gather DMA, in short it uses a fixed-size block of RAM populated with the description (btw the descriptor is itself made of bitfields) of the next transfer and chained one to each other creating a FIFO queue of transfers in RAM. the application prepares the descriptors first and then it chains with the tail of the current transfers without ever pausing the controller (not even disabling the interrupts). the allocation/deallocation of such descriptors can be made using bitfield and transactional instructions so when it is shared between diffent CPUs and between the driver interrupt and the kernel all will still work without conflicts. one usage case would be the kernel allocates atomically descriptors without stopping or disabling interrupts and without additional locks (the bitfield itself is the lock), the interrupt deallocates when the transfer completes.
most old strategies were to preallocate the resources and force the application to free after usage.
If you ever need to use multitask on steriods C# allows you to use either Threads + Interlocked, but lately C# introduced lightweight Tasks, guess how it is made? transactional memory access using Interlocked class. So you likely do not need to reinvent the wheel any of the low level part is already covered and well engineered.
so the idea is, let smart people (not me, I am a common developer like you) solve the hard part for you and just enjoy general purpose computing platform like C#. if you still see some remnants of these parts is because someone may still need to interface with worlds outside .NET and access some driver or system calls for example requiring you to know how to build a descriptor and put each bit in the right place. do not being mad at those people, they made our jobs possible.
In short : Interlocked + bitfields. incredibly powerful, don't use it

It is for speed and efficiency. Essentially all you are working with is a single int.
if ((flags & AnchorStyles.Top) == AnchorStyles.Top)
{
//Do stuff
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.