ReadInt32 vs. ReadUInt32 - c#

I was tinkering with IP packet 'parsers' when I noticed something odd.
When it came to parsing the IP addresses, in C#
private uint srcAddress;
// stuff
srcAddress = (uint)(binaryReader.ReadInt32());
does the trick, so you'd think this VB.Net equivallent
Private srcAddress As UInteger
'' stuff
srcAddress = CUInt(binaryReader.ReadInt32())
would do the trick too. It doesn't. This :
srcAddress = reader.ReadUInt32()
however will.
Took some time to discover, but what have I dicovered -- if anything ? Why is this ?

VB.NET, by default, does something that C# doesn't do by default. It always check for numerical overflow. And that will trigger in your code, IP addresses whose last bit is 1 instead of 0 will produce a negative number and that cannot be converted to UInteger. A data type that can only store positive 32-bit numbers.
C# has this option too, you'd have to explicitly use the checked keyword in your code. Or use the same option that VB.NET projects have turned on by default: Project + Properties, Build tab, Advanced, tick the "Check for arithmetic overflow/underflow" checkbox. The same option in VB.NET project is named "Remove integer overflow checks", off by default.
Do note how these defaults affected the syntax of the languages as well. In C# you have to write a cast to convert a value to an incompatible value type. Not necessary in VB.NET, the runtime check keeps you out of trouble. It is very bad kind of trouble to have, overflow can produce drastically bad results. Not in your case, that happens, an IP address really is an unsigned number.
Do keep the other quirk about IP-addresses in mind, sockets were first invented on Unix machines that were powered by LSD and big-endian processors. You must generally use IPAddress.NetworkToHostOrder() to get the address in the proper order. Which only has overloads that take a signed integer type as the argument. So using ReadInt32() is actually correct, assuming it is an IPv4 address, you pass that directly to NetworkToHostOrder(). No fear of overflow.

Related

C# literal int comparison versus creating a const for comparison

I'm reviewing code review suggestions written from/to various developers and came across an interesting one.
Someone originally wrote a basic comparison in LINQ (EF to be specific):
myTable.Where(i => i.MyValue == 1);
Where 1 is an unchanging TypeId stored in the database.
The suggestion was to remove the hard coded value of 1 in favor of a const. So for example it would be rewritten as:
const int valueId = 1;
myTable.Where(i => i.MyValue == valueId);
From the suggestion point of view I get where they were coming from with the const since the code only ever needs a single copy of this value. But perhaps the compiler is smart enough to recognize that this is an unchanging value and treats it similarly.
So the question remains, does this kind of code refactor actually hold any weight other than eliminating "magic numbers"?
At this level, it is highly unlikely it matters what is produced at the compiler. It is likely the same anyway. The point is, what is safer for usage and easier to understand? Does '1' represent anything in particular except the literal value '1'? Based on the code snippet, I would guess that it does, and that is very good grounds for introducing a constant field because you now know exactly what is being checked against here.
Is this literal value '1' used in any other places that would need to be changed if the value change, for example, to '2'? If so, that also is very good grounds for introducing a constant field because now you only have to change the value in a single place, rather than search your entire code base, and most likely missing at least one instance.
Also, credit to Ixrec from Programmers, valueId is a terrible name for a constant as it does not say what the value is. A better name would be answersId, if, for example, the '1' represented answers while '0' represented questions and '2' represented comments.
First of all, both versions of the code will compile to exactly the same IL.
A constant is not a variable. Any usage of a constant is replaced at compile-time by it's value.
There are 2 advantages of using a const instead of a literal
The constant can be defined once and used in many places. So if you ever need to change the value of the constant, you only need to change it in one place*
You can give a meaningful name to the constant.
(*) Never change value of a public const field - all other assemblies using this constant will have to be recompiled to use the updated value.

FromBase64 string length must be multiple or 4 or not?

according to my understanding, a base64 encoded string (ie the output of encode) must always be a multiple of 4.
the c# Convert.FromBase64String says that its input must be a multiple of 4
However if I give it a 25 character string it doesnt complain
[convert]::FromBase64String("ei5gsIELIki+GpnPGyPVBA==")
[convert]::FromBase64String("1ei5gsIELIki+GpnPGyPVBA==")
both work. (The first one is 24 , second is 25)
[convert]::FromBase64String("11ei5gsIELIki+GpnPGyPVBA==")
fails with Invalid length exception
I assume this is a bug in the c# library but I just want to make sure - I am writing code that is sniffing strings to see if they are valid base64 strings and I want to be sure that I understand what a valid one looks like (one possible implementation was to give the string to system.convert and see if it threw - why reinvent perfectly good code)
Yes, this is a flaw (aka bug). It got started due to a perf optimization in an internal helper function named FromBase64_ComputeResultLength() which calculates the length of the byte[] result. It has this comment (edited to fit):
// For legal input, we can assume that 0 <= padding < 3. But it may be
// more for illegal input.
// We will notice it at decode when we see a '=' at the wrong place.
The "we will notice" remark is not entirely accurate, the decoder does flag an '=' if one isn't expected but it fails to check if there's one too many. Which is the case for the 25-char string.
You can report the problem at connect.microsoft.com, I don't see an existing report that resembles it. Do note that it is fairly unlikely that Microsoft can actually fix it any time soon since the change is going to break existing programs that now successfully parse bad base64 strings. It normally requires a major .NET release update to get rid of such problems, like it was done for .NET 4.0, there isn't one on the horizon afaik.
But yes, the simple workaround for you is to check if the string length is divisible by 4, use the % operator.

Decreasing value of minimum int results in maximum int

I'm currently reading through C# 4.0 In A Nutshell, and in chapter 2 it states that if an int object has a value of int.MinValue (-2147483648) and the value is decreased, as the value cannot go any lower, it goes to the highest value, int.MaxValue (2147483647):
int a = int.MinValue;
Console.WriteLine(a); //-2147483648
a = a-1;
Console.WriteLine(a); //-2147483648
Console.WriteLine (a == int.MaxValue); // True
The book then goes on to mention how you can protect from this happening using checked.
However, I'm wondering why this is possible? Surely this could cause a lot of issues if its not protected with checked?
Is it not a bit of a flaw in the design? Why would anyone want a minimum value to become a maximum value?
Furthermore, I've noticed VB.NET will throw an 'Arithmetic overflow' instead of letting this happen:
Dim a As Integer
a = Integer.MaxValue
a = a + 1
Response.Write(a) 'Arithmetic overflow exception here
Sometimes it is desired behaviour. Other times, it simply isn't necessary to deal with and check for overflows, as they slow down program execution. Take a game drawing unbound (FPS). At 1000FPS, a small 1ms extra check could have a significant affect on framerate. Other times, it may be expected execution (I've only seen one use of this to date) - it flows on from what happens in C#'s ancestor - C++ and C.
Also, please note that the default behaviour (checked or unchecked) varies between configurations and environments:
If neither checked nor unchecked is used, a constant expression uses
the default overflow checking at compile time, which is checked.
Otherwise, if the expression is non-constant, the run-time overflow
checking depends on other factors such as compiler options and
environment configuration.
Since the code you posted is non-constant (and also due to various environment variables), no checking for overflows is performed.
It is a matter of language design - why this happens has to do with the binary representation of numbers. In .NET this is done with two's complement, hence the overflow issue.
The designers of VB.NET have decided to not allow such overflow/underflow issues.
The designers of C# have decided to leave control to the programmer.
It is possible that your programm automatically check overflow: right click on project file,
select Build, than select Advanced button in right bottom corner. In the opened window check Check Arithmetical Overflow checkbutton.

Is there any plugin for VS or program to show type and value etc... of a C# code selection?

What I want to do is be told the type, value (if there is one at compile-time) and other information (I do not know what I need now) of a selection of an expression.
For example, if I have an expression like
int i = unchecked((short)0xFF);
selecting 0xFF will give me (Int32, 255), while selecting ((short)0xFF) will give me (Int16, 255), and selecting i will give me (Int32, 255).
Reason why I want such a feature is to be able to verify my assumptions. It's pretty easy to assume that 0xFF is a byte but it is actually an int. I could of course refer to the C# Language Specifications all the time, but I think it's inefficient to have to refer to it everytime I want to check something out. I could also use something like ANTLR but the learning curve is high.
I do intend to read the entire specs and learn ANTLR and about compilers, but that's for later. Right now I wish to have tools to help me get the job done quickly and accurately.
Another case in point:
int? i = 0x10;
int? j = null;
int x;
x = (i >> 4) ?? -1;//x=1
x = (j >> 4) ?? -1;//x=-1
It may seem easy to you or even natural for the bottom two lines in the code above. (Maybe one should avoid code like these, but that's another story) However, what msdn says about the null-coalescing operator is lacking information to tell me that the above code ((i>>4)??) is legal (yet it is, and it is). I had to dig into grammar in the specs to know what's happening:
null-coalescing-expression
conditional-or-expression
conditional-and-expression
exclusive-or-expression
and-expression
equality-expression
relational-expression
shift-expression
shift-expression right-shift additive-expression
... (and more)
Only after reading so much can I get a satisfactory confirmation that it is valid code and does what I think it does. There should be a much simpler way for the average programmer to verify (not about validity, but whether it behaves as thought or not, and also to satisfy my curiosity) such code without having to dive into that canonical manual. It doesn't necessary have to be a VS plugin. Any alternative that is intuitive to use will do just as well.
Well, I'm not aware of any add-ins that do what you describe - however, there is a trick you can use figure out the type of an expression (but not the compile-time value):
Assign the expression to a var variable, and hover your mouse over the keyword var.
So for example, when you write:
var i = unchecked((short)0xFF);
and then hover your mouse over the keyword var, you get a tooltip that says something like:
Struct System.Int16
Represents a 16-bit signed integer.
This is definitely a bit awkward - since you have to potentially change code to make it work. But in a pinch, it let's you get the compiler to figure out the type of an expression for you.
Keep in mind, this approach doesn't really help you once you start throwing casts into the picture. For instance:
object a = 0xFF;
var z = (string)a; // compiles but fails at runtime!
In the example above, the IDE will dutifully report that the type of var z is System.String - but this is, of course, entirely wrong.
Your question is a little vague on what you are looking for, so I don't know if "improved" intellisense solves it, but I would try the Productivity Power Tools.

c#: tryparse vs convert

Today I read an article where it's written that we should always use TryParse(string, out MMM) for conversion rather than Convert.ToMMM().
I agree with article but after that I got stuck in one scenario.
When there will always be some valid value for the string and hence we can also use Convert.ToMMM() because we don't get any exception from Convert.ToMMM().
What I would like to know here is: Is there any performance impact when we use TryParse because when I know that the out parameter is always going to be valid then we can use Convert.ToMMM() rather TryParse(string, out MMM).
What do you think?
If you know the value can be converted, just use Parse(). If you 'know' that it can be converted, and it can't, then an exception being thrown is a good thing.
EDIT: Note, this is in comparison to using TryParse or Convert without error checking. If you use either of the other methods with proper error checking then the point is moot. I'm just worried about your assumption that you know the value can be converted. If you want to skip the error checking, use Parse and die immediately on failure rather than possibly continuing and corrupting data.
When the input to TryParse/Convert.ToXXX comes from user input, I'd always use TryParse. In case of database values, I'd check why you get strings from the database (maybe bad design?). If string values can be entered in the database columns, I'd also use TryParse as you can never be sure that nobody modifies data manually.
EDIT
Reading Matthew's reply: If you are unsure and would wrap the conversion in a try-catch-block anyway, you might consider using TryParse as it is said to be way faster than doing a try-catch in that case.
There is significant difference regarding the developing approach you use.
Convert: Converting one "primitive" data in to another type and corresponding format using multiple options
Case and point - converting an integer number in to its bit by bit representation. Or hexadecimal number (as string) in to integer, etc...
Error Messages : Conversion Specific Error Message - for problems in multiple cases and at multiple stages of the conversion process.
TryParse: Error-less transfer from one data format to another. Enabling T/F control of possible or not.
Error Messages: NONE
NB: Even after passing the data in to a variable - the data passed is the default of the type we try to parse in to.
Parse: in essence taking some data in one format and transfer it in to another. No representations and nothing fancy.
Error Messages: Format-oriented
P.S. Correct me if I missed something or did not explain it well enough.

Categories

Resources