C++ vs. C# calculation (wrong result) - c#

I have a little problem;
I have this C++ calculation:
int main(unsigned long pass){
char name[50];
cout << "Enter username please: " << endl << endl;
gets_s(name);
cout << "\n";
pass = strlen(name);
pass = pass * 3;
pass = pass << 2;
pass = pow(pass, 3.0);
pass = pass + 23;
pass = pass + (pass * 708224);
cout << "Your generated serial: " << +pass << endl << endl;
system("pause");}
This gives me the working code for a 3 char username.
This is my C# calculation.
private void btn_generate_Click(object sender, EventArgs e)
{
pass = txt_user.TextLength;
pass = pass * 3;
pass = pass << 2;
pass = pass * pass * pass;
pass = pass + 23;
pass = pass + (pass * 708224);
txt_serial.Text = pass.ToString();
}
This gives me the wrong code for the exact same username..
What is strange is that the calculation on both gives me the same result until this line:
pass = pass + (pass * 708224);
after this calculation C# gives me the wrong result.
c++ result: 2994463703 (correct)
c# result: 33059234775 (wrong)
I hope someone can explain this.

So, there are three (at least) underlying issues here.
This algorithm grows exponentially, and offers no protection against anything. It is easily reversible and does not attempt to secure the input.
The pow(pass, 3.0) method is going to return a double.
The long datatype (in C++) is not always 64-bits. It can be 32-bit.
If we ignore the first point, and skip to two and three, there are two potential issues:
When the pow(pass, 3.0) line gets hit, it may not always return the same value, due to floating-point error. (Now, I don't suspect this is a major issue in your code, but you fail to take it into account.)
When the pass + (pass * 708224) line (which can be rewritten as pass * 708225 fyi) gets hit, on a 32-bit C++ environment it will silently overflow to the value 2,994,463,703, which just so happens to be your C++ result.
So, how do you fix this?
Fix that algorithm. As it stands, you can easily build a lookup table of potential values.
Input (pass) Output
1 1,240,154,505
2 9,806,791,575
3 33,059,234,775
4 78,340,308,375
5 152,992,889,175
etc. etc.
Now, the issue here is not that these numbers are always going to be the same, that's generally expected. The issue is that the only value which actually fits within an Int32 is the first one. As soon as the second character is calculated, it's outside the potential range. And if you are going to be doing this in C++, you should really try to make sure you avoid long data-types. They are not always guaranteed to be 64-bits.
If you need a serial or hash (as we call them in the real-world), I recommend you look at md5, sha1/2, or any other hash algorithm. (All three mentioned here are built into .NET, should be easy enough to get them for C++.)
How can I tell if my C++ environment supports 64-bit unsigned long variables?
Easy, seed a unsigned long value with the maximum value for an unsigned int (4,294,967,295), add one to it, and check if the value is less than 1.
unsigned long test = 4294967295;
test = test + 1;
bool longIs64Bits = test > 0;
The result should be either true, or false. If true, then you have a 64-bit unsigned long type. If false, then you don't.
What if I really need 64-bit numbers?
Fortunately, C++ also provides a long long variable type. (As well as unsigned long long.) Note: these data-type sizes can vary, but will be no less than 64-bits.
unsigned long long test = 4294967295;
test = test + 1;
bool longLongIs64Bits = test > 0;
The preceding snippet should always be true.
Lastly, there is also uint64_t, defined in <stdint.h>. This is guaranteed to be 64-bits. It's part of the C99 spec, and C++11 spec, though I cannot vouch for support of it.

EBrown is dead right that the C++ code need serious rethinking, but if the C++ code is legacy and you can't change it, the best you can do is duplicate the bugs in the C# version.
Use uint instead of long in the c# version to trigger the unsigned 32 bit overflow.
You can't reliably duplicate the double rounding error. Use Math.pow and pray it never comes up.
Edit:
Addendum: Friends don't let friends roll their own crypto.

Related

Precision of Math.Cos() for a large integer

I'm trying to compute the cosine of 4203708359 radians in C#:
var x = (double)4203708359;
var c = Math.Cos(x);
(4203708359 can be exactly represented in double precision.)
I'm getting
c = -0.57977754519440394
Windows' calculator gives
c = -0.579777545198813380788467070278
PHP's cos(double) function (which internally just uses cos(double) from the C standard library) on Linux gives:
c = -0.57977754519881
C's cos(double) function in a simple C program compiled with Visual Studio 2017 gives
c = -0.57977754519881342
Here is the definition of Math.cos() in C#: https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/Math.cs#L57-L58
It appears to be a built-in function. I didn't dig (yet) in the C# compiler to check what this effectively compiles to but this is probably the next step.
In the meantime:
Why is the precision so poor in my C# example, and what can I do about it?
Is it simply that the cosine implementation in the C# compiler deals poorly with large integer inputs?
Edit 1: Wolfram Mathematica 11.0:
In[1] := N[Cos[4203708359], 50]
Out[1] := -0.57977754519881338078846707027800171954257546099993
Edit 2: I do need that level precision, and I'm ready to go pretty far in order to obtain it. I'd be happy to use an arbitrary precision library if there exists a good one that supports cosine (my efforts haven't led to one so far).
Edit 3: I posted the question on coreclr's issue tracker: https://github.com/dotnet/coreclr/issues/12737
I think I might know the answer. I'm pretty sure the sin/cos libraries don't take arbitrarily large numbers and calculate the sin/cos of them - they instead reduce them down to low numbers (between 0-2xpi?) and calculate them there. I mean, cos(x) = cos(x + 2xpi) = cos(x + 4xpi) = ...
Problem is, how is the program supposed to reduce your 10-digit number down? Realistically, it should figure out how many times it needs to multiply (2xpi) to get a value just below your number, then subtract that out. In your case, that's about 670 million.
So it's multiplying (2xpi) by this 9 digit value - so it's effectively losing 9 digits worth of significance from the math library's version of pi.
I ended up writing a little function to test what was going on:
private double reduceDown(double start)
{
decimal startDec = (decimal)start;
decimal pi = decimal.Parse("3.1415926535897932384626433832795");
decimal tau = pi * 2;
int num = (int)(startDec / tau);
decimal x = startDec - (num * tau);
double retVal;
double.TryParse(x.ToString(), out retVal);
return retVal;
//return start - (num * tau);
}
All this is doing is using decimal data type as a way of reducing down the value without losing digits of precision from pi - it still returns back a double. When I call it with a modification of your code:
var x = (double)4203708359;
var c = Math.Cos(x);
double y = reduceDown(x);
double c2 = Math.Cos(y);
MessageBox.Show(c.ToString() + Environment.NewLine + c2);
return;
... sure enough, the second one is accurate.
So my advice is - if you really need radians that high, and you really need the accuracy? Do something like that function above, and reduce the number down on your end in a way that you don't lose digits of precision.
Presumably, the salts are stored along with each password. You could use the PHP code to calculate that cosine, and store that also with the password. I would then also add a password version number and default all those older passwords to be version 1. Then, in your C# code, for any new passwords, you implement a new hashing algorithm, and store those password hashes as passwords version 2. For any version 1 passwords, to authenticate, you do not have to calculate the cosine, you simply use the one stored along with the password hash and the salt.
The programmer of that PHP code was probably wanting to do a clever version of pepper. By storing that cosine, or pepper along with the salt and the password hashes, you basically change that pepper into a salt2. So, another versionless way of doing this would be to use two salts in your C# hashing code. For new passwords you could leave the second salt blank or assign it some other way. For old passwords, it would be that cosine, but it is already calculated.
Regarding this part of my question: "Why is the precision so poor in my C# example", coreclr developers answered here: https://github.com/dotnet/coreclr/issues/12737
In a nutshell, .NET Framework 4.6.2 (x86 and x64) and .NET Core (x86) appear to use Intel's x87 FP unit (i.e. fcos or fsincos) that gives inaccurate results while .NET Core on x64 (and PHP, Visual Studio 2017 and gcc) use more accurate, presumably SSE2-based implementations that give correctly rounded results.

Is there existing Documentation about why the VBA Val function behaves differently than .Net implementation of same code (hex conversions)?

I'm converting code from VBA and I need to confirmed proof about the behavior of the Val function in order to faithfully reproduce it in .Net.
The issue is this line of VBA code
lHexNum = Val("&h" & HexNum) ' HexNum = 3B05000004F137
Is producing this output
323895
Which should be this,
16612521184391480
but I don't know why it isnt.
I have used 2 methods in .Net which both confirm the expected output of 16612521184391480 (as well as using a simple hex calculator).
Convert.ToInt64(HexNum, 16);
and
Microsoft.VisualBasic.Conversion.Val("&h" + HexNum);
However, I still need to perfectly replicate the actual output from the VBA program which right now gives the 323895 output.
The only reasoning I can find is if I remove the 3B05 from the HexNum I then get matching output. Since I cannot test this against enough live data to be 100% sure this works in all cases I cannot use this hack.
Does anyone have references or more information on how and why an Access 2003 application is getting the 323895 output from the Val function and why even the matching Microsoft.VisualBasic.Conversion.Val method cannot get the same output?
Well, 323895 is (in hex) 0004F137, so as a complete guess the problem here could be that Val you are using (or: the place where you are storing the value) is 32-bit, and is thus only going to give you the values from the last 8 characters (the last 4 bytes of data)
Val() returns a Double. Assuming lHexNum is declared as a 32 bit Long, VBA will do an implicit conversion and it doesn't throw an error even if it overflows. Since VBA doesn't have a 64 bit integer data type, it just throws away the upper bytes.
The same is true for VB6, which I verified below returns the value you expected as 323895.
Dim HexNum As String
HexNum = "3B05000004F137"
Dim lHexNum As Long
lHexNum = Val("&h" & HexNum)
Debug.Print lHexNum
In .NET however, a Long is a 64 bit value. It is able to hold the entire hex value so nothing gets thrown away. Technically, this is more correct than what VBA is doing since you are losing some of your original data during the conversion with VBA. You can't just change your variable to a Int32 either because C# will throw an overflow exception if the value is too large at runtime.
If you want the same behavior as VBA/VB6, you need to first cast the Double to an Int64, then cast it back to an Int32 so it gets truncated. Like this:
lHexNum = (Int32)(Int64)(Microsoft.VisualBasic.Conversion.Val("&h" + HexNum));
The result is that the upper 32 bits of the Int64 get thrown away, and you end up with the 323895 you desire.
I am using the Int64 and Int32 data types to be more explicit, however you could also use int in place of Int32, and long in place of Int64.
You state that lHexNum is a Long in VBA. This is 32 bits, so the max value that can be stored is 2,147,483,647 or 0x7FFFFFFF - this means your 0x3B05000004F137 is being truncated in the VBA code.
In .NET a Long is 64 bits so the hex value can fit and no truncation happens.
In order to get the same behaviour in .Net you will need to mask off the top 32 bits: see I want to get the low 32 bit of a int64 as int32
e.g.
Dim HexNumString = "3B05000004F137"
Dim lHexNum As Long = CLng(Val("&h" & HexNumString))
Dim tempLong As Long = ((lHexNum >> 32) << 32) 'shift it right Then left 32 bits, which zeroes the lower half Of the Long
Dim hexInt As Integer = CInt(lHexNum - tempLong)
Debug.WriteLine(hexInt)

Why does C# implement pre/post-increment/decrement operators for floating point types?

What's so special about adding/subtracting 1 to/from a floating point value that it deserves a dedicated operator?
double a = -0.001234129;
a++; // ?
I've never felt the need to use such a construction; it looks really weird to me. But if I ever had to, I'd feel much more comfortable with just:
a += 1;
Maybe it's because of my strong C++ background, but to me it makes a variable look like an array indexer.
Is there any reason for this?
The ++ and -- operators operate on all other number types, why make an exception for floating point numbers? To me, that would be the more surprising choice.
Note that C++ also implements these for floating point:
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
double a = 0.5;
cout << a << '\n';
++a;
cout << a << '\n';
return 0;
}
Output:
0.5
1.5
My guess is that the reason is consistency with C/C++.
I agree with you, that it's kind of weird - the '++' operator has some special meaning for integer values:
It translates to INC assembly instruction,
It represents changing the value by a special amount (i.e. by the smallest possible amount), and because of this it's used in iterations.
For floating point numbers, however, the value 1.0 is not any special value (from machine point of view). You also shouldn't use it for iterations (in other words: if you're using it you should usually consider using an int) as well as it doesn't have a designated INC assembly instruction.

long/large numbers and modulus in .NET

I'm currently writing a quick custom encoding method where I take a stamp a key with a number to verify that it is a valid key.
Basically I was taking whatever number that comes out of the encoding and multiplying it by a key.
I would then multiply those numbers to the deploy to the user/customer who purchases the key. I wanted to simply use (Code % Key == 0) to verify that the key is valid, but for large values the mod function does not seem to function as expected.
Number = 468721387;
Key = 12345678;
Code = Number * Key;
Using the numbers above:
Code % Key == 11418772
And for smaller numbers it would correctly return 0. Is there a reliable way to check divisibility for a long in .NET?
Thanks!
EDIT:
Ok, tell me if I'm special and missing something...
long a = DateTime.Now.Ticks;
long b = 12345;
long c = a * b;
long d = c % b;
d == 10001 (Bad)
and
long a = DateTime.Now.Ticks;
long b = 12;
long c = a * b;
long d = c % b;
d == 0 (Good)
What am I doing wrong?
As others have said, your problem is integer overflow. You can make this more obvious by checking "Check for arithmetic overflow/underflow" in the "Advanced Build Settings" dialog. When you do so, you'll get an OverflowException when you perform *DateTime.Now.Ticks * 12345*.
One simple solution is just to change "long" to "decimal" (or "double") in your code.
In .NET 4.0, there is a new BigInteger class.
Finally, you say you're "... writing a quick custom encoding method ...", so a simple homebrew solution may be satisfactory for your needs. However, if this is production code, you might consider more robust solutions involving cryptography or something from a third-party who specializes in software licensing.
The answers that say that integer overflow is the likely culprit are almost certainly correct; you can verify that by putting a "checked" block around the multiplication and seeing if it throws an exception.
But there is a much larger problem here that everyone seems to be ignoring.
The best thing to do is to take a large step back and reconsider the wisdom of this entire scheme. It appears that you are attempting to design a crypto-based security system but you are clearly not an expert on cryptographic arithmetic. That is a huge red warning flag. If you need a crypto-based security system DO NOT ATTEMPT TO ROLL YOUR OWN. There are plenty of off-the-shelf crypto systems that are built by experts, heavily tested, and readily available. Use one of them.
If you are in fact hell-bent on rolling your own crypto, getting the math right in 64 bits is the least of your worries. 64 bit integers are way too small for this crypto application. You need to be using a much larger integer size; otherwise, finding a key that matches the code is trivial.
Again, I cannot emphasize strongly enough how difficult it is to construct correct crypto-based security code that actually protects real users from real threats.
Integer Overflow...see my comment.
The value of the multiplication you're doing overflows the int data type and causes it to wrap (int values fall between +/-2147483647).
Pick a more appropriate data type to hold a value as large as 5786683315615386 (the result of your multiplication).
UPDATE
Your new example changes things a little.
You're using long, but now you're using System.DateTime.Ticks which on Mono (not sure about the MS platform) is returning 633909674610619350.
When you multiply that by a large number, you are now overflowing a long just like you were overflowing an int previously. At that point, you'll probably need to use a double to work with the values you want (decimal may work as well, depending on how large your multiplier gets).
Apparently, your Code fails to fit in the int data type. Try using long instead:
long code = (long)number * key;
The (long) cast is necessary. Without the cast, the multiplication will be done in 32-bit integer form (assuming number and key variables are typed int) and the result will be casted to long which is not what you want. By casting one of the operands to long, you tell the compiler to perform the multiplication on two long numbers.

High precision integer math in C#?

I have a very large number I need to calculate, and none of the inbuilt datatypes in C# can handle such a large number.
Basicly I want to solve this:
Project Euler 16:
2^15 = 32768 and the sum of its digits
is 3 + 2 + 7 + 6 + 8 = 26.
What is the sum of the digits of the
number 2^1000?
I have already written the code, but, as said before, the number is too large for c# datatypes. The code has been tested and verified with small numbers (such as 2^15) and it works perfectly.
using System;
namespace _16_2E1000
{
class Program
{
static void Main(string[] args)
{
ulong sum = 0;
ulong i = 1 << 1000;
string s = i.ToString();
foreach (char c in s)
{
sum += (ulong) Convert.ToInt64(c.ToString());
}
Console.WriteLine(sum);
Console.ReadLine();
}
}
}
You can use BigInteger from the J# classes. First question in this article tells you how. It's a bit of pain b/c then you have to provide the J# redistributable when you roll out tho.
First to answerer you exact question, look for a BigInt or BigNum type
Second, from what I know of Project Euler, there will be a cool, tricky way to do it that is much easier.
As a first guess I'd compute the answerer for 2^1 -> 2^n (for whatever n you can get to work) and look for patterns. Also look for patterns in the sequences
V(0) = 2^p
V(n) = floor(V(n - 1) / 10)
D(n) = V(n) % 10
I hope this is not a homework problem, but to get to the answer of 2^1000, you'll have to divide it into smaller chunks,
try something like,
2^1000 = 2 * 2^999 = 2^999 + 2^999 = 2^ 998 + 2^ 998 + 2^ 998 + 2^ 998
breaking into smaller bits till you get to solvable a problem,
complete solution to project Euler is on following links.
http://blog.functionalfun.net/2008/07/project-euler-problem-16-calculating.html
http://code.msdn.microsoft.com/projecteuler
It is not necessary to have Big Integer capabilities in order to solve this problem.
One could just use the property that:
2^n = 2^(n-1) + 2^(n-1)
If Big Integer is really necessary for other tasks, I have been using the BigInt class from F# in my C# programs and am happy with it.
The necessary steps:
Install the F# CTP
In your C# (or other .NET language) application add a reference to the FSharp.Core dll.
Add: using Microsoft.FSharp.Math;
In the "Class View" window familiarize yourself with the members of the two classes: BigInt and BigNum
After executing these steps one is basically ready to use the BigInt class.
One last hint:
To avoid declaring variables with improper names to hold constants that makes the code unreadable, I am using a name that starts with _ (underscore), followed by the integer constant. In this way one will have expressions like:
N = _2 * N;
clearly much more readable than:
N = Two * N;
Here's a BigInteger (source code is available) that you can use; though, as already mentioned, there are more efficient ways to do this than brute force.
BigInteger on codeplex
Actually, while a biginteger utility might be of interest here, you don't need it, even for this. Yes, it looks like it does, but you don't. In fact, use of a biginteger form may even slow things down.
Since I don't want to solve the problem for you, I'll just suggest you think about this in a modular way.

Categories

Resources