Is it safe to use float.NaN as Dictionary key? - c#

code(from interactive shell):
> var a = new Dictionary<float, string>();
> a.Add(float.NaN, "it is NaN");
> a[float.NaN]
"it is NaN"
So it is possible, but is it safe?

Paraphrasing from https://github.com/dotnet/corefx/blob/master/src/Common/src/CoreLib/System/Single.cs;
public const float NaN = (float)0.0 / (float)0.0;
public static unsafe bool IsNaN(float f) => f != f;
public int CompareTo(object? value){
...
if (m_value < f) return -1;
if (m_value > f) return 1;
if (m_value == f) return 0;
if (IsNaN(m_value))
return IsNaN(f) ? 0 : -1;
else // f is NaN.
return 1;
}
public bool Equals(float obj)
{
if (obj == m_value)
{
return true;
}
return IsNaN(obj) && IsNaN(m_value);
}
public override int GetHashCode()
{
int bits = Unsafe.As<float, int>(ref Unsafe.AsRef(in m_value));
// Optimized check for IsNan() || IsZero()
if (((bits - 1) & 0x7FFFFFFF) >= 0x7F800000)
{
// Ensure that all NaNs and both zeros have the same hash code
bits &= 0x7F800000;
}
return bits;
}
You can see that NaN requires special handling in each of these cases. The standard IEEE representation leaves most bits undefined, and defines special cases for comparisons even if those bit values are identical.
However you can also see that both GetHashCode() && Equals() treat two NaN's as equivalent. So I believe that using NaN as a dictionary key should be fine.

That depends on what you mean by safe.
If you expect people to be able to use the dictionary and compare its keys to other floats, they will have to deal with a key value of NaN correctly themselves. And since float.NaN == float.NaN happens to be False, that may cause issues down the line.
However, the Dictionary succeeds in performing the lookup and other operations work correctly as well.
The question here is really why you need it in the first place?

It's bad idea to use float as key of Dictionary.
In theory you can do it. But when you work with float\double\decimal you shoud use some Epsilon to compare 2 values. Use formula like this:
abs(a1 - a2) < Epsilon
It's need due to rounding of float in operations and existing of irrational numbers. For example how you will compare with PI or sqrt(2)?
So, on this case using float as dictionary key is bad idea.

Related

Is it safe to set a double to a literal and then compare it

for example
double x = -1;
TrySetX(ref x);
if(x == -1)
//since x == -1, it obviously wasn't set
TryADifferentWayToSetX(ref x);
Use(x);
in the case where x is not changed, will x == -1 always return true, or will I have to use an epsiolon for the comparison?
The logic here is that both literals will presumable be converted to the same value, so there is no need to worry about the fact that they may lose precision.
Trying it out, it looks like both literals get converted to the same value. As such the comparison is safe:
Here is a little test I wrote
public class Program
{
public static void Main(string[] args)
{
double x = 3.455678756567656765677812345678912345678901234567890123456789009876543223456798765423456709876512345;
System.Console.WriteLine(x);
System.Console.WriteLine(x == 3.455678756567656765677812345678912345678901234567890123456789009876543223456798765423456709876512345);
}
}
That number is long enough there will be a large loss in precision, but it prints
3.45567875656766
True
So indeed there is loss of precision, but the comparison nonetheless works.

Sort stop working after enum values added

I have a imprmented sort method for a colection in my code and today i noticed something strange. When i tried to add new enum values to the enum the sort method crashed with this error.
Unable to sort because the IComparer.Compare() method returns inconsistent results. Either a value does not compare equal to itself, or one value repeatedly compared to another value yields different results. x: '', x's type: 'Texture2D', IComparer: 'System.Array+FunctorComparer`1[Microsoft.Xna.Framework.Graphics.Texture2D]'.
This seems really strange seens the sort is in now way dependent on earlyer result and all it should do is sort after the index of the enum insteed of alfabatic order.
Here is the code.
availableTiles.Sort(CompareTilesToEnum);
private static int CompareTilesToEnum(Texture2D x, Texture2D y)
{
int xValue = (int) (Enum.Parse(typeof(TileTyp), x.Name, true));
int yValue = (int) (Enum.Parse(typeof(TileTyp), y.Name, true));
if (xValue > yValue)
{
return 1;
}
else
{
return -1;
}
}
public enum TileTyp
{
Nothing = -1,
Forest,
Grass,
GrassSandBottom,
GrassSandLeft,
GrassSandRight,
GrassSandTop,
Mounten,
Sand,
Snow,
Water,
GrassSandTopLeft,
GrassSandAll,
GrassSandBottomLeft,
GrassSandBottomRightLeft,
GrassSandBottomRightTop,
GrassSandBottomTopLeft,
GrassSandRightLeft,
GrassSandRightTop,
GrassSandRightTopLeft,
GrassSandBottomRight,
GrassSandBottomTop
}
The values i added was
GrassSandBottomRight,
GrassSandBottomTop
Your comparison never returns 0 - even if the values are equal. Any reason you don't just ask int.CompareTo to compare the values?
private static int CompareTilesToEnum(Texture2D x, Texture2D y)
{
int xValue = (int) (Enum.Parse(typeof(TileTyp), x.Name, true));
int yValue = (int) (Enum.Parse(typeof(TileTyp), y.Name, true));
return xValue.CompareTo(yValue);
}
Simpler and more importantly, it should actually work :)
As the error clearly states, your comparer is broken.
You need to return 0 if the values are equal.
There are some rules you must follow with any comparison method:
If A == B, then B == A (return zero both times).
If A < B and B < C, then A < C.
If A < B, then B > A
A == A (return zero if compared with itself).
(Note, the == above means that nether < nor > is true. It is permissable for two objects to be equivalent in a sort-order without being true for a corresponding Equals. We could for instance have a rule that sorted all strings containing numbers in numerical order, put all other strings and the end, but didn't care about what order those other strings were in).
These rules follow for any language (they're not programming rules, they're logic rules), there is a .NET specific one too:
5: If A != null, then A > null.
You're breaking all of the first four rules. Since Texture2D is a reference type you risk breaking rule 5 too (will throw a different exception though).
You're also lucky that .NET catches it. A different sort algorithm could well have crashed with a more confusing error or fallen into an infinite loop as it e.g found that item 6 was reported as greater than item 7 and swapped them, then soon after found that item 6 was reported as greater than item 7 and swapped them, then soon after found...
private static int CompareTilesToEnum(Texture2D x, Texture2D y)
{
//Let's deal with nulls first
if(ReferenceEquals(x, y))//both null or both same item
return 0;
if(x == null)
return -1;
if(y == null)
return 1;
//Enum has a CompareTo that works on integral value, so why not just use that?
return Enum.Parse(typeof(TileTyp), x.Name, true)).CompareTo(Enum.Parse(typeof(TileTyp), y.Name, true)));
}
(This assumes a failure in the parsing is impossible and doesn't have to be considered).

Generate a unique string based on a pair of strings

I've two strings StringA, StringB. I want to generate a unique string to denote this pair.
i.e.
f(x, y) should be unique for every x, y and f(x, y) = f(y, x) where x, y are strings.
Any ideas?
Compute a message digest of both strings and XOR the values
MD5(x) ^ MD5(Y)
The message digest gives you unique value for each string and the XOR makes it possible for f(x, y) to be equal to f(y, x).
EDIT: As #Phil H observed, you have to treat the case in which you receive two equal strings as input, which would generate 0 after the XOR. You could return something like an MD5(x+y) if x and y are the same, and MD5(x) ^ MD5(y) for the rest of values.
Just create a new class and override Equals & GetHashCode:
class StringTuple
{
public string StringA { get; set; }
public string StringB { get; set; }
public override bool Equals(object obj)
{
var stringTuple = obj as StringTuple;
if (stringTuple == null)
return false;
return (StringA.Equals(stringTuple.StringA) && StringB.Equals(stringTuple.StringB)) ||
(StringA.Equals(stringTuple.StringB) && StringB.Equals(stringTuple.StringA));
}
public override int GetHashCode()
{
// Order of operands is irrelevant when using *
return StringA.GetHashCode() * StringB.GetHashCode();
}
}
Just find a unique way of ordering them and concatenate with a separator.
def uniqueStr(strA,strB,sep):
if strA <= strB:
return strA+sep+strB
else:
return strB+sep+strA
For arbitrarily long lists of strings, either sort the list or generate a set, then concatenate with a separator:
def uniqueStr(sep,strList):
return sep.join(Set(strList));
Preferably, if the strings are long or the separator choice is a problem, use the hashes and hash the result:
def uniqueStr(sep,strList):
return hash(''.join([hash(str) for str in Set(strList)]))
I think the following should yield unique strings:
String f = Replace(StringA<StringB?StringA:StringB,"#","##") + "}#{" + Replace(StringA<StringB?StringB:StringA,"#","##")
(That is, there's only one place in the string where a single "#" sign can appear, and we don't have to worry about a run of "#"s at the end of StringA being confused with a run of "#"s at the start of StringB.
You can use x.GetHashCode(). That not ensures that this will be unique, but quite. See more information in this question.
For example:
public int GetUniqueValue(string x, string y)
{
unchecked {
var result = x.GetHashCode() * x.GetHashCode();
return result;
}
}
Well take into consideration the first letter of each string before combining them? So if it is alphabetically ordered f(x, y) = f(y, x) will be true.
if(x > y)
c = x + y;
else
c = y + x;
What about StringC = StringA + StringB;.
That is guaranteed to be unique for any combination of StringA or StringB. Or did you have some other considerations for the string also?
You can for example combine the strings and take the MD5 hash of it. Then you will get a string that is probably "unique enough" for your needs, but you cannot reverse the hash back into the strings again, but you can take the same strings and be sure that the generated hash will be the same the next time.
EDIT
I saw your edit now, but I feel it's only a matter of sorting the strings first in that case. So something like
StringC = StringA.CompareTo(StringB) < 0 ? StringA + StringB : StringB + StringA;
You could just sort them and concatenate them, along with, lets, say the lenght of the first word.
That way f("one","two") = "onetwo3", f("two","one") = "onetwo3", and no other combination would produce that unique string as , e,g, "onet", "wo" would yield "onetwo4"
However, this will be a abysmal solution for reasonably long strings.
You could also do some sort of hash code calculcation, like this
first.GetHashCode() ^ second.GetHashCode()
that would be reasonably unique, however, you can't guarantee uniqueness.
It would be nice if the OP provided a little more context, because this does not sound like a sound solution to any problem.
public static String getUniqString(String x,String y){
return (x.compareTo(y)<0)?(x+y):(y+x);
}

How should I go about implementing Object.GetHashCode() for complex equality?

Basically, I have the following so far:
class Foo {
public override bool Equals(object obj)
{
Foo d = obj as Foo ;
if (d == null)
return false;
return this.Equals(d);
}
#region IEquatable<Foo> Members
public bool Equals(Foo other)
{
if (this.Guid != String.Empty && this.Guid == other.Guid)
return true;
else if (this.Guid != String.Empty || other.Guid != String.Empty)
return false;
if (this.Title == other.Title &&
this.PublishDate == other.PublishDate &&
this.Description == other.Description)
return true;
return false;
}
}
So, the problem is this: I have a non-required field Guid, which is a unique identifier. If this isn't set, then I need to try to determine equality based on less accurate metrics as an attempt at determining if two objects are equal. This works fine, but it make GetHashCode() messy... How should I go about it? A naive implementation would be something like:
public override int GetHashCode() {
if (this.Guid != String.Empty)
return this.Guid.GetHashCode();
int hash = 37;
hash = hash * 23 + this.Title.GetHashCode();
hash = hash * 23 + this.PublishDate.GetHashCode();
hash = hash * 23 + this.Description.GetHashCode();
return hash;
}
But what are the chances of the two types of hash colliding? Certainly, I wouldn't expect it to be 1 in 2 ** 32. Is this a bad idea, and if so, how should I be doing it?
A very easy hash code method for custom classes is to bitwise XOR each of the fields' hash codes together. It can be as simple as this:
int hash = 0;
hash ^= this.Title.GetHashCode();
hash ^= this.PublishDate.GetHashCode();
hash ^= this.Description.GetHashCode();
return hash;
From the link above:
XOR has the following nice properties:
It does not depend on order of computation.
It does not “waste” bits. If you change even one bit in one of the components, the final value will change.
It is quick, a single cycle on even the most primitive computer.
It preserves uniform distribution. If the two pieces you combine are uniformly distributed so will the combination be. In other words, it does not tend to collapse the range of the digest into a narrower band.
XOR doesn't work well if you expect to have duplicate values in your fields as duplicate values will cancel each other out when XORed. Since you're hashing together three unrelated fields that should not be a problem in this case.
I don't think there is a problem with the approach you have chosen to use. Worrying 'too much' about hash collisions is almost always an indication of over-thinking the problem; as long as the hash is highly likely to be different you should be fine.
Ultimately you may even want to consider leaving out the Description from your hash anyway if it is reasonable to expect that most of the time objects can be distinguished based on their title and publication date (books?).
You could even consider disregarding the GUID in your hash function altogether, and only use it in the Equals implementation to disambiguate the unlikely(?) case of hash clashes.

How can I compare a float to NaN if comparisons to NaN always return false?

I have a float value set to NaN (seen in the Watch Window), but I can't figure out how to detect that in code:
if (fValue == float.NaN) // returns false even though fValue is NaN
{
}
You want float.IsNaN(...). Comparisons to NaN always return false, no matter what the value of the float is. It's one of the quirks of floating points.
That means you can do this:
if (f1 != f1) { // This conditional will be true if f1 is NaN.
In fact, that's exactly how IsNaN() works.
Try this:
if (float.IsNaN(fValue))
{
}
In performance-critical code float.IsNaN could be too slow because it involves FPU. In that case you can use binary mask check (according to IEEE 754 specification) as follow:
public static unsafe bool IsNaN (float f)
{
int binary = *(int*)(&f);
return ((binary & 0x7F800000) == 0x7F800000) && ((binary & 0x007FFFFF) != 0);
}
It is 5 times faster than float.IsNaN. I just wonder why Microsoft did not implement IsNaN in such way. If you'd prefer not using unsafe code you still can use union-like structure:
[StructLayout (LayoutKind.Explicit)]
struct FloatUnion
{
[FieldOffset (0)]
public float value;
[FieldOffset (0)]
public int binary;
}
public static bool IsNaN (float f)
{
FloatUnion union = new FloatUnion ();
union.value = f;
return ((union.binary & 0x7F800000) == 0x7F800000) && ((union.binary & 0x007FFFFF) != 0);
}
It's still 3 times faster than IsNaN.
if(float.isNaN(fValue))
{
}
if (fValue.CompareTo(float.NaN) == 0)
Note: I know, the thread is dead.

Categories

Resources