GetHashCode() on byte[] array

GetHashCode() on byte[] array - c#

What does GetHashCode() calculate when invoked on the byte[] array?
The 2 data arrays with equal content do not provide the same hash.

Arrays in .NET don't override Equals or GetHashCode, so the value you'll get is basically based on reference equality (i.e. the default implementation in Object) - for value equality you'll need to roll your own code (or find some from a third party). You may want to implement IEqualityComparer<byte[]> if you're trying to use byte arrays as keys in a dictionary etc.
EDIT: Here's a reusable array equality comparer which should be fine so long as the array element handles equality appropriately. Note that you mustn't mutate the array after using it as a key in a dictionary, otherwise you won't be able to find it again - even with the same reference.
using System;
using System.Collections.Generic;
public sealed class ArrayEqualityComparer<T> : IEqualityComparer<T[]>
{
// You could make this a per-instance field with a constructor parameter
private static readonly EqualityComparer<T> elementComparer
= EqualityComparer<T>.Default;
public bool Equals(T[] first, T[] second)
{
if (first == second)
{
return true;
}
if (first == null || second == null)
{
return false;
}
if (first.Length != second.Length)
{
return false;
}
for (int i = 0; i < first.Length; i++)
{
if (!elementComparer.Equals(first[i], second[i]))
{
return false;
}
}
return true;
}
public int GetHashCode(T[] array)
{
unchecked
{
if (array == null)
{
return 0;
}
int hash = 17;
foreach (T element in array)
{
hash = hash * 31 + elementComparer.GetHashCode(element);
}
return hash;
}
}
}
class Test
{
static void Main()
{
byte[] x = { 1, 2, 3 };
byte[] y = { 1, 2, 3 };
byte[] z = { 4, 5, 6 };
var comparer = new ArrayEqualityComparer<byte>();
Console.WriteLine(comparer.GetHashCode(x));
Console.WriteLine(comparer.GetHashCode(y));
Console.WriteLine(comparer.GetHashCode(z));
Console.WriteLine(comparer.Equals(x, y));
Console.WriteLine(comparer.Equals(x, z));
}
}

Like other non-primitive built-in types, it just returns something arbitrary. It definitely doesn't try to hash the contents of the array. See this answer.

Simple solution
public static int GetHashFromBytes(byte[] bytes)
{
return new BigInteger(bytes).GetHashCode();
}

byte[] inherits GetHashCode() from object, it doesn't override it. So what you get is basically object's implementation.

If you are using .NET 6 or at least .NET Core 2.1, you can write less codes and achieve better performance with System.HashCode struct.
Using the method HashCode.AddBytes() which available from .NET 6:
public int GetHashCode(byte[] value)
{
var hash = new HashCode();
hash.AddBytes(value);
return hash.ToHashCode();
}
Using the method HashCode.Add which available from .NET Core 2.1:
public int GetHashCode(byte[] value) =>
value.Aggregate(new HashCode(), (hash, i) => {
hash.Add(i);
return hash;
}).ToHashCode();
Note in the document of HashCode.AddBytes() it says:
This method does not guarantee that the result of adding a span of bytes will match the result of adding the same bytes individually.
In this sharplab demo they are just output same result, but this might be varying from .NET version or runtime environment.

If it's not the same instance, it will return different hashes. I'm guessing it is based on the memory address where it is stored somehow.

Related

Is this possible? Specify any generic type as long as the + operation is defined on it

I'm not sure if this is possible, but if it is then it would be useful.
I am attempting to program in a class called Matrix<T>. The intent is to be able to have matrices of various data types, such as integers, floats, doubles, etc.
I now want to define addition:
public static Matrix<T> operator +(Matrix<T> first, Matrix<T> second)
{
if (first.dimension != second.dimension)
{
throw new Exception("The matrices' dimensions do not match");
}
Matrix<T> add = new Matrix<T>(first.dimension);
for (int i = 1; i <= first.rows; i++)
{
for (int j = 1; j <= first.columns; i++)
{
add[i,j] = first[i,j] + second[i,j];
}
}
return add;
}
There is an issue with the line add[i,j] = first[i,j] + second[i,j]; since the operation + is not defined on a general object of type T.
I only want to specify matrices where T is a type such that addition is defined, however. So, I can make a matrix of ints, floats, doubles, etc. but if I were to try and define a matrix of, say, int[]s, I would want this to throw an exception since + is not defined for int[]s.
So, instead of writing T, is there some way of telling the computer "this can take in any generic type, as long as an operator + is defined on the type? Or, is this not possible and I would have to sepeately define a matrix of ints, matrix of floats, and so on?
Edit: I don't see how the linked question from closure is related to this - I see nothing about operators there. If they are related, can somebody explain how?

Currently it is not possible (at least without losing compile time safety or changing the API) but with preview features enabled and System.Runtime.Experimental nuget you can use IAdditionOperators to restrict T to have + operator defined. I would say that adding this interface also to Matrix itself can be a good idea:
class Matrix<T> : IAdditionOperators<Matrix<T>, Matrix<T>, Matrix<T>> where T : IAdditionOperators<T, T, T>
{
public static Matrix<T> operator +(Matrix<T> left, Matrix<T> right)
{
// swap to real implementation here
T x = default;
T y = default;
Console.WriteLine(x + y);
return default;
}
}
See also:
Generic math (especially section about trying it out, note - VS 2022 recommended)

It's possible using reflection
class Int
{
readonly int v;
public int Get => v;
public Int(int v)
{
this.v = v;
}
public static Int operator +(Int me, Int other) => new Int(me.v + other.v);
}
class Arr<T>
{
T[] _arr;
public Arr(T[] arr)
{
_arr = arr;
}
public T this[int index] => _arr[index];
public static Arr<T> operator+(Arr<T> me, Arr<T> other)
{
var addMethod = typeof(T).GetMethod("op_Addition");
if (addMethod == null)
throw new InvalidOperationException($"Type {typeof(T)} doesn't implement '+' operator");
var result = me._arr.Zip(other._arr)
.Select(elements => addMethod.Invoke(null, new object[] { elements.First, elements.Second }))
.Cast<T>()
.ToArray();
return new Arr<T>(result);
}
}
[Test]
public void TestAdd()
{
var firstArray = new Arr<Int>(new[] { new Int(1), new Int(2) });
var secondArray = new Arr<Int>(new[] { new Int(2), new Int(3) });
var sum = firstArray + secondArray;
Assert.AreEqual(3, sum[0].Get);
Assert.AreEqual(5, sum[1].Get);
}
Reduced the example to array.
Unfortunetly it compiles even if T doesn't implement add operator, so you will get a exception in runtime. You could also check if the add method has proper signature (returns T and takes two T's). If you need help understanding the code, let me know!

Array as Dictionary key gives a lot of collisions

I need to use a list of numbers (longs) as a Dictionary key in order to do some group calculations on them.
When using the long array as a key directly, I get a lot of collisions. If I use string.Join(",", myLongs) as a key, it works as I would expect it to, but that's much, much slower (because the hash is more complicated, I assume).
Here's an example demonstrating my problem:
Console.WriteLine("Int32");
Console.WriteLine(new[] { 1, 2, 3, 0}.GetHashCode());
Console.WriteLine(new[] { 1, 2, 3, 0 }.GetHashCode());
Console.WriteLine("String");
Console.WriteLine(string.Join(",", new[] { 1, 2, 3, 0}).GetHashCode());
Console.WriteLine(string.Join(",", new[] { 1, 2, 3, 0 }).GetHashCode());
Output:
Int32
43124074
51601393
String
406954194
406954194
As you can see, the arrays return a different hash.
Is there any way of getting the performance of the long array hash, but the uniqeness of the string hash?
See my own answer below for a performance comparison of all the suggestions.
About the potential duplicate -- that question has a lot of useful information, but as this question was primarily about finding high performance alternatives, I think it still provides some useful solutions that are not mentioned there.

That the first one is different is actually good. Arrays are a reference type and luckily they are using the reference (somehow) during hash generation. I would guess that is something like the Pointer that is used on machine code level, or some Garbage Colletor level value. One of the things you have no influence on but is copied if you assign the same instance to a new reference variable.
In the 2nd case you get the hash value on a string consisting of "," and whatever (new[] { 1, 2, 3, 0 }).ToString(); should return. The default is something like teh class name, so of course in both cases they will be the same. And of course string has all those funny special rules like "compares like a value type" and "string interning", so the hash should be the same.

Another alternative is to leverage the lesser known IEqualityComparer to implement your own hash and equality comparisons. There are some notes you'll need to observe about building good hashes, and it's generally not good practice to have editable data in your keys, as it'll introduce instability should the keys ever change, but it would certainly be more performant than using string joins.
public class ArrayKeyComparer : IEqualityComparer<int[]>
{
public bool Equals(int[] x, int[] y)
{
return x == null || y == null
? x == null && y == null
: x.SequenceEqual(y);
}
public int GetHashCode(int[] obj)
{
var seed = 0;
if(obj != null)
foreach (int i in obj)
seed %= i.GetHashCode();
return seed;
}
}
Note that this still may not be as performant as a tuple, since it's still iterating the array rather than being able to take a more constant expression.

Your strings are returning the same hash codes for the same strings correctly because string.GetHashCode() is implemented that way.
The implementation of int[].GetHashCode() does something with its memory address to return the hash code, so arrays with identical contents will nevertheless return different hash codes.
So that's why your arrays with identical contents are returning different hash codes.
Rather than using an array directly as a key, you should consider writing a wrapper class for an array that will provide a proper hash code.
The main disadvantage with this is that it will be an O(N) operation to compute the hash code (it has to be - otherwise it wouldn't represent all the data in the array).
Fortunately you can cache the hash code so it's only computed once.
Another major problem with using a mutable array for a hash code is that if you change the contents of the array after using it for the key of a hashing container such as Dictionary, you will break the container.
Ideally you would only use this kind of hashing for arrays that are never changed.
Bearing all that in mind, a simple wrapper would look like this:
public sealed class IntArrayKey
{
public IntArrayKey(int[] array)
{
Array = array;
_hashCode = hashCode();
}
public int[] Array { get; }
public override int GetHashCode()
{
return _hashCode;
}
int hashCode()
{
int result = 17;
unchecked
{
foreach (var i in Array)
{
result = result * 23 + i;
}
}
return result;
}
readonly int _hashCode;
}
You can use that in place of the actual arrays for more sensible hash code generation.
As per the comments below, here's a version of the class that:
Makes a defensive copy of the array so that it cannot be modified.
Implements equality operators.
Exposes the underlying array as a read-only list, so callers can access its contents but cannot break its hash code.
Code:
public sealed class IntArrayKey: IEquatable<IntArrayKey>
{
public IntArrayKey(IEnumerable<int> sequence)
{
_array = sequence.ToArray();
_hashCode = hashCode();
Array = new ReadOnlyCollection<int>(_array);
}
public bool Equals(IntArrayKey other)
{
if (other is null)
return false;
if (ReferenceEquals(this, other))
return true;
return _hashCode == other._hashCode && equals(other.Array);
}
public override bool Equals(object obj)
{
return ReferenceEquals(this, obj) || obj is IntArrayKey other && Equals(other);
}
public static bool operator == (IntArrayKey left, IntArrayKey right)
{
return Equals(left, right);
}
public static bool operator != (IntArrayKey left, IntArrayKey right)
{
return !Equals(left, right);
}
public IReadOnlyList<int> Array { get; }
public override int GetHashCode()
{
return _hashCode;
}
bool equals(IReadOnlyList<int> other) // other cannot be null.
{
if (_array.Length != other.Count)
return false;
for (int i = 0; i < _array.Length; ++i)
if (_array[i] != other[i])
return false;
return true;
}
int hashCode()
{
int result = 17;
unchecked
{
foreach (var i in _array)
{
result = result * 23 + i;
}
}
return result;
}
readonly int _hashCode;
readonly int[] _array;
}
If you wanted to use the above class without the overhead of making a defensive copy of the array, you can change the constructor to:
public IntArrayKey(int[] array)
{
_array = array;
_hashCode = hashCode();
Array = new ReadOnlyCollection<int>(_array);
}

If you know the length of the arrays you're using, you could use a Tuple.
Console.WriteLine("Tuple");
Console.WriteLine(Tuple.Create(1, 2, 3, 0).GetHashCode());
Console.WriteLine(Tuple.Create(1, 2, 3, 0).GetHashCode());
Outputs
Tuple
1248
1248

I took all the suggestions from this question and the similar byte[].GetHashCode() question, and made a simple performance test.
The suggestions are as follows:
int[] as key (original attempt -- does not work at all, included as a benchmark)
string as key (original solution -- works, but slow)
Tuple as key (suggested by David)
ValueTuple as key (inspired by the Tuple)
Direct int[] hash as key
IntArrayKey (suggested by Matthew Watson)
int[] as key with Skeet's IEqualityComparer
int[] as key with David's IEqualityComparer
I generated a List containing one million int[]-arrays of length 7 containing random numbers between 100 000 and 999 999 (which is an approximation of my current use case). Then I duplicated the first 100 000 of these arrays, so that there are 900 000 unique arrays, and 100 000 that are listed twice (to force collisions).
For each solution, I enumerated the list, and added the keys to a Dictionary, OR incremented the Value if the key already existed. Then I printed how many keys had a Value more than 1**, and how much time it took.
The results are as follows (ordered from best to worst):
Algorithm Works? Time usage
NonGenericSkeetEquality YES 392 ms
SkeetEquality YES 422 ms
ValueTuple YES 521 ms
QuickIntArrayKey YES 747 ms
IntArrayKey YES 972 ms
Tuple YES 1 609 ms
string YES 2 291 ms
DavidEquality YES 1 139 200 ms ***
int[] NO 336 ms
IntHash NO 386 ms
The Skeet IEqualityComparer is only slightly slower than using the int[] as key directly, with the huge advantage that it actually works, so I'll use that.
** I'm aware that this is not a completely fool proof solution, as I could theoretically get the expected number of collisions without it actually being the collisions I expected, but having run the test a lot of times, I'm fairly certain I don't.
*** Did not finish, probably due to poor hashing algorithm and a lot of equality checks. Had to reduce the number of arrays to 10 000, then multiply the time usage by 100 to compare with the others.

Get Enum name based on the Enum value

I have declared the following Enum:
public enum AfpRecordId
{
BRG = 0xD3A8C6,
ERG = 0xD3A9C6
}
and i want to retrieve the enum object from is value:
private AfpRecordId GetAfpRecordId(byte[] data)
{
...
}
Call Examples:
byte[] tempData = new byte { 0xD3, 0xA8, 0xC6 };
AfpRecordId tempId = GetAfpRecordId(tempData);
//tempId should be equals to AfpRecordId.BRG
I would also like to use linq or lambda, only if they can give better or equals performance.

Simple:
Convert the byte array into an int (either manually, or by creating a four byte array and using BitConverter.ToInt32)
Cast the int to AfpRecordId
Call ToString on the result if necessary (your subject line suggests getting the name, but your method signature only talks about the value)
For example:
private static AfpRecordId GetAfpRecordId(byte[] data)
{
// Alternatively, switch on data.Length and hard-code the conversion
// for lengths 1, 2, 3, 4 and throw an exception otherwise...
int value = 0;
foreach (byte b in data)
{
value = (value << 8) | b;
}
return (AfpRecordId) value;
}
You can use Enum.IsDefined to check whether the given data is actually a valid ID.
As for performance - check whether something simple gives you good enough performance before you look for something faster.

Assuming that tempData has 3 elements use Enum.GetName (typeof (AfpRecordId), tempData[0] * 256*256 + tempData[1] * 256 +tempData[2]).

If the array is of a known size (I'll assume the size is 3 as per your example) you can
add the elements together and the cast the result to the enum
private AfpRecordId GetAfpRecordId(byte[] tempData){
var temp = tempData[0] * 256*256 + tempData[1] * 256 +tempData[2];
return (AfpRecordId)temp;
}
a different approach would be to use the shift operator instead
private AfpRecordId GetAfpRecordId(byte[] tempData){
var temp = (int)tempData[0]<<16 + (int)tempData[1] * 8 +tempData[2];
return (AfpRecordId)temp;
}

Why do 2 delegate instances return the same hashcode?

Take the following:
var x = new Action(() => { Console.Write("") ; });
var y = new Action(() => { });
var a = x.GetHashCode();
var b = y.GetHashCode();
Console.WriteLine(a == b);
Console.WriteLine(x == y);
This will print:
True
False
Why is the hashcode the same?
It is kinda surprising, and will make using delegates in a Dictionary as slow as a List (aka O(n) for lookups).
Update:
The question is why. IOW who made such a (silly) decision?
A better hashcode implementation would have been:
return Method ^ Target == null ? 0 : Target.GetHashcode();
// where Method is IntPtr

Easy! Since here is the implementation of the GetHashCode (sitting on the base class Delegate):
public override int GetHashCode()
{
return base.GetType().GetHashCode();
}
(sitting on the base class MulticastDelegate which will call above):
public sealed override int GetHashCode()
{
if (this.IsUnmanagedFunctionPtr())
{
return ValueType.GetHashCodeOfPtr(base._methodPtr);
}
object[] objArray = this._invocationList as object[];
if (objArray == null)
{
return base.GetHashCode();
}
int num = 0;
for (int i = 0; i < ((int) this._invocationCount); i++)
{
num = (num * 0x21) + objArray[i].GetHashCode();
}
return num;
}
Using tools such as Reflector, we can see the code and it seems like the default implementation is as strange as we see above.
The type value here will be Action. Hence the result above is correct.
UPDATE

My first attempt of a better implementation:
public class DelegateEqualityComparer:IEqualityComparer<Delegate>
{
public bool Equals(Delegate del1,Delegate del2)
{
return (del1 != null) && del1.Equals(del2);
}
public int GetHashCode(Delegate obj)
{
if(obj==null)
return 0;
int result = obj.Method.GetHashCode() ^ obj.GetType().GetHashCode();
if(obj.Target != null)
result ^= RuntimeHelpers.GetHashCode(obj);
return result;
}
}
The quality of this should be good for single cast delegates, but not so much for multicast delegates (If I recall correctly Target/Method return the values of the last element delegate).
But I'm not really sure if it fulfills the contract in all corner cases.
Hmm it looks like quality requires referential equality of the targets.

This smells like some of the cases mentioned in this thread, maybe it will give you some pointers on this behaviour. else, you could log it there :-)
What's the strangest corner case you've seen in C# or .NET?
Rgds GJ

From MSDN :
The default implementation of
GetHashCode does not guarantee
uniqueness or consistency; therefore,
it must not be used as a unique object
identifier for hashing purposes.
Derived classes must override
GetHashCode with an implementation
that returns a unique hash code. For
best results, the hash code must be
based on the value of an instance
field or property, instead of a static
field or property.
So if you have not overwritten the GetHashCode method, it may return the same. I suspect this is because it generates it from the definition, not the instance.

reinterpret_cast in C#

I'm looking for a way to reinterpret an array of type byte[] as a different type, say short[]. In C++ this would be achieved by a simple cast but in C# I haven't found a way to achieve this without resorting to duplicating the entire buffer.
Any ideas?

You can achieve this but this is a relatively bad idea. Raw memory access like this is not type-safe and can only be done under a full trust security environment. You should never do this in a properly designed managed application. If your data is masquerading under two different forms, perhaps you actually have two separate data sets?
In any case, here is a quick and simple code snippet to accomplish what you asked:
byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int byteCount = bytes.Length;
unsafe
{
// By using the fixed keyword, we fix the array in a static memory location.
// Otherwise, the garbage collector might move it while we are still using it!
fixed (byte* bytePointer = bytes)
{
short* shortPointer = (short*)bytePointer;
for (int index = 0; index < byteCount / 2; index++)
{
Console.WriteLine("Short {0}: {1}", index, shortPointer[index]);
}
}
}

There are four good answers to this question. Each has different downsides. Of course, beware of endianness and realize that all of these answers are holes in the type system, just not particularly treacherous holes. In short, don't do this a lot, and only when you really need to.
Sander's answer. Use unsafe code to reinterpret pointers. This is the fastest solution, but it uses unsafe code. Not always an option.
Leonidas' answer. Use StructLayout and FieldOffset(0) to turn a struct into a union. The downsides to this are that some (rare) environments don't support StructLayout (eg Flash builds in Unity3D) and that StructLayout cannot be used with generics.
ljs' answer. Use BitConverter methods. This has the disadvantage that most of the methods allocate memory, which isn't great in low-level code. Also, there isn't a full suite of these methods, so you can't really use it generically.
Buffer.BlockCopy two arrays of different types. The only downside is that you need two buffers, which is perfect when converting arrays, but a pain when casting a single value. Just beware that length is specified in bytes, not elements. Buffer.ByteLength helps. Also, it only works on primitives, like ints, floats and bools, not structs or enums.
But you can do some neat stuff with it.
public static class Cast {
private static class ThreadLocalType<T> {
[ThreadStatic]
private static T[] buffer;
public static T[] Buffer
{
get
{
if (buffer == null) {
buffer = new T[1];
}
return buffer;
}
}
}
public static TTarget Reinterpret<TTarget, TSource>(TSource source)
{
TSource[] sourceBuffer = ThreadLocalType<TSource>.Buffer;
TTarget[] targetBuffer = ThreadLocalType<TTarget>.Buffer;
int sourceSize = Buffer.ByteLength(sourceBuffer);
int destSize = Buffer.ByteLength(targetBuffer);
if (sourceSize != destSize) {
throw new ArgumentException("Cannot convert " + typeof(TSource).FullName + " to " + typeof(TTarget).FullName + ". Data types are of different sizes.");
}
sourceBuffer[0] = source;
Buffer.BlockCopy(sourceBuffer, 0, targetBuffer, 0, sourceSize);
return targetBuffer[0];
}
}
class Program {
static void Main(string[] args)
{
Console.WriteLine("Float: " + Cast.Reinterpret<int, float>(100));
Console.ReadKey();
}
}

c# supports this so long as you are willing to use unsafe code but only on structs.
for example : (The framework provides this for you but you could extend this to int <-> uint conversion
public unsafe long DoubleToLongBits(double d)
{
return *((long*) (void*) &d);
}
Since the arrays are reference types and hold their own metadata about their type you cannot reinterpret them without overwriting the metadata header on the instance as well (an operation likely to fail).
You can howveer take a foo* from a foo[] and cast that to a bar* (via the technique above) and use that to iterate over the array. Doing this will require you pin the original array for the lifetime of the reinterpreted pointer's use.

You could wrap your shorts/bytes into a structure which allows you to access both values:
See also here: C++ union in C#
using System;
using System.Collections.Generic;
using System.Runtime.InteropServices;
namespace TestShortUnion {
[StructLayout(LayoutKind.Explicit)]
public struct shortbyte {
public static implicit operator shortbyte(int input) {
if (input > short.MaxValue)
throw new ArgumentOutOfRangeException("input", "shortbyte only accepts values in the short-range");
return new shortbyte((short)input);
}
public shortbyte(byte input) {
shortval = 0;
byteval = input;
}
public shortbyte(short input) {
byteval = 0;
shortval = input;
}
[FieldOffset(0)]
public byte byteval;
[FieldOffset(0)]
public short shortval;
}
class Program {
static void Main(string[] args) {
shortbyte[] testarray = new shortbyte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1111 };
foreach (shortbyte singleval in testarray) {
Console.WriteLine("Byte {0}: Short {1}", singleval.byteval, singleval.shortval);
}
System.Console.ReadLine();
}
}
}

You can use System.Memory to do this in a safe way.
public static TTo[] Cast<TFrom, TTo>(this TFrom[] source) where TTo : struct where TFrom : struct =>
MemoryMarshal.Cast<TFrom, TTo>(source).ToArray();
private byte[] CastToBytes(int[] foo) => foo.Cast<int, byte>(foo);

This kind of behaviour would result in C# being rather type-unsafe. You can easily achieve this in a type-safe manner, however (though of course you are copying the array in doing so).
If you want one byte to map to one short then it's simple using ConvertAll, e.g.:-
short[] shorts = Array.ConvertAll(bytes, b => (short)b);
If you want to simply map every 2 bytes to a short then the following should work:-
if (bytes.Length % 2 != 0)
{
throw new ArgumentException("Byte array must have even rank.");
}
short[] shorts = new short[bytes.Length / 2];
for (int i = 0; i < bytes.Length / 2; ++i)
{
shorts[i] = BitConverter.ToInt16(bytes, 2*i);
}
It may be possible to use the marshaller to do some weird bit-twiddling to achieve this, probably using an unsafe { ... } code block, though this would be prone to errors and make your code unverifiable.
I suspect what you're trying to do can be achieved better using a type-safe idiom rather than a type-unsafe C/C++ one!
Update: Updated to take into account comment.

Casting like this is fundamentally unsafe and not permitted in a managed language. That's also why C# doesn't support unions. Yes, the workaround is to use the Marshal class.

Wouldn't it be possible to create a collection class that implements an interface for both bytes and shorts? Maybe implement both IList< byte > and IList< short >? Then you can have your underlying collection contain bytes, but implement IList< short > functions that work on byte pairs.

I used the code from FastArraySerializer to create a type converter to get from SByte[] to Double[]
public unsafe class ConvertArrayType
{
[StructLayout(LayoutKind.Explicit)]
private struct Union
{
[FieldOffset(0)] public sbyte[] sbytes;
[FieldOffset(0)] public double[] doubles;
}
private Union _union;
public double[] doubles {
get { return _union.doubles; }
}
public sbyte[] sbytes
{
get { return _union.sbytes; }
}
[StructLayout(LayoutKind.Sequential, Pack = 1)]
private struct ArrayHeader
{
public UIntPtr type;
public UIntPtr length;
}
private readonly UIntPtr SBYTE_ARRAY_TYPE;
private readonly UIntPtr DOUBLE_ARRAY_TYPE;
public ConvertArrayType(Array ary, Type newType)
{
fixed (void* pBytes = new sbyte[1])
fixed (void* pDoubles = new double[1])
{
SBYTE_ARRAY_TYPE = getHeader(pBytes)->type;
DOUBLE_ARRAY_TYPE = getHeader(pDoubles)->type;
}
Type typAry = ary.GetType();
if (typAry == newType)
throw new Exception("No Type change specified");
if (!(typAry == typeof(SByte[]) || typAry == typeof(double[])))
throw new Exception("Type Not supported");
if (newType == typeof(Double[]))
{
ConvertToArrayDbl((SByte[])ary);
}
else if (newType == typeof(SByte[]))
{
ConvertToArraySByte((Double[])ary);
}
else
{
throw new Exception("Type Not supported");
}
}
private void ConvertToArraySByte(double[] ary)
{
_union = new Union { doubles = ary };
toArySByte(_union.doubles);
}
private void ConvertToArrayDbl(sbyte[] ary)
{
_union = new Union { sbytes = ary };
toAryDouble(_union.sbytes);
}
private ArrayHeader* getHeader(void* pBytes)
{
return (ArrayHeader*)pBytes - 1;
}
private void toAryDouble(sbyte[] ary)
{
fixed (void* pArray = ary)
{
var pHeader = getHeader(pArray);
pHeader->type = DOUBLE_ARRAY_TYPE;
pHeader->length = (UIntPtr)(ary.Length / sizeof(double));
}
}
private void toArySByte(double[] ary)
{
fixed (void* pArray = ary)
{
var pHeader = getHeader(pArray);
pHeader->type = SBYTE_ARRAY_TYPE;
pHeader->length = (UIntPtr)(ary.Length * sizeof(double));
}
}
} // ConvertArrayType{}
Here's the VB usage:
Dim adDataYgch As Double() = Nothing
Try
Dim nGch As GCHandle = GetGch(myTag)
If GCHandle.ToIntPtr(nGch) <> IntPtr.Zero AndAlso nGch.IsAllocated Then
Dim asb As SByte()
asb = CType(nGch.Target, SByte())
Dim cvt As New ConvertArrayType(asb, GetType(Double()))
adDataYgch = cvt.doubles
End If
Catch ex As Exception
Debug.WriteLine(ex.ToString)
End Try

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

GetHashCode() on byte[] array - c#

What does GetHashCode() calculate when invoked on the byte[] array? The 2 data arrays with equal content do not provide the same hash.

Like other non-primitive built-in types, it just returns something arbitrary. It definitely doesn't try to hash the contents of the array. See this answer.

Simple solution public static int GetHashFromBytes(byte[] bytes) { return new BigInteger(bytes).GetHashCode(); }

byte[] inherits GetHashCode() from object, it doesn't override it. So what you get is basically object's implementation.

If it's not the same instance, it will return different hashes. I'm guessing it is based on the memory address where it is stored somehow.

Related

Is this possible? Specify any generic type as long as the + operation is defined on it

Array as Dictionary key gives a lot of collisions

Get Enum name based on the Enum value

Why do 2 delegate instances return the same hashcode?

reinterpret_cast in C#

Categories

Resources