In C# and Vb.net,is any way without iterating by loop a bitarray to check contins any true or false value (Dotnet 2.0) ?
I doubt there's any way you could do it without a loop under the hood (as a BitArray can be arbitrarily long, unlike BitVector32), but if you just don't want to write it yourself:
var hasAnyTrue = input.Cast<bool>().Contains(true);
var hasAnyFalse = input.Cast<bool>().Contains(false);
If you are using the BitArray class from System.Collections you can use the following code to determine if anything is true.
C# version
var anyTrue = myArray.Cast<bool>().Any(x => x);
VB.Net Version
Dim anyTrue = myArray.Cast(Of Boolean)().Any(Function(x) x)
Indexing into the BitArray and checking the individual boolean values is an obvious solution. If you are concerned about performance, you should first and foremost consider creating your own abstraction, but if you prefer using BitArray for most of your operations, then you could do the check using CopyTo to an int[] of the right size (Count >> 5), and then perform zero or non-zero checks on these ints as appropriate.
I don't know if you can do it using a BitArray, but if you use an int, long etc. and then check to see if it is greater than 0 (for true) or less than the max value of the data type (for false) that would do it.
so something like this:
bool IsTrue (int bitArray)
{
return bitArray != 0;
}
bool isFalse (int bitArray)
{
return bitArray != int.MinValue;
}
Related
I need to use a list of numbers (longs) as a Dictionary key in order to do some group calculations on them.
When using the long array as a key directly, I get a lot of collisions. If I use string.Join(",", myLongs) as a key, it works as I would expect it to, but that's much, much slower (because the hash is more complicated, I assume).
Here's an example demonstrating my problem:
Console.WriteLine("Int32");
Console.WriteLine(new[] { 1, 2, 3, 0}.GetHashCode());
Console.WriteLine(new[] { 1, 2, 3, 0 }.GetHashCode());
Console.WriteLine("String");
Console.WriteLine(string.Join(",", new[] { 1, 2, 3, 0}).GetHashCode());
Console.WriteLine(string.Join(",", new[] { 1, 2, 3, 0 }).GetHashCode());
Output:
Int32
43124074
51601393
String
406954194
406954194
As you can see, the arrays return a different hash.
Is there any way of getting the performance of the long array hash, but the uniqeness of the string hash?
See my own answer below for a performance comparison of all the suggestions.
About the potential duplicate -- that question has a lot of useful information, but as this question was primarily about finding high performance alternatives, I think it still provides some useful solutions that are not mentioned there.
That the first one is different is actually good. Arrays are a reference type and luckily they are using the reference (somehow) during hash generation. I would guess that is something like the Pointer that is used on machine code level, or some Garbage Colletor level value. One of the things you have no influence on but is copied if you assign the same instance to a new reference variable.
In the 2nd case you get the hash value on a string consisting of "," and whatever (new[] { 1, 2, 3, 0 }).ToString(); should return. The default is something like teh class name, so of course in both cases they will be the same. And of course string has all those funny special rules like "compares like a value type" and "string interning", so the hash should be the same.
Another alternative is to leverage the lesser known IEqualityComparer to implement your own hash and equality comparisons. There are some notes you'll need to observe about building good hashes, and it's generally not good practice to have editable data in your keys, as it'll introduce instability should the keys ever change, but it would certainly be more performant than using string joins.
public class ArrayKeyComparer : IEqualityComparer<int[]>
{
public bool Equals(int[] x, int[] y)
{
return x == null || y == null
? x == null && y == null
: x.SequenceEqual(y);
}
public int GetHashCode(int[] obj)
{
var seed = 0;
if(obj != null)
foreach (int i in obj)
seed %= i.GetHashCode();
return seed;
}
}
Note that this still may not be as performant as a tuple, since it's still iterating the array rather than being able to take a more constant expression.
Your strings are returning the same hash codes for the same strings correctly because string.GetHashCode() is implemented that way.
The implementation of int[].GetHashCode() does something with its memory address to return the hash code, so arrays with identical contents will nevertheless return different hash codes.
So that's why your arrays with identical contents are returning different hash codes.
Rather than using an array directly as a key, you should consider writing a wrapper class for an array that will provide a proper hash code.
The main disadvantage with this is that it will be an O(N) operation to compute the hash code (it has to be - otherwise it wouldn't represent all the data in the array).
Fortunately you can cache the hash code so it's only computed once.
Another major problem with using a mutable array for a hash code is that if you change the contents of the array after using it for the key of a hashing container such as Dictionary, you will break the container.
Ideally you would only use this kind of hashing for arrays that are never changed.
Bearing all that in mind, a simple wrapper would look like this:
public sealed class IntArrayKey
{
public IntArrayKey(int[] array)
{
Array = array;
_hashCode = hashCode();
}
public int[] Array { get; }
public override int GetHashCode()
{
return _hashCode;
}
int hashCode()
{
int result = 17;
unchecked
{
foreach (var i in Array)
{
result = result * 23 + i;
}
}
return result;
}
readonly int _hashCode;
}
You can use that in place of the actual arrays for more sensible hash code generation.
As per the comments below, here's a version of the class that:
Makes a defensive copy of the array so that it cannot be modified.
Implements equality operators.
Exposes the underlying array as a read-only list, so callers can access its contents but cannot break its hash code.
Code:
public sealed class IntArrayKey: IEquatable<IntArrayKey>
{
public IntArrayKey(IEnumerable<int> sequence)
{
_array = sequence.ToArray();
_hashCode = hashCode();
Array = new ReadOnlyCollection<int>(_array);
}
public bool Equals(IntArrayKey other)
{
if (other is null)
return false;
if (ReferenceEquals(this, other))
return true;
return _hashCode == other._hashCode && equals(other.Array);
}
public override bool Equals(object obj)
{
return ReferenceEquals(this, obj) || obj is IntArrayKey other && Equals(other);
}
public static bool operator == (IntArrayKey left, IntArrayKey right)
{
return Equals(left, right);
}
public static bool operator != (IntArrayKey left, IntArrayKey right)
{
return !Equals(left, right);
}
public IReadOnlyList<int> Array { get; }
public override int GetHashCode()
{
return _hashCode;
}
bool equals(IReadOnlyList<int> other) // other cannot be null.
{
if (_array.Length != other.Count)
return false;
for (int i = 0; i < _array.Length; ++i)
if (_array[i] != other[i])
return false;
return true;
}
int hashCode()
{
int result = 17;
unchecked
{
foreach (var i in _array)
{
result = result * 23 + i;
}
}
return result;
}
readonly int _hashCode;
readonly int[] _array;
}
If you wanted to use the above class without the overhead of making a defensive copy of the array, you can change the constructor to:
public IntArrayKey(int[] array)
{
_array = array;
_hashCode = hashCode();
Array = new ReadOnlyCollection<int>(_array);
}
If you know the length of the arrays you're using, you could use a Tuple.
Console.WriteLine("Tuple");
Console.WriteLine(Tuple.Create(1, 2, 3, 0).GetHashCode());
Console.WriteLine(Tuple.Create(1, 2, 3, 0).GetHashCode());
Outputs
Tuple
1248
1248
I took all the suggestions from this question and the similar byte[].GetHashCode() question, and made a simple performance test.
The suggestions are as follows:
int[] as key (original attempt -- does not work at all, included as a benchmark)
string as key (original solution -- works, but slow)
Tuple as key (suggested by David)
ValueTuple as key (inspired by the Tuple)
Direct int[] hash as key
IntArrayKey (suggested by Matthew Watson)
int[] as key with Skeet's IEqualityComparer
int[] as key with David's IEqualityComparer
I generated a List containing one million int[]-arrays of length 7 containing random numbers between 100 000 and 999 999 (which is an approximation of my current use case). Then I duplicated the first 100 000 of these arrays, so that there are 900 000 unique arrays, and 100 000 that are listed twice (to force collisions).
For each solution, I enumerated the list, and added the keys to a Dictionary, OR incremented the Value if the key already existed. Then I printed how many keys had a Value more than 1**, and how much time it took.
The results are as follows (ordered from best to worst):
Algorithm Works? Time usage
NonGenericSkeetEquality YES 392 ms
SkeetEquality YES 422 ms
ValueTuple YES 521 ms
QuickIntArrayKey YES 747 ms
IntArrayKey YES 972 ms
Tuple YES 1 609 ms
string YES 2 291 ms
DavidEquality YES 1 139 200 ms ***
int[] NO 336 ms
IntHash NO 386 ms
The Skeet IEqualityComparer is only slightly slower than using the int[] as key directly, with the huge advantage that it actually works, so I'll use that.
** I'm aware that this is not a completely fool proof solution, as I could theoretically get the expected number of collisions without it actually being the collisions I expected, but having run the test a lot of times, I'm fairly certain I don't.
*** Did not finish, probably due to poor hashing algorithm and a lot of equality checks. Had to reduce the number of arrays to 10 000, then multiply the time usage by 100 to compare with the others.
I create the following array like this:
array<UInt16>^ temp = gcnew array<UInt16>(1000);
How do I determine if this entire array has been filled with zero or not.
I think I may be able to use TrueForAll(T) but I'm not sure.
var allElementsAreZero = temp.All(o => o == 0);
Simple as that.
It'll return when it finds one that doesn't satisfy the condition, so may not necessarily iterate through your whole collection:
"The enumeration of source is stopped as soon as the result can be determined."
https://msdn.microsoft.com/en-us/library/bb548541(v=vs.110).aspx
This should work properly (here I used LINQ):
IEnumerable<int> values = new List<int>(); // Or use any array type instead of List.
... Add your values here ...
var allAreZero = !values.Any(v => v != 0);
P.S. the array class inherits IEnumerable.
And here is a solution with foreach:
var isAllZero = true;
foreach (var value in values)
{
if (value != 0)
{
isAllZero = false;
break;
}
}
UPDATE
The really difference between TrueForAll, and my LINQ code is: LINQ code uses the fluent (or maybe also query) syntax, where TrueForAll is just a normal function where you send the array as a parameter.
initialize a counter from 0 then use for loop to interate through the array and increment the counter whenever it finds 0, and at the end compare the counter with size of array if its equal, it has all zeros
Reading the C++/CLI specification, it has been filled with
0s because you created it with a "new-expression" and the default value of the element type is 0.
24.2 CLI array creation
CLI array instances are created by new-expressions containing gcnew (§15.4.6) or …
Elements of CLI arrays created by new-expressions are always initialized to their default value.
I have an array of booleans which gets filled by a loop. The method that owns the array needs to return a single boolean. So can I do this:
bool[] Booleans = new bool[4];
// do work - fill array
return (Booleans[0] && Booleans[1] && Booleans[2] && Booleans[3]);
So if I have: T,T,F,T will I get F back since there is one in the array or will it send back something else or just crash all together?
A single false will result in false being returned with boolean AND logic.
You can also rewrite as:
return Booleans.All(b => b);
For the sake of completeness, or if LINQ is not an option, you can achieve the same via a loop:
var list = new List<bool> { true, false, true };
bool result = true;
foreach (var item in list)
{
result &= item;
if (!item)
break;
}
Console.WriteLine(result);
For small samples what you have is fine, but as the number of items grow either of the above approaches will make the code a lot friendlier.
If you must process the array and return only a single boolean, then the best approach would be to just set a value to true, then loop through until the value is false, or you reach the end of the loop.
This way you won't have to process past the first false value, since that makes the entire result false.
How you loop through depends on which version of .NET you are using.
We have this field in our database indicating a true/false flag for each day of the week that looks like this : '1111110'
I need to convert this value to an array of booleans.
For that I have written the following code :
char[] freqs = weekdayFrequency.ToCharArray();
bool[] weekdaysEnabled = new bool[]{
Convert.ToBoolean(int.Parse(freqs[0].ToString())),
Convert.ToBoolean(int.Parse(freqs[1].ToString())),
Convert.ToBoolean(int.Parse(freqs[2].ToString())),
Convert.ToBoolean(int.Parse(freqs[3].ToString())),
Convert.ToBoolean(int.Parse(freqs[4].ToString())),
Convert.ToBoolean(int.Parse(freqs[5].ToString())),
Convert.ToBoolean(int.Parse(freqs[6].ToString()))
};
And I find this way too clunky because of the many conversions.
What would be the ideal/cleanest way to convert this fixed length string into an Array of bool ??
I know you could write this in a for-loop but the amount of days in a week will never change and therefore I think this is the more performant way to go.
A bit of LINQ can make this a pretty trivial task:
var weekdaysEnabled = weekdayFrequency.Select(chr => chr == '1').ToArray();
Note that string already implements IEnumerable<char>, so you can use the LINQ methods on it directly.
In .NET 2
bool[] weekdaysEnabled1 =
Array.ConvertAll<char, bool>(
freqs,
new Converter<char, bool>(delegate(char c) { return Convert.ToBoolean(int.Parse(freqs[0].ToString())); }));
If you have a string of "1,2,3,1,5,7" you can put this in an array or hash table or whatever is deemed best.
How do you determine that all value are the same? In the above example it would fail but if you had "1,1,1" that would be true.
This can be done nicely using lambda expressions.
For an array, named arr:
var allSame = Array.TrueForAll(arr, x => x == arr[0]);
For an list (List<T>), named lst:
var allSame = lst.TrueForAll(x => x == lst[0]);
And for an iterable (IEnumerable<T>), named col:
var first = col.First();
var allSame = col.All(x => x == first);
Note that these methods don't handle empty arrays/lists/iterables however. Such support would be trivial to add however.
Iterate through each value, store the first value in a variable and compare the rest of the array to that variable. The instant one fails, you know all the values are not the same.
How about something like...
string numArray = "1,1,1,1,1";
return numArrray.Split( ',' ).Distinct().Count() <= 1;
I think using List<T>.TrueForAll would be a slick approach.
http://msdn.microsoft.com/en-us/library/kdxe4x4w.aspx
Not as efficient as a simple loop (as it always processes all items even if the result could be determined sooner), but:
if (new HashSet<string>(numbers.Split(',')).Count == 1) ...