implementation of insert function of Bucket hashing algorithm - c#

In the code below, doesn't it say, "if the array list does not contain the item then add it"?
public void Insert(string item)
{
int hash_value;
hash_value = Hash(value);
if (data[hash_value].Contains(item))
data[hash_value].Add(item);
}
If the item is already there, then why is it adding it again?

The code as written says, "if the collection at the index of the hash of "value" contains "item", then try to add it again". You're missing a '!' in front of your if condition (which means "not"), like this:
public void Insert(string item)
{
int hashValue = Hash(value);
if (!data[hashValue].Contains(item)) data[hashValue].Add(item);
}
Note that in your code snippet, value is undefined. It is entirely possible that you could get an IndexOutOfRangeException if hashValue is greater than the number of items in the collection.
Also, data is undefined, so it's also possible that the item at data[hashValue] is null, in which case you'll get a NullReferenceException when trying to call .Contains() on it.
There are different ways to handle these issues, depending on the datatype of data. If it's an array of lists, for example, you could just not add anything if the hash is out of range, and you can initialize the list if it's null:
public void Insert(string item)
{
int hashValue = Hash(value);
if (hashValue < data.Length)
{
if (data[hashValue] == null) data[hashValue] = new List<string>();
if (!data[hashValue].Contains(item)) data[hashValue].Add(item);
}
}

Related

If I'm using non-nullable reference types, how do I show that I didn't find anything?

I've enabled the C# 8.0 non-nullable reference types feature in one of my projects, but now I'm unclear about how to represent missing data.
For example, I'm reading a file whose lines are colon-separated key/value pairs. Sometimes there's more than one colon on a line. In that case, the text before the first colon is the key, and the rest is the value. My code to parse each line looks like this:
public (string key, string value) GetKeyValue(string line)
{
var split = line.Split(':');
if (split.Length == 2)
return (split[0].Trim(), split[1].Trim());
else if (split.Length > 2)
{
var joined = string.Join(":", split.ToList().Skip(1));
return (split[0].Trim(), joined.Trim());
}
else
{
Debug.Print($"Couldn't parse this into key/value: {line}");
return (null, null);
}
}
What this does: If we have just one colon, return the key and value. If we have more than one, join the rest of the text after the first colon, then return the key and value. Otherwise we have no colons and can't parse it, so return a null tuple. (Let's assume this last case can reasonably happen; I can't just throw and call it a bad file.)
Obviously that last line gets a nullability warning unless I change the declaration to
public (string? key, string? value) GetKeyValue(string line)
Now in F# I would just use an Option type and in the no-colon case, I'd return None.
But C# doesn't have an Option type. I could return ("", ""), but to me that doesn't seem better than nulls.
In a case like this, what's a good way to say "I didn't find anything" without using nulls?
You could include if the result was successful in parsing by just returning a flag:
public class Result
{
private Result(){}
public bool Successful {get;private set;} = false;
public string Key {get; private set;} = string.Empty;
public string Value {get; private set;} = string.Empty;
public static Successful(string key, string value)
{
return new Result
{
Successful = true,
Key = key,
Value = value
};
}
public static Failed()
{
return new Result();
}
}
public Result GetKeyValue(string line){
return Result.Failed();
}
Then you could use it like
var result = GetKeyValue("yoda");
if(result.Successful)
{
// do something...
}
Alternatiely you could return 2 diffrent types and use pattern matching 👍
Actually, I realize now that part of the problem is that my method is doing two separate things:
Determine whether the line has a key.
Return the key and value.
Thus the return value has to indicate both whether there's a key and value, and what the key and value are.
I can simplify by doing the first item separately:
bool HasKey(string line)
{
var split = line.Split(':');
return split.Length >= 2;
}
Then in the method I posted, if there's no key, I can throw and say that the lines need to be filtered by HasKey first.
Putting on my functional thinking cap, an idiomatic return type would be IEnumerable<(string?,string?)>. The only change to your code would be to change return to yield return, and to remove the return statement if nothing is found.
public IEnumerable<(string? key, string? value)> GetKeyValue(string line)
{
var split = line.Split(':');
if (split.Length == 2)
return (split[0].Trim(), split[1].Trim());
else if (split.Length > 2)
{
var joined = string.Join(":", split.ToList().Skip(1));
yield return (split[0].Trim(), joined.Trim());
}
else
{
Debug.Print($"Couldn't parse this into key/value: {line}");
}
}
The caller then has several options on how to handle the response.
If they want to check if the key was found the old-fashioned eway, do this:
var result = GetKeyValue(line).SingleOrDefault();
if (!result.HasValue) HandleKeyNotFound();
If they prefer to throw an exception if the key is not found, they'd do this:
var result = GetKeyValue(line).Single();
If they just want to be quiet about it they can use ForEach, which will use the key and value if they are found and simply do nothing if they are not:
foreach (var result in GetKeyValue(line)) DoSomething(result.Item1, result.Item2);
Also, for what it's worth, I'd suggest using KeyValuePair instead of a tuple, since it clearly communicates the purpose of the fields.

How to check whether two lists items have value equality using EqualityComparer? [duplicate]

Before marking this as duplicate because of its title please consider the following short program:
static void Main()
{
var expected = new List<long[]> { new[] { Convert.ToInt64(1), Convert.ToInt64(999999) } };
var actual = DoSomething();
if (!actual.SequenceEqual(expected)) throw new Exception();
}
static IEnumerable<long[]> DoSomething()
{
yield return new[] { Convert.ToInt64(1), Convert.ToInt64(999999) };
}
I have a method which returns a sequence of arrays of type long. To test it I wrote some test-code similar to that one within Main.
However I get the exception, but I don´t know why. Shouldn´t the expected sequence be comparable to the actually returned one or did I miss anything?
To me it looks as both the method and the epxected contain exactly one single element containing an array of type long, doesn´t it?
EDIT: So how do I achieve to not get the exception meaning to compare the elements within the enumeration to return equality?
The actual problem is the fact that you're comparing two long[], and Enumerable.SequenceEquals will use an ObjectEqualityComparer<Int64[]> (you can see that by examining EqualityComparer<long[]>.Default which is what is being internally used by Enumerable.SequenceEquals), which will compare references of those two arrays, and not the actual values stored inside the array, which obviously aren't the same.
To get around this, you could write a custom EqualityComparer<long[]>:
static void Main()
{
var expected = new List<long[]>
{ new[] { Convert.ToInt64(1), Convert.ToInt64(999999) } };
var actual = DoSomething();
if (!actual.SequenceEqual(expected, new LongArrayComparer()))
throw new Exception();
}
public class LongArrayComparer : EqualityComparer<long[]>
{
public override bool Equals(long[] first, long[] second)
{
return first.SequenceEqual(second);
}
// GetHashCode implementation in the courtesy of #JonSkeet
// from http://stackoverflow.com/questions/7244699/gethashcode-on-byte-array
public override int GetHashCode(long[] arr)
{
unchecked
{
if (array == null)
{
return 0;
}
int hash = 17;
foreach (long element in arr)
{
hash = hash * 31 + element.GetHashCode();
}
return hash;
}
}
}
No, your sequences are not equal!
Lets remove the sequence bit, and just take what is in the first element of each item
var firstExpected = new[] { Convert.ToInt64(1), Convert.ToInt64(999999) };
var firstActual = new[] { Convert.ToInt64(1), Convert.ToInt64(999999) };
Console.WriteLine(firstExpected == firstActual); // writes "false"
The code above is comparing two separate arrays for equality. Equality does not check the contents of arrays it checks the references for equality.
Your code using SequenceEquals is, essentially, doing the same thing. It checks the references in each case of each element in an enumerable.
SequenceEquals tests for the elements within the sequences to be identical. The elements within the enumerations are of type long[], so we actually compare two different arrays (containing the same elements however) against each other which is obsiously done by comparing their references instead of their actual value .
So what we actually check here is this expected[0] == actual[0] instead of expected[0].SequqnceEquals(actual[0])
This is obiosuly returns false as both arrays share different references.
If we flatten the hierarchy using SelectMany we get what we want:
if (!actual.SelectMany(x => x).SequenceEqual(expected.SelectMany(x => x))) throw new Exception();
EDIT:
Based on this approach I found another elegant way to check if all the elements from expected are contained in actual also:
if (!expected.All(x => actual.Any(y => y.SequenceEqual(x)))) throw new Exception();
This will search if for ever sub-list within expected there is a list within actual that is sequentially identical to the current one. This seems much smarter to be as we do not need any custom EqualityComparer and no weird hashcode-implementation.

Search for an existing object in a list

This is my first question here so I hope I'm doing right.
I have to create a List of array of integer:
List<int[]> finalList = new List<int[]>();
in order to store all the combinations of K elements with N numbers.
For example:
N=5, K=2 => {1,2},{1,3},{1,4},...
Everything is all right but I want to avoid the repetitions of the same combination in the list({1,2} and {2,1} for example). So before adding the tmpArray (where I temporally store the new combination) in the list, I want to check if it's already stored.
Here it's what I'm doing:
create the tmpArray with the next combination (OK)
sort tmpArray (OK)
check if the List already contains tmpArray with the following code:
if (!finalList.Contains(tmpArray))
finalList.Add(tmpArray);
but it doesn't work. Can anyone help me with this issue?
Array is a reference type - your Contains query will not do what you want (compare all members in order).
You may use something like this:
if (!finalList.Any(x => x.SequenceEqual(tmpArray))
{
finalList.Add(tmpArray);
}
(Make sure you add a using System.Linq to the top of your file)
I suggest you learn more about value vs. reference types, Linq and C# data structure fundamentals. While above query should work it will be slow - O(n*m) where n = number of arrays in finalList and m length of each array.
For larger arrays some precomputing (e.g. a hashcode for each of the arrays) that allows you a faster comparison might be beneficial.
If I remember correctly, contains will either check the value for value data types or it will check the address for object types. An array is an object type, so the contains is only checking if the address in memory is stored in your list. You'll have to check each item in this list and perform some type of algorithm to check that the values of the array are in the list.
Linq, Lambda, or brute force checking comes to mind.
BrokenGlass gives a good suggestion with Linq and Lambda.
Brute Force:
bool itemExists = true;
foreach (int[] ints in finalList)
{
if (ints.Length != tmpArray.Length)
{
itemExists = false;
break;
}
else
{
// Compare each element
for (int i = 0; i < tmpArray.Length; i++)
{
if (ints[i] != tmpArray[i])
{
itemExists = false;
break;
}
}
// Have to check to break from the foreach loop
if (itemExists == false)
{
break;
}
}
}
if (itemExists == false)
{
finalList.add(tmpArray);
}

How do you insert a value into an array, preserving order?

I have a string array, and I want to add a new value somewhere in the center, but don't know how to do this. Can anyone please make this method for me?
void AddValueToArray(String ValueToAdd, String AddAfter, ref String[] theArray) {
// Make this Value the first value
if(String.IsNullOrEmpty(AddAfter)) {
theArray[0]=ValueToAdd; // WRONG: This replaces the first Val, want to Add a new String
return;
}
for(int i=0; i<theArray.Length; i++) {
if(theArray[i]==AddAfter) {
theArray[i++]=ValueToAdd; // WRONG: Again replaces, want to Add a new String
return;
}
}
}
You can't add items to an array, it always remains the same size.
To get an array with the item added, you would need to allocate a new array with one more item, and copy all items from the original array to the new array.
This is certainly doable, but not efficient. You should use a List<string> instead, which already has an Insert metod.
This would work only in some particular case.
public static void AddValueToArray(ref String[] theArray, String valueToAdd, String addAfter) {
var count=theArray.Length;
Array.Resize(ref theArray, 1+count);
var index=Array.IndexOf(theArray, addAfter);
var array=Array.CreateInstance(typeof(String), count-index);
Array.Copy(theArray, index, array, 0, array.Length);
++index;
Array.Copy(array, 0, theArray, index, array.Length);
theArray[index]=valueToAdd;
}
Here's a sample, but it works with Type, you might need to modify the type you need. It is an example of copying array recursively.
How to find the minimum covariant type for best fit between two types?
See how IList Insert method is implemented

Way to pad an array to avoid index outside of bounds of array error

I expect to have at least 183 items in my list when I query it, but sometimes the result from my extract results in items count lower than 183. My current fix supposedly pads the array in the case that the count is less than 183.
if (extractArray.Count() < 183) {
int arraysize= extractArray.Count();
var tempArr = new String[183 - arraysize];
List<string> itemsList = extractArray.ToList<string>();
itemsList.AddRange(tempArr);
var values = itemsList.ToArray();
//-- Process the new array that is now at least 183 in length
}
But it seems my solution is not the best. I would appreciate any other solutions that could help ensure I get at least 183 items whenever the extract happens please.
I'd probably follow others' suggestions, and use a list. Use the "capacity" constructor for added performance:
var list = new List<string>(183);
Then, whenever you get a new array, do this (replace " " with whatever value you use to pad the array):
list.Clear();
list.AddRange(array);
// logically, you can do this without the if, but it saves an object allocation when the array is full
if (array.Length < 183)
list.AddRange(Enumerable.Repeat(" ", 183 - array.Length));
This way, the list is always reusing the same internal array, reducing allocations and GC pressure.
Or, you could use an extension method:
public static class ArrayExtensions
{
public static T ElementOrDefault<T>(this T[] array, int index)
{
return ElementOrDefault(array, index, default(T));
}
public static T ElementOrDefault<T>(this T[] array, int index, T defaultValue)
{
return index < array.Length ? array[index] : defaultValue;
}
}
Then code like this:
items.Zero = array[0];
items.One = array[1];
//...
Becomes this:
items.Zero = array.ElementOrDefault(0);
items.One = array.ElementOrDefault(1);
//...
Finally, this is the rather cumbersome idea with which I started writing this answer: You could wrap the array in an IList implementation that's guaranteed to have 183 indexes (I've omitted most of the interface member implementations for brevity):
class ConstantSizeReadOnlyArrayWrapper<T> : IList<T>
{
private readonly T[] _array;
private readonly int _constantSize;
private readonly T _padValue;
public ConstantSizeReadOnlyArrayWrapper(T[] array, int constantSize, T padValue)
{
//parameter validation omitted for brevity
_array = array;
_constantSize = constantSize;
_padValue = padValue;
}
private int MissingItemCount
{
get { return _constantSize - _array.Length; }
}
public IEnumerator<T> GetEnumerator()
{
//maybe you don't need to implement this, or maybe just returning _array.GetEnumerator() would suffice.
return _array.Concat(Enumerable.Repeat(_padValue, MissingItemCount)).GetEnumerator();
}
public int Count
{
get { return _constantSize; }
}
public bool IsReadOnly
{
get { return true; }
}
public int IndexOf(T item)
{
var arrayIndex = Array.IndexOf(_array, item);
if (arrayIndex < 0 && item.Equals(_padValue))
return _array.Length;
return arrayIndex;
}
public T this[int index]
{
get
{
if (index < 0 || index >= _constantSize)
throw new IndexOutOfRangeException();
return index < _array.Length ? _array[index] : _padValue;
}
set { throw new NotSupportedException(); }
}
}
Ack.
The Array base class implements the Resize method
if(extractArray.Length < 183)
Array.Resize<string>(ref extractArray, 183);
However, keep in mind that resizing is problematic for performance, thus this method is useful only if you require the array for some reason. If you can switch to a List
And, I suppose you have an unidimensional array of strings here, so I use the Length property to check the effective number of items in the array.
Since you've stated that you need to ensure there's 183 indexes, and that you need to pad it if there is not, I would suggest using a List instead of an array. You can do something like:
while (extractList.Count < 183)
{
extractList.Add(" "); // just add a space
}
If you ABSOLUTELY have to go back to an array you can using something similar.
I can't say that I would recommend this solution, but I won't let that stop me from posting it! Whether they like to admit it or not, everyone likes linq solutions!
Using linq, given an array with X elements in it, you can generate an array with exactly Y (183 in your case) elements in it like this:
var items183exactly = extractArray.Length == 183 ? extractArray :
extractArray.Take(183)
.Concat(Enumerable.Repeat(string.Empty, Math.Max(0, 183 - extractArray.Length)))
.ToArray();
If there are fewer than 183 elements, the array will be padded with empty strings. If there are more than 183 elements, the array will be truncated. If there are exactly 183 elements, the array is used as is.
I don't claim that this is efficient or that it is necessarily a good idea. However, it does use linq (yippee!) and it is fun.

Categories

Resources