Why Array.BinarySearch() giving negative numbers? - c#

I have some code that isn't making much sense to me. I have an array of strings and I'm using a binary search to count them in a foreach() loop. The code is exactly the same both times I attempt output, other than sorting. I'm not sure why I'm getting the results I'm getting. I assume it should just count the array values the same way both time. Any help?
Code:
using System;
public class Driver {
public static void Main(string [] args) {
String [] s = {"Bob", "Jane", "Will", "Bill", "Liz"};
Console.WriteLine("Before Sorting:\n----------");
foreach(string item in s) {
Console.WriteLine("{0}. {1}", Array.BinarySearch(s, item) + 1, item);
}
Console.WriteLine("Will is at position {0}", Array.BinarySearch(s, "Will") + 1);
Console.WriteLine("\n\nAfter Sorting:\n----------");
Array.Sort(s);
foreach(string item in s) {
Console.WriteLine("{0}. {1}", Array.BinarySearch(s, item) + 1, item);
}
Console.WriteLine("Will is at position {0}", Array.BinarySearch(s, "Will") + 1);
}
}
Output:
Before Sorting:
----------
1. Bob
2. Jane
3. Will
0. Bill
-2. Liz
Will is at position 3
After Sorting:
----------
1. Bill
2. Bob
3. Jane
4. Liz
5. Will
Will is at position 5
I'm sure it's something entirely stupid, but I can't figure it out.

Binary search only works on sorted arrays. It is not finding the value:
If value is not found and value is less than one or more elements in array, a negative number which is the bitwise complement of the index of the first element that is larger than value. If value is not found and value is greater than any of the elements in array, a negative number which is the bitwise complement of (the index of the last element plus 1).

Array.BinarySearch requires that the array be sorted. From the documentation:
This method does not support searching arrays that contain negative indexes. array must be sorted before calling this method.
It will return negative values, by design:
If value is not found and value is less than one or more elements in array, a negative number which is the bitwise complement of the index of the first element that is larger than value.

Binary search works on sorted array. You array is not sorted, so you are getting random results as expected.

Related

Please suggest different approach for this CountNumbers algorithm

Implement function CountNumbers that accepts a sorted array of unique integers and counts the number of array elements that are less than the parameter lessThan
For example, SortedSearch.CountNumbers(new int[] { 1, 3, 5, 7 }, 4) should return 2 because there are two array elements less than 4.
Below is my approach. But the score given by online tool for this is 50%. What am i missing?
using System;
public class SortedSearch
{
public static int CountNumbers(int[] sortedArray, int lessThan)
{
int iRes=0;
for (int i=0; i<sortedArray.Length; i++)
{
if(sortedArray[i]< lessThan)
{
iRes=iRes+1;
}
}
return iRes;
}
public static void Main(string[] args)
{
Console.WriteLine(SortedSearch.CountNumbers(new int[] { 1, 3, 5, 7 }, 4));
}
}
Your current solution takes up to O(N) where N is the size of array. You could leverage the fact that your input array is sorted to decrease the time complexity of the solution to by using BinarySearch:
public static int CountNumbers(int[] sortedArray, int lessThan)
{
var result = Array.BinarySearch(sortedArray, lessThan);
return result >= 0 ? result : -1 * result - 1;
}
Why do the strange -1 * result - 1 code? Because, as per the docs:
Returns
The index of the specified value in the specified array, if value is
found; otherwise, a negative number. If value is not found and value
is less than one or more elements in array, the negative number
returned is the bitwise complement of the index of the first element
that is larger than value. If value is not found and value is greater
than all elements in array, the negative number returned is the
bitwise complement of (the index of the last element plus 1). If this
method is called with a non-sorted array, the return value can be
incorrect and a negative number could be returned, even if value is
present in array.
result - 1 reverses the "bitwise complement of (the index of the last element plus 1)".
BinarySearch will generally perform faster than Where or TakeWhile particularly over large sets of data - since it will perform a binary search.
From wikipedia:
In computer science, binary search, also known as half-interval
search, logarithmic search, or binary chop, is a search algorithm that
finds the position of a target value within a sorted array.
The clue to use a binary search is the "accepts a sorted array of unique integers" part of the requirement. My above solution only works, as is, with a sorted array of unique values. It thus seems to me that whomever wrote the online test likely had binary search in mind.
You could make use of Linq for the purpose.
int CountNumbers(IEnumerable<int> source,int limit)
{
return source.TakeWhile(x=>x<limit).Count();
}
Since it is already mentioned in OP that the input array is sorted, you can exit your search when you find the first element greater than the limit. TakeWhile method would help you in the same.
The above method, would select all elements while the condition is met, and finds the count of items.
Example.
var result = CountNumbers(new int[] {1, 3, 5, 7},4);
Output : 2

Searching List<string> based on string match

I have the following c# List<string>
var lists = new List<string>
{
"a", "b", "c", "ee", "ja"
}
I now want to find the index of the last item whose alphanumeric value is less than or equal to d, which in this case would be 2 - which represents "c"
Can anyone suggest how I can do this? It needs to be fast as it will be searching large lists.
Is there also a way to do the same comparison for the closest match to "ef" or any set of multiple characters
EDIT - I know I could write a for loop to do this, but is there any other way to do this? Maybe a built in function.
I know if it was a numeric function I could use Linq.
You want FindLastIndex
var index = lists.FindLastIndex(value => value.CompareTo("d") < 0);
NOTE: You have to use CompareTo as < doesn't exist for strings.
You'll get great performance by using the BinarySearch method, under the condition that your List is sorted. If it isn't, then don't use this method because you'll get incorrect results.
// List.BinarySearch returns:
// The zero-based index of item in the sorted System.Collections.Generic.List`1,
// if item is found; otherwise, a negative number that is the bitwise complement
// of the index of the next element that is larger than item or, if there is no
// larger element, the bitwise complement of System.Collections.Generic.List`1.Count.
int pos = lists.BinarySearch("d");
int resultPos = pos >= 0 ? pos : ~pos - 1;
Console.WriteLine("Result: " + resultPos);

Collection/Dictionary.Count returning a different value to the actual number of elements?

I'm probably missing something obvious - but when I do .Count on a Dictionary or a Collection I'm getting the wrong count compared to the actual number of elements in the Dictionary or Collection.
I have code that counts the number of words in a string, and returns a Collection<KeyValuePair<string, uint>>.
Here's my code:
string myString = String.Format("Hello world this is test test test test hi");
var result = WordCounter.GetWordCollection(myString);
result.Dump(); //Using LINQPad .Dump() method
Console.WriteLine ("Counted {0} words in {1}ms", result.Count, stopwatch.Elapsed.Milliseconds);
foreach (var word in result)
{
Console.WriteLine ("{0} - {1}", word.Key, word.Value);
}
Result:
What's going wrong?
Edit: I was counting the number of unique words without realising it, like the comment/answer said. I needed to use result.Sum(n => n.Value).
It is working properly - there are seven keys in the dictionary, each with a value representing how many times they were in the source text. The count of seven means there are seven unique words. Summing the values will give you the total word count including repetitions.

Pick up two numbers from an array so that the sum is a constant

I came across an algorithm problem. Suppose I receive a credit and would like to but two items from a local store. I would like to buy two items that add up to the entire value of the credit. The input data has three lines.
The first line is the credit, the second line is the total amount of the items and the third line lists all the item price.
Sample data 1:
200
7
150 24 79 50 88 345 3
Which means I have $200 to buy two items, there are 7 items. I should buy item 1 and item 4 as 200=150+50
Sample data 2:
8
8
2 1 9 4 4 56 90 3
Which indicates that I have $8 to pick two items from total 8 articles. The answer is item 4 and item 5 because 8=4+4
My thought is first to create the array of course, then pick up any item say item x. Creating another array say "remain" which removes x from the original array.
Subtract the price of x from the credit to get the remnant and check whether the "remain" contains remnant.
Here is my code in C#.
// Read lines from input file and create array price
foreach (string s in price)
{
int x = Int32.Parse(s);
string y = (credit - x).ToString();
index1 = Array.IndexOf(price, s) ;
index2 = Array.IndexOf(price, y) ;
remain = price.ToList();
remain.RemoveAt(index1);//remove an element
if (remain.Contains(y))
{
break;
}
}
// return something....
My two questions:
How is the complexity? I think it is O(n2).
Any improvement to the algorithm? When I use sample 2, I have trouble to get correct indices. Because there two "4" in the array, it always returns the first index since IndexOf(String) reports the zero-based index of the first occurrence of the specified string in this instance.
You can simply sort the array in O(nlogn) time. Then for each element A[i] conduct a binary search for S-A[i] again in O(nlogn) time.
EDIT: As pointed out by Heuster, you can solve the 2-SUM problem on the sorted array in linear time by using two pointers (one from the beginning and other from the end).
Create a HashSet<int> of the prices. Then go through it sequentially.Something like:
HashSet<int> items = new HashSet<int>(itemsList);
int price1 = -1;
int price2 = -1;
foreach (int price in items)
{
int otherPrice = 200 - price;
if (items.Contains(otherPrice))
{
// found a match.
price1 = price;
price2 = otherPrice;
break;
}
}
if (price2 != -1)
{
// found a match.
// price1 and price2 contain the values that add up to your target.
// now remove the items from the HashSet
items.Remove(price1);
items.Remove(price2);
}
This is O(n) to create the HashSet. Because lookups in the HashSet are O(1), the foreach loop is O(n).
This problem is called 2-sum. See., for example, http://coderevisited.com/2-sum-problem/
Here is an algorithm in O(N) time complexity and O(N) space : -
1. Put all numbers in hash table.
2. for each number Arr[i] find Sum - Arr[i] in hash table in O(1)
3. If found then (Arr[i],Sum-Arr[i]) are your pair that add up to Sum
Note:- Only failing case can be when Arr[i] = Sum/2 then you can get false positive but you can always check if there are two Sum/2 in the array in O(N)
I know I am posting this is a year and a half later, but I just happened to come across this problem and wanted to add input.
If there exists a solution, then you know that both values in the solution must both be less than the target sum.
Perform a binary search in the array of values, searching for the target sum (which may or may not be there).
The binary search will end with either finding the sum, or the closest value less than sum. That is your starting high value while searching through the array using the previously mentioned solutions. Any value above your new starting high value cannot be in the solution, as it is more than the target value.
At this point, you have eliminated a chunk of data in log(n) time, that would otherwise be eliminated in O(n) time.
Again, this is an optimization that may only be worth implementing if the data set calls for it.

I have a sorted list of key/value pairs, and want to find the values adjacent to a new key

I have a list of key/value pairs (probably will be using a SortedList) and I won't be adding new values.
Instead I will be using new keys to get bounding values. For example if I have the following key/value pairs:
(0,100) (6, 200), (9, 150), (15, 100), (20, 300)
and I have the new key of 7, I want it to return 200 and 150, because 7 is between 6 and 9.
If I give 15 I want it to return 100 and 100 (because 15 is exactly 15). I want something like a binary search.
Thanks
You can do this with List<T>.BinarySearch:
var keys = new List<int>(sortedList.Keys);
int index = keys.BinarySearch(target);
int lower;
int upper;
if (index >= 0) {
lower = upper = index;
}
else {
index = ~index;
upper = index < keys.Count ? index : index - 1;
lower = index == 0 ? index : index - 1;
}
Console.WriteLine("{0} => {1}, {2}",
target, sortedList[keys[lower]], sortedList[keys[upper]]);
You have to use the return value of List<T>.BinarySearch to get to the boundary values. From msdn, its return value is:
"The zero-based index of item in the sorted List<T>, if item is found; otherwise, a negative number that is the bitwise complement of the index of the next element that is larger than item or, if there is no larger element, the bitwise complement of Count."
Also, for elements that fall below the first or beyond the last, this code "returns" the first and the last twice, respectively. This might not be what you want, but it's up to you to define your boundary conditions. Another one is if the collection is empty, which I didn't address.
Yep, you want exactly binary search -- use the List<t>.BinarySearch method, specifically the overload taking a IComparer second argument (and implement that interface with a simple aux class that just compares keys).

Categories

Resources