I have to search a for a name from the array through binary search but, show the corresponding age of the person's name instead. Please no other suggestions, I have to do it with Binary search on 2D arrays.
string[,] persons = new string[4, 2];
persons[0, 0] = "Ana";
persons[0, 1] = "19";
persons[1, 0] = "Ammara";
persons[1, 1] = "20";
persons[2, 0] = "Marilyn";
persons[2, 1] = "40";
persons[3, 0] = "Zacharaia";
persons[3, 1] = "70";
string x = "Zacharia";
int upBound = persons.GetLength(0) - 1;
int lowBound = 0;
while (lowBound <= upBound)
{
int midpoint = lowBound + (upBound - lowBound) / 2;
int result = x.CompareTo(persons[midpoint, 1]);
if (result == midpoint)
{
Console.WriteLine("The value is present at:" + persons[midpoint, 1]);
break;
}
else if (result > midpoint)
{
lowBound = midpoint + 1;
Console.WriteLine("The value is present at:" + persons[lowBound, 1]);
break;
}
else if (result < midpoint)
{
upBound = midpoint - 1;
Console.WriteLine("The value is present at:" + persons[upBound, 1]);
break;
}
}
This code is showing the age of everyone 20. The CompareTo() method is not working.
You have 4 major problems in you code:
1- Use midpoint2, lowBound2 and upBound2 for unclear reason, you should use 0 and 1 instead.
2- As you depend on binary search the key elements must be sorted, so "Ana" must be after "Ammara" although she is younger, but you search with name not age, or change it to another name, like "Aca".
3- result should be compared to 0, < 0 and > 0, it has nothing to do with midpoint.
4- You should use Console.WriteLine and break in the first condition only to let the while continue if the name wasn't found yet.
string[,] persons = new string[4, 2];
persons[0, 0] = "Aca";
persons[0, 1] = "19";
persons[1, 0] = "Ammara";
persons[1, 1] = "20";
persons[2, 0] = "Marilyn";
persons[2, 1] = "40";
persons[3, 0] = "Zach";
persons[3, 1] = "70";
string x = "Aca";
int upBound = persons.GetLength(0) - 1;
int lowBound = 0;
while (lowBound <= upBound)
{
int midpoint = lowBound + (upBound - lowBound) / 2;
int result = x.CompareTo(persons[midpoint, 0]);
if (result == 0)
{
Console.WriteLine("The value is present at:" + persons[midpoint, 1]);
break;
}
else if (result > 0)
{
lowBound = midpoint + 1;
}
else
{
upBound = midpoint - 1;
}
}
There are several problems with your code.
A binary search depends on the data being ordered. Therefore to do a binary search on the names they must be in alphabetical order, however Ammara should come before Ana, not after. However, in this case, you would still be able to successfully search for Zacharaia because it is still after the name at the midpoint.
A problem in the processing code is at the line:
x.CompareTo(persons[midpoint, 1]);
You want to compare the name in x ("Zacharia") with the name at the midpoint position, however persons[##, 1] is the location of a number. You need to use
x.CompareTo(persons[midpoint, 0]);
The CompareTo method returns 0 if the values match, < 0 if the x precedes the value and > 0 if x is after the value. However, you are comparing the result to midpoint which has nothing to do with the relative location of the value. Instead use:
if (result == 0)
{...}
else
{
if(result > 0)
{...}
else//The only possibility is for result to be < 0 so you don't need another if here.
{...}
}
The only time that you have found the value is whe result == 0, however you are writing the answer and breaking out of the while loop for all 3 cases. Only break when you find a match (result ==0).
The name in your array ("Zacharaia") is NOT the same as the name in string x ("Zacharia") therefore you will never find a match.
Ideally, your code should handle never finding a match.
If I am not wrong, you want to search names from the Array/List by binary search. For string, you can use Trie data structure to find a name.
If you are asking the names of the person performing search on their age, then proceed with the ages. First, you need to sort the names according to the ages (increasing or decreasing order). Then perform the binary search.
Related
This thing that I'm writing should do the following: get as an input a number, then, that many kids' names, and that many grades. Then, assign each kid a number of coins, so that if their grade is bigger, than their neighbor's, they get more coins and vice versa. What I wrote is this:
string input = Console.ReadLine();
int n = Convert.ToInt32(input);
int i = 0;
string[] names = new string[n];
for (i = 0; i < n; i++)
{
names[i] = Console.ReadLine();
}
string[] gradeText = new string[n];
int[] grades = new int[n];
for (i = 0; i < n; i++)
{
gradeText[i] = Console.ReadLine();
grades[i] = Convert.ToInt32(gradeText[i]);
}
int[] minCoins = { 1, 2, 3 };
int[] coinArray = new int[n];
for (i = 1; i < n - 2; i++)
{
if (grades[0] > grades[1])
{
coinArray[0] = 3;
}
else
{
coinArray[0] = 1;
}
if (grades[i] > grades[i + 1] && grades[i] > grades[i - 1])
{
coinArray[i] = 3;
}
if (grades[i] > grades[i + 1] || grades[i] > grades[i - 1])
{
coinArray[i] = 2;
}
if (grades[i] < grades[i + 1] && grades[i] < grades[i - 1])
{
coinArray[i] = 1;
}
if (grades[n - 1] > grades[n - 2])
{
coinArray[n - 1] = 3;
}
else
{ coinArray[n - 1] = 1; }
}
for (i = 0; i < n; i++)
{
Console.WriteLine(names[i] + " " + coinArray[i]);
}
I know my loop is hella messed up, but any tips on how to fix it would be kindly appreciated!
Others here have already suggested how to deal with index out of bounds issues. So this is slight different approach to solving your problem.
It is very easy to see this as one problem and then try to resolve it all in one place but that isn't always the best solution.
Your for loop is trying to do quite a lot. Each iteration could have many checks to make. In addition you are making checks you have previously made.
Do I have a neighbour to the left.
Do I have a neighbour to the right.
Did I get a better grade than both neighbours.
Did I get a better grade than one neighbour.
Did I lose to both neighbours.
My advice would be to break this down into two separate tasks.
1, To calculate how many neighbours each person got a higher grade than.
string[] names = new string[]{"John", "Paul", "Ringo", "George"};
int[] grades = new[] {3, 4, 3,2};
int[] winnersandloser = new int[4];
for (int i = 1; i < grades.Length; i++) //note starting at position 1 so I dont need to handle index out of bounds inside the for loop
{
if (grades[i] > grades[i - 1])
{
winnersandloser[i]++;
}
else
{
winnersandloser[i - 1]++;
}
}
In the above code you should have an array with these values: {0,2,1,0}
0 = you lost to both neighbours
1 = you beat one neighbour
2 = well done you beat both neighbours
Then using this winnersandlosers array you can calculate how many coins to give each person. I'll leave that for you to do.
Update
If you require different behaviour for the first and last in the list of people you need to add the logic to your code for allocating coins.
An array gives each value and index value, starting from 0. So 0 points to the first value in you array. George is the 4th entry in the array but as the array index starts with 0 the index value is 3, you also get this from arrayname.Length - 1
So now when looping through the array to allocate coins we can add a check for the first and last positions in the array.
//allocating coins
for (int i = 0; i < winnersandloser.Length; i++)
{
if (i == 0 || i == winnersandloser.Length - 1)
{
//allocating rules for first and last
}
else
{
//allocating rules for everyone else
}
}
One common way to approach this kind of issue is to oversize your array by as many elements as you need to look ahead/look behind. Place you real elements in the "middle"1 of this array and suitable dummy values into the elements at the start/end that don't correspond to real entries. You pick the dummy values such that the comparisons work out how you need them to (e.g. often you'll put int.MinValue in the dummy elements at the start and int.MaxValue in the dummy elements at the end).
You then just enumerate the real elements in the array, but all of your computed look aheads/look behinds still correspond to valid indexes in the array.
1Sometimes you'll have uneven look ahead/look behind requirements so it may not be the true middle. E.g. say you need to be able to look behind one element and ahead 3 elements, and you want to process 20 elements. You then create an array containing 24 entries, put dummy values at index 0, 21, 22 and 23, and populate your real elements into indexes 1 - 20.
I have an assingment and I'm a bit lost. In an array of 10 (or less) numbers which the user enters (I have this part done), I need to find the second smallest number. My friend sent me this code, but I'm having a hard time understanding it and writing it in c#:
Solved it!!! :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int vnesena;
int? min1 = null;
int? min2 = null;
for(int i=1; i<11; i=i+1)
{
Console.WriteLine("Vpiši " + i +"." + " število: ");
vnesena = Convert.ToInt32(Console.ReadLine());
if (vnesena == 0)
{
break;
}
if (min1 == null || vnesena < min1)
{
min2 = min1;
min1 = vnesena;
}
else if (vnesena != min1 && (min2==null || vnesena<min2))
{
min2 = vnesena;
}
}
if (min1 == null || min2 == null)
{
Console.WriteLine("Opozorilo o napaki");
}
else
{
Console.WriteLine("Izhod: " + min2);
}
Console.ReadKey();
}
}
}
That code is too complicated, so try something like this.
int[] numbers = new int[10];
for (int i = 0; i < 10; i++)
{
numbers[i] = int.Parse(Console.ReadLine());
}
Array.Sort(numbers);
Console.WriteLine("Second smallest number: " + numbers[1]);
If the code isn't too obvious, let me explain:
Declare an array of 10 integers
Loop 10 ten times and each time, ask for user input & place input as an integer to the array
Sort the array so each number is in the number order (smallest first, biggest last).
The first integer is smallest (input at index 0, so numbers[0]) and the second smallest is obviously numbers[1].
Of course, for this piece of code to work, you have to use this code in console program.
As you didn't mention if you are allowed to use built in sorting functions etc, I assume that Array.Sort() is valid.
EDIT: You updated your topic so I'll change my code to match criterias.
int[] numbers = new int[10];
bool tooShortInput = false;
for (int i = 0; i < 10; i++)
{
int input = int.Parse(Console.ReadLine());
if (input != 0)
{
numbers[i] = input;
}
else
{
if (i == 2)
{
Console.WriteLine("You only entered two numbers!");
tooShortInput = true;
break;
}
else
{
for (int j = 0; j < 10; j++)
{
if (numbers[j] == 0)
{
numbers[j] = 2147483647;
}
}
break;
}
}
}
// Sort the array
int temp = 0;
for (int write = 0; write < numbers.Length; write++) {
for (int sort = 0; sort < numbers.Length - 1; sort++) {
if (numbers[sort] > numbers[sort + 1]) {
temp = numbers[sort + 1];
numbers[sort + 1] = numbers[sort];
numbers[sort] = temp;
}
}
}
if (!tooShortInput)
{
Console.WriteLine("Second smallest number: " + numbers[1]);
}
If you don't understand the updated code, let me know, I will explain.
NOTE: This is fastly coded and tested with android phone so obviously this code isn't 5 star quality, not even close, but it qualifies :-).
Regards, TuukkaX.
To paraphrase the code given:
Set 2 variables to nothing. (This is so that there can be checks done later. int? could be used if you want to use null for one idea here.
Start loop through values.
Get next value.
If the minimum isn't set or the new value is lower than the minimum, replace the second lowest with the former lowest and lowest with the new value that was entered.
Otherwise, check if the new value isn't the same as the minimum and if the minimum isn't set or the entered value is lower than the second lowest then replace the second lowest with this new value.
Once the loop is done, if either minimum value isn't filled in then output there isn't such a value otherwise output the second lowest value.
Imagine if you had to do this manually. You'd likely keep track of the lowest value and second lowest value as you went through the array and the program is merely automating this process. What is the problem?
This is a rough translation of what your friend gave you that isn't that hard to translate to my mind.
int enteredValue;
int? smallest = null, secondSmallest = null;
for (int i = 0; i < 10; i = i + 1)
{
Console.WriteLine("Vpiši " + i+1 + " število: ");
enteredValue = Convert.ToInt32(Console.ReadLine());
if (smallest==null || enteredValue<smallest) {
secondSmallest=smallest;
smallest = enteredValue;
} else if (enteredValue!=smallest && enteredValue<secondSmallest) {
secondSmallest= enteredValue;
}
}
Why use a loop and not take advantage of the Array.Sort method?
int[] numbers = new int[4] { 4, 2, 6, 8 };
Array.Sort(numbers);
int secondSmallestNumber = numbers[1];
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm following this blog asking
Five programming problems every Software Engineer should be able to solve in less than 1 hour
I am absolutely stumped by question 4 (5 is a different story too)
Write a function that given a list of non negative integers, arranges them such that they form the largest possible number. For example, given [50, 2, 1, 9], the largest formed number is 95021.
Now the author posted a answer, and I saw an python attempt too:
import math
numbers = [50,2,1,9,10,100,52]
def arrange(lst):
for i in xrange(0, len(lst)):
for j in xrange(0, len(lst)):
if i != j:
comparison = compare(lst[i], lst[j])
if lst[i] == comparison[0]:
temp = lst[j]
lst[j] = lst[i]
lst[i] = temp
return lst
def compare(num1, num2):
pow10_1 = math.floor(math.log10(num1))
pow10_2 = math.floor(math.log10(num2))
temp1 = num1
temp2 = num2
if pow10_1 > pow10_2:
temp2 = (temp2 / math.pow(10, pow10_2)) * math.pow(10, pow10_1)
elif pow10_2 > pow10_1:
temp1 = (temp1 / math.pow(10, pow10_1)) * math.pow(10, pow10_2)
print "Starting", num1, num2
print "Comparing", temp1, temp2
if temp1 > temp2:
return [num1, num2]
elif temp2 > temp1:
return [num2, num1]
else:
if num1 < num2:
return [num1, num2]
else:
return [num2, num1]
print arrange(numbers)
but I'm not going to learn these languages soon. Is anyone willing to share how they would sort the numbers in C# to form the largest number please?
I've also tried straight conversion to C# in VaryCode but when the IComparer gets involved, then it causes erroneous conversions.
The python attempt uses a bubble sort it seems.
Is bubble sort a starting point? What else would be used?
The solution presented uses the approach to sort the numbers in the array in a special way. For example the value 5 comes before 50 because "505" ("50"+"5") comes before "550" ("5"+"50"). The thinking behind this isn't explained, and I'm not convinced that it actually works...
When looking that the problem, I came to this solution:
You can do it recursively. Make a method that loops through the numbers in the array and concatenates each number with the largest value that can be formed by the remaining numbers, to see which one of those is largest:
public static int GetLargest(int[] numbers) {
if (numbers.Length == 1) {
return numbers[0];
} else {
int largest = 0;
for (int i = 0; i < numbers.Length; i++) {
int[] other = numbers.Take(i).Concat(numbers.Skip(i + 1)).ToArray();
int n = Int32.Parse(numbers[i].ToString() + GetLargest(other).ToString());
if (i == 0 || n > largest) {
largest = n;
}
}
return largest;
}
}
If you want to do it by bubble sort, try this:
public static void Main()
{
bool swapped = true;
while (swapped)
{
swapped = false;
for (int i = 0; i < VALUES.Length - 1; i++)
{
if (Compare(VALUES[i], VALUES[i + 1]) > 0)
{
int temp = VALUES[i];
VALUES[i] = VALUES[i + 1];
VALUES[i + 1] = temp;
swapped = true;
}
}
}
String result = "";
foreach (int integer in VALUES)
{
result += integer.ToString();
}
Console.WriteLine(result);
}
public static int Compare(int lhs, int rhs)
{
String v1 = lhs.ToString();
String v2 = rhs.ToString();
return (v1 + v2).CompareTo(v2 + v1) * -1;
}
The Compare method compares the order of two numbers that would create the largest number. When it returns value larger than 0, that means you need to swap.
This is the sorting part:
while (swapped)
{
swapped = false;
for (int i = 0; i < VALUES.Length - 1; i++)
{
if (Compare(VALUES[i], VALUES[i + 1]) > 0)
{
int temp = VALUES[i];
VALUES[i] = VALUES[i + 1];
VALUES[i + 1] = temp;
swapped = true;
}
}
}
It checks to consecutive values in the array and swaps if necessary. After one iteration without swapping, the sort is finished.
Finally, you concatenate the values in the array and print it out.
I need to calculate the similarity between 2 strings. So what exactly do I mean? Let me explain with an example:
The real word: hospital
Mistaken word: haspita
Now my aim is to determine how many characters I need to modify the mistaken word to obtain the real word. In this example, I need to modify 2 letters. So what would be the percent? I take the length of the real word always. So it becomes 2 / 8 = 25% so these 2 given string DSM is 75%.
How can I achieve this with performance being a key consideration?
I just addressed this exact same issue a few weeks ago. Since someone is asking now, I'll share the code. In my exhaustive tests my code is about 10x faster than the C# example on Wikipedia even when no maximum distance is supplied. When a maximum distance is supplied, this performance gain increases to 30x - 100x +. Note a couple key points for performance:
If you need to compare the same words over and over, first convert the words to arrays of integers. The Damerau-Levenshtein algorithm includes many >, <, == comparisons, and ints compare much faster than chars.
It includes a short-circuiting mechanism to quit if the distance exceeds a provided maximum
Use a rotating set of three arrays rather than a massive matrix as in all the implementations I've see elsewhere
Make sure your arrays slice accross the shorter word width.
Code (it works the exact same if you replace int[] with String in the parameter declarations:
/// <summary>
/// Computes the Damerau-Levenshtein Distance between two strings, represented as arrays of
/// integers, where each integer represents the code point of a character in the source string.
/// Includes an optional threshhold which can be used to indicate the maximum allowable distance.
/// </summary>
/// <param name="source">An array of the code points of the first string</param>
/// <param name="target">An array of the code points of the second string</param>
/// <param name="threshold">Maximum allowable distance</param>
/// <returns>Int.MaxValue if threshhold exceeded; otherwise the Damerau-Leveshteim distance between the strings</returns>
public static int DamerauLevenshteinDistance(int[] source, int[] target, int threshold) {
int length1 = source.Length;
int length2 = target.Length;
// Return trivial case - difference in string lengths exceeds threshhold
if (Math.Abs(length1 - length2) > threshold) { return int.MaxValue; }
// Ensure arrays [i] / length1 use shorter length
if (length1 > length2) {
Swap(ref target, ref source);
Swap(ref length1, ref length2);
}
int maxi = length1;
int maxj = length2;
int[] dCurrent = new int[maxi + 1];
int[] dMinus1 = new int[maxi + 1];
int[] dMinus2 = new int[maxi + 1];
int[] dSwap;
for (int i = 0; i <= maxi; i++) { dCurrent[i] = i; }
int jm1 = 0, im1 = 0, im2 = -1;
for (int j = 1; j <= maxj; j++) {
// Rotate
dSwap = dMinus2;
dMinus2 = dMinus1;
dMinus1 = dCurrent;
dCurrent = dSwap;
// Initialize
int minDistance = int.MaxValue;
dCurrent[0] = j;
im1 = 0;
im2 = -1;
for (int i = 1; i <= maxi; i++) {
int cost = source[im1] == target[jm1] ? 0 : 1;
int del = dCurrent[im1] + 1;
int ins = dMinus1[i] + 1;
int sub = dMinus1[im1] + cost;
//Fastest execution for min value of 3 integers
int min = (del > ins) ? (ins > sub ? sub : ins) : (del > sub ? sub : del);
if (i > 1 && j > 1 && source[im2] == target[jm1] && source[im1] == target[j - 2])
min = Math.Min(min, dMinus2[im2] + cost);
dCurrent[i] = min;
if (min < minDistance) { minDistance = min; }
im1++;
im2++;
}
jm1++;
if (minDistance > threshold) { return int.MaxValue; }
}
int result = dCurrent[maxi];
return (result > threshold) ? int.MaxValue : result;
}
Where Swap is:
static void Swap<T>(ref T arg1,ref T arg2) {
T temp = arg1;
arg1 = arg2;
arg2 = temp;
}
What you are looking for is called edit distance or Levenshtein distance. The wikipedia article explains how it is calculated, and has a nice piece of pseudocode at the bottom to help you code this algorithm in C# very easily.
Here's an implementation from the first site linked below:
private static int CalcLevenshteinDistance(string a, string b)
{
if (String.IsNullOrEmpty(a) && String.IsNullOrEmpty(b)) {
return 0;
}
if (String.IsNullOrEmpty(a)) {
return b.Length;
}
if (String.IsNullOrEmpty(b)) {
return a.Length;
}
int lengthA = a.Length;
int lengthB = b.Length;
var distances = new int[lengthA + 1, lengthB + 1];
for (int i = 0; i <= lengthA; distances[i, 0] = i++);
for (int j = 0; j <= lengthB; distances[0, j] = j++);
for (int i = 1; i <= lengthA; i++)
for (int j = 1; j <= lengthB; j++)
{
int cost = b[j - 1] == a[i - 1] ? 0 : 1;
distances[i, j] = Math.Min
(
Math.Min(distances[i - 1, j] + 1, distances[i, j - 1] + 1),
distances[i - 1, j - 1] + cost
);
}
return distances[lengthA, lengthB];
}
There is a big number of string similarity distance algorithms that can be used. Some listed here (but not exhaustively listed are):
Levenstein
Needleman Wunch
Smith Waterman
Smith Waterman Gotoh
Jaro, Jaro Winkler
Jaccard Similarity
Euclidean Distance
Dice Similarity
Cosine Similarity
Monge Elkan
A library that contains implementation to all of these is called SimMetrics
which has both java and c# implementations.
I have found that Levenshtein and Jaro Winkler are great for small differences betwen strings such as:
Spelling mistakes; or
ö instead of o in a persons name.
However when comparing something like article titles where significant chunks of the text would be the same but with "noise" around the edges, Smith-Waterman-Gotoh has been fantastic:
compare these 2 titles (that are the same but worded differently from different sources):
An endonuclease from Escherichia coli that introduces single polynucleotide chain scissions in ultraviolet-irradiated DNA
Endonuclease III: An Endonuclease from Escherichia coli That Introduces Single Polynucleotide Chain Scissions in Ultraviolet-Irradiated DNA
This site that provides algorithm comparison of the strings shows:
Levenshtein: 81
Smith-Waterman Gotoh 94
Jaro Winkler 78
Jaro Winkler and Levenshtein are not as competent as Smith Waterman Gotoh in detecting the similarity. If we compare two titles that are not the same article, but have some matching text:
Fat metabolism in higher plants. The function of acyl thioesterases in the metabolism of acyl-coenzymes A and acyl-acyl carrier proteins
Fat metabolism in higher plants. The determination of acyl-acyl carrier protein and acyl coenzyme A in a complex lipid mixture
Jaro Winkler gives a false positive, but Smith Waterman Gotoh does not:
Levenshtein: 54
Smith-Waterman Gotoh 49
Jaro Winkler 89
As Anastasiosyal pointed out, SimMetrics has the java code for these algorithms. I had success using the SmithWatermanGotoh java code from SimMetrics.
Here is my implementation of Damerau Levenshtein Distance, which returns not only similarity coefficient, but also returns error locations in corrected word (this feature can be used in text editors). Also my implementation supports different weights of errors (substitution, deletion, insertion, transposition).
public static List<Mistake> OptimalStringAlignmentDistance(
string word, string correctedWord,
bool transposition = true,
int substitutionCost = 1,
int insertionCost = 1,
int deletionCost = 1,
int transpositionCost = 1)
{
int w_length = word.Length;
int cw_length = correctedWord.Length;
var d = new KeyValuePair<int, CharMistakeType>[w_length + 1, cw_length + 1];
var result = new List<Mistake>(Math.Max(w_length, cw_length));
if (w_length == 0)
{
for (int i = 0; i < cw_length; i++)
result.Add(new Mistake(i, CharMistakeType.Insertion));
return result;
}
for (int i = 0; i <= w_length; i++)
d[i, 0] = new KeyValuePair<int, CharMistakeType>(i, CharMistakeType.None);
for (int j = 0; j <= cw_length; j++)
d[0, j] = new KeyValuePair<int, CharMistakeType>(j, CharMistakeType.None);
for (int i = 1; i <= w_length; i++)
{
for (int j = 1; j <= cw_length; j++)
{
bool equal = correctedWord[j - 1] == word[i - 1];
int delCost = d[i - 1, j].Key + deletionCost;
int insCost = d[i, j - 1].Key + insertionCost;
int subCost = d[i - 1, j - 1].Key;
if (!equal)
subCost += substitutionCost;
int transCost = int.MaxValue;
if (transposition && i > 1 && j > 1 && word[i - 1] == correctedWord[j - 2] && word[i - 2] == correctedWord[j - 1])
{
transCost = d[i - 2, j - 2].Key;
if (!equal)
transCost += transpositionCost;
}
int min = delCost;
CharMistakeType mistakeType = CharMistakeType.Deletion;
if (insCost < min)
{
min = insCost;
mistakeType = CharMistakeType.Insertion;
}
if (subCost < min)
{
min = subCost;
mistakeType = equal ? CharMistakeType.None : CharMistakeType.Substitution;
}
if (transCost < min)
{
min = transCost;
mistakeType = CharMistakeType.Transposition;
}
d[i, j] = new KeyValuePair<int, CharMistakeType>(min, mistakeType);
}
}
int w_ind = w_length;
int cw_ind = cw_length;
while (w_ind >= 0 && cw_ind >= 0)
{
switch (d[w_ind, cw_ind].Value)
{
case CharMistakeType.None:
w_ind--;
cw_ind--;
break;
case CharMistakeType.Substitution:
result.Add(new Mistake(cw_ind - 1, CharMistakeType.Substitution));
w_ind--;
cw_ind--;
break;
case CharMistakeType.Deletion:
result.Add(new Mistake(cw_ind, CharMistakeType.Deletion));
w_ind--;
break;
case CharMistakeType.Insertion:
result.Add(new Mistake(cw_ind - 1, CharMistakeType.Insertion));
cw_ind--;
break;
case CharMistakeType.Transposition:
result.Add(new Mistake(cw_ind - 2, CharMistakeType.Transposition));
w_ind -= 2;
cw_ind -= 2;
break;
}
}
if (d[w_length, cw_length].Key > result.Count)
{
int delMistakesCount = d[w_length, cw_length].Key - result.Count;
for (int i = 0; i < delMistakesCount; i++)
result.Add(new Mistake(0, CharMistakeType.Deletion));
}
result.Reverse();
return result;
}
public struct Mistake
{
public int Position;
public CharMistakeType Type;
public Mistake(int position, CharMistakeType type)
{
Position = position;
Type = type;
}
public override string ToString()
{
return Position + ", " + Type;
}
}
public enum CharMistakeType
{
None,
Substitution,
Insertion,
Deletion,
Transposition
}
This code is a part of my project: Yandex-Linguistics.NET.
I wrote some tests and it's seems to me that method is working.
But comments and remarks are welcome.
Here is an alternative approach:
A typical method for finding similarity is Levenshtein distance, and there is no doubt a library with code available.
Unfortunately, this requires comparing to every string. You might be able to write a specialized version of the code to short-circuit the calculation if the distance is greater than some threshold, you would still have to do all the comparisons.
Another idea is to use some variant of trigrams or n-grams. These are sequences of n characters (or n words or n genomic sequences or n whatever). Keep a mapping of trigrams to strings and choose the ones that have the biggest overlap. A typical choice of n is "3", hence the name.
For instance, English would have these trigrams:
Eng
ngl
gli
lis
ish
And England would have:
Eng
ngl
gla
lan
and
Well, 2 out of 7 (or 4 out of 10) match. If this works for you, and you can index the trigram/string table and get a faster search.
You can also combine this with Levenshtein to reduce the set of comparison to those that have some minimum number of n-grams in common.
Here's a VB.net implementation:
Public Shared Function LevenshteinDistance(ByVal v1 As String, ByVal v2 As String) As Integer
Dim cost(v1.Length, v2.Length) As Integer
If v1.Length = 0 Then
Return v2.Length 'if string 1 is empty, the number of edits will be the insertion of all characters in string 2
ElseIf v2.Length = 0 Then
Return v1.Length 'if string 2 is empty, the number of edits will be the insertion of all characters in string 1
Else
'setup the base costs for inserting the correct characters
For v1Count As Integer = 0 To v1.Length
cost(v1Count, 0) = v1Count
Next v1Count
For v2Count As Integer = 0 To v2.Length
cost(0, v2Count) = v2Count
Next v2Count
'now work out the cheapest route to having the correct characters
For v1Count As Integer = 1 To v1.Length
For v2Count As Integer = 1 To v2.Length
'the first min term is the cost of editing the character in place (which will be the cost-to-date or the cost-to-date + 1 (depending on whether a change is required)
'the second min term is the cost of inserting the correct character into string 1 (cost-to-date + 1),
'the third min term is the cost of inserting the correct character into string 2 (cost-to-date + 1) and
cost(v1Count, v2Count) = Math.Min(
cost(v1Count - 1, v2Count - 1) + If(v1.Chars(v1Count - 1) = v2.Chars(v2Count - 1), 0, 1),
Math.Min(
cost(v1Count - 1, v2Count) + 1,
cost(v1Count, v2Count - 1) + 1
)
)
Next v2Count
Next v1Count
'the final result is the cheapest cost to get the two strings to match, which is the bottom right cell in the matrix
'in the event of strings being equal, this will be the result of zipping diagonally down the matrix (which will be square as the strings are the same length)
Return cost(v1.Length, v2.Length)
End If
End Function
Given a matrix[n,n] I want to find out how many ways we can reach from [0,0] to [n,n] non recursively.
My approach is to
Create a stuct Node to store row, col and path travelled so far
Add node to a Queue
Iterate thru queue till not empty . Increment row, increment col. Add to Queue
Print the path if row=n, col=n
Question
Is there a different way of storing row,col and path
If n is very large, storing nodes in Queue can be a problem. How can we avoid this?
Please not I am not looking for recursive solution.
I see such questions in many interview forums and so want to know if this would be the right approach.
Below is the structure of Node and the function
struct Node
{
public int row;
public int col;
public string path;
public Node(int r, int c, string p)
{
this.row = r;
this.col = c;
this.path = p;
}
}
public static void NextMoveNonRecursive(int max)
{
int rowPos;
int colPos;
string prevPath = "";
Node next;
while (qu.Count > 0)
{
Node current = qu.Dequeue();
rowPos = current.row;
colPos = current.col;
prevPath = current.path;
if (rowPos + 1 == max && colPos + 1 == max)
{
Console.WriteLine("Path = ..." + prevPath);
TotalPathCounter++;
}
if (rowPos + 1 < max)
{
if (prevPath == "")
prevPath = current.path;
prevPath = prevPath + ">" + (rowPos + 1) + "" + (colPos);
next = new Node(rowPos + 1, colPos, prevPath);
qu.Enqueue(next);
prevPath = "";
}
if (colPos + 1 < max)
{
if (prevPath == "")
prevPath = current.path;
prevPath = prevPath + ">" + (rowPos) + "" + (colPos+1);
next = new Node(rowPos, colPos+1, prevPath);
qu.Enqueue(next);
prevPath = "";
}
}
}
Let dp[i, j] be the number of paths from [0, 0] to [i, j].
We have:
dp[0, i] = dp[i, 0] = 1 for all i = 0 to n
dp[i, j] = dp[i - 1, j] + come down from all paths to [i - 1, j]
dp[i, j - 1] + come down from all paths to [i, j - 1]
dp[i - 1, j - 1] come down from all paths to [i - 1, j - 1]
for i, j > 0
Remove dp[i - 1, j - 1] from the above sum if you cannot increase both the row and the column.
dp[n, n] will have your answer.
Given a matrix [n,n], how many ways we can reach from [0,0] to [n,n] by increasing either a col or a row?
(n*2-2) choose (n*2-2)/2
If you can only go down or right (i.e., increase row or col), it seems like a binary proposition -- we can think of 'down' or 'right' as '0' or '1'.
In an nxn matrix, every path following the down/right condition will be n*2-2 in length (for example, in a 3x3 square, paths are always length 4; in a 4x4 square, length 6).
The number of total combinations for 0's and 1's in binary numbers of x digits is 2^x. In this case, our 'x' is n*2-2, but we cannot use all the combinations since the number of 'down's or 'right's cannot exceed n-1. It seems we need all binary combinations that have an equal number of 0's and 1's. And the solution is ... tada:
(n*2-2) choose (n*2-2)/2
In Haskell, you could write the following non-recursive function to list the paths:
import Data.List
mazeWays n = nub $ permutations $ concat $ replicate ((n*2-2) `div` 2) "DR"
if you want the number of paths, then:
length $ mazeWays n
Javascript solutions with sample
var arr = [
[1, 1, 1, 0, 0, 1, 0],
[1, 0, 1, 1, 1, 1, 0],
[1, 0, 1, 0, 1, 0, 0],
[1, 1, 0, 0, 1, 0, 0],
[1, 0, 0, 0, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1]
];
function sols2(arr){
var h = arr.length,
w = arr[0].length,
i, j, current, left, top;
for(i = 0; i < h; i++){
for(j = 0; j < w; j++){
current = arr[i][j];
top = i === 0 ? 0.5 : arr[i - 1][j];
left = j === 0 ? 0.5 : arr[i][j-1];
if(left === 0 && top === 0){
arr[i][j] = 0;
} else if(current > 0 && (left > 0 || top > 0)){
arr[i][j] = (left + top) | 0;
} else {
console.log('a6');
arr[i][j] = 0;
}
}
}
return arr[h-1][w-1];
}
sols2(arr);