Algorithm for shortest list of words - c#

The issue is as follows: the user provides a StartWord and EndWord string of X letters together with a list of strings that are also of length X (lets make it 4 but probably more)
static void Main(string[] args)
{
string StartWord = "Spot";
string EndWord = "Spin";
List<string> providedList = new List<string>
{
"Spin", "Spit", "Spat", "Spot", "Span"
};
List<string> result = MyFunc(StartWord, EndWord, providedList);
}
public List<string> MyFunc(string startWord, string endWord, List<string> input)
{
???
}
From the provided parameters I need to display to the user a result that comprises of the SHORTEST list of 4 letter words, starting with StartWord and ending with EndWord with a number of intermediate words that are to be found in the list, where each word differs from the previous word by PRECISELY one letter.
For example the above code should return a list of strings containing these elements:
Spot(as FirstWord),
Spit(only one letter is different from previous word),
Spin (as EndWord)
A bad exapmle would be: Spot, Spat, Span, Spin (as it takes 3 changes compared to the above 2)
I have been looking at some matching algorithms and recursion, but I am not able to figure out how to go about this.
Thank you for any kind of help in advance.

Create a graph where the vertices are words, and an edge connects any two words that differ by one letter.
Do a breadth-first search, starting at the StartWord, looking for the shortest path to the EndWord.
Here is sample code for this solution in a different language (Python). That may give you an even better pointer. :-)
def shortestWordPath (startWord, endWord, words):
graph = {}
for word in words:
graph[word] = {"connected": []}
for word in words:
for otherWord in words:
if 1 == wordDistance(word, otherWord):
graph[word]['connected'].append(otherWord)
todo = [(startWord,0)]
while len(todo):
(thisWord, fromWord) = todo.pop(0)
if thisWord == endWord:
answer = [thisWord, fromWord]
while graph[ answer[-1] ]["from"] != 0:
answer.append(graph[ answer[-1] ]["from"])
answer.reverse()
return answer
elif "from" in graph[thisWord]:
pass # We have already processed this.
else:
graph[thisWord]["from"] = fromWord
for nextWord in graph[thisWord]["connected"]:
todo.append([nextWord, thisWord])
return None
def wordDistance (word1, word2):
return len(differentPositions(word1, word2))
def differentPositions(word1, word2):
answer = []
for i in range(0, min(len(word1), len(word2))):
if word1[i] != word2[i]:
answer.append(i)
for i in range(min(len(word1), len(word2)),
max(len(word1), len(word2))):
answer.append(i)
return answer
print shortestWordPath("Spot", "Spin",
["Spin", "Spit", "Spat", "Spot", "Span"])

This is what I ended up using(please feel free to comment on the up and down side of it):
private List<List<string>> allWordSteps;
private string[] allWords;
public List<string> WordLadder(string wordStart, string wordEnd, string[] allWordsInput)
{
var wordLadder = new List<string>() { wordStart };
this.allWordSteps = new List<List<string>>() { wordLadder };
allWords = allWordsInput;
do
{
wordLadder = this.IterateWordSteps(wordEnd);
}
while (wordLadder.Count() == 0);
return wordLadder;
}
private List<string> IterateWordSteps(string wordEnd)
{
List<List<string>> allWordStepsCopy = this.allWordSteps.ToList();
this.allWordSteps.Clear();
foreach (var wordSteps in allWordStepsCopy)
{
var adjacent = this.allWords.Where(
x => this.IsOneLetterDifferent(x, wordSteps.Last()) &&
!wordSteps.Contains(x));
if (adjacent.Contains(wordEnd))
{
wordSteps.Add(wordEnd);
return wordSteps;
}
foreach (var word in adjacent)
{
List<string> newWordStep = wordSteps.ToList();
newWordStep.Add(word);
this.allWordSteps.Add(newWordStep);
}
}
return new List<string>();
}
private bool IsOneLetterDifferent(string first, string second)
{
int differences = 0;
if (first.Length == second.Length)
{
for (int i = 0; i < first.Length; i++)
{
if (first[i] != second[i])
{
differences++;
}
}
}
return differences == 1;
}

Related

How to split a string into an array of two letter substrings with C#

Problem
Given a sample string abcdef, i am trying to split that into an array of two character string elements that should results in ['ab','cd','ef'];
What i tried
I tried to iterate through the string while storing the substring in the current index in an array i declared inside the method, but am getting this output
['ab','bc','cd','de','ef']
Code I used
static string[] mymethod(string str)
{
string[] r= new string[str.Length];
for(int i=0; i<str.Length-1; i++)
{
r[i]=str.Substring(i,2);
}
return r;
}
Any solution to correct that with the code to return the correct output is really welcome, Thanks
your problem was that you incremented your index by 1 instead of 2 every time
var res = new List<string>();
for (int i = 0; i < x.Length - 1; i += 2)
{
res.Add(x.Substring(i, 2));
}
should work
EDIT:
because you ask for a default _ suffix in case of odd characters amount,
this should be the change:
var testString = "odd";
string workOn = testString.Length % 2 != 0
? testString + "_"
: testString;
var res = new List<string>();
for (int i = 0; i < workOn.Length - 1; i += 2)
{
res.Add(workOn.Substring(i, 2));
}
two notes to notice:
in .NET 6 Chunk() is available so you can use this as suggested in other answers
this solution might not be the best in case of a very long input
so it really depends on what are your inputs and expectations
.net 6 has an IEnumerable.Chunk() method that you can use to do this, as follows:
public static void Main()
{
string[] result =
"abcdef"
.Chunk(2)
.Select(chunk => new string(chunk)).ToArray();
Console.WriteLine(string.Join(", ", result)); // Prints "ab, cd, ef"
}
Before .net 6, you can use MoreLinq.Batch() to do the same thing.
[EDIT] In response the the request below:
MoreLinq is a set of Linq utilities originally written by Jon Skeet. You can find an implementation by going to Project | Manage NuGet Packages and then browsing for MoreLinq and installing it.
After installing it, add using MoreLinq.Extensions; and then you'll be able to use the MoreLinq.Batch extension like so:
public static void Main()
{
string[] result = "abcdef"
.Batch(2)
.Select(chunk => new string(chunk.ToArray())).ToArray();
Console.WriteLine(string.Join(", ", result)); // Prints "ab, cd, ef"
}
Note that there is no string constructor that accepts an IEnumerable<char>, hence the need for the chunk.ToArray() above.
I would say, though, that including the whole of MoreLinq just for one extension method is perhaps overkill. You could just write your own extension method for Enumerable.Chunk():
public static class MyBatch
{
public static IEnumerable<T[]> Chunk<T>(this IEnumerable<T> self, int size)
{
T[] bucket = null;
int count = 0;
foreach (var item in self)
{
if (bucket == null)
bucket = new T[size];
bucket[count++] = item;
if (count != size)
continue;
yield return bucket;
bucket = null;
count = 0;
}
if (bucket != null && count > 0)
yield return bucket.Take(count).ToArray();
}
}
If you are using latest .NET version i.e (.NET 6.0 RC 1), then you can try Chunk() method,
var strChunks = "abcdef".Chunk(2); //[['a', 'b'], ['c', 'd'], ['e', 'f']]
var result = strChunks.Select(x => string.Join('', x)).ToArray(); //["ab", "cd", "ef"]
Note: I am unable to test this on fiddle or my local machine due to latest version of .NET
With linq you can achieve it with the following way:
char[] word = "abcdefg".ToCharArray();
var evenCharacters = word.Where((_, idx) => idx % 2 == 0);
var oddCharacters = word.Where((_, idx) => idx % 2 == 1);
var twoCharacterLongSplits = evenCharacters
.Zip(oddCharacters)
.Select((pair) => new char[] { pair.First, pair.Second });
The trick is the following, we create two collections:
one where we have only those characters where the original index was even (% 2 == 0)
one where we have only those characters where the original index was odd (% 2 == 1)
Then we zip them. So, we create a tuple by taking one item from the even and one item from the odd collection. Then we create a new tuple by taking one item from the even and ...
And last we convert the tuples to arrays to have the desired output format.
You are on the right track but you need to increment by 2 not by one. You also need to check if the array has not ended before taking the second character else you risk running into an index out of bounds exception. Try this code I've written below. I've tried it and it works. Best!
public static List<string> splitstring(string str)
{
List<string> result = new List<string>();
int strlen = str.Length;
for(int i = 0; i<strlen; i+=2)
{
string currentstr = str[i].ToString();
if (i + 1 <= strlen-1)
{ currentstr += str[i + 1].ToString(); }
result.Add(currentstr);
}
return result;
}

What will be the linq query for below code?

class Program
{
public static bool Like(string toSearch,string toFind)
{
if (toSearch.Contains(toFind))
return true;
else
return false;
}
static void Main(string[] args)
{
List<string> str = new List<string>();
List<string> strNew = new List<string>();
str.Add("abdecacd");
str.Add("facdgh");
str.Add("iabcacdjk");
str.Add("lmn");
str.Add("opqe");
str.Add("acbd");
str.Add("efgh");
string strToSearch= "acd,abc,abcacd,al";
string[] desc = strToSearch.Split(',');
for(int i = 0; i < str.Count; i++)
{
for(int j = 0; j < desc.Length; j++)
{
if(Like(str[i].ToString(),desc[j].ToString()))
{
strNew.Add(str[i].ToString());
break;
}
}
}
if(strNew != null)
{
foreach(string strPrint in strNew)
{
Console.WriteLine(strPrint);
}
}
}
}
How to write a linq query for above code,in this strToSearch variable value will be dynamic,user will enter comma separated values,user may enter as many comma separated values as user wants,I want to write a linq query which will find all the values in List which will contain the value entered by user.
Reason I need linq query,because linq is used in my application. Kindly help me out on this.
The LINQ expression is:
List<string> strNew = str.Where(x => desc.Any(y => x.Contains(y))).ToList();
that can even be simplified (simplified for the .NET runtime, not for the programmer) to:
List<string> strNew = str.Where(x => desc.Any(x.Contains)).ToList();
by removing an intermediate lambda function.
In general there is no "speed" difference between what you wrote and what I wrote. Both expressions are O(m*n), with m = str.Length and n = desc.Length, so O(x^2). You aren't doing an exact search, so you can't use the usual trick of creating an HashSet<string> (or doing str.Intersect(desc).ToList() that internally does the same thing).

Unable to find longest match of string within List<String>

I am trying to find the longest match of strings in a list of strings if my List are:
- "1->2"
- "1->2->3"
- "1->2->3->4"
- "5->6"
- "5->6->7"
- "5->6->7->8"
So the output should be because this strings contain all of the matches within the same list, I want to discard the remaining matches which fall short:
"1->2->3->4"
"5->6->7->8"
Update:
As 1->2 and 1->2->3 are contained in 1->2->3->4, so I want to discard the less specific ones like 1->2 and 1->2->3 and take the longest match 1->2->3->4
The paths will always be in order like 1->2->3->4 and not 1->4 or 1->3.
I am trying like this, but I am getting enumeration yielded no results:
public class Program
{
public static void Main(string[] args)
{
List<(string, i)> flattenedPaths = new List<(string, i)>
{
("1->2", 0)
("1->2->3", 1)
("1->2->3->4", 2)
("5->6", 3)
("5->6->7", 4)
("5->6->7->8", 5)
};
IEnumerable<string> uniquePaths = GetUniquePaths(flattenedPaths);
}
public static IEnumerable<(string, int)> GetUniquePaths(List<(string, int)> Paths)
{
for (int i = 0; i < Paths.Count; i++)
{
bool doesMatchContain = Paths.Skip(i)
.Any(x => x.Item1.Contains(Paths[i].Item1));
if (!doesMatchContain)
yield return Paths[i];
}
}
}
Any help is appreciated.
Below is one approach you may wish to try.
betterData is your existing data, but projected into a more usable form - where the upper and lower bounds of the range are integers rather than strings.
The Substring and IndexOf and LastIndexOf code is brittle - but will work for your current sample data - feel free to harden it up with checks for -1 etc.
Once we have a list of data with those upper and lower bounds set, we use RemoveAll to delete any entries from the list which are within a 'wider' range (e.g. 2-3 is within 1-4).
Note also that betterData.ToList() is used to allow us to iterate over the list while modifying it. There are fancier ways of doing the same effect - but they are slightly more error prone so I've gone for the dumb but simple approach here.
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
Console.WriteLine(string.Join("\r\n", uniquePaths.Select(z => z.Item1 + " " + z.Item2)));
Console.WriteLine("Done");
Console.ReadLine();
}
private static List<(string, int)> flattenedPaths = new List<(string, int)>
{
("1->2", 0),
("1->2->3", 1),
("1->2->3->4", 2),
("5->6", 3),
("5->6->7", 4),
("5->6->7->8", 5),
};
private static IEnumerable<(string, int)> uniquePaths = GetUniquePaths(flattenedPaths);
private static IEnumerable<(string, int)> GetUniquePaths(List<(string, int)> Paths)
{
var betterData = Paths
.Select(z => new
{
Number = z.Item2,
Value = z.Item1,
Lower = int.Parse(z.Item1.Substring(0, z.Item1.IndexOf("-"))),
Upper = int.Parse(z.Item1.Substring(z.Item1.LastIndexOf("-") + 2))
})
.OrderByDescending(z => z.Value.Length).ThenByDescending(z => z.Upper).ThenBy(z => z.Lower).ToList();
foreach (var entry in betterData.ToList())
{
betterData.RemoveAll(z => z != entry && z.Lower >= entry.Lower && z.Upper <= entry.Upper);
}
return betterData.Select(x => (x.Value, x.Number));
}
}
For your given list
List<string> list= new List<string>
{
"1->2",
"1->2->3",
"1->2->3->4",
"5->6",
"5->6->7",
"5->6->7->8"
};
The LINQ query would be
var result = list
.Where(s => !list.Any(s2 => s2 != s && s2.IndexOf(s) == 0))
.ToList();
result contains "1->2->3->4" and "5->6->7->8"
if the input is in order and if you would like to find the longest sequence of that containing that string then follow below code,
class Program
{
string output;
public void initalize(string findlongestcontainof)
{
int length = 0;
List<string> inputs = new List<string>();
inputs.Add("1->2->3");
inputs.Add("1->2->3->4");
inputs.Add("1->2->3->4->5->6");
inputs.Add("1->2->3->4->5");
foreach(string s in inputs)
{
if(s.Contains(findlongestcontainof))
{
if(s.Length > length)
{
length = s.Length;
output = s;
}
}
}
}
static void Main(string[] args)
{
Program p = new Program();
p.initalize("1->2");
Console.WriteLine(p.output);
Console.ReadLine();
}
}

All letter-combinations (n-items) according pattern

Im trying to decrypt a word which letters have been replaced with random other letters (but not 2 different letters into the same one).
The goal:
Im searching for a word which lenght and letter-pattern is known.
What I know:
The pattern itself means if searching for "guests" I know "123454" which shows the position of unique letters in this word. And for sure I know its an english word correctly written.
Software side:
I've created a DataGridView which headers are titled by the pattern. I want to populate each Column (pattern) with its possible combinations of all letters a-z.
What I've tried:
Ill start at the end => I've successfully implemented a spell-checker. So in the end I thought about just going through the columns and check for each result to find the actual word.
From the starting point seen I've written this so far:
private string[] alpha = new string[] { "a", "b", "c", ..."z"};
private int[] digits = new int[] { 0, 1, 2, 3, 4,....9 };
private void bruteforce()
{
// Each Column
foreach(DataGridViewColumn col in dgvResults.Columns)
{
// HeaderText to CharArray to IntArray (-48 to switch from ASCII to INT).
int[] pattern = Array.ConvertAll(col.HeaderText.ToCharArray(), c => c - 48);
// Prepare an result-array with the same length as the pattern.
string[] resultSet = Enumerable.Repeat("-", pattern.Length).ToArray();
// For each digit 0-9.
foreach(int digit in digits)
{
// In pattern search for each digit and save the index.
int[] indexes = pattern.FindAllIndexof(digit);
// If index is found.
if(indexes.Length > 0)
{
// Custom function ReplaceAtIndex.
// Replace resultSet-Values at given Indexes with unique letter
resultSet.ReplaceAtIndex(indexes, alpha[digit]);
}
}
}
}
Current result:
A pattern of 0112344 will be saved (resultSet) as abbcdee.
Now I would need to loop the letters while staying on the same pattern.
This step feels even more complicated then the stuff before. I thought, before continuing blowing away my head, Im going to see if there are some genius guys out there on stackoverflow who can provide a shorter easier way (maybe some shortcuts with LINQ).
So please, is there anyone thinking "easy doin" about this who could help me?
I appreciate every help in here. Thanks
Here is IMO a quite effective algorithm for generating what are you asking for.
It's a variation of the algorithm I've used in Looking at each combination in jagged array and System.OutOfMemoryException when generating permutations and is optimized to perform minimum allocations.
public static class Algorithms
{
private static readonly char[] alpha = Enumerable.Range('a', 'z' - 'a' + 1).Select(c => (char)c).ToArray();
public static IEnumerable<string> GenerateWords(this string pattern)
{
return pattern.GenerateWordsCore().Select(word => new string(word));
}
public static IEnumerable<char[]> GenerateWordsCore(this string pattern)
{
var distinctSet = pattern.Select(c => c - '0').Distinct().ToArray();
var indexMap = pattern.Select(c => Array.IndexOf(distinctSet, c - '0')).ToArray();
var result = new char[pattern.Length];
var indices = new int[distinctSet.Length];
var indexUsed = new bool[alpha.Length];
for (int pos = 0, index = 0; ;)
{
// Generate the next permutation
if (index < alpha.Length)
{
if (indexUsed[index]) { index++; continue; }
indices[pos] = index;
indexUsed[index] = true;
if (++pos < distinctSet.Length) { index = 0; continue; }
// Populate and yield the result
for (int i = 0; i < indexMap.Length; i++)
result[i] = alpha[indices[indexMap[i]]];
yield return result;
}
// Advance to next permutation if any
if (pos == 0) yield break;
index = indices[--pos];
indexUsed[index] = false;
index++;
}
}
}
Sample usage:
bool test = "12334".GenerateWords().Contains("hello");
foreach (var word in "123454".GenerateWords())
{
// Do something with word
}

Split and join multiple logical "branches" of string data

I know there's a couple similarly worded questions on SO about permutation listing, but they don't seem to be quite addressing really what I'm looking for. I know there's a way to do this but I'm drawing a blank. I have a flat file that resembles this format:
Col1|Col2|Col3|Col4|Col5|Col6
a|b,c,d|e|f|g,h|i
. . .
Now here's the trick: I want to create a list of all possible permutations of these rows, where a comma-separated list in the row represents possible values. For example, I should be able to take an IEnumerable<string> representing the above to rows as such:
IEnumerable<string> row = new string[] { "a", "b,c,d", "e", "f", "g,h", "i" };
IEnumerable<string> permutations = GetPermutations(row, delimiter: "/");
This should generate the following collection of string data:
a/b/e/f/g/i
a/b/e/f/h/i
a/c/e/f/g/i
a/c/e/f/h/i
a/d/e/f/g/i
a/d/e/f/h/i
This to me seems like it would elegantly fit into a recursive method, but apparently I have a bad case of the Mondays and I can't quite wrap my brain around how to approach it. Some help would be greatly appreciated. What should GetPermutations(IEnumerable<string>, string) look like?
You had me at "recursive". Here's another suggestion:
private IEnumerable<string> GetPermutations(string[] row, string delimiter,
int colIndex = 0, string[] currentPerm = null)
{
//First-time initialization:
if (currentPerm == null) { currentPerm = new string[row.Length]; }
var values = row[colIndex].Split(',');
foreach (var val in values)
{
//Update the current permutation with this column's next possible value..
currentPerm[colIndex] = val;
//..and find values for the remaining columns..
if (colIndex < (row.Length - 1))
{
foreach (var perm in GetPermutations(row, delimiter, colIndex + 1, currentPerm))
{
yield return perm;
}
}
//..unless we've reached the last column, in which case we create a complete string:
else
{
yield return string.Join(delimiter, currentPerm);
}
}
}
I'm not sure whether this is the most elegant approach, but it might get you started.
private static IEnumerable<string> GetPermutations(IEnumerable<string> row,
string delimiter = "|")
{
var separator = new[] { ',' };
var permutations = new List<string>();
foreach (var cell in row)
{
var parts = cell.Split(separator);
var perms = permutations.ToArray();
permutations.Clear();
foreach (var part in parts)
{
if (perms.Length == 0)
{
permutations.Add(part);
continue;
}
foreach (var perm in perms)
{
permutations.Add(string.Concat(perm, delimiter, part));
}
}
}
return permutations;
}
Of course, if the order of the permutations is important, you can add an .OrderBy() at the end.
Edit: added an alernative
You could also build a list of string arrays, by calculating some numbers before determining the permutations.
private static IEnumerable<string> GetPermutations(IEnumerable<string> row,
string delimiter = "|")
{
var permutationGroups = row.Select(o => o.Split(new[] { ',' })).ToArray();
var numberOfGroups = permutationGroups.Length;
var numberOfPermutations =
permutationGroups.Aggregate(1, (current, pg) => current * pg.Length);
var permutations = new List<string[]>(numberOfPermutations);
for (var n = 0; n < numberOfPermutations; n++)
{
permutations.Add(new string[numberOfGroups]);
}
for (var position = 0; position < numberOfGroups; position++)
{
var permutationGroup = permutationGroups[position];
var numberOfCharacters = permutationGroup.Length;
var numberOfIterations = numberOfPermutations / numberOfCharacters;
for (var c = 0; c < numberOfCharacters; c++)
{
var character = permutationGroup[c];
for (var i = 0; i < numberOfIterations; i++)
{
var index = c + (i * numberOfCharacters);
permutations[index][position] = character;
}
}
}
return permutations.Select(p => string.Join(delimiter, p));
}
One algorithm you can use is basically like counting:
Start with the 0th item in each list (00000)
Increment the last value (00001, 00002 etc.)
When you can't increas one value, reset it and increment the next (00009, 00010, 00011 etc.)
When you can't increase any value, you're done.
Function:
static IEnumerable<string> Permutations(
string input,
char separator1, char separator2,
string delimiter)
{
var enumerators = input.Split(separator1)
.Select(s => s.Split(separator2).GetEnumerator()).ToArray();
if (!enumerators.All(e => e.MoveNext())) yield break;
while (true)
{
yield return String.Join(delimiter, enumerators.Select(e => e.Current));
if (enumerators.Reverse().All(e => {
bool finished = !e.MoveNext();
if (finished)
{
e.Reset();
e.MoveNext();
}
return finished;
}))
yield break;
}
}
Usage:
foreach (var perm in Permutations("a|b,c,d|e|f|g,h|i", '|', ',', "/"))
{
Console.WriteLine(perm);
}
I really thought this would be a great recursive function, but I ended up not writing it that way. Ultimately, this is the code I created:
public IEnumerable<string> GetPermutations(IEnumerable<string> possibleCombos, string delimiter)
{
var permutations = new Dictionary<int, List<string>>();
var comboArray = possibleCombos.ToArray();
var splitCharArr = new char[] { ',' };
permutations[0] = new List<string>();
permutations[0].AddRange(
possibleCombos
.First()
.Split(splitCharArr)
.Where(x => !string.IsNullOrEmpty(x.Trim()))
.Select(x => x.Trim()));
for (int i = 1; i < comboArray.Length; i++)
{
permutations[i] = new List<string>();
foreach (var permutation in permutations[i - 1])
{
permutations[i].AddRange(
comboArray[i].Split(splitCharArr)
.Where(x => !string.IsNullOrEmpty(x.Trim()))
.Select(x => string.Format("{0}{1}{2}", permutation, delimiter, x.Trim()))
);
}
}
return permutations[permutations.Keys.Max()];
}
... my test conditions provided me with exactly the output I expected:
IEnumerable<string> row = new string[] { "a", "b,c,d", "e", "f", "g,h", "i" };
IEnumerable<string> permutations = GetPermutations(row, delimiter: "/");
foreach(var permutation in permutations)
{
Debug.Print(permutation);
}
This produced the following output:
a/b/e/f/g/i
a/b/e/f/h/i
a/c/e/f/g/i
a/c/e/f/h/i
a/d/e/f/g/i
a/d/e/f/h/i
Thanks to everyone's suggestions, they really were helpful in sorting out what needed to be done in my mind. I've upvoted all your answers.

Categories

Resources