Finding differences within 2 Lists of string arrays - c#

I am looking to find the differences between two Lists of string arrays using the index 0 of the array as the primary key.
List<string[]> original = new List<string[]>();
List<string[]> web = new List<string[]>();
//define arrays for List 'original'
string[] original_a1 = new string[3]{"a","2","3"};
string[] original_a2 = new string[3]{"x","2","3"};
string[] original_a3 = new string[3]{"c","2","3"};
//define arrays for List 'web'
string[] web_a1 = new string[3]{"a","2","3"};
string[] web_a2 = new string[3]{"b","2","3"};
string[] web_a3 = new string[3]{"c","2","3"};
//populate Lists
original.Add(original_a1);
original.Add(original_a2);
original.Add(original_a3);
web.Add(web_a1);
web.Add(web_a2);
web.Add(web_a3);
My goal is to find what is in List 'original' but NOT in 'web' by using index 0 as the primary key
This is what I tried.
List<string> differences = new List<string>(); //differences go in here
string tempDiff = ""; // I use this to try and avoid duplicate entries but its not working
for(int i = 0; i < original.Count; i++){
for(int j = 0; j< web.Count; j++){
if(!(original[i][0].Equals(web[j][0]))){
tempDiff = original[i][0];
}
}
differences.Add(tempDiff);
}
OUTPUT:
foreach(string x in differences){
Console.WriteLine("SIZE " + differences.Count);
Console.WriteLine(x);
ConSole.ReadLine();
}
SIZE 3
SIZE 3
x
SIZE 3
x
Why is it reporting the mismatch 3 times instead of once?

Using linq you can just go:
var differences = orignal.Except(web).ToList();
Reference here
This will give you the values that are in original, that don't exist in web
Sorry didn't read your question properly, to answer your question:
You have a nested for-loop. So for each value of original (3) it will loop through all values of web (3), which is 9 loops total.
In 3 cases it doesn't match and therefore outputs 3 times.

I think this is what you want. I use Linq to grab the primary keys, and then I use Except to do original - web. By the way, you can use == instead of Equals with strings in C# because C# does a value comparison as opposed to a reference comparison.
List<string[]> original = new List<string[]>
{
new string[3] { "a", "2", "3" },
new string[3] { "x", "2", "3" },
new string[3] { "c", "2", "3" }
};
List<string[]> web = new List<string[]>
{
new string[3] { "a", "2", "3" },
new string[3] { "b", "2", "3" },
new string[3] { "c", "2", "3" }
};
var originalPrimaryKeys = original.Select(o => o[0]);
var webPrimaryKeys = web.Select(o => o[0]);
List<string> differences = originalPrimaryKeys.Except(webPrimaryKeys).ToList();
Console.WriteLine("The number of differences is {0}", differences.Count);
foreach (string diff in differences)
{
Console.WriteLine(diff);
}
And here it is without Linq:
var differences = new List<string>();
for (int i = 0; i < original.Count; i++)
{
bool found = false;
for (int j = 0; j < web.Count; j++)
{
if (original[i][0] == web[j][0])
{
found = true;
}
}
if (!found)
{
differences.Add(original[i][0]);
}
}

To answer your question: It is a nested for loop as stated in JanR's answer. This approach will make you reiterate to your web count 9 times, thus listing your mismatched key three times.
What could be a better way to do is this:
//Check for originals not introduced in web.
if(original.Count > web.Count)
{
for(int y = web.Count; y < original.Count; y++)
{
differences.Add(original[y][0]);
}
}
//Check if Web has value, if not, everything else is done on the first for loop
if(web.Count > 0)
{
for(int i = 0; i < original.Count; i++)
{
if(!original[i][0].Equals(web[i][0]))
differences.Add(original[i][0]);
}
}
Also, the output is in a for loop, when you just need one result, the length of the mismatch. You can do that without a loop.
Console.WriteLine("SIZE " + differences.Count);
This is, of course to make it kinda simpler if you're not used to using LINQ statements, but if you can do so with LINQ, then by all means, use LINQ as it's more efficient.

You can get the difference by using Except extension method like this:
var originalDic = original.ToDictionary(arr => arr.First());
var webDic = web.ToDictionary(arr => arr.First());
var differences =
originalDic
.Except(webDic, kvp => kvp.Key)
.Select(kvp => kvp.Value)
.ToList();
The trick here is to first convert your original and web lists into a Dictionary using the first element of each array as key and then perform Except.

Related

replace an element in a specific array with another value in C#

There is one array for me. my array is as follows.
var Array = [["Dog","0","A","eat"],["cat","1","B","eat"]]
I want to replace the value in some indexes in this array with other values.
for example, it should be like this.
var newArray = [["Dog","big","house","eat"],["cat","small","forest","eat"]]
can be understood from the example, "0 = big, 1 = small" and "A=house, B=forest"
how can I solve this both with the for loop and using C# Linq.
Unsure if it qualifies as elegant but what you're describing is a matter of translation, a Dictionary is very good for this.
Loop through each string in each array and replace if the translation dictionary contains a key equal to the string.
var Array = new string[][] {
new string[] {"Dog", "0", "A", "eat" },
new string[] {"Cat", "1", "B", "eat" }
};
//Array: [["Dog","0","A","eat"],["Cat","1","B","eat"]]
var TranslationDict = new Dictionary<string, string>() {
{ "0", "big" },
{ "1", "small" },
{ "A", "house" },
{ "B", "forest" },
};
for (int y = 0; y < Array.Length; y++) {
for (int x = 0; x < Array[y].Length; x++) {
if(TranslationDict.ContainsKey(Array[y][x])) {
Array[y][x] = TranslationDict[Array[y][x]];
}
}
}
//Array: [["Dog","big","house","eat"],["Cat","small","forest","eat"]]
Do it with linq like that:
var testArray = array.
Select(x => x.
Select(y => y.Replace("0", "big").Replace("1","test")).ToArray())
.ToArray();

Comparing a string inside an object inside a list [duplicate]

This question already has an answer here:
C# List<object>.RemoveAll() - How to remove a subset of the list?
(1 answer)
Closed 1 year ago.
I have two objects each containing a name/user variable. I want to compare these two strings inside the list to make sure no object from my calendar permissions list is also in my users list. The way i'm trying now is with two for loops and it seems to remove everything but one duplicate?
calendarPermissions = new ObservableCollection<CalendarPermissions>(await parse.GetCalendarPermissionsAsync(user.Email));
users = new ObservableCollection<UserList>(await parse.GetUserListAsync());
for (int x = 0; x < users.Count; x++)
{
for (int y = 0; y < calendarPermissions.Count; y++)
{
if (calendarPermissions[y].User == users[x].Navn)
{
Debug.WriteLine($"{calendarPermissions[y].User} {users[x].Navn}");
users.Remove(users[x]);
}
}
}
I am not entirely sure what you are asking?
I would propose a solution like this to simplify your code:
var calenderPerms = new List<CalenderPerms>
{
new CalenderPerms { User = "A" },
new CalenderPerms { User = "B" },
new CalenderPerms { User = "C" },
new CalenderPerms { User = "D" },
};
var users = new List<User>
{
new User { Navn = "A" },
new User { Navn = "B" },
new User { Navn = "C" },
new User { Navn = "F" },
};
// HashSet for faster .Contains query
var calenderPermsUsers = calenderPerms.Select(c => c.User).ToHashSet();
users.RemoveAll(u => calenderPermsUsers.Contains(u.Navn));
This removes all but User F from the users list, i.e. the intersection between the two lists - similarly to your own code.
Are you trying to obtain a different result?
var calendarPermissionUsers = calendarPermissions.Select(x => x.User).ToArray();
users.RemoveAll(user => calendarPermissionUsers.Contains(user.Navn));

find the two longest word made of other words

I want to find two longest words from array ,made from smaller words. my code are given below.
current out put is:
catxdogcatsrat, ratcatdogcat, catsdogcats, dogcatsdog
required output is:
ratcatdogcat, catsdogcats
class program
{
public static void Main(String[] args)
{
List<string> list2 = new List<string>();
string[] stringrray = { "cat", "cats", "catsdogcats", "catxdogcatsrat", "dog", "dogcatsdog",
"hippopotamuses", "rat", "ratcatdogcat" };
list2.Add(stringrray[0]);
list2.Add(stringrray[1]);
list2.Add(stringrray[2]);
list2.Add(stringrray[3]);
list2.Add(stringrray[4]);
list2.Add(stringrray[5]);
list2.Add(stringrray[6]);
list2.Add(stringrray[7]);
list2.Add(stringrray[8]);
List<string> list = new List<string>();
var mod = list2.OrderByDescending(x => x.Length).ToList();
int j = 1;
for (int k = 0; k < mod.Count; k++)
{
for (int i = 0; i < mod.Count-j; i++)
{
if (mod[i].Contains(mod[mod.Count - j]))
{
j++;
list.Add(mod[i]);
}
}
}
var mod1 = list.OrderByDescending(x => x.Length);
foreach (var i in mod1)
{
Console.WriteLine(i);
}
Console.ReadLine();
}
}
I think you are looking for something like this
string[] stringrray = { "cat", "cats", "catsdogcats", "catxdogcatsrat", "dog", "dogcatsdog", "hippopotamuses", "rat", "ratcatdogcat" };
List<string> list2 = new List<string>(stringrray);
List<string> Finallist = new List<string>();
char[] smallstrchar = String.Join("", list2.Where(x => x.Length <= 4)).ToCharArray();
char[] bigstrchar = String.Join("", list2.Where(x => x.Length > 4)).ToCharArray();
char[] modchar = bigstrchar.Except(smallstrchar).ToArray();
foreach(string bigstr in list2)
{
if(!(bigstr.IndexOfAny(modchar) != -1))
{
Finallist.Add(bigstr);
}
}
Finallist = Finallist.OrderByDescending(x => x.Length).Take(2).ToList();
foreach(string finalstr in Finallist)
{
Console.WriteLine(finalstr);
}
So first is the stringrray which contains all the strings which are supposed to be taken care and find the longest one out of it. With your code it also takes the string which contains x in them but all other chars are matched. So I have made a list of strings in list2 which contains all the values. Then splitted the list2 in 2 parts that is smallstrchar array contains all the chars of the smaller strings less than length of 4 and Bigstrchar contains all chars of strings which are bigger than length of 5. Now Except takes out all the chars which does not exsist in the smallstrchar and present in Bigstrchar. Now we have the list of chars which need to be excluded from the sort.
Finally IndexOfAny to find in that string contains that char or not. If not then add to Finallist. Later we can take 2 from the list.
Hope this helps
You could simplify adding the array to list2 with
list2.AddRange(stringrray)
You may use this code...
static void Main(string[] args)
{
List<string> words = new List<string>() { "cat", "cats", "catsdogcats", "catxdogcatsrat", "dog", "dogcatsdog", "hippopotamuses", "rat", "ratcatdogcat" };
List<string> result = new List<string>();
// solution 1
foreach (string word in words)
{
if (IsCombinationOf(word, words))
{
result.Add(word);
}
}
// solution 2
result = words.Where(x => IsCombinationOf(x, words)).ToList();
}
public static bool IsCombinationOf(string word, List<string> parts)
{
// removing the actual word just to be secure.
parts = parts.Where(x => x != word).OrderByDescending(x => x.Length).ToList();
// erase each part in word. Only those which are not in the list will remain.
foreach (string part in parts)
{
word = Regex.Replace(word, part, "");
}
// if there are any caracters left, it hasn't been a combination
return word.Length == 0;
}
but...
This code has a little bug. The OrderbyDescending clause ensures that cats will be removed before cat. Otherwise the s would remain and the code wouldn't work as expected. But if we use some fictional values this code will not work properly. For example:
List<string> words = new List<string>() { "abz", "e", "zefg", "f", "g", "abzefg" };
Let's have a look at abzef. The algorithm will remove zefg first, but then it's not possible to go any futher. Indeed, the word is a combination of abz, e, f and g .

For-Loop and LINQ's deferred execution don't play well together

The title suggests that i've already an idea what's going on, but i cannot explain it. I've tried to order a List<string[]> dynamically by each "column", beginning with the first and ending with the minimum Length of all arrays.
So in this sample it is 2, because the last string[] has only two elements:
List<string[]> someValues = new List<string[]>();
someValues.Add(new[] { "c", "3", "b" });
someValues.Add(new[] { "a", "1", "d" });
someValues.Add(new[] { "d", "4", "a" });
someValues.Add(new[] { "b", "2" });
Now i've tried to order all by the first and second column. I could do it statically in this way:
someValues = someValues
.OrderBy(t => t[0])
.ThenBy(t => t[1])
.ToList();
But if i don't know the number of "columns" i could use this loop(that's what I thought):
int minDim = someValues.Min(t => t.GetLength(0)); // 2
IOrderedEnumerable<string[]> orderedValues = someValues.OrderBy(t => t[0]);
for (int i = 1; i < minDim; i++)
{
orderedValues = orderedValues.ThenBy(t => t[i]);
}
someValues = orderedValues.ToList(); // IndexOutOfRangeException
But that doesn't work, it fails with an IndexOutOfRangeException at the last line. The debugger tells me that i is 2 at that time, so the for-loop condition seems to be ignored, i is already == minDim.
Why is that so? What is the correct way for this?
It's the same problem as lots of people had with foreach loops pre C# 5.
orderedValues = orderedValues.ThenBy(t => t[i]);
The value of i will not be evaluated until you call .ToList() at which point it is 2 since that's the exit condition of the for loop.
You can introduce a new local variable inside the for-loop to fix it:
for (int i = 1; i < minDim; i++)
{
var tmp = i;
orderedValues = orderedValues.ThenBy(t => t[tmp]);
}
For more information you could take a look at Eric Lippert's blog post about Closing over the loop variable considered harmful.
This is probably happening because the value of i is not closed within the loop - when the loop exits, i will have a value of 2 and then t[i] will be evaluated because of deferred execution.
One solution is to create a closing variable within the loop:
int minDim = someValues.Min(t => t.GetLength(0)); // 2
IOrderedEnumerable<string[]> orderedValues = someValues.OrderBy(t => t[0]);
for (int i = 1; i < minDim; i++)
{
var x = i;
orderedValues = orderedValues.ThenBy(t => t[x]);
}
someValues = orderedValues.ToList();

Tricky algorithm... finding multiple combinations of subsets within nested HashSets?

I have a problem where I have to find multiple combinations of subsets within nested hashsets. Basically I have a "master" nested HashSet, and from a collection of "possible" nested HashSets I have to programmatically find the "possibles" that could be simultaneous subsets of the "master".
Lets say I have the following:
var master = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "A", "B", "C"}),
new HashSet<string>( new string[] { "D", "E"}),
new HashSet<string>( new string[] { "F"})
}
);
var possible1 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "A", "B", "C"}),
new HashSet<string>( new string[] { "F"})
}
);
var possible2 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "D", "E"})
}
);
var possible3 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "F"})
}
);
var possible4 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "X", "Y", "Z"})
}
);
var possible5 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "A", "B" }),
new HashSet<string>( new string[] { "D", "E"})
}
);
The output I should get from my algorithm should be as follows:
All possible combination subsets:
possible1 and possible2
possible3 and possible5
possible2 and possible3
possible1
possible2
possible3
possible5
I'm trying to figure out the best way to approach this. There is, of course, the brute force option, but I'm trying to avoid that if I can.
I just hope my question was clear enough.
EDIT
To further elaborate on what constitutes a subset, here are some examples, given the master {{"A","B","C"},{"C","D","E",F"},{"X","Y","Z"}} :
{{"A","B"}{"C","D"}} would be a subset of
{{"A","B","C"},{"X","Y"}} would be a subset
{{"A","B"},{"A","B"}} would NOT be a subset
{{"A","B","C","D"}} would NOT be a subset
{{"A","B","C"},{"C","D","X"}} would NOT be a subset
Basically each child set needs to be a subset of a corresponding child in the master.
Use bruteforce:
public static int IsCsInMaster(HashSet<string> childSubset, List<HashSet<string>> master, int startIndex)
{
for (int i = startIndex; i < master.Count; i++)
if (childSubset.IsSubsetOf(master[i])) return i;
return -1;
}
public static bool IsChildInMaster(List<HashSet<string>> child, List<HashSet<string>> master)
{
foreach (var childSubset in child) if (IsCsInMaster(childSubset, master, 0) == -1) return false;
return true;
}
public static bool IsChildInMasterMulti(List<HashSet<string>> child, List<HashSet<string>> master)
{
Dictionary<int, int> subsetChecker = new Dictionary<int, int>();
List<IEnumerable<int>> multiMatches = new List<IEnumerable<int>>();
int subsetIndex;
// Check for matching subsets.
for (int i = 0; i < child.Count; i++)
{
subsetIndex = 0;
List<int> indexes = new List<int>();
while ((subsetIndex = IsCsInMaster(child[i], master, subsetIndex)) != -1)
{
indexes.Add(subsetIndex++);
}
if (indexes.Count == 1)
{
subsetIndex = indexes[0];
if (subsetChecker.ContainsKey(subsetIndex)) return false;
else subsetChecker[subsetIndex] = subsetIndex;
}
else
{
multiMatches.Add(indexes);
}
}
/*** Check for multi-matching subsets. ***/ //got lazy ;)
var union = multiMatches.Aggregate((aggr, indexes) => aggr.Union(indexes));
// Filter the union so only unmatched subset indexes remain.
List<int> filteredUion = new List<int>();
foreach (int index in union)
{
if (!subsetChecker.ContainsKey(index)) filteredUion.Add(index);
}
return (filteredUion.Count >= multiMatches.Count);
}
And in code:
IsChildInMasterMulti(possible2, master)
The code does not handle the {{"A","B"},{"A","B"}} case, though. That is a LOT more difficult (flagging used subsets in master, maybe even individual elements - recursively).
Edit2: The third method handles the {{"A","B"},{"A","B"}} case as well (and more).
Use the simplest solution possible.
Keep in mind that if someone else has to look at your code they should be able to understand what it's doing with as little effort as possible. I already found it hard to understand from your description what you want to do and I haven't had to read code yet.
If you find that it's too slow after it's working optimize it then.
If possible write unit tests. Unit tests will ensure that your optimized solution is also working correctly and will help others ensure their changes don't break anything.

Categories

Resources