Tricky algorithm... finding multiple combinations of subsets within nested HashSets?

Tricky algorithm... finding multiple combinations of subsets within nested HashSets? - c#

I have a problem where I have to find multiple combinations of subsets within nested hashsets. Basically I have a "master" nested HashSet, and from a collection of "possible" nested HashSets I have to programmatically find the "possibles" that could be simultaneous subsets of the "master".
Lets say I have the following:
var master = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "A", "B", "C"}),
new HashSet<string>( new string[] { "D", "E"}),
new HashSet<string>( new string[] { "F"})
}
);
var possible1 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "A", "B", "C"}),
new HashSet<string>( new string[] { "F"})
}
);
var possible2 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "D", "E"})
}
);
var possible3 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "F"})
}
);
var possible4 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "X", "Y", "Z"})
}
);
var possible5 = new HashSet<HashSet<string>>(new HashSet<string>[] {
new HashSet<string>( new string[] { "A", "B" }),
new HashSet<string>( new string[] { "D", "E"})
}
);
The output I should get from my algorithm should be as follows:
All possible combination subsets:
possible1 and possible2
possible3 and possible5
possible2 and possible3
possible1
possible2
possible3
possible5
I'm trying to figure out the best way to approach this. There is, of course, the brute force option, but I'm trying to avoid that if I can.
I just hope my question was clear enough.
EDIT
To further elaborate on what constitutes a subset, here are some examples, given the master {{"A","B","C"},{"C","D","E",F"},{"X","Y","Z"}} :
{{"A","B"}{"C","D"}} would be a subset of
{{"A","B","C"},{"X","Y"}} would be a subset
{{"A","B"},{"A","B"}} would NOT be a subset
{{"A","B","C","D"}} would NOT be a subset
{{"A","B","C"},{"C","D","X"}} would NOT be a subset
Basically each child set needs to be a subset of a corresponding child in the master.

Use bruteforce:
public static int IsCsInMaster(HashSet<string> childSubset, List<HashSet<string>> master, int startIndex)
{
for (int i = startIndex; i < master.Count; i++)
if (childSubset.IsSubsetOf(master[i])) return i;
return -1;
}
public static bool IsChildInMaster(List<HashSet<string>> child, List<HashSet<string>> master)
{
foreach (var childSubset in child) if (IsCsInMaster(childSubset, master, 0) == -1) return false;
return true;
}
public static bool IsChildInMasterMulti(List<HashSet<string>> child, List<HashSet<string>> master)
{
Dictionary<int, int> subsetChecker = new Dictionary<int, int>();
List<IEnumerable<int>> multiMatches = new List<IEnumerable<int>>();
int subsetIndex;
// Check for matching subsets.
for (int i = 0; i < child.Count; i++)
{
subsetIndex = 0;
List<int> indexes = new List<int>();
while ((subsetIndex = IsCsInMaster(child[i], master, subsetIndex)) != -1)
{
indexes.Add(subsetIndex++);
}
if (indexes.Count == 1)
{
subsetIndex = indexes[0];
if (subsetChecker.ContainsKey(subsetIndex)) return false;
else subsetChecker[subsetIndex] = subsetIndex;
}
else
{
multiMatches.Add(indexes);
}
}
/*** Check for multi-matching subsets. ***/ //got lazy ;)
var union = multiMatches.Aggregate((aggr, indexes) => aggr.Union(indexes));
// Filter the union so only unmatched subset indexes remain.
List<int> filteredUion = new List<int>();
foreach (int index in union)
{
if (!subsetChecker.ContainsKey(index)) filteredUion.Add(index);
}
return (filteredUion.Count >= multiMatches.Count);
}
And in code:
IsChildInMasterMulti(possible2, master)
The code does not handle the {{"A","B"},{"A","B"}} case, though. That is a LOT more difficult (flagging used subsets in master, maybe even individual elements - recursively).
Edit2: The third method handles the {{"A","B"},{"A","B"}} case as well (and more).

Use the simplest solution possible.
Keep in mind that if someone else has to look at your code they should be able to understand what it's doing with as little effort as possible. I already found it hard to understand from your description what you want to do and I haven't had to read code yet.
If you find that it's too slow after it's working optimize it then.
If possible write unit tests. Unit tests will ensure that your optimized solution is also working correctly and will help others ensure their changes don't break anything.

Related

Generate all possible coverage options

Suppose I have 2 lists: one containing strings, one containing integers, they differ in length. The application I am building will use these lists to generate combinations of vehicle and coverage areas. Strings represent area names and ints represent vehicle ID's.
My goal is to generate a list of all possible unique combinations used for further investigation. One vehicle can service many areas, but one area can't be served by multiple vehicles. Every area must receive service, and every vehicle must be used.
So to conclude the constraints:
Every area is used only once
Every vehicle is used at least once
No area can be left out.
No vehicle can be left out
Here is an example:
public class record = {
public string areaId string{get;set;}
public int vehicleId int {get;set;}
}
List<string> areas = new List<string>{ "A","B","C","D"};
List<int> vehicles = new List<int>{ 1,2};
List<List<record>> uniqueCombinationLists = retrieveUniqueCombinations(areas,vehicles);
I just have no clue how to make the retrieveUniqueCombinations function. Maybe I am just looking wrong or thinking too hard. I am stuck thinking about massive loops and other brute force approaches. An explanation of a better approach would be much appreciated.
The results should resemble something like this, I think this contains all possibilities for this example.
A1;B1;C1;D2
A1;B1;C2;D1
A1;B2;C1;D1
A2;B1;C1;D1
A2;B2;C2;D1
A2;B2;C1;D2
A2;B1;C2;D2
A1;B2;C2;D2
A2;B1;C1;D2
A1;B2;C2;D1
A2;B2;C1;D1
A1;B1;C2;D2
A2;B1;C2;D1
A1;B2;C1;D2

Here's something I threw together that may or may not work. Borrowing heavily from dtb's work on this answer.
Basically, I generate them all, then remove the ones that don't meet the requirements.
List<string> areas = new List<string> { "A", "B", "C", "D" };
List<int> vehicles = new List<int> { 1, 2 };
var result = retrieveUniqueCombinations(areas, vehicles);
result.ToList().ForEach((recordList) => {
recordList.ToList().ForEach((record) =>
Console.Write("{0}{1};", record.areaId, record.vehicleId));
Console.WriteLine();
});
public IEnumerable<IEnumerable<record>> retrieveUniqueCombinations(IEnumerable<string> areas, IEnumerable<int> vehicles)
{
var items = from a in areas
from v in vehicles
select new record { areaId = a, vehicleId = v };
var result = items.GroupBy(i => i.areaId).CartesianProduct().ToList();
result.RemoveAll((records) =>
records.All(record =>
record.vehicleId == records.First().vehicleId));
return result;
}
public class record
{
public string areaId { get; set; }
public int vehicleId { get; set; }
}
static class Extensions
{
public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item }));
}
}
This produces the following:
A1;B1;C1;D2;
A1;B1;C2;D1;
A1;B1;C2;D2;
A1;B2;C1;D1;
A1;B2;C1;D2;
A1;B2;C2;D1;
A1;B2;C2;D2;
A2;B1;C1;D1;
A2;B1;C1;D2;
A2;B1;C2;D1;
A2;B1;C2;D2;
A2;B2;C1;D1;
A2;B2;C1;D2;
A2;B2;C2;D1;
Note that these are not in the same order as yours, but I'll leave the verification to you. Also, there's likely a better way of doing this (for instance, by putting the logic in the RemoveAll step in the CartesianProduct function), but hey, you get what you pay for ;).

So lets use some helper classes to convert numbers to IEnumerable<int> enumerations in different bases. It may be more efficient to use List<> but since we are trying to use LINQ:
public static IEnumerable<int> LeadingZeros(this IEnumerable<int> digits, int minLength) {
var dc = digits.Count();
if (dc < minLength) {
for (int j1 = 0; j1 < minLength - dc; ++j1)
yield return 0;
}
foreach (var j2 in digits)
yield return j2;
}
public static IEnumerable<int> ToBase(this int num, int numBase) {
IEnumerable<int> ToBaseRev(int n, int nb) {
do {
yield return n % nb;
n /= nb;
} while (n > 0);
}
foreach (var n in ToBaseRev(num, numBase).Reverse())
yield return n;
}
Now we can create an enumeration that lists all the possible answers (and a few extras). I converted the Lists to Arrays for indexing efficiency.
var areas = new List<string> { "A", "B", "C", "D" };
var vehicles = new List<int> { 1, 2 };
var areasArray = areas.ToArray();
var vehiclesArray = vehicles.ToArray();
var numVehicles = vehiclesArray.Length;
var numAreas = areasArray.Length;
var NumberOfCombos = Convert.ToInt32(Math.Pow(numVehicles, numAreas));
var ansMap = Enumerable.Range(0, NumberOfCombos).Select(n => new { n, nd = n.ToBase(numVehicles).LeadingZeros(numAreas)});
Given the enumeration of the possible combinations, we can convert into areas and vehicles and exclude the ones that don't use all vehicles.
var ans = ansMap.Select(nnd => nnd.nd).Select(m => m.Select((d, i) => new { a = areasArray[i], v = vehiclesArray[d] })).Where(avc => avc.Select(av => av.v).Distinct().Count() == numVehicles);

Dependency Graph using Dictionaries and Lists

I'm in the middle of working on a dependency graph, and I'm having trouble with properly adding my dependents and dependees.
I have it set up like:
private List<Tuple<string, string>> DG;
private Dictionary<string, List<string>> dependants;
private Dictionary<string, List<string>> dependees;
And I'm trying to add to my dictionaries like:
for (int i = 0; i < DG.Count; i++)
{
dependants.Add(DG[i].Item1, new List<string>().Add(DG[i].Item2);
}
It gives me the error "Argument2: Cannot convert from void to System.Collections.Generic.List" where I try to add to a new list in the second parameter. I think I know why I'm getting errors, but I am having trouble thinking of an alternative way to correctly add into the dictionaries.
My goal is something like this:
//DG = {("a", "b"), ("a", "c"), ("b", "d"), ("d", "d")}
// dependents("a") = {"b", "c"}
// dependents("b") = {"d"}
// dependents("c") = {}
// dependents("d") = {"d"}
// dependees("a") = {}
// dependees("b") = {"a"}
// dependees("c") = {"a"}
// dependees("d") = {"b", "d"}
So ("a", "b") means that "b" is a dependent of "a" and "a" is a dependee of "b"

Its a little longer than your code, but this might be what you need:
for (int i = 0; i < DG.Count; i++)
{
if (!dependants.ContainsKey(DG[i].Item1))
{
List<string> temp = new List<string>();
temp.add(DG[i].Item2);
dependants.Add(DG[i].Item1, temp);
}
else
dependants[DG[i].Item1].Add(DG[i].Item2);
}
Hopefully the longer code helps you understand the flow. This is only for making the dependants. Also, you were missing a bracket close in your original code:
dependants.Add(DG[i].Item1, new List<string>().Add(DG[i].Item2);
should be
dependants.Add(DG[i].Item1, new List<string>().Add(DG[i].Item2));

Finding differences within 2 Lists of string arrays

I am looking to find the differences between two Lists of string arrays using the index 0 of the array as the primary key.
List<string[]> original = new List<string[]>();
List<string[]> web = new List<string[]>();
//define arrays for List 'original'
string[] original_a1 = new string[3]{"a","2","3"};
string[] original_a2 = new string[3]{"x","2","3"};
string[] original_a3 = new string[3]{"c","2","3"};
//define arrays for List 'web'
string[] web_a1 = new string[3]{"a","2","3"};
string[] web_a2 = new string[3]{"b","2","3"};
string[] web_a3 = new string[3]{"c","2","3"};
//populate Lists
original.Add(original_a1);
original.Add(original_a2);
original.Add(original_a3);
web.Add(web_a1);
web.Add(web_a2);
web.Add(web_a3);
My goal is to find what is in List 'original' but NOT in 'web' by using index 0 as the primary key
This is what I tried.
List<string> differences = new List<string>(); //differences go in here
string tempDiff = ""; // I use this to try and avoid duplicate entries but its not working
for(int i = 0; i < original.Count; i++){
for(int j = 0; j< web.Count; j++){
if(!(original[i][0].Equals(web[j][0]))){
tempDiff = original[i][0];
}
}
differences.Add(tempDiff);
}
OUTPUT:
foreach(string x in differences){
Console.WriteLine("SIZE " + differences.Count);
Console.WriteLine(x);
ConSole.ReadLine();
}
SIZE 3
SIZE 3
x
SIZE 3
x
Why is it reporting the mismatch 3 times instead of once?

Using linq you can just go:
var differences = orignal.Except(web).ToList();
Reference here
This will give you the values that are in original, that don't exist in web
Sorry didn't read your question properly, to answer your question:
You have a nested for-loop. So for each value of original (3) it will loop through all values of web (3), which is 9 loops total.
In 3 cases it doesn't match and therefore outputs 3 times.

I think this is what you want. I use Linq to grab the primary keys, and then I use Except to do original - web. By the way, you can use == instead of Equals with strings in C# because C# does a value comparison as opposed to a reference comparison.
List<string[]> original = new List<string[]>
{
new string[3] { "a", "2", "3" },
new string[3] { "x", "2", "3" },
new string[3] { "c", "2", "3" }
};
List<string[]> web = new List<string[]>
{
new string[3] { "a", "2", "3" },
new string[3] { "b", "2", "3" },
new string[3] { "c", "2", "3" }
};
var originalPrimaryKeys = original.Select(o => o[0]);
var webPrimaryKeys = web.Select(o => o[0]);
List<string> differences = originalPrimaryKeys.Except(webPrimaryKeys).ToList();
Console.WriteLine("The number of differences is {0}", differences.Count);
foreach (string diff in differences)
{
Console.WriteLine(diff);
}
And here it is without Linq:
var differences = new List<string>();
for (int i = 0; i < original.Count; i++)
{
bool found = false;
for (int j = 0; j < web.Count; j++)
{
if (original[i][0] == web[j][0])
{
found = true;
}
}
if (!found)
{
differences.Add(original[i][0]);
}
}

To answer your question: It is a nested for loop as stated in JanR's answer. This approach will make you reiterate to your web count 9 times, thus listing your mismatched key three times.
What could be a better way to do is this:
//Check for originals not introduced in web.
if(original.Count > web.Count)
{
for(int y = web.Count; y < original.Count; y++)
{
differences.Add(original[y][0]);
}
}
//Check if Web has value, if not, everything else is done on the first for loop
if(web.Count > 0)
{
for(int i = 0; i < original.Count; i++)
{
if(!original[i][0].Equals(web[i][0]))
differences.Add(original[i][0]);
}
}
Also, the output is in a for loop, when you just need one result, the length of the mismatch. You can do that without a loop.
Console.WriteLine("SIZE " + differences.Count);
This is, of course to make it kinda simpler if you're not used to using LINQ statements, but if you can do so with LINQ, then by all means, use LINQ as it's more efficient.

You can get the difference by using Except extension method like this:
var originalDic = original.ToDictionary(arr => arr.First());
var webDic = web.ToDictionary(arr => arr.First());
var differences =
originalDic
.Except(webDic, kvp => kvp.Key)
.Select(kvp => kvp.Value)
.ToList();
The trick here is to first convert your original and web lists into a Dictionary using the first element of each array as key and then perform Except.

Compare two list of objects C#

I want to compare two list of objects. These lists contains the same type of objects. I create a new List in my programme and i want to compare it at the old list which is in the database. I get it with a stored procedure, then i put it into an object.
The old list : the new list :
*Category 1* Category 5
*Category 2* Category 6
*Category 3* *Category 4*
Category 4
Here the aim is to delete the first three Category in the old list, beacause they don't exist in the new list. And to delete the Category 4 in the new list because category 4 already exists in the old list.
It is possible to use à method like Equals() or use two foreach loop to browse the lists ?
Thanks for you answers and advises

You can use the linq, except and where
var a = new List<string> { "a", "b", "c" };
var b = new List<string> { "c", "d", "e" };
var temp = a.Intersect(b).ToList();
b = b.Except(a).ToList();
a = temp;
Output:
a: "c"
b: "d", "e"
Note: It is probably more efficient to do this without linq
var a = new List<string> { "a", "b", "c" };
var b = new List<string> { "c", "d", "e" };
for(int i = 0; i < a.Count; i++)
if(b.Contains(a[i]))
b.Remove(a[i]);
else
a.Remove(a[i--]);
If you need to compare based on a particular value
for(int i = 0; i < a.Count; i++)
{
var obj = b.Where(item => item.Category == a[i].Category);
if(obj.Any())
b.Remove(obj.First());
else
a.Remove(a[i--]);
}

It's not the most pretty of implementations but the fastest way you can do this is:
var tempA = new HashSet<int>(inputA.Select(item => item.Id));
var tempB = new HashSet<int>(inputB.Select(item => item.Id));
var resultA = new List<Category>(inputA.Count);
var resultB = new List<Category>(inputB.Count);
foreach (var value in inputA)
if (tempB.Contains(value.Id))
resultA.Add(value);
foreach (var value in inputB)
if (!tempA.Contains(value.Id))
resultB.Add(value);
resultA.TrimExcess();
resultB.TrimExcess();
// and if needed:
inputA = resultA;
inputB = resultB;
If you need more than item.id as unique then use a new Tuple such as:
inputA.Select(item => new Tuple<int, string>(item.Id, item.Title));
Another option is to override .GetHashCode in your category class such as:
public override int GetHashCode()
{
return Id.GetHashCode();
}
public override bool Equals(object obj)
{
var typedObj = obj as Category;
if (typedObj == null)
return false;
return Title == typedObj.Title && Id == typedObj.Id && Rank == typedObj.Rank;
}

I would solve this by sorting the two list and iterating over the first and second list. I would compare the current item of the first list to the current item from the second. If a match is found I remove the match from the second list and I move to the next item in both lists, otherwise the current item of the first list is removed from it and the iteration continues in the first list.

unequal size lists to merge

I have searched without success to a similar situation as follows.
I have two lists, list A and list B.
List A is composed of 10 objects created from ClassA which contains only strings.
List B is composed of 100 objects created from ClassB which only contains decimals.
List A is the header information.
List B is the data information.
The relationship between the two lists is:
Row 1 of list A corresponds to rows 1-10 of list B.
Row 2 of list A corresponds to rows 11-20 of list B.
Row 3 of list A corresponds to rows 21-30 of list B.
etc.........
How can I combine these two lists so that when I display them on the console the user will see a header row followed immediately by the corresponding 10 data rows.
I apologize if this has been answered before.

Ok, that should work. Let me know in case I got anything wrong.
List<ClassA> listA = GetListA()// ...
List<ClassB> listB = GetListA()// ...
if(listB.Count % listA.Count != 0)
throw new Exception("Unable to match listA to listB");
var datasPerHeader = listB.Count / listA.Count;
for(int i = 0; i < listA.Count;i++)
{
ClassA header = listA[i];
IEnumerable<ListB> datas = listB.Skip(datasPerHeader*i).Take(datasPerHeader);
Console.WriteLine(header.ToString());
foreach(var data in datas)
{
Console.WriteLine("\t{0}", data.ToString());
}
}

Here is some code that should fulfill your request - I am going to find a link for the partition extension as I can't find it in my code anymore:
void Main()
{
List<string> strings = Enumerable.Range(1,10).Select(x=>x.ToString()).ToList();
List<decimal> decimals = Enumerable.Range(1,100).Select(x=>(Decimal)x).ToList();
var detailsRows = decimals.Partition(10)
.Select((details, row) => new {HeaderRow = row, DetailsRows = details});
var headerRows = strings.Select((header, row) => new {HeaderRow = row, Header = header});
var final = headerRows.Join(detailsRows, x=>x.HeaderRow, x=>x.HeaderRow, (header, details) => new {Header = header.Header, Details = details.DetailsRows});
}
public static class Extensions
{
public static IEnumerable<List<T>> Partition<T>(this IEnumerable<T> source, Int32 size)
{
for (int i = 0; i < Math.Ceiling(source.Count() / (Double)size); i++)
yield return new List<T>(source.Skip(size * i).Take(size));
}
}
That Partition method is the one that does the grunt work...
And here is the link to the article - LINK
EDIT 2
Here is better code for the Main() method... Rushed to answer and forgot brain:
void Main()
{
List<string> strings = Enumerable.Range(1,10).Select(x=>x.ToString()).ToList();
List<decimal> decimals = Enumerable.Range(1,100).Select(x=>(Decimal)x).ToList();
var detailsRows = decimals.Partition(10);
var headerRows = strings; //just renamed for clarity from other code
var final = headerRows.Zip(detailsRows, (header, details) => new {Header = header, Details = details});
}

This should be pretty straight forward unless I'm missing something.
var grouped = ListA.Select((value, index) =>
new {
ListAItem = value,
ListBItems = ListB.Skip(index * 10).Take(10)
})
.ToList();
Returns back an anonymous type you can loop through.
foreach (var group in grouped)
{
Console.WriteLine("List A: {0}", group.Name);
foreach (var listBItem in group.ListBItems)
{
Console.WriteLine("List B: {0}", listBItem.Name);
{
}

The easiest way may be something like this:
var listA = new List<string>() { "A", "B", "C", ... }
var listB = new List<decimal>() { 1m, 2m, 3m, ... }
double ratio = ((double)listA.Count) / listB.Count;
var results =
from i in Enumerable.Range(0, listB.Count)
select new { A = listA[(int)Math.Truncate(i * ratio)], B = listB[i] };
Or in fluent syntax:
double ratio = ((double)listA.Count) / listB.Count;
var results = Enumerable.Range(0, listB.Count)
.Select(i => new { A = listA[(int)Math.Truncate(i * ratio)], B = listB[i] });
Of course if you know you will always have 10 items in listB for each item in listA, you can simplify this to:
var results =
from i in Enumerable.Range(0, listB.Count)
select new { A = listA[i / 10], B = listB[i] };
Or in fluent syntax:
var results = Enumerable.Range(0, listB.Count)
.Select(i => new { A = listA[i / 10], B = listB[i] });
This will return a result set like
{ { "A", 1 },
{ "A", 2 },
{ "A", 3 }
..,
{ "A", 10 },
{ "B", 11 },
{ "B", 12 },
{ "B", 13 },
...
{ "B", 20 },
{ "C", 21 },
...
{ "J", 100 }
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Tricky algorithm... finding multiple combinations of subsets within nested HashSets? - c#

Related

Generate all possible coverage options

Dependency Graph using Dictionaries and Lists

Finding differences within 2 Lists of string arrays

Compare two list of objects C#

unequal size lists to merge

Categories

Resources