Reducing complexity from permutation to combination

Reducing complexity from permutation to combination - c#

Lets say I have 6 image parts that forms a single image when correctly arranged. Also suppose that I have 2 pair of parts that may be interchanged in position and may still form the same image. Now, I want all the possible combinations without permutation. I want to start from 1st place and check how many partials fits that place and incrementally move towards last place.
List<List<int>> possible_combinations = new List<List<int>>();
for (int i = 0; i <= 6; i++)
{
foreach (var comb in possible_combinations)
{
List<int[]> best_combination = new List<int[]>();
for (int p = 0; p < comb.Count; p++)
best_combination.AddRange(allStrokes[comb[p]]);
if (!comb.Contains(i))
{
best_combination.AddRange(allStrokes[i]);
var fits = cs.checkFeasibility(best_combination);
if (fits)
comb.Add(i);
}
}
}
possible combination already may consists of all first possible partial image id. Check feasibility will check if newly combined partial will improve the result so far or not. How can I achieve that. Code herewith is for reference for understanding.

Related

Looping through huge c# list with inner loops

I have a huge list of strings _wordList List<string> containing about 100,000 values. The problem I'm having is that I also require multiple nested loops within this. The nested loop is also a list but with a structure containing only variables, containing about 0-100 values depending on what happens
for (int y = 0; y < _wordList.Count; y++)
{
string word = _wordList[y];
for(int x = 0; x < _secondWordList.Count; x++)
{
if (!word.Contains(_secondWordList[x].Word) || word == _secondWordList[x].Word)
continue;
// do other stuff
}
}
Here is part of the code, I won't post all of it since most of it will be irrelevant but within the second loop I have about 2 other short loops, the whole function completes in 350-600ms. What would the best way to optimize the loops? The word.Contains also have an impact of about 100-150ms on performance.

It seems that you're looking for a text search, and in this case, you can benefit from projects like LuceneNet.
If the call to string.Contains() didn't exist and you were looking for exact matches I was to suggest that you swap the list for hashset as it'll give you a great performance boost in your case. Like below.
static void Main(string[] args)
{
var wordList = new HashSet<string>(); //Assuming Initialized
var secondWordList = new List<X>(); //Assuming Initialized
for (var c = 0; c < secondWordList.Count; c++)
{
if(wordList.Contains(secondWordList[c].Word))
continue;
// do other stuff
}
}
With this, you're going to iterate on the smaller list and look for the value in the HashSet which has a complexity of O(1), which Means that it'll be extremely fast.

C# Combine list of arrays with count difference

I'm working on a music application that reads lilypond and midi-files.
Both filetypes needs to be turned into our own storage.
For lilypond you need to read repeat's and their alternatives.
There can however be difference in counting.
If there are three repeatings but two alternatives the first two repeatings get the first alternative and the third alternative gets the last alternative.
Due to re-use being starting at front i don't know how to do it.
My current code looks like this, so the only part missing is combining the repeatList and the altList.
I was hoping there is a math solution for this because flipping the arrays would be tragical for the preformance.
private List<Note> readRepeat(List<string> repeat, List<string> alt, int repeatCount)
{
List<Note> noteList = new List<Note>();
List<Note> repeatList = new List<Note>();
List<List<Note>> altList = new List<List<Note>>();
foreach (string line in repeat)
{
repeatList.AddRange(readNoteLine(line));
}
foreach (string line in alt)
{
altList.Add(readNoteLine(line));
}
while (repeatCount > 0)
{
List<Note> toAdd = repeatList.ToList(); // Clone the list, to destroy the reference
if (altList.Count() != 0)
{
// logic to add the right alt
}
noteList.AddRange(toAdd);
repeatCount--;
}
return noteList;
}
In the above code the two lists get populated with the notes.
Their function is as follow:
RepeatList: The basic list of notes that gets played x times
AltList: The list of possibilities to be added to the repeat list.
Some example I/O
RepeatCount = 4
AltList.Count() = 3
Repeat 1 gets: Alt 1
Repeat 2 gets: Alt 1
Repeat 3 gets: Alt 2
Repeat 4 gets: Alt 3
Example in visual style

From what I understand, here is some possible way to implement the logic part (basically you just have to maintain a cursor on the appropriate alternative). This is in place of your while loop.
int currentAlt = 0;
for (int currentRepeat = 0; currentRepeat < repeatCount; currentRepeat++)
{
noteList.AddRange(repeatList);
noteList.AddRange(altList[currentAlt]);
if (currentRepeat >= repeatCount - altList.Count)
{
currentAlt++;
}
}

Binary search slower, what am I doing wrong?

EDIT: so it looks like this is normal behavior, so can anyone just recommend a faster way to do these numerous intersections?
so my problem is this. I have 8000 lists (strings in each list). For each list (ranging from size 50 to 400), I'm comparing it to every other list and performing a calculation based on the intersection number. So I'll do
list1(intersect)list1= number
list1(intersect)list2= number
list1(intersect)list888= number
And I do this for every list. Previously, I had HashList and my code was essentially this: (well, I was actually searching through properties of an object, so I
had to modify the code a bit, but it's basically this:
I have my two versions below, but if anyone knows anything faster, please let me know!
Loop through AllLists, getting each list, starting with list1, and then do this:
foreach (List list in AllLists)
{
if (list1_length < list_length) //just a check to so I'm looping through the
//smaller list
{
foreach (string word in list1)
{
if (block.generator_list.Contains(word))
{
//simple integer count
}
}
}
// a little more code, but the same, but looping through the other list if it's smaller/bigger
Then I make the lists into regular lists, and applied Sort(), which changed my code to
foreach (List list in AllLists)
{
if (list1_length < list_length) //just a check to so I'm looping through the
//smaller list
{
for (int i = 0; i < list1_length; i++)
{
var test = list.BinarySearch(list1[i]);
if (test > -1)
{
//simple integer count
}
}
}
The first version takes about 6 seconds, the other one takes more than 20 (I just stop there cuz otherwise it would take more than a minute!!!) (and this is for a smallish subset of the data)
I'm sure there's a drastic mistake somewhere, but I can't find it.

Well I have tried three distinct methods for achieving this (assuming I understood the problem correctly). Please note I have used HashSet<int> in order to more easily generate random input.
setting up:
List<HashSet<int>> allSets = new List<HashSet<int>>();
Random rand = new Random();
for(int i = 0; i < 8000; ++i) {
HashSet<int> ints = new HashSet<int>();
for(int j = 0; j < rand.Next(50, 400); ++j) {
ints.Add(rand.Next(0, 1000));
}
allSets.Add(ints);
}
the three methods I checked (code is what runs in the inner loop):
the loop:
note that you are getting duplicated results in your code (intersecting set A with set B and later intersecting set B with set A).
It won't affect your performance thanks to the list length check you are doing. But iterating this way is clearer.
for(int i = 0; i < allSets.Count; ++i) {
for(int j = i + 1; j < allSets.Count; ++j) {
}
}
first method:
used IEnumerable.Intersect() to get the intersection with the other list and checked IEnumerable.Count() to get the size of the intersection.
var intersect = allSets[i].Intersect(allSets[j]);
count = intersect.Count();
this was the slowest one averaging 177s
second method:
cloned the smaller set of the two sets I was intersecting, then used ISet.IntersectWith() and checked the resulting sets Count.
HashSet<int> intersect;
HashSet<int> intersectWith;
if(allSets[i].Count < allSets[j].Count) {
intersect = new HashSet<int>(allSets[i]);
intersectWith = allSets[j];
} else {
intersect = new HashSet<int>(allSets[j]);
intersectWith = allSets[i];
}
intersect.IntersectWith(intersectWith);
count = intersect.Count;
}
}
this one was slightly faster, averaging 154s
third method:
did something very similar to what you did iterated over the shorter set and checked ISet.Contains on the longer set.
for(int i = 0; i < allSets.Count; ++i) {
for(int j = i + 1; j < allSets.Count; ++j) {
count = 0;
if(allSets[i].Count < allSets[j].Count) {
loopingSet = allSets[i];
containsSet = allSets[j];
} else {
loopingSet = allSets[j];
containsSet = allSets[i];
}
foreach(int k in loopingSet) {
if(containsSet.Contains(k)) {
++count;
}
}
}
}
this method was by far the fastest (as expected), averaging 66s
conclusion
the method you're using is the fastest of these three. I certainly can't think of a faster single threaded way to do this. Perhaps there is a better concurrent solution.

I've found that one of the most important considerations in iterating/searching any kind of collection is to choose the collection type very carefully. To iterate through a normal collection for your purposes will not be the most optimal. Try using something like:
System.Collections.Generic.HashSet<T>
Using the Contains() method while iterating over the shorter list of two (as you mentioned you're already doing) should give close to O(1) performance, the same as key lookups in the generic Dictionary type.

Modulus usage when dealing with odd numbers

I have a list of roughly 50~60 items that I want to be able to divide into multiple columns dynamically. I'm using a nested for loop and the lists divide properly when there are an even number of items. However, when there are an odd number of items the remainder (modulus) items get left out. I've been playing around with it for a while and have not struck gold yet. I'm hoping someone smarter than me can & will assist.
Thanks.
for (int fillRow = 0; fillRow < numOfCols; fillRow++)
{
for (int fillCell = 0; fillCell < (siteTitles.Count / numOfCols); fillCell++)
{
linkAddress = new HyperLink();
linkAddress.Text = tempSites[fillCell].ToString();
linkAddress.NavigateUrl = tempUrls[fillCell].ToString();
mainTbl.Rows[fillCell].Cells[fillRow].Controls.Add(linkAddress);
}
}

Well yes, the problem is here:
fillCell < (siteTitles.Count / numOfCols)
That division will round down, so for example if there are 13 titles and numOfCols is 5, it will give 2 - which means that items 10-12 won't be used.
I suggest that actually you loop over all the items instead, and work out the row and column for each item:
for (int i = 0; i < siteTitles.Count; i++)
{
int row = i / numOfCols;
int col = i % numOfCols;
// Fill in things using row, col and i
}
(It's not exactly clear what you're doing as you're using siteTitles in the loop condition and tempSites in the loop body, and you're not using fillRow when extracting the data... basically I think you've still got some bugs...)

C# Best way to parse flat file with dynamic number of fields per row

I have a flat file that is pipe delimited and looks something like this as example
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
The first two columns are set and will always be there.
* denotes a count for how many repeating fields there will be following that count so Notes 1 2 3
** denotes a count for how many times a block of fields are repeated and there are always 3 fields in a block.
This is per row, so each row may have a different number of fields.
Hope that makes sense so far.
I'm trying to find the best way to parse this file, any suggestions would be great.
The goal at the end is to map all these fields into a few different files - data transformation. I'm actually doing all this within SSIS but figured the default components won't be good enough so need to write own code.
UPDATE I'm essentially trying to read this like a source file and do some lookups and string manipulation to some of the fields in between and spit out several different files like in any normal file to file transformation SSIS package.
Using the above example, I may want to create a new file that ends up looking like this
"ColA","HardcodedString","Note1CRLFNote2CRLF","ColB"
And then another file
Row1: "ColA","A1","A2","A3"
Row2: "ColA","B1","B2","B3"
So I guess I'm after some ideas on how to parse this as well as storing the data in either Stacks or Lists or?? to play with and spit out later.

One possibility would be to use a stack. First you split the line by the pipes.
var stack = new Stack<string>(line.Split('|'));
Then you pop the first two from the stack to get them out of the way.
stack.Pop();
stack.Pop();
Then you parse the next element: 3* . For that you pop the next 3 items on the stack. With 2** you pop the next 2 x 3 = 6 items from the stack, and so on. You can stop as soon as the stack is empty.
while (stack.Count > 0)
{
// Parse elements like 3*
}
Hope this is clear enough. I find this article very useful when it comes to String.Split().

Something similar to below should work (this is untested)
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
string[] columns = line.Split('|');
List<string> repeatingColumnNames = new List<string();
List<List<string>> repeatingFieldValues = new List<List<string>>();
if(columns.Length > 2)
{
int repeatingFieldCountIndex = columns[2];
int repeatingFieldStartIndex = repeatingFieldCountIndex + 1;
for(int i = 0; i < repeatingFieldCountIndex; i++)
{
repeatingColumnNames.Add(columns[repeatingFieldStartIndex + i]);
}
int repeatingFieldSetCountIndex = columns[2 + repeatingFieldCount + 1];
int repeatingFieldSetStartIndex = repeatingFieldSetCountIndex + 1;
for(int i = 0; i < repeatingFieldSetCount; i++)
{
string[] fieldSet = new string[repeatingFieldCount]();
for(int j = 0; j < repeatingFieldCountIndex; j++)
{
fieldSet[j] = columns[repeatingFieldSetStartIndex + j + (i * repeatingFieldSetCount))];
}
repeatingFieldValues.Add(new List<string>(fieldSet));
}
}

System.IO.File.ReadAllLines("File.txt").Select(line => line.Split(new[] {'|'}))

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Reducing complexity from permutation to combination - c#

Related

Looping through huge c# list with inner loops

C# Combine list of arrays with count difference

Binary search slower, what am I doing wrong?

Modulus usage when dealing with odd numbers

C# Best way to parse flat file with dynamic number of fields per row

Categories

Resources