C# Combine list of arrays with count difference - c#

I'm working on a music application that reads lilypond and midi-files.
Both filetypes needs to be turned into our own storage.
For lilypond you need to read repeat's and their alternatives.
There can however be difference in counting.
If there are three repeatings but two alternatives the first two repeatings get the first alternative and the third alternative gets the last alternative.
Due to re-use being starting at front i don't know how to do it.
My current code looks like this, so the only part missing is combining the repeatList and the altList.
I was hoping there is a math solution for this because flipping the arrays would be tragical for the preformance.
private List<Note> readRepeat(List<string> repeat, List<string> alt, int repeatCount)
{
List<Note> noteList = new List<Note>();
List<Note> repeatList = new List<Note>();
List<List<Note>> altList = new List<List<Note>>();
foreach (string line in repeat)
{
repeatList.AddRange(readNoteLine(line));
}
foreach (string line in alt)
{
altList.Add(readNoteLine(line));
}
while (repeatCount > 0)
{
List<Note> toAdd = repeatList.ToList(); // Clone the list, to destroy the reference
if (altList.Count() != 0)
{
// logic to add the right alt
}
noteList.AddRange(toAdd);
repeatCount--;
}
return noteList;
}
In the above code the two lists get populated with the notes.
Their function is as follow:
RepeatList: The basic list of notes that gets played x times
AltList: The list of possibilities to be added to the repeat list.
Some example I/O
RepeatCount = 4
AltList.Count() = 3
Repeat 1 gets: Alt 1
Repeat 2 gets: Alt 1
Repeat 3 gets: Alt 2
Repeat 4 gets: Alt 3
Example in visual style

From what I understand, here is some possible way to implement the logic part (basically you just have to maintain a cursor on the appropriate alternative). This is in place of your while loop.
int currentAlt = 0;
for (int currentRepeat = 0; currentRepeat < repeatCount; currentRepeat++)
{
noteList.AddRange(repeatList);
noteList.AddRange(altList[currentAlt]);
if (currentRepeat >= repeatCount - altList.Count)
{
currentAlt++;
}
}

Related

Compare 2 big string lists

I have two lists of strings - this is not mandatory, i can convert them to any collection (list, dictionary, etc).
First is "text":
Birds sings
Dogs barks
Frogs jumps
Second is "words":
sing
dog
cat
I need to iterate through "text" and if line contains any of "words" - do one thing and if not another thing.
Important: yes, in my case i need to find partial match ignoring case, like text "Dogs" is a match for word "dog". This is why i use .Contains and .ToLower().
My naive try looks like this:
List<string> text = new List<string>();
List<string> words = new List<string>();
foreach (string line in text)
{
bool found = false;
foreach (string word in words)
{
if (line.ToLower().Contains(word.ToLower()))
{
;// one thing
found = true;
break;
}
}
if (!found)
;// another
}
Problem in size - 8000 in first list and ~50000 in second. This takes too many time.
How to make it faster?
I'm assuming that you only want to match on the specific words in your text list: that is, if text contains "dogs", and words contains "dog", then that shouldn't be a match.
Note that this is different to what your code currently does.
Given this, we can construct a HashSet<string> of all of the words in your text list. We can then query this very cheaply.
We'll also use StringComparer.OrdinalIgnoreCase to do our comparisons. This is a better way of doing a case-insensitive match than ToLower(), and ordinal comparisons are relatively cheap. If you're dealing with languages other than English, you'll need to consider whether you actually need StringComparer.CurrentCultureIgnoreCase or StringComparer.InvariantCultureIgnoreCase.
var textWords = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
foreach (var line in text)
{
var lineWords = line.Split(' ');
textWords.UnionWith(lineWords);
}
if (textWords.Overlaps(words))
{
// One thing
}
else
{
// Another
}
If this is not the case, and you do want to do a .Contains on each, then you can speed it up a bit by avoiding the calls to .ToLower(). Each call to .ToLower() creates a new string in memory, so you're creating two new, useless objects per comparison.
Instead, use:
if (line.IndexOf(word, StringComparison.OrdinalIgnoreCase) >= 0)
As above, you might have to use StringComparison.CurrentCultureIgnoreCase or StringComparison.InvariantCultureIgnoreCase depending on the language of your strings. However, you should see a significant speedup if your strings are entirely ASCII and you use OrdinalIgnoreCase as this makes the string search a lot quicker.
If you're using .NET Framework, another thing to try is moving to .NET Core. .NET Core introduced a lot of optimizations in this area, and you might find that it's quicker.
Another thing you can do is see if you have duplicates in either text or words. If you have a lot, you might be able to save a lot of time. Consider using a HashSet<string> for this, or linq's .Distinct() (you'll need to see which is quicker).
You can try using LINQ for the second looping construct.
List<string> text = new List<string>();
List<string> words = new List<string>();
foreach (string line in text)
{
bool found = words.FirstOrDefault(w=>line.ToLower().Contains(w.ToLower()))!=null;
if (found)
{
//Do something
}
else
{
//Another
}
}
Might not be as fast as you want but it will be faster than before.
You can improve the search algorithm.
public static int Search(string word, List<string> stringList)
{
string wordCopy = word.ToLower();
List<string> stringListCopy = new List<string>();
stringList.ForEach(s => stringListCopy.Add(s.ToLower()));
stringListCopy.Sort();
int position = -1;
int count = stringListCopy.Count;
if (count > 0)
{
int min = 0;
int max = count - 1;
int middle = (max - min) / 2;
int comparisonStatus = 0;
do
{
comparisonStatus = string.Compare(wordCopy, stringListCopy[middle]);
if (comparisonStatus == 0)
{
position = middle;
break;
}
else if (comparisonStatus < 0)
{
max = middle - 1;
}
else
{
min = middle + 1;
}
middle = min + (max - min) / 2;
} while (min < max);
}
return position;
}
Inside this method we create copy of string list. All elements are lower case.
After that we sort copied list by ascending. This is crucial because the entire algorithm is based upon ascending sort.
If word exists in the list then the Search method will return its position inside list, otherwise it will return -1.
How the algorithm works?
Instead of checking every element in the list, we split the list in half in every iteration.
In every iteration we take the element in the middle and compare two strings (the element and our word). If out string is the same as the one in the middle then our search is finished. If our string is lexical before the string in the middle, then our string must be in the first half of the list, because the list is sorted by ascending. If our string is lexical after the string in the middle, then our string must be in the second half of the list, again because the list is sorted by ascending. Then we take the appropriate half and repeat the process.
In first iteration we take the entire list.
I've tested the Search method using these data:
List<string> stringList = new List<string>();
stringList.Add("Serbia");
stringList.Add("Greece");
stringList.Add("Egypt");
stringList.Add("Peru");
stringList.Add("Palau");
stringList.Add("Slovakia");
stringList.Add("Kyrgyzstan");
stringList.Add("Mongolia");
stringList.Add("Chad");
Search("Serbia", stringList);
This way you will search the entire list of ~50,000 elements in 16 iterations at most.

How to make this code more functional or 'prettier'

I've been working on a project where I need on a button press that this line gets executed.
if (listView1.SelectedItems[0].SubItems[3].Text == "0") //Checks to see Value
{
listView1.SelectedItems[0].SubItems[3].Text = "1";// If Value is Greater, Increase and Change ListView
questionNumberLabel.Text = listView1.SelectedItems[0].SubItems[3].Text;// Increase and Change Label
}
Now I have this repeated about 10 times with each value increasing by one. But I know that this is ugly, and dysfunctional. As well as conflates the file size. I've tried a few things. Primarily this method.
if (listView1.SelectedItems[0].SubItems[3].Text == "0")
{
for (var i = 1; i < 100;)
{
if (!Int32.TryParse(listView1.SelectedItems[0].SubItems[3].Text, out i))
{
i = 0;
}
i++;
listView1.SelectedItems[0].SubItems[3].Text = i.ToString();
Console.WriteLine(i);
}
}
But instead of just adding one, it does the 100 instances and ends. The reason this is becoming a pain in the *** is because the
listView1.SelectedItems[0].SubItems[3].Text
is just that - it's a string, not an int. That's why I parsed it and tried to run it like that. But it still isn't having the out come I want.
I've also tried this
string listViewItemToChange = listView1.SelectedItems[0].SubItems[3].Text;
Then parsing the string, to make it prettier. It worked like it did before, but still hasn't given me the outcome I want. Which to reiterate is, I'm wanting the String taken from the list view to be changed into an int, used in the for loop, add 1, then restring it and output it on my listView.
Please help :(
You say you want the text from a listview subitem converted to an int which is then used in a loop
so - first your creating your loop variable, i, then in your loop you're assigning to it potentially 3 different values 2 of which are negated by the, i++. None of it makes sense and you shouldn't be manipulating your loop variable like that (unless understand what you're doing).
if you move statements around a little..
int itemsToCheck = 10; // "Now I have this repeated about 10 times "
for (var item = 0; item < itemsToCheck; item++)
{
int i;
if (!Int32.TryParse(listView1.SelectedItems[item].SubItems[3].Text, out i))
{
i = 0;
}
i++;
listView1.SelectedItems[item].SubItems[3].Text = i.ToString();
Console.WriteLine(i);
}
Something along those lines is what you're looking for. I haven't changed what your code does with i, just added a loop count itemsToCheck and used a different loop variable so your loop variable and parsed value are not one in the same which will likely be buggy.
Maybe this give you an idea. You can start using this syntax from C# 7.0
var s = listView1.SelectedItems[0].SubItems[3].Text;
var isNumeric = int.TryParse(s, out int n);
if(isNumeric is true && n > 0){
questionNumberLabel.Text = s;
}
to shortcut more
var s = listView1.SelectedItems[0].SubItems[3].Text;
if(int.TryParse(s, out int n) && n > 0){
questionNumberLabel.Text = s;
}

Reducing complexity from permutation to combination

Lets say I have 6 image parts that forms a single image when correctly arranged. Also suppose that I have 2 pair of parts that may be interchanged in position and may still form the same image. Now, I want all the possible combinations without permutation. I want to start from 1st place and check how many partials fits that place and incrementally move towards last place.
List<List<int>> possible_combinations = new List<List<int>>();
for (int i = 0; i <= 6; i++)
{
foreach (var comb in possible_combinations)
{
List<int[]> best_combination = new List<int[]>();
for (int p = 0; p < comb.Count; p++)
best_combination.AddRange(allStrokes[comb[p]]);
if (!comb.Contains(i))
{
best_combination.AddRange(allStrokes[i]);
var fits = cs.checkFeasibility(best_combination);
if (fits)
comb.Add(i);
}
}
}
possible combination already may consists of all first possible partial image id. Check feasibility will check if newly combined partial will improve the result so far or not. How can I achieve that. Code herewith is for reference for understanding.

C# Best way to parse flat file with dynamic number of fields per row

I have a flat file that is pipe delimited and looks something like this as example
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
The first two columns are set and will always be there.
* denotes a count for how many repeating fields there will be following that count so Notes 1 2 3
** denotes a count for how many times a block of fields are repeated and there are always 3 fields in a block.
This is per row, so each row may have a different number of fields.
Hope that makes sense so far.
I'm trying to find the best way to parse this file, any suggestions would be great.
The goal at the end is to map all these fields into a few different files - data transformation. I'm actually doing all this within SSIS but figured the default components won't be good enough so need to write own code.
UPDATE I'm essentially trying to read this like a source file and do some lookups and string manipulation to some of the fields in between and spit out several different files like in any normal file to file transformation SSIS package.
Using the above example, I may want to create a new file that ends up looking like this
"ColA","HardcodedString","Note1CRLFNote2CRLF","ColB"
And then another file
Row1: "ColA","A1","A2","A3"
Row2: "ColA","B1","B2","B3"
So I guess I'm after some ideas on how to parse this as well as storing the data in either Stacks or Lists or?? to play with and spit out later.
One possibility would be to use a stack. First you split the line by the pipes.
var stack = new Stack<string>(line.Split('|'));
Then you pop the first two from the stack to get them out of the way.
stack.Pop();
stack.Pop();
Then you parse the next element: 3* . For that you pop the next 3 items on the stack. With 2** you pop the next 2 x 3 = 6 items from the stack, and so on. You can stop as soon as the stack is empty.
while (stack.Count > 0)
{
// Parse elements like 3*
}
Hope this is clear enough. I find this article very useful when it comes to String.Split().
Something similar to below should work (this is untested)
ColA|ColB|3*|Note1|Note2|Note3|2**|A1|A2|A3|B1|B2|B3
string[] columns = line.Split('|');
List<string> repeatingColumnNames = new List<string();
List<List<string>> repeatingFieldValues = new List<List<string>>();
if(columns.Length > 2)
{
int repeatingFieldCountIndex = columns[2];
int repeatingFieldStartIndex = repeatingFieldCountIndex + 1;
for(int i = 0; i < repeatingFieldCountIndex; i++)
{
repeatingColumnNames.Add(columns[repeatingFieldStartIndex + i]);
}
int repeatingFieldSetCountIndex = columns[2 + repeatingFieldCount + 1];
int repeatingFieldSetStartIndex = repeatingFieldSetCountIndex + 1;
for(int i = 0; i < repeatingFieldSetCount; i++)
{
string[] fieldSet = new string[repeatingFieldCount]();
for(int j = 0; j < repeatingFieldCountIndex; j++)
{
fieldSet[j] = columns[repeatingFieldSetStartIndex + j + (i * repeatingFieldSetCount))];
}
repeatingFieldValues.Add(new List<string>(fieldSet));
}
}
System.IO.File.ReadAllLines("File.txt").Select(line => line.Split(new[] {'|'}))

C# List Manipulation - Output Different When Stepping Through vs. Running

I'm a C# newbie and I'm really confused about something I'm trying to do for a project in a C# class.
The assignment is some list manipulation in C#.
The program accepts a list of items in the text box, then iterates through those items, creating multiple copies of the list. It randomly resizes each copy of the list to between 3 and all items. It then outputs all the copies.
The problem I'm having is that when I step through this program with the debugger, I get the expected output. The same happens if I display a message box after each iteration (as I have in the code below).
However, if I just run the program straight through, I get a different output. Instead of variations in the lists, all the copies of the list are exactly the same.
If you see in the code I've commented "// FIRST DEBUG MESSAGEBOX" and "// SECOND DEBUG MESSAGEBOX". If the first debug messagebox code is left in there, the output is as expected...multiple versions of the list are output with random lengths between 3 and all items.
However, and this is where I'm confused...if you comment out the first debug messagebox code, you get a different result. All versions of the list output are the same length with no variation.
Any help would be appreciated! Here's the code I have so far...sorry if it's terrible - I'm new at C#:
public partial class MainForm : Form
{
/**
* Vars to hold raw text list items
* and list items split by line
*/
String rawListItems = "";
List<string> listItems = new List<string>();
List<List<string>> textListItems = new List<List<string>>();
public MainForm()
{
InitializeComponent();
}
private void cmdGo_Click(object sender, EventArgs e)
{
// store the contents of the list item text box
this.rawListItems = txtListItems.Text;
this.listItems.AddRange(Regex.Split(this.rawListItems, "\r\n"));
// setup min and max items - max items all items
int minItems = 3;
int maxItems = this.listItems.Count;
// We'll copy this list X times, X = same number of items in list
for (int i = 0; i < this.listItems.Count; i++)
{
// make a copy of the list items
List<string> listItemsCopy = new List<string>(this.listItems);
// get a random number between min items and max items
Random random = new Random();
int maxIndex = random.Next(minItems, maxItems + 1); // max is exclusive, hence the +1
// remove all elements after the maxIndex
for (int j = 0; j < listItemsCopy.Count; j++)
{
if (j > maxIndex)
{
listItemsCopy.RemoveAt(j);
}
}
// add the list copy to the master list
this.textListItems.Add(listItemsCopy);
// FIRST DEBUG MESSAGEBOX
String tst = "";
foreach (string item in listItemsCopy)
{
tst += item + " ## ";
}
MessageBox.Show(tst);
}
// SECOND DEBUG MESSAGEBOX
String output = "";
foreach (List<string> listitem in this.textListItems)
{
foreach (string item in listitem)
{
output += item + " ## ";
}
}
MessageBox.Show(output);
}
}
Move the creation of Random out of the loop:
Random random = new Random();
By default, the constructor uses a default time based seed. In a tight loop, you may be getting 'the same' random generator instead of a different one with each loop.
When using MessageBoxes or single stepping, you are allowing the timer to run and getting 'a new' random generator in each loop.
I don't understand your assignment exactly, but this loop seems to be incorrect:
for (int j = 0; j < listItemsCopy.Count; j++)
{
if (j > maxIndex)
{
listItemsCopy.RemoveAt(j);
}
}
when you remove an element in the middle of a list, elements after that get shifted, so not all the elements after maxIndex get removed, as you might expect.
In circumstances where stepping through the code in a debugger affects the behaviour of the program, a useful alternative debugging technique is to use the System.Diagnostics namespace in particular the Trace class.
The Trace functions work much like Console.WriteLine(), you can trace a string or a format string plus an array of objects to populate the format string, e.g.:
Trace.TraceInformation("some message that tells me something");
Trace.TraceInformation("some useful format string {1}, {0}",
new object[] {someObject, someOtherObject});

Categories

Resources