How do I make the foreach instruction iterate in 2 places? - c#

how do I make the foreach instruction iterate both in the "files" variable and in the "names" array?
var files = Directory.GetFiles(#".\GalleryImages");
string[] names = new string[8] { "Matt", "Joanne", "Robert","Andrei","Mihai","Radu","Ionica","Vasile"};
I've tried 2 options.. the first one gives me lots of errors and the second one displays 8 images of each kind
foreach(var file in files,var i in names)
{
//Do stuff
}
and
foreach(var file in files)
{
foreach (var i in names)
{
//Do stuff
}
}

You can try using the Zip Extension method of LINQ:
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
foreach (var item in numbersAndWords)
Console.WriteLine(item);
Would look something like this:
var files = Directory.GetFiles(#".\GalleryImages");
string[] names = new string[] { "Matt", "Joanne", "Robert", "Andrei", "Mihai","Radu","Ionica","Vasile"};
var zipped = files.Zip(names, (f, n) => new { File = f, Name = n });
foreach(var fn in zipped)
Console.WriteLine(fn.File + " " + fn.Name);
But I haven't tested this one.

It's not clear what you're asking. But, you can't iterate two iterators with foreach; but you can increment another variable in the foreach body:
int i = 0;
foreach(var file in files)
{
var name = names[i++];
// TODO: do something with name and file
}
This, of course, assumes that files and names are of the same length.

You can't. Use a for loop instead.
for(int i = 0; i < files.Length; i++)
{
var file = files[i];
var name = names[i];
}
If the both array have the same length this should work.

You have two options here; the first works if you are iterating over something that has an indexer, like an array or List, in which case use a simple for loop and access things by index:
for (int i = 0; i < files.Length && i < names.Length; i++)
{
var file = files[i];
var name = names[i];
// Do stuff with names.
}
If you have a collection that doesn't have an indexer, e.g. you just have an IEnumerable and you don't know what it is, you can use the IEnumerable interface directly. Behind the scenes, that's all foreach is doing, it just hides the slightly messier syntax. That would look like:
var filesEnum = files.GetEnumerator();
var namesEnum = names.GetEnumerator();
while (filesEnum.MoveNext() && namesEnum.MoveNext())
{
var file = filesEnum.Current;
var name = namesEnum.Current;
// Do stuff with files and names.
}
Both of these assume that both collections have the same number of items. The for loop will only iterate as many times as the smaller one, and the smaller enumerator will return false from MoveNext when it runs out of items. If one collection is bigger than the other, the 'extra' items won't get processed, and you'll need to figure out what to do with them.

I guess the files array and the names array have the same indices.
When this is the case AND you always want the same index at one time you do this:
for (int key = 0; key < files.Length; ++key)
{
// access names[key] and files[key] here
}

You can try something like this:
var pairs = files.Zip(names, (f,n) => new {File=f, Name=n});
foreach (var item in pairs)
{
Console.Write(item.File);
Console.Write(item.Name);
}

Related

Quickest algorithm for identifying pairs in collection of string

I am looking for the quickest algorithm:
GOAL: output the total number of pair occurrences found on a line. The individual elements may be in any order on any given line.
INPUT:
a;b;c;d
a;e;f;g
a;b;f;h
OUTPUT
a;b = 2
a;c = 1
a;d = 1
a;e = 1
a;f = 2
a;g = 1
b;c = 1
b;d = 1
I am programming in C#, I've got a nested for loop adding do a common dictionary of type where string is like a;b and when an occurrence is found it adds to the existing int tally or adds a new one at tally = 0.
Note this:
a;b = 1
b;a = 1
Should be reduced to this:
a;b = 1
I am open to using other languages, the output is in a plain text file which I feed into Gephi visualization tool.
Bonus: Very interested to know the name of this particular algorithm if it's out there. Pretty sure it is.
String[] data = File.ReadAllLines(#"C:\input.txt");
Dictionary<string, int> ress = new Dictionary<string, int>();
foreach (var line in data)
{
string[] outStrings = line.Split(';');
for (int i = 0; i < outStrings.Count(); i++)
{
for (int y = 0; y < outStrings.Count(); y++)
{
if (outStrings[i] != outStrings[y])
{
try
{
if (ress.Any(x => x.Key == outStrings[i] + ";" + outStrings[y]))
{
ress[outStrings[i] + ";" + outStrings[y]] += 1;
}
else
{
ress.Add(outStrings[i] + ";" + outStrings[y], 0);
}
}
catch (Exception)
{
}
}
}
}
}
foreach (var val in ress)
{
Console.WriteLine(val.Key + "----" + val.Value);
}
I think your inner loop should start with i + 1 instead of starting back at 0 again, and the outer loop should only run until Length - 1, since the last item will be compared on the inner loop. Also, when you add a new item, you should add the value 1, not 0 (since the whole reason we're adding it is because we found one).
You can also just store the key into a string once instead of doing multiple concatenations during your comparison and assignment, and you can use the ContainsKey method to determine if a key exists already.
Also, you might want to consider avoiding empty catch blocks unless you're really certain that you don't care if or what went wrong. If I'm expecting an exception and know how to handle it, then I catch that exception, otherwise I'll just let it bubble up the stack.
Here's one way you could modify your code to find all pairs and their counts:
Update
I added a check to ensure that the "pair" key is always sorted, so that "b;a" becomes "a;b". This wasn't an issue in your sample data, but I extended the data to include lines like b;a;a;b;a;b;a;. Also I added StringSplitOptions.RemoveEmptyEntries to the Split method to handle cases where a line begins or ends with a ; (otherwise the null value resulted in a pair like ";a").
private static void Main()
{
var data = File.ReadAllLines(#"f:\public\temp\temp.txt");
var pairCount = new Dictionary<string, int>();
foreach (var line in data)
{
var lineItems = line.Split(new[] {';'}, StringSplitOptions.RemoveEmptyEntries);
for (var outer = 0; outer < lineItems.Length - 1; outer++)
{
for (var inner = outer + 1; inner < lineItems.Length; inner++)
{
var outerComparedToInner = string.Compare(lineItems[outer],
lineItems[inner], StringComparison.Ordinal);
// If both items are the same character, ignore them and keep looping
if (outerComparedToInner == 0) continue;
// Create the pair such that the lower of the two
// values is first, so that "b;a" becomes "a;b"
var thisPair = outerComparedToInner < 0
? $"{lineItems[outer]};{lineItems[inner]}"
: $"{lineItems[inner]};{lineItems[outer]}";
if (pairCount.ContainsKey(thisPair))
{
pairCount[thisPair]++;
}
else
{
pairCount.Add(thisPair, 1);
}
}
}
}
Console.WriteLine("Pair\tCount\n----\t-----");
foreach (var val in pairCount.OrderBy(i => i.Key))
{
Console.WriteLine($"{val.Key}\t{val.Value}");
}
Console.Write("\nDone!\nPress any key to exit...");
Console.ReadKey();
}
Output
Given a file containing your sample data, the output is:
#mrmcgreg, finally after changing the implementation to the ECLAT algorythm everything runs in seconds instead of hours.
Basically for each unique tag, keep track of the LINE NUMBERS where those tags are found, and simply intersect the pair of list of numbers by combination pairs to get the count.
Dictionary<string, List<int>> uniqueTagList = new Dictionary<string, List<int>>();
foreach (var uniqueTag in uniquetags)
{
List<int> lineNumbers = new List<int>();
foreach (var item in data.Select((value, i) => new { i, value }))
{
var value = item.value;
var index = item.i;
//split data into tags
var tags = item.ToString().Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
foreach (var tag in tags)
{
if (uniqueTag == tag)
{
lineNumbers.Add(index);
}
}
}
//remove all having support threshold.
if (lineNumbers.Count > 5)
{
uniqueTagList.Add(uniqueTag, lineNumbers);
}
}

StreamWriter C# formatting output

Problem Statement
In order to run gene annotation software, I need to prepare two types of files, vcard files and coverage tables, and there has to be one-to-one match of vcard to coverage table. Since Im running 2k samples, its hard to identify which file is not one-to-one match. I know that both files have unique identifier numbers, hence, if both folders have files that have same unique numbers, i treat that as "same" file
I made a program that compares two folders and reports unique entries in each folder. To do so, I made two list that contains unique file names to each directory.
I want to format the report file (tab delimited .txt file) such that it looks something like below:
Unique in fdr1 Unique in fdr2
file x file a
file y file b
file z file c
I find this difficult to do because I have to iterate twice (since I have two lists), but there is no way of going back to the previous line in StreamWriter as far as I know. Basically, once I iterate through the first list and fill the first column, how can I fill the second column with the second list?
Can someone help me out with this?
Thanks
If design of the code has to change (i.e. one list instead of two), please let me know
As requested by some user, this is how I was going to do (not working version)
// Write report
using (StreamWriter sw = new StreamWriter(dest_txt.Text + #"\" + "Report.txt"))
{
// Write headers
sw.WriteLine("Unique Entries in Folder1" + "\t" + "Unique Entries in Folder2");
// Write unique entries in fdr1
foreach(string file in fdr1FileList)
{
sw.WriteLine(file + "\t");
}
// Write unique entries in fdr2
foreach (string file in fdr2FileList)
{
sw.WriteLine(file + "\t");
}
sw.Dispose();
}
As requested for my approach for finding unique entries, here's my code snippet
Dictionary<int, bool> fdr1Dict = new Dictionary<int, bool>();
Dictionary<int, bool> fdr2Dict = new Dictionary<int, bool>();
List<string> fdr1FileList = new List<string>();
List<string> fdr2FileList = new List<string>();
string fdr1Path = folder1_txt.Text;
string fdr2Path = folder2_txt.Text;
// File names in the specified directory; path not included
string[] fdr1FileNames = Directory.GetFiles(fdr1Path).Select(Path.GetFileName).ToArray();
string[] fdr2FileNames = Directory.GetFiles(fdr2Path).Select(Path.GetFileName).ToArray();
// Iterate through the first directory, and add GL number to dictionary
for(int i = 0; i < fdr1FileNames.Length; i++)
{
// Grabs only the number from the file name
string number = Regex.Match(fdr1FileNames[i], #"\d+").ToString();
int glNumber;
// Make sure it is a number
if(Int32.TryParse(number, out glNumber))
{
fdr1Dict[glNumber] = true;
}
// If number not present, raise exception
else
{
throw new Exception(String.Format("GL Number not found in: {0}", fdr1FileNames[i]));
}
}
// Iterate through the second directory, and add GL number to dictionary
for (int i = 0; i < fdr2FileNames.Length; i++)
{
// Grabs only the number from the file name
string number = Regex.Match(fdr2FileNames[i], #"\d+").ToString();
int glNumber;
// Make sure it is a number
if (Int32.TryParse(number, out glNumber))
{
fdr2Dict[glNumber] = true;
}
// If number not present, raise exception
else
{
throw new Exception(String.Format("GL Number not found in: {0}", fdr2FileNames[i]));
}
}
// Iterate through the first directory, and find files that are unique to it
for (int i = 0; i < fdr1FileNames.Length; i++)
{
int glNumber = Int32.Parse(Regex.Match(fdr1FileNames[i], #"\d+").Value);
// If same file is not present in the second folder add to the list
if(!fdr2Dict[glNumber])
{
fdr1FileList.Add(fdr1FileNames[i]);
}
}
// Iterate through the second directory, and find files that are unique to it
for (int i = 0; i < fdr2FileNames.Length; i++)
{
int glNumber = Int32.Parse(Regex.Match(fdr2FileNames[i], #"\d+").Value);
// If same file is not present in the first folder add to the list
if (!fdr1Dict[glNumber])
{
fdr2FileList.Add(fdr2FileNames[i]);
}
I am a quite confident that this will work as I've tested it:
static void Main(string[] args)
{
var firstDir = #"Path1";
var secondDir = #"Path2";
var firstDirFiles = System.IO.Directory.GetFiles(firstDir);
var secondDirFiles = System.IO.Directory.GetFiles(secondDir);
print2Dirs(firstDirFiles, secondDirFiles);
}
private static void print2Dirs(string[] firstDirFile, string[] secondDirFiles)
{
var maxIndex = Math.Max(firstDirFile.Length, secondDirFiles.Length);
using (StreamWriter streamWriter = new StreamWriter("result.txt"))
{
streamWriter.WriteLine(string.Format("{0,-150}{1,-150}", "Unique in fdr1", "Unique in fdr2"));
for (int i = 0; i < maxIndex; i++)
{
streamWriter.WriteLine(string.Format("{0,-150}{1,-150}",
firstDirFile.Length > i ? firstDirFile[i] : string.Empty,
secondDirFiles.Length > i ? secondDirFiles[i] : string.Empty));
}
}
}
It's a quite simple code but if you need help understanding it just let me know :)
I would construct each line at a time. Something like this:
int row = 0;
string[] fdr1FileList = new string[0];
string[] fdr2FileList = new string[0];
while (row < fdr1FileList.Length || row < fdr2FileList.Length)
{
string rowText = "";
rowText += (row >= fdr1FileList.Length ? "\t" : fdr1FileList[row] + "\t");
rowText += (row >= fdr2FileList.Length ? "\t" : fdr2FileList[row]);
row++;
}
Try something like this:
static void Main(string[] args)
{
Dictionary<int, string> fdr1Dict = FilesToDictionary(Directory.GetFiles("path1"));
Dictionary<int, string> fdr2Dict = FilesToDictionary(Directory.GetFiles("path2"));
var unique_f1 = fdr1Dict.Where(f1 => !fdr2Dict.ContainsKey(f1.Key)).ToArray();
var unique_f2 = fdr2Dict.Where(f2 => !fdr1Dict.ContainsKey(f2.Key)).ToArray();
int f1_size = unique_f1.Length;
int f2_size = unique_f2.Length;
int list_length = 0;
if (f1_size > f2_size)
{
list_length = f1_size;
Array.Resize(ref unique_f2, list_length);
}
else
{
list_length = f2_size;
Array.Resize(ref unique_f1, list_length);
}
using (StreamWriter writer = new StreamWriter("output.txt"))
{
writer.WriteLine(string.Format("{0,-30}{1,-30}", "Unique in fdr1", "Unique in fdr2"));
for (int i = 0; i < list_length; i++)
{
writer.WriteLine(string.Format("{0,-30}{1,-30}", unique_f1[i].Value, unique_f2[i].Value));
}
}
}
static Dictionary<int, string> FilesToDictionary(string[] filenames)
{
Dictionary<int, string> dict = new Dictionary<int, string>();
for (int i = 0; i < filenames.Length; i++)
{
int glNumber;
string filename = Path.GetFileName(filenames[i]);
string number = Regex.Match(filename, #"\d+").ToString();
if (int.TryParse(number, out glNumber))
dict.Add(glNumber, filename);
}
return dict;
}

Randomly place images in multiple Image controls

I am creating a simple "Pairs" game in WPF. I have 12 Image controls on MainWindow. What I need to do, is to use OpenFileDialog to select multiple Images (can be less then all 6) and then randomly place them into Image controls. Each picture should appear twice. How would I be able to achieve this? I am stuck here for a while and only have following code at the moment. I am not asking for a solution, I only need a few pointers on how to deal with this. Thank you.
> public ObservableCollection<Image> GetImages()
{
OpenFileDialog dlg = new OpenFileDialog();
dlg.Multiselect = true;
ObservableCollection<Image> imagesList = new ObservableCollection<Image>();
if (dlg.ShowDialog() == true)
{
foreach (String img in dlg.FileNames)
{
Image image = new Image();
image.Name = "";
image.Location = img;
imagesList.Add(image);
}
}
return imagesList;
}
There are many ways to achieve your required results. A good way would be to use the Directory.GetFiles method, which will return a collection of string file paths:
string [] filePaths = Directory.GetFiles(targetDirectory);
You can then use a method to randomise the order of the collection. From the C# Shuffle Array page on DotNETPerls:
public string[] RandomizeStrings(string[] arr)
{
List<KeyValuePair<int, string>> list = new List<KeyValuePair<int, string>>();
// Add all strings from array
// Add new random int each time
foreach (string s in arr)
{
list.Add(new KeyValuePair<int, string>(_random.Next(), s));
}
// Sort the list by the random number
var sorted = from item in list
orderby item.Key
select item;
// Allocate new string array
string[] result = new string[arr.Length];
// Copy values to array
int index = 0;
foreach (KeyValuePair<int, string> pair in sorted)
{
result[index] = pair.Value;
index++;
}
// Return copied array
return result;
}
Then add your duplicate file paths, re-randomise the order again and populate your UI property with the items:
string[] filePathsToUse = new string[filePaths.Length * 2];
filePaths = RandomizeStrings(filePaths);
for (int count = 0; count < yourRequiredNumber; count++)
{
filePathsToUse.Add(filePaths(count));
filePathsToUse.Add(filePaths(count));
}
// Finally, randomize the collection again:
ObservableCollection<string> filePathsToBindTo = new
ObservableCollection<string>(RandomizeStrings(filePathsToUse));
Of course, you could also do it in many other ways, some easier to understand, some more efficient. Just pick a method that you feel comfortable with.

Get first value in CSV column without duplicates

I am getting a list of items from a csv file via a Web Api using this code:
private List<Item> items = new List<Item>();
public ItemRepository()
{
string filename = HttpRuntime.AppDomainAppPath + "App_Data\\items.csv";
var lines = File.ReadAllLines(filename).Skip(1).ToList();
for (int i = 0; i < lines.Count; i++)
{
var line = lines[i];
var columns = line.Split('$');
//get rid of newline characters in the middle of data lines
while (columns.Length < 9)
{
i += 1;
line = line.Replace("\n", " ") + lines[i];
columns = line.Split('$');
}
//Remove Starting and Trailing open quotes from fields
columns = columns.Select(c => { if (string.IsNullOrEmpty(c) == false) { return c.Substring(1, c.Length - 2); } return string.Empty; }).ToArray();
var temp = columns[5].Split('|', '>');
items.Add(new Item()
{
Id = int.Parse(columns[0]),
Name = temp[0],
Description = columns[2],
Photo = columns[7]
});
}
}
The Name attribute of the item list must come from column whose structure is as follows:
Groups>Subgroup>item
Therefore I use var temp = columns[5].Split('|', '>'); in my code to get the first element of the column before the ">", which in the above case is Groups. And this works fine.
However, I a getting many duplicates in the result. This is because other items in the column may be:
(These are some of the entries in my csv column 9)
Groups>Subgroup2>item2, Groups>Subgroup3>item4, Groups>Subgroup4>item9
All start with Groups, but I only want to get Groups once.
As it is I get a long list of Groups. How do I stop the duplicates?
I want that if an Item in the list is returned with the Name "Groups", that no other item with that name would be returned. How do I make this check and implement it?
If you are successfully getting the list of groups, take that list of groups and use LINQ:
var undupedList = dupedList
.Distinct();
Update: The reason distinct did not work is because your code is requesting not just Name, but also, Description, etc...If you only ask for Name, Distinct() will work.
Update 2: Try this:
//Check whether already exists
if((var match = items.Where(q=>q.Name == temp[0])).Count==0)
{
items.add(...);
}
How about using a List to store Item.Name?
Then check List.Contains() before calling items.Add()
Simple, only 3 lines of code, and it works.
IList<string> listNames = new List();
//
for (int i = 0; i < lines.Count; i++)
{
//
var temp = columns[5].Split('|', '>');
if (!listNames.Contains(temp[0]))
{
listNames.Add(temp[0]);
items.Add(new Item()
{
//
});
}
}

Improve the code for performance?

I wrote the follow c# codes to generate a set of numbers and then compare with another set of numbers to remove the unwanted numbers.
But its taking too long at run time to complete the process. Following is the code behind file.
The numbers it has to generate is like 7 figures large and the numbers list which I use it as to remove is around 700 numbers.
Is there a way to improve the run time performance?
string[] strAry = txtNumbersToBeExc.Text.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
int[] intAry = new int[strAry.Length];
List<int> intList = new List<int>();
for (int i = 0; i < strAry.Length; i++)
{
intList.Add(int.Parse(strAry[i]));
}
List<int> genList = new List<int>();
for (int i = int.Parse(txtStartSeed.Text); i <= int.Parse(txtEndSeed.Text); i++)
{
genList.Add(i);
}
lblStatus.Text += "Generated: " + genList.Capacity;
var finalvar = from s in genList where !intList.Contains(s) select s;
List<int> finalList = finalvar.ToList();
foreach (var item in finalList)
{
txtGeneratedNum.Text += "959" + item + "\n";
}
First thing to do is grab a profiler and see which area of your code is taking too long to run, try http://www.jetbrains.com/profiler/ or http://www.red-gate.com/products/dotnet-development/ants-performance-profiler/.
You should never start performance tuning until you know for sure where the problem is.
If the problem is in the linq query than you could try sorting the intlist and doing a binary search for each item to remove, though you can probably get a similar behavour with the right linq query.
string numbersStr = txtNumbersToBeExc.Text;
string startSeedStr = txtStartSeed.Text;
string endSeedStr = txtEndSeed.Text;
//next, the input type actually is of type int, we should test if the strings are ok ( they do represent ints)
var intAry = numbersStr.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries).Select(s=>Int32.Parse(s));
int startSeed = Int32.Parse(startSeedStr);
int endSeed = Int32.Parse(endSeedStr);
/*FROM HERE*/
// using Enumerable.Range
var genList = Enumerable.Range(startSeed, endSeed - startSeed + 1);
// we can use linq except
var finalList = genList.Except(intAry);
// do you need a string, for 700 concatenations I would suggest StringBuilder
var sb = new StringBuilder();
foreach ( var item in finalList)
{
sb.AppendLine(string.Concat("959",item.ToString()));
}
var finalString = sb.ToString();
/*TO HERE, refactor it into a method or class*/
txtGeneratedNum.Text = finalString;
They key point here is that String is a immutable class, so the "+" operation between two strings will create another string. StringBuilder it doesn't do this. On your situation it really doesn't matter if you're using for loops, foreach loops, linq fancy functions to accomplish the exclusion. The performance hurt was because of the string concatenations. I'm trusting more the System.Linq functions because they are already tested for performance.
Change intList from a List to a HashSet - gives much better performance when determining if an entry is present.
Consider using Linq's Enumerable.Intersect, especially combined with #1.
Change the block of code that create genList with this:
List<int> genList = new List<int>();
for (int i = int.Parse(txtStartSeed.Text); i <= int.Parse(txtEndSeed.Text); i++)
{
if (!intList.Contains(i)) genList.Add(i);
}
and after create txtGeneratedNum looping on genList. This will reduce the number of loop of your implementation.
Why not do the inclusion check when you are parsing the int and just build the result list directley.
There is not much point in iterating over the list twice. In fact, why build the intermediate list at all !?! just write straight to a StringBuilder since a newline delimited string seems to be your goal.
string[] strAry = txtNumbersToBeExc.Text.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
var exclusions = new HashSet<T>();
foreach (string s in txtNumbersToBeExc.Text.Split(new string[] { Environment.NewLine })
{
int value;
if (int.TryParse(s, value)
{
exclusions.Add(value);
}
}
var output = new StringBuilder();
for (int i = int.Parse(txtStartSeed.Text); i <= int.Parse(txtEndSeed.Text); i++)
{
if (!exclusions.Contains(i))
{
output.AppendFormat("959{0}\n", i);
}
}
txtGeneratedNum.Text = output.ToString();

Categories

Resources