Been tackling this all day now and I am really fed up with it. I am new to sorting and I was working on bubble sort, quick sort and finally a bucket sort for my school exercise.
Here is what I have for bucket sort for a list of Objects(T-shirts) sorted by Cost. Now T-shirts Have also Size, Fabric and Color and that is why I need to sort them as well. I know this falls in the sort strings by bubble sort category but I can not find anything about it and everything I tried went bad
public List<Tshirt> BucketSort(Tshirt[] array)
{
List<Tshirt> result = new List<Tshirt>();
// Determine how many buckets you want to create
//Create buckets
int numOfBuckets = 5;
List<Tshirt>[] buckets = new List<Tshirt>[numOfBuckets];
for (int i = 0; i < 5; i++)
buckets[i] = new List<Tshirt>();
//Iterate through the passed array and add each tshirt to the appropriate bucket
for (int i = 0; i < array.Length; i++)
{
int buckitChoice = ((int)array[i].Cost / numOfBuckets);
buckets[buckitChoice].Add(array[i]);
}
//Sort each bucket and add it to the result List
//Each sublist is sorted using Bubblesort, but you could substitute any sorting algo you would like
for (int i = 0; i < numOfBuckets; i++)
{
Tshirt[] temp = BubbleSortList(buckets[i]);
result.AddRange(temp);
}
return result;
}
public static Tshirt[] BubbleSortList(List<Tshirt> input)
{
for (int i = 0; i < input.Count; i++)
{
for (int j = 0; j < input.Count; j++)
{
if (input[i].Cost < input[j].Cost)
{
decimal temp = input[i].Cost;
input[i].Cost = input[j].Cost;
input[j].Cost = temp;
}
}
}
return input.ToArray();
}
public Tshirt[] ReturnContexttoArray()
{
var tshirts = _context.Tshirts;
Tshirt[] TshirtArr = new Tshirt[tshirts.Count()];
var count = 0;
foreach (var tshirt in tshirts)
{
TshirtArr[count++] = tshirt;
}
return TshirtArr;
}
public List<Tshirt> ImplementBucketSortAsc()
{
return BucketSort(ReturnContexttoArray());
}
This question already has answers here:
HashSet vs. List performance
(12 answers)
Closed 9 months ago.
I have two lists of strings, and I need to check to see if there are any matches, and I have to do this at a minimum of sixty times a second, but this can scale up to thousands of times a second.
Right now, the lists are both small; one is three, and another might have a few dozen elements at most, but the currently small list is probably gonna grow.
Would it be faster to do this:
for (int i = 0; i < listA.Length; i++)
{
for (int j = 0; j < listB.Length; j++) {
if (listA[i] == listB[j])
{
// do stuff
}
}
}
Or to do this:
var hashSetB = new HashSet<string>(listB.Length);
for (int i = 0; i < listB.Length; i++)
{
hashSetB.Add(listB[i]);
}
for (int i = 0; i < listA.Length; i++)
{
if (hashSetB.Contains(listA[i])) {
// do stuff
}
}
ListA and ListB when they come to me, will always be lists; I have no control over them.
I think the core of my question is that I don't know how long var hashSetB = new HashSet<string>(listB.Length); takes, so I'm not sure the change would be good or bad for smaller lists.
Was curious so here's some code I wrote to test it. From what I got back, HashSet was near instantaneous whereas nested loops were slow. Makes sense as you've essentially taken something where you needed to do lengthA * lengthB operations and simplified it to lengthA + lengthB operation.
const int size = 20000;
var listA = new List<int>();
for (int i = 0; i < size; i++)
{
listA.Add(i);
}
var listB = new List<int>();
for (int i = size - 5; i < 2 * size; i++)
{
listB.Add(i);
}
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < listA.Count; i++)
{
for (int j = 0; j < listB.Count; j++)
{
if (listA[i] == listB[j])
{
Console.WriteLine("Nested loop match");
}
}
}
long timeTaken1 = sw.ElapsedMilliseconds;
sw.Restart();
var hashSetB = new HashSet<int>(listB.Count);
for (int i = 0; i < listB.Count; i++)
{
hashSetB.Add(listB[i]);
}
for (int i = 0; i < listA.Count; i++)
{
if (hashSetB.Contains(listA[i]))
{
Console.WriteLine("HashSet match");
}
}
long timeTaken2 = sw.ElapsedMilliseconds;
Console.WriteLine("Time Taken Nested Loop: " + timeTaken1);
Console.WriteLine("Time Taken HashSet: " + timeTaken2);
Console.ReadLine();
I created an extension method for type List<decimal[]> that does element-wise addition. The method is as follows:
public static void ElementAddition(this List<decimal[]> thisList, List<decimal[]> listToAdd)
{
if (thisList.Count == 0) return;
for (var i = 0; i < thisList.Count; i++)
{
for (var j = 0; j < thisList[0].Length; j++)
{
thisList[i][j] += listToAdd[i][j];
}
}
}
This has some interesting results. Every time the thisList[i][j] += listToAdd[i][j]; line executes it adds ALL the elements in the one list to the other list. So I eventually end up with thisList being substantially larger than it should be. It does not do element-wise addition as I was expecting.
I struggled to find similar questions on stackoverflow so any pointers would be useful. I clearly do not understand something about the implementation of Lists?
Paul
edit:
The following unit test:
[Test]
public void ElementAddition_WhenCalled_CorrectlyAddIndividualElements()
{
var decListOne = new decimal[] {1, 2, 3, 4};
var decListTwo = new decimal[] {10, 20, 30, 40};
var listListOne = new List<decimal[]>();
var listListTwo = new List<decimal[]>();
for (var i = 0; i < 3; i++)
{
listListOne.Add(decListOne);
listListTwo.Add(decListTwo);
}
listListOne.ElementAddition(listListTwo);
Assert.AreEqual(11m,listListOne[0][0]);
}
provides this output:
Expected: 11m
But was: 31m
You're reusing the same array 3 times in your test:
for (var i = 0; i < 3; i++)
{
listListOne.Add(decListOne); //here's the problem
listListTwo.Add(decListTwo);
}
This means in your ElementAddition, you adding the elements back into the same collection on each loop. The assumption that each entry of thisList is referencing a different array is incorrect.
If you change your test to ensure each array is unique/different:
for (var i = 0; i < 3; i++)
{
listListOne.Add(decListOne.ToArray()); //force new/different instances of the array
listListTwo.Add(decListTwo);
}
Then you will get the result you expect.
Change following:
for (var j = 0; j < thisList[0].Length; j++)
{
}
to:
for (var j = 0; j < thisList[i].Length; j++)
{
}
I have written the following code but it looks to be far from efficient.
//Find largest in tempRankingData
int largestIntempRankingData = tempRankingData[0, 0];
for (int i = 0; i < count; i++)
{
for (int j = 0; j < count; j++)
{
if (tempRankingData[i, j] > largestIntempRankingData)
{
largestIntempRankingData = tempRankingData[i, j];
}
}
}
//Find position of largest in tempRankingData
List<string> positionLargestIntempRankingData = new List<string>();
for (int i = 0; i < count; i++)
{
for (int j = 0; j < count; j++)
{
if (tempRankingData[i, j] == largestIntempRankingData)
{
positionLargestIntempRankingData.Add(i + "," + j);
}
}
}
//Find largest in each column
int largestInColumn = 0;
List<string> positionOfLargestInColumn = new List<string>();
Dictionary<int, List<string>> position = new Dictionary<int, List<string>>();
for (int i = 0; i < count; i++)
{
largestInColumn = tempRankingData[0, i];
positionOfLargestInColumn = new List<string>();
for (int j = 0; j < count; j++)
{
if (tempRankingData[j, i] > largestInColumn)
{
largestInColumn = tempRankingData[j, i];
}
}
for (int j = 0; j < count; j++)
{
if (tempRankingData[j, i] == largestInColumn)
{
positionOfLargestInColumn.Add(j + "," + i);
}
}
position.Add(i, positionOfLargestInColumn);
}
So, I wanted to check about the most efficient way to do this.
Whilst you're finding the largest in each column, you could also be finding the largest overall. You can also capture the positions as you go:
//Find largest in each column
int largestInColumn = 0;
int largestOverall = int.MinValue;
List<string> positionOfLargestInColumn;
Dictionary<int, List<string>> position = new Dictionary<int, List<string>>();
List<string> positionLargestIntempRankingData = new List<string>();
for (int i = 0; i < count; i++)
{
largestInColumn = tempRankingData[0, i];
positionOfLargestInColumn = new List<string>();
positionOfLargestInColumn.Add("0," + i);
for (int j = 1; j < count; j++)
{
if (tempRankingData[j, i] > largestInColumn)
{
largestInColumn = tempRankingData[j, i];
positionOfLargestInColumn.Clear();
positionOfLargestInColumn.Add(j + "," + i);
}
else if(tempTankingData[j,i] == largestInColumn)
{
positionOfLargestInColumn.Add(j + "," + i);
}
}
position.Add(i, positionOfLargestInColumn);
if(largestInColumn > largestOverall)
{
positionLargestIntempRankingData.Clear();
positionLargestIntempRankingData.AddRange(positionOfLargestInColumn);
largestOverall = largestInColumn;
}
else if(largestInColumn == largestOverall)
{
positionLargestIntempRankingData.AddRange(positionOfLargestInColumn);
}
}
1). You can find largest element and its position in one method and retrieve.
Would be caller of your method concerned about position or actual value, is a matter of concrete case.
2) You can use `yield return' technique in your matrix search (for column based search), so do not compute all column's maximas and push them into the dictionary. Dictionaries are not that fast as arrays, if you can avoid use them, do that.
3) You can keep a matrix in single dimension, long array. Have [] access operator overload, to "emulate" matrix access. Why ? If finding maximum is something frequent you might need to do during program run, having one foreach loop is faster then having 2 nested once. In case of a big matrices, single array search can be easily parallelized among different cores.
If big matrices and/or frequent calls are not your concern, just simplify your code like in points (1), (2).
For your fist two itterations you could replace with this:
//Find largest in tempRankingData
int largestIntempRankingData = tempRankingData[0, 0];
List<KeyValuePair<double,string>> list = new List<KeyValuePair<double,string>>();
for (int i = 0; i < count; i++)
{
for (int j = 0; j < count; j++)
{
if (tempRankingData[i, j] > largestIntempRankingData)
{
largestIntempRankingData = tempRankingData[i, j];
list.Add(new KeyValuePair<double, string>(largestIntempRankingData, i + "," + j)); //Add the value and the position;
}
}
}
//This gives a list of strings in which hold the position of largestInItemRankingData example "3,3"
//Only positions where the key is equal to the largestIntempRankingData;
list.Where(w => w.Key == largestIntempRankingData).ToList().Select(s => s.Value).ToList();
You can get all these pieces of information in a single scan with a little fiddling around. Something like this (converting the rows and columns to a string is trivial and better done at the end anyway):
int? largestSoFar = null; // you could populate this with myMatrix[0,0]
// but it would fail if the matrix is empty
int largestCol = 0;
int largestRow = 0;
int?[] largestPerColumn = new int?[numOfCols]; // You could also populate this with
// the values from the first row but
// it would fail if there are no rows
int[] largestColumnRow = new int[numOfCols];
for (int i = 0; i < numOfRows; i++)
{
for (int j = 0; j < numOfCols; i++)
{
if (largestSoFar < myMatrix[i,j])
{
largestSoFar = myMatrix[i,j];
largestCol = j;
largestRow = i;
}
if (largestPerColumn[j] < myMatrix[i,j])
{
largestPerColumn[j] = myMatix[i,j];
largestColumnRow[j] = i;
}
}
}
// largestSoFar is the biggest value in the whole matrix
// largestCol and largestRow is the column and row of the largest value in the matrix
// largestPerColumn[j] is the largest value in the jth column
// largestColumnRow[j] is the row of the largest value of the jth column
If you do need to capture all the "maxima" (for want of a better word, because that's not really what you are doing) in a column, you could just change the above code to something like this:
int? largestSoFar = null; // you could populate this with myMatrix[0,0]
// but it would fail if the matrix is empty
int largestCol = 0;
int largestRow = 0;
int?[] largestPerColumn = new int?[numOfCols]; // You could also populate this with
// the values from the first row but
// it would fail if there are no rows
List<int>[] largestColumnRow = new List<int>[numOfCols];
for (int i = 0; i < numOfRows; i++)
{
for (int j = 0; j < numOfCols; i++)
{
if (largestSoFar < myMatrix[i,j])
{
largestSoFar = myMatrix[i,j];
largestCol = j;
largestRow = i;
}
if (largestPerColumn[j] < myMatrix[i,j])
{
largestPerColumn[j] = myMatix[i,j];
largestColumnRow[j].Add(i);
}
}
}
// Now largestColumnRow[j] gives you a list of all the places where you found a larger
// value for the jth column
I am reading a csv file which has column names in first line and values in line >1. I need to get the position of the column name. The only way I can think of is to do either switch or ifs. I read it somewhere that in my case , it is faster (better) to do the ifs. However the file has many columns (~120). Just wondering if there is an alternative(s).
private static void Get_Position(string line, performance p)
{
string[] line_split = line.Split(',');
for (int i = 0; i < line_split.Length; i++)
{
if (line_split[i].Contains(#"(0)\% Processor Time"))
{
p.percore[0] = i;
}
else if (line_split[i].Contains(#"(1)\% Processor Time"))
{
p.percore[1] = i;
}
else if (line_split[i].Contains("Private Bytes"))
{}
else if (line_split[i].contains("DPC")
{
}
//on and on and on with else ifs
What is preventing you from using a loop?
for (int i = 0; i < line_split.Length; i++)
{
for(var j = 0; j < 120; j++)
{
if(line_split[i].Contains(#"(" + j + ")\% Processor Time"))
{
p.percore[j] = i;
}
}
...
To maintain the same functionality as if else if then you could use a break inside the conditional.
Edit: The edit now made it clear that there is no clear pattern to the string in contains. Still, if you are writing out 120 if/else if statements you should store what you will be looking for in some type of collection. For example, a List would work. Then access the index j of the collection in your loop:
...
var listOfSearchItems = new List<string>() { "Private Bytes", "DPC" };
for (int i = 0; i < line_split.Length; i++)
{
for(var j = 0; j < 120; j++)
{
if(line_split[i].Contains(listOfSearchItems[j])
{
p.percore[j] = i;
}
}
...