Computing the most frequent element

Computing the most frequent element - c#

I have recently come across this line of code and what it does is that it goes through an array and returns the value that is seen most often. For example 1,1,2,1,3 so it will return 1 because it appears more than 2 and 3. What I am trying to do is understand how it works so what I did was I went through it with visual studio step by step but it is not ringing any bells.
Can anyone help me understand what is going on here? It would be a total plus if someone can tell me what does c do and what is the logic behind the arguments in the if statements.
int[] arr = a;
int c = 1, maxcount = 1, maxvalue = 0;
int result = 0;
for (int i = 0; i < arr.Length; i++)
{
maxvalue = arr[i];
for (int j = 0; j <arr.Length; j++)
{
if (maxvalue == arr[j] && j != i)
{
c++;
if (c > maxcount)
{
maxcount = c;
result = arr[i];
}
}
else
{
c=1;
}
}
}
return result;

EDIT: On closer examination, the code snippet has a nested loop and is conventionally counting the maximum seen element by simply keeping track of the maximum seen times and the element that was seen and keeping them in sync.
That looks like an implementation of the Boyer-Moore majority vote counting algorithm. They have a nice illustration here.
The logic is simple, and is to compute the majority in a single pass, taking O(n) time. Note that majority here means that more than 50% of the array must be filled with that element. If there is no majority element, you get an "incorrect" result.
Verifying if the element is actually forming a majority is done in a separate pass usually.

It is not computing the most frequent element - what it is computing is the longest run of elements.
Also, it is not doing it very efficiently, the inner loop only needs to compute upto i-1, not upto arr.Length.
c is keeping track of the current run length. The first "if" is to check if this is a "continouous run". The second "if" (after reaching the last element in the run) will check if this run is longer than any run you have seen so far.
In the above input sample, you are getting 1 as answer because it is the longest run. Try with an input where the element with the longest run is not the same as the most frequent element. (e.g., 2,1,1,1,3,2,3,2,3,2,3,2 - here 2 is the most frequent element, but 1,1,1 is the longest run).

Related

time complexity of Method

I want to learn about big-o, I hope someone can help me count operators in Method and tell me what the time complexity of this method is and teach me how to count. I tried to study on Youtube and I was a bit confused.
static void SelectionSort(int[] data)
{
int temp, min;
for (int i = 0; i < data.Length - 1 ; i++)
{
min = i;
for (int j = 0; j < data.Length; j++)
{
if (data[j] < data[min])
{
min = j;
}
temp = data[min];
data[min] = data[i];
data[i] = temp;
}
}
}

first of all this is a function not a method because a method is simply a function inside a class.
the time complexity of this algorithm is O(n^2) because of the double for loop that means that this algorithm will take around n^2 operations to be done.
for example if you input an array of length 10 it will make 100 steps that's not an exact number but it means a lot if you try an array of length 100 it will make 10000 steps that means that if would take 100 more time to finish.
so the less the time complexity the faster the algorithm is.
to learn about time complexity check this video it will help a lot-->
https://www.youtube.com/watch?v=6aDHWSNKlVw&t=6s

Time Complexity will be O(n^2) for this one. Reason is you have nested loop inside another loop.
The outer loop iterates n times giving an element to the inner loop which again loops n times, per one loop of the outer array.
https://adrianmejia.com/most-popular-algorithms-time-complexity-every-programmer-should-know-free-online-tutorial-course/#Bubble-sort

Calculating the approximate run time of a for loop

I have a piece of code in my C# Windows Form Application that looks like below:
List<string> RESULT_LIST = new List<string>();
int[] arr = My_LIST.ToArray();
string s = "";
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < arr.Length; i++)
{
int counter = i;
for (int j = 1; j <= arr.Length; j++)
{
counter++;
if (counter == arr.Length)
{
counter = 0;
}
s += arr[counter].ToString();
RESULT_LIST.Add(s);
}
s = "";
}
sw.Stop();
TimeSpan ts = sw.Elapsed;
string elapsedTime = String.Format("{0:00}", ts.TotalMilliseconds * 1000);
MessageBox.Show(elapsedTime);
I use this code to get any combination of the numbers of My list. I have behaved with My_LIST like a recursive one. The image below demonstrates my purpose very clearly:
All I need to do is:
Making a formula to calculate the approximate run time of these two
nested for loops to guess the run time for any length and help the
user know the approximate time that he/she must wait.
I have used a C# Stopwatch like this: Stopwatch sw = new Stopwatch(); to show the run time and below are the results(Note that in order to reduce the chance of error I've repeated the calculation three times for each length and the numbers show the time in nano seconds for the first, second and third attempt respectively.):
arr.Length = 400; 127838 - 107251 - 100898
arr.Length = 800; 751282 - 750574 - 739869
arr.Length = 1200; 2320517 - 2136107 - 2146099
arr.Length = 2000; 8502631 - 7554743 - 7635173
Note that there are only one-digit numbers in My_LIST to make the time
of adding numbers to the list approximately equal.
How can I find out the relation between arr.Length and run time?

First, let's suppose you have examined the algorithm and noticed that it appears to be quadratic in the array length. This suggests to us that the time taken to run should be a function of the form
t = A + B n + C n2
You've gathered some observations by running the code multiple times with different values for n and measuring t. That's a good approach.
The question now is: what are the best values for A, B and C such that they match your observations closely?
This problem can be solved in a variety of ways; I would suggest to you that the least-squares method of regression would be the place to start, and see if you get good results. There's a page on it here:
www.efunda.com/math/leastsquares/lstsqr2dcurve.cfm
UPDATE: I just looked at your algorithm again and realized it is cubic because you have a quadratic string concat in the inner loop. So this technique might not work so well. I suggest you use StringBuilder to make your algorithm quadratic.
Now, suppose you did not know ahead of time that the problem was quadratic. How would you determine the formula then? A good start would be to graph your points on log scale paper; if they roughly form a straight line then the slope of the line gives you a clue as to the power of the polynomial. If they don't form a straight line -- well, cross that bridge when you come to it.

Well you gonna do some math here.
Since the total number of runs is exactly n^2, not O(n^2) but exactly n^2 times.
Then what you could do is to keep a counter variable for the number of items processed and use math to find out an estimate
int numItemProcessed;
int timeElapsed;//read from stop watch
int totalItems = n * n;
int remainingEstimate = ((float) totalItems - numItemProcessed) / numItemProcessed) * timeElapsed

Don't assume the algorithm is necessarily N^2 in time complexity.
Take the averages of your numbers, and plot the best fit on a log-log plot, then measure the gradient. This will give you an idea as to the largest term in the polynomial. (see wikipedia log-log plot)
Once you have that, you can do a least-squares regression to work out the coefficients of the polynomial of the correct order. This will allow an estimate from the data, of the time taken for an unseen problem.
Note: As Eric Lippert said, it depends on what you want to measure - averaging may not be appropriate depending on your use case - the first run time might be more correct.
This method will work for any polynomial algorithm. It will also tell you if the algorithm is polynomial (non-polynomial running times will not give straight lines on the log-log plot).

What would be the shortest way to sum up the digits in odd and even places separately

I've always loved reducing number of code lines by using simple but smart math approaches. This situation seems to be one of those that need this approach. So what I basically need is to sum up digits in the odd and even places separately with minimum code. So far this is the best way I have been able to think of:
string number = "123456789";
int sumOfDigitsInOddPlaces=0;
int sumOfDigitsInEvenPlaces=0;
for (int i=0;i<number.length;i++){
if(i%2==0)//Means odd ones
sumOfDigitsInOddPlaces+=number[i];
else
sumOfDigitsInEvenPlaces+=number[i];
{
//The rest is not important
Do you have a better idea? Something without needing to use if else

int* sum[2] = {&sumOfDigitsInOddPlaces,&sumOfDigitsInEvenPlaces};
for (int i=0;i<number.length;i++)
{
*(sum[i&1])+=number[i];
}

You could use two separate loops, one for the odd indexed digits and one for the even indexed digits.
Also your modulus conditional may be wrong, you're placing the even indexed digits (0,2,4...) in the odd accumulator. Could just be that you're considering the number to be 1-based indexing with the number array being 0-based (maybe what you intended), but for algorithms sake I will consider the number to be 0-based.
Here's my proposition
number = 123456789;
sumOfDigitsInOddPlaces=0;
sumOfDigitsInEvenPlaces=0;
//even digits
for (int i = 0; i < number.length; i = i + 2){
sumOfDigitsInEvenPlaces += number[i];
}
//odd digits, note the start at j = 1
for (int j = 1; i < number.length; i = i + 2){
sumOfDigitsInOddPlaces += number[j];
}
On the large scale this doesn't improve efficiency, still an O(N) algorithm, but it eliminates the branching

Since you added C# to the question:
var numString = "123456789";
var odds = numString.Split().Where((v, i) => i % 2 == 1);
var evens = numString.Split().Where((v, i) => i % 2 == 0);
var sumOfOdds = odds.Select(int.Parse).Sum();
var sumOfEvens = evens.Select(int.Parse).Sum();

Do you like Python?
num_string = "123456789"
odds = sum(map(int, num_string[::2]))
evens = sum(map(int, num_string[1::2]))

This Java solution requires no if/else, has no code duplication and is O(N):
number = "123456789";
int[] sums = new int[2]; //sums[0] == sum of even digits, sums[1] == sum of odd
for(int arrayIndex=0; arrayIndex < 2; ++arrayIndex)
{
for (int i=0; i < number.length()-arrayIndex; i += 2)
{
sums[arrayIndex] += Character.getNumericValue(number.charAt(i+arrayIndex));
}
}

Assuming number.length is even, it is quite simple. Then the corner case is to consider the last element if number is uneven.
int i=0;
while(i<number.length-1)
{
sumOfDigitsInEvenPlaces += number[ i++ ];
sumOfDigitsInOddPlaces += number[ i++ ];
}
if( i < number.length )
sumOfDigitsInEvenPlaces += number[ i ];
Because the loop goes over i 2 by 2, if number.length is even, removing 1 does nothing.
If number.length is uneven, it removes the last item.
If number.length is uneven, then the last value of i when exiting the loop is that of the not yet visited last element.
If number.length is uneven, by modulo 2 reasoning, you have to add the last item to sumOfDigitsInEvenPlaces.
This seems slightly more verbose, but also more readable, to me than Anonymous' (accepted) answer. However, benchmarks to come.
Well, the compiler seems to think my code more understandable as well, since he removes it all if I don't print the results (which explains why I kept getting a time of 0 all along...). The other code though is obfuscated enough for even the compiler.
In the end, even with huge arrays, it's pretty hard for clock_t to tell the difference between the two. You get about a third less instructions in the second case, but since everything's in cache (and your running sums even in registers) it doesn't matter much.
For the curious, I've put the disassembly of both versions (compiled from C) here : http://pastebin.com/2fciLEMw

While Looping an Array

I'm trying to understand a book from Don Gosselin on ASP.NET Programming with Visual C#. To solve it I just simply make it to work by adhering to while loops: one while loop is to assign a number to an array element, the other while loop is to display that array. Total array count displays 1 through 100. This should have worked but didn't. Visual Studio 2013 debugger for some reason assigns count = 100, that's why it's failing.
<%
int count = 0;
int[] numbers = new int[100];
while (count <= 100)
{
numbers[count] = count;
++count;
}
while (count <= 100)
{
Response.Write(numbers[count] + "<br />");
++count;
}
%>

You should set count to 0 after first while loop:
int count = 0;
int[] numbers = new int[100];
while (count <= 100)
{
numbers[count] = count;
++count;
}
count = 0;
while (count <= 100)
{
Response.Write(numbers[count] + "<br />");
++count;
}

You need to reset the count to 0 before you attempt the next while statement. Currently, the first loop ends when it reaches a count equal to 101. WHen you proceed to the next while, the count is 101 so the loop automatically ends. Just set count = 0; before the second while loop.

This seems like a very convoluted and unrealistic way of using while loops and arrays. In order to understand it better, it may be worth thinking about it per step.
var i = 0;
while (i < 100)
{
Response.Write(++i + "<br />");
}
The first important distinction is between i++ and ++i. The former utilises the value, and then increments by one; the latter, increments the number and then utilises the value.
In C#, you should really be working with Collections, rather than Arrays. Arrays are zero-indexed, and are renowned for causing serious errors, including exposing potential exploits. Being statically allocated, there is no failsafe when attempting to access indicies outside of the bounds of the Array. Collections, on the other hand, are (for the most part) one-indexed, dynamically allocated, and provide fallbacks when accessing indicies. The most commonly used Collection is a List.
var i = 1;
var list = new List<int>();
while (i <= 100)
{
list.Add(i++);
}
For the second while loop, it's not really suitable to use a while loop here, for any practical example. The excercise is forcing while loops where they are not needed. In this instance, the aim is to iterate through each element in the array (List) and dump its contents to the screen. Because we want to perform an action for each element, a while loop may cause issues. If the array has less than 100 elements, the program will crash, if the array has more than 100 elements, we'll miss some of them. By using a foreach loop, instead of a while, we can eliminate these potential errors.
foreach (var num in list)
{
Response.Write(num + "<br />");
}
Now, I realise that the excercise is about while loops, however, it is teaching you to use them in the wrong way. A much better way - and how you'll most often use them - is to perform an action until a particular condition is met, rather than for simple iteration. By this, I mean, a condition is set to false, then inside the while loop, we manipulate a variable, test the condition, and if it's still false, we go round again. The most common example of this is to work out factorials of numbers.
var num = 5;
var factorial = 1;
while (counter > 1)
{
factorial *= num--;
}
Response.Write(String.Format("{0}! = {1}", input, factorial));
The other main way in which while loops are used is to force an infinite loop, unless a break condition is met. I'll show a very arbitrary use of this here, but a real world example would be the loop() method in Arduino C coding, or a HTTP Listener that constantly repeats the same procedures, until stopped.
var stop = 13;
Response.Write("Pick a number between 1 and 100...<br /><br />");
while (true)
{
var num = new Random().Next(1, 101);
Response.Write(num + " ..... ");
if (num == stop) break;
Response.Write("You got lucky!<br />");
}
Response.Write("Unlucky for you!);
The best way to learn these things is to practice them. Pick a task and find out just how many ways there are to complete it. There is one last important distinction to mention though. a while loop tests the condition at the beginning of the loop. A do while loop, tests the condition at the end.
while(false)
{
// This code will never be run.
}
Compared to:
do
{
// This code will be run once only.
}
while(false)
As a final thought, here's how I'd write the original code (using a LINQ foreach loop):
var numbers = new List<int>();
for (var count = 1; count <= 100; count++)
{
numbers.Add(count);
}
numbers.ForEach(num => Response.Write(num + "<br />")));

Deleting from array, mirrored (strange) behavior

The title may seem a little odd, because I have no idea how to describe this in one sentence.
For the course Algorithms we have to micro-optimize some stuff, one is finding out how deleting from an array works. The assignment is delete something from an array and re-align the contents so that there are no gaps, I think it is quite similar to how std::vector::erase works from c++.
Because I like the idea of understanding everything low-level, I went a little further and tried to bench my solutions. This presented some weird results.
At first, here is a little code that I used:
class Test {
Stopwatch sw;
Obj[] objs;
public Test() {
this.sw = new Stopwatch();
this.objs = new Obj[1000000];
// Fill objs
for (int i = 0; i < objs.Length; i++) {
objs[i] = new Obj(i);
}
}
public void test() {
// Time deletion
sw.Restart();
deleteValue(400000, objs);
sw.Stop();
// Show timings
Console.WriteLine(sw.Elapsed);
}
// Delete function
// value is the to-search-for item in the list of objects
private static void deleteValue(int value, Obj[] list) {
for (int i = 0; i < list.Length; i++) {
if (list[i].Value == value) {
for (int j = i; j < list.Length - 1; j++) {
list[j] = list[j + 1];
//if (list[j + 1] == null) {
// break;
//}
}
list[list.Length - 1] = null;
break;
}
}
}
}
I would just create this class and call the test() method. I did this in a loop for 25 times.
My findings:
The first round it takes a lot longer than the other 24, I think this is because of caching, but I am not sure.
When I use a value that is in the start of the list, it has to move more items in memory than when I use a value at the end, though it still seems to take less time.
Benchtimes differ quite a bit.
When I enable the commented if, performance goes up (10-20%) even if the value I search for is almost at the end of the list (which means the if goes off a lot of times without actually being useful).
I have no idea why these things happen, is there someone who can explain (some of) them? And maybe if someone sees this who is a pro at this, where can I find more info to do this the most efficient way?
Edit after testing:
I did some testing and found some interesting results. I run the test on an array with a size of a million items, filled with a million objects. I run that 25 times and report the cumulative time in milliseconds. I do that 10 times and take the average of that as a final value.
When I run the test with my function described just above here I get a score of:
362,1
When I run it with the answer of dbc I get a score of:
846,4
So mine was faster, but then I started to experiment with a half empty empty array and things started to get weird. To get rid of the inevitable nullPointerExceptions I added an extra check to the if (thinking it would ruin a bit more of the performance) like so:
if (fromItem != null && fromItem.Value != value)
list[to++] = fromItem;
This seemed to not only work, but improve performance dramatically! Now I get a score of:
247,9
The weird thing is, the scores seem to low to be true, but sometimes spike, this is the set I took the avg from:
94, 26, 966, 36, 632, 95, 47, 35, 109, 439
So the extra evaluation seems to improve my performance, despite of doing an extra check. How is this possible?

You are using Stopwatch to time your method. This calculates the total clock time taken during your method call, which could include the time required for .Net to initially JIT your method, interruptions for garbage collection, or slowdowns caused by system loads from other processes. Noise from these sources will likely dominate noise due to cache misses.
This answer gives some suggestions as to how you can minimize some of the noise from garbage collection or other processes. To eliminate JIT noise, you should call your method once without timing it -- or show the time taken by the first call in a separate column in your results table since it will be so different. You might also consider using a proper profiler which will report exactly how much time your code used exclusive of "noise" from other threads or processes.
Finally, I'll note that your algorithm to remove matching items from an array and shift everything else down uses a nested loop, which is not necessary and will access items in the array after the matching index twice. The standard algorithm looks like this:
public static void RemoveFromArray(this Obj[] array, int value)
{
int to = 0;
for (int from = 0; from < array.Length; from++)
{
var fromItem = array[from];
if (fromItem.Value != value)
array[to++] = fromItem;
}
for (; to < array.Length; to++)
{
array[to] = default(Obj);
}
}
However, instead of using the standard algorithm you might experiment by using Array.RemoveAt() with your version, since (I believe) internally it does the removal in unmanaged code.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Computing the most frequent element - c#

Related

time complexity of Method

Calculating the approximate run time of a for loop

What would be the shortest way to sum up the digits in odd and even places separately

While Looping an Array

Deleting from array, mirrored (strange) behavior

Categories

Resources