Difference in declaring variable as a array index and "0"? - c#

I'm new to c# so be ready for some dumb questions.
My current task is to find a score from an array where the highest/lowest scores have been taken away, and if the highest/lowest occur more than once (ONLY if they occur more than once), one of them can be added:
eg. int[] scores = [4, 8, 6, 4, 8, 5] therefore the final addition will be 4+8+6+5 = 23.
Another condition of the task is that LINQ cannot be used, as well as any of the System.Array methods. (you can see by my previously ask questions that has been a bit of a pain for me, since I solved this with LINQ in less than 5 minutes).
So here is the problem: I have working code the solves the problem but the task requires multiple methods/functions, so I cannot receive full marks if I have only 3 methods (including main). I have been trying to restructure the program but with all sorts of issues. Here is my code (just so I can explain it better):
using System;
using System.Collections.Generic;
//using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Scoring {
class Program {
static int highOccurrence = 0;
static int lowOccurrence = 0;
//static int high; <------
//static int low; <------
static void Main(string[] args) {
int[] scores = { 4, 8, 6, 4, 8, 5 };
findScore(scores);
ExitProgram();
}
static int findOccurrence(int[] scores, int low, int high) { //find the number of times a high/low occurs
for (int i = 0; i < scores.Length; i++) {
if (low == scores[i]) {
lowOccurrence++;
//record number of time slow occurs
}
if (high == scores[i]) {
highOccurrence++;
//record number of times high occurs }
}
return highOccurrence;
}
static int findScore(int[] scores) { //calculates score, needs to be restructured
int[] arrofNormal = new int[scores.Length];
//int low = scores[0]; <----This is where the issue is
//int high = scores[0]; <----- ^^^^^
int total = 0;
for (int i = 0; i < scores.Length; i++) {
if (low > scores[i]) {
low = scores[i];
} //record lowest value
if (high < scores[i]) {
high = scores[i];
//record highest value
}
}
for (int x = 0; x < scores.Length; x++) {
if (scores[x] != low && scores[x] != high) {
arrofNormal[x] = scores[x];
//provides the total of the scores (not including the high and the low)
}
total += arrofNormal[x];
}
findOccurrence(scores, low, high);
if (highOccurrence > 1) { //if there is more than 1 high (or 1 low) it is added once into the total
total += high;
if (lowOccurrence > 1) {
total += low;
}
}
Console.WriteLine("Sum = " + total);
return total; //remove not all code paths return.. error
}
static void ExitProgram() {
Console.Write("\n\nPress any key to exit program: ");
Console.ReadKey();
}//end ExitProgram
}
}
I have placed arrows in the code above to show where my issue is. If I try to declare "high" and "low" as global variables, my final answer is always a few numbers off, buy if I leave the variables declared as "high = scores[0]" etc, I will get the right answer.
What I want ideally is to have separate methods for each step of the calculation, so right now I have method for finding the number of times a specific value shows up in the array. The next I would like to do is finding the highest/lowest value in the array, one method would do the final calculation, and the final one would write the results into the console window. The last two parts (finding the high/low and final calculation) are currently in the find score method.
Any help would be greatly appreciated. Thanks.

Related

How to optimize a loops going through arrays?

I've been going though www.testdome.com to test my skills and opened a list of public questions. One of the practice questions was:
Implement function CountNumbers that accepts a sorted array of
integers and counts the number of array elements that are less than
the parameter lessThan.
For example, SortedSearch.CountNumbers(new int[] { 1, 3, 5, 7 }, 4)
should return 2 because there are two array elements less than 4.
And my answer was:
using System;
public class SortedSearch
{
public static int CountNumbers(int[] sortedArray, int lessThan)
{
int count = 0;
int l = sortedArray.Length;
for (int i = 0; i < l; i++) {
if (sortedArray [i] < lessThan)
count++;
}
return count;
}
public static void Main(string[] args)
{
Console.WriteLine(SortedSearch.CountNumbers(new int[] { 1, 3, 5, 7 }, 4));
}
}
It seems that I've failed on two counts:
Performance test when sortedArray contains lessThan: Time limit exceeded
and
Performance test when sortedArray doesn't contain lessThan: Time limit exceeded
To be honest I'm not sure what to optimize there? Maybe I'm using a wrong method and there is a similar way to speed up the calculation?
If someone could point out my mistake or explain what I'm going wrong, I'd really appreciate it!
Because the array is sorted, you can stop counting as soon as you reach or exceed the lessThan parameter.
else break would probably do it.
Does it have to be really a loop? You could do Lambda exp for that
public static int CountNumbers(int[] sortedArray, int lessThan)
{
return sortedArray.ToList().Where(x=>x < lessThan).Count();
}
Harold's answer and approach is spot on.
Find below another code sample in case you're practicing for technical interviews. It handles cases when the array is null or empty, when lessThan is presented in the array (including duplicates), etc.
private static int CountNumbers(int[] sortedArray, int lessThan)
{
if (sortedArray == null)
{
throw new ArgumentNullException("Sorted array cannot be null.");
}
if (sortedArray.Length == 0)
{
throw new ArgumentException("Sorted array cannot be empty.");
}
int start = 0;
int end = sortedArray.Length;
int middle = int.MinValue;
while (start < end)
{
middle = (start + end) / 2;
if (sortedArray[middle] == lessThan)
{
break; // Found the "lessThan" number in the array, we can stop and move left
}
else if (sortedArray[middle] < lessThan)
{
start = middle + 1;
}
else
{
end = middle - 1;
}
}
// Adjust the middle pointer based on the "current" and "lessThan" numbers in the sorted array
while (middle >= 0 && sortedArray[middle] >= lessThan)
{
middle--;
}
// +1 because middle is calculated through 0-based (e.g. start)
return middle + 1;
}

Detecting spikes and drops in a long list of integers C#

Hi there I'm trying to write a method that reads every number in a list and detects where it spikes and drops. This is what I have so far:
I basically figure if I loop through the list, loop through it again to get the next number in the list, then detecting if it's more or less. If it's more it'll save to one list, vice versa.
What I want this method to do is determine where there's a spike of 100 or more, save the point that it does this (which is 'counter') and also save the points where the numbers drop.
This so far notices only a drop and it will save every number in the list until it spikes again and once it has spiked it shows no numbers, until it drops again and so on.
I've put 'check' and 'check2' to try and counteract it saving every number after it notices a drop and only save it once but no luck.
Any ideas?
public void intervalDetection()
{
//Counter is the point in the list
int counter = 0;
int spike = 0;
int drop = 0;
//Loop through power list
for (int i = 0; i < powerList.Count(); i++)
{
counter++;
int firstNumber = powerList[i];
//Loop again to get the number after??
for (int j = 1; j < 2; j++)
{
//Detect Spike
spike = firstNumber + 100;
drop = firstNumber - 100;
if (powerList[j] > spike)
{
if (check2 == false)
{
intervalStartList.Add(counter);
check2 = true;
check = false;
}
}
//Detect Drop
else if (powerList[j] < drop)
{
if (check == false)
{
intervalEndList.Add(counter);
check = true;
check2 = false;
}
}
}
Create integer "average"
Loop through List/Array and add each value to average
Divide average by the count of the List/Array
Loop through List/Array and check deviation to the average integer
derp
Code example:
public class DSDetector {
public static List<int>[] getDropsnSpikes(List<int> values, int deviation) {
List<int> drops = new List<int>();
List<int> spikes = new List<int>();
int average = 0;
foreach (int val in values) {
average += val;
}
average = average/values.Count;
foreach (int val in values) {
if (val < average - deviation) {
drops.add(val);
}
if (val > average + deviation) {
spikes.add(val);
}
}
//derp.
return new List<int>{drops, spikes};
}
}
not tested but I think it works. Just try it.
What exactly do you mean saying "peaks" and "drops"?
Let's say you have following list of integers
112, 111, 113, 250, 112, 111, 1, 113
In this case value 250 is peak and 1 drop relative to average value and you can get it using Kai_Jan_57 answer.
But also 250 is peak to previous value 113 and 112 is drop for 250.
If you want to find local peaks and drops you can check each value relative to previous and next: find average as avg=(val[i-1]+val[i+1])/2 and check if val[i]>avg + 100 (peak) or val[i]

Code sample that shows casting to uint is more efficient than range check

So I am looking at this question and the general consensus is that uint cast version is more efficient than range check with 0. Since the code is also in MS's implementation of List I assume it is a real optimization. However I have failed to produce a code sample that results in better performance for the uint version. I have tried different tests and there is something missing or some other part of my code is dwarfing the time for the checks. My last attempt looks like this:
class TestType
{
public TestType(int size)
{
MaxSize = size;
Random rand = new Random(100);
for (int i = 0; i < MaxIterations; i++)
{
indexes[i] = rand.Next(0, MaxSize);
}
}
public const int MaxIterations = 10000000;
private int MaxSize;
private int[] indexes = new int[MaxIterations];
public void Test()
{
var timer = new Stopwatch();
int inRange = 0;
int outOfRange = 0;
timer.Start();
for (int i = 0; i < MaxIterations; i++)
{
int x = indexes[i];
if (x < 0 || x > MaxSize)
{
throw new Exception();
}
inRange += indexes[x];
}
timer.Stop();
Console.WriteLine("Comparision 1: " + inRange + "/" + outOfRange + ", elapsed: " + timer.ElapsedMilliseconds + "ms");
inRange = 0;
outOfRange = 0;
timer.Reset();
timer.Start();
for (int i = 0; i < MaxIterations; i++)
{
int x = indexes[i];
if ((uint)x > (uint)MaxSize)
{
throw new Exception();
}
inRange += indexes[x];
}
timer.Stop();
Console.WriteLine("Comparision 2: " + inRange + "/" + outOfRange + ", elapsed: " + timer.ElapsedMilliseconds + "ms");
}
}
class Program
{
static void Main()
{
TestType t = new TestType(TestType.MaxIterations);
t.Test();
TestType t2 = new TestType(TestType.MaxIterations);
t2.Test();
TestType t3 = new TestType(TestType.MaxIterations);
t3.Test();
}
}
The code is a bit of a mess because I tried many things to make uint check perform faster like moving the compared variable into a field of a class, generating random index access and so on but in every case the result seems to be the same for both versions. So is this change applicable on modern x86 processors and can someone demonstrate it somehow?
Note that I am not asking for someone to fix my sample or explain what is wrong with it. I just want to see the case where the optimization does work.
if (x < 0 || x > MaxSize)
The comparison is performed by the CMP processor instruction (Compare). You'll want to take a look at Agner Fog's instruction tables document (PDF), it list the cost of instructions. Find your processor back in the list, then locate the CMP instruction.
For mine, Haswell, CMP takes 1 cycle of latency and 0.25 cycles of throughput.
A fractional cost like that could use an explanation, Haswell has 4 integer execution units that can execute instructions at the same time. When a program contains enough integer operations, like CMP, without an interdependency then they can all execute at the same time. In effect making the program 4 times faster. You don't always manage to keep all 4 of them busy at the same time with your code, it is actually pretty rare. But you do keep 2 of them busy in this case. Or in other words, two comparisons take just as long as single one, 1 cycle.
There are other factors at play that make the execution time identical. One thing helps is that the processor can predict the branch very well, it can speculatively execute x > MaxSize in spite of the short-circuit evaluation. And it will in fact end up using the result since the branch is never taken.
And the true bottleneck in this code is the array indexing, accessing memory is one of the slowest thing the processor can do. So the "fast" version of the code isn't faster even though it provides more opportunity to allow the processor to concurrently execute instructions. It isn't much of an opportunity today anyway, a processor has too many execution units to keep busy. Otherwise the feature that makes HyperThreading work. In both cases the processor bogs down at the same rate.
On my machine, I have to write code that occupies more than 4 engines to make it slower. Silly code like this:
if (x < 0 || x > MaxSize || x > 10000000 || x > 20000000 || x > 3000000) {
outOfRange++;
}
else {
inRange++;
}
Using 5 compares, now I can a difference, 61 vs 47 msec. Or in other words, this is a way to count the number of integer engines in the processor. Hehe :)
So this is a micro-optimization that probably used to pay off a decade ago. It doesn't anymore. Scratch it off your list of things to worry about :)
I would suggest attempting code which does not throw an exception when the index is out of range. Exceptions are incredibly expensive and can completely throw off your bench results.
The code below does a timed-average bench for 1,000 iterations of 1,000,000 results.
using System;
using System.Diagnostics;
namespace BenchTest
{
class Program
{
const int LoopCount = 1000000;
const int AverageCount = 1000;
static void Main(string[] args)
{
Console.WriteLine("Starting Benchmark");
RunTest();
Console.WriteLine("Finished Benchmark");
Console.Write("Press any key to exit...");
Console.ReadKey();
}
static void RunTest()
{
int cursorRow = Console.CursorTop; int cursorCol = Console.CursorLeft;
long totalTime1 = 0; long totalTime2 = 0;
long invalidOperationCount1 = 0; long invalidOperationCount2 = 0;
for (int i = 0; i < AverageCount; i++)
{
Console.SetCursorPosition(cursorCol, cursorRow);
Console.WriteLine("Running iteration: {0}/{1}", i + 1, AverageCount);
int[] indexArgs = RandomFill(LoopCount, int.MinValue, int.MaxValue);
int[] sizeArgs = RandomFill(LoopCount, 0, int.MaxValue);
totalTime1 += RunLoop(TestMethod1, indexArgs, sizeArgs, ref invalidOperationCount1);
totalTime2 += RunLoop(TestMethod2, indexArgs, sizeArgs, ref invalidOperationCount2);
}
PrintResult("Test 1", TimeSpan.FromTicks(totalTime1 / AverageCount), invalidOperationCount1);
PrintResult("Test 2", TimeSpan.FromTicks(totalTime2 / AverageCount), invalidOperationCount2);
}
static void PrintResult(string testName, TimeSpan averageTime, long invalidOperationCount)
{
Console.WriteLine(testName);
Console.WriteLine(" Average Time: {0}", averageTime);
Console.WriteLine(" Invalid Operations: {0} ({1})", invalidOperationCount, (invalidOperationCount / (double)(AverageCount * LoopCount)).ToString("P3"));
}
static long RunLoop(Func<int, int, int> testMethod, int[] indexArgs, int[] sizeArgs, ref long invalidOperationCount)
{
Stopwatch sw = new Stopwatch();
Console.Write("Running {0} sub-iterations", LoopCount);
sw.Start();
long startTickCount = sw.ElapsedTicks;
for (int i = 0; i < LoopCount; i++)
{
invalidOperationCount += testMethod(indexArgs[i], sizeArgs[i]);
}
sw.Stop();
long stopTickCount = sw.ElapsedTicks;
long elapsedTickCount = stopTickCount - startTickCount;
Console.WriteLine(" - Time Taken: {0}", new TimeSpan(elapsedTickCount));
return elapsedTickCount;
}
static int[] RandomFill(int size, int minValue, int maxValue)
{
int[] randomArray = new int[size];
Random rng = new Random();
for (int i = 0; i < size; i++)
{
randomArray[i] = rng.Next(minValue, maxValue);
}
return randomArray;
}
static int TestMethod1(int index, int size)
{
return (index < 0 || index >= size) ? 1 : 0;
}
static int TestMethod2(int index, int size)
{
return ((uint)(index) >= (uint)(size)) ? 1 : 0;
}
}
}
You aren't comparing like with like.
The code you were talking about not only saved one branch by using the optimisation, but also 4 bytes of CIL in a small method.
In a small method 4 bytes can be the difference in being inlined and not being inlined.
And if the method calling that method is also written to be small, then that can mean two (or more) method calls are jitted as one piece of inline code.
And maybe some of it is then, because it is inline and available for analysis by the jitter, optimised further again.
The real difference is not between index < 0 || index >= _size and (uint)index >= (uint)_size, but between code that has repeated efforts to minimise the method body size and code that does not. Look for example at how another method is used to throw the exception if necessary, further shaving off a couple of bytes of CIL.
(And no, that's not to say that I think all methods should be written like that, but there certainly can be performance differences when one does).

Calculating adjacency matrix from randomly generated graphs

I have developed small program, which randomly generates several connections between the graphs (the value of the count could be randomly too, but for the test aim I have defined const value, it could be redefined in random value in any time).
Code is C#: http://ideone.com/FDCtT0
( result: Success time: 0.04s memory: 36968 kB returned value: 0 )
If you don't know, what is the adjacency matrix, go here : http://en.wikipedia.org/wiki/Adjacency_matrix
I think, that my version of code is rather not-optimized.
If I shall work with large matrixes, which have the size: 10k x 10k.
What are your suggestions, how is better to parallel calculations in
this task? Should I use some of the lockers-models like semaphore
etc for multi-threading calculations on large matrixes.
What are your suggestions for redesigning the architecture of
program. How should I prepare it for large matrixes?
As you see, upper at ideone, I have showed the time execution parameter and allocated memory in RAM. What is the asymptotic value of execution of my program? Is it O(n^2)?
So I want to listen to your advice how to increase the asymptotic mark, parallel calculations with using semaphores ( or maybe better locker-model for threads ).
Thank you!
PS:
SO doesn't allow to post topic without formatted code, so I'm posting in at the end (full program):
/*
Oleg Orlov, 2012(c), generating randomly adjacency matrix and graph connections
*/
using System;
using System.Collections.Generic;
class Graph
{
internal int id;
private int value;
internal Graph[] links;
public Graph(int inc_id, int inc_value)
{
this.id = inc_id;
this.value = inc_value;
links = new Graph[Program.random_generator.Next(0, 4)];
}
}
class Program
{
private const int graphs_count = 10;
private static List<Graph> list;
public static Random random_generator;
private static void Init()
{
random_generator = new Random();
list = new List<Graph>(graphs_count);
for (int i = 0; i < list.Capacity; i++)
{
list.Add(new Graph(i, random_generator.Next(100, 255) * i + random_generator.Next(0, 32)));
}
}
private static void InitGraphs()
{
for (int i = 0; i < list.Count; i++)
{
Graph graph = list[i] as Graph;
graph.links = new Graph[random_generator.Next(1, 4)];
for (int j = 0; j < graph.links.Length; j++)
{
graph.links[j] = list[random_generator.Next(0, 10)];
}
list[i] = graph;
}
}
private static bool[,] ParseAdjectiveMatrix()
{
bool[,] matrix = new bool[list.Count, list.Count];
foreach (Graph graph in list)
{
int[] links = new int[graph.links.Length];
for (int i = 0; i < links.Length; i++)
{
links[i] = graph.links[i].id;
matrix[graph.id, links[i]] = matrix[links[i], graph.id] = true;
}
}
return matrix;
}
private static void PrintMatrix(ref bool[,] matrix)
{
for (int i = 0; i < list.Count; i++)
{
Console.Write("{0} | [ ", i);
for (int j = 0; j < list.Count; j++)
{
Console.Write(" {0},", Convert.ToInt32(matrix[i, j]));
}
Console.Write(" ]\r\n");
}
Console.Write("{0}", new string(' ', 7));
for (int i = 0; i < list.Count; i++)
{
Console.Write("---");
}
Console.Write("\r\n{0}", new string(' ', 7));
for (int i = 0; i < list.Count; i++)
{
Console.Write("{0} ", i);
}
Console.Write("\r\n");
}
private static void PrintGraphs()
{
foreach (Graph graph in list)
{
Console.Write("\r\nGraph id: {0}. It references to the graphs: ", graph.id);
for (int i = 0; i < graph.links.Length; i++)
{
Console.Write(" {0}", graph.links[i].id);
}
}
}
[STAThread]
static void Main()
{
try
{
Init();
InitGraphs();
bool[,] matrix = ParseAdjectiveMatrix();
PrintMatrix(ref matrix);
PrintGraphs();
}
catch (Exception exc)
{
Console.WriteLine(exc.Message);
}
Console.Write("\r\n\r\nPress enter to exit this program...");
Console.ReadLine();
}
}
I will start from the end, if you don't mind. :)
3) Of course, it is O(n^2). As well as the memory usage.
2) Since sizeof(bool) == 1 byte, not bit, you can optimize memory usage by using bit masks instead of raw bool values, this will make it (8 bits per bool)^2 = 64 times less.
1) I don't know C# that well, but as i just googled i found out that C# primitive types are atomic, which means you can safely use them in multi-threading. Then, you are to make a super easy multi-threading task: just split your graphs by threads and press the 'run' button, which will run every thread with its part of graph on itself. They are independent so that's not going to be any problem, you don't need any semaphores, locks and so on.
The thing is that you won't be able to have an adjacency matrix with size 10^9 x 10^9. You just can't store it in the memory. But, there is an other way.
Create an adjacency list for each vertex, which will have a list of all vertices it is connected with. After building those lists from your graph, sort those lists for each vertex. Then, you can answer on the 'is a connected to b' in O( log(size of adjacency list for vertex a) ) time by using binary search, which is really fast for common usage.
Now, if you want to implement Dijkstra algorithm really fast, you won't need an adj. matrix at all, just those lists.
Again, it all depends on the future tasks and constraints. You cannot store the matrix of that size, that's all. You don't need it for Dijkstra or BFS, that's a fact. :) There is no conceptual difference from the graph's side: graph will be the same no matter what data structure it's stored in.
If you really want the matrix, then that's the solution:
We know, that number of connections (1 in matrix) is greatly smaller than its maximum which is n^2. By doing those lists, we simply store the positions of 1 (it's also called sparse matrix), which consumes no unneeded memory.

for loop inside array terminating on second call

i have done coding in C# but not much inside the Console App (teacher is making us do an assignment in it)
I have a problem where my static method works fine the first time it is called (each question is asked), but the second time through the console closes. I need this function to execute 10 times and not sure why it wont. Here is what i have and thanks in advance!:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Lab2
{
class Program
{
//Create the arrays
static string[] questions = new string[5]; //For questions
static int[] tableHeader = new int[10]; //Table Header
static int[,] responses = new int[5, 10]; //For answers
//Int for the number of times the questions have been asked
static int quizCount = 0;
static int answer;
static bool isGoing = true;
static void Main(string[] args)
{
//Set the questions in an array
questions[0] = "On a scale of 1-10, how do you feel about the drinking age in Wisconsin?";
questions[1] = "On a scale of 1-10, how often do you drink a week?";
questions[2] = "On a scale of 1-10, how important is this class?";
questions[3] = "On a scale of 1-10, how would you rate this campus?";
questions[4] = "On a scale of 1-10, how would you rate this command prompt?";
while(isGoing)
Questions();
}
static void Questions()
{
for (int i = 0; i < 5; i++)
{
Console.WriteLine(questions[i]);
answer = Convert.ToInt16(Console.ReadLine());
responses[i, quizCount] = answer;
}
if (quizCount < 10)
{
Console.WriteLine("Enter more data? (1=yes, 0=no)");
int again = Console.Read();
if (again != 1)
Environment.Exit(0);
}
else
isGoing = false;
DisplayResults();
}
static void DisplayResults()
{
Console.WriteLine(tableHeader);
for (int i = 0; i < 5; i++)
{
for (int x = 0; x < 10; x++)
{
Console.Write(responses[i, x]);
}
Console.Write("\n");
}
}
}
}
First off Console.Read() returns an int representing the ascii value of what was entered. If the user enters 1, Console.Read() returns 49. (See this ascii table)
You could use Console.ReadKey()
Second, you need some fixes in the way you loop and ask to continue....
int again = Console.Read();
Your problem is here - Console.Read() returns the first character entered (as represented by its ASCII code), not the number you type in. I leave the solution for your homework.

Categories

Resources