Is the complexity N or N^2? - c#

I was solving a question in which I've to create a unique array from the sorted array which can have duplicate elements.
I solved the solution using the following code:
for (int i = 0; i < sorted.Length - 1; i++)
{
if (sorted[i] == sorted[i + 1])
{
unqiueList.Add(sorted[i]);
int j = i + 1;
while (j < sorted.Length)
{
if (sorted[i] != sorted[j])
{
break;
}
j++;
i++;
}
}
else
{
unqiueList.Add(sorted[i]);
}
}
Now, I want to know the complexity of this solution.
Some people say, it is N but some say it is N^2. It hints to my mind why not ask the same question to stack overflow to have better understanding of it.

Worst case is O(N).
It's a bit of a nasty one, but given the fact that i and j are incremented on every iteration in the while loop, there is basically no looping in looping.
The algorithm doesn't allow more iterations than sorted.Length.
Interestingly; this indicates that the while loop could be replaced with an if statement (might not be a simple one), but it would be a nice exercise.

Related

What should I use to optimize my c# code?

I was doing a codewars kata and it's working but I'm timing out.
I searched online for solutions, for some kind of reference but they were all for java script.
Here is the kata: https://i.stack.imgur.com/yGLmw.png
Here is my code:
public static int DblLinear(int n)
{
if(n > 0)
{
var list = new List<int>();
int[] next_two = new int[2];
list.Add(1);
for (int i = 0; i < n; i++)
{
for (int m = 0; m < next_two.Length; m++)
{
next_two[m] = ((m + 2) * list[i]) + 1;
}
if(list.Contains(next_two[0]))
{
list.Add(next_two[1]);
}
else if(list.Contains(next_two[1]))
{
list.Add(next_two[0]);
}
else
list.AddRange(next_two);
list.Sort();
}
return list[n];
}
return 1;
}
It's really slow solution but that's what seems to be working for me.
The first rule of performance optimization is to measure. Ideally using a profiler that can tell you where most of the time is spent, but for simple cases using some stopwatches can be sufficient.
I would guess that most of the time would be spent in list.Contains, since this is linear lookup, and is in the innermost loop. So one approach would be to change the list to a HashSet<int> to provide better lookup performance, skip the .Sort-call, and return the maximum value in the hashSet. As far as I can tell that should give the same result.
You might also consider using some specialized data structure that fits the problem better than the general containers provided in .Net.

Why does a simple increment not work in this InsertionSort Algorithm?

static void InsertionSort(int[] array)
{
int swaps = 0;
for (int i = 0; i < array.Count(); i++)
{
int item = array[i];
int index = i;
while (index > 0 && array[index-1] > item)
{
array[index] = array[index - 1];
index--;
swaps++;
}
array[index] = item;
}
Console.WriteLine($"InsertionSorted Array: {String.Join(",",array)}, took {swaps} swaps.");
}
Someone please explain to me why "swaps" always returns 0?
I've checked the placement.
Tried moving the swap++ to before the array assignment.
Even tried using ++swap (pre-increment) instead to no avail...
I just want to keep track of how many times a value gets moved to demonstrate the difference between different sorting methods.
Either I'm an idiot and need to quit now or this is beyond me.
So, Copying the whole method to a new file seems to have fixed the issue... Still do not understand why swap++; failed on me so would appreciate a valid answer but either way this problem is fixed for now.

How to reduce cyclomatic complexity on this c# method?

I'm currently doing a project for my c# classes. Our teacher gave us some code metrics limits that we have to abide to and one of them is cyclomatic complexity. Right now he complexity of the method below is 5, but it needs to be 4. Is there any way to improve that?
MethodI was talking about:
private bool MethodName()
{
int counter = 0;
for (int k = 0; k < 8; k++)
{
for (int j = 0; j < 3; j++)
{
if (class1.GetBoard()[array1[k, j, 0], array1[k, j, 1]] == player.WhichPlayer()) counter++;
}
if (counter == 3) return true;
else counter = 0;
}
return false;
}
I can wrap the conditions to reduce it. For example
private bool MethodName()
{
for (int k = 0; k < 8; k++)
{
bool b = true;
for (int j = 0; j < 3; j++)
{
b &= class1.GetBoard()[array1[k, j, 0], array1[k, j, 1]] == player.WhichPlayer();
}
if (b) return true;
}
return false;
}
For the OP (that seems to be just starting on programming):
It is nice that you got an assignment to reduce the cyclomatic complexity of a method, and it is good both to know what is it and how to keep it low as a general practice.
However, try to not get too zealous with these kind of metrics. There is more value on having the code as straightforward as possible, easy to reason about and understand quickly, and only worry about metrics after profiling the app and knowing the places it matters the most.
For more experienced coders:
This simple problem reminded me of a very famous discussion from 1968 between Dijkstra and other fellas on the ACM periodic. Although I tend to align with him on this matter, that was one answer from Frank Rubin that is very reasonable.
Frank basically advocates that "elegance" can many times arise from code clarity, instead of any other metrics of practices. Back then, the discussion was the over-use of the goto statement on the popular languages of that time. Today, discussion goes around cyclomatic complexity, terseness, oop or whatever.
The bottom line is, in my opinion:
know your tools
code with clarity in mind
try to write efficient code on the first pass, but don't overthink
profile your code and decide where it's work to spend more time
Back to the question
The implementation presented in the question got the following scores in my Visual Studio Analyzer:
Cycl. Compl: 5; Maintainability: 67
The snippet presented by #Boris got this:
Cycl. Compl: 4; Maintainability: 68
Even though the cyclomatic complexity improved, the maintainability index keeps basically the same. Personally, I consider the latter metric more valuable most of the time.
Just for fun, let's see how a solution akin the one presented by Frank Rubin that uses the dreaded goto statement would look like:
private bool MethodName() {
for (int k = 0; k < 8; k++) {
for (int j = 0; j < 3; j++) {
if (watheverTestCondition(k, j) is false) goto reject;
}
// condition is true for all items in this row
return true;
// if condition is false for any item, go straight to this line
reject:;
}
return false;
}
Honestly, I think this is the most clear, simple and performatic implementation for this. Do I recommend goto in general as a code feature? NO. Does it fit perfectly and smoothly in this specific case? YES. And what about the metrics?
Cycl. Compl: 4; Maintainability: 70
Bonus
Just because I wouldn't be able to sleep if I didn't tell that, this is how you would implement this in real life:
obj.Any(row => row.All(watheverTestCondition));
Cycl. Compl: 1; Maintainability: 80

What sorting method is this being applied and what is the algorithmic complexity of the method

I came across the code below for implementing sorting array.
I have applied to a very long array and it was able to do so in under a sec may be 20 millisec or less.
I have been reading about Algorithm complexity and the Big O notation and would like to know:
Which is the sorting method (of the existing ones) that is implemented in this code.
What is the complexity of the algorithm used here.
If you were to improve the algorithm/ code below what would you alter.
using System;
using System.Text;
//This program sorts an array
public class SortArray
{
static void Main(String []args)
{
// declaring and initializing the array
//int[] arr = new int[] {3,1,4,5,7,2,6,1, 9,11, 7, 2,5,8,4};
int[] arr = new int[] {489,491,493,495,497,529,531,533,535,369,507,509,511,513,515,203,205,207,209,211,213,107,109,111,113,115,117,11913,415,417,419,421,423,425,427,15,17,19,21,4,517,519,521,523,525,527,4,39,441,443,445,447,449,451,453,455,457,459,461,537,539,541,543,545,547,1,3,5,7,9,11,13,463,465,467,23,399,401,403,405,407,409,411,499,501,503,505,333,335,337,339,341,343,345,347,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,9,171,173,175,177,179,181,183,185,187,269,271,273,275,277,279,281,283,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,133,135,137,139,141,143,145,285,287,289,291,121,123,125,127,129,131,297,299,373,375,377,379,381,383,385,387,389,97,99,101,103,105,147,149,151,153,155,157,159,161,163,165,167,16,391,393,395,397,399,401,403,189,191,193,195,197,199,201,247,249,251,253,255,257,259,261,263,265,267,343,345,347,349,501,503,505,333,335,337,339,341,417,419,421,423,425,561,563,565,567,569,571,573,587,589,591,593,595,597,599,427,429,431,433,301,303,305,307,309,311,313,315,317,319,321,323,325,327,329,331,371,359,361,363,365,367,369,507,509,511,513,515,351,353,355,57,517,519,521,523,525,527,413,415,405,407,409,411,499,435,437,469,471,473,475,477,479,481,483,485,487,545,547,549,551,553,555,575,577,579,581,583,585,557,559,489,491,493,495,497,529,531,533,535,537,539,541,543,215,217,219,221,223,225,227,229,231,233,235,237,239,241,243,245,293,295};
int temp;
// traverse 0 to array length
for (int i = 0; i < arr.Length ; i++)
{
// traverse i+1 to array length
//for (int j = i + 1; j < arr.Length; j++)
for (int j = i+1; j < arr.Length; j++)
{
// compare array element with
// all next element
if (arr[i] > arr[j]) {
///Console.WriteLine(i+"i before"+arr[i]);
temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
//Console.WriteLine("i After"+arr[i]);
}
}
}
// print all element of array
foreach(int value in arr)
{
Console.Write(value + " ");
}
}
}
This is selection sort. It's time complexity is O(𝑛²). It has nested loops over i and j, and you can see these produce every possible set of two indices in the range {0,...,𝑛-1}, where 𝑛 is arr.Length. The number of pairs is a triangular number, and is equal to:
𝑛(𝑛-1)/2
...which is O(𝑛²)
If we stick to selection sort, we can still find some improvements.
We can see that the role of the outer loop is to store in arr[i] the value that belongs there in the final sorted array, and never touch that entry again. It does so by searching the minimum value in the right part of the array that starts at this index 𝑖.
Now during that search, which takes place in the inner loop, it keeps swapping lesser values into arr[i]. This may happen a few times, as it might find even lesser values as j walks to the right. That is a waste of operations, as we would prefer to only perform one swap. And this is possible: instead of swapping immediately, delay this operation. Instead keep track of where the minimum value is located (initially at i, but this may become some j index). Only when the inner loop completes, perform the swap.
There is less important improvement: i does not have to get equal to arr.Length - 1, as then there are no iterations of the inner loop. So the ending condition for the outer loop can exclude that iteration from happening.
Here is how that looks:
for (int i = 0, last = arr.Length - 1; i < last; i++)
{
int k = i; // index that has the least value in the range arr[i..n-1] so far
for (int j = i+1; j < arr.Length; j++)
{
if (arr[k] > arr[j]) {
k = j; // don't swap yet -- just track where the minimum is located
}
}
if (k > i) { // now perform the swap
int temp = arr[i];
arr[i] = arr[k];
arr[k] = temp;
}
}
A further improvement can be to use the inner loop to not only locate the minimum value, but also the maximum value, and to move the found maximum to the right end of the array. This way both ends of the array get sorted gradually, and the inner loop will shorten twice as fast. Still, the number of comparisons remains the same, and the average number of swaps as well. So the gain is only in the iteration overhead.
This is "Bubble sort" with O(n^2). You can use "Mergesort" or "Quicksort" to improve your algorithm to O(n*log(n)). If you always know the minimum and maximum of your numbers, you can use "Digit sort" or "Radix Sort" with O(n)
See: Sorting alrogithms in c#

Search a file for a sequence of bytes (C#)

I'm writing a C# application in which I need to search a file (could be very big) for a sequence of bytes, and I can't use any libraries to do so. So, I need a function that takes a byte array as an argument and returns the position of the byte following the given sequence. The function doesn't have to be fast, it simply has to work. Any help would be greatly appreciated :)
If it doesn't have to be fast you could use this:
int GetPositionAfterMatch(byte[] data, byte[]pattern)
{
for (int i = 0; i < data.Length - pattern.Length; i++)
{
bool match = true;
for (int k = 0; k < pattern.Length; k++)
{
if (data[i + k] != pattern[k])
{
match = false;
break;
}
}
if (match)
{
return i + pattern.Length;
}
}
}
But I really would recommend you to use Knuth-Morris-Pratt algorithm, it's the algorithm mostly used as a base of IndexOf methods for strings. The algorithm above will perform really slow, exept for small arrays and small patterns.
The straight-forward approach as pointed out by Turrau works, and for your purposes is probably good enough, since you say it doesn't have to be fast - especially since for most practical purposes the algorithm is much faster than the worst case O(n*m). (Depending on your pattern I guess).
For an optimal solution you can also check out the Knuth-Morris-Pratt algorithm, which makes use of partial matches which in the end is O(n+m).
Here's an extract of some code I used to do a boyer-moore type search. It's mean to work on pcap files, so it operates record by record, but should be easy enough to modify to suit just searching a long binary file. It's sort of extracted from some test code, so I hope I got everything for you to follow along. Also look up boyer-moore searching on wikipedia, since that is what it's based off of.
int[] badMatch = new int[256];
byte[] pattern; //the pattern we are searching for
//badMath is an array of every possible byte value (defined as static later).
//we use this as a jump table to know how many characters we can skip comparison on
//so first, we prefill every possibility with the length of our search string
for (int i = 0; i < badMatch.Length; i++)
{
badMatch[i] = pattern.Length;
}
//Now we need to calculate the individual maximum jump length for each byte that appears in my search string
for (int i = 0; i < pattern.Length - 1; i++)
{
badMatch[pattern[i] & 0xff] = pattern.Length - i - 1;
}
// Place the bytes you want to run the search against in the payload variable
byte[] payload = <bytes>
// search the packet starting at offset, and try to match the last character
// if we loop, we increment by whatever our jump value is
for (i = offset + pattern.Length - 1; i < end && cont; i += badMatch[payload[i] & 0xff])
{
// if our payload character equals our search string character, continue matching counting backwards
for (j = pattern.Length - 1, k = i; (j >= 0) && (payload[k] == pattern[j]) && cont; j--)
{
k--;
}
// if we matched every character, then we have a match, add it to the packet list, and exit the search (cont = false)
if (j == -1)
{
//we MATCHED!!!
//i = end;
cont = false;
}
}

Categories

Resources