Sorting numbers array issue - c#

Yesterday at work I set out to figure out how to sort numbers without using the library method Array.Sort. I worked on and off when time permitted and finally was able to come up with a basic working algorithm at the end of today. It might be rather stupid and the slowest way, but I am content that I have a working code.
But there is something wrong or missing in the logic, that is causing the output to hang before printing the line: Numbers Sorted. (12/17/2011 2:11:42 AM)
This delay is directly proportionate to the number of elements in the array. To be specific, the output just hangs at the position where I put the tilde in the results section below. The content after tilde is getting printed after that noticeable delay.
Here is the code that does the sort:
while(pass != unsortedNumLen)
{
for(int i=0,j=1; i < unsortedNumLen-1 && j < unsortedNumLen; i++,j++)
{
if (unsorted[i] > unsorted[j])
{
pass = 0;
swaps++;
Console.Write("Swapping {0} and {1}:\t", unsorted[i], unsorted[j]);
tmp = unsorted[i];
unsorted[i] = unsorted[j];
unsorted[j] = tmp;
printArray(unsorted);
}
else pass++;
}
}
The results:
Numbers unsorted. (12/17/2011 2:11:19 AM)
4 3 2 1
Swapping 4 and 3: 3 4 2 1
Swapping 4 and 2: 3 2 4 1
Swapping 4 and 1: 3 2 1 4
Swapping 3 and 2: 2 3 1 4
Swapping 3 and 1: 2 1 3 4
Swapping 2 and 1: 1 2 3 4
~
Numbers sorted. (12/17/2011 2:11:42 AM)
1 2 3 4
Number of swaps: 6
Can you help identify the issue with my attempt?
Link to full code
This is not homework, just me working out.

Change the condition in your while to this:
while (pass < unsortedNumLen)
Logically pass never equals unsortedNumLen so your while won't terminate.
pass does eventually equal unsortedNumLen when it goes over the max value of an int and loops around to it.
In order to see what's happening yourself while it's in the hung state, just hit the pause button in Visual Studio and hover your mouse over pass to see that it contains a huge value.
You could also set a breakpoint on the while line and add a watch for pass. That would show you that the first time the list is sorted, pass equals 5.

It sounds like you want a hint to help you work through it and learn, so I am not posting a complete solution.
Change your else block to the below and see if it puts you on the right track.
else {
Console.WriteLine("Nothing to do for {0} and {1}", unsorted[i], unsorted[j]);
pass++;
}

Here is the fix:
while(pass < unsortedNumLen)
And here is why the delay occurred.
After the end of the for loop in which the array was eventually sorted, pass contains at most unsortedNumLen - 2 (if the last change was between first and second members). But it does not equal the unsorted array length, so another iteration of while and inner for starts. Since the array is sorted unsorted[i] > unsorted[j] is always false, so pass always gets incremented - exactly the number of times j got incremented, and that is the unsortedNumLen - 1. Which is not equal to unsortedNumLen, and so another iteration of while begins. Nothing essentially changed, and after this iteration pass contains 2 * (unsortedNumLen - 1), which is still not equal to unsortedNumLen. And so on.
When pass reaches value int.MaxValue, it the overflow happens, and next value the variable pass will get is int.MinValue. And the process goes on, until pass finally gets the value unsortedNumLen at the moment the while condition is checked. If you are particularly unlucky, this might never happen at all.
P.S. You might want to check out this link.

This is just a characteristic of the algorithm you're using to sort. Once it's completed sorting the elements it has no way of knowing the sort is complete, so it does one final pass checking every element again. You can fix this by adding --unsortedNumLen; at the end of your for loop as follows:
for(int i=0,j=1; i < unsortedNumLen-1 && j < unsortedNumLen; i++,j++)
{
/// existing sorting code
}
--unsortedNumLen;
Reason? Because you algorithm is bubbling the biggest value to the end of the array, there is no need to check this element again since it's already been determined to be larger the all other elements.

Related

Checking for duplicates in an array

I have checked for answers on the website, but I am curious about the way I am writing my code via C# to check for duplicates in the array. My function sort of works. But in the end when I print out my 5 sets of arrays, duplicates are detected but, the output still contains duplicates. I have also commented out the part where if a duplicate is detected generate a random number to replace the duplicate found in that element of the array. My logic seems to be sound, a nested for loop that starts with the first element then loop through the same array 5 times to see if the initial element matches. So start at element 0, then loop 5 times to see if 0 through 4 matches element 0, then 1 and so on. Plus generating a random number when a duplicate is found and replacing that element is not working out so good. I did see a solution with using a dictionary object and its key, but I don't want to do that, I want to just use raw code to solve this algorithm, no special objects.
My function:
void checkForDuplicates()
{
int[] test = { 3,6,8,10,2,3 };
int count = 0;
Random ranDuplicateChange;
for(int i = 0; i < test.Length; i++)
{
count = 0;
Console.WriteLine(" {0} :: The current number is: {1} ",i, test[i]);
for(int j = 0; j < test.Length; j++)
{
if (test[i] == test[j])
{
count++;
if (count >= 2)
{
Console.WriteLine("Duplicate found: {0}", test[j]);
//ranDuplicateChange = new Random();
//test[j] = ranDuplicateChange.Next(1, 72);
}
}
}
}
You can get them using lambda expressions:
var duplicates = test.GroupBy(a => a)
.Where(g => g.Count() > 1)
.Select(i => new { Number = i.Key, Count = i.Count()});
This returns an IEnumerable of an anonymous type with 2 properties, Number and Count
I want to just use raw code to solve this algorithm, no special objects.
I believe what you mean by this is not using LINQ or any other library methods that are out there that can achieve your end-goal easily, but simply to manipulate the array you have and find a way to find duplicates.
Let's leave code aside for a moment and see what we need to do to find the duplicates. Your approach is to start from the beginning, and compare each element to other elements in the array, and see if they are duplicates. Not a bad idea, so let's see what we need to do to implement that.
Your array:
test = 3, 6, 8, 10, 2, 3
What we must do is, take 3, see if it's equal to the next element, and then the next, and then the next, until the end of the array. If duplicates are found replace them.
Second round, take 6, and since we already compared first element 3, we start with 8 and go on till the end of the array.
Third round, start from 8, and go on.
You get the drift.
Now let's take a look at your code.
Now we start at zero-element (I'm using zero based index for convenience), which is 3, and then in the inner-loop of j we see if the next element, 6, is a duplicate. It's not, so we move on. And so on. We do find a duplicate at the last position, then count it. So far so good.
Next loop, now here is your first mistake. Your second loop, j, starts at 0, so when i=1, the first iteration of your j starts at 0, so you're comparing test[1] vs test[0], which you already compared in the first round (your outer loop). What you should instead do is, compare test[1] vs test[2].
So think what you need to change in your code to achieve this, in terms of i and j in your loops. What you want to do is, start your j loop one more than your current i value.
Next, you increment count whenever you find a duplicate, which is fine. But printing the number only when count >= 2 doesn't make sense. Because, you started it at 0, and increment only if you found a duplicate, so even if your counter is 1, that means you've found a duplicate. You should instead simply generate a random number and replace test[j] with it.
I'm intentionally not giving you code samples as you say that you're eager to learn yourself how to solve this problem, which is always a good thing. Hope the above information is useful.
Disclaimer:
All the above is simply to show you how to fix your current code, but in itself, it still has flaws. To being with, your 'replacing with random number' idea is not watertight. For instance, if it generated the same number you're trying to replace (although the odds are low it can happen, and when you write a program you shouldn't rely on chance for your program to not go wrong), you'd still end up with duplicates. Same with if it generated a number that's found at the beginning of the list later on. For example say your list is 2, 3, 5, 3. The first iteration of i would correctly determine 2 is not a duplicate. Then in next iteration, you find that 3 is a duplicate, and replace it. However, there, if the new randomly generated number turned out to be 2, and since we've already ruled out that 2 is not a duplicate, the newly generated 2 will not be overwritten again and you'll end up with a list with duplicates. To combat that you can revert to your original idea of starting j loop with 0 every time, and replace if a duplicate is encountered. To do that, you'll need an extra condition to see if i == j and if so skip the inner loop. But even then, the now newly generated random number could be equal to one of the numbers in the list to again ruining your logic.
So really, it's fine to attempt this problem this way, but you should also compare your random number to your list every time you generate a number and if it's equal, then generate another random number, and so on until you're good.
But at the end of the day to remove duplicates for a list and replace them with unique numbers there are way more easier and non-error-prone methods using LINQ etc.

Recursive function to calculate time of a hierarchy

I'm having problem with a recursive function for calculating the time of an hierarchy. Every item in the hierarchy have list of its subordinates and what I'll like to do is getting the time sum of all the step below when calling the function. The problem is that I have a time driver that the time should be multiplied by. Say for instance that the last step in the hierarchy have the time driver 2 and the step above it the time driver 4, the third step from the bottom should then get the time for 4 of those steps under which each one contains 2 of the first step. Hope you get the idea.
public double getIdealTime(double sum, bool first)
{
if (first)
{
sum += this.idealTime;
first = false;
}
else
{
sum += this.idealTime * this.timeDriverNumber;
}
foreach (var prodel in this.subordinates)
{
sum = prodel.getIdealTime(sum, first);
}
return sum;
}
The "first" true and false is just a solution to not get the time of all the steps (the time x the time drivers) for the actual step but for only one of that instance.
The problem is that the function doesn't give me the right output value. I'm not sure why, but it feels like the function goes "top to bottom" instead of "bottom to top". That is, since one usually only set the time for the last items and then calculate the times using the subordinates and its subordinates, it doesn't multiplicate with any other timedriver than the last one.
An example:
Step 1
step 2 time driver 8
Step 2.1 = 3s time driver 4
Step 2.2 = 4s time driver 3
Should for step 1 generate (4*3 + 3*4) * 8 = 192 seconds but gives me 24 seconds since it doesn't multiply with step twos timedriver of 8. Step 2 gives me 24 seconds as it should since it should give me the time for one of the step 2 only.
Edit: I've tried to work out a working example but I can't simplify the code to build around it enough to get something I could publish here. I'd still really appreciate some help though!

Interview - Write a program to remove even elements

I was asked this today and i know the answer is damn sure simple but he kept me the twist to the last.
Question
Write a program to remove even numbers stored in ArrayList containing 1 - 100.
I just said wow
Here you go this is how i have implemented it.
ArrayList source = new ArrayList(100);
for (int i = 1; i < 100; i++)
{
source.Add(i);
}
for (int i = 0; i < source.Count; i++)
{
if (Convert.ToInt32(source[i]) % 2 ==0)
{
source.RemoveAt(i);
}
}
//source contains only Odd elements
The twist
He asked me what is the computational complexity of this give him a equation. I just did and said this is Linear directly proportional to N (Input).
he said : hmmm.. so that means i need to wait longer to get results when the input size increases am i right? Yes sirr you are
Tune it for me, make it Log(N) try as much as you can he said. I failed miserably in this part.
Hence come here for the right logic, answer or algorithm to do this.
note: He wanted no Linq, No extra bells and whistles. Just plain loops or other logic to do it
I dare say that the complexity is in fact O(N^2), since removal in arrays is O(N) and it can potentially be called for each item.
So you have O(N) for the traversal of the array(list) and O(N) for each removal => O(N) * O(N).
Since it does not seem clear, I'll explain the reasoning. At each step a removal of an item may take place (assuming the worst case in which every item must be removed). In an array the removal is done by shifting. Hence, to remove the first item, I need to shift all the following N-1 items by one position to the left:
1 2 3 4 5 6...
<---
2 3 4 5 6...
Now, at each iteration I need to shift, so I'm doing N-1 + N-2 + ... + 1 + 0 shifts, which gives a result of (N) * (N-1) / 2 (arithmetic series) giving a final complexity of O(N^2).
Let's think it this way:
The number of delete actions you are doing is, forcely, the half of array lenght (if the elements are stored in array). So the complexity is at least O(N) .
The question you received let me suppose that your professor wanted you to reason about different ways of storing the numbers.
Usually when you have log complexity you are working with different structures, like graphs or trees.
The only way I can think of having logartmic complexity is having the numbers stored in a tree (ordered tree, b-tree... we colud elaborate on this), but it is actually out of the constraints of your exam (sotring numbers in array).
Does it make sense to you?
You can get noticeably better performance if you keep two indexes, one to the current read position and one to the current write position.
int read = 0
int write = 0;
The idea is that read looks at each member of the array in turn; write keeps track of the current end of the list. When we find a member we want to delete, we move read forwards, but not write.
for (int read = 0; read < source.Count; read++) {
if (source[read] % 2 != 0) {
source[write] = source[read];
write += 1;
}
}
Then at the end, tell the ArrayList that its new length is the current value of `write'.
This takes you from your original O(n^2) down to O(n).
(note: I haven't tested this)
Without changing the data structure or making some assumption on the way items are stores inside the ArrayList, I can't see how you'll avoid checking the parity of each and every member (hence at least O(n) complexity). Perhaps the interviewer simply wanted you to tell him it's impossible.
If you really have to use an ArrayList and actively have to remove the entries (instead if not adding them in the first place)
Not incrementing by i + 1 but i + 2 will remove your need to check if it is odd.
for (int i = source.Count - 1 ; i > 0; i = i i 2)
{
source.RemoveAt(i);
}
Edit: I know this will only work if source contains the entries from 1-100 in sequential order.
The problem with the given solution is that it starts from the beginning, so the entire list must be shifted each time an item is removed:
Initial List: 1, 2, 3, 4, 5, ..., 98, 99
/ / / /// /
After 1st removal: 1, 3, 4, 5, ..., 98, 99, <empty>
/ /// / /
After 2nd removal: 1, 3, 5, ..., 98, 99, <empty>, <empty>
I've used the slashes to try to show how the list shifts after each removal.
You can reduce the complexity (and eliminate the bug I mentioned in the comments) simply by reversing the order of removal:
for (int i = source.Count-1; i >= 0; --i) {
if (Convert.ToInt32(source[i]) % 2 == 0) {
// No need to re-check the same element during the next iteration.
source.RemoveAt(--i);
}
}
It is possible IF you have unlimited parallel threads available to you.
Suppose that we have an array with n elements. Assign one thread per element. Assume all threads act in perfect sync.
Each thread decides whether its element is even or odd. (Time O(1).)
Determine how many elements below it in the array are odd. (Time O(log(n)).)
Mark a 0 or 1 in an second array depending whether you are even or odd at the same index. So each one is a count of odds at that spot.
If your index is odd, add the previous number. Now each entry is a count of odds in the current block of 2 up to yourself
If your index mod 4 is 2, add the value at the index below, if it is 3, add the answer 2 indexes below. Now each entry is a count of odds in the current block of 4 up to yourself.
Continue this pattern with blocks of 2**i (if you're in the top half add the count for the bottom half) log2(n) times - now each entry in this array is the count of odds below.
Each CPU inserts its value into the correct slot.
Truncate the array to the right size.
I am willing to bet that something like this is the answer your friend has in mind.

Changing variables outside of Scope C#

I'm a beginner C# programmer, and to improve my skills I decided to give Project Euler a try. The first problem on the site asks you to find the sum of all the multiples of 3 and 5 under 1000. Since I'm essentially doing the same thing twice, I made a method to multiply a base number incrementally, and add the sum of all the answers togethor.
public static int SumOfMultiplication(int Base, int limit)
{
bool Escape = false;
for (int mult = 1; Escape == true; mult++)
{
int Number = 0;
int iSum = 0;
Number = Base * mult;
if (Number > limit)
return iSum;
else
iSum = iSum + Number;
}
regardless of what I put in for both parameters, it ALWAYS returns zero. I'm 99% sure it has something to do with the scope of the variables, but I have no clue how to fix it. All help is appreciated.
Thanks in advance,
Sam
Your loop never actually executes:
bool Escape = false;
for (int mult = 1; Escape == true; mult++)
Escape is set to false initially, so the first test fails (Escape == true returns false) and the body of the loop is skipped.
The compiler would have told you if you were trying to access variables outside of their defined scope, so that's not the problem. You are also missing a return statement, but that is probably a typo.
I would also note that your code never checks if the number to be added to the sum is actually a multiple of 3 or 5. There are other issues as well (for example, iSum is declared inside of the loop and initialized to 0 after each iteration), but I'll let you work that one out since this is practice. The debugger is your friend in cases like these :)
EDIT: If you need help with the actual logic I'll be happy to help, but I figure you want to work it out on your own if possible.
As others have pointed out, the problem is that the control flow does not do what you think it does. This is a common beginner problem.
My suggestion to you is learn how to use your debugger. Beginners often have this strange idea that they're not allowed to use tools to solve their coding problems; that rather, they have to reason out the defect in the program by simply reading it. Once the programs become more than a page long, that becomes impossible for humans. The debugger is your best friend, so get to know its features really well.
In this case if you'd stepped through the code in the debugger you'd see that the loop condition was being evaluated and then the loop was being skipped. At that point you wouldn't be asking "why does this return zero?", you'd be asking "why is the loop body always skipped?" Clearly that is a much more productive question to ask since that is actually the problem here.
Don't write any code without stepping through it in the debugger. Watch every variable, watch how it changes value (the debugger highlights variables in the watch windows right after they change value, by the way) and make sure that the control flow and the variable changes are exactly as you'd expect. Pay attention to quiet doubts; if anything seems out of the ordinary, track it down, and either learn why it is correct, or fix it until it is.
Regarding the actual problem: remember that 15, 30, 45, 60... are all multiples of both three and five, but you only want to add them to the sum once. My advice when solving Project Euler problems is to write code that is as like what you are trying to solve as is possible. Try writing the problem out in "pseudocode" first. I'd pseudocode this as:
sum = 0
for each positive number under 1000:
if number is multiple of three or five then:
add number to sum
Once you have that pseudocode you can notice its subtleties. Like, is 1000 included? Does the problem say "under 1000" or "up to 1000"? Make sure your loop condition considers that. And so on.
The closer the program reads like the problem actually being solved, the more likely it is to be correct.
It does not enter for loop because for condition is false.
Escape == true
returns false
Advice:
Using for loop is much simpler if you use condition as limit for breaking loop
for (int mult = 1; something < limit; mult++)
This way in most cases you do not need to check condition in loop
Most programming languages have have operator modulo division.
http://en.wikipedia.org/wiki/Modulo_operation
It might come handy whit this problem.
There are several problems with this code. The first, and most important, is that you are using the Escape variable only once. It is never set to false within your for loop, so it serves no purpose whatsoever. It should be removed. Second, isum is declared within your for loop, which means it will keep being re-initialized to 0 every time the loop executes. This means you will only get the last multiple, not the addition of all multiples. Here is a corrected code sample:
int iSum = 0;
for(int mult = 1; true; mult++)
{
int Number = Base * mult;
if(Number > limit)
return iSum;
else
iSum += Number;
}

Finding a pattern within a string variable

I need to create a heads or tails project where the computer will guess randomly up to 5 times, but on the sixth time it will look into the playersGuessHistory variable setup as a string to see if it can find a match for a pattern of 4 entires. If there is a pattern found the computer will guess the next character after the pattern.
For example, given the sequence HHTTH the pattern is HHTT so the computer would guess H for the sixth turn. My only problem is that I'm having difficulty setting up the project so that it will look through the playersguesshistory and find the patterns and guess the next character in the history. Any suggestions?
Create a List<string> and throw the history into this, so that each item in the list is a string of 4 characters (like you show in your text). Then when the computer should guess select the items (there should be several) from the list that starts with (myList.StartsWith - method) your string, then you should sum up the amount of times that H is the next character, and the amount of times that T is the next character - calculate the probability of each of them and let the computer choose the one with the highest probability...
Does it make sense?
This is a little snippet based on what I understand of your requirement. The below method will return a string of guesses of 'H' heads or 'T' tails. The first 5 guesses are random, and then if any sequence of 4 guesses is HHTT the final guess will be 'H'.
static string HeadsOrTails()
{
string guessHistory = String.Empty;
// Guess heads or tails 5 times
Random random = new Random();
for (int currentGuess = 0; currentGuess < 5; currentGuess++)
{
if (random.Next(2) == 0)
guessHistory += 'H';
else
guessHistory += 'T';
}
// Analyse pattern of guesses
if (guessHistory.Substring(0, 4) == "HHTT" || guessHistory.Substring(1, 4) == "HHTT")
{
// If guess history contains HHTT then make the 6th guess = H
guessHistory += 'H';
}
return guessHistory;
}
This is a very simple implementation and will only work for 5 random initial guesses, but it should be quite easy to enhance as needed.
First of all, if the heads and tails are really random, like results from flipping an actual coin, this task is pointless. The computer will always get the next throw right with probability 1/2, regardless of any perceived patters in the history. (See "Independence".)
Now, if the heads and tails are not really random (e.g. they are created by a person calling heads or tails in a way he thinks is random), then we can maybe get the computer to have a higher success quote than 1/2.
I'd try the following: To start, check how often in the history.
heads are followed by heads
heads are followed by tails
and use these number for a guess on the transition probability H->H and H->T, do the same with the tails, and guess the next outcome based on the last one, choosing whatever seems more probable..
Says in the sequence "HHHTH", you find
- H->H: 2 of 3
- H->T: 1 of 3
- T->H: 1 of 1
Since the last throw came up heads, the computer should choose heads as the guess for the next throw.
Now, you can experiment with taking longer parts of the history into account, by counting the transitions "HH->T" and so on and try to improve your success rate.

Categories

Resources