need help understanding a loop in "head first c#" - c#

I'm running through a book called "head first c#" and there's a part where it goes through loops that I need to figure out if they eventually end or run forever.
One of these loops the book says runs 8 times, however for the life of me it should run essentially forever.
It starts off with p and q equal, and runs while q is less than 32, because p is equal and not less than q it skips the while loop and goes straight to: q = p - q; which is q = 2 - 2;
at this point the for loop finishes its first run and does q = q * 2; which is now q = 0 * 2;
At this point it is now impossible for q to ever go up and the for loop should run forever, and p will NEVER be less than q, meaning the while loop is never run....
So HOW THE HELL does it run 8 times?
I was expecting there to be a trick I was missing and read through it carefully a number of times but I can't find it.

You're not quite correct here, because you're missing the fact that the loop invariant always runs last.
To help you understand what's going on here, let me break this down. When the loop starts, p and q are both 2.
In the first iteration, you are correct that the while loop does not run, and therefore, q is updated to 0. The loop invariant then does q = q * 2, which updates it to 0.
In the second iteration, again the while loop does not run. In this case, q is then set to p - q, which is 2. The loop invariant then does q = q * 2, which sets q to 4.
In the third iteration, the while loop runs and updates p to 4. We then repeat the same operation as in step 1, and q is set to 0.
In the fourth iteration, the loop operates similarly to the second iteration, and p - q results in 4. The loop invariant then does q = q * 2, which sets q to 8.
From this it's easy to see, that each odd iteration of the loop results in q being zero and p being equal to q and each even iteration results in q being set to 2 * p. Therefore, q will be equal to 32 after 8 iterations.
As an aside, if you're curious to see how a piece of code is working, you could always write it in C# and step into the code in Debugging Mode. Doing so would show you the value of all variables within scope after each line is executed.

Related

Checking for duplicates in an array

I have checked for answers on the website, but I am curious about the way I am writing my code via C# to check for duplicates in the array. My function sort of works. But in the end when I print out my 5 sets of arrays, duplicates are detected but, the output still contains duplicates. I have also commented out the part where if a duplicate is detected generate a random number to replace the duplicate found in that element of the array. My logic seems to be sound, a nested for loop that starts with the first element then loop through the same array 5 times to see if the initial element matches. So start at element 0, then loop 5 times to see if 0 through 4 matches element 0, then 1 and so on. Plus generating a random number when a duplicate is found and replacing that element is not working out so good. I did see a solution with using a dictionary object and its key, but I don't want to do that, I want to just use raw code to solve this algorithm, no special objects.
My function:
void checkForDuplicates()
{
int[] test = { 3,6,8,10,2,3 };
int count = 0;
Random ranDuplicateChange;
for(int i = 0; i < test.Length; i++)
{
count = 0;
Console.WriteLine(" {0} :: The current number is: {1} ",i, test[i]);
for(int j = 0; j < test.Length; j++)
{
if (test[i] == test[j])
{
count++;
if (count >= 2)
{
Console.WriteLine("Duplicate found: {0}", test[j]);
//ranDuplicateChange = new Random();
//test[j] = ranDuplicateChange.Next(1, 72);
}
}
}
}
You can get them using lambda expressions:
var duplicates = test.GroupBy(a => a)
.Where(g => g.Count() > 1)
.Select(i => new { Number = i.Key, Count = i.Count()});
This returns an IEnumerable of an anonymous type with 2 properties, Number and Count
I want to just use raw code to solve this algorithm, no special objects.
I believe what you mean by this is not using LINQ or any other library methods that are out there that can achieve your end-goal easily, but simply to manipulate the array you have and find a way to find duplicates.
Let's leave code aside for a moment and see what we need to do to find the duplicates. Your approach is to start from the beginning, and compare each element to other elements in the array, and see if they are duplicates. Not a bad idea, so let's see what we need to do to implement that.
Your array:
test = 3, 6, 8, 10, 2, 3
What we must do is, take 3, see if it's equal to the next element, and then the next, and then the next, until the end of the array. If duplicates are found replace them.
Second round, take 6, and since we already compared first element 3, we start with 8 and go on till the end of the array.
Third round, start from 8, and go on.
You get the drift.
Now let's take a look at your code.
Now we start at zero-element (I'm using zero based index for convenience), which is 3, and then in the inner-loop of j we see if the next element, 6, is a duplicate. It's not, so we move on. And so on. We do find a duplicate at the last position, then count it. So far so good.
Next loop, now here is your first mistake. Your second loop, j, starts at 0, so when i=1, the first iteration of your j starts at 0, so you're comparing test[1] vs test[0], which you already compared in the first round (your outer loop). What you should instead do is, compare test[1] vs test[2].
So think what you need to change in your code to achieve this, in terms of i and j in your loops. What you want to do is, start your j loop one more than your current i value.
Next, you increment count whenever you find a duplicate, which is fine. But printing the number only when count >= 2 doesn't make sense. Because, you started it at 0, and increment only if you found a duplicate, so even if your counter is 1, that means you've found a duplicate. You should instead simply generate a random number and replace test[j] with it.
I'm intentionally not giving you code samples as you say that you're eager to learn yourself how to solve this problem, which is always a good thing. Hope the above information is useful.
Disclaimer:
All the above is simply to show you how to fix your current code, but in itself, it still has flaws. To being with, your 'replacing with random number' idea is not watertight. For instance, if it generated the same number you're trying to replace (although the odds are low it can happen, and when you write a program you shouldn't rely on chance for your program to not go wrong), you'd still end up with duplicates. Same with if it generated a number that's found at the beginning of the list later on. For example say your list is 2, 3, 5, 3. The first iteration of i would correctly determine 2 is not a duplicate. Then in next iteration, you find that 3 is a duplicate, and replace it. However, there, if the new randomly generated number turned out to be 2, and since we've already ruled out that 2 is not a duplicate, the newly generated 2 will not be overwritten again and you'll end up with a list with duplicates. To combat that you can revert to your original idea of starting j loop with 0 every time, and replace if a duplicate is encountered. To do that, you'll need an extra condition to see if i == j and if so skip the inner loop. But even then, the now newly generated random number could be equal to one of the numbers in the list to again ruining your logic.
So really, it's fine to attempt this problem this way, but you should also compare your random number to your list every time you generate a number and if it's equal, then generate another random number, and so on until you're good.
But at the end of the day to remove duplicates for a list and replace them with unique numbers there are way more easier and non-error-prone methods using LINQ etc.

Is it possible to determine if a list operation is thread safe?

I am attempting to do a comparison for each element X in a list, ListA, if two properties of X, X.Code and X.Rate, have match the Code and Rate of any element Y in ListB. The current solution uses LINQ and AsParallel to execute these comparisons (time is a factor and each list can contain anywhere from 0 elements to a couple hundred elements each).
So far the AsParallel method seems much faster, however I am not sure that these operations are thread-safe. My understanding is that because this comparison will only be reading values and not modifying them that this should be safe but I am not 100% confident. How can I determine if this operation is thread-safe before unleashing it on my production environment?
Here is the code I am working with:
var s1 = System.Diagnostics.Stopwatch.StartNew();
ListA.AsParallel().ForAll(x => x.IsMatching = ListB.AsParallel().Any(y => x.Code== y.Code && x.Rate== y.Rate));
s1.Stop();
var s2 = System.Diagnostics.Stopwatch.StartNew();
ListA.ForEach(x => x.IsMatching = ListB.Any(y => x.Code == y.Code && x.Rate== y.Rate));
s2.Stop();
Currently each method returns the same result, however the AsParallel() executes in ~1/3 the time as the plain ForEach, so I hope to benefit from that if there is a way to perform this operation safely.
The code you have is thread-safe. The lists are being accessed as read-only, and the implicit synchronization required to implement the parallelized version is sufficient to ensure any writes have been committed. You do modify the elements within the list, but again, the synchronization implicit in the parallel operation, with which the current thread necessarily has to wait on, will ensure any writes to the element objects are visible in the current thread.
That said, the thread safety is irrelevant, because you are doing the whole thing wrong. You are applying a brute force, O(N^2) algorithm to a need that can be addressed using a more elegant and efficient solution, the LINQ join:
var join = from x in list1
join y in list2 on new { x.Code, x.Rate } equals new { y.Code, y.Rate }
select x;
foreach (A a in join)
{
a.IsMatching = true;
}
Your code example didn't include any initialization of sample data. So I can't reproduce your results with any reliability. Indeed, in my test set, where I initialized list1 and list2 identically, with each having the same 1000 elements (I simply set Code and Rate to the element's index in the list, i.e. 0 through 999), I found the AsParallel() version slower than the serial version, by a little more than 25% (i.e. 250 iterations of the parallel version took around 2.7 seconds, while 250 iterations of the serial version took about 1.9 seconds).
But neither came close to the join version, which completed 250 iterations of that particular test data in about 60 milliseconds, almost 20 times faster than the faster of the other two implementations.
I'm reasonably confident that in spite of my lack of a comparable data set relative to your scenario, that the basic result will still stand, and that you will find the use of the join approach far superior to either of the options you've tried so far.

Sorting numbers array issue

Yesterday at work I set out to figure out how to sort numbers without using the library method Array.Sort. I worked on and off when time permitted and finally was able to come up with a basic working algorithm at the end of today. It might be rather stupid and the slowest way, but I am content that I have a working code.
But there is something wrong or missing in the logic, that is causing the output to hang before printing the line: Numbers Sorted. (12/17/2011 2:11:42 AM)
This delay is directly proportionate to the number of elements in the array. To be specific, the output just hangs at the position where I put the tilde in the results section below. The content after tilde is getting printed after that noticeable delay.
Here is the code that does the sort:
while(pass != unsortedNumLen)
{
for(int i=0,j=1; i < unsortedNumLen-1 && j < unsortedNumLen; i++,j++)
{
if (unsorted[i] > unsorted[j])
{
pass = 0;
swaps++;
Console.Write("Swapping {0} and {1}:\t", unsorted[i], unsorted[j]);
tmp = unsorted[i];
unsorted[i] = unsorted[j];
unsorted[j] = tmp;
printArray(unsorted);
}
else pass++;
}
}
The results:
Numbers unsorted. (12/17/2011 2:11:19 AM)
4 3 2 1
Swapping 4 and 3: 3 4 2 1
Swapping 4 and 2: 3 2 4 1
Swapping 4 and 1: 3 2 1 4
Swapping 3 and 2: 2 3 1 4
Swapping 3 and 1: 2 1 3 4
Swapping 2 and 1: 1 2 3 4
~
Numbers sorted. (12/17/2011 2:11:42 AM)
1 2 3 4
Number of swaps: 6
Can you help identify the issue with my attempt?
Link to full code
This is not homework, just me working out.
Change the condition in your while to this:
while (pass < unsortedNumLen)
Logically pass never equals unsortedNumLen so your while won't terminate.
pass does eventually equal unsortedNumLen when it goes over the max value of an int and loops around to it.
In order to see what's happening yourself while it's in the hung state, just hit the pause button in Visual Studio and hover your mouse over pass to see that it contains a huge value.
You could also set a breakpoint on the while line and add a watch for pass. That would show you that the first time the list is sorted, pass equals 5.
It sounds like you want a hint to help you work through it and learn, so I am not posting a complete solution.
Change your else block to the below and see if it puts you on the right track.
else {
Console.WriteLine("Nothing to do for {0} and {1}", unsorted[i], unsorted[j]);
pass++;
}
Here is the fix:
while(pass < unsortedNumLen)
And here is why the delay occurred.
After the end of the for loop in which the array was eventually sorted, pass contains at most unsortedNumLen - 2 (if the last change was between first and second members). But it does not equal the unsorted array length, so another iteration of while and inner for starts. Since the array is sorted unsorted[i] > unsorted[j] is always false, so pass always gets incremented - exactly the number of times j got incremented, and that is the unsortedNumLen - 1. Which is not equal to unsortedNumLen, and so another iteration of while begins. Nothing essentially changed, and after this iteration pass contains 2 * (unsortedNumLen - 1), which is still not equal to unsortedNumLen. And so on.
When pass reaches value int.MaxValue, it the overflow happens, and next value the variable pass will get is int.MinValue. And the process goes on, until pass finally gets the value unsortedNumLen at the moment the while condition is checked. If you are particularly unlucky, this might never happen at all.
P.S. You might want to check out this link.
This is just a characteristic of the algorithm you're using to sort. Once it's completed sorting the elements it has no way of knowing the sort is complete, so it does one final pass checking every element again. You can fix this by adding --unsortedNumLen; at the end of your for loop as follows:
for(int i=0,j=1; i < unsortedNumLen-1 && j < unsortedNumLen; i++,j++)
{
/// existing sorting code
}
--unsortedNumLen;
Reason? Because you algorithm is bubbling the biggest value to the end of the array, there is no need to check this element again since it's already been determined to be larger the all other elements.

How to sort depended objects by dependency for maximum concurrency

I have a list of dependencies (or better a DAG without cycles):
Input (for example):
Item A depends on nothing
Item B depends on C, E
Item C depends on A
Item D depends on E
Item E depends on A
What I'm looking for is: "What is the best* parallel order of the items?"
*best means: maximum concurrency level
Result (for example):
[A], [E, C], [D, B]
The best approach seems to be Pseudocode for getting order based on Dependency but I think I miss a basic algorithm on this.
This looks a lot like https://en.wikipedia.org/wiki/Program_Evaluation_and_Review_Technique and https://en.wikipedia.org/wiki/Critical_path_method
Assuming that you want the maximum concurrency level to get the thing done as soon as possible, once you have organised things so that it takes no longer than the critical path you have something that does as well as the best possible solution - and if there is no limit on the amount of parallel tasks you can run, you can get the critical path just by scheduling every action as soon as all of its dependencies have completed.
I'm not sure you actually want the kind of answer you think you want. For example, in your scenario, you might get item D done before item C if D and E were faster than C, so the list-of-lists you got won't necessarily tell the whole story.
If you have to actually implement this kind of workflow, as opposed to just predicting the workflow in advance, it's easy to do it optimally; whenever a task completes, just scan all the remaining tasks and fire off in parallel any of them whose dependencies are satisfied. If you want to compute it in advance, then maybe you want to reconsider the structure of your result?
GetLists(tasks[1..m], depends[1..m])
1. topological_sort(tasks)
2. cumulative = set()
3. lists = queue()
4. i = 0
5. while |cumulative| != m do
6. temp = set()
7. while depends[i] is a subset of cumulative do
8. temp = temp union {tasks[i]}
9. i = i + 1
10. cumulative = cumulative union temp
11. lists.enqueue(temp)
Something like that might work. Note that the lynchpin is doing a "topological sort" to ensure that you get termination. Also note that, as is, this algorithm is only correct for the set of inputs with a valid solution. If there is no solution, this loops forever. Easy to fix, but you can handle that.
An example: A depends on nothing, B and C depend on A, E depends on A and C and D depends on C and B.
Topological sort: A, B, C, D, E.
cumulative = {}
lists = []
i = 0
|cumulative| = 0 < 5 so...
temp = {}
depends[A] = {} is a subset of {} so
temp = {A}
i = 1
depends[B] = {A} is not a subset of {}, so break
cumulative = {A}
lists = [{A}]
|cumulative| = 1 < 5 so...
temp = {}
depends[B] = {A} is a subset of {A}, so
temp = {B}
i = 2
depends[C] = {A} is a subset of {A}, so
...
You get the idea.

Increase value over time using mathematical algorithm

I am writing a test tool which places a large amount of load on a network service. I would like this tool to start with little load and gradually increase over time. I am sure that there is some triganometry which can do this sort of calculation in one line of code but I am not a math guru (yet). Is there some sort of library (or simple algorithm) which can help with this calculation?
The code would ideally take a few arguments:
algorithm to use (determine how quickly the value increases
starting value
ending value (maximum)
time (amount of time between starting and ending value)
step (granularity in milliseconds)
So every [step] an event would be raised indicating what the value is at that point in time.
This is an ideal implementation though, so I am open to suggestion.
Any input would be greatly appreciated, thank you :)
EDIT:
Let me be more clear ... the amount which the value increases is not linear, it is a curve.
If you desire some form of saturation (see Sigmoid function), have a look at my answer here. Another common function shape would be linear or exponential growth. Just let me know if you need one of the later.
I think what you need is some easing function.
There is a set of famous easing functions created by Robert Penner. You may try to look at:
Tweener transition cheat sheets which visualize Robert Penner's equations.
Robert Penner's original code should be at his webpage.
value = (endValue - StartValue) / (time / stepSize) * currentStep;
heck just add one each time the timer goes off
If I understood correctly, why not do this (using the "variables" you defined):
You need to progress overall ending value - starting value values.
Using the time variable, you can figure out how much of an increase you want every millisecond (let's call this increase-amount).
Step just tells you how much time to "sleep" between each value you raise. Every time a new value is raised, you just do last-value + (milliseconds-since-last_step * increase-amount).
Note: I'm not sure why you need the first variable (algorithm to use), since it seems to me that its role is defined by the other variables.
Are you looking for something like this ? (in python, sorry, my C# is rusted at best)
Given you have a curve f that takes values from 0 to 1:
def my_stepper(f, start, end, time, step_size, current_step):
x = current_step * step_size / time
f_1 = f(1)
f_0 = f(0)
y = start + (end - start) * (f(x)- f_0) / (f_1 - f_0)
return y
for i in xrange(11):
# increment increases over time
print 'exp', my_stepper(math.exp, 100., 200., 10., 1., i)
# increment decreases over time
print 'log', my_stepper(lambda x: math.log(1+x), 100., 200., 10., 1., i)
Pseduo logic to your problem:
let the function be F(a+b*x) for given step x,
let the starting value is start,
let the ending value is end
let the starting time is 0 and final time is time
and InverseF is the inverse function of F.
when x=0, F(a)=start Hence a= InverseF(start)
when x=time, F(a+b*time)=end, Hence b=(InverseF(end)-a)/time which reduces to b= (inverseF(end)-inverseF(start))/time
Finaly for any x=step,
value is F(a+b*step) which is
nothing but
F( inverseF(start)+ (inverseF(end)-inverseF(start))/time * step )
is the answer.
For example if
F(x) is liner ie) f(x)=x
value = start+(end-start)/time*step
if F(x) is x*x, then
value = ( sqrt(start) + (sqrt(end)-sqrt(start))/time * step) * ( sqrt(start) + (sqrt(end)-sqrt(start))/time * step)
if F(x) is exp(x) then
value = Exp ( log(start) + (log(end)-log(start))/time*step )
if F(x) is log(x) then
value = Log( (exp(start) + (exp(end)-exp(start))/time*step )
and so on..
another approach without using inverse function is explained below.
let the function be a+b*F(x) for given step x,
let the starting value is start,
let the ending value is end
let the starting time is 0 and final time is time
then a+ b * F(0) = start and a + b * F(time) = end, on solving a & b,
you will get
value = start + (end-start) / (F(time)-F(0) ) * (F(x)-F(0) )
and for a step x,
value = start + (end-start) / (F(time)-F(0) ) * (F(step)-F(0) )
and I hope, any one of the above will solve your problem..
I'll reuse my answer to another question. The functions given for that answer will not spend much time doing low load, but quickly go to medium-heavy load and then increase load slower to reach maximum. If you need more values on the in the middle of the possible loads, or more values in the low load - just pass appropriate distribution function.
Given your input parameters I would call it like this:
Spread(startingValue, endingValue, time/step, x => 1-(1-x)*(1-x))
Sample algorithm functions:
FocusOnHighLoad = x => 1-(1-x)*(1-x)
FocusOnLowLoad = x => x * x
FocusOnMediumLoad = x => (1 + Math.Pow(x * 2 - 1, 3)) / 2
Sample output:
foreach (double load in Spread(50, 1000, 9, FocusOnHighLoad))
Console.WriteLine("Working with {0} load", load);
Working with 50 load
Working with 272.65625 load
Working with 465.625 load
Working with 628.90625 load
Working with 762.5 load
Working with 866.40625 load
Working with 940.625 load
Working with 985.15625 load
Working with 1000 load

Categories

Resources