Related
I want to check If my OCR result (a string) is either "No Edge" or "No Signal".
Problem is sometimes I would get N0 Edge, No Signa1, N0 signa1, No 5ignal, etc. The letter o, S, i and l can sometimes become digits or something else. Unfortunately there is nothing else I can do regarding the OCR.
Currently I am doing this:
ocrResult = ocrResult.ToLower();
if (ocrResult.Contains("edg") || ocrResult.Contains("gna"))
{
//no edge or no signal
}
else
{
//Not no edge or no signal
}
Can any of you please suggest a smarter approach?
There's a library called Simila which is designed for such scenarios:
In Simila you can have this:
// A similarity engine which accepts similar if similarity is more than 70%
var simila = new Simila() { Treshold = 0.7 };
if (simila.AreSimilar(ocrResult, "No Edge") || simila.AreSimilar(ocrResult, "No Signal"))
{
// ...
}
A simple documentation of Simila is available here:
https://github.com/mehrandvd/Simila/wiki
FYI, I'm working on it and it is still in beta version. Let me know if an early release helps you, so I can create an early beta release for you.
If what you are doing works just keep doing it, it's simple, easy to understand and scanning a 9 letter string twice isn't likely to cause performance issues unless you have really big data sets.
Just add a comment so that someone who looks at this code years from now know why you are looking for seemingly random substrings.
If this isn't working then what you are looking for is a "classification algorithm" (Wikipedia list's 79 of them) - but they can get complex and choosing the right one can be tricky so they truly an overkill if simple string comparison does the job.
Well the .lower is slower then a comparison that ignores the case. Certainly if u use it in a loop. So at first i recommend you do a comparison that ignores the case. For readability and maintainability i advice u refactor the comparison. And finally u should check if the string is empty or null, then u do not have to compare the string.
Example:
if (IsThereNoEdgeOrNoSignal(ocrResult))
{
//no edge or no signal
}
else
{
//Not no edge or no signal
}
private static bool IsThereNoEdgeOrNoSignal(string ocrResult)
{
if (string.IsNullOrEmpty(ocrResult))
return false;
return ocrResult.IndexOf("edg", StringComparison.CurrentCultureIgnoreCase) >= 0 || ocrResult.IndexOf("gna", StringComparison.CurrentCultureIgnoreCase) >= 0;
}
if it only stays to these two strings, then you should keep it this way, does it grows with more possibilities you should check it with a regular expression.
I hope this helps u.
I'm experimenting with creating a semi-natural scripting language, mostly for my own learning purposes, and for fun. The catch is that it needs to be in native C#, no parsing or lexical analysis on my part, so whatever I do needs to be able to be done through normal syntactical sugar.
I want it to read somewhat like a sentence would, so that it is easy to read and learn, especially for those that aren't especially fluent with programming, but I also want the full functionality of native code available to the user.
For example, in the perfect world it would look like a natural language (English in this case):
When an enemy is within 10 units of player, the enemy attacks the player
In C#, allowing a sentence like this to actually do what the scripter intends would almost certainly require that this be a string that is run through a parser and lexical analyzer. My goal isn't that I have something this natural, and I don't want the scripter to be using strings to script. I want the scripter to have full access to C#, and have things like syntax highlighting, intellisense, debugging in IDE, etc. So what I'm trying to get it something that reads easily, but is in native C#. A couple of the major hurdles that I don't see a way to overcome is getting rid of periods ., commas ,, and parentheses for empty methods (). For example, something like this is feasible but doesn't read very cleanly:
// C#
When(Enemy.Condition(Conditions.isWithinDistance(Enemy, Player, 10))), Event(Attack(Enemy, Player))
Using a language like Scala you can actually get much closer, because periods and parentheses can be replaced by a single whitespace in many cases. For example, you could take the above statement and make it look something like this in Scala:
// Scala
When(Enemy is WithinDistance(Player, 10)) => Then(Attack From(Enemy, Player))
This above code would actually compile assuming you setup your engine to handle it, in fact you might be able to coax further parentheses and commas out of this. Without the syntactical sugar in the above example it would be more like this, in Scala:
// Scala (without syntactical sugar)
When(Enemy.is(WithinDistance(Player, 10)) => Then(Attack().From(Enemy, Player))
The bottom line is I want to get as close as possible to something like the first scala example using native C#. It may be that there is really nothing I can do, but I'm willing to try any tricks that may be possible to make it read more natural, and get the periods, parentheses, and commas out of there (except when they make sense even in natural language).
I'm not as experienced with C# as other languages, so I might not know about some syntax tricks that are available, like macros in C++. Not that macros would actually be a good solution, they would probably cause more problems then they would solve, and would be a debugging nightmare, but you get where I'm going with this, at least in C++ it would be feasible. Is what I'm wanting even possible in C#?
Here's an example, using LINQ and Lambda expressions you can sometimes get the same amount of work done with fewer lines, less symbols, and code the reads closer to English. For example, here's an example of three collisions that happen between pairs of objects with IDs, we want to gather all collisions with the object that has ID 5, then sort those collisions by the "first" ID in the pair, and then output the pairs. Here is how you would do this without LINQ and/or Lambra expressions:
struct CollisionPair : IComparable, IComparer
{
public int first;
public int second;
// Since we're sorting we'll need to write our own Comparer
int IComparer.Compare( object one, object two )
{
CollisionPair pairOne = (CollisionPair)one;
CollisionPair pairTwo = (CollisionPair)two;
if (pairOne.first < pairTwo.first)
return -1;
else if (pairTwo.first < pairOne.first)
return 1;
else
return 0;
}
// ...and our own compable
int IComparable.CompareTo( object two )
{
CollisionPair pairTwo = (CollisionPair)two;
if (this.first < pairTwo.first)
return -1;
else if (pairTwo.first < this.first)
return 1;
else
return 0;
}
}
static void Main( string[] args )
{
List<CollisionPair> collisions = new List<CollisionPair>
{
new CollisionPair { first = 1, second = 5 },
new CollisionPair { first = 2, second = 3 },
new CollisionPair { first = 5, second = 4 }
};
// In a script this would be all the code you needed, everything above
// would be part of the game engine
List<CollisionPair> sortedCollisionsWithFive = new List<CollisionPair>();
foreach (CollisionPair c in collisions)
{
if (c.first == 5 || c.second == 5)
{
sortedCollisionsWithFive.Add(c);
}
}
sortedCollisionsWithFive.Sort();
foreach (CollisionPair c in sortedCollisionsWithFive)
{
Console.WriteLine("Collision between " + c.first +
" and " + c.second);
}
}
And now the same example with LINQ and Lambda. Notice in this example we don't have to both with making CollisionPair both IComparable and IComparer, and don't have to implement to the Compare and CompareTo methods:
struct CollisionPair
{
public int first;
public int second;
}
static void Main( string[] args )
{
List<CollisionPair> collisions = new List<CollisionPair>
{
new CollisionPair { first = 1, second = 5 },
new CollisionPair { first = 2, second = 3 },
new CollisionPair { first = 5, second = 4 }
};
// In a script this would be all the code you needed, everything above
// would be part of the game engine
(from c in collisions
where ( c.first == 5 || c.second == 5 )
orderby c.first select c).ForEach(c =>
Console.WriteLine("Collision between " + c.first +
" and " + c.second));
}
In the end we're left with a LINQ and Lambda expression that read closer to natural language, and are much less code for both a game engine and for the script. These kinds of changes are really what I'm looking for, but obviously LINQ and Lambda are both limited to specific syntax, not something as generic as I would like in the end.
Another approach would be to use FluentInterface "pattern", implement something like:
When(enemy).IsWithin(10.units()).Of(player).Then(enemy).Attacks(player);
If you make the functions like When, IsWithin, Of, Then return some interfaces, then you will be able easily add new extension methods to expand your rules language.
For example let's take a look at function Then:
public IActiveActor Then(this ICondition condition, Actor actor) {
/* keep the actor, etc */
}
public void Attacks(this IActiveActor who, Actor whom) {
/* your business logic */
}
In the future it would be easy to implement another function, say RunAway() without changing anything in your code:
public void RunAway(this IActiveActor who) {
/* perform runaway logic */
}
so it with this little addition you will be able to write:
When(player).IsWithin(10.units()).Of(enemy).Then(player).RunAway();
Same for conditions, assuming When returns something like ICheckActor, you can introduce new conditions by simply defining new functions:
public ICondition IsStrongerThan(this ICheckActor me, Actor anotherGuy) {
if (CompareStrength(me, anotherGuy) > 0)
return TrueActorCondition(me);
else
return FalseActorCondition(me);
}
so now you can do:
When(player)
.IsWithin(10.units()).Of(enemy)
.And(player).IsStrongerThan(enemy)
.Then(player)
.Attacks(enemy);
or
When(player)
.IsWithin(10.units()).Of(enemy)
.And(enemy).IsStrongerThan(player)
.Then(player)
.RunAway();
The point is that you can improve your language without experiencing heavy impact on the code you already have.
Honestly I don't think this is a good direction for a language. Take a look at AppleScript sometime. They went to great pains to mimic natural language, and in trivial examples you can write AppleScript that reads like English. In real usage, it's a nightmare. It's awkward and cumbersome to use. And it's hard to learn, because people have a very hard time with "just write this incredibly limited subset of English with no deviations from the set pattern." It's easier to learn real C# syntax, which is regular and predictable.
I don't quite understand your requirement of "written in native C#". Why? Probably you want it to be written in native .NET? I can understand this as you can compile these rules written in "plain English" into .NET with no parsing etc. Then your engine (probably written in C#) will be able to use these rules, evaluate them, etc. Just because it is all .NET, doesn't really matter which language developer used.
Now, if C# is not really a requirement, then we can stop figuring out how to make "ugly-ugly" syntax look "just ugly" :)
We can look at, for example, F#. It compiles into .NET in the same way C# or VB.NET do, but it is more suitable for solving problems like yours.
You gave us 3 (ugly looking) examples in C# and Scala, here is one in F# I managed to write from the top of my head in 5 minutes:
When enemy (within 10<unit> player) (Then enemy attacks player)
I only spent 5 minutes, so probably it can be even prettier.
No parsing is involved, When, within, Then, attacks are just normal .NET functions (written in F#).
Here is all the code I had to write to make it possible:
[<Measure>] type unit
type Position = int<unit>
type Actor =
| Enemy of Position
| Player of Position
let getPosition actor =
match actor with
| Enemy x -> x
| Player x -> x
let When actor condition positiveAction =
if condition actor
then positiveAction
else ()
let Then actor action = action actor
let within distance actor1 actor2 =
let pos1 = getPosition actor1
let pos2 = getPosition actor2
abs (pos1 - pos2) <= distance
let attacks victim agressor =
printfn "%s attacks %s" (agressor.GetType().Name) (victim.GetType().Name)
This is really it, not hundreds and hundreds of lines of code you would probably write in C# :)
This is a beauty of .NET: you can use appropriate languages for appropriate tasks. And F# is a good language for DLS (just what you need here)
P.S. You can even define functions like "an", "the", "in", etc to make it look more like English (these functions will do nothing but return their first argument):
let an something = something
let the = an
let is = an
Good luck!
I'm a beginner C# programmer, and to improve my skills I decided to give Project Euler a try. The first problem on the site asks you to find the sum of all the multiples of 3 and 5 under 1000. Since I'm essentially doing the same thing twice, I made a method to multiply a base number incrementally, and add the sum of all the answers togethor.
public static int SumOfMultiplication(int Base, int limit)
{
bool Escape = false;
for (int mult = 1; Escape == true; mult++)
{
int Number = 0;
int iSum = 0;
Number = Base * mult;
if (Number > limit)
return iSum;
else
iSum = iSum + Number;
}
regardless of what I put in for both parameters, it ALWAYS returns zero. I'm 99% sure it has something to do with the scope of the variables, but I have no clue how to fix it. All help is appreciated.
Thanks in advance,
Sam
Your loop never actually executes:
bool Escape = false;
for (int mult = 1; Escape == true; mult++)
Escape is set to false initially, so the first test fails (Escape == true returns false) and the body of the loop is skipped.
The compiler would have told you if you were trying to access variables outside of their defined scope, so that's not the problem. You are also missing a return statement, but that is probably a typo.
I would also note that your code never checks if the number to be added to the sum is actually a multiple of 3 or 5. There are other issues as well (for example, iSum is declared inside of the loop and initialized to 0 after each iteration), but I'll let you work that one out since this is practice. The debugger is your friend in cases like these :)
EDIT: If you need help with the actual logic I'll be happy to help, but I figure you want to work it out on your own if possible.
As others have pointed out, the problem is that the control flow does not do what you think it does. This is a common beginner problem.
My suggestion to you is learn how to use your debugger. Beginners often have this strange idea that they're not allowed to use tools to solve their coding problems; that rather, they have to reason out the defect in the program by simply reading it. Once the programs become more than a page long, that becomes impossible for humans. The debugger is your best friend, so get to know its features really well.
In this case if you'd stepped through the code in the debugger you'd see that the loop condition was being evaluated and then the loop was being skipped. At that point you wouldn't be asking "why does this return zero?", you'd be asking "why is the loop body always skipped?" Clearly that is a much more productive question to ask since that is actually the problem here.
Don't write any code without stepping through it in the debugger. Watch every variable, watch how it changes value (the debugger highlights variables in the watch windows right after they change value, by the way) and make sure that the control flow and the variable changes are exactly as you'd expect. Pay attention to quiet doubts; if anything seems out of the ordinary, track it down, and either learn why it is correct, or fix it until it is.
Regarding the actual problem: remember that 15, 30, 45, 60... are all multiples of both three and five, but you only want to add them to the sum once. My advice when solving Project Euler problems is to write code that is as like what you are trying to solve as is possible. Try writing the problem out in "pseudocode" first. I'd pseudocode this as:
sum = 0
for each positive number under 1000:
if number is multiple of three or five then:
add number to sum
Once you have that pseudocode you can notice its subtleties. Like, is 1000 included? Does the problem say "under 1000" or "up to 1000"? Make sure your loop condition considers that. And so on.
The closer the program reads like the problem actually being solved, the more likely it is to be correct.
It does not enter for loop because for condition is false.
Escape == true
returns false
Advice:
Using for loop is much simpler if you use condition as limit for breaking loop
for (int mult = 1; something < limit; mult++)
This way in most cases you do not need to check condition in loop
Most programming languages have have operator modulo division.
http://en.wikipedia.org/wiki/Modulo_operation
It might come handy whit this problem.
There are several problems with this code. The first, and most important, is that you are using the Escape variable only once. It is never set to false within your for loop, so it serves no purpose whatsoever. It should be removed. Second, isum is declared within your for loop, which means it will keep being re-initialized to 0 every time the loop executes. This means you will only get the last multiple, not the addition of all multiples. Here is a corrected code sample:
int iSum = 0;
for(int mult = 1; true; mult++)
{
int Number = Base * mult;
if(Number > limit)
return iSum;
else
iSum += Number;
}
Possible Duplicates:
While vs. Do While
When should I use do-while instead of while loops?
I've been programming for a while now (2 years work + 4.5 years degree + 1 year pre-college), and I've never used a do-while loop short of being forced to in the Introduction to Programming course. I have a growing feeling that I'm doing programming wrong if I never run into something so fundamental.
Could it be that I just haven't run into the correct circumstances?
What are some examples where it would be necessary to use a do-while instead of a while?
(My schooling was almost all in C/C++ and my work is in C#, so if there is another language where it absolutely makes sense because do-whiles work differently, then these questions don't really apply.)
To clarify...I know the difference between a while and a do-while. While checks the exit condition and then performs tasks. do-while performs tasks and then checks exit condition.
If you always want the loop to execute at least once. It's not common, but I do use it from time to time. One case where you might want to use it is trying to access a resource that could require a retry, e.g.
do
{
try to access resource...
put up message box with retry option
} while (user says retry);
do-while is better if the compiler isn't competent at optimization. do-while has only a single conditional jump, as opposed to for and while which have a conditional jump and an unconditional jump. For CPUs which are pipelined and don't do branch prediction, this can make a big difference in the performance of a tight loop.
Also, since most compilers are smart enough to perform this optimization, all loops found in decompiled code will usually be do-while (if the decompiler even bothers to reconstruct loops from backward local gotos at all).
I have used this in a TryDeleteDirectory function. It was something like this
do
{
try
{
DisableReadOnly(directory);
directory.Delete(true);
}
catch (Exception)
{
retryDeleteDirectoryCount++;
}
} while (Directory.Exists(fullPath) && retryDeleteDirectoryCount < 4);
Do while is useful for when you want to execute something at least once. As for a good example for using do while vs. while, lets say you want to make the following: A calculator.
You could approach this by using a loop and checking after each calculation if the person wants to exit the program. Now you can probably assume that once the program is opened the person wants to do this at least once so you could do the following:
do
{
//do calculator logic here
//prompt user for continue here
} while(cont==true);//cont is short for continue
This is sort of an indirect answer, but this question got me thinking about the logic behind it, and I thought this might be worth sharing.
As everyone else has said, you use a do ... while loop when you want to execute the body at least once. But under what circumstances would you want to do that?
Well, the most obvious class of situations I can think of would be when the initial ("unprimed") value of the check condition is the same as when you want to exit. This means that you need to execute the loop body once to prime the condition to a non-exiting value, and then perform the actual repetition based on that condition. What with programmers being so lazy, someone decided to wrap this up in a control structure.
So for example, reading characters from a serial port with a timeout might take the form (in Python):
response_buffer = []
char_read = port.read(1)
while char_read:
response_buffer.append(char_read)
char_read = port.read(1)
# When there's nothing to read after 1s, there is no more data
response = ''.join(response_buffer)
Note the duplication of code: char_read = port.read(1). If Python had a do ... while loop, I might have used:
do:
char_read = port.read(1)
response_buffer.append(char_read)
while char_read
The added benefit for languages that create a new scope for loops: char_read does not pollute the function namespace. But note also that there is a better way to do this, and that is by using Python's None value:
response_buffer = []
char_read = None
while char_read != '':
char_read = port.read(1)
response_buffer.append(char_read)
response = ''.join(response_buffer)
So here's the crux of my point: in languages with nullable types, the situation initial_value == exit_value arises far less frequently, and that may be why you do not encounter it. I'm not saying it never happens, because there are still times when a function will return None to signify a valid condition. But in my hurried and briefly-considered opinion, this would happen a lot more if the languages you used did not allow for a value that signifies: this variable has not been initialised yet.
This is not perfect reasoning: in reality, now that null-values are common, they simply form one more element of the set of valid values a variable can take. But practically, programmers have a way to distinguish between a variable being in sensible state, which may include the loop exit state, and it being in an uninitialised state.
I used them a fair bit when I was in school, but not so much since.
In theory they are useful when you want the loop body to execute once before the exit condition check. The problem is that for the few instances where I don't want the check first, typically I want the exit check in the middle of the loop body rather than at the very end. In that case, I prefer to use the well-known for (;;) with an if (condition) exit; somewhere in the body.
In fact, if I'm a bit shaky on the loop exit condition, sometimes I find it useful to start writing the loop as a for (;;) {} with an exit statement where needed, and then when I'm done I can see if it can be "cleaned up" by moving initilizations, exit conditions, and/or increment code inside the for's parentheses.
A situation where you always need to run a piece of code once, and depending on its result, possibly more times. The same can be produced with a regular while loop as well.
rc = get_something();
while (rc == wrong_stuff)
{
rc = get_something();
}
do
{
rc = get_something();
}
while (rc == wrong_stuff);
It's as simple as that:
precondition vs postcondition
while (cond) {...} - precondition, it executes the code only after checking.
do {...} while (cond) - postcondition, code is executed at least once.
Now that you know the secret .. use them wisely :)
do while is if you want to run the code block at least once. while on the other hand won't always run depending on the criteria specified.
I see that this question has been adequately answered, but would like to add this very specific use case scenario. You might start using do...while more frequently.
do
{
...
} while (0)
is often used for multi-line #defines. For example:
#define compute_values \
area = pi * r * r; \
volume = area * h
This works alright for:
r = 4;
h = 3;
compute_values;
-but- there is a gotcha for:
if (shape == circle) compute_values;
as this expands to:
if (shape == circle) area = pi *r * r;
volume = area * h;
If you wrap it in a do ... while(0) loop it properly expands to a single block:
if (shape == circle)
do
{
area = pi * r * r;
volume = area * h;
} while (0);
The answers so far summarize the general use for do-while. But the OP asked for an example, so here is one: Get user input. But the user's input may be invalid - so you ask for input, validate it, proceed if it's valid, otherwise repeat.
With do-while, you get the input while the input is not valid. With a regular while-loop, you get the input once, but if it's invalid, you get it again and again until it is valid. It's not hard to see that the former is shorter, more elegant, and simpler to maintain if the body of the loop grows more complex.
I've used it for a reader that reads the same structure multiple times.
using(IDataReader reader = connection.ExecuteReader())
{
do
{
while(reader.Read())
{
//Read record
}
} while(reader.NextResult());
}
I can't imagine how you've gone this long without using a do...while loop.
There's one on another monitor right now and there are multiple such loops in that program. They're all of the form:
do
{
GetProspectiveResult();
}
while (!ProspectIsGood());
I like to understand these two as:
while -> 'repeat until',
do ... while -> 'repeat if'.
I've used a do while when I'm reading a sentinel value at the beginning of a file, but other than that, I don't think it's abnormal that this structure isn't too commonly used--do-whiles are really situational.
-- file --
5
Joe
Bob
Jake
Sarah
Sue
-- code --
int MAX;
int count = 0;
do {
MAX = a.readLine();
k[count] = a.readLine();
count++;
} while(count <= MAX)
Here's my theory why most people (including me) prefer while(){} loops to do{}while(): A while(){} loop can easily be adapted to perform like a do..while() loop while the opposite is not true. A while loop is in a certain way "more general". Also programmers like easy to grasp patterns. A while loop says right at start what its invariant is and this is a nice thing.
Here's what I mean about the "more general" thing. Take this do..while loop:
do {
A;
if (condition) INV=false;
B;
} while(INV);
Transforming this in to a while loop is straightforward:
INV=true;
while(INV) {
A;
if (condition) INV=false;
B;
}
Now, we take a model while loop:
while(INV) {
A;
if (condition) INV=false;
B;
}
And transform this into a do..while loop, yields this monstrosity:
if (INV) {
do
{
A;
if (condition) INV=false;
B;
} while(INV)
}
Now we have two checks on opposite ends and if the invariant changes you have to update it on two places. In a certain way do..while is like the specialized screwdrivers in the tool box which you never use, because the standard screwdriver does everything you need.
I am programming about 12 years and only 3 months ago I have met a situation where it was really convenient to use do-while as one iteration was always necessary before checking a condition. So guess your big-time is ahead :).
It is a quite common structure in a server/consumer:
DOWHILE (no shutdown requested)
determine timeout
wait for work(timeout)
IF (there is work)
REPEAT
process
UNTIL(wait for work(0 timeout) indicates no work)
do what is supposed to be done at end of busy period.
ENDIF
ENDDO
the REPEAT UNTIL(cond) being a do {...} while(!cond)
Sometimes the wait for work(0) can be cheaper CPU wise (even eliminating the timeout calculation might be an improvement with very high arrival rates). Moreover, there are many queuing theory results that make the number served in a busy period an important statistic. (See for example Kleinrock - Vol 1.)
Similarly:
DOWHILE (no shutdown requested)
determine timeout
wait for work(timeout)
IF (there is work)
set throttle
REPEAT
process
UNTIL(--throttle<0 **OR** wait for work(0 timeout) indicates no work)
ENDIF
check for and do other (perhaps polled) work.
ENDDO
where check for and do other work may be exorbitantly expensive to put in the main loop or perhaps a kernel that does not support an efficient waitany(waitcontrol*,n) type operation or perhaps a situation where a prioritized queue might starve the other work and throttle is used as starvation control.
This type of balancing can seem like a hack, but it can be necessary. Blind use of thread pools would entirely defeat the performance benefits of the use of a caretaker thread with a private queue for a high updating rate complicated data structure as the use of a thread pool rather than a caretaker thread would require thread-safe implementation.
I really don't want to get into a debate about the pseudo code (for example, whether shutdown requested should be tested in the UNTIL) or caretaker threads versus thread pools - this is just meant to give a flavor of a particular use case of the control flow structure.
This is my personal opinion, but this question begs for an answer rooted in experience:
I have been programming in C for 38 years, and I never use do / while loops in regular code.
The only compelling use for this construct is in macros where it can wrap multiple statements into a single statement via a do { multiple statements } while (0)
I have seen countless examples of do / while loops with bogus error detection or redundant function calls.
My explanation for this observation is programmers tend to model problems incorrectly when they think in terms of do / while loops. They either miss an important ending condition or they miss the possible failure of the initial condition which they move to the end.
For these reasons, I have come to believe that where there is a do / while loop, there is a bug, and I regularly challenge newbie programmers to show me a do / while loop where I cannot spot a bug nearby.
This type of loop can be easily avoided: use a for (;;) { ... } and add the necessary termination tests where they are appropriate. It is quite common that there need be more than one such test.
Here is a classic example:
/* skip the line */
do {
c = getc(fp);
} while (c != '\n');
This will fail if the file does not end with a newline. A trivial example of such a file is the empty file.
A better version is this:
int c; // another classic bug is to define c as char.
while ((c = getc(fp)) != EOF && c != '\n')
continue;
Alternately, this version also hides the c variable:
for (;;) {
int c = getc(fp);
if (c == EOF || c == '\n')
break;
}
Try searching for while (c != '\n'); in any search engine, and you will find bugs such as this one (retrieved June 24, 2017):
In ftp://ftp.dante.de/tex-archive/biblio/tib/src/streams.c , function getword(stream,p,ignore), has a do / while and sure enough at least 2 bugs:
c is defined as a char and
there is a potential infinite loop while (c!='\n') c=getc(stream);
Conclusion: avoid do / while loops and look for bugs when you see one.
while loops check the condition before the loop, do...while loops check the condition after the loop. This is useful is you want to base the condition on side effects from the loop running or, like other posters said, if you want the loop to run at least once.
I understand where you're coming from, but the do-while is something that most use rarely, and I've never used myself. You're not doing it wrong.
You're not doing it wrong. That's like saying someone is doing it wrong because they've never used the byte primitive. It's just not that commonly used.
The most common scenario I run into where I use a do/while loop is in a little console program that runs based on some input and will repeat as many times as the user likes. Obviously it makes no sense for a console program to run no times; but beyond the first time it's up to the user -- hence do/while instead of just while.
This allows the user to try out a bunch of different inputs if desired.
do
{
int input = GetInt("Enter any integer");
// Do something with input.
}
while (GetBool("Go again?"));
I suspect that software developers use do/while less and less these days, now that practically every program under the sun has a GUI of some sort. It makes more sense with console apps, as there is a need to continually refresh the output to provide instructions or prompt the user with new information. With a GUI, in contrast, the text providing that information to the user can just sit on a form and never need to be repeated programmatically.
I use do-while loops all the time when reading in files. I work with a lot of text files that include comments in the header:
# some comments
# some more comments
column1 column2
1.234 5.678
9.012 3.456
... ...
i'll use a do-while loop to read up to the "column1 column2" line so that I can look for the column of interest. Here's the pseudocode:
do {
line = read_line();
} while ( line[0] == '#');
/* parse line */
Then I'll do a while loop to read through the rest of the file.
Being a geezer programmer, many of my school programming projects used text menu driven interactions. Virtually all used something like the following logic for the main procedure:
do
display options
get choice
perform action appropriate to choice
while choice is something other than exit
Since school days, I have found that I use the while loop more frequently.
One of the applications I have seen it is in Oracle when we look at result sets.
Once you a have a result set, you first fetch from it (do) and from that point on.. check if the fetch returns an element or not (while element found..) .. The same might be applicable for any other "fetch-like" implementations.
I 've used it in a function that returned the next character position in an utf-8 string:
char *next_utf8_character(const char *txt)
{
if (!txt || *txt == '\0')
return txt;
do {
txt++;
} while (((signed char) *txt) < 0 && (((unsigned char) *txt) & 0xc0) == 0xc0)
return (char *)txt;
}
Note that, this function is written from mind and not tested. The point is that you have to do the first step anyway and you have to do it before you can evaluate the condition.
Any sort of console input works well with do-while because you prompt the first time, and re-prompt whenever the input validation fails.
Even though there are plenty of answers here is my take. It all comes down to optimalization. I'll show two examples where one is faster then the other.
Case 1: while
string fileName = string.Empty, fullPath = string.Empty;
while (string.IsNullOrEmpty(fileName) || File.Exists(fullPath))
{
fileName = Guid.NewGuid().ToString() + fileExtension;
fullPath = Path.Combine(uploadDirectory, fileName);
}
Case 2: do while
string fileName = string.Empty, fullPath = string.Empty;
do
{
fileName = Guid.NewGuid().ToString() + fileExtension;
fullPath = Path.Combine(uploadDirectory, fileName);
}
while (File.Exists(fullPath));
So there two will do the exact same things. But there is one fundamental difference and that is that the while requires an extra statement to enter the while. Which is ugly because let's say every possible scenario of the Guid class has already been taken except for one variant. This means I'll have to loop around 5,316,911,983,139,663,491,615,228,241,121,400,000 times.
Every time I get to the end of my while statement I will need to do the string.IsNullOrEmpty(fileName) check. So this would take up a little bit, a tiny fraction of CPU work. But do this very small task times the possible combinations the Guid class has and we are talking about hours, days, months or extra time?
Of course this is an extreme example because you probably wouldn't see this in production. But if we would think about the YouTube algorithm, it is very well possible that they would encounter the generation of an ID where some ID's have already been taken. So it comes down to big projects and optimalization.
Even in educational references you barely would find a do...while example. Only recently, after reading Ethan Brown beautiful book, Learning JavaScript I encountered one do...while well defined example. That's been said, I believe it is OK if you don't find application for this structure in you routine job.
It's true that do/while loops are pretty rare. I think this is because a great many loops are of the form
while(something needs doing)
do it;
In general, this is an excellent pattern, and it has the usually-desirable property that if nothing needs doing, the loop runs zero times.
But once in a while, there's some fine reason why you definitely want to make at least one trip through the loop, no matter what. My favorite example is: converting an integer to its decimal representation as a string, that is, implementing printf("%d"), or the semistandard itoa() function.
To illustrate, here is a reasonably straightforward implementation of itoa(). It's not quite the "traditional" formulation; I'll explain it in more detail below if anyone's curious. But the key point is that it embodies the canonical algorithm, repeatedly dividing by 10 to pick off digits from the right, and it's written using an ordinary while loop... and this means it has a bug.
#include <stddef.h>
char *itoa(unsigned int n, char buf[], int bufsize)
{
if(bufsize < 2) return NULL;
char *p = &buf[bufsize];
*--p = '\0';
while(n > 0) {
if(p == buf) return NULL;
*--p = n % 10 + '0';
n /= 10;
}
return p;
}
If you didn't spot it, the bug is that this code returns nothing — an empty string — if you ask it to convert the integer 0. So this is an example of a case where, when there's "nothing" to do, we don't want the code to do nothing — we always want it to produce at least one digit. So we always want it to make at least one trip through the loop. So a do/while loop is just the ticket:
do {
if(p == buf) return NULL;
*--p = n % 10 + '0';
n /= 10;
} while(n > 0);
So now we have a loop that usually stops when n reaches 0, but if n is initially 0 — if you pass in a 0 — it returns the string "0", as desired.
As promised, here's a bit more information about the itoa function in this example. You pass it arguments which are: an int to convert (actually, an unsigned int, so that we don't have to worry about negative numbers); a buffer to render into; and the size of that buffer. It returns a char * pointing into your buffer, pointing at the beginning of the rendered string. (Or it returns NULL if it discovers that the buffer you gave it wasn't big enough.) The "nontraditional" aspect of this implementation is that it fills in the array from right to left, meaning that it doesn't have to reverse the string at the end — and also meaning that the pointer it returns to you is usually not to the beginning of the buffer. So you have to use the pointer it returns to you as the string to use; you can't call it and then assume that the buffer you handed it is the string you can use.
Finally, for completeness, here is a little test program to test this version of itoa with.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int n;
if(argc > 1)
n = atoi(argv[1]);
else {
printf("enter a number: "); fflush(stdout);
if(scanf("%d", &n) != 1) return EXIT_FAILURE;
}
if(n < 0) {
fprintf(stderr, "sorry, can't do negative numbers yet\n");
return EXIT_FAILURE;
}
char buf[20];
printf("converted: %s\n", itoa(n, buf, sizeof(buf)));
return EXIT_SUCCESS;
}
I ran across this while researching the proper loop to use for a situation I have. I believe this will fully satisfy a common situation where a do.. while loop is a better implementation than a while loop (C# language, since you stated that is your primary for work).
I am generating a list of strings based on the results of an SQL query. The returned object by my query is an SQLDataReader. This object has a function called Read() which advances the object to the next row of data, and returns true if there was another row. It will return false if there is not another row.
Using this information, I want to return each row to a list, then stop when there is no more data to return. A Do... While loop works best in this situation as it ensures that adding an item to the list will happen BEFORE checking if there is another row. The reason this must be done BEFORE checking the while(condition) is that when it checks, it also advances. Using a while loop in this situation would cause it to bypass the first row due to the nature of that particular function.
In short:
This won't work in my situation.
//This will skip the first row because Read() returns true after advancing.
while (_read.NextResult())
{
list.Add(_read.GetValue(0).ToString());
}
return list;
This will.
//This will make sure the currently read row is added before advancing.
do
{
list.Add(_read.GetValue(0).ToString());
}
while (_read.NextResult());
return list;
I'm trying to work through the problems on projecteuler.net but I keep running into a couple of problems.
The first is a question of storing large quanities of elements in a List<t>. I keep getting OutOfMemoryException's when storing large quantities in the list.
Now I admit I might not be doing these things in the best way but, is there some way of defining how much memory the app can consume?
It usually crashes when I get abour 100,000,000 elements :S
Secondly, some of the questions require the addition of massive numbers. I use ulong data type where I think the number is going to get super big, but I still manage to wrap past the largest supported int and get into negative numbers.
Do you have any tips for working with incredibly large numbers?
Consider System.Numerics.BigInteger.
You need to use a large number class that uses some basic math principals to split these operations up. This implementation of a C# BigInteger library on CodePoject seems to be the most promising. The article has some good explanations of how operations with massive numbers work, as well.
Also see:
Big integers in C#
As far as Project Euler goes, you might be barking up the wrong tree if you are hitting OutOfMemory exceptions. From their website:
Each problem has been designed according to a "one-minute rule", which means that although it may take several hours to design a successful algorithm with more difficult problems, an efficient implementation will allow a solution to be obtained on a modestly powered computer in less than one minute.
As user Jakers said, if you're using Big Numbers, probably you're doing it wrong.
Of the ProjectEuler problems I've done, none have required big-number math so far.
Its more about finding the proper algorithm to avoid big-numbers.
Want hints? Post here, and we might have an interesting Euler-thread started.
I assume this is C#? F# has built in ways of handling both these problems (BigInt type and lazy sequences).
You can use both F# techniques from C#, if you like. The BigInt type is reasonably usable from other languages if you add a reference to the core F# assembly.
Lazy sequences are basically just syntax friendly enumerators. Putting 100,000,000 elements in a list isn't a great plan, so you should rethink your solutions to get around that. If you don't need to keep information around, throw it away! If it's cheaper to recompute it than store it, throw it away!
See the answers in this thread. You probably need to use one of the third-party big integer libraries/classes available or wait for C# 4.0 which will include a native BigInteger datatype.
As far as defining how much memory an app will use, you can check the available memory before performing an operation by using the MemoryFailPoint class.
This allows you to preallocate memory before doing the operation, so you can check if an operation will fail before running it.
string Add(string s1, string s2)
{
bool carry = false;
string result = string.Empty;
if (s1.Length < s2.Length)
s1 = s1.PadLeft(s2.Length, '0');
if(s2.Length < s1.Length)
s2 = s2.PadLeft(s1.Length, '0');
for(int i = s1.Length-1; i >= 0; i--)
{
var augend = Convert.ToInt64(s1.Substring(i,1));
var addend = Convert.ToInt64(s2.Substring(i,1));
var sum = augend + addend;
sum += (carry ? 1 : 0);
carry = false;
if(sum > 9)
{
carry = true;
sum -= 10;
}
result = sum.ToString() + result;
}
if(carry)
{
result = "1" + result;
}
return result;
}
I am not sure if it is a good way of handling it, but I use the following in my project.
I have a "double theRelevantNumber" variable and an "int PowerOfTen" for each item and in my relevant class I have a "int relevantDecimals" variable.
So... when large numbers is encountered they are handled like this:
First they are changed to x,yyy form. So if the number 123456,789 was inputed and the "powerOfTen" was 10, it would start like this:
theRelevantNumber = 123456,789
PowerOfTen = 10
The number was then: 123456,789*10^10
It is then changed to:
1,23456789*10^15
It is then rounded by the number of relevant decimals (for example 5) to 1,23456 and then saved along with "PowerOfTen = 15"
When adding or subracting numbers together, any number outside the relevant decimals are ignored. Meaning if you take:
1*10^15 + 1*10^10 it will change to 1,00001 if "relevantDecimals" is 5 but will not change at all if "relevantDecimals" are 4.
This method make you able to deal with numbers up doubleLimit*10^intLimit without any problem, and at least for OOP it is not that hard to keep track of.
You don't need to use BigInteger. You can do this even with string array of numbers.
class Solution
{
static void Main(String[] args)
{
int n = 5;
string[] unsorted = new string[6] { "3141592653589793238","1", "3", "5737362592653589793238", "3", "5" };
string[] result = SortStrings(n, unsorted);
foreach (string s in result)
Console.WriteLine(s);
Console.ReadLine();
}
static string[] SortStrings(int size, string[] arr)
{
Array.Sort(arr, (left, right) =>
{
if (left.Length != right.Length)
return left.Length - right.Length;
return left.CompareTo(right);
});
return arr;
}
}
If you want to work with incredibly large numbers look here...
MIKI Calculator
I am not a professional programmer i write for myself, sometimes, so sorry for unprofessional use of c# but the program works. I will be grateful for any advice and correction.
I use this calculator to generate 32-character passwords from numbers that are around 58 digits long.
Since the program adds numbers in the string format, you can perform calculations on numbers with the maximum length of the string variable. The program uses long lists for the calculation, so it is possible to calculate on larger numbers, possibly 18x the maximum capacity of the list.