How to make large if-statement more readable

How to make large if-statement more readable - c#

Is there a better way for writing a condition with a large number of AND checks than a large IF statement in terms of code clarity?
Eg I currently need to make a field on screen mandatory if other fields do not meet certain requirements. At the moment I have an IF statement which runs over 30 LOC, and this just doesn't seem right.
if(!(field1 == field2 &&
field3 == field4 &&
field5 == field6 &&
.
.
.
field100 == field101))
{
// Perform operations
}
Is the solution simply to break these down into smaller chunks and assign the results to a smaller number of boolean variables? What is the best way for making the code more readable?
Thanks

I would consider building up rules, in predicate form:
bool FieldIsValid() { // condition }
bool SomethingElseHappened() { // condition }
// etc
Then, I would create myself a list of these predicates:
IList<Func<bool>> validConditions = new List<Func<bool>> {
FieldIsValid,
SomethingElseHappend,
// etc
};
Finally, I would write the condition to bring forward the conditions:
if(validConditions.All(c => c())
{
// Perform operations
}

The right approach will vary depending upon details you haven't provided. If the items to be compared can be selected by some sort of index (a numeric counter, list of field identifiers, etc.) you are probably best off doing that. For example, something like:
Ok = True
For Each fld as KeyValuePair(Of Control, String) in CheckFields
If fld.FormField.Text fld.RequiredValue Then
OK = False
Exit For
End If
Next
Constructing the list of controls and strings may be a slight nuisance, but there are reasonable ways of doing it.

Personally, I feel that breaking this into chunks will just make the overall statement less clear. It's going to make the code longer, not more concise.
I would probably refactor this check into a method on the class, so you can reuse it as needed, and test it in a single place. However, I'd most likely leave the check written as you have it - one if statement with lots of conditions, one per line.

You could refactor your conditional into a separate function, and also use De Morgan's Laws to simplify your logic slightly.
Also - are your variables really all called fieldN?

Part of the problem is you are mixing meta data and logic.
WHICH Questions are required(/must be equal/min length/etc) is meta data.
Verifying that each field meets it's requirements is program logic.
The list of requirements (and the fields that apply too) should all be stored somewhere else, not inside of a large if statement.
Then your verification logic reads the list, loops through it, and keeps a running total. If ANY field fails, you need to alert the user.

It may be useful to begin using the Workflow Engine for C#. It was specifically designed to help graphically lay out these sorts of complex decision algorithms.
Windows WorkFlow Foundation

The first thing I'd change for legibility is to remove the almost hidden negation by inverting the statement (by using De Morgan's Laws):
if ( field1 != field2 || field3 != field4 .... etc )
{
// Perform operations
}
Although using a series of && rather than || does have some slight performance improvement, I feel the lack of readability with the original code is worth the change.
If performance were an issue, you could break the statements into a series of if-statements, but that's getting to be a mess by then!

Is there some other relationship between all the variables you're comparing which you can exploit?
For example, are they all the members of two classes?
If so, and provided your performance requirements don't preclude this, you can scrape references to them all into a List or array, and then compare them in a loop. Sometimes you can do this at object construction, rather than for every comparison.
It seems to me that the real problem is somewhere else in the architecture rather than in the if() statement - of course, that doesn't mean it can easily be fixed, I appreciate that.

Isn't this what arrays are basically for?
Instead of having 100 variables named fieldn, create an array of 100 values.
Then you can have a function to loop combinations in the array and return true or false if the condition matches.

Related

Good practice in condition statments

Currently I'm refactoring my old code, so I have some time to think about how code should look like for somebody else. I managed much of problems, but I always wonder how to prepare good syntax for some nested logical condition. Let's assume that we have following part of code:
bool param1;
int param2;
var result = ( param1 == toCheck.param1) && // to achive condition it always has to be true
((param2 == toCheck.param2)) ||
( (!param1) &&
(param2 == defaultValueForParam2));
// to pass condition param! has to be always true AND ( (params 2 has to be equal) OR (param1 has to be false AND param2 has to be equal with default value)
How should it be formated to be more readable for anybody? Are there some rules of formating conditions? Maybe the general solution is just wrong if I need so complicated condition?
My intention was to ask about: how I should use brackets, how I should use indents, grouping, etc?

If you have a complicated condition, that you cannot make less complicated, it helps to
Have good variable names
Write a small helper function with a clear name so that the calling code is clear
If the conition is used in more that one place Don't Repeat Yourself (see 2.)
Reconsider if you can simplify this. Do you really need boolean flags?

Fast and Elegant way to control string is empty or not in C#

I am try to control multiple strings variables is empty or not at once:
My first approach is very simple:
if(string.isNullOrEmpty(val1) && string.isNullOrEmpty(val2) && string.isNullOrEmpty(val3))
My second way looks like this
if(string.isNullOrEmpty(val1 + val2 + val3))
Which one is fastest and elegant?
Is there any options to do this operation?

The first was faster in my test (just had to): 6ms vs. 70ms and that was on 10,000,000 iterations each (so the speed difference probably doesn't matter very much unless you're doing this on a massive scale).
Anyway, i find the first to be more clear.
Also it doesn't rely on behavior of IsNullOrEmpty that is not immediately obvious (you might just as well think that passing null parameters causes an ArgumentNullException if you don't know better), which i think is important.
Note: The test was with all variables set to null, but setting them to other values confirms it, the longer the strings get, the longer option 2 takes, while option 1 stays at about 30ms max.
Also, the first returns true if any of the strings is null or empty, while the second does it only if all of them are null or empty. So it's not the same check.

They are not equivalent. The first one checks if any of them is null. The second one checks if all of them are null. Make up your mind.

How about this?
new string[] {val1, val2, val2}.All(s => string.IsNullOrEmpty(s))
Or something similar.

I would expect 'fastest' to depend on how often you expect one or more of the strings to actually be null or empty.
For example, if val1 is often going to be null or empty then the first option is likely to be best; if they are all rarely going to be null or empty then I'm not sure of the answer, but it can't take more than five minutes to knock together a few benchmarks for your particular expectations.
(Also, note that the two options don't do the same thing, the first is true if ANY of them are null or empty the second is not doing that)

if(string.isNullOrEmpty(val1 + val2 + val3)) seems to me the fastest
I would advice you to also use concat
but behind the scenes it uses the '+' operator.
I think this is the fastest.
If it werent nullable I suggest summing their length and check ==0

The second one:
if(string.isNullOrEmpty(val1 + val2 + val3))
equals
if(string.isNullOrEmpty(val1) && string.isNullOrEmpty(val2) && string.isNullOrEmpty(val3))
(note the && instead of ||)
but will create an intermediate string, whereas my second version will not create extra strings and stop checking as soon as one string is not empty.

If you have a number of string variables, then I think it is more readable to use the first construction:
if(string.isNullOrEmpty(val1) &&
string.isNullOrEmpty(val2) &&
string.isNullOrEmpty(val3))
{
}
This way, it seems like each variable is treated separately, and this code is a little easier to change if you need to treat one of the variables in another way.
But in case all of the string variables are to be treated in the same way, they are very likely to be represented as an array or another kind of enumeration. Then it's definitely better to use John M Gant's suggestion:
if(myStrings.All(s => string.IsNullOrEmpty(s)))
{
}

Is this multi line if statement too complex?

I am validating input on a form and attempting to prompt the user of improper input(s) based on the combination of controls used.
For example, I have 2 combo boxes and 3 text boxes. The 2 combo boxes must always have a value other than the first (default) value, but one of three, or two of three, or all text boxes can be filled to make the form valid.
In one such scenario I have a 6 line if statement to try to make the test easily readable:
if ((!String.Equals(ComboBoxA.SelectedValue.ToString(), DEFAULT_COMBO_A_CHOICE.ToString())
&& !String.IsNullOrEmpty(TextBoxA.Text)
&& !String.Equals(ComboBoxB.SelectedValue.ToString(), DEFAULT_COMBO_B_CHOICE.ToString()))
||
(!String.IsNullOrEmpty(TextBoxB.Text)
|| !String.IsNullOrEmpty(TextBoxC.Text)))
{
//Do Some Validation
}
I have 2 questions:
Should this type of if statement be avoided at all cost?
Would it be better to enclose this test in another method? (This would be a good choice as this validation will happen in more than one scenario)
Thanks for your input(s)!

In such a case I find it helps to move some of the logic out of the if statement and into some more meaningfully named booleans. Eg.
bool comboBoxASelected = !String.Equals(ComboBoxA.SelectedValue.ToString(), DEFAULT_COMBO_A_CHOICE.ToString());
bool comboBSelected = !String.Equals(ComboBoxB.SelectedValue.ToString(), DEFAULT_COMBO_B_CHOICE.ToString());
bool textBoxAHasContent = !String.IsNullOrEmpty(TextBoxA.Text);
bool textBoxBHasContent = !String.IsNullOrEmpty(TextBoxB.Text);
bool textBoxCHasContent = !String.IsNullOrEmpty(TextBoxC.Text);
bool primaryInformationEntered = comboBoxASelected && textBoxAHasContent && comboBSelected;
bool alternativeInformationEntered = textBoxBHasContent || textBoxCHasContent;
if (primaryInformationEntered || alternativeInformationEntered)
{
//Do Some Validation
}
Obviously, name the combo and text boxes to reflect their actual content. When someone has to work their way through the logic several months down the line they'll thank you.

a) Doesn't have to necesarily be avoided at all costs. The code works. But it is certainly messy, confusing and I would say could be difficult to maintain.
b) Yes. Give it a relevant name so that the code reader knows what is going on there.

I personally wouldn't have a big issue with code like this. (Your last set of parentheses seem unnecessary.)
Generally, I'd like to keep my if statements simpler. But all your conditions are simple ones. If you really need to test that many tests, then I'd keep it like it is.

It is not very readable yes. But you can shorten it:
!String.Equals(ComboBoxA.SelectedValue.ToString(), DEFAULT_COMBO_A_CHOICE.ToString()
could also be written as:
ComboBoxA.SelectedValue.ToString()!=DEFAULT_COMBO_A_CHOICE
I presume DEFAULT_COMBO_A_CHOICE is already of string to ToString si superflous.
also the parenthese around
(!String.IsNullOrEmpty(TextBoxB.Text)
|| !String.IsNullOrEmpty(TextBoxC.Text))
are not necessary.

IMO such conditions should be avoided (though not at all costs). They are very difficult to read an maintain.
There are several ways of doing that
Try and group the conditions according to the behavior they represent. For example
if (OrderDetailsSelected() && ShippingAddressProvided() )
{
This way you can also avoid the duplication of the conditions within your form.
Secondly, you can use the Boolean Algebra to simplify the expression and
Use Extract Method refactoring to move conditions, which are difficult to read in functions to avoid duplication and make them more readable.
For ex. The condition
String.Equals(ComboBoxB.SelectedValue.ToString(), DEFAULT_COMBO_B_CHOICE.ToString())
can be extracted into a function
private bool IsDefaultA() { return ... }

'do...while' vs. 'while'

Possible Duplicates:
While vs. Do While
When should I use do-while instead of while loops?
I've been programming for a while now (2 years work + 4.5 years degree + 1 year pre-college), and I've never used a do-while loop short of being forced to in the Introduction to Programming course. I have a growing feeling that I'm doing programming wrong if I never run into something so fundamental.
Could it be that I just haven't run into the correct circumstances?
What are some examples where it would be necessary to use a do-while instead of a while?
(My schooling was almost all in C/C++ and my work is in C#, so if there is another language where it absolutely makes sense because do-whiles work differently, then these questions don't really apply.)
To clarify...I know the difference between a while and a do-while. While checks the exit condition and then performs tasks. do-while performs tasks and then checks exit condition.

If you always want the loop to execute at least once. It's not common, but I do use it from time to time. One case where you might want to use it is trying to access a resource that could require a retry, e.g.
do
{
try to access resource...
put up message box with retry option
} while (user says retry);

do-while is better if the compiler isn't competent at optimization. do-while has only a single conditional jump, as opposed to for and while which have a conditional jump and an unconditional jump. For CPUs which are pipelined and don't do branch prediction, this can make a big difference in the performance of a tight loop.
Also, since most compilers are smart enough to perform this optimization, all loops found in decompiled code will usually be do-while (if the decompiler even bothers to reconstruct loops from backward local gotos at all).

I have used this in a TryDeleteDirectory function. It was something like this
do
{
try
{
DisableReadOnly(directory);
directory.Delete(true);
}
catch (Exception)
{
retryDeleteDirectoryCount++;
}
} while (Directory.Exists(fullPath) && retryDeleteDirectoryCount < 4);

Do while is useful for when you want to execute something at least once. As for a good example for using do while vs. while, lets say you want to make the following: A calculator.
You could approach this by using a loop and checking after each calculation if the person wants to exit the program. Now you can probably assume that once the program is opened the person wants to do this at least once so you could do the following:
do
{
//do calculator logic here
//prompt user for continue here
} while(cont==true);//cont is short for continue

This is sort of an indirect answer, but this question got me thinking about the logic behind it, and I thought this might be worth sharing.
As everyone else has said, you use a do ... while loop when you want to execute the body at least once. But under what circumstances would you want to do that?
Well, the most obvious class of situations I can think of would be when the initial ("unprimed") value of the check condition is the same as when you want to exit. This means that you need to execute the loop body once to prime the condition to a non-exiting value, and then perform the actual repetition based on that condition. What with programmers being so lazy, someone decided to wrap this up in a control structure.
So for example, reading characters from a serial port with a timeout might take the form (in Python):
response_buffer = []
char_read = port.read(1)
while char_read:
response_buffer.append(char_read)
char_read = port.read(1)
# When there's nothing to read after 1s, there is no more data
response = ''.join(response_buffer)
Note the duplication of code: char_read = port.read(1). If Python had a do ... while loop, I might have used:
do:
char_read = port.read(1)
response_buffer.append(char_read)
while char_read
The added benefit for languages that create a new scope for loops: char_read does not pollute the function namespace. But note also that there is a better way to do this, and that is by using Python's None value:
response_buffer = []
char_read = None
while char_read != '':
char_read = port.read(1)
response_buffer.append(char_read)
response = ''.join(response_buffer)
So here's the crux of my point: in languages with nullable types, the situation initial_value == exit_value arises far less frequently, and that may be why you do not encounter it. I'm not saying it never happens, because there are still times when a function will return None to signify a valid condition. But in my hurried and briefly-considered opinion, this would happen a lot more if the languages you used did not allow for a value that signifies: this variable has not been initialised yet.
This is not perfect reasoning: in reality, now that null-values are common, they simply form one more element of the set of valid values a variable can take. But practically, programmers have a way to distinguish between a variable being in sensible state, which may include the loop exit state, and it being in an uninitialised state.

I used them a fair bit when I was in school, but not so much since.
In theory they are useful when you want the loop body to execute once before the exit condition check. The problem is that for the few instances where I don't want the check first, typically I want the exit check in the middle of the loop body rather than at the very end. In that case, I prefer to use the well-known for (;;) with an if (condition) exit; somewhere in the body.
In fact, if I'm a bit shaky on the loop exit condition, sometimes I find it useful to start writing the loop as a for (;;) {} with an exit statement where needed, and then when I'm done I can see if it can be "cleaned up" by moving initilizations, exit conditions, and/or increment code inside the for's parentheses.

A situation where you always need to run a piece of code once, and depending on its result, possibly more times. The same can be produced with a regular while loop as well.
rc = get_something();
while (rc == wrong_stuff)
{
rc = get_something();
}
do
{
rc = get_something();
}
while (rc == wrong_stuff);

It's as simple as that:
precondition vs postcondition
while (cond) {...} - precondition, it executes the code only after checking.
do {...} while (cond) - postcondition, code is executed at least once.
Now that you know the secret .. use them wisely :)

do while is if you want to run the code block at least once. while on the other hand won't always run depending on the criteria specified.

I see that this question has been adequately answered, but would like to add this very specific use case scenario. You might start using do...while more frequently.
do
{
...
} while (0)
is often used for multi-line #defines. For example:
#define compute_values \
area = pi * r * r; \
volume = area * h
This works alright for:
r = 4;
h = 3;
compute_values;
-but- there is a gotcha for:
if (shape == circle) compute_values;
as this expands to:
if (shape == circle) area = pi *r * r;
volume = area * h;
If you wrap it in a do ... while(0) loop it properly expands to a single block:
if (shape == circle)
do
{
area = pi * r * r;
volume = area * h;
} while (0);

The answers so far summarize the general use for do-while. But the OP asked for an example, so here is one: Get user input. But the user's input may be invalid - so you ask for input, validate it, proceed if it's valid, otherwise repeat.
With do-while, you get the input while the input is not valid. With a regular while-loop, you get the input once, but if it's invalid, you get it again and again until it is valid. It's not hard to see that the former is shorter, more elegant, and simpler to maintain if the body of the loop grows more complex.

I've used it for a reader that reads the same structure multiple times.
using(IDataReader reader = connection.ExecuteReader())
{
do
{
while(reader.Read())
{
//Read record
}
} while(reader.NextResult());
}

I can't imagine how you've gone this long without using a do...while loop.
There's one on another monitor right now and there are multiple such loops in that program. They're all of the form:
do
{
GetProspectiveResult();
}
while (!ProspectIsGood());

I like to understand these two as:
while -> 'repeat until',
do ... while -> 'repeat if'.

I've used a do while when I'm reading a sentinel value at the beginning of a file, but other than that, I don't think it's abnormal that this structure isn't too commonly used--do-whiles are really situational.
-- file --
5
Joe
Bob
Jake
Sarah
Sue
-- code --
int MAX;
int count = 0;
do {
MAX = a.readLine();
k[count] = a.readLine();
count++;
} while(count <= MAX)

Here's my theory why most people (including me) prefer while(){} loops to do{}while(): A while(){} loop can easily be adapted to perform like a do..while() loop while the opposite is not true. A while loop is in a certain way "more general". Also programmers like easy to grasp patterns. A while loop says right at start what its invariant is and this is a nice thing.
Here's what I mean about the "more general" thing. Take this do..while loop:
do {
A;
if (condition) INV=false;
B;
} while(INV);
Transforming this in to a while loop is straightforward:
INV=true;
while(INV) {
A;
if (condition) INV=false;
B;
}
Now, we take a model while loop:
while(INV) {
A;
if (condition) INV=false;
B;
}
And transform this into a do..while loop, yields this monstrosity:
if (INV) {
do
{
A;
if (condition) INV=false;
B;
} while(INV)
}
Now we have two checks on opposite ends and if the invariant changes you have to update it on two places. In a certain way do..while is like the specialized screwdrivers in the tool box which you never use, because the standard screwdriver does everything you need.

I am programming about 12 years and only 3 months ago I have met a situation where it was really convenient to use do-while as one iteration was always necessary before checking a condition. So guess your big-time is ahead :).

It is a quite common structure in a server/consumer:
DOWHILE (no shutdown requested)
determine timeout
wait for work(timeout)
IF (there is work)
REPEAT
process
UNTIL(wait for work(0 timeout) indicates no work)
do what is supposed to be done at end of busy period.
ENDIF
ENDDO
the REPEAT UNTIL(cond) being a do {...} while(!cond)
Sometimes the wait for work(0) can be cheaper CPU wise (even eliminating the timeout calculation might be an improvement with very high arrival rates). Moreover, there are many queuing theory results that make the number served in a busy period an important statistic. (See for example Kleinrock - Vol 1.)
Similarly:
DOWHILE (no shutdown requested)
determine timeout
wait for work(timeout)
IF (there is work)
set throttle
REPEAT
process
UNTIL(--throttle<0 **OR** wait for work(0 timeout) indicates no work)
ENDIF
check for and do other (perhaps polled) work.
ENDDO
where check for and do other work may be exorbitantly expensive to put in the main loop or perhaps a kernel that does not support an efficient waitany(waitcontrol*,n) type operation or perhaps a situation where a prioritized queue might starve the other work and throttle is used as starvation control.
This type of balancing can seem like a hack, but it can be necessary. Blind use of thread pools would entirely defeat the performance benefits of the use of a caretaker thread with a private queue for a high updating rate complicated data structure as the use of a thread pool rather than a caretaker thread would require thread-safe implementation.
I really don't want to get into a debate about the pseudo code (for example, whether shutdown requested should be tested in the UNTIL) or caretaker threads versus thread pools - this is just meant to give a flavor of a particular use case of the control flow structure.

This is my personal opinion, but this question begs for an answer rooted in experience:
I have been programming in C for 38 years, and I never use do / while loops in regular code.
The only compelling use for this construct is in macros where it can wrap multiple statements into a single statement via a do { multiple statements } while (0)
I have seen countless examples of do / while loops with bogus error detection or redundant function calls.
My explanation for this observation is programmers tend to model problems incorrectly when they think in terms of do / while loops. They either miss an important ending condition or they miss the possible failure of the initial condition which they move to the end.
For these reasons, I have come to believe that where there is a do / while loop, there is a bug, and I regularly challenge newbie programmers to show me a do / while loop where I cannot spot a bug nearby.
This type of loop can be easily avoided: use a for (;;) { ... } and add the necessary termination tests where they are appropriate. It is quite common that there need be more than one such test.
Here is a classic example:
/* skip the line */
do {
c = getc(fp);
} while (c != '\n');
This will fail if the file does not end with a newline. A trivial example of such a file is the empty file.
A better version is this:
int c; // another classic bug is to define c as char.
while ((c = getc(fp)) != EOF && c != '\n')
continue;
Alternately, this version also hides the c variable:
for (;;) {
int c = getc(fp);
if (c == EOF || c == '\n')
break;
}
Try searching for while (c != '\n'); in any search engine, and you will find bugs such as this one (retrieved June 24, 2017):
In ftp://ftp.dante.de/tex-archive/biblio/tib/src/streams.c , function getword(stream,p,ignore), has a do / while and sure enough at least 2 bugs:
c is defined as a char and
there is a potential infinite loop while (c!='\n') c=getc(stream);
Conclusion: avoid do / while loops and look for bugs when you see one.

while loops check the condition before the loop, do...while loops check the condition after the loop. This is useful is you want to base the condition on side effects from the loop running or, like other posters said, if you want the loop to run at least once.
I understand where you're coming from, but the do-while is something that most use rarely, and I've never used myself. You're not doing it wrong.
You're not doing it wrong. That's like saying someone is doing it wrong because they've never used the byte primitive. It's just not that commonly used.

The most common scenario I run into where I use a do/while loop is in a little console program that runs based on some input and will repeat as many times as the user likes. Obviously it makes no sense for a console program to run no times; but beyond the first time it's up to the user -- hence do/while instead of just while.
This allows the user to try out a bunch of different inputs if desired.
do
{
int input = GetInt("Enter any integer");
// Do something with input.
}
while (GetBool("Go again?"));
I suspect that software developers use do/while less and less these days, now that practically every program under the sun has a GUI of some sort. It makes more sense with console apps, as there is a need to continually refresh the output to provide instructions or prompt the user with new information. With a GUI, in contrast, the text providing that information to the user can just sit on a form and never need to be repeated programmatically.

I use do-while loops all the time when reading in files. I work with a lot of text files that include comments in the header:
# some comments
# some more comments
column1 column2
1.234 5.678
9.012 3.456
... ...
i'll use a do-while loop to read up to the "column1 column2" line so that I can look for the column of interest. Here's the pseudocode:
do {
line = read_line();
} while ( line[0] == '#');
/* parse line */
Then I'll do a while loop to read through the rest of the file.

Being a geezer programmer, many of my school programming projects used text menu driven interactions. Virtually all used something like the following logic for the main procedure:
do
display options
get choice
perform action appropriate to choice
while choice is something other than exit
Since school days, I have found that I use the while loop more frequently.

One of the applications I have seen it is in Oracle when we look at result sets.
Once you a have a result set, you first fetch from it (do) and from that point on.. check if the fetch returns an element or not (while element found..) .. The same might be applicable for any other "fetch-like" implementations.

I 've used it in a function that returned the next character position in an utf-8 string:
char *next_utf8_character(const char *txt)
{
if (!txt || *txt == '\0')
return txt;
do {
txt++;
} while (((signed char) *txt) < 0 && (((unsigned char) *txt) & 0xc0) == 0xc0)
return (char *)txt;
}
Note that, this function is written from mind and not tested. The point is that you have to do the first step anyway and you have to do it before you can evaluate the condition.

Any sort of console input works well with do-while because you prompt the first time, and re-prompt whenever the input validation fails.

Even though there are plenty of answers here is my take. It all comes down to optimalization. I'll show two examples where one is faster then the other.
Case 1: while
string fileName = string.Empty, fullPath = string.Empty;
while (string.IsNullOrEmpty(fileName) || File.Exists(fullPath))
{
fileName = Guid.NewGuid().ToString() + fileExtension;
fullPath = Path.Combine(uploadDirectory, fileName);
}
Case 2: do while
string fileName = string.Empty, fullPath = string.Empty;
do
{
fileName = Guid.NewGuid().ToString() + fileExtension;
fullPath = Path.Combine(uploadDirectory, fileName);
}
while (File.Exists(fullPath));
So there two will do the exact same things. But there is one fundamental difference and that is that the while requires an extra statement to enter the while. Which is ugly because let's say every possible scenario of the Guid class has already been taken except for one variant. This means I'll have to loop around 5,316,911,983,139,663,491,615,228,241,121,400,000 times.
Every time I get to the end of my while statement I will need to do the string.IsNullOrEmpty(fileName) check. So this would take up a little bit, a tiny fraction of CPU work. But do this very small task times the possible combinations the Guid class has and we are talking about hours, days, months or extra time?
Of course this is an extreme example because you probably wouldn't see this in production. But if we would think about the YouTube algorithm, it is very well possible that they would encounter the generation of an ID where some ID's have already been taken. So it comes down to big projects and optimalization.

Even in educational references you barely would find a do...while example. Only recently, after reading Ethan Brown beautiful book, Learning JavaScript I encountered one do...while well defined example. That's been said, I believe it is OK if you don't find application for this structure in you routine job.

It's true that do/while loops are pretty rare. I think this is because a great many loops are of the form
while(something needs doing)
do it;
In general, this is an excellent pattern, and it has the usually-desirable property that if nothing needs doing, the loop runs zero times.
But once in a while, there's some fine reason why you definitely want to make at least one trip through the loop, no matter what. My favorite example is: converting an integer to its decimal representation as a string, that is, implementing printf("%d"), or the semistandard itoa() function.
To illustrate, here is a reasonably straightforward implementation of itoa(). It's not quite the "traditional" formulation; I'll explain it in more detail below if anyone's curious. But the key point is that it embodies the canonical algorithm, repeatedly dividing by 10 to pick off digits from the right, and it's written using an ordinary while loop... and this means it has a bug.
#include <stddef.h>
char *itoa(unsigned int n, char buf[], int bufsize)
{
if(bufsize < 2) return NULL;
char *p = &buf[bufsize];
*--p = '\0';
while(n > 0) {
if(p == buf) return NULL;
*--p = n % 10 + '0';
n /= 10;
}
return p;
}
If you didn't spot it, the bug is that this code returns nothing — an empty string — if you ask it to convert the integer 0. So this is an example of a case where, when there's "nothing" to do, we don't want the code to do nothing — we always want it to produce at least one digit. So we always want it to make at least one trip through the loop. So a do/while loop is just the ticket:
do {
if(p == buf) return NULL;
*--p = n % 10 + '0';
n /= 10;
} while(n > 0);
So now we have a loop that usually stops when n reaches 0, but if n is initially 0 — if you pass in a 0 — it returns the string "0", as desired.
As promised, here's a bit more information about the itoa function in this example. You pass it arguments which are: an int to convert (actually, an unsigned int, so that we don't have to worry about negative numbers); a buffer to render into; and the size of that buffer. It returns a char * pointing into your buffer, pointing at the beginning of the rendered string. (Or it returns NULL if it discovers that the buffer you gave it wasn't big enough.) The "nontraditional" aspect of this implementation is that it fills in the array from right to left, meaning that it doesn't have to reverse the string at the end — and also meaning that the pointer it returns to you is usually not to the beginning of the buffer. So you have to use the pointer it returns to you as the string to use; you can't call it and then assume that the buffer you handed it is the string you can use.
Finally, for completeness, here is a little test program to test this version of itoa with.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int n;
if(argc > 1)
n = atoi(argv[1]);
else {
printf("enter a number: "); fflush(stdout);
if(scanf("%d", &n) != 1) return EXIT_FAILURE;
}
if(n < 0) {
fprintf(stderr, "sorry, can't do negative numbers yet\n");
return EXIT_FAILURE;
}
char buf[20];
printf("converted: %s\n", itoa(n, buf, sizeof(buf)));
return EXIT_SUCCESS;
}

I ran across this while researching the proper loop to use for a situation I have. I believe this will fully satisfy a common situation where a do.. while loop is a better implementation than a while loop (C# language, since you stated that is your primary for work).
I am generating a list of strings based on the results of an SQL query. The returned object by my query is an SQLDataReader. This object has a function called Read() which advances the object to the next row of data, and returns true if there was another row. It will return false if there is not another row.
Using this information, I want to return each row to a list, then stop when there is no more data to return. A Do... While loop works best in this situation as it ensures that adding an item to the list will happen BEFORE checking if there is another row. The reason this must be done BEFORE checking the while(condition) is that when it checks, it also advances. Using a while loop in this situation would cause it to bypass the first row due to the nature of that particular function.
In short:
This won't work in my situation.
//This will skip the first row because Read() returns true after advancing.
while (_read.NextResult())
{
list.Add(_read.GetValue(0).ToString());
}
return list;
This will.
//This will make sure the currently read row is added before advancing.
do
{
list.Add(_read.GetValue(0).ToString());
}
while (_read.NextResult());
return list;

Is there a standard way to count statements in C#

I was looking at some code length metrics other than Lines of Code. Something that Source Monitor reports is statements. This seemed like a valuable thing to know, but the way Source Monitor counted some things seemed unintuitive. For example, a for statement is one statement, even though it contains a variable definition, a condition, and an increment statement. And if a method call is nested in an argument list to another method, the whole thing is considered one statement.
Is there a standard way that statements are counted and are their rules governing such a thing?

The first rule of metrics is "be careful what you measure". You ask for a count of statements, that's what you're going to get. As you note, that figure is perhaps not actually relevant.
If you're interested in other measures, like how "complex" code is, consider looking into other code metrics, like cyclometric complexity.
http://en.wikipedia.org/wiki/Cyclomatic_complexity
UPDATE: Re: your comment
I agree that "doing too much" is an interesting metric. My rule of thumb is that one statement should have one side effect (usually a "local" side effect like mutating a local variable, but sometimes a visible side effect, like writing to a file) and therefore "number of statements" should be roughly correlated with how much the method is "doing" in terms of its number of side effects.
In practice, of course no one's code, my own included, actually meets that bar all the time. You might consider a metric for "how much the method is doing" to count not just statements but also, say, method calls.
To actually answer your question: I'm not aware of any industry standard that regulates what "number of statements" is. The C# specification certainly defines what a "statement" is lexically, but then of course you have to do some interpretation to do a count. For example:
void M()
{
try
{
if (blah)
{
Frob();
Blob();
}
}
catch(Exception ex)
{ /* eat it */ }
finally
{
Grob();
}
}
How many statements are there in M? Well, the body of M consists of one statement, a try-catch-finally. So is the answer one? The body of the try contains one statement, an "if" statement. The consequence of the "if" contains one statement -- remember, a block is a statement. The block contains two statements. The finally contains one statement. The catch block contains no statements -- a catch block is not a statement, lexically -- but it certainly is highly relevant to the operation of the method!
So how many statements is that altogether? One could make a reasonable case for any number from one to six, depending on whether you count blocks as "real" statements, whether you consider child statements as in addition to their parent statement or not, and so on. There is no standards body which regulates the answer to this question that I'm aware of.

The closest you might get to a formal definition of "what is a statement" would be the C# specification itself. Good luck working out whether a particular tool's measurement agrees with your reading of the specification.
Given that metrics are best used as a guide to better/worse code, and not a strict formula, does the exact definition used by the tool make much difference?
If I have three methods, with "statement lengths" of 2500, 1500 and 150, I know which method I'll be examining first; that another tool might report 2480, 1620 and 174 isn't too important.
One of the best tools I've seen for measuring metrics is NDepend, though again I'm not 100% sure what definitions it is using. According to the website, NDepend has 82 separate metrics, including Number of instructions and Cyclomatic Complexity.

The C# Metrics Tool defines the things being counted ("statements", "operands"), etc. by using a precise C# BNF language definition. (In fact, it precisely parses the code according a full C# grammar and then computes structural metrics by walking over the parse tree; SLOC count it gets by countline lines as you'd expect).
You might still argue that such a definition it unintuitive (grammars rarely are), but they are precise. I agree with other posters here, however, that the precise measure isn't as important as the relative value that one block of code has with respect to another. A value of "173.92" complexity just isn't very helpful by itself; compard to another complexity value of "81.02", we can say there's a good indication that the first one is more complex than the second, and that's enough to provide a focus of attention.
I think that metrics are also useful in trending; if last week, this code was "81.02" complex, ad this week it is "173.92", I should wonder why is all that happening inthis part of the code?
You might also consider a ratio of a structural metric (e.g., Cyclomatic) to SLOC as an indication of "doing too much", or at least an indication of writing code that is way too dense to understand

One simple metric is to just count the punctuation marks (;, ,, .) between tokens (so as to avoid those in strings, comments, or numbers). Thus, for (x = 0, y = 1; x < foo.Count; x++, y++) bar[y] = foo[x]; would count as 6.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.