I have this piece of code:
var kv = new Dictionary<int, string>() { ... };
var kv30Valid = kv.ContainsKey(30) && int.TryParse(kv[30], out var kv30Value);
myObject.nullableInt = kv30Valid ? (int?)kv30Value : null;
Note: myObject is a POCO class representing a table row, that's why nullable int.
I cannot compile my code because I get compiler error on the last line:
Local variable 'kv30Value' might not be initialized before accessing
In which case can it be unintialized and how to properly handle the case to allow valid code?
I need to populate the myObject properties with values from the kv (if they are present) parsed to their respective values.
Solution:
Moving the condition into TryParse() method solved the problem.
var kv30Valid = int.TryParse(kv.ContainsKey(30) ? kv[30] : null, out var kv30Value);
The definite assignment analyzer has limitations (as it must, yada yada yada halting problem, etc). Although we can look at this code and conclude that it'll only access kv30value if ContainsKey returned true and thus TryParse was called, it's too "separate" for the analyzer to be able to see this.
If this was inside an if block using kv30valid it might be able to see it but even then I'm not sure.
In which case can it be unintialized
When kv.ContainsKey(30) returns false, int.TryParse() isn't called, and kv30Value won't be assigned.
To simplify the issue, you have an unassigned variable:
bool test; // declared, but not assigned
if (test)
{
Console.WriteLine("Test is true");
}
This won't compile, because test is not assigned.
Now a method with an out parameter will definitely assign the variable:
bool test;
Assign(out test);
if (test)
{
Console.WriteLine("Test is true");
}
private static void Assign(out bool foo)
{
foo = true;
}
This will print "Test is true".
Now if you make the assignment conditional:
bool test;
bool condition = DateTime.Now > DateTime.Now;
if (condition)
{
Assign(out test);
}
if (test) { ... }
You'll be back at the compiler error:
CS0165: Use of unassigned local variable test
Because the assignment of test can't be guaranteed by the compiler, so it forbids further use of that variable.
Even if the usage of the variable uses the same condition:
bool test;
bool condition = DateTime.Now > DateTime.Now;
if (condition)
{
Assign(out test);
}
if (condition && test)
{
Console.WriteLine("Test is true");
}
Then still the compiler refuses you to use test.
That's exactly the same as with your && int.TryParse(..., out) code. The right side of the && is conditionally executed, thus the compiler will refuse to let you use the variable that's potentially unassigned.
Regarding the discussion below and the downvote on my answer, if you want to know the why behind all this, see the C# language specification chapter 5.3 Definite assignment. Basically put, you get this error because the compiler does a best effort attempt at statically analyzing whether the variable is assigned.
how to properly handle the case to allow valid code?
Declare it above, and assign it a sensible default value:
int kv30Value = 0;
var kv30Valid = kv.ContainsKey(30) && int.TryParse(kv[30], out kv30Value);
Or simplify the code by moving it into an if, where it'll be definitely assigned:
if (kv.ContainsKey(30) && int.TryParse(kv[30], out var kv30Value))
{
myObject.nullableInt = kv30Value;
}
Related
I can't seem to wrap my head around the compiler's warning in this case:
using System;
using System.Collections.Generic;
#nullable enable
public class Program
{
public static void Main()
{
Guid guid = Guid.NewGuid();
Dictionary<Guid, string> d = new();
bool found = d.TryGetValue(guid, out string? str);
if (found is false)
{
return;
}
string s = str; // WARNING: possible null value
}
}
After all, I'm doing the found check and return if there is no value (e.g. when the out str value is null). Plus, the out parameter of the .TryGetValue method is annotated with [MaybeNullWhen(false)].
Would appreciate your help figuring our where I'm wrong in my expectations and fixing the code, thanks. Code is here.
Basically the compiler (or language specification) isn't "smart" enough to carry the conditional processing of the return value from TryGetValue over when you use a local variable.
If you inline the TryGetValue call into the if condition, it's fine:
if (!d.TryGetValue(guid, out string? str))
{
return;
}
string s = str; // No warning
It's possible that over time this will evolve to be more sophisticated, but it's relatively difficult to specify this sort of thing in a bullet-proof way.
This isn't limited to nullable reference types - there are other cases where logically code is fine from a human perspective, but the compiler will reject it. For example:
string text;
bool condition = DateTime.UtcNow.Hour == 5;
if (condition)
{
text = "hello";
}
if (condition)
{
Console.WriteLine(text); // Error: use of unassigned local variable
}
We know that if we get into the second if statement body, we'll also have gone into the first one so text will have been assigned a value, but the rules of the compiler don't attempt to be smart enough to spot that.
For some reason this if statement is invalid and it says: "Use of unassigned local variable 'I1'.
I can't figure out what is wrong with this if statement and it's bothering me.
I am using VisualStudio 2019, and I can't figure out how to fix it.
bool I1;
if (I1 == true)
{
}
The line 'Bool I1;' creates a variable, but doesn't assign it any value. Your if statement is asking "does this thing with no value equal something" and the compiler is politely letting you know that it's not bothered to do that. Presumably you'd want something like
bool I1 = GetValueOfI1Method();
if(I1) //if(I1) is functionally equivalent to if(I1 == true) but generally more readible
{
//do stuff
}
I'm guessing/hoping you have some code that could potentially assign a value to I1, if you want to use your if statement anyway what you can do is use a nullable boolean
bool? I1 = null; //use a nullable bool
//code which might assign a value to foo
...
//if statement to check if a value has been assigned and if it's true
if(I1.HasValue && I1.Value) //has anyone assigned a value to I1 and if so is that value true
{
}
Realistically you'd never bother to do the above with a bool, you'd just set it to false and run a method that might possibly set it to true (or the inverse)
Consider the following code:
using System;
#nullable enable
namespace Demo
{
public sealed class TestClass
{
public string Test()
{
bool isNull = _test == null;
if (isNull)
return "";
else
return _test; // !!!
}
readonly string _test = "";
}
}
When I build this, the line marked with !!! gives a compiler warning: warning CS8603: Possible null reference return..
I find this a little confusing, given that _test is readonly and initialised to non-null.
If I change the code to the following, the warning goes away:
public string Test()
{
// bool isNull = _test == null;
if (_test == null)
return "";
else
return _test;
}
Can anyone explain this behaviour?
I can make a reasonable guess as to what's going on here, but it's all a bit complicated :) It involves the null state and null tracking described in the draft spec. Fundamentally, at the point where we want to return, the compiler will warn if the state of the expression is "maybe null" instead of "not null".
This answer is in somewhat narrative form rather than just "here's the conclusions"... I hope it's more useful that way.
I'm going to simplify the example slightly by getting rid of the fields, and consider a method with one of these two signatures:
public static string M(string? text)
public static string M(string text)
In the implementations below I've given each method a different number so I can refer to specific examples unambiguously. It also allows all of the implementations to be present in the same program.
In each of the cases described below, we'll do various things but end up trying to return text - so it's the null state of text that's important.
Unconditional return
First, let's just try to return it directly:
public static string M1(string? text) => text; // Warning
public static string M2(string text) => text; // No warning
So far, so simple. The nullable state of the parameter at the start of the method is "maybe null" if it's of type string? and "not null" if it's of type string.
Simple conditional return
Now let's check for null within the if statement condition itself. (I would use the conditional operator, which I believe will have the same effect, but I wanted to stay truer to the question.)
public static string M3(string? text)
{
if (text is null)
{
return "";
}
else
{
return text; // No warning
}
}
public static string M4(string text)
{
if (text is null)
{
return "";
}
else
{
return text; // No warning
}
}
Great, so it looks like within an if statement where the condition itself checks for nullity, the state of the variable within each branch of the if statement can be different: within the else block, the state is "not null" in both pieces of code. So in particular, in M3 the state changes from "maybe null" to "not null".
Conditional return with a local variable
Now let's try to hoist that condition to a local variable:
public static string M5(string? text)
{
bool isNull = text is null;
if (isNull)
{
return "";
}
else
{
return text; // Warning
}
}
public static string M6(string text)
{
bool isNull = text is null;
if (isNull)
{
return "";
}
else
{
return text; // Warning
}
}
Both M5 and M6 issue warnings. So not only do we not get the positive effect of the state change from "maybe null" to "not null" in M5 (as we did in M3)... we get the opposite effect in M6, where the state goes from "not null" to "maybe null". That really surprised me.
So it looks like we've learned that:
Logic around "how a local variable was computed" isn't used to propagate state information. More on that later.
Introducing a null comparison can warn the compiler that something it previously thought wasn't null might be null after all.
Unconditional return after an ignored comparison
Let's look at the second of those bullet points, by introducing a comparison before an unconditional return. (So we're completely ignoring the result of the comparison.):
public static string M7(string? text)
{
bool ignored = text is null;
return text; // Warning
}
public static string M8(string text)
{
bool ignored = text is null;
return text; // Warning
}
Note how M8 feels like it should be equivalent to M2 - both have a not-null parameter which they return unconditionally - but the introduction of a comparison with null changes the state from "not null" to "maybe null". We can get further evidence of this by trying to dereference text before the condition:
public static string M9(string text)
{
int length1 = text.Length; // No warning
bool ignored = text is null;
int length2 = text.Length; // Warning
return text; // No warning
}
Note how the return statement doesn't have a warning now: the state after executing text.Length is "not null" (because if we execute that expression successfully, it couldn't be null). So the text parameter starts as "not null" due to its type, becomes "maybe null" due to the null comparison, then becomes "not null" again after text2.Length.
What comparisons affect state?
So that's a comparison of text is null... what effect similar comparisons have? Here are four more methods, all starting with a non-nullable string parameter:
public static string M10(string text)
{
bool ignored = text == null;
return text; // Warning
}
public static string M11(string text)
{
bool ignored = text is object;
return text; // No warning
}
public static string M12(string text)
{
bool ignored = text is { };
return text; // No warning
}
public static string M13(string text)
{
bool ignored = text != null;
return text; // Warning
}
So even though x is object is now a recommended alternative to x != null, they don't have the same effect: only a comparison with null (with any of is, == or !=) changes the state from "not null" to "maybe null".
Why does hoisting the condition have an effect?
Going back to our first bullet point earlier, why don't M5 and M6 take account of the condition which led to the local variable? This doesn't surprise me as much as it appears to surprise others. Building that sort of logic into the compiler and specification is a lot of work, and for relatively little benefit. Here's another example with nothing to do with nullability where inlining something has an effect:
public static int X1()
{
if (true)
{
return 1;
}
}
public static int X2()
{
bool alwaysTrue = true;
if (alwaysTrue)
{
return 1;
}
// Error: not all code paths return a value
}
Even though we know that alwaysTrue will always be true, it doesn't satisfy the requirements in the specification that make the code after the if statement unreachable, which is what we need.
Here's another example, around definite assignment:
public static void X3()
{
string x;
bool condition = DateTime.UtcNow.Year == 2020;
if (condition)
{
x = "It's 2020.";
}
if (!condition)
{
x = "It's not 2020.";
}
// Error: x is not definitely assigned
Console.WriteLine(x);
}
Even though we know that the code will enter exactly one of those if statement bodies, there's nothing in the spec to work that out. Static analysis tools may well be able to do so, but trying to put that into the language specification would be a bad idea, IMO - it's fine for static analysis tools to have all kinds of heuristics which can evolve over time, but not so much for a language specification.
The nullable flow analysis tracks the null state of variables, but it does not track other state, such as the value of a bool variable (as isNull above), and it does not track the relationship between the state of separate variables (e.g. isNull and _test).
An actual static analysis engine would probably do those things, but would also be "heuristic" or "arbitrary" to some degree: you couldn't necessarily tell the rules it was following, and those rules might even change over time.
That's not something we can do directly in the C# compiler. The rules for nullable warnings are quite sophisticated (as Jon's analysis shows!), but they are rules, and can be reasoned about.
As we roll out the feature it feels like we mostly struck the right balance, but there are a few places that do come up as awkward, and we'll be revisiting those for C# 9.0.
You have discovered evidence that the program-flow algorithm that produces this warning is relatively unsophisticated when it comes to tracking the meanings encoded in local variables.
I have no specific knowledge of the flow checker's implementation, but having worked on implementations of similar code in the past, I can make some educated guesses. The flow checker is likely deducing two things in the false positive case: (1) _test could be null, because if it could not, you would not have the comparison in the first place, and (2) isNull could be true or false -- because if it could not, you would not have it in an if. But the connection that the return _test; only runs if _test is not null, that connection is not being made.
This is a surprisingly tricky problem, and you should expect that it will take a while for the compiler to attain the sophistication of tools that have had multiple years of work by experts. The Coverity flow checker, for example, would have no problem at all in deducing that neither of your two variations had a null return, but the Coverity flow checker costs serious money for corporate customers.
Also, the Coverity checkers are designed to run on large codebases overnight; the C# compiler's analysis must run between keystrokes in the editor, which significantly changes the sorts of in-depth analyses you can reasonably perform.
All the other answers are pretty much exactly correct.
In case anyone's curious, I tried to spell out the compiler's logic as explicitly as possible in https://github.com/dotnet/roslyn/issues/36927#issuecomment-508595947
The one piece that's not mentioned is how we decide whether a null check should be considered "pure", in the sense that if you do it, we should seriously consider whether null is a possibility. There are a lot of "incidental" null checks in C#, where you test for null as a part of doing something else, so we decided that we wanted to narrow down the set of checks to ones that we were sure people were doing deliberately. The heuristic we came up with was "contains the word null", so that's why x != null and x is object produce different results.
I want to check if a variable is initialized at run time, programmatically. To make the reasons for this less mysterious, please see the following incomplete code:
string s;
if (someCondition) s = someValue;
if (someOtherCondition) s = someOtherValue;
bool sIsUninitialized = /* assign value correctly */;
if (!sIsUninitialized) Console.WriteLine(s) else throw new Exception("Please initialize s.");
And complete the relevant bit.
One hacky solution is to initialize s with a default value:
string s = "zanzibar";
And then check if it changed:
bool sIsUninitialized = s == "zanzibar";
However, what if someValue or someOtherValue happen to be "zanzibar" as well? Then I have a bug. Any better way?
Code won't even compile if the compiler knows a variable hasn't been initialized.
string s;
if (condition) s = "test";
// compiler error here: use of unassigned local variable 's'
if (s == null) Console.Writeline("uninitialized");
In other cases you could use the default keyword if a variable may not have been initialized. For example, in the following case:
class X
{
private string s;
public void Y()
{
Console.WriteLine(s == default(string)); // this evaluates to true
}
}
The documentation states that default(T) will give null for reference types, and 0 for value types. So as pointed out in the comments, this is really just the same as checking for null.
This all obscures the fact that you should really initialize variables, to null or whatever, when they are first declared.
With C# 2.0, you have the Nullable operator that allows you to set an initial value of null for heretofore value types, allowing for such things as:
int? x = null;
if (x.HasValue)
{
Console.WriteLine("Value for x: " + num.Value);
}
Which yields:
"Value for x: Null".
Just assign it null by default, not a string value
Here's one way:
string s;
if (someCondition) { s = someValue; }
else if (someOtherCondition) { s = someOtherValue; }
else { throw new Exception("Please initialize s."); }
Console.WriteLine(s)
This might be preferable for checking if the string is null, because maybe someValue is a method that can sometimes return null. In other words, maybe null is a legitimate value to initialize the string to.
Personally I like this better than an isInitialized flag. Why introduce an extra flag variable unless you have to? I don't think it is more readable.
You can keep a separate flag that indicates that the string has been initialized:
string s = null;
bool init = false;
if (conditionOne) {
s = someValueOne;
init = true;
}
if (conditionTwo) {
s = someValueTwo;
init = true;
}
if (!init) {
...
}
This will take care of situations when s is assigned, including the cases when it is assigned null, empty string, or "zanzibar".
Another solution is to make a static string to denote "uninitialized" value, and use Object.ReferenceEquals instead of == to check if it has changed. However, the bool variable approach expresses your intent a lot more explicitly.
I would agree with Vytalyi that a default value of null should be used when possible, however, not all types (like int) are nullable. You could allocate the variable as a nullable type as explained by David W, but this could break a lot of code in a large codebase due to having to refine the nullable type to its primitive type before access.
This generic method extension should help for those who deal with large codebases where major design decisions were already made by a predecessor:
public static bool IsDefault<T>(this T value)
=> ((object) value == (object) default(T));
If you are staring from scratch, just take advantage of nullable types and initialize it as null; that C# feature was implemented for a reason.
I pick initialization values that can never be used, typical values include String.Empty, null, -1, and a 256 character random string generator .
In general, assign the default to be null or String.Empty. For situations where you cannot use those "empty" values, define a constant to represent your application-specific uninitialized value:
const string UninitializedString = "zanzibar";
Then reference that value whenever you want to initialize or test for initialization:
string foo = UnininitializedString;
if (foo == UninitiaizedString) {
// Do something
}
Remember that strings are immutable constants in C# so there is really only one instance of UninitializedString (which is why the comparison works).
Now I've long known and been use to this behavior in C#, and in general, I like it. But sometimes the compiler just isn't smart enough.
I have a small piece of code where right now my workaround isn't a big problem, but it could be in similar cases.
bool gap=false;
DateTime start; // = new DateTime();
for (int i = 0; i < totaldays; i++)
{
if (gap)
{
if (list[i])
{
var whgap = new WorkHistoryGap();
whgap.From = start; //unassigned variable error
whgap.To = dtFrom.AddDays(i);
return whgap;
}
}
else
{
gap = true;
start = dtFrom.AddDays(i);
}
}
The problem I'm seeing is what if you had to do this with a non-nullable struct that didn't have a default constructor? Would there be anyway to workaround this if start wasn't a simple DateTime object?
sometimes the compiler just isn't smart enough
The problem you want the compiler to solve is equivalent to the Halting Problem. Since that problem is provably not solvable by computer programs, we make only a minimal attempt to solve it. We don't do anything particularly sophisticated. You're just going to have to live with it.
For more information on why program analysis is equivalent to the Halting Problem, see my article on the subject of deducing whether the end point of a method is reachable. This is essentially the same problem as determining if a variable is definitely assigned; the analysis is very similar.
http://blogs.msdn.com/b/ericlippert/archive/2011/02/24/never-say-never-part-two.aspx
what if you had to do this with a non-nullable struct that didn't have a default constructor?
There is no such animal. All structs, nullable or otherwise, have a default constructor.
Would there be anyway to workaround this if start wasn't a simple DateTime object?
The expression default(T) gives you the default value for any type T. You can always say
Foo f = default(Foo);
and have a legal assignment. If Foo is a value type then it calls the default constructor, which always exists. If it is a reference type then you get null.
The compiler has no way of knowing that you are guaranteed to set DateTime because of your gap variable.
Just use
DateTime start = DateTime.Now;
and be done with it.
Edit Better yet, on second glance through your code, use
DateTime start = dtFrom;
There is no such thing as a default constructor in a struct. Try it:
struct MyStruct {
public MyStruct() {
// doesn't work
}
}
You can have a static constructor, but you cannot define a default constructor for a struct. That's why there's the static method Create on so many structures, and why you can say new Point() instead of Point.Empty.
The "default constructor" of any struct always initializes all of its fields to their default values. The Empty static field of certian types is for convenience. It actually makes zero difference in performance because they're value types.
Looks to me like your bool gap and the DateTime start are really the same thing. Try refactoring like this:
DateTime? gapStart = null ;
for (int i = 0; i < totaldays; i++)
{
if ( gapStart.HasValue )
{
if (list[i])
{
var whgap = new WorkHistoryGap();
whgap.From = gapStart.Value ; //unassigned variable error
whgap.To = dtFrom.AddDays(i);
return whgap;
}
}
else
{
gapStart = dtFrom.AddDays(i);
}
}
[edited to note: please post code samples that will...oh...actually compile. It makes it easier.]
[further edited to note: you set gap to true and set your start value the first time through the loop. Further refactor to something like this:]
DateTime gapStart = dtFrom.AddDays( 0 );
for ( int i = 1 ; i < totaldays ; i++ )
{
if ( list[i] )
{
var whgap = new WorkHistoryGap();
whgap.From = gapStart.Value; //unassigned variable error
whgap.To = dtFrom.AddDays( i );
return whgap;
}
}
Why are you trying to work around the design of the language? Even if the compiler could work out your entire loop in advance, which seems needlessly complex on the part of the compiler, how does it know that exceptions cannot be thrown in portions of your code? You MUST assign a value to start because you use it later in the code, possibly before its (according to you) inevitable assignment.