Definitely, I know the basic differences between unsigned integers (uint) and signed integers (int).
I noticed that in .NET public classes, a property called Length is always using signed integers.
Maybe this is because unsigned integers are not CLS compliant.
However, for example, in my static function:
public static double GetDistributionDispersion(int tokens, int[] positions)
The parameter tokens and all elements in positions cannot be negative. If it's negative, the final result is useless. So if I use int both for tokens and positions, I have to check the values every time this function is called (and return non-sense values or throw exceptions if negative values found???), which is tedious.
OK, then we should use uint for both parameters. This really makes sense to me.
I found, however, as in a lot of public APIs, they are almost always using int. Does that mean inside their implementation, they always check the negativeness of each value (if it is supposed to be non-negative)?
So, in a word, what should I do?
I could provide two cases:
This function will only be called by myself in my own solution;
This function will be used as a library by others in other team.
Should we use different schemes for these two cases?
Peter
P.S.: I did do a lot of research, and there is still no reason to convince me not to use uint :-)
I see three options.
Use uint. The framework doesn't because it's not CLS compliant. But do you have to be CLS compliant? (There are also some not-fun issues with arithmetic that crop up; it's not fun to cast all over the place. I tend to void uint for this reason).
Use int but use contracts:
Contract.Requires(tokens >= 0);
Contract.Requires(Contract.ForAll(positions, position => position >= 0));
Make it explicit it exactly what you require.
Create a custom type that encapsulates the requirement:
struct Foo {
public readonly int foo;
public Foo(int foo) {
Contract.Requires(foo >= 0);
this.foo = foo;
}
public static implicit operator int(Foo foo) {
return this.foo;
}
public static explicit operator Foo(int foo) {
return new Foo(foo);
}
}
Then:
public static double GetDistributionDispersion(Foo tokens, Foo[] positions) { }
Ah, nice. We don't have to worry about it in our method. If we're getting a Foo, it's valid.
You have a reason for requiring non-negativity in your domain. It's modeling some concept. Might as well promote that concept to a bonafide object in your domain model and encapsulate all the concepts that come with it.
I use uint.
Yes, other answers are all correct... But I prefer then uint for one reason:
Make interface more clear. If a parameter (or a returned value) is unsigned, it is because it cannot be negative (has you seen a negative collection count?). Otherwise, I need to check parameters, document parameters (and returned value) that cannot be negative; then I need to write additional unit tests for checking parameters and returned values (wow, and someone would complain for doing casts? Are int casts so frequent? No, by my experience).
Additionally, users are required to test returned values negativity, which may be worse.
I don't mind about CLS compliace, so why I should be? From my point of view, the question should be reversed: why should I use ints when value cannot be negative?
For the point of returning negative value for additional information (errors, for example...): I don't like so C-ish design. I think there may be a more modern design to accomplish this (i.e. use of Exceptions, or alterntively use of out values and boolean returned values).
Yes go for int. I once tried to go uint myself only to refactor everything again as soon as I had my share of annoying casts in my lib.
I guess the decision to go for int is for historical reasons where often a result of -1 indicates some sort of error (for example in IndexOf).
int. it gives more flexibility when you need to do a API change in the near future.
such as negative indexes are often used by python to indicate reversed counting from the end of string.
values also turns negative when overflows, assertion will catch it.
its a trade off for speed vs robustness.
As you read it violates the Common Language Specification rules, but then how often will the function be used and in case it going to be clubbed up with other method its normal for them to expect int as parameter which would get you into the trouble of casting the values.
If you are going to make it available as a library then its better sticking to the conventional int else you need to implicitly take care of conditions wherein you might not get a positive value which would mean littering the checks across the pages.
Interesting Read - SO link
Related
If we want to get a value from a method, we can use either return value, like this:
public int GetValue();
or:
public void GetValue(out int x);
I don't really understand the differences between them, and so, don't know which is better. Can you explain me this?
Thank you.
Return values are almost always the right choice when the method doesn't have anything else to return. (In fact, I can't think of any cases where I'd ever want a void method with an out parameter, if I had the choice. C# 7's Deconstruct methods for language-supported deconstruction acts as a very, very rare exception to this rule.)
Aside from anything else, it stops the caller from having to declare the variable separately:
int foo;
GetValue(out foo);
vs
int foo = GetValue();
Out values also prevent method chaining like this:
Console.WriteLine(GetValue().ToString("g"));
(Indeed, that's one of the problems with property setters as well, and it's why the builder pattern uses methods which return the builder, e.g. myStringBuilder.Append(xxx).Append(yyy).)
Additionally, out parameters are slightly harder to use with reflection and usually make testing harder too. (More effort is usually put into making it easy to mock return values than out parameters). Basically there's nothing I can think of that they make easier...
Return values FTW.
EDIT: In terms of what's going on...
Basically when you pass in an argument for an "out" parameter, you have to pass in a variable. (Array elements are classified as variables too.) The method you call doesn't have a "new" variable on its stack for the parameter - it uses your variable for storage. Any changes in the variable are immediately visible. Here's an example showing the difference:
using System;
class Test
{
static int value;
static void ShowValue(string description)
{
Console.WriteLine(description + value);
}
static void Main()
{
Console.WriteLine("Return value test...");
value = 5;
value = ReturnValue();
ShowValue("Value after ReturnValue(): ");
value = 5;
Console.WriteLine("Out parameter test...");
OutParameter(out value);
ShowValue("Value after OutParameter(): ");
}
static int ReturnValue()
{
ShowValue("ReturnValue (pre): ");
int tmp = 10;
ShowValue("ReturnValue (post): ");
return tmp;
}
static void OutParameter(out int tmp)
{
ShowValue("OutParameter (pre): ");
tmp = 10;
ShowValue("OutParameter (post): ");
}
}
Results:
Return value test...
ReturnValue (pre): 5
ReturnValue (post): 5
Value after ReturnValue(): 10
Out parameter test...
OutParameter (pre): 5
OutParameter (post): 10
Value after OutParameter(): 10
The difference is at the "post" step - i.e. after the local variable or parameter has been changed. In the ReturnValue test, this makes no difference to the static value variable. In the OutParameter test, the value variable is changed by the line tmp = 10;
What's better, depends on your particular situation. One of the reasons out exists is to facilitate returning multiple values from one method call:
public int ReturnMultiple(int input, out int output1, out int output2)
{
output1 = input + 1;
output2 = input + 2;
return input;
}
So one is not by definition better than the other. But usually you'd want to use a simple return, unless you have the above situation for example.
EDIT:
This is a sample demonstrating one of the reasons that the keyword exists. The above is in no way to be considered a best practise.
You should generally prefer a return value over an out param. Out params are a necessary evil if you find yourself writing code that needs to do 2 things. A good example of this is the Try pattern (such as Int32.TryParse).
Let's consider what the caller of your two methods would have to do. For the first example I can write this...
int foo = GetValue();
Notice that I can declare a variable and assign it via your method in one line. FOr the 2nd example it looks like this...
int foo;
GetValue(out foo);
I'm now forced to declare my variable up front and write my code over two lines.
update
A good place to look when asking these types of question is the .NET Framework Design Guidelines. If you have the book version then you can see the annotations by Anders Hejlsberg and others on this subject (page 184-185) but the online version is here...
http://msdn.microsoft.com/en-us/library/ms182131(VS.80).aspx
If you find yourself needing to return two things from an API then wrapping them up in a struct/class would be better than an out param.
There's one reason to use an out param which has not already been mentioned: the calling method is obliged to receive it. If your method produces a value which the caller should not discard, making it an out forces the caller to specifically accept it:
Method1(); // Return values can be discard quite easily, even accidentally
int resultCode;
Method2(out resultCode); // Out params are a little harder to ignore
Of course the caller can still ignore the value in an out param, but you've called their attention to it.
This is a rare need; more often, you should use an exception for a genuine problem or return an object with state information for an "FYI", but there could be circumstances where this is important.
It's preference mainly
I prefer returns and if you have multiple returns you can wrap them in a Result DTO
public class Result{
public Person Person {get;set;}
public int Sum {get;set;}
}
You should almost always use a return value. 'out' parameters create a bit of friction to a lot of APIs, compositionality, etc.
The most noteworthy exception that springs to mind is when you want to return multiple values (.Net Framework doesn't have tuples until 4.0), such as with the TryParse pattern.
You can only have one return value whereas you can have multiple out parameters.
You only need to consider out parameters in those cases.
However, if you need to return more than one parameter from your method, you probably want to look at what you're returning from an OO approach and consider if you're better off return an object or a struct with these parameters. Therefore you're back to a return value again.
I would prefer the following instead of either of those in this simple example.
public int Value
{
get;
private set;
}
But, they are all very much the same. Usually, one would only use 'out' if they need to pass multiple values back from the method. If you want to send a value in and out of the method, one would choose 'ref'. My method is best, if you are only returning a value, but if you want to pass a parameter and get a value back one would likely choose your first choice.
I think one of the few scenarios where it would be useful would be when working with unmanaged memory, and you want to make it obvious that the "returned" value should be disposed of manually, rather than expecting it to be disposed of on its own.
Additionally, return values are compatible with asynchronous design paradigms.
You cannot designate a function "async" if it uses ref or out parameters.
In summary, Return Values allow method chaining, cleaner syntax (by eliminating the necessity for the caller to declare additional variables), and allow for asynchronous designs without the need for substantial modification in the future.
As others have said: return value, not out param.
May I recommend to you the book "Framework Design Guidelines" (2nd ed)? Pages 184-185 cover the reasons for avoiding out params. The whole book will steer you in the right direction on all sorts of .NET coding issues.
Allied with Framework Design Guidelines is the use of the static analysis tool, FxCop. You'll find this on Microsoft's sites as a free download. Run this on your compiled code and see what it says. If it complains about hundreds and hundreds of things... don't panic! Look calmly and carefully at what it says about each and every case. Don't rush to fix things ASAP. Learn from what it is telling you. You will be put on the road to mastery.
Using the out keyword with a return type of bool, can sometimes reduce code bloat and increase readability. (Primarily when the extra info in the out param is often ignored.) For instance:
var result = DoThing();
if (result.Success)
{
result = DoOtherThing()
if (result.Success)
{
result = DoFinalThing()
if (result.Success)
{
success = true;
}
}
}
vs:
var result;
if (DoThing(out result))
{
if (DoOtherThing(out result))
{
if (DoFinalThing(out result))
{
success = true;
}
}
}
There is no real difference. Out parameters are in C# to allow method return more then one value, that's all.
However There are some slight differences , but non of them are really important:
Using out parameter will enforce you to use two lines like:
int n;
GetValue(n);
while using return value will let you do it in one line:
int n = GetValue();
Another difference (correct only for value types and only if C# doesn't inline the function) is that using return value will necessarily make a copy of the value when the function return, while using OUT parameter will not necessarily do so.
Please avoid using out parameters.
Although, they can make sense in certain situations (for example when implementing the Try-Parse Pattern), they are very hard to grasp.
Chances to introduce bugs or side effects by yourself (unless you are very experienced with the concept) and by other developers (who either use your API or may inherit your code) is very high.
According to Microsoft's quality rule CA1021:
Although return values are commonplace and heavily used, the correct application of out and ref parameters requires intermediate design and coding skills. Library architects who design for a general audience should not expect users to master working with out or ref parameters.
Therefore, if there is not a very good reason, please just don't use out or ref.
See also:
Is using "out" bad practice
https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1021
Both of them have a different purpose and are not treated the same by the compiler. If your method needs to return a value, then you must use return. Out is used where your method needs to return multiple values.
If you use return, then the data is first written to the methods stack and then in the calling method's. While in case of out, it is directly written to the calling methods stack. Not sure if there are any more differences.
out is more useful when you are trying to return an object that you declare in the method.
Example
public BookList Find(string key)
{
BookList book; //BookList is a model class
_books.TryGetValue(key, out book) //_books is a concurrent dictionary
//TryGetValue gets an item with matching key and returns it into book.
return book;
}
return value is the normal value which is returned by your method.
Where as out parameter, well out and ref are 2 key words of C# they allow to pass variables as reference.
The big difference between ref and out is, ref should be initialised before and out don't
I suspect I'm not going to get a look-in on this question, but I am a very experienced programmer, and I hope some of the more open-minded readers will pay attention.
I believe that it suits object-oriented programming languages better for their value-returning procedures (VRPs) to be deterministic and pure.
'VRP' is the modern academic name for a function that is called as part of an expression, and has a return value that notionally replaces the call during evaluation of the expression. E.g. in a statement such as x = 1 + f(y) the function f is serving as a VRP.
'Deterministic' means that the result of the function depends only on the values of its parameters. If you call it again with the same parameter values, you are certain to get the same result.
'Pure' means no side-effects: calling the function does nothing except computing the result. This can be interpreted to mean no important side-effects, in practice, so if the VRP outputs a debugging message every time it is called, for example, that can probably be ignored.
Thus, if, in C#, your function is not deterministic and pure, I say you should make it a void function (in other words, not a VRP), and any value it needs to return should be returned in either an out or a ref parameter.
For example, if you have a function to delete some rows from a database table, and you want it to return the number of rows it deleted, you should declare it something like this:
public void DeleteBasketItems(BasketItemCategory category, out int count);
If you sometimes want to call this function but not get the count, you could always declare an overloading.
You might want to know why this style suits object-oriented programming better. Broadly, it fits into a style of programming that could be (a little imprecisely) termed 'procedural programming', and it is a procedural programming style that fits object-oriented programming better.
Why? The classical model of objects is that they have properties (aka attributes), and you interrogate and manipulate the object (mainly) through reading and updating those properties. A procedural programming style tends to make it easier to do this, because you can execute arbitrary code in between operations that get and set properties.
The downside of procedural programming is that, because you can execute arbitrary code all over the place, you can get some very obtuse and bug-vulnerable interactions via global variables and side-effects.
So, quite simply, it is good practice to signal to someone reading your code that a function could have side-effects by making it non-value returning.
I'm toying with the idea of making primitive .NET value types more type-safe and more "self-documenting" by wrapping them in custom structs. However, I'm wondering if it's actually ever worth the effort in real-world software.
(That "effort" can be seen below: Having to apply the same code pattern again and again. We're declaring structs and so cannot use inheritance to remove code repetition; and since the overloaded operators must be declared static, they have to be defined for each type separately.)
Take this (admittedly trivial) example:
struct Area
{
public static implicit operator Area(double x) { return new Area(x); }
public static implicit operator double(Area area) { return area.x; }
private Area(double x) { this.x = x; }
private readonly double x;
}
struct Length
{
public static implicit operator Length(double x) { return new Length(x); }
public static implicit operator double(Length length) { return length.x; }
private Length(double x) { this.x = x; }
private readonly double x;
}
Both Area and Length are basically a double, but augment it with a specific meaning. If you defined a method such as…
Area CalculateAreaOfRectangleWith(Length width, Length height)
…it would not be possible to directly pass in an Area by accident. So far so good.
BUT: You can easily sidestep this apparently improved type safety simply by casting a Area to double, or by temporarily storing an Area in a double variable, and then passing that into the method where a Length is expected:
Area a = 10.0;
double aWithEvilPowers = a;
… = CalculateAreaOfRectangleWith( (double)a, aWithEvilPowers );
Question: Does anyone here have experience with extensive use of such custom struct types in real-world / production software? If so:
Has the wrapping of primitive value types in custom structs ever directly resulted in less bugs, or in more maintainable code, or given any other major advantage(s)?
Or are the benefits of custom structs too small for them to be used in practice?
P.S.: About 5 years have passed since I asked this question. I'm posting some of my experiences that I've made since then as a separate answer.
I did this in a project a couple of years ago, with some mixed results. In my case, it helped a lot to keep track of different kinds of IDs, and to have compile-time errors when the wrong type of IDs were being used. And I can recall a number of occasions where it prevented actual bugs-in-the-making. So, that was the plus side. On the negative side, it was not very intuitive for other developers -- this kind of approach is not very common, and I think other developers got confused with all these new types springing up. Also, I seem to recall we had some problems with serialization, but I can't remember the details (sorry).
So if you are going to go this route, I would recommend a couple of things:
1) Make sure you talk with the other folks on your team first, explain what you're trying to accomplish and see if you can get "buy-in" from everyone. If people don't understand the value, you're going to be constantly fighting against the mentality of "what's the point of all this extra code"?
2) Consider generate your boilerplate code using a tool like T4. It will make the maintenance of the code much easier. In my case, we had about a dozen of these types and going the code-generation route made changes much easier and much less error prone.
That's my experience. Good luck!
John
This is a logical next step to Hungarian Notation, see an excellent article from Joel here http://www.joelonsoftware.com/articles/Wrong.html
We did something similar in one project/API where esp. other developers needed to use some of our interfaces and it had siginificantly less "false alarm support cases" - because it made really obvious what was allowed/needed... I would suspect this means measurably less bugs though we never did the statistics because the resulting apps were not ours...
In the aprox. 5 years since I've asked this question, I have often toyed with defining struct value types, but rarely done it. Here's some reasons why:
It's not so much that their benefit is too small, but that the cost of defining a struct value type is too high. There's lots of boilerplate code involved:
Override Equals and implement IEquatable<T>, and override GetHashCode too;
Implement operators == and !=;
Possibly implement IComparable<T> and operators <, <=, > and >= if a type supports ordering;
Override ToString and implement conversion methods and type-cast operators feom/to other related types.
This is simply a lot of repetitive work that is often not necessary when using the underlying primitive types directly.
Often, there is no reasonable default value (e.g. for value types such as zip codes, dates, times, etc.). Since one can't prohibit the default constructor, defining such types as struct is a bad idea and defining them as class means some runtime overhead (more work for the GC, more memory dereferencing, etc.)
I haven't actually found that wrapping a primitive type in a semantic struct really offers significant benefits with regard to type safety; IntelliSense / code completion and good choices of variable / parameter names can achieve much of the same benefits as specialized value types. On the other hand, as another answer suggests, custom value types can be unintuitive for some developers, so there's an additional cognitive overhead in using them.
There have been some instances where I ended up wrapping a primitive type in a struct, normally when there are certain operations defined on them (e.g. a DecimalDegrees and Radians type with methods to convert between these).
Defining equality and comparison methods, on the other hand, does not necessarily mean that I'd define a custom value type. I might instead use primitive .NET types and provide well-named implementations of IEqualityComparer<T> and IComparer<T> instead.
I don't often use structs.
One thing you may consider is to make your type require a dimension. Consider this:
public enum length_dim { meters, inches }
public enum area_dim { square_meters, square_inches }
public class Area {
public Area(double a,area_dim dim) { this.area=a; this.dim=dim; }
public Area(Area a) { this.area = a.area; this.dim = a.dim; }
public Area(Length l1, Length l2)
{
Debug.Assert(l1.Dim == l2.Dim);
this.area = l1.Distance * l1.Distance;
switch(l1.Dim) {
case length_dim.meters: this.dim = square_meters;break;
case length_dim.inches: this.dim = square_inches; break;
}
}
private double area;
public double Area { get { return this.area; } }
private area_dim dim;
public area_dim Dim { get { return this.dim; } }
}
public class Length {
public Length(double dist,length_dim dim)
{ this.distance = dist; this.dim = dim; }
private length_dim dim;
public length_dim Dim { get { return this.dim; } }
private double distance;
public double Distance { get { return this.distance; } }
}
Notice that nothing can be created from a double alone. The dimension must be specified. My objects are immutable. Requiring dimension and verifying it would have prevented the Mars Express disaster.
I can't say, from my experience, if this is a good idea or not. It certainly has it's pros and cons. A pro is that you get an extra dose of type safety. Why accept a double for an angle when you can accept a type that has true angle semantics (degrees to/from radians, constrained to degree values of 0-360, etc). Con, not common so could be confusing to some developers.
See this link for a real example of a real commercial product that has several types like you describe:
http://geoframework.codeplex.com/
Types include Angle, Latitude, Longitude, Speed, Distance.
I've been writing C# for seven years now, and I keep wondering, why do enums have to be of an integral type? Wouldn't it be nice to do something like:
enum ErrorMessage
{
NotFound: "Could not find",
BadRequest: "Malformed request"
}
Is this a language design choice, or are there fundamental incompatibilities on a compiler, CLR, or IL level?
Do other languages have enums with string or complex (i.e. object) types? What languages?
(I'm aware of workarounds; my question is, why are they needed?)
EDIT: "workarounds" = attributes or static classes with consts :)
The purpose of an Enum is to give more meaningful values to integers. You're looking for something else besides an Enum. Enums are compatible with older windows APIs and COM stuff, and a long history on other platforms besides.
Maybe you'd be satisfied with public const members of a struct or a class.
Or maybe you're trying to restrict some specialized types values to only certain string values? But how it's stored and how it's displayed can be two different things - why use more space than necessary to store a value?
And if you want to have something like that readable in some persisted format, just make a utility or Extension method to spit it out.
This response is a little messy because there are just so many reasons. Comparing two strings for validity is much more expensive than comparing two integers. Comparing literal strings to known enums for static type-checking would be kinda unreasonable. Localization would be ... weird. Compatibility with would be broken. Enums as flags would be meaningless/broken.
It's an Enum. That's what Enums do! They're integral!
Perhaps use the description attribute from System.ComponentModel and write a helper function to retrieve the associated string from an enum value? (I've seen this in a codebase I work with and seemed like a perfectly reasonable alternative)
enum ErrorMessage
{
[Description("Could not find")]
NotFound,
[Description("Malformed request")]
BadRequest
}
What are the advantages, because I can only see drawbacks:
ToString will return a different string to the name of the enumeration. That is, ErrorMessage.NotFound.ToString() will be "Could not find" instead of "NotFound".
Conversely, with Enum.Parse, what would it do? Would it still accept the string name of the enumeration as it does for integer enumerations, or does it work with the string value?
You would not be able to implement [Flags] because what would ErrorMessage.NotFound | ErrorMessage.BadRequest equal in your example (I know that it doesn't really make sense in this particular case, and I suppose you could just say that [Flags] is not allowed on string-based enumerations but that still seems like a drawback to me)
While the comparison errMsg == ErrorMessage.NotFound could be implemented as a simple reference comparison, errMsg == "Could not find" would need to be implemented as a string comparison.
I can't think of any benefits, especially since it's so easy to build up your own dictionary mapping enumeration values to "custom" strings.
The real answer why: There's never been a compelling reason to make enums any more complicated than they are. If you need a simple closed list of values - they're it.
In .Net, enums were given the added benefit of internal representation <-> the string used to define them. This one little change adds some versioning downsides, but improves upon enums in C++.
The enum keyword is used to declare an
enumeration, a distinct type that
consists of a set of named constants
called the enumerator list.
Ref: msdn
Your question is with the chosen storage mechanism, an integer. This is just an implementation detail. We only get to peek beneath the covers of this simple type in order to maintain binary compatibility. Enums would otherwise have very limited usefulness.
Q: So why do enums use integer storage? As others have pointed out:
Integers are quick and easy to compare.
Integers are quick and easy to combine (bitwise for [Flags] style enums)
With integers, it's trivially easy to implement enums.
* none of these are specific to .net, and it appears the CLR designers apparently didn't feel compelled to change anything or add any gold plating to them.
Now that's not to saying your syntax isn't entirely unappealing. But is the effort to implement this feature in the CLR, and all the compilers, justified? For all the work that goes into this, has it really bought you anything you couldn't already achieve (with classes)? My gut feeling is no, there's no real benefit. (There's a post by Eric Lippert I wanted to link to, but I couldn't find it)
You can write 10 lines of code to implement in user-space what you're trying to achieve without all the headache of changing a compiler. Your user-space code is easily maintained over time - although perhaps not quite as pretty as if it's built-in, but at the end of the day it's the same thing. You can even get fancy with a T4 code generation template if you need to maintain many of your custom enum-esque values in your project.
So, enums are as complicated as they need to be.
Not really answering your question but presenting alternatives to string enums.
public struct ErrorMessage
{
public const string NotFound="Could not find";
public const string BadRequest="Malformed request";
}
Perhaps because then this wouldn't make sense:
enum ErrorMessage: string
{
NotFound,
BadRequest
}
It's a language decision - eg., Java's enum doesn't directly correspond to an int, but is instead an actual class. There's a lot of nice tricks that an int enum gives you - you can bitwise them for flags, iterate them (by adding or subtracting 1), etc. But, there's some downsides to it as well - the lack of additional metadata, casting any int to an invalid value, etc.
I think the decision was probably made, as with most design decisions, because int enums are "good enough". If you need something more complex, a class is cheap and easy enough to build.
Static readonly members give you the effect of complex enums, but don't incur the overhead unless you need it.
static class ErrorMessage {
public string Description { get; private set; }
public int Ordinal { get; private set; }
private ComplexEnum() { }
public static readonly NotFound = new ErrorMessage() {
Ordinal = 0, Description = "Could not find"
};
public static readonly BadRequest = new ErrorMessage() {
Ordinal = 1, Description = "Malformed Request"
};
}
Strictly speaking, the intrinsic representation of an enum shouldn't matter, because by definition, they are enumerated types. What this means is that
public enum PrimaryColor { Red, Blue, Yellow }
represents a set of values.
Firstly, some sets are smaller, whereas other sets are larger. Therefore, the .NET CLR allows one to base an enum on an integral type, so that the domain size for enumerated values can be increased or decreased, i.e., if an enum was based on a byte, then that enum cannot contain more than 256 distinct values, whereas one based on a long can contain 2^64 distinct values. This is enabled by the fact that a long is 8 times larger than a byte.
Secondly, an added benefit of restricting the base type of enums to integral values is that one can perform bitwise operations on enum values, as well as create bitmaps of them to represent more than one values.
Finally, integral types are the most efficient data types available inside a computer, therefore, there is a performance advantage when it comes to comparing different enum values.
For the most part, I would say representing enums by integral types seems to be a CLR and/or CLS design choice, though one that is probably not very difficult to arrive at.
The main advantage of integral enums is that they don't take up much space in memory. An instance of a default System.Int32-backed enum takes up just 4-bytes of memory and can be compared quickly to other instances of that enum.
In constrast, string-backed enums would be reference types that require each instance to be allocated on the heap and comparisons to involve checking each character in a string. You could probably minimize some of the issues with some creativity in the runtime and with compilers, but you'd still run into similar problems when trying to store the enum efficiently in a database or other external store.
While it also counts as an "alternative", you can still do better than just a bunch of consts:
struct ErrorMessage
{
public static readonly ErrorMessage NotFound =
new ErrorMessage("Could not find");
public static readonly ErrorMessage BadRequest =
new ErrorMessage("Bad request");
private string s;
private ErrorMessage(string s)
{
this.s = s;
}
public static explicit operator ErrorMessage(string s)
{
return new ErrorMessage(s);
}
public static explicit operator string(ErrorMessage em)
{
return em.s;
}
}
The only catch here is that, as any value type, this one has a default value, which will have s==null. But this isn't really different from Java enums, which themselves can be null (being reference types).
In general, Java-like advanced enums cross the line between actual enums, and syntactic sugar for a sealed class hierarchy. Whether such sugar is a good idea or not is arguable.
There are a number of questions already on the definition of "ref" and "out" parameter but they seem like bad design. Are there any cases where you think ref is the right solution?
It seems like you could always do something else that is cleaner. Can someone give me an example of where this would be the "best" solution for a problem?
In my opinion, ref largely compensated for the difficulty of declaring new utility types and the difficulty of "tacking information on" to existing information, which are things that C# has taken huge steps toward addressing since its genesis through LINQ, generics, and anonymous types.
So no, I don't think there are a lot of clear use cases for it anymore. I think it's largely a relic of how the language was originally designed.
I do think that it still makes sense (like mentioned above) in the case where you need to return some kind of error code from a function as well as a return value, but nothing else (so a bigger type isn't really justified.) If I were doing this all over the place in a project, I would probably define some generic wrapper type for thing-plus-error-code, but in any given instance ref and out are OK.
Well, ref is generally used for specialized cases, but I wouldn't call it redundant or a legacy feature of C#. You'll see it (and out) used a lot in XNA for example. In XNA, a Matrix is a struct and a rather massive one at that (I believe 64 bytes) and it's generally best if you pass it to functions using ref to avoid copying 64 bytes, but just 4 or 8. A specialist C# feature? Certainly. Of not much use any more or indicative of bad design? I don't agree.
One area is in the use of small utility functions, like :
void Swap<T>(ref T a, ref T b) { T tmp = a; a = b; b = tmp; }
I don't see any 'cleaner' alternatives here. Granted, this isn't exactly Architecture level.
P/Invoke is the only place I can really think of a spot where you must use ref or out. Other cases, they can be convenient, but like you said, there is generally another, cleaner way.
What if you wanted to return multiple objects, that for some unknown reason are not tied together into a single object.
void GetXYZ( ref object x, ref object y, ref object z);
EDIT: divo suggested using OUT parameters would be more appropriate for this. I have to admit, he's got a point. I'll leave this answer here as a, for the record, this is an inadaquate solution. OUT trumps REF in this case.
I think the best uses are those that you usually see; you need to have both a value and a "success indicator" that is not an exception from a function.
One design pattern where ref is useful is a bidirectional visitor.
Suppose you had a Storage class that can be used to load or save values of various primitive types. It is either in Load mode or Save mode. It has a group of overloaded methods called Transfer, and here's an example for dealing with int values.
public void Transfer(ref int value)
{
if (Loading)
value = ReadInt();
else
WriteInt(value);
}
There would be similar methods for other primitive types - bool, string, etc.
Then on a class that needs to be "transferable", you would write a method like this:
public void TransferViaStorage(Storage s)
{
s.Transfer(ref _firstName);
s.Transfer(ref _lastName);
s.Transfer(ref _salary);
}
This same single method can either load the fields from the Storage, or save the fields to the Storage, depending what mode the Storage object is in.
Really you're just listing all the fields that need to be transferred, so it closely approaches declarative programming instead of imperative. This means that you don't need to write two functions (one for reading, one for writing) and given that the design I'm using here is order-dependent then it's very handy to know for sure that the fields will always be read/written in identical order.
The general point is that when a parameter is marked as ref, you don't know whether the method is going to read it or write to it, and this allows you to design visitor classes that work in one of two directions, intended to be called in a symmetrical way (i.e. with the visited method not needing to know which direction-mode the visitor class is operating in).
Comparison: Attributes + Reflection
Why do this instead of attributing the fields and using reflection to automatically implement the equivalent of TransferViaStorage? Because sometimes reflection is slow enough to be a bottleneck (but always profile to be sure of this - it's hardly ever true, and attributes are much closer to the ideal of declarative programming).
The real use for this is when you create a struct. Structs in C# are value types and therefore always are copied completely when passed by value. If you need to pass it by reference, for example for performance reasons or because the function needs to make changes to the variable, you would use the ref keyword.
I could see if someone has a struct with 100 values (obviously a problem already), you'd likely want to pass it by reference to prevent 100 values copying. That and returning that large struct and writing over the old value would likely have performance issues.
The obvious reason for using the "ref" keyword is when you want to pass a variable by reference. For example passing a value type like System.Int32 to a method and alter it's actual value. A more specific use might be when you want to swap two variables.
public void Swap(ref int a, ref int b)
{
...
}
The main reason for using the "out" keyword is to return multiple values from a method. Personally I prefer to wrap the values in a specialized struct or class since using the out parameter produces rather ugly code. Parameters passed with "out" - is just like "ref" - passed by reference.
public void DoMagic(out int a, out int b, out int c, out int d)
{
...
}
There is one clear case when you must use the 'ref' keyword. If the object is defined but not created outside the scope of the method that you intend to call AND the method you want to call is supposed to do the 'new' to create it, you must use 'ref'. e.g.{object a; Funct(a);} {Funct(object o) {o = new object; o.name = "dummy";} will NOT do a thing with object 'a' nor will it complain about it at either compile or run time. It just won't do anything. {object a; Funct(ref a);} {Funct(object ref o) {o = new object(); o.name = "dummy";} will result in 'a' being a new object with the name of "dummy". But if the 'new' was already done, then ref not needed (but works if supplied). {object a = new object(); Funct(a);} {Funct(object o) {o.name = "dummy";}
Unlike in java why c# does not have a supertype of Number for Floats, Integers etc? Any reasoning behind avoiding Number in c#?
Because value types can't be inherited.
I don't know if true, but one explanation I heard was weight - in particular for the small frameworks (Compact Framework, Silverlight, Micro Framework); I'm not convinced by this...
Far more convincing is that by itself knowing it is a number doesn't provide much; for example, integer division works very differently to floating point, and operators aren't always as simple as you'd like (think DateTime + TimeSpan => DateTime, DateTime - DateTime => TimeSpan).
If it helps, MiscUtil offers generic operator support, allowing things like:
T x = ..., y = ...; // any T that has suitable operators
T sum = Operator.Add(x,y);
All very cleanly and efficiently. Note, however, that there is no compile-time validation (since there are no suitable generic constraints). But it works.
It might have performance reasons - Numbers, being struct-types, are stack-allocated and fast but do not allow inheritance structures. Using them in an OO-way would require a very large amount of auto/unboxing and additionally big much performance reductions due to much more memory consumption and vtable lookups to solve the polymorphism.
Java has introduced object-oriented wrappers and ends up with different and incompatible implementations for one and the same thing with is even more strange.
A better possibility for providing fast abstractions for numbers would be the introduction of typeclasses/concepts like in Haskell or C++ where you could write:
sum :: (Num t) => [t] -> t
read as Sum takes a list of elements of type t - where t is a number type - and returns such a number. This mechanism could be optimized away at compile-time without any performance overhead. But neither .NET nor Java have such techniques.
But very useful, if they had included it, would have been for all integral numeric types (int, short, long, uint, etc.) to have been defined to implement an empty interface named IIntegral, and all numeric types (Integral plus decimal, float, etc.), to have been defined to implement an empty interface named INumeric.
This would have allowed generics to have specified constraints based on these interfaces to restrict allowable types to the integral types, or to numeric types, which is currently a much more difficult problem.
ValueType is pretty close. There aren't many Value types that can't be represented as a single number:
static void Main(string[] args)
{
int myInteger = 42;
decimal myDecimal = 3.141592653589793238M;
long myLong = 900000000000;
byte myByte = 128;
float myFloat = 2.71828F;
TestFunction(myInteger);
TestFunction(myDecimal);
TestFunction(myLong);
TestFunction(myByte);
TestFunction(myFloat);
}
static void TestFunction(System.ValueType number)
{
Console.WriteLine(number.ToString());
}
Output:
42
3.141592653589793238
900000000000
128
2.71828
You could create an extension to Object that returns a bool for you, like so (untested, may provide false positives, etc.) (catches decimals, floats, ints, using US style delimiter for decimal; modify regex to fit hex, etc etc.)
public static class Object
{
static Regex r = new Regex(#"^\d*\.*\d$", RegexOptions.Compiled);
public static bool IsNumber(this object obj)
{
return r.IsMatch(obj.ToString() && !(obj is string);
}
}
Since ToString is part of Object, and everything is ultimately a child of object...
Granted, this won't provide type safety or generics or anything like that, but it will still let you do similar things with a little more work. You didn't specify what you wanted the base class for, whether you want accept any numeric type as an argument someplace, or you wanted generics. This may get you partway to wherever you were going though.
Usage:
public class thingThatHasNumericValue
{
private object arbNumber;
public object SomeArbitraryNumber
{
get { return arbNumber; }
set
{
if (!arbNumber.IsNumber())
{
throw new InvalidOperationException("Must be a number");
}
arbNumber = value;
}
}
}