Related
As discovered in C 3.5, the following would not be possible due to type erasure: -
int foo<T>(T bar)
{
return bar.Length; // will not compile unless I do something like where T : string
}
foo("baz");
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
Having read about the dynamic keyword, I wrote the following: -
int foo<T>(T bar)
{
dynamic test = bar;
return test.Length;
}
foo("baz"); // will compile and return 3
So, as far as I understand, dynamic will bypass compile time checking but if the type has been erased, surely it would still be unable to resolve the symbol unless it goes deeper and uses some kind of reflection?
Is using the dynamic keyword in this way bad practice and does this make generics a little more powerful?
dynamics and generics are 2 completely different notions. If you want compile-time safety and speed use strong typing (generics or just standard OOP techniques such as inheritance or composition). If you do not know the type at compile time you could use dynamics but they will be slower because they are using runtime invocation and less safe because if the type doesn't implement the method you are attempting to invoke you will get a runtime error.
The 2 notions are not interchangeable and depending on your specific requirements you could use one or the other.
Of course having the following generic constraint is completely useless because string is a sealed type and cannot be used as a generic constraint:
int foo<T>(T bar) where T : string
{
return bar.Length;
}
you'd rather have this:
int foo(string bar)
{
return bar.Length;
}
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
No, this isn't because of type erasure. Anyway there is no type erasure in C# (unlike Java): a distinct type is constructed by the runtime for each different set of type arguments, there is no loss of information.
The reason why it doesn't work is that the compiler knows nothing about T, so it can only assume that T inherits from object, so only the members of object are available. You can, however, provide more information to the compiler by adding a constraint on T. For instance, if you have an interface IBar with a Length property, you can add a constraint like this:
int foo<T>(T bar) where T : IBar
{
return bar.Length;
}
But if you want to be able to pass either an array or a string, it won't work, because the Length property isn't declared in any interface implemented by both String and Array...
No, C# does not have type erasure - only Java has.
But if you specify only T, without any constraint, you can not use obj.Lenght because T can virtually be anything.
foo(new Bar());
The above would resolve to an Bar-Class and thus the Lenght Property might not be avaiable.
You can only use Methods on T when you ensure that T this methods also really has. (This is done with the where Constraints.)
With the dynamics, you loose compile time checking and I suggest that you do not use them for hacking around generics.
In this case you would not benefit from dynamics in any way. You just delay the error, as an exception is thrown in case the dynamic object does not contain a Length property. In case of accessing the Length property in a generic method I can't see any reason for not constraining it to types who definately have this property.
"Dynamics are a powerful new tool that make interop with dynamic languages as well as COM easier, and can be used to replace much turgid reflective code. They can be used to tell the compiler to execute operations on an object, the checking of which is deferred to runtime.
The great danger lies in the use of dynamic objects in inappropriate contexts, such as in statically typed systems, or worse, in place of an interface/base class in a properly typed system."
Qouted From Article
Thought I'd weigh-in on this one, because no one clarified how generics work "under the hood". That notion of T being an object is mentioned above, and is quite clear. What is not talked about, is that when we compile C# or VB or any other supported language, - at the Intermediate Language (IL) level (what we compile to) which is more akin to an assembly language or equivalent of Java Byte codes, - at this level, there is no generics! So the new question is how do you support generics in IL? For each type that accesses the generic, a non-generic version of the code is generated which substitutes the generic(s) such as the ubiquitous T to the actual type it was called with. So if you only have one type of generic, such as List<>, then that's what the IL will contain. But if you use many implementation of a generic, then many specific implementations are created, and calls to the original code substituted with the calls to the specific non-generic version. To be clear, a MyList used as: new MyList(), will be substituted in IL with something like MyList_string().
That's my (limited) understanding of what's going on. The point being, the benefit of this approach is that the heavy lifting is done at compile-time, and at runtime there's no degradation to performance - which is again, why generic are probably so loved used anywhere, and everywhere by .NET developers.
On the down-side? If a method or type is used many times, then the output assembly (EXE or DLL) will get larger and larger, dependent of the number of different implementation of the same code. Given the average size of DLLs output - I doubt you'll ever consider generics to be a problem.
I'm trying to formalise the usage of the "out" keyword in c# for a project I'm on, particularly with respect to any public methods. I can't seem to find any best practices out there and would like to know what is good or bad.
Sometimes I'm seeing some methods signatures that look like this:
public decimal CalcSomething(Date start, Date end, out int someOtherNumber){}
At this point, it's just a feeling, this doesn't sit well with me. For some reason, I'd prefer to see:
public Result CalcSomething(Date start, Date end){}
where the result is a type that contains a decimal and the someOtherNumber. I think this makes it easier to read. It allows Result to be extended or have properties added without breaking code. It also means that the caller of this method doesn't have to declare a locally scoped "someOtherNumber" before calling. From usage expectations, not all callers are going to be interested in "someOtherNumber".
As a contrast, the only instances that I can think of right now within the .Net framework where "out" parameters make sense are in methods like TryParse(). These actually make the caller write simpler code, whereby the caller is primarily going to be interested in the out parameter.
int i;
if(int.TryParse("1", i)){
DoSomething(i);
}
I'm thinking that "out" should only be used if the return type is bool and the expected usages are where the "out" parameters will always be of interest to the caller, by design.
Thoughts?
There is a reason that one of the static code analysis (=FxCop) rules points at you when you use out parameters. I'd say: only use out when really needed in interop type scenarios. In all other cases, simply do not use out. But perhaps that's just me?
This is what the .NET Framework Developer's Guide has to say about out parameters:
Avoid using out or reference parameters.
Working with members
that define out or reference
parameters requires that the developer
understand pointers, subtle
differences between value types and
reference types, and initialization
differences between out and reference
parameters.
But if you do use them:
Do place all out parameters after all of the pass-by-value and ref
parameters (excluding parameter
arrays), even if this results in an
inconsistency in parameter ordering
between overloads.
This convention makes the method
signature easier to understand.
Your approach is better than out, because you can "chain" calls that way:
DoSomethingElse(DoThing(a,b).Result);
as opposed to
DoThing(a, out b);
DoSomethingElse(b);
The TryParse methods implemented with "out" was a mistake, IMO. Those would have been very convenient in chains.
There are only very few cases where I would use out. One of them is if your method returns two variables that from an OO point of view do not belong into an object together.
If for example, you want to get the most common word in a text string, and the 42nd word in the text, you could compute both in the same method (having to parse the text only once). But for your application, these informations have no relation to each other: You need the most common word for statistical purposes, but you only need the 42nd word because your customer is a geeky Douglas Adams fan.
Yes, that example is very contrived, but I haven't got a better one...
I just had to add that starting from C# 7, the use of the out keyword makes for very readable code in certain instances, when combined with inline variable declaration. While in general you should rather return a (named) tuple, control flow becomes very concise when a method has a boolean outcome, like:
if (int.TryParse(mightBeCount, out var count)
{
// Successfully parsed count
}
I should also mention, that defining a specific class for those cases where a tuple makes sense, more often than not, is more appropriate. It depends on how many return values there are and what you use them for. I'd say, when more than 3, stick them in a class anyway.
One advantage of out is that the compiler will verify that CalcSomething does in fact assign a value to someOtherNumber. It will not verify that the someOtherNumber field of Result has a value.
Stay away from out. It's there as a low-level convenience. But at a high level, it's an anti-technique.
int? i = Util.TryParseInt32("1");
if(i == null)
return;
DoSomething(i);
If you have even seen and worked with MS
namespace System.Web.Security
MembershipProvider
public abstract MembershipUser CreateUser(string username, string password, string email, string passwordQuestion, string passwordAnswer, bool isApproved, object providerUserKey, out MembershipCreateStatus status);
You will need a bucket. This is an example of a class breaking many design paradigms. Awful!
Just because the language has out parameters doesn't mean they should be used. eg goto
The use of out Looks more like the Dev was either Lazy to create a type or wanted to try a language feature.
Even the completely contrived MostCommonAnd42ndWord example above I would use
List or a new type contrivedresult with 2 properties.
The only good reasons i've seen in the explanations above was in interop scenarios when forced to. Assuming that is valid statement.
You could create a generic tuple class for the purpose of returning multiple values. This seems to be a decent solution but I can't help but feel that you lose a bit of readability by returning such a generic type (Result is no better in that regard).
One important point, though, that james curran also pointed out, is that the compiler enforces an assignment of the value. This is a general pattern I see in C#, that you must state certain things explicitly, for more readable code. Another example of this is the override keyword which you don't have in Java.
If your result is more complex than a single value, you should, if possible, create a result object. The reasons I have to say this?
The entire result is encapsulated. That is, you have a single package that informs the code of the complete result of CalcSomething. Instead of having external code interpret what the decimal return value means, you can name the properties for your previous return value, Your someOtherNumber value, etc.
You can include more complex success indicators. The function call you wrote might throw an exception if end comes before start, but exception throwing is the only way to report errors. Using a result object, you can include a boolean or enumerated "Success" value, with appropriate error reporting.
You can delay the execution of the result until you actually examine the "result" field. That is, the execution of any computing needn't be done until you use the values.
I'm just wondering how other developers tackle this issue of getting 2 or 3 answers from a method.
1) return a object[]
2) return a custom class
3) use an out or ref keyword on multiple variables
4) write or borrow (F#) a simple Tuple<> generic class
http://slideguitarist.blogspot.com/2008/02/whats-f-tuple.html
I'm working on some code now that does data refreshes. From the method that does the refresh I would like to pass back (1) Refresh Start Time and (2) Refresh End Time.
At a later date I may want to pass back a third value.
Thoughts? Any good practices from open source .NET projects on this topic?
It entirely depends on what the results are. If they are related to one another, I'd usually create a custom class.
If they're not really related, I'd either use an out parameter or split the method up. If a method wants to return three unrelated items, it's probably doing too much. The exception to this is when you're talking across a web-service boundary or something else where a "purer" API may be too chatty.
For two, usually 4)
More than that, 2)
Your question points to the possibility that you'll be returning more data in the future, so I would recommend implementing your own class to contain the data.
What this means is that your method signature will remain the same even if the inner representation of the object you're passing around changes to accommodate more data. It's also good practice for readability and encapsulation reasons.
Code Architeture wise i'd always go with a Custom Class when needing somewhat a specific amount of variables changed. Why? Simply because a Class is actually a "blueprint" of an often used data type, creating your own data type, which it in this case is, will help you getting a good structure and helping others programme for your interface.
Personally, I hate out/ref params, so I'd rather not use that approach. Also, most of the time, if you need to return more than one result, you are probably doing something wrong.
If it really is unavoidable, you will probably be happiest in the long run writing a custom class. Returning an array is tempting as it is easy and effective in the short teerm, but using a class gives you the option of changing the return type in the future without having to worry to much about causing problems down stream. Imagine the potential for a debugging nightmare if someone swaps the order of two elements in the array that is returned....
I use out if it's only 1 or 2 additional variables (for example, a function returns a bool that is the actual important result, but also a long as an out parameter to return how long the function ran, for logging purposes).
For anything more complicated, i usually create a custom struct/class.
I think the most common way a C# programmer would do this would be to wrap the items you want to return in a separate class. This would provide you with the most flexibility going forward, IMHO.
It depends. For an internal only API, I'll usually choose the easiest option. Generally that's out.
For a public API, a custom class usually makes more sense - but if it's something fairly primitive, or the natural result of the function is a boolean (like *.TryParse) I'll stick with an out param. You can do a custom class with an implicit cast to bool as well, but that's usually just weird.
For your particular situation, a simple immutable DateRange class seems most appropriate to me. You can easily add that new value without disturbing existing users.
If you're wanting to send back the refresh start and end times, that suggests a possible class or struct, perhaps called DataRefreshResults. If your possible third value is also related to the refresh, then it could be added. Remember, a struct is always passed by value, so it's allocated on the heap does not need to be garbage-collected.
Some people use KeyValuePair for two values. It's not great though because it just labels the two things as Key and Value. Not very descriptive. Also it would seriously benefit from having this added:
public static class KeyValuePair
{
public static KeyValuePair<K, V> Make(K k, V v)
{
return new KeyValuePair<K, V>(k, v);
}
}
Saves you from having to specify the types when you create one. Generic methods can infer types, generic class constructors can't.
For your scenario you may want to define generic Range{T} class (with checks for the range validity).
If method is private, then I usually use tuples from my helper library. Public or protected methods generally always deserve separate.
Return a custom type, but don't use a class, use a struct - no memory allocation/garbage collection overhead means no downsides.
If 2, a Pair.
If more than 2 a class.
Another solution is to return a dictionary of named object references. To me, this is pretty equivalent to using a custom return class, but without the clutter. (And using RTTI and reflection it is just as typesafe as any other solution, albeit dynamically so.)
It depends on the type and meaning of the results, as well as whether the method is private or not.
For private methods, I usually just use a Tuple, from my class library.
For public/protected/internal methods (ie. not private), I use either out parameter or a custom class.
For instance, if I'm implementing the TryXYZ pattern, where you have an XYZ method that throws an exception on failure and a TryXYZ method that returns Boolean, TryXYZ will use an out parameter.
If the results are sequence-oriented (ie. return 3 customers that should be processed) then I will typically return some kind of collection.
Other than that I usually just use a custom class.
If a method outputs two to three related value, I would group them in a type. If the values are unrelated, the method is most likely doing way too much and I would refactor it into a number of simpler methods.
Can someone explain why there is the need to add an out or in parameter to indicate that a generic type is Co or Contra variant in C# 4.0?
I've been trying to understand why this is important and why the compiler can't just figure it out..
Thanks,
Josh
Eric Lippert, who works on the langauge, has a series of posts on msdn that should help clarify the issues involved:
http://blogs.msdn.com/ericlippert/archive/tags/Covariance+and+Contravariance/default.aspx
When reading the articles shown at that link, start at the bottom and work up.
Eventually you'll get to #7 (Why do we need a syntax at all?).
We don't actually need them, any more then we need abstract on classes or both out and ref. They exist just so that we, as programmers, can make our intention crystal clear, so that maintenance programmer know what we are doing, and the compiler can verify that we are doing it right.
Well, the main problem is that if you have a class hierarchy like:
class Foo { .. }
class Bar : Foo { .. }
And you have an IEnumerator<Bar>, you can't use that as an IEnumerator<Foo> even though that would be perfectly safe. In 3.5 this forces a large number of painful gyrations. This operation would always be safe but is denied by the type system because it doesn't know about the covariant use of the generic type parameter. IEnumerator<Bar> can only return a Bar and every Bar is a Foo.
Similarly, if you had an IEqualityComparer<Foo> it can be used to compare any pair of objects of type Foo even if one or both is a Bar, but it cannot be cast into an IEqualityComparer<Bar> because it doesn't know about the contravariant use of the generic type parameter. IEqualityComparer<Foo> only consumes objects of type Foo and every Bar is a Foo.
Without these keywords we're forced to assume the generic argument can occur as both an argument to a method and as the result type of a method, and so we can't safely allow either of the above conversions.
With them, the type system is free to allow us to safely upcast and downcast between those interfaces in the direction indicated by the keyword and we get errors indicating when we would violate the discipline required to ensure that safety.
The in and out keywords have been keywords since C# 1.0, and have been used in the context of in- and out- parameters to methods.
Covariance and contravariance are constraints on how one can implement an interface. There is no good way to infer them - the only way, I think, is from usage, and that would be messy and in the end it wouldn't work.
Jon and Joel both provided a pretty complete answer to this, but the bottom line is that they aren't so much needed by the compiler but rather help guarantee the safety of the implementation by explicitly indicating the variance of the parameter. This follows a very similar pattern to requiring the out or ref keyword at both the calling site and the declaration site.
What is their use if when you call the method, it might not exist?
Does that mean that you would be able to dynamically create a method on a dynamic object?
What are the practical use of this?
You won't really be able to dynamically create the method - but you can get an implementation of IDynamicMetaObject (often by extending DynamicObject) to respond as if the method existed.
Uses:
Programming against COM objects with a weak API (e.g. office)
Calling into dynamic languages such as Ruby/Python
Potentially making "explorable" objects - imagine an XPath-like query but via a method/property calls e.g. document.RootElement.Person[5].Name["Attribute"]
No doubt many more we have yet to think of :)
First of all, you can't use it now. It's part of C#4, which will be released sometime in the future.
Basically, it's for an object, whose properties won't be known until runtime. Perhaps it comes from a COM object. Perhaps it's a "define on the fly object" as you describe (although I don't think there's a facility to create those yet or planned).
It's rather like a System.Object, except that you are allowed to call methods that the compiler doesn't know about, and that the runtime figures out how to call.
The two biggies I can think of are duck typing and the ability to use C# as a scripting language in applications, similar to javascript and Python. That last one makes me tear up a little.
Think of it as a simplified form of Reflection. Instead of this:
object value = GetSomeObject();
Method method = value.GetType().GetMethod("DoSomething");
method.Invoke(value, new object[] { 1, 2, 3 });
You get this:
IDynamicObject value = GetSomeObject();
value.DoSomething(1, 2, 3);
I see several dynamic ORM frameworks being written. Or heck write one yourself.
I agree with Jon Skeet, you might see some interesting ways of exploring objects.
Maybe with selectors like jQuery.
Calling COM and calling Dynamic Languages.
I'm looking forward to seeing if there is a way to do a Ruby-like missing_method.