More particularly, I really want an immutable/shared linked list, and I think having immutable maps and sets would be nice too. As long as I don't have to worry about the core implementation, I can easily add extension methods/subclass/wrap it to provide a reasonably slick external interface for myself to use.
Is there any reason I shouldn't do this? Performance, incompatibility, etc.?
FSharpx includes a couple of "adapters" so that F# collections can be used more comfortably in C#. Here's a short example:
var a = FSharpList.Create(1, 2, 3);
var b = a.Cons(0);
b.TryFind(x => x > 4)
.Match(v => Console.WriteLine("I found a value {0}", v),
() => Console.WriteLine("I didn't find anything"));
There's not much documentation right now, but you can use the tests for reference. It doesn't include absolutely every operation (I don't mind directly using things like MapModule in C# too much), but if you find anything you need missing, please fork the repository and add it!
I also blogged about this a few weeks ago.
Or you can try and use one of these implementations of persistent collections in C#.
The types in the F# library (such as Set, Map and list) were not designed to be used from C#, so I wouldn't generally recommend using them directly. It can be done and some basic operations will work well (e.g. adding elements to an immutable map and checking if an element exists). However, there are some issues:
F# also has functionality in modules (MapModule for an immutable map) and as a C# user, you would expect to see these as members.
F# functions are not represented as Func<_, _> delegates, but using some special F#-specific way. This means that using higher-order functions will be difficult.
So, in summary, I think that a better approach is to wrap the F# data type into a class (implemented in F#) that exposes the methods you need to a C# developer in a friendly way. You can e.g. easily declare an F# method that takes Func<_, _> delegate and calls F# higher-order function in a module.
Related
I'm trying to understand the design decision behind this part of the language. I admit i'm very new to it all but this is something which caught me out initially and I was wondering if I'm missing an obvious reason. Consider the following code:
List<int> MyList = new List<int>() { 5, 4, 3, 2, 1 };
int[] MyArray = {5,4,3,2,1};
//Sort the list
MyList.Sort();
//This was an instance method
//Sort the Array
Array.Sort(MyArray);
//This was a static method
Why are they not both implemented in the same way - intuitively to me it would make more sense if they were both instance methods?
The question is interesting because it reveals details of the .NET type system. Like value types, string and delegate types, array types get special treatment in .NET. The most notable oddish behavior is that you never explicitly declare an array type. The compiler takes care of it for you with ample helpings of the jitter. System.Array is an abstract type, you'll get dedicated array types in the process of writing code. Either by explicitly creating a type[] or by using generic classes that have an array in their base implementation.
In a largish program, having hundreds of array types is not unusual. Which is okay, but there's overhead involved for each type. It is storage required for just the type, not the objects of it. The biggest chunk of it is the so-called 'method table'. In a nutshell, it is a list of pointers to each instance method of the type. Both the class loader and the jitter work together to fill this table. This is commonly known as the 'v-table' but isn't quite a match, the table contains pointers to methods that are both non-virtual and virtual.
You can see where this leads perhaps, the designers were worried about having lots of types with big method tables. So looked for ways to cut down on the overhead.
Array.Sort() was an obvious target.
The same issue is not relevant for generic types. A big nicety of generics, one of many, one method table can handle the method pointers for any type parameter of a reference type.
You are comparing two different types of 'object containers':
MyList is a generic collection of type List, a wrapper class, of type int, where the List<T> represents a strongly typed list of objects. The List class itself provides methods to search, sort, and manipulate its contained objects.
MyArray is a basic data structure of type Array. The Array does not provide the same rich set of methods as the List. Arrays can at the same time be single-dimensional, multidimensional or jagged, whilst Lists out of the box only are single-dimensional.
Take a look at this question, it provides a richer discussion about these data types: Array versus List<T>: When to use which?
Without asking someone who was involved in the design of the original platform it's hard to know. But, here's my guess.
In older languages, like C, arrays are dumb data structures - they have no code of their own. Instead, they're manipulated by outside methods. As you move into an Object oriented framework, the closest equivilent is a dumb object (with minimal methods) manipulated by static methods.
So, my guess is that the implementation of .NET Arrays is more a symptom of C style thinking in the early days of development than anything else.
This likely has to do with inheritance. The Array class cannot be manually derived from. But oddly, you can declare an array of anything at all and get an instance of System.Array that is strongly typed, even before generics allowed you to have strongly typed collections. Array seems to be one of those magic parts of the framework.
Also notice that none of the instance methods provided on an array massively modify the array. SetValue() seems to be the only one that changes anything. The Array class itself provides many static methods that can change the content of the array, like Reverse() and Sort(). Not sure if that's significant - maybe someone here can give some background as to why that's the case.
In contrast, List<T> (which wasn't around in the 1.0 framework days) and classes like ArrayList (which was around back then) are just run-of-the mill classes with no special meaning within the framework. They provide a common .Sort() instance method so that when you inherited from these classes, you'd get that functionality or could override it.
However, these kinds of sort methods have gone out of vogue anyway as extension methods like Linq's .OrderBy() style sorting have become the next evolution. You can query and sort arrays and Lists and any other enumerable object with the same mechanism now, which is really, really nice.
-- EDIT --
The other, more cynical answer may just be - that's how Java did it so Microsoft did it the same way in the 1.0 version of the framework since at that time they were busy playing catch-up.
One reason might be because Array.Sort was designed in .NET 1.0, which had no generics.
I'm not sure, but I'm thinking maybe just so that arrays are as close to Primitives as they can be.
I need to find the minimum between 3 values, and I ended up doing something like this:
Math.Min(Math.Min(val1, val2), val3)
It just seems a little silly to me, because other languages use variadic functions for this. I highly doubt this was an oversight though.
Is there any reason why a simple Min/Max function shoundn't be variadic? Are there performance implications? Is there a variadic version that I didn't notice?
If it is a collection (A subclass of IEnumerable<T>) one could easily use the functions in the System.Linq library
int min = new int[] {2,3,4,8}.Min();
Furthermore, it's easy to implement these methods on your own:
public static class Maths {
public static T Min<T> (params T[] vals) {
return vals.Min();
}
public static T Max<T> (params T[] vals) {
return vals.Max();
}
}
You can call these methods with just simple scalars so Maths.Min(14,25,13,2) would give 2.
These are the generic methods, so there is no need to implement this method for each type of numbers (int,float,...)
I think the basic reason why these methods are not implemented in general is that, every time you call this method, an array (or at least an IList object) should be created. By keeping low-level methods one can avoid the overhead. However, I agree one could add these methods to the Math class to make the life of some programmers easier.
CommuSoft has addressed how to accomplish the equivalent in C#, so I won't retread that part.
To specifically address your question "Why aren't C#'s Math.Min/Max variadic?", two thoughts come to mind.
First, Math.Min (and Math.Max) is not, in fact, a C# language feature, it is a .NET framework library feature. That may seem pedantic, but it is an important distinction. C# does not, in fact, provide any special purpose language feature for determining the minimum or maximum value between two (or more) potential values.
Secondly, as Eric Lippert has pointed out a number of times, language features (and presumably framework features) are not "removed" or actively excluded - all features are unimplemented until someone designs, implements, tests, documents and ships the feature. See here for an example.
Not being a .NET framework developer, I cannot speak to the actual decision process that occurred, but it seems like this is a classic case of a feature that simply never rose to the level of inclusion, similar to the sequence foreach "feature" Eric discusses in the provided link.
I think CommuSoft is providing a robust answer that is at least suited for people searching for something along these lines, and that should be accepted.
With that said, the reason is definitely to avoid the overhead necessary for the less likely use case that people want to compare a group rather than two values.
As pointed about by #arx, using a parametric would be unnecessary overhead for the most used case, but it would also be a lot of unnecessary overhead with regards to the loop that would have to be used internally to go through the array n - 1 times.
I can easily see an argument for having created the method in addition to the basic form, but with LINQ that's just no longer necessary.
I'm looking at the new C# feature of tuples. I'm curious, what problem was the tuple designed to solve?
What have you used tuples for in your apps?
Update
Thanks for the answers thus far, let me see if I have things straight in my mind.
A good example of a tuple has been pointed out as coordinates. Does this look right?
var coords = Tuple.Create(geoLat,geoLong);
Then use the tuple like so:
var myLatlng = new google.maps.LatLng("+ coords.Item1 + ", "+ coords.Item2 + ");
Is that correct?
When writing programs it is extremely common to want to logically group together a set of values which do not have sufficient commonality to justify making a class.
Many programming languages allow you to logically group together a set of otherwise unrelated values without creating a type in only one way:
void M(int foo, string bar, double blah)
Logically this is exactly the same as a method M that takes one argument which is a 3-tuple of int, string, double. But I hope you would not actually make:
class MArguments
{
public int Foo { get; private set; }
... etc
unless MArguments had some other meaning in the business logic.
The concept of "group together a bunch of otherwise unrelated data in some structure that is more lightweight than a class" is useful in many, many places, not just for formal parameter lists of methods. It's useful when a method has two things to return, or when you want to key a dictionary off of two data rather than one, and so on.
Languages like F# which support tuple types natively provide a great deal of flexibility to their users; they are an extremely useful set of data types. The BCL team decided to work with the F# team to standardize on one tuple type for the framework so that every language could benefit from them.
However, there is at this point no language support for tuples in C#. Tuples are just another data type like any other framework class; there's nothing special about them. We are considering adding better support for tuples in hypothetical future versions of C#. If anyone has any thoughts on what sort of features involving tuples you'd like to see, I'd be happy to pass them along to the design team. Realistic scenarios are more convincing than theoretical musings.
Tuples provide an immutable implementation of a collection
Aside from the common uses of tuples:
to group common values together without having to create a class
to return multiple values from a function/method
etc...
Immutable objects are inherently thread safe:
Immutable objects can be useful in multi-threaded applications. Multiple threads can act on data represented by immutable objects without concern of the data being changed by other threads. Immutable objects are therefore considered to be more thread-safe than mutable objects.
From "Immutable Object" on wikipedia
It provides an alternative to ref or out if you have a method that needs to return multiple new objects as part of its response.
It also allows you to use a built-in type as a return type if all you need to do is mash-up two or three existing types, and you don't want to have to add a class/struct just for this combination. (Ever wish a function could return an anonymous type? This is a partial answer to that situation.)
It's often helpful to have a "pair" type, just used in quick situations (like returning two values from a method). Tuples are a central part of functional languages like F#, and C# picked them up along the way.
very useful for returning two values from a function
Personally, I find Tuples to be an iterative part of development when you're in an investigative cycle, or just "playing". Because a Tuple is generic, I tend to think of it when working with generic parameters - especially when wanting to develop a generic piece of code, and I'm starting at the code end, instead of asking myself "how would I like this call to look?".
Quite often I realise that the collection that the Tuple forms become part of a list, and staring at List> doesn't really express the intention of the list, or how it works. I often "live" with it, but find myself wanting to manipulate the list, and change a value - at which point, I don't necessarily want to create a new Tuple for that, thus I need to create my own class or struct to hold it, so I can add manipulation code.
Of course, there's always extension methods - but quite often you don't want to extend that extra code to generic implementations.
There have been times I'm wanted to express data as a Tuple, and not had Tuples available. (VS2008) in which case I've just created my own Tuple class - and I don't make it thread safe (immutable).
So I guess I'm of the opinion that Tuples are lazy programming at the expense of losing a type name that describes it's purpose. The other expense is that you have to declare the signature of the Tuple whereever it's used as a parameter. After a number of methods that begin to look bloated, you may feel as I do, that it is worth making a class, as it cleans up the method signatures.
I tend to start by having the class as a public member of the class you're already working in. But the moment it extends beyond simply a collection of values, it get's it's own file, and I move it out of the containing class.
So in retrospect, I believe I use Tuples when I don't want to go off and write a class, and just want to think about what I've writing right now. Which means the signature of the Tuple may change quite a lot in the text half an hour whilst I figure out what data I am going to need for this method, and how it's returning what ever values it will return.
If I get a chance to refactor code, then often I'll question a Tuple's place in it.
Old question since 2010, and now in 2017 Dotnet changes and become more smart.
C# 7 introduces language support for tuples, which enables semantic names for the fields of a tuple using new, more efficient tuple types.
In vs 2017 and .Net 4.7 (or installing nuget package System.ValueTuple), you can create/use a tuple in a very efficient and simple way:
var person = (Id:"123", Name:"john"); //create tuble with two items
Console.WriteLine($"{person.Id} name:{person.Name}") //access its fields
Returning more than one value from a method:
public (double sum, double average) ComputeSumAndAverage(List<double> list)
{
var sum= list.Sum();
var average = sum/list.Count;
return (sum, average);
}
How to use:
var list=new List<double>{1,2,3};
var result = ComputeSumAndAverage(list);
Console.WriteLine($"Sum={result.sum} Average={result.average}");
For more details read: https://learn.microsoft.com/en-us/dotnet/csharp/tuples
A Tuple is often used to return multiple values from functions when you don’t want to create a specific type. If you're familiar with Python, Python has had this for a long time.
Returning more than one value from a function. getCoordinates() isn't very useful if it just returns x or y or z, but making a full class and object to hold three ints also seems pretty heavyweight.
A common use might be to avoid creating classes/structs that only contains 2 fields, instead you create a Tuple (or a KeyValuePair for now).
Usefull as a return value, avoid passing N out params...
I find the KeyValuePair refreshing in C# to iterate over the key value pairs in a Dictionary.
Its really helpful while returning values from functions. We can have multiple values back and this is quite a saver in some scenarios.
I stumbled upon this performance benchmark between Tuples and Key-Value pairs and probably you will find it interesting. In summary it says that Tuple has advantage because it is a class, therefore it is stored in the heap and not in the stack and when passed around as argument its pointer is the only thing that is going. But KeyValuePair is a structs so it is faster to allocate but it is slower when used.
http://www.dotnetperls.com/tuple-keyvaluepair
So in C++, I'm used to being able to do:
typedef int PeerId;
This allows me to make a type more self-documenting, but additionally also allows me to make PeerId represent a different type at any time without changing all of the code. I could even turn PeerId into a class if I wanted. This kind of extensibility is what I want to have in C#, however I am having trouble figuring out how to create an alias for 'int' in C#.
I think I can use the using statement, but it only has scope in the current file I believe, so that won't work (The alias needs to be accessible between multiple files without being redefined). I also can't derive a class from built-in types (but normally this is what I would do to alias ref-types, such as List or Dictionary). I'm not sure what I can do. Any ideas?
You need to use the full type name like this:
using DWORD = System.Int32;
You could (ab)use implicit conversions:
struct PeerId
{
private int peer;
public static implicit operator PeerId(int i)
{
return new PeerId {peer=i};
}
public static implicit operator int(PeerId p)
{
return p.peer;
}
}
This takes the same space as an int, and you can do:
PeerId p = 3;
int i = p;
But I agree you probably don't need this.
Summary
Here's the short answer:
Typedefs are actually a variable used by compile-time code generators.
C# is being designed to avoid adding code generation language constructs.
Therefore, the concept of typedefs doesn't fit in well with the C# language.
Long Answer
In C++, it makes more sense: C++ started off as a precompiler that spit out C code, which was then compiled. This "code generator" beginning still has effects in modern C++ features (i.e., templates are essentially a Turing-complete language for generating classes and functions at compile time). In this context, a typedef makes sense because it's a way to get the "result" of a compile-time type factory or "algorithm" that "returns" a type.
In this strange meta-language (which few outside of Boost have mastered), a typedef is actually a variable.
What you're describing is less complex, but you're still trying to use the typedef as a variable. In this case, it's used as an input variable. So when other code uses the typedef, it's really not using that type directly. Rather, it's acting as a compile-time code generator, building classes and methods based on typedef'ed input variables. Even if you ignore C++ templates and just look at C typedefs, the effect is the same.
C++ and Generative Programming
C++ was designed to be a multi-paradign language (OO and procedural, but not functional until Boost came out). Interestingly enough, templates have evolved an unexpected paradign: generative programming. (Generative programming was around before C++, but C++ made it popular). Generative programs are actually meta-programs that - when compiled - generate the needed classes and methods, which are in turn compiled into executables.
C# and Generative Programming
Our tools are slowly evolving in the same direction. Of course, reflection emit can be used for "manual" generative programming, but it is quite painful. The way LINQ providers use expression trees is very generative in nature. T4 templates get really close but still fall short. The "compiler as a service" which will hopefully be part of C# vNext appears most promising of all, if it could be combined with some kind of type variable (such as a typedef).
This one piece of the puzzle is still missing: generative programs need some sort of automatic trigger mechanism (in C++, this is handled by implicit template instantiation).
However, it is explicitly not a goal of C# to have any kind of "code generator" in the C# language like C++ templates (probably for the sake of understandability; very few C++ programmers understand C++ templates). This will probably be a niche satisfied by T4 rather than C#.
Conclusion (repeating the Summary)
All of the above is to say this:
Typedefs are a variable used by code generators.
C# is being designed to avoid adding code generation language constructs.
Therefore, the concept of typedefs doesn't fit in well with the C# language.
I also sometimes feel I need (integer) typedefs for similar purposes to the OP.
If you do not mind the casts being explicit (I actually want them to be) you can do this:
enum PeerId : int {};
Will also work for byte, sbyte, short, ushort, uint, long, or ulong (obviously).
Not exactly the intended usage of enum, but it does work.
Since C# 10 you can use global using:
global using PeerId = System.Int32;
It works for all files.
It should appear before all using directives without the global modifier.
See using directive.
Redefining fundamental types just for the sake of changing the name is C++ think and does not sit well with the more pure Object Orientated C#. Whenever you get the urge to shoehorn a concept from one language into another, you must stop and think whether or not it makes sense and try to stay native to the platform.
The requirement of being able to change the underlying type easily can be satisfied by defining your own value type. Coupled with implicit conversion operators and arithmetic operators, you have the power to define very powerful types. If you are worried about performance for adding layers on top of simple types, don't. 99% chance that it won't, and the 1% chance is that in case it does, it will not the be "low hanging fruit" of performance optimization.
What is their use if when you call the method, it might not exist?
Does that mean that you would be able to dynamically create a method on a dynamic object?
What are the practical use of this?
You won't really be able to dynamically create the method - but you can get an implementation of IDynamicMetaObject (often by extending DynamicObject) to respond as if the method existed.
Uses:
Programming against COM objects with a weak API (e.g. office)
Calling into dynamic languages such as Ruby/Python
Potentially making "explorable" objects - imagine an XPath-like query but via a method/property calls e.g. document.RootElement.Person[5].Name["Attribute"]
No doubt many more we have yet to think of :)
First of all, you can't use it now. It's part of C#4, which will be released sometime in the future.
Basically, it's for an object, whose properties won't be known until runtime. Perhaps it comes from a COM object. Perhaps it's a "define on the fly object" as you describe (although I don't think there's a facility to create those yet or planned).
It's rather like a System.Object, except that you are allowed to call methods that the compiler doesn't know about, and that the runtime figures out how to call.
The two biggies I can think of are duck typing and the ability to use C# as a scripting language in applications, similar to javascript and Python. That last one makes me tear up a little.
Think of it as a simplified form of Reflection. Instead of this:
object value = GetSomeObject();
Method method = value.GetType().GetMethod("DoSomething");
method.Invoke(value, new object[] { 1, 2, 3 });
You get this:
IDynamicObject value = GetSomeObject();
value.DoSomething(1, 2, 3);
I see several dynamic ORM frameworks being written. Or heck write one yourself.
I agree with Jon Skeet, you might see some interesting ways of exploring objects.
Maybe with selectors like jQuery.
Calling COM and calling Dynamic Languages.
I'm looking forward to seeing if there is a way to do a Ruby-like missing_method.