Implementing a content-hashable HashSet in C# (like python's `frozenset`) - c#

Brief summary
I want to build a set of sets of items in C#. The inner sets of items have a GetHashCode and Equals method defined by their contents. In mathematical notation:
x = { }
x.Add( { A, B, C } )
x.Add( { A, D } )
x.Add( { B, C, A } )
now x should be{ { A, B, C }, { A, D } }
In python, this could be accomplished with frozenset:
x = set()
x.add( frozenset(['A','B','C']) )
x.add( frozenset(['A','D']) )
x.add( frozenset(['B','C','A']) )
/BriefSummary
I would like to have a hashable HashSet in C#. This would allow me to do:
HashSet<ContentHashableHashSet<int>> setOfSets;
Although there are more sophisticated ways to accomplish this, This can be trivially achieved in practice (although not in the most efficient manner) by adding overriding ContentHashableHashSet.ToString() (outputing the strings of the elements contained in sorted order) and then using then using ContentHashableHashSet.ToString().GetHashCode() as the hash code.
However, if an object is modified after placement in setOfSets, it could result in multiple copies:
var setA = new ContentHashableHashSet<int>();
setA.Add(1);
setA.Add(2);
var setB = new ContentHashableHashSet<int>();
setB.Add(1);
setOfSets.Add(setA);
setOfSets.Add(setB);
setB.Add(2); // now there are duplicate members!
As far as I can see, I have two options: I can derive ContentHashableHashSet from HashSet, but then I will need to make it so that all modifiers throw an exception. Missing one modifier could cause an insidious bug.
Alternatively, I can use encapsulation and class ContentHashableHashSet can contain a readonly HashSet. But then I would need to reimplement all set methods (except modifiers) so that the ContentHashableHashSet can behave like a HashSet. As far as I know, extensions would not apply.
Lastly, I could encapsulate as above and then all set-like functionality will occur by returning the const (or readonly?) HashSet member.
In hindsight, this is reminiscent of python's frozenset. Does anyone know of a well-designed way to implement this in C#?
If I was able to lose ISet functionality, then I would simply create a sorted ImmutableList, but then I would lose functionality like fast union, fast intersection, and sub-linear ( roughly O(log(n)) ) set membership with Contains.
EDIT: The base class HashSet does not have virtual Add and Remove methods, so overriding them will work within the derived class, but will not work if you perform HashSet<int> set = new ContentHashableHashSet<int>();. Casting to the base class will allow editing.
EDIT 2: Thanks to #xanatos for recommending a simple GetHashCode implementation:
The easiest way to calculate the GetHashCode is to simply xor (^) all the gethashcodes of the elements. The xor operator is commutative, so the ordering is irrelevant. For the comparison you can use the SetEquals
EDIT 3: Someone recently shared information about ImmutableHashSet, but because this class is sealed, it is not possible to derive from it and override GetHashCode.
I was also told that HashSet takes an IEqualityComparer as an argument, and so this can be used to provide an immutable, content-hashable set without deriving from ImmutableHashSet; however, this is not a very object oriented solution: every time I want to use a ContentHashableHashSet, I will need to pass the same (non-trivial) argument. As I'm sure you know, this can really wreak havoc with your coding zen, and where I would be flying along in python with myDictionary[ frozenset(mySet) ] = myValue, I will be stuck doing the same thing again and again and again.
Thanks for any help you can provide. I have a temporary workaround (whose problems are mentioned in EDIT 1 above), but I'd mostly like to learn about the best way to design something like this.

Hide the elements of your set of sets so that they can't be changed. That means copying when you add/retrieve sets, but maybe that's acceptable?
// Better make sure T is immutable too, else set hashes could change
public class SetofSets<T>
{
private class HashSetComparer : IEqualityComparer<HashSet<T>>
{
public int GetHashCode(HashSet<T> x)
{
return x.Aggregate(1, (code,elt) => code ^ elt.GetHashCode());
}
public bool Equals(HashSet<T> x, HashSet<T> y)
{
if (x==null)
return y==null;
return x.SetEquals(y);
}
}
private HashSet<HashSet<T>> setOfSets;
public SetofSets()
{
setOfSets = new HashSet<HashSet<T>>(new HashSetComparer());
}
public void Add(HashSet<T> set)
{
setOfSets.Add(new HashSet<T>(set));
}
public bool Contains(HashSet<T> set)
{
return setOfSets.Contains(set);
}
}

Related

Elegant way to query a dictionary in C#

I am trying to create an elegant and extensible way of querying a dictionary which maps an enum to a set of strings.
So I have this class SearchFragments that has the dictionary in it. I then want a method wherein consumers of this class can simply ask "HasAny" and, this is the bit where I am struggling, simply pass in some query like expression and get the boolean answer back.
public class SearchFragments
{
private readonly IDictionary<SearchFragmentEnum, IEnumerable<string>> _fragments;
public SearchFragments()
{
_fragments = new Dictionary<SearchFragmentEnum, IEnumerable<string>>();
}
public bool HasAny(IEnumerable<SearchFragmentEnum> of)
{
int has = 0;
_fragments.ForEach(x => of.ForEach(y => has += x.Key == y ? 1 : 0));
return has >= 1;
}
}
The problem with the way this currently is, is that consumers of this class now have to construct an IEnumerable<SearchFragmentEnum> which can be quite messy.
What I am looking for is that the consuming code will be able to write something along the lines of:
searchFragments.HasAny(SearchFragmentEnum.Name, SearchFragmentEnum.PhoneNumber)
But where that argument list can vary in size (without me having to write method overloads in the SearchFragments class for every possible combination (such that if new values are added to the SearchFragmentEnum at a future date I won't have to update the class.
You can use params[]
public bool HasAny(params SearchFragmentEnum[] of)
{ ...
Sidenote: you know that LIN(Q) queries should just query a source and never cause any side-effects? But your query does unnecessarily increment the integer:
_fragments.ForEach(x => of.ForEach(y => has += x.Key == y ? 1 : 0));
Instead use this (which is also more efficient and more readable):
return _fragments.Keys.Intersect(of).Any();
An even more efficient alternative to this is Sergey's idea:
return of?.Any(_fragments.ContainsKey) == true;
For variable sized arguments in c# you use the params keyword:
public int HasAny(params SearchFragmentEnum[] of)
The .Net API usually offers a couple of overloads of this for performance reasons; the parameters passed are copied into a new array. Explicitely providing overloads for the most common cases avoids this.
public int HasAny(SearchfragmentEnum of1)
public int HasAny(SearchFragmentEnum of1, SearchFragmentEnum of2)
etc.
Instead of using params you could also consider marking your enum with the [Flags] attribute. Parameters could than be passed like HasAny(SearchFragmentEnum.Name | SearchFragmentEnum.PhoneNumber. Examples abundant on StackOverflow (e.g. Using a bitmask in C#)
Use the params keyword to allow a varying number of arguments. Further, you can simplify your code by looping over the smaller of array. Also, you are using a dictionary that has O(1) key check, so it is uneccessary to have an inner loop:
public bool HasAny(params SearchFragmentEnum[] of)
{
foreach(var o in of) {
if (this._fragments.ContainsKey(o))
return true;
}
return false;
}
or shorter with LINQ
public bool HasAny(params SearchFragmentEnum[] of) {
return of?.Any(_fragments.ContainsKey) ?? false;
}

Is there an accepted pattern to preserve variables' values during a function call which modifies those variables?

Within a class I have a property used by a method which I want to remain in the same state after a call to a second method (which might alter that state).
Example: for a property Value I could do something like this:
void MethodOne()
{
...
var tempValue = this.Value;
MethodTwo(); // might modify this.Value
this.Value = tempValue;
...
}
For a single property this isn't a big deal. If I have multiple properties it gets uglier.
I'm looking for a C# solution but would be interested to know if this kind of construct appears in any common language. The sort of syntax I'm after might look something like this:
void MethodOne()
{
...
preserving(this.Value)
{
MethodTwo(); // might modify this.Value
}
...
}
where the preserving keyword could potentially accept multiple properties/fields.
In my specific case it's a recursive method, so the code looks more like:
void MethodOne(object[] args)
{
...
// Do something which might modify this.Value
preserving(this.Value)
{
MethodOne(args);
}
...
}
Is there an accepted pattern / best practice to achieve this?
EDIT
The specific case for which I'm asking is something like this:
For the purposes of sorting lists I have a custom comparison class which implements IComparer. Its Compare method acts on objects which appear in collections (which may therefore be sorted). These collections might be nested, so sorting such a collection might result in the sort function, and therefore Compare(), being called recursively.
The actual comparison function is partially dynamic, which means that it could be set at runtime to something invalid (e.g. non-transitive or non-deterministic). I can't prevent this, so I want to set a limit on the number of comparisons (let's say n-squared, where n is the length of the list being sorted) to protect against cases where an invalid comparison function might result in the sorting algorithm going into an infinite loop.
The Compare method might be called from (e.g.) various LINQ methods such as OrderBy, possibly resulting in lazily evaluated sorts and possibly from code over which I have no control. However, I need to count the number of comparisons in each sort without any 'subsorts' of nested objects corrupting the count (but also counting comparisons in those subsorts).
My code looks something like this:
public int Compare(T x, T y)
{
// this.MaxComparisons is set from outside this code, since this method does not know the length of the list it is sorting.
if (++this.ComparisonCount > this.MaxComparisons)
{
// Error: too many comparisons
}
if (predicate)
{
// Preserve...
tempComparisonCount = this.ComparisonCount;
tempMaxComparisons = this.MaxComparisons;
// ...reset...
this.ComparisonCount = 0;
this.MaxComparisons = ... ; // set as required
var result = this.customComparer.Compare(x.Child, y.Child); // might involve further calls to the above method, which should be counted separately
// ...and restore
this.ComparisonCount = tempComparisonCount;
this.MaxComparisons = tempMaxComparisons;
return result;
}
else
{
return otherComparer.Compare(x, y);
}
}
I hope this makes it clearer why I have asked the question.
private static void Preserving<T>(ref T value, Action act)
{
T old = value;
act();
value = old;
}
then you can do:
Preserving(ref this.Value, MethodTwo);
If you have multiple variables you want to save and restore, you should probably create a Context class containing the state you want to save and then push/pop them from a stack.

C++ Rvalue references and move semantics

C++03 had the problem of unnecessary copies that could happen implicitly. For this purpose, C++11 introduced rvalue references and move semantics. Now my question is, do this unnecessary copying problem also exist in languages such as C# and java or was it only a C++ problem? In other words, does rvalue references make C++11 even more efficient as compared to C# or Java?
As far as C# concerned (operator overloading allowed in it), lets say we have a mathematical vector class, and we use it like this.
vector_a = vector_b + vector_c;
The compiler will surely transform vector_b + vector_c to some temporary object (lets call it vector_tmp).
Now I don't think C# can differentiate between a temporary rvalue such as vector_tmp or a an lvalue such as vector_b, so we'll have to copy data to vector_a anyway, which can easily be avoided by using rvalue references and move semantics in C++11.
Class references in C# and Java have some properties of shared_ptrs in C++. However, rvalue references and move semantics relate more to temporary value types, but the value types in C# are quite non-flexible compared to C++ value types, and from my own C# experience, you'll end up with classes, not structs, most of the time.
So my assumption is that neither Java nor C# would profit much from those new C++ features, which lets code make safe assumptions whether something is a temporary, and instead of copying lets it just steal the content.
yes unnecessary copy operation are there in C# and java.
does rvalue references make C++11 even more efficient as compared to C# or Java?
answer is yes. :)
Because classes in Java and C# use reference semantics, there are never any implicit copies of objects in those languages. The problem move semantics solve does not and has never existed in Java and C#.
I think it could occur in Java. See the add and add_to operation below. add creates a result object to hold the result of the matrix add operation, while add_to merely adds the rhs to this.
class Matrix {
public static final int w = 2;
public static final int h = 2;
public float [] data;
Matrix(float v)
{
data = new float[w*h];
for(int i=0; i<w*h; ++i)
{ data[i] = v; }
}
// Creates a new Matrix by adding this and rhs
public Matrix add(Matrix rhs)
{
Main result = new Main(0.0f);
for(int i=0; i<w*h; ++i)
{ result.data[i] = this.data[i] + rhs.data[i]; }
return result;
}
// Just adds the values in rhs to this
public Main add_to(Main rhs)
{
for(int i=0; i<w*h; ++i)
{ this.data[i] += rhs.data[i]; }
return this;
}
public static void main(String [] args)
{
Matrix m = new Matrix(0.0f);
Matrix n = new Matrix(1.0f);
Matrix o = new Matrix(1.0f);
// Chaining these ops would modify m
// Matrix result = m.add_to(n).subtract_from(o);
m.add_to(n); // Adds n to m
m.subtract_from(o); // Subtract o from n
// Can chain ops without modifying m,
// but temps created to hold results from each step
Matrix result = m.add(n).subtract(o);
}
}
Thus, I think it depends on what sort of functionality you're providing to the user with your classes.
The problem comes up a lot. Someone I want to hold onto a unique copy of an object that no one else can modify. How do I do that?
Make a deep copy of whatever object someone gives me? That would work, but it's not efficient.
Ask people to give me a new object and not to keep a copy? That's faster if you're brave. Bugs can come from a completely unrelated piece of code modifying the object hours later.
C++ style: Move all the items from the input to my own new object. If the caller accidentally tries to use the object again, he will immediately see the problem.
Sometimes a C# read only collection can help. But in my experiences that's usually a pain at best.
Here's what I'm talking about:
class LongLivedObject
{
private Dictionary <string, string> _settings;
public LongLivedObject(Dictionary <string, string> settings)
{ // In C# this always duplicates the data structure and takes O(n) time.
// C++ will automatically try to decide if it could do a swap instead.
// C++ always lets you explicitly say you want to do the swap.
_settings = new Dictionary <string, string>(settings);
}
}
This question is at the heart of Clojure and other functional languages!
In summary, yes, I often wish I had C++11 style data structures and operations in C#.
You can try to emulate move semantics. For instance in Trade-Ideas Philip's example you can pass custom MovableDictionary instead of Dictionary:
public class MovableDictionary<K, V> // : IDictionary<K, V>, IReadOnlyDictionary<K, V>...
{
private Dictionary<K, V> _map;
// Implement all Dictionary<T>'s methods by calling Map's ones.
public Dictionary<K, V> Move()
{
var result = Map;
_map = null;
return result;
}
private Dictionary<K, V> Map
{
get
{
if (_map == null)
_map = new Dictionary<K, V>();
return _map;
}
}
}

Reducing Duplicated Code

I have some code that works on the color structure like this
public void ChangeColor()
{
thisColor.R = thisColor.R + 5;
}
Now I need to make a method that changes a different variable depending on what it is passed. Here is what the code looks like now.
public void ChangeColor(int RGBValue)
{
switch(RGBValue)
{
case 1:
thisColor.R = thisColor.R + 5;
break;
case 2:
thiscolor.B = thisColor.B + 5;
break;
}
}
Now, this is something I would normally never question, I'd just throw a #region statement around it and call it a day, but this is just an example of what I have, the actual function is quite long.
I want it to look like this:
public void ChangeColor(int RGBValue)
{
thiscolor.RGBValue = thiscolor.RGBValue;
}
So essentially the value would refer to the variable being used. Is there a name for this? Is this what Reflection is for? Or something like that... Is there a way to do this?
I'm not 100% sure if this is what you want. But with the given example, it sounds like this might be what you're after.
you might be able to use the ref keyword:
public void ChangeColor(ref int color)
{
color += 5;
}
void SomeMethod()
{
ChangeColor(ref thisColor.R); //Change the red value
ChangeColor(ref thisColor.B); //Change the blue value
}
This is definitely not what reflection is for. In fact, there seem to be a number of issues here. Let's review here - you want to change the following method:
public void ChangeColor(int RGBValue)
{
switch(...)
{
case ...
case ...
case ...
}
}
Into something like this:
public void ChangeColor(int RGBValue)
{
thisColor.{something-from-RGBValue} += 5;
}
The problems with this are:
The name of the method, ChangeColor, does not precisely describe what the method actually does. Perhaps this is an artifact of anonymization, but nevertheless it's a terrible name for the method.
The parameter, RGBValue, does not accurately describe what the argument is or does. The name RGBValue and the type int makes it sound like an actual RGB color value, i.e. 0x33ccff for a light blue. Instead it chooses which of R, G, or B will be set.
There are only 3 valid values for the parameter, and yet the range of possible values is completely unrestricted. This is a recipe for bugs. Worse, individual values are used as magic numbers inside the method.
But perhaps most important of all, the "clean/quick method" you are asking for is precisely the abstraction that this method purports to provide! You're writing a method that intensifies the hue, and in order to keep the method short, you're asking for... a method to intensify the hue. It doesn't make sense!
I can only assume that you want to do this because you have many different things you might want to do to a Color, for example:
public void Brighten(...) { ... }
public void Darken(...) { ... }
public void Desaturate(...) { ... }
public void Maximize(...) { ... }
And so on and so forth. And you're trying to avoid writing switch statements for all.
Fine, but don't eliminate the switch entirely; it is by far the most efficient and readable way to write this code! What's more important is to distill it down to one switch instead of many, and fix the other problems mentioned above. First, let's start with a reasonable parameter type instead of an int - create an enumeration:
public enum PrimaryColor { Red, Green, Blue };
Now, start from the idea that there may be many actions we want to perform on one of the primary colors of a composite color, so write the generic method:
protected void AdjustPrimaryColor(PrimaryColor pc, Func<byte, byte> adjustFunc)
{
switch (pc)
{
case PrimaryColor.Red:
internalColor.R = adjustFunc(internalColor.R);
case PrimaryColor.Green:
internalColor.G = adjustFunc(internalColor.G);
default:
Debug.Assert(pc == PrimaryColor.Blue,
"Unexpected PrimaryColor value in AdjustPrimaryColor.");
internalColor.B = adjustFunc(internalColor.B);
}
}
This method is short, easy to read, and will likely never have to change. It is a good, clean method. Now we can write the individual action methods quite easily:
public void Brighten(PrimaryColor pc)
{
AdjustPrimaryColor(pc, v => v + 5);
}
public void Darken(PrimaryColor pc)
{
AdjustPrimaryColor(pc, v => v + 5);
}
public void Desaturate(PrimaryColor pc)
{
AdjustPrimaryColor(pc, v => 0);
}
public void Maximize(PrimaryColor pc)
{
AdjustPrimaryColor(pc, v => 255);
}
The (significant) advantages to this are:
The enumeration type prevents callers from screwing up and passing in an invalid parameter value.
The general Adjust method is easy to read and therefore easy to debug and easy to maintain. It's also going to perform better than any reflection-based or dictionary-based approach - not that performance is likely a concern here, but I'm mainly saying this to note that it certainly isn't going to be worse.
You don't have to write repeated switch statements. Each individual modifier method is exactly one line.
Eventually, somewhere, you're actually going to have to write some code, and I would much rather that code be an extremely simple switch statement than a mess of reflection, delegates, dictionaries, etc. The key is to generalize this work as much as possible; once you've done that and created that abstraction, then you can start writing one-liner methods to do the "real" work.
It's a bit awkward, but you can pass a property 'by ref' like this:
int ThisColor { get; set; }
public void ChangeColor(Func<int> getter, Action<int> setter)
{
setter(getter() + 5);
}
public void SomeMethod()
{
ChangeColor(() => ThisColor, (color) => ThisColor = color);
}
This is less expensive than reflection and it's compile-time checked (with reflection, you'd have to pass a string to a GetProperty call and the string name could potentially diverge from the property name in later refactoring.)
I would tend to use a dictionary rather than what i suspect could end up being a large switch statement so if you created a
Dictionary<int,Func<int,int>> map = new Dictionary<int, Func<int, int>>();
Each item in your dictionary could take then input and return the new value
so you your method you would be able to call
public int ChangeColor(int rgbValue)
{
return map[rgbValue](rgbValue);
}
which will execute the delegate specific for the Rgb value you insert, to assign a delegate you simply add a new entry to the map
map.Add(5,x => x+5);
If I understand you correctly, you'd like to write a method that takes some symbol (or property name) and modifies the property of the structure using defined by this symbol. This isn't easily possible in C# (you could of course use reflection, but...).
You could do similar thing using Dictionary containing delegates for reading and writing the value of the property. However, that will still be a bit lengthy, because you'll need to initialize the dictionary. Anyway, the code might look like this:
var props = new Dictionary<string, Tuple<Func<Color, int>, Action<Color, int>>>
{ "R", Tuple.Create(c => c.R, (c, r) => c.R = r),
"G", Tuple.Create(c => c.G, (c, g) => c.G = g),
"B", Tuple.Create(c => c.B, (c, b) => c.B = b) };
This creates a dictionary that contains string (name of the property) as the key and a tuple with getter delegate and setter delegate for each of the property. Now your ChangeColor method could look like this:
public void ChangeColor(string propName) {
var getSet = props[propName];
getSet.Item2(thisColor, getSet.Item1(thisColor) + 5);
}
The code would be more readable if you used your own type with Get property and Set property instead of Tuple with properties named Item1 and Item2. This solution may be useful in some scenarios, but you still need to explicitly list all the properties when initializing the dictionary.
This might be what your looking for, you may want to add some error handling though.
It will work with any kind of property with public get; and set; methods.
And if you want to there is ways to reduce use of "magic-strings".
public static void ChangeProperty<T>(this object obj, string propertyName, Func<T,T> func)
{
var pi = obj.GetType().GetProperty(propertyName);
pi.SetValue(obj, func((T)pi.GetValue(obj, null)), null);
}
public void Change()
{
thisColor.ChangeProperty<int>("R", (x) => x + 5);
}
Well, it's kind of hard to tell what's really going on since you've given a very simplified example.
But, what I'm really reading is that you want to have a method that will perform one of a number of possible modifications to local state based upon one of the parameters of the method.
Now, is the operation the same, except for what it's being done to?
Ultimately, you have to have some code that understandds that maps an input to a desired operation. How much that can be generalized depends upon how similar the actions are (if it's always 'add 5 to a property' you have more generalization options...).
Some options you have are:
Write a class which encapsulates the Color struct.
Use a lookup table of Actions, as suggested by Kev Hunter.
Write a switch statement.
Pass in a parameter which contains a virtual method which can be executed on the internal data (or just pass in an Action<> directly) - avoiding the lookup
And... that's about it, really. Which one of these makes the most sense probably depends more on your actual use case (which we don't really have a lot of info on) than anything else.

C#: Returning 'this' for method nesting?

I have a class that I have to call one or two methods a lot of times after each other. The methods currently return void. I was thinking, would it be better to have it return this, so that the methods could be nested? or is that considerd very very very bad? or if bad, would it be better if it returned a new object of the same type? Or what do you think? As an example I have created three versions of an adder class:
// Regular
class Adder
{
public Adder() { Number = 0; }
public int Number { get; private set; }
public void Add(int i) { Number += i; }
public void Remove(int i) { Number -= i; }
}
// Returning this
class Adder
{
public Adder() { Number = 0; }
public int Number { get; private set; }
public Adder Add(int i) { Number += i; return this; }
public Adder Remove(int i) { Number -= i; return this; }
}
// Returning new
class Adder
{
public Adder() : this(0) { }
private Adder(int i) { Number = i; }
public int Number { get; private set; }
public Adder Add(int i) { return new Adder(Number + i); }
public Adder Remove(int i) { return new Adder(Number - i); }
}
The first one can be used this way:
var a = new Adder();
a.Add(4);
a.Remove(1);
a.Add(7);
a.Remove(3);
The other two can be used this way:
var a = new Adder()
.Add(4)
.Remove(1)
.Add(7)
.Remove(3);
Where the only difference is that a in the first case is the new Adder() while in the latter it is the result of the last method.
The first I find that quickly become... annoying to write over and over again. So I would like to use one of the other versions.
The third works kind of like many other methods, like many String methods and IEnumerable extension methods. I guess that has its positive side in that you can do things like var a = new Adder(); var b = a.Add(5); and then have one that was 0 and one that was 5. But at the same time, isn't it a bit expensive to create new objects all the time? And when will the first object die? When the first method returns kind of? Or?
Anyways, I like the one that returns this and think I will use that, but I am very curious to know what others think about this case. And what is considered best practice.
The 'return this' style is sometimes called a fluent interface and is a common practice.
I like "fluent syntax" and would take the second one. After all, you could still use it as the first, for people who feel uncomfortable with fluent syntax.
another idea to make an interface like the adders one easier to use:
public Adder Add(params int[] i) { /* ... */ }
public Adder Remove(params int[] i) { /* ... */ }
Adder adder = new Adder()
.Add(1, 2, 3)
.Remove(3, 4);
I always try to make short and easy-to-read interfaces, but many people like to write the code as complicated as possible.
Chaining is a nice thing to have and is core in some frameworks (for instance Linq extensions and jQuery both use it heavily).
Whether you create a new object or return this depends on how you expect your initial object to behave:
var a = new Adder();
var b = a.Add(4)
.Remove(1)
.Add(7)
.Remove(3);
//now - should a==b ?
Chaining in jQuery will have changed your original object - it has returned this.
That's expected behaviour - do do otherwise would basically clone UI elements.
Chaining in Linq will have left your original collection unchanged. That too is expected behaviour - each chained function is a filter or transformation, and the original collection is often immutable.
Which pattern better suits what you're doing?
I think that for simple interfaces, the "fluent" interface is very useful, particularly because it is very simple to implement. The value of the fluent interface is that it eliminates a lot of the extraneous fluff that gets in the way of understanding. Developing such an interface can take a lot of time, especially when the interface starts to be involved. You should worry about how the usage of the interface "reads"; In my mind, the most compelling use for such an interface is how it communicates the intent of the programmer, not the amount of characters that it saves.
To answer your specific question, I like the "return this" style. My typical use of the fluent interface is to define a set of options. That is, I create an instance of the class and then use the fluent methods on the instance to define the desired behavior of the object. If I have a yes/no option (say for logging), I try not to have a "setLogging(bool state)" method but rather two methods "WithLogging" and "WithoutLogging". This is somewhat more work but the clarity of the final result is very useful.
Consider this: if you come back to this code in 5 years, is this going to make sense to you? If so, then I suppose you can go ahead.
For this specific example, though, it would seem that overloading the + and - operators would make things clearer and accomplish the same thing.
For your specific case, overloading the arithmetic operators would be probably the best solution.
Returning this (Fluent interface) is common practice to create expressions - unit testing and mocking frameworks use this a lot. Fluent Hibernate is another example.
Returning a new instance might be a good choice, too. It allows you to make your class immutable - in general a good thing and very handy in the case of multithreading. But think about the object creation overhead if immutability is of no use for you.
If you call it Adder, I'd go with returning this. However, it's kind of strange for an Adder class to contain an answer.
You might consider making it something like MyNumber and create an Add()-method.
Ideally (IMHO), that would not change the number that is stored inside your instance, but create a new instance with the new value, which you return:
class MyNumber
{
...
MyNumber Add( int i )
{
return new MyNumber( this.Value + i );
}
}
The main difference between the second and third solution is that by returning a new instance instead of this you are able to "catch" the object in a certain state and continue from that.
var a = new Adder()
.Add(4);
var b = a.Remove(1);
var c = a.Add(7)
.Remove(3);
In this case both b and c have the state captured in a as a starting point.
I came across this idiom while reading about a pattern for building test domain objects in Growing Object-Oriented Software, Guided by Tests by Steve Freeman; Nat Pryce.
On your question regarding the lifetime of your instances: I would exspect them to be elligible for garbage collection as soon as the invocation of Remove or Add are returning.

Categories

Resources