How to print object ID? - c#

`I need to know if two references from completely different parts of the program refers to the same object.
I can not compare references programaticaly because they are from the different context (one reference is not visible from another and vice versa).
Then I want to print unique identifier for each object using Console.WriteLine(). But ToString() method doesn't return "unique" identifier, it just returns "classname".
Is it possible to print unique identifier in C# (like in Java)?

The closest you can easily get (which won't be affected by the GC moving objects around etc) is probably RuntimeHelpers.GetHashCode(Object). This gives the hash code which would be returned by calling Object.GetHashCode() non-virtually on the object. This is still not a unique identifier though. It's probably good enough for diagnostic purposes, but you shouldn't rely on it for production comparisons.
EDIT: If this is just for diagnostics, you could add a sort of "canonicalizing ID generator" which was just a List<object>... when you ask for an object's "ID" you'd check whether it already existed in the list (by comparing references) and then add it to the end if it didn't. The ID would be the index into the list. Of course, doing this without introducing a memory leak would involve weak references etc, but as a simple hack this might work for you.

one reference is not visible from another and vice versa
I don't buy that. If you couldn't even get the handles, how would you get their ID's?
In C# you can always get handles to objects, and you can always compare them. Even if you have to use reflection to do it.

If you need to know if two references are pointing the same object, I'll just citate this.
By default, the operator == tests for
reference equality. This is done by
determining if two references indicate
the same object. Therefore reference
types do not need to implement
operator == in order to gain this
functionality.
So, == operator will do the trick without doing the Id workaround.

I presume you're calling ToString on your object reference, but not entirely clear on this or your explained situatyion, TBH, so just bear with me.
Does the type expose an ID property? If so, try this:
var idAsString = yourObjectInstance.ID.ToString();
Or, print directly:
Console.WriteLine(yourObjectInstance.ID);
EDIT:
I see Jon seen right through this problem, and makes my answer look rather naive - regardless, I'm leaving it in if for nothing else but to emphasise the lack of clarity of the question. And also, maybe, provide an avenue to go down based on Jon's statement that 'This [GetHashCode] is still not a unique identifier', should you decide to expose your own uniqueness by way of an identifier.

Related

Am I always dealing with the same object?

I'm working on a TCP socket related application, where an object I've created refers to a System.Net.Sockets.Socket object. That latter object seems to become null and in order to understand why, I would like to check if my own object gets re-created. For that, I thought of the simplest possible approach by checking the memory address of this. However, when adding this to the watch-window I get following error message:
Name Value
&this error CS0211: Cannot take the address of the given expression
As it seems to be impossible to check the memory address of an object in C#, how can I verify that I'm dealing with the same or another object when debugging my code?
In C#, objects are moved during garbage collection. You can't simply take the address of it, because the address changed when the GC heap is compacted.
Dealing with pointers in C# requires unsafe code and you leave the terrain of safe code, basically making it as unsafe as C++.
You can use a debugger like windbg, which displays the memory addresses of objects - but they will still change when GC moves them around.
If you want to see if a new instance of your class gets created, why not set a breakpoint in the constructor?
I am convinced with #thomas answer above.
you can add a unique identifier (such as a GUID) property to your object and use that to determine if you have the same object.
you could override the Equals method to compare two objects if they same as below.
public class MyClass
{
public Guid Id { get; } = Guid.NewGuid();
public override bool Equals(object obj)
{
return obj is MyClass second && this.Id == second.Id;
}
}
As already explained, addresses of objects are not a viable means of reasoning about objects in garbage-collected virtual machines like DotNet. In DotNet you may get the chance to observe the address of an object if you use the fixed keyword, unsafe blocks, or GCHandle.Alloc(), but these are all very hacky and they keep objects fixed in memory so they cannot be garbage collected, which is something that you absolutely do not want. The moment you unfix an object, then its address is free to change, so you cannot keep track of it.
Luckily, you do not need any of that!
You don't need addresses, because all you want is a mnemonic for each object, for the purpose of identifying it during troubleshooting. For this, you have the following options:
Create a singleton which issues unique ids, and in the constructor of each object invoke this singleton to obtain a unique id, store the id with the object, and include the id in the ToString() method of the object, or in whatever other method you might be using for debug display.
Use the System.Runtime.Serialization.ObjectIDGenerator class, which does more or less what the singleton id generator would do, but in a more advanced, and possibly easier to use way. (I have no personal experience using it, so I cannot give any more advice about it.)
Use the System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode( object ) method, which returns what is known in other circles as The Identity Hash-Code of an Object. It is guaranteed to remain unchanged throughout the lifetime of the object, but it is not guaranteed to be unique among all objects. However, since it is 32-bits long, it will be a cold day in hell before another object gets issued the same hash code by coincidence, so it will serve all your troubleshooting purposes just fine.
Do yourself a favor and display the Identity Hash Code of your objects in hexadecimal; the number will be shorter, and will have a wider variety of digits than decimal, so it will be easier to retain in short-term memory while troubleshooting.

Get original value from HashSet

UPDATE:
Starting with .Net 4.7.2, HashSet.TryGetValue - docs is available.
HashSet.TryGetValue - SO post
I have a problem with HashSet because it does not provide any method similar to TryGetValue known from Dictionary. And I need such method -- passing element to find in the set, and set returning element from its collection (when found).
Sidenote -- "why do you need element from the set, you already have that element?". No, I don't, equality and identity are two different things.
HashSet is not sealed but all its fields are private, so deriving from it is pointless. I cannot use Dictionary instead because I need SetEquals method. I was thinking about grabbing a source for HashSet and adding desired method, but the license is not truly open source (I can look, but I cannot distribute/modify). I could use reflection but the arrays in HashSet are not readonly meaning I cannot bind to those fields once per instance lifetime.
And I don't want to use full blown library for just single class.
So far I am stuck with LINQ SingleOrDefault. So the question is how fix this -- have HashSet with TryGetValue?
Probably you should switch from a HashSet to a SortedSet
There is a simple TryGetValue() for a SortedSet:
public bool TryGetValue(ref T element)
{
var foundSet = sortedSet.GetViewBetween(element, element);
if(foundSet.Count == 1)
{
element = foundSet.First();
return true;
}
return false;
}
when called, the element needs just all properties set which are used in the Comparer. It returns the element found in the Set.
I agree this is something which is basically missing. While it's only useful in rare cases, I think they're significant rare cases - most notable, key canonicalization.
I can only think of one suggestion at the moment, and it's truly foul.
You can specify your own IEqualityComparer<T> when creating a HashSet<T> - so create one which remembers the arguments to the last positive (i.e. true-returning) Equals comparison it has performed. You can then call Contains, and see what the equality comparer was asked to compare.
Caveats:
This holds on to references unnecessarily, so could end up preventing objects being garbage collected
You'd potentially want to do this on a per-thread basis (if you've got a set that isn't modified after initialization, but is then read by multiple threads, for example)
It assumes that HashSet<T> doesn't use any optimization such as "if the references are equal, don't bother consulting the equality comparer"
It's fundamentally a horrible abuse
I've been trying to think of other alternatives in terms of finding intersections, but I haven't got anywhere yet...
As noted in comments, it would be worth encapsulating this as far as possible - I suspect you only need a very limited set of operations, so I'd wrap a HashSet<T> in your own class and only expose the operations you really need - that way you get to clear the "cache" after each operation, removing my first objection above.
It still feels like a horrible abuse to me, but...
As others have suggested, an alternative would be to use a Dictionary<TKey, TValue> and implement SetEquals yourself. That would be simple enough to do - and again, you'd want to encapsulate this in your own type. Either way, you should probably design the type itself first, and then implement it using either a HashSet<> or a Dictionary<,> as an implementation detail.
Sounds like you trying to use the wrong tool. True, you can save some memory using a HashSet but it seems to me that you are trying to acheeve a different goal: Get the actual element that is just equal to a representation.
So in reality they are two different elements. Just the memento (a unique representation) is equal.
Therefore you'd be better of using a Dictionary where you add your elements as Key and Value. So you're able to get it back (the identical) but you miss your SetEquals....
I suppose SetEquals in it's implementation does nothing much different than sequencially compare two HashSets in it's bucket order and fails on first non-equality.
So you should be equally good off using a simple SequenceEqual() (LINQ) comparing the two Keys collections.
So this extension method could do
public static SetEqual<T,G>(this IDictionary<T,G> d, IDictionary<T,G> e)
{
return d.Keys.SequenceEqual(e.Keys);
}
This should work, because a Dictionary basically is a HashSet with an associated value. And more appropriate to your problem. (OK, to be correct, the code should go for Dictionary<> instead of IDictionary<> because Key order matters)
If you need an IEnumerable<> on the second parameter try sorting to get a defined order (not so efficient).
Finally added in .NET 4.7.2:
HashSet.TryGetValue(T, T) Method
An SO post with more details
hopefully not blind but I haven't seen this answer anywhere. If you want dictionary's TryGetValue, you can just steal it.
theHashset.ToDictionary(item => item.ID).TryGetValue(key, out value)
All you need is a quick lambda for determining unique keys.

Should I return a collection when the reference to the collection is not changed?

I got a method which accepts a collection as below
public IList<CountryDto> ApplyDefaults(IList<CountryDto> dtos)
{
//Iterates the collection
//Validates the items in collection
//If items are invalid
//Removes items e.g dtos.Remove(currentCountryDto)
return dtos;//Do I need to do this?
}
My question is since, the reference to the collection is not changed, should I return the collection again from the method?
For: By returning the collection back, I make it explicit in the signature and user is aware that the items in the collection could be different from the original source. Sort of it avoid ambiguity.
Against: Since the validation doesnt change the reference of the collection, it doesn't make sense technically to return it.
What is the best approach in this case?
Note: I am not sure if this question is opinion based. I think probably I missing something here on design side.
In every programming language consistency of your own code / library with the approach of the core libraries is of high value. Hence, inspecting how Collections.sort() or Collection.swap() and Collections.shuffle() are defined, I would suggest to not return the input parameter, if you intend to modify it. In addition, your method should be named in such a way, that it is obvious the input parameter gets modified. Otherwise your method will be considered to have side-effects.
Returning a value most often suggests that it is a new instance which reflects the work, performed by the method or is used for method-chaining in case of builders.
Given your comments/requirements:
Does not need to report if defaults are applied.
ApplyDefaults is complicated and invoking other services and not intended to produce a fluent API
ApplyDefaults is a "black box"; validation logic is injected so the calling code doesn't know/care about the validation
I think based on these, this method definitely should not return the reference to the incoming list, even if no validation is applied. Firstly, unless the API is clearly built around method chaining (which you indicated you do not want), returning a List<T> type usually indicates a new List is being created. Secondly, if a new list is not created, users may find themselves modifying the list in ways they didn't expect.
Consider:
IList<CountryDto> originalCountries = Service.GetCountries();
IList<CountryDto> validatedCountries = ApplyDefaults(originalCountries);
validatedCountries.Add(mySpecialCountry);
OutputOriginalCountries(originalCountries);
OutputValidatedCountries(validatedCountries);
This code isn't very special, and a fairly common pattern. If ApplyDefaults returned a reference to the same originalCountries collection, then mySpecialCountry would also be added to originalCountries. This would violate the Principle of Least Astonishment.
This would be exacerbated if this behaviour changed depending on whether or not items were validated/filtered. Since the validation logic is a black-box of behaviour that the caller doesn't know or care about, the API consumer could not depend on whether or not it returned the same reference. They would either have to do their own reference check (e.g., if (myValidatedCountries == myInputCountries)), or simply make a copy every time. Regardless, this becomes another weird behaviour that the programmer has to juggle when working with the API.
I think that the method should either:
A) always return a copied list with the items filtered out (public IList<CountryDto> ApplyDefaults(IEnumerable<CountryDto> dtos))
B) modify the incoming list in-place (public void ApplyDefaults(IList<CountryDto> dtos))
For option A, depending on the size of your list, this incurs the possible unnecessary work of creating a copied list every time even if no filtering is performed. However, the validation/filtering logic might be simpler. You might be able to use LINQ queries to apply the filtering nicely. Additionally, removing items from a list is generally costly as it has to rebuild the internal array. So it might actually be faster to build a new list. You may even simplify the signature here to be IEnumerable<CountryDto>; this allows for wider usage and is extremely obvious that you're creating a new collection.
For option B, if no validation is required, then no work is done and the method is essentially "free" (no array rebuilding, no copying, no reference changes). But if there is significant validation, the removal aspect may be costly. Since you're not method chaining, this version should have a void return type as it's much more obvious to the developer that this is modifying the list in-place. This follows other commonly known methods like List<T>.Sort. Furthermore, if a user wants to have a separate originalCountries and validatedCountries they can always make a copy:
var validatedCountries = originalCountries.ToList();
ApplyDefaults(validatedCountries);
Ultimately, which one you choose might depend on performance. If validation/removal is cheap and rare, then modifying the list in-place might be best. If you're expecting a lot of changes to the list, it might simply be faster to produce a new copy every time.
Regardless, I would suggest you name the method with a little more clarity as well. For example:
public IList<CountryDto> GetValidCountries(IEnumerable<CountryDto> dtos)
public void RemoveInvalidCountries(IList<CountryDto> dtos)
Of course, the naming might be different depending on your actual code context (I suspect ApplyDefaults is a common/inherited method name and not specific to CountryDto)
I'd rather return boolean (or enum in an elaborated case: collection preserved intact,
changed, can't be validated etc.)
// true if the collection is changed, false otherwise
public Boolean ApplyDefaults(IList<CountryDto> dtos) {
Boolean result = false;
//Iterates the collection
//Validates the items in collection
//If items are invalid:
// Removes items e.g dtos.Remove(currentCountryDto)
// result = true;
...
return result;
}
...
if (ApplyDefaults(myData)) {
// Collection is changed, do some extra stuff
}
First of all: you cannot change the reference of the collection you send by parameter, because by default you're getting copy of it. You'd need to use a ref keyword in order to be able to change it.
Secondly: if your method has a return type, than it has to return an object. Your method is not called GetNewCollectionWithAppliedDefaults, but ApplyDefaults which implies that the collection will be modified. You should either return boolean true/false to inform user changes were done or always return parameter's collecion (to allow nested methods calling).
Also, why would you think it doesn't make sense to return a collection? I'd say there's no argument against it. Turn the question around: "why wouldn't I return the collection and could it harm my code"?
Technically, I would say there is not much difference between the two.
However, and as you pointed out, a common used convention is that a function should only return an object it creates. Basically, that would mean that a function that returns an object is generating one while a function which doesn't return anything is modifying the object passed as a parameter.
Again, this is only a convention and it is not widely used within the C# community, but in the python community for example, it is.
Some people, returns a Boolean (or an error code) instead as an indicator of an error (like the old dos command line). I don't like this approach and prefer by far raising exceptions that I can handle later on.
Finally, the best approach in my regard, is to return a value that indicates if a change was done by the function and eventually a value indicating how much of a change was done. It can be a Boolean or it can be the number of inserted/removed elements...
In any case, try to be consistent with the approach you chose, if not in all your code, at least within a single project. Sometimes, you will have no other choice but to abide with the convention used by your teammates.
(My answer is based on the Java viewpoint; C++ and C# programmers might have a different take.) I think it's best to return the collection. The fact that the collection you're returning is the same collection that was given is just an implementation detail, and in future versions of the code, you might want to change that. Document that the collection returned might not be the same one passed in.
If, on the other hand, you want to lock in the design that this method modifies a collection in place, document it that way and don't return the collection. I prefer not to do it this way, but I can see advantages in some contexts.
In your case I would leave void since ApplyDefaults clearly states what its doing.
Also, it might be a good idea to ApplyDefaults in the collection itself. Subclass IList or List or whatever and then you'd call like this:
myCollection.ApplyDefaults();
Which is just obvious.

c# - Returning default values for null properties, when the parent of these properties can or can not be null

So I didn't find any elegant solution for this, either googling or throughout stackoverflow. I guess that I have a very specific situation in my hands, anyway here it goes:
I have a object structure, which I don't have control of, because I receive this structure from an external WS. This is quite a huge object, with various levels of fields and properties, and this fields and properties can or can't be null, in any level. You can think of this object as an anemic model, it doesn't have behaviour, just state.
For the purpose of this question, I'll give you a simplified sample that simulates my situation:
Class A
PropB1
PropC11
PropLeaf111
PropC12
PropLeaf112
PropB2
PropC21
PropLeaf211
PropC22
PropLeaf221
So, throughout my code I have to access a number of these properties, in different levels, to do some math in order to calculate what I need. Basically for each type of calculation that I have to do, I have to test each level of the properties that I need, to check if it's not null, in which case I would return (decimal) 0, or any other default value depending on the business logic.
Sample of a math that I have to do with it:
var value = 0;
if (objClassA.PropB1 != null && objClassA.PropB1.PropC11 != null) {
var leaf = objClassA.PropB1.PropC11.PropLeaf111;
value = leaf.HasValue ? leaf.Value : value;
}
Just to be very, the leaf properties of this structure would always be primitives, or nullable primitives in which case I give the proper treatment. This is "the logic" that I have to do for each property that I need, and sometimes I have to use quite some of them. Also the real structure is quite bigger, so the number of verifications that I would need to do, would also be bigger for each necessary property.
Now, I came up with some ideas, none of them I think is ideal:
Create methods to gather the properties, where it would abstract any necessary verification, or the logic to get default values. The drawback is that it would have, in my opinion, quite some duplicated code, since the verifications and the default values would be similar for some groups of fields.
Create a single generic method, where it receives a object, and a lamba function that access the required field. This method would try to execute the function and return it's result, and in case of an NullReferenceException, it would return a default value. The bright side of this one, is that it is realy generic, I just have to pass lambdas to access the properties, and the method would handle any problem. The drawback of it, is that I am using try -> catch to control logic, which is not the purpose of it, and the code might look confusing for other programmers that would eventually give maintenance to it.
Null Object Pattern, this would be the most elegant solution, I guess. It would have all the good points if it was a normal case. But the thing is the impact of providing Null Objects for this structure. Just to give a bit more of context, the software that I am working on, integrates with government's services, and the structure that I am working with, which is in the government's specifications, have some fields where null have some meaning which is different from a default value like "0". Also this specification changes from time to time, and the classes are generated again, and the post processing that I would have to do to create Null Objects, would also need maintenance, which seems a bit dangerous for me.
I hope that I made myself clear enough.
Thanks in advance.
Solution
This is a response as to how I solved my problem, based on the accepted answer.
I'm quite new to C#, and this kind of discution that was linked really helped me to come up with a elegant solution in many aspects. I still have the problem that depending where the code is executed, it uses .NET 2.0, but I also found a solution for this problem, where I can somewhat define extension methods: https://stackoverflow.com/a/707160/649790
And for the solution itself, I found this one the best:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
I can basically access the properties this way, and just do the math:
objClassA.With(o => o.PropB1).With(o => PropC11).Return(o => PropLeaf111, 0);
For each property that I need. It still isn't just:
objClassA.PropB1.PropC11.PropLeaf111
ofcourse, but it is far better that any solution that I found so far, since I was unfamiliar with Extension Methods, I really learned a lot.
Thanks again.
There is a strategy for dealing with this, involving the "Maybe" Monad.
Basically it works by providing a "fluent" interface where the chain of properties is interrupted by a null somewhere along the chain.
See here for an example: http://smellegantcode.wordpress.com/2008/12/11/the-maybe-monad-in-c/
And also here:
http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
http://mikehadlow.blogspot.co.uk/2011/01/monads-in-c-5-maybe.html
It's related to but not quite the same as what you seem to need; however, perhaps it can be adapted to your needs. The concepts are fairly fundamental.

Update properties of objects in an IEnumerable<>

I am working on some software that should be used for a special type of experiment.
The experiments are performed using:
1) A "Chip" (basically an XY grid of known dimensions).
2) Each Chip contains "Electrodes", identified by their X and Y coordinate on the chip and by a unique ID. Each electrode can also hold or not hold a sample, indicated by a simple bool (it's a biosensing chip).
I have objects that represent this hardware in C#.
I now need to use the hardware in an experiment;
1) I have an "Experiment" which exposes an IEnumerable holding "ExperimentStep" objects.
2) An "ExperimentStep" holds a name, and a limited list of "Electrodes" involved among other things.
Some experiment steps could run concurrently and could change the "HasSample" property of an electrode. Therefore, when I perform an "ExperimentStep" it might be good to know what the initial "HasSample" property is at any one time.
This is where my problem lies;
If I just pass "Electrode" objects to my "ExpermentStep" they will probably be passed by Value... Is it possible to create an IEnumerable that holds references to the unique electrodes so that each time I want to run an "ExperimentStep" the list of "Electrodes" used in that step holds the most recent value for "HasSample"? Should I use pointers for this?? From my limited knowledge of C++ I would expect that this would be trivial in that language (since you work with pointers most of the time). But in C# I really have no clue (and I am not experienced enough).
In c# a class is reference type. This means that if you create list of instances of a class and then also add the instances to another list then its the same items. Each list will hold a reference. So to that effect yes you can up the items using an IEnumberable.
I suspect you don't understand the difference between reference types and value types, and what pass by value really means in C#.
Assuming Electrode is a class, you can modify the properties of an instance of it and those changes will be visible via any reference to the same object.
I strongly recommend you make sure you have a firm understanding of the .NET type system before trying to develop a lot of production code. The consequences of not understanding what's going on can be disastrous.
A couple of my articles on these topics:
Reference types and value types in .NET
Parameter passing in C#
... but I suggest you get a good introductory C# book as well.

Categories

Resources