Reflection, contravariance and polymorphism - c#

I have a base class (abstract) with multiple implementations, and some of them contain collection properties of other implementations - like so:
class BigThing : BaseThing
{
/* other properties omitted for brevity */
List<SquareThing> Squares { get; set; }
List<LittleThing> SmallThings { get; set;}
/* etc. */
}
Now sometimes I get a BigThing and I need to map it to another BigThing, along with all of its collections of BaseThings. However, when this happens, I need to be able to tell if a BaseThing in a collection from the source BigThing is a new BaseThing, and thus should be Add()-ed to the destination BigThing's collection, or if it's an existing BaseThing that should be mapped to one of the BaseThings that already exist in the destination collection. Each implementation of BaseThing has a different set of matching criteria on which it should be evaluated for new-ness. I have the following generic extension method to evaluate this:
static void UpdateOrCreateThing<T>(this T candidate, ICollection<T> destinationEntities) where T : BaseThing
{
var thingToUpdate = destinationEntites.FirstOrDefault(candidate.ThingMatchingCriteria);
if (thingToUpdate == null) /* Create new thing and add to destinationEntities */
else /* Map thing */
}
Which works fine. However I think I am getting lost with the method that deals in BigThings. I want to make this method generic because there are a few different kinds of BigThings, and I don't want to have to write methods for each, and if I add collection properties I don't want to have to change my methods. I have written the following generic method that makes use of reflection, but it is not
void MapThing(T sourceThing, T destinationThing) where T : BaseThing
{
//Take care of first-level properties
Mapper.Map(sourceThing, destinationThing);
//Now find all properties which are collections
var collectionPropertyInfo = typeof(T).GetProperties().Where(p => typeof(ICollection).IsAssignableFrom(p.PropertyType));
//Get property values for source and destination
var sourceProperties = collectionPropertyInfo.Select(p => p.GetValue(sourceThing));
var destinationProperties = collectionPropertyInfo.Select(p => p.GetValue(destinationThing));
//Now loop through collection properties and call extension method on each item
for (int i = 0; i < collectionPropertyInfo.Count; i++)
{
//These casts make me suspicious, although they do work and the values are retained
var thisSourcePropertyCollection = sourceProperties[i] as ICollection;
var sourcePropertyCollectionAsThings = thisSourcePropertyCollection.Cast<BaseThing>();
//Repeat for destination properties
var thisDestinationPropertyCollection = destinationProperties[i] as ICollection;
var destinationPropertyCollectionAsThings = thisDestinationPropertyCollection.Cast<BaseThing>();
foreach (BaseThing thing in sourcePropertyCollectionAsThings)
{
thing.UpdateOrCreateThing(destinationPropertyCollectionAsThings);
}
}
}
This compiles and runs, and the extension method runs successfully (matching and mapping as expected), but the collection property values in destinationThing remain unchanged. I suspect I have lost the reference to the original destinationThing properties with all the casting and assigning to other variables and so on. Is my approach here fundamentally flawed? Am I missing a more obvious solution? Or is there some simple bug in my code that's leading to the incorrect behavior?

Without thinking too much, I'd say you have fallen to a inheritance abuse trap, and now trying to save yourself, you might want to consider how can you solve your problem while ditching the existing design which leads you to do such things at the first place. I know, this is painful, but it's an investment in future :-)
That said,
var destinationPropertyCollectionAsThings =
thisDestinationPropertyCollection.Cast<BaseThing>();
foreach (BaseThing thing in sourcePropertyCollectionAsThings)
{
thing.UpdateOrCreateThing(destinationPropertyCollectionAsThings);
}
You are losing your ICollection when you use Linq Cast operator that creates the new IEnumerable<BaseThing>. You can't use contravariance either, because ICollectiondoes not support it. If it would, you'd get away with as ICollection<BaseThing> which would be nice.
Instead, you have to build the generic method call dynamically, and invoke it. The simplest way is probably using dynamic keyword, and let the runtime figure out, as such:
thing.UpdateOrCreateThing((dynamic)thisDestinationPropertyCollection);

Related

Merging C# objects with rules

I have some rather large files around that I would like to load in automatically with deserialization and then simply merge together, if possible maintaining memory references in a central object to make the merge as benign as possible.
Merges seem to be anything but simple though, in concept it seems easy
If there are some nulls, use the non null values
If there are if there are conflicting objects, dig deeper and merge their internals or just prefer one, maybe write custom code just for those classes to merge.
If there are collections, definitely combine them, if there are keys, like in a dictionary, try to add them, when there is a key conflict, as before, merge them or prefer one.
I've seen a lot of people around stack recommending I use Automapper to try and achieve this, though that seems flawed. Automapper isn't made for this and the overall task doesn't seem complex enough to warrant it. Its also not amazing encapsulation to put all your class specific merge code anywhere but that class. Your data pertaining to a given aspect of your code should sit in central locations, like a class object, to enable programmers to understand the usage of the data structure around them more readily. So I don't feel that automapper is a good solution for merging objects rather than simply, keeping one.
How would you recommend automating the merge of two structurally identical c# objects with nested heirarchies of custom classes?
I will post my own solution as well, but I encourage other developers, certainly many more intelligent than I, to recommend solutions.
While #JodySowald's answer decribes a nice generic approach, merging sounds to me like something that could involve an awful lot of class-specific business logic.
I would simply add a MergeWith method to each and every class in my hierarchy, down to a level where "merging" means a simple repeatable generic operation.
class A
{
string Description;
B MyB {get; set;}
public void MergeWith(A other)
{
// Simple local logic
Description = string.IsNullOrWithSpace(Description) ? other.Description : Description;
// Let class B do its own logic
MyB = MyB.MergeWith(other.MyB);
}
}
I think that in ~70% of use cases, someone will have a large hierarchical structure of many classes in a class library and will wish to be able to merge the whole hierarchy at once. For that purpose I think the code should iterate across the properties of the object and nested properties of subclasses, but only ones defined in the assembly you've created. No merging the internals of System.String, who knows what could happen. So only internal types to this assembly should be dug into for further merging
var internalTypes = Assembly.GetExecutingAssembly().DefinedTypes;
We also need a way to define custom code on a given class, there are always edge cases. I believe that this is what interfaces were created for, to generically define functionality for several classes and have a specific implementation available to the specific class. But I found that if merging requires knowledge of the data hierarchically above this class, such as the key it is stored with in a dictionary or perhaps an enum indicating the types or modes of data present, a reference to the containing datastructure should be available. So I defined a quick interface, ICombinable
internal interface ICombinable
{
/// <summary>
/// use values on the incomingObj to set correct values on the current object
/// note that merging comes after the individual DO has been loaded and populated as necessary and is the last step in adding the objects to the central DO that already exists.
/// </summary>
/// <param name="containingDO"></param>
/// <param name="incomingObj">an object from the class being merged in.</param>
/// <returns></returns>
ICombinable Merge(object containingDO, ICombinable incomingObj);
}
Bringing this together into a functional piece of code basically requires a little bit of property reflection, a little bit of recursion, and a little bit of logic, all of which is nuanced, so I just commented my code isntead of explaining it before hand. Since the goal is to affect a central object and not to create a new, merged copy, this is an instance method in the base class of the datastructure. but you could probably convert it to a helper method pretty easily.
internal void MergeIn(Object singleDO)
{
var internalTypes = Assembly.GetExecutingAssembly().DefinedTypes;
var mergableTypes = internalTypes.Where(c => c.GetInterfaces().Contains(typeof(ICombinable)));
MergeIn(this, this, singleDO, internalTypes, mergableTypes);
}
private void MergeIn(Object centralDORef, object centralObj, object singleObj, IEnumerable<TypeInfo> internalTypes, IEnumerable<TypeInfo> mergableTypes)
{
var itemsToMerge = new List<MergeMe>();
//all at once to open up parallelization later.
IterateOver(centralObj, singleObj, (f, t, i) => itemsToMerge.Add(new MergeMe(t, f, i)));
//check each property on these structures.
foreach (var merge in itemsToMerge)
{
//if either is null take non-null
if (merge.From == null || merge.To == null)
merge.Info.SetValue(centralObj, merge.To ?? merge.From);
//if collection merge
else if (merge.Info.PropertyType.IsGenericType && merge.Info.PropertyType.GetGenericTypeDefinition().IsAssignableFrom(typeof(IList<>)))
foreach (var val in (IList)merge.From)
((IList)merge.To).Add(val);
//if dictionary merge
else if (merge.Info.PropertyType.IsGenericType && merge.Info.PropertyType.GetGenericTypeDefinition().IsAssignableFrom(typeof(IDictionary<,>)))
{
var f = ((IDictionary)merge.From);
var t = ((IDictionary)merge.To);
foreach (var key in f.Keys)
if (t.Contains(key))
{
//if conflicted, check for icombinable
if (merge.Info.PropertyType.GenericTypeArguments[1].IsAssignableFrom(typeof(ICombinable)))
t[key] = ((ICombinable)t[key]).Merge(centralDORef, (ICombinable)f[key]);
}
else
t.Add(key, f[key]);
}
//if both non null and not collections, merge.
else if (merge.From != null && merge.To != null)
{
//check for Icombinable.
if (merge.Info.PropertyType.IsAssignableFrom(typeof(ICombinable)))
merge.Info.SetValue(centralObj, ((ICombinable)merge.To).Merge(centralDORef, (ICombinable)merge.From));
//if we made the object, dig deeper
else if (internalTypes.Contains(merge.Info.PropertyType))
{
//recurse.
MergeIn(centralDORef, merge.To, merge.From, internalTypes, mergableTypes);
}
//else do nothing, keeping the original
}
}
}
private class MergeMe{
public MergeMe(object from, object to, PropertyInfo info)
{
From = from;
To = to;
Info = info;
}
public object From;
public object To;
public PropertyInfo Info;
}
private static void IterateOver<T>(T destination, T other, Action<object, object, PropertyInfo> onEachProperty)
{
foreach (var prop in destination.GetType().GetProperties(BindingFlags.Public | BindingFlags.Instance))
onEachProperty(prop.GetValue(destination), prop.GetValue(other), prop);
}

How to pass variable entities to a generic function?

If i generate my entities through Entity Framework Database First, and i want to use a function like that:
AuditManager.DefaultConfiguration.Exclude<T>();
considering that the number of times i want to call it should be equal to the number of entities
ex:
AuditManager.DefaultConfiguration.Exclude<Employee>();
AuditManager.DefaultConfiguration.Exclude<Department>();
AuditManager.DefaultConfiguration.Exclude<Room>();
Now how to Loop through selected number of entities and pass every one to the Exclude function ?
The obvious solution would be to call the method for every entity-type you want to hide. Like this:
AuditManager.DefaultConfiguration.Exclude<Employee>();
AuditManager.DefaultConfiguration.Exclude<Department>();
AuditManager.DefaultConfiguration.Exclude<Room>();
You can add conditional statements (ifs) around them to do it dynamically.
Howevery, if you want a fully flexible solution, where you call the Exclude method based on metadata, you need something else. Something like this:
var types = new[] { typeof(Employee), typeof(Department), typeof(Room) };
var instance = AuditManager.DefaultConfiguration;
var openGenericMethod = instance.GetType().GetMethod("Exclude");
foreach (var #type in types)
{
var closedGenericMethod = openGenericMethod.MakeGenericMethod(#type);
closedGenericMethod.Invoke(instance, null);
}
This assumes that the Exclude<T> method is an instance method on whatever instance DefaultConfiguration points to.
An alternative to looping through your entity types is to make the entities you don't want audited implement the same interface and exclude that. For example:
public interface IExcludeFromAudit
{ }
And your entities:
public class Order : IExcludeFromAudit
{
//snip
}
And now just exclude the interface:
AuditManager.DefaultConfiguration.Exclude<IExcludeFromAudit>();
The benefit of this is that it's now easy to control which ones are excluded.

Very slow Reflection (trying to write a generic wrapper)

I'm trying to write a generic method to wrap an SDK we're using. The SDK provides "AFElement" objects that represent our data object, and each data AFElement has a collection of "AFAttributes" that map to our data objects' properties.
I've created a generic method which uses reflection to check the object it's called for's properties and get them (if they exist) from the AFElement.Attributes:
private T ConvertAFElementTo<T>(AFElement element, T item) where T : class, new()
{
PropertyInfo[] properties = item.GetType().GetProperties();
foreach (PropertyInfo property in properties)
{
//Get the Attribute object that represents this property
AFAttribute attribrute = element.Attributes[property.Name];
if (attribrute != null)
{
//check if we have the same type
if (property.PropertyType.Equals(attribrute.Type))
{
//set our property value to that of the attribute
var v = attribrute.GetValue().Value;
property.SetValue(item, v);
}
//check if we have an AFElement as an Attribute that will need converting to a data object
else if (attribrute.Type.Equals(typeof(AFElement)))
{
AFElement attributeElement = attribrute.GetValue().Value as AFElement;
Type attributeType = null;
//look up it's data type from the template
TypeConversionDictionary.TryGetValue(attributeElement.Template, out attributeType);
if (attributeType != null)
{
//set it as a .NET object
property.SetValue(item, ConvertAFElementTo(attributeElement, Activator.CreateInstance(attributeType)));
}
}
}
}
return item;
}
The idea is I can throw any of my data objects T at this method and it would populate them, and it works, except it's exceptionally slow.
It takes around 10 seconds to get 63 objects (11 properties each, all simple types like Guid, String and Single), 93% of the time is in this conversion method. I've heard reflection wasn't very efficient, but is is this inefficient?
Is there any other way I could do this, or a way to speed things up? Am I being stupid even trying to do something this generic?
The general rule when you do reflection is not to do any lookup operation etc. at execution time, but only once during an initialization step.
In your example, you could have a class for that method that would do the reflection lookup in the static constructor - ONCE when the class is first accessed. All method calls then will use the already evaluated reflection elements.
Reflection has to do a lot - and you really make it a lot harder by being fully dynamic.
I suggest you do more profiling and find out which methods exactly are slow ;) THen try to do the reflection part a little less often.
You can have an AFAMapper class that gets initialized for every pair of Source and Target ;)

Creating an object via lambda factory vs direct "new Type()" syntax

For example, consider a utility class SerializableList:
public class SerializableList : List<ISerializable>
{
public T Add<T>(T item) where T : ISerializable
{
base.Add(item);
return item;
}
public T Add<T>(Func<T> factory) where T : ISerializable
{
var item = factory();
base.Add(item);
return item;
}
}
Usually I'd use it like this:
var serializableList = new SerializableList();
var item1 = serializableList.Add(new Class1());
var item2 = serializableList.Add(new Class2());
I could also have used it via factoring, like this:
var serializableList = new SerializableList();
var item1 = serializableList.Add(() => new Class1());
var item2 = serializableList.Add(() => new Class2());
The second approach appears to be a preferred usage pattern, as I've been lately noticing on SO. Is it really so (and why, if yes) or is it just a matter of taste?
Given your example, the factory method is silly. Unless the callee requires the ability to control the point of instantiation, instantiate multiple instances, or lazy evaluation, it's just useless overhead.
The compiler will not be able to optimize out delegate creation.
To reference the examples of using the factory syntax that you gave in comments on the question. Both examples are trying (albeit poorly) to provide guaranteed cleanup of the instances.
If you consider a using statement:
using (var x = new Something()) { }
The naive implementation would be:
var x = new Something();
try
{
}
finally
{
if ((x != null) && (x is IDisposable))
((IDisposable)x).Dispose();
}
The problem with this code is that it is possible for an exception to occur after the assignment of x, but before the try block is entered. If this happens, x will not be properly disposed, because the finally block will not execute. To deal with this, the code for a using statement will actually be something more like:
Something x = null;
try
{
x = new Something();
}
finally
{
if ((x != null) && (x is IDisposable))
((IDisposable)x).Dispose();
}
Both of the examples that you reference using factory parameters are attempting to deal with this same issue. Passing a factory allows for the instance to be instantiated within the guarded block. Passing the instance directly allows for the possibility of something to go wrong along the way and not have Dispose() called.
In those cases, passing the factory parameter makes sense.
Caching
In the example you have provided it does not make sense as others have pointed out. Instead I will give you another example,
public class MyClass{
public MyClass(string file){
// load a huge file
// do lots of computing...
// then store results...
}
}
private ConcurrentDictionary<string,MyClass> Cache = new ....
public MyClass GetCachedItem(string key){
return Cache.GetOrAdd(key, k => new MyClass(key));
}
In above example, let's say we are loading a big file and we are calculating something and we are interested in end result of that calculation. To speedup my access, when I try to load files through Cache, Cache will return me cached entry if it has it, only when cache does not find the item, it will call the Factory method, and create new instance of MyClass.
So you are reading files many times, but you are only creating instance of class that holds data just once. This pattern is only useful for caching purpose.
But if you are not caching, and every iteration requires to call new operator, then it makes no sense to use factory pattern at all.
Alternate Error Object or Error Logging
For some reason, if creation fails, List can create an error object, for example,
T defaultObject = ....
public T Add<T>(Func<T> factory) where T : ISerializable
{
T item;
try{
item = factory();
}catch(ex){
Log(ex);
item = defaultObject;
}
base.Add(item);
return item;
}
In this example, you can monitor factory if it generates an exception while creating new object, and when that happens, you Log the error, and return something else and keep some default value in list. I don't know what will be practical use of this, but Error Logging sounds better candidate here.
No, there's no general preference of passing the factory instead of the value. However, in very particular situations, you will prefer to pass the factory method instead of the value.
Think about it:
What's the difference between passing the parameter as a value, or
passing it as a factory method (e.g. using Func<T>)?
Answer is simple: order of execution.
In the first case, you need to pass the value, so you must obtain it before calling the target method.
In the second case, you can postpone the value creation/calculation/obtaining till it's needed by the target method.
Why would you want to postpone the value creation/calculation/obtaining? obvious things come to mind:
Processor-intensive or memory-intensive creation of the value, that you want to happen only in case the value is really needed (on-demand). This is Lazy loading then.
If the value creation depends on parameters that are accessible by the target method but not from outside of it. So, you would pass Func<T, T> instead of Func<T>.
The question compares methods with different purposes. The second one should be named CreateAndAdd<T>(Func<T> factory).
So depending what functionality is required, should be used one or another method.

Deep Reflection in .NET

I need to create the ability to drill through an objects properties like two or three deep. For instance, class A has a property reference to class B, which I need to access class C. What is the best way to do this: straight reflection, or maybe using the TypeDescriptor, or something else?
Thanks.
It's not too hard to write. I put a few classes together to deal with this so I could serialize properties of a WinForm. Take a look at this class and the related classes.
http://csharptest.net/browse/src/Library/Reflection/PropertySerializer.cs
If you know the path in a static context (ie the path is always the same) and the properties are accessible (internal or public) you can use dynamic
[Test]
public void Foo()
{
var a = new A
{
B = new B
{
C = new C
{
Name = "hello"
}
}
};
DoReflection(a);
}
private void DoReflection(dynamic value)
{
string message = value.B.C.Name;
Debug.WriteLine(message);
}
I you wanna write you own serialization code for whatever reason, you'll be using reflection.
What you do is that you write a recursive method of serlizating a type. You then apply this as you see fit to get the result.
var type = myObjectOfSomeType.GetType();
// now depending on what you want to store
// I'll save all public properties
var properties = type.GetProperties(); // get all public properties
foreach(var p in properties)
{
var value = p.GetValue(myObjectOfSomeType, null);
Writevalue(p.Name, value);
}
The implementation of WriteValue have to recognize the built in types and treat them accordingly, that's typical things like string, char, integer, double, DateTime etc.
If it encounters a sequence or collection you need to write out many values.
If it encounters a non trivial type you'll apply this recursive pattern again.
The end result is a recursive algorithm that traverses your object model and writes out values as it encounters types that I know how to serialize.
However, I do recommend looking into WCF, not for building services, but for serialization. It shipped as part of the .NET 3.0 framework with a new assembly System.Runtime.Serilization and in general is very capable when dealing with serialization and data annotations.

Categories

Resources