Design Issue: Good or bad to make the cached entities mutable?

Design Issue: Good or bad to make the cached entities mutable? - c#

Here I need to cache some entites, for example, a Page Tree in a content management system (CMS). The system allows developers to write plugins, in which they can access the cached page tree. Is it good or bad to make the cached page tree mutable (i.e., there are setters for the tree node objects, and/or we expose the Add, Remove method in the ChildPages collection. So the client code can set properties of the page tree nodes, and add/remove tree nodes freely)?
Here's my opinions:
(1) If the page tree is immutable, the plugin developers has no way to modify the tree unexpected. That way we can avoid some subtle bugs.
(2) But sometimes we need to change the name of a page. If the page tree is immutable, we should invoke some method like "Refresh()" to refresh the cache. This will cause a database hit(so totally two database hits, but we should have avoided 1 of the 2 database hit). In this case, if the page tree is mutable, we can directly change the name in the page tree to make the tree up to date (so only 1 database hit is needed).
What do you think about it? And what will you do if you encounter such a situation?
Thanks in advance! :)
UPDATE: The page tree is something like:
public class PageCacheItem {
public string Name { get; set; }
public string PageTitle { get; set; }
public PageCacheItemCollection Children { get; private set; }
}
My problem here is not about the hashcode, because the PageCacheItem won't be put on a hashset or dictionary as keys.
My prolbem is:
If the PageCacheItem (the tree node) is mutable, that is, there are setters for properties(e.g., has setter for Name, PageTitle property). If some plugin authors change the properties of the PageCacheItem by mistake, the system will be in a incorrect state (that cached data is not consistent with the data in the database), and this bug is hard to debug, because it's caused by some plugin, not the system itself.
But if the PageCacheItem is readonly, it might be hard to implement efficient "cache refresh" functionality, because there are no setters for the properties, we can't simply update the properties by setting them to the latest values.
UPDATE2
Thanks guys. But I have one thing to note, that is, I'm not going to develop a generic caching framework, but develop some APIs on top of an exsiting caching framework. So my APIs is a middle layer between the underlying caching framework and the plugin authors. The plugin author doesn't need to know anything about the underlying caching framework. He only need to know this page tree is retrieved from cache. And he gets strongly-typed PageCacheItem APIs to use, not the weak-typed "object" retrieved from the underlying caching framework.
So my questions is about designing APIs for plugin authors, that is, is it good or bad to make the API class PageCacheItem mutable (here mutable == properties can be set outside the PageCacheItem class)?

First, I assume you mean the cached values may or may not be mutable, rather than the identifier it is identified by. If you mean the identifier too, then I would be quite emphatic about being immutable in this regard (emphatic enough to have my post flagged for obscene language).
As for mutable values, there is no one right answer here. You've hit on the primary pro and con either way, and there are multiple variants within each of the two options you describe. Cache invalidation is in general a notoriously difficult problem (as in the well known quote from Phil Karlton, "There are only two hard problems in Computer Science: cache invalidation and naming things."*)
Some things to consider:
How often will changes be made. If changes are rare, refreshes become easy - dump the existing cache and let it rebuild.
Will the CMS be on multiple servers, or could it in the future, as this means that any invalidation information has to be shared.
How bad is stale data, and how soon is it bad (could you happily server out of date values for the next hour or so, or would this conflict disastrously with fresh values).
Does a revalidation approach make sense for you, where after a certain time a cached value is checked to be sure it is still valid, and the time-to-next-check is updated (alternatively, periodically dump old values and let them be retrieved from the fresh source again).
How easy is detecting staleness in the first place? If its hard this can rule out some approaches.
How much does the cache actually save. Could you just get rid of it?
I haven't mentioned threading issues, because the threading issues are difficult with any sort of cache unless you're single-threaded (and if its a CMS I'm guessing it's web, and hence inherently multi-threaded). One thing I'll will say on the matter is that it's generally the case that a cache failure isn't critical (by definition, cache failure has a fallback - get the fresh value) for this reason it can be fruitful to take an approach where rather than blocking indefinitely on the monitor (which is what lock does internally) you use Montior.TryEnter with a timeout, and have the cache operation fail if the timeout is hit. Using a ReaderWriterLockSlim and allowing a slightly longer timeout for writing can be a good approach. This way if you get a point of heavy lock contention then the cache will stop working for some threads, but those threads still get usable data. This will suck for performance for those threads, but not as much as lock contention would cause for all affected threads, and caches are a place where it is very easy to introduce lock contention into a web project that only hits once you've gone live.
*(and of course the well known variant, "there are only two hard problems in Computer Science: cache invalidation, naming things, and off-by-one errors").

Look at it this way, if the entry is mutable, then it is likely that the hashcode will change when the object is mutated.
Depending on the dictionary implementation of the cache, it could either:
be 'lost'
in worst case the entire cache will need to be rehashed
There may be valid reasons why you want 'mutable hashcodes' but I cannot see a justification here. (I have only ever needed to do this once in the last 9 years).
It would be a lot easier just to remove and replace the entry you wish to be 'mutated'.

Related

System.Runtime.Caching.MemoryCache vs HttpRuntime.Cache - are there any differences?

I'm wondering if there are any differences between MemoryCache and HttpRuntime.Cache, which one is preferred in ASP.NET MVC projects?
As far as I understand, both are thread safe, API is from first sight more or less the same, so is there any difference when to use which?

HttpRuntime.Cache gets the Cache for the current application.
The MemoryCache class is similar to the ASP.NET Cache class.
The MemoryCache class has many properties and methods for accessing the cache that will be familiar to you if you have used the ASP.NET Cache class.
The main difference between HttpRuntime.Cache and MemoryCache is that the latter has been changed to make it usable by .NET Framework applications that are not ASP.NET applications.
For additional reading:
Justin Mathew Blog - Caching in .Net 4.0
Jon Davis Blog - Four Methods Of Simple Caching In .NET
Update :
According to the users feedback, sometimes Jon davis blog is not working.Hence I have put the whole article as an image.Please see that.
Note : If it's not clear then just click on the image.After that it'll open on a browser.Then click again on it to zoom :)

Here is Jon Davis' article. To preserve readability, I'm cutting out the now obsolete EntLib section, the intro as well as the conclusion.
ASP.NET Cache
ASP.NET, or the System.Web.dll assembly, does have a caching mechanism. It was never intended to be used outside of a web context, but it can be used outside of the web, and it does perform all of the above expiration behaviors in a hashtable of sorts.
After scouring Google, it appears that quite a few people who have discussed the built-in caching functionality in .NET have resorted to using the ASP.NET cache in their non-web projects. This is no longer the most-available, most-supported built-in caching system in .NET; .NET 4 has an ObjectCache which I’ll get into later. Microsoft has always been adamant that the ASP.NET cache is not intended for use outside of the web. But many people are still stuck in .NET 2.0 and .NET 3.5, and need something to work with, and this happens to work for many people, even though MSDN says clearly:
Note: The Cache class is not intended for use outside of ASP.NET applications. It was designed and tested for use in ASP.NET to provide caching for Web applications. In other types of applications, such as console applications or Windows Forms applications, ASP.NET caching might not work correctly.
The class for the ASP.NET cache is System.Web.Caching.Cache in System.Web.dll. However, you cannot simply new-up a Cache object. You must acquire it from System.Web.HttpRuntime.Cache.
Cache cache = System.Web.HttpRuntime.Cache;
Working with the ASP.NET cache is documented on MSDN here.
Pros:
It’s built-in.
Despite the .NET 1.0 syntax, it’s fairly simple to use.
When used in a web context, it’s well-tested. Outside of web contexts, according to Google searches it is not commonly known to cause problems, despite Microsoft recommending against it, so long as you’re using .NET 2.0 or later.
You can be notified via a delegate when an item is removed, which is necessary if you need to keep it alive and you could not set the item’s priority in advance.
Individual items have the flexibility of any of (a), (b), or (c) methods of expiration and removal in the list of removal methods at the top of this article. You can also associate expiration behavior with the presence of a physical file.
Cons:
Not only is it static, there is only one. You cannot create your own type with its own static instance of a Cache. You can only have one bucket for your entire app, period. You can wrap the bucket with your own wrappers that do things like pre-inject prefixes in the keys and remove these prefixes when you pull the key/value pairs back out. But there is still only one bucket. Everything is lumped together. This can be a real nuisance if, for example, you have a service that needs to cache three or four different kinds of data separately. This shouldn’t be a big problem for pathetically simple projects. But if a project has any significant degree of complexity due to its requirements, the ASP.NET cache will typically not suffice.
Items can disappear, willy-nilly. A lot of people aren’t aware of this—I wasn’t, until I refreshed my knowledge on this cache implementation. By default, the ASP.NET cache is designed to destroy items when it “feels” like it. More specifically, see (c) in my definition of a cache table at the top of this article. If another thread in the same process is working on something completely different, and it dumps high-priority items into the cache, then as soon as .NET decides it needs to require some memory it will start to destroy some items in the cache according to their priorities, lower priorities first. All of the examples documented here for adding cache items use the default priority, rather than the NotRemovable priority value which keeps it from being removed for memory-clearing purposes but will still remove it according to the expiration policy. Peppering CacheItemPriority.NotRemovable in cache invocations can be cumbersome, otherwise a wrapper is necessary.
The key must be a string. If, for example, you are caching data records where the records are keyed on a long or an integer, you must convert the key to a string first.
The syntax is stale. It’s .NET 1.0 syntax, even uglier than ArrayList or Hashtable. There are no generics here, no IDictionary<> interface. It has no Contains() method, no Keys collection, no standard events; it only has a Get() method plus an indexer that does the same thing as Get(), returning null if there is no match, plus Add(), Insert() (redundant?), Remove(), and GetEnumerator().
Ignores the DRY principle of setting up your default expiration/removal behaviors so you can forget about them. You have to explicitly tell the cache how you want the item you’re adding to expire or be removed every time you add add an item.
No way to access the caching details of a cached item such as the timestamp of when it was added. Encapsulation went a bit overboard here, making it difficult to use the cache when in code you’re attempting to determine whether a cached item should be invalidated against another caching mechanism (i.e. session collection) or not.
Removal events are not exposed as events and must be tracked at the time of add.
And if I haven’t said it enough, Microsoft explicitly recommends against it outside of the web. And if you’re cursed with .NET 1.1, you not supposed to use it with any confidence of stability at all outside of the web so don’t bother.
.NET 4.0’s ObjectCache / MemoryCache
Microsoft finally implemented an abstract ObjectCache class in the latest version of the .NET Framework, and a MemoryCache implementation that inherits and implements ObjectCache for in-memory purposes in a non-web setting.
System.Runtime.Caching.ObjectCache is in the System.Runtime.Caching.dll assembly. It is an abstract class that that declares basically the same .NET 1.0 style interfaces that are found in the ASP.NET cache. System.Runtime.Caching.MemoryCache is the in-memory implementation of ObjectCache and is very similar to the ASP.NET cache, with a few changes.
To add an item with a sliding expiration, your code would look something like this:
var config = new NameValueCollection();
var cache = new MemoryCache("myMemCache", config);
cache.Add(new CacheItem("a", "b"),
new CacheItemPolicy
{
Priority = CacheItemPriority.NotRemovable,
SlidingExpiration=TimeSpan.FromMinutes(30)
});
Pros:
It’s built-in, and now supported and recommended by Microsoft outside of the web.
Unlike the ASP.NET cache, you can instantiate a MemoryCache object instance.
Note: It doesn’t have to be static, but it should be—that is Microsoft’s recommendation (see yellow Caution).
A few slight improvements have been made vs. the ASP.NET cache’s interface, such as the ability to subscribe to removal events without necessarily being there when the items were added, the redundant Insert() was removed, items can be added with a CacheItem object with an initializer that defines the caching strategy, and Contains() was added.
Cons:
Still does not fully reinforce DRY. From my small amount of experience, you still can’t set the sliding expiration TimeSpan once and forget about it. And frankly, although the policy in the item-add sample above is more readable, it necessitates horrific verbosity.
It is still not generically-keyed; it requires a string as the key. So you can’t store as long or int if you’re caching data records, unless you convert to string.
DIY: Build One Yourself
It’s actually pretty simple to create a caching dictionary that performs explicit or sliding expiration. (It gets a lot harder if you want items to be auto-removed for memory-clearing purposes.) Here’s all you have to do:
Create a value container class called something like Expiring or Expirable that would contain a value of type T, a TimeStamp property of type DateTime to store when the value was added to the cache, and a TimeSpan that would indicate how far out from the timestamp that the item should expire. For explicit expiration you can just expose a property setter that sets the TimeSpan given a date subtracted by the timestamp.
Create a class, let’s call it ExpirableItemsDictionary, that implements IDictionary. I prefer to make it a generic class with defined by the consumer.
In the the class created in #2, add a Dictionary> as a property and call it InnerDictionary.
The implementation if IDictionary in the class created in #2 should use the InnerDictionary to store cached items. Encapsulation would hide the caching method details via instances of the type created in #1 above.
Make sure the indexer (this[]), ContainsKey(), etc., are careful to clear out expired items and remove the expired items before returning a value. Return null in getters if the item was removed.
Use thread locks on all getters, setters, ContainsKey(), and particularly when clearing the expired items.
Raise an event whenever an item gets removed due to expiration.
Add a System.Threading.Timer instance and rig it during initialization to auto-remove expired items every 15 seconds. This is the same behavior as the ASP.NET cache.
You may want to add an AddOrUpdate() routine that pushes out the sliding expiration by replacing the timestamp on the item’s container (Expiring instance) if it already exists.
Microsoft has to support its original designs because its user base has built up a dependency upon them, but that does not mean that they are good designs.
Pros:
You have complete control over the implementation.
Can reinforce DRY by setting up default caching behaviors and then just dropping key/value pairs in without declaring the caching details each time you add an item.
Can implement modern interfaces, namely IDictionary<K,T>. This makes it much easier to consume as its interface is more predictable as a dictionary interface, plus it makes it more accessible to helpers and extension methods that work with IDictionary<>.
Caching details can be unencapsulated, such as by exposing your InnerDictionary via a public read-only property, allowing you to write explicit unit tests against your caching strategy as well as extend your basic caching implementation with additional caching strategies that build upon it.
Although it is not necessarily a familiar interface for those who already made themselves comfortable with the .NET 1.0 style syntax of the ASP.NET cache or the Caching Application Block, you can define the interface to look like however you want it to look.
Can use any type for keys. This is one reason why generics were created. Not everything should be keyed with a string.
Cons:
Is not invented by, nor endorsed by, Microsoft, so it is not going to have the same quality assurance.
Assuming only the instructions I described above are implemented, does not “willy-nilly” clear items for clearing memory on a priority basis (which is a corner-case utility function of a cache anyway .. BUY RAM where you would be using the cache, RAM is cheap).
Among all four of these options, this is my preference. I have implemented this basic caching solution. So far, it seems to work perfectly, there are no known bugs (please contact me with comments below or at jon-at-jondavis if there are!!), and I intend to use it in all of my smaller side projects that need basic caching. Here it is:
Github link: https://github.com/kroimon/ExpirableItemDictionary
Old Link: ExpirableItemDictionary.zip
Worthy Of Mention: AppFabric, NoSQL, Et Al
Notice that the title of this blog article indicates “Simple Caching”, not “Heavy-Duty Caching”. If you want to get into the heavy-duty stuff, you should look at dedicated, scale out solutions.

MemoryCache.Default can also serve as a "bridge" if you're migrating a classic ASP.NET MVC app to ASP.NET Core, because there's no "System.Web.Caching" and "HttpRuntime" in Core.
I also wrote a small benchmark to store a bool item 20000 times (and another benchmark to retrieve it) and MemoryCache seems to be two times slower (27ms vs 13ms - that's total for all 20k iterations) but they're both super-fast and this can probably be ignored.

MemoryCache is what it says it is, a cache stored in memory
HttpRuntime.Cache (see http://msdn.microsoft.com/en-us/library/system.web.httpruntime.cache(v=vs.100).aspx and http://msdn.microsoft.com/en-us/library/system.web.caching.cache.aspx) persists to whatever you configure it to in your application.
see for example "ASP.NET 4.0: Writing custom output cache providers"
http://weblogs.asp.net/gunnarpeipman/archive/2009/11/19/asp-net-4-0-writing-custom-output-cache-providers.aspx

Are there any general patterns for designing classes for schemaless databases in .NET?

I have done a little bit of work with mongoDB in C# but all of my code is still in development. I am wondering what useful patterns people have found in evolving their domain classes over time as new properties are created, altered and removed. I am clear that I will need to either run updates on all my stored data or make sure my domain classes know how to deal with the older format records, but over time I could imagine this becoming chaotic if a class has know how to deal with all possible form formats.
Am I over thinking this? Is this mostly just a case of using good defensive programming?

Adding new properties to your data objects can’t be easier. You just
add them. Unless you worry about these properties being null for
object which exist in the database, you don’t have to do anything
else. If some users/machines use the older version of your
application and your classes were marked with
BsonIgnoreExtraElementsAttribute, they may not even need to update
their software.
Removing obsolete properties can’t be easier. You
just remove them in your classes. If your classes are marked with
BsonIgnoreExtraElementsAttribute, then you don’t even have to remove
them in your database (in case, for example, your users have several
versions of your app).
Renaming class properties is also easy. The BsonElementAttribute constructor has a parameter, so you can map it to the correct property name in the database.
Changing property type may require you to run an update on your
data. But seriously, how often do you change property type from
string to int in production?
So in many cases you won't even need to run updates on your data (unless, you change data type or your property affects index). Another point is that adding the BsonIgnoreExtraElementsAttribute is often a good practice, particularly if you are worried about properties being added and/or removed frequently. Following this practice, you can to provide older and newer versions of your application to work with all versions of records, enjoying the benefits of "schemalessness".

Are protected members/fields really that bad?

Now if you read the naming conventions in the MSDN for C# you will notice that it states that properties are always preferred over public and protected fields. I have even been told by some people that you should never use public or protected fields. Now I will agree I have yet to find a reason in which I need to have a public field but are protected fields really that bad?
I can see it if you need to make sure that certain validation checks are performed when getting/setting the value however a lot of the time it seems like just extra overhead in my opinion. I mean lets say I have a class GameItem with fields for baseName, prefixName, and suffixName. Why should I take the overhead of both creating the properties (C#) or accessor methods and the performance hit I would occur (if I do this for every single field in an application, I am sure that it would adds up at less a little especially in certain languages like PHP or certain applications with performance is critical like games)?

Are protected members/fields really that bad?
No. They are way, way worse.
As soon as a member is more accessible than private, you are making guarantees to other classes about how that member will behave. Since a field is totally uncontrolled, putting it "out in the wild" opens your class and classes that inherit from or interact with your class to higher bug risk. There is no way to know when a field changes, no way to control who or what changes it.
If now, or at some point in the future, any of your code ever depends on a field some certain value, you now have to add validity checks and fallback logic in case it's not the expected value - every place you use it. That's a huge amount of wasted effort when you could've just made it a damn property instead ;)
The best way to share information with deriving classes is the read-only property:
protected object MyProperty { get; }
If you absolutely have to make it read/write, don't. If you really, really have to make it read-write, rethink your design. If you still need it to be read-write, apologize to your colleagues and don't do it again :)
A lot of developers believe - and will tell you - that this is overly strict. And it's true that you can get by just fine without being this strict. But taking this approach will help you go from just getting by to remarkably robust software. You'll spend far less time fixing bugs.
And regarding any concerns about performance - don't. I guarantee you will never, in your entire career, write code so fast that the bottleneck is the call stack itself.

OK, downvote time.
First of all, properties will never hurt performance (provided they don't do much). That's what everyone else says, and I agree.
Another point is that properties are good in that you can place breakpoints in them to capture getting/setting events and find out where they come from.
The rest of the arguments bother me in this way:
They sound like "argument by prestige". If MSDN says it, or some famous developer or author whom everybody likes says it, it must be so.
They are based on the idea that data structures have lots of inconsistent states, and must be protected against wandering or being placed into those states. Since (it seems to me) data structures are way over-emphasized in current teaching, then typically they do need those protections. Far more preferable is to minimize data structure so that it tends to be normalized and not to have inconsistent states. Then, if a member of a class is changed, it is simply changed, rather than damaged. After all, somehow lots of good software was/is written in C, and that didn't suffer massively from lack of protections.
They are based on defensive coding carried to extremes. It is based on the idea that your classes will be used in a world where nobody else's code can be trusted not to goose your stuff. I'm sure there are situations where this is true, but I've never seen them. What I have seen is situations where things were made horribly complicated to get around protections for which there was no need, and to try to guard the consistency of data structures that were horribly over-complicated and un-normalized.

Regarding fields vs. properties, I can think of two reasons for prefering properties in the public interface (protected is also public in the sense that someone else than just your class can see it).
Exposing properties gives you a way to hide the implementation. It also allows you to change the implementation without changing the code that uses it (e.g. if you decide to change the way data are stored in the class)
Many tools that work with classes using reflection only focus on properties (for example, I think that some libraries for serialization work this way). Using properties consistently makes it easier to use these standard .NET tools.
Regarding overheads:
If the getter/setter is the usual one line piece of code that simply reads/sets the value of a field, then the JIT should be able to inline the call, so there is no performance overhad.
Syntactical overhead is largely reduced when you're using automatically implemented properties (C# 3.0 and newer), so I don't think this is an issue:
protected int SomeProperty { get; set; }
In fact, this allows you to make for example set protected and get public very easily, so this can be even more elegant than using fields.

Public and/or protected fields are bad because they can be manipulated from outside the declaring class without validation; thus they can be said to break the encapsulation principle of object oriented programming.
When you lose encapsulation, you lose the contract of the declaring class; you cannot guarantee that the class behaves as intended or expected.
Using a property or a method to access the field enables you to maintain encapsulation, and fulfill the contract of the declaring class.

I agree with the read-only property answer. But to play devil's advocate here, it really depends on what you're doing. I'll be happy to admit i write code with public members all the time (i also don't comment, follow guidelines, or any of the formalities).
But when i'm at work that's a different story.

It actually depends on if your class is a data class or a behaviour class.
If you keep your behaviour and data separate, it is fine to expose the data of your data classes, as long as they have no behaviour.
If the class is a behaviour class, then it should not expose any data.

Are There Reasons To Not Use CustomAttributes?

This is mostly a request for comments if there is a reason I should not go down this road.
I have a multi-tierd, CodeSmith generated application. At the UI level there need to be some fields that are required, and the required fields will vary depending on field values in the bound entity. What I am thinking of doing is adding a "PropertyRequired" CustomAttribute to each property in the entities that I can set true or false when I load the entity in its manager. Then I will use Reflection to query the property and give visual feedback to the user at the UI level, and I can validate that all the required properties have a valid value in the manager before I save. I've worked this out as a proof of concept with one property in one entity, but before I try to extend it to the rest of the application I'd like to ask if there is someone with more experience to either tell me to go for it, or why I won't like it when I scale up. If this is a bad idea, or if you can suggest a better approach please offer your opinion.

It is a pretty reasonable way to do it (I've done something very similar before) - but there are always downsides:
any code needing the entity will need the extra reference (assuming that the attribute and entity are in different assemblies)
the values (unless you are clever about it) must be determined at compile-time
you can't use it on entities outside of your control
In most cases the above aren't a problem. If they are an issue, you might want to support an external metadata model - but unless you need it, this would be overkill. Don't do it unless you must (meaning: go ahead and use attributes; they are usually fine).

There is no inherent reason to avoid custom attributes. It is a supported CLR feature which is the backbone for many available products (Code Contracts, FxCop, etc ...).

This is not an unreasonable approach and healthier than baking this stuff into a UI tier. There are a couple of points worth considering before taking the full dive:
You are tightly coupling business logic with the business entity itself. Are there circumstances where a field being required or valid values could change? You may be limiting yourself or be faced with an inconsistent validation mechanism
Dynamic assignment is possible but more tricky - i.e. when you set a field to be required thats what it will be unless you override
Custom attributes can be quite inflexible if further down the line you wanted to do something more complicated - namely if you need to pass state into an attribute driven validation scheme. Attributes like declarative assignment. Only having a true/false required property shouldn't be an issue here though
Just being a devils advocate really, in general for a fairly simple application where you only care about required fields, this is quite a tidy way of doing it

How to deal with unstable 3rd party object tree (sorry, I can’t come up with a better title)?

Let’s say I have to use an unstable assembly that I cannot refractor. My code is supposed to be the client of a CustomerCollection that is (surprise) a collection of Customer instances. Each customer instance has further properties like a collection of Order instances. I hope you get the idea.
Since the assembly behaves not that well my approach is to wrap each class in a façade where I can deal with exceptions and workarounds an all that stuff. (To make things more complicated I like to design the wrapper to be usable with WPF regarding data binding.)
So my question is about the design of the wrapper, e.g. CustomerCollectionFacade. How to expose the object tree (customers, orders, properties of orders)? Is the CustomerWrapper collection stored in a field or do I create CustomerWrapper instances on the fly (in the get accessor of a property maybe)?
Any ideas welcome. Thanks!
Edit:
Unfortunately the way proposed by krosenvold is not an option in my case. Since the object tree’s behavior is very interactive (editing from multiple views, events fired if properties change) I will not opt to abandon the ‘source object’. These changes are supposed to propagate to the source. Thanks anyway.

I generally try to isolate such transformations into one or more adapter classes and let them do the whole story at once. This is a god idea because it is easily testable, all the conversion logic ends up in one place, and you avoid littering the conversion logic "all over the place".
Sometimes there is state in the underlying (source) object that is going to be needed when/if you're updating the object. You might not be exposing this data in your cleaned-up api, so it's going to have to be hidden somewhere.
If you choose to encapsulate the original object there's always the chance that someone'll break that encapsulation sometime in the future and start leaking the gory details of the underlying object. That reason alone is usually enough for me to not keep a reference to the original instance, since I actually understand what I'm doing six months later when I'm in a hurry. But if you keep it somewhere else you'll need lifecycle management for the originals, so I usually end up stashing it away in some secret interface on the "clean" object.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.