I have an abstract class in a library. I'm trying to make it as easy as possible to properly implement a derivation of this class. The trouble is that I need to initialize the object in a three-step process: grab a file, do a few intermediate steps, and then work with the file. The first and last step are particular to the derived class. Here's a stripped-down example.
abstract class Base
{
// grabs a resource file specified by the implementing class
protected abstract void InitilaizationStep1();
// performs some simple-but-subtle boilerplate stuff
private void InitilaizationStep2() { return; }
// works with the resource file
protected abstract void InitilaizationStep3();
protected Base()
{
InitilaizationStep1();
InitilaizationStep2();
InitilaizationStep3();
}
}
The trouble, of course, is the virtual method call in the constructor. I'm afraid that the consumer of the library will find themselves constrained when using the class if they can't count on the derived class being fully initialized.
I could pull the logic out of the constructor into a protected Initialize() method, but then the implementer might call Step1() and Step3() directly instead of calling Initialize(). The crux of the issue is that there would be no obvious error if Step2() is skipped; just terrible performance in certain situations.
I feel like either way there is a serious and non-obvious "gotcha" that future users of the library will have to work around. Is there some other design I should be using to achieve this kind of initialization?
I can provide more details if necessary; I was just trying to provide the simplest example that expressed the problem.
I would consider creating an abstract factory that is responsible for instantiating and initializing instances of your derived classes using a template method for initialization.
As an example:
public abstract class Widget
{
protected abstract void InitializeStep1();
protected abstract void InitializeStep2();
protected abstract void InitializeStep3();
protected internal void Initialize()
{
InitializeStep1();
InitializeStep2();
InitializeStep3();
}
protected Widget() { }
}
public static class WidgetFactory
{
public static CreateWidget<T>() where T : Widget, new()
{
T newWidget = new T();
newWidget.Initialize();
return newWidget;
}
}
// consumer code...
var someWidget = WidgetFactory.CreateWidget<DerivedWidget>();
This factory code could be improved dramatically - especially if you are willing to use an IoC container to handle this responsibility...
If you don't have control over the derived classes, you may not be able to prevent them from offering a public constructor that can be called - but at least you can establish a usage pattern that consumers could adhere to.
It's not always possible to prevent users of you classes from shooting themselves in the foot - but, you can provide infrastructure to help consumers use your code correctly when they familiarize themselves with the design.
That's way too much to place in the constructor of any class, much less of a base class. I suggest you factor that out into a separate Initialize method.
In lots of cases, initialization stuff involves assigning some properties. It's possible to make those properties themselves abstract and have derived class override them and return some value instead of passing the value to the base constructor to set. Of course, whether this idea is applicable depends on the nature of your specific class. Anyway, having that much code in the constructor is smelly.
At first sight, I would suggest to move this kind of logic to the methods relying on this initialization. Something like
public class Base
{
private void Initialize()
{
// do whatever necessary to initialize
}
public void UseMe()
{
if (!_initialized) Initialize();
// do work
}
}
Since step 1 "grabs a file", it might be good to have Initialize(IBaseFile) and skip step 1. This way the consumer can get the file however they please - since it is abstract anyways. You can still offer a 'StepOneGetFile()' as abstract that returns the file, so they could implement it that way if they choose.
DerivedClass foo = DerivedClass();
foo.Initialize(StepOneGetFile('filepath'));
foo.DoWork();
Edit: I answered this for C++ for some reason. Sorry. For C# I recommend against a Create() method - use the constructor and make sure the objects stays in a valid state from the start. C# allows virtual calls from the constructor, and it's OK to use them if you carefully document their expected function and pre- and post-conditions. I inferred C++ the first time through because it doesn't allow virtual calls from the constructor.
Make the individual initialization functions private. The can be both private and virtual. Then offer a public, non-virtual Initialize() function that calls them in the correct order.
If you want to make sure everything happens as the object is created, make the constructor protected and use a static Create() function in your classes that calls Initialize() before returning the newly created object.
You could employ the following trick to make sure that initialization is performed in the correct order. Presumably, you have some other methods (DoActualWork) implemented in the base class, that rely on the initialization.
abstract class Base
{
private bool _initialized;
protected abstract void InitilaizationStep1();
private void InitilaizationStep2() { return; }
protected abstract void InitilaizationStep3();
protected Initialize()
{
// it is safe to call virtual methods here
InitilaizationStep1();
InitilaizationStep2();
InitilaizationStep3();
// mark the object as initialized correctly
_initialized = true;
}
public void DoActualWork()
{
if (!_initialized) Initialize();
Console.WriteLine("We are certainly initialized now");
}
}
I wouldn't do this. I generally find that doing any "real" work in a constructor ends up being a bad idea down the road.
At the minimum, have a separate method to load the data from a file. You could make an argument to take it a step further and have a separate object responsible for building one of your objects from file, separating the concerns of "loading from disk" and the in-memory operations on the object.
Related
I did not find anything about this exact scenario while googling.
In C# is it possible to allow an abstract method to be implemented in the derived class, but only called in the base class?
The reason I would want to do this is that I want to be able to define multiplier do "do-er" methods if you will, but I want to wrap all the calls to that method in a lock. I don't want to have to leave it up to the method implementor to remember to put locks in their methods, and I don't want them to be able to call the method without a lock.
It's not absolutely necessary to protect it to this level but I thought it would be nice if I could.
You can do this by playing with the access modifiers:
public abstract class BaseClass
{
public void DoSomethingDangerous()
{
lock (someObject)
{
DoDangerous();
}
}
protected virtual void DoDangerous() { }
}
public class ChildClass : BaseClass
{
protected override void DoDangerous()
{
//Do something here
}
}
Since the public method only exists on the base class, the "unprotected" child method cannot be called directly, this way the base class can control in what context the "doer" method is called.
Really though if you want to protect access to some resource you should lock on the calls to that resource, don't try to force user code to be implemented in a certain pattern. If you need to lock on a dictionary access for example, either use the appropriate type (ConcurrentDictionary) or set the access to private and provide getter/setter/deleter methods in the base class and use the locks in there.
I'm trying to find a proper way to restrict/force the usage of methods in order to ensure the correct internal handling.
Given the following abstract base class
public abstract class BaseClass
{
// needs to be overriden by concrete implementation
protected abstract void CreateInternal(object dataToCreate);
// only visible method
public void Create(object dataToCreate)
{
// check the data provided <-- this is important to be done each time
CheckData(dataToCreate);
// call implementation of concrete class
CreateInternal(dataToCreate);
}
private void CheckData(object dataToCheck)
{
if(dataToCheck == null) throw new Exception("Data is not valid");
}
}
and a simple implementation
public class ChildClass : BaseClass
{
protected override void CreateInternal(object dataToCreate)
{
// do create-stuff related to ChildClass
}
}
My question: Is there a way to restrict the access to CreateInternal? In ChildClass I could create a public method
public void DoStuff(object dataToDoStuff)
{
// access protected method is not forbidden
CreateInternal(dataToDoStuff);
}
This will call CreateInternal without doing the needed checks as if it would do if called via Create of the base-class.
Is there any way to force the usage of Create prior to CreateInternal? There is no need to have this at compile-time (but it would be nice), but at least at runtime.
I have something like checking who is calling in mind.
public class ChildClass : BaseClass
{
protected override void CreateInternal(object dataToCreate)
{
// if not called via base 'Create' -> throw exception
}
}
Is there some pattern I'm not aware of or is what I'm trying to achieve too weired and simply not possible?
There is no way of really enforcing this at compile or runtime. As you well say, a virtual protected method is reachable and overridable from any derived type so you'd always have to rely on the implementation of the overriden method making the necessary checks which kind of defeats the purpose.
IMHO your best bet is to enforce this through code reviews if you can control who's extending your class. If thats not the case then, seeing that your Create method is not virtual and is simply changing the state of BaseClass, why don't you call it in the constructor? Is this possible or is your example a simplified scenario and this isn't an option? Doing this would guarantee that Create is always called first.
UPDATE: Contrary to what I said before, there are "ways" you could enforce this at runtime.
Although not shown in your example, I'm guessing there will be some kind of internal state in BaseClass that any derived class must leverage via methods, properties, fields, etc. to be of any use (inheritance would be kind of pointless otherwise). You could always set a private flag createCalled in BaseClass and make all BaseClass methods, getters (yuck) and setters check the flag and bail out with an InvalidOperationException if its not set. This would esentially render useless any derived instance not correctly initialized. Ugly but doable.
Or even simpler, if you control all potential consumers of BaseClass and any derived type out there in the wild, then just make the flag public public bool Initialized { get; }and check when consuming the object and bail out if necessary.
"Weird" would not be the word of choice of mine, but still..
As far as I can see, what you are trying to achieve is to prevent the ChildClass owner, who implemented CreateInternal()'s method body, from executing those statements without invoking CheckData() first.
Even if this works, he can still copy and paste the statements into DoStuff() method body and can execute them there.
We do not force, we guide.
People will follow your guideline, and they will be happy to see that their class is working according to it.
I have a class, BaseEmailTemplate, that formats an email, and I want to create a derived type that can overrule the defaults. Originally my base constructor -
public BaseEmailTemplate(Topic topic)
{
CreateAddresses(topic);
CreateSubject(topic);
CreateBody(topic);
}
... (Body/Addresses)
protected virtual void CreateSubject(Topic topic)
{
Subject = string.Format("Base boring format: {0}", topic.Name);
}
And in my derived
public NewEmailTemplate(Topic topic) : Base (topic)
{
//Do other things
}
protected override void CreateSubject(Topic topic)
{
Subject = string.Format("New Topic: {0} - {1})", topic.Id, topic.Name);
}
Of course this leads to the error discussed here: Virtual member call in a constructor
So to be absolutely blunt about this - I don't want to have to call the same methods in every derived type. On the flip side, I need to be able to change any/all. I know another base has a different subset of addresses, but the body and subject will be the default.
All three methods must be called, and the ability to alter any one of them needs to be available on a per derived basis.
I mean the thing everyone seems to be saying is an unintended consequence of using virtual seems to be my exact intention.. Or maybe I'm in too deep and singly focused?
UPDATE- Clarification
I understand why virtual members in the constructor is bad, I appreciate the answers on that topic, though my question isn't "Why is this bad?" its "Ok this is bad, but I can't see what better serves my need, so what do I do?"
This is how it is currently implemented
private void SendNewTopic(TopicDTO topicDto)
{
Topic topic = Mapper.Map<TopicDTO , Topic>(topicDto);
var newEmail = new NewEmailTemplate(topic);
SendEmail(newEmail); //Preexisting Template Reader infrastructure
//Logging.....
}
I'm dealing with a child and grandchild. Where I came in there was only newemailtemplate, but I have 4 other tempaltes I now have to build, but 90% of the code is reusable. Thats why I opted to create BaseEmailTemplate(Topic topic). BaseTemplate creates things like Subject and List and other things that SendEmail is expecting to read.
NewEmailTemplate(Topic topic): BaseEmailTemplate(Topic topic): BaseTemplate, IEmailTempate
I would prefer not have to require anyone who follows my work have to know that
var newEmail = new NewEmailTemplate();
newEmail.Init(topic);
is required every single time it is used. The object would be unusable without it. I thought there were many warnings about that?
[10.11] of the C# Specification tells us that object constructors run in order from the base class first, to the most inherited class last. Whereas [10.6.3] of the specification tells us that it is the most derived implementation of a virtual member which is executed at run-time.
What this means is that you may receive a Null Reference Exception when attempting to run a derived method from the base object constructor if it accesses items that are initialized by the derived class, as the derived object has not had it's constructor run yet.
Effectively, the Base method's constructor runs [10.11] and tries to reference the derived method CreateSubject() before the constructor is finished and the derived constructor can be run, making the method questionable.
As has been mentioned, in this case, the derived method seems to only rely upon items passed as parameters, and may well run without issue.
Note that this is a warning, and is not an error per se, but an indication that an error could occur at runtime.
This would not be a problem if the methods were called from any other context except for the base class constructor.
A factory method and an initialize function is an effective workaround for this situation.
In the base class:
private EmailTemplate()
{
// private constructor to force the factory method to create the object
}
public static EmailTemplate CreateBaseTemplate(Topic topic)
{
return (new BaseEmailTemplate()).Initialize(topic);
}
protected EmailTemplate Initialize(Topic topic)
{
// ...call virtual functions here
return this;
}
And in the derived class:
public static EmailTemplate CreateDerivedTemplate(Topic topic)
{
// You do have to copy/paste this initialize logic here, I'm afraid.
return (new DerivedEmailTemplate()).Initialize(topic);
}
protected override CreateSubject...
The only exposed method to create an object will be through the factory method, so you don't have to worry about the end user forgetting to call an initialize. It's not quite as straight-forward to extend, when you want to create further derived classes, but the objects themselves should be quite usable.
A workaround could be to use your constructor to initialize a private readonly Topic _topic field, and then to move the three method calls to a protected void Initialize() method which your derived types can safely call in their constructor, since when that call occurs the base constructor has already executed.
The fishy part is that a derived type needs to remember to make that Initialize() call.
#Tanzelax: That looks ok, except that Initialize always returns EmailTemplate. So the static factory method won't be quite as sleak:
public static DerivedEmailTemplate CreateDerivedTemplate(Topic topic)
{
// You do have to copy/paste this initialize logic here, I'm afraid.
var result = new DerivedEmailTemplate();
result.Initialize(topic);
return result;
}
This answer is mostly for completeness, in case somebody stumbles upon this question these days (like me).
To avoid a separate Init method while still keeping things simple, one thing that could feel more natural (IMO) to users of the code would be to have Topic as a property of the base class:
// This:
var newEmail = new NewEmailTemplate { Topic = topic };
// Instead of this:
var newEmail = new NewEmailTemplate();
newEmail.Init(topic);
Then, the property setter could take care of calling the abstract methods, such as:
public abstract class BaseEmailTemplate
{
// No need for even a constructor
private Topic topic;
public Topic
{
get => topic;
set
{
if (topic == value)
{
return;
}
topic = value;
// Derived methods could also access the topic
// as this.Topic instead of as an argument
CreateAddresses(topic);
CreateSubject(topic);
CreateBody(topic);
}
}
protected abstract void CreateAddresses(Topic topic);
protected abstract void CreateSubject(Topic topic);
protected abstract void CreateBody(Topic topic);
}
Pros:
An email template can be defined in a single line with an intuitive syntax
No factory methods or third classes involved
Derived classes only need to worry about overriding the abstract methods, and not about calling the base constructor (but you may still want to pass other variables as constructor arguments)
Cons:
You still need to consider the possibility of users forgetting to define Topic, and handle the case of it being null. But I would argue you should do that anyway; somebody could explicitly pass a null topic to the original constructor
You are publicly exposing a Topic property without really needing to. Perhaps you intended to do this anyway but, if not, it might not be very ideal. You could remove the getter, but that might look a bit odd
If you have more than one inter-dependent property, the boilerplate code would increase. You could try to group all these into a single class, so that only one setter still triggers the abstract methods
I've got a couple of classes that form a too-complicated object graph. Here's a peek at a smaller scenario. Assume INotifyPropertyChanged is in place.
class A
{
public InternalType InterestingProperty { get; set; }
}
class B
{
public A Component { get; set; }
}
My helper class watches for these events and updates its properties when the properties of the objects change. It does this so some other class that's interested in about a dozen properties on as many objects are easily accessible. This is all packed in a framework that has several variants, so inheritance is in play.
I've finished the first scenario, and ended up with a concrete class like this:
class ScenarioOnePropertySpy
{
protected ScenarioOne PropertySpy(Foo thingToMonitor)
{
_thingToMonitor = thingToMonitor;
RegisterForEvents()
}
public B InterestingB { get; }
protected RegisterForEvents()
{
// * Register for _thingToMonitor propertyChanged if first time.
// * If B is different, unregister the old and register the new.
// * If B hasn't been set yet register for PropertyChanged on it.
// * If B.Component isn't the same as last time unregister the
// old and register the new.
}
protected Update()
{
// Some monitored object changed; refresh property values and
// update events in case some monitored object was replaced.
B = _thingToMonitor.B;
RegisterForEvents()
}
private Handle_PropertyChanged(...) { Update(); }
}
It's icky event registration, but moving that ugliness out of the class that wants to know about the properties is the purpose. Now I've moved on to scenario 2 that monitors different objects/properties and used my concrete class as a guide for an abstract one:
abstract class PropertySpy
{
protected PropertySpy(FooBase thingToMonitor)
{
_thingToMonitor = thingToMonitor;
RegisterForEvents()
}
protected abstract void RegisterForEvents()
// ...
}
Whoops. I've got a virtual method call in the constructor. In theory it's safe for all of my scenarios, but the R# warning keeps digging at me. I'm sure if I move forward one day it's going to cause a problem that'll take a while to figure out. That method's definitely going to need to work with properties on the derived types.
I could drop the method and force derived types to do the event management themselves. That'd defeat the purpose for the base class. And someone would forget to follow the contract and it'd turn into a support incident; I spend enough time writing documentation as it is. Another one I thought of was making RegisterForEvents() publich and requiring users to call it after construction. That "create then initialize" pattern stinks in .NET and people always forget. Currently I'm toying with the notion of another class that does the event registration that's injected via the constructor. Then derived classes can provide that class to achieve the same effect as a virtual method without the dangers. But the thing doing the registration would need practically the same property interface as PropertySpy; it seems tedious but I guess "ugly and works" is better than what I've got.
Anything I'm missing? I'll even take "it's a warning, not a rule" as an answer if the argument is convincing.
Your scenario seems complicated enough to consider a completely different approach to class instantiation. What about using a factory to construct property spies?
public class PropertySpyFactory<T> where T : PropertySpy, new()
{
public static T Create()
{
T result = new T();
// … whatever initialization needs to be done goes here …
result.RegisterForEvents();
return result;
}
}
ScenarioOnePropertySpy spy = PropertySpyFactory<ScenarioOnePropertySpy>.Create();
It's salvagable in the code, instance initialization can be extended easily, and once you turn to an IoC it will feel quite natural and not much refactoring will be needed.
UPDATE: One another option in case a) your spy hierarchy is flat enough and b) you don't need to use a common ancestor or you can substitute it with an interface:
public abstract class PropertySpy<T> where T : PropertySpy, new()
{
public static T Create()
{
T result = new T();
// … whatever initialization needs to be done goes here …
result.RegisterForEvents();
return result;
}
…
}
public class ScenarioOnePropertySpy : PropertySpy<ScenarioOnePropertySpy>
{
…
}
ScenarioOnePropertySpy spy = ScenarioOnePropertySpy.Create();
In other words, the factory method is located right within the common ancestor. The drawback of this approach is that it isn't that orthogonal (the factory isn't separated from the classes being constructed) and hence less extensible and flexible. However, in certain cases may be a valid option.
Last but not least, you can create a factory method in each class again. The advantage is you can keep constructors protected and hence force users to use factory methods instead of direct instantiation.
The key issue I believe is that by the time the virtual method is called, your subclass constructor and initializers have not executed yet. So, in your overridden method, your subclass may not have all the things you expect to be initialized initialized.
I want to develop a process() method. The method takes some data in the form of a data class, and processes it. The data classes are similar, but slightly different.
For example we have the following classes of data processDataObject_A, processDataObject_B and processDataObject_C.
Is it better to overload the method:
void process(processDataObject_A data)
{
//Process processDataObject_A here
}
void process(processDataObject_B data)
{
//Process processDataObject_B here
}
void process(processDataObject_C data)
{
//Process processDataObject_C here
}
OR have the concrete data classes extend some Abstract Data Class, and pass that to the process method and then have the method check the type and act accordingly:
void process(AbstractProcessDataObject data)
{
//Check for type here and do something
}
OR is there some better way to address it? Would the approach change if this were to be a Web Method?
Thanks in advance
I would go with:
process (data) {
data.doProcessing();
}
The fact that your methods return void lead me to believe that you may have your responsibilities turned around. I think it may be better to think about this as having each of your classes implement an interface, IProcessable, that defines a Process method. Then each class would know how to manipulate it's own data. This, I think, is less coupled than having a class which manipulates data inside each object. Assuming all of theses classes derive from the same base class you could put the pieces of the processing algorithm that are shared in the base class.
This is slightly different than the case where you may have multiple algorithms that operate on identical data. If you need this sort of functionality then you may still want to implement the interface, but have the Process method take a strategy type parameter and use a factory to create an appropriate strategy based on its type. You'd end up having a strategy class for each supported algorithm and data class pair this way, but you'd be able keep the code decoupled. I'd probably only do this if the algorithms were reasonably complex so that separating the code makes it more readable. If it's just a few lines that are different, using the switch statement on the strategy type would probably suffice.
With regard to web methods, I think I'd have a different signature per class. Getting the data across the wire correctly will be much easier if the methods take concrete classes of the individual types so it knows how to serialize/deserialize it properly. Of course, on the back end the web methods could use the approach described above.
public interface IProcessable
{
public void Process() {....}
}
public abstract class ProcessableBase : IProcessable
{
public virtual void Process()
{
... standard processing code...
}
}
public class FooProcessable : ProcessableBase
{
public override void Process()
{
base.Process();
... specific processing code
}
}
...
IProcessable foo = new FooProcessable();
foo.Process();
Implementing the strategy-based mechanism is a little more complex.
Web interface, using data access objects
[WebService]
public class ProcessingWebService
{
public void ProcessFoo( FooDataObject foo )
{
// you'd need a constructor to convert the DAO
// to a Processable object.
IProcessable fooProc = new FooProcessable( foo );
fooProc.Process();
}
}
I second Marko's design.
Imagine you need to add another type of data structure and process logic, say processDataObject_D. With your first proposed solution (method overloading), you will have to modify the class by adding another method. With your second proposed solution, you will have to add another condition to the type checking and execution statement. Both requires you to modify the existing code.
Marko's solution is to avoid modifying the existing code by leveraging polymorphism. You don't have to code if-else type checking. It allows you to add new data structure and process logic without modifying the existing code as long as the new class inherits the same super class.
Studying Strategy Pattern of the design patterns will give you full theoritical understanding of the problem you are facing. The book "Head First Design Pattern" from O'Reilly is the best introduction I know of.
How about polymorphism on AbstractProcessDataObject - i.e. a virtual method? If this isn't appropriate (separation of concerns etc), then the overload would seem preferable.
Re web-methods; very different: neither polymorphism nor overloading are very well supported (at least, not on basic-profile). The detection option "Check for type here and do something" might be the best route. Or have different named methods for each type.
per request:
abstract class SomeBase { // think: AbstractProcessDataObject
public abstract void Process();
}
class Foo : SomeBase {
public override void Process() { /* do A */ }
}
class Bar : SomeBase {
public override void Process() { /* do B */ }
}
SomeBase obj = new Foo();
obj.Process();
I believe the Strategy pattern would help you.