Compiling many chunks of code into a single method

Compiling many chunks of code into a single method - c#

I have a legacy method which processes various quantities in real time. There is lots of data, and this method is basically a large if/switch mess which decides how to calculate the target value based on certain rules, and does this for each sample received from each device (and there are many of them). Its signature is something like:
double Process(ITimestampedData data, IProcessingRule rule);
where ISample contains multiple different quantities' values for a single timestamp, while IProcessingRule defines which value to use and how to process it to get the result (which can then be compared to a threshold).
I would like to get rid of all ifs and switches and refactor this into a factory which would create a single processing method for each rule, and then run these methods for input data. Since these rules have various parameters, I would also like to see if there is a way to fully resolve all these branches at compile-time (well, run-time, but I am referring to the point where I invoke the factory method once to "compile" my processing delegate).
So, I have something like this, but much more complex (more mutually-dependent conditions and various rules):
// this runs on each call
double result;
switch (rule.Quantity)
{
case QuantityType.Voltage:
{
Vector v;
if (rule.Type == VectorType.SinglePhase)
{
v = data.Vectors[Quantity.Voltage].Phases[rule.Phase];
if (rule.Phase == PhaseType.Neutral)
{
v = v * 2; // making this up just to make a point
}
}
else if (rule.Type == VectorType.Symmetry)
{
v = CalculateSymmetry(data.Vectors);
}
if (rule.TargetProperty == PropertyType.Magnitude)
{
result = v.Magnitude();
if (rule.Normalize)
{
result /= rule.NominalValue;
}
}
}
// ... this doesn't end so soon
Into something like this:
// this is a factory method which will return a single delegate
// for each rule - and do it only once, at startup
Func<ITimestampedData, double> GetProcessor(IProcessingRule)
{
Func<ITimestampedData, Vectors> quantityGetter;
Func<Vectors, Vector> vectorGetter;
Func<Vector, double> valueGetter;
quantityGetter = data => data.Vectors[rule.Quantity];
if (rule.Type == VectorType.SinglePhase)
{
if (rule.Phase == PhaseType.Neutral)
vectorGetter = vectors => 2 * vectors.Phases[rule.Phase];
else
vectorGetter = vectors => vectors.Phases[rule.Phase];
}
else if (rule.Type == VectorType.Symmetry)
{
vectorGetter = vectors => CalculateSymmetry(vectors);
}
if (rule.TargetProperty == PropertyType.Magnitude)
{
if (rule.Normalize)
valueGetter = v => v.Magnitude() / rule.NominalValue;
else
valueGetter = v => v.Magnitude();
}
...
// now we just chain all delegates into a single "if-less" call
return data => valueGetter(vectorGetter(quantityGetter(data)));
}
But the problem is:
I still have lots of repetition inside my method,
I have switched ifs for multiple delegate invocations and performance doesn't get any better,
although this "chain" is fixed and known at the end of the factory method, I still don't have a single compiled method which would process my input.
So, finally, my question is:
Is there a way to somehow "build" the final compiled method from these various chunks of code inside my factory?
I know I can use something like CSharpCodeProvider, create a huge string and then compile it, but I was hoping for something with better compile time support and type checking.

Factories
The switch statement is usually a bad smell in code, and your feeling about it are completely right. But factories are perfectly valid place for switch statements. Just don't forget that factory responsibility is to construct objects, so make sure any extra logic is outside of the factory. Also, don't confuse Factories with Factory Methods. First are used when you have a group of polymorphically exchangeable classes and your factory decides which one to use. Also, it helps to break dependencies. At the same time, factory methods are more like static constructors that know about all dependencies of a constructed object. I recommend to be careful about factory methods and prefer proper Factory classes instead. Consider this in terms of SRP - Factory's responsibility is to construct the object while your class has some business responsibility. Whereas you use Factory Method your class gets two responsibilities.
Indentation
There is a good rule I try to follow, called "One indentation level per method". That means, that you can have only more level of indentation, excluding the root one. That is valid and readable piece of code:
function something() {
doSomething();
if (isSomethingValid()) {
doSomethingElse();
}
return someResult();
}
Try to follow this rule, by extracting private methods and you will see that code becomes much clearer.
If/Else statements
It is proven that else statement is always optional and you can always refactor your code to not use it. The solution is simple - use early returns. Your methods will become much shorter and way more readable.
Probably my answer is not good enough to solve all your problems, but at least it gives you some ideas to think about.
If you are working with legacy code, I strongly recommend reading "Working Effectively with Legacy Code" book by Michael Feathers and of course "Refactoring" by Martin Fowler.

Think about having your rules have more functionality inside them.
You know what rule you want because you passed it in. But then in your current code you ask the run about itself to determine what calculation you do. I suggest you make the rules more intelligent and ask the rule for the result.
For example the rule that does the most calculations is the SinglePhaseNeutralVoltageMagnitudeNormalizedRule.
class SinglePhaseNeutralVoltageMagnitudeNormalizedRule implements IProcessingRule
{
double calculate(ITimestampedData data)
{
double result;
Vector v;
v = data.Vectors[Quantity.Voltage].Phases[Phase];
v = v * 2; // making this up just to make a point
result = v.Magnitude();
result /= NominalValue;
return result;
}
}
So the Process method becomes much simpler
result = rule.calculate(data);
A factory class as suggested by #SergeKuharev could be used to build the rules if there is much complexity there. Also, if there is much common code between the rules themselves that could be refactored to a common place.
For example, Normalization could be a rule that simply wraps another rule.
class NormalizeRule IProcessingRule
{
private IProcessingRule priorRule;
private double nominalValue;
public NormalizeRule(IProcessingRule priorRule, double nominalValue)
{
priorRule = priorRule;
nominalValue = nominalValue;
}
public double calculate(ITimestampedData data)
{
return priorRule.calculate(data)/nominalValue;
}
}
so given that, and a class SinglePhaseNeutralVoltageMagnitudeRule (as above less the /= nominalValue) a factory could combine the two to make a SinglePhaseNeutralVoltageMagnitudeNrmalizedRule by composition.

Related

Is there somthing like 'out of scope' in C#? And some related questions

I am a C#-Newbie.
I have a function that is supposed to return all values from a List, that have the matching time-stamp:
static public PointCloud getPointsByTime (float time)
{
PointCloud returnList = new List<PointData> ();
for (int i = 0; i < _pointCloud.Count; i++) {
if (_pointCloud [i].time == time) {
returnList.Add (_pointCloud [i]);
}
}
return returnList;
}
Where
public class PointData
{
public float time;
// and some other members
}
and
// let's call a list of PointData-objects a PointCloud
using PointCloud = System.Collections.Generic.List<PointData>;
Does my function do what I want it to do? Or do I have to create a new PointData-object? Am I able to use my returned PointCloud or will it be out of scope and deleted?
This may be not the best example to explain so feel free to link me to something better. I think you get what my basic quastions are.

As #Patrick suggested inheriting for List seems more reasonable, but I would go further and I would just use a List so you don't create an unnecessary class if it is not going to add anything extra.
Also I suggest you to have a look to LINQ which makes the code more readable and is a very powerful feature that you would like to master as soon as possible. :)
Your method could look then like this:
_pointCloud.Where(p => p.time == time).ToList();
Also try to get familiar with properties:
public class PointData
{
public float Time { get; set; }
}
And you may want to follow the more standard C# coding style (although this is completely personal) of using PascalCase for public members instead of camelCase.

Your code is correct. You can use your function like so:
var someTime = 0.0f;
var pointsAtTime = getPointsByTime(someTime);
DoSomethingWith(pointsAtTime);
The return value from the function remains in scope if you assign it to some local variable (e.g. pointsAtTime here).
EDIT: As Peter Schneider correctly notes in the comments, you need to be aware that this function creates a new list with references to the matching points, and does not create new points. This might or might not be what you want.
However, if you're new to C#, here are some things you might want to keep in mind:
Methods in C# are conventionally named in TitleCase, e.g. GetPointsByTime and not getPointsByTime.
Assigning names to generic types like using PointCloud = List<PointData>, while technically allowed, is not very idiomatic and might confuse other readers of your code. If you think a list of PointData is special enough to have its own type, create a type for it (either by inheriting from List<PointData> or, preferably, using a IList<PointData> as a member in a new PointCloud class). Or just use using System.Collections.Generic and use List<PointData> throughout your code, which is what most people would do.
Comparing floating-point numbers for equality is sometimes discouraged as this might fail in some cases due to representation errors; if the time is truly a continuous value, you might want to look for points in a specific time period (e.g. points who fall within some range around your desired time). You don't really have to worry about this for now, though.

Instantiating various inherited classes through one method, without reflection

In a project I'm working on, I have a set of blocks that make up a 3D voxel based environment (like Minecraft). These worlds are stored in an external data file.
This file contains the data for:
Each block,
its location,
and its type.
When the LoadLevel method is called, I want it to iterate over the data for every block in the file, creating a new instance of the Block object for every one. It's no problem to pass things like location. It's as simple as
CreateBlock(Vector3 position)
The issue is with the type. All types are child classes (think Abstract Block, and then subtypes like GrassBlock or WaterBlock that inherit the abstract Block's properties.) Assuming there's a child class like "GrassBlock" that I want to be created, rather than a generic block, how do I make it do this through the method? The only way I know of is through reflection, which I've been advised to stay away from. Is it possible that I can do this through generic typing or something?
This seems like such an important question in game design, but no one I've asked seems to have any idea. Any help?

Generic typing will still require reflection.
First of all: what you're looking for is the factory pattern. It creates objects for you without having to do it explicitly yourself everywhere.
Basically there are two options:
Reflection
This indeed has a performance impact connected to it but don't dismiss it if you haven't determined it to be a problem yet. It will be readable and maintainable.
Switches
Hardcode every option and create a new instance based on some sort of metadata you pass in (something that will identify the type of each block). This has the benefit of not using reflection and as such not incurring that performance penalty but it will also be less extensible and if you have 500 different blocks you can guess what your code will look like.

Of course, you can create objects without any reflection.
Simple assign each class the integer index:
Func<Vector3, Block>[] factories =
{
(v) => new GrassBlock(v), // index 0
(v) => new WaterBlock(v), // index 1
. . .
}
Save this index in the external data. At deserialization time read Vector3 v and index i, then call var block = factories[i](v);

Without reflection, you can use a factory method, with a switch. Assume BlockType is an enum.
public static Block CreateBlock(BlockType type, Vector3 position)
{
switch (BlockType type)
{
case BlockType.Grass:
return new GrassBlock(position);
case BlockType.Water:
return new WaterBlock(position);
default:
throw new InvalidOperationException();
}
}
But to have something more maintainable, you could still use reflection until it proves to be a bottleneck. In that case, you could switch to runtime code generation.
private static readonly Dictionary<Type, Func<Vector3, Block>> _activators = new Dictionary<Type, Func<Vector3, Block>>();
public static Block CreateBlock(Type blockType, Vector3 position)
{
Func<Vector3, Block> factory;
if (!_activators.TryGetValue(blockType, out factory))
{
if (!typeof(Block).IsAssignableFrom(blockType))
throw new ArgumentException();
var posParam = Expression.Parameter(typeof(Vector3));
factory = Expression.Lambda<Func<Vector3, Block>>(
Expression.New(
blockType.GetConstructor(new[] { typeof(Vector3) }),
new[] { posParam }
),
posParam
).Compile();
_activators.Add(blockType, factory);
}
return factory(position);
}
This code will generate a factory function at runtime, the first time a block of a given type is requested. And you could make this function thread-safe if needed by using a ConcurrentDictionary.
But that may be a bit overkill for your purpose ;)

Why are you avoiding reflection? If you're able to execute this code only on startup (which it sounds like you can do if you're reading a file) then I don't personally have too big a problem with using reflection.
An alternative is to store the fully qualified type name (e.g. My.System.Blocks.GrassBlock) and load that type with
var typeName = readStringTypeFromFile(file);
Block type = Activator.CreateInstance(typeName, location);
As I said, running something like this on startup is fine by me, and you can test performance of this if needs be.
Quick and dirty fiddle: https://dotnetfiddle.net/BDmlyi

Property initialisation anti-pattern

Now and again I end up with code along these lines, where I create some objects then loop through them to initialise some properties using another class...
ThingRepository thingRepos = new ThingRepository();
GizmoProcessor gizmoProcessor = new GizmoProcessor();
WidgetProcessor widgetProcessor = new WidgetProcessor();
public List<Thing> GetThings(DateTime date)
{
List<Thing> allThings = thingRepos.FetchThings();
// Loops through setting thing.Gizmo to a new Gizmo
gizmoProcessor.AddGizmosToThings(allThings);
// Loops through setting thing.Widget to a new Widget
widgetProcessor.AddWidgetsToThings(allThings);
return allThings;
}
...which just, well, feels wrong.
Is this a bad idea?
Is there a name of an anti-pattern that I'm using here?
What are the alternatives?
Edit: assume that both GizmoProcessor and WidgetProcessor have to go off and do some calculation, and get some extra data from other tables. They're not just data stored in a repository. They're creating new Gizmos and Widgets based on each Thing and assigning them to Thing's properties.
The reason this feels odd to me is that Thing isn't an autonomous object; it can't create itself and child objects. It's requiring higher-up code to create a fully finished object. I'm not sure if that's a bad thing or not!

ThingRepository is supposed to be the single access point to get collections of Thing's, or at least that's where developers will intuitively look. For that reason, it feels strange that GetThings(DateTime date) should be provided by another object. I'd rather place that method in ThingRepository itself.
The fact that the Thing's returned by GetThings(DateTime date) are different, "fatter" animals than those returned by ThingRepository.FetchThings() also feels awkward and counter-intuitive. If Gizmo and Widget are really part of the Thing entity, you should be able to access them every time you have an instance of Thing, not just for instances returned by GetThings(DateTime date).
If the Date parameter in GetThings() isn't important or could be gathered at another time, I would use calculated properties on Thing to implement on-demand access to Gizmo and Widget :
public class Thing
{
//...
public Gizmo Gizmo
{
get
{
// calculations here
}
}
public Widget Widget
{
get
{
// calculations here
}
}
}
Note that this approach is valid as long as the calculations performed are not too costly. Calculated properties with expensive processing are not recommended - see http://msdn.microsoft.com/en-us/library/bzwdh01d%28VS.71%29.aspx#cpconpropertyusageguidelinesanchor1
However, these calculations don't have to be implemented inline in the getters - they can be delegated to third-party Gizmo/Widget processors, potentially with a caching strategy, etc.

If you have complex intialization then you could use a Strategy pattern. Here is a quick overview adapted from this strategy pattern overview
Create a strategy interface to abstract the intialization
public interface IThingInitializationStrategy
{
void Initialize(Thing thing);
}
The initialization implementation that can be used by the strategy
public class GizmosInitialization
{
public void Initialize(Thing thing)
{
// Add gizmos here and other initialization
}
}
public class WidgetsInitialization
{
public void Initialize(Thing thing)
{
// Add widgets here and other initialization
}
}
And finally a service class that accepts the strategy implementation in an abstract way
internal class ThingInitalizationService
{
private readonly IThingInitializationStrategy _initStrategy;
public ThingInitalizationService(IThingInitializationStrategy initStrategy)
{
_initStrategy = initStrategy;
}
public Initialize(Thing thing)
{
_initStrategy.Initialize(thing);
}
}
You can then use the initialization strategies like so
var initializationStrategy = new GizmosInitializtion();
var initializationService = new ThingInitalizationService(initializationStrategy);
List<Thing> allThings = thingRepos.FetchThings();
allThings.Foreach ( thing => initializationService.Initialize(thing) );

Tho only real potential problem would be that you're iterating over the same loop multiple times, but if you need to hit a database to get all the gizmos and widgets then it might be more efficient to request them in batches so passing the full list to your Add... methods would make sense.
The other option would be to look into returning the gizmos and widgets with the thing in the first repository call (assuming they reside in the same repo). It might make the query more complex, but it would probably be more efficient. Unless of course you don't ALWAYS need to get gizmos and widgets when you fetch things.

To answer your questions:
Is this a bad idea?
From my experience, you rarely know if it's a good/bad idea until you need to change it.
IMO, code is either: Over-engineered, under-engineered, or unreadable
In the meantime, you do your best and stick to the best practices (KISS, single responsibility, etc)
Personally, I don't think the processor classes should be modifying the state of any Thing.
I also don't think the processor classes should be given a collection of Things to modify.
Is there a name of an anti-pattern that I'm using here?
Sorry, unable to help.
What are the alternatives?
Personally, I would write the code as such:
public List<Thing> GetThings(DateTime date)
{
List<Thing> allThings = thingRepos.FetchThings();
// Build the gizmo and widget for each thing
foreach (var thing in allThings)
{
thing.Gizmo = gizmoProcessor.BuildGizmo(thing);
thing.Widget = widgetProcessor.BuildWidget(thing);
}
return allThings;
}
My reasons being:
The code is in a class that "Gets things". So logically, I think it's acceptable for it to traverse each Thing object and initialise them.
The intention is clear: I'm initialising the properties for each Thing before returning them.
I prefer initialising any properties of Thing in a central location.
I don't think that gizmoProcessor and widgetProcessor classes should have any business with a Collection of Things
I prefer the Processors to have a method to build and return a single widget/gizmo
However, if your processor classes are building several properties at once, then only would I refactor the property initialisation to each processor.
public List<Thing> GetThings(DateTime date)
{
List<Thing> allThings = thingRepos.FetchThings();
// Build the gizmo and widget for each thing
foreach (var thing in allThings)
{
// [Edited]
// Notice a trend here: The common Initialize(Thing) interface
// Could probably be refactored into some
// super-mega-complex Composite Builder-esque class should you ever want to
gizmoProcessor.Initialize(thing);
widgetProcessor.Initialize(thing);
}
return allThings;
}
P.s.:
I personally do not care that much for (Anti)Pattern names.
While it helps to discuss a problem at a higher level of abstraction, I wouldn't commit every (anti)pattern names to memory.
When I come across a Pattern that I believe is helpful, then only do I remember it.
I'm quite lazy, and my rationale is that: Why bother remembering every pattern and anti pattern if I'm only going to use a handful?
[Edit]
Noticed an answer was already given regarding using a Strategy Service.

Whose responsibility is it to cache / memoize function results?

I'm working on software which allows the user to extend a system by implementing a set of interfaces.
In order to test the viability of what we're doing, my company "eats its own dog food" by implementing all of our business logic in these classes in the exact same way a user would.
We have some utility classes / methods that tie everything together and use the logic defined in the extendable classes.
I want to cache the results of the user-defined functions. Where should I do this?
Is it the classes themselves? This seems like it can lead to a lot of code duplication.
Is it the utilities/engine which uses these classes? If so, an uninformed user may call the class function directly and not receive any caching benefit.
Example code
public interface ILetter { string[] GetAnimalsThatStartWithMe(); }
public class A : ILetter { public string[] GetAnimalsThatStartWithMe()
{
return new [] { "Aardvark", "Ant" };
}
}
public class B : ILetter { public string[] GetAnimalsThatStartWithMe()
{
return new [] { "Baboon", "Banshee" };
}
}
/* ...Left to user to define... */
public class Z : ILetter { public string[] GetAnimalsThatStartWithMe()
{
return new [] { "Zebra" };
}
}
public static class LetterUtility
{
public static string[] GetAnimalsThatStartWithLetter(char letter)
{
if(letter == 'A') return (new A()).GetAnimalsThatStartWithMe();
if(letter == 'B') return (new B()).GetAnimalsThatStartWithMe();
/* ... */
if(letter == 'Z') return (new Z()).GetAnimalsThatStartWithMe();
throw new ApplicationException("Letter " + letter + " not found");
}
}
Should LetterUtility be responsible for caching? Should each individual instance of ILetter? Is there something else entirely that can be done?
I'm trying to keep this example short, so these example functions don't need caching. But consider I add this class that makes (new C()).GetAnimalsThatStartWithMe() take 10 seconds every time it's run:
public class C : ILetter
{
public string[] GetAnimalsThatStartWithMe()
{
Thread.Sleep(10000);
return new [] { "Cat", "Capybara", "Clam" };
}
}
I find myself battling between making our software as fast as possible and maintaining less code (in this example: caching the result in LetterUtility) and doing the exact same work over and over (in this example: waiting 10 seconds every time C is used).

Which layer is best responsible for caching of the results of these user-definable functions?
The answer is pretty obvious: the layer that can correctly implement the desired cache policy is the right layer.
A correct cache policy needs to have two characteristics:
It must never serve up stale data; it must know whether the method being cached is going to produce a different result, and invalidate the cache at some point before the caller would get stale data
It must manage cached resources efficiently on the user's behalf. A cache without an expiration policy that grows without bounds has another name: we usually call them "memory leaks".
What's the layer in your system that knows the answers to the questions "is the cache stale?" and "is the cache too big?" That's the layer that should implement the cache.

Something like caching can be considered a "cross-cutting" concern (http://en.wikipedia.org/wiki/Cross-cutting_concern):
In computer science, cross-cutting concerns are aspects of a program which affect other concerns. These concerns often cannot be cleanly decomposed from the rest of the system in both the design and implementation, and can result in either scattering (code duplication), tangling (significant dependencies between systems), or both.
For instance, if writing an application for handling medical records, the bookkeeping and indexing of such records is a core concern, while logging a history of changes to the record database or user database, or an authentication system, would be cross-cutting concerns since they touch more parts of the program.
Cross cutting concerns can often be implemented via Aspect Oriented Programming (http://en.wikipedia.org/wiki/Aspect-oriented_programming).
In computing, aspect-oriented programming (AOP) is a programming paradigm which aims to increase modularity by allowing the separation of cross-cutting concerns. AOP forms a basis for aspect-oriented software development.
There are many tools in .NET to facilitate Aspect Oriented Programming. I'm most fond of those that provide completely transparent implementation. In the example of caching:
public class Foo
{
[Cache(10)] // cache for 10 minutes
public virtual void Bar() { ... }
}
That's all you need to do...everything else happens automatically by defining a behavior like so:
public class CachingBehavior
{
public void Intercept(IInvocation invocation) { ... }
// this method intercepts any method invocations on methods attributed with the [Cache] attribute.
// In the case of caching, this method would check if some cache store contains the data, and if it does return it...else perform the normal method operation and store the result
}
There are two general schools for how this happens:
Post build IL weaving. Tools like PostSharp, Microsoft CCI, and Mono Cecil can be configured to automatically rewrite these attributed methods to automatically delegate to your behaviors.
Runtime proxies. Tools like Castle DynamicProxy and Microsoft Unity can automatically generate proxy types (a type derived from Foo that overrides Bar in the example above) that delegates to your behavior.

Although I do not know C#, this seems like a case for using AOP (Aspect-Oriented Programming). The idea is that you can 'inject' code to be executed at certain points in the execution stack.
You can add the caching code as follows:
IF( InCache( object, method, method_arguments ) )
RETURN Cache(object, method, method_arguments);
ELSE
ExecuteMethod(); StoreResultsInCache();
You then define that this code should be executed before every call of your interface functions (and all subclasses implementing these functions as well).
Can some .NET expert enlighten us how you would do this in .NET ?

In general, caching and memoisation makes sense when:
Obtaining the result is (or at least can be) high-latency or otherwise expensive than the expense caused by caching itself.
The results have a look-up pattern where there will be frequent calls with the same inputs to the function (that is, not just the arguments but any instance, static and other data that affects the result).
There isn't an already existing caching mechanism within the code the code in question calls into that makes this unnecessary.
There won't be another caching mechanism within the code that calls the code in question that makes this unnecessary (why it almost never makes sense to memoise GetHashCode() within that method, despite people often being tempted to when the implementation is relatively expensive).
Is impossible to become stale, unlikely to become stale while the cache is loaded, unimportant if it becomes stale, or where staleness is easy to detect.
There are cases where every use-case for a component will match all of these. There are many more where they will not. For example, if a component caches results but is never called twice with the same inputs by a particular client component, then that caching is just a waste that has had a negative impact upon performance (maybe negligible, maybe severe).
More often it makes much more sense for the client code to decide upon the caching policy that would suit it. It will also often be easier to tweak for a particular use at this point in the face of real-world data than in the component (since the real-world data it'll face could vary considerably from use to use).
It's even harder to know what degree of staleness could be acceptable. Generally, a component has to assume that 100% freshness is required from it, while the client component can know that a certain amount of staleness will be fine.
On the other hand, it can be easier for a component to obtain information that is of use to the cache. Components can work hand-in-hand in these cases, though it is much more involved (an example would be the If-Modified-Since mechanism used by RESTful webservices, where a server can indicate that a client can safely use information it has cached).
Also, a component can have a configurable caching policy. Connection pooling is a caching policy of sorts, consider how that's configurable.
So in summary:
The component that can work out what caching is both possible and useful.
Which is most often the client code. Though having details of likely latency and staleness documented by the component's authors will help here.
Can less often be the client code with help from the component, though you have to expose details of the caching to allow that.
And can sometimes be the component with the caching policy configurable by the calling code.
Can only rarely be the component, because it's rarer for all possible use-cases to be served well by the same caching policy. One important exception is where the same instance of that component will serve multiple clients, because then the factors that affect the above are spread over those multiple clients.

All of the previous posts brought up some good points, here is a very rough outline of a way you might do it. I wrote this up on the fly so it might need some tweaking:
interface IMemoizer<T, R>
{
bool IsValid(T args); //Is the cache valid, or stale, etc.
bool TryLookup(T args, out R result);
void StoreResult(T args, R result);
}
static IMemoizerExtensions
{
Func<T, R> Memoizing<T, R>(this IMemoizer src, Func<T, R> method)
{
return new Func<T, R>(args =>
{
R result;
if (src.TryLookup(args, result) && src.IsValid(args))
{
return result;
}
else
{
result = method.Invoke(args);
memoizer.StoreResult(args, result);
return result;
}
});
}
}

TDD approach for complex function

I have a method in a class for which they are a few different outcomes (based upon event responses etc). But this is a single atomic function which is to used by other applications.
I have broken down the main blocks of the functionality that comprise this function into different functions and successfully taken a Test Driven Development approach to the functionality of each of these elements. These elements however aren't exposed for other applications would use.
And so my question is how can/should i easily approach a TDD style solution to verifying that the single method that should be called does function correctly without a lot of duplication in testing or lots of setup required for each test?
I have considered / looked at moving the blocks of functionality into a different class and use Mocking to simulate the responses of the functions used but it doesn't feel right and the individual methods need to write to variables within the main class (it felt really heath robinson).
The code roughly looks like this (i have removed a lot of parameters to make things clearer along with a fair bit of irrelevant code).
public void MethodToTest(string parameter)
{
IResponse x = null;
if (function1(parameter))
{
if (!function2(parameter,out x))
{
function3(parameter, out x);
}
}
// ...
// more bits of code here
// ...
if (x != null)
{
x.Success();
}
}

I think you would make your life easier by avoiding the out keyword, and re-writing the code so that the functions either check some condition on the response, OR modify the response, but not both. Something like:
public void MethodToTest(string parameter)
{
IResponse x = null;
if (function1(parameter))
{
if (!function2Check(parameter, x))
{
x = function2Transform(parameter, x);
x = function3(parameter, x);
}
}
// ...
// more bits of code here
// ...
if (x != null)
{
x.Success();
}
}
That way you can start pulling apart and recombining the pieces of your large method more easily, and in the end you should have something like:
public void MethodToTest(string parameter)
{
IResponse x = ResponseBuilder.BuildResponse(parameter);
if (x != null)
{
x.Success();
}
}
... where BuildResponse is where all your current tests will be, and the test for MethodToTest should now be fairly easy to mock the ResponseBuilder.

Your best option would indeed be mocking function1,2,3 etc. If you cannot move your functions to a separate class you could look into using nested classes to move the functions to, they are able to access the data in the outer class. After that you should be able to use mocks instead of the nested classes for testing purposes.
Update: From looking at your example code I think you could get some inspiration by looking into the visitor pattern and ways of testing that, it might be appropriate.

In this case I think you would just mock the method calls as you mentioned.
Typically you would write your test first, and then write the method in a way so that all of the tests pass. I've noticed that when you do it this way, the code that's written is very clean and to the point. Also, each class is very good about only having a single responsibility that can easily be tested.
I don't know what's wrong, but something doesn't smell right, and I think there maybe a more elegant way to do what you're doing.

IMHO, you have a couple options here:
Break the inner functions out into a different class so you can mock them and verify that they are called. (which you already mentioned)
It sounds like the other methods you created are private methods, and that this is the only public interface into those methods. If so, you should be running those test cases through this function, and verifying the results (you said that those private methods modify variables of the class) instead of testing private methods. If that is too painful, then I would consider reworking your design.
It looks to me like this class is trying to do more than one thing. For example, the first function doesn't return a response but the other two do. In your description you said the function is complex and takes a lot of parameters. Those are both signs that you need to refactor your design.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.