Providing robust consistency checks for collection of classes

Providing robust consistency checks for collection of classes - c#

To give some background on what I am doing:
I have a program that allows a user to modify and create a general calibration. Inside this calibration includes groups. Groups differ by the type of analysis they perform. These groups also included spectral sets with each set containing data for only one molecule. Inside these sets also includes spectral data at varying concentrations. Each spectra data is a discrete set of data which is specified by its resolution (x-axis space between each point) and its spectral range (x axis range).
One of the main aspects of building these calibration files is to keep the resolution and spectral range consistent in all spectral data in each set. This means that spectral data cannot be added unless it matches the spectral data of the rest. Also, if the user deletes all spectral data the resolution and range is reset allowing Spectra data of any range or resolution to be added to the calibration.
The question is: How can I provided an effective way to prevent adding spectral data to the calibration that doesn't match the current resolution and spectral range???
Below is a general description of a calibration class. This is a very general description and contains the only info needed to explain what I am trying to do.
class calibration
{
List<Group> groups;
}
class Group
{
List<SpectralSet> sets;
}
class SpectralSet
{
List<SpectraData> spectras;
}
class SpectraData
{
double firstXPoint;
double lastXPoint;
double resolution;
double[] Ypoints;
}

I'm sure you could apply all sorts of fancy design patterns to enforce the rules you mention.
However, for the sake of simplicity, I would just encapsulate the logic in the calibration class with an AddGroup method which validates the added group conforms to the calibration's requirements. Similarly, I would create an AddSpectralSet method to the group class as a gate keeper into the sets collection.
At that point, depending on how often these things change, I would think about exposing the groups collection and sets collection as ReadOnlyCollection to ensure code doesn't try to add items without using the prescribed methods.
Hope this helps.

Your approach will probably vary a bit, but here's an outline of how you could achieve this. You need to do the following:
Only expose immutable public collection properties, along with an Add method, so that you can do your own validation.
For example, you don't want to do:
class Group
{
public List<SpectralSet> sets;
}
Because then anyone can just do myGroup.sets.Add(mySet), without you getting a chance to do any validation on the set. One common pattern to achieve this is as follows:
class Group
{
private List<SpectralSet> _sets;
public IEnumerable<SpectralSet> Sets { get { return _sets; } }
public void Add(SpectralSet set)
{
//Do validation here, throw an exception or whatever you want to do if the set isn't valid
_sets.Add(set);
}
//Have a similar Remove method
}
Store the criteria that the data must match
I'm not quite sure what a spectral range is, so I'll just use the resolution as an example. You can extend this to whatever the range criteria is simply. There's at least three possible ways you could do this:
When you construct a class, pass it the resolution that it's allowed in the constructor, and make this immutable.
When adding and removing, update the allowed resolution as necessary.
Don't store the resolution explicitly, calculate it every time you add or remove.
Out of those, option 1 is by far the simplest. Life is always much easier when you can keep things as immutable as possible. So you'd do something like:
class Group
{
private List<SpectralSet> _sets;
public IEnumerable<SpectralSet> Sets { get { return _sets; } }
public readonly double Resolution;
public Group(double resolution)
{
Resolution = resolution;
}
public void Add(SpectralSet set)
{
if(set.Resolution != resolution)
//Throw an Exception here, or however you want to handle invalid data
_sets.Add(set);
}
//Have a similar Remove method
}
Following this example, each of the classes you included would need a Resolution parameter with the same kind of logic, just checking the Resolution of any direct child you tried to add. Likewise with whatever you use for spectral range.
You also want to be able to change the resolution allowed if all data is cleared. The simplest way to do this is just create new objects whenever the data is cleared, rather than just clearing out the collections in existing objects.
Make SpectraData immutable
All this is useless if you can get a SpectraData out of one of the carefully gated collections and change it arbitrarily. Make anything that needs to be validated immutable, only allowing it to be set on construction. If you have a requirement not to do that, you need to think very carefully about how you will allow it to be changed.

Related

Data structures and interfaces, for an extendable "Tournament Bracketing" system

Background
I am in the early stages of writing a "tournament bracketing" application (C#, although any object-oriented language would be appropriate). This application would theoretically generate bracket sheets for multiple types of tournaments:
Single-elimination
Double-elimination
Double-elimination with "true second"
Round robin
Swiss-system
...and probably more that I've never even heard of, before.
For each type of tournament, I'd like to implement the 'bracket algorithm' as an instance of a common interface. In this way, I can make the system extensible, and easily add support for additional brackets in the future.
Given a variable list of 'competitors' - the user could simply choose their desired bracket plugin and poof, there's the generated bracket!
Design Challenges
I am currently struggling to visualize a design for bracketing; the needed APIs of the interface, and - more importantly - I'm not sure how to generically and flexibly represent the bracket 'model,' as a data structure. Some sort of node map, I guess?
Ideally, I'd need the interface to:
Accept a variable-sized list of competitors as input
Generate a graph (or other) structure as output, representing the 'flow' of the bracket.
The graph structure would need to support some sort of 'Win / Lose' API, to 'advance' the various competitors through the bracket, and fill it in as it goes.
The Question
I wish I had a better way to phrase this question;
How I do 'dis?
What are your thoughts?
What interface structure makes the most sense?
How do I model this generically, in code?
My Initial Flailings
It wouldn't be a StackOverflow question unless I put some code on paper. Here are my initial thoughts;
// A plugin interface; generates a tournament bracket, given a list of competitors
public interface IBracketSheetGenerator
{
public IBracketSheet CreateBracket(IEnumerable<Competitor> competitors);
}
// Parent data structure, used to model a tournament bracket
public interface IBracketSheet
{
public IEnumerable<Competitor> Competitors { get; }
public IEnumerable<IBracketNode> Matches { get; }
}
// Node representing a single competitor match
public interface IBracketNode
{
public Competitor Left { get; }
public Competitor Right { get; }
public IBracketNode Winner { get; }
public IBracketNode Loser { get; }
// Advance the winner to the next winner's match,
// and the loser to the loser's match.
public Advance(Competitor winner, Competitor loser);
}
Right off the bat, I can see some shortcomings with my first attempt;
How do I represent the 'winner' of the entire bracket?
How do I represent losers who have been 'eliminated' from the bracket?
How do I signal that the bracket has been completed/resolved? What does a resolved bracket look like?
Does this framework support 'strange' brackets which don't fall into the simple 'elimination' mold (like round robin, for instance)?

Just brainstorming here, but I guess I would model the concept of a Round too. For the elimination systems rounds already make sense, but you should be able to simulate rounds for the other systems as well. I think the total number of rounds can be predetermined for all of them.
Each round has matches and each match has winners and losers, the implementation of the bracket system would be able to generate the next matches after a round is completed and you supply the outcome of each match.
If a competitor is not placed on a match in the subsequent round, they're "out". Perhaps the bracket system could return an ordered list of competitors representing the current standing, or even a custom CompetitorAndScore class that contains statistics?

One UI for two business objects

I have an order edit and quote edit screen that are very similar. I want to try to avoid code like this:
if (order is Order)
SetupScreenForOrder();
if (order is Quote)
SetupScreenForQuote();
But maintaining two screens is not good either. If I create some common interface between a Quote and Order then how do you deal with fields like OrderNumber or QuoteDate?
What's the best way to handle this?

Foo foo = GetFoo();
if (foo is Order)
...
if (foo is Quote)
...
If you don't want to write these conditionals, avoid referring to Order instances and Quote instances by their common parent type (if any).
Order order = GetOrder()
SetupScreen(order); // resolves to void SetupScreen(Order order)
Quote quote = GetQuote()
SetupScreen(quote); //resolves to void SetupScreen(Quote quote)

Often if the display screens for two classes are similar, it's because they have fields that are similar in function, even if they don't have the same name.
For example, it might be that each instance (of an order or a quote) will have a date displayed with it. So you might have an interface:
interface displayableInOrderAndQuoteList {
Date getDisplayDate();
}
public class Order {
private Date orderDate;
public Date getOrderDate() { //used only when treating object as an order
return orderDate;
}
public Date getDisplayDate() {//used when displaying object via interface
return orderDate;
}
}
public class Quote {
private Date quoteDate;
public Date getQuoteDate() { //used only when treating object as a quote
return quoteDate;
}
public Date getDisplayDate() {//used when displaying object via interface
return quoteDate;
}
}
In other words, the interface represents questions you want to ask the object in order to build the screen. Each object decides how to answer those questions.
If the display is different enough that you need to ask the objects totally different questions, then you probably should have two screens.

I think a common interface between the two objects would be a good idea.
Perhaps define the uncommon fields as nullable and have the screen check for null and determine how to display those fields.
int? OrderNumber {get;set;}
DateTime? QuoteDate {get;set;}
EDIT: in response to JC's comment
Perhaps consider trying to take it a step further and, at least logically, consider an Order and a Quote to be the same kind of object but at different stages in the "Order lifecycle". I.e. a "Quote" is an order at the beginning of the lifecycle, whereas an "Order" is an order at the middle or end of the lifecycle.
You could have an OrderState property defined on your interface, and then your UI could use the OrderState property to decide how the quote/order should be displayed, rather than checking each piece of data individually.
If you feel that the problem is more that you have too many if statements in your UI, then perhaps consider creating small user controls to handle displaying chunks of the UI for either a quote or for an order. You could then either dynamically add the appropriate control (a quote control or an order controL) to your UI, or have both controls already added and just show/hide them as appropriate. I would caution, though, that this sounds like it could be a messy approach, so just be careful that the solution doesn't end up being more complicated than the problem that you're trying to solve.

Good code architecture for this problem?

I am developing a space shooter game with customizable ships. You can increase the strength of any number of properties of the ship via a pair of radar charts*. Internally, i represent each ship as a subclassed SpaceObject class, which holds a ShipInfo that describes various properties of that ship.
I want to develop a relatively simple API that lets me feed in a block of relative strengths (from minimum to maximum of what the radar chart allows) for all of the ship properties (some of which are simplifications of the underlying actual set of properties) and get back a ShipInfo class i can give to a PlayerShip class (that is the object that is instantiated to be a player ship).
I can develop the code to do the transformations between simplified and actual properties myself, but i would like some recommendations as to what sort of architecture to provide to minimize the pain of interacting with this translator code (i.e. no methods with 5+ arguments or somesuch other nonsense). Does anyone have any ideas?
*=not actually implemented yet, but that's the plan.

What about the Builder pattern? You could have a static FillDefaults method on your ShipInfo class and then assign each property of the ShipInfo via an instance method that returns the instance that you're working with, like this:
ShipInfo.FillDefaults().CalculateSomething(50).AssignName("Testing...").RelativeFiringPower(10).ApplyTo(myShip);
Within ShipInfo, this would look something like:
public static ShipInfo FillDefaults()
{
ShipInfo newInstance = ...;
// Do some default setup here
return newInstance;
}
public ShipInfo CalculateSomething(int basis)
{
// Do some calculation
// Assign some values internally
return this;
}
// Keep following this pattern of methods
public void ApplyTo(SpaceObject obj)
{
// Some checks here if you want
obj.ShipInfo = this;
}

I would say the Facade pattern is perfect for that kind of problem. If you have 5+ arguments on your methods, consider encapsulating at least part of them in a new type.

Seems like you want to set some properties but not the others, but not in a particular order of importance so that you could define overloads with incrementally more arguments.
You could implement a constructor with minimum required values that sets default values for the other, and then use object initializer to set the remaining relevant values:
// Didn't set properties 2 3 and 6, only set the ones needed in this case.
SpaceObject ship = new SpaceObject(someRequiredValue) {
Property1 = 50,
Property4 = Game.Settings.Ships.Armor.Strong,
Property5 = new PropertySet1{
Prop51 = "Enterprise",
Prop53 = true,
Prop57 = false
};

To me this looks like a case for the decorator pattern.

When to use Properties and Methods?

I'm new to the .NET world having come from C++ and I'm trying to better understand properties. I noticed in the .NET framework Microsoft uses properties all over the place. Is there an advantage for using properties rather than creating get/set methods? Is there a general guideline (as well as naming convention) for when one should use properties?

It is pure syntactic sugar. On the back end, it is compiled into plain get and set methods.
Use it because of convention, and that it looks nicer.
Some guidelines are that when it has a high risk of throwing Exceptions or going wrong, don't use properties but explicit getters/setters. But generally even then they are used.

Properties are get/set methods; simply, it formalises them into a single concept (for read and write), allowing (for example) metadata against the property, rather than individual members. For example:
[XmlAttribute("foo")]
public string Name {get;set;}
This is a get/set pair of methods, but the additional metadata applies to both. It also, IMO, simply makes it easier to use:
someObj.Name = "Fred"; // clearly a "set"
DateTime dob = someObj.DateOfBirth; // clearly a "get"
We haven't duplicated the fact that we're doing a get/set.
Another nice thing is that it allows simple two-way data-binding against the property ("Name" above), without relying on any magic patterns (except those guaranteed by the compiler).

There is an entire book dedicated to answering these sorts of questions: Framework Design Guidelines from Addison-Wesley. See section 5.1.3 for advice on when to choose a property vs a method.
Much of the content of this book is available on MSDN as well, but I find it handy to have it on my desk.

Consider reading Choosing Between Properties and Methods. It has a lot of information on .NET design guidelines.

properties are get/set methods

Properties are set and get methods as people around here have explained, but the idea of having them is making those methods the only ones playing with the private values (for instance, to handle validations).
The whole other logic should be done against the properties, but it's always easier mentally to work with something you can handle as a value on the left and right side of operations (properties) and not having to even think it is a method.
I personally think that's the main idea behind properties.

I always think that properties are the nouns of a class, where as methods are the verbs...

First of all, the naming convention is: use PascalCase for the property name, just like with methods. Also, properties should not contain very complex operations. These should be done kept in methods.
In OOP, you would describe an object as having attributes and functionality. You do that when designing a class. Consider designing a car. Examples for functionality could be the ability to move somewhere or activate the wipers. Within your class, these would be methods. An attribute would be the number of passengers within the car at a given moment. Without properties, you would have two ways to implement the attribute:
Make a variable public:
// class Car
public int passengerCount = 4;
// calling code
int count = myCar.passengerCount;
This has several problems. First of all, it is not really an attribute of the vehicle. You have to update the value from inside the Car class to have it represent the vehicles true state. Second, the variable is public and could also be written to.
The second variant is one widley used, e. g. in Java, where you do not have properties like in c#:
Use a method to encapsulate the value and maybe perform a few operations first.
// class Car
public int GetPassengerCount()
{
// perform some operation
int result = CountAllPassengers();
// return the result
return result;
}
// calling code
int count = myCar.GetPassengerCount();
This way you manage to get around the problems with a public variable. By asking for the number of passengers, you can be sure to get the most recent result since you recount before answering. Also, you cannot change the value since the method does not allow it. The problem is, though, that you actually wanted the amount of passengers to be an attribute, not a function of your car.
The second approach is not necessarily wrong, it just does not read quite right. That's why some languages include ways of making attributes look like variables, even though they work like methods behind the scenes. Actionscript for example also includes syntax to define methods that will be accessed in a variable-style from within the calling code.
Keep in mind that this also brings responsibility. The calling user will expect it to behave like an attribute, not a function. so if just asking a car how many passengers it has takes 20 seconds to load, then you probably should pack that in a real method, since the caller will expect functions to take longer than accessing an attribute.
EDIT:
I almost forgot to mention this: The ability to actually perform certain checks before letting a variable be set. By just using a public variable, you could basically write anything into it. The setter method or property give you a chance to check it before actually saving it.

Properties simply save you some time from writing the boilerplate that goes along with get/set methods.
That being said, a lot of .NET stuff handles properties differently- for example, a Grid will automatically display properties but won't display a function that does the equivalent.
This is handy, because you can make get/set methods for things that you don't want displayed, and properties for those you do want displayed.

The compiler actually emits get_MyProperty and set_MyProperty methods for each property you define.

Although it is not a hard and fast rule and, as others have pointed out, Properties are implemented as Get/Set pairs 'behind the scenes' - typically Properties surface encapsulated/protected state data whereas Methods (aka Procedures or Functions) do work and yield the result of that work.
As such Methods will take often arguments that they might merely consume but also may return in an altered state or may produce a new object or value as a result of the work done.
Generally speaking - if you need a way of controlling access to data or state then Properties allow the implementation that access in a defined, validatable and optimised way (allowing access restriction, range & error-checking, creation of backing-store on demand and a way of avoiding redundant setting calls).
In contrast, methods transform state and give rise to new values internally and externally without necessarily repeatable results.
Certainly if you find yourself writing procedural or transformative code in a property, you are probably really writing a method.

Also note that properties are available via reflection. While methods are, too, properties represent "something interesting" about the object. If you are trying to display a grid of properties of an object-- say, something like the Visual Studio form designer-- then you can use reflection to query the properties of a class, iterate through each property, and interrogate the object for its value.

Think of it this way, Properties encapsulate your fields (commoningly marked private) while at the same time provides your fellow developers to either set or get the field value. You can even perform routine validation in the property's set method should you desire.

Properties are not just syntactic sugar - they are important if you need to create object-relational mappings (Linq2Sql or Linq2Entities), because they behave just like variables while it is possible to hide the implementation details of the object-relational mapping (persistance). It is also possible to validate a value being assigned to it in the getter of the property and protect it against assigning unwanted values.
You can't do this with the same elegance with methods. I think it is best to demonstrate this with a practical example.
In one of his articles, Scott Gu creates classes which are mapped to the Northwind database using the "code first" approach. One short example taken from Scott's blog (with a little modification, the full article can be read at Scott Gu's blog here):
public class Product
{
[Key]
public int ProductID { get; set; }
public string ProductName { get; set; }
public Decimal? UnitPrice { get; set; }
public bool Discontinued { get; set; }
public virtual Category category { get; set; }
}
// class Category omitted in this example
public class Northwind : DbContext
{
public DbSet<Product> Products { get; set; }
public DbSet<Category> Categories { get; set; }
}
You can use entity sets Products, Categories and the related classes Product and Category just as if they were normal objects containing variables: You can read and write them and they behave just like normal variables. But you can also use them in Linq queries, persist them (store them in the database and retrieve them).
Note also how easy it is to use annotations (C# attributes) to define the primary key (in this example ProductID is the primary key for Product).
While the properties are used to define a representation of the data stored in the database, there are some methods defined in the entity set class which control the persistence: For example, the method Remove() marks a given entity as deleted, while Add() adds a given entity, SaveChanges() makes the changes permanent. You can consider the methods as actions (i.e. you control what you want to do with the data).
Finally I give you an example how naturally you can use those classes:
// instantiate the database as object
var nw = new NorthWind();
// select product
var product = nw.Products.Single(p => p.ProductName == "Chai");
// 1. modify the price
product.UnitPrice = 2.33M;
// 2. store a new category
var c = new Category();
c.Category = "Example category";
c.Description = "Show how to persist data";
nw.Categories.Add(c);
// Save changes (1. and 2.) to the Northwind database
nw.SaveChanges();

Data Inheritance in C#

Is there a known pattern to inherit data in a hierarchical object structure? I have a hierarchical 'Item' structure which needs to inherit its 'Type' from its 'Parent' (have the same data as default). The type of sub item can be modified by its own, and when the type of parent Item changes, all sub items which their type is not changed, should get the new type of parent.
Note that I cannot fake it like
public string Type
{
get
{
if (type == null)
return Parent != null ? Parent.Type : null;
return type;
}
}
'cause I have to fill the values in the database, and the structure is too deep to use recursion and not worry about the performance.
The only way I can think of it now is
public string Type
{
set
{
type = value;
UpdateUnchangedChildren(value);
}
}
public int AddChild(Item item)
{
item.Type = Type;
return Items.Add(item);
}
Is there a better way?
Thanks.

It's a common problem, usually related to maintenance of various hierarchical settings/configurations. So, I guess a solution to it can be considered "a pattern".
Anyways, from the internal architecture perspective you have 2 major options:
normalized structure
denormalized structure
"Normazlied" is the one implemented with recursion. A particular piece of data is always stored in one place, all the other places have references to it (e.g., to parent). The structure is easily updated, but readng from it may be a problem.
"Denormalized" means that every node will store the whole set of settings for its level and whenever you update a node it takes some time to go down the hierarchy and corect all the children nodes. But the reading operation is instant.
And so the "denormalized" version seems to be more widely used, because the common scenario with settings is that you update them rarely, while read them often, hence you need better read performance. For example, Windows ACL security model uses the "denormalized" approach to make security checks fast. You can read how they resolve conflicts between the "inherited" and explicit permissions (ACEs) by checking them in a specific order. That might be an overkill for your particular system though, you can simply have a flag that a particular value was overriden or, on the opposite, reset to "default"...
Further details depend on your system needs, you might waht to have a "hybrid" architecture, where some of the fields would be "normalized" and some others won't. But you seem to be on the right way.

I'm not 100% sure what it is you are trying to do... but you could use generics to pass the type of a parent object into a child object... But having a setter there doesn't really make sense... The Parent object's type will be set when it's instantiated, so why would you have a setter there to change it.
Assuming you have something like this...
public class Child<T>
{
public string Type
{
get { return typeof(T).ToString(); }
}
}
So then, when you have a Parent Object of any type, you can pass that to your Child Property...
public class ParentA
{
public Child<ParentA> ChildObj { get; set; }
}
public class ParentB
{
public Child<ParentB> ChildObj { get; set; }
}
public class ParentC
{
public Child<ParentC> ChildObj { get; set; }
}
Calling any of those ChildObj.Type Properties will return ParentA, ParentB & ParentC respectively.
Buit I've a funny feeling you haven't fully explained what it is you're trying to do.
Can you post some more code examples showing a Parent Class & Child/Item Class

An obvious optimization would be to cache the value obtained from the parent when reading the type. That means you will only traverse each path at most once (whereas the naive solution means you'll be traversing each subpath again and again for each path containing it, which means up to O(h^2) instead of O(h)). That would work great if you have more reads than writes.
Consider this:
class Node
{
string _cachedParentType = null;
string _type;
string Type
{
get { return _type ?? _cachedParentType ?? (_cachedParentType = Parent.Type); }
set
{
_type = value;
foreach (var child in Children) { child._cachedParentType = null; }
}
}
}
This means with enough reads and few writes, reading becomes O(1) in the best case or, at worst, a "cache miss" will cost you O(h) with h being the height of the tree; while updating is O(k) with k being the branching level (because we only update one layer down!). I think this will generally be better than the UpdateUnchangedChildren solution (which I presume updates nodes recursively all the way to the leafs), unless you're doing WAY more reads than writes.

"...the structure is too deep to use recursion and not worry about the performance."
Have you actually measured this? How many items are you dealing with, how deep is the structure, and how common is it for items to not have their own "Type" value? What are your performance goals for the application, and how does the recursive solution compare with those goals?
It is very common for people to think that recursion is slow and therefore eliminate it from consideration without ever trying it. It is NEVER a good idea to reject the simplest design for performance reasons without measuring it first. Otherwise you go off and invent a more complicated solution when the simpler one would have worked just fine.
Of course, your second solution is also using recursion, just going down the hierarchy instead of up. If the child inserts are happening at a different time and can absorb the possible performance hit, then perhaps that will be more acceptable.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.