Related
First off, I have read through a list of postings on this topic and I don't feel I have grasped properties because of what I had come to understand about encapsulation and field modifiers (private, public..ect).
One of the main aspects of C# that I have come to learn is the importance of data protection within your code by the use of encapsulation. I 'thought' I understood that to be because of the ability of the use of the modifiers (private, public, internal, protected). However, after learning about properties I am sort of torn in understanding not only properties uses, but the overall importance/ability of data protection (what I understood as encapsulation) within C#.
To be more specific, everything I have read when I got to properties in C# is that you should try to use them in place of fields when you can because of:
1) they allow you to change the data type when you can't when directly accessing the field directly.
2) they add a level of protection to data access
However, from what I 'thought' I had come to know about the use of field modifiers did #2, it seemed to me that properties just generated additional code unless you had some reason to change the type (#1) - because you are (more or less) creating hidden methods to access fields as opposed to directly.
Then there is the whole modifiers being able to be added to Properties which further complicates my understanding for the need of properties to access data.
I have read a number of chapters from different writers on "properties" and none have really explained a good understanding of properties vs. fields vs. encapsulation (and good programming methods).
Can someone explain:
1) why I would want to use properties instead of fields (especially when it appears I am just adding additional code
2) any tips on recognizing the use of properties and not seeing them as simply methods (with the exception of the get;set being apparent) when tracing other peoples code?
3) Any general rules of thumb when it comes to good programming methods in relation to when to use what?
Thanks and sorry for the long post - I didn't want to just ask a question that has been asked 100x without explaining why I am asking it again.
1) why I would want to use properties
instead of fields (especially when it
appears I am just adding additional
code
You should always use properties where possible. They abstract direct access to the field (which is created for you if you don't create one). Even if the property does nothing other than setting a value, it can protect you later on. Changing a field to a property later is a breaking change, so if you have a public field and want to change it to a public property, you have to recompile all code which originally accessed that field.
2) any tips on recognizing the use of
properties and not seeing them as
simply methods (with the exception of
the get;set being apparent) when
tracing other peoples code?
I'm not totally certain what you are asking, but when tracing over someone else's code, you should always assume that the property is doing something other than just getting and setting a value. Although it's accepted practice to not put large amounts of code in getters and setter, you can't just assume that since it's a property it will behave quickly.
3) Any general rules of thumb when it
comes to good programming methods in
relation to when to use what?
I always use properties to get and set methods where possible. That way I can add code later if I need to check that the value is within certain bounds, not null etc. Without using properties, I have to go back and put those checks in every place I directly accessed the field.
One of the nice things about Properties is that the getter and the setter can have different levels of access. Consider this:
public class MyClass {
public string MyString { get; private set; }
//...other code
}
This property can only be changed from within, say in a constructor. Have a read up on Dependency Injection. Constructor injection and Property injection both deal with setting properties from some form of external configuration. There are many frameworks out there. If you delve into some of these you will get a good feel for properties and their use. Dependency injection will also help you with your 3rd question about good practice.
When looking at other people's code, you can tell whether something is a method or a property because their icons are different. Also, in Intellisence, the first part of a property's summary is the word Property.
You should not worry about the extra code needed for accessing fields via properties, it will be "optimized" away by the JIT compiler (by inlining the code). Except when it is too large to be inlined, but then you needed the extra code anyway.
And the extra code for defining simple properties is also minimal:
public int MyProp { get; set; } // use auto generated field.
When you need to customize you can alway define your own field later.
So you are left with the extra layer of encapsulation / data protection, and that is a good thing.
My rule: expose fields always through properties
While I absolutely dislike directly exposing fields to the public, there's another thing: Fields can't be exposed through Interfaces; Properties can.
There are several reasons why you might want to use Properties over Fields, here are just a couple:
a. By having the following
public string MyProperty { get; private set; }
you are making the property "read only". No one using your code can modify it's value. There are cases where this isn't strictly true (if your property is a list), but these are known and have solutions.
b. If you decide you need to increase the safety of your code use properties:
public string MyProperty
{
get { return _myField; }
set
{
if (!string.IsNullOrEmpty(value))
{
_myField = value;
}
}
}
You can tell they're properties because they don't have (). The compiler will tell you if you try to add brackets.
It's considered good practise to always use properties.
There are many scenarios where using a simple field would not cause damage, but
a Property can be changed more easily later, i.e. if you want to add an event whenever the value changes or want to perform some value/range checking.
Also, If you have several projects that depend on each other you have to recompile all that depend on the one where a field was changed to a property.
Using fields is usually practiced in private classes that is not intended to share data with other classes, When we want our data to be accessible by other classes we use properties which has the ability to share data with other classes through get and set which are access methods called Auto Properties that have access to data in private classes, also you can use both with access modifiers Full Property in the same class allowing the class to use data privately as data field and in the same time link the private field to a property that makes the data accessible to other classes as well, see this simple example:
private string _name;
public string Name
{
get
{
return _name;
}
set
{
_name = value;
}
}
The private string _name is used by the class only, while the Name property is accessible by other classes in the same namespace.
why I would want to use properties instead of fields (especially when it appears I am just adding additional code
You want to use properties over fields becuase, when you use properties you can use events with them, so in a case when you want to do some action when a property changes, you can bind some handlers to PropertyChanging or PropertyChanged events. In case of fields this is not possible. Fields can either be public or private or protected, in case of props you can make them read-only publicly but writable privately.
any tips on recognizing the use of properties and not seeing them as simply methods (with the exception of the get;set being apparent) when tracing other peoples code?
A method should be used when the return value is expected to be dynamic every-time you call, a property should be used when the return value is not that greatly dynamic.
Any general rules of thumb when it comes to good programming methods in relation to when to use what?
Yes, I strongly recommend to read Framework Design guidelines for best practices of good programming.
Properties are the preferred way to cover fields to enforce encapsulation. However, they are functional in that you can expose a property that is of a different type and marshal the casting; you can change access modifiers; they are used in WinForms data binding; they allow you to embed lightweight per-property logic such as change notifications; etc.
When looking at other peoples code, properties have different intellisense icons to methods.
If you think properties are just extra code, I would argue sticking with them anyway but make your life easier by auto-generating the property from the field (right-click -> Refactor -> Encapsulate Field...)
Properties allow you to do things other than set or get a value when you use them. Most notably, they allow you to do validation logic.
A Best Practice is to make anything exposed to the public a Property. That way, if you change the set/get logic at a later time, you only have to recompile your class, not every class linked against it.
One caveat is that things like "Threading.Interlocked.Increment" can work with fields, but cannot work with properties. If two threads simultaneously call Threading.Interlocked.Increment on SomeObject.LongIntegerField, the value will get increased by two even if there is no other locking. By contrast, if two threads simultaneously call Threading.Interlocked.Increment on SomeObject.LongIntegerProperty, the value of that property might get incremented by two, or by one, or by -4,294,967,295, or who knows what other values (the property could be written to use locking prevent values other than one or two in that scenario, but it could not be written to ensure the correct increment by two).
I was going to say Properties (setters) are a great place to raise events like NotifyPropertyChanged, but someone else beat me to it.
Another good reason to consider Properties: let's say you use a factory to construct some object that has a default constructor, and you prepare the object via its Properties.
new foo(){Prop1 = "bar", Prop2 = 33, ...};
But if outside users new up your object, maybe there are some properties that you want them to see as read-only and not be able to set (only the factory should be able to set them)? You can make the setters internal - this only works, of course, if the object's class is in the same assembly as the factory.
There are other ways to achieve this goal but using Properties and varying accessor visibility is a good one to consider if you're doing interface-based development, or if you expose libraries to others, etc.
In my domain each Domain Entity may have many Value Objects. I have created value objects to represent money, weight, count, length, volume, percentage, etc.
Each of these value objects contains both a numeric value and a unit of measure. E.g. money contains the monetary value and the currency ($, euro,...) , weight contains the numeric value and the unit of weight (kilo, pound, ...)
In the user interface these are displayed side-by-side as well: field name, its value followed by its accompanying unit, typically in a properties panel. The domain entities have equivalent DTOs that are exposed to the UI.
I have been searching for the best way to transfer the value objects inside the DTOs to the UI.
Do I simply expose the specific value object as a part of the DTO?
Do I expose a generic "value object"-equivalent that provides name/value/unit in a DTO?
Do I split it into separate name/value/unit members inside the DTO, just to reassemble them in the UI?
Do I transfer them as a KeyValuePair or Tuple inside the DTO?
Something else?
I have searched intensively but no other question seems to quite address this issue. Greatly appreciate any suggestions!
EDIT:
In the UI both values and units could get changed and sent back to the domain to update.
I would be inclined to agree with debuggr's comment above if these are one-way transfers; Value Objects aren't really Domain objects - they have no behaviour that can change their state and therefore in many ways they are only specialised "bit-buckets" in that you can serialise them without losing context.
However; if you have followed DDD practices (or if your back-end is using multi-threading, etc) then your Value Objects are immutable i.e they perhaps look something like this:
public class Money
{
readonly decimal _amount;
readonly string _currency;
public decimal Amount {get{return _amount;}}
public decimal Currency {get{return _currency;}}
public Money(decimal amount, string currency)
{
//validity checks here and then
_amount=amount;
_currency=currency;
}
}
Now if you need to send these back from the client, you can't easily re-use them directly in DTO objects unless whatever DTO mapping system you have (custom WebAPI Model binder, Automapper, etc) can easily let you bind the DTO to a Value Object using constructors...which may or may not be a problem for you, it could get messy :)
I would tend to stay away from "generic" DTO objects for things like this though, bear in mind that on the UI you still want some semblance of the "Domain" for the client-side code to work with (regardless of if that's Javascript on a Web Page or C# on a Form/Console, or whatever). Plus, it tends to be only a matter of time before you find an exceptional Value Object that has Name/Value/Unit/Plus One Weird Property specific to that Value concept
The only "fool-proof"*** way of handling this is one DTO per Value Object; although this is extra work you can't really go wrong - if you have lots and lots of these Value Objects, you can always write a simple DTO generation tool or use a T4 template to generate them for you, based on the public properties of your Value Objects.
***not a guarantee
DDD is all about behavior and explicitly expressing intent, next to clearly identifying the bounded contexts (transactional and organizational boundaries) for the problem you are trying to solve. This is far more important than the type of "structural" questions for which you are requesting answers.
I.e. starting from the "Domain Entities" that may have "Value Objects", where "Domain Entities" are mapped as a "DTO" to show/be edited in a UI is a statement about how you have structured things, that says nothing about what a user is trying to achieve in this UI, nor what the organization is required to do in response to this (i.e. the real business rules, such as awarding discounts, changing a shipping address, recommending other products a user might be interested in, changing a billing currency, etc).
It appears from your description, that you have a domain model that is mirroring what needs to be viewed/edited on a UI. That is kinda "putting the horse behind the carriage". Now you have a lot of "tiers" that provide no added value, and add a lot of complexity.
Let me try to explain what I mean, using the (simplified) example that was mentioned on having an "Order" with "Money". Using the approach that was mentioned, trying to show this on screen would likely involve the following steps:
Read the "Order Entity" for a given OrderId and its related "Money" values (likely in Order Lines for specific Product Types with a given Quantity and Unit Price). This would require a SQL statement with several joins (if using a SQL DB).
Map each of these somehow to a mirroring "domain objects" structure.
Map these again to mirroring a "DTO" object hierarchy.
Map these "DTO" objects to "View" or "ViewModel" objects in the UI.
That is a lot of work that in this example has not yielded any benefit of having a model which is supposed to capture and execute business logic.
Now as the next step, the user is editing fields in a UI. And you somehow have to marshal this back to your domain entity using the reverse route and try to infer the user's intent from the fields that were changed and subsequently apply business rules to that.
So say for instance that the user changes the currency on the "MoneyDTO" of a line item. What could be the user's intent? Make this the new Billing Currency and change it for all other line items as well? And how does this relate to the business rules? Do you need to look up the exchange rate and change the "Moneys" for all line items? Is there different business logic for more volatile currencies? Do you need to switch to new rules regarding VAT?
Those are the types of questions that seem to be more relevant for your domain, and would likely lead to a structure of domain entities and services that is different from the model which is viewed/modified on a UI.
Why not simply store the viewmodel in your database (e.g. as Json so it can be retrieved with a single query and rendered directly), so that you do not need additional translation layers to show it to a user. Also, why not structure your UI to reveal intent, and map this to commands to be sent to your domain service. E.g. a "change shipping address" command is likely relevant in the "shipping" bounded context of your organisation, "change billing currency" is relevant in the "billing" bounded context.
Also, if you complement this with domain events that are generated from your domain, denoting something that "has happened" you get additional benefits. For example the "order line added" event could be picked up by the "Additional Products A User Might Be Interested In" service, that in response updates the "Suggested Products" viewmodel in the UI for the user.
I would recommend you to have a look at concepts from CQRS as one possible means for dealing with these types of problems. As a very basic introduction with some more detailed references you could check out Martin Fowler's take on this: http://martinfowler.com/bliki/CQRS.html
If my domain object should contain string properties in 2 languages, should I create 2 separate properties or create a new type BiLingualString?
For example in plant classification application, the plant domain object can contain Plant.LatName and Plant.EngName.
The number of bi-lingual properties for the whole domain is not big, about 6-8, I need only to support two languages, information should be presented to UI in both languages at the same time. (so this is not locallization). The requirements will not change during development.
It may look like an easy question, but this decision will have impact on validation, persistance, object cloning and many other things.
Negative sides I can think of using new dualString type:
Validation: If i'm going to use DataAnattations, Enterprise Library validation block, Flued validation this will require more work, object graph validation is harder than simple property validation.
Persistance: iether NH or EF will require more work with complex properties.
OOP: more complex object initialization, I will have to initialize this new Type in constructor before I can use it.
Architecture: converting objects for passing them between layers is harder, auto mapping tools will require more hand work.
While reading your question I was thinking about why not localization all the time but when I read information should be presented to UI in both languages at the same time. I think it makes sense to use properties.
In this case I would go for a class with one string for each languages as you have mentioned BiLingualString
public class Names
{
public string EngName {get;set;}
public string LatName {get;set;}
}
Then I would use this class in my main Plant Class like this
public class Plant: Names
{
}
If you 100% sure that it will always be only Latin and English I would just stick with simplest solution - 2 string properties. It also more flexible in UI then having BiLingualString. And you won't have to deal with Complex types when persisting.
To help decide, I suggest considering how consistent this behavior will be at all layers. If you expose these as two separate properties on the business object, I would also expect to see it stored as two separate columns in a database record, for example, rather than two translations for the same property stored in a separate table. It does seem odd to store translations this way, but your justifications sound reasonable, and 6 properties is not un-managable. But be sure that you don't intend to add more languages in the future.
If you expect this system to by somewhat dynamic in that you may need to add another language at some point, it would seem to make more sense to me to implement this differently so that you don't have to alter the schema when a new language needs to be supported.
I guess the thing to balance is this: consider the likelihood of having to adjust the languages or properties to accommodate a new language against the advantage (simplicity) you gain by exposing these directly as separate properties rather than having to load translations as a separate level.
I'm new to the .NET world having come from C++ and I'm trying to better understand properties. I noticed in the .NET framework Microsoft uses properties all over the place. Is there an advantage for using properties rather than creating get/set methods? Is there a general guideline (as well as naming convention) for when one should use properties?
It is pure syntactic sugar. On the back end, it is compiled into plain get and set methods.
Use it because of convention, and that it looks nicer.
Some guidelines are that when it has a high risk of throwing Exceptions or going wrong, don't use properties but explicit getters/setters. But generally even then they are used.
Properties are get/set methods; simply, it formalises them into a single concept (for read and write), allowing (for example) metadata against the property, rather than individual members. For example:
[XmlAttribute("foo")]
public string Name {get;set;}
This is a get/set pair of methods, but the additional metadata applies to both. It also, IMO, simply makes it easier to use:
someObj.Name = "Fred"; // clearly a "set"
DateTime dob = someObj.DateOfBirth; // clearly a "get"
We haven't duplicated the fact that we're doing a get/set.
Another nice thing is that it allows simple two-way data-binding against the property ("Name" above), without relying on any magic patterns (except those guaranteed by the compiler).
There is an entire book dedicated to answering these sorts of questions: Framework Design Guidelines from Addison-Wesley. See section 5.1.3 for advice on when to choose a property vs a method.
Much of the content of this book is available on MSDN as well, but I find it handy to have it on my desk.
Consider reading Choosing Between Properties and Methods. It has a lot of information on .NET design guidelines.
properties are get/set methods
Properties are set and get methods as people around here have explained, but the idea of having them is making those methods the only ones playing with the private values (for instance, to handle validations).
The whole other logic should be done against the properties, but it's always easier mentally to work with something you can handle as a value on the left and right side of operations (properties) and not having to even think it is a method.
I personally think that's the main idea behind properties.
I always think that properties are the nouns of a class, where as methods are the verbs...
First of all, the naming convention is: use PascalCase for the property name, just like with methods. Also, properties should not contain very complex operations. These should be done kept in methods.
In OOP, you would describe an object as having attributes and functionality. You do that when designing a class. Consider designing a car. Examples for functionality could be the ability to move somewhere or activate the wipers. Within your class, these would be methods. An attribute would be the number of passengers within the car at a given moment. Without properties, you would have two ways to implement the attribute:
Make a variable public:
// class Car
public int passengerCount = 4;
// calling code
int count = myCar.passengerCount;
This has several problems. First of all, it is not really an attribute of the vehicle. You have to update the value from inside the Car class to have it represent the vehicles true state. Second, the variable is public and could also be written to.
The second variant is one widley used, e. g. in Java, where you do not have properties like in c#:
Use a method to encapsulate the value and maybe perform a few operations first.
// class Car
public int GetPassengerCount()
{
// perform some operation
int result = CountAllPassengers();
// return the result
return result;
}
// calling code
int count = myCar.GetPassengerCount();
This way you manage to get around the problems with a public variable. By asking for the number of passengers, you can be sure to get the most recent result since you recount before answering. Also, you cannot change the value since the method does not allow it. The problem is, though, that you actually wanted the amount of passengers to be an attribute, not a function of your car.
The second approach is not necessarily wrong, it just does not read quite right. That's why some languages include ways of making attributes look like variables, even though they work like methods behind the scenes. Actionscript for example also includes syntax to define methods that will be accessed in a variable-style from within the calling code.
Keep in mind that this also brings responsibility. The calling user will expect it to behave like an attribute, not a function. so if just asking a car how many passengers it has takes 20 seconds to load, then you probably should pack that in a real method, since the caller will expect functions to take longer than accessing an attribute.
EDIT:
I almost forgot to mention this: The ability to actually perform certain checks before letting a variable be set. By just using a public variable, you could basically write anything into it. The setter method or property give you a chance to check it before actually saving it.
Properties simply save you some time from writing the boilerplate that goes along with get/set methods.
That being said, a lot of .NET stuff handles properties differently- for example, a Grid will automatically display properties but won't display a function that does the equivalent.
This is handy, because you can make get/set methods for things that you don't want displayed, and properties for those you do want displayed.
The compiler actually emits get_MyProperty and set_MyProperty methods for each property you define.
Although it is not a hard and fast rule and, as others have pointed out, Properties are implemented as Get/Set pairs 'behind the scenes' - typically Properties surface encapsulated/protected state data whereas Methods (aka Procedures or Functions) do work and yield the result of that work.
As such Methods will take often arguments that they might merely consume but also may return in an altered state or may produce a new object or value as a result of the work done.
Generally speaking - if you need a way of controlling access to data or state then Properties allow the implementation that access in a defined, validatable and optimised way (allowing access restriction, range & error-checking, creation of backing-store on demand and a way of avoiding redundant setting calls).
In contrast, methods transform state and give rise to new values internally and externally without necessarily repeatable results.
Certainly if you find yourself writing procedural or transformative code in a property, you are probably really writing a method.
Also note that properties are available via reflection. While methods are, too, properties represent "something interesting" about the object. If you are trying to display a grid of properties of an object-- say, something like the Visual Studio form designer-- then you can use reflection to query the properties of a class, iterate through each property, and interrogate the object for its value.
Think of it this way, Properties encapsulate your fields (commoningly marked private) while at the same time provides your fellow developers to either set or get the field value. You can even perform routine validation in the property's set method should you desire.
Properties are not just syntactic sugar - they are important if you need to create object-relational mappings (Linq2Sql or Linq2Entities), because they behave just like variables while it is possible to hide the implementation details of the object-relational mapping (persistance). It is also possible to validate a value being assigned to it in the getter of the property and protect it against assigning unwanted values.
You can't do this with the same elegance with methods. I think it is best to demonstrate this with a practical example.
In one of his articles, Scott Gu creates classes which are mapped to the Northwind database using the "code first" approach. One short example taken from Scott's blog (with a little modification, the full article can be read at Scott Gu's blog here):
public class Product
{
[Key]
public int ProductID { get; set; }
public string ProductName { get; set; }
public Decimal? UnitPrice { get; set; }
public bool Discontinued { get; set; }
public virtual Category category { get; set; }
}
// class Category omitted in this example
public class Northwind : DbContext
{
public DbSet<Product> Products { get; set; }
public DbSet<Category> Categories { get; set; }
}
You can use entity sets Products, Categories and the related classes Product and Category just as if they were normal objects containing variables: You can read and write them and they behave just like normal variables. But you can also use them in Linq queries, persist them (store them in the database and retrieve them).
Note also how easy it is to use annotations (C# attributes) to define the primary key (in this example ProductID is the primary key for Product).
While the properties are used to define a representation of the data stored in the database, there are some methods defined in the entity set class which control the persistence: For example, the method Remove() marks a given entity as deleted, while Add() adds a given entity, SaveChanges() makes the changes permanent. You can consider the methods as actions (i.e. you control what you want to do with the data).
Finally I give you an example how naturally you can use those classes:
// instantiate the database as object
var nw = new NorthWind();
// select product
var product = nw.Products.Single(p => p.ProductName == "Chai");
// 1. modify the price
product.UnitPrice = 2.33M;
// 2. store a new category
var c = new Category();
c.Category = "Example category";
c.Description = "Show how to persist data";
nw.Categories.Add(c);
// Save changes (1. and 2.) to the Northwind database
nw.SaveChanges();
What advice/suggestions/guidance would you provide for designing a class that has upwards of 100 properties?
Background
The class describes an invoice. An invoice can have upwards of 100 attributes describing it, i.e. date, amount, code, etc...
The system we are submitting the invoice to uses each of the 100 attributes and is submitted as a single entity (as opposed to various parts being submitted at different times).
The attributes describing the invoice are required as part of the business process. The business process can not be changed.
Suggestions?
What have others done when faced with designing a class that has 100 attributes? i.e., create the class with each of the 100 properties?
Somehow break it up (if so, how)?
Or is this a fairly normal occurrence in your experience?
EDIT
After reading through some great responses and thinking about this further, I don't think there really is any single answer for this question. However, since we ended up modeling our design along the lines of LBrushkin's Answer I have given him credit. Albeit not the most popular answer, LBrushkin's answer helped push us into defining several interfaces which we aggregate and reuse throughout the application as well as a nudged us into investigating some patterns that may be helpful down the road.
You could try to 'normalize' it like you would a database table. Maybe put all the address related properties in an Address class for example - then have a BillingAddress and MailingAddress property of type Address in your Invoice class. These classes could be reused later on also.
The bad design is obviously in the system you are submitting to - no invoice has 100+ properties that cannot be grouped into a substructure. For example an invoice will have a customer and a customer will have an id and an address. The address in turn will have a street, a postal code, and what else. But all this properties should not belong directly to the invoice - an invoice has no customer id or postal code.
If you have to build an invoice class with all these properties directly attached to the invoice, I suggest to make a clean design with multiple classes for a customer, an address, and all the other required stuff and then just wrap this well designed object graph with a fat invoice class having no storage and logic itself just passing all operations to the object graph behind.
I would imagine that some of these properties are probably related to each other. I would imagine that there are probably groups of properties that define independent facets of an Invoice that make sense as a group.
You may want to consider creating individual interfaces that model the different facets of an invoice. This may help define the methods and properties that operate on these facets in a more coherent, and easy to understand manner.
You can also choose to combine properties that having a particular meaning (addresses, locations, ranges, etc) into objects that you aggregate, rather than as individual properties of a single large class.
Keep in mind, that the abstraction you choose to model a problem and the abstraction you need in order to communicate with some other system (or business process) don't have to be the same. In fact, it's often productive to apply the bridge pattern to allow the separate abstractions to evolve independently.
Hmmm... Are all of those really relevant specifically, and only to the invoice? Typically what I've seen is something like:
class Customer:
.ID
.Name
class Address
.ID
.Street1
.Street2
.City
.State
.Zip
class CustomerAddress
.CustomerID
.AddressID
.AddressDescription ("ship","bill",etc)
class Order
.ID
.CustomerID
.DatePlaced
.DateShipped
.SubTotal
class OrderDetails
.OrderID
.ItemID
.ItemName
.ItemDescription
.Quantity
.UnitPrice
And tying it all together:
class Invoice
.OrderID
.CustomerID
.DateInvoiced
When printing the invoice, join all of these records together.
If you really must have a single class with 100+ properties, it may be better to use a dictionary
Dictionary<string,object> d = new Dictionary<string,object>();
d.Add("CustomerName","Bob");
d.Add("ShipAddress","1600 Pennsylvania Ave, Suite 0, Washington, DC 00001");
d.Add("ShipDate",DateTime.Now);
....
The idea here is to divide your into logical units. In the above example, each class corresponds to a table in a database. You could load each of these into a dedicated class in your data access layer, or select with a join from the tables where they are stored when generating your report (invoice).
Unless your code actually uses many of the attributes at many places, I'd go for a dictionary instead.
Having real properties has its advantages(type-safety, discoverability/intellisense, refactorability) but these don't matter if all the code does is gets these from elsewhere, displays on UI, sends in a web-service, saves to a file etc.
It would be too many columns when your class / table that you store it in starts to violate the rules of normalization.
In my experience, it has been very hard to get that many columns when you are normalizing properly. Apply the rules of normalization to the wide table / class and I think you will end up with fewer columns per entity.
It's considered bad O-O style, but if all you're doing is populating an object with properties to pass them onward for processing, and the processing only reads the properties (presumably to create some other object or database updates), them perhaps a simple POD object is what you need, having all public members, a default constructor, and no other member methods. You can thus treat is as a container of properties instead of a full-blown object.
I used a Dictionary < string,string > for something like this.
it comes with a whole bunch of functions that can process it, it's easy to convert strings to other structures, easy to store, etc.
You should not be motivated purely by aesthetic considerations.
Per your comments, the object is basically a data transfer object consumed by a legacy system that expects the presence of all the fields.
Unless there is real value in composing this object from parts, what precisely is gained by obscuring its function and purpose?
These would be valid reasons:
1 - You are gathering the information for this object from various systems and the parts are relatively independent. It would make sense to compose the final object in that case based on process considerations.
2 - You have other systems that can consume various sub-sets of the fields of this object. Here reuse is the motivating factor.
3 - There is a very real possibility of a next generation invoicing system based on a more rational design. Here extensibility and evolution of the system are the motivating factor.
If none of these considerations are applicable in your case, then what's the point?
It sounds like for the end result you need to produce an invoice object with around 100 properties. Do you have 100 such properties in every case? Maybe you would want a factory, a class that would produce an invoice given a smaller set of parameters. A different factory method could be added for each scenario where the relevant fields of the invoice are relevant.
If what you're trying to create is a table gateway for pre-existing 100-column table to this other service, a list or dictionary might be pretty quick way to get started. However if you're taking input from a large form or UI wizard, you're probably going to have to validate the contents before submission to your remote service.
A simple DTO might look like this:
class Form
{
public $stuff = array();
function add( $key, $value ) {}
}
A table gateway might be more like:
class Form
{
function findBySubmitId( $id ) {} // look up my form
function saveRecord() {} // save it for my session
function toBillingInvoice() {} // export it when done
}
And you could extend that pretty easily depending on if you have variations of the invoice. (Adding a validate() method for each subclass might be appropriate.)
class TPSReport extends Form {
function validate() {}
}
If you want to separate your DTO from the delivery mechanism, because the delivery mechanism is generic to all your invoices, that could be easy. However you might be in a situation where there is business logic around the success or failure of the invoice. And this is where I'm prolly going off into the weeds. But it's where and OO model can be useful...I'll wage a penny that there will be different invoices and different procedures for different invoices, and if invoice submission barfs, you'll need extra routines :-)
class Form {
function submitToBilling() {}
function reportFailedSubmit() {}
function reportSuccessfulSubmit() {}
}
class TPSReport extends Form {
function validate() {}
function reportFailedSubmit() { /* oh this goes to AR */ }
}
Note David Livelys answer: it is a good insight. Often, fields on a form are each their own data structures and have their own validation rules. So you can model composite objects pretty quickly. This would associate each field type with its own validation rules and enforce stricter typing.
If you do have to get further into validation, often business rules are a whole different modelling from the forms or the DTOs that supply them. You could also be faced with logic that is oriented by department and has little to do with the form. Important to keep that out of the validation of the form itself and model submission process(es) separately.
If you are organizing a schema behind these forms, instead of a table with 100 columns, you would probably break down the entries by field identifiers and values, into just a few columns.
table FormSubmissions (
id int
formVer int -- fk of FormVersions
formNum int -- group by form submission
fieldName int -- fk of FormFields
fieldValue text
)
table FormFields (
id int
fieldName char
)
table FormVersions (
id
name
)
select s.* f.fieldName from FormSubmissions s
left join FormFields f on s.fieldName = f.id
where formNum = 12345 ;
I would say this is definitely a case where you're going to want to re-factor your way around until you find something comfortable. Hopefully you have some control over things like schema and your object model. (BTW...is that table known a 'normalized'? I've seen variations on that schema, typically organized by data type...good?)
Do you always need all the properties that are returned? Can you use projection with whatever class is consuming the data and only generate the properties you need at the time.
You could try LINQ, it will auto-gen properties for you. If all the fields are spread across multiple tables and you could build a view and drag the view over to your designer.
Dictionary ? why not, but not necessarily. I see a C# tag, your language has reflection, good for you. I had a few too large classes like this in my Python code, and reflection helps a lot :
for attName in 'attr1', 'attr2', ..... (10 other attributes):
setattr( self, attName, process_attribute( getattr( self, attName ))
When you want to convert 10 string members from some encoding to UNICODE, some other string members shouldn't be touched, you want to apply some numerical processing to other members... convert types... a for loop beats copy-pasting lots of code anytime for cleanliness.
If an entity has a hundred unique attributes than a single class with a hundred properties is the correct thing to do.
It may be possible to split things like addresses into a sub class, but this is because an address is really an entity in itself and easily recognised as such.
A textbook (i.e. oversimplified not usable in the real world) invoice would look like:-
class invoice:
int id;
address shipto_address;
address billing_address;
order_date date;
ship_date date;
.
.
.
line_item invoice_line[999];
class line_item;
int item_no.
int product_id;
amt unit_price;
int qty;
amt item_cost;
.
.
.
So I am surpised you dont have at least an array of line_items in there.
Get used to it! In the business world an entity can easily have hundreds and sometimes thousands of unique attributes.
if all else fails, at least split the class to several partial classes to have better readability. it'll also make it easier for the team to work in parallel on different part of this class.
good luck :)