C# Where to load initial data in an object? - c#

Similar to this question:
C# Constructor Design but this question is slight different.
I have a class Customer and a class CustomerManager. When an instance is created of the CustomerManager class I want to load all the customers. And this is where I got stuck. I can do this several ways:
Load all the customers in the constructor (I don't like this one because it can take a while if I have many customers)
In every method of the CustomerManager class that performs database related actions, check the local list of customers is loaded and if not, load the list:
public method FindCustomer(int id)
{
if(_customers == null)
// some code which will load the customers list
}
Create a method which loads all the customers. This method must be called before calling methods which performs database related actions:
In the class:
public LoadData()
{
// some code which will load the customers list
}
In the form:
CustomerManager manager = new CustomerManager();
manager.LoadData();
Customer customer = manager.FindCustomer(int id);
What is the best way to do this?
EDIT:
I have the feeling that I am misunderstood here. Maybe it is because I wasn't clear enough. In the CustomerManager class I have several methods which depends on the local list (_customers). So, my question is, where should I fill that list?

What you are describing is "lazy loading".
A simple approach is to have a private property like this:
private Lixt<Customer> _customers;
private List<Customer> Customers
{
get
{
if(_customers == null)
_customers = LoadData();
return _customers;
}
}
Then, you refer to Customers internally. The customers will be loaded the first time they are needed but no earlier.
This is such a common pattern that .Net 4.0 added a Lazy<T> class that does this for you.
I that case, you just define it as a private like this:
private Lazy<List<Customer>> _customers = new Lazy<List<Customer>>(LoadData);
Then, you simply refer to your customers in code:
_customers.Value
The class will initialize the value with your LoadData() method.
If you are not on .Net 4.0 yet, the Lazy<T> class is very easy to implement.

Use a property for accessing the customers. Have that check if the customers are loaded.

Well, it depends. All your options have advantages and disadvantages.
The good thing about options 1 and 3 is that the user has full control over when the (lengthy) data loading operation is performed. Whether option 1 or 3 is better depends on whether it makes sense to create the Manager and load the data later or not. Personally, I prefer a separate LoadData method if it's a lengthy operation, but that might be a matter of taste.
The good thing about option 2 is that the data will not be loaded if it is not needed. The drawback is that the (lengthy) load occurs as a side-affect of the first access, which makes your program "less deterministic".
In principle, all the options you have presented are fine and valid choices. It really depends on your requirements.

Related

How can I avoid adding getters to facilitate unit testing?

After reading a blog post mentioning how it seems wrong to expose a public getter just to facilitate testing, I couldn't find any concrete examples of better practices.
Suppose we have this simple example:
public class ProductNameList
{
private IList<string> _products;
public void AddProductName(string productName)
{
_products.Add(productName);
}
}
Let's say for object-oriented design reasons, I have no need to publicly expose the list of products. How then can I test whether AddProductName() did its job? (Maybe it added the product twice, or not at all.)
One solution is to expose a read-only version of _products where I can test whether it has only one product name -- the one I passed to AddProductName().
The blog post I mentioned earlier says it's more about interaction (i.e., did the product name get added) rather than state. However, state is exactly what I'm checking. I already know AddProductName() has been called -- I want to test whether the object's state is correct once that method has done its work.
Disclaimer: Although this question is similar to
Balancing Design Principles: Unit Testing, (1) the language is different (C# instead of Java), (2) this question has sample code, and (3) I don't feel the question was adequately answered (i.e., code would have helped demonstrate the concept).
Unit tests should test the public API.
If you have "no need to publicly expose the list of products", then why would you care whether AddProductName did its job? What possible difference would it make if the list is entirely private and never, ever affected anything else?
Find out what affect AddProductName has on the state that can be detected using the API, and test that.
Very similar question here: Domain Model (Exposing Public Properties)
You could make your accessors protected so you can mock or you could use internal so that you can access the property in a test but that IMO would be wrong as you have suggested.
I think sometimes we get so caught up in wanting to make sure that every little thing in our code is tested. Sometime we need to take a step back and ask why are we storing this value, and what is its purpose? Then instead of testing that the value gets set we can then start testing that the behaviour of the component is correct.
EDIT
One thing you can do in your scenario is to have a bastard constructor where you inject an IList and then you test that you have added a product:
public class ProductNameList
{
private IList<string> _products;
internal ProductNameList(IList<string> products)
{
_products = products;
}
...
}
You would then test it like this:
[Test]
public void FooTest()
{
var productList = new List<string>();
var productNameList = new ProductNameList(productList);
productNameList.AddProductName("Foo");
Assert.IsTrue(productList[0] == "Foo");
}
You will need to remember to make internals visable to your test assembly.
Make _products protected instead of private. In your mock, you can add an accessor.
To test if AddProductName() did it's job, instead of using a public getter for _ProductNames, make a call to GetProductNames() - or the equivalent that's defined in your API. Such a function may not necessarily be in the same class.
Now, if your API does not expose any way to get information about product names, then AddProductName() has no observable side effects (In which case, it is a meaningless function).
If AddProductName() does have side effects, but they are indirect - say, a method in ProductList that writes the list of product names to a file, then ProductList should be split into two classes - one that manages the list, and the other that calls it's Add and Get API, and performs side effects.

should I make this class static?

In the projects I worked on I have classes that query/update database, like this one,
public class CompanyInfoManager
{
public List<string> GetCompanyNames()
{
//Query database and return list of company names
}
}
as I keep creating more and more classes of this sort, I realize that maybe I should make this type of class static. By doing so the obvious benefit is avoid the need to create class instances every time I need to query the database. But since for the static class, there is only one copy of the class, will this result in hundreds of requests contend for only one copy of static class?
Thanks,
I would not make that class static but instead would use dependency injection and pass in needed resources to that class. This way you can create a mock repository (that implements the IRepository interface) to test with. If you make the class static and don't pass in your repository then it is very difficult to test since you can't control what the static class is connecting to.
Note: The code below is a rough example and is only intended to convey the point, not necessarily compile and execute.
public interface IRepository
{
public DataSet ExecuteQuery(string aQuery);
//Other methods to interact with the DB (such as update or insert) are defined here.
}
public class CompanyInfoManager
{
private IRepository theRepository;
public CompanyInfoManager(IRepository aRepository)
{
//A repository is required so that we always know what
//we are talking to.
theRepository = aRepository;
}
public List<string> GetCompanyNames()
{
//Query database and return list of company names
string query = "SELECT * FROM COMPANIES";
DataSet results = theRepository.ExecuteQuery(query);
//Process the results...
return listOfNames;
}
}
To test CompanyInfoManager:
//Class to test CompanyInfoManager
public class MockRepository : IRepository
{
//This method will always return a known value.
public DataSet ExecuteQuery(string aQuery)
{
DataSet returnResults = new DataSet();
//Fill the data set with known values...
return returnResults;
}
}
//This will always contain known values that you can test.
IList<string> names = new CompanyInfoManager(new MockRepository()).GetCompanyNames();
I didn't want to ramble on about dependency injection. Misko Hevery's blog goes into great detail with a great post to get started.
It depends. Will you ever need to make your program multithreaded? Will you ever need to connect to more than one database? Will you ever need to store state in this class? Do you need to control the lifetime of your connections? Will you need data caching in the future? If you answer yes to any of these, a static class will make things awkward.
My personal advice would be to make it an instance as this is more OO and would give you flexibility you might need in the future.
You have to be careful making this class static. In a web app, each request is handled on its own thread. Static utilities can be thread-unsafe if you are not careful. And if that happens you are not going to be happy.
I would highly recommend you follow the DAO pattern. Use a tool like Spring to make this easy for you. All you have to do is configure a datasource and your DB access and transactions will be a breeze.
If you go for a static class you will have to design it such that its largely stateless. The usual tactic is to create a base class with common data access functions and then derive them in specific classes for, say, loading Customers.
If object creation is actually the overhead in the entire operation, then you could also look at pooling pre-created objects. However, I highly doubt this is the case.
You might find that a lot of your common data access code could be made into static methods, but a static class for all data access seems like the design is lost somewhere.
Static classes don't have any issues with multi-threaded access per-se, but obviously locks and static or shared state is problematic.
By making the class static, you would have a hard time unit testing it, as then you
would probably have to manage internally the reading of the connection string in a non-clear manner, either by reading it inside the class from a configuration file or requesting it from some class that manages these constants. I'd rather instantiate such a class in a traditional way
var manager = new CompanyInfoManager(string connectionString /*...and possible other dependencies too*/)
and then assign it to a global/public static variable, if that makes sense for the class, ie
//this can be accessed globally
public static CompanyInfoManager = manager;
so now you would not sacrifice any flexibility for your unit tests, since all of the class's dependencies are passed to it through its constructor

getter , setter c#

i have got various custom datatypes in my web application to map some data from the database.
something like:
Person
Id
Name
Surname
and i need a List of persons in most of my application's pages
i was thinking to create a getter property that gets the list of persons from the database and store into cache in this way i do not have to call the database each time
something Like (pseudo code)
public List<Person> Persons
{
get { return if cache != null return List of Persons from cache else get from the database;}
}
Where shall i put this getter? in my Person class definition or into my base page( page from which all the others pages inherit)
Thanks
I think putting it in your base page would be better option.
Depending on your application architecture, putting process related code in your domain classes might be an issue. Some use it in DDD (domain-driven design) type applications though.
Better even, I usually try to hide those implementation details in a service class. You could have a PersonService class that would contain your above method and all person related operations. This way, any page requiring person information would simply call the PersonService; and you can concentrate your page code on GUI related code.
I don't think that you should put it in your Person class since it accesses the database and HttpContext.Current.Cache. Furthermore I think you should make it a method and not a property, to imply that this may be a "lengthy" operation. So, of the two options, I would put it on the base Page class.

Constructive criticism on this class

I've just reviewed some code that looked like this before
public class ProductChecker
{
// some std stuff
public ProductChecker(int AccountNumber)
{
var account = new AccountPersonalDetails(AccountNumber);
//Get some info from account and populate class fields
}
public bool ProductACriteriaPassed()
{
//return some criteria based on stuff in account class
//but now accessible in private fields
}
}
There has now been some extra criteria added which needs data not in the AccountPersonalDetails class
the new code looks like this
public class ProductChecker
{
// some std stuff
public ProductChecker(int AccountNumber)
{
var account = new AccountPersonalDetails(AccountNumber);
var otherinfo = getOtherInfo(AccountNumber)
//Get some info from account and populate class fields
}
public bool ProductACriteriaPassed()
{
//return some criteria based on stuff in account class
// but now accessible in private fields and other info
}
public otherinfo getOtherInfo(int AccountNumber)
{
//DIRECT CALL TO DB TO GET OTHERINFO
}
}
I'm bothered by the db part but can people spell out to me why this is wrong? Or is it?
In a layered view of your system, it looks like ProductChecker belongs to the business rules / business logic layer(s), so it shouldn't be "contaminated" with either user interaction functionality (that belongs in the layer(s) above) or -- and that's germane to your case -- storage functionality (that belongs in the layer(s) below).
The "other info" should be encapsulated in its own class for the storage layers, and that class should be the one handling persist/retrieve functionality (just like I imagine AccountPersonalDetails is doing for its own stuff). Whether the "personal details" and "other info" are best kept as separate classes or joined into one I can't tell from the info presented, but the option should be critically considered and carefully weighed.
The rule of thumb of keeping layers separate may feel rigid at times, and it's often tempting to shortcut it to add a feature by miscegenation of the layers -- but to keep your system maintainable and clean as it grows, I do almost invariably argue for layer separation whenever such a design issue arises. In OOP terms, it speaks to "strong cohesion but weak coupling"; but in a sense it's more fundamental than OOP since it also applies to other programming paradigms, and mixes thereof!-)
It seems like the extra data grabbed in getOtherInfo should be encapsulated as part of the AccountPersonalDetails class, and thus already part of your account variable in the constructor when you create a new AccountPersonalDetails object. You pass in AccountNumber to both, so why not make AccountPersonalDetails gather all the info you need? Then you won't have to tack on extra stuff externally, as you're doing now.
It definitely looks like there might be something going haywire with the design of the class...but it's hard to tell without knowing the complete architecture of the application.
First of all, if the OtherInfo object pertains to the Account rather than the Product you're checking on...it's introducing responsibilities to your class that shouldn't be there.
Second of all, if you have a Data Access layer...then the ProductChecker class should be using the Data Access layer to retrieve data from the database rather than making direct calls in to retrieve the data it needs.
Third of all, I'm not sure that the GetOtherInfo method needs to be public. It looks like something that should only be used internally to your class (if, in fact, it actually belongs there to begin with). In that case, you also shouldn't need to pass around the accountId (you class should hold that somewhere already).
But...if OtherInfo pertains to the Product you're checking on AND you have no real Data Access layer then I can see how this might be a valid design.
Still, I'm on your side. I don't like it.
considering that an accountNumber was passed into the constructor you shouldn't have to pass it to another method like that.
A few points
The parameter names are pascal case, instead of camel (this maybe a mistake)
getOtherInfo() looks like it's a responsibility of AccountPersonalDetails and so should be in that class
You may want to use a Façade class or Repository pattern to retrieve your AccountPersonalDetails instead of using a constructor
getOtherInfo() may also be relevant for this refactor, so the database logic isn't embedded inside the domain object, but in a service class (the Façade/Repository)
ProductACriteriaPassed() is in the right place
I would recommend this:
public class AccountPersonalDetails
{
public OtherInfo OtherInfo { get; private set; }
}
public class ProductChecker
{
public ProductChecker(AccountPersonalDetails) {}
}
// and here's the important piece
public class EitherServiceOrRepository
{
public static AccountPersonalDetails GetAccountDetailsByNumber(int accountNumber)
{
// access db here
}
// you may also feel like a bit more convinience via helpers
// this may be inside ProductCheckerService, though
public static ProductChecker GetProductChecker(int accountNumber)
{
return new ProductChecker(GetAccountDetailsByNumber(accountNumber));
}
}
I'm not expert in Domain-Driven Design but I believe this is what DDD is about. You keep your logic clean of DB concerns, moving this to external services/repositories. Will be glad if somebody correct me if I'm wrong.
Whats good. It looks like you have a productChecker with a nice clear purpose. Check products. You'd refactor or alter this because your have a need to. If you don't need to, you wouldn't. Here's what I would probably do.
It "feels" clunky to create a new instance of the class for each account number. A constructor argument should be something required for the class to behave correctly. Its a parameter of the class, not a dependency. It leads to the tempation to do a lot of work in the constructor. Usage of the class should look like this:
result = new ProductChecker().ProductACriteriaPassed(accountNumber)
Which I'd quickly rename to indicate it does work.
result = new ProductChecker().PassesProductACriteria(accountNumber)
A few others have mentioned that you may want to split out the database logic. You'd want to do this if you want unit tests that are fast. Most programs want unit tests (unless you are just playing around), and they are nicer if they are fast. They are fast when you can get the database out of the way.
Let's make a dummy object representing results of the database, and pass it to a method that determines whether the product passes. If not for testibility, this would be a private. Testability wins. Suppose I want to verify a rule such as "the product must be green if the account number is prime." This approach to unit testing works great without fancy infrastructure.
// Maybe this is just a number of items.
DataRequiredToEvaluateProduct data = // Fill in data
// Yes, the next method call could be static.
result = new ProductChecker().CheckCriteria(accountNumber, data)
// Assert result
Now we need to connect the database. The database is a dependency, its required for the class to behave correctly. It should be provided in the constructor.
public class ProductRepository {} // Define data access here.
// Use the ProductChecker as follows.
result = new ProductChecker(new ProductRepository()).CheckCriteria(accountNumber)
If the constructor gets annoyingly lengthy (it probably has to read a config file to find the database), create a factory to sort it out for you.
result = ProductCheckerFactory().GimmeProductChecker().CheckCriteria(accountNumber)
So far, I haven't used any infrastructure code. Typically, we'd make the above easier and prettier with mocks and dependency injection (I use rhinomocks and autofac). I won't go into that. That is only easier if you already have it in place.

Lazy Loading in my C# app

I have a method in my DAL returning a list of Customers:
Collection<Customer> FindAllCustomers();
The Customer has these columns: ID, Name, Address, Bio
I need to show them in a paged grid in my ASPX form (show-customers.aspx) where I'll be showing only these columns: ID, Name
Now, in my DAL FindAllCustomers(), do I return the Bio field too from the SP (I am filling in the collection using a reader)? The Bio field can be large (nvarchar(max)). I was thinking of lazy loading or loading only the required fields. But then in that case I would need to create another method which returns a "full" list of customers including the bio so that 3rd party apps can use it through a service layer. So is it ok to create a method like this:
Collection<Customer> FindAllCustomers(bool loadPartial);
If loadPartial = true, then do not load Bio, else load it. In this case since I do not want to return the Bio from the SP, I would need to create 2 select statements in my SP based on the bool value.
I think using lazy loading here will not work, because then the DAL method can be accessed by a 3rd party app, which might want to load the bio too.
Any suggestions on the best pattern to implement in such cases?
thanks,
Vikas
The 3rd party thing is a bind.
At first blush I would normally suggest that you load only the minimal data normally and then load complete or further detail on an as-requested (i.e. touching a property could trigger a DB call - mild abuse of property there perhaps) or background process basis, depending on the nature of what you're doing.
Lazy property code by way of clarification:
class Customer
{
private string _lazydata = null;
public string LazyData
{
get
{
if (this._lazydata==null)
{
LazyPopulate();
}
return this._lazydata;
}
}
private void LazyPopulate()
{
/* fetch data and set lazy fields */
}
}
Be careful with this, you don't want to make lots of DB calls but nor do you want to be creating a bottleneck whenever you look at something lazy. Only the nature of your app can decide if this is suitable.
I think you have a valid case for creating the boolean flag method (though I would default to lightweight version) on the grounds that it's very likely a 3rd party would want the lightweight version for the same reasons you do.
I'd go with:
Collection<Customer> FindAllCustomers()
{
return this.FindAllCustomers(false);
}
Collection<Customer> FindAllCustomers(bool alldata)
{
/* do work */
}
In that case I would use an enum to be even more clear about the meaning of the parameter.
public enum RetrieveCustomerInfo
{
WithBio,
WithoutBio
}
And when you call the method:
dao.FindAllCustomers(RetrieveCustomerInfo.WithBio);
I don't know if it's better but I think it's more clear.
I think it's fine to have two methods optimised for specific cases so long as you make it clear in your method names. Personally I don't think:
Collection<Customer> FindAllCustomers(bool loadPartial);
Is very clear. How would a developer know what that boolean param actually meant in practise? See boolean parameters — do they smell? question.
Make it clear and all is well.
If your list of customers never shows the bio, then having a streamlined version would be fine.
A few questions...
Does that parameter only determine whether the Bio is loaded? In the future, will you have other fields that don't load when it's set to true?
What happens if I try to access Bio if loadPartial was set to true?
The key will be to ensure that whatever mechanism you choose is resilient to change. Put yourself in the perspective of a 3rd party and try to make your methods do what you'd expect. You don't want the developers who use your classes to have to understand the mechanisms deeply. So maybe instead of a "loadPartial" parameter, you simply have a different method which resolves the fast, minimal data needed to bind to a list and lazy loads the other fields as needed.
Why not use lazy loading in the properties of the Customer Class itself? Give each property (Id, Name,Bio), a private variable. In the getter of each property, if the private variable is not null then return it, otherwise read it in from your DAL.
When it comes to Bio, if you have to lazy load it, then in your getter you call another method in the Customer Class called LazyLoadAdditionalDetails() and call the appropriate sprocs there, then return your private variable.
This way you can keep your code as normal, and your paging view will only call the getters of the ID and Name, and your Bio will be populated from a sproc only when needed without you having to remember to call a lazy loading method.
I think that is acceptable... It is your program. You'd just want to ensure the API is documented and makes sense to others.
As a side note, you don't necessarily need two stored procedures or a classic if statement in your stored procedure.
You can use a case statement to NULL out the field when loadPartial == true.
Case WHEN #loadPartial = 1 THEN NULL ELSE [bio] END
class CustomerDAO
{
private bool _LoadPartial = true;
public bool LoadPartial
{
get
{
return _LoadPartial;
}
set
{
_LoadPartial = value;
}
}
public Collection<Customer> FindAllCustomers()
{
...
}
}
Would be another option, although I also like annakata's one.
Instead of
Collection<Customer> FindAllCustomers(bool loadPartial);
I'd make it
Collection<Customer> FindAllCustomers(bool includeBio);
"loadPartial" doesn't tell the consumer what constitutes a "partial" customer. I also agree with annakata's points.

Categories

Resources