WCF and size of DTOs

WCF and size of DTOs - c#

We've got a business logic/data access layer that we're exposing on a couple of different endpoints via a WCF service. We've created DTOs for use as the data contract of the service. We'll be using the service via the different endpoints for multiple different applications. In some of the applications, we only need a few fields from the DTO while in others we may need almost all of them. For those in which we only need a few, we really don't want to be sending the entire object "over the wire" every time - we'd like to pare it down to what we actually need for a given application.
I've gone back and forth between creating specific sets of DTOs for use with each application (overkill?) and using something like EmitDefaultValue=false on the members that are only needed in certain apps. I've also considered using the XmlSerializer rather than DataContractSerializer in order to have greater control over the serialization within the service.
My question is - first off, should we worry that much about the size of data we're passing? Second, assuming the answer is 'yes' or that we decide to care about it even if it is 'no', what approach is recommended here and why?
EDIT
Thanks for the responses so far. I was concerned we might be getting into premature optimizations. I'd like to leave the question open for now, however, in hopes that I can get some answers to the rest of it, both for my own edification and in case anybody else has this question and has a valid reason to need to optimize.

first off, should we worry that much about the size of data we're passing?
You didn't give the number/sizes of the fields but in general: No. You've already got the envelope(s) and the overhead of setting up the channel, a few more bytes won't matter much.
So unless we're talking about hundreds of doubles or something similar, I would first wait and if there's a real problem: experiment and measure.

Should you worry? Maybe. Performance/stress test your services and find out.
If you decide you do care...a couple options:
Create a different service (or maybe different operations in the same service) that return partially hydrated DataContracts. So these new services and/or operations return the same DataContrcts, but only partially hydrated.
Create "lite" versions of your DataContracts and return those instead. Basically the same as option 1, but with this approach you don't have to worry about consumers misusing the full DataContract (potentially getting null reference exceptions and such).
I prefer option 2, but if you have control over your consumers, option 1 might work for you.

It seems you may be entering the "premature optimization" zone. I'd avoid using application specific DataContracts for an entity because of the maintenance work it will cause problems in the long run. However, if your application has a valid need to hide information from some client applications and not other then its good to have multiple DataContracts for a given entity. #Henk is right, unless you're dealing with massive deeply nested entities (in which case you have a different problem) then do not "optimize" your design simply to reduce network transmission packets.

Related

Why is passing a dataset to a web service method not good?

Please explain in detail. I have been telling my mentor that everything I've researched on WCF and many programmers all over the net say it is NOT good to pass DataSets to the service, why is that? I created a BUNCH of classes in the service and the work great with the application, but he says that I just wasted time doing all that work, he has a better way of doing it.
He keeps telling me to create a SINGLE OperationContract. There will be many functions in the service, but the OperationContract will take the string name of the function and the dataset providing the details for that function.
Is his way bad practice? Not safe? I'm just trying to understand why many people say don't use datasets.

The first reason is interoperability. If you expect consumers of your service to be implemented in any other technologies other than .NET, they may have lots of trouble extracting or generating the data in the DataSet, as they will have no equivalent data structure on their end.
Performance can be affected quite a bit, as well. In particular, the serialization format for untyped datasets can be huge because it will contain not just the data, but also the XSD schema for the data set, which can be quite large depending on the complexity of the DataSet. This can make your messages a lot larger, which will use more network bandwidth, take longer to transfer (particularly over high latency links), and will take more resources at the endpoint to parse.

So the web service you have does something specific lets say it sends a bunch of emails. Lets say this service has one method that sends an email. The method should accept and email address, subject and a body.
Now if we send a data set with the information required the service would have to know the shape of the data and parse it.
Alternatively if the web service accepted a object with properties for email address, subject and body. It can be used in more than one place and is less prone to going wrong dues to a malformed dataset.

One more thing: you can get incorrect data using DataSet.
For example a value in the DataSet might look like the following before serialization:
<date_time>12:19:38</date_time>
In the client it would come with a offset specified:
<date_time>12:19:38.0000000-04:00</date_time>
The client code would adjust this to its local time (much like Outlook when you schedule an appointment with someone in a different timezone).
More details can be found here.

Using WCF is not just an implementation decision - it is a design choice. When you choose to use WCF you have to abandon many of your treasured OO principles behind and embrace a new set of patterns and principals that are associated to service orientation.
One such principle is that of explicit contracts: A service should have well defined public contracts (see this Wikipedia article). This is crucial for interoperability, but is also important so clients have an accurate picture of what functionality your service provides.
A DataSet is basically just a big bag of "stuff" - there is no limitation to what it could contain - or any well defined contract that explains how I can get data out. By using a DataSet you introduce inherent coupling between the client and the server - the client has to have "inside information" about how the DataSet was created in order to get the data out. By introducing this level of coupling between the client and service you have just negated one of the main motivations for using WCF (precisely that of decoupling the two areas of functionality to allow for independent deployment and/or development lifecycle).

C# application talking with ruby msgpack server which is expecting a ruby-type hash?

I am looking to build a small application to talk with a ruby msgpack server in C#. My only holdup so far is that the API behind the server is expecting to pull out a ruby hash. Can I use a simple dictionary/key-value pair type in C#? If not, what would you suggest?
I will be using the library mentioned on the msgpack website (http://wiki.msgpack.org/display/MSGPACK/QuickStart+for+C+Sharp). However, it only seems to support primitive types? I have tried to go the IronRuby way, however there is a very crippling bug in mono that prevents you from using it. https://bugzilla.xamarin.com/show_bug.cgi?id=2770

It is normal that different part of the system can be built using different technology stacks. Because these parts should be able to talk to each other (this way or another) it is important to specify contracts between subsystems.
It is really important to think first about these contracts as these parts of your system (subsystems) can be (and will be, no doubt) subjects of changes (due to evolving their business logic, bug fixes, etc.).
By having these contracts you allow subsystems to be changed independently without impacting all their "clients" (other subsystems). Otherwise you will end up with "I need to fix this, but may affect tonnes of places I even don't know about" syndrome.
Well, as soon as you hold the contract you can do whatever you want within the given subsystem, which is just a Heaven! :)
This means that instead of "pulling out the ruby hash" you normally want to define a platform-agnostic contract that will be exposed as an aspect in terms of the business logic of your application. This contract then can be consumed by any other subsystem written in any technology.
It also means that instead of just passing some data between subsystems you want to pass some objects. These objects not only contain the data you want to pass, but also describe this data, dive it some meaning. By this "description" I mean the object type, property names, ect. Objects are self-descriptive, you know.
You may declare the contract for your ruby subsystem saying "I accept these queries and I return these results". Both query (method) and result (object) should be formulated in terms of business logic of the specified subsystem. For example, GetProducts contract should probably return a list of Product objects, not some meaningless "ruby hashes". So all the consumers will know what the contract is and what to expect.
You can make it a standard then, saying "between subsystems all the objects passed are serialized to JSON (or XML)", which is more than trivial in Ruby, C# or any other language, as well as truly platform-agnostic.
Therefore, back to your question, you normally just have no such problem in your live as translating ruby types into .NET types using some buggy libraries, or doing similar crazy things :)
Simply defining contracts and standardizing transport (JSON?) helps you in many ways starting from getting rid of this problem and all the way through to having the clean and easily maintainable system.

WCF single point-of-contact

WCF beginner's question: I've been told that changing the WCF contract is costly and requires constant maintenance (recreating the proxy in the client side), and therefore the preferred method is having one very generic point-of-contact method (which decides how to act, say, according to a given enum parameter).
This sounds quite smelly to me, but I haven't been able to find any information about this issue (bad choice of search keywords? probably).
Any advice, or maybe a useful link?
Thanks!

You don't need to generate the proxy again, you can simply ensure the client is built with the correct interface version. If you're very careful and only add methods, not remove or modify, that works just fine too. That's a lot of responsibility to manage, of course.
To use an interface rather than generate a client proxy, check my question from a while ago:
WCF Service Reference generates its own contract interface, won't reuse mine

You are confusing some terms here and I think you might be referring to a known flaw which has been fixed in .Net 3.5 SP1.
Recreating the WCF proxy used to be an expensive operation at runtime. This has been improved in .Net 3.5 to cache the proxy objects transparently MSDN Blog.
If you are referring to the "code maintenance" of the proxy, then all you are referring to is implementing an interface at the client. If you need to maintain the interface then this comes back to basic SOA. If your services provide access and as much information as possible, assuming that your service will be used for purposes you haven't yet considered then you will likely not need to modify the interface after it is created. You should also consider your upgrade paths as well.
Juval Lowy has a good discussion about this problem in his book which is a little dense but has some pretty good information in it.
A piece of advice: WCF has a whole lot of features designed to make your code really simple and elegant. If you are worreid about maintenance, what you may be driven to do is write an interface:
string ServiceMethod(string xml) //returns XML
Don't do this. Take the time to design a good maintainable interface and a good data/message contract. This will let WCF provide all the extras you get for free when hosting your service for interaction.

Generic (as in non-specific, monolithic) interfaces are hard to understand and program to. The reason not to define a single method as the API is that it's impossible for clients to understand what's going on, and when you change the (implicit) API of this interface, your clients will break in horrible ways that you won't detect at compile time.
It's been a while since I touched WCF, but if your clients are internal (same codebase, versioning and deployment schemes), then regenerating the WCF proxies is very easy, and having a "strong" detailed API will make your life so much easier than a generic one.

It depends on what kind of change you mean. Change to the service contract is indeed costly and should not happen. Service contracts are (or should be) at a sufficiently high level of granularity that change is very rare.
More common are changes to the types which are exposed on the service. These changes are more common and therefore you do need to approach your change in such a way as to avoid breaking existing clients if possible.
There are several ways you can do this, such as exposing your types polymorphically using an interface, but the simplest way is to simply ensure that changes to your types on add new data member fields and make the new fields non-mandatory. If you can limit your changes to these then this is has the lowest impact to existing clients and enables new clients to use the new fields.
Hope this helps.

This is true that modifying the service contract (interface) would also required the client to recreate the proxy class at their end using the new published WSDL and may even require the client to change their code as par the new proxy. I don't think you can create such a generic interface that can handle all changes further down the road in the contract. A contract has to be written very carefully so that it doesn't change often and if there is a need to change the contract then it is better to deploy the service with a different version so that your old client can still work with the old version.

Where do you put Validation in projects with Domain Driven Design?

Where should I place the Validation logic of the Domain objects in my solution? Should I put them in the Domain classes, Business layer or else?
I would also like to make use of Validation Application Block and Policy Injection Application Block from Microsoft Enterprise Library for this.
What validation strategy should I be using to fit all these together nicely?
Thanks all in advance!

It depends. First - You need to understand what You are validating.
You might validate:
that value You retrieve from Http post can be parsed as date time,
that Customer.Name is not larger than 100 symbols,
that Customer has enough money to purchase stuff.
As You can see - these validations are different in nature, so they should be separated. Importance of them varies too (see "All rules aren’t created equal" paragraph).
Thing You might want to consider is not allowing domain object to be in invalid state.
That will greatly reduce complexity because at current time frame, You know that object is valid and You need to validate only current task related things in order to advance.
Also, You should consider avoiding usage of tools in Your domain model because it should be infrastructure free as much as possible.
Another thing - embrace usage of value objects. Those are great for validation encapsulation.

You can do either, depending on your needs.
Putting it in domain classes makes sure the validation is always done, but can make the classes bloated. It also can go against the single responsibility principle depending on how you interpret that (it adds the responsibility to validate). Putting it in domain classes also restricts you to one kind of validation. Also, unless you use inheritance, the same rule might have to be implemented multiple times in related classes (DRY). Validation is spread out through your domain if you do it this way.
External validation (you can get a validation object through DI, factories, business layer, or context) makes sure you can swap out the validation rules depending on context (e.g. for a long running process you want to save in a partially finished state you could have one validation object just to be able to save, and another to check whether the domain class is really valid and ready to be used). Your domain classes will be simpler (less responsibilities, though you'd have to do minimal checks, like null checks, to prevent run time errors), and you could reuse rule sets for related classes as well. Validation is centred in a small area of your domain model in this way. B.t.w. you can inject the external validation into the domain class itself making sure the classes do validate themselves, just don't know what they are validating.
Can't comment on the validation application block though.As always you have to weigh the pros versus the cons, there is never one valid solution.

First off, I agree with #i8abug.
But I did want to go a bit further to talk architecture. Every one of those design architectures, like domain driven, should be taken as nothing more than a suggestion and viewed with scrutiny.
At every step you should ask yourself what the benefit and drawbacks of the point in question is with regards to your application.
A lot of these involve adding a tremendous amount of code and seriously complicating projects with very little benefit.
The validation point is a prime example. As Stefan said, the principle of single responsibility basically says you need to create a whole set of other classes whose purpose is to only validate the state of the original objects. Obviously this adds a LOT of code to the app. Maybe it's generated for you, maybe you have to hand write it. Regardless, more code generally equates to being less robust and certainly equates to being harder to understand.
The benefit of separating all of that is that you can swap out validation rules. Ok, fine. The drawback is that you now have 2 files to look at and create for each class definition. ie: more work. Does your app need to swap out validation rules? Probably not. I'd even wager to say very very few do.
Quite frankly, if you go down this path then you may as well define everything as a struct and let all of those "helper" classes creep back to take care of validation, persistence, setting properties, etc as being a full blown class buys you almost nothing.
All of that said, I tend towards self contained classes. In other words they know how their properties relate to each other and know what are acceptable values. They can also perform operations on themselves and their children. In other words, they know what they are. This tends to lead to simplified coding and implementation. It also leads to knowing exactly where to go for a modification or change. The only separation I really do here is to implement Inversion of Control for persistence; which allows me to swap out data providers at runtime; which has been a requirement on several applications I've done.
Point is, think through what you are doing and decide if it's really the best way to go in your particular situation. All of these programming "rules" are just suggestions after all.

I generally put it in the domain objects. This is because the domain objects are the things that I am concerned about validating so if a rule for a specific object changes, I know where to update it rather than having to search through a bunch of unrelated entity rules in some specific validation class/file.
I realize this may not be considered POCO but every project has specific exceptions and this one often makes sense to me. Likewise, in some projects it makes sense to have your domain entities referenced from the views and, therefore, implement IPropertyChanged rather than constantly copying values from entities to a whole other set of view specific objects.
The old way I did validation was I had an IValidator interface like below which each entity implemented.
public interface IValidator
{
IList<RuleViolation> GetViolations();
}
Now I do this using NHibernate Validation (don't need to use nhibernate ORM to take advantage of the validation library. It is done simply through attributes.
//I can't remember the exact syntax but it is very similar to this
public class MyEntity
{
[NHibernateValidation(Length(min=1, max=10)]
public String Name {get;set;}
}
//... and then later ...
NHibernateValidator.Validate(myEntity);
Edit: I removed my comment about not being a huge fan of enterprise library in general in the past since Chris informed me that it is now very similar to NHibernate Validation

Architectural question: In what assembly should I put which class, for a clean solution?

PREAMBLE:
This is by far the longest post I've left here...but I think it's required in this case.
I've had questions about these kinds of things for a long time: how to name assemblies, and how to divide up classes within them.
I'd like to give an example of an application here, with only a bare minimum of classes to demonstrate what I'm trying to understand.
Imagine an application that
Accepts client messages, store them in a db, and then later dequeues them to an MTA server.
It's a Web application that has both an ASP.NET interface to write a message + attach attachments.
There's also a Silverlight client, so the webapp exposes a ClientServices WCF ServiceContract, with one OperationContract (SaveMessage).
There's also a Windows client...does the same thing as the Silerlight contract.
OK. that should be enough of a fake scenario to demonstrate my cluelessness.
The above will need the following classes:
Message
MessageAddress
MessageAddressType (an enum with From, To)
MessageAddressCollection
MessageAttachment
MessageAttachmentType
MessageAttachmentCollection
MessageException
MessageAddressFormatException
MessageExtensions (static extension for Message)
MessageAddressExtensions (static extension for MessageAddress)
MessageAttachmentExtensions (static extension for MessageAttachment)
Project.Contract.dll
My first stab at organizing the above into the right assemblies would be observing that Message, MessageAddress, MessageAttachment, the enums needed for its properties (MessageAddressType, MessageAttachmentType) and the collections needed for them(MessageAddressCollection, MessageAttachmentCollection), are all to be marked as [DataContract] so that they can be serialized between the WCF client and the server.
Being common to both, I think I would move them into a neutral shared assembly called Contract.
Project.Client.dll
I'll need a Client proxy of the server [ServiceContract], that refs the classes in the Contract.dll.
So now the server, which also refs Project.Contract.dll could now save serialized Messages received from a WCF Client, and save them into a db.
Plugins
Next I would realize that I would like to have these objects be processed server side by 3rd party plugins (eg; a virus checker)...
But plugins should have readonly access (only) to the variables in order to check the variables, and throw errors if they see something they don't like.
So I would think about going back to have Message inherit from IMessageReadOnly ...but where to put that interface?
Project.Interfaces.dll
If I put it in an assembly called Project.Interfaces.dll, this would work for the plugins who could reference that without having a reference to Contracts.dll...but now the client has to reference both Contracts assembly AND Interfaces...doesn't sound like a good direction...
Duplicate Objects
Alternatively, I could have two Messages structures (and duplicate the other MessageAttachment, etc. classes as well)...one for communicating from client to server (in the Contracts.dll), and then use a second ServerMessage/ServerMessageAddress/ServerMessageAddressCollection on the server side, which inherits from IMessageReadOnly, and then it would appear that I am closer to what I want.
With duplicate objects, plugins are limited in access, while Server BL, etc. has full access for types relevant to its work, all while the client has different but identical objects...
In fact...they I should probably start considering them as non-identical, making it clearer in my head that the objects are just there to talk to clients, ie Contract/Comm objects)...
The Website UI
which brings up ...hum...if there are two different Messages, and they have now different properties...which one is the most appropriate for using to back the ASP.NET forms? The ServerMessage object seems fastest (no mapping going on between types)...but all the logic has already been worked out against client message objects (with different properties and internal logic). So would I use a ClientMessage, and map it to a Servermessage, to keep the various UI logics the same, across different mediums? or should i prefer mapping, and just rewrite the UI validation?
What about the third case, Silverlight...The Contracts assembly was a Full Framework assembly...which Silverlight can't ref (different framework/build mechanism)....so the assembly that i have on the Silverlight side might be exactly the same code, but has to be a different assembly. How does that work out?
What exactly to Consider as DataContract?
Finally...and this is, I swear, near the end of my huge question...what about the pesky extra classes that are not clearly DataContract?
For example, The MessageAddress was a DataContract. Ok. And the enums it exposed are part of it...Makes sense... But if the messageAddress constructor raises a MessageAddressFormatException...is it considered part of the DataContract?
Can there be Classes common to both Server, Client, AND Plugins?
Or is it an exception that is common to BOTH ServerMessageAddress and ClientMessageAddress, so should not be duplicated, and instead be in a Common assembly...so that in the end, the client has to bind to Contracts AND Common? (Didn't we just go down this alley with the Interfaces assembly?)
What about common Base classes/Interfaces?
And should these exceptions have common base classes? for example...ClientMessageAddressException, ServerMessageAddressException, ServerMessageVirusException (from plugin)...should I struggle to get them to -- as best as possible -- all derive from an abstract MessageException...or is there a time when enheritence/reusse just no longer an appropriate goal to strive for?
HUGE THANKS FOR READING THIS FAR.
I'm a developer and on the tech side I can bumble along ok...but these kinds of questions, where I've had to lay out the assemblies, the architecture, myself, leave me hugely perplexed...and lose me SOOOO much time, as I drive myself batty, moving things around from one assembly to another to see which one is the best fit, all while not really certain of what I am doing, and trying to not get circular references...
So -- really -- thanks for listening, and I hope this gets read by people who can describe how to lay out the above cleanly, hopefully expressing how to think my way through it for future projects as well.

After spending 10 minutes editing the question for formatting, I'm still going to downvote it. There's no way I'm going to read all that.
Go pick up a copy of
Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries (2nd Edition)

As an architect, I've learned that it doesn't pay to get too wrapped up in getting things absolutely perfect the first time, and perfect is subjective. Refactoring, especially moving classes between assemblies, doesn't have too huge a cost. It sounds to me like you're already thinking things through logically and correctly. Here's my opinions on a few of your questions:
Q: Should I have read-only contracts for my data contract classes?
The plugins most-likely shouldn't be aware of your data contracts at all. A virus checker may take a byte array, a spell checker a string and locale, etc. If you're making a general interface layer for the plugins, you should just isolate what's shared to the data specific to the plugin. This will allow you to maximize their reuse. Thus, I think you'll get little payoff on creating interfaces to your data contract structures, which should mostly be dumb bags of data with little logic that are practically interfaces themselves.
Q: Should I use the same data contract classes as my Silverlight app does in my ASP.NET application or use server-side classes directly?
I would go with the client message objects so you can benefit from code reuse. Object creation is fairly cheap, and I'm sure that most of the mapping would be one-to-one. It's not as fast, true, but that won't be the bottleneck in your application.
Q: Where do I put my exception classes?
I would put your example exception classes in the assembly with the data contract, since they are all raised due to contract violations or as a means to communicate errors while fulfilling the contract.
Q: Should the exceptions have common base classes?
I have yet to need to do this, but I don't know your code base as well as you do. My guess is that it will gain you little if anything.
Edit:
You may be overplanning for the future. In my experience, taking a YAGNI approach has allowed us to get the important things done more quickly. Making incremental design changes is preferred to spending valuable time building an elaborate architecture that you might never even benefit from.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.