Related
I've got an ASP.NET web application, that is essentially our intranet site. I made a lot of progress on the administration office's employee management pages. It ties into an SQL server database, and I'm using a three layered design (Objects, Logic, DataAccess). It was all reviewed and all of it was accepted, except! for the part that manages vacations and vacation histories.
My question, before I go into details is, how does one efficiently "untangle" code that is no longer necessary?
For example: previously I was treating each VacationDay as it's own entity with it's own history. Such that I could track the history of an individual day. To help in tracking, I have an enum called VacationDayAction, which includes options such as .Submitted, .RequestDenied, .CancellationRequested, and so on. This was in an attempt to provide meticulous detail for each day. It was then determined that we no longer need that. We do, however, still need VacationDays and all the basic functions of that (saving days, getting days, etc.), but now we no longer need any of the "history" related classes.
My problem is, when I right click a class that I no longer need in VS and go to "Show All References," I get a ton of results scattered across several pages. I need to get rid of all of them, without breaking the rest of the application. Is there not some kind of "smart" technique or method for easily untangling parts that are no longer necessary? This is particularly difficult because 90% of what I did was just fine, and needs to stay like it is. Yet scattered in that 90% is 10% of stuff that is no longer needed. I can't just go storming through with the delete key either, because with the removal of each reference, I need to be sure that any dependencies on that reference are also fixed in a way that they don't call stuff that isn't there anymore. And I still need the application is a compilable state, so that I can test along the way that the rest of the application didn't fall apart as a result of some deletion.
To give you an idea of my low level of experience, I started two years ago with having never used C#, ASP.Net, or Visual Studio. It blew my mind when, way after starting and as I was learning, someone taught me that I could use breakpoints. And then it really really blew my mind when I learned about multi-layered design. I'm wondering if there is not some technique or trick or feature that can help in scenarios like this, where you have to "untangle" and throw away unnecessary stuff.
This is not a simple question. In fact, I would say this is one of the major challenges for any systems developer; how to handle and get rid of old code which is not in use. There is lots of literature on this, and few really excellent answers. A good book may be "Working effectively with legacy code" by Michael Feathers, which deals with many related problems. It is no light read though, and will probably take some time to get through, but it will likely help you become a better coder, and better at these kinds of tasks in particular.
Maybe you can have a look at the Resharper tool? ( http://www.jetbrains.com/resharper/ ) It is a productivity tool which among other things shows "dead" code (unused code) in grey, and lets you remove it. It will also help you remove unused references from each class (again, they will be grayed out and let you remove them automatically).
Drawing diagrams where each major piece of code /component is a box with a line linking it to any related component might help you get a better overview; try to draw a hierarchy showing how different parts of the code are related and dependent.
The bottom line as far as I know, is that you just have to muddle through it, commenting out code a little at a time, then recompiling and testing it. If it still works, fine, now you can remove the commented out code completely. This would be easier if you had unit-tests covering your code, but I take it as a given that you don't, as is unfortunately often the case.
I develop and maintain a large (500k+ LOC) WinForms app written in C# 2.0. It's multi-user and is currently deployed on about 15 machines. The development of the system is ongoing (can be thought of as a perpetual beta), and there's very little done to shield users from potential new bugs that might be introduced in a weekly build.
For this reason, among others, i've found myself becoming very reliant on edit-and-continue in the debugger. It helps not only with bug-hunting and bug-fixing, but in some cases with ongoing development as well. I find it extremely valuable to be able to execute newly-written code from within the context of a running application - there's no need to recompile and add a specific entry point to the new code (having to add dummy menu options, buttons, etc to the app and remembering to remove them before the next production build) - everything can be tried and tested in real-time without stopping the process.
I hold edit-and-continue in such high regard that I actively write code to be fully-compatible with it. For example, I avoid:
Anonymous methods and inline delegates (unless completely impossible to rewrite)
Generic methods (except in stable, unchanging utility code)
Targeting projects at 'Any CPU' (i.e. never executing in 64-bit)
Initializing fields at the point of declaration (initialisation is moved to the constructor)
Writing enumerator blocks that use yield (except in utility code)
Now, i'm fully aware that the new language features in C# 3 and 4 are largely incompatible with edit-and-continue (lambda expressions, LINQ, etc). This is one of the reasons why i've resisted moving the project up to a newer version of the Framework.
My question is whether it is good practice to avoid using these more advanced constructs in favor of code that is very, very easy to debug? Is there legitimacy in this sort of development, or is it wasteful? Also, importantly, do any of these constructs (lambda expressions, anonymous methods, etc) incur performance/memory overheads that well-written, edit-and-continue-compatible code could avoid? ...or do the inner workings of the C# compiler make such advanced constructs run faster than manually-written, 'expanded' code?
Without wanting to sound trite - it is good practice to write unit/integration tests rather than rely on Edit-Continue.
That way, you expend the effort once, and every other time is 'free'...
Now I'm not suggesting you retrospectively write units for all your code; rather, each time you have to fix a bug, start by writing a test (or more commonly multiple tests) that proves the fix.
As #Dave Swersky mentions in the comments, Mchael Feathers' book, Working Effectively with Legacy Code is a good resource (It's legacy 5 minutes after you wrote it, right?)
So Yes, I think it's a mistake to avoid new C# contructs in favor of allowing for edit and continue; BUT I also think it's a mistake to embrace new constructs just for the sake of it, and especially if they lead to harder to understand code.
I love 'Edit and Continue'. I find it is a huge enabler for interactive development/debugging and I too find it quite annoying when it doesn't work.
If 'Edit and Continue' aids your development methodology then by all means make choices to facilitate it, keeping in mind the value of what you are giving up.
One of my pet peeves is that editing anything in a function with lambda expressions breaks 'Edit and Continue'. If I trip over it enough I may write out the lambda expression. I'm on the fence with lambda expressions. I can do some things quicker with them but they don't save me time if I end up writing them out later.
In my case, I avoid using lambda expressions when I don't really need to. If they get in the way I may wrap them in a function so that I can 'Edit and Continue' the code that uses them. If they are gratuitous I may write them out.
Your approach doesn't need to be black and white.
Wanted to clarify these things a bit
it is good practice to avoid using these more advanced constructs in favor of code that is very, very easy to debug?
Edit and Continue is not really debugging, it is developing. I make this distinction because the new C# features are very debuggable. Each version of the language adds debugging support for new language features to make them as easy as possible debug.
everything can be tried and tested in real-time without stopping the process.
This statement is misleading. It's possible with Edit and Continue to verify a change fixes a very specific issue. It's much harder to verify that the change is correct and doesn't break a host of other issues. Namely because edit and continue doesn't modify the binaries on disk and hence doesn't allow for items such as unit testing.
Overall though yes I think it's a mistake to avoid new C# contructs in favor of allowing for edit and continue. Edit and Continue is a great feature (really loved it when I first encountered it in my C++ days). But it's value as a production server helper doesn't make up for the producitivy gains from the new C# features IMHO.
My question is whether it is good practice to avoid using these more advanced constructs in favor of code that is very, very easy to debug
I would argue that any time you are forcing yourself to write code that is:
Less expressive
Longer
Repeated (from avoiding generic methods)
Non-portable (never debug and test 64bit??!?!?)
You are adding to your overall maintenance cost far more than the loss of the "Edit and Continue" functionality in the debugger.
I would write the best code possible, not code that makes a feature of your IDE work.
While there is nothing inherently wrong with your approach, it does limit you to the amount of expressiveness understood by the IDE. Your code becomes a reflection of its capabilities, not the language's, and thus your overall value in the development world decreases because you are holding yourself back from learning other productivity-enhancing techniques. Avoiding LINQ in favor of Edit-and-Continue feels like an enormous opportunity cost to me personally, but the paradox is that you have to gain some experience with it before you can feel that way.
Also, as has been mentioned in other answers, unit-testing your code removes the need to run the entire application all the time, and thus solves your dilemma in a different way. If you can't right-click in your IDE and test just the 3 lines of code you care about, you're doing too much work during development already.
You should really introduce continues integration, which can help you to find and eliminate bugs before deploying software. Especially big projects (I consider 500k quite big) need some sort of validation.
http://www.codinghorror.com/blog/2006/02/revisiting-edit-and-continue.html
Regarding the specific question: Don't avoid these constructs and don't rely on your mad debugging skills - try to avoid bugs at all (in deployed software). Write unit tests instead.
I've also worked on very large permanent-beta projects.
I've used anonymous methods and inline delegates to keep some relatively simple bits of use-one logic close to their sole place of use.
I've used generic methods and classes for reuse and reliability.
I've initialised classes in constructors to as full an extent as possible, to maintain class invariants and eliminate the possibility of bugs caused by objects in invalid states.
I've used enumerator blocks to reduce the amount of code needed to create an enumerator class to a few lines.
All of these are useful in maintaining a large rapidly changing project in a reliable state.
If I can't edit-and-continue, I edit and start again. This costs me a few seconds most of the time, a couple of minutes in nasty cases. Its worth it for the hours that greater ability to reason about code and greater reliability through reuse saves me.
There's a lot I'll do to make it easier to find bugs, but not if it'll make it easier to have bugs too.
You could try Test Driven Development. I found it very useful to avoid using the debugger at all. You start from a new test (e.g. unit test), and then you only run this unit test to check your development - you don't need the whole application running all the time. And this means you don't need edit-and-continue!
I know that TDD is the current buzz-word, but it really works for me. If I need to use the debugger I take it as a personal failure :)
Relying on Edit and Cont. sounds as if there is very little time spent on designing new features, let alone unit tests. This I find to be bad because you probably end up doing a lot of debugging and bug fixing, and sometimes your bug fixes cause more bugs, right?
However, it's very hard to judge whether you should or should not use language features or not, because this also depends on many, many other factors : project reqs, release deadlines, team skills, cost of code manageability after refactoring, to name a few.
Hope this helps!
The issue you seem to be having is:
It takes too long to rebuild you app,
start it up again and get to the bit
of UI you are working on.
As everyone has said, Unit Tests will help reduce the number of times you have to run your app to find/fix bugs on none UI code; however they don’t help with issues like the layout of the UI.
In the past I have written a test app that will quickly load the UI I am working on and fill it with dummy data, so as to reduce the cycle time.
Separating out none UI code into other classes that can be tested with unit tests, will allow you do use all C# constructs in those classes. Then you can just limit the constructs in use in the UI code its self.
When I started writing lots of unit tests, my usage of “edit-and-continue” went down, I how hardly use it apart from UI code.
I am working on a project where we have several attributes in AssemblyInfo.cs, that are being multicast to methods of a particular class.
[assembly: Repeatable(
AspectPriority = 2,
AttributeTargetAssemblies = "MyNamespace",
AttributeTargetTypes = "MyNamespace.MyClass",
AttributeTargetMemberAttributes = MulticastAttributes.Public,
AttributeTargetMembers = "*Impl", Prefix = "Cls")]
What I don't like about this, is that it puts a piece of logic into AssemblyInfo (Info, mind you!), which for starters should not contain any logic at all. The worst part of it, is that the actual MyClass.cs does not have the attribute anywhere in the file, and it is completely unclear that methods of this class might have them. From my perspective it greatly hurts readability of the code (not to mention that overuse of PostSharp can make debugging a nightmare).Especially when you have multiple multicast attributes.
What is the best practice here? Is anyone out there is using PostSharp attributes like this?
Let me first answer to Max: indeed, aspects are not an alternative to good OOP patterns. They are a complement. Any good AOP design starts with a good OOP design. But OOP patterns sometimes force you to write a lot of plumbing code manually. For these cases, aspects can be used to automate the implementation of OOP pattern, not to replace them.
When you use AOP intelligently, your solution can become easier to understand (business code is not mixed with maintenance code), to test (you can test the aspect independently from business code, i.e. you don't have to test that any business method traces properly), change (you just have to change the aspect when you want to change the pattern, instead of changing every implementation of the pattern). Now, if you abuse from AOP, if you use it as a hacking tool, if you do not think in terms of OOP patterns before, then your're going to get more costs than benefits from AOP. As any sharp tool, AOP should be used intelligently.
Back to the original question.
Who tells you should put aspects in AssemblyInfo.cs? You could create a new file called GlobalAspects.cs and put all assembly-level aspects there. You're right that AssemblyInfo.cs should just be for assembly-level metadata.
But like you, I don't like assembly-level aspects. I think there should be avoided. The principal problem with assemly-level aspects is that they rely on naming conventions, and this is evil. (This evil is called pointcut fragility in the academic AOSD community.) Indeed, when you rename a class or namespace, you change the set of methods to which the aspect applies, and this can quickly become a nightmare. That's why I never use aspects based on naming conventions for myself.
What about code readibility? To a great extent, I think readable code is short code. If I have a business method called CreateProduct, I probably want to see just the code creating the product. Most of the time, I am not interested in code that handles transactions, exceptions, or tracing. It's enough if I know that some aspects handle that for me.
And how do I know? With PostSharp, you have the Visual Studio Extension. With AspectJ, you have the AspectJ plug-in for Eclipse (AJDT). They show you, inside the IDE, which aspects are applied to the code you currently see. And if you really want to see details (but you seldom really want), you can use the debugger to step into aspects, or use Reflector to see produced code.
Summary:
Good AOP design always starts with a good OOP design.
Avoid relying on naming conventions to apply aspects.
Use PostSharp extension for Visual Studio or AJDT to visualize aspects in your code.
I'm sure this will be an unpopular answer but maybe I can get my peer pressure badge...
Your instincts are correct. Putting logic in metadata of any kind is a horrible, horrible sin for which one burns eternally in the hellfire of unmaintainability.
I mean no disrespect by this although I'm certain it will be interpreted otherwise.
The best practice would be to not use "aspect-oreinted programming" tools, which are crutches that enable the lameness of poor design and testing practices. Instead, look at your design and ask yourself "why."
Why did I feel the need to use this
tool? What design problem was I
trying to solve?
Once you have a firm grasp of the problem, go pick up Design Patterns Explained (Shalloway & Trott) or Head First Design Patterns (Freeman, Robson, Bates, & Sierra).
In the end, a pattern-oriented solution will be easier to understand, easier to test, and easier to change. The only additional cost will be the one-time fee of mastering design patterns in place of the recurring charge of trying to figure out where all these aspects are, how they fit together, and how they influence one another every time you make a change.
If it is possible to auto-format code before and after a source control commit, checkout, diff, etc. does a company really need a standard code style?
It feels like standard coding style debates that have been raging since programming began like "put the bracket on the following line" or "properly indent your (" are no longer essential.
I realize in languages where white space matters the diff will have to consider it but for languages where the style is a personal preference is there really a need to worry about it anymore?
Auto-format can really only address whitespace.
It wont address developers giving variables bizarre nonsensical names.
It won't address some developers having functions return null on an error vs throwing an exception.
I'm sure others can think of more examples.
This is what we do at my work:
We all use Eclipse. We don't have a policy for using Eclipse but somehow none of us is an IDEA/IntelliJ guy. We also think our code should be written with legacy in mind. This means our code has to be readable in a certain way even years after (#1) no matter who wrote it and if that person even is in the company anymore.
Eclipse has couple handy features, automatic format on save and a specific Formatter tool. As you can see from the linked screenshot, it can be configured with XML. Thus there's a bunch of premade XML:s available for every worker in our company so that when a new guy comes in, we walk him through of the whole process and configure their Eclipse for them (yes, it's slightly evil thing to do) so that it actually uses those formatting XML:s we have provided. We do not enforce automatic format on save, we don't want to be completely intrusive, we just want to push all our developers into the right directions. For even increased compatability, we mostly use rules defined in JCC.
Next comes the important part, the actual builds. We are those who embrace automatic builds and for that we use Hudson Continuous Integration Server. There's two important parts in our configurations beyond this:
We use CVS loginfo to trigger builds whenever something is committed.
We utilize several plugins available for Hudson, including Continuous Integration Game in conjuction with the most important one, Checkstyle.
The Checkstyle Plugin is the magician in our code style enforcement guide line:
After commiting code to CVS, Hudson build is triggered
After build has been completed succesfully (all unit tests pass etc.), Checkstyle inspects the actual source files
Checkstyle ranks the code based rules we have defined for it
Continuous Integration Game sees the result of Checkstyle and awards/takes away points for the person who has the ownership for the relevant part of the code
Leaderboard shows total points for every commiter in system
Basically this means that when anyone commits ugly code into our CVS, our build server automatically reduces that person's points.
This means that eventually any one of us can be ranked on the Leaderboard based on the general code quality in both look and OO principles such as Law of Demeter, Cyclomatic complexity etc. etc. Naturally this isn't a completely serious statistic, but it's a good indication you're doing something wrong when causing a build to be initiated in our CI won't reduce your points - most of our commits are worth between 1 and 5 points.
And is it working? Sort of, I don't think anyone of us at my work writes ugly or unmaintainable code and personally I love to hunt all kinds of scores so it's definitely motivating me to make code that looks nice and follows all the OO paradigms I know of.
And do we as a company really need it? I think we do as you should see from reading this entire answer, it can be considered a good practice for the advancements it brings.
#1: in a related note, I refactored legacy code from 2002 today which used those standards, didn't look "bad" at all even in its original form and certainly not worse in its new form
No, not really.
If you can actually get it to work consistently and not make it flag code has changed due to a different style of laying the code out.
However, this is just a small part of coding standards. It won't cover multiple return statements, the use or not of ternary operators, etc.
It is always nice if the coding style that the shop uses is the same one that is also followed by the development tools.
Otherwise, if there is a large body of code that already follows a shop standard which is NOT the same as that of the tools you have two choices:
Modify all of the code to follow the tool standard, or
Maintain the existing shop standard.
Many shops do the latter. Regardless, there does need to be some kind of standard, and it does need to be followed.
Some development tools allow you to tweak their standard. In some cases you may be able to bring the tools in alignment with the shop standard.
It probably doesn't matter that much anymore if you can ensure that everybody in the team sees the source code "correctly" formatted, whatever they think it is. However I've not seen a system that can do that - you can do parts of it (say, reformat before and after checkin/checkout) but these days you also have to consider web interfaces into the version control, external code review systems that interact directly with the version control system etc.
The main purpose of a standard code style is (IMHO) to ensure that you can read other team members' code easily without having to start reverse engineering it because all the code is written using the same sort of guiding principles. Indenting and parentheses placement seem to be a major hangup on this but they are only a very small and in my opinion, somewhat overblown and not very important part of the need to make code consistent.
Unfortunately I'm not aware of any tools that can automatically apply consistent coding principles to source code...
Yes, coding styles are needed if there is a desire to have a homogeneous code base. Such a code base can be useful in preventing individual ownership of parts of the code base, which can cause problems when people leave the team. If you can't imagine having wildly different styles and problems understanding all of it, just look at all the different ways English text can be organized in various communications, all written but quite different such as tweets, e-mail, text messages, IM, message board posts, etc. and changes in fonts, capitalization, decorations, etc.
I'm writing a console tool to generate some C# code for objects in a class library. The best/easiest way I can actual generate the code is to use reflection after the library has been built. It works great, but this seems like a haphazard approch at best. Since the generated code will be compiled with the library, after making a change I'll need to build the solution twice to get the final result, etc. Some of these issues could be mitigated with a build script, but it still feels like a bit too much of a hack to me.
My question is, are there any high-level best practices for this sort of thing?
Its pretty unclear what you are doing, but what does seem clear is that you have some base line code, and based on some its properties, you want to generate more code.
So the key issue here are, given the base line code, how do you extract interesting properties, and how do you generate code from those properties?
Reflection is a way to extract properties of code running (well, at least loaded) into the same execution enviroment as the reflection user code. The problem with reflection is it only provides a very limited set of properties, typically lists of classes, methods, or perhaps names of arguments. IF all the code generation you want to do can be done with just that, well, then reflection seems just fine. But if you want more detailed properties about the code, reflection won't cut it.
In fact, the only artifact from which truly arbitrary code properties can be extracted is the the source code as a character string (how else could you answer, is the number of characters between the add operator and T in middle of the variable name is a prime number?). As a practical matter, properties you can get from character strings are generally not very helpful (see the example I just gave :).
The compiler guys have spent the last 60 years figuring out how to extract interesting program properties and you'd be a complete idiot to ignore what they've learned in that half century.
They have settled on a number of relatively standard "compiler data structures": abstract syntax trees (ASTs), symbol tables (STs), control flow graphs (CFGs), data flow facts (DFFs), program triples, ponter analyses, etc.
If you want to analyze or generate code, your best bet is to process it first into such standard compiler data structures and then do the job. If you have ASTs, you can answer all kinds of question about what operators and operands are used. If you have STs, you can answer questions about where-defined, where-visible and what-type. If you have CFGs, you can answer questions about "this-before-that", "what conditions does statement X depend upon". If you have DFFs, you can determine which assignments affect the actions at a point in the code. Reflection will never provide this IMHO, because it will always be limited to what the runtime system developers are willing to keep around when running a program. (Maybe someday they'll keep all the compiler data structures around, but then it won't be reflection; it will just finally be compiler support).
Now, after you have determined the properties of interest, what do you do for code generation? Here the compiler guys have been so focused on generation of machine code that they don't offer standard answers. The guys that do are the program transformation community (http://en.wikipedia.org/wiki/Program_transformation). Here the idea is to keep at least one representation of your program as ASTs, and to provide special support for matching source code syntax (by constructing pattern-match ASTs from the code fragments of interest), and provide "rewrite" rules that say in effect, "when you see this pattern, then replace it by that pattern under this condition".
By connecting the condition to various property-extracting mechanisms from the compiler guys, you get relatively easy way to say what you want backed up by that 50 years of experience. Such program transformation systems have the ability to read in source code,
carry out analysis and transformations, and generally to regenerate code after transformation.
For your code generation task, you'd read in the base line code into ASTs, apply analyses to determine properties of interesting, use transformations to generate new ASTs, and then spit out the answer.
For such a system to be useful, it also has to be able to parse and prettyprint a wide variety of source code langauges, so that folks other than C# lovers can also have the benefits of code analysis and generation.
These ideas are all reified in the
DMS Software Reengineering Toolkit. DMS handles C, C++, C#, Java, COBOL, JavaScript, PHP, Verilog, ... and a lot of other langauges.
(I'm the architect of DMS, so I have a rather biased view. YMMV).
Have you considered using T4 templates for performing the code generation? It looks like it's getting much more publicity and attention now and more support in VS2010.
This tutorial seems database centric but it may give you some pointers: http://www.olegsych.com/2008/09/t4-tutorial-creatating-your-first-code-generator/ in addition there was a recent Hanselminutes on T4 here: http://www.hanselminutes.com/default.aspx?showID=170.
Edit: Another great place is the T4 tag here on StackOverflow: https://stackoverflow.com/questions/tagged/t4
EDIT: (By asker, new developments)
As of VS2012, T4 now supports reflection over an active project in a single step. This means you can make a change to your code, and the compiled output of the T4 template will reflect the newest version, without requiring you to perform a second reflect/build step. With this capability, I'm marking this as the accepted answer.
You may wish to use CodeDom, so that you only have to build once.
First, I would read this CodeProject article to make sure there are not language-specific features you'd be unable to support without using Reflection.
From what I understand, you could use something like Common Compiler Infrastructure (http://ccimetadata.codeplex.com/) to programatically analyze your existing c# source.
This looks pretty involved to me though, and CCI apparently only has full support for C# language spec 2. A better strategy may be to streamline your existing method instead.
I'm not sure of the best way to do this, but you could do this
As a post-build step on your base dll, run the code generator
As another post-build step, run csc or msbuild to build the generated dll
Other things which depend on the generated dll will also need to depend on the base dll, so the build order remains correct