I have mail-merge like functionality, which takes a template, some business object, and produces html which is then made into PDF.
I'm using RazorEngine to do the template+model to html bit.
If I let the users specify the templates, what risks am I taking? Is it possible to mitigate any risks?
For example, could the users execute arbitrary code? (delete files, alter database, etc.?) Is there some way I can detect this sort of thing? (I know that would be impossible generally, but the bits of code in the razor template should be model property gets, or possibly if statements based on model property values).
I do basically trust the users here (it's a small private project), but as templating engines go, this one seems excessively powerful for this application.
In version 3 I've introduced an IsolatedTemplateService which supports the parsing/compiling of templates in another AppDomain. You'll be able to control the creation of the application domain that templates will be compiled in, which means you can introduce whatever security requirements you want by applying security policies to the child application domain itself.
In future pushes, I am hoping to introduce a generic way for adding extensions to the pipeline, so you can do things like code generation inspection. I would imagine this will enable scenarios for type checking of the generated code before it is compiled.
I pushed an early version of RazorEngine (v3) onto GitHub a few days ago. Feel free to check it out. https://github.com/Antaris/RazorEngine
A cshtml Razor file is able to execute any. NET code in the context of the site so yes, it is a security risk to permit them to be supplied by users.
You would be better served by accepting a more general HTML template, with custom tokens to input Model data.
I believe that having removed using statements and replacing any #System.[...] like System.IO.File.Delete(filepath) using regex can reduce a fair amount of possible security holes.
Keep in mind that the Template runs inside a context and can access only what is available in it but that includes also .NET Framework assemblies.
Related
Is it possible to use C# as a DSL in which the C# source code is edited by the end user in a TextBox, compiled while the application is running, then called by the already-running application?
I ask because in the next few months I will be needing to implement a simple math-crunching DSL (similar to somthing Rachel Lim blogged about at http://rachel53461.wordpress.com/2011/08/20/the-math-converter/
I am focused on the math-processing aspect of her code, not the XAML/Converter aspect). I would lean against just reusing her code because I want to add if-statements and possibly other features. If I can use C# itself, then I get all of the features without having to re-implement them.
If it is possible to do this, what framework or namespace or class would I want to use to accomplish such?
Please note that one thing I would do with the C#-derived DSL is hard-code all necessary using header statements, then remove all using statements entered by the savvy user. The purpose of this is to reduce the prospect of an end user trying to leverage my C#-like DSL into a full-fledged compiler against the wishes of their enterprise policy or without the knowledge of the site administrator. Is my proposed managing of using statements an adequate defense against user mischief?
Finally, if all of the answers up to this point are "yes", then what are the drawbacks of this approach, especially drawbacks of introducing a security vulnerability?
Paul
Is my proposed managing of using statements an adequate defense against user mischief?
No. You'd have to remove references to fully-qualified classes as well. And then, the user can still use reflection to gain access to classes they have not referred to in either way.
You'll want to create a separate appdomain to contain the user's code, which you can then sandbox appropriately. Here is a relevant article on MSDN, which explains this process in depth.
Stackoverflow automatically converts link answers to comments now. How lovely.
Compile and run dynamic code, without generating EXE?
Anyway, the answer lies with Microsoft.CSharp.CSharpCodeProvider
Removing using directives will not help, unless you also find some way to prevent the user from writing e.g. System.Diagnostics.Process.Start("evilprogram.exe"). Doing this (without also preventing property accesses) will require you to use a C# parser.
You might, however, be able to use Code Access Security for this.
In my scenario, let's say there is a ASP.Net 4.0 C# page containing a form with several inputs on it. Based upon which state the user is in, the form needs to act in entirely different ways: some fields might be required, some not visible at all, some might have different requirements (state A might only allow numbers 1-5, state B numbers 5-10), etc.
So, to simplify things, let's just say for any given input on the form, I need to determine whether or not it's required for the user, again based on their state. For those of you who run into this scenario quite a bit, what's the best way of implementing a system to handle this? I can see the following options:
Hardcoded - Difficult to maintain, obviously
Custom Database Rule Framework - This seems like it would work; however, it would be somewhat of a pain to maintain depending on how complicated the logic is
Windows Workflow Foundation - This would be able to handle just about any kind of logic, and be decent to maintain, but I'm not sure how this would do performance wise. (could be stored externally in database)
Dynamic Code - Store the logic in a database and run it directly based upon the user. I've never done this.. is it possible?
That's all I've come up with at this point, but I'm hoping someone out there has found an elegant solution to handle scenarios with complicated forms like this.
Thanks!
I have never worked with WWF, but I have encountered a scenario like this and implemented an entry form for it that works well and is easy to maintain once you understand the system.
I will discourage you from using hardcoded logic because any degree of complexity will quickly become impossible to maintain. I tried a hybrid approach that included some hardcoding initially and it did not turn out well.
I ended up creating, as you call it, a custom database rule framework. It is a little extra work to set up config forms to associate user groups with certain codes and pieces of functionality, but in the end it is well worth it for everything to automatically configure itself. Also in my case I was able to farm out user & code setup work to a supervisor in the department that uses the application, so that is a big plus.
Hardcoding -- not so hard to maintain, just depending on how fluid the rules are. I.e., if your "states" are relatively fixed, you're not adding new ones or changing the way those states interact with the page, then hardcoding might be fine. My only recommendation in this case would be to keep it in a separate class so you can re-use it, modify & re-publish easier, etc.
If you want the flexibility to change the rules a lot, create new states (I'm thinking of these as "roles"), then storing the info in a database would make more sense.
Personally, I use the database approach. It saves me some re-publsihing of the app, and it has allowed me to build additional interfaces for my end-users to have limited capability to manage their own app in terms of role assignments ("states" as you put it), etc. For example, my end-users can grant one of their clients (based on the client's login) access to a certain report. Or in your situation, they could change the min number for some range-validator your .aspx is using.
Since this approach lets me delegate some admin functions to my end-users, it allows them to do on-the-fly changes (to a limited extent), and also saves me a lot of rush-work / do it yesterday work as far as my own to-do list is concerned.
AGAIN: If you're voting -1, please leave a comment explaining why. This post isn't about whether or not you approve if this approach, but how to go about it.
Like many architects, I've developed coding standards through years of experience to which I expect my developers to adhere.
This is especially a problem with the crowd that believes that three or four years of experience makes you a senior-level developer.Approaching this as a training and code review issue has generated limited success.
So, I was thinking that it would be great to be able to add custom compile-time errors to the build process to more strictly enforce our in-house best practices and coding standards.
For instance, we use stored procedures for ALL database access, which provides procedure-level security, db encapsulation (table structure is hidden from the app), and other benefits. (Note: I am not interested in starting a debate about this.) Some developers prefer inline SQL or parametrized queries, and that's fine - on their own time and own projects.
I'd like a way to add a compilation check that finds, say, anything that looks like
string sql = "insert into some_table (col1,col2) values (#col1, #col2);"
and generates an error or, in certain circumstances, a warning, with a message like
Inline SQL and parametrized queries are not permitted.
Or, if they use the var keyword
var x = new MyClass();
Variable definitions must be explicitly typed.
Do Visual Studio and MSBuild provide a way to add this functionality? I'm thinking that I could use a regular expression to find unacceptable code and generate the correct error, but I'm not sure what, from a performance standpoint, is the best way to to integrate this into the build process.
We could add a pre- or post-build step to run a custom EXE, but how can I return line- and file-specifc errors? Also, I'd like this to run after compilation of each file, rather than post-link.
Is a regex the best way to perform this type of pattern matching, or should I go crazy and run the code through a C# parser, which would allow node-level validation via the parse tree?
I'd appreciate suggestions and tales of prior experience.
Comments
Several respondents have pointed out that it's possible to restrict the ability of a user to run anything but a stored proc through db permissions. However, we're in the process of porting a 350k+ line application from ASP 3.0 to ASP.NET MVC, and the existing code base relies pretty heavily on concatenated SQL, whereas the new stuff all uses Enterprise Library. I guess I could add a separate web user account for the .NET code with more restrictive permissions.
For coding standards I would look at writing custom rules for FxCop or StyleCop. I don't think Regex would be a suitable tool for the job.
For the specific case of requiring Stored Procedures - if you ensure the application doesn't have permission to do anything else on the production database, everyone will soon fall in line.
What about writing a plugin for Resharper? Here is a tutorial to start with: Writing plug-ins for ReSharper: Part 1 of Undefined
Implicit typing (var x = ....) is a feature that can be turned off on project level in visual studio.
The other one is trickier. Have you had a look at FxCop, which is the tool for enforcing code standards.
The requirement that only stored procedures can be used should be managed through database permissions. The rule against using var seems fairly arbitrary to me and I can't think of a way to enforce it. Do you have any more examples of your best practices?
Is there anyway to let users write their own aspx templates with my defined dynamic variables? Note that I don't want to use Web Forms (so there are no tags like <asp:button> etc).
In addition, I'd need a security solution so users can't change the system or do dangerous things like this.
Thanks.
Personally I would avoid using the ASPX engine for this. I would probably use either a really simple custom formatting solution (such as just a text file with %%VAR_NAME%% allowed for dynamic values), or I would look at a templating language such as Markdown (used by StackOverflow and others). BBCode is another option in a similar vein.
Allowing people to create ASPX templated pages on the fly seems like to much of a security issue to me. It would be hard to make sure you have closed all the possible attack vectors once they have direct access to the ASP.NET engine.
Since you didn't specify, I'm just guessing at your needs, so depending on the exact problem this may or may not be your best bet. If you include more details about the problem you are solving it would be easier to make suggestions.
I've put off using generated code as part of the build process for fear of the complexity it introduces into the build process.
Is there a simple way to integrate build-time generated code into an app?
The kind of code I'm thinking of is similar to the resource and settings file code generation that Visual studio performs:
Having intellisense here is valuable
There are a lot of properties and links between properties that are trivial to describe, but impossible to implement tersely in C#.
The underlying resource is be modifiable and the code is automatically regenerated without needing any user interaction and without any need to understand the internals of the generator.
For (a non-real-world) example consider a precompiler that generated accessor to the named capture groups of a Regex via similarly named C# properties (or methods). This is typical of the kinds of things I'd like to generate: long snippets of boilerplate wrappers whose primary function is to enable compile time checking for errors (in the above; accessing non-existant capturing groups or writing and invalid regex) and no less importantly, intellisense for these properties. Finally, this setup should be trivially usable by others on the team with only the bare minimum of learning curve. I.e., it's absolutely not acceptable to require manual intervention to regenerate the code, nor acceptable to commit the generated code into source control. At worst, everyone should just need to install some extension; ideally the extension should be installable into the source-tree so that anyone that checks out the tree can build the project without any introduction.
For that to work well, it's critical that the IDE integration be excellent: Updating the underlying "resource" definition file should trigger a regeneration of the code without any user interaction, and ideally the generator itself would be easy to maintain for other developers later on (i.e. some amount of generator debug-ability is a plus).
Finally, an XSLT-like approach where the same template can be applied to various input resources is ideal; both because this means that you don't even need to look at the actual generator code if all you want to do is is update the resource, and because it makes template reuse trivial.
I've looked at T4, but from what I've seen this has a less handy ASP-like approach where template and resource aren't cleanly split (i.e, the generator is responsible for finding the resource - which makes template reuse less easy).
Is there a better (cleaner) solution or some way of running T4 such that the same template is can be trivially reused and (much like .NET settings files) that any update of the resource automatically triggers a regeneration of the implemented code?
Summary:
I'm looking for a code-gen approach that can
Regenerate code automatically without dev intervention when the underlying resource (not the template!) changes.
Be somewhat simple to maintain
Be able to share the same generator template between several resources (which, with point #1 probably implies the resource should refer to the generator and not vice-versa).
You can use T4ScriptFileGenerator from T4 Toolbox. Change "Custom Tool" property for your "resource" file to T4ScriptFileGenerator and save changes. The custom tool will generate a new, empty T4 script (.tt file). Place your code generation logic in this .tt file. Any time you modify (and save) the resource file, the T4ScriptFileGenerator will use the .tt file to generate the output code. For an example of how this works, see "LINQ to SQL Model" generator in the T4 Toolbox, which uses a .dbml file as the "resource". In the .tt file created by this generator, you will see that all of the code generation logic resides in separate .tt files and is reused with the help of include directives.
You may want to keep an eye on ABSE (http://www.abse.info). ABSE is a code-generation and model-driven software development methodology that is completely agnostic in terms of platform and language, so you wouldn't have any trouble creating your own generators for C# and anything else you wish. The big plus is that you can generate code exactly the way you want. The downside is that you may have more work to do at first to build your templates.
ABSE allows you to capture your domain knowledge into "Atoms", which are basically fragments of larger models you can build. ABSE is both declarative and executable. The model is able to generate code by your specification and incorporate custom code at the model level.
Unfortunately, ABSE is still work in progress and an Integrated Development Environment (named AtomWeaver) is still in the making. Anyway a CTP release of the generator is scheduled for January 2010, so we're already close to it.