From a performance perspective, is it better to wrap each statement that utilizes LINQ in a using() statement, or to declare a class-level instance and use in each method?
For instance:
public void UpdateSomeRecord(Record recordToUpdate)
{
using(var entities = new RecordDBEntities())
{
// logic here...
}
}
private RecordDBEntities entites = new RecordDBEntities();
public void UpdateSomeRecord(Record recordToUpdate)
{
// logic here...
}
Or does it not matter either way?
Thanks!
The using statement may hurt performance in the sense that it will take longer to run but this shouldn't be your concern in cases like this. If a type implements IDisposable it really ought to be wrapped in a using statement so that it can clean up after itself.
This cleanup code will take longer to run than no cleanup code of course so that is why I say that the using statement will take longer to run. But this does not mean that you shouldn't have the using statement. I think that you should use the using statement even though it may take longer to run.
I guess what I am trying to say is that you are comparing apples to oranges here as performance comparisons only make sense when the code being compared creates identical output and identical side effects. Your examples do not so that I why I don't think this is a performance issue.
The best practice in this situation is to use the using statement on types that implement IDisposable regardless of the fact that the using statement will make the method run longer. If you need to know how much longer it will run then you should employ a profiler to identify if the code in question is creating a bottleneck.
In fact your question is about the lifetime management of the LINQ DataContext.
You may wish to look at the following article: Linq to SQL DataContext Lifetime Management
Related
Is there any benefit of doing this;
private void Method()
{
var data = ConfigurationManager.AppSettings["Data"].Split('-');
}
than doing this;
private void Method()
{
var _data = ConfigurationManager.AppSettings["Data"];
var data = _data.Split('-');
}
Case: I need to read bunch of configuration values like this in the same method, multiple times (let's say every time I instantiate this class).
How will both cases will affect the performance and memory? Or are they pretty much the same things? I see assigning it to a variable will allocate space on memory for no reason.
There will be the same IL code generated in both cases.
And don't forget about The Rules of Code Optimization
The compiler will reduce those to the exact same thing. No, there's no difference in this scenario. If you're ever curious, compile it in release mode, and use ildasm to look at what it did.
However! Performance questions should never be answered by hunch - or even asked on hunch. First, determine if you are actually trying to solve a real problem - otherwise you're probably just yak shaving.
In your first case since ConfigurationManager.AppSettings["Data"] will return a string there is no harm in chaining the Split() method with it than creating a extra variable.
In second case, it would be efficient if ConfigurationManager.AppSettings["Data"] would be used multiple places. In such case, instead of fetching it again and again, you fetch it once, store it to a variable and re-use it.
Both statements are equal. You have a false understanding on when space on your memory is allocated. This actually happens inside the AppSettings-call, not on assignement. Thus when you make any call to a member the result allready exists on memory. Storing this value in a variable does not increase anything - neither memory-allocation nor performance.
However if you´d store the result in a member of your class it´ll be garbage-collected far later than your local data-variable as it doesn´t get out of scope. In this case storing your result to the member will allocate memory as long as the instance exists.
Having said this it is in mostly all cases more important to focus on your code being maintainable, that is if other developers can understand it without asking what all this about.
This means you shouldn´t ask: which horse runs faster but instead which code is easier to understand?
I have two scenarios (examples below), both are perfectly legitimate methods of making a database request, however I'm not really sure which is best.
Example One - This is the method we generally use when building new applications.
private readonly IInterfaceName _repositoryInterface;
public ControllerName()
{
_repositoryInterface = new Repository(Context);
}
public JsonResult MethodName(string someParameter)
{
var data = _repositoryInterface.ReturnData(someParameter);
return data;
}
protected override void Dispose(bool disposing)
{
Context.Dispose();
base.Dispose(disposing);
}
public IEnumerable<ModelName> ReturnData(filter)
{
Expression<Func<ModelName, bool>> query = q => q.ParameterName.ToUpper().Contains(filter)
return Get(filter);
}
Example Two - I've recently started seeing this more frequently
using (SqlConnection connection = new SqlConnection(
ConfigurationManager.ConnectionStrings["ConnectionName"].ToString()))
{
var storedProcedureName = GetStoredProcedureName();
using (SqlCommand command = new SqlCommand(storedProcedureName, connection))
{
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add("#Start", SqlDbType.Int).Value = start;
using (SqlDataReader reader = command.ExecuteReader())
{
// DATA IS READ AND PARSED
}
}
}
Both examples use Entity Framework in some form (the first more so than the other), there are Model and Mapping files for every table which could be interrogated. The main thing the second example does over the first (regarding EF) is utilising Migrations as part of the Stored Procedure code generation. In addition, both implement the Repository pattern similar to that which is in the second link below.
Code First - MSDN
Contoso University - Tutorial
My understanding of Example One is that the repository and context are instantiated once the Controller is called. When making the call to the repository it returns the data but leaves the context intact until it is disposed of at the end of the method. Example Two on the other hand will call Dispose as soon as the database call is finished with (unless forced into memory, e.g. using .ToList() on an IEnumerable). If my understanding is not correct, please correct me where appropriate.
So my main question is what are the disadvantages and advantages of using one over the other? Example, is there a larger performance overhead of going with Example 2 compared to Example 1.
FYI: I've tried to search for an answer to the below but have been unsuccessful, so if you are of a similar question please feel free to point me in that direction.
You seem to be making a comparison like this:
Is it better to build a house or to install plumbing in the bathroom?
You can have both. You could have a repository (house) that uses data connections (plumbing) so it's not an "OR" situation.
There is no reason why the call to ReturnData doesn't use a SqlCommand under the hood.
Now, the real important difference that is worth considering is whether or not the repository holds a resource (memory, connection, pipe, file, etc) open for its lifetime, or just per data call.
The advantage of using a using is that resources are only opened for the duration of the call. This helps immensely with scaling of the app.
On the other hand there's an overhead to opening connections, so it's better - particularly for single threaded apps - to open a connection, do several tasks, and then close it.
So it really boils down to what type of app you're writing as to which approach you use.
Your second example isn't using entity framework. It seems you may have two different approaches to data access here although it is hard to tell from the repository snippet as it quite rightly hides the data access implementation. The second example is correctly using a "using" statement as you should on any object that implements IDisposable. It means you don't have to worry about calling dispose. This is using pure ADO.net which is what Entity Framework uses under the hood.
If the first example is using Entity framework you most likely have lazy loading in play in which case you need the DbContext to remain until the query has been executed. Entity Framework is an ORM tool. It too uses ADO.net under the hood to connect to the database but it also offers you alot more on top. A good book on both subjects should help you.
I found learning ADO.net first helps alot in understanding how Entity Framework retrieves info from the Database.
the using statement is good practice where ever you find an object that implements IDisposable. You can read more about that here : IDisposable the right way
In response to the change to the question - the answer still on the whole remains the same. In terms of performance - how fast are the queries returned? Does the performance of one work better than the other? Only your current system and set up can tell you that. Both approaches seem to be doing things the correct way.
I haven't worked with Migrations so not sure why you are getting ADO.net type queries integrating with your EF models but wouldn't be surprised by this functionality. Entity Framework as I have experienced it creates the queries for you and then executes them using the ADO.net objects from your second example. The key point is that you want to have the "using" block for SqlConnection and SqlCommand objects (although I don't think you need to nest them. everything inside the outer "using block will be disposed).
There is nothing stopping you putting a "using" block in your repository around the context but when it comes to lazily load the related Entities you will get an error as the context will have been disposed. If you need to make this change you can include the relevant elements in your query and do away with the lazy loading approach. There are performance gains in certain situations for doing this but again you need to balance this in terms to how your system is performing.
I tried to search for this but couldn't quite find an answer.
I have a method, and inside it there's a code block call very often, so I refactored it into a local Func.
Now because I don't use that code block anywhere else, it makes sense to have this instead of another method.
But is it better, performance-wise, to use another method? Does the Func get allocated or in some other way use extra processing time or memory because it's declared inside the function, or does it get cached or even actually made into a method behind the scenes by the compiler?
I know it sounds like a micro-optimization thing, but in my case, the method gets called very often. So maybe that changes the consideration.
So, basically:
public T CalledVeryOften(...)
{
Func<...> block = () => ...;
//code that calls 'block' several times
}
or
public T CalledVeryOften(...)
{
//code that calls 'block()' several times
}
private ... block()
{
...
}
Nah, there shouldn't be a huge difference in performance. A Func either compiles to a static or instance method depending on whether you use closures.
However, if you can inline the Func code it can increase performance.. maybe. Not sure how to do that though.
By inline, I'm referring to the inline keyword we can have in C++. It tells the compiler to embed the function instructions in that code block. I'm not sure if C# offers that benefit.
Btw, if the private method really belongs to a method block that can be reusable and you are using Func for the sake of performance increase, I'd refactor it back to the way it was.
It is a micro optimisation :) Unless your program is noticeably slowing down to an unacceptable level and profiling determines that the root cause is the fact you're making the function call, then you can consider alternatives.
The overhead really is negligible in the grand scheme of things. I would definitely file this under "Things I need not be concerned about".
Besides, you've probably made your code more readable in the process.
I came across some code recently that replaces the use of switches by hard-coding a
Dictionary<string (or whatever we would've been switching on), Func<...>>
and where ever the switch would've been, it instead does dict["value"].Invoke(...).
The code feels wrong in some way, but at the same time, the methods do look a bit cleaner, especially when there's many possible cases. I can't give any rationale as to why this is good or bad design so I was hoping someone could give some reasons to support/condemn this kind of code. Is there a gain in performance? Loss of clarity?
Example:
public class A {
...
public int SomeMethod(string arg){
...
switch(arg) {
case "a": do stuff; break;
case "b": do other stuff; break;
etc.
}
...
}
...
}
becomes
public class A {
Dictionary<string, Func<int>> funcs = new Dictionary<string, Func<int>> {
{ "a", () => 0; },
{ "b", () => DoOtherStuff(); }
... etc.
};
public int SomeMethod(string arg){
...
funcs[arg].Invoke();
...
}
...
}
Advantages:
You can change the behaviour at runtime of the "switch" at runtime
it doesn't clutter the methods using it
you can have non-literal cases (ie. case a + b == 3) with much less hassle
Disadvantages:
All of your methods must have the same signature.
You have a change of scope, you can't use variables defined in the scope of the method unless you capture them in the lambda, you'll have to take care of redefining all lambdas should you add a variable at some point
you'll have to deal with non-existant indexes specifically (similar to default in a switch)
the stacktrace will be more complicated if an unhandled exception should bubble up, resulting in a harder to debug application
Should you use it? It really depends. You'll have to define the dictionary at some place, so the code will be cluttered by it somewhere. You'll have to decide for yourself. If you need to switch behaviour at runtime, the dictionary solution really sticks out, especially, if the methods you use don't have sideeffects (ie. don't need access to scoped variables).
For several reasons:
Because doing it this way allows you to select what each case branch will do at runtime. Otherwise, you have to compile it in.
What's more, you can also change the number of branches at runtime.
The code looks much cleaner especially with a large number of branches, as you mention.
Why does this solution feel wrong to you? If the dictionary is populated at compile time, then you certainly don't lose any safety (the delegates that go in certainly have to compile without error). You do lose a little performance, but:
In most cases the performance loss is a non-issue
The flexibility you gain is enormous
Jon has a couple good answers. Here are some more:
Whenever you need a new case in a switch, you have to code it in to that switch statement. That requires opening up that class (which previously worked just fine), adding the new code, and re-compiling and re-testing that class and any class that used it. This violates a SOLID development rule, the Open-Closed Principle (classes should be closed to modification, but open to extension). By contrast, a Dictionary of delegates allows delegates to be added, removed, and swapped out at will, without changing the code doing the selecting.
Using a Dictionary of delegates allows the code to be performed in a condition to be located anywhere, and thus given to the Dictionary from anywhere. Given this freedom, it's easy to turn the design into a Strategy pattern where each delegate is provided by a unique class that performs the logic for that case. This supports encapsulation of code and the Single Responsibility Principle (a class should do one thing, and should be the only class responsible for that thing).
If there are more number of possible cases then it is good idea to replace Switch Statement with the strategy pattern, See this.
Applying Strategy Pattern Instead of Using Switch Statements
No one has said anything yet about what I believe to be the single biggest drawback of this approach.
It's less maintainable.
I say this for two reasons.
It's syntactically more complex.
It requires more reasoning to understand.
Most programmers know how a switch statement works. Many programmers have never seen a Dictionary of functions.
While this might seem like an interesting and novel alternative to the switch statement and may very well be the only way to solve some problems, it is considerably more complex. If you don't need the added flexibility you shouldn't use it.
Convert your A class to a partial class, and create a second partial class in another file with just the delegate dictionary in it.
Now you can change the number of branches, and add logic to your switch statement without touching the source for the rest of your class.
(Regardless of language) Performance-wise, where such code exists in a critical section, you are almost certainly better off with a function look-up table.
The reason is that you eliminate multiple runtime conditionals (the longer your switch, the more comparisons there will be) in favour of simple array indexing and function call.
The only performance downside is you've introduced the cost of a function call. This will typically be preferable to said conditionals. Profile the difference; YMMV.
Some time ago I had to address a certain C# design problem when I was implementing a JavaScript code-generation framework. One of the solutions I came with was using the “using” keyword in a totally different (hackish, if you please) way. I used it as a syntax sugar (well, originally it is one anyway) for building hierarchical code structure. Something that looked like this:
CodeBuilder cb = new CodeBuilder();
using(cb.Function("foo"))
{
// Generate some function code
cb.Add(someStatement);
cb.Add(someOtherStatement);
using(cb.While(someCondition))
{
cb.Add(someLoopStatement);
// Generate some more code
}
}
It is working because the Function and the While methods return IDisposable object, that, upon dispose, tells the builder to close the current scope. Such thing can be helpful for any tree-like structure that need to be hard-codded.
Do you think such “hacks” are justified? Because you can say that in C++, for example, many of the features such as templates and operator overloading get over-abused and this behavior is encouraged by many (look at boost for example). On the other side, you can say that many modern languages discourage such abuse and give you specific, much more restricted features.
My example is, of course, somewhat esoteric, but real. So what do you think about the specific hack and of the whole issue? Have you encountered similar dilemmas? How much abuse can you tolerate?
I think this is something that has blown over from languages like Ruby that have much more extensive mechanisms to let you create languages within your language (google for "dsl" or "domain specific languages" if you want to know more). C# is less flexible in this respect.
I think creating DSL's in this way is a good thing. It makes for more readable code. Using blocks can be a useful part of a DSL in C#. In this case I think there are better alternatives. The use of using is this case strays a bit too far from its original purpose. This can confuse the reader. I like Anton Gogolev's solution better for example.
Offtopic, but just take a look at how pretty this becomes with lambdas:
var codeBuilder = new CodeBuilder();
codeBuilder.DefineFunction("Foo", x =>
{
codeBuilder.While(condition, y =>
{
}
}
It would be better if the disposable object returned from cb.Function(name) was the object on which the statements should be added. That internally this function builder passed through the calls to private/internal functions on the CodeBuilder is fine, just that to public consumers the sequence is clear.
So long as the Dispose implementation would make the following code cause a runtime error.
CodeBuilder cb = new CodeBuilder();
var f = cb.Function("foo")
using(function)
{
// Generate some function code
f.Add(someStatement);
}
function.Add(something); // this should throw
Then the behaviour is intuitive and relatively reasonable and correct usage (below) encourages and prevents this happening
CodeBuilder cb = new CodeBuilder();
using(var function = cb.Function("foo"))
{
// Generate some function code
function.Add(someStatement);
}
I have to ask why you are using your own classes rather than the provided CodeDomProvider implementations though. (There are good reasons for this, notably that the current implementation lacks many of the c# 3.0 features) but since you don't mention it yourself...
Edit: I would second Anoton's suggest to use lamdas. The readability is much improved (and you have the option of allowing Expression Trees
If you go by the strictest definitions of IDisposable then this is an abuse. It's meant to be used as a method for releasing native resources in a deterministic fashion by a managed object.
The use of IDisposable has evolved to essentially be used by "any object which should have a deterministic lifetime". I'm not saying this is write or wrong but that's how many API's and users are choosing to use IDisposable. Given that definition it's not an abuse.
I wouldn't consider it terribly bad abuse, but I also wouldn't consider it good form because of the cognitive wall you're building for your maintenance developers. The using statement implies a certain class of lifetime management. This is fine in its usual uses and in slightly customized ones (like #heeen's reference to an RAII analogue), but those situations still keep the spirit of the using statement intact.
In your particular case, I might argue that a more functional approach like #Anton Gogolev's would be more in the spirit of the language as well as maintainable.
As to your primary question, I think each such hack must ultimately stand on its own merits as the "best" solution for a particular language in a particular situation. The definition of best is subjective, of course, but there are definitely times (especially when the external constraints of budgets and schedules are thrown into the mix) where a slightly more hackish approach is the only reasonable answer.
I often "abuse" using blocks. I think they provide a great way of defining scope. I have a whole series of objects that I use for capture and restoring state (e.g. of Combo boxes or the mouse pointer) during operations that may change the state. I also use them for creating and dropping database connections.
E.g.:
using(_cursorStack.ChangeCursor(System.Windows.Forms.Cursors.WaitCursor))
{
...
}
I wouldn't call it abuse. Looks more like a fancied up RAII technique to me. People have been using these for things like monitors.