I have an assembly in 'C++/CLI' which implements certain classes. Let us assume the class is of 'SomeType'.
Now, in an application developed in 'C#', to do the following -
while(!Console.KeyAvailable)
{
using(SomeType type = new SomeType())
{
type.doSomething(); //do something
}
}
Would it have any consequence, like memory leaks etc. in any situation, say if there was an unhandled exception or any such?
I read that the using keyword should generally be used for a class which implements IDisposable, but for a C++/CLI class?
C++/CLI does not have an equivalent of the using keyword. It took a different approach, one that native C++ programmers expect. Who are familiar with a very common C++ idiom to implement deterministic destruction, the RAII pattern. Invoking it requires using "stack semantics". Works well, the syntax requirements are however pretty obscure.
I'll first show the clumsy way, helpful to demonstrate the syntax differences. Lets use StreamReader, a disposable class in .NET:
String^ ReadTopLineFromFile(String^ path) {
StreamReader^ reader = gcnew StreamReader(path);
try {
return reader->ReadLine();
}
finally {
delete reader;
}
}
The try/finally is what makes code exception-safe, if the ReadLine() throws an exception then the StreamReader object is still disposed and the lock on the file is guaranteed to be released. This is code that the C# compiler emits automatically when you use the using statement. Also note the use of the delete operator, it in effect calls the StreamReader::Dispose() method. The compiler won't let you write reader->Dispose(), using the operator is mandatory.
Now the using version that the C++/CLI compiler supports. You invoke stack semantics by emulating the way the native C++ compiler treats a C++ object that's allocated on the stack. Like this:
String^ ReadTopLineFromFile(String^ path) {
StreamReader reader(path);
return reader.ReadLine();
} // <== disposed here
Note the missing ^ hat on the variable name, normally required when it stores a reference type reference. Intentionally omitting it is what invokes the pattern. No explicit gcnew call required, the compiler emits it automatically. Also note that you no longer use -> to dereference the object, you now use .
The C++/CLI compiler automatically generates the try/finally blocks as well as the delete operator call. Which is emitted at the closing brace of the scope block. Just like the native C++ compiler does it. While it looks like the managed object is allocated on the stack, that's just an syntactical illusion, it is still on the GC heap.
Syntax is very different of course, the only real hangup with the feature. Knowing when to use the ^ hat and when to rely on stack semantics is something that needs to be learned, takes a while.
Related
This question already has answers here:
What is IDisposable for?
(6 answers)
Closed 7 years ago.
I come from a C++ background and have trouble understanding the point of IDisposable objects (and the point of many other things in .NET). Why is a Dispose function necessary in the first place? Whatever it does, why not do that in the destructor of the class? I understand it cleans up managed resources, but isn't that what a destructor is supposed to do? I understand that
Using ( var obj = new SomeIDisposableObject )
{
// ...
}
is the equivalent of
var obj = new SomeIDisposableObject;
// ...
obj.Dispose();
but how does that save any typing? And if C# has a garbage collector, which it does, then why are we ever worried about disposing resources?
Is IDisposable/Using/etc. a Skeet-approved concept? What does he think of it?
IDisposable is nothing special. It's just an interface that makes you have a Dispose() function. IDisposable won't clear anything or destroy objects. A call to Dispose() does nothing if that function does nothing.
The use of IDisposable is a pattern. It's so important that it gets its own language construct (the using block), but it's just a pattern.
The difference with a destructor, is that in .NET, the destructor is non-deterministic. You never know when the garbage collector will collect your object. You don't even know if it even will (unlike using delete in C++, which is deterministic).
So IDisposable is there for deterministically releasing unneeded references (and releasing unmanaged resources). The "disposed" object will remain in memory after you call Dispose until the garbage collector decides to collect it (at which point, the "destructor" will be called, unless you explicitly tell it not-to), if it ever decides to.
That's the main difference to using "destructors" (or finalizers, as they are called in C#).
As for the why do we need it:
Managed references to other objects prevent objects to be collected by the garbage collector. So you need to "release" those references (by, for example, setting the variables that reference those objects to null).
Objects in .NET can use unmanaged resources, if they use native libraries. For example, every control or bitmap in a Windows.Forms application uses a Win32 handler, which is unmanaged. If you don't release those resources, you may reach the handle limit in Windows if using any big application, easily. Also, if you need interoperability with any non-.NET API, you'll more likely have to allocate unmanaged memory (with the use of, for example, Marshal.AllocHGlobal), which will need to be released at some point: that point is generally the Dispose() function of the IDisposable, which must be called explicitly (or by using a using block).
As for the using block, it's more equivalent to:
var myObject = new MyDisposableClass();
try
{
...
}
finally {
myObject.Dispose();
}
So it does indeed save typing
Using a using block doesn't only call the .Dispose method; it calls the .Dispose method when you leave the block, however you leave it. So, if your code crashes, it will still call Dispose. The actual code would be closer to:
try
{
var obj = new SomeIDisposableObject;
// ...
}
catch (exception ex)
{
}
finally
{
obj.Dispose();
}
Additionally, destructors don't always fire when you expect them to. I've had a few bugs where the destructor is called after the program has apparently exited, and tried to access resources that no longer exist. Since you don't know when it will be called, it's difficult to fix this.
If you try to use decompilers like: jetbrains dotpeek, redgate reflector, telerik justdecompile, whatever.. Sometimes if you need a code to copy or just to understand, it is not possible because are shown somethings like it:
[CompilerGenerated]
private sealed class Class15
{
// Fields
public Class11.Class12 CS$<>8__locals25;
public string endName;
// Methods
public Class15();
public bool <Show>b__11(object intelliListItem_0);
}
I'm not taking about obfuscation, this is happens at any time, I didsome tests (my own code), and occurs using lambdas and iterators. I'm not sure, could anyone give more information about when and why..?
So, by standard Visual Studio not compile $ and <> keywords in c# (like the code above)...
There is a way to translate or convert this decompiled code automatically?
Lambdas are a form of closure which is a posh way of saying it's a unit of code you can pass around like it was an object (but with access to its original context). When the compiler finds a lambda it generates a new type (Type being a class or struct) which encapsulates the code and any fields accessed by the lambda in its original context.
The problem here is, how do you generate code which will never conflict with user written code?
The compiler's answer is to generate code which is illegal in the language you are using, but legal in IL. IL is "Intermediate Language" it's the native language used by the Common Language Runtime. Any language which runs on the CLR (C#, vb.net, F#) compiles into IL. This is how you get to use VB.Net assemblies in C# code and so on.
So this is why the decompilers generate the hideous code you see. Iterators follow the exact same model as do a bunch of other language features that require generated types.
There is an interesting side effect. The Lambda may capture a variable in its original context:
public void TestCapture()
{
StringBuilder b = new StringBuilder();
Action l = () => b.Append("Kitties!");
}
So by capture I mean the variable b here is included in the package that defines the closure.
The compiler tries to be efficient and create as few types as possible, so you can end up with one generated class that supports all the lambdas found in a specific class, including fields for all the captured variables. In this way, if you're not careful, you can accidentally capture something you expect to be released, causing really tricky to trace memory leaks.
Is there an option to change the target framework?... I know with some decompilers they default to the lowest level framework (C# 1.0)
Writing memleak-free code in C++ isn't a problem for me, I just keep to the RAII idiom.
Writing memleak-free code in C# isn't very hard either, the garbage collector will handle it.
Unfortunately, writing C++/CLI code is a problem for me. I thought I had understood how it works, but I still have big problems and I hope you can give me some hints.
This is what I have:
A Windows service written in C# that uses C++ libraries (for example OpenCV) internally. The C++ classes are accessed with C++/CLI wrapper classes.
For example I have a MatW C++/CLI wrapper class for a cv::Mat image object, which takes as constructor argument a System::Drawing::Bitmap:
public ref class MatW
{
public:
MatW(System::Drawing::Bitmap ^bmpimg)
{
cv::Size imgsize(bmpimg->Width, bmpimg->Height);
nativeMat = new Mat(imgsize, CV_8UC3);
// code to copy data from Bitmap to Mat
// ...
}
~MatW()
{
delete nativeMat;
}
cv::Mat* ptr() { return nativeMat; }
private:
cv::Mat *nativeMat;
};
Another C++ class might be for example
class PeopleDetector
{
public:
void detect(const cv::Mat &img, std::vector<std::string> &people);
}
And its wrapper class:
public ref class PeopleDetectorW
{
public:
PeopleDetectorW() { nativePeopleDetector = new PeopleDetector(); }
~PeopleDetectorW() { delete nativePeopleDetector; }
System::Collections::Generic::List<System::String^>^ detect(MatW^ img)
{
std::vector<std::string> people;
nativePeopleDetector->detect(*img->ptr(), people);
System::Collections::Generic::List<System::String^>^ peopleList = gcnew System::Collections::Generic::List<System::String^>();
for (std::vector<std::string>::iterator it = people.begin(); it != people.end(); ++it)
{
System::String^ p = gcnew System::String(it->c_str());
peopleList->Add(p);
}
return peopleList;
}
And here is the call to the method in my Windows Service C# class:
Bitmap bmpimg = ...
using (MatW img = new MatW(bmpimg))
{
using (PeopleDetectorW peopleDetector = new PeopleDetector())
{
List<string> people = peopleDetector.detect(img);
}
}
Now, here's my questions:
is there anything wrong with my code?
do I have to use using in my C# code? It makes the code ugly, when there are multiple wrapper objects in use, because the using statements have to be nested
could I use Dispose() instead after having used the objects?
Could I just not bother and leave it to the garbage collector?
(no using, no Dispose())
is the above code the right way to return objects like List<string^>^ from C++/CLI to C#?
does using gcnew not mean that the garbage collector will take care of the objects and I don't have to care how and when to free them?
I know that's a lot of questions, but all I want is to get rid of my memory leak, so I listed everything that I think could possibly go wrong...
is there anything wrong with my code?
Not in what you've posted - you are applying using statements correctly. So your code sample is not the cause of your memory leaks.
do I have to use using in my C# code? It makes the code ugly, when there are multiple wrapper objects in use, because the using statements have to be nested
You don't have to, but you don't have to nest them syntactically. This is equivalent:
Bitmap bmpimg = ...
using (MatW img = new MatW(bmpimg))
using (PeopleDetectorW peopleDetector = new PeopleDetector())
{
List<string> people = peopleDetector.detect(img);
}
could I use Dispose() instead after having used the objects?
You could, but then you'll need a try/finally to ensure Dispose is always called, even when an exception is thrown. The using statement encapsulates that whole pattern.
Could I just not bother and leave it to the garbage collector? (no using, no Dispose())
C++ RAII is commonly applied to all kinds of temporary state clean-up, including things like decrementing a counter that was incremented in the constructor, etc. Whereas the GC runs in a background thread. It is not suitable for all the deterministic clean-up scenarios that RAII can take care of. The CLR equivalent of RAII is IDisposable, and the C# language interface to it is using. In C++, the language interface to it is (naturally) destructors (which become Dispose methods) and the delete operator (which becomes a call to Dispose). Ref class objects declared "on the stack" are really just equivalent to the C# using pattern.
is the above code the right way to return objects like List^ from C++/CLI to C#?
Pretty much!
does using gcnew not mean that the garbage collector will take care of the objects and I don't have to care how and when to free them?
You don't have to free the memory. But if the class implements IDisposable, then that is a totally separate issue from memory allocation. You have to call Dispose manually before you abandon the object.
Be wary of finalizers - these are a way of hooking into the GC to get your own clean-up code to run when the GC collects your objects. But they are not really fit for general use in application code. They run from a thread that you don't control, at a time you don't control, and in an order that you don't control. So if one object's finalizer tries to access another object with a finalizer, the second object may already have been finalized. There is no way to control the ordering of these events. Most of the original uses of finalizers are nowadays covered by SafeHandle.
You don't have a finalizer in the ref class.
In C++/CLI the destructors are called either when you create an instance of the class on the stack (C++ style) and then it goes out of scope, or you use the delete operator.
The finalizers are called by the GC when it's time to finalize the object.
In C# the GC handles the destruction of all the objects (there's no delete operator) so there's no distinction between the destructor and finalizer.
So the "destructors" with the ~ behave like c++ destructors, not at all like C# destructors.
The "destructors" in C++/CLI ref class are compiled into a .Net Dispose() method.
The equivilant to a C# destructor/finalizer is the finalizer method witch is defined with a ! (exclamation mark).
So, to avoid memory leaks, you need to define a finalizer:
!MatW()
{
delete nativeMat;
}
~MatW()
{
this->!MatW();
}
See the MSDN article Destructors and Finalizers in visual C++
Use the
When should one use dynamic keyword in c# 4.0?.......Any good example with dynamic keyword in c# 4.0 that explains its usage....
Dynamic should be used only when not using it is painful. Like in MS Office libraries. In all other cases it should be avoided as compile type checking is beneficial. Following are the good situation of using dynamic.
Calling javascript method from Silverlight.
COM interop.
Maybe reading Xml, Json without creating custom classes.
How about this? Something I've been looking for and was wondering why it was so hard to do without 'dynamic'.
interface ISomeData {}
class SomeActualData : ISomeData {}
class SomeOtherData : ISomeData {}
interface ISomeInterface
{
void DoSomething(ISomeData data);
}
class SomeImplementation : ISomeInterface
{
public void DoSomething(ISomeData data)
{
dynamic specificData = data;
HandleThis( specificData );
}
private void HandleThis(SomeActualData data)
{ /* ... */ }
private void HandleThis(SomeOtherData data)
{ /* ... */ }
}
You just have to maybe catch for the Runtime exception and handle how you want if you do not have an overloaded method that takes the concrete type.
Equivalent of not using dynamic will be:
public void DoSomething(ISomeData data)
{
if(data is SomeActualData)
HandleThis( (SomeActualData) data);
else if(data is SomeOtherData)
HandleThis( (SomeOtherData) data);
...
else
throw new SomeRuntimeException();
}
As described in here dynamics can make poorly-designed external libraries easier to use: Microsoft provides the example of the Microsoft.Office.Interop.Excel assembly.
And With dynamic, you can avoid a lot of annoying, explicit casting when using this assembly.
Also, In opposition to #user2415376 ,It is definitely not a way to handle Interfaces since we already have Polymorphism implemented from the beginning days of the language!
You can use
ISomeData specificData = data;
instead of
dynamic specificData = data;
Plus it will make sure that you do not pass a wrong type of data object instead.
Check this blog post which talks about dynamic keywords in c#. Here is the gist:
The dynamic keyword is powerful indeed, it is irreplaceable when used with dynamic languages but can also be used for tricky situations while designing code where a statically typed object simply will not do.
Consider the drawbacks:
There is no compile-time type checking, this means that unless you have 100% confidence in your unit tests (cough) you are running a risk.
The dynamic keyword uses more CPU cycles than your old fashioned statically typed code due to the additional runtime overhead, if performance is important to your project (it normally is) don’t use dynamic.
Common mistakes include returning anonymous types wrapped in the dynamic keyword in public methods. Anonymous types are specific to an assembly, returning them across assembly (via the public methods) will throw an error, even though simple testing will catch this, you now have a public method which you can use only from specific places and that’s just bad design.
It’s a slippery slope, inexperienced developers itching to write something new and trying their best to avoid more classes (this is not necessarily limited to the inexperienced) will start using dynamic more and more if they see it in code, usually I would do a code analysis check for dynamic / add it in code review.
Here is a recent case in which using dynamic was a straightforward solution. This is essentially 'duck typing' in a COM interop scenario.
I had ported some code from VB6 into C#. This ported code still needed to call other methods on VB6 objects via COM interop.
The classes needing to be called looked like this:
class A
{
void Foo() {...}
}
class B
{
void Foo() {...}
}
(i.e., this would be the way the VB6 classes looked in C# via COM interop.
Since A and B are independent of each other you can't cast one to the other, and they have no common base class (COM doesn't support that AFAIK and VB6 certainly didn't. And they did not implement a common interface - see below).
The original VB6 code which was ported did this:
' Obj must be either an A or a B
Sub Bar(Obj As Object)
Call Obj.Foo()
End Sub
Now in VB6 you can pass things around as Object and the runtime will figure out if those objects have method Foo() or not. But in C# a literal translation would be:
// Obj must be either an A or a B
void Bar(object Obj)
{
Obj.Foo();
}
Which will NOT work. It won't compile because object does not have a method called "Foo", and C# being typesafe won't allow this.
So the simple "fix" was to use dynamic, like this:
// Obj must be either an A or a B
void Bar(dynamic Obj)
{
Obj.Foo();
}
This defers type safety until runtime, but assuming you've done it right works just fine.
I wouldn't endorse this for new code, but in this situation (which I think is not uncommon judging from other answers here) it was valuable.
Alternatives considered:
Using reflection to call Foo(). Probably would work, but more effort and less readable.
Modifying the VB6 library wasn't on the table here, but maybe there could be an approach to define A and B in terms of a common interface, which VB6 and COM would support. But using dynamic was much easier.
Note: This probably will turn out to be a temporary solution. Eventually if the remaining VB6 code is ported over then a proper class structure can be used.
I will like to copy an excerpt from the code project post, which define that :
Why use dynamic?
In the statically typed world, dynamic gives developers a lot of rope
to hang themselves with. When dealing with objects whose types can be
known at compile time, you should avoid the dynamic keyword at all
costs. Earlier, I said that my initial reaction was negative, so what
changed my mind? To quote Margret Attwood, context is all. When
statically typing, dynamic doesn't make a stitch of sense. If you are
dealing with an unknown or dynamic type, it is often necessary to
communicate with it through Reflection. Reflective code is not easy to
read, and has all the pitfalls of the dynamic type above. In this
context, dynamic makes a lot of sense.[More]
While Some of the characteristics of Dynamic keyword are:
Dynamically typed - This means the type of variable declared is
decided by the compiler at runtime time.
No need to initialize at the time of declaration.
e.g.,
dynamic str;
str=”I am a string”; //Works fine and compiles
str=2; //Works fine and compiles
Errors are caught at runtime
Intellisense is not available since the type and its related methods and properties can be known at run time only. [https://www.codeproject.com/Tips/460614/Difference-between-var-and-dynamic-in-Csharp]
It is definitely a bad idea to use dynamic in all cases where it can be used. This is because your programs will lose the benefits of compile-time checking and they will also be much slower.
I come from a managed world and c++ automatic memory management is quite unclear to me
If I understand correctly, I encapsulate a pointer within a stack object and when auto_ptr becomes out of scope, it automatically calls delete on the pointed object?
What kind of usage should I make of it and how should I naturally avoid inherent c++ problems?
auto_ptr is the simplest implementation of RAII in C++. Your understanding is correct, whenever its destructor is called, the underlying pointer gets deleted.
This is a one step up from C where you don't have destructors and any meaningful RAII is impossible.
A next step up towards automagic memory management is shared_ptr. It uses reference counting to keep track of whether or not the object is alive. This allows the programmer to create the objects a bit more freely, but still not as powerful as the garbage collection in Java and C#. One example where this method fails is circular references. If A has a ref counted pointer to B and B has a ref counted pointer to A, they will never get destructed, even though no other object is using either.
Modern object orianted languages use some sort of variation of mark and sweep. This technique allows managing circular references and is reliable enough for most programming tasks.
Yes, std::auto_ptr calls delete on its content when it goes out of scope. You use auto_ptr only if no shared ownership takes place.
auto_ptr isn't particularly flexible, you can't use it with objects created with new[] or anything else.
Shared ownership is usually approached with shared pointers, which e.g. boost has implementations of. The most common usage, implemented e.g. in Boosts shared_ptr, employs a reference counting scheme and cleans up the pointee when the last smart pointer goes out of scope.
shared_ptr has one big advantage - it lets you specify custom deleters. With that you can basically put every kind of resource in it and just have to specify what deleter it should use.
Here's how you use a smart pointer. For the sake of example, I'll be using a shared_ptr.
{
shared_ptr<Foo> foo(new Foo);
// do things with foo
}
// foo's value is released here
Pretty much all smart pointers aim to achieve something similar to the above, in that the object being held in the smart pointer gets released at the end of the smart pointer's scope. However, there are three types of smart pointers that are widely used, and they have very different semantics on how ownership is handled:
shared_ptr uses "shared ownership": the shared_ptr can be held by more than one scope/object, and they all own a reference to the object. When the last reference falls off, the object is deleted. This is done using reference counting.
auto_ptr uses "transferable ownership": the auto_ptr's value can be held only in one place, and each time the auto_ptr is assigned, the assignee receives ownership of the object, and the assigner loses its reference to the object. If an auto_ptr's scope is exited without the object being transferred to another auto_ptr, the object is deleted. Since there is only one owner of the object at a time, no reference counting is needed.
unique_ptr/scoped_ptr uses "nontransferable ownership": the object is held only at the place it's created, and cannot be transferred elsewhere. When the program leaves the scope where the unique_ptr is created, the object is deleted, no questions asked.
It's a lot to take in, I'll grant, but I hope it'll all sink in soon. Hope it helps!
You should use boost::shared_ptr instead of std::auto_ptr.
auto_ptr and shared_ptr simply keep an instance of the pointer and because they are local stack objects they get deallocated when they go out of scope. Once they are deallocated they call delete on internal pointer.
Simple example, the actuall shared_ptr and auto_ptr are more sophisticated (they have methods for assignment and conversion/access to internal pointer):
template <typename T>
struct myshrdptr
{
T * t;
myshrdptr(T * p) : t(p) {}
~myshrdptr()
{
cout << "myshrdptr deallocated" << endl;
delete t;
}
T * operator->() { return t; }
};
struct AB
{
void dump() { cout << "AB" << endl; }
};
void testShrdptr()
{
myshrdptr<AB> ab(new AB());
ab->dump();
// ab out of scope destructor called
// which calls delete on the internal pointer
// which deletes the AB object
}
From somewhere else:
int main()
{
testShrdptr();
cout << "done ..." << endl;
}
output something like (you can see that the destructor is called):
AB
myshrdptr deallocated
done ...
Rather than trying to understand auto_ptr and its relation to garbage-collected references, you should really try to see the underlying pattern:
In C++, all local objects have their destructors called when they go out of scope. This can be harnessed to clean up memory. For example, we could write a class which, in its constructor, is given a pointer to heap-allocated memory, and in its destructor, frees this pointer.
That is pretty much what auto_ptr does. (Unfortunately, auto_ptr also has some notoriously quirky semantics for assignment and copying)
It's also what boost::shared_ptr or other smart pointers do. There's no magic to any of those. They are simply classes that are given a pointer in their constructor, and, as they're typically allocated on the stack themselves, they'll automatically go out of scope at some point, and so their destructor is called, which can delete the pointer you originally passed to the constructor. You can write such classes yourself. Again, no magic, just a straightforward application of C++'s lifetime rules: When a local object goes out of scope, its destructor is called.
Many other classes cut out the middleman and simply let the same class do both allocation and deallocation. For example, std::vector calls new as necessary to create its internal array -- and in its destructor, it calls delete to release it.
When the vector is copied, it takes care to allocate a new array, and copy the contents from the original one, so that each object ends up with its own private array.
auto_ptr, or smart pointers in general, aren't the holy grail. They don't "solve" the problem of memory management. They are one useful part of the recipe, but to avoid memory management bugs and headaches, you need to understand the underlying pattern (commonly known as RAII) -- that is, whenever you have a resource allocation, it should be tied to a local variable which is given responsibility for also cleaning it up.
Sometimes, this means calling new yourself to allocate memory, and then passing the result to an auto_ptr, but more often, it means not calling new in the first place -- simply create the object you need on the stack, and let it call new as required internally. Or perhaps, it doesn't even need to call new internally. The trick to memory management is really to just rely on local stack-allocated objects instead of heap allocations. Don't use new by default.
Choose an imperative language (such as C, C++, or ADA) that provides pointer types.
Redesign that language to abolish pointer types, instead allowing programmers to define recursive types directly.
Consider carefully the issue of copy semantics vs reference semantics. Implement an interpreter for the language using DrRacket .