I was reading through 2010 CWE/SANS Top 25 Most Dangerous Programming Errors and one of the entries is for Buffer Copy without Checking Size of Input. It suggests using a language with features to prevent or mitigate this problem, and says:
For example, many languages that
perform their own memory management,
such as Java and Perl, are not subject
to buffer overflows. Other languages,
such as Ada and C#, typically provide
overflow protection, but the
protection can be disabled by the
programmer.
I was not aware that Java and C# differed in any meaningful way with regard to memory management. How is it that Java is not subject to buffer overflows, while C# only protects against overflows? And how is it possible to disable this protection in C#?
java does not support raw pointers (strictly speaking it does not support pointer arithmetic).
In C#, you can use unsafe code and pointers, and unmanaged memory, which makes buffer overruns possible. See unsafe keyword.
To maintain type safety and security,
C# does not support pointer
arithmetic, by default. However, by
using the unsafe keyword, you can
define an unsafe context in which
pointers can be used. For more
information about pointers, see the
topic Pointer types.
Good Answers. I would add that Java depends on usage of stack or heap memory locations. C# does as well. The idea of using raw pointers is an addition to C# that comes from it's C code background. Although C# and C / C++ are not the same code language, they do share some commonalities semantics. The idea of using "unsafe" code allows you to avoid keeping large objects on the heap where memory is limited to around 2GB per runtime instance (for C# per CLR, for Java per JVM instance) without incurring dramatic performance degradation due to garbage collection. In some cases you can use C#'s ability to leverage unsafe or manually managed memory pointers to get around the fact there are not near as many third party tools for problems like caching outside of the heap.
I would caution that if you do use unsafe code be sure to get familiar with "Disposable Types" and "Finalizers". This can be a rather advanced practice and the ramifications of not disposing of your objects properly is the same as with C code ... the dreaded MEMORY LEAK. Repercussions are you run out of memory for your app and it falls over (not good). That is why C# does not allow it by default and that you need to override any usage of manually controlled pointers with the "unsafe" keyword. This ensures that any manually handled memory is intentional. Put on your C-code hat when dealing with the "unsafe" keyword.
A great reference to this was in the chapter "Understanding Object Lifetime" in "Pro C# 2010 and the .Net Platform" by Andrew Troelsen. If you prefer online references see the MSDN Website Implementing Finalize and Dispose to Clean Up Unmanaged Resources
One final note - Unmanaged memory is released in the finalizer portion of your object (~ObjectName(){...}). These patterns do add overhead to performance so if you are dealing with lower latency scenarios you may be best served by keeping objects light. If you are dealing with human response then you should be fine to consider this where absolutely necessary.
Related
I am doing a project in C#, which could benefit from a linear algebra package. I've looked at the ones out there, but I don't really want to pay, or I found them not very good. So I decided to write my own.
I read that C++ arrays are much faster than C# arrays, but that it was possible to get similar performance using pointer arrays in C#, although they are considered "unsafe." I'm curious to know how C++ pointers differ, and if the "unsafe-ness" applies to C++ as well, or if they are two fundamentally different things.
Both C# (unsafe) pointers and C++ (raw) pointers have the following characteristics:
They allow you to reference an address in a given address space.
They allow you to perform simple arithmetic operations (addition and subtraction) on them, involving integers as offsets.
They allow you to dereference whatever they point to as data of a particular type.
Wrong usage of them can invoke undefined behavior, making it exclusively your responsibility to ensure that you're using them correctly.
In that sense, and regardless of any minor differences (like syntax, pinning, etc), C# pointers and C++ pointers are pretty much the same programming concept. Therefore, they lend themselves to static analysis pretty much equally and thus they are equally safe or unsafe. So the fact that C# explicitly calls this construct out as unsafe doesn't make the equivalent C++ construct "safe". Rather, the ability to use "unsafe" code is "always on" in C++.
As an example, consider the case where you attempt to access an array using an index that's out of bounds:
With a C# array you will get an exception when using the indexer syntax and you will invoke undefined behavior when using a pointer and an offset.
With a C-style array in C++ you will invoke undefined behavior when using either the indexer syntax or a pointer and an offset (because those two syntaxes are equivalent for C-style arrays).
With a C++11 std::array you will get an exception when using array::at and you will invoke undefined behavior when using the indexer syntax.
Roughly speaking (and it is a very crude approximation), a C# unsafe pointer is the same sort of thing as a C++ pointer.
With both, there is a lot more responsibility on the programmer to get it right whereas with normal C# if you get things wrong, the worst that will happen is that an exception will be thrown. The run-time checks that give those guarantees cost performance, but if you switch them off - you are on your own.
In particular, unsafe in C# means can break out of the managed sandbox and execute native code. This then means your managed code can generate unmanaged crashes. This is also why unsafe code is not allowed without full trust (but you probably don't have to deal with partial trust code anymore).
You're probably thinking how can I run unmanaged code with an int * in C#. Easy: deliberate stack smashing. Assign it to the address of an integer on the stack and write to the next few integers the address of a byte array containing native code.
unsafe means that .Net grants you access to memory that you did not necessarily allocate. The bound checking is turned off which allows some optimizations in the JIT compiler.
In general, pointers are the same in C++, that is they grant you access to any region of memory. You can implement bound-checking with operator overloading, but it isn't the default for pointers.
I know C# allows you to use pointers in the unsafe context. But does Java have some similar memory access method?
Java does not have pointers (for good reasons), so if there is a similar memory access method, what would it be exactly?
Well, there is a sun.misc.Unsafe class. It allows direct memory access, so you can implement some magic like reinterpret casts and so on. The thing is you need to use hacky reflection approach to get the instance and this class is not realy well documented. In general you need a very good reason to use this kind of tool in production code.
Here's an example how to get it:
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
Unsafe unsafe = (Unsafe) f.get(null);
There are 105 methods, allowing different low-level stuff. These methods are devoted to direct memory access:
allocateMemory
copyMemory
freeMemory
getAddress
getInt
Edit: this method may be incompatible with future versions of OpenJDK or any other JVM implementation, as it is not a part of the public API. Although a lot of OpenJDK code uses Unsafe, it's implementation still is a subject of change without any notice. Thanks to all who point this out in comments.
No, you can't.
The closest you can get is to use JNI and call a C function that returns the data at a given memory location.
Standard Java gives you direct memory address to unmanaged memory using direct memory buffers. You can allocate a direct memory buffer in Java with the ByteBuffer.allocateDirect method, or in C or C++ with the NewDirectByteBuffer JNI function. You cannot access arbitrary memory locations, but access to direct memory buffers is enough for most purposes, especially since the NewDirectByteBuffer function permits wrapping a completely arbitrary memory region in a ByteBuffer.
No, Java does not have any in-language mechanism for accessing arbitrary memory locations. JNI can perform whatever (unsafe and unmanaged) operations the OS allows, but that's as close as you get.
I have to compile my assembly with /unsafe in order to use a pointer. I wonder differences when I compile with /unsafe. Please assume that there is no programming faults such as invalid use of pointers etc. Do I lose some performance if I use unsafe compiled assembly? Any memory drawbacks?
Well, using "unsafe" code you basically improve performance, with diect access to a memory and pointer ariphmetics . The usual case of using this is inside .NET code focused on high performance, like for example 3D rendering kernel engine. Writing stuff like this in 100% .NET code would make application too slow, so pointers come to rescue, especially when we need to deal with "bridges" between C/C++ libriaries like OpenGL (say)
Long story short: you will benefit from it definitely, if you write a good not managed code.
Unsafe code may increase an application's performance by removing array bounds checks.
Using unsafe code introduces security and stability risks.
Link : http://msdn.microsoft.com/en-us/library/chfa2zb8.aspx
I'm trying to learn digital image processing, I found my friend using c#. There is a very important reason why he using C#: There is unsafe keyword in c# and the performance of his code(algorithm part) can reach 75% of same code in c++, which is good enough for him.
He encourages me to turn to c#, but I'm java programmer of many years. I know there is a Unsafe class in java too, but I have never used of it, not sure if the performance is as good as C#.
So I want to know the performance of Unsafe in java, and is it a good idea to use Java for image processing?
UPDATE
Just using unsafe code for some performance-aware task, not use it everywhere.
Unsafe means you can avoid all the overheads in a managed environment. All the range and type checking, Garbage collection, reflection etc. Whether your code will be faster using unsafe all depends on what you wrote. I dare say the main optimisation point would be processing large blocks of raw memory as opposed to say a list of pixel classes or structs which OO would lead you towards.
I love C#, but choosing one language over another because of a feature, that has a very limited scope seems a very weak argument to me. Don't pick a language based on your friend's opinions, but based on your needs and preferences! Programming language is just a tool. You'd seriously dump years of experiences just like that? Use the language you're most comfortable with.
Check this discussion it comes up with both plus and negative points about it here
Though this one taken from C# article but I think it applies well for Java too - check here
In unsafe code or in other words unmanaged code it is possible to declare and use pointers. But the question is why do we write unmanaged code? If we want to write code that interfaces with the operating system, or want to access memory mapped device or want to implement a time critical algorithm then the use of pointer can give lots of advantages.
But there are some disadvantages of using pointers too. If pointers were chosen to be 32 bit quantities at compile time, the code would be restricted to 4gig of address space, even if it were run on a 64 bit machine. If pointers were chosen at compile time to be 64 bits, the code could not be run on a 32 bit machine.
What functionality does the stackalloc keyword provide? When and Why would I want to use it?
From MSDN:
Used in an unsafe code context to allocate a block of memory on the
stack.
One of the main features of C# is that you do not normally need to access memory directly, as you would do in C/C++ using malloc or new. However, if you really want to explicitly allocate some memory you can, but C# considers this "unsafe", so you can only do it if you compile with the unsafe setting. stackalloc allows you to allocate such memory.
You almost certainly don't need to use it for writing managed code. It is feasible that in some cases you could write faster code if you access memory directly - it basically allows you to use pointer manipulation which suits some problems. Unless you have a specific problem and unsafe code is the only solution then you will probably never need this.
Stackalloc will allocate data on the stack, which can be used to avoid the garbage that would be generated by repeatedly creating and destroying arrays of value types within a method.
public unsafe void DoSomeStuff()
{
byte* unmanaged = stackalloc byte[100];
byte[] managed = new byte[100];
//Do stuff with the arrays
//When this method exits, the unmanaged array gets immediately destroyed.
//The managed array no longer has any handles to it, so it will get
//cleaned up the next time the garbage collector runs.
//In the mean-time, it is still consuming memory and adding to the list of crap
//the garbage collector needs to keep track of. If you're doing XNA dev on the
//Xbox 360, this can be especially bad.
}
Paul,
As everyone here has said, that keyword directs the runtime to allocate on the stack rather than the heap. If you're interested in exactly what this means, check out this article.
http://msdn.microsoft.com/en-us/library/cx9s2sy4.aspx
this keyword is used to work with unsafe memory manipulation. By using it, you have ability to use pointer (a powerful and painful feature in C/C++)
stackalloc directs the .net runtime to allocate memory on the stack.
Most other answers are focused on the "what functionality" part of OP's question.
I believe this will answers the when and why:
When do you need this?
For the best worst-case performance with cache locality of multiple small arrays.
Now in an average app you won't need this, but for realtime sensitive scenarios it gives more deterministic performance: No GC is involved and you are all but guaranteed a cache hit.
(Because worst-case performance is more important than average performance.)
Keep in mind that the default stack size in .net is small though!
(I think it's 1MB for normal apps and 256kb for ASP.net?)
Practical use could for example include realtime sound processing.
It is like Steve pointed out, only used in unsafe code context (e.g, when you want to use pointers).
If you don't use unsafe code in your C# application, then you will never need this.