I know C# allows you to use pointers in the unsafe context. But does Java have some similar memory access method?
Java does not have pointers (for good reasons), so if there is a similar memory access method, what would it be exactly?
Well, there is a sun.misc.Unsafe class. It allows direct memory access, so you can implement some magic like reinterpret casts and so on. The thing is you need to use hacky reflection approach to get the instance and this class is not realy well documented. In general you need a very good reason to use this kind of tool in production code.
Here's an example how to get it:
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
Unsafe unsafe = (Unsafe) f.get(null);
There are 105 methods, allowing different low-level stuff. These methods are devoted to direct memory access:
allocateMemory
copyMemory
freeMemory
getAddress
getInt
Edit: this method may be incompatible with future versions of OpenJDK or any other JVM implementation, as it is not a part of the public API. Although a lot of OpenJDK code uses Unsafe, it's implementation still is a subject of change without any notice. Thanks to all who point this out in comments.
No, you can't.
The closest you can get is to use JNI and call a C function that returns the data at a given memory location.
Standard Java gives you direct memory address to unmanaged memory using direct memory buffers. You can allocate a direct memory buffer in Java with the ByteBuffer.allocateDirect method, or in C or C++ with the NewDirectByteBuffer JNI function. You cannot access arbitrary memory locations, but access to direct memory buffers is enough for most purposes, especially since the NewDirectByteBuffer function permits wrapping a completely arbitrary memory region in a ByteBuffer.
No, Java does not have any in-language mechanism for accessing arbitrary memory locations. JNI can perform whatever (unsafe and unmanaged) operations the OS allows, but that's as close as you get.
Related
I am doing a project in C#, which could benefit from a linear algebra package. I've looked at the ones out there, but I don't really want to pay, or I found them not very good. So I decided to write my own.
I read that C++ arrays are much faster than C# arrays, but that it was possible to get similar performance using pointer arrays in C#, although they are considered "unsafe." I'm curious to know how C++ pointers differ, and if the "unsafe-ness" applies to C++ as well, or if they are two fundamentally different things.
Both C# (unsafe) pointers and C++ (raw) pointers have the following characteristics:
They allow you to reference an address in a given address space.
They allow you to perform simple arithmetic operations (addition and subtraction) on them, involving integers as offsets.
They allow you to dereference whatever they point to as data of a particular type.
Wrong usage of them can invoke undefined behavior, making it exclusively your responsibility to ensure that you're using them correctly.
In that sense, and regardless of any minor differences (like syntax, pinning, etc), C# pointers and C++ pointers are pretty much the same programming concept. Therefore, they lend themselves to static analysis pretty much equally and thus they are equally safe or unsafe. So the fact that C# explicitly calls this construct out as unsafe doesn't make the equivalent C++ construct "safe". Rather, the ability to use "unsafe" code is "always on" in C++.
As an example, consider the case where you attempt to access an array using an index that's out of bounds:
With a C# array you will get an exception when using the indexer syntax and you will invoke undefined behavior when using a pointer and an offset.
With a C-style array in C++ you will invoke undefined behavior when using either the indexer syntax or a pointer and an offset (because those two syntaxes are equivalent for C-style arrays).
With a C++11 std::array you will get an exception when using array::at and you will invoke undefined behavior when using the indexer syntax.
Roughly speaking (and it is a very crude approximation), a C# unsafe pointer is the same sort of thing as a C++ pointer.
With both, there is a lot more responsibility on the programmer to get it right whereas with normal C# if you get things wrong, the worst that will happen is that an exception will be thrown. The run-time checks that give those guarantees cost performance, but if you switch them off - you are on your own.
In particular, unsafe in C# means can break out of the managed sandbox and execute native code. This then means your managed code can generate unmanaged crashes. This is also why unsafe code is not allowed without full trust (but you probably don't have to deal with partial trust code anymore).
You're probably thinking how can I run unmanaged code with an int * in C#. Easy: deliberate stack smashing. Assign it to the address of an integer on the stack and write to the next few integers the address of a byte array containing native code.
unsafe means that .Net grants you access to memory that you did not necessarily allocate. The bound checking is turned off which allows some optimizations in the JIT compiler.
In general, pointers are the same in C++, that is they grant you access to any region of memory. You can implement bound-checking with operator overloading, but it isn't the default for pointers.
I'm basically a C++ guy trying to venture into C#. From the basic tutorial of C#, I happen to find that all objects are created and stored dynamically (also true for Java) and are accessed by references and hence there's no need for copy constructors. There is also no need of bitwise copy when passing objects to a function or returning objects from a function. This makes C# much simpler than C++.
However, I read somewhere that operating on objects exclusively through references imposes limitations on the type of operations that one can perform thus restricting the programmer of complete control. One limitation is that the programmer cannot precisely specify when an object can be destroyed.
Can someone please elaborate on other limitations? (with a sample code if required)
Most of the "limitations" are by design rather than considered a deficiency (you may not agree of course)
You cannot determine/you don't have to worry about
when an object is destroyed
where the object is in memory
how big it is (unless you are tuning the application)
using pointer arithmetic
accessing out side an object
accessing an object with the wrong type
sharing objects between threads is simpler
whether the object is on the stack or the heap. (The stack is being used more and more in Java)
fragmentation of memory (This is not true of all collectors)
Because of Garbage collection done in java we cannot predict when the object will get destroyed but it performs the work of destructor.
If you want to free up some resources then you can use finally block.
try {
} finally{
// dispose resources.
}
Having made a similar transition, the more you look into it, the more you do have to think about C#'s GC behaviour in all but the most straighforward cases. This is especially true when trying to handle unmanaged resources from managed code.
This article highlights a lot of the issues you may be interested in.
Personally I miss a reference counted alternative to IDisposable (more like shared_ptr), but that's probably a hangover from a C++ background.
The more I have to write my own plumbing to support C++ like programming, the more likely it is there is another C# mechanism I've overlooked, or I end up getting frustrated with C#. For example, swap and move are not common idioms in C# as far as I've seen and I miss them: other programmers with a C# background may well disagree about how useful those idioms are.
I was reading through 2010 CWE/SANS Top 25 Most Dangerous Programming Errors and one of the entries is for Buffer Copy without Checking Size of Input. It suggests using a language with features to prevent or mitigate this problem, and says:
For example, many languages that
perform their own memory management,
such as Java and Perl, are not subject
to buffer overflows. Other languages,
such as Ada and C#, typically provide
overflow protection, but the
protection can be disabled by the
programmer.
I was not aware that Java and C# differed in any meaningful way with regard to memory management. How is it that Java is not subject to buffer overflows, while C# only protects against overflows? And how is it possible to disable this protection in C#?
java does not support raw pointers (strictly speaking it does not support pointer arithmetic).
In C#, you can use unsafe code and pointers, and unmanaged memory, which makes buffer overruns possible. See unsafe keyword.
To maintain type safety and security,
C# does not support pointer
arithmetic, by default. However, by
using the unsafe keyword, you can
define an unsafe context in which
pointers can be used. For more
information about pointers, see the
topic Pointer types.
Good Answers. I would add that Java depends on usage of stack or heap memory locations. C# does as well. The idea of using raw pointers is an addition to C# that comes from it's C code background. Although C# and C / C++ are not the same code language, they do share some commonalities semantics. The idea of using "unsafe" code allows you to avoid keeping large objects on the heap where memory is limited to around 2GB per runtime instance (for C# per CLR, for Java per JVM instance) without incurring dramatic performance degradation due to garbage collection. In some cases you can use C#'s ability to leverage unsafe or manually managed memory pointers to get around the fact there are not near as many third party tools for problems like caching outside of the heap.
I would caution that if you do use unsafe code be sure to get familiar with "Disposable Types" and "Finalizers". This can be a rather advanced practice and the ramifications of not disposing of your objects properly is the same as with C code ... the dreaded MEMORY LEAK. Repercussions are you run out of memory for your app and it falls over (not good). That is why C# does not allow it by default and that you need to override any usage of manually controlled pointers with the "unsafe" keyword. This ensures that any manually handled memory is intentional. Put on your C-code hat when dealing with the "unsafe" keyword.
A great reference to this was in the chapter "Understanding Object Lifetime" in "Pro C# 2010 and the .Net Platform" by Andrew Troelsen. If you prefer online references see the MSDN Website Implementing Finalize and Dispose to Clean Up Unmanaged Resources
One final note - Unmanaged memory is released in the finalizer portion of your object (~ObjectName(){...}). These patterns do add overhead to performance so if you are dealing with lower latency scenarios you may be best served by keeping objects light. If you are dealing with human response then you should be fine to consider this where absolutely necessary.
We have certain application written in C# and we would like it to stay this way. The application manipulates many small and short lived chunks of dynamic memory. It also appears to be sensitive to GC interruptions.
We think that one way to reduce GC is to allocate 100K chunks and then allocate memory from them using a custom memory manager. Has anyone encountered custom memory manager implementations in C#?
Perhaps you should consider using some sort of pooling architecture, where you preallocate a number of items up front then lease them from the pool. This keeps the memory requirements nicely pinned. There are a few implementations on MSDN that might serve as reference:
http://msdn2.microsoft.com/en-us/library/bb517542.aspx
http://msdn.microsoft.com/en-us/library/system.net.sockets.socketasynceventargs.socketasynceventargs.aspx
...or I can offer my generic implementation if required.
Memory management of all types that descend from System.Object is performed by the garbage collector (with the exception of structures/primitives stored on the stack). In Microsoft's implementation of the CLR, the garbage collector cannot be replaced.
You can allocate some primitives on the heap manually inside of an unsafe block and then access them via pointers, but it's not recommended.
Likely, you should profile and migrate classes to structures accordingly.
The obvious option is using Marshal.AllocHGlobal and Marshal.FreeHGlobal. I also have a copy of the DougLeaAllocator (dlmalloc) written and battle-tested in C#. If you want, I can get that out to you. Either way will require careful, consistent usage of IDisposable.
The only time items are collected in garbage collection is when there are no more references to the object.
You should make a static class or something to keep a lifetime reference to the object for the life of the application.
If you want to manage your own memory it is possible using unsafe in C#, but you would be better to choose a language that wasn't managed like C++.
Although I don't have experience with it, you can try to write C# unmanaged code.
Or maybe, you can tell GC to not collect your objects by calling
GC.KeepAlive(obj);
What functionality does the stackalloc keyword provide? When and Why would I want to use it?
From MSDN:
Used in an unsafe code context to allocate a block of memory on the
stack.
One of the main features of C# is that you do not normally need to access memory directly, as you would do in C/C++ using malloc or new. However, if you really want to explicitly allocate some memory you can, but C# considers this "unsafe", so you can only do it if you compile with the unsafe setting. stackalloc allows you to allocate such memory.
You almost certainly don't need to use it for writing managed code. It is feasible that in some cases you could write faster code if you access memory directly - it basically allows you to use pointer manipulation which suits some problems. Unless you have a specific problem and unsafe code is the only solution then you will probably never need this.
Stackalloc will allocate data on the stack, which can be used to avoid the garbage that would be generated by repeatedly creating and destroying arrays of value types within a method.
public unsafe void DoSomeStuff()
{
byte* unmanaged = stackalloc byte[100];
byte[] managed = new byte[100];
//Do stuff with the arrays
//When this method exits, the unmanaged array gets immediately destroyed.
//The managed array no longer has any handles to it, so it will get
//cleaned up the next time the garbage collector runs.
//In the mean-time, it is still consuming memory and adding to the list of crap
//the garbage collector needs to keep track of. If you're doing XNA dev on the
//Xbox 360, this can be especially bad.
}
Paul,
As everyone here has said, that keyword directs the runtime to allocate on the stack rather than the heap. If you're interested in exactly what this means, check out this article.
http://msdn.microsoft.com/en-us/library/cx9s2sy4.aspx
this keyword is used to work with unsafe memory manipulation. By using it, you have ability to use pointer (a powerful and painful feature in C/C++)
stackalloc directs the .net runtime to allocate memory on the stack.
Most other answers are focused on the "what functionality" part of OP's question.
I believe this will answers the when and why:
When do you need this?
For the best worst-case performance with cache locality of multiple small arrays.
Now in an average app you won't need this, but for realtime sensitive scenarios it gives more deterministic performance: No GC is involved and you are all but guaranteed a cache hit.
(Because worst-case performance is more important than average performance.)
Keep in mind that the default stack size in .net is small though!
(I think it's 1MB for normal apps and 256kb for ASP.net?)
Practical use could for example include realtime sound processing.
It is like Steve pointed out, only used in unsafe code context (e.g, when you want to use pointers).
If you don't use unsafe code in your C# application, then you will never need this.