I'm writing a c# application that takes a string to any friendly process name(say 'notepad') and reads the process memory. It is fine for reading bytes but I have no idea if those are int32s, chars, bools or other types of data. One of the first steps to solving that is knowing how the data is padded. how can I determine the data alignment of the memory?
I've learned it isn't as simple as knowing the OS or processor. Different packings are supposedly possible even then: http://www.developerfusion.com/article/84519/mastering-structs-in-c/
So, is there some pinvoke I could use on the process handle to read some value or maybe an algorithm that reads some bytes and tests what it finds?
Motivation(in case someone has a better solution for my end goal): I don't want to look for potential int32 values(or any other type) by looking at relative address 0,1,2,3 and then looking at 1,2,3,4 and so on if I can help it. If memory is say 4-byte aligned, I'd be wasting a lot of effort for nothing when I could just check 0,1,2,3 and skip to 4,5,6,7.
I'm not quite sure what you're trying to do - but my best bet is that you're hoping to dig around the process to find a bug or get an idea what they're up to?
the best way to figure out the memory layout will be from the symbols (.pdb). Is this an app that you've written?
Assuming not, you might consider injecting a thread and then calling MiniDumpWriteDump(). This API can dump the memory to disk where you can browse it with windbg.
The idea here will to use the Microsoft public symbols (!symfix) and then to go routing around the memory looking for whaterver you're needing. having the symbols for the Microsoft bits will help you - with those you'll be able to figure out where threads/heaps/handles/etc are located
Related
I have a C# application that will continously allocate memory for data stored in byte arrays. I have another process written in python that will read from these arrays once instantiated. Both processes will be running on a ubuntu machine.
The obvious solution seems to be to share memory between the processes by passing a pointer from the C# process to the python process. However, this has turned out to be difficult.
I've mainly looked at solutions proposed online. Two notable ones are named pipes and mapped memory files. I read the following posts:
Sharing memory between C and Python. Suggested to be done via named pipes:
Share memory between C/C++ and Python
The C# application will neither read nor write from the array and the python script will only read from the array. Therefore, this solution doesn't satisfy my efficiency requirements and seems to be a superfluous solution when the data is literally stored in memory.
When i looked at memory mapped files, it seemed as if though that we would allocate memory for these memory files to write the data to. However, the data will already be allocated before the mapped file is used. Thus, it seems inefficient as well.
The second post:
https://learn.microsoft.com/en-us/dotnet/standard/io/memory-mapped-files?redirectedfrom=MSDN
The article says: "Starting with the .NET Framework 4, you can use managed code to access memory-mapped files in the same way that native Windows functions access memory-mapped files". Would an ubuntu machine run into potential problems when reading these files in the same way that windows would? And if not, could someone give either a simple example of using these mapped files between the program languages mentioned above as well as pass a reference to these mapped files between the processes, or give a reference to where someone has already done this?
Or if someone knows how to directly pass a pointer to a byte array from C# to python, that would be even better if possible.
Any help is greatly appreciated!
So after coming back to this post four months later, i did not find a solution that satisfied my efficiency needs.
I had tried to find a way to get around having to write a large amount of data, already allocated in memory, to another process. Meaning i would have needed to reallocate that same data taking up double the amount of memory and adding additional overhead even though the data would be read-safe. However, it seemed as though the proper way to solve this, in this case, for two processes running on the same machine would be to use named pipes as they are faster than i.e. sockets. As Holger stated, this ended up being a question of interprocess-communication.
I ended up writing the whole application in python which happened to be the better alternative in the end anyways, probably saving me a lot of headache.
I'm having a REALLY tough time with something. I'm using C#, and I want to be able to check whether the virtual memory paging file is currently enabled (or has a maximum size greater than 0, whichever is easier to test).
I haven't found any way to do it (if there is one, please tell me!), but I thought I might be able to indirectly accomplish my goal by just checking either how much memory in the paging file my program/process is using or even how much it's allowed to use.
So I was using the Process.PagedMemorySize64 and Process.PagedSystemMemorySize64 functions, but they return a positive number even when I have the paging file disabled! I was told elsewhere that those functions will tell me information about physical memory if a paging file is unavailable. Well great, because that doesn't help me.
So if anyone knows how to actually get any information about how much memory the current process or whole computer is either using or allowed to use within a paging file, I'd appreciate it!
P.S.: As a related question, does anyone ever get an OutOfMemoryException while trying to write an array into a FileStream in a loop? I mean I'm not actually allocating any memory at that time, and it only happens when I try to write huge files - the small ones write just fine.
My task is to provide random read access to a very large (50GB+) ASCII text file (processing requests for nth line/nth word in nth line) in a form of C# console app.
After googling and reading for a few days, I've come to such vision of implementation:
Since StreamReader is good at sequential access, use it to build an index of lines/words in file (List<List<long>> map, where map[i][j] is position where jth word of ith line starts). And then use the index to access file through MemoryMappedFile, since it good at providing random access.
Are there some obvious flaws in the solution? Would it be optimal for a given task?
UPD: It will be executed at 64bit system.
It seems fine, but if you're using MemoryMapping then your program will only work on a 64-bit system because you're excessing the effective 2GB address space available.
You'll be fine with just using a FileStream and calling .Seek() to jump to the selected offset as appropriate, so I don't see a need for using MemoryMapped files.
I believe your solution is a good start - even thou List container is not the best Map container - Lists are very slow to read arbitrary elements.
I would test whether doing List<List<long>> map is the best in terms of memory/speed tradeoff - since OS caches memory maps at page boundaries (4096 bytes on x86/x64), it might be actually faster to only look up the address of the start of each line, and then scan the line looking for words.
Obviously, this approach would only work on 64bit OS, but the performance benefit of an MMap is significant - this is one of the few places where going 64bit matters a lot - database applications :)
when I run my application I got this exception
a busy cat http://img21.imageshack.us/img21/5619/bugxt.jpg
I understood that the program is out of memory .. are there any other possible meaning for that exception?
given that I am calling a dll files (deployment from matlab)
thank you all
It's absolutely possible, just use Process Explorer to see your processe's WorkingSet.
For 32 bit Windows systems maximum available memory for .NET Provecesses is arround 2GB, but it can be less based on your version configuration. Here is the SO Link on subject.
Considering the fact that you use matlab, so probably make a massive or complex calculations, you, probably, create a lot of objects/values to pass to DLL functions, which can be a one possible sources of bottleneck. But this is only a guess, cause you need to measure you program to figure out real problem.
Regards.
Note: check your old questions and accept an answer you prefer among responses you got for every question, your rate is too low !
I have to create a C# program that deals well with reading in huge files.
For example, I have a 60+ mB file. I read all of it into a scintilla box, let's call it sci_log. The program is using roughly 200mB of memory with this and other features. This is still acceptable (and less than the amount of memory used by Notepad++ to open this file).
I have another scintilla box, sci_splice. The user inputs a search term and the program searches through the file (or sci_log if the file length is small enough--it doesn't matter because it happens both ways) to find a regexp.match. When it finds a match, it concatenates that line with a string that has previous matches and increases a temporary count variable. When count is 100 (or 150, or 200, any number really), then I put the output in sci_splice, call GC.Collect(), and repeat for the next 100 lines (setting count = 0, nulling the string).
I don't have the code on me right now as I'm writing this from my home laptop, but the issue with this is it's using a LOT of memory. The 200mB mem usage jumps up to well over 1gB with no end in sight. This only happens on a search with a lot of regexp matches, so it's something with the string. But the issue is, wouldn't the GC free up that memory? Also, why does it go up so high? It doesn't make sense for why it would more than triple (worst possible case). Even if all of that 200mB was just the log in memory, all it's doing is reading each line and storing it (at worst).
After some more testing, it looks like there's something wrong with Scintilla using a lot of memory when adding lines. The initial read of the lines has a memory spike up to 850mB for a fraction of a second. Guess I need to just page the output.
Don't call GC.Collect. In this case I don't think it matters because I think this memory is going to end up on the Large Object Heap (LOH). But the point is .Net knows a lot more about memory management than you do; leave it alone.
I suspect you are looking at this using Task Manager just by the way you are describing it. You need to instead use at least Perfmon. Anticipating you have not used it before go here and do pretty much what Tess does to where it says Get a Memory Dump. Not sure you are ready for WinDbg but that maybe your next step.
Without seeing code there is almost no way to know what it is going on. The problem could be inside of Scintilla too, but I would check through what you are doing first. By running perfmon you may at least get more information to figure out what to do next.
If you are using System.String to store your matching lines, I suggest you try replacing it with System.Text.StringBuilder and see if this makes any difference.
Try http://msdn.microsoft.com/en-us/library/system.io.memorymappedfiles.memorymappedfile(VS.100).aspx