Memory Efficient Recursion

Memory Efficient Recursion - c#

I have written an application in C# that generates all the words that can be existed in the combination of alphabets, numbers and few special characters.
The problem is that it isn't memory efficient as it is adapting Recursion and also some collection like List.
Is there any way I can make it to run in limited memory environment?
Umair

Convert it to an iterative function.

Unfortunately C# compiler does not perform tail call optimization, which is something that you want to happen in this case. CLR supports it, kinda, but you shouldn't rely on it.
Perhaps left of field, but maybe you can write the recursive part of your program in F#? This way you can leverage guaranteed tail call optimization and reuse bits of your C# code. Whilst a steep learning curve, F# is a more suitable language for these combinatorial tasks.

Well...I am not sure whom with I go amongst you but I got the solution. I am using more than one process one that is interacting with user and other for finding the words combination. The other process finds 5000 words, save them and quit. Communication is being achieved through WCF. This looks pretty fine as when process quits = frees memory.

Well, you obviously cannot store the intermediate results in memory (unless you've got some sort of absurd computer at your disposal); you will have to write the results to disk.
The recursion depth isn't a result of the number of considered characters - its determined by what the maximum string length you're willing to consider.
For instance, my install of python 2.6.2 has it's default recursion limit set to 1000. Arguable, I should be able to generate all possible 1-1000 length strings given a character set within this limitation (now, I think the recursion limit applies to total stack depth, so the actual limit may be less than 1000).
Edit (added python sample):
The following python snippet will produce what you're asking for (limiting itself to the given runtime stack limits):
from string import ascii_lowercase
def generate(base="", charset=ascii_lowercase):
for c in charset:
next = base + c
yield next
try:
for s in generate(next, charset):
yield s
except:
continue
for s in generate():
print s
One could produce essentially the same in C# by try/catching on StackOverflowException. As I'm typing this update, the script is running, chewing up one of my cores. However, memory usage is constant at less than 7MB. Now, I'm only print to stdout since I'm not interested in capturing the result, but I think it proves the point above. ;)
Addendum to the example:
Interesting note: Looking closer at running processes, python is actually I/O bound with the above example. It's only using 7% of my CPU, while the rest of the core is bound rending the results in my command window. Minimizing the window allows python to climb to 40% of total CPU usage, this is on a 2 core machine.

One more consideration: When you concatenate or use some other method to generate a string in C#, it occupies its own memory and may stick around for a while. If you are generating millions of strings, you are likely to notice some performance drag.
If you don't need to keep your many strings around, I would see if there's away to avoid generating the strings. For example, maybe you have a character array that you keep updating as you move through the character combinations, and if you're outputting them to a file, you would output them one character at a time so you don't have to build the string.

Related

providing random read access to a very large (50GB+) ASCII file

My task is to provide random read access to a very large (50GB+) ASCII text file (processing requests for nth line/nth word in nth line) in a form of C# console app.
After googling and reading for a few days, I've come to such vision of implementation:
Since StreamReader is good at sequential access, use it to build an index of lines/words in file (List<List<long>> map, where map[i][j] is position where jth word of ith line starts). And then use the index to access file through MemoryMappedFile, since it good at providing random access.
Are there some obvious flaws in the solution? Would it be optimal for a given task?
UPD: It will be executed at 64bit system.

It seems fine, but if you're using MemoryMapping then your program will only work on a 64-bit system because you're excessing the effective 2GB address space available.
You'll be fine with just using a FileStream and calling .Seek() to jump to the selected offset as appropriate, so I don't see a need for using MemoryMapped files.

I believe your solution is a good start - even thou List container is not the best Map container - Lists are very slow to read arbitrary elements.
I would test whether doing List<List<long>> map is the best in terms of memory/speed tradeoff - since OS caches memory maps at page boundaries (4096 bytes on x86/x64), it might be actually faster to only look up the address of the start of each line, and then scan the line looking for words.
Obviously, this approach would only work on 64bit OS, but the performance benefit of an MMap is significant - this is one of the few places where going 64bit matters a lot - database applications :)

Many Methods Kill Code Speed?

I'm building an application that is seriously slower than it should be (a process takes 4 seconds when it should take only .1 seconds, that's my goal at least).
I have a bunch of methods that pass an array from one to the other. This has kept my code nice and organized, but I'm worried that it's killing the efficiency of my code.
Can anyone confirm if this is the case?
Also, I have all of my code contained in a class separate from my UI. Is this going make things run significantly slower than if I had my code contained in the Form1.cs file?
Edit: There are about 95000 points that need to be calculated, each point goes through 7 methods that does additional calculations.

Have you tried any profiling or performance tools to narrow down why the slowdown occurs?
It might show you ways that you could use to refactor your code and improve performance.
This question asked by another user has several options that you can choose from:
Good .Net Profilers

No. This is not what is killing your code speed, unless many methods means like a million or something. You probably have more things iterating through your array than you need or realize, and the array itself may have a larger memory footprint than you realize.
Perhaps you should look into a design where instead of passing the array to 7 methods, you iterate the array once, passing the members to 7 methods, this will minimize the number of times you're iterating through 95000 members.

In general, function calls are basic enough to be highly optimized by any interpreter (or compiler). Therefore these do not produce to much blow-up in run time. In fact, if wrap your problem to, say, some fancy iterative solution, you save handling the stack, but instead have to handle some iteration variables, which will not be to hard.
I know, there have been programmers who wondered why their recursive algorithms have been so slow, until someone told them not to pass array entries by value.
You should provide some sample code. In general, you should for other bottlenecks, or find another algorithm.

Just need to run it against a good profiling tool. I've got some stuff I wished only took 4 seconds - works with upwards of a hundred million records in a pass.

An Array is a reference type not a value type. Therefore you never pass the array. You are actually passing the pointer to the array in memory. So passing the array isn't your issue. Most likely you have an issue with what you do with your array. You need to do what Jamie Keeling said and run it through a profiler or even just debug it and see if you get stuck in some big loops.

Why are you loading them all into an array and doing each method in turn rather than iterating through them as loaded?
If you can obtain them (from whatever input source) deal with them and output them (whether to screen, file our wherever) this will inevitably use less memory and reduce start-up time, at the very least.
If this answer is applicable to your situation, start by changing your methods to deal with enumerations rather than arrays (non-breaking change, since arrays are enumerations), then change your input method to yield return items as loaded rather than loading an entire array.

Sorry for posting an old link (.NET 1.1) but it was contained in VS2010 article, so:
Here you can read about method costs. (Initial link)
Then, if you start your code from VS (no matters, even in Release mode) the VS debugger connects to your code and slow it down.
I know that for this advise I will be minused but... The max performance will be achieved with unsafe operations with arrays (yes it's UNSAFE, but when there is performance deal, so...)
And the last - refactor your code to use minimum of methods which working with your arrays. It will improve the performance.

Reading lines from a .NET text/scintilla box without using too much memory?

I have to create a C# program that deals well with reading in huge files.
For example, I have a 60+ mB file. I read all of it into a scintilla box, let's call it sci_log. The program is using roughly 200mB of memory with this and other features. This is still acceptable (and less than the amount of memory used by Notepad++ to open this file).
I have another scintilla box, sci_splice. The user inputs a search term and the program searches through the file (or sci_log if the file length is small enough--it doesn't matter because it happens both ways) to find a regexp.match. When it finds a match, it concatenates that line with a string that has previous matches and increases a temporary count variable. When count is 100 (or 150, or 200, any number really), then I put the output in sci_splice, call GC.Collect(), and repeat for the next 100 lines (setting count = 0, nulling the string).
I don't have the code on me right now as I'm writing this from my home laptop, but the issue with this is it's using a LOT of memory. The 200mB mem usage jumps up to well over 1gB with no end in sight. This only happens on a search with a lot of regexp matches, so it's something with the string. But the issue is, wouldn't the GC free up that memory? Also, why does it go up so high? It doesn't make sense for why it would more than triple (worst possible case). Even if all of that 200mB was just the log in memory, all it's doing is reading each line and storing it (at worst).
After some more testing, it looks like there's something wrong with Scintilla using a lot of memory when adding lines. The initial read of the lines has a memory spike up to 850mB for a fraction of a second. Guess I need to just page the output.

Don't call GC.Collect. In this case I don't think it matters because I think this memory is going to end up on the Large Object Heap (LOH). But the point is .Net knows a lot more about memory management than you do; leave it alone.
I suspect you are looking at this using Task Manager just by the way you are describing it. You need to instead use at least Perfmon. Anticipating you have not used it before go here and do pretty much what Tess does to where it says Get a Memory Dump. Not sure you are ready for WinDbg but that maybe your next step.
Without seeing code there is almost no way to know what it is going on. The problem could be inside of Scintilla too, but I would check through what you are doing first. By running perfmon you may at least get more information to figure out what to do next.

If you are using System.String to store your matching lines, I suggest you try replacing it with System.Text.StringBuilder and see if this makes any difference.

Try http://msdn.microsoft.com/en-us/library/system.io.memorymappedfiles.memorymappedfile(VS.100).aspx

Why use flags+bitmasks rather than a series of booleans?

Given a case where I have an object that may be in one or more true/false states, I've always been a little fuzzy on why programmers frequently use flags+bitmasks instead of just using several boolean values.
It's all over the .NET framework. Not sure if this is the best example, but the .NET framework has the following:
public enum AnchorStyles
{
None = 0,
Top = 1,
Bottom = 2,
Left = 4,
Right = 8
}
So given an anchor style, we can use bitmasks to figure out which of the states are selected. However, it seems like you could accomplish the same thing with an AnchorStyle class/struct with bool properties defined for each possible value, or an array of individual enum values.
Of course the main reason for my question is that I'm wondering if I should follow a similar practice with my own code.
So, why use this approach?
Less memory consumption? (it doesn't seem like it would consume less than an array/struct of bools)
Better stack/heap performance than a struct or array?
Faster compare operations? Faster value addition/removal?
More convenient for the developer who wrote it?

It was traditionally a way of reducing memory usage. So, yes, its quite obsolete in C# :-)
As a programming technique, it may be obsolete in today's systems, and you'd be quite alright to use an array of bools, but...
It is fast to compare values stored as a bitmask. Use the AND and OR logic operators and compare the resulting 2 ints.
It uses considerably less memory. Putting all 4 of your example values in a bitmask would use half a byte. Using an array of bools, most likely would use a few bytes for the array object plus a long word for each bool. If you have to store a million values, you'll see exactly why a bitmask version is superior.
It is easier to manage, you only have to deal with a single integer value, whereas an array of bools would store quite differently in, say a database.
And, because of the memory layout, much faster in every aspect than an array. It's nearly as fast as using a single 32-bit integer. We all know that is as fast as you can get for operations on data.

Easy setting multiple flags in any order.
Easy to save and get a serie of 0101011 to the database.

Among other things, its easier to add new bit meanings to a bitfield than to add new boolean values to a class. Its also easier to copy a bitfield from one instance to another than a series of booleans.

It can also make Methods clearer. Imagine a Method with 10 bools vs. 1 Bitmask.

Actually, it can have a better performance, mainly if your enum derives from an byte.
In that extreme case, each enum value would be represented by a byte, containing all the combinations, up to 256. Having so many possible combinations with booleans would lead to 256 bytes.
But, even then, I don't think that is the real reason. The reason I prefer those is the power C# gives me to handle those enums. I can add several values with a single expression. I can remove them also. I can even compare several values at once with a single expression using the enum. With booleans, code can become, let's say, more verbose.

From a domain Model perspective, it just models reality better in some situations. If you have three booleans like AccountIsInDefault and IsPreferredCustomer and RequiresSalesTaxState, then it doesnn't make sense to add them to a single Flags decorated enumeration, cause they are not three distinct values for the same domain model element.
But if you have a set of booleans like:
[Flags] enum AccountStatus {AccountIsInDefault=1,
AccountOverdue=2 and AccountFrozen=4}
or
[Flags] enum CargoState {ExceedsWeightLimit=1,
ContainsDangerousCargo=2, IsFlammableCargo=4,
ContainsRadioactive=8}
Then it is useful to be able to store the total state of the Account, (or the cargo) in ONE variable... that represents ONE Domain Element whose value can represent any possible combination of states.

Raymond Chen has a blog post on this subject.
Sure, bitfields save data memory, but
you have to balance it against the
cost in code size, debuggability, and
reduced multithreading.
As others have said, its time is largely past. It's tempting to still do it, cause bit fiddling is fun and cool-looking, but it's no longer more efficient, it has serious drawbacks in terms of maintenance, it doesn't play nicely with databases, and unless you're working in an embedded world, you have enough memory.

I would suggest never using enum flags unless you are dealing with some pretty serious memory limitations (not likely). You should always write code optimized for maintenance.
Having several boolean properties makes it easier to read and understand the code, change the values, and provide Intellisense comments not to mention reduce the likelihood of bugs. If necessary, you can always use an enum flag field internally, just make sure you expose the setting/getting of the values with boolean properties.

Space efficiency - 1 bit
Time efficiency - bit comparisons are handled quickly by hardware.
Language independence - where the data may be handled by a number of different programs you don't need to worry about the implementation of booleans across different languages/platforms.
Most of the time, these are not worth the tradeoff in terms of maintance. However, there are times when it is useful:
Network protocols - there will be a big saving in reduced size of messages
Legacy software - once I had to add some information for tracing into some legacy software.
Cost to modify the header: millions of dollars and years of effort.
Cost to shoehorn the information into 2 bytes in the header that weren't being used: 0.
Of course, there was the additional cost in the code that accessed and manipulated this information, but these were done by functions anyways so once you had the accessors defined it was no less maintainable than using Booleans.

I have seen answers like Time efficiency and compatibility. those are The Reasons, but I do not think it is explained why these are sometime necessary in times like ours. from all answers and experience of chatting with other engineers I have seen it pictured as some sort of quirky old time way of doing things that should just die because new way to do things are better.
Yes, in very rare case you may want to do it the "old way" for performance sake like if you have the classic million times loop. but I say that is the wrong perspective of putting things.
While it is true that you should NOT care at all and use whatever C# language throws at you as the new right-way™ to do things (enforced by some fancy AI code analysis slaping you whenever you do not meet their code style), you should understand deeply that low level strategies aren't there randomly and even more, it is in many cases the only way to solve things when you have no help from a fancy framework. your OS, drivers, and even more the .NET itself(especially the garbage collector) are built using bitfields and transactional instructions. your CPU instruction set itself is a very complex bitfield, so JIT compilers will encode their output using complex bit processing and few hardcoded bitfields so that the CPU can execute them correctly.
When we talk about performance things have a much larger impact than people imagine, today more then ever especially when you start considering multicores.
when multicore systems started to become more common all CPU manufacturer started to mitigate the issues of SMP with the addition of dedicated transactional memory access instructions while these were made specifically to mitigate the near impossible task to make multiple CPUs to cooperate at kernel level without a huge drop in perfomrance it actually provides additional benefits like an OS independent way to boost low level part of most programs. basically your program can use CPU assisted instructions to perform memory changes to integers sized memory locations, that is, a read-modify-write where the "modify" part can be anything you want but most common patterns are a combination of set/clear/increment.
usually the CPU simply monitors if there is any other CPU accessing the same address location and if a contention happens it usually stops the operation to be committed to memory and signals the event to the application within the same instruction. this seems trivial task but superscaler CPU (each core has multiple ALUs allowing instruction parallelism), multi-level cache (some private to each core, some shared on a cluster of CPU) and Non-Uniform-Memory-Access systems (check threadripper CPU) makes things difficult to keep coherent, luckily the smartest people in the world work to boost performance and keep all these things happening correctly. todays CPU have a large amount of transistor dedicated to this task so that caches and our read-modify-write transactions work correctly.
C# allows you to use the most common transactional memory access patterns using Interlocked class (it is only a limited set for example a very useful clear mask and increment is missing, but you can always use CompareExchange instead which gets very close to the same performance).
To achieve the same result using a array of booleans you must use some sort of lock and in case of contention the lock is several orders of magnitude less permorming compared to the atomic instructions.
here are some examples of highly appreciated HW assisted transaction access using bitfields which would require a completely different strategy without them of course these are not part of C# scope:
assume a DMA peripheral that has a set of DMA channels, let say 20 (but any number up to the maximum number of bits of the interlock integer will do). When any peripheral's interrupt that might execute at any time, including your beloved OS and from any core of your 32-core latest gen wants a DMA channel you want to allocate a DMA channel (assign it to the peripheral) and use it. a bitfield will cover all those requirements and will use just a dozen of instructions to perform the allocation, which are inlineable within the requesting code. basically you cannot go faster then this and your code is just few functions, basically we delegate the hard part to the HW to solve the problem, constraints: bitfield only
assume a peripheral that to perform its duty requires some working space in normal RAM memory. for example assume a high speed I/O peripheral that uses scatter-gather DMA, in short it uses a fixed-size block of RAM populated with the description (btw the descriptor is itself made of bitfields) of the next transfer and chained one to each other creating a FIFO queue of transfers in RAM. the application prepares the descriptors first and then it chains with the tail of the current transfers without ever pausing the controller (not even disabling the interrupts). the allocation/deallocation of such descriptors can be made using bitfield and transactional instructions so when it is shared between diffent CPUs and between the driver interrupt and the kernel all will still work without conflicts. one usage case would be the kernel allocates atomically descriptors without stopping or disabling interrupts and without additional locks (the bitfield itself is the lock), the interrupt deallocates when the transfer completes.
most old strategies were to preallocate the resources and force the application to free after usage.
If you ever need to use multitask on steriods C# allows you to use either Threads + Interlocked, but lately C# introduced lightweight Tasks, guess how it is made? transactional memory access using Interlocked class. So you likely do not need to reinvent the wheel any of the low level part is already covered and well engineered.
so the idea is, let smart people (not me, I am a common developer like you) solve the hard part for you and just enjoy general purpose computing platform like C#. if you still see some remnants of these parts is because someone may still need to interface with worlds outside .NET and access some driver or system calls for example requiring you to know how to build a descriptor and put each bit in the right place. do not being mad at those people, they made our jobs possible.
In short : Interlocked + bitfields. incredibly powerful, don't use it

It is for speed and efficiency. Essentially all you are working with is a single int.
if ((flags & AnchorStyles.Top) == AnchorStyles.Top)
{
//Do stuff
}

Removing Duplicates in Large Text Files

I've been trying to calculate all the unique permutations for a very long word (antidisestablishmentarianism), and although I can calculate the permutations for the words, I am having problems with stopping the production of duplications.
Normally I would just run the List<T>.Contains() method on my string, but the list of permutations becomes so large I can't keep it in memory. I made that mistake earlier and managed to use up all 8GB of memory in my computer. In order to prevent that from happening again, I changed the code to append the calculated permutation to a file and release it from memory.
My main question is this: How can I prevent duplicate permutations from being added to my file without loading the whole thing in memory? Is it possible to selectively load, for example, the first few megabytes, scan that, and move on until the file is completed, or should I be looking in a different direction?
This is not homework, my math homework gave a hypothetical situation where a computer could calculate 30 permutations per second and made me figure out how long it would take to calculate all the permutations. That wasn't a problem, and I don't need help with that, I just wanted to know how long it would take a modern computer to perform the same task.

How about using an algorithm that generates all permutations without duplicates? That way you wouldn't have to check for them in the first place.
A Google search for "algorithm generate permutations" turns up dozens of references to get you started. e.g. Permutation Generation Methods

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.