concat two byte[] returns System.OutOfMemoryException

concat two byte[] returns System.OutOfMemoryException - c#

I have a problem with concat two byte[]. One of them have more than 300,000,000 byte. It's throwing exception of type System.OutOfMemoryException.
I use this code :
byte[] b3 = by2.Concat(by1).ToArray();
anybody can help me

Because of Concat call ToArray know nothing about how big the result array has to be. It can't create proper, big array and just fill it with data. So it creates small one, then when it's full creates new one with twice the size, etc. over and over again as long as there is more data to fill. This way you need much more memory then just theoretical (b1.Length + b2.Length) * 2. And things get even more tricky, because after certain point these big arrays are allocated on LOH, and are not collected that easily by GC as normal objects.
That's why you should not use ToArray() in this case and do it the old-fashioned way: allocate new array with size equals combines sizes of source arrays and copy the data.
Something like:
var b3 = new byte[b1.Length + b2.Length];
Array.Copy(b1, b2, b1.Length);
Array.Copy(b1, 0, b2, b1.Length, b2.Length);
It does not guaranty success, but makes it more likely. And executes much, much, much faster then ToArray().

When working with that amount of data, I think you should be working with streams (this of course depends on the application).
Then you can have code that works on the data without requiring it all to be loaded in memory at the same time, and you could create a specialized stream class that acts as a concatenation between two streams.

Well, the error message taks for itself, you don't have free continuous ~550Mb of RAM. Maybe it's just too fragmented.

Well.. you know, requesting from the system a continuous block of ~600meg - I'm not suprised. It is quite a large block itself, and provided that you must also have the source arrays in the memory, that's over 1GB of raw data chunks..
You should probably start thinking about other data structures, or try to keep them as files and map them to memory edit: memmapping a whole file needs the same contiguous area in address space, so it solves nothing. This answer will be deleted.

Related

Deleting byte[] from program and memory in C#

I'm making a program in which one of its functions, in order to correctly create the message to be sent, keeps calling a function I have generated to add each of the parts to the array. The thing is, in C# you can't do this because the byte arrays (and if I'm not wrong, any kind of array) has a finite Length which cannot be changed.
Due to this, I thought of creating 2 byte variables. The first one would get the first to values. The second one would be created after you know the quantity of new bytes you have to add, and after this, you would delete the first variable and create it again, with the Length of the previous variable, but adding the Length of the new values, doing the same you did with the second variable. The code I've generated is:
byte[] message_mod_0 = adr_and_func;
byte[] byte_memory_adr = AddAndTypes.ToByteArray(memory_adr);
byte[] message_mod_1 = new byte[2 + byte_memory_adr.Length];
message_mod_1 = AddAndTypes.AddByteArrayToByteArray(message_mod_0, byte_memory_adr);
AddAndTypes.AddByteArrayToByteArray(message_mod_0, AddAndTypes.IntToByte(value));
byte[] CRC = Aux.CRC(message_mod_0);
AddAndTypes.AddByteArrayToByteArray(message_mod_0, CRC);
In this code, the two variables I've meant are message_mod_0 and message_mod_1. I also think of doing the deleting and redeclaring the byte_memory_adr variable that is required in order to know which is the Length of the byte array you want to add to the ouput message.
The parameters adr_and_func, memory_adr and value are given as input parameters of the function I'm making.
The question can be summed up as: is there any way to delete variables in the same scope they were created? And, in case it can be done, would there be any problem if I created a new variable with the same name after I have deleted the first one? I can't think of any reason why that could happen, but I'm pretty new to this programming language.
Also, I don't know if there is any less messy way of doing this.

This sounds like you are writing your own custom serializer.
I would recommend just using a existing library, like protobuf.net to define your messages if at all possible.
If this is not possible you can use a BinaryWriter to write your values to a Stream. If you want to keep it in memory use a MemoryStream and use .ToArray() when your done to get a array of all bytes.
As for memory, do not worry about it. Unless you have gigabyte sized messages the memory requirements should not be an issue, and the garbage collector will automatically recycle memory when it is no longer needed, and it can do this after the last usage, regardless of scope. If you have huge memory streams you might want to look at something like recyclable memory stream since this can avoid some allocation performance issues and fragmentation issues.

C# Quick bit array

as stated in the title i am evaluating the cost of implement a BitArray over bytes[] (i have understood that native BitArray is pretty slow) insthead of using a string representation of bits (eg : "001001001" ) but i am open to any suggestion that are more effective.
The length of array is not known at design time, but i suppose may be between 200 and 500 bit per array.
Memory is not a concern, so use a lot of memory for represent the array is not an issue, what matter is speed when array is created and manupulated (thiy will be manipulated a lot).
Thanks in advance for yours consideration and suggenstion onto the topic.

A few suggestions:
1) Computers don't process bits o even n int or long will work at the same speed
2) To reach speed you can consider writing it with unsafe code
3) New is expensive. If the objects are created a lot you can do the following: Create a bulk of 10K
objects at a time and serve them from a method when required. Once the cache runs out you can recreate them. Have another method that once an object processing completes you clean it up and return it to the cache
4) Make sure your manipulation is optimal

I need very big array length(size) in C#

public double[] result = new double[ ??? ];
I am storing results and total number of the results are bigger than the 2,147,483,647 which is max int32.
I tried biginteger, ulong etc. but all of them gave me errors.
How can I extend the size of the array that can store > 50,147,483,647 results (double) inside it?
Thanks...

An array of 2,147,483,648 doubles will occupy 16GB of memory. For some people, that's not a big deal. I've got servers that won't even bother to hit the page file if I allocate a few of those arrays. Doesn't mean it's a good idea.
When you are dealing with huge amounts of data like that you should be looking to minimize the memory impact of the process. There are several ways to go with this, depending on how you're working with the data.
Sparse Arrays
If your array is sparsely populated - lots of default/empty values with a small percentage of actually valid/useful data - then a sparse array can drastically reduce the memory requirements. You can write various implementations to optimize for different distribution profiles: random distribution, grouped values, arbitrary contiguous groups, etc.
Works fine for any type of contained data, including complex classes. Has some overheads, so can actually be worse than naked arrays when the fill percentage is high. And of course you're still going to be using memory to store your actual data.
Simple Flat File
Store the data on disk, create a read/write FileStream for the file, and enclose that in a wrapper that lets you access the file's contents as if it were an in-memory array. The simplest implementation of this will give you reasonable usefulness for sequential reads from the file. Random reads and writes can slow you down, but you can do some buffering in the background to help mitigate the speed issues.
This approach works for any type that has a static size, including structures that can be copied to/from a range of bytes in the file. Doesn't work for dynamically-sized data like strings.
Complex Flat File
If you need to handle dynamic-size records, sparse data, etc. then you might be able to design a file format that can handle it elegantly. Then again, a database is probably a better option at this point.
Memory Mapped File
Same as the other file options, but using a different mechanism to access the data. See System.IO.MemoryMappedFile for more information on how to use Memory Mapped Files from .NET.
Database Storage
Depending on the nature of the data, storing it in a database might work for you. For a large array of doubles this is unlikely to be a great option however. The overheads of reading/writing data in the database, plus the storage overheads - each row will at least need to have a row identity, probably a BIG_INT (8-byte integer) for a large recordset, doubling the size of the data right off the bat. Add in the overheads for indexing, row storage, etc. and you can very easily multiply the size of your data.
Databases are great for storing and manipulating complicated data. That's what they're for. If you have variable-width data - strings and the like - then a database is probably one of your best options. The flip-side is that they're generally not an optimal solution for working with large amounts of very simple data.
Whichever option you go with, you can create an IList<T>-compatible class that encapsulates your data. This lets you write code that doesn't have any need to know how the data is stored, only what it is.

BCL arrays cannot do that.
Someone wrote a chunked BigArray<T> class that can.
However, that will not magically create enough memory to store it.

You can't. Even with gcAllowVeryLargeObjects, the maximum size of any dimension in an array (of non-bytes) is 2,146,435,071
So you'll need to rethink your design, or use an alternative implementation such as a jagged array.

Another possible approach is to implement your own BigList. First note that List is implemented as an array. Also, you can set the initial size of the List in the constructor, so if you know it will be big, get a big chunk of memory up front.
Then
public class myBigList<T> : List<List<T>>
{
}
or, maybe more preferable, use a has-a approach:
public class myBigList<T>
{
List<List<T>> theList;
}
In doing this you will need to re-implement the indexer so you can use division and modulo to find the correct indexes into your backing store. Then you can use a BigInt as the index. In your custom indexer you will decompose the BigInt into two legal sized ints.

I ran into the same problem. I solved it using a list of list which mimics very well an array but can go well beyond the 2Gb limit. Ex List<List> It worked for an 250k x 250k of sbyte running on a 32Gb computer even if this elephant represent a 60Gb+ space:-)

C# arrays are limited in size to System.Int32.MaxValue.
For bigger than that, use List<T> (where T is whatever you want to hold).
More here: What is the Maximum Size that an Array can hold?

How can I reduce the garbage generation in this situation

My game has gotten to the point where its generating too much garbage and is resulting in long GC times. I've been going around and reducing a lot of the garbage generated but there's one spot that's allocating a large amount of memory too frequently and I'm stuck on how I can resolve this.
My game is a minecraft-type world that generates new regions as you walk. I have a large, variable size array that is allocated on the creation of a new region that is used to store the vertex data for the terrain. After the array is filled with data, it's passed to a slimdx DataStream so it can be used for rendering.
The problem is the fact that this is a variable-size array and that it needs to be passed to slimdx, which calls GCHandle.Alloc on it. Since its a variable size, it may have to be resized in order to reuse it. I also can't just allocate a max sized array for each region because it would require impossibly large amounts of memory. I can't use a List because of the GCHandle business with slimdx.
So far, resizing the array only when it needs to be made bigger seems to be the only plausible option to me, but it may not work out that well and will likely be a pain to implement. I'd need to keep track of the actual size of the array separately and use unsafe code to get a pointer to the array and pass that to slimdx. It may also end up eventually using such a large amount of memory that I have to occasionally go and reduce the size of all the arrays down to the minimum needed.
I'm hesitant to jump at this solution and would like to know if anyone sees any better solutions to this.

I'd suggest a tighter integration with the slimdx library. It's open source so you could dig in and find the critical path that you need for the rendering. Then you could integrate tighter by using a DMA-style memory sharing approach.

Since SlimDX is open source and it is too slow the time has come to change the open Source to suit your performance needs. What I do see here is that you want to keep a much larger array but hand to SlimDX only the actual used region to prevent additional memory allocatons for this potentially huge array.
There is a type in the .NET Framework named ArraySegment which was made exactly for this purpose.
// Taken from MSDN
// Create and initialize a new string array.
String[] myArr = { "The", "quick", "brown", "fox", "jumps", "over", "the",
"lazy", "dog" };
// Define an array segment that contains the middle five values of the array.
ArraySegment<String> myArrSegMid = new ArraySegment<String>( myArr, 2, 5 );
public static void PrintIndexAndValues( ArraySegment<String> arrSeg )
{
for ( int i = arrSeg.Offset; i < (arrSeg.Offset + arrSeg.Count); i++ )
{
Console.WriteLine( " [{0}] : {1}", i, arrSeg.Array[i] );
}
Console.WriteLine();
}
That said I have found the usage of ArraySegment somewhat strange because I always have to use the offset and the index which just behaves not a regular array. Instead you can distill your own struct which allows zero based indexing which is much easier to use but comes at the cost that every index based access does cost you and add of the base offset. But if the usage pattern is mainly foreaches then it does not really matter.
I had situations where ArraySegment was also too costly because you do allocate a struct every time and pass it to all methods by value on the stack. You need to watch closely where its usage is ok and if it is not allocated at a too high rate.

I sympathize your problem with older library, slimdx, which may not be .NET compliant. I have dealt with such a situation.
Suggestions:
Use a more performance efficient generic list or array like ArrayList. It keeps track of the size of the array so you don't have to. Allocate the List, chunks at a time, e.g. 100 elements at a time.
Use C++ .NET and take advantage of unsafe arrays or the .NET classes like ArrayList.
Updated: Use the idea of virtual memory. Save some data to an XML file or SQL database, relieving huge amount of memory.
I realize it's a gamble either way.

non contiguous String object C#.net

By what i understand String and StringBuilder objects both allocate contiguous memory underneath.
My program runs for days buffering several output in a String object. This sometimes cause outofmemoryexception which i think is because of non availability of contiguous memory. my string size can go upto 100MBs and i m concatenating new string frequently this causes new string object being allocated. i can reduce new string object creation by using Stringbuilder but that would not solve my problem entirely
Is there an alternative to a contiguous string object?

A rope data structure may be an option but I don't know of any ready-to-use implementations for .NET.
Have you tried using a LinkedList of strings instead? Or perhaps you can modify your architecture to read and write a file on disk instead of keeping everything in memory.

DO NOT USE STRINGS.
Strings will copy and allocate a new string for every operation. That is, if you have an 50mb string and add one character, until garbage collection happens, you will have two (aprox) 50mb strings around.
Then, you add another char, you'll have 3.... and so on.
On the other hand, proper use of StringBuilder, that is, using "Append" should not have any problem with 100 mbs.
Another optimization is creating the StringBuilder with your estimated size,
StringBuilder SB;
SB= new StringBuilder(capacity); // being capacity the suggested starting size
Use stringBuider to hold your big string, and then use append.
HTH

By going so large your strings are moved to the Large Object Heap (LOH) and you run a greater risk of fragmentation.
A few options:
Use a StringBuilder. You will be re-allocating less frequently. And try to pre-allocate, like new StringBuilder(100*1000*1000);
re-design your solution. There must be alternatives to keeping such large strings around. A List<string> for instance, that is only converted to 1 single string when (really) necessary.

I don't believe there's any solution for this using either String or StringBuilder. Both will require contiguous memory. Is it possible to change your architecture such that you can save the ongoing data to a List, a file, a database, or some other structure designed for such purposes?

First you should examine why you are doing that and see if there are other things you can do that give you the same value.
Then you have lots of options (depending on what you need) ranging from using logging to writing a simple class that collects strings into a List.

You can try saving the string to a database such as TextFile, SQL Server Express, MySQL, MS Access, ..etc. This way if your server gets shutdown for any reason (Power outage, someone bumped the UPS, thunderstorm, etc) you would not lose your data. It is a little slower then RAM but I think the trade off is worth it.
If this is not an option -- Most definitly use the stringbuilder for adding strings.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.