Limiting process memory with MaxWorkingSet

Limiting process memory with MaxWorkingSet - c#

MSDN:
public IntPtr MaxWorkingSet { get; set; }
Gets or sets the maximum allowable
working set size for the associated
process. Property Value: The maximum
working set size that is allowed in
memory for the process, in bytes.
So, as far as I understand, I can limit amount of memory that can be used by a process. I've tried this, but with no luck..
Some code:
public class A
{
public void Do()
{
List<string> guids = new List<string>();
do
{
guids.Add(Guid.NewGuid().ToString());
Thread.Sleep(5);
} while (true);
}
}
public static class App
{
public static void Main()
{
Process.GetCurrentProcess().MaxWorkingSet = new IntPtr(2097152);
try
{
new A().Do();
}
catch (Exception e)
{
}
}
}
I'm expecting OutOfMemory exception after the limit of 2mb is reached, but nothing happens.. If I open Task Manager I can see that the amount of memory my application uses is growing continiously without any limits.
What am I doing wrong?
Thanks in advance

No, this doesn't limit the amount of memory used by the process. It's simply a wrapper around SetProcessWorkingSetSize which a) is a recommendation, and b) limits the working set of the process, which is the amount of physical memory (RAM) this process can consume.
It will absolutely not cause an out of memory exception in that process, even if it allocates significantly more than what you set the MaxWorkingSet property to.
There is an alternative to what you're trying to do -- the Win32 Job Object API. There's a managed wrapper for that on Codeplex (http://jobobjectwrapper.codeplex.com/) to which I contributed. It allows you to create a process and limit the amount of memory this process can use.

Related

C# Memory leak query - Why GC is not releasing memory of the local scope?

class Program
{
static void Main(string[] args)
{
Console.WriteLine("Before 1st call TestFunc");
TestFunc();
Console.WriteLine("After 1st call TestFunc");
Console.WriteLine("Before 2nd call TestFunc");
TestFunc();
Console.WriteLine("After 2nd call TestFunc");
Console.ReadLine();
}
public static void TestFunc()
{
List<Employee> empList = new List<Employee>();
for (int i = 1; i <= 50000000; i++)
{
Employee obj = new Employee();
obj.Name = "fjasdkljflasdkjflsdjflsjfkldsjfljsflsdjlkfajsd";
obj.ID = "11111111111111112222222222222222222222222222222";
empList.Add(obj);
}
}
}
public class Employee
{
public string Name;
public string ID;
}
I am creating lot of employees only inside a location(local scope), when the control comes back to main function, why DOTNET is not releasing the memory?
Lets say each function call is using 1 GB of memory, at the end of the function main, still the application uses more than 1Gb of memory. Why GC is not collecting after the scope goes off?
This might be a simple question, Any help would be great.

GC doesn't start automatically at the end of scope or function call. According to MS documentation:
Garbage collection occurs when one of the following conditions is true:
•The system has low physical memory. This is detected by either the low memory notification from the OS or low memory indicated by the host.
•The memory that is used by allocated objects on the managed heap surpasses an acceptable threshold. This threshold is continuously adjusted as the process runs.
•The GC.Collect method is called. In almost all cases, you do not have to call this method, because the garbage collector runs continuously. This method is primarily used for unique situations and testing.
Fundamentals of Garbage Collection

How to minimize Logging impact

There's already a question about this issue, but it's not telling me what I need to know:
Let's assume I have a web application, and there's a lot of logging on every roundtrip. I don't want to open a debate about why there's so much logging, or how can I do less loggin operations. I want to know what possibilities I have in order to make this logging issue performant and clean.
So far, I've implemented declarative (attribute based) and imperative logging, which seems to be a cool and clean way of doing it... now, what can I do about performance, assuming I can expect those logs to take more time than expected. Is it ok to open a thread and leave that work to it?

Things I would consider:
Use an efficient file format to minimise the quantity of data to be written (e.g. XML and text formats are easy to read but usually terribly inefficient - the same information can be stored in a binary format in a much smaller space). But don't spend lots of CPU time trying to pack data "optimally". Just go for a simple format that is compact but fast to write.
Test use of compression on the logs. This may not be the case with a fast SSD but in most I/O situations the overhead of compressing data is less than the I/O overhead, so compression gives a net gain (although it is a compromise - raising CPU usage to lower I/O usage).
Only log useful information. No matter how important you think everything is, it's likely you can find something to cut out.
Eliminate repeated data. e.g. Are you logging a client's IP address or domain name repeatedly? Can these be reported once for a session and then not repeated? Or can you store them in a map file and use a compact index value whenever you need to reference them? etc
Test whether buffering the logged data in RAM helps improve performance (e.g. writing a thousand 20 byte log records will mean 1,000 function calls and could cause a lot of disk seeking and other write overheads, while writing a single 20,000 byte block in one burst means only one function call and could give significant performance increase and maximise the burst rate you get out to disk). Often writing blocks in sizes like (4k, 16k, 32, 64k) of data works well as it tends to fit the disk and I/O architecture (but check your specific architecture for clues about what sizes might improve efficiency). The down side of a RAM buffer is that if there is a power outage you will lose more data. So you may have to balance performance against robustness.
(Especially if you are buffering...) Dump the information to an in-memory data structure and pass it to another thread to stream it out to disk. This will help stop your primary thread being held up by log I/O. Take care with threads though - for example, you may have to consider how you will deal with times when you are creating data faster than it can be logged for short bursts - do you need to implement a queue, etc?
Are you logging multiple streams? Can these be multiplexed into a single log to possibly reduce disk seeking and the number of open files?
Is there a hardware solution that will give a large bang for your buck? e.g. Have you used SSD or RAID disks? Will dumping the data to a different server help or hinder? It may not always make much sense to spend $10,000 of developer time making something perform better if you can spend $500 to simply upgrade the disk.

I use the code below to Log. It is a singleton that accepts Logging and puts every message into a concurrentqueue. Every two seconds it writes all that has come in to the disk. Your app is now only delayed by the time it takes to put every message in the list. It's my own code, feel free to use it.
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Windows.Forms;
namespace FastLibrary
{
public enum Severity : byte
{
Info = 0,
Error = 1,
Debug = 2
}
public class Log
{
private struct LogMsg
{
public DateTime ReportedOn;
public string Message;
public Severity Seriousness;
}
// Nice and Threadsafe Singleton Instance
private static Log _instance;
public static Log File
{
get { return _instance; }
}
static Log()
{
_instance = new Log();
_instance.Message("Started");
_instance.Start("");
}
~Log()
{
Exit();
}
public static void Exit()
{
if (_instance != null)
{
_instance.Message("Stopped");
_instance.Stop();
_instance = null;
}
}
private ConcurrentQueue<LogMsg> _queue = new ConcurrentQueue<LogMsg>();
private Thread _thread;
private string _logFileName;
private volatile bool _isRunning;
public void Message(string msg)
{
_queue.Enqueue(new LogMsg { ReportedOn = DateTime.Now, Message = msg, Seriousness = Severity.Info });
}
public void Message(DateTime time, string msg)
{
_queue.Enqueue(new LogMsg { ReportedOn = time, Message = msg, Seriousness = Severity.Info });
}
public void Message(Severity seriousness, string msg)
{
_queue.Enqueue(new LogMsg { ReportedOn = DateTime.Now, Message = msg, Seriousness = seriousness });
}
public void Message(DateTime time, Severity seriousness, string msg)
{
_queue.Enqueue(new LogMsg { ReportedOn = time, Message = msg, Seriousness = seriousness });
}
private void Start(string fileName = "", bool oneLogPerProcess = false)
{
_isRunning = true;
// Unique FileName with date in it. And ProcessId so the same process running twice will log to different files
string lp = oneLogPerProcess ? "_" + System.Diagnostics.Process.GetCurrentProcess().Id : "";
_logFileName = fileName == ""
? DateTime.Now.Year.ToString("0000") + DateTime.Now.Month.ToString("00") +
DateTime.Now.Day.ToString("00") + lp + "_" +
System.IO.Path.GetFileNameWithoutExtension(Application.ExecutablePath) + ".log"
: fileName;
_thread = new Thread(LogProcessor);
_thread.IsBackground = true;
_thread.Start();
}
public void Flush()
{
EmptyQueue();
}
private void EmptyQueue()
{
while (_queue.Any())
{
var strList = new List<string>();
//
try
{
// Block concurrent writing to file due to flush commands from other context
lock (_queue)
{
LogMsg l;
while (_queue.TryDequeue(out l)) strList.Add(l.ReportedOn.ToLongTimeString() + "|" + l.Seriousness + "|" + l.Message);
if (strList.Count > 0)
{
System.IO.File.AppendAllLines(_logFileName, strList);
strList.Clear();
}
}
}
catch
{
//ignore errors on errorlogging ;-)
}
}
}
public void LogProcessor()
{
while (_isRunning)
{
EmptyQueue();
// Sleep while running so we write in efficient blocks
if (_isRunning) Thread.Sleep(2000);
else break;
}
}
private void Stop()
{
// This is never called in the singleton.
// But we made it a background thread so all will be killed anyway
_isRunning = false;
if (_thread != null)
{
_thread.Join(5000);
_thread.Abort();
_thread = null;
}
}
}
}

Check if the logger is debug enabled before calling logger.debug, this means your code does not have to evaluate the message string when debug is turned off.
if (_logger.IsDebugEnabled) _logger.Debug($"slow old string {this.foo} {this.bar}");

How to free memory on Dictionary in static class?

I have a problem with freeing memory in C#. I have a static class containing a static dictionary, which is filled with references to objects. Single object zajumie large amount of memory. From time to time I release the memory by deleting obsolete references to the object set to null and remove the item from the dictionary. Unfortunately, in this case, the memory is not slowing down, time after reaching the maximum size of the memory in the system is as if a sudden release of unused resources and the amount of memory used correctly decreases.
Below is the diagram of classes:
public class cObj
{
public DateTime CreatedOn;
public object ObjectData;
}
public static class cData
{
public static ConcurrentDictionary<Guid, cObj> ObjectDict = new ConcurrentDictionary<Guid, cObj>();
public static FreeData()
{
foreach(var o in ObjectDict)
{
if (o.Value.CreatedOn <= DateTime.Now.AddSeconds(-30))
{
cObj Data;
if (ObjectDict.TryGetValue(o.Key, out Data))
{
Data.Status = null;
Data.ObjectData = null;
ObjectDict.TryRemove(o.Key, out Data);
}
}
}
}
}
In this case, the memory is released. If, however, after this operation, I call
GC.Collect ();
Followed by the expected release of unused objects.
How to solve the problem, so you do not have to use the GC.Collect()?

You shouldn't have to call GC.Collect() in most cases. To GC.Collect or not?
I've had similar scenarios where I've just created a dictionary that's limited to n entries, I did this myself on top of ConcurrentDictionary but you could use BlockingCollection.
One possible advantage is that if 1 million entries get added at the same time, all except n will be available for garbage collection rather than 30 seconds later.

Memory leak problems: dispose or not to dispose managed resources?

I experience strange memory leak in computation expensive content-based image retrieval (CBIR) .NET application
The concept is that there is service class with thread loop which captures images from some source and then passes them to image tagging thread for annotation.
Image tags are queried from repository by the service class at specified time intervals and stored in its in-memory cache (Dictionary) to avoid frequent db hits.
The classes in the project are:
class Tag
{
public Guid Id { get; set; } // tag id
public string Name { get; set; } // tag name: e.g. 'sky','forest','road',...
public byte[] Jpeg { get; set; } // tag jpeg image patch sample
}
class IRepository
{
public IEnumerable<Tag> FindAll();
}
class Service
{
private IDictionary<Guid, Tag> Cache { get; set; } // to avoid frequent db reads
// image capture background worker (ICBW)
// image annotation background worker (IABW)
}
class Image
{
public byte[] Jpeg { get; set; }
public IEnumerable<Tag> Tags { get; set; }
}
ICBW worker captures jpeg image from some image source and passes it to IABW worker for annotation. IABW worker first tries to update Cache if time has come and then annotates the image by some algorithm creating Image object and attaching Tags to it then storing it to annotation repository.
Service cache update snippet in IABW worker is:
IEnumerable<Tag> tags = repository.FindAll();
Cache.Clear();
tags.ForEach(t => Cache.Add(t.Id, t));
IABW is called many times a second and is pretty processor extensive.
While running it for days I found memory increase in task manager. Using Perfmon to watch for Process/Private Bytes and .NET Memory/Bytes in all heaps I found them both increasing over the time.
Experimenting with the application I found that Cache update is the problem. If it is not updated there is no problem with the mem increase. But if the Cache update is as frequent as once in 1-5 minutes application gets ouf of mem pretty fast.
What might be the reason of that mem leak? Image objects are created quite often containing references to Tag objects in Cache. I presume when the Cache dictionary is created those references somehow are not garbage collected in the future.
Does it need to explicitly null managed byte[] objects to avoid memory leak e.g. by implementing Tag, Image as IDisposable?
Edit: 4 aug 2001, addition of the buggy code snippet causing quick mem leak.
static void Main(string[] args)
{
while (!Console.KeyAvailable)
{
IEnumerable<byte[]> data = CreateEnumeration(100);
PinEntries(data);
Thread.Sleep(900);
Console.Write(String.Format("gc mem: {0}\r", GC.GetTotalMemory(true)));
}
}
static IEnumerable<byte[]> CreateEnumeration(int size)
{
Random random = new Random();
IList<byte[]> data = new List<byte[]>();
for (int i = 0; i < size; i++)
{
byte[] vector = new byte[12345];
random.NextBytes(vector);
data.Add(vector);
}
return data;
}
static void PinEntries(IEnumerable<byte[]> data)
{
var handles = data.Select(d => GCHandle.Alloc(d, GCHandleType.Pinned));
var ptrs = handles.Select(h => h.AddrOfPinnedObject());
IntPtr[] dataPtrs = ptrs.ToArray();
Thread.Sleep(100); // unmanaged function call taking byte** data
handles.ToList().ForEach(h => h.Free());
}

No, you don't need to set anything to null or dispose of anything if it's just memory as you've shown.
I suggest you get hold of a good profiler to work out where the leak is. Do you have anything non-memory-related that you might be failing to dispose of, e.g. loading a GDI+ image to get the bytes?

Out of memory exceptions when using MemoryStream in cache

We are dealing with a lot of files which need to be opened and close for data reads mostly.
Is it a good idea or not to cache the memorystream of each file in a temp hashtable or some other object?
We have noticed when opening files over 100MB we are running into out of memory exceptions.
We are using a wpf app.
We could successfully open the files 1 or 2 time sometimes 3 to 4 times but after that we are running into out of memory exceptions.

If you are currently caching these files, then you would expect to run out of memory quite quickly.
If you aren't caching them yet, don't, because you'll just make it worse. Perhaps you have a memory leak? Are you disposing of the memorystream once you've used it?
The best way to deal with large files is to stream data in and out (using FileStreams), so that you don't have to have the whole file in memory at once...

One issue with the MemoryStream is the internal buffer doubles in size each time the capacity is forced to increase. Even if your MemoryStream is 100MB and your file is 101MB, as soon as you try to write that last 1MB to the MemoryStream the internal buffer on MemoryStream is doubled to 200MB. You may reduce this if you give the Memory Buffer a starting capacity equal to that of you files. But this will still allow the files to use all of the memory and stop any new allocations after the some of the files are loaded. If create a cache object that is help inside of a WeakReference object you would be able to allow the garbage collector to toss a few of your cached files as needed. But don't forget you will need to add code to recreate the lost cache on demand.
public class CacheStore<TKey, TCache>
{
private static object _lockStore = new object();
private static CacheStore<TKey, TCache> _store;
private static object _lockCache = new object();
private static Dictionary<TKey, TCache> _cache =
new Dictionary<TKey, TCache>();
public TCache this[TKey index]
{
get
{
lock (_lockCache)
{
if (_cache.ContainsKey(index))
return _cache[index];
return default(TCache);
}
}
set
{
lock (_lockCache)
{
if (_cache.ContainsKey(index))
_cache.Remove(index);
_cache.Add(index, value);
}
}
}
public static CacheStore<TKey, TCache> Instance
{
get
{
lock (_lockStore)
{
if (_store == null)
_store = new CacheStore<TKey, TCache>();
return _store;
}
}
}
}
public class FileCache
{
private WeakReference _cache;
public FileCache(string fileLocation)
{
if (!File.Exists(fileLocation))
throw new FileNotFoundException("fileLocation", fileLocation);
this.FileLocation = fileLocation;
}
private MemoryStream GetStream()
{
if (!File.Exists(this.FileLocation))
throw new FileNotFoundException("fileLocation", FileLocation);
return new MemoryStream(File.ReadAllBytes(this.FileLocation));
}
public string FileLocation { get; private set; }
public MemoryStream Data
{
get
{
if (_cache == null)
_cache = new WeakReference(GetStream(), false);
var ret = _cache.Target as MemoryStream;
if (ret == null)
{
Recreated++;
ret = GetStream();
_cache.Target = ret;
}
return ret;
}
}
public int Recreated { get; private set; }
}
class Program
{
static void Main(string[] args)
{
var cache = CacheStore<string, FileCache>.Instance;
var fileName = #"c:\boot.ini";
cache[fileName] = new FileCache(fileName);
var ret = cache[fileName].Data.ToArray();
Console.WriteLine("Recreated {0}", cache[fileName].Recreated);
Console.WriteLine(Encoding.ASCII.GetString(ret));
GC.Collect();
var ret2 = cache[fileName].Data.ToArray();
Console.WriteLine("Recreated {0}", cache[fileName].Recreated);
Console.WriteLine(Encoding.ASCII.GetString(ret2));
GC.Collect();
var ret3 = cache[fileName].Data.ToArray();
Console.WriteLine("Recreated {0}", cache[fileName].Recreated);
Console.WriteLine(Encoding.ASCII.GetString(ret3));
Console.Read();
}
}

It's very dificutl say "yes" or "no", if is file content caching right in the common case and/or with so little informations. However - finited resources are real state of world, and you (as developer) must count with it. If you want cache something, you should use some mechanism for auto unloading data. In .NET framework you can use a WeakReference class, which unloads the target object (byte array and memory stream are objects too).
If you have the HW in you control, and you can use 64bit and have funds for very big RAM, you can cache big files.
However, you should be humble to resources (cpu,ram) and use the "cheap" way of implementation.

I think that the problem is that after you are done, the file is not disposed immediatly, it is waiting to the next GC cycle.
Streams are IDisposable, whice means you can and should use the using block. then the stream will dispose immidiatly when your are done dealing with it.

I don't think that caching such amount of data is a good solution, even if you don't get ever memroy overflow. Check out Memory Mapped files solution, which means that file lays on file system but speed of reading is almost equal to the in memory ones (there is an overhead for sure). Check out this link. MemoryMappedFiles
P.S. Ther are pretty good articles and examples on this topic arround in internet.
Good Luck.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.