How to handle System.OutOfMemoryException when using protobuf-net? - c#

I'm using protobuf-net in a c# application to load and save my program's 'project files'. At save time, the program creates a ProjectData object and adds many different objects to it - see general principle below.
static ProjectData packProjectData()
{
ProjectData projectData = new ProjectData();
projectData.projectName = ProjectHandler.projectName;
foreach (KeyValuePair<int, Module> kvp in DataHandler.modulesDict)
{
projectData.modules.Add(serializeModule(kvp.Value));
}
return projectData;
}
[ProtoContract]
public class ProjectData
{
[ProtoMember(1)]
public List<SEModule> modules = new List<SEModule>();
[ProtoMember(2)]
public string projectName = "";
}
Once this is created, it's zipped and save to the disk. The problem I am having is that when the number of modules gets very big (40,000+) System.OutOfMemoryException is being reported during the packProjectData function.
I've seen questions like this asked already, but these do not contain a clear answer to address the problem. If anyone can give me either a specific solution, or a general principle to follow that would be greatly appreciated.

What sort of size are we talking about here? Most likely this is due to buffering required for the length prefix - something that v3 will address, but for now - if the file is huge, a pragmatic workaround might be:
[ProtoContract]
public class ProjectData
{
[ProtoMember(1, DataFormat = DataFormat.Grouped)]
public List<SEModule> modules = new List<SEModule>();
[ProtoMember(2)]
public string projectName = "";
}
This changes the internal encoding format of the SEModule items so that no length-prefix is required. This same approach may also be useful for some elements inside SEModule, but I can't see that to comment.
Note that this changes the data layout, so should be considered a breaking change.

Related

C# - Best alternative to Shell32 object for retriving extended properties

I'm really getting stuck on how to design my program, in simple term, it needs to create a list of file in a given path and then sorts them for now by date creating the respective subdirectory. The problem arises since the files are uploaded by the phone in a NAS and their creation date gets modified when uploaded to this drive. Since we are talking about photos-video or audio I tried using metadata and the best way I found to retrieve some common date stored in the metadata based on this answer is this:
internal static class FileInSorting
{
private static List<string> arrHeaders;
private static List<int> date = new List<int>() { 197, 3, 4, 5, 12, 198, 287, 208 };
private static List<FileToSort> fileToSort = new List<FileToSort>();
public static List<FileToSort> GetFilesToSort(string path)
{
Folder objFolder;
LoadHeader(path, out arrHeaders, out objFolder);
//I search for each file inside his extended property for the true creation date
//If none is found I default to what FileInfo thinks is right
foreach (Shell32.FolderItem2 item in objFolder.Items())
{
List<DateTime> tmp = new List<DateTime>();
DateTime SelectedDate;
foreach (int h in date)
{
string l = objFolder.GetDetailsOf(item, h);
if (!string.IsNullOrEmpty(l))
{
string asAscii = Encoding.ASCII.GetString(
Encoding.Convert(
Encoding.UTF8,
Encoding.GetEncoding(
Encoding.ASCII.EncodingName,
new EncoderReplacementFallback(string.Empty),
new DecoderExceptionFallback()),
Encoding.UTF8.GetBytes(l)
)
);
tmp.Add(DateTime.Parse(asAscii.Substring(0, 11)));
}
}
if (tmp.Count == 0)
SelectedDate = File.GetCreationTime(item.Path);
else
SelectedDate = tmp.Min();
fileToSort.Add(new FileToSort(item.Name, item.Path, SelectedDate));
}
return fileToSort;
}
public static void LoadHeader(string path, out List<string> arrHeaders, out Folder objFolder)
{
arrHeaders = new List<string>();
Shell32.Shell shell = new Shell32.Shell();
objFolder = shell.NameSpace(path);
for (int i = 0; i < short.MaxValue; i++)
{
string header = objFolder.GetDetailsOf(null, i);
if (!String.IsNullOrEmpty(header))
arrHeaders.Add(header);
}
}
}
I made this class just for easy use during sort but it could be completely redundant
public class FileToSort
{
public string nome { get; set; }
public string path { get; set; }
public DateTime sortDate { get; set; }
public FileToSort(string nome,string path,DateTime data)
{
this.nome = nome;
this.path = path;
this.sortDate = data;
}
}
The problem using this COM object is that is slow and not so easy to handle(maybe I'm just not able to) and as turned out on another question of mine it's not thread-safe, blocking out the option for parallel operation on multiple folders after the first sort.
For example, i'm first sorting all files in a tree structure "[YEAR]/[Month]/[Full date]" but then I would have to recreate the COM object for each "Full date" folder and sort those by type. I'm aware that after the first date sort I could start using Directory.EnumerateFile() for each of the newly created folders but I would like to see if there is a better way to "design" the code so it can be reused without writing 2 separate methods for the date sort and for the type sort, so is there a way to avoid using the Com object entirely?
Quick edit I forgot another why I'm searching for another solution:
this is a WPF application and I would really like to use a ListView binded with a single collection perhaps a FileInfo collection
The problem arises since the files are in a network and their creation date gets modified when uploaded
That's your choice, and thus your problem. If you don't want file dates to change on upload, don't change them. Windows Explorer, for example, doesn't change them by default, you get the same creation date as the source. Your own code has full access over what dates to use.
I made this class just for easy use during sort but it could be completely redundant
You should look up record. And proper .Net naming conventions (public properties should be capitalized).
it's not thread-safe, blocking out the option for parallel operation on multiple folders after the first sort
You're jumping to assumptions here. It may not be thread-safe, but nothing stops you from creating multiple objects to query through, one for each thread. Look up thread-local variables and/or statics.
but then I would have to recreate the COM object for each "Full date" folder and sort those by type
That line is a little harder to understand, but if you're saying you "need" to requery the entire filesystem again just to sort items then you're dead wrong. Sorting is a view operation, the model doesn't care about it, and what you're writing here is the model. Sorting for the view can be handled any way you want, you have the data in memory already, sort as you wish.
And I don't wish to go through your code too deep, but holy wow what is this:
string asAscii = Encoding.ASCII.GetString(
Encoding.Convert(
Encoding.UTF8,
Encoding.GetEncoding(
Encoding.ASCII.EncodingName,
new EncoderReplacementFallback(string.Empty),
new DecoderExceptionFallback()),
Encoding.UTF8.GetBytes(l)
)
);
If I had to rate it you'd be fired by the time I counted to 0... Just use the original string, what are you doing, man?

How to cache reading .csv files in C#

This may be a noob question, but I need some help. I have written two simple methods in C#: ReadCsv_IT and GetTranslation. The ReadCsv_IT method reads from a csv file. The GetTransaltion method calls the ReadCsv_IT method and returns the translated input (string key).
My problem is that in the future I will have to request a lot of times GetTranslation, but I obviously don't want to read the .csv files every time. So I was thinking about ways to use cache Memory to optimize my program, so that I don't have to read the .csv file on every request. But I am not sure how to do it and what I could do to optimize my program. Can anyone please help ?
public string ReadCsv_IT(string key)
{
string newKey = "";
using (var streamReader = new StreamReader(#"MyResource.csv"))
{
CsvReader csv = new CsvReader(streamReader);
csv.Configuration.Delimiter = ";";
List<DataRecord> rec = csv.GetRecords<DataRecord>().ToList();
DataRecord record = rec.FirstOrDefault(a => a.ORIGTITLE == key);
if (record != null)
{
//DOES THE LOCALIZATION with the help of the .csv file.
}
}
return newKey;
}
Here is the GetTranslation Method:
public string GetTranslation(string key, string culture = null)
{
string result = "";
if (culture == null)
{
culture = Thread.CurrentThread.CurrentCulture.Name;
}
if (culture == "it-IT")
{
result = ReadCsv_IT(key);
}
return result;
}
Here is also the class DataRecord.
class DataRecord
{
public string ORIGTITLE { get; set; }
public string REPLACETITLE { get; set; }
public string ORIGTOOLTIP { get; set; }
}
}
Two options IMO:
Turn your stream into an object?
In other words:
Make a class stream so you can refer to that object of the class stream.
Second:
Initialize your stream in the scope that calls for GetTranslation, and pass it on as an attribute to GetTranslation and ReadCSV_IT.
Brecht C and Thom Hubers have already given you good advice. I would like to add one more point, though: using csv files for localization in .NET is not really a good idea. Microsoft recommends using a resource-based approach (this article is a good starting point). It seems to me that you are trying to write code for something that is already built into .NET.
From a translation point of view csv files are not the best possible format either. First of all, they are not really standardized: many tools have slightly different ways to handle commas, quotes, and line breaks that are part of the translated text. Besides, translators will be tempted to open them in Excel, and -unless handled with caution- Excel will write out translations in whatever encoding it deems best.
If the project you are working on is for learning please feel free to go ahead with it, but if you are developing software that will be used by customers, updated, translated into several target languages, and redeployed, I would recommend to reconsider your internationalization approach.
#Brecht C is right, use that answer to start. When a variable has to be cached to be used by multiple threads or instances: take a look at InMemoryCache or Redis when perfomance and distribution over several clients gets an issue.

Keeping track of user customization's c#

Good evening; I have an application that has a drop down list; This drop down list is meant to be a list of commonly visited websites which can be altered by the user.
My question is how can I store these values in such a manor that would allow the users to change it.
Example; I as the user, decide i want google to be my first website, and youtube to be my second.
I have considered making a "settings" file however is it practical to put 20+ websites into a settings file and then load them at startup? Or a local database, but this may be overkill for the simple need.
Please point me in the right direction.
Given you have already excluded database (probably for right reasons.. as it may be over kill for a small app), I'd recommend writing the data to a local file.. but not plain text..
But preferably serialized either as XML or JSON.
This approach has at least two benefits -
More complex data can be stored in future.. example - while order can be implicit, it can be made explicit.. or additional data like last time the url was used etc..
Structured data is easier to validate against random corruption.. If it was a plain text file.. It will be much harder to ensure its integrity.
The best would be to use the power of Serializer and Deserializer in c#, which will let you work with the file in an Object Oriented. At the same time you don't need to worry about storing into files etc... etc...
Here is the sample code I quickly wrote for you.
using System;
using System.IO;
using System.Collections;
using System.Xml.Serialization;
namespace ConsoleApplication3
{
public class UrlSerializer
{
private static void Write(string filename)
{
URLCollection urls = new URLCollection();
urls.Add(new Url { Address = "http://www.google.com", Order = 1 });
urls.Add(new Url { Address = "http://www.yahoo.com", Order = 2 });
XmlSerializer x = new XmlSerializer(typeof(URLCollection));
TextWriter writer = new StreamWriter(filename);
x.Serialize(writer, urls);
}
private static URLCollection Read(string filename)
{
var x = new XmlSerializer(typeof(URLCollection));
TextReader reader = new StreamReader(filename);
var urls = (URLCollection)x.Deserialize(reader);
return urls;
}
}
public class URLCollection : ICollection
{
public string CollectionName;
private ArrayList _urls = new ArrayList();
public Url this[int index]
{
get { return (Url)_urls[index]; }
}
public void CopyTo(Array a, int index)
{
_urls.CopyTo(a, index);
}
public int Count
{
get { return _urls.Count; }
}
public object SyncRoot
{
get { return this; }
}
public bool IsSynchronized
{
get { return false; }
}
public IEnumerator GetEnumerator()
{
return _urls.GetEnumerator();
}
public void Add(Url url)
{
if (url == null) throw new ArgumentNullException("url");
_urls.Add(url);
}
}
}
You clearly need some sort of persistence, for which there are a few options:
Local database
- As you have noted, total overkill. You are just storing a list, not relational data
Simple text file
- Pretty easy, but maybe not the most "professional" way. Using XML serialization to this file would allow for complex data types.
Settings file
- Are these preferences really settings? If they are, then this makes sense.
The Registry - This is great for settings you don't want your users to ever manually mess with. Probably not the best option for a significant amount of data though
I would go with number 2. It doesn't sound like you need any fancy encoding or security, so just store everything in a text file. *.ini files tend to meet this description, but you can use any extension you want. A settings file doesn't seem like the right place for this scenario.

How to read data from multiple (multi language) resource files?

I am trying the multi language features in an application. I have created the resource files GlobalTexts.en-EN.resx GlobalTexts.fr-FR.resx and a class that sets the culture and returns the texts like (I will not go in all the details, just show the structure):
public class Multilanguage
{
...
_res_man_global = new ResourceManager("GlobalResources.Resources.GlobalTexts", System.Reflection.Assembly.GetExecutingAssembly());
...
public virtual string GetText(string _key)
{
return = _res_man_global.GetString(_key, _culture);
}
}
...
Multilanguage _translations = new Multilanguage();
...
someText = _translations.GetText(_someKey);
...
This works just fine.
Now, I would like to use this application in another solution that basically extends it (more windows etc.) which also has resource files ExtendedTexts.en-En.resx ExtendedTexts.fr-FR.resx and a new class like:
public class ExtendedMultilanguage : Multilanguage
{
...
_res_man_local = new ResourceManager("ExtendedResources.Resources.ExtendedTexts", System.Reflection.Assembly.GetExecutingAssembly());
...
public override string GetText(string _key)
{
string _result;
try
{
_result = _res_man_local.GetString(_key, _culture);
}
catch (Exception ex)
{
_result = base.GetText(_key);
}
}
...
ExtendedMultilanguage _translations = new Multilanguage();
...
someText = _translations.GetText(_someKey);
...
the idea being that if the key is not found in ExtendedTexts the method will call the base class which is looking into GlobalTexts. I did this in order to use the call GetText(wantedKey) everywhere in the code without having to care about the location of the resource (I do not want to include the translations from the extensions in the GlobalTexts files); it is juts the used class that is different from project to project.
The problem I am facing is that the try/catch is very slow when exceptions raise- I wait seconds for one window to populate. I tested with direct call and works much faster, but then I need to care all the time where the resource is located...
The question is: is there an alternative way of doing this (having resources spread in various files and have only one method that gives the desired resource without throwing an error)?
In the end I took a workaround solution and loaded all the content of the resource files in dictionaries. This way I can use ContainsKey and see if the key exists or not.

How to use the read/writeable local XML settings?

I found something similar to what I need here:
http://www.codeproject.com/KB/cs/PropertiesSettings.aspx
But it does not quite do it for me. The user settings are stored in some far away location such as C:\documents and settings\[username]\local settings\application data\[your application], but I do not have access to these folders and I cannot copy the settings file from one computer to another, or to delete the file altogether. Also, it would be super-convenient to have the settings xml file right next to the app, and to copy/ship both. This is used for demo-ware (which is a legitimate type of coding task) and will be used by non-technical people in the field. I need to make this quickly, so I need to reuse some existing library and not write my own. I need to make it easy to use and be portable. The last thing I want is to get a call at midnight that says that settings do not persist when edited through the settings dialog that I will have built.
So, user settings are stored god knows where, and application settings are read-only (no go). Is there anything else that I can do? I think app.config file has multiple purposes and I think I once saw it being used the way I want, I just cannot find the link.
Let me know if something is not clear.
You could create a class that holds your settings and then XML-serialize it:
public class Settings
{
public string Setting1 { get; set; }
public int Setting2 { get; set; }
}
static void SaveSettings(Settings settings)
{
var serializer = new XmlSerializer(typeof(Settings));
using (var stream = File.OpenWrite(SettingsFilePath))
{
serializer.Serialize(stream, settings);
}
}
static Settings LoadSettings()
{
if (!File.Exists(SettingsFilePath))
return new Settings();
var serializer = new XmlSerializer(typeof(Settings));
using (var stream = File.OpenRead(SettingsFilePath))
{
return (Settings)serializer.Deserialize(stream);
}
}

Categories

Resources