I'm trying to use Protobuf-net to save and load data to disk but got stuck.
I have a portfolio of assets that I need to process, and I want to be able to do that as fast as possible. I can already read from a CSV but it would be faster to use a binary file, so I'm looking into Protobuf-Net.
I can't fit all assets into memory so I want to stream them, not load them all into memory.
So what I need to do is expose a large set of records as an IEnumerable. Is this possible with Protobuf-Net? I've tried a couple of things but haven't been able to get it running.
Serializing seems to work, but I haven't been able to read them back in again, I get 0 assets back. Could someone point me in the right direction please? Looked at the methods in the Serializer class but can't find any that covers this case. I this use-case supported by Protobuf-net? I'm using V2 by the way.
Thanks in advance,
Gert-Jan
Here's some sample code I tried:
public partial class MainWindow : Window {
// Generate x Assets
IEnumerable<Asset> GenerateAssets(int Count) {
var rnd = new Random();
for (int i = 1; i < Count; i++) {
yield return new Asset {
ID = i,
EAD = i * 12345,
LGD = (float)rnd.NextDouble(),
PD = (float)rnd.NextDouble()
};
}
}
// write assets to file
private void Write(string path, IEnumerable<Asset> assets){
using (var file = File.Create(path)) {
Serializer.Serialize<IEnumerable<Asset>>(file, assets);
}
}
// read assets from file
IEnumerable<Asset> Read(string path) {
using (var file = File.OpenRead(path)) {
return Serializer.DeserializeItems<Asset>(file, PrefixStyle.None, -1);
}
}
// try it
private void Test() {
Write("Data.bin", GenerateAssets(100)); // this creates a file with binary gibberish that I assume are the assets
var x = Read("Data.bin");
MessageBox.Show(x.Count().ToString()); // returns 0 instead of 100
}
public MainWindow() {
InitializeComponent();
}
private void button2_Click(object sender, RoutedEventArgs e) {
Test();
}
}
[ProtoContract]
class Asset {
[ProtoMember(1)]
public int ID { get; set; }
[ProtoMember(2)]
public double EAD { get; set; }
[ProtoMember(3)]
public float LGD { get; set; }
[ProtoMember(4)]
public float PD { get; set; }
}
figured it out. To deserialize use PrefixBase.Base128 wich apparently is the default.
Now it works like a charm!
GJ
using (var file = File.Create("Data.bin")) {
Serializer.Serialize<IEnumerable<Asset>>(file, Generate(10));
}
using (var file = File.OpenRead("Data.bin")) {
var ps = Serializer.DeserializeItems<Asset>(file, PrefixStyle.Base128, 1);
int i = ps.Count(); // got them all back :-)
}
Related
Hi I have a simple class in a .NET Core SDK -> 3.1.409 project for the communication with devices.
public class WriteCommand
{
//Commands is an enumeration.
public Commands LaserCommand { get; }
public List<byte> Parameter { get; }
public List<byte> Data { get; }
public WriteCommand(Commands laserCommand, byte[] parameter = null)
{
Data = BuildSendData(laserCommand, parameter);
Parameter = new List<byte>(parameter);
LaserCommand = laserCommand;
}
private List<byte> BuildSendData(Commands command, byte[] paramBytes)
{
var parameter = paramBytes ?? Array.Empty<byte>();
int numberOfBytes = parameter.Length + Constants.ADD_TO_PARAMETER; // Defined by protocol
List<byte> sendData = new List<byte>();
sendData.Add(Constants.PACKET_START_BYTE);
sendData.Add((byte)numberOfBytes);
sendData.Add(Constants.COMMAND_START_BYTE);
sendData.Add((byte)command);
foreach (var param in parameter)
{
sendData.Add(param);
}
sendData.Add(Constants.PACKET_END_BYTE);
byte checksum = new CheckSumCalculator().CalculateCheckSum(sendData);
sendData.Add(checksum);
return sendData;
}
}
I use this class to add to a ConcurrentQueue in one taks like this.
public void AddCommand()
{
commandsQueue.Enqueue(new WriteCommand(Commands.SetRs232BaudRate));
}
And in another task I get the command out of the ConcurrentQueue
public void SendAndReceiveMessages()
{
while (!commandsQueue.IsEmpty)
{
if (commandsQueue.TryDequeue(out WriteCommand writeCommand))
{
//Do something
}
}
}
In my progam I habe 6 devices to communicate within an interval of one second. Each device has it's own communication class.
When the program run for a while (more than 2 days) I see an increase of the needed memory.
I Check this with the a memory profiler and see a memory leak:
WriteCommand
ConcurrentQueueSegment+Slot
This is only articelI found.
You can find the example code here
Does anyone know this problem?
Greetings Mike
So I'm making a game, and it saves users' progress on the computer in a binary file. The User class stores a few things:
Integers for stat values (Serializable)
Strings for the Username and the skin assets
Lists of both the Achievement class and the InventoryItem class, which I have created myself.
Here are the User fields:
public string Username = "";
// ID is used for local identification, as usernames can be changed.
public int ID;
public int Coins = 0;
public List<Achievement> AchievementsCompleted = new List<Achievement>();
public List<InventoryItem> Inventory = new List<InventoryItem>();
public List<string> Skins = new List<string>();
public string CurrentSkinAsset { get; set; }
The Achievement class stores ints, bools, and strings, which are all serializable. The InventoryItem class stores its name (a string) and an InventoryAction, which is a delegate that is called when the item is used.
These are the Achievement class's fields:
public int ID = 0;
public string Name = "";
public bool Earned = false;
public string Description = "";
public string Image;
public AchievmentDifficulty Difficulty;
public int CoinsOnCompletion = 0;
public AchievementMethod OnCompletion;
public AchievementCriteria CompletionCriteria;
public bool Completed = false;
And here are the fields for the InventoryItem class:
InventoryAction actionWhenUsed;
public string Name;
public string AssetName;
The source of the InventoryAction variables are in my XNAGame class. What I mean by this is that the XNAGame class has a method called "UseSword()" or whatever, which it passes into the InventoryItem class. Previously, the methods were stored in the Game1 class, but the Game class, which Game1 inherits from, is not serializable, and there's no way for me to control that. This is why I have an XNAGame class.
I get an error when trying to serialize: "The 'SpriteFont' class is not marked as serializable", or something like that. Well, there is a SpriteFont object in my XNAGame class, and some quick tests showed that this is the source of the issue. Well, I have no control over whether or not the SpriteFont class is Serializable.
Why is the game doing this? Why must all the fields in the XNAGame class be serializable, when all I need is a few methods?
Keep in mind when answering that I'm 13, and may not understand all the terms you're using. If you need any code samples, I'll be glad to provide them for you. Thanks in advance!
EDIT: One solution I have thought of is to store the InventoryAction delegates in a Dictionary, except that this will be a pain and isn't very good programming practice. If this is the only way, I'll accept it, though (Honestly at this point I think this is the best solution).
EDIT 2: Here's the code for the User.Serialize method (I know what I'm doing in inefficient, and I should use a database, blah, blah, blah. I'm fine with what I'm doing now, so bear with me.):
FileStream fileStream = null;
List<User> users;
BinaryFormatter binaryFormatter = new BinaryFormatter();
try
{
if (File.Exists(FILE_PATH) && !IsFileLocked(FILE_PATH))
{
fileStream = File.Open(FILE_PATH, FileMode.Open);
users = (List<User>)binaryFormatter.Deserialize(fileStream);
}
else
{
fileStream = File.Create(FILE_PATH);
users = new List<User>();
}
for (int i = 0; i < users.Count; i++)
{
if (users[i].ID == this.ID)
{
users.Remove(users[i]);
}
}
foreach (Achievement a in AchievementsCompleted)
{
if (a.CompletionCriteria != null)
{
a.CompletionCriteria = null;
}
if (a.OnCompletion != null)
{
a.OnCompletion = null;
}
}
users.Add(this);
fileStream.Position = 0;
binaryFormatter.Serialize(fileStream, users);
You cannot serialize a SpriteFont by design, actually this is possible (.XNB file) but it hasn't been made public.
Solution:
Strip it off your serialized class.
Alternatives:
If for some reasons you must serialize some font, the first thing that comes to my mind would be to roll-out your own font system such as BMFont but that's a daunting task since you'll have to use it everywhere else where you might already do ...
Generate a pre-defined amount of fonts (i.e. Arial/Times/Courier at size 10/11/12 etc ...) using XNA Content app (can't recall its exact name); then store this user preference as two strings. With a string.Format(...) you should be able to load the right font back quite easily.
Alternative 2 is certainly the easiest and won't take more than a few minutes to roll-out.
EDIT
Basically, instead of saving a delegate I do the following:
inventory items have their own type
each type name is de/serialized accordingly
their logic does not happen in the main game class anymore
you don't have to manually match item type / action method
So while you'll end up with more classes, you have concerns separated and you can keep your main loop clean and relatively generic.
Code:
public static class Demo
{
public static void DemoCode()
{
// create new profile
var profile = new UserProfile
{
Name = "Bill",
Gold = 1000000,
Achievements = new List<Achievement>(new[]
{
Achievement.Warrior
}),
Inventory = new Inventory(new[]
{
new FireSpell()
})
};
// save it
using (var stream = File.Create("profile.bin"))
{
var formatter = new BinaryFormatter();
formatter.Serialize(stream, profile);
}
// load it
using (var stream = File.OpenRead("profile.bin"))
{
var formatter = new BinaryFormatter();
var deserialize = formatter.Deserialize(stream);
var userProfile = (UserProfile) deserialize;
// set everything on fire :)
var fireSpell = userProfile.Inventory.Items.OfType<FireSpell>().FirstOrDefault();
if (fireSpell != null) fireSpell.Execute("whatever");
}
}
}
[Serializable]
public sealed class UserProfile
{
public string Name { get; set; }
public int Gold { get; set; }
public List<Achievement> Achievements { get; set; }
public Inventory Inventory { get; set; }
}
public enum Achievement
{
Warrior
}
[Serializable]
public sealed class Inventory : ISerializable
{
public Inventory() // for serialization
{
}
public Inventory(SerializationInfo info, StreamingContext context) // for serialization
{
var value = (string) info.GetValue("Items", typeof(string));
var strings = value.Split(';');
var items = strings.Select(s =>
{
var type = Type.GetType(s);
if (type == null) throw new ArgumentNullException(nameof(type));
var instance = Activator.CreateInstance(type);
var item = instance as InventoryItem;
return item;
}).ToArray();
Items = new List<InventoryItem>(items);
}
public Inventory(IEnumerable<InventoryItem> items)
{
if (items == null) throw new ArgumentNullException(nameof(items));
Items = new List<InventoryItem>(items);
}
public List<InventoryItem> Items { get; }
#region ISerializable Members
public void GetObjectData(SerializationInfo info, StreamingContext context)
{
var strings = Items.Select(s => s.GetType().AssemblyQualifiedName).ToArray();
var value = string.Join(";", strings);
info.AddValue("Items", value);
}
#endregion
}
public abstract class InventoryItem
{
public abstract void Execute(params object[] objects);
}
public abstract class Spell : InventoryItem
{
}
public sealed class FireSpell : Spell
{
public override void Execute(params object[] objects)
{
// using 'params object[]' a simple and generic way to pass things if any, i.e.
// var world = objects[0];
// var strength = objects[1];
// now do something with these !
}
}
Okay, so I figured it out.
The best solution was to use a Dictionary in the XNAGame class, which stores two things: an ItemType (an enumeration), and an InventoryAction. Basically, when I use an item, I check it's type and then look up it's method. Thanks to everyone who tried, and I'm sorry if the question was confusing.
I am having trouble figuring out how to write this collection out to file. I have the following classes
public static class GeoPolyLines
{
public static ObservableCollection<Connections> connections = new ObservableCollection<Connections>();
}
public class Connections
{
public IEnumerable<IEnumerable<Point>> Points { get; set; }
public Connections(Point p1, Point p2)
{
Points = new List<List<Point>>
{
new List<Point>
{
p1, p2
}
};
}
}
And then a bunch of things like this:
GeoPolyLines.connections.Add(new Connections(new Point(GeoLocations.locations[0].Longitude, GeoLocations.locations[0].Latitude), new Point(GeoLocations.locations[1].Longitude, GeoLocations.locations[1].Latitude)));
So GeoPolyLines.connections will eventually have a bunch of different locations that I want to then write out to a .txt file to save and reload if I need to. But I don't know how to do this. I have something like this:
using (StreamWriter sw = new StreamWriter(filename))
{
var enumerator = GeoPolyLines.connections.GetEnumerator();
while (enumerator.MoveNext())
{
}
sw.Close();
}
use serialization.
To write to file
var serializer = new JavaScriptSerializer();
File.WriteAllText(filename, serializer.Serialize(points));
and to read from file
var points = serializer.Deserialize<List<Point>>(File.ReadAllText(filename));
I have this c# class:
public class Test
{
public Test() { }
public IList<int> list = new List<int>();
}
Then I have this code:
Test t = new Test();
t.list.Add(1);
t.list.Add(2);
IsolatedStorageFile storage = IsolatedStorageFile.GetUserStoreForApplication();
StringWriter sw = new StringWriter();
XmlSerializer xml = new XmlSerializer(t.GetType());
xml.Serialize(sw, t);
When I look at the output from sw, its this:
<?xml version="1.0" encoding="utf-16"?>
<Test xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
the values 1,2 I added to the list member variable dont show up.
So how can I fix this ? I made the list a property but it still doesnt seem to work.
I am using xml serialization here, are there any other serializers ?
I want performance! Is this the best approach ?
--------------- UPDATE BELOW -------------------------
So the actual class I want to serialize is this:
public class RoutingResult
{
public float lengthInMeters { get; set; }
public float durationInSeconds { get; set; }
public string Name { get; set; }
public double travelTime
{
get
{
TimeSpan timeSpan = TimeSpan.FromSeconds(durationInSeconds);
return timeSpan.TotalMinutes;
}
}
public float totalWalkingDistance
{
get
{
float totalWalkingLengthInMeters = 0;
foreach (RoutingLeg leg in Legs)
{
if (leg.type == RoutingLeg.TransportType.Walk)
{
totalWalkingLengthInMeters += leg.lengthInMeters;
}
}
return (float)(totalWalkingLengthInMeters / 1000);
}
}
public IList<RoutingLeg> Legs { get; set; } // this is a property! isnit it?
public IList<int> test{get;set;} // test ...
public RoutingResult()
{
Legs = new List<RoutingLeg>();
test = new List<int>(); //test
test.Add(1);
test.Add(2);
Name = new Random().Next().ToString(); // for test
}
}
But the XML produced by the serializer is this:
<RoutingResult>
<lengthInMeters>9800.118</lengthInMeters>
<durationInSeconds>1440</durationInSeconds>
<Name>630104750</Name>
</RoutingResult>
???
its ignoring both of those lists ?
1) Your list is a field, not a property, and the XmlSerializer will only work with properties, try this:
public class Test
{
public Test() { IntList = new List<int>() }
public IList<int> IntList { get; set; }
}
2) There are other Serialiation options, Binary the main other one, though there is one for JSON as well.
3) Binary is probably the most performant way, since it is typically a straight memory dump, and the output file will be the smallest.
list is not a Property. Change it to a publicly visible property, and it should be picked up.
I figured it out that XmlSerializer doesnt work if I use IList so I changed it to List, that made it work. As Nate also mentioned.
I have been using BinaryFormatter to serialise data to disk but it doesn't seem very scalable. I've created a 200Mb data file but am unable to read it back in (End of Stream encountered before parsing was completed). It tries for about 30 minutes to deserialise and then gives up. This is on a fairly decent quad-cpu box with 8Gb RAM.
I'm serialising a fairly large complicated structure.
htCacheItems is a Hashtable of CacheItems. Each CacheItem has several simple members (strings + ints etc) and also contains a Hashtable and a custom implementation of a linked list. The sub-hashtable points to CacheItemValue structures which is currently a simple DTO which contains a key and a value. The linked list items are also equally simple.
The data file that fails contains about 400,000 CacheItemValues.
Smaller datasets work well (though takes longer than i'd expect to deserialize and use a hell of a lot of memory).
public virtual bool Save(String sBinaryFile)
{
bool bSuccess = false;
FileStream fs = new FileStream(sBinaryFile, FileMode.Create);
try
{
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(fs, htCacheItems);
bSuccess = true;
}
catch (Exception e)
{
bSuccess = false;
}
finally
{
fs.Close();
}
return bSuccess;
}
public virtual bool Load(String sBinaryFile)
{
bool bSuccess = false;
FileStream fs = null;
GZipStream gzfs = null;
try
{
fs = new FileStream(sBinaryFile, FileMode.OpenOrCreate);
if (sBinaryFile.EndsWith("gz"))
{
gzfs = new GZipStream(fs, CompressionMode.Decompress);
}
//add the event handler
ResolveEventHandler resolveEventHandler = new ResolveEventHandler(AssemblyResolveEventHandler);
AppDomain.CurrentDomain.AssemblyResolve += resolveEventHandler;
BinaryFormatter formatter = new BinaryFormatter();
htCacheItems = (Hashtable)formatter.Deserialize(gzfs != null ? (Stream)gzfs : (Stream)fs);
//remove the event handler
AppDomain.CurrentDomain.AssemblyResolve -= resolveEventHandler;
bSuccess = true;
}
catch (Exception e)
{
Logger.Write(new ExceptionLogEntry("Failed to populate cache from file " + sBinaryFile + ". Message is " + e.Message));
bSuccess = false;
}
finally
{
if (fs != null)
{
fs.Close();
}
if (gzfs != null)
{
gzfs.Close();
}
}
return bSuccess;
}
The resolveEventHandler is just a work around because i'm serialising the data in one application and loading it in another (http://social.msdn.microsoft.com/Forums/en-US/netfxbcl/thread/e5f0c371-b900-41d8-9a5b-1052739f2521)
The question is, how can I improve this? Is data serialisation always going to be inefficient, am i better off writing my own routines?
I would personally try to avoid the need for the assembly-resolve; that has a certain smell about it. If you must use BinaryFormatter, then I'd simply put the DTOs into a separate library (dll) that can be used in both applications.
If you don't want to share the dll, then IMO you shouldn't be using BinaryFormatter - you should be using a contract-based serializer, such as XmlSerializer or DataContractSerializer, or one of the "protocol buffers" implementations (and to repeat Jon's disclaimer: I wrote one of the others).
200MB does seem pretty big, but I wouldn't have expected it to fail. One possible cause here is the object tracking it does for the references; but even then, this surprises me.
I'd love to see a simplified object model to see if it is a "fit" for any of the above.
Here's an example that attempts to mirror your setup from the description using protobuf-net. Oddly enough there seems to be a glitch working with the linked-list, which I'll investigate; but the rest seems to work:
using System;
using System.Collections.Generic;
using System.IO;
using ProtoBuf;
[ProtoContract]
class CacheItem
{
[ProtoMember(1)]
public int Id { get; set; }
[ProtoMember(2)]
public int AnotherNumber { get; set; }
private readonly Dictionary<string, CacheItemValue> data
= new Dictionary<string,CacheItemValue>();
[ProtoMember(3)]
public Dictionary<string, CacheItemValue> Data { get { return data; } }
//[ProtoMember(4)] // commented out while I investigate...
public ListNode Nodes { get; set; }
}
[ProtoContract]
class ListNode // I'd probably expose this as a simple list, though
{
[ProtoMember(1)]
public double Head { get; set; }
[ProtoMember(2)]
public ListNode Tail { get; set; }
}
[ProtoContract]
class CacheItemValue
{
[ProtoMember(1)]
public string Key { get; set; }
[ProtoMember(2)]
public float Value { get; set; }
}
static class Program
{
static void Main()
{
// invent 400k CacheItemValue records
Dictionary<string, CacheItem> htCacheItems = new Dictionary<string, CacheItem>();
Random rand = new Random(123456);
for (int i = 0; i < 400; i++)
{
string key;
CacheItem ci = new CacheItem {
Id = rand.Next(10000),
AnotherNumber = rand.Next(10000)
};
while (htCacheItems.ContainsKey(key = rand.NextString())) {}
htCacheItems.Add(key, ci);
for (int j = 0; j < 1000; j++)
{
while (ci.Data.ContainsKey(key = rand.NextString())) { }
ci.Data.Add(key,
new CacheItemValue {
Key = key,
Value = (float)rand.NextDouble()
});
int tail = rand.Next(1, 50);
ListNode node = null;
while (tail-- > 0)
{
node = new ListNode
{
Tail = node,
Head = rand.NextDouble()
};
}
ci.Nodes = node;
}
}
Console.WriteLine(GetChecksum(htCacheItems));
using (Stream outfile = File.Create("raw.bin"))
{
Serializer.Serialize(outfile, htCacheItems);
}
htCacheItems = null;
using (Stream inFile = File.OpenRead("raw.bin"))
{
htCacheItems = Serializer.Deserialize<Dictionary<string, CacheItem>>(inFile);
}
Console.WriteLine(GetChecksum(htCacheItems));
}
static int GetChecksum(Dictionary<string, CacheItem> data)
{
int chk = data.Count;
foreach (var item in data)
{
chk += item.Key.GetHashCode()
+ item.Value.AnotherNumber + item.Value.Id;
foreach (var subItem in item.Value.Data.Values)
{
chk += subItem.Key.GetHashCode()
+ subItem.Value.GetHashCode();
}
}
return chk;
}
static string NextString(this Random random)
{
const string alphabet = "abcdefghijklmnopqrstuvwxyz0123456789 ";
int len = random.Next(4, 10);
char[] buffer = new char[len];
for (int i = 0; i < len; i++)
{
buffer[i] = alphabet[random.Next(0, alphabet.Length)];
}
return new string(buffer);
}
}
Serialization is tricky, particularly when you want to have some degree of flexibility when it comes to versioning.
Usually there's a trade-off between portability and flexibility of what you can serialize. For example, you might want to use Protocol Buffers (disclaimer: I wrote one of the C# ports) as a pretty efficient solution with good portability and versioning - but then you'll need to translate whatever your natural data structure is into something supported by Protocol Buffers.
Having said that, I'm surprised that binary serialization is failing here - at least in that particular way. Can you get it to fail with a large file with a very, very simple piece of serialization code? (No resolution handlers, no compression etc.)
Something that could help is cascade serializing.
You call mainHashtable.serialize(), which return a XML string for example. This method call everyItemInYourHashtable.serialize(), and so on.
You do the same with a static method in every class, called 'unserialize(String xml)', which unserialize your objetcs and return an object, or a list of objects.
You get the point ?
Of course, you need to implement this method in every of your class you want to be serializable.
Take a look at ISerializable interface, which represent exaclty what I'm describing. IMO, this interface looks too "Microsoft" (no use of DOM, etc), so i created mine, but principle is the same : cascade.