Can I re-use object instances to avoid allocations with protobuf-net?

Can I re-use object instances to avoid allocations with protobuf-net? - c#

Context: this is based on a question that was asked and then deleted before I could answer it - but I think it is a good question, so I've tidied it, rephrased it, and re-posted it.
In a high-throughput scenario using protobuf-net, where lots of allocations are a problem (in particular for GC), is it possible to re-use objects? For example by adding a Clear() method?
[ProtoContract]
public class MyDTO
{
[ProtoMember(1)]
public int Foo { get; set; }
[ProtoMember(2)]
public string Bar { get; set; }
[ProtoMember(3, DataFormat = DataFormat.Group)]
public List<int> Values { get { return values; } }
private readonly List<int> values = new List<int>();
public void Clear()
{
values.Clear();
Foo = 0;
Bar = null;
}
}

protobuf-net will never call your Clear() method itself, but for simple cases you can simply do this yourself, and use the Merge method (on the v1 API, or just pass the object into Deserialize in the v2 API). For example:
MyDTO obj = new MyDTO();
for(...) {
obj.Clear();
Serializer.Merge(obj, source);
}
This loads the data into the existing obj rather than creating a new object each time.
In more complex scenarios where you want to reduce the number of object allocations, and are happy to handle the object pooling / re-use yourself, then you can use a custom factory. For example, you can add a method to MyDTO such as:
// this can also accept serialization-context parameters if
// you want to pass your pool in, etc
public static MyDTO Create()
{
// try to get from the pool; only allocate new obj if necessary
return SomePool.GetMyDTO() ?? new MyDTO();
}
and, at app-startup, configure protobuf-net to know about it:
RuntimeTypeModel.Default[typeof(MyDTO)].SetFactory("Create");
(SetFactory can also accept a MethodInfo - useful if the factory method is not declared inside the type in question)
With this, what should happen is the factory method is used instead of the usual construction mechanisms. It remains, however, entirely your job to cleanse (Clear()) the objects when you are finished with them, and to return them to your pool. What is particularly nice about the factory approach is that it will work for new sub-items in lists, etc, which you can't do just from Merge.

Related

Moq setup returns reference to object

Let's say I have a simple class called MyRequestHandler, and it has a method called ProcessRequest that simply takes a request object, maps it to a return object and returns that object. (This is obviously a very simple example of a much more complex method/test that I'm working on).
public class MyRequestHandler
{
private IMapper _mapper;
public MyRequestHandler(IMapper maper)
{
_mapper = mapper;
}
public MyReturnObject ProcessRequest(MyRequestObject requestObject)
{
MyReturnObject returnObject = _mapper.Map<MyReturnObject>(requestObject);
return returnObject;
}
}
Now for unit testing (using Xunit), I want to test the ProcessRequest method, but obviously want to Moq the Map method, as such:
MyRequestObject requestObject = new RequestObject()
{
RequestInt = 1,
RequestString = "Hello"
};
MyReturnObject returnObject = new MyReturnObject()
{
MyInt = 1,
MyString = "Hello"
};
Mock<IMapper> mockMapper = new Mock<IMapper>();
mockMapper.Setup(m => m.Map<MyRequestObject>(requestObject)).Returns(returnObject);
MyRequestHandler requestHandler = new MyRequestHandler(mockMapper.Object);
MyReturnObject response = requestHandler.ProcessRequest(requestObject);
Assert.Equal(returnObject.MyInt, response.MyInt);
Assert.Equal(returnObject.MyString, response.MyString);
The problem here is that Moq returns (and I guess it should be obvious that it is) a reference to returnObject, so my Asserts will always pass, even if my method were to change a value prior to returning the object. Now I could instantiate a new MyReturnObject in the Moq Setup/Return and compare the MyInt and MyString by the values I give to the new one, but what if it's a really complex object with 20 properties and lists of objects? Maybe I want to use AutoFixture to create the object being returned and use DeepEqual to compare them? Is this even possible? Am I looking at this wrong, or do I have to do some type of cloning in the Setup/Return to make this work?

I don't believe there is built in functionality to detect that method under test did not change object passed to it.
Options:
make sure that return objects are immutable - either by having them immutable to start with or by returning interface without "set" methods with an instance created via mocks
create separate instance for "expected" and "mocked" values and then compare property-by-property. There are plenty of helper libraries to do so (I like FluentAssertions).
just assert on individual properties instead of comparing objects - works fine for small number of fields.
If possible I'd prefer immutable objects - that prevent possibility of writing wrong code and thus decreases amount of testing needed.

In this case you didn't receive a new data and can verify behavior
Internal state is not valuable in this case
var requestObject = new RequestObject();
var returnObject = new MyReturnObject();
...
var actual = requestHandler.ProcessRequest(requestObject);
Assert.AreSame(returnObject, actual);
mockMapper.Verify(
instance => instance.Map<MyRequestObject>(requestObject),
Times.Once);
Some details
we can't share write access with others, so i assume you have
public class MyRequestObject
{
int RequestInt { get; private set; }
string RequestString { get; private set; }
}
otherwise you always should test for parameter mutation. You can imagine 10 participants called in depth and each of them should have such tests. These tests will weak against changes, they do nothing with new properties.
It is better to have good coding convention and do codereview sometimes. In example someone can randomly remove private from property and it can't be catched with any tests.
There are many good practices in example "write test before of code" and so on

Can I create and object that "self-loads" properties from a data source?

I have a design pattern question that I'm hoping you can help me with.
I have a C# application that stores objects in a database as JSON. Currently, a dedicated class handles loading the JSON, deserializing, and returning the object. This works great, but I'm wondering if there is an approach that would allow the object to "self-load" (for lack of a better term).
What I am doing now is (short-hand code):
public class MyObject {
public string Name;
public string Rank;
public int SerialNo;
}
public class DataTransport{
public MyObject LoadMyObject(string ObjectId){
string ObjectJSON = fetch.from.database;
return ObjectJSON.deserialize.object;
}
}
DataTransport dt = new DataTransport();
MyObject mo = dt.LoadMyObject("object123");
What I would like to do is something like:
public class MyObject {
public string Name;
public string Rank;
public int SerialNo;
public MyObject(string ObjectId){
DataTransport dt = new DataTransport();
this = dt.LoadMyObject(ObjectId);
}
}
MyObject mo = new MyObject("object123);
Obviously this fails, but is there a similar mechanism to have the constructor replace itself with the object loaded from the database?
I've thought about loading and manually assigning properties when instantiated, but that risk and hassle doesn't pass the cost/benefit smell test.
Thanks in advance for your help!

You could essentially do what you want if you made your LoadMyObject() method populate an existing object instance from the JSON instead of creating a new one. Maybe PopulateMyObject() would be a better name for it:
public void PopulateMyObject(string ObjectId, object target)
{
string ObjectJSON = FetchJsonFromDatabase(ObjectId);
PopulateObjectFromJson(ObjectJSON, target);
}
Incidentally, Json.Net supports populating existing objects from JSON out of the box, so you could use its JsonConvert.PopulateObject method as a drop-in replacement for PopulateObjectFromJson in the above code.
Then in your object constructors you just do:
public MyObject(string ObjectId)
{
DataTransport dt = new DataTransport();
dt.PopulateMyObject(ObjectId, this);
}
Concept fiddle: https://dotnetfiddle.net/I9bHeN
Of course, #MickyD makes a very good point in the comments that you would be tightly coupling all your model objects to the DataTransport object, which may not be desirable.

C#, generic way to access different lists within a class

I have a class of 3 different linked lists (for saving the entities in a game I'm working on). The lists are all of objects with the same base type, but I keep them separate for processing reasons. Note that IEntity, IObject and IUndead all inherited from IEntity.
public class EntityBucket
{
public LinkedList<IEntity> undeadEntities;
public LinkedList<IEntity> objects;
public LinkedList<IEntity> livingEntities;
public EntityBucket()
{
undeadEntities = new LinkedList<IEntity>();
objects = new LinkedList<IEntity>();
livingEntities = new LinkedList<IEntity>();
}
public LinkedList<IEntity> GetList(IObject e)
{
return objects;
}
public LinkedList<IEntity> GetList(IUndead e)
{
return undeadEntities;
}
public LinkedList<IEntity> GetList(ILiving e)
{
return livingEntities;
}
}
I have 3 methods for retrieving each of the lists, currently based on their parameters. The fact that there are 3 is fine, since I know each list will in some way or another require its own accessor. Passing an instantiated object is not ideal though, as I may want to retrieve a list somewhere without having an object of similar type at hand. Note that the object here is not even used in the GetList methods, they are only there to determine which version to use. Here is an example where I have an instantiated object at hand:
public void Delete(IUndead e, World world)
{
.....
LinkedList<IEntity> list = buckets[k].GetList(e);
.....
}
I don't like this current implementation as I may not always have an instantiated object at hand (when rendering the entities for example). I was thinking of doing it generically but I'm not sure if this is possible with what I want to do. With this I also need 3 Delete methods (and 3 of any other, such as add and so forth) - one for each type, IUndead, IObject and ILiving. I just feel that this is not the right way of doing it.
I'll post what I have tried to do so far on request, but my generics is rather bad and I feel that it would be a waste for anyone to read this as well.
Finally, performance is very important. I'm not prematurely optimizing, I am post-optimizing as I have working code already, but need it to go faster. The getlist methods will be called very often and I want to avoid any explicit type checking.

So you want a better interface, because, as you said, passing an unnecessary object to GetList just to figure out its type makes little sense.
You could do something like:
public List<IEntity> GetList<T>() : where T:IEntity
{
if(typeof(T)==typeof(IUndead)) return undedEntities;
// and so on
}
And you'll have to call it like this: GetList<IUndead>();
I think an enum is a better idea here:
enum EntityTypes { Undead, Alive, Object };
public List<IEntity> GetList(EntityTypes entityType) { ... }
It's cleaner and makes more sense to me.
EDIT: Using generics is actually not that simple. Someone could call GetList a Zombie type, which implements IUndead, and then you'll have to check for interface implementations. Someone could even pass you a LiveZombie which implements both IUndead and IAlive. Definitely go with an enum.

How about a better implementation to go with that better interface?
public class EntityBucket
{
public LinkedList<IEntity> Entities;
public IEnumerable<T> GetEntities<T>() where T : IEntity
{
return Entities.OfType<T>();
}
}
List<IUndead> myBrainFinders = bucket.GetEntities<IUndead>().ToList();
With this implementation, the caller better add each item to the right list(s). That was a requirement for your original implementation, so I figure it's no problem.
public class EntityBucket
{
Dictionary<Type, List<IEntity>> entities = new Dictionary<Type, List<IEntity>>();
public void Add<T>(T item) where T : IEntity
{
Type tType = typeof(T);
if (!entities.ContainsKey(tType))
{
entities.Add(tType, new List<IEntity>());
}
entities[tType].Add(item);
}
public List<T> GetList<T>() where T : IEntity
{
Type tType = typeof(T);
if (!entities.ContainsKey(tType))
{
return new List<T>();
}
return entities[tType].Cast<T>().ToList();
}
public List<IEntity> GetAll()
{
return entities.SelectMany(kvp => kvp.Value)
.Distinct() //to remove items added multiple times, or to multiple lists
.ToList();
}
}

How about something like the following?
public LinkedList<IEntity> GetList(Type type) {
if (typeof(IUndead).IsAssignableFrom(type)) return undeadEntities;
if (typeof(ILiving).IsAssignableFrom(type)) return livingEntities;
if (typeof(IObject).IsAssignableFrom(type)) return objects;
}
Then you would call it like this:
var myUndeads = GetList(typeof(IUndead));
var myLivings = GetList(typeof(ILiving));
// etc
The same type of logic could be implemented in your deletes, add, and other methods, and you never need a concrete instance of an object to access them.
The IsAssignableFrom logic handles subclassing just fine (i.e. you could have a CatZombie, which derives from Zombie, which implements IUndead, and this would still work). This means you still only have to create one Delete method, something like the following:
public void Delete(IEntity e, World world) {
if (typeof(IUndead).IsAssignableFrom(type)) undeadEntities.Remove(e);
if (typeof(ILiving).IsAssignableFrom(type)) livingEntities.Remove(e);
if (typeof(IObject).IsAssignableFrom(type)) objects.Remove(e);
}
EDIT: I noticed your comment on zmbq's answer regarding performance; this is definitely NOT fast. If you need high performance, use an enum-style approach. Your code will be more verbose and require more maintenance, but you'll get much better performance.

Seems to me you could just implement a Dictionary
of named LinkedList's and refer to them
by name or enum.
That way adding or removing lists is just an
implementation issue and no separate class to deal with.

What is a "mostly complete" (im)mutability approach for C#? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Since immutability is not fully baked into C# to the degree it is for F#, or fully into the framework (BCL) despite some support in the CLR, what's a fairly complete solution for (im)mutability for C#?
My order of preference is a solution consisting of general patterns/principles compatible with
a single open-source library with few dependencies
a small number of complementary/compatible open-source libraries
something commercial
that
covers Lippert's kinds of immutability
offers decent performance (that's vague I know)
supports serialization
supports cloning/copying (deep/shallow/partial?)
feels natural in scenarios such as DDD, builder patterns, configuration, and threading
provides immutable collections
I'd also like to include patterns you as the community might come up with that don't exactly fit in a framework such as expressing mutability intent through interfaces (where both clients that shouldn't change something and may want to change something can only do so through interfaces, and not the backing class (yes, I know this isn't true immutability, but sufficient):
public interface IX
{
int Y{ get; }
ReadOnlyCollection<string> Z { get; }
IMutableX Clone();
}
public interface IMutableX: IX
{
new int Y{ get; set; }
new ICollection<string> Z{ get; } // or IList<string>
}
// generally no one should get ahold of an X directly
internal class X: IMutableX
{
public int Y{ get; set; }
ICollection<string> IMutableX.Z { get { return z; } }
public ReadOnlyCollection<string> Z
{
get { return new ReadOnlyCollection<string>(z); }
}
public IMutableX Clone()
{
var c = MemberwiseClone();
c.z = new List<string>(z);
return c;
}
private IList<string> z = new List<string>();
}
// ...
public void ContriveExample(IX x)
{
if (x.Y != 3 || x.Z.Count < 10) return;
var c= x.Clone();
c.Y++;
c.Z.Clear();
c.Z.Add("Bye, off to another thread");
// ...
}

Would the better solution be to just use F# where you want true immutability?

Use this T4 template I put together to solve this problem. It should generally suit your needs for whatever kinds of immutable objects you need to create.
There's no need to go with generics or use any interfaces. For my purposes, I do not want my immutable classes to be convertible to one another. Why would you? What common traits should they share that means they should be convertible to one another? Enforcing a code pattern should be the job of a code generator (or better yet, a nice-enough type system to allow you to do define general code patterns, which C# unfortunately does not have).
Here's some example output from the template to illustrate the basic concept at play (nevermind the types used for the properties):
public sealed partial class CommitPartial
{
public CommitID ID { get; private set; }
public TreeID TreeID { get; private set; }
public string Committer { get; private set; }
public DateTimeOffset DateCommitted { get; private set; }
public string Message { get; private set; }
public CommitPartial(Builder b)
{
this.ID = b.ID;
this.TreeID = b.TreeID;
this.Committer = b.Committer;
this.DateCommitted = b.DateCommitted;
this.Message = b.Message;
}
public sealed class Builder
{
public CommitID ID { get; set; }
public TreeID TreeID { get; set; }
public string Committer { get; set; }
public DateTimeOffset DateCommitted { get; set; }
public string Message { get; set; }
public Builder() { }
public Builder(CommitPartial imm)
{
this.ID = imm.ID;
this.TreeID = imm.TreeID;
this.Committer = imm.Committer;
this.DateCommitted = imm.DateCommitted;
this.Message = imm.Message;
}
public Builder(
CommitID pID
,TreeID pTreeID
,string pCommitter
,DateTimeOffset pDateCommitted
,string pMessage
)
{
this.ID = pID;
this.TreeID = pTreeID;
this.Committer = pCommitter;
this.DateCommitted = pDateCommitted;
this.Message = pMessage;
}
}
public static implicit operator CommitPartial(Builder b)
{
return new CommitPartial(b);
}
}
The basic pattern is to have an immutable class with a nested mutable Builder class that is used to construct instances of the immutable class in a mutable way. The only way to set the immutable class's properties is to construct a ImmutableType.Builder class and set that in the normal mutable way and convert that to its containing ImmutableType class with an implicit conversion operator.
You can extend the T4 template to add a default public ctor to the ImmutableType class itself so you can avoid a double allocation if you can set all the properties up-front.
Here's an example usage:
CommitPartial cp = new CommitPartial.Builder() { Message = "Hello", OtherFields = value, ... };
or...
CommitPartial.Builder cpb = new CommitPartial.Builder();
cpb.Message = "Hello";
...
// using the implicit conversion operator:
CommitPartial cp = cpb;
// alternatively, using an explicit cast to invoke the conversion operator:
CommitPartial cp = (CommitPartial)cpb;
Note that the implicit conversion operator from CommitPartial.Builder to CommitPartial is used in the assignment. That's the part that "freezes" the mutable CommitPartial.Builder by constructing a new immutable CommitPartial instance out of it with normal copy semantics.

Personally, I'm not really aware of any third party or previous solutions to this problem, so my apologies if I'm covering old ground. But, if I were going to implement some kind of immutability standard for a project I was working on, I would start with something like this:
public interface ISnaphot<T>
{
T TakeSnapshot();
}
public class Immutable<T> where T : ISnaphot<T>
{
private readonly T _item;
public T Copy { get { return _item.TakeSnapshot(); } }
public Immutable(T item)
{
_item = item.TakeSnapshot();
}
}
This interface would be implemented something like:
public class Customer : ISnaphot<Customer>
{
public string Name { get; set; }
private List<string> _creditCardNumbers = new List<string>();
public List<string> CreditCardNumbers { get { return _creditCardNumbers; } set { _creditCardNumbers = value; } }
public Customer TakeSnapshot()
{
return new Customer() { Name = this.Name, CreditCardNumbers = new List<string>(this.CreditCardNumbers) };
}
}
And client code would be something like:
public void Example()
{
var myCustomer = new Customer() { Name = "Erik";}
var myImmutableCustomer = new Immutable<Customer>(myCustomer);
myCustomer.Name = null;
myCustomer.CreditCardNumbers = null;
//These guys do not throw exceptions
Console.WriteLine(myImmutableCustomer.Copy.Name.Length);
Console.WriteLine("Credit card count: " + myImmutableCustomer.Copy.CreditCardNumbers.Count);
}
The glaring deficiency is that the implementation is only as good as the client of ISnapshot's implementation of TakeSnapshot, but at least it would standardize things and you'd know where to go searching if you had issues related to questionable mutability. The burden would also be on potential implementors to recognize whether or not they could provide snapshot immutability and not implement the interface, if not (i.e. the class returns a reference to a field that does not support any kind of clone/copy and thus cannot be snapshot-ed).
As I said, this is a start—how I'd probably start—certainly not an optimal solution or a finished, polished idea. From here, I'd see how my usage evolved and modify this approach accordingly. But, at least here I'd know that I could define how to make something immutable and write unit tests to assure myself that it was.
I realize that this isn't far removed from just implementing an object copy, but it standardizes copy vis a vis immutability. In a code base, you might see some implementors of ICloneable, some copy constructors, and some explicit copy methods, perhaps even in the same class. Defining something like this tells you that the intention is specifically related to immutability—I want a snapshot as opposed to a duplicate object because I happen to want n more of that object. The Immtuable<T> class also centralizes the relationship between immutability and copies; if you later want to optimize somehow, like caching the snapshot until dirty, you needn't do it in all implementors of copying logic.

If the goal is to have objects which behave as unshared mutable objects, but which can be shared when doing so would improve efficiency, I would suggest having a private, mutable "fundamental data" type. Although anyone holding a reference to objects of this type would be able to mutate it, no such references would ever escape the assembly. All outside manipulations to the data must be done through wrapper objects, each of which holds two references:
UnsharedVersion--Holds the only reference in existence to its internal data object, and is free to modify it
SharedImmutableVersion--Holds a reference to the data object, to which no references exist except in other SharedImmutableVersion fields; such objects may be of a mutable type, but will in practice be immutable because no references will ever be made available to code that would mutate them.
One or both fields may be populated; when both are populated, they should refer to instances with identical data.
If an attempt is made to mutate an object via the wrapper and the UnsharedVersion field is null, a clone of the object in SharedImmutableVersion should be stored in UnsharedVersion. Next, SharedImmutableCVersion should be cleared and the object in UnsharedVersion mutated as desired.
If an attempt is made to clone an object, and SharedImmutableVersion is empty, a clone of the object in UnsharedVersion should be stored into SharedImmutableVersion. Next, a new wrapper should be constructed with its UnsharedVersion field empty and its SharedImmutableVersion field populated with the SharedImmutableVersion from the original.
It multiple clones are made of an object, whether directly or indirectly, and the object hasn't been mutated between the construction of those clones, all clones will refer to the same object instance. Any of those clones may be mutated, however, without affecting the others. Any such mutation would generate a new instance and store it in UnsharedVersion.

Holding out on object creation

Is there ever a case where holding the necessary data to create an object and only creating it when is absolutely necessary, is better/more efficient than holding the object itself?
A trivial example:
class Bar
{
public string Data { get; set; }
}
class Foo
{
Bar bar;
readonly string barData;
public Foo(string barData)
{
this.barData = barData;
}
public void MaybeCreate(bool create)
{
if (create)
{
bar = new Bar { Data = barData };
}
}
public Bar Bar { get { return bar; } }
}

It makes sense if the object performs some complex operation on construction, such as allocate system resources.
You have Lazy<T> to help you delay an object's instantiation. Among other things, it has thread safety built in, if you need it.

In general, no. (If I understand your question correct).
Allocations/constructions are cheap in terms of performance. Unless you are doing something crazy, construct your objects when it feels natural for the design - don't optimize prematurely.

Yes if creating the object means populating it, and to populate it you need to do a slow operation.
For example,
List<int> ll = returnDataFromDBVeryVerySlowly();
or
Lazy<List<int>> ll = new Lazy<List<int>>(() =>
{
return returnDataFromDBVeryVerySlowly();
});
In first example returnDataFromDBVeryVerySlowly will be called always, even if you don't need it. In the second one it will be called only if it's necessary. This is quite common, for example, in ASP.NET where you want to have "ready" many "standard" datasets, but you don't want them to be populated unless they are needed and you want to put them as members of your Page, so that multiple methods can access them (otherwhise a method could call directly returnDataFromDBVeryVerySlowly)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.