Starting in C++11, one can write something like
#include <vector>
#include <string>
struct S
{
S(int x, const std::string& s)
: x(x)
, s(s)
{
}
int x;
std::string s;
};
// ...
std::vector<S> v;
// add new object to the vector v
// only parameters of added object's constructor are passed to the function
v.emplace_back(1, "t");
Is there any C# analogue of C++ functions like emplace or emplace_back for container classes (System.Collections.Generic.List)?
Update:
In C# similar code might be written as list.EmplaceBack(1, "t"); instead of list.Add(new S(1, "t"));. It would be nice not to remember a class name and write new ClassName in such situations every time.
In general there is nothing similar in C#, and its need is much less than in C++.
In C# when you have a List<SomeReferenceType> what you really have is a List<ReferenceToSomeType>, so a list of references, with the size of each element of 4 or 8 bytes (see How big is an object reference in .NET?). Copying a reference doesn't cause the underlying object to be duplicated, so it is very fast (you are copying around 4 or 8 bytes, and the processor is optimized for this operation, because that is the size of the native pointer of the processor). So when you someList.Add(someReference) what you are doing is adding a reference to your List<>.
In C++ when you have a std::vector<SomeType> what you have is a vector of SomeType, with the size of each element equal to sizeof(SomeType). Inserting a new element in std::vector<> will cause the element you are inserting to be duplicated (cloned, copied... choose a verb you like). This is an expensive operation.
Quite often the pattern you use is that you create an object just to insert it into a std::vector<>. To optimize this operation in C++11 they added two ways to do it: the std::vector<>::emplace method and support by the std::vector<> of the move semantic. The difference is that the move semantic must be supported by the SomeType type (you need a move constructor with the noexcept specifier), while every type supports the emplace (that in the end simply used placement constructor).
You can a bit improve #Boo variant with extenstion.
You can create object instance with Activator.CreateInstance so it make solution more generic.
public static class ListExtension
{
public static void Emplace<S>(this IList<S> list, params object[] parameters)
{
list.Add((S)Activator.CreateInstance(typeof(S), parameters));
}
}
Note: not checked type and count parameters, so if you do something wrong, you get errors just in run-time
in c# you can use extension method to achive what you want
public static class ListExtension
{
public static void Emplace(this IList<S> list, int x, string s)
{
list.Add(new S(x, s));
}
}
then use it like this
myList.Emplace(1,"t");
It seems you have following problems:
It's longer to type by "new S". But "add" is shorter than "emplace". Type is added for you by intellisense (simply press Enter after typing "new "):
You are afraid to write a wrong type. Well you can't with List<T>. Intellisense will help you to type and compiler will not allow wrong type to be added at compile time anyway.
Performance: see #Xanatos answer.
list.Add(new S(1, "t")); is perfectly fine to use.
Conclusion: we don't need emplace in C#.
Related
UPDATE: the next version of C# has a feature under consideration that would directly answer this issue. c.f. answers below.
Requirements:
App data is stored in arrays-of-structs. There is one AoS for each type of data in the app (e.g. one for MyStruct1, another for MyStruct2, etc)
The structs are created at runtime; the more code we write in the app, the more there will be.
I need one class to hold references to ALL the AoS's, and allow me to set and get individual structs within those AoS's
The AoS's tend to be large (1,000's of structs per array); copying those AoS's around would be a total fail - they should never be copied! (they never need to!)
I have code that compiles and runs, and it works ... but is C# silently copying the AoS's under the hood every time I access them? (see below for full source)
public Dictionary<System.Type, System.Array> structArraysByType;
public void registerStruct<T>()
{
System.Type newType = typeof(T);
if( ! structArraysByType.ContainsKey(newType ) )
{
structArraysByType.Add(newType, new T[1000] ); // allowing up to 1k
}
}
public T get<T>( int index )
{
return ((T[])structArraysByType[typeof(T)])[index];
}
public void set<T>( int index, T newValue )
{
((T[])structArraysByType[typeof(T)])[index] = newValue;
}
Notes:
I need to ensure C# sees this as an array of value-types, instead of an array of objects ("don't you DARE go making an array of boxed objects around my structs!"). As I understand it: Generic T[] ensures that (as expected)
I couldn't figure out how to express the type "this will be an array of structs, but I can't tell you which structs at compile time" other than System.Array. System.Array works -- but maybe there are alternatives?
In order to index the resulting array, I have to typecast back to T[]. I am scared that this typecast MIGHT be boxing the Array-of-Structs; I know that if it were (T) instead of (T[]), it would definitely box; hopefully it doesn't do that with T[] ?
Alternatively, I can use the System.Array methods, which definitely boxes the incoming and outgoing struct. This is a fairly major problem (although I could workaround it if were the only way to make C# work with Array-of-struct)
As far as I can see, what you are doing should work fine, but yes it will return a copy of a struct T instance when you call Get, and perform a replacement using a stack based instance when you call Set. Unless your structs are huge, this should not be a problem.
If they are huge and you want to
Read (some) properties of one of a struct instance in your array without creating a copy of it.
Update some of it's fields (and your structs are not supposed to be immutable, which is generally a bad idea, but there are good reasons for doing it)
then you can add the following to your class:
public delegate void Accessor<T>(ref T item) where T : struct;
public delegate TResult Projector<T, TResult>(ref T item) where T : struct;
public void Access<T>(int index, Accessor<T> accessor)
{
var array = (T[])structArraysByType[typeof(T)];
accessor(ref array[index]);
}
public TResult Project<T, TResult>(int index, Projector<T, TResult> projector)
{
var array = (T[])structArraysByType[typeof(T)];
return projector(ref array[index]);
}
Or simply return a reference to the underlying array itself, if you don't need to abstract it / hide the fact that your class encapsulates them:
public T[] GetArray<T>()
{
return (T[])structArraysByType[typeof(T)];
}
From which you can then simply access the elements:
var myThingsArray = MyStructArraysType.GetArray<MyThing>();
var someFieldValue = myThingsArray[10].SomeField;
myThingsArray[3].AnotherField = "Hello";
Alternatively, if there is no specific reason for them to be structs (i.e. to ensure sequential cache friendly fast access), you might want to simply use classes.
There is a much better solution that is planned for adding to next version of C#, but does not yet exist in C# - the "return ref" feature of .NET already exists, but isn't supported by the C# compiler.
Here's the Issue for tracking that feature: https://github.com/dotnet/roslyn/issues/118
With that, the entire problem becomes trivial "return ref the result".
(answer added for future, when the existing answer will become outdated (I hope), and because there's still time to comment on that proposal / add to it / improve it!)
I have arrived at a point in my self-taught studies where I am not fully grasping what a delegate in C# is useful for. Additionally, on a whim, I decided to take a look at an intro to C++ site and I came across function templates.
Maybe I'm comparing apples and oranges here, but I understood a delegate to be a sort of template for a function that would be defined at run-time. Is this true? If so, how does that differ from a function template in C++?
Is it possible to see (realistic) examples of each in use?
A delegate is a way of taking a member function of some object, and creating a...thing that can be called independently.
In other words, if you have some object A, with some member function F, that you'd normally call as something like: A.F(1);, a delegate is a single entity that you can (for example) pass as a parameter, that acts as a proxy for that object/member function, so when the delegate is invoked, it's equivalent to invoking that member function of that object.
It's a way of taking existing code, and...packaging it to make it easier to use in a fairly specific way (though I feel obliged to add, that 'way' is quite versatile so delegates can be extremely useful).
A C++ function template is a way of generating functions. It specifies some set of actions to take, but does not specify the specific type of object on which those actions will happen. The specification is at a syntactic level, so (for example) I can specify adding two things together to get a third item that's their sum. If I apply that to numbers, it sums like you'd expect. If I do the same with strings, it'll typically concatenate the strings. This is because (syntactically) the template just specifies something like a+b, but + is defined to do addition of numbers, and concatenation of strings.
Looked at slightly differently, a function template just specifies the skeleton for some code. The rest of that code's body is "filled in" based on the type, when you instantiate the template over some specific type.
In C++ terms a C# delegate combines an object pointer and a member function pointer into one callable entity, which calls the member function on the pointed to object.
You can do about the same with std::bind and std::function.
Before C++11 there was a short flurry of articles on how to do very efficient delegates in C++. std::function is a very reasonable compromise and may even attain those levels of efficiency.
Example:
#include <iostream>
#include <functional>
using namespace std;
// Here `function<void()>` serves as a "delegate" type.
void callback_on( function<void()> const f )
{
f();
}
struct S
{
int x;
void foo() const { cout << x << endl; }
};
int main()
{
S o = {42};
callback_on( bind( &S::foo, &o ) );
}
In C# we have a datatype object which can hold any type of data. Same thing I want to achieve in VC++. Can anyone kindly let me know VC++ equivalent of "Object of C#".
IN C#, in the calling appl program (say call.cs)
object ob=null;
ob=(object)str;
funct(ref ob);
Here str is empty string.
This thing I want to achieve in VC++. So I need to create VC++ equivalent of object.
I am sure we need to use pointers as ref's equivalent??
There isn't one. C++ doesn't have a unified type hierarchy like .NET languages have.
The closest you can get is a void* (pointer-to-void), which can point to any type of object. You should avoid void*s like the plague, though; once you start using them you lose any and all type safety.
As other commentators have said, C++ does not have a common base-class for every object. Theoretically, you could create your own and derive everything from it:
class Object
{
protected:
Object(){};
virtual ~Object(){};
public:
virtual std::string toString() const {return("");};
}; // eo class Object
This, however, won't help you with integral types such as int, short. You'd have to make your own:
class Int : public Object
{
private:
int m_nVal;
public:
Int(int _val = 0) : m_nVal(_val){};
Int(const Int& _rhs) : m_nVal(_rhs.m_nVal){};
virtual ~Int(){};
// operators
operator int() const {return(m_nVal);}
bool operator == (const Int& _rhs) const {return(m_nVal == _rhs.m_nVal);};
bool operator == (int _val) const {return(m_nVal == _val);};
Int& operator = (const Int& _rhs) {m_nVal = _rhs.m_nVal; return(*this);}:
Int& operator = (int _val) {m_nVal = _val; return(*this);};
// .... and all the other operators
// overrides
virtual std::string toString() const
{
std::ostringstream oss;
oss << m_nVal;
return(oss.str());
};
}; // eo class Int
You'd then have to do this for all the other types you want to use. Once done you can pass them around as if they were ints, bools, longs etc (thanks to operator overloading). A better method would be to use a template class for the integral types:
template<class T> class IntegralType : public Object
{
private:
T m_Val;
public:
// rest of class looks the same
}; // eo class IntegralType<>
Then typedef them away:
typedef IntegralType<int> Int;
typedef IntegralType<short> Short;
typedef IntegralType<long> Long;
Even using a template-class like this to take the leg-work out of it, you'd still need a specialisation for strings/bools. implementing operator ++ on IntegralType<> will work fine for numbers, but is going to throw up on std::string.
If you went the template route, you've now got "Object", integral types and some specialisations for strings, bools. But to mimick .NET even more, you probably want to introduce interfaces for comparisons:
template<class T> class IEquitable
{
public:
virtual Equals(T _other) = 0;
}; // eo class IEquitable<>
That can easily be plumbed in to your IntegralType<> classes and the specialisations.
But as another commentator pointed out, why would you? boost::any is useful if you're trying to do something like a Tuple which has a name and a value of an arbitrary type. If you need to build a collection of these then there is something fundamentally wrong with your design. For example, in all my coding in C# I have never had to write:
List<Object> list = new List<Object>();
There may have been:
List<Vehicle> list;
List<Employee> List;
Dictionary<string, Alien> aliens;
But never anything at the Object level. Why? Well apart from calling ToString() on it, or perhaps doing some risky casting, why would you want to? Generics exist in programming so that we do not have to have lists of objects (or in the case of C++, void*).
So there you have it. The above shows how you might have objects and integral types working kind of like C#, and I've missed a chunk of stuff out. Now it's time to look at your design and decide if that's what you really need to do.
There's nothing built into the language. Usually, wanting it at all indicates that your design isn't very well thought out, but if you can't figure out any alternative, you might consider (for one example) Boost any.
The <comutil.h> header contains a handy wrapper for VARIANT. Takes care of proper initialization and cleanup.
#include <comutil.h>
#ifdef _DEBUG
# pragma comment(lib, "comsuppwd.lib")
#else
# pragma comment(lib, "comsuppw.lib")
#endif
...
_variant_t arg = L"some string";
someComPtr->func(&arg);
There isn't anything in your code snippet that would help me help you figuring out how to obtain the COM interface pointer. Start a new question about that if you have trouble.
The alternative for you is to look at System.Runtime.InteropServices.GCHandle, which allows you to find the managed object from unmanaged code, but in any way you will end up with nasty and risky type casts and you need to be really careful to keep somewhere a reference to the managed object as it might get garbage-collected if there is only a reference in unmanaged code.
There are a number of questions already on the definition of "ref" and "out" parameter but they seem like bad design. Are there any cases where you think ref is the right solution?
It seems like you could always do something else that is cleaner. Can someone give me an example of where this would be the "best" solution for a problem?
In my opinion, ref largely compensated for the difficulty of declaring new utility types and the difficulty of "tacking information on" to existing information, which are things that C# has taken huge steps toward addressing since its genesis through LINQ, generics, and anonymous types.
So no, I don't think there are a lot of clear use cases for it anymore. I think it's largely a relic of how the language was originally designed.
I do think that it still makes sense (like mentioned above) in the case where you need to return some kind of error code from a function as well as a return value, but nothing else (so a bigger type isn't really justified.) If I were doing this all over the place in a project, I would probably define some generic wrapper type for thing-plus-error-code, but in any given instance ref and out are OK.
Well, ref is generally used for specialized cases, but I wouldn't call it redundant or a legacy feature of C#. You'll see it (and out) used a lot in XNA for example. In XNA, a Matrix is a struct and a rather massive one at that (I believe 64 bytes) and it's generally best if you pass it to functions using ref to avoid copying 64 bytes, but just 4 or 8. A specialist C# feature? Certainly. Of not much use any more or indicative of bad design? I don't agree.
One area is in the use of small utility functions, like :
void Swap<T>(ref T a, ref T b) { T tmp = a; a = b; b = tmp; }
I don't see any 'cleaner' alternatives here. Granted, this isn't exactly Architecture level.
P/Invoke is the only place I can really think of a spot where you must use ref or out. Other cases, they can be convenient, but like you said, there is generally another, cleaner way.
What if you wanted to return multiple objects, that for some unknown reason are not tied together into a single object.
void GetXYZ( ref object x, ref object y, ref object z);
EDIT: divo suggested using OUT parameters would be more appropriate for this. I have to admit, he's got a point. I'll leave this answer here as a, for the record, this is an inadaquate solution. OUT trumps REF in this case.
I think the best uses are those that you usually see; you need to have both a value and a "success indicator" that is not an exception from a function.
One design pattern where ref is useful is a bidirectional visitor.
Suppose you had a Storage class that can be used to load or save values of various primitive types. It is either in Load mode or Save mode. It has a group of overloaded methods called Transfer, and here's an example for dealing with int values.
public void Transfer(ref int value)
{
if (Loading)
value = ReadInt();
else
WriteInt(value);
}
There would be similar methods for other primitive types - bool, string, etc.
Then on a class that needs to be "transferable", you would write a method like this:
public void TransferViaStorage(Storage s)
{
s.Transfer(ref _firstName);
s.Transfer(ref _lastName);
s.Transfer(ref _salary);
}
This same single method can either load the fields from the Storage, or save the fields to the Storage, depending what mode the Storage object is in.
Really you're just listing all the fields that need to be transferred, so it closely approaches declarative programming instead of imperative. This means that you don't need to write two functions (one for reading, one for writing) and given that the design I'm using here is order-dependent then it's very handy to know for sure that the fields will always be read/written in identical order.
The general point is that when a parameter is marked as ref, you don't know whether the method is going to read it or write to it, and this allows you to design visitor classes that work in one of two directions, intended to be called in a symmetrical way (i.e. with the visited method not needing to know which direction-mode the visitor class is operating in).
Comparison: Attributes + Reflection
Why do this instead of attributing the fields and using reflection to automatically implement the equivalent of TransferViaStorage? Because sometimes reflection is slow enough to be a bottleneck (but always profile to be sure of this - it's hardly ever true, and attributes are much closer to the ideal of declarative programming).
The real use for this is when you create a struct. Structs in C# are value types and therefore always are copied completely when passed by value. If you need to pass it by reference, for example for performance reasons or because the function needs to make changes to the variable, you would use the ref keyword.
I could see if someone has a struct with 100 values (obviously a problem already), you'd likely want to pass it by reference to prevent 100 values copying. That and returning that large struct and writing over the old value would likely have performance issues.
The obvious reason for using the "ref" keyword is when you want to pass a variable by reference. For example passing a value type like System.Int32 to a method and alter it's actual value. A more specific use might be when you want to swap two variables.
public void Swap(ref int a, ref int b)
{
...
}
The main reason for using the "out" keyword is to return multiple values from a method. Personally I prefer to wrap the values in a specialized struct or class since using the out parameter produces rather ugly code. Parameters passed with "out" - is just like "ref" - passed by reference.
public void DoMagic(out int a, out int b, out int c, out int d)
{
...
}
There is one clear case when you must use the 'ref' keyword. If the object is defined but not created outside the scope of the method that you intend to call AND the method you want to call is supposed to do the 'new' to create it, you must use 'ref'. e.g.{object a; Funct(a);} {Funct(object o) {o = new object; o.name = "dummy";} will NOT do a thing with object 'a' nor will it complain about it at either compile or run time. It just won't do anything. {object a; Funct(ref a);} {Funct(object ref o) {o = new object(); o.name = "dummy";} will result in 'a' being a new object with the name of "dummy". But if the 'new' was already done, then ref not needed (but works if supplied). {object a = new object(); Funct(a);} {Funct(object o) {o.name = "dummy";}
I have a class, and I want to inspect its fields and report eventually how many bytes each field takes. I assume all fields are of type Int32, byte, etc.
How can I find out easily how many bytes does the field take?
I need something like:
Int32 a;
// int a_size = a.GetSizeInBytes;
// a_size should be 4
You can't, basically. It will depend on padding, which may well be based on the CLR version you're using and the processor etc. It's easier to work out the total size of an object, assuming it has no references to other objects: create a big array, use GC.GetTotalMemory for a base point, fill the array with references to new instances of your type, and then call GetTotalMemory again. Take one value away from the other, and divide by the number of instances. You should probably create a single instance beforehand to make sure that no new JITted code contributes to the number. Yes, it's as hacky as it sounds - but I've used it to good effect before now.
Just yesterday I was thinking it would be a good idea to write a little helper class for this. Let me know if you'd be interested.
EDIT: There are two other suggestions, and I'd like to address them both.
Firstly, the sizeof operator: this only shows how much space the type takes up in the abstract, with no padding applied round it. (It includes padding within a structure, but not padding applied to a variable of that type within another type.)
Next, Marshal.SizeOf: this only shows the unmanaged size after marshalling, not the actual size in memory. As the documentation explicitly states:
The size returned is the actually the
size of the unmanaged type. The
unmanaged and managed sizes of an
object can differ. For character
types, the size is affected by the
CharSet value applied to that class.
And again, padding can make a difference.
Just to clarify what I mean about padding being relevant, consider these two classes:
class FourBytes { byte a, b, c, d; }
class FiveBytes { byte a, b, c, d, e; }
On my x86 box, an instance of FourBytes takes 12 bytes (including overhead). An instance of FiveBytes takes 16 bytes. The only difference is the "e" variable - so does that take 4 bytes? Well, sort of... and sort of not. Fairly obviously, you could remove any single variable from FiveBytes to get the size back down to 12 bytes, but that doesn't mean that each of the variables takes up 4 bytes (think about removing all of them!). The cost of a single variable just isn't a concept which makes a lot of sense here.
Depending on the needs of the questionee, Marshal.SizeOf might or might not give you what you want. (Edited after Jon Skeet posted his answer).
using System;
using System.Runtime.InteropServices;
public class MyClass
{
public static void Main()
{
Int32 a = 10;
Console.WriteLine(Marshal.SizeOf(a));
Console.ReadLine();
}
}
Note that, as jkersch says, sizeof can be used, but unfortunately only with value types. If you need the size of a class, Marshal.SizeOf is the way to go.
Jon Skeet has laid out why neither sizeof nor Marshal.SizeOf is perfect. I guess the questionee needs to decide wether either is acceptable to his problem.
From Jon Skeets recipe in his answer I tried to make the helper class he was refering to. Suggestions for improvements are welcome.
public class MeasureSize<T>
{
private readonly Func<T> _generator;
private const int NumberOfInstances = 10000;
private readonly T[] _memArray;
public MeasureSize(Func<T> generator)
{
_generator = generator;
_memArray = new T[NumberOfInstances];
}
public long GetByteSize()
{
//Make one to make sure it is jitted
_generator();
long oldSize = GC.GetTotalMemory(false);
for(int i=0; i < NumberOfInstances; i++)
{
_memArray[i] = _generator();
}
long newSize = GC.GetTotalMemory(false);
return (newSize - oldSize) / NumberOfInstances;
}
}
Usage:
Should be created with a Func that generates new Instances of T. Make sure the same instance is not returned everytime. E.g. This would be fine:
public long SizeOfSomeObject()
{
var measure = new MeasureSize<SomeObject>(() => new SomeObject());
return measure.GetByteSize();
}
It can be done indirectly, without considering the alignment.
The number of bytes that reference type instance is equal service fields size + type fields size.
Service fields(in 32x takes 4 bytes each, 64x 8 bytes):
Sysblockindex
Pointer to methods table
+Optional(only for arrays) array size
So, for class without any fileds, his instance takes 8 bytes on 32x machine. If it is class with one field, reference on the same class instance, so, this class takes(64x):
Sysblockindex + pMthdTable + reference on class = 8 + 8 + 8 = 24 bytes
If it is value type, it does not have any instance fields, therefore in takes only his fileds size. For example if we have struct with one int field, then on 32x machine it takes only 4 bytes memory.
I had to boil this down all the way to IL level, but I finally got this functionality into C# with a very tiny library.
You can get it (BSD licensed) at bitbucket
Example code:
using Earlz.BareMetal;
...
Console.WriteLine(BareMetal.SizeOf<int>()); //returns 4 everywhere I've tested
Console.WriteLine(BareMetal.SizeOf<string>()); //returns 8 on 64-bit platforms and 4 on 32-bit
Console.WriteLine(BareMetal.SizeOf<Foo>()); //returns 16 in some places, 24 in others. Varies by platform and framework version
...
struct Foo
{
int a, b;
byte c;
object foo;
}
Basically, what I did was write a quick class-method wrapper around the sizeof IL instruction. This instruction will get the raw amount of memory a reference to an object will use. For instance, if you had an array of T, then the sizeof instruction would tell you how many bytes apart each array element is.
This is extremely different from C#'s sizeof operator. For one, C# only allows pure value types because it's not really possible to get the size of anything else in a static manner. In contrast, the sizeof instruction works at a runtime level. So, however much memory a reference to a type would use during this particular instance would be returned.
You can see some more info and a bit more in-depth sample code at my blog
if you have the type, use the sizeof operator. it will return the type`s size in byte.
e.g.
Console.WriteLine(sizeof(int));
will output:
4
You can use method overloading as a trick to determine the field size:
public static int FieldSize(int Field) { return sizeof(int); }
public static int FieldSize(bool Field) { return sizeof(bool); }
public static int FieldSize(SomeStructType Field) { return sizeof(SomeStructType); }
Simplest way is: int size = *((int*)type.TypeHandle.Value + 1)
I know this is implementation detail but GC relies on it and it needs to be as close to start of the methodtable for efficiency plus taking into consideration how GC code complex is nobody will dare to change it in future. In fact it works for every minor/major versions of .net framework+.net core. (Currently unable to test for 1.0)
If you want more reliable way, emit a struct in a dynamic assembly with [StructLayout(LayoutKind.Auto)] with exact same fields in same order, take its size with sizeof IL instruction. You may want to emit a static method within struct which simply returns this value. Then add 2*IntPtr.Size for object header. This should give you exact value.
But if your class derives from another class, you need to find each size of base class seperatly and add them + 2*Inptr.Size again for header. You can do this by getting fields with BindingFlags.DeclaredOnly flag.
System.Runtime.CompilerServices.Unsafe
Use System.Runtime.CompilerServices.Unsafe.SizeOf<T>() where T: unmanaged
(when not running in .NET Core you need to install that NuGet package)
Documentation states:
Returns the size of an object of the given type parameter.
It seems to use the sizeof IL-instruction just as Earlz solution does as well. (source)
The unmanaged constraint is new in C# 7.3