I'm just returning back to C# after an extended period of C++ and Qt. I'm currently stumped by what I would have thought to be a very simple problem.
I have a struct:
struct FontGlyph {
public uint codepoint;
public byte[] bmp;
...
}
And an array of such structs:
FontGlyph[] glyphs = new FontGlyph[100];
Now, I have a couple of functions which set up and modify the fields in the structs:
static void ConvertFont(Font fnt) {
...
if (fnt.IsSpaceCharacter) {
glyphs[idx].codepoint = 0x00FF;
glyphs[idx].bmp = null;
}
else {
RenderFontGlyph(glyphs[idx]);
// glyphs[idx].bmp is NOT ok here - it is null!
}
...
}
static void RenderFontGlyph(FontGlyph glyph) {
...
glyph.bmp = new byte[256];
// bmp is fine here, and can be populated as a normal array
...
}
This isn't a particularly great snippet of code, however, in the RenderFontGlyph function, I can see that the bmp array is allocated correctly yet when the RenderFontGlyph function returns, upon inspection of the glyphs[idx] variable, bmp is back to null.
I appreciate I'm probably doing something n00bish but its been a while. Am I a victim of garbage collection or am I being stupid? It had occurred to me that the struct was being passed into the RenderFontGlyph function by-value rather than by-ref but this also makes no difference!
It had occurred to me that the struct was being passed into the RenderFontGlyph function by-value rather than by-ref but this also makes no difference!
Well yes, it does. You're creating a copy of the struct, and passing that into RenderFontGlyph. Any changes made to that copy don't affect anything else.
If you pass it by reference instead, it will make a difference, because you'll be modifying the original storage location in the array:
RenderFontGlyph(ref glyphs[idx]);
...
static void RenderFontGlyph(ref FontGlyph glyph)
Or you could keep using a value parameter, and make RenderFontGlyph return the modified value which you'd need to store back in the array, as per Leonardo's answer.
I certainly wouldn't go so far as to say that you're being stupid, but it's really, really important that you understand the semantics of reference types and value types, particularly if you're creating mutable value types. (And worse, a mutable value type containing a reference to a mutable reference type - the array in this case. You can mutate the array without mutating the struct... this could all become very confusing if you're not careful.)
Unless you have a really good reason to create mutable value types, I'd strongly advise against it - just like I'd also advise against exposing public fields. You should almost certainly be modelling FontGlyph as a class - it doesn't feel like a natural value type to me. If you do want to model it as a value type, then rather than passing in a FontGlyph at all, why not just pass in the code point you want to render, and make the method return the glyph?
glyphs[0] = RenderGlyph(codePoint);
As you're claiming that pass-by-reference isn't working for you, here's a complete example demonstrating that it does work. You should compare this with your code to see what you're doing wrong:
using System;
struct FontGlyph
{
public uint codepoint;
public byte[] bmp;
}
class Program
{
static void Main()
{
FontGlyph[] glyphs = new FontGlyph[100];
RenderFontGlyph(ref glyphs[0]);
Console.WriteLine(glyphs[0].bmp.Length); // 10
}
static void RenderFontGlyph(ref FontGlyph glyph)
{
glyph.bmp = new byte[10];
}
}
How about:
static void ConvertFont(Font fnt) {
...
if (fnt.IsSpaceCharacter) {
glyphs[idx].codepoint = 0x00FF;
glyphs[idx].bmp = null;
}
else {
glyphs[idx] = RenderFontGlyph(glyphs[idx]);
// glyphs[idx].bmp is NOT ok here - it is null!
}
...
}
static FontGlyph RenderFontGlyph(FontGlyph glyph) {
...
glyph.bmp = new byte[256];
// bmp is fine here, and can be populated as a normal array
...
return glyph;
}
or use ref like this: static void RenderFontGlyph(ref FontGlyph glyph) and then call it like this: RenderFontGlyph(ref glyphs[idx])
Related
I wish to obtain the original value a Span represents. Take the following code for example, how would I, in DoWork, gain access to the original byte array without creating a copy of it?
static void Main()
{
var data = new byte[0x100];
DoWork(new Span<byte>(data));
}
private void DoWork(Span<byte> Data)
{
//var data = Data.ToArray(); Unsuitable; creates a copy
//var data = (byte[])Data; Unsuitable; doesn't work
//MemoryMarshal. Something in here may work, but unsure
//MemoryExtensions. Something in here may work, but unsure
}
I found 2 static classes with helper methods (shown above) that may help, but I am unsure as to what is the best way to do this without making things slower than just making a copy.
According to the Span Document:
Because it is a stack-only type, Span is unsuitable for many scenarios that require storing references to buffers on the heap. This is true, for example, of routines that make asynchrous method calls. For such scenarios, you can use the complimentary System.Memory and System.ReadOnlyMemory types.
So maybe to your need, you don't have to use a Span:
static void Main()
{
var data = new byte[0x100];
DoWork(data);
}
private void DoWork(byte[] data)
{
// data array is by reference.
}
In C++ I have the following struct from 3rd-party code:
typedef struct NodeInfoTag
{
long lResult;
int bComplete;
char *pszNodeAddr;
char *pszParentAddr;
RTS_WCHAR *pwszNodeName;
RTS_WCHAR *pwszDeviceName;
RTS_WCHAR *pwszVendorName;
unsigned long ulTargetType;
unsigned long ulTargetId;
unsigned long ulTargetVersion;
unsigned short wMaxChannels;
}NodeInfotyp;
And the definition to RTS_WCHAR:
# ifndef RTS_WCHAR_DEFINED
# define RTS_WCHAR_DEFINED
typedef wchar_t RTS_WCHAR; /* wide character value */
# endif
(So it's basically a wchar_t)
Then I have my own class called CScanNetworkCallback, which extends the CPLCHandlerCallback class, a class from the same vendor:
.h file:
class CScanNetworkCallback : public CPLCHandlerCallback
{
public:
bool bScanComplete;
NodeInfotyp* pNodeInfo;
NodeInfotyp* pNodeInfoList;
std::vector<NodeInfotyp> vList;
CScanNetworkCallback();
virtual ~CScanNetworkCallback(void);
virtual long Notify(CPLCHandler *pPlcHandler, CallbackAddInfoTag CallbackAdditionalInfo);
};
The implementation follows their own guidelines with some of my own stuff thrown in:
CScanNetworkCallback::CScanNetworkCallback(void) : CPLCHandlerCallback()
{
bScanComplete = false;
}
CScanNetworkCallback::~CScanNetworkCallback()
{
delete pNodeInfo;
delete pNodeInfoList;
}
long CScanNetworkCallback::Notify(CPLCHandler *pPlcHandler, CallbackAddInfoTag CallbackAdditionalInfo)
{
if (pPlcHandler != NULL)
{
if (CallbackAdditionalInfo.ulType == PLCH_SCAN_NETWORK_CALLBACK)
{
pNodeInfo = CallbackAdditionalInfo.AddInf.pNodeInfo;
if (pNodeInfo->lResult == RESULT_OK)
{
vList.push_back(*pNodeInfo);
bScanComplete = false;
}
else
{
pNodeInfoList = &vList[0]; //New pointer points to the vector elements, which will be used as an array later on
// I have also tried copying it, to the same result:
//std::copy(vList.begin(), vList.end(), pNodeInfoList);
bScanComplete = true;
}
}
}
return RESULT_OK;
}
So basically, the Notify method in the class above is called every time a "node" is found in the network, assigning the node's information to pNodeInfo (please disregard what a node is, it isn't relevant ATM). Since it is called to every node in the network during the scanning process and I must send this information to C++, I couldn't find any other way to do so other than using a std::vector to store every callback info for latter use, as I don't know how many nodes there will be at compile time. The else part is called after all nodes have been found. In order to make sense out of the C# code, I must describe the implementation of some other C++ methods that are p/Invoked:
PROASADLL __declspec(dllexport) void scanNetwork(){
pScanHandler->ScanNetwork(NULL, &scanNetworkCallback);
}
The object scanNetworkCallback is static. pScanHandler is a pointer to another class from the 3rd party vendor and its ScanNetwork method runs on a separate thread. Internally (and I only know that due to this API Guidelines, I don't have its source code), it calls the Notify method whenever a node is found in the network, or something to that effect
And finally:
PROASADLL __declspec(dllexport) NodeInfotyp* getScanResult(int* piSize) {
*piSize = scanNetworkCallback.vList.size();
return scanNetworkCallback.pNodeInfoList;
}
That returns the pointer that points to all nodes' information and the amount in as an out parameter. Now let's take a look at the C# code:
public static List<NodeInfoTag> AsaScanNetworkAsync()
{
Console.WriteLine("SCANNING NETWORK");
scanNetwork(); // C++ Method
while (!isScanComplete()) // Holds the C# thread until the scan is complete
Thread.Sleep(50);
int size = 0;
IntPtr pointer = getScanResult(out size); // works fine, I get some IntPtr and the correct size
List<NodeInfoTag> list = Marshaller.MarshalPointerToList<NodeInfoTag>(pointer, size); // PROBLEM!!!
// Continue doing stuff
}
This is the class NodeInfoTag, to match the C++ NodeInfotyp struct:
[StructLayout(LayoutKind.Sequential)]
public class NodeInfoTag
{
public int Result;
public int Complete;
[MarshalAs(UnmanagedType.LPStr)] //char*
public string NodeAddress;
[MarshalAs(UnmanagedType.LPStr)] //char*
public string ParentAddress;
[MarshalAs(UnmanagedType.LPWStr)] //wchar_t
public string VendorName;
public uint TargetType;
public uint TargetId;
public uint TargetVersion;
public short MaxChannels;
}
And this is where I get my Memory Access Violation:
internal class Marshaller
{
public static List<T> MarshalPointerToList<T>(IntPtr pointer, int size)
{
if (size == 0)
return null;
List<T> list = new List<T>();
var symbolSize = Marshal.SizeOf(typeof(T));
for (int i = 0; i < size; i++)
{
var current = (T)Marshal.PtrToStructure(pointer, typeof(T));
list.Add(current);
pointer = new IntPtr(pointer.ToInt32() + symbolSize);
}
return list;
}
}
The error occurs specifically when marshaling should take place, at the line var current = (T)Marshal.PtrToStructure(pointer, typeof(T));. This C# code used to work just fine, but the C++ part was terrible, convoluted and error-prone, so I decided to make things more simple but I can't figure out for the life of me why I'm getting this Exception as I'm making sure that all C++ resources are available for C#, since for testing purposes, I don't delete anything in C++ and I'm only using variables with global scope within the class, which is allocated to static memory. So, what did I miss?
Edit: I removed pNodeInfoList = &vList[0]; and rewrote getScanResult as follows:
static NodeInfotyp pNodeInfoList;
//(...)
PROASADLL __declspec(dllexport) NodeInfotyp* getScanResult(int* piSize) {
*piSize = scanNetworkCallback.vList.size();
std::move(scanNetworkCallback.vList.begin(),
scanNetworkCallback.vList.end(), &pNodeInfoList);
return &pNodeInfoList;
}
No dice. I don't use new or malloc in any of the variables involved, and even changed pNodeInfoList (the array) from a class member to a global variable. Also, I'm using move, as I've been told, could be used to solve ownership problems. Any other tips?
Ownership is not part of the naive C++ type system, so you will not get an error when you delete a pointer you do not own or transfer ownership away without giving it up.
However, semantically certain values and pointers and data blocks are owned by certain types or values.
In this case the vector owns its block of memory. There is no way to ask it or make it give up ownership.
Calling .data() onky provides you a pointer, it does not give that pointer semantic ownership.
You store the return value of .data() in a member variable. You later call delete on that member variable. This indicates to me that member variable is supposed to own its data. So you double delete (as both the vector and the pointer think they own the data pointed to), and your compiler crashes the program for you.
You need to rewite your code taking into account liefetime and ownership of every block of memory you are working with. One approach is to never ever call new, malloc or delete or free directly, and always use memory managing types like vector and unique ptr. Avoid persisting raw pointers, as their ownership semantics are not clear from the type.
I'm trying to use this great project but since i need to scan many images the process takes a lot of time so i was thinking about multi-threading it.
However, since the class that makes the actual processing of the images uses Static methods and is manipulating Objects by ref i'm not really sure how to do it right. the method that I call from my main Thread is:
public static void ScanPage(ref System.Collections.ArrayList CodesRead, Bitmap bmp, int numscans, ScanDirection direction, BarcodeType types)
{
//added only the signature, actual class has over 1000 rows
//inside this function there are calls to other
//static functions that makes some image processing
}
My question is if it's safe to use use this function like this:
List<string> filePaths = new List<string>();
Parallel.For(0, filePaths.Count, a =>
{
ArrayList al = new ArrayList();
BarcodeImaging.ScanPage(ref al, ...);
});
I've spent hours debugging it and most of the time the results i got were correct but i did encounter several errors which i now can't seem to reproduce.
EDIT
I pasted the code of the class to here: http://pastebin.com/UeE6qBHx
I'm pretty sure it is thread safe.
There are two fields, which are configuration fields and are not modified inside the class.
So basically this class has no state and all calculation has no side effects
(Unless I don't see something very obscure).
Ref modifier is not needed here, because the reference is not modified.
There's no way of telling unless you know if it stores values in local variables or in a field (in the static class, not the method).
All local variables will be fine and instanced per call, but the fields will not.
A very bad example:
public static class TestClass
{
public static double Data;
public static string StringData = "";
// Can, and will quite often, return wrong values.
// for example returning the result of f(8) instead of f(5)
// if Data is changed before StringData is calculated.
public static string ChangeStaticVariables(int x)
{
Data = Math.Sqrt(x) + Math.Sqrt(x);
StringData = Data.ToString("0.000");
return StringData;
}
// Won't return the wrong values, as the variables
// can't be changed by other threads.
public static string NonStaticVariables(int x)
{
var tData = Math.Sqrt(x) + Math.Sqrt(x);
return Data.ToString("0.000");
}
}
I'm working with the ref and don't understand clearly "Is it like a pointer as in C/C++ or it's like a reference in C++?"
Why did I ask such a weak question as you thought for a moment?
Because, when I'm reading C#/.NET books, msdn or talking to C# developers I'm becoming confused by the following reasons:
C# developers suggest NOT to use ref in the arguments of a function, e.g. ...(ref Type someObject) doesn't smell good for them and they suggest ...(Type someObject), I really don't understand clearly this suggestion. The reasons I heard: better to work with the copy of object, then use it as a return value, not to corrupt memory by a reference etc... Often I hear such explanation about DB connection objects. As on my plain C/C++ experience, I really don't understand why to use a reference is a bad stuff in C#? I control the life of object and its memory allocations/re-allocations etc... I read in books and forums only advises it's bad, because you can corrupt your connection and cause a memory leak by a reference lose, so I control the life of object, I may control manually what I really want, so why is it bad?
Nowadays reading different books and talk to different people, I don't clearly understand is ref a pointer (*) or a reference like in C++ by & ? As I remember pointers in C/C++ always do allocate a space with a size of void* type - 4 bytes (the valid size depends on architecture), where hosts an address to a structure or variable. In C++ by passing a reference & there is no new allocations from the heap/stack and you work with already defined objects in memory space and there is no sub-allocating memory for a pointer externally like in plain C. So what's the ref in C#? Does .NET VM handle it like a pointer in plain C/C++ and its GC allocates temporary space for a pointer or it does a work like reference in C++? Does ref work only with a managed types correctly or for value types like bool, int it's better to switch an unsafe code and pass through a pointer in unmanaged style?
In C#, when you see something referring to a reference type (that is, a type declared with class instead of struct), then you're essentially always dealing with the object through a pointer. In C++, everything is a value type by default, whereas in C# everything is a reference type by default.
When you say "ref" in the C# parameter list, what you're really saying is more like a "pointer to a pointer." You're saying that, in the method, that you want to replace not the contents of the object, but the reference to the object itself, in the code calling your method.
Unless that is your intent, then you should just pass the reference type directly; in C#, passing reference types around is cheap (akin to passing a reference in C++).
Learn/understand the difference between value types and reference types in C#. They're a major concept in that language and things are going to be really confusing if you try to think using the C++ object model in C# land.
The following are essentially semantically equivalent programs:
#include <iostream>
class AClass
{
int anInteger;
public:
AClass(int integer)
: anInteger(integer)
{ }
int GetInteger() const
{
return anInteger;
}
void SetInteger(int toSet)
{
anInteger = toSet;
}
};
struct StaticFunctions
{
// C# doesn't have free functions, so I'll do similar in C++
// Note that in real code you'd use a free function for this.
static void FunctionTakingAReference(AClass *item)
{
item->SetInteger(4);
}
static void FunctionTakingAReferenceToAReference(AClass **item)
{
*item = new AClass(1729);
}
};
int main()
{
AClass* instanceOne = new AClass(6);
StaticFunctions::FunctionTakingAReference(instanceOne);
std::cout << instanceOne->GetInteger() << "\n";
AClass* instanceTwo;
StaticFunctions::FunctionTakingAReferenceToAReference(&instanceTwo);
// Note that operator& behaves similar to the C# keyword "ref" at the call site.
std::cout << instanceTwo->GetInteger() << "\n";
// (Of course in real C++ you're using std::shared_ptr and std::unique_ptr instead,
// right? :) )
delete instanceOne;
delete instanceTwo;
}
And for C#:
using System;
internal class AClass
{
public AClass(int integer)
: Integer(integer)
{ }
int Integer { get; set; }
}
internal static class StaticFunctions
{
public static void FunctionTakingAReference(AClass item)
{
item.Integer = 4;
}
public static void FunctionTakingAReferenceToAReference(ref AClass item)
{
item = new AClass(1729);
}
}
public static class Program
{
public static void main()
{
AClass instanceOne = new AClass(6);
StaticFunctions.FunctionTakingAReference(instanceOne);
Console.WriteLine(instanceOne.Integer);
AClass instanceTwo = new AClass(1234); // C# forces me to assign this before
// it can be passed. Use "out" instead of
// "ref" and that requirement goes away.
StaticFunctions.FunctionTakingAReferenceToAReference(ref instanceTwo);
Console.WriteLine(instanceTwo.Integer);
}
}
A ref in C# is equivalent to a C++ reference:
Their intent is pass-by-reference
There are no null references
There are no uninitialized references
You cannot rebind references
When you spell the reference, you are actually denoting the referred variable
Some C++ code:
void foo(int& x)
{
x = 42;
}
// ...
int answer = 0;
foo(answer);
Equivalent C# code:
void foo(ref int x)
{
x = 42;
}
// ...
int answer = 0;
foo(ref answer);
Every reference in C# is pointer to objects on heap as pointer in C++ and ref of C# is same as & in C++
The reason ref should be avoided is, C# works on fundamental that method should not change the object passed in parameter, because for someone who does not have source of method may not know if it will result in loss of data or not.
String a = " A ";
String b = a.Trim();
In this case I am confident that a remains intact. In mathematics change should be seen as an assignment that visually tells is that b is changed here by programmer's consent.
a = a.Trim();
This code will modify a itself and the coder is aware of it.
To preserve this method of change by assignment ref should be avoided unless it is exceptional case.
C# has no equvalent of C++ pointers and works on references. ref adds a level of indirection. It makes value type argument a reference and when used with reference type it makes it a reference to a reference.
In short it allows to carry any changes to a value type outside a method call. For reference type it allows to replace the original reference to a totally different object (and not just change object content). It can be used if you want to re-initialize an object inside a method and the only way to do it is to recreate it. Although I would try avoid such an approach.
So to answer your question ref would be like C++ reference to a reference.
EDIT
The above is true for safe code. Pointers do exist in unsafe C# and are used in some very specific cases.
This seems like a disposing/eventing nightmare. If I have an object who's events are registered for and pass it into a function by reference and that reference is then reallocated, the dispose should be called or the memory will be allocated until the program is closed. If the dispose is called everything registered to the objects events will no longer be registered for and everything it is registered for will no longer be registered for. How would someone keep this straight? I guess you could compare memory addresses and try to bring things back to sanity if you don't go insane.
in c# you can check run unsafe in your project properties
and then you can run this code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Exercise_01
{
public struct Coords
{
public int X;
public int Y;
public override string ToString() => $"({X}, {Y})";
}
class Program
{
static unsafe void Main(string[] args)
{
int n = 0;
SumCallByRefPointer(1, 2, &n);
Console.Clear();
Console.WriteLine("call by refrence {0}",n);
n = 0;
SumCallByValue(3, 4, n);
Console.WriteLine("call by Value {0}", n);
n = 0;
SumCallByRef(5, 6, ref n);
Console.WriteLine("call by refrence {0}", n);
Pointer();
Console.ReadLine();
}
private static unsafe void SumCallByRefPointer(int a, int b, int* c)
{
*c = a + b;
}
private static unsafe void SumCallByValue(int a, int b, int c)
{
c = a + b;
}
private static unsafe void SumCallByRef(int a, int b, ref int c)
{
c = a + b;
}
public static void Pointer()
{
unsafe
{
Coords coords;
Coords* p = &coords;
p->X = 3;
p->Y = 4;
Console.WriteLine(p->ToString()); // output: (3, 4)
}
}
}
}
what is the memory overhead on the stack and heap of A versus B
A:
private string TestA()
{
string a = _builder.Build();
return a;
}
B:
private string TestB()
{
return _builder.Build();
}
re the efficiency question; the two are identical, and in release mode will be reduced to the same thing. Either way, string is a reference-type, so the string itself is always on the heap. The only thing on the stack would be the reference to the string - a few bytes (no matter the string length).
"do all local variables go on the stack": no; there are two exceptions:
captured variables (anonymous methods / lambdas)
iterator blocks (yield return etc)
In both cases, there is a compiler generated class behind the scenes:
int i = 1;
Action action = delegate {i++;};
action();
Console.WriteLine(i);
is similar to:
class Foo {
public int i; // yes, a public field
public void SomeMethod() {i++;}
}
...
Foo foo = new Foo();
foo.i = 1;
Action action = foo.SomeMethod;
action();
Console.WriteLine(foo.i);
Hence i is on an object, hence on the heap.
Iterator blocks work in a similar way, but with the state machine.
They both get optimised to the same thing.
In answer to the question in your title "do all local variables go on the stack" the simple answer is not exactly. All objects get stored on the 'heap' (don't remember if that's what it's called in .NET) regardless. C# has a generational-based garbage collector that's aware that some objects only live a very short time and so is designed to manage this efficiently.