Pass an object to a new thread inside a loop [duplicate] - c#

I have an object that contains a very large 3D-array of doubles and I need to start a new thread that need the data of this array, so I will either need to start a new thread passing the object (which contains a whole lot of other data too) to the new thread or I just pass the 3D-array to the new thread.
For the first solution I would simply do the following:
MyClass
{
...
public double[,,] _data = new double[x,y,z];
...
}
MyMethod(object MyObject)
{
//do stuff with (MyObject as MyClass)
}
MyClass _newObject = new MyClass();
Thread thread = new Thread(new ParameterizedThreadStart(MyMethod));
thread.Start(_newObject);
My question now: As I pass the object _newObject to the new thread, is that object sent to the thread by reference or is the object copied and the copy used by the new thread? The problem is that the object contains data of around 300MB and it would be almost impossible if copies are used since I need to start around 10 threads that need to use data of that object.

By reference.
If you change the data in your thread it will change the original data you put in. Same applies if you change the data outside the thread your thread will see the modified data.
You need proper locking mechanisms so that it will not collide when accessing the data from multiple threads.

10 thread ? how do you plan to maintain the data integrity of the _newObject ?
Saying that copies will not be passed, only the reference will be used.
If you going to call the method MyMethod(object MyObject) in 10 different threads
will the MyObject be different objects ?? if not you are better off refractoring the method.
Also you should remember that a thread is just a liner set of instructions to be executed.
So just because of using multiple threads your object size will not increase in the memory.
the very advantage of multi threading is to make use of different threads to process your instructions, and does not make copies of objects.

Assuming that your MyClass is a class, then the reference of the object is only passed to the new thread, since it is a reference type(read more on reference types on MSDN), i would also suggest that you use a lock in order avoid deadlock issues you can do that simply by using the lock keyword

To be more precise it's send by reference copy.
Since this is a reference type, only a reference is copied in this case, and not all data.
That's why you have to care about locking mechanisms in cases where more then one thread accesses the data this object refers to.

Related

What is a practical application of using an immutable type in a thread-safe way that differs from using a mutable type in the same way?

Consider the following code:
class Program
{
static object locker = new object();
static string data;
static void Main(string[] args)
{
Task.Factory.StartNew(async () =>
{
while(true)
{
await Task.Delay(5000);
string localCopy;
lock (locker)
{
localCopy = data;
}
// do some read operation with localCopy;
// write to log file, call a web API, etc
Log(localCopy);
}
});
while(true)
{
// data is written to from time to time on the main thread;
// can be user input, etc.
string input = Console.ReadLine();
lock(locker)
{
data = input;
}
}
}
}
Since .NET, strings are immutable, and one of the benefits of immutability is thread safety, are the lock statements necessary?
EDIT: I chose an immutable type, string in the above example, just for context; I am generally trying to understand the "thread-safe" property of immutable types, if, based on comments (and my own understanding of things), some sort of lock semantics is still necessary in multi-threaded code when using such types cross-thread.
What is a practical application of using an immutable type in a thread-safe way that differs from using a mutable type in the same way?
As noted in the comments, it's all about the variables.
If you have multiple threads accessing the same variable, then yes, you have to protect the variable in some way (lock, Interlocked, etc).
The benefit of immutable types comes in when you pass that data to another thread - creating another variable. All you need to do is copy the reference from one variable to another, and now the first variable can change however much it wants; the second variable remains immutable.
I think it's a bit easier to understand with an example like ImmutableStack<string>. Let's say there's a "main" thread that pushes and pops that ImmutableStack<string>; since this is immutable, each push/pop updates its own variable. If our "main" thread wants to give another thread a snapshot, it just copies its current variable to another variable for that thread. Then the "main" thread can continue pushing/popping/updating its own variable with impunity. The "secondary" thread has its own immutable snapshot.
In a more general situation, this can be useful with one or more readers/responders, where each "read" loop starts with capturing the current state of the shared variable and using that local copy for the duration of the loop.
If you wanted to snapshot a mutable value, that would require doing a deep clone. Imagine if string was mutable, like it is in other languages. In that case, copying the value (reference) of the string would be insufficient; one thread could change a single character while another thread was trying to do something else with the value. In order to capture a true snapshot of a mutable string value, you'd have to copy the entire string to a new string.
There are other benefits to immutable types in general (design, etc), but this "reference snapshot" benefit is one that specifically benefits multithreading.

Why do you need a lock object for C#?

You can sometimes reuse the object itself for the lock, but quite often it is advised to use a different object anyway.
Don't we have a lot more typesafety and a lot better intention if there would just be a keyword for lock?
private object _MyLock = new object();
// someone would now be able to reassign breaking everything
_MyLock = new object();
lock ( _MyLock )
...
VS
private lock _MyLock;
// compiler error
_MyLock = new object();
lock ( _MyLock )
...
Before I get downvotes that you can't guess someones intention: I'm pretty sure the language designers had a good reason and there are more knowledgable coders here that now why. I ask this to understand better programming principles.
Note that by using an object as a monitor, you can already do all the things you can do with any other object: Pass references so that multiple classes can share the same monitor, keep them in arrays, keep them inside other data structures.
If you had a special type of declaration for a lockable, you would need special syntax for passing it to a function, storing a reference to it inside another instance, associating it with a type instead of an instance, creating arrays (e.g. for LockMany operation), and so on.
Using the language rules that already exist for objects to handle all these common and not-so-common usages makes the language a whole lot simpler.
Don't we have a lot more typesafety and a lot better intention if there would just be a keyword for lock?
It's not about type safety at all. It's about thread safety.
Sometimes that means running the same code in a single lock over and over. Perhaps you have a single large array, where some of your operations might need to swap two elements, and you want to make sure things are synchronized during the swap. In this kind of context, even a simple lock keyword by itself, where the object is created for you behind the scenes, might be good enough.
Sometimes you're sharing an object among very different sets of code. Now you need multiple lock sections that coordinate using a common object. In this case, the code you're talking about seems to make sense. Letting the compiler create a lock object for you isn't good enough because the different lock sections would not coordinate, but you also want to make sure the common lock object is fixed, and doesn't change somehow. For example, maybe you're working through an array with multiple threads, and you have different operations that might modify a shared index value that indicates which element is considered current or active. Each of these operations should lock on the same object.
But sometimes you share multiple object instances (often of the same type) among several sets of code. Think producer/consumer pattern, where multiple consumers from different threads need to coordinate access to a shared queue, and the consumers themselves are multi-threaded. In this kind of case, a single common lock object would be okay to retrieve an element from the queue, but a single shared object in different sections of the consumer could become a bottleneck for the application. Instead, you would only want to lock once per active object/consumer. You need the lock section to accept a variable that indicates which object needs protection, without locking your entire data set.
One solution may be defining lock object as readonly.
static readonly object lockObject = new object();
In this case compiler prevents renewing and assigning new object to lockobject.

Multithreading: difference between types of locking objects

Please explain the difference between these two types of locking.
I have a List which I want to access thread-safe:
var tasks = new List<string>();
1.
var locker = new object();
lock (locker)
{
tasks.Add("work 1");
}
2.
lock (tasks)
{
tasks.Add("work 2");
}
My thoughts:
Prevents two different threads from running the locked block of code at the same time.
But if another thread runs a different method where it tries to access task - this type of lock won't help.
Blocks the List<> instance so other threads in other methods will be blocked untill I unlock tasks.
Am I right or mistaking?
(2) only blocks other code that explicitly calls lock (tasks). Generally, you should only do this if you know that tasks is a private field and thus can enforce throughout your class that lock (tasks) means locking operations on the list. This can be a nice shortcut when the lock is conceptually linked with access to the collection and you don't need to worry about public exposure of the lock. You don't get this 'for free', though; it needs to be explicitly used just like locking on any other object.
They do the same thing. Any other code that tries to modify the list without locking the same object will cause potential race conditions.
A better way might be to encapsulate the list in another object that obtains a lock before doing any operations on the underlying list and then any other code can simple call methods on the wrapper object without worrying about obtaining the lock.

Is list copy thread safe?

Is it safe to use the following pattern in a multithreaded scenario?:
var collection = new List<T>(sharedCollection);
Where sharedCollection can be modified at the same time by another thread (i.e. have elements added or removed from it)?
The scenario I'm currently dealing with is copying the items from a BindingList, but the question should be relative to any standard collection type.
If it isn't thread safe, should I put a lock on the sharedCollection, or are there better solutions?
You seem to have answered your own question(s). No, copying a changing list to another list is not thread-safe, and yes, you could lock on sharedCollection. Note that it's not enough to lock sharedCollection while copying it; you need to lock it anytime you read or change its contents as well.
Edit: just a note about when it's bad to lock on the object you're modifying--if the object reference itself can be changed (like `sharedCollection = new List) or if it can be null, then make a separate object to lock on as a member of the class where the reading/writing is happening.
You can lock the SyncRoot object of sharedCollection.
Explain here :
Lock vs. ToArray for thread safe foreach access of List collection

C# Lock statements

When a thread tries to enter a critical section and obtain a lock, what is it actually doing?
I'm asking this because I usually create an object (of type object) which will serve for locking purposes only.
Consider the following: I want to write a method which accepts a collection, and an object which will serve as the locking object so the whole collection manipulation inside that method will be declared inside the critical section which will be locked by that given object.
Should I pass that locking object using "ref" or is passing a reference copy of that object is enough? In other words - since the lock statement is used with reference types only, does the mechanism check the value of the referenced object, or does it check the pointer's value? because obviously, when passing an object without "ref", I actually get a copy of the reference, and not the reference itself.
Here's a typical pattern that you can follow for locking. Basically, you can create a locking object that is used to lock access to your critical section (which, as #Hans said, is not protecting the object that you're working on -- it just handles the lock).
class ThreadSafe
{
static readonly object _locker = new object();
static int _val1, _val2;
static void Go()
{
lock (_locker)
{
if (_val2 != 0) Console.WriteLine (_val1 / _val2);
_val2 = 0;
}
}
}
This example was from Joseph Albahari's online book on threading. It provides an excellent overview of what's going on when you create a lock statement and some tips/tricks on how to best optimize for it. Definitely highly recommended reading.
Per Albahari, again, the lock statement translates in .NET 4 as:
bool lockTaken = false;
try
{
Monitor.Enter (_locker, ref lockTaken);
// Do your stuff...
}
finally { if (lockTaken) Monitor.Exit (_locker); }
It's actually safer than a straight Monitor.Enter and then calling Monitor.Exit in your finally, which is why it was added in .NET 4.
It's enough to lock the object without passing ref. What lock actually does is, call
Monitor.Enter on the beginning of the block and Monitor.Exit on exit.
Hope this helps.
MSDN says here about lock
Use Enter to acquire the Monitor on the object passed as the parameter. If another
thread has executed an Enter on the object but has not yet executed the corresponding Exit,
the current thread will block until the other thread releases the object. It is legal for
the same thread to invoke Enter more than once without it blocking; however, an equal
number of Exit calls must be invoked before other threads waiting on the object will
unblock.
which means it's not about the reference or pointer it is about the actual object which is pointed by the reference so you won't need to pass as ref simple pass by reference will work
Regarding what actually happens inside the lock see answer to this question which says
"The lock statement is translated by C# to the following:"
var temp = obj;
Monitor.Enter(temp);
try
{
// body
}
finally
{
Monitor.Exit(temp);
}
Should I pass that locking object using "ref" or is passing a reference copy of that object is enough?
Probably neither. If you have some resource that is not thread-safe, the best option usually is to access that resource directly only from one class, which has a lock object as a field (or you can lock directly on the resource). If you pass the lock object to others, it's hard to make sure that the code will still work properly, e.g. locking is done when it should and there are no deadlocks.
But if you really want to pass the lock object, you don't need to use ref, as others have pointed out. Locking is done on the instance of the object, not on the variable containing reference to it.

Categories

Resources