I am trying to get 100000 string output and trying to achieve with multiple threads but when checking final result string, it only has 10000 line.
Here =>
string result = "";
private void Testing()
{
var threadA = new Thread(() => { result += A()+Environment.NewLine; });
var threadB = new Thread(() => { result += A() + Environment.NewLine; });
var threadC = new Thread(() => { result += A() + Environment.NewLine; });
var threadD = new Thread(() => { result += A() + Environment.NewLine; });
var threadE = new Thread(() => { result += A() + Environment.NewLine; });
var threadF = new Thread(() => { result += A() + Environment.NewLine; });
var threadG = new Thread(() => { result += A() + Environment.NewLine; });
var threadH = new Thread(() => { result += A() + Environment.NewLine; });
var threadI = new Thread(() => { result += A() + Environment.NewLine; });
var threadJ = new Thread(() => { result += A()+Environment.NewLine; });
threadA.Start();
threadB.Start();
threadC.Start();
threadD.Start();
threadE.Start();
threadF.Start();
threadG.Start();
threadH.Start();
threadI.Start();
threadJ.Start();
threadA.Join();
threadB.Join();
threadC.Join();
threadD.Join();
threadE.Join();
threadF.Join();
threadG.Join();
threadH.Join();
threadI.Join();
threadJ.Join();
}
private string A()
{
for (int i = 0; i <= 10000; i++)
{
result += "select * from testing" + Environment.NewLine;
}
return result;
}
But i dont get 100000,I just get 10000.Please let me known why?
Another way you can do this is to forget about creating a Thread it has a lot of overhead and there are many better solutions. Why not just use a Parallel.For. It uses the threadpool you can set how much parallelism you like.
Also if you you are dealing with threads you need to know how to write thread safe code, there are many sorts of locking mechanisms, or there are structures built with thread safety in mind, Thread-Safe Collections . If ordering doesn't matter you could easily use ConcurrentBag<T>
Which can shorten your code to
static ConcurrentBag<string> results = new ConcurrentBag<string>();
...
private static void myTest(int count )
{
for (var i = 0; i < 1000; i++)
{
results.Add("select * from testing " + i * count);
}
}
Usage
Parallel.For(0, 10, myTest);
var result = string.Join(Environment.NewLine, results);
Anyway, this wasn't intended to be a panacea to your problems or the worlds best line writing threaded masterpiece, its just to show you there is lots of resources for threading and many ways to do what you want.
As I've explained in the comments, A() is not thread-safe.
If you visualise result += value; as result = result+ value;, you can see that between a single thread getting the result, and writing it back, it's possible for another thread to get the (now) old value.
You should build each thread's contribution in a local variable (I've changed this to StringBuilder since it's more efficient than string concatenation) and then synchronise the context, and update the result object:
private readonly object _resultLock = new object();
private void A()
{
var lines = new StringBuilder();
for (int i = 0; i <= 10000; i++)
{
lines.AppendLine("select * from testing");
}
lock (_resultLock)
{
result += lines.ToString();
}
}
Since you already have a variable called "result" in the class scope, I've changed A() to a void.
It's best to lock as little as possible, since threads will have to wait to acquire the lock. We use _resultLock so that we know that the lock is for. You can read more about lock in the docs and on this question.
You might also want to look into tasks: docs, Task vs Thread question.
Related
I have a code base in which multiple threads are writing in a ConcurrentDictionary and every 60 seconds another thread runs and clones the main CD, clears it, and continues its work on the cloned CD. I want to know am I going to miss some data if I don't use lock while Cloning and Clearing the main CD? The code to demonstrate the problem is like the following:
class Program
{
static object lock_obj = new object();
static async Task Main(string[] args)
{
ConcurrentDictionary<string, ThreadSafeLong> cd = new ConcurrentDictionary<string, ThreadSafeLong>();
Func<Task> addData = () =>
{
return Task.Run(async () =>
{
var counter = 1;
while (true)
{
lock (lock_obj)
{
for (int i = 0; i < 100_000; i++)
{
cd.TryAdd($"{counter}:{i}", new ThreadSafeLong(i));
//WriteLine(i);
}
WriteLine($"Round {counter}");
}
counter++;
await Task.Delay(1_000);
}
});
};
Func<Task> writeData = () =>
{
return Task.Run(async () =>
{
while (true)
{
var sw = Stopwatch.StartNew();
lock (lock_obj) // to clone the data, and prevent any other data to be added while clone
{
var cloned = new ConcurrentDictionary<string, ThreadSafeLong>(cd);
cd.Clear();
WriteLine($"Cloned Count: {cloned.Count}");
}
sw.Stop();
WriteLine($"Elapsed Time: {sw.ElapsedMilliseconds}");
await Task.Delay(6_000);
}
});
};
await Task.WhenAll(addData(), writeData());
}
}
PS: Somehow might be related to the question here
In these cases I would replace the dictionary with a new one instead of calling clear:
lock (lock_obj)
{
var cloned = cd;
cd = new ConcurrentDictionary<string, ThreadSafeLong>();
}
In that case the other threads are finish their write into the old one or already working with the new one.
There is a demo app I prepared.
using System.Collections.Concurrent;
using System.Reactive.Linq;
class Program
{
static void Main(string[] args)
{
var stored = new ConcurrentQueue<long>();
Observable.Interval(TimeSpan.FromMilliseconds(20))
.Subscribe(it => stored.Enqueue(it));
var random = new Random();
Task.Run(async () =>
{
while (true)
{
await Task.Delay((int)(random.NextDouble() * 1000));
var currBatch = stored.ToArray();
for (int i = 0; i < currBatch.Length; i++)
{
long res;
stored.TryDequeue(out res);
}
Console.WriteLine("[" + string.Join(",", currBatch) + "]");
}
});
Console.ReadLine();
}
}
It simulates independent consumer, which fires at random time intervals. In real app event source would come from file system, though might be bursty.
What this thing does is storing indefinite ammount of events in concurrent queue, until consumer decides to consume gathered events.
I have a strong feeling that this code is unsafe. Is it possible to reproduce such behaviour in purely Rx manner?
If not, can you suggest better / safer approach?
Here you go:
var producer = Observable.Interval(TimeSpan.FromMilliseconds(20));
var random = new Random();
Task.Run(async () =>
{
var notify = new Subject<int>();
producer.Window(() => notify)
.SelectMany(ev => ev.ToList())
.Subscribe(currBatch => Console.WriteLine("[" + string.Join(",", currBatch) + "]"));
while (true)
{
await Task.Delay((int)(random.NextDouble() * 1000));
notify.OnNext(1);
}
});
Console.ReadLine();
Thread[] threads = new Thread[12];
int temp;
for (int i = 0; i < threads.Length - 1; i++)
{
temp = i;
threads[temp] = new Thread(new ThreadStart(()=> test(test1[temp],"start", temp)));
threads[temp].Start();
//threads[temp].Join();
}
for(int i=0; i<threads.Length-1; i++)
{
threads[i].Join();
}
//Need to capture the response returned from method executed"test1" in thread.
You could use a Task<T> (if you're on .NET 4+), which has a return value. You could also use events to get notified when the thread is done with doing whatever it does and get the returned value that way.
I would use Microsoft's Reactive Framework for this. NugGet "Rx-Main".
var query =
Observable
.Range(0, 12)
.SelectMany(n => Observable
.Start(() => new
{
n,
r = test(test1[n], "start", n)
}))
.ToArray()
.Select(xs => xs
.OrderBy(x => x.n)
.Select(x => x.r)
.ToArray());
query.Subscribe(rs =>
{
/* do something with the results */
});
You could start the thread using another ctor overload where you can start the thread and pass an object to that thread. The thread would then save the result in a field of that object. The main thread could after the call to Join retrieve the results from all those objects. You could have an array of 12 objects each of them passed to one thread. Or you could have an array of 12 classes, each class encapsulating one thread and the corresponding object that wraps the result:
public class ThreadResult
{
public int Result {get; set;}
}
However, today you have better choices than raw threads. Take a look at TPL (Task Parallel Library) and async / await in C#.
You can also use a shared state, in this case you have to lock every access to the shared objects inside the thread method:
Thread[] threads = new Thread[12];
int temp;
string msg = "";
List<string> results = new List<string>();
for (int i = 0; i < threads.Length; i++)
{
temp = i;
threads[temp] = new Thread(() =>
{
lock (results)
{
lock (msg)
{
msg = "Hello from Thread " + Thread.CurrentThread.ManagedThreadId;
results.Add(msg);
}
}
});
threads[temp].Start();
}
for (int i = 0; i < threads.Length; i++)
{
threads[i].Join();
}
Question: Why using a WriteOnceBlock (or BufferBlock) for getting back the answer (like sort of callback) from another BufferBlock<Action> (getting back the answer happens in that posted Action) causes a deadlock (in this code)?
I thought that methods in a class can be considered as messages that we are sending to the object (like the original point of view about OOP that was proposed by - I think - Alan Kay). So I wrote this generic Actor class that helps to convert and ordinary object to an Actor (Of-course there are lots of unseen loopholes here because of mutability and things, but that's not the main concern here).
So we have these definitions:
public class Actor<T>
{
private readonly T _processor;
private readonly BufferBlock<Action<T>> _messageBox = new BufferBlock<Action<T>>();
public Actor(T processor)
{
_processor = processor;
Run();
}
public event Action<T> Send
{
add { _messageBox.Post(value); }
remove { }
}
private async void Run()
{
while (true)
{
var action = await _messageBox.ReceiveAsync();
action(_processor);
}
}
}
public interface IIdGenerator
{
long Next();
}
Now; why this code works:
static void Main(string[] args)
{
var idGenerator1 = new IdInt64();
var idServer1 = new Actor<IIdGenerator>(idGenerator1);
const int n = 1000;
for (var i = 0; i < n; i++)
{
var t = new Task(() =>
{
var answer = new WriteOnceBlock<long>(null);
Action<IIdGenerator> action = x =>
{
var buffer = x.Next();
answer.Post(buffer);
};
idServer1.Send += action;
Trace.WriteLine(answer.Receive());
}, TaskCreationOptions.LongRunning); // Runs on a separate new thread
t.Start();
}
Console.WriteLine("press any key you like! :)");
Console.ReadKey();
Trace.Flush();
}
And this code does not work:
static void Main(string[] args)
{
var idGenerator1 = new IdInt64();
var idServer1 = new Actor<IIdGenerator>(idGenerator1);
const int n = 1000;
for (var i = 0; i < n; i++)
{
var t = new Task(() =>
{
var answer = new WriteOnceBlock<long>(null);
Action<IIdGenerator> action = x =>
{
var buffer = x.Next();
answer.Post(buffer);
};
idServer1.Send += action;
Trace.WriteLine(answer.Receive());
}, TaskCreationOptions.PreferFairness); // Runs and is managed by Task Scheduler
t.Start();
}
Console.WriteLine("press any key you like! :)");
Console.ReadKey();
Trace.Flush();
}
Different TaskCreationOptions used here to create Tasks. Maybe I am wrong about TPL Dataflow concepts here, just started to use it (A [ThreadStatic] hidden somewhere?).
The problematic issue with your code is this part: answer.Receive().
When you move it inside the action the deadlock doesn't happen:
var t = new Task(() =>
{
var answer = new WriteOnceBlock<long>(null);
Action<IIdGenerator> action = x =>
{
var buffer = x.Next();
answer.Post(buffer);
Trace.WriteLine(answer.Receive());
};
idServer1.Send += action;
});
t.Start();
So why is that? answer.Receive();, as opposed to await answer.ReceiveAsnyc(); blocks the thread until an answer is returned. When you use TaskCreationOptions.LongRunning each task gets its own thread, so there's no problem, but without it (the TaskCreationOptions.PreferFairness is irrelevant) all the thread pool threads are busy waiting and so everything is much slower. It doesn't actually deadlock, as you can see when you use 15 instead of 1000.
There are other solutions that help understand the problem:
Increasing the thread pool with ThreadPool.SetMinThreads(1000, 0); before the original code.
Using ReceiveAsnyc:
Task.Run(async () =>
{
var answer = new WriteOnceBlock<long>(null);
Action<IIdGenerator> action = x =>
{
var buffer = x.Next();
answer.Post(buffer);
};
idServer1.Send += action;
Trace.WriteLine(await answer.ReceiveAsync());
});
This is the first time I'm attempting multiple threads in a project so bear with me. The idea is this. I have a bunch of documents I need converted to pdf. I am using itextsharp to do the conversion for me. When run iteratively, the program runs fine but slow.
I have a list of items that need to be converted. I take that list and split it into 2 lists.
for (int i = 0; i < essaylist.Count / 2; i++)
{
frontessay.Add(essaylist[i]);
try
{
backessay.Add(essaylist[essaylist.Count - i]);
}
catch(Exception e)
{
}
}
if (essaylist.Count > 1)
{
var essay1 = new Essay();
Thread t1 = new Thread(() => essay1.StartThread(frontessay));
Thread t2 = new Thread(() => essay1.StartThread(backessay));
t1.Start();
t2.Start();
t1.Join();
t2.Join();
}
else
{
var essay1 = new Essay();
essay1.GenerateEssays(essaylist[1]);
}
I then create 2 threads that run this code
public void StartThread(List<Essay> essaylist)
{
var essay = new Essay();
List<System.Threading.Tasks.Task> tasklist = new List<System.Threading.Tasks.Task>();
int threadcount = 7;
Boolean threadcomplete = false;
int counter = 0;
for (int i = 0; i < essaylist.Count; i++)
{
essay = essaylist[i];
var task1 = System.Threading.Tasks.Task.Factory.StartNew(() => essay.GenerateEssays(essay));
tasklist.Add(task1);
counter++;
if (tasklist.Count % threadcount == 0)
{
tasklist.ForEach(t => t.Wait());
//counter = 0;
tasklist = new List<System.Threading.Tasks.Task>();
threadcomplete = true;
}
Thread.Sleep(100);
}
tasklist.ForEach(t => t.Wait());
Thread.Sleep(100);
}
For the majority of the files, the code runs as it should. However, for example I have 155 items that need to be convereted. When the program finishes and I look at the results I have 149 items instead of 155. It seems like the results are something like the total = list - threadcount. In this case its 7. Any ideas on how to correct this? Am I even doing threads/tasks correctly?
Also the essay.GenerateEssays code is the actual itextsharp that converts the info from the db to the actual pdf.
How about using TPL. It seems that all your code can be replaced with this
Parallel.ForEach(essaylist, essay =>
{
YourAction(essay);
});