I have a loop creating three tasks:
List<Task> tasks = new List<Task>();
foreach (DSDevice device in validdevices)
{
var task = Task.Run(() =>
{
var conf = PrepareModasConfig(device, alternativconfig));
//CHECK-Point1
string config = ModasDicToConfig(conf);
//CHECK-Point2
if (config != null)
{
//Do Stuff
}
else
{
//Do other Stuff
}
});
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
it calls this method, where some data of a dictionary of a default-config gets overwritten:
private Dictionary<string, Dictionary<string, string>> PrepareModasConfig(DSDevice device, string alternativeconfig)
{
try
{
Dictionary<string, Dictionary<string, string>> config = new Dictionary<string, Dictionary<string, string>>(Project.project.ModasConfig.Config);
if (config.ContainsKey("[Main]"))
{
if (config["[Main]"].ContainsKey("DevName"))
{
config["[Main]"]["DevName"] = device.ID;
}
}
return config;
}
catch
{
return null;
}
}
and after that, it gets converted into a string with this method:
private string ModasDicToConfig(Dictionary<string, Dictionary<string, string>> dic)
{
string s = string.Empty;
try
{
foreach (string key in dic.Keys)
{
s = s + key + "\n";
foreach (string k in dic[key].Keys)
{
s = s + k + "=" + dic[key][k] + "\n";
}
s = s + "\n";
}
return s;
}
catch
{
return null;
}
}
But every Tasks gets the exact same string back.
On //CHECK-Point1 I check the Dic for the changed value: Correct Value for each Task
On //CHECK-Point2 I check the String: Same String on all 3 Tasks (Should be of course different)
Default-Dictionary looks like this: (shortened)
{
{"[Main]",
{"DevName", "Default"},
...
},
...
}
The resulting string look like that:
[Main]
DevName=003 <--This should be different (from Device.ID)
...
[...]
EDIT:
I moved the methods to execute outside the Task. Now I get the correct Results. So I guess it has something to do with the Task?
List<Task> tasks = new List<Task>();
foreach (DSDevice device in validdevices)
{
var conf = PrepareModasConfig(device, alternativconfig));
//CHECK-Point1
string config = ModasDicToConfig(conf);
//CHECK-Point2
var task = Task.Run(() =>
{
if (config != null)
{
//Do Stuff
}
else
{
//Do other Stuff
}
});
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
The problem isn't caused by tasks. The lambda passed to Task.Run captures the loop variable device so when the tasks are executed, all will use the contents of that variable. The same problem would occur even without tasks as this SO question shows. The following code would print 10 times:
List<Action> actions = new List<Action>();
for (int i = 0; i < 10; ++i )
actions.Add(()=>Console.WriteLine(i));
foreach (Action a in actions)
a();
------
10
10
10
10
10
10
10
10
10
10
If the question's code used an Action without Task.Run it would still result in bad results.
One way to fix this is to copy the loop variable into a local variable and use only that in the lambda :
for (int i = 0; i < 10; ++i )
{
var ii=i;
actions.Add(()=>Console.WriteLine(ii));
}
The question's code can be fixed by copying the device loop variable into the loop:
foreach (DSDevice dev in validdevices)
{
var device=dev;
var task = Task.Run(() =>
{
var conf = PrepareModasConfig(device, alternativconfig));
Another way is to use Parallel.ForEach to process all items in parallel, using all available cores, without creating tasks explicitly:
Parallel.ForEach(validdevices,device=>{
var conf = PrepareModasConfig(device, alternativconfig));
string config = ModasDicToConfig(conf);
...
});
Parallel.ForEach allows limiting the number of worker tasks through the MaxDegreeOfParallelism option. It's a blocking call because it uses the current thread to process data along with any worker tasks.
Related
I have an application that recursively walks a very large (6 TB) folder. To speed things up, I create a new thread for each recursion. At one point my thread count was in excess of 12,000. As the task gets closer to completion, my thread count gets drops, but on Task Manager the thread count keeps climbing. I think that indicates that the threads are not being garbage collected when they finish.
At one point, my internal thread count showed 5575 threads while the Windows resource monitor showed the task using 33,023 threads.
static void Main(string[] args)
{
string folderName = Properties.Settings.Default.rootFolder;
ParameterizedThreadStart needleThreader = new ParameterizedThreadStart(needle);
Thread eye = new Thread(needleThreader);
threadcount = 1;
eye.Start(folderName);
}
static void needle(object objFolderName)
{
string folderName = (string)objFolderName;
FolderData folderData = getFolderData(folderName);
addToDB(folderData);
//since the above statement gets executed (my database table
//gets populated), I think the thread should get garbage collected
//here, but the windows thread count keeps climbing.
}
// recursive routine to walk directory structure and create annotated treeview
private static FolderData getFolderData(string folderName)
{
//Console.WriteLine(folderName);
long folderSize = 0;
string[] directories = new string[] { };
string[] files = new string[] { };
try
{
directories = Directory.GetDirectories(folderName);
}
catch { };
try
{
files = Directory.GetFiles(folderName);
}
catch { }
for (int f = 0; f < files.Length; f++)
{
try
{
folderSize += new FileInfo(files[f]).Length;
}
catch { } //cannot access file so skip;
}
FolderData folderData = new FolderData(folderName, directories.Length, files.Length, folderSize);
List<String> directoryList = directories.ToList<String>();
directoryList.Sort();
for (int d = 0; d < directoryList.Count; d++)
{
Console.Write(" " + threadcount + " ");
//threadcount is my internal counter. it increments here
//where i start a new thread and decrements when the thread ends
//see below
threadcount++;
ParameterizedThreadStart needleThreader = new ParameterizedThreadStart(needle);
Thread eye = new Thread(needleThreader);
eye.Start(directoryList[d]);
}
//thread is finished, so decrement
threadcount--;
return folderData;
}
Thanks to matt-dot-net's suggestion I spent a few hours research TPL (Task Parallel Library), and it was well worth it.
Here is my new code. It works blazingly fast, does not peg the CPU (uses 41% which is a lot but still plays nice in the sandbox), uses only about 160MB of memory (instead of nearly all of the 4GB available) and uses a maximum of about 70 threads.
You'd almost think I new what I was doing. But the .net TPL handles all the hard stuff, like determining the correct number of threads and making sure they clean up after themselves.
class Program
{
static object padlock = new object();
static void Main(string[] args)
{
OracleConnection ora = new OracleConnection(Properties.Settings.Default.ora);
ora.Open();
new OracleCommand("DELETE FROM SCRPT_APP.S_DRIVE_FOLDERS", ora).ExecuteNonQuery();
ora.Close();
string folderName = Properties.Settings.Default.rootFolder;
Task processRoot = new Task((value) =>
{
getFolderData(value);
}, folderName);
//wait is like join; it waits for this asynchronous task to finish.
processRoot.Start();
processRoot.Wait();
}
// recursive routine to walk directory structure and create annotated treeview
private static void getFolderData(object objFolderName)
{
string folderName = (string)objFolderName;
Console.WriteLine(folderName);
long folderSize = 0;
string[] directories = new string[] { };
string[] files = new string[] { };
try
{
directories = Directory.GetDirectories(folderName);
}
catch { };
try
{
files = Directory.GetFiles(folderName);
}
catch { }
for (int f = 0; f < files.Length; f++)
{
try
{
folderSize += new FileInfo(files[f]).Length;
}
catch { } //cannot access file so skip;
}
FolderData folderData = new FolderData(folderName, directories.Length, files.Length, folderSize);
List<String> directoryList = directories.ToList<String>();
directoryList.Sort();
//create a task for each subdirectory
List<Task> dirTasks = new List<Task>();
for (int d = 0; d < directoryList.Count; d++)
{
dirTasks.Add(new Task((value) =>
{
getFolderData(value);
}, directoryList[d]));
}
//start all tasks
foreach (Task task in dirTasks)
{
task.Start();
}
//wait fo them to finish
Task.WaitAll(dirTasks.ToArray());
addToDB(folderData);
}
private static void addToDB(FolderData folderData)
{
lock (padlock)
{
OracleConnection ora = new OracleConnection(Properties.Settings.Default.ora);
ora.Open();
OracleCommand addFolderData = new OracleCommand(
"INSERT INTO FOLDERS " +
"(PATH, FOLDERS, FILES, SPACE_USED) " +
"VALUES " +
"(:PATH, :FOLDERS, :FILES, :SPACE_USED) ",
ora);
addFolderData.BindByName = true;
addFolderData.Parameters.Add(":PATH", OracleDbType.Varchar2);
addFolderData.Parameters.Add(":FOLDERS", OracleDbType.Int32);
addFolderData.Parameters.Add(":FILES", OracleDbType.Int32);
addFolderData.Parameters.Add(":SPACE_USED", OracleDbType.Int64);
addFolderData.Prepare();
addFolderData.Parameters[":PATH"].Value = folderData.FolderName;
addFolderData.Parameters[":FOLDERS"].Value = folderData.FolderCount;
addFolderData.Parameters[":FILES"].Value = folderData.FileCount;
addFolderData.Parameters[":SPACE_USED"].Value = folderData.Size;
addFolderData.ExecuteNonQuery();
ora.Close();
}
}
}
}
So I'm pulling in a list of items and for each item I'm creating an instance of an object to run a task on that item. All the objects are the same, they updated based off of a received message every three seconds. This update does not all occur at once though, sometimes it takes 3.1 seconds, etc. This is data I need to serialize in XML once it all exists so I'm looking for a way to see when its all done.
I've explored tasks in .net 4.6 but that initiates a task and it reports complete and then to run again the task class would initiate it again but in my case that won't work because each instance stays alive and initiates itself when a new message comes in.
What is the best way to have it report it reached the last line of code and then look at a list of these instances and say when all of them show as complete then run task to serialize?
I've included code below of the instance that is running.
private void OnMessageReceived(object sender, MessageReceivedEventArgs e)
{
var eventArgs = new CallDataReceivedEventArgs();
this.OnCallDataReceived(eventArgs);
try
{
List<Tuple<String, TimeSpan>> availInItems = new List<Tuple<string, TimeSpan>>();
List<Tuple<string, int, TimeSpan, string, string, string>> agentlist = new List<Tuple<string, int, TimeSpan, string, string, string>>();
if (e == null)
{
return;
}
List<TimeSpan> listOfTimeSpans = new List<TimeSpan>();
if (e.CmsData != null)
{
#region Gathering Agent Information
// Create a list of all timespans for all _agents in a queue using the property AgentTimeInState
foreach (var item in e.CmsData.Agents)
{
//AgentData = new ScoreBoardAgentDataModel(AgentName, AgentExtension, AgentTimeInState, AgentAuxReason, AgentId, AgentAdcState);
_agentData.AgentName = item.AgName;
_agentData.AgentExtension = item.Extension;
_agentData.AgentAuxReason = item.AuxReasonDescription;
_agentData.AgentId = item.LoginId;
_agentData.AgentAcdState = item.WorkModeDirectionDescription;
_agentData.AgentTimeInState = DateTime.Now - item.DateTimeUpdated;
_agentData.TimeSubmitted = DateTime.Now;
agentlist.Add(Tuple.Create(_agentData.AgentName, _agentData.AgentExtension, _agentData.AgentTimeInState, _agentData.AgentId, _agentData.AgentAcdState, _agentData.AgentAuxReason));
if (_agentData.AgentAcdState == "AVAIL")
{
listOfTimeSpans.Add(_agentData.AgentTimeInState);
availInItems.Add(Tuple.Create(_agentData.AgentName, _agentData.AgentTimeInState));
}
availInItems.Sort((t1, t2) => t1.Item2.CompareTo(t2.Item2));
}
var availInAgents =
agentlist
.Where(ag => ag.Item5 == "AVAIL")
.ToList();
availInAgents.Sort((t1, t2) =>
t1.Item3.CompareTo(t2.Item3));
var max3 = availInAgents.Skip(availInAgents.Count - 3);
max3.Reverse();
_agents.AgentsOnBreak = 0;
foreach (var agent in agentlist)
{
if (!string.IsNullOrEmpty(agent.Item6) && agent.Item6.StartsWith("Break"))
{
_agents.AgentsOnBreak++;
}
}
_agents.AgentsOnLunch = 0;
foreach (var agent in agentlist)
{
//If the current agent's aux reason is Lunch
if (!string.IsNullOrEmpty(agent.Item6) && agent.Item6.StartsWith("Lunch"))
{
//add one to agentsonlunch
_agents.AgentsOnLunch++;
}
}
_agents.NextInLine = string.Empty;
foreach (var agent in max3.Reverse())
{
//assign agent to NextInLine and start a new line
_agents.NextInLine += agent.Item1 + Environment.NewLine;
//reverse NextInLine
_agents.NextInLine.Reverse();
}
_agents.TimeSubmitted = DateTime.Now;
#endregion
#region Gathering Skill Information
_skillData.OldestCall = e.CmsData.Skill.OldestCall;
_skillData.AgentsStaffed = e.CmsData.Skill.AgentsStaffed;
_skillData.AgentsAuxed = e.CmsData.Skill.AgentsInAux;
_skillData.AgentsAvailable = e.CmsData.Skill.AgentsAvailable;
_skillData.AgentsOnCalls = e.CmsData.Skill.AgentsOnAcdCall;
_skillData.CallsWaitingInQueue = e.CmsData.Skill.InQueueInRing;
_skillData.Asa = e.CmsData.Skill.AnswerTimePerAcdCall;
_skillData.TimeSubmitted = DateTime.Now;
_skillData.EstimatedHoldTimeLow = e.CmsData.Skill.ExpectedWaitTimeLow;
_skillData.EstimatedHoldTimeMedium = e.CmsData.Skill.ExpectedWaitTimeMedium;
_skillData.EstimatedHoldTimeHigh = e.CmsData.Skill.ExpectedWaitTimeHigh;
#endregion
}
}
catch (Exception ex)
{
_logger.Info(ex.Message, ex);
}
}
With tasks you can start many at the same time and wait for them all to finish like this:
var taskList = new List<Task>();
foreach (var thingToDo in work)
{
taskList.Add(thingToDo.StartTask());
}
Task.WaitAll(taskList.ToArray());
This way you can run everything in parallel and wont get after the last line until everything is done.
Edit following your comment
You can embed your work in a task with this:
public async Task DoWork()
{
var taskList = new List<Task>();
foreach (var thingToDo in work)
{
taskList.Add(thingToDo.StartTask());
}
await Task.WhenAll(taskList.ToArray());
}
I would like to seek your help in implementing Multi-Threading in my C# program.
The program aims to upload 10,000++ files to an ftp server. I am planning to implement atleast a minimum of 10 threads to increase the speed of the process.
With this, this is the line of code that I have:
I have initialized 10 Threads:
public ThreadStart[] threadstart = new ThreadStart[10];
public Thread[] thread = new Thread[10];
My plan is to assign one file to a thread, as follows:
file 1 > thread 1
file 2 > thread 2
file 3 > thread 3
.
.
.
file 10 > thread 10
file 11 > thread 1
.
.
.
And so I have the following:
foreach (string file in files)
{
loop++;
threadstart[loop] = new ThreadStart(() => ftp.uploadToFTP(uploadPath + #"/" + Path.GetFileName(file), file));
thread[loop] = new Thread(threadstart[loop]);
thread[loop].Start();
if (loop == 9)
{
loop = 0;
}
}
The passing of files to their respective threads is working. My problem is that the starting of the thread is overlapping.
One example of exception is that when Thread 1 is running, then a file is passed to it. It returns an error since Thread 1 is not yet successfully done, then a new parameter is being passed to it. Also true with other threads.
What is the best way to implement this?
Any feedback will be greatly appreciated. Thank you! :)
Using async-await and just pass an array of files into it:
private static async void TestFtpAsync(string userName, string password, string ftpBaseUri,
IEnumerable<string> fileNames)
{
var tasks = new List<Task<byte[]>>();
foreach (var fileInfo in fileNames.Select(fileName => new FileInfo(fileName)))
{
using (var webClient = new WebClient())
{
webClient.Credentials = new NetworkCredential(userName, password);
tasks.Add(webClient.UploadFileTaskAsync(ftpBaseUri + fileInfo.Name, fileInfo.FullName));
}
}
Console.WriteLine("Uploading...");
foreach (var task in tasks)
{
try
{
await task;
Console.WriteLine("Success");
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
}
Then call it like this:
const string userName = "username";
const string password = "password";
const string ftpBaseUri = "ftp://192.168.1.1/";
var fileNames = new[] { #"d:\file0.txt", #"d:\file1.txt", #"d:\file2.txt" };
TestFtpAsync(userName, password, ftpBaseUri, fileNames);
Why doing it the hard way?
.net already has a class called ThreadPool.
You can just use that and it manages the threads itself.
Your code will be like this:
static void DoSomething(object n)
{
Console.WriteLine(n);
Thread.Sleep(10);
}
static void Main(string[] args)
{
ThreadPool.SetMaxThreads(20, 10);
for (int x = 0; x < 30; x++)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(DoSomething), x);
}
Console.Read();
}
I have a telephony application, in which I want to invoke simultaneous calls,. Each call will occupy a channel or port. So I added all channels to a BlockingCollection. The application is a windows service.
Let's see the code.
public static BlockingCollection<Tuple<ChannelResource, string>> bc = new BlockingCollection<Tuple<ChannelResource, string>>();
public static List<string> list = new List<string>();// then add 100 test items to it.
The main application has the code:
while (true)
{
ThreadEvent.WaitOne(waitingTime, false);
lock (SyncVar)
{
Console.WriteLine("Block begin");
for (int i = 0; i < ports; i++)
{
var firstItem = list.FirstOrDefault();
if (bc.Count >= ports)
bc.CompleteAdding();
else
{
ChannelResource cr = OvrTelephonyServer.GetChannel();
bc.TryAdd(Tuple.Create(cr, firstItem));
list.Remove(firstItem);
}
}
pc.SimultaneousCall();
Console.WriteLine("Blocking end");
if (ThreadState != State.Running) break;
}
Now for the simultaneous call code:
public void SimultaneousCall()
{
Console.WriteLine("There are {0} channels to be processed.", bc.Count);
var workItemBlock = new ActionBlock<Tuple<ChannelResource, string>>(
workItem =>
{
ProcessEachChannel(workItem);
});
foreach (var workItem in bc.GetConsumingEnumerable())
{
bool result = workItemBlock.SendAsync(workItem).Result;
}
workItemBlock.Complete();
}
private void ProcessEachChannel(Tuple<ChannelResource, string> workItem)
{
ChannelResource cr = workItem.Item1;
string sipuri = workItem.Item2;
VoiceResource vr = workItem.Item1.VoiceResource;
workItem.Item1.Disconnected += new Disconnected(workItemItem1_Disconnected);
bool success = false;
try
{
Console.WriteLine("Working on {0}", sipuri);
DialResult dr = new DialResult();
// blah blah for calling....
}
catch (Exception ex)
{
Console.WriteLine("Exception: {0}", ex.Message);
}
finally
{
if (cr != null && cr.VoiceResource != null)
{
cr.Disconnect();
cr.Dispose();
cr = null;
Console.WriteLine("Release channel for item {0}.", sipuri);
}
}
}
The question was when I tested the application with 4 ports, I thought the code should reach at
Console.WriteLine("Blocking end");
However it was not. Please see the snapshot.
The application is just hanging on after releasing the last channel. I guess that I may use the blockingcollection incorrectly. Thanks for help.
UPDATE:
Even I changed the code by using POST action as below, the situation is still unchanged.
private bool ProcessEachChannel(Tuple<ChannelResource, string> workItem)
{
// blah blah to return true or false respectively.
public void SimultaneousCall()
{
Console.WriteLine("There are {0} channels to be processed.", bc.Count);
var workItemBlock = new ActionBlock<Tuple<ChannelResource, string>>(
workItem =>
{
bool success = ProcessEachChannel(workItem);
});
foreach (var workItem in bc.GetConsumingEnumerable())
{
workItemBlock.Post(workItem);
}
workItemBlock.Complete();
}
I believe the problem is that you never call bc.CompleteAdding(): the if means it would be called in ports + 1-th iteration of the loop, but the loop iterates only ports-times. Because of this, GetConsumingEnumerable() returns a sequence that never ends, which means the foreach inside SimultaneousCall() blocks forever.
I think the right solution is to call bc.CompleteAdding() after the for loop, not in an impossible condition inside it.
I have some async method
public static Task<JObject> GetUser(NameValueCollection parameters)
{
return CallMethodApi("users.get", parameters, CallType.HTTPS);
}
And I write method below
public static IEnumerable<JObject> GetUsers(IEnumerable<string> usersUids, Field fields)
{
foreach(string uid in usersUids)
{
var parameters = new NameValueCollection
{
{"uids", uid},
{"fields", FieldsUtils.ConvertFieldsToString(fields)}
};
yield return GetUser(parameters).Result;
}
}
This method is asynchronous? How to write this using Parallel.ForEach?
Something kind of like this.
public static IEnumerable<JObject> GetUsers(IEnumerable<string> usersUids, Field fields)
{
var results = new List<JObject>
Parallel.ForEach(usersUids, uid => {
var parameters = new NameValueCollection
{
{"uids", uid},
{"fields", FieldsUtils.ConvertFieldsToString(fields)}
};
var user = GetUser(parameters).Result;
lock(results)
results.Add(user);
});
return results;
}
NOTE: The results won't be in the same order as you expect.
Your method is not asynchronous. Assuming your GetUser method already starts an asynchronous task, Parallel.ForEach would use additional threads just to start off your tasks, which is probably not what you want.
Instead, what you probably want to do is to start all of the tasks and wait for them to finish:
public static IEnumerable<JObject> GetUsers(IEnumerable<string> usersUids, Field fields)
{
var tasks = usersUids.Select(
uid =>
{
var parameters = new NameValueCollection
{
{"uids", uid},
{"fields", FieldsUtils.ConvertFieldsToString(fields)}
};
return GetUser(parameters);
}
).ToArray();
Task.WaitAll(tasks);
var result = new JObject[tasks.Length];
for (var i = 0; i < tasks.Length; ++i)
result[i] = tasks[i].Result;
return result;
}
If you also want to start them in parallel you can use PLINQ:
var tasks = usersUids.AsParallel().AsOrdered().Select(
uid =>
{
var parameters = new NameValueCollection
{
{"uids", uid},
{"fields", FieldsUtils.ConvertFieldsToString(fields)}
};
return GetUser(parameters);
}
).ToArray();
Both code snippets preserve relative ordering of uids and returned objects - result[0] corresponds to usersUids[0], etc.