I need you to debug my idea about this project.
I've written a backup manager project which I give a folder and it copies every file and folder of it to another location and so on.
It works (does the copy job well) but during copying which takes about 1 minute the application UI does not respond. I've heard about threads and I've seen the word parallel programming (just the word and no more), now I want some explanation, comparison and examples to become able to switch my code.
I have done very simple actions with threads before but it was a long time ago and I am not confident enough on threading. Here is my code :
private void CopyFiles(string path, string dest)
{
System.IO.Directory.CreateDirectory(dest + "\\" + path.Split('\\')[path.Split('\\').Count()-1]);
dest = dest + "\\" + path.Split('\\')[path.Split('\\').Count() - 1];
foreach (string file in System.IO.Directory.GetFiles(path))
{
System.IO.File.Copy(file, dest + "\\" + file.Split('\\')[file.Split('\\').Count() - 1]);
}
foreach (string folder in System.IO.Directory.GetDirectories(path))
{
CopyFiles(folder, dest);
}
}
I run this in a timer based on a special interval, if I come up using threading, should I omit timer? Lead me, I'm confused.
Since you are not confident with threading enough, I highly recommend you read Joe Albahari's Threading in C# Tutorial. Parallel programming is when you do multiple operations in 'parallel' or at the same time (mostly for spreading large amounts of calculations over several CPU or GPU cores). In this case you want threading to make your UI responsive while copying all the files. Essentially, you would have something set out like this: (After you read the threading in C# tutorial)
Thread copyFilesThread = new Thread(() =>
{
CopyFiles(path, dest);
});
copyFilesThread.Start();
The UI runs on its own thread. All of the code that is put into your application will run on the UI thread (unless you are explicitly using threading). Since your CopyFiles method takes a long time, it will stop the UI until the copying job is completed. Using threading will run the CopyFiles on a separate thread to the UI thread, therefore making the UI thread responsive.
Edit: As for your timer, how often does it run?
A simple way to perform an operation in a separate dedicated thread which allows you to know when the thread has completed is by using BackgroundWorker.
An example of usage is on the page I linked above.
If you want to copy a big or unknown amount of files, you should use ThreadPool
ThreadPool.QueueUserWorkItem(delegate
{
CopyFiles(folder, dest);
});
Background worker can be used to implement asynchronous execution.
This link may help
http://www.codeproject.com/Articles/20627/BackgroundWorker-Threads-and-Supporting-Cancel
Related
I want to write a program which will have 2 thread. One will download another will parse the downloaded file. The tricky part is I can not have 2 parsing thread at the same time as it is using a library technique to parse the file. Please help with a suggestion. Thank you.
Foreach(string filename in filenames)
{
//start downloading thread here;
readytoparse.Add(filename);
}
Foreach(string filename in readytoparse)
{
//start parsing here
}
I ended up with the following logic
bool parserrunning = false;
List<string> readytoparse = new List<string>();
List<string> filenames= new List<string>();
//downloading method
Foreach(string filename in filenames)
{
//start downloading thread here;
readytoparse.Add(filename);
if(parserrunning == false;
{
// start parser method
}
}
//parsing method
parserrunning = true;
list<string> _readytoparse = new List<string>(readytoparse);
Foreach(string filename in _readytoparse)
{
//start parsing here
}
parserrunning = false;
Yousuf, your "question" is pretty vague. You could take an approach where your main thread downloads the files, then each time a file finishes downloading, spawns a worker thread to parse that file. There is the Task API or QueueUserWorkItem for this sort of thing. I suppose it's possible that you could end up with an awful lot of worker threads running concurrently this way, which isn't necessarily the key to getting the work done faster and could negatively impact other concurrent work on the computer.
If you want to limit this to two threads, you might consider having the download thread write the file name into a queue each time a download finishes. Then your parser thread monitors that queue (wake up every x seconds, check the queue to see if there's anything to do, do the work, check the queue again, if there's nothing to do, go back to sleep for x seconds, repeat).
If you want the parser to be resilient, make that queue persistent (a database, MSMQ, a running text file on disk--something persistent). That way, if there is an interruption (computer crashes, program crashes, power loss), the parser can start right back up where it left off.
Code synchronization comes into play in the sense that you obviously cannot have the parser trying to parse a file that the downloader is still downloading, and if you have two threads using a queue, then you obviously have to protect that queue from concurrent access.
Whether you use Monitors or Mutexes, or QueueUserWorkItem or the Task API is sort of academic. There is plenty of support in the .NET framework for synchronizing and parallelizing units of work.
I suggest avoiding all of the heart-ache in doing this yourself with any primatives and use a library designed for this kind of thing.
I recommend Microsoft's Reactive Framework (Rx).
Here's the code:
var query =
from filename in filenames.ToObservable(Scheduler.Default)
from file in Observable.Start(() => /* read file */, Scheduler.Default)
from parsed in Observable.Start(() => /* parse file */, Scheduler.Default)
select new
{
filename,
parsed,
};
query.Subscribe(fp =>
{
/* Do something with finished file */
});
Very simple.
If your parsing library is single threaded only, then add this line:
var els = new EventLoopScheduler();
And then replace Scheduler.Default with els on the parsing line.
Without using extra threads I would simply like to display a "Loading" label or something similar to the user when a large amount of data is being read or written. If I however attempt to modify any UI elements before calling the IO method, the application freezes for a while and then displays the "Loading" message after all the work is already done. This obviously doesn't help. How can I ensure that any UI changes are applied and visible before calling the IO method?
DataSet ds = STT_Import.ImportExcelToDataSet(filePath);
bool result = false;
if (ds != null)
{
int cellCount = ds.GetTotalCellCount();
if (Popup.ShowMessage(string.Format("Your file contains {0} cells. Inserting data will take approximately {1} seconds. Do you want to continue?",
cellCount, CalculateTime(cellCount)), "Confirm", MessageType.Confirm) == MessageBoxResult.Yes)
{
// Tell user the application is working:
StatusLabel.Content = "Writing to database...";
// Do actual work after user has been notified:
result = DB.StoreItems(_currentType, ds);
}
}
I tried looking for answers but couldn't find anything that answered my specific question, so I'm sorry if the question has been asked before.
When working with WPF, you can use the Dispatcher to queue commands on the UI thread at different DispatcherPriorities
This will allow you to queue your long-running process on the UI thread after everything in the DispatcherPriority.Render or DispatcherPriority.Loaded queues have occurred.
For example, your code may look like this:
// Tell user the application is working:
StatusLabel.Content = "Writing to database...";
// Do actual work after user has been notified:
Dispatcher.BeginInvoke(DispatcherPriority.Input,
new Action(delegate() {
var result = DB.StoreItems(_currentType, ds); // Do Work
if (result)
StatusLabel.Content = "Finished";
else
StatusLabel.Content = "An error has occured";
}));
It should be noted though that its usually considered bad design to lock up an application while something is running.
A better solution would be to run the long-running process on a background thread, and simply disable your application form while it runs. There are many ways of doing this, but my personal preference is using the Task Parallel Library for it's simplicity.
As an example, your code to use a background thread would look something like this:
using System.Threading.Tasks;
...
// Tell user the application is working:
StatusLabel.Content = "Writing to database...";
MyWindow.IsEnabled = False;
// Do actual work after user has been notified:
Task.Factory.StartNew(() => DB.StoreItems(_currentType, ds))
// This runs after background thread is finished executing
.ContinueWith((e) =>
{
var isSuccessful = e.Result;
if (isSuccessful)
StatusLabel.Content = "Finished";
else
StatusLabel.Content = "An error has occured";
MyWindow.Enabled = true;
});
You are trying to solve the problem in the wrong manner. What you should be doing here is run the time-consuming task in a worker thread; this way, your UI will remain responsive and the current question will become moot.
There are several ways you can offload the task to a worker thread; among the most convenient are using the thread pool and asynchronous programming.
It is provably impossible to keep your UI responsive without utilizing additional threads unless your database provides an asynchronous version of the method you're using. If it does provide an asynchronous version of the method then you simply need to use that. (Keep in mind that async does not mean that it's using any other threads. It's entirely possible to create an asynchronous method that never uses additional threads, and that's exactly what's done with most network IO methods.) The specifics of how to go about doing that will depends on the type of DB framework you're using, and how you're using it.
If your DB framework does not provide async methods then the only way to keep the UI responsive is to perform the long running operation(s) in a non-UI thread.
The Approach you are using is not efficient way so I would suggest to go with Async Programing or threading
Async programming:
Visual Studio 2012 introduces a simplified approach, async programming, that leverages asynchronous support in the .NET Framework 4.5 and the Windows Runtime. The compiler does the difficult work that the developer used to do, and your application retains a logical structure that resembles synchronous code. As a result, you get all the advantages of asynchronous programming with a fraction of the effort. Support .Net framework 4.5
It will save your time to implementing System .Threading and very efficient for the task same as your where we have to wait for some operation
http://msdn.microsoft.com/en-ca/library/vstudio/hh191443.aspx
http://go.microsoft.com/fwlink/?LinkID=261549
or
Threading:
The advantage of threading is the ability to create applications that use more than one thread of execution. For example, a process can have a user interface thread that manages interactions with the user and worker threads that perform other tasks while the user interface thread waits for user input.Support .Net fremework 4.0 or Older
http://msdn.microsoft.com/en-us/library/aa645740%28v=vs.71%29.aspx
If you don't want the UI to be responsive I use a busy indicator.
There are prettier cursors - this is an in house application.
using (new WaitCursor())
{
// very long task
Search.ExecuteSearch(enumSrchType.NextPage);
}
public class WaitCursor : IDisposable
{
private Cursor _previousCursor;
public WaitCursor()
{
_previousCursor = Mouse.OverrideCursor;
Mouse.OverrideCursor = Cursors.Wait;
}
#region IDisposable Members
public void Dispose()
{
Mouse.OverrideCursor = _previousCursor;
}
#endregion
}
I am using the BackgroundWorker to do some heavy stuff in the background so that the UI does not become unresponsive.
But today I noticed that when I run my program, only one of the two CPUs is being used.
Is there any way to use all CPUs with the BackgroundWorker?
Here is my simplified code, just if you are curious!
private System.ComponentModel.BackgroundWorker bwPatchApplier;
this.bwPatchApplier.WorkerReportsProgress = true;
this.bwPatchApplier.DoWork += new System.ComponentModel.DoWorkEventHandler(this.bwPatchApplier_DoWork);
this.bwPatchApplier.ProgressChanged += new System.ComponentModel.ProgressChangedEventHandler(this.bwPatchApplier_ProgressChanged);
this.bwPatchApplier.RunWorkerCompleted += new System.ComponentModel.RunWorkerCompletedEventHandler(this.bwPatchApplier_RunWorkerCompleted);
private void bwPatchApplier_DoWork(object sender, DoWorkEventArgs e)
{
string pc1WorkflowName;
string pc2WorkflowName;
if (!GetWorkflowSettings(out pc1WorkflowName, out pc2WorkflowName)) return;
int progressPercentage = 0;
var weWorkspaces = (List<WEWorkspace>) e.Argument;
foreach (WEWorkspace weWorkspace in weWorkspaces)
{
using (var spSite = new SPSite(weWorkspace.SiteId))
{
foreach (SPWeb web in spSite.AllWebs)
{
using (SPWeb spWeb = spSite.OpenWeb(web.ID))
{
PrintHeader(spWeb.ID, spWeb.Title, spWeb.Url, bwPatchApplier);
try
{
for (int index = 0; index < spWeb.Lists.Count; index++)
{
SPList spList = spWeb.Lists[index];
if (spList.Hidden) continue;
string listName = spList.Title;
if (listName.Equals("PC1") || listName.Equals("PC2"))
{
#region STEP 1
// STEP 1: Remove Workflow
#endregion
#region STEP 2
// STEP 2: Add Events: Adding & Updating
#endregion
}
if ((uint) spList.BaseTemplate == 10135 || (uint) spList.BaseTemplate == 10134)
{
#region STEP 3
// STEP 3: Configure Custom AssignedToEmail Property
#endregion
#region STEP 4
if (enableAssignToEmail)
{
// STEP 4: Install AssignedTo events to Work lists
}
#endregion
}
#region STEP 5
// STEP 5 Install Notification Events
#endregion
#region STEP 6
// STEP 6 Install Report List Events
#endregion
progressPercentage += TotalSteps;
UpdatePercentage(progressPercentage, bwPatchApplier);
}
}
catch (Exception exception)
{
progressPercentage += TotalSteps;
UpdatePercentage(progressPercentage, bwPatchApplier);
}
}
}
}
}
PrintMessage(string.Empty, bwPatchApplier);
PrintMessage("*** Process Completed", bwPatchApplier);
UpdateStatus("Process Completed", bwPatchApplier);
}
Thanks a lot for looking into this :)
The BackgroundWorker does its work within a single background (ThreadPool) thread. As such, if it's computationally heavy, it'll use one CPU heavily. The UI thread is still running on the second, but is probably (like most user interface work) spending almost all of its time idle waiting for input (which is a good thing).
If you want to split your work up to use more than one CPU, you'll need to use some other techniques. This could be multiple BackgroundWorker components, each doing some work, or using the ThreadPool directly. Parallel programming has been simplified in .NET 4 via the TPL, which is likely a very good option. For details, you can see my series on the TPL or MSDN's page on the Task Parallel Library.
Each BackgroundWorker uses only a single thread to do the stuff you tell it to do. To take advantage of multiple cores, you would need multiple threads. That would mean either multiple BackgroundWorkers or spawning multiple threads from within your DoWork method.
The backgroundworker, by itself, only provides one additional thread of execution. It's purpose is to get things off the UI thread, and it's very good at that job. If you want more threads, you need to provide them yourself.
It would be tempting here to build a method that accepts an SPWeb argument, and just call Thread.Start() over and over for each object; then finish with Thread.Join() or WaitAll() to wait for them to finish at the end of the BackgroundWorker. However, this would be a bad idea because you'll lose efficiency as the operating system spends time performing context switches among all the threads.
Instead, you want to force your system to run in only a few threads, but at least two (in this case). A good rule of thumb is (2n - 1), where "n" is the number of processor cores you have... but there are all kinds of cases where you want to break this rule. You can implement this by using a ThreadPool, by iterating over your SPWeb objects and adding them to a queue that you keep pulling from, or other means such as the TPL.
The BackgroundWorker is running a new thread on the second CPU core, leaving the UI responsive.
If you're using .NET 4, look into using the Task Parallel Library, which could give you better results and utilize both cores.
The BackgroundWorker itself is only creating a single thread apart from your main UI to do work in - it's not trying to parallelize the operations within that work thread. If you want to spread your work across multiple work threads you should look into using the TPL. Bear in mind that not all tasks translate well to parallel execution, so if freeing the UI is your only goal this may already be the best you can do.
There are potential pitfalls to this, but you might get some mileage out of utilizing Parallel.ForEach:
Instead of
foreach (SPWeb web in spSite.AllWebs)
{
//Your loop code here
}
You could:
Parallel.Foreach(spSite.AllWebs, web =>
{
//Your loop code here
});
This basically creates a Task (from the Task API in .NET 4.0) from each item and schedules that work with the TaskPool, which will give you some of the parallelism you will need to take advantage of those cores.
You will have to fix the inevitable concurrency problems that might arise from this, but it's a good starting point. You are going to at least fix the fact that you are maintaining a shared state across threads (the progress counter). Here's some guidance on that: http://msdn.microsoft.com/en-us/library/dd997392.aspx
I have to be able to save a file, unfortunatly it can potentially be very large so saving it can potentially take minutes. As I need to do this from a GUI thread I don't want to block the GUI from executing. I was thinking about attempting the save operation on a seperate thread to allow the primary GUI thread to continue executing.
Is there a nice (easy) way to spawn a new thread, save the file, and destroy the thread without any nasty side effects?!
It must be said that I have NEVER had to use threads before so I am a complete novice! Any and all help would be greatly appreciated!
BackgroundWorker (as suggested by Frederik) is a good choice, particularly if you want to report progress to the UI while you're saving. A search for BackgroundWorker tutorial gets a lot of hits, so you should be able to follow one of those to get you started.
One thing to be careful of: would there be any way of changing the data structure that you'll be trying to save from the UI thread? If so, you should disable those aspects of the UI while you're saving - it would (probably!) be bad to be half way through saving the data, then allow the user to change some of it. If you can get away with effectively handing off the data to the background thread and then not touching it from the UI thread, that will make your life a lot easier.
You could maybe use the BackGroundWorker component, as it will abstract a bit the Threading part for you.
Your problem might be that there are several nice and easy ways of doing it. If you just want to set off the file save and not worry about knowing when it has completed, then having a method
void SaveMyFile(object state)
{
// SaveTheFile
}
and calling it with
ThreadPool.QueueUserWorkItem( SaveMyFile );
will do what you want.
I would recommend doing Asynchronous I/O. It's a little bit easier to set up and doesn't require you to create new threads yourself.
Asynchronous programming is where you have, for example, a file stream you want to write to but does not want to wait for it to finish. You might want to be notified when it's finished but you don't want to wait.
What you do is using the BeginWrite/BeginRead and EndWrite/EndRead functions that are available on the Stream class.
In your method you start by calling BeginWrite with all the data you want to write and also pass in a callback function. This function will be called when BeginWrite has finished.
Inside the callback function you call EndWrite and clean up the stream and check for errors.
BeginWrite will not block which means that if it's called from within an event handler that thread can finish that handler and continue processing more event (such as other GUI events).
using System;
using System.IO;
using System.Text;
class Program
{
private static FileStream stream;
static void Main(string[] args)
{
stream = new FileStream("foo.txt",
FileMode.Create,
FileAccess.Write);
const string mystring = "Foobarlalala";
ASCIIEncoding encoding = new ASCIIEncoding();
byte[] data = encoding.GetBytes(mystring);
Console.WriteLine("Started writing");
stream.BeginWrite(data, 0, data.Length, callback, null);
Console.WriteLine("Writing dispatched, sleeping 5 secs");
System.Threading.Thread.Sleep(5000);
}
public static void callback(IAsyncResult ia)
{
stream.EndWrite(ia);
Console.WriteLine("Finished writing");
}
}
}
The sleeping is pretty important because the thread that's writing stuff will be killed if the main thread is killed off. This is not an issue in a GUI application, only here in this small example.
MSDN has a pretty good overview on how to write this stuff, and also some good articles on Asynch programming in general in case you go for the backgroundworker or ThreadPool.
or u could use old friends delegates.
this kind of follows on from another question of mine.
Basically, once I have the code to access the file (will review the answers there in a minute) what would be the best way to test it?
I am thinking of creating a method which just spawns lots of BackgroundWorker's or something and tells them all load/save the file, and test with varying file/object sizes. Then, get a response back from the threads to see if it failed/succeeded/made the world implode etc.
Can you guys offer any suggestions on the best way to approach this? As I said before, this is all kinda new to me :)
Edit
Following ajmastrean's post:
I am using a console app to test with Debug.Asserts :)
Update
I originally rolled with using BackgroundWorker to deal with the threading (since I am used to that from Windows dev) I soon realised that when I was performing tests where multiple ops (threads) needed to complete before continuing, I realised it was going to be a bit of a hack to get it to do this.
I then followed up on ajmastrean's post and realised I should really be using the Thread class for working with concurrent operations. I will now refactor using this method (albeit a different approach).
In .NET, ThreadPool threads won't return without setting up ManualResetEvents or AutoResetEvents. I find these overkill for a quick test method (not to mention kind of complicated to create, set, and manage). Background worker is a also a bit complex with the callbacks and such.
Something I have found that works is
Create an array of threads.
Setup the ThreadStart method of each thread.
Start each thread.
Join on all threads (blocks the current thread until all other threads complete or abort)
public static void MultiThreadedTest()
{
Thread[] threads = new Thread[count];
for (int i = 0; i < threads.Length; i++)
{
threads[i] = new Thread(DoSomeWork());
}
foreach(Thread thread in threads)
{
thread.Start();
}
foreach(Thread thread in threads)
{
thread.Join();
}
}
#ajmastrean, since unit test result must be predictable we need to synchronize threads somehow. I can't see a simple way to do it without using events.
I found that ThreadPool.QueueUserWorkItem gives me an easy way to test such use cases
ThreadPool.QueueUserWorkItem(x => {
File.Open(fileName, FileMode.Open);
event1.Set(); // Start 2nd tread;
event2.WaitOne(); // Blocking the file;
});
ThreadPool.QueueUserWorkItem(x => {
try
{
event1.WaitOne(); // Waiting until 1st thread open file
File.Delete(fileName); // Simulating conflict
}
catch (IOException e)
{
Debug.Write("File access denied");
}
});
Your idea should work fine. Basically you just want to spawn a bunch of threads, and make sure the ones writing the file take long enough to do it to actually make the readers wait. If all of your threads return without error, and without blocking forever, then the test succeeds.