I have a windows form program. In the main form I have a label where I update with progress. I use the following code to perform my operations:
Thread workThread = new Thread(delegate ()
{
for (int i = 1; i < bytes.Length; i++)
{
SliderOverBytesAndDoProcessing();
}
});
workThread.Start();
The function SliderOverBytesAndDoProcessing() performs operations where I read 4 bytes and perform some operations on more bytes. I do this for each 4 bytes of the 200MB file (read 4 bytes process something and continue moving). Everything works fine until I decide to report my progress. Here is how I do the update:
Thread workThread = new Thread(delegate ()
{
for (int i = 1; i < bytes.Length; i++)
{
Lbl.Invoke((Action)delegate { Lbl.Text = "Processing " + (ii / 1048576) + "/" + (bytes.Length / 1048576); }
SliderOverBytesAndDoProcessing();
}
});
workThread.Start();
If I add that one line the program becomes too slow. I mean the program runs in around a few minutes without that line but with that line it takes around 1.5 hours. Also, on some other operations if I do that, I end up in infinite loops. How do I report my progress from the other thread to form thread without facing this problem.
Thanks.
It is slow because you’re dispatching to the UI thread on every byte, even though the progress message only seems to care about megabytes. So you change the label over a million times to the same value.
Consider
var totalMb = bytes.Length / 1024 / 1024;
var lastMb = -1;
for (int i = 1; i < bytes.Length; i++) {
var currentMb = i / 1024 / 1024;
if (currentMb > lastMb) {
Lbl.Invoke(() => { Lbl.Text = $"Processing {currentMb}/{totalMb} MiB" });
lastMb = currentMb;
}
SliderOverBytesAndDoProcessing();
}
Related
So it takes too long, if I try to do this code in serialization. So I want to do it with threading. But I'm running into safety-problems. Here is what starts the threads:
protected void Page_LoadComplete(object sender, EventArgs e)
{
if (!IsPostBack)
{
using (var finished = new CountdownEvent(1))
{
for (int i = 1; i <= Convert.ToInt32(ViewState["Count"]); i++)
{
string k = i.ToString();
ThreadInfo threadInfo = new ThreadInfo();
threadInfo.f = k;
finished.AddCount(); // Indicate that there is another work item.
ThreadPool.QueueUserWorkItem(
(state) =>
{
try
{
Debug.WriteLine("Thread-Start Part " + k.ToString());
CalcThread(threadInfo);
}
finally
{
finished.Signal(); // Signal that the work item is complete.
}
}, null);
Thread.Sleep(300);
}
Debug.WriteLine("Waiting till threads are done! ");
finished.Signal(); // Signal that queueing is complete.
finished.Wait(); // Wait for all work items to complete.
Debug.WriteLine("Threads have completed! ");
}
}
Here is one place that I believe I'm getting some unsafe conditions, and I really don't know if there is a way to solve the problems. I cannot Lock ( ... ) all the code. Because that would defeat the purpose. So a lot of the calculations happen in a sub-class. The problem, I believe is when this sub-class is called, in multiple threads... the answers I'm getting back are often the same answers.
m_calculation.StartQuotePart(runnerweight, waste, machinesize, injectioncycle, volumefactor, MOQ);
m_calculation.partCostNoShiping = m_calculation.partCostNoShiping / partquantity * (1.0 + partcommission);
m_calculation.FedExShippingCost = m_calculation.FedExShippingCost / partquantity * (1.0 + partcommission);
m_calculation.DHLShippingCost = m_calculation.DHLShippingCost / partquantity * (1.0 + partcommission);
m_calculation.UPSShippingCost = m_calculation.UPSShippingCost / partquantity * (1.0 + partcommission);
m_calculation.OceanShippingCost = m_calculation.OceanShippingCost / partquantity * (1.0 + partcommission);
m_calculation.materialcost_out = m_calculation.materialcost_out / partquantity;
m_calculation.ProcessCost_out = m_calculation.ProcessCost_out / partquantity;
m_calculation.DHLshippingcost_out = m_calculation.DHLshippingcost_out / partquantity;
I have an instance for m_calculation... in the same class that is kicking off the threads. Is there a better way to access it, that wouldn't cause the issues. Is there a better way to create the threads that would cause the variable-mis-mash? This is supposed to run during the 'Page-Load-Complete' and then wait for the threads to complete with 'Finish-Wait'
Edit: I'm updating the Load-Complete with this based on jjxtra's post...
int count = Convert.ToInt32(ViewState["Count"]);
var tasks = new Task[count];
for (int i = 1; i <= count; i++)
{
string k = i.ToString();
ThreadInfo threadInfo = new ThreadInfo();
threadInfo.f = k;
tasks[i - 1] = Task.Factory.StartNew(() =>
{
CalcThread(threadInfo);
});
}
Task.WaitAll(tasks);
That's what the state object is for, you can pass an object to the thread, let it modify that object in isolation and then execute some sort of callback at the end of the thread to another method, or you can use the Join method to wait for the work to complete.
Instead of pasing null to QueueUserWorkItem, pass an instance of a class with everything the thread needs to do it's work and report results.
Having said all this, if you switch to using Task everything might be much simpler, but not sure what version of .NET / .NET core you are using...
I am reading the contents of a zip file and trying to extract them.
var allZipEntries = ZipFile.Open(zipFileFullPath, ZipArchiveMode.Read).Entries;
Now if I extract the using Foreach loop this works fine. The drawback is it is equivalent of zip.extract method and I am not getting any advantage when intend to extract all the files.
foreach (var currentEntry in allZipEntries)
{
if (currentEntry.FullName.Equals(currentEntry.Name))
{
currentEntry.ExtractToFile($"{tempPath}\\{currentEntry.Name}");
}
else
{
var subDirectoryPath = Path.Combine(tempPath, Path.GetDirectoryName(currentEntry.FullName));
Directory.CreateDirectory(subDirectoryPath);
currentEntry.ExtractToFile($"{subDirectoryPath}\\{currentEntry.Name}");
}
}
Now to take advantage of TPL tried using Parallel.forEach,but that's throwing following exception:
An exception of type 'System.IO.InvalidDataException' occurred in System.IO.Compression.dll but was not handled in user code
Additional information: A local file header is corrupt.
Parallel.ForEach(allZipEntries, currentEntry =>
{
if (currentEntry.FullName.Equals(currentEntry.Name))
{
currentEntry.ExtractToFile($"{tempPath}\\{currentEntry.Name}");
}
else
{
var subDirectoryPath = Path.Combine(tempPath, Path.GetDirectoryName(currentEntry.FullName));
Directory.CreateDirectory(subDirectoryPath);
currentEntry.ExtractToFile($"{subDirectoryPath}\\{currentEntry.Name}");
}
});
And to avoid this I could use a lock , but that defeats the whole purpose.
Parallel.ForEach(allZipEntries, currentEntry =>
{
lock (thisLock)
{
if (currentEntry.FullName.Equals(currentEntry.Name))
{
currentEntry.ExtractToFile($"{tempPath}\\{currentEntry.Name}");
}
else
{
var subDirectoryPath = Path.Combine(tempPath, Path.GetDirectoryName(currentEntry.FullName));
Directory.CreateDirectory(subDirectoryPath);
currentEntry.ExtractToFile($"{subDirectoryPath}\\{currentEntry.Name}");
}
}
});
Any other or better way around to extract the files?
ZipFile is explicitly documented as not guaranteed to be threadsafe for instance members. This is no longer mentioned on the page. Snapshot from Nov 2016.
What you're trying to do cannot be done with this library. There may be some other libraries out there which do support multiple threads per zip file, but I wouldn't expect it.
You can use multi-threading to unzip multiple files at the same time, but not for multiple entries in the same zip file.
Writing/reading in parallel is not a good idea as the hard drive controller will only run the requests one by one. By having multiple threads you just add overhead and queue them all up for no gain.
Try reading the file into memory first, this will avoid your exception however if you benchmark it you may find its actually slower due to the overhead of more threads.
If the file is very large and the decompression takes a long time, running the decompressing in parallel may improve speed, however the IO read/write will not. Most decompression libraries are already multi threaded anyway, so only if this one is not will you have any performance gain from doing this.
Edit: A dodgy way to make the library thread safe below. This runs slower/on par depending on the zip archive which proves the point that this is not something that will benefit from parallelism
Array.ForEach(Directory.GetFiles(#"c:\temp\output\"), File.Delete);
Stopwatch timer = new Stopwatch();
timer.Start();
int numberOfThreads = 8;
var clonedZipEntries = new List<ReadOnlyCollection<ZipArchiveEntry>>();
for (int i = 0; i < numberOfThreads; i++)
{
clonedZipEntries.Add(ZipFile.Open(#"c:\temp\temp.zip", ZipArchiveMode.Read).Entries);
}
int totalZipEntries = clonedZipEntries[0].Count;
int numberOfEntriesPerThread = totalZipEntries / numberOfThreads;
Func<object,int> action = (object thread) =>
{
int threadNumber = (int)thread;
int startIndex = numberOfEntriesPerThread * threadNumber;
int endIndex = startIndex + numberOfEntriesPerThread;
if (endIndex > totalZipEntries) endIndex = totalZipEntries;
for (int i = startIndex; i < endIndex; i++)
{
Console.WriteLine($"Extracting {clonedZipEntries[threadNumber][i].Name} via thread {threadNumber}");
clonedZipEntries[threadNumber][i].ExtractToFile($#"C:\temp\output\{clonedZipEntries[threadNumber][i].Name}");
}
//Check for any remainders due to non evenly divisible size
if (threadNumber == numberOfThreads - 1 && endIndex < totalZipEntries)
{
for (int i = endIndex; i < totalZipEntries; i++)
{
Console.WriteLine($"Extracting {clonedZipEntries[threadNumber][i].Name} via thread {threadNumber}");
clonedZipEntries[threadNumber][i].ExtractToFile($#"C:\temp\output\{clonedZipEntries[threadNumber][i].Name}");
}
}
return 0;
};
//Construct the tasks
var tasks = new List<Task<int>>();
for (int threadNumber = 0; threadNumber < numberOfThreads; threadNumber++) tasks.Add(Task<int>.Factory.StartNew(action, threadNumber));
Task.WaitAll(tasks.ToArray());
timer.Stop();
var threaderTimer = timer.ElapsedMilliseconds;
Array.ForEach(Directory.GetFiles(#"c:\temp\output\"), File.Delete);
timer.Reset();
timer.Start();
var entries = ZipFile.Open(#"c:\temp\temp.zip", ZipArchiveMode.Read).Entries;
foreach (var entry in entries)
{
Console.WriteLine($"Extracting {entry.Name} via thread 1");
entry.ExtractToFile($#"C:\temp\output\{entry.Name}");
}
timer.Stop();
Console.WriteLine($"Threaded version took: {threaderTimer} ms");
Console.WriteLine($"Non-Threaded version took: {timer.ElapsedMilliseconds} ms");
Console.ReadLine();
I've
written simple for loop iterating through array and Parallel.ForEach loop doing the same thing. However, resuls I've get are different so I want to ask what the heck is going on? :D
class Program
{
static void Main(string[] args)
{
long creating = 0;
long reading = 0;
long readingParallel = 0;
for (int j = 0; j < 10; j++)
{
Stopwatch timer1 = new Stopwatch();
Random rnd = new Random();
int[] array = new int[100000000];
timer1.Start();
for (int i = 0; i < 100000000; i++)
{
array[i] = rnd.Next(5);
}
timer1.Stop();
long result = 0;
Stopwatch timer2 = new Stopwatch();
timer2.Start();
for (int i = 0; i < 100000000; i++)
{
result += array[i];
}
timer2.Stop();
Stopwatch timer3 = new Stopwatch();
long result2 = 0;
timer3.Start();
Parallel.ForEach(array, (item) =>
{
result2 += item;
});
if (result != result2)
{
Console.WriteLine(result + " - " + result2);
}
timer3.Stop();
creating += timer1.ElapsedMilliseconds;
reading += timer2.ElapsedMilliseconds;
readingParallel += timer3.ElapsedMilliseconds;
}
Console.WriteLine("Create : \t" + creating / 100);
Console.WriteLine("Read: \t\t" + reading / 100);
Console.WriteLine("ReadP: \t\t" + readingParallel / 100);
Console.ReadKey();
}
}
So in the condition I get results:
result = 200009295;
result2 = 35163054;
Is there anything wrong?
The += operator is non-atomic and actually performs multiple operations:
load value at location that result is pointing to, into memory
add array[i] to the in-memory value (I'm simplifying here)
write the result back to result
Since a lot of these add operations will be running in parallel it is not just possible, but likely that there will be races between some of these operations where one thread reads a result value and performs the addition, but before it has the chance to write it back, another thread grabs the old result value (which hasn't yet been updated) and also performs the addition. Then both threads write their respective values to result. Regardless of which one wins the race, you end up with a smaller number than expected.
This is why the Interlocked class exists.
Your code could very easily be fixed:
Parallel.ForEach(array, (item) =>
{
Interlocked.Add(ref result2, item);
});
Don't be surprised if Parallel.ForEach ends up slower than the fully synchronous version in this case though. This is due to the fact that
the amount of work inside the delegate you pass to Parallel.ForEach is very small
Interlocked methods incur a slight but non-negligible overhead, which will be quite noticeable in this particular case
I have a Web Service, to make the load of the database server for a local database, making 100 requests for records.
Since the process is slow, I want to create ten threads, not to use too much memory, making Web Service calls, and when one of the threads, finished, over 100 call records. How do part of the thread?
Example:
Create thread 1
Create thread 2
Create thread 3
Create thread 4
thread 1 complete change Web Service again
Edit
My code not working. Variable sendalways gets the value 10 and not 0,1,2,3,4 and etc.
Int32 page = 0;
do
{
for (int iterator=0; iterator < 10; iterator++)
{
listTask[iterator] = Task.Factory.StartNew(() =>
{
Int32 send = iterator + page * 10;
DoStatus("Page: " + send.ToString());
Processamento(parametros, filial, send);
});
}
Task.WaitAll(listTask);
page++;
}
while (true); // Test only
You're closing over the loop variable. You need to remember that lambdas close over variables not over values. Your tasks will each read the value of iterator at the time that the lambda executes iterator + page * 10. By the time that that happens the main thread has already incremented it to 10.
This is simple enough to resolve. Make a copy of the loop variable inside of your for loop so that the closure closes over that variable, which never changes.
for (int iterator=0; iterator < 10; iterator++)
{
int i = iterator;
listTask[iterator] = Task.Factory.StartNew(() =>
{
Int32 send = i + page * 10;
DoStatus("Page: " + send.ToString());
Processamento(parametros, filial, send);
});
}
If I understand your question, you want to create 10 threads, wait for all, then recreate 10 threads, etc. Each thread load 100 results.
In this answer, results are String but that can be changed.
private void Load()
{
Boolean loading = true;
List<String> listResult = new List<String>();
Int32 boucle = 0;
Task[] listTask = new Task[10];
do
{
// create 10 threads (=1000 results)
for (int iterator=0; iterator < 10; iterator++)
{
// [0-99] [100-199] [200-299] ...
Int32 start = 100 * iterator + 1000 * boucle;
Int32 end = start + 99;
listTask[iterator] = Task<List<String>>.Factory.StartNew(() =>
{
List<String> data = LoadData(start, end);
return data;
});
}
// wait for 10 threads to finish
Task.WaitAll(listTask);
// collapse results
for (int i=0; i < 10; i++)
{
listResult.AddRange((listTask[i] as Task<List<String>>).Result);
}
// check if there is 100 results in last thread
loading = (listTask[9] as Task<List<String>>).Result.Count == 100;
// ready for another iteration (next 1000 results)
boucle++;
}
while (loading);
}
private List<string> LoadData(int p1, int p2)
{
// TODO : load data from p1 to p2
throw new NotImplementedException();
}
I have array of 2863 objects. I want in two "runs" per 1000 objects read array data by 4 threads (running PC # of CPUs).
Currenly my source code is partitioning data to correct number of threads and runs:
Single run size (default) = 1000 elements
Number of runs = 2
Extra thread run size = 866 elements
Starting run [1 / 2]
Thread as readDCMTags(i=0,firstIndex=0, lastIndex=249
Thread as readDCMTags(i=1,firstIndex=250, lastIndex=499
Thread as readDCMTags(i=2,firstIndex=500, lastIndex=749
Thread as readDCMTags(i=3,firstIndex=750, lastIndex=999
Starting run [2 / 2]
Thread as readDCMTags(i=0,firstIndex=1000, lastIndex=1249
Thread as readDCMTags(i=1,firstIndex=1250, lastIndex=1499
Thread as readDCMTags(i=2,firstIndex=1500, lastIndex=1749
Thread as readDCMTags(i=3,firstIndex=1750, lastIndex=1999
Extra Thread as readDCMTags(i=1,firstIndex=2000, lastIndex=2865
However current source code is starting all threads at once, it is not waiting for RUN TO END. When I join threads from current run, the GUI is hanging out. How to solve the issue?
Source Code is:
nrOfChunks = 2866 / 1000;
int leftOverChunk = 2866 % 1000;
for(int z = 0; z < nrOfChunks; z++)
{
addToStatusPanel("\nStarting run [" + (z+1).ToString() + " / " + nrOfChunks.ToString() + "]");
int indexesPerThread = 1000 / 5; #nrOfThreads
int leftOverIndexes = 1000 % 5; #nrOfThreads
threads = new Thread[nrOfThreads];
threadProgress = new int[nrOfThreads];
threadDCMRead = new int[nrOfThreads];
for(int i = 0; i < nrOfThreads; i++)
{
int firstIndex = (i * indexesPerThread+z*Convert.ToInt32(chunkSizeTextBox.Text));
int lastIndex = firstIndex + indexesPerThread - 1;
if(i == (nrOfThreads- 1))
{
if(i == (nrOfThreads - 1))
{
lastIndex += leftOverIndexes;
}
}
addToStatusPanel("readDCMTags(i=" + i.ToString() + ",firstIndex=" + firstIndex.ToString() + ", lastIndex=" + lastIndex.ToString());
threads[i] = new Thread(() => readDCMTags(i.ToString(), firstIndex, lastIndex));
threads[i].Name = i.ToString();
threads[i].Start();
}
if(z == (nrOfChunks - 1))
{
int firstIndex = (nrOfChunks * Convert.ToInt32(chunkSizeTextBox.Text));
int lastIndex = firstIndex + leftOverChunk - 1;
addToStatusPanel("readDCMTags(i=" + z.ToString() + ",firstIndex=" + firstIndex.ToString() + ", lastIndex=" + lastIndex.ToString());
}
}
Adding after a loop for(int i = 0; i < nrOfThreads; i++) a join command for the threads array, before going for next next run loop for(int z = 0; z < nrOfChunks; z++) is hanging the GUI.
By definition, if you wait, you block (the current executing threads blocks waiting for something else).
What you want, instead, is for something to happen when all threads are finished. That, "all threads have finished" is an event. So your best option will be to wait in a background thread and fire the event when all threads complete.
If the GUI is interested on that, then the GUI thread will need to subscribe to that particular event.
Edit: Pseudocode (not tested, just the idea).
waitBg = new Thread(() =>
{
foreach (thread in threads)
thread.WaitFor();
// All threads have finished
if (allThreadFinishedEvent != null)
allThreadFinishedEvent();
}
);
Then, on the handler for the allThreadFinishedEvent you do whatever you want to do (remember to dispatch it to the main thread if you want to change something in the UI, as that'll be executed in the bg thread context).
How about placing this threading logic in a backgroundworker and sending back the result of your backgroundworker to your interface. This way, your interface will not be locked while the program is processing the threads.
you can find the msdn example of initializing and using the backgroundworker here.
I think that should be the correct way forward.