I have a C# program that currently downloads data from several sites synchronously after which the code does some work on the data I've downloaded. I am trying to move this to do my downloads asynchronously and then process the data I've downloaded. I am having some trouble with this sequencing. Below is a snapshot of code I am using:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Started URL downloader");
UrlDownloader d = new UrlDownloader();
d.Process();
Console.WriteLine("Finished URL downloader");
Console.ReadLine();
}
}
class UrlDownloader
{
public void Process()
{
List<string> urls = new List<string>() {
"http://www.stackoverflow.com",
"http://www.microsoft.com",
"http://www.apple.com",
"http://www.google.com"
};
foreach (var url in urls)
{
WebClient Wc = new WebClient();
Wc.OpenReadCompleted += new OpenReadCompletedEventHandler(DownloadDataAsync);
Uri varUri = new Uri(url);
Wc.OpenReadAsync(varUri, url);
}
}
void DownloadDataAsync(object sender, OpenReadCompletedEventArgs e)
{
StreamReader k = new StreamReader(e.Result);
string temp = k.ReadToEnd();
PrintWebsiteTitle(temp, e.UserState as string);
}
void PrintWebsiteTitle(string temp, string source)
{
Regex reg = new Regex(#"<title[^>]*>(.*)</title[^>]*>");
string title = reg.Match(temp).Groups[1].Value;
Console.WriteLine(new string('*', 10));
Console.WriteLine("Source: {0}, Title: {1}", source, title);
Console.WriteLine(new string('*', 10));
}
}
Essentially, my problem is this. My output from above is:
Started URL downloader
Finished URL downloader
"Results of d.Process()"
What I want to do is complete the d.Process() method and then return to the "Main" method in my Program class. So, the output I am looking for is:
Started URL downloader
"Results of d.Process()"
Finished URL downloader
My d.Process() method runs asynchronously, but I can't figure out how to wait for all of my processing to complete before returning to my Main method. Any ideas on how to do this in C#4.0? I am not sure how I'd go about 'telling' my Process() method to wait until all it's asynchronous activity is complete before returning to the Main method.
If you are on .NET>=4.0 you can use TPL
Parallel.ForEach(urls, url =>
{
WebClient Wc = new WebClient();
string page = Wc.DownloadString(url);
PrintWebsiteTitle(page);
});
I would also use HtmlAgilityPack to parse the page instead of regex.
void PrintWebsiteTitle(string page)
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
Console.WriteLine(doc.DocumentNode.Descendants("title").First().InnerText);
}
I would recommend using WebClient.DownloadDataAsync instead of writing your own. You could then use the Task Parallel Library to wrap the call to DownloadDataAsync in a TaskCompletionSource to get multiple Task objects you can wait on or continue with:
webClient.DownloadDataAsync(myUri);
webClient.DownloadDataCompleted += (s, e) =>
{
tcs.TrySetResult(e.Result);
};
if (wait)
{
tcs.Task.Wait();
Console.WriteLine("got {0} bytes", tcs.Task.Result.Length);
}
else
{
tcs.Task.ContinueWith(t => Console.WriteLine("got {0} bytes", t.Result.Length));
}
To handle error conditions, you can expand the use of the TaskCompletionSource:
webClient.DownloadDataCompleted += (s, e) =>
{
if(e.Error != null) tcs.SetException(e.Error);
else if(e.Cancelled) tcs.SetCanceled();
else tcs.TrySetResult(e.Result);
};
To do similar with multiple tasks:
Task.WaitAll(tcs.Task, tcs2.Task);
or
Task.Factory.ContinueWhenAll(new Task[] {tcs.Task, tcs2.Task}, ts =>
{
/* do something with all the results */
});
Related
I have a C# form with a web browser control on it. I want to open a url eg:(www. google.com)in a loop and for each time the loop runs I want to first navigate to the url fill a search string and click the search button and wait until the search results load fully.
How can I do this?
I wrote this code to save the url that we get after search result loads but only the search result for the last string seems to load and gets saved in my list.
private void button1_Click(object sender, EventArgs e)
{
var task = DoNavigationAsync();
task.ContinueWith((t) =>
{
MessageBox.Show("Done!");
}, TaskScheduler.FromCurrentSynchronizationContext());
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlElement url = webBrowser1.Document.GetElementById("sb_form_q");
if (url != null)
{
url.SetAttribute("value", search[searchindx-1]);
webBrowser1.Document.GetElementById("sb_form_go").InvokeMember("click");
}
if (webBrowser1.Url.ToString() != "http://www.bing.com/")
{
SavedUrl.Add(webBrowser1.Url.ToString());
}
}
async Task DoNavigationAsync()
{
TaskCompletionSource<bool> tcsNavigation = null;
TaskCompletionSource<bool> tcsDocument = null;
this.webBrowser1.Navigated += (s, e) =>
{
if (tcsNavigation.Task.IsCompleted)
return;
tcsNavigation.SetResult(true);
};
this.webBrowser1.DocumentCompleted += (s, e) =>
{
if (this.webBrowser1.ReadyState != WebBrowserReadyState.Complete)
return;
if (tcsDocument.Task.IsCompleted)
return;
tcsDocument.SetResult(true);
};
search = new string[3];
search[0] = "C";
search[1] = "C++";
search[2] = "C#";
searchindx = 0;
foreach (string sval in search)
{
searchindx++;
tcsNavigation = new TaskCompletionSource<bool>();
tcsDocument = new TaskCompletionSource<bool>();
webBrowser1.Navigate("www.bing.com");
await tcsNavigation.Task;
await tcsDocument.Task;
}
}
Using the async HttpClient from .NET Framework 4.5, you can load a web page without using a gui element such as WebBrowser.
A download would look like this:
using (HttpClient client = new HttpClient()) {
await client.GetStringAsync("https://google.com");
}
This would get you the HTML content of the google search site.
But if you just want to have the resulting URL, you won't even need to perform a download because google (and most other search engines) provides an API for that. Note the following google URL: https://www.google.com/search?q=google. You can see that the search string "google" appears as parameter named "q". So if you build your code like this...
string[] search = new string[] { "C", "C++", "C#" };
foreach (string sval in search)
{
// C# <= 5
SavedUrl.Add(string.Format("https://google.com/search?q={0}", sval));
// C# 6
SavedUrl.Add($"https://google.com/search?q={sval}");
}
... you won't need any web access.
I'm making a WPF application where I use WebClient to download files form a webserver. I have a list of URL's with all the files i have to download. I use a foreach to loop through every URL and download each one at the time. The first URL much be completed before moving to the next one. I know the size of each file. Is there a way where I can set my e.ProgressPercentage to know the size of all files instead of loading from 0 to 100% for each file. I know that I'm calling DownloadProtocol for each URL right now, which makes a new instance of WebClient, but it is the only way I can think of to fulfill my solution, which is to download one file at a time.
public DownloadStart()
{
foreach(var url in ListOfDownloadURL)
{
DownloadGameFile dlg = new DownloadGameFile();
await dlg.DownloadProtocol(url, myLocation);
}
}
Download function in DownloadGameFile class:
public async Task DownloadProtocol(string address, string location)
{
Uri Uri = new Uri(address);
using (WebClient client = new WebClient())
{
//client.DownloadFileCompleted += new AsyncCompletedEventHandler(Completed);
//client.DownloadProgressChanged += new DownloadProgressChangedEventHandler(DownloadProgress);
client.DownloadProgressChanged += (o, e) =>
{
Console.WriteLine(e.BytesReceived + " " + e.ProgressPercentage);
//ProgressBar = e.ProgressPercentage (total)
};
client.DownloadFileCompleted += (o, e) =>
{
if (e.Cancelled == true)
{
Console.WriteLine("Download has been canceled.");
}
else
{
Console.WriteLine("Download completed!");
}
};
await client.DownloadFileTaskAsync(Uri, location);
}
}
Why not take the easy way out and just update the progress when file is completed? Something like...
ProgressBar p = new ProgressBar();
p.Maximum = ListOfDownloadURL.Count();
foreach(var url in ListOfDownloadURL)
{
DownloadGameFile dlg = new DownloadGameFile();
await dlg.DownloadProtocol(url, myLocation);
p.Value += 1;
}
Or if you insist, you could query file sizes before you begin downloading, sum total bytes of all the files and then calculate the percentage when ever DownloadProgressChanged is fired.
var bytes = Convert.ToInt64(client.ResponseHeaders["Content-Length"]);
I'm trying to get the current user's network download speed. After hitting a dead end with NetworkInterfaces and all I tried a solution I found online. I edited it a bit and it works great but it's not asynchronous.
public static void GetDownloadSpeed(this Label lbl)
{
double[] speeds = new double[5];
for (int i = 0; i < 5; i++)
{
int fileSize = 407; //Size of File in KB.
WebClient client = new WebClient();
DateTime startTime = DateTime.Now;
if (!Directory.Exists($"{CurrentDir}/tmp/speedtest"))
Directory.CreateDirectory($"{CurrentDir}/tmp/speedtest");
client.DownloadFile(new Uri("https://ajax.googleapis.com/ajax/libs/threejs/r69/three.min.js"), "/tmp/speedtest/three.min.js");
DateTime endTime = DateTime.Now;
speeds[i] = Math.Round((fileSize / (endTime - startTime).TotalSeconds));
}
lbl.Text = string.Format("{0}KB/s", speeds.Average());
}
That function is called within a timer at an interval of 2 minutes.
MyLbl.GetDownloadSpeed()
I've tried using WebClient.DownloadFileAsync but that just shows the unlimited symbol.My next try would be to use HttpClient but before I go on does anyone have a recommended way of getting the current users download speed asynchronously (without lagging the main GUI thread)?
As it was suggested you could make an async version of GetDownloadSpeed():
async void GetDownloadSpeedAsync(this Label lbl, Uri address, int numberOfTests)
{
string directoryName = #"C:\Work\Test\speedTest";
string fileName = "tmp.dat";
if (!Directory.Exists(directoryName))
Directory.CreateDirectory(directoryName);
Stopwatch timer = new Stopwatch();
timer.Start();
for (int i = 0; i < numberOfTests; ++i)
{
using (WebClient client = new WebClient())
{
await client.DownloadFileTaskAsync(address, Path.Combine(directoryName, fileName), CancellationToken.None);
}
}
lbl.Text == Convert.ToString(timer.Elapsed.TotalSeconds / numberOfTests);
}
WebClient class being relatively old does not have awaitable DownloadFileAsync().
EDITED
As it was correctly pointed out WebClient in fact has a task-based async method DownloadFileTaskAsync(), which i advise to use. The code below can still help addressing the case when async method returning Task is not provided.
We can fix it with the help of TaskCompletionSource<T>:
public static class WebClientExtensions
{
public static Task DownloadFileAwaitableAsync(this WebClient instance, Uri address,
string fileName, CancellationToken cancellationToken)
{
TaskCompletionSource<object> tcs = new TaskCompletionSource<object>();
// Subscribe for completion event
instance.DownloadFileCompleted += instance_DownloadFileCompleted;
// Setup cancellation
var cancellationRegistration = cancellationToken.CanBeCanceled ? (IDisposable)cancellationToken.Register(() => { instance.CancelAsync(); }) : null;
// Initiate asyncronous download
instance.DownloadFileAsync(address, fileName, Tuple.Create(tcs, cancellationRegistration));
return tcs.Task;
}
static void instance_DownloadFileCompleted(object sender, System.ComponentModel.AsyncCompletedEventArgs e)
{
((WebClient)sender).DownloadDataCompleted -= instance_DownloadFileCompleted;
var data = (Tuple<TaskCompletionSource<object>, IDisposable>)e.UserState;
if (data.Item2 != null) data.Item2.Dispose();
var tcs = data.Item1;
if (e.Cancelled)
{
tcs.TrySetCanceled();
}
else if (e.Error != null)
{
tcs.TrySetException(e.Error);
}
else
{
tcs.TrySetResult(null);
}
}
}
Try `await Task.Run(()=> { //your code });
Edit: #JustDevInc I still think you should use DownloadAsync. Task.Run(delegate) creates a new thread and you might want to avoid that. If you want, post some of your old code so we can try to fix it.
Edit: The first solution turned out to be the only one of the two working. DownloadFileAsync doesn't return task, so can't it awaited.
this program reads a list of web site then saves them.
i found it runs good for the first 2 url requests. then goes very slow (about 5 min per request)
the time spend on row 1 and row 2 are only 2 second.
Then all other will be about 5 min each.
When i debug , i see it actually tooks long in wb.Navigate(url.ToString());
public static async Task<bool> test()
{
long totalCnt = rows.Count();
long procCnt = 0;
foreach (string url in rows)
{
procCnt++;
string webStr = load_WebStr(url).Result;
Console.WriteLine(DateTime.Now+ "["+procCnt + "/" + totalCnt+"] "+url);
}
return true;
}
public static async Task<string> load_WebStr(string url)
{
var tcs = new TaskCompletionSource<string>();
var thread = new Thread(() =>
{
EventHandler idleHandler = null;
idleHandler = async (s, e) =>
{
// handle Application.Idle just once
Application.Idle -= idleHandler;
// return to the message loop
await Task.Yield();
// and continue asynchronously
// propogate the result or exception
try
{
var result = await webBrowser_Async(url);
tcs.SetResult(result);
}
catch (Exception ex)
{
tcs.SetException(ex);
}
// signal to exit the message loop
// Application.Run will exit at this point
Application.ExitThread();
};
// handle Application.Idle just once
// to make sure we're inside the message loop
// and SynchronizationContext has been correctly installed
Application.Idle += idleHandler;
Application.Run();
});
// set STA model for the new thread
thread.SetApartmentState(ApartmentState.STA);
// start the thread and await for the task
thread.Start();
try
{
return await tcs.Task;
}
finally
{
thread.Join();
}
}
public static async Task<string> webBrowser_Async(string url)
{
string result = "";
using (var wb = new WebBrowser())
{
wb.ScriptErrorsSuppressed = true;
TaskCompletionSource<bool> tcs = null;
WebBrowserDocumentCompletedEventHandler documentCompletedHandler = (s, e) =>
tcs.TrySetResult(true);
tcs = new TaskCompletionSource<bool>();
wb.DocumentCompleted += documentCompletedHandler;
try
{
wb.Navigate(url.ToString());
// await for DocumentCompleted
await tcs.Task;
}
catch
{
Console.WriteLine("BUG!");
}
finally
{
wb.DocumentCompleted -= documentCompletedHandler;
}
// the DOM is ready
result = wb.DocumentText;
}
return result;
}
I recognize a slightly modified version of the code I used to answer quite a few WebBrowser-related questions. Was it this one? It's always a good idea to include a link to the original source.
Anyhow, the major problem in how you're using it here is perhaps the fact that you create and destroy an instance of WebBrowser control for every URL from your list.
Instead, you should be re-using a single instance of WebBrowser (or a pool of WebBrowser objects). You can find both versions here.
How can I pause the loop in someRandomMethod() until the code in DownloadCompleted() have been executed? This code below only unpacks the latest version in the versions array. It's like the loop is faster than the first download and m_CurrentlyDownloading have the latest value the first time DownloadCompleted() is beeing executed.
private void someRandomMethod() {
for (int i = 0; i < versions.Count; i++)
{
//ClearInstallFolder();
m_CurrentlyDownloading = versions.ElementAt(i);
Download(versions.ElementAt(i));
LocalUpdate(versions.ElementAt(i));
System.Threading.Thread.Sleep(500);
}
}
private void Download(string p_Version)
{
string file = p_Version + ".zip";
string url = #"http://192.168.56.5/" + file;
//client is global in the class
client = new WebClient();
client.DownloadFileCompleted += new AsyncCompletedEventHandler(DownloadCompleted);
client.DownloadProgressChanged += new DownloadProgressChangedEventHandler(DownloadProgressChanged);
client.DownloadFileAsync(new Uri(url), #"C:\tmp\" + file);
}
private void DownloadCompleted(object sender, AsyncCompletedEventArgs e)
{
if (e.Error == null)
{
Unpack(m_CurrentlyDownloading);
if (GetInstalledVersion() == GetLatestVersion())
ClearZipFiles();
}
else
MessageBox.Show(e.Error.ToString());
}
The easiest way would be to not use the *async methods. The normal DownloadFile will pause execution until it completes.
But if you've got access to the Await keyword, try this.
private async Task Download(string p_Version)
{
string file = p_Version + ".zip";
string url = #"http://192.168.56.5/" + file;
//client is global in the class
client = new WebClient();
client.DownloadFileCompleted += new AsyncCompletedEventHandler(DownloadCompleted);
client.DownloadProgressChanged += new DownloadProgressChangedEventHandler(DownloadProgressChanged);
await client.DownloadFileAsync(new Uri(url), #"C:\tmp\" + file);
}
something like this can be used to wait
make it class property
bool IsDownloadCompleted=false;
Add this in DownloadCompletedEvent
IsDownloadCompleted=true;
and this where you want to stop loop
while(DownloadCompleted!=true)
{
Application.DoEvents();
}
Create some boolean variable, create a delegate and get\set methods for this variable.
Then just in loop made smth like :
while(!isDownLoadCompleted)Thread.Sleep(1024);
You Can use Paralel.ForEach. this loop will wait until all threads done.
check Here for how to use :
http://msdn.microsoft.com/tr-tr/library/dd460720(v=vs.110).aspx
or
http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx