I am testing the validity of a large list of proxy servers concurrently. During this testing, many exceptions are being raised and caught. Although I am doing the testing in a background thread, my UI becomes unresponsive unless I use a SemaphoreSlim object to control the concurrency.
I know this is a self imposed bottle neck, and when scaling with an even larger list of proxies to test, I was hoping there might be a better way to solve the problem.
private void ValidateProxiesButton_Click(object sender, EventArgs e)
{
new Thread(async () =>
{
Thread.CurrentThread.IsBackground = true;
await ValidateProxiesAsync(proxies, judges, tests, 10);
}).Start();
}
public async Task ValidateProxiesAsync(IEnumerable<Proxy> proxies, IEnumerable<ProxyJudge> judges, IEnumerable<ProxyTest> tests = null, int maxConcurrency = 20)
{
if (proxies.Count() == 0)
{
throw new ArgumentException("Proxy list empty.");
}
foreach (var proxy in proxies)
{
proxy.Status = ProxyStatus.Queued;
}
//Get external IP to check if proxy is anonymous.
var publicIp = await WebUtility.GetPublicIP();
foreach (var judge in judges)
{
judge.Invalidation = publicIp;
}
await ValidateTestsAsync(judges.ToList<IProxyTest>());
var validJudges = judges.ToList<IProxyTest>().GetValidTests();
if (validJudges.Count == 0)
{
throw new ArgumentException("No valid judges found.");
}
if (tests != null)
{
await ValidateTestsAsync(tests.ToList<IProxyTest>());
}
var semaphore = new SemaphoreSlim(maxConcurrency);
var tasks = new List<Task>();
foreach (var proxy in proxies)
{
tasks.Add(Task.Run(async () =>
{
await semaphore.WaitAsync();
proxy.Status = ProxyStatus.Testing;
var isValid = await proxy.TestValidityAsync((IProxyTest)validJudges.GetRandomItem());
proxy.Status = isValid ? ProxyStatus.Valid : ProxyStatus.Invalid;
semaphore.Release();
}));
}
await Task.WhenAll(tasks);
}
Inside proxy.TestValidityAsync method
public async Task<bool> TestValidityAsync(IProxyTest test, int timeoutSeconds = 30)
{
try
{
var req = WebRequest.Create(test.URL);
req.Proxy = new WebProxy(this.ToString());
var respBody = await WebUtility.GetResponseStringAsync(req).TimeoutAfter(new TimeSpan(0, 0, timeoutSeconds));
if (respBody.Contains(test.Validation))
{
return true;
}
else
{
return false;
}
}
catch (Exception)
{
return false;
}
}
So I found a working solution, it is to add the TPL Dataflow NuGet package to my project and then use the TransformBlock class. When I do this, my UI stays very responsive even if I am processing tons of concurrent requests that often throw exceptions. The code below is proof of concept, I will update it when I translate it to work with my project.
Source: Throttling asynchronous tasks
private async void button1_Click(object sender, EventArgs e)
{
var downloader = new TransformBlock<string, WebResponse>(
url => Download(url),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 200 }
);
var buffer = new BufferBlock<WebResponse>();
downloader.LinkTo(buffer);
var urls = new List<string>();
for (int i = 0; i < 100000; i++)
{
urls.Add($"http://example{i}.com");
}
foreach (var url in urls)
downloader.Post(url);
//or await downloader.SendAsync(url);
downloader.Complete();
await downloader.Completion;
IList<WebResponse> responses;
if (buffer.TryReceiveAll(out responses))
{
//process responses
}
}
private WebResponse Download(string url)
{
WebResponse resp = null;
try
{
var req = WebRequest.Create(url);
resp = req.GetResponse();
}
catch (Exception)
{
}
return resp;
}
}
Related
I am attempting to use the TPL in a windows service for efficient asynchronous processing. The service runs in an infinite loop, until the service is cancelled.
Here is the code that I'm using for the main service methods:
private CancellationTokenSource cancellationTokenSource;
private readonly List<Task> tasks = new List<Task>();
protected override void OnStart(string[] args)
{
cancellationTokenSource = new CancellationTokenSource();
tasks.Add(Task.Factory.StartNew(() =>
{
Worker(cancellationTokenSource.Token);
}, cancellationTokenSource.Token));
}
private async void Worker(CancellationToken token)
{
bool keepGoing = true;
while (keepGoing)
{
try
{
if (token.IsCancellationRequested)
{
token.ThrowIfCancellationRequested();
}
//Parallel.ForEach(processors, processor =>
//{
await processor.Process();
//});
}
catch (Exception ex)
{
if (ex is OperationCanceledException)
{
keepGoing = false;
}
else
{
//write log here
}
}
finally
{
await Task.Delay(configurationSettings.OperationSettings.ServiceOperationDelay, token).ContinueWith(tsk => { });
}
}
}
protected override void OnStop()
{
cancellationTokenSource.Cancel();
using var mres = new ManualResetEventSlim();
using (cancellationTokenSource.Token.Register(() => mres.Set()))
{
Task.Factory.ContinueWhenAll(tasks.ToArray(), (t) => mres.Set());
mres.Wait();
}
}
The call to the processor basically does the following:
var records = await interfaceService.Get()
foreach record retrieved
await interfaceService.Patch()
The service utilizes an HttpClient instance to make requests.
**Get:**
using HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Get,
"");
using HttpResponseMessage response = await httpClient.SendAsync(request);
if (response.IsSuccessStatusCode)
{
//return results
}
else
{
throw Exception("Foo bar")
}
**Patch**
using HttpRequestMessage request = new HttpRequestMessage(HttpMethod.Patch, "")
{
Content = new StringContent(JsonConvert.SerializeObject(body), Encoding.UTF8, "application/json")
};
using HttpResponseMessage response = await httpClient.SendAsync(request);
The issue that I am encountering is that if the endpoint becomes unavailable, the service just doesn't effectively catch any exception or responses returned and for a lack of better terminology, falls in a hole. I believe my issue is with the way that the tasks are being managed.
What I want to ultimately be able to do is have the service with each iteration of the loop
Fire off specific tasks asynchronously, which perform the get/patch operation, at once
Wait until all are completed.
Log results of each to a file
Go to sleep
Start at step #1
In addition, when the service stops, I want to gracefully stop processing of each task.
Any help with this is greatly appreciated!
I am processing a list of Proxy servers asynchronously/concurrently, testing each proxy server for validity. Those proxy servers are being displayed in a custom user control which inherits from DataGridView and sets the DoubleBuffered property to true in it's constructor. Furthermore, that DGV is not data-bound but rather using Virtual Mode and CellValueNeeded.
In my method which tests the validity of the displayed proxies (ProxyTester.Start()), I can control the degree of concurrency using a SemaphoreSlim object. When that semaphore is initialized with a small value like 10, I can scroll through the DGV and see the data being updated and it is smooth. If I increase the degree of concurrency to a larger number like 100, which increases throughput (Yay!), my DGV starts lagging during scrolling.
How can I reduce that lag during scrolling/processing the list while still having a high degree of concurrency, besides things I've already done like setting DoubleBuffered to True and using Virtual Mode?
public partial class ExtendedDataGridView : DataGridView
{
public ExtendedDataGridView()
{
//InitializeComponent();
DoubleBuffered = true;
}
}
public partial class DataGridViewForm : Form
{
private List<Proxy> proxies = new List<Proxy>();
public DataGridViewForm()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
for (int i = 0; i < 10000; i++)
{
proxies.Add(new Proxy("127.0.0." + RandomUtility.GetRandomInt(1, 5), 8888));
}
extendedDataGridView1.RowCount = proxies.Count;
}
private void button2_Click(object sender, EventArgs e)
{
var judges = new List<ProxyJudge>();
judges.Add(new ProxyJudge("http://azenv.net"));
Task.Run(async () => { await ProxyTester.Start(proxies, judges, maxConcurrency: 1); });
}
private void extendedDataGridView1_CellValueNeeded(object sender, DataGridViewCellValueEventArgs e)
{
if (proxies.Count > 0)
{
var proxy = proxies[e.RowIndex];
switch (e.ColumnIndex)
{
case 0:
e.Value = proxy.IP;
break;
case 1:
e.Value = proxy.IsValid;
break;
default:
break;
}
}
}
}
public class ProxyTester
{
public async static Task Start(List<Proxy> proxies, List<ProxyJudge> judges, List<ProxyTest> tests = null, PauseOrCancelToken pct = null, int maxConcurrency = 100)
{
if (tests == null)
{
tests = new List<ProxyTest>();
}
//Get external IP to check if proxy is anonymous.
var publicIp = await WebUtility.GetPublicIP();
//Validate proxy judges.
var tasks = new List<Task>();
var semaphore = new SemaphoreSlim(maxConcurrency);
foreach (var judge in judges)
{
tasks.Add(Task.Run(async () => {
await semaphore.WaitAsync();
judge.IsValid = await judge.TestValidityAsync();
if (pct != null) { await pct.PauseOrCancelIfRequested(); }
semaphore.Release();
}));
}
await Task.WhenAll(tasks);
var validJudges = from judge in judges
where judge.IsValid
select judge;
if (validJudges.Count() == 0)
{
throw new Exception("No valid judges loaded.");
}
//Validate proxy tests.
tasks.Clear();
foreach (var test in tests)
{
tasks.Add(Task.Run(async () => {
await semaphore.WaitAsync();
test.IsValid = await test.TestValidityAsync();
if (pct != null) { await pct.PauseOrCancelIfRequested(); }
semaphore.Release();
}));
}
await Task.WhenAll(tasks);
var validTests = from test in tests
where test.IsValid
select test;
//Test proxies with a random, valid proxy judge. If valid, test with all valid proxy tests.
tasks.Clear();
var count = 0;
foreach (var proxy in proxies)
{
tasks.Add(Task.Run(async () =>
{
await semaphore.WaitAsync();
proxy.IsValid = await proxy.TestValidityAsync(validJudges.ElementAt(RandomUtility.GetRandomInt(0, validJudges.Count())));
semaphore.Release();
Interlocked.Increment(ref count);
Console.WriteLine(count);
if (proxy.IsValid)
{
proxy.TestedSites.AddRange(validTests);
var childTasks = new List<Task>();
foreach (var test in validTests)
{
childTasks.Add(Task.Run(async () =>
{
await semaphore.WaitAsync();
proxy.TestedSites.ElementAt(proxy.TestedSites.IndexOf(test)).IsValid = await proxy.TestValidityAsync(test);
if (pct != null) { await pct.PauseOrCancelIfRequested(); }
semaphore.Release();
}));
}
await Task.WhenAll(childTasks);
}
}));
}
await Task.WhenAll(tasks);
}
}
I have a windows form which has the following code
BindingList<TicketResult> tickResults = new BindingList<TicketResult>();
BindingSource bindingSource1 = new BindingSource();
Action<String> call;
private void method(String x)
{
if (this.dataGridView1.InvokeRequired)
{
lock (this)
{
dataGridView1.Invoke(
new MethodInvoker(() =>
{
Debug.WriteLine(x);
tickResults[int.Parse(x)].Row = "first page";
dataGridView1.Refresh();
}));
}
}
}
public Form1()
{
call = method;
ServicePointManager.DefaultConnectionLimit = 48;
InitializeComponent();
tickResults.ListChanged += tickResults_ListChanged;
for (int i = 0; i < 10; i++)
{
TicketResult result = new TicketResult();
tickResults.Add(result);
}
bindingSource1.DataSource = tickResults;
dataGridView1.DataSource = bindingSource1;
for (int i = 0; i < 10; i++)
{
Search s = new Search();
int x = i;
Task.Run(() => s.start(x, this.call));
}
}
I don't understand why the change in tickResults is not reflected without calling dataGridView1's Refresh() method.
Code for other classes which call the "call" delegate in the form are as follows:
class Search : ISearch
{
public async Task<bool> start(int i, Action<String> x)
{
bool result = false;
TicketLogic tixLogic = new TicketLogic();
try
{
await Task.Run(() => tixLogic.processFirstPage(i, x))
.ContinueWith((t) => tixLogic.processSecondPage(i, x))
.ContinueWith((t) => tixLogic.processThirdPage(i, x));
result = true;
}
catch (Exception e)
{
Debug.WriteLine(e.Message);
result = false;
}
return result;
}
public async Task<bool> stop()
{
return false;
}
public async Task<bool> restart()
{
return false;
}
}
class TicketLogic
{
public async Task<bool> processFirstPage(int i, Action<String> x)
{
bool result = false;
try
{
HttpWebRequest request = WebRequest.CreateHttp("http://www.google.com");
WebResponse response = await request.GetResponseAsync();
StreamReader reader = new StreamReader(response.GetResponseStream());
String textResponse = await reader.ReadToEndAsync();
reader.Close();
response.Close();
result = true;
}
catch (Exception e)
{
Debug.WriteLine(e.Message);
result = false;
}
return result;
}
public async Task<bool> processSecondPage(int i, Action<String> x)
{
bool result = false;
try
{
HttpWebRequest request = WebRequest.CreateHttp("http://www.example.com");
WebResponse response = await request.GetResponseAsync();
StreamReader reader = new StreamReader(response.GetResponseStream());
String textResponse = await reader.ReadToEndAsync();
//tixResult.Information = "Second Page";
reader.Close();
response.Close();
x(i.ToString());
result = true;
}
catch (Exception e)
{
Debug.WriteLine(e.Message);
result = false;
}
return result;
}
public async Task<bool> processThirdPage(int i, Action<String> x)
{
bool result = false;
try
{
HttpWebRequest request = WebRequest.CreateHttp("http://www.hotmail.com");
WebResponse response = await request.GetResponseAsync();
StreamReader reader = new StreamReader(response.GetResponseStream());
String textResponse = await reader.ReadToEndAsync();
//tixResult.Information = "Third Page";
reader.Close();
response.Close();
x(i.ToString());
result = true;
}
catch (Exception e)
{
Debug.WriteLine(e.Message);
result = false;
}
return result;
}
}
Before this I tried one more approach, in which I was passing the databound object to a computation Task, where the databound object got manipulated, but even there the result was the same i.e. the changes in the object were not reflected upon the grid until I clicked some Cell in the Grid or minimized/maximized the form.
My question is, why are the changes not being reflected in the Grid without calling datagrid refresh() ??
Try using ObservableCollection instead of BindingList
The Observable implements the INotifyPropertyChange which notifies the DataGridView when something changes
It is often difficult to answer question like this as we don't usually have access to the reasoning of the .NET designers.
So I'll try to make sense of it by guessing; I hope this not just helps you to understand but also to accept and make the best of it..
Maybe the powers that be decided that constant automatic refreshing is not good for e.g. performance or even user experience. So they leave it to you to decide just when all updates are through to to trigger a Refresh..
There is a big difference between a Click and calls from code, let alone from other tasks. A Click happens on the UI, so the UI should be current. Changes to the datasource from code could happen multiple times in a row, at any frequency and a flickering UI would not be nice..
Let's try to change the perspective: Instead of seeing the issue at hand as a tedious extra task one could see it as a chance to control the times when the refresh happens.
Or to change the perspective even further: You can prevent the user from being flooded with updates when he maybe really would prefer a Refresh button and maybe an unobtrusive count of outstanding changes or new records..
I'm trying to show a waiting symbol while while a ASYNC task are doing.
I'm really new to this, so if there are better ways to implement this code, please enlighten me :)
But, everything works except the hiding of the pictureBox1 after the code are done and there are now result found. In other words, when there are a result, the pictureBox1 are hidden
Here are the method that runs every time a outlook item are opened
private void FormRegion1_FormRegionShowing(object sender, System.EventArgs e)
{
if (this.OutlookItem is Microsoft.Office.Interop.Outlook.MailItem)
{
Microsoft.Office.Interop.Outlook.MailItem item = (Microsoft.Office.Interop.Outlook.MailItem)this.OutlookItem;
getContactByEmail(item);
}
}
This is the method that I implement the wait stuff
public async Task getContactByEmail(Microsoft.Office.Interop.Outlook.MailItem item)
{
pictureBox1.Visible = true;
using (var client = new System.Net.Http.HttpClient())
{
client.BaseAddress = new Uri("http://api.....");
client.DefaultRequestHeaders.Accept.Clear();
HttpResponseMessage response = await client.GetAsync("tools/getContactByEmail?email=" + item.SenderEmailAddress + "&key=1232");
if (response.IsSuccessStatusCode)
{
SimpleContact contact = await response.Content.ReadAsAsync<SimpleContact>();
lblName.Text = contact.Name;
lblMobile.Text = contact.Phone;
}
pictureBox1.Visible = false;
}
}
Posting the code that fixes this so the exception are not raised
if (response.IsSuccessStatusCode)
{
SimpleContact contact = await response.Content.ReadAsAsync<SimpleContact>();
if (contact != null)
{
lblName.Text = contact.Name;
lblMobile.Text = contact.Phone;
}
pictureBox1.Visible = false;
}
In C# method names are always CamelCase and asynchronous methods are always suffixed Async. Just conventions.
You might want to extract the non UI code to another asynchronous method to avoid going back and forth to the UI thread:
private async void FormRegion1_FormRegionShowing(object sender, System.EventArgs e)
{
if (this.OutlookItem is Microsoft.Office.Interop.Outlook.MailItem)
{
Microsoft.Office.Interop.Outlook.MailItem item = (Microsoft.Office.Interop.Outlook.MailItem)this.OutlookItem;
pictureBox1.Visible = true;
var contact = GetContactByEmailAsync(item);
if (contact != null)
{
lblName.Text = contact.Name;
lblMobile.Text = contact.Phone;
}
pictureBox1.Visible = false;
}
}
public async Task<SimpleContact> GetContactByEmailAsync(Microsoft.Office.Interop.Outlook.MailItem item)
{
using (var client = new System.Net.Http.HttpClient())
{
client.BaseAddress = new Uri("http://api.....");
client.DefaultRequestHeaders.Accept.Clear();
HttpResponseMessage response = await client.GetAsync(
"tools/getContactByEmail?email=" + item.SenderEmailAddress + "&key=1232")
.ConfigureAwait(false);
return (response.IsSuccessStatusCode)
? await response.Content.ReadAsAsync<SimpleContact>();
: null;
}
}
Note: Don't forget proper exception handling!!!
I have to load two large files in parallels
so far I have this code
The code below is click button method
private async void MILoadLogFile_Click(object sender, RoutedEventArgs e)
{
...
if (oFD.ShowDialog() == true)
{
await myLogSession.LoadCompassLogAsync(oFD.FileName);
await myLogSession.LoadCoreServiceLogAsync(oFD.FileName);
}
}
loading method:
public async Task LoadCompassLogAsync(String fileName)
{
StreamReader streamReader = new StreamReader(fileName);
if (fileName.Contains("Compass"))
{
...
try
{
using (streamReader)
{
//Console.Out.WriteLine("lineCount: " + lineCount);
while (((line = await streamReader.ReadLineAsync()) != null)
&& !CompassLogLoadCompleted)
{
...
loggingLvl = new LoggingLvl(eLoggingLvl);
CompassLogData cLD = new CompassLogData(id, dateTime, loggingLvl, threadId, loggingMessage);
await addRoCompassLogCollectionAsync(cLD);
}
}
}
catch (Exception e)
{
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
}
}
the LoadCoreServiceLogAsync is almost identical to LoadCompassLogAsync.
The two loading methods runs sequentially. I want them to run in parallel.
Your code will run one task after the other. To run the two tasks in parallel you can use the Task.WaitAll method:
var loadCompassLogTask = myLogSession.LoadCompassLogAsync(oFD.FileName);
var loadCoreServiceLogTask = myLogSession.LoadCoreServiceLogAsync(oFD.FileName);
Task.WaitAll(loadCompassLogTask, loadCoreServiceLogTask);
Or if you want to use await you can use Task.WhenAll:
var loadCompassLogTask = myLogSession.LoadCompassLogAsync(oFD.FileName);
var loadCoreServiceLogTask = myLogSession.LoadCoreServiceLogAsync(oFD.FileName);
await Task.WhenAll(loadCompassLogTask, loadCoreServiceLogTask);