C# stopping an infinite foreach loop - c#

This foreach loop checks a webpage and sees if there are any images then downloads them. How do i stop it? When i press the button it continues the loop forever.
private void button1_Click(object sender, EventArgs e)
{
WebBrowser browser = new WebBrowser();
browser.DocumentCompleted +=browser_DocumentCompleted;
browser.Navigate(textBox1.Text);
}
void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser browser = sender as WebBrowser;
HtmlElementCollection imgCollection = browser.Document.GetElementsByTagName("img");
WebClient webClient = new WebClient();
int count = 0; //if available
int maximumCount = imgCollection.Count;
try
{
foreach (HtmlElement img in imgCollection)
{
string url = img.GetAttribute("src");
webClient.DownloadFile(url, url.Substring(url.LastIndexOf('/')));
count++;
if(count >= maximumCount)
break;
}
}
catch { MessageBox.Show("errr"); }
}

use the break; keyword to break out of a loop

You do not have an infinite loop, you have an exception that is being thrown based on how you are writing the file to disk
private void button1_Click(object sender, EventArgs e)
{
WebBrowser browser = new WebBrowser();
browser.DocumentCompleted += browser_DocumentCompleted;
browser.Navigate("www.google.ca");
}
void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser browser = sender as WebBrowser;
HtmlElementCollection imgCollection = browser.Document.GetElementsByTagName("img");
WebClient webClient = new WebClient();
foreach (HtmlElement img in imgCollection)
{
string url = img.GetAttribute("src");
string name = System.IO.Path.GetFileName(url);
string path = System.IO.Path.Combine(Environment.CurrentDirectory, name);
webClient.DownloadFile(url, path);
}
}
That code works fine on my environment. The issue you seemed to be having was when you were setting the DownloadFile filepath, you were setting it to a value like `\myimage.png', and the webclient could not find the path so it threw and exception.
The above code drops it into the current directory with the extension name.

Maybe the Event browser.DocumentCompleted cause the error, if the page refreshes the event gets fired again. You could try to deregister the event.
void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser browser = sender as WebBrowser;
browser.DocumentCompleted -= browser_DocumentCompleted;
HtmlElementCollection imgCollection = browser.Document.GetElementsByTagName("img");
WebClient webClient = new WebClient();
foreach (HtmlElement img in imgCollection)
{
string url = img.GetAttribute("src");
string name = System.IO.Path.GetFileName(url);
string path = System.IO.Path.Combine(Environment.CurrentDirectory, name);
webClient.DownloadFile(url, path);
}
}

Related

How do I scrape web content async?

Here is what I tried so far. This works but the Form is Freezing everytime it updates
private void timer1_Tick(object sender, EventArgs e)
{
HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("https://www.roblox.com/catalog/527365852/Dominus-Praefectus");
foreach (var item in doc.DocumentNode.SelectNodes("//*[#id='item-details']/div[1]/div[1]/div[2]/div/span[2]"))
{
textBox1.Text = item.InnerText;
}
}

C# Wait for Web Page to Load Before Scraping

I am trying to make a Windows Forms app that logs in another web application, navigates for a few steps (clicks) until it reaches a specific page and then scrape some info (names and addresses).
The problem is that I am using the DocumentCompletedEventHandler in order to have a page loaded before I execute the code for navigating to the next page (in order to reach the final web page).
When it fires, DocumentCompletedEventHandler fires multiple times.
When I reach the loggin page, it enters the credentials and then the message "Page loaded!" appears multiple times.
I press enter, it appears again.
Then it navigates to the next page and with that new page I have the same problem.
how can I make DocumentCompletedEventHandler to fire only once and not multiple times?
private void loadEvent(object sender, WebBrowserDocumentCompletedEventArgs e)
{
MessageBox.Show("Page loaded!");
}
private void loadLogin(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var inputElements = webBrowser1.Document.GetElementsByTagName("input");
foreach (HtmlElement i in inputElements)
{
if (i.GetAttribute("name").Equals("utilizator"))
{
i.InnerText = textBox1.Text;
}
if (i.GetAttribute("name").Equals("parola"))
{
i.Focus();
i.InnerText = textBox2.Text;
}
}
var buttonElements = webBrowser1.Document.GetElementsByTagName("input");
foreach (HtmlElement b in buttonElements)
{
if (b.GetAttribute("name").Equals("Intra"))
{
b.InvokeMember("Click");
}
}
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(loadEvent);
var inputElements1 = webBrowser1.Document.GetElementsByTagName("input");
foreach (HtmlElement i1 in inputElements1)
{
if (i1.GetAttribute("id").Equals("headerqstext"))
{
i1.Focus();
i1.InnerText = textBox3.Text;
}
}
var buttonElements1 = webBrowser1.Document.GetElementsByTagName("button");
foreach (HtmlElement b1 in buttonElements1)
{
if (b1.GetAttribute("title").Equals("Caută"))
{
b1.InvokeMember("Click");
}
}
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(loadEvent);
}
private void Button1_Click(object sender, EventArgs e)
{
webBrowser1.Navigate("http://10.1.104.23/ecris_cdms/");
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(loadLogin);
}
}
}
try this :)
Uri last = null;
private void CompleteResponse(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (!(last != null && last != e.Url))
return;
//your code here
}

How to update the text in a listView with the DownloadFileAsync progress percentage?

I am currently making a podcast client to download episodes. I have got a listView filled with the episodes for a feed and then when you double click on one it places it into a separate 'downloads' lisview which has a 'name' and a 'progress' column.
The problem I am having is trying to individually update each progress while downloading asynchronously. As I am not sure of how to keep track of the progress for each ListViewItem and how to reference it in the downloadProgressChanged function.
private void lvPodDownloads_SelectionChanged(object sender, SelectionChangedEventArgs e)
{
if (lvPodEpisodes.SelectedItems.Count == 1) // Check if an item is selected just to be safe
{
ListViewItem item = (ListViewItem)lvPodEpisodes.SelectedItem;
string[] epInfo = (string[])item.Tag;
txtTitle.Text = epInfo[0];
txtDesc.Text = epInfo[1];
try
{
imgFeedImage.Source = new BitmapImage(new Uri((Environment.CurrentDirectory + "\\..\\..\\feedImages\\" + epInfo[3])));
}
catch (Exception) // If it fails to set the image (Eg. It's non-existent) It will leave it blank
{
imgFeedImage.Source = null;
}
}
}
private void lvPodEpisodes_MouseDoubleClick(object sender, MouseButtonEventArgs e) // Downloading the episode in here
{
if (e.ChangedButton == MouseButton.Left) // Left button was double clicked
{
ListViewItem selected = (ListViewItem)lvPodEpisodes.SelectedItem;
string[] epInfo = (string[])selected.Tag;
Uri downloadUrl = new Uri(epInfo[2]);
List<Episode> downloading = new List<Episode>();
downloading.Add(new Episode() { Title = epInfo[0], Progress = "0%" });
lvPodDownloads.Items.Add((new Episode() { Title = epInfo[0], Progress = "0%" }));
using (WebClient client = new WebClient())
{
client.DownloadProgressChanged += new DownloadProgressChangedEventHandler(ProgressChanged);
}
}
}
static int intDownloadProgress = new int();
private void ProgressChanged(object sender, DownloadProgressChangedEventArgs e)
{
intDownloadProgress = e.ProgressPercentage;
}
private void Completed(object sender, AsyncCompletedEventArgs e)
{
MessageBox.Show("Download completed!");
}
This is a code sample of the downloading section of the program.
Here is an image of what I have so far:
https://s33.postimg.cc/gthzioxlr/image.png
You should add an extra argument to your ProgressChanged method.
private void ProgressChanged(object sender, DownloadProgressChangedEventArgs e, Episode curEpisode)
{
curEpisode.Progress = $"{e.ProgressPercentage} %";
}
And to modify the handler setting like that:
List<Episode> downloading = new List<Episode>();
var newEpisode = new Episode() { Title = epInfo[0], Progress = "0%" };
downloading.Add(newEpisode);
lvPodDownloads.Items.Add(newEpisode);
using (WebClient client = new WebClient())
{
client.DownloadProgressChanged += new DownloadProgressChangedEventHandler((sender, e) => ProgressChanged(sender, e, newEpisode));
}
The static property intDownloadProgress is then useless.
You should also think about using an observable collection for the episode list and using it for the binding via the XAML code.

C# WebBrowser Body is null, GetElementById returns null

I am loading a local HTML page using the WebBrowser control.
namespace ConfigEditorWinForms
{
[PermissionSet(SecurityAction.Demand, Name = "FullTrust")]
[System.Runtime.InteropServices.ComVisibleAttribute(true)]
public partial class ConfigEditorForm : Form
{
String _currentConfigFilePath = null;
public ConfigEditorForm()
{
Load += new EventHandler(ConfigEditorForm_Load);
InitializeComponent();
}
private void ConfigEditorForm_Load(object sender, EventArgs e)
{
webBrowser1.AllowWebBrowserDrop = true;
webBrowser1.IsWebBrowserContextMenuEnabled = false;
webBrowser1.WebBrowserShortcutsEnabled = false;
webBrowser1.ObjectForScripting = this;
//webBrowser1.ScriptErrorsSuppressed = true;
webBrowser1.DocumentCompleted +=
new WebBrowserDocumentCompletedEventHandler(OnDocumentCompleted);
string curDir = Directory.GetCurrentDirectory();
webBrowser1.Url = new Uri(String.Format("file:///{0}/ConfigEditor/ConfigEditor.html", curDir));
}
private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
MessageBox.Show("Document completed");
}
public string GetFileContents(String path)
{
return System.IO.File.ReadAllText(path);
}
public void SaveFileContents(String path, String contents)
{
System.IO.File.WriteAllText(path, contents);
}
private void openToolStripMenuItem_Click(object sender, EventArgs e)
{
openFileDialog.InitialDirectory = #"XXX";
openFileDialog.FilterIndex = 1;
openFileDialog.RestoreDirectory = false;
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
HtmlDocument doc = webBrowser1.Document;
HtmlElement fileToOpenInput = doc.GetElementById("fileToOpenInput");
fileToOpenInput.InvokeMember("onchange", new object[1] { openFileDialog.FileName });
_currentConfigFilePath = openFileDialog.FileName;
}
}
}
}
On my computer :
The document completed event is fired twice
Opening a file from the menu works fine too
On another computer :
The document completed event is only fired (ONCE) when the executable is run as administrator
Document.Body is null and Document.GetElementById returns null too, despite the document completed event being fired several seconds before.
What's going on please ?
Thank you. :)

C# Web Browser control only loads one page, will not work on the second attempt

I have urls in a listbox. I am trying to navigate to a url when it is selected.
private void lstURL_SelectedIndexChanged(object sender, EventArgs e)
{
wbrBrowser.Navigate(lstURL.Text);
lblUrl.Text = lstURL.Text;
lblTitle.Text = "Loading...";
System.Windows.Forms.HtmlDocument document = wbrBrowser.Document;
document.MouseUp += new HtmlElementEventHandler(this.htmlDocument_Click);
}
private void wbrBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
lblTitle.Text = wbrBrowser.Document.Title;
}
private void htmlDocument_Click(object sender, HtmlElementEventArgs e)
{
HtmlElement element = this.wbrBrowser.Document.GetElementFromPoint(e.ClientMousePosition);
var savedId = element.Id;
var uniqueId = Guid.NewGuid().ToString();
element.Id = uniqueId;
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(element.Document.GetElementsByTagName("html")[0].OuterHtml);
element.Id = savedId;
var node = doc.GetElementbyId(uniqueId);
var xpath = node.XPath;
lblXpath.Text = xpath;
}
It works the first time I load a page, after that it just freezes and lblTitle.Text just stays at "Loading..."
I have been searching for a while but I can't figure out why this is happening.

Categories

Resources