I'm working on an application that downloads images from the internet in Selenium. However, I get the same error in the code seen in the picture and I cannot continue the process. This code, you can see the error in the image below.
IWebDriver driver;
int PictureID = 0;
private void button1_Click(object sender, EventArgs e)
{
var ChromeService = ChromeDriverService.CreateDefaultService();
driver = new ChromeDriver(ChromeService);
driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(30);
driver.Navigate().GoToUrl("https://oblivious212.artstation.com/");
var Projects = driver.FindElements(By.ClassName("album-grid-item"));
for(int i = 0; i < Projects.Count(); i++)
{
if (Projects.ElementAt(i) == null)
{
continue;
}
Projects[i].Click();
var Images = driver.FindElements(By.TagName("img"));
for(int x = 0; x < Images.Count(); x++)
{
PictureID++;
WebClient Downloader = new WebClient();
var ImageUrl = Images[x].GetAttribute("src");
var ImageName = Images[x].GetAttribute("alt");
Downloader.DownloadFile(ImageUrl, "C:\\Users\\DeLL\\Pictures\\Images\\" + ImageName + PictureID + ".jpg");
}
driver.Navigate().Back();
}
Screenshot of exception when running in debug mode:
How do I solve this?
As soon as you navigate to a new page, which I guess is what your Projects[i].Click(); call does, any IWebElement objects you saved from an earlier page (oblivious212.artstation.com/) become "stale" and you can no longer use them. You must design your code around this fact; there are several ways you might do this.
Basically, while you're still on page oblivious212.artstation.com/, you need to save off any data you need from the IWebElement objects returned by your driver.FindElements(By.ClassName("album-grid-item")) call, into a local object, rather than saving the IWebElement objects themselves. Then, replace your Projects[i].Click(); call with code which uses your saved local data, rather than using the IWebElement objects themselves.
Related
I'm using Selenium for retrieve data from this site, and I encountered a little problem when I try to click an element within a foreach.
What I'm trying to do
I'm trying to get the table associated to a specific category of odds, in the link above we have different categories:
As you can see from the image, I clicked on Asian handicap -1.75 and the site has generated a table through javascript, so inside my code I'm trying to get that table finding the corresponding element and clicking it.
Code
Actually I have two methods, the first called GetAsianHandicap which iterate over all categories of odds:
public List<T> GetAsianHandicap(Uri fixtureLink)
{
//Contains all the categories displayed on the page
string[] categories = new string[] { "-1.75", "-1.5", "-1.25", "-1", "-0.75", "-0.5", "-0.25", "0", "+0.25", "+0.5", "+0.75", "+1", "+1.25", "+1.5", "+1.75" };
foreach(string cat in categories)
{
//Get the html of the table for the current category
string html = GetSelector("Asian handicap " + asian);
if(html == string.Empty)
continue;
//other code
}
}
and then the method GetSelector which click on the searched element, this is the design:
public string GetSelector(string selector)
{
//Get the available table container (the category).
var containers = driver.FindElements(By.XPath("//div[#class='table-container']"));
//Store the html to return.
string html = string.Empty;
foreach (IWebElement container in containers)
{
//Container not available for click.
if (container.GetAttribute("style") == "display: none;")
continue;
//Get container header (contains the description).
IWebElement header = container.FindElement(By.XPath(".//div[starts-with(#class, 'table-header')]"));
//Store the table description.
string description = header.FindElement(By.TagName("a")).Text;
//The container contains the searched category
if (description.Trim() == selector)
{
//Get the available links.
var listItems = driver.FindElement(By.Id("odds-data-table")).FindElements(By.TagName("a"));
//Get the element to click.
IWebElement element = listItems.Where(li => li.Text == selector).FirstOrDefault();
//The element exist
if (element != null)
{
//Click on the container for load the table.
element.Click();
//Wait few seconds on ChromeDriver for table loading.
driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(20);
//Get the new html of the page
html = driver.PageSource;
}
return html;
}
return string.Empty;
}
Problem and exception details
When the foreach reach this line:
var listItems = driver.FindElement(By.Id("odds-data-table")).FindElements(By.TagName("a"));
I get this exception:
'OpenQA.Selenium.StaleElementReferenceException' in WebDriver.dll
stale element reference: element is not attached to the page document
Searching for the error means that the html page source was changed, but in this case I store the element to click in a variable and the html itself in another variable, so I can't get rid to patch this issue.
Someone could help me?
Thanks in advance.
I looked at your code and I think you're making it more complicated than it needs to be. I'm assuming you want to scrape the table that is exposed when you click one of the handicap links. Here's some simple code to do this. It dumps the text of the elements which ends up unformatted but you can use this as a starting point and add functionality if you want. I didn't run into any StaleElementExceptions when running this code and I never saw the page refresh so I'm not sure what other people were seeing.
string url = "http://www.oddsportal.com/soccer/europe/champions-league/paok-spartak-moscow-pIXFEt8o/#ah;2";
driver.Url = url;
// get all the (visible) handicap links and click them to open the page and display the table with odds
IReadOnlyCollection<IWebElement> links = driver.FindElements(By.XPath("//a[contains(.,'Asian handicap')]")).Where(e => e.Displayed).ToList();
foreach (var link in links)
{
link.Click();
}
// print all the odds tables
foreach (var item in driver.FindElements(By.XPath("//div[#class='table-container']")))
{
Console.WriteLine(item.Text);
Console.WriteLine("====================================");
}
I would suggest that you spend some more time learning locators. Locators are very powerful and can save you having to stack nested loops looking for one thing... and then children of that thing... and then children of that thing... and so on. The right locator can find all that in one scrape of the page which saves a lot of code and time.
As you mentioned in related Post, this issue is because site executes an auto refresh.
Solution 1:
I would suggest if there is an explicit way to do refresh, perform that refresh on a periodic basis, or (if you are sure, when you need to do refresh).
Solution 2:
Create a Extension method for FindElement and FindElements, so that it try to get element for a given timeout.
public static void FindElement(this IWebDriver driver, By by, int timeout)
{
if(timeout >0)
{
return new WebDriverWait(driver, TimeSpan.FromSeconds(timeout)).Until(ExpectedConditions.ElementToBeClickable(by));
}
return driver.FindElement(by);
}
public static IReadOnlyCollection<IWebElement> FindElements(this IWebDriver driver, By by, int timeout)
{
if(timeout >0)
{
return new WebDriverWait(driver, TimeSpan.FromSeconds(timeout)).Until(ExpectedConditions.PresenceOfAllElementsLocatedBy(by));
}
return driver.FindElements(by);
}
so your code will use these like this:
var listItems = driver.FindElement(By.Id("odds-data-table"), 30).FindElements(By.TagName("a"),30);
Solution 3:
Handle StaleElementException using an Extension Method:
public static void FindElement(this IWebDriver driver, By by, int maxAttempt)
{
for(int attempt =0; attempt <maxAttempt; attempt++)
{
try
{
driver.FindElement(by);
break;
}
catch(StaleElementException)
{
}
}
}
public static IReadOnlyCollection<IWebElement> FindElements(this IWebDriver driver, By by, int maxAttempt)
{
for(int attempt =0; attempt <maxAttempt; attempt++)
{
try
{
driver.FindElements(by);
break;
}
catch(StaleElementException)
{
}
}
}
Your code will use these like this:
var listItems = driver.FindElement(By.Id("odds-data-table"), 2).FindElements(By.TagName("a"),2);
Use this:
string description = header.FindElement(By.XPath("strong/a")).Text;
instead of your:
string description = header.FindElement(By.TagName("a")).Text;
I'm trying to get the flash object with ShockwaveFlashObjects component. I get the browser object successfully, but I'm wondering how to get the Flash object through the browser object in IWebBrowser2 type. The code below shows the interface I defined. Has anyone some ideas for me? Thanks.
interface IGetObjects
{
SHDocVw.IWebBrowser2 GetBrowserObject();
ShockwaveFlashObjects.ShockwaveFlashClass GetFlashObject(IWebBrowser2 browserObject);
}
And this is the way, I get the browser object. #caution# Flash is used for test, so it is located in local.
public IWebBrowser2 GetBrowserObject()
{
InternetExplorerClass browser = null;
var shellWindows = new ShellWindowsClass();
const string explorFullName = "C:\\Program Files (x86)\\Internet Explorer\\IEXPLORE.EXE";
IWebBrowser2 iwb2 = null;
for (int i = 0; i < shellWindows.Count; i++)
{
iwb2 = shellWindows.Item(i) as IWebBrowser2;
if (iwb2 != null && Equals(iwb2.FullName , explorFullName))
{
break;
}
}
return iwb2;
}
And now I have no idea how to completed the second method ShockwaveFlashClass GetFlashObject(IWebBrowser2 browserObject).
I am making a wpf application and I need it to select a link at random from the generated search results. I have no idea how to go about doing that. It is just an intellectual exercise I was assigned. please help i am almost done. Here is the code so far... I am a super beginner at WPF.
namespace Search
{
/// <summary>
/// Interaction logic for MainWindow.xaml
/// </summary>
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
}
private void Btn_Click(object sender, RoutedEventArgs e)
{
using (var browser = new IE("http://www.google.com"))
{
browser.TextField(Find.ByName("q")).TypeText(_textBox.Text);
browser.Button(Find.ByName("btnG")).Click();
browser.WaitForComplete(5000);
System.Windows.Forms.SendKeys.SendWait("{Enter}"); // presses search on the second screen
browser.Button(Find.ById("gbqfb")/*.ByName("btnG")*/).Click(); // doesn't work
}
}
}
}
Here's some indicative code...
private void DownloadRandomLink(string searchTerm)
{
string fullUrl = "http://www.google.com/#q=" + searchTerm;
WebClient wc = new WebClient();
wc.DownloadFile(fullUrl, "file.htm");
Random rand = new Random();
HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm");
var linksOnPage = from lnks in doc.DocumentNode.Descendants()
where lnks.Name == "a" &&
lnks.Attributes["href"] != null &&
lnks.InnerText.Trim().Length > 0
select new
{
Url = lnks.Attributes["href"].Value,
Text = lnks.InnerText
};
if (linksOnPage.Count() > 0)
{
int randomChoice = rand.Next(0, linksOnPage.Count()-1);
var link = linksOnPage.Skip(randomChoice).First();
// do something with link...
}
}
This code takes a search term and builds a full Google url. It then downloads the query into a local file, and opens the file with HTML Agility Pack.
Then the code creates a list of all the links on the page and uses a cobbled-together randomized selection.
As others have mentioned, you will need to get Google's permission to run the code against their servers. Not doing so places you in breach and may have awkward consequences.
Also, this code is indicative; it's not meant to be exemplary, or even buildable. It's a rough idea of the steps needed to get what you are after.
Your earlier design was attempting to interact with controls on Google's index page and that kind of approach is way too brittle in the first instance. You can't hardly test it for starters.
The HTML Agility Pack is here http://htmlagilitypack.codeplex.com/wikipage?title=Examples
I've been following this great tutorial:
http://buildmobile.com/twitter-in-a-windows-phone-7-app/#fbid=o0eLp-OipGa
But it seems that the pin extraction method used in doesn't work for me or is out of date. I'm not an expert on html scrapping and was wondering if someone could help me find a solution to extracting the pin. The method used by the tutorial is:
private void BrowserNavigated(object sender, NavigationEventArgs e){
if (AuthenticationBrowser.Visibility == Visibility.Collapsed) {
AuthenticationBrowser.Visibility = Visibility.Visible;
}
if (e.Uri.AbsoluteUri.ToLower().Replace("https://", "http://") == AuthorizeUrl) {
var htmlString = AuthenticationBrowser.SaveToString();
var pinFinder = new Regex(#"<DIV id=oauth_pin>(?<pin>[A-Za-z0-9_]+)</DIV>", RegexOptions.IgnoreCase);
var match = pinFinder.Match(htmlString);
if (match.Length > 0) {
var group = match.Groups["pin"];
if (group.Length > 0) {
pin = group.Captures[0].Value;
if (!string.IsNullOrEmpty(pin)) {
RetrieveAccessToken();
}
}
}
if (string.IsNullOrEmpty(pin)){
Dispatcher.BeginInvoke(() => MessageBox.Show("Authorization denied by user"));
}
// Make sure pin is reset to null
pin = null;
AuthenticationBrowser.Visibility = Visibility.Collapsed;
}
}
When running through that code, "match" always ends up null and the pin is never found. Everything else in the tutorial works, but I have no idea how to manipulate this code to extract the pin due to the new structure of the page.
I really appreciate the time,
Mike
I have found that Twitter has 2 different PIN pages, and I think they determine which page to redirect you to depending on your browser.
Something as simple as string parsing will work for you. The first PIN page I came across has the PIN code wrapped in a <.code> tag, so simply look for <.code> and parse it out:
if (innerHtml.Contains("<code>"))
{
pin = innerHtml.Substring(innerHtml.IndexOf("<code>") + 6, 7);
}
The other page I came across (which looks like the one in the tutorial you are using) is wrapped using an id="oauth_pin" if I recall correctly. So, just parse that as well:
else if(innerHtml.Contains("oauth_pin"))
{
pin = innerHtml.Substring(innerHtml.IndexOf("oauth_pin") + 10, 7);
}
innerHtml is a string that contains the body of the page. Which seems to be var htmlString = AuthenticationBrowser.SaveToString(); from your code.
I use both of these in my C# program and they work great, full snippet:
private void WebBrowser1DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var innerHtml = webBrowser1.Document.Body.InnerHtml.ToLower();
var code = string.Empty;
if (innerHtml.Contains("<code>"))
{
code = innerHtml.Substring(innerHtml.IndexOf("<code>") + 6, 7);
}
else if(innerHtml.Contains("oauth_pin"))
{
code = innerHtml.Substring(innerHtml.IndexOf("oauth_pin") + 10, 7);
}
textBox1.Text = code;
}
Let me know if you have any question and I hope this helps!!
I need to change the code suggested from Toma A with this one:
var innerHtml = webBrowser1.SaveToString();
var code = string.Empty;
if (innerHtml.Contains("<code>"))
{
code = innerHtml.Substring(innerHtml.IndexOf("<code>") + 6, 7);
}
else if (innerHtml.Contains("oauth_pin"))
{
code = innerHtml.Substring(innerHtml.IndexOf("oauth_pin") + 10, 7);
}
because this one doesn't works for windows phone
var innerHtml = webBrowser1.Document.Body.InnerHtml.ToLower();
I created a program a while ago using C# that does some automation for a completely different program, but found that I need to access data from a Lotus Notes database. The only problem is, I can only seem to figure out how to open the database by the server's name (using session.GetDatabase())... I can't figure out how to open it by Replica ID. Does anyone know how I would go about that? (I don't want my program going down every time the server changes.)
public static string[] GetLotusNotesHelpTickets()
{
NotesSession session = new NotesSession();
session.Initialize(Password);
// 85256B45:000EE057 = NTNOTES1A Server Replica ID
NotesDatabase database = session.GetDatabase("NTNOTES1A", "is/gs/gshd.nsf", false);
string SearchFormula = string.Concat("Form = \"Call Ticket\""
, " & GroupAssignedTo = \"Business Systems\""
, " & CallStatus = \"Open\"");
NotesDocumentCollection collection = database.Search(SearchFormula, null, 0);
NotesDocument document = collection.GetFirstDocument();
string[] ticketList = new string[collection.Count];
for (int i = 0; i < collection.Count; ++i)
{
ticketList[i] = ((object[])(document.GetItemValue("TicketNumber")))[0].ToString();
document = collection.GetNextDocument(document);
}
document = null;
collection = null;
database = null;
session = null;
return ticketList;
}
This code is working fine, but if the server changed from NTNOTES1A, then nothing is going to work anymore.
you'll need to use the notesDbDirectory.OpenDatabaseByReplicaID(rid$) method. To get the NotesDbDirectory, you can use the getDbDirectory method of the session
Set notesDbDirectory = notesSession.GetDbDirectory( serverName$ )
So you can use the code below to get a database by replicaID.
public static string[] GetLotusNotesHelpTickets()
{
NotesSession session = new NotesSession();
session.Initialize(Password);
Set notesDBDirectory = session.GetDbDirectory("NTNOTES1A")
// 85256B45:000EE057 = NTNOTES1A Server Replica ID
NotesDatabase database = notesDBDirectory.OpenDatabaseByReplicaID("85256B45:000EE057")
string SearchFormula = string.Concat("Form = \"Call Ticket\""
, " & GroupAssignedTo = \"Business Systems\""
, " & CallStatus = \"Open\"");
NotesDocumentCollection collection = database.Search(SearchFormula, null, 0);
NotesDocument document = collection.GetFirstDocument();
string[] ticketList = new string[collection.Count];
for (int i = 0; i < collection.Count; ++i)
{
ticketList[i] = ((object[])(document.GetItemValue("TicketNumber")))[0].ToString();
document = collection.GetNextDocument(document);
}
document = null;
collection = null;
database = null;
session = null;
return ticketList;
}
Unfortunately, this only solves half of your problem. I know you'd rather just tell Notes to fetch the database with a particular replicaID from the server closest to the client, just like the Notes Client does when you click on a DBLink or Bookmark. However, there is (or appears to be) no way to do that using the Notes APIs.
My suggestion is to either loop through a hard-coded list of potential servers by name, and check to see if the database is found (the OpenDatabaseByReplicaID method returns ERR_SYS_FILE_NOT_FOUND (error 0FA3) if the database is not found). If that's not a good option, perhaps you can easily expose the servername in an admin menu of your app so it can be changed easily if the server name changes at some point.
set database = new NotesDatabase("")
call database.OpenByReplicaID("repid")