I'm using Selenium ChromeDriver to navigate to pages and it works fine, but on second request, I get intercepted by Incapsula.
If I dispose of the driver everytime, it works though.
Here's the current code:
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments(new List<string>() { "headless" });
var chromeDriverService = ChromeDriverService.CreateDefaultService();
ChromeDriver driver = new ChromeDriver(chromeDriverService, chromeOptions);
The code below is in a loop which iterates over many records
//extract json variable from page output
ResultModel resultModel = new ResultModel();
driver = new ChromeDriver(chromeDriverService, chromeOptions);
driver.Navigate().GoToUrl($"https://www.website.ca{resultUrl}");
var modelString = driver.ExecuteScript("return JSON.stringify(window.the_variable);", new object[] { });
if (modelString != null)
resultModel = JsonConvert.DeserializeObject<ResultModel>(modelString.ToString());
driver.Dispose();
So this works, but disposing and re-creating the driver everytime slows the process quite a bit.
When I try to simply Navigate to the next page, after the first request, I get intercepted.
What is happening exactly when I'm disposing and recreating ? Could I spoof that without actually doing this ?
Clearing the cookies seemed to have helped:
driver.ExecuteChromeCommand("Network.clearBrowserCookies", new Dictionary<string, object>() );
Related
I am scraping the web page and navigating to correct location, however as being a new to the whole c# world I am stuck with downloading pdf file.
Link is hiding behind this
var reportDownloadButton = driver.FindElementById("company_report_link");
It is something like: www.link.com/key/489498-654gjgh6-6g5h4jh/link.pdf
How to download the file to C:\temp\?
Here is my code:
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments("headless");
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(chromeOptions))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link2.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("Company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
EDIT:
EDIT 2:
I am not the sharpest pen on Stackoverflow community yet. I don't understand how to do it with Selenium. I have done it with
var reportDownloadButton = driver.FindElementById("company_report_link");
var text = reportDownloadButton.GetAttribute("href");
// driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
WebClient client = new WebClient();
// Save the file to desktop for debugging
var desktop = System.Environment.GetFolderPath(System.Environment.SpecialFolder.Desktop);
string fileName = desktop + "\\myfile.pdf";
client.DownloadFile(text, fileName);
However web page seems to be a little bit tricky. I am getting
System.Net.WebException: 'The remote server returned an error: (401)
Unauthorized.'
Debugger pointing at:
client.DownloadFile(text, fileName);
I think it should really simulate Right click and Save Link As, otherwise this download will not work. Also if I just click on button, it opens PDF in new Chrome tab.
EDIT3:
Should it be like this?
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
options.AddArguments("headless"); // we add headless here
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
//driver = new ChromeDriver(options); we don't need this because we already declare driver below.
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(options))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
driver.Navigate().GoToUrl("www.link.com");
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
}
}
}
}
}
You could try setting the download.default_directory Chrome driver preference:
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
driver = new ChromeDriver(options);
// ... run your code here ...
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
If reportDownloadButton is a link that triggers a download, then the file should download to the filePath you have set in download.default_directory.
Neither of these threads are in C#, but they speak of a similar issue:
How to control the download of files with Selenium + Python bindings in Chrome
How to use chrome webdriver in selenium to download files in python?
You can use WebClient.DownloadFile for that.
With the new Selenium.WebDriver.ChromeDriver.74.0.3729.6 update, the problem came out.
https://chrome.google.com/webstore/detail/block-image/pehaalcefcjfccdpbckoablngfkfgfgj?hl=tr
I'm trying to use the extension above.
ChromeDriverService chromeDriverService = ChromeDriverService.CreateDefaultService();
chromeDriverService.HideCommandPromptWindow = true;
ChromeOptions chromeOptions = new ChromeOptions();
chromeOptions.AddExtensions(#Application.StartupPath.ToString() + #"\block-image.crx");
driver = new ChromeDriver(chromeDriverService, chromeOptions, TimeSpan.FromMinutes(10));
run the program. My result page remains "data :,". does not go to the page I want. different extensions have tried the same unfortunately.
i am trying to change chrome default homepage (google tabs) but i didn' t find a working solution.
What i have tried:
var _options = new ChromeOptions();
_options.AddUserProfilePreference("homepage", "http://www.example.com");
_options.AddUserProfilePreference("homepage_is_newtabpage", true);
_options.AddUserProfilePreference("session.restore_on_startup", 4);
_options.AddUserProfilePreference("session.startup_urls", new List<string>() { "http://in.gr"});
_options.AddArgument("--homepage=http://in.gr");
var _driver = new ChromeDriver(_options);
You can navigate to the page you want to after initializing the WebDriver before you begin the rest of your web automation.
...
var _driver = new ChromeDriver(_options);
_driver.Navigate().GoToUrl("http://in.gr");
Hope that is of some help.
I want to try out headless chrome, but I am running into this issue, that I can't start the driver in headless mode. I was following google documentation. am I missing something ? The code execution gets stuck in var browser = new ChromeDriver(); line
Here is my code:
var chromeOptions = new ChromeOptions
{
BinaryLocation = #"C:\Users\2-as Aukstas\Documents\Visual Studio 2017\Projects\ChromeTest\ChromeTest\bin\Debug\chromedriver.exe",
DebuggerAddress = "localhost:9222"
};
chromeOptions.AddArguments(new List<string>() {"headless", "disable-gpu" });
var browser = new ChromeDriver(chromeOptions);
browser.Navigate().GoToUrl("https://stackoverflow.com/");
Console.WriteLine(browser.FindElement(By.CssSelector("#h-top-questions")).Text);
UPDATE
Chrome version 60 is out so all you need to do is to download Chromdriver and Selenium via Nuget and use this simple code and everything works like a charm. Amazing.
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
...
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments("headless");
using (var browser = new ChromeDriver(chromeOptions))
{
// add your code here
}
DATED
There is a solution until the official release of Chrome 60 will be released. You can download Chrome Canary and use headless with it. After installation set BinaryLocation to point to chrome canary also comment out the DebuggerAddress line(it forces chrome to timeout):
var chromeOptions = new ChromeOptions
{
BinaryLocation = #"C:\Users\2-as Aukstas\AppData\Local\Google\Chrome SxS\Application\chrome.exe",
//DebuggerAddress = "127.0.0.1:9222"
};
chromeOptions.AddArguments(new List<string>() { "no-sandbox", "headless", "disable-gpu" });
var _driver = new ChromeDriver(chromeOptions);
For you that did not get reference for ChromeDriver.
Use this step :
Download the dll from this: http://seleniumtestings.com/selenium-download/
Extract, and you should see: Selenium.WebDriverBackedSelenium.dll, ThoughtWorks.Selenium.Core.dll, WebDriver.dll and WebDriver.Support.dll
Add those files via "Add Reference"
Now you can use it:
String url = "http://www.google.com";
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments(new List<string>() {
"--silent-launch",
"--no-startup-window",
"no-sandbox",
"headless",});
var chromeDriverService = ChromeDriverService.CreateDefaultService();
chromeDriverService.HideCommandPromptWindow = true; // This is to hidden the console.
ChromeDriver driver = new ChromeDriver(chromeDriverService, chromeOptions);
driver.Navigate().GoToUrl(url);
====
If after you run, you are still facing error about no ChromeDriver.exe file, try to add the Selenium.WebDriver.ChromeDriver, WebDriver.ChromeDriver, WebDriver.ChromeDriver.win32, Selenium.Chrome.WebDriver via nuget.
As alternative:
Add 2 libraries via NuGet like below picture.
Try below Code:
String url = "http://www.google.com";
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments(new List<string>() { "headless" });
var chromeDriverService = ChromeDriverService.CreateDefaultService();
ChromeDriver driver = new ChromeDriver(chromeDriverService, chromeOptions);
driver.Navigate().GoToUrl(url);
What OS you're running? I see on developers.google.com/web/updates/2017/04/headless-chrome that headless won't be available on Windows until Chrome 60.
Below i have given how to set the headless to true for firefox and chrome browsers.
FirefoxOptions ffopt = new FirefoxOptions();
FirefoxOptions option = ffopt.setHeadless(true);
WebDriver driver = new FirefoxDriver(option);
ChromeOptions coptions = new ChromeOptions();
ChromeOptions options = coptions.setHeadless(true);
WebDriver driver = new ChromeDriver(options);
I have problems with setting the default download folder for chrome driver.
I found some information related to this but none of it is working.
This is what I've tried:
var options = new ChromeOptionsWithPrefs();
options.AddArguments("start-maximized");
options.prefs = new Dictionary<string, object> {
{ "download.default_directory", folderName },
{ "download.prompt_for_download", false },
{ "intl.accept_languages", "nl" }};
webdriver = new ChromeDriver(chromedriver_path, options);
and
var options = new ChromeOptions();
options.AddUserProfilePreference("download.default_directory", folderName);
options.AddUserProfilePreference("intl.accept_languages", "nl");
options.AddUserProfilePreference("download.prompt_for_download", "false");
I am using chrome driver 2.9(latest one) and chrome version 33.
Also tried to set a default directory for chrome and when I start the web-driver I expect that the default directory to be change but I did not work as well.
Do you have any new idea how I can set the this default folder?
Edit: adding declaration:
string folderName = #"C:\Browser";
I was running into trouble doing this with ChromeDriver 2.24 and Selenium 3.0.
For me the following code worked:
var service = ChromeDriverService.CreateDefaultService(driverPath);
var downloadPrefs = new Dictionary<string, object>
{
{"default_directory", #"C:\Users\underscore\MyCustomLocation"},
{"directory_upgrade", true}
};
var options = new ChromeOptions();
options.AddUserProfilePreference("download", downloadPrefs);
return new ChromeDriver(service, options);
Hopefully this helps anyone trying to do it now.
In case it changes in future; I verified the required format by opening my default Chrome preferences file. The location of this file can be found by browsing to chrome://version and opening the Preferences file at the location specified by Profile Path. This showed that the default "download" key has an object with these values.
I could then check the changes were applied by opening the preferences file used by the Selenium Chrome browser (again by checking the location from chrome://version).
Edit 2
Similarly in order to disable the inbuilt Chrome PDF Viewer which was blocking file downloads, I added the following lines to the configuration:
var pdfViewerPlugin = new Dictionary<string, object>
{
["enabled"] = false,
["name"] = "Chrome PDF Viewer"
};
var pluginsList = new Dictionary<string, object>
{
{ "plugins_list", new [] { pdfViewerPlugin } }
};
var downloadPreferences = new Dictionary<string, object>
{
{"default_directory", launchOptions.DownloadFolder},
{"directory_upgrade", true}
};
var options = new ChromeOptions();
options.AddUserProfilePreference("download", downloadPreferences);
options.AddUserProfilePreference("plugins", pluginsList);
Firefox
Since I wasted another hour on this today, here is the configuration for Firefox (49+) running the same version of Selenium (Note: this won't work with GeckoDriver 0.10.0 and Selenium 3.0.0+, GeckoDriver must be version 0.11.1):
var path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "GeckoBinary");
var service = FirefoxDriverService.CreateDefaultService(path);
service.HideCommandPromptWindow = true;
var profile = new FirefoxProfile();
profile.SetPreference("browser.download.dir", myDownloadLocation);
profile.SetPreference("browser.download.downloadDir", myDownloadLocation);
profile.SetPreference("browser.download.defaultFolder", myDownloadLocation);
profile.SetPreference("browser.helperApps.neverAsk.saveToDisk", ContentTypes.AllTypesSingleLine);
profile.SetPreference("pdfjs.disabled", true);
profile.SetPreference("browser.download.useDownloadDir", true);
profile.SetPreference("browser.download.folderList", 2);
return new FirefoxDriver(service, new FirefoxOptions
{
Profile = profile
}, TimeSpan.FromMinutes(5));
Where ContentTypes.AllTypesSingleLine is just a string containing mime types, e.g.:
application/pdf;application/excel;...
As of GeckoDriver 0.11.1 and Selenium 3.0.1 this can be simplified to:
var options = new FirefoxOptions();
options.SetPreference("browser.download.dir", launchOptions.DownloadFolder);
options.SetPreference("browser.download.downloadDir", launchOptions.DownloadFolder);
options.SetPreference("browser.download.defaultFolder", launchOptions.DownloadFolder);
options.SetPreference("browser.helperApps.neverAsk.saveToDisk", ContentTypes.AllTypesSingleLine);
options.SetPreference("pdfjs.disabled", true);
options.SetPreference("browser.download.useDownloadDir", true);
options.SetPreference("browser.download.folderList", 2);
return new FirefoxDriver(service, options, TimeSpan.FromMinutes(5));