Download file from Button link to specific folder on C drive - c#

I am scraping the web page and navigating to correct location, however as being a new to the whole c# world I am stuck with downloading pdf file.
Link is hiding behind this
var reportDownloadButton = driver.FindElementById("company_report_link");
It is something like: www.link.com/key/489498-654gjgh6-6g5h4jh/link.pdf
How to download the file to C:\temp\?
Here is my code:
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments("headless");
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(chromeOptions))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link2.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("Company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
EDIT:
EDIT 2:
I am not the sharpest pen on Stackoverflow community yet. I don't understand how to do it with Selenium. I have done it with
var reportDownloadButton = driver.FindElementById("company_report_link");
var text = reportDownloadButton.GetAttribute("href");
// driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
WebClient client = new WebClient();
// Save the file to desktop for debugging
var desktop = System.Environment.GetFolderPath(System.Environment.SpecialFolder.Desktop);
string fileName = desktop + "\\myfile.pdf";
client.DownloadFile(text, fileName);
However web page seems to be a little bit tricky. I am getting
System.Net.WebException: 'The remote server returned an error: (401)
Unauthorized.'
Debugger pointing at:
client.DownloadFile(text, fileName);
I think it should really simulate Right click and Save Link As, otherwise this download will not work. Also if I just click on button, it opens PDF in new Chrome tab.
EDIT3:
Should it be like this?
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
options.AddArguments("headless"); // we add headless here
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
//driver = new ChromeDriver(options); we don't need this because we already declare driver below.
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(options))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
driver.Navigate().GoToUrl("www.link.com");
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
}
}
}
}
}

You could try setting the download.default_directory Chrome driver preference:
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
driver = new ChromeDriver(options);
// ... run your code here ...
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
If reportDownloadButton is a link that triggers a download, then the file should download to the filePath you have set in download.default_directory.
Neither of these threads are in C#, but they speak of a similar issue:
How to control the download of files with Selenium + Python bindings in Chrome
How to use chrome webdriver in selenium to download files in python?

You can use WebClient.DownloadFile for that.

Related

Launch Tor Browser with Puppeteer-sharp

I'm trying to launch Tor browser via puppeteer-sharp. I am using .net core 3.1 console application and latest version of puppeteer-sharp. So far the given the executable path console application launches the Tor Browser with an exception.
using PuppeteerSharp;
using System.Threading;
using System.Threading.Tasks;
namespace puppeteer_tor
{
internal class Program
{
static async Task Main(string[] args)
{
string enableAutomation = "--enable-automation";
string noSandBox = "--no-sandbox";
string disableSetUidSandBox = "--disable-setuid-sandbox";
string[] argumentsWithoutExtension = new string[] { "C:\\Users\\selaka.nanayakkara\\Desktop\\Tor Browser\\Browser\\TorBrowser\\Data\\profile.default", "--proxy-server=socks5://127.0.0.1:9050", "--disable-gpu", "--disable-dev-shm-usage", enableAutomation, disableSetUidSandBox, noSandBox };
var options = new LaunchOptions
{
Headless = false,
ExecutablePath = #"C:\Users\selaka.nanayakkara\Desktop\Tor Browser\Browser\firefox.exe",
Args = argumentsWithoutExtension
};
using (var browser = await Puppeteer.LaunchAsync(options))
{
Thread.Sleep(5000);
var page = await browser.NewPageAsync();
await page.GoToAsync("https://check.torproject.org/");
var element = await page.WaitForSelectorAsync("h1");
var text = element.ToString();
}
}
}
}
The browser launches with an issue and gives me the exception of :
Failed to launch browser!
With the below screen of the Tor browser :
Your help is much appreciated in the above issue. Thanks in advance.
Please find the attach code base here.
Set the Headless to true nad try
var options = new LaunchOptions
{
Headless = true,
ExecutablePath = #"C:\Program Files\Mozilla Firefox\firefox.exe",
Args = argumentsWithoutExtension
};
After many pitfalls I was able to find the puppeteer-sharp to work along with Tor Browser. For anyone who is interested please find the below code attached here with :
using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;
using PuppeteerSharp;
using System;
using System.Threading;
using System.Threading.Tasks;
namespace puppeteer_tor
{
internal class Program
{
static async Task Main(string[] args)
{
// Initiating Browser configuration
Console.WriteLine("Intiating Tor Browser");
Browser browser = (Browser)await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = false,
ExecutablePath = #"C:\Users\selaka.nanayakkara\Desktop\Tor Browser\Browser\firefox.exe",
Product = Product.Firefox,
UserDataDir = #"C:\Users\selaka.nanayakkara\Desktop\Tor Browser\Browser\TorBrowser\Data\profile.default",
DefaultViewport = null,
IgnoreHTTPSErrors = true,
Args = new[] { "-wait-for-browser" }
});
// Enabling prxoy connectivilty
Console.WriteLine("Intiating Tor proxy");
var page = await browser.PagesAsync();
Page page1 =(Page)page[0];
await page1.ClickAsync("#connectButton");
// Loading geoblocked url.
Console.WriteLine("Navigating to the URL");
Page page3 =(Page)await browser.NewPageAsync();
page3.DefaultNavigationTimeout = 0;
await page3.GoToAsync("http://nebraskalegislature.gov/laws/browse-chapters.php?chapter=20");
// Fetching content from the page.
Console.WriteLine("Fetching content in the URL.");
var content = await page3.GetContentAsync();
Console.WriteLine("Content fetching completed! ");
// Closing Browser
Console.WriteLine("Closing browser.");
await browser.CloseAsync();
}
}
}
Sample git repository : https://github.com/SelakaKithmal/puppeteer-tor

ChromeDriver getting detected after first request

I'm using Selenium ChromeDriver to navigate to pages and it works fine, but on second request, I get intercepted by Incapsula.
If I dispose of the driver everytime, it works though.
Here's the current code:
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments(new List<string>() { "headless" });
var chromeDriverService = ChromeDriverService.CreateDefaultService();
ChromeDriver driver = new ChromeDriver(chromeDriverService, chromeOptions);
The code below is in a loop which iterates over many records
//extract json variable from page output
ResultModel resultModel = new ResultModel();
driver = new ChromeDriver(chromeDriverService, chromeOptions);
driver.Navigate().GoToUrl($"https://www.website.ca{resultUrl}");
var modelString = driver.ExecuteScript("return JSON.stringify(window.the_variable);", new object[] { });
if (modelString != null)
resultModel = JsonConvert.DeserializeObject<ResultModel>(modelString.ToString());
driver.Dispose();
So this works, but disposing and re-creating the driver everytime slows the process quite a bit.
When I try to simply Navigate to the next page, after the first request, I get intercepted.
What is happening exactly when I'm disposing and recreating ? Could I spoof that without actually doing this ?
Clearing the cookies seemed to have helped:
driver.ExecuteChromeCommand("Network.clearBrowserCookies", new Dictionary<string, object>() );

selenium chromedriver browser not open after publish

In my MVC web application, I'm using selenium C# web driver to read some data from HTML file. my application works properly when I execute my application through VS(HTML file opening through chrome and reading HTML properly). But after I publish and host application in IIS server HTML file not opening through the chrome browser. (browser not opening), here is my code.
public class CribController : Controller
{
public ActionResult Index()
{
try
{
IWebDriver driver = new ChromeDriver(#"C:\Selenium\");
driver.Navigate().GoToUrl("D:/Crib/toEdit_Foramted V2.html");
string text = driver.Title;
var table = driver.FindElement(By.Id("reportcontainerstyle-Ver2"));
var rowsss = table.FindElements(By.TagName("tr"));
//To get days arrears details
var mainTable = driver.FindElement(By.Name("ConsumerCreditDetails_Version3"));
var subTables = mainTable.FindElements(By.Id("bandstyle-Ver2"));
var rows = driver.FindElements(By.XPath("//table[.//td[normalize-space(.)='Credit Facility (CF) Details']][1]/following-sibling::table[1]//tr[not(#type='table-header')]"));
foreach (IWebElement row in rows)
{
//Some logic here
}
Thread.Sleep(3000);
driver.Close();
}
catch (Exception ex)
{
Logger.LogWriter("WebApplication2.Controllers", ex, "CribController", "Index");
Console.WriteLine(ex);
}
return View();
}
}
Why this not working after publishing. how can I solve this?
I think we need more context about the error it throws you.
There's a similar question the Selenium GitHub Repository and this was the response https://github.com/seleniumhq/selenium/issues/1125#issuecomment-257258747
You can declare the driver like this:
var driverService = ChromeDriverService.CreateDefaultService();
driverService.HideCommandPromptWindow = true;
var options = new ChromeOptions();
options.AddArguments(new List<string> { { "start-maximized" } });
IWebDriver driver;
driver = new ChromeDriver(driverService, options);
Other way that might help is: In the same solution, try to create a Console Application for the Selenium code and executions, calling its constructor from the controller (of the MVC project).

C# Selenium Chrome change homepage

i am trying to change chrome default homepage (google tabs) but i didn' t find a working solution.
What i have tried:
var _options = new ChromeOptions();
_options.AddUserProfilePreference("homepage", "http://www.example.com");
_options.AddUserProfilePreference("homepage_is_newtabpage", true);
_options.AddUserProfilePreference("session.restore_on_startup", 4);
_options.AddUserProfilePreference("session.startup_urls", new List<string>() { "http://in.gr"});
_options.AddArgument("--homepage=http://in.gr");
var _driver = new ChromeDriver(_options);
You can navigate to the page you want to after initializing the WebDriver before you begin the rest of your web automation.
...
var _driver = new ChromeDriver(_options);
_driver.Navigate().GoToUrl("http://in.gr");
Hope that is of some help.

Unable to set default download directory from chrome

I have problems with setting the default download folder for chrome driver.
I found some information related to this but none of it is working.
This is what I've tried:
var options = new ChromeOptionsWithPrefs();
options.AddArguments("start-maximized");
options.prefs = new Dictionary<string, object> {
{ "download.default_directory", folderName },
{ "download.prompt_for_download", false },
{ "intl.accept_languages", "nl" }};
webdriver = new ChromeDriver(chromedriver_path, options);
and
var options = new ChromeOptions();
options.AddUserProfilePreference("download.default_directory", folderName);
options.AddUserProfilePreference("intl.accept_languages", "nl");
options.AddUserProfilePreference("download.prompt_for_download", "false");
I am using chrome driver 2.9(latest one) and chrome version 33.
Also tried to set a default directory for chrome and when I start the web-driver I expect that the default directory to be change but I did not work as well.
Do you have any new idea how I can set the this default folder?
Edit: adding declaration:
string folderName = #"C:\Browser";
I was running into trouble doing this with ChromeDriver 2.24 and Selenium 3.0.
For me the following code worked:
var service = ChromeDriverService.CreateDefaultService(driverPath);
var downloadPrefs = new Dictionary<string, object>
{
{"default_directory", #"C:\Users\underscore\MyCustomLocation"},
{"directory_upgrade", true}
};
var options = new ChromeOptions();
options.AddUserProfilePreference("download", downloadPrefs);
return new ChromeDriver(service, options);
Hopefully this helps anyone trying to do it now.
In case it changes in future; I verified the required format by opening my default Chrome preferences file. The location of this file can be found by browsing to chrome://version and opening the Preferences file at the location specified by Profile Path. This showed that the default "download" key has an object with these values.
I could then check the changes were applied by opening the preferences file used by the Selenium Chrome browser (again by checking the location from chrome://version).
Edit 2
Similarly in order to disable the inbuilt Chrome PDF Viewer which was blocking file downloads, I added the following lines to the configuration:
var pdfViewerPlugin = new Dictionary<string, object>
{
["enabled"] = false,
["name"] = "Chrome PDF Viewer"
};
var pluginsList = new Dictionary<string, object>
{
{ "plugins_list", new [] { pdfViewerPlugin } }
};
var downloadPreferences = new Dictionary<string, object>
{
{"default_directory", launchOptions.DownloadFolder},
{"directory_upgrade", true}
};
var options = new ChromeOptions();
options.AddUserProfilePreference("download", downloadPreferences);
options.AddUserProfilePreference("plugins", pluginsList);
Firefox
Since I wasted another hour on this today, here is the configuration for Firefox (49+) running the same version of Selenium (Note: this won't work with GeckoDriver 0.10.0 and Selenium 3.0.0+, GeckoDriver must be version 0.11.1):
var path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "GeckoBinary");
var service = FirefoxDriverService.CreateDefaultService(path);
service.HideCommandPromptWindow = true;
var profile = new FirefoxProfile();
profile.SetPreference("browser.download.dir", myDownloadLocation);
profile.SetPreference("browser.download.downloadDir", myDownloadLocation);
profile.SetPreference("browser.download.defaultFolder", myDownloadLocation);
profile.SetPreference("browser.helperApps.neverAsk.saveToDisk", ContentTypes.AllTypesSingleLine);
profile.SetPreference("pdfjs.disabled", true);
profile.SetPreference("browser.download.useDownloadDir", true);
profile.SetPreference("browser.download.folderList", 2);
return new FirefoxDriver(service, new FirefoxOptions
{
Profile = profile
}, TimeSpan.FromMinutes(5));
Where ContentTypes.AllTypesSingleLine is just a string containing mime types, e.g.:
application/pdf;application/excel;...
As of GeckoDriver 0.11.1 and Selenium 3.0.1 this can be simplified to:
var options = new FirefoxOptions();
options.SetPreference("browser.download.dir", launchOptions.DownloadFolder);
options.SetPreference("browser.download.downloadDir", launchOptions.DownloadFolder);
options.SetPreference("browser.download.defaultFolder", launchOptions.DownloadFolder);
options.SetPreference("browser.helperApps.neverAsk.saveToDisk", ContentTypes.AllTypesSingleLine);
options.SetPreference("pdfjs.disabled", true);
options.SetPreference("browser.download.useDownloadDir", true);
options.SetPreference("browser.download.folderList", 2);
return new FirefoxDriver(service, options, TimeSpan.FromMinutes(5));

Categories

Resources