I have a problem in order to execute Page.addScriptToEvaluateOnNewDocument
in the context of a new window when called via window.open from js
i do driver.SwitchTo().Window(window)
when the Page.WindowOpen event fires
and then I use the code from the selenium documentation
var domains = GetDomains();
domains.Page.Enable(new OpenQA.Selenium.DevTools.V106.Page.EnableCommandSettings());
domains.Page.AddScriptToEvaluateOnNewDocument(new AddScriptToEvaluateOnNewDocumentCommandSettings()
{
Source = script
});
private static DevToolsSessionDomains GetDomains()
{
IDevTools devTools = driver as IDevTools;
DevToolsSession session = devTools.GetDevToolsSession();
var domains = session.GetVersionSpecificDomains<DevToolsSessionDomains>();
return domains;
}
but TargetId in session obj is still same.
the expected behavior should be like this. js Opens a new window and selenium sends an AddScriptToEvaluateOnNewDocumentCommandSettings event to it so that the target script is executed every time the page is loaded.
Related
I'm using .NET Core 6 with C# 10.
What I'm trying to achieve is to run Selenium in headless mode whilst remaining "undetectable". I followed the instructions from here: https://intoli.com/blog/not-possible-to-block-chrome-headless/ which provided a page to test your bot: https://intoli.com/blog/not-possible-to-block-chrome-headless/chrome-headless-test.html
Headless mode causes some JS vars (like window.chrome) to be unset or invalid which causes the bot to be detected.
IJavaScriptExecutor doesn't work since it runs after the page has loaded. The same author mentions that you have to capture the response and inject JS in this article: https://intoli.com/blog/making-chrome-headless-undetectable/ (Putting It All Together section)
Since the article uses python, I followed this: https://www.automatetheplanet.com/webdriver-capture-modify-http-traffic/ and this: Titanium Web Proxy - Can't modify request body which uses the Titanium Web Proxy library (found here: https://github.com/justcoding121/titanium-web-proxy)
For testing, I used this site http://www.example.com and tried to modify the response (change something in the HTML, set JS vars, etc)
Here is the proxy class:
public static class Proxy
{
static ProxyServer proxyServer = new ProxyServer(userTrustRootCertificate: true);
public static void StartProxy()
{
//Run on port 8080, decrypt ssl
ExplicitProxyEndPoint explicitEndPoint = new ExplicitProxyEndPoint(IPAddress.Any, 8080, true);
proxyServer.Start();
proxyServer.AddEndPoint(explicitEndPoint);
proxyServer.BeforeResponse += OnBeforeResponse;
}
static async Task OnBeforeResponse(object sender, SessionEventArgs ev)
{
var request = ev.HttpClient.Request;
var response = ev.HttpClient.Response;
//Modify title tag in example.com
if (String.Equals(ev.HttpClient.Request.RequestUri.Host, "www.example.com", StringComparison.OrdinalIgnoreCase))
{
var body = await ev.GetResponseBodyAsString();
body = body.Replace("<title>Example Domain</title>", "<title>Completely New Title</title>");
ev.SetResponseBodyString(body);
}
}
public static void StopProxy()
{
proxyServer.Stop();
}
}
And here is the selenium code:
Proxy.StartProxy();
string url = "localhost:8080";
var seleniumProxy = new OpenQA.Selenium.Proxy
{
HttpProxy = url,
SslProxy = url,
FtpProxy = url
};
ChromeOptions options = new ChromeOptions();
options.AddArgument("ignore-certificate-errors");
options.Proxy = seleniumProxy;
IWebDriver driver = new ChromeDriver(#"C:\ChromeDrivers\103\", options);
driver.Manage().Window.Maximize();
driver.Navigate().GoToUrl("http://www.example.com");
Console.ReadLine();
TornCityBot.Proxy.StopProxy();
When selenium loads http://www.example.com, the <title>Example Domain</title> should be changed to <title>Completely New Title</title>, but there was no change. I tried setting the proxy URL as http://localhost:8080, 127.0.0.1:8080, localhost:8080, etc but there was no change.
As a test, I ran the code and left the proxy on. I then ran curl --proxy http://localhost:8080 http://www.example.com in git bash and the output was:
<!doctype html>
<html>
<head>
<title>Completely New Title</title>
. . .
The proxy was working, it was modifying the response for the curl command. But for some reason, it wasn't working with selenium.
If you guys have a solution that can also work on HTTPS or a better method to execute JavaScript on page load, that would be great. If it's not possible, then I might need to forget about headless.
Thanks in advance for any help.
Selenium.WebDriver 4.3.0 and ChromeDriver 103
Try use the ExecuteCdpCommand method
var options = new ChromeOptions();
options.AddArgument("--headless");
options.AddArgument("--user-agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'");
using var driver = new ChromeDriver(options);
Dictionary<string, object> cmdParams= new();
cmdParams.Add("source", "Object.defineProperty(navigator, 'webdriver', { get: () => false });");
driver.ExecuteCdpCommand("Page.addScriptToEvaluateOnNewDocument", cmdParams);
With this piece of code we bypass the first two but if you follow the guide you've already mentioned i think it's easy to bypass the rest.
UPDATE
var initialScript = #"Object.defineProperty(Notification, 'permission', {
get: function () { return ''; }
})
window.chrome = true
Object.defineProperty(navigator, 'webdriver', {
get: () => false})
Object.defineProperty(window, 'chrome', {
get: () => true})
Object.defineProperty(navigator, 'plugins', {
writeable: true,
configurable: true,
enumerable: true,
value: 'works'})
navigator.plugins.length = 1
Object.defineProperty(navigator, 'language', {
get: () => 'el - GR'});
Object.defineProperty(navigator, 'deviceMemory', {
get: () => 8});
Object.defineProperty(navigator, 'hardwareConcurrency', {
get: () => 8});";
cmdParams.Add("source", initialScript);
driver.ExecuteCdpCommand("Page.addScriptToEvaluateOnNewDocument", cmdParams);
I'm trying to have the default web browser open up an URL that is provided by the user in this case let's say it's https://stackoverflow.com/ but when the program runs the app is returning true for the LaunchUriAsync, but it isn't opening the browser and navigating to the site. How would I get a browser to open up the URL?
private void Button_Click(object sender, RoutedEventArgs e)
{
string url = StoreUrl.Text;
Uri uri = new Uri(#"" + url);
DefaultLaunch(uri);
}
async void DefaultLaunch(Uri uri)
{
// Launch the URI
var success = await Windows.System.Launcher.LaunchUriAsync(uri);
if (success)
{
// URI launched
}
else
{
// URI launch failed
}
}
I just created a new project and pasted the following code into the Loaded event
var success = await Windows.System.Launcher.LaunchUriAsync(new Uri("https://stackoverflow.com/"));
It worked fine for me.
If the call returns true, there seemingly nothing to stop this from working. At this stage I would suggest this is an environmental issue, or something do with the shell mechanism of that browser.
From here I would
Test this in a new clean project.
Change the default browser, if this works then there is likely some sort of setting/corruption within the browser
Disable your antivirus and see if it works.
I'm using Selenium ChromeDriver to navigate to pages and it works fine, but on second request, I get intercepted by Incapsula.
If I dispose of the driver everytime, it works though.
Here's the current code:
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments(new List<string>() { "headless" });
var chromeDriverService = ChromeDriverService.CreateDefaultService();
ChromeDriver driver = new ChromeDriver(chromeDriverService, chromeOptions);
The code below is in a loop which iterates over many records
//extract json variable from page output
ResultModel resultModel = new ResultModel();
driver = new ChromeDriver(chromeDriverService, chromeOptions);
driver.Navigate().GoToUrl($"https://www.website.ca{resultUrl}");
var modelString = driver.ExecuteScript("return JSON.stringify(window.the_variable);", new object[] { });
if (modelString != null)
resultModel = JsonConvert.DeserializeObject<ResultModel>(modelString.ToString());
driver.Dispose();
So this works, but disposing and re-creating the driver everytime slows the process quite a bit.
When I try to simply Navigate to the next page, after the first request, I get intercepted.
What is happening exactly when I'm disposing and recreating ? Could I spoof that without actually doing this ?
Clearing the cookies seemed to have helped:
driver.ExecuteChromeCommand("Network.clearBrowserCookies", new Dictionary<string, object>() );
I am scraping the web page and navigating to correct location, however as being a new to the whole c# world I am stuck with downloading pdf file.
Link is hiding behind this
var reportDownloadButton = driver.FindElementById("company_report_link");
It is something like: www.link.com/key/489498-654gjgh6-6g5h4jh/link.pdf
How to download the file to C:\temp\?
Here is my code:
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments("headless");
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(chromeOptions))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link2.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("Company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
EDIT:
EDIT 2:
I am not the sharpest pen on Stackoverflow community yet. I don't understand how to do it with Selenium. I have done it with
var reportDownloadButton = driver.FindElementById("company_report_link");
var text = reportDownloadButton.GetAttribute("href");
// driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
WebClient client = new WebClient();
// Save the file to desktop for debugging
var desktop = System.Environment.GetFolderPath(System.Environment.SpecialFolder.Desktop);
string fileName = desktop + "\\myfile.pdf";
client.DownloadFile(text, fileName);
However web page seems to be a little bit tricky. I am getting
System.Net.WebException: 'The remote server returned an error: (401)
Unauthorized.'
Debugger pointing at:
client.DownloadFile(text, fileName);
I think it should really simulate Right click and Save Link As, otherwise this download will not work. Also if I just click on button, it opens PDF in new Chrome tab.
EDIT3:
Should it be like this?
using System.Linq;
using OpenQA.Selenium.Chrome;
namespace WebDriverTest
{
class Program
{
static void Main(string[] args)
{
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
options.AddArguments("headless"); // we add headless here
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
//driver = new ChromeDriver(options); we don't need this because we already declare driver below.
// Initialize the Chrome Driver // chromeOptions
using (var driver = new ChromeDriver(options))
{
// Go to the home page
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
// Get the page elements
var userNameField = driver.FindElementById("loginForm:username");
var userPasswordField = driver.FindElementById("loginForm:password");
var loginButton = driver.FindElementById("loginForm:loginButton");
// Type user name and password
userNameField.SendKeys("username");
userPasswordField.SendKeys("password");
// and click the login button
loginButton.Click();
driver.Navigate().GoToUrl("www.link.com");
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
var reportSearchField = driver.FindElementByClassName("form-control");
reportSearchField.SendKeys("company");
var reportSearchButton = driver.FindElementById("search_filter_button");
reportSearchButton.Click();
driver.Manage().Timeouts().ImplicitWait = System.TimeSpan.FromSeconds(15);
driver.Navigate().GoToUrl("www.link.com");
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
}
}
}
}
}
You could try setting the download.default_directory Chrome driver preference:
// declare chrome options with prefs
var options = new ChromeOptionsWithPrefs();
// declare prefs
options.prefs = new Dictionary<string, object>
{
{ "download.default_directory", downloadFilePath }
};
// declare driver with these options
driver = new ChromeDriver(options);
// ... run your code here ...
// click the link to download
var reportDownloadButton = driver.FindElementById("company_report_link");
reportDownloadButton.Click();
// if clicking does not work, get href attribute and call GoToUrl() -- this may trigger download
var href = reportDownloadButton.GetAttribute("href");
driver.Navigate().GoToUrl(href);
If reportDownloadButton is a link that triggers a download, then the file should download to the filePath you have set in download.default_directory.
Neither of these threads are in C#, but they speak of a similar issue:
How to control the download of files with Selenium + Python bindings in Chrome
How to use chrome webdriver in selenium to download files in python?
You can use WebClient.DownloadFile for that.
run tests on Firefox after that on Chrome. Test which run on FF works correctly. Test with the same code but another driver settings(for Chrome) do not work correctly. I have following info from Saucelab's Chrome:
I create driver by that way:
[SetUp]
public void Init()
{
DesiredCapabilities capabillities = DesiredCapabilities.Chrome();
capabillities.SetCapability(CapabilityType.Platform, "Windows 8.1");
capabillities.SetCapability(CapabilityType.Version, "36");
capabillities.SetCapability("name", "R(...)");
capabillities.SetCapability("username", "My username");
capabillities.SetCapability("accessKey", "my acces key value");
driver = new RemoteWebDriver(
new Uri("http://ondemand.saucelabs.com:80/wd/hub"), capabillities);
baseURL = "http://starting address without www";
}
Test fails after one command. The page loaded and after that error that he can not find element. I have tried many posibilities to find elem(by id, css, xpath).
Any idea what am I doing wrong?
The "www" is necessary before web address to correctly runs test in Chrome on Saucelabs.
driver = new RemoteWebDriver(
new Uri("http://ondemand.saucelabs.com:80/wd/hub"), capabillities);
baseURL = "http://www.(...)";