How to HttpGet a url using ChromeDriver in c#

How to HttpGet a url using ChromeDriver in c# - c#

I am using OpenQA.Selenium.Chrome ChromeDriver for automating the browser changes.
As per the application, The URL will only send a response when is the user is login in into the browser, otherwise, it will return 400 Error
I need to identify post-login if the URL exists or not, I am unable to find any function to call a httpGet request from the IWebDriver driver object
IWebDriver driver = new ChromeDriver();
Thanks in advance.

Got a solution using class WebDriverWait that basically can run a javascript method from the current browser instance.
So what I did is calling a nonsynchronous i.e. async = false while raising XMLHttpRequest from javascript like below
return (function () {
{
var result = false;
try {
{
var xhttp = new XMLHttpRequest();
xhttp.open('GET', '<YOUR GET URL HERE>', false); // last param is async = false
xhttp.send();
console.log(xhttp.responseText);
result = !xhttp.responseText.includes('HTTP ERROR 404');
}
} catch (err) {
{}
}
return result;
}
})()
And calling this javascript method on loop till the timeout (TimeSpan is 5000 seconds) from the browser using the WebDriverWait class' config method and casting to IJavaScriptExecutor like below
IWebDriver driver = new ChromeDriver();
TimeSpan timeToWait = TimeSpan.FromSeconds(5000);
WebDriverWait wait1 = new WebDriverWait(driver, timeToWait);
wait1.Until(d =>
{
string url = "<Your GET request URL>";
bool isURLReachable = (bool)((IJavaScriptExecutor)d).ExecuteScript(String.Format(#"return (function() {{ var result = false; try {{ var xhttp = new XMLHttpRequest(); xhttp.open('GET', '{0}', false); xhttp.send(); console.log(xhttp.responseText); result = !xhttp.responseText.includes('HTTP ERROR 404'); }} catch (err) {{ }} return result;}})()", url));
return isURLReachable;
});
This will wait until isURLReachable has true value.
Hope this will help others as well.

Related

C# Selenium Inject/execute JS on page load

I'm using .NET Core 6 with C# 10.
What I'm trying to achieve is to run Selenium in headless mode whilst remaining "undetectable". I followed the instructions from here: https://intoli.com/blog/not-possible-to-block-chrome-headless/ which provided a page to test your bot: https://intoli.com/blog/not-possible-to-block-chrome-headless/chrome-headless-test.html
Headless mode causes some JS vars (like window.chrome) to be unset or invalid which causes the bot to be detected.
IJavaScriptExecutor doesn't work since it runs after the page has loaded. The same author mentions that you have to capture the response and inject JS in this article: https://intoli.com/blog/making-chrome-headless-undetectable/ (Putting It All Together section)
Since the article uses python, I followed this: https://www.automatetheplanet.com/webdriver-capture-modify-http-traffic/ and this: Titanium Web Proxy - Can't modify request body which uses the Titanium Web Proxy library (found here: https://github.com/justcoding121/titanium-web-proxy)
For testing, I used this site http://www.example.com and tried to modify the response (change something in the HTML, set JS vars, etc)
Here is the proxy class:
public static class Proxy
{
static ProxyServer proxyServer = new ProxyServer(userTrustRootCertificate: true);
public static void StartProxy()
{
//Run on port 8080, decrypt ssl
ExplicitProxyEndPoint explicitEndPoint = new ExplicitProxyEndPoint(IPAddress.Any, 8080, true);
proxyServer.Start();
proxyServer.AddEndPoint(explicitEndPoint);
proxyServer.BeforeResponse += OnBeforeResponse;
}
static async Task OnBeforeResponse(object sender, SessionEventArgs ev)
{
var request = ev.HttpClient.Request;
var response = ev.HttpClient.Response;
//Modify title tag in example.com
if (String.Equals(ev.HttpClient.Request.RequestUri.Host, "www.example.com", StringComparison.OrdinalIgnoreCase))
{
var body = await ev.GetResponseBodyAsString();
body = body.Replace("<title>Example Domain</title>", "<title>Completely New Title</title>");
ev.SetResponseBodyString(body);
}
}
public static void StopProxy()
{
proxyServer.Stop();
}
}
And here is the selenium code:
Proxy.StartProxy();
string url = "localhost:8080";
var seleniumProxy = new OpenQA.Selenium.Proxy
{
HttpProxy = url,
SslProxy = url,
FtpProxy = url
};
ChromeOptions options = new ChromeOptions();
options.AddArgument("ignore-certificate-errors");
options.Proxy = seleniumProxy;
IWebDriver driver = new ChromeDriver(#"C:\ChromeDrivers\103\", options);
driver.Manage().Window.Maximize();
driver.Navigate().GoToUrl("http://www.example.com");
Console.ReadLine();
TornCityBot.Proxy.StopProxy();
When selenium loads http://www.example.com, the <title>Example Domain</title> should be changed to <title>Completely New Title</title>, but there was no change. I tried setting the proxy URL as http://localhost:8080, 127.0.0.1:8080, localhost:8080, etc but there was no change.
As a test, I ran the code and left the proxy on. I then ran curl --proxy http://localhost:8080 http://www.example.com in git bash and the output was:
<!doctype html>
<html>
<head>
<title>Completely New Title</title>
. . .
The proxy was working, it was modifying the response for the curl command. But for some reason, it wasn't working with selenium.
If you guys have a solution that can also work on HTTPS or a better method to execute JavaScript on page load, that would be great. If it's not possible, then I might need to forget about headless.
Thanks in advance for any help.

Selenium.WebDriver 4.3.0 and ChromeDriver 103
Try use the ExecuteCdpCommand method
var options = new ChromeOptions();
options.AddArgument("--headless");
options.AddArgument("--user-agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'");
using var driver = new ChromeDriver(options);
Dictionary<string, object> cmdParams= new();
cmdParams.Add("source", "Object.defineProperty(navigator, 'webdriver', { get: () => false });");
driver.ExecuteCdpCommand("Page.addScriptToEvaluateOnNewDocument", cmdParams);
With this piece of code we bypass the first two but if you follow the guide you've already mentioned i think it's easy to bypass the rest.
UPDATE
var initialScript = #"Object.defineProperty(Notification, 'permission', {
get: function () { return ''; }
})
window.chrome = true
Object.defineProperty(navigator, 'webdriver', {
get: () => false})
Object.defineProperty(window, 'chrome', {
get: () => true})
Object.defineProperty(navigator, 'plugins', {
writeable: true,
configurable: true,
enumerable: true,
value: 'works'})
navigator.plugins.length = 1
Object.defineProperty(navigator, 'language', {
get: () => 'el - GR'});
Object.defineProperty(navigator, 'deviceMemory', {
get: () => 8});
Object.defineProperty(navigator, 'hardwareConcurrency', {
get: () => 8});";
cmdParams.Add("source", initialScript);
driver.ExecuteCdpCommand("Page.addScriptToEvaluateOnNewDocument", cmdParams);

ChromeDriver getting detected after first request

I'm using Selenium ChromeDriver to navigate to pages and it works fine, but on second request, I get intercepted by Incapsula.
If I dispose of the driver everytime, it works though.
Here's the current code:
var chromeOptions = new ChromeOptions();
chromeOptions.AddArguments(new List<string>() { "headless" });
var chromeDriverService = ChromeDriverService.CreateDefaultService();
ChromeDriver driver = new ChromeDriver(chromeDriverService, chromeOptions);
The code below is in a loop which iterates over many records
//extract json variable from page output
ResultModel resultModel = new ResultModel();
driver = new ChromeDriver(chromeDriverService, chromeOptions);
driver.Navigate().GoToUrl($"https://www.website.ca{resultUrl}");
var modelString = driver.ExecuteScript("return JSON.stringify(window.the_variable);", new object[] { });
if (modelString != null)
resultModel = JsonConvert.DeserializeObject<ResultModel>(modelString.ToString());
driver.Dispose();
So this works, but disposing and re-creating the driver everytime slows the process quite a bit.
When I try to simply Navigate to the next page, after the first request, I get intercepted.
What is happening exactly when I'm disposing and recreating ? Could I spoof that without actually doing this ?

Clearing the cookies seemed to have helped:
driver.ExecuteChromeCommand("Network.clearBrowserCookies", new Dictionary<string, object>() );

HttpClient.GetByteArrayAsync(…) “deadlock” when there is no internet connection in .NET standard library calling from UWP

I am developing a UWP application for a document management system. I am trying to open documents from my application. When I click the open document, It is going to download the document and then open in the default application. But the problem is document is not downloaded if the internet is a disconnect in the middle of the process. It means when httpClient is already called. My code is as following
public async Task<DownloadFileDetail> DownloadFileAsync(int dmsFileId)
{
if (dmsFileId <= 0)
{
throw new ArgumentException("Invalid DMS File Id");
}
try
{
return await Task.Run(async () =>
{
DownloadFileDetail fileDetail = new DownloadFileDetail()
{
DocId = dmsFileId
};
string apiUrl = $"files/download/latest/{dmsFileId}";
HttpClient httpClient = new HttpClient();
httpClient.BaseAddress = new Uri(BaseApiUrl);
httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {SessionStore.Instance.AuthToken}");
var response = await httpClient.GetByteArrayAsync(apiUrl); --> gone deadlock
fileDetail.Content = response;
return fileDetail;
});
}
catch (Exception ex)
{
}
return new DownloadFileDetail()
{
DocId = dmsFileId
};
}
Download process called as UWP->.NET Standard Library (holds above code). It will be great if someone helps me to solve the problem.
Thanks
ss
Update:
The above code is working on my laptop and not working on any other laptop in dev environment

when there is no internet connection in .NET standar library calling from UWP
If the deadlock only occurs in no internet connection environment, you could check if internet is available before sending http request. Please check this NetworkHelper.
if (NetworkHelper.Instance.ConnectionInformation.IsInternetAvailable)
{
// sending the request.
}

First, remove the Task.Run(async () => ...) call:
try
{
DownloadFileDetail fileDetail = new DownloadFileDetail()
{
DocId = dmsFileId
};
string apiUrl = $"files/download/latest/{dmsFileId}";
HttpClient httpClient = new HttpClient();
httpClient.BaseAddress = new Uri(BaseApiUrl);
httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {SessionStore.Instance.AuthToken}");
var response = await httpClient.GetByteArrayAsync(apiUrl); --> gone deadlock
fileDetail.Content = response;
return fileDetail;
}

Lambda Function using c# cannot invoke external HTTPS APIs

I am trying to invoke External APIs from AWS lambda function written in c#. The Lamda function is deployed in No VPC mode. I am calling this function from Alexa skill. The code works fine for an http request, but its not working for https.
The below code works when I use http://www.google.com.
But, if I replace http with https, then I get the error in the cloud watch saying:
"Process exited before completing request."
Even the log written in catch is not getting logged in cloud watch.
public class Function
{
public const string INVOCATION_NAME = "bingo";
public async Task<SkillResponse> FunctionHandler(SkillRequest input, ILambdaContext context)
{
var requestType = input.GetRequestType();
if (requestType == typeof(IntentRequest))
{
string response = "";
IntentRequest request = input.Request as IntentRequest;
response += $"About {request.Intent.Slots["carmodel"].Value}";
try
{
using (var httpClient = new HttpClient())
{
Console.WriteLine("Trying to access internet");
//var resp=httpClient.GetAsync("http://www.google.com").Result // this works perfect!
var resp = httpClient.GetAsync("https://www.google.com").Result; // this throws error
Console.WriteLine("Call was successful");
}
}
catch (Exception ex)
{
Console.WriteLine("Exception from main function " + ex.Message);
Console.WriteLine(ex.InnerException.Message);
Console.WriteLine(ex.StackTrace);
}
return MakeSkillResponse(response, true);
}
else
{
return MakeSkillResponse(
$"I don't know how to handle this intent. Please say something like Alexa, ask {INVOCATION_NAME} about Tesla.",
true);
}
}
private SkillResponse MakeSkillResponse(string outputSpeech, bool shouldEndSession,
string repromptText = "Just say, tell me about car models to learn more. To exit, say, exit.")
{
var response = new ResponseBody
{
ShouldEndSession = shouldEndSession,
OutputSpeech = new PlainTextOutputSpeech { Text = outputSpeech }
};
if (repromptText != null)
{
response.Reprompt = new Reprompt() { OutputSpeech = new PlainTextOutputSpeech() { Text = repromptText } };
}
var skillResponse = new SkillResponse
{
Response = response,
Version = "1.0"
};
return skillResponse;
}
}

The issue was resolved by updating the library version.
System.Net.Http v4.3.4 was not completely compatible with dotnet core v1.
So outbound http calls were working but not https calls. Changing the version of System.net.http resolved the issue.

Server Events Client - Getting rid of the automatically appended string at the end of the URI

I am new to the Service Stack library and trying to use the Server Events Client. The server I'm working with has two URIs. One for receiving a connection token and one for listening for search requests using the token acquired in the previous call.
I use a regular JsonServiceClient with digest authentication to get the token like so:
public const string Baseurl = "http://serverIp:port";
var client = new JsonServiceClient(Baseurl)
{
UserName = "user",
Password = "password",
AlwaysSendBasicAuthHeader = false
};
//ConnectionData has a string token property
var connectionData = client.Get<ConnectionData>("someServices/connectToSomeService");
And then use this token to listen for server events. Like so:
var eventClient =
new ServerEventsClient($"{Baseurl}/differentUri/retrieveSearchRequests?token={connectionData.Token}")
{
OnConnect = Console.WriteLine,
OnMessage = message => Console.WriteLine(message.Json),
OnCommand = message => Console.WriteLine(message.Json),
OnException = WriteLine,
ServiceClient = client, //same JsonServiceClient from the previous snippet
EventStreamRequestFilter = request =>
{
request.PreAuthenticate = true;
request.Credentials = new CredentialCache
{
{
new Uri(Baseurl), "Digest", new NetworkCredential("user", "password")
}
};
}
};
Console.WriteLine(eventClient.EventStreamUri); // "/event-stream&channels=" is appended at the end
eventClient.Start();
The problem with the above code is that it automatically appends "/event-stream&channels=" at the end of my URI. How do I disable this behavior?
I have tried adding the following class
public class AppHost : AppSelfHostBase
{
public static void Start()
{
new AppHost().Init().Start(Baseurl);
}
public AppHost() : base(typeof(AppHost).Name, typeof(AppHost).Assembly)
{
}
public override void Configure(Container container)
{
Plugins.Add(new ServerEventsFeature
{
StreamPath = string.Empty
});
Plugins.Add(new AuthFeature(() => new AuthUserSession(),
new IAuthProvider[]
{
new DigestAuthProvider()
}));
}
}
and called Start on it, before calling the above code, but still no luck.

The ServerEventsClient is only for listening to ServiceStack SSE Stream and should only be populated with the BaseUrl of the remote ServiceStack instance, i.e. not the path to the /event-stream or a queryString.
See this previous answer for additional customization available, e.g. you can use ResolveStreamUrl to add a QueryString to the EventStream URL it connects to:
var client = new ServerEventsClient(BaseUrl) {
ResolveStreamUrl = url => url.AddQueryParam("token", token)
});
If you've modified ServerEventsFeature.StreamPath to point to a different path, e.g:
Plugins.Add(new ServerEventsFeature
{
StreamPath = "/custom-event-stream"
});
You can change the ServerEventsClient to subscribe to the custom path with:
client.EventStreamPath = client.BaseUri.CombineWith("custom-event-stream");
ResolveStreamUrl + EventStreamPath is available from v5.0.3 that's now available on MyGet.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to HttpGet a url using ChromeDriver in c# - c#

Related

C# Selenium Inject/execute JS on page load

ChromeDriver getting detected after first request

HttpClient.GetByteArrayAsync(…) “deadlock” when there is no internet connection in .NET standard library calling from UWP

Lambda Function using c# cannot invoke external HTTPS APIs

Server Events Client - Getting rid of the automatically appended string at the end of the URI

Categories

Resources