How to prevent pasted URLs from becoming absolute? - c#

Having a WebBrowser control, I roughly do the following steps:
Navigate to "about:blank".
Turn on design mode in DocumentCompleted event handler.
Paste a HTML string with an # as the URL.
Read the document back from the WebBrowser control.
Step 2 is done by this way:
private void webBrowser1_DocumentCompleted(
object sender,
WebBrowserDocumentCompletedEventArgs e)
{
dynamic axObj = webBrowser1.ActiveXInstance;
axObj.document.designmode = "On";
}
Step 3 is done this way:
private void button1_Click(object sender, EventArgs e)
{
var doc = (HTMLDocument)webBrowser1.Document.DomDocument;
var selection = doc.selection;
var range = (IHTMLTxtRange)selection.createRange();
range.pasteHTML("<p>Read more</p>");
}
Step 4 is done this way:
private void button2_Click(object sender, EventArgs e)
{
MessageBox.Show(this, webBrowser1.DocumentText);
}
What I expect:
I would expect to get a HTML string like this:
<html><body>
<p>Read more</p>
</body></html>
What I actually get:
I get an HTML string where the # URL is prefixed with the current document's URL:
<html><body>
<p>Read more</p>
</body></html>
This happens, no matter whether I navigate to about:blank or e.g. https://www.google.com or any other URL.
My question:
Is there any way to prevent IE/mshtml/WebBrowser control from prefixing the currently loaded URL when pasting anchors?
Update 1:
A possible workaround I can think of is to paste an absolute URL like e.g. http://pseudo-hash.com instead of the # and later when getting the HTML back from the WebBrowser control do a string replace and replace the pseudo placeholder URL back with #.

Related

How to access and manipulate the DOM and JavaScript of WebView?

I have a WebView that I navigate to a URL, and I want to read the contents, Interact with the JavaScript, etc.
I tried
XAML:
<WebView Name="wv1" LoadCompleted="wv1_LoadCompleted" />
C# code:
private void wv1_LoadCompleted(object sender, NavigationEventArgs e)
{
var uri = e.Uri;
var c = e.Content;
}
The problem is that the e.Uri is returned, but the e.Content is null.
How to access the DOM?
How can I do that?
The problem is that the e.Uri is returned, but the e.Content is null.
The Content property of NavigationEventArgs is used to get the root node of the target page's content. The page is xaml page but not web page. You could verify it with Frame.Navigated event. When page navigated you could get value of Content.
private void RootFrame_Navigated(object sender, NavigationEventArgs e)
{
System.Diagnostics.Debug.WriteLine(e.Content.ToString());
}
If you want to get the web page content. you could refer this reply. With some updates:
To enable an external web page to fire the ScriptNotify event when calling window.external.notify, you must include the page's Uniform Resource Identifier (URI) in the ApplicationContentUriRules section of the app manifest.
The follow eval method used to get body html.
string functionString = #"window.external.notify(document.getElementsByTagName('body')[0].innerHTML)";
await Test.InvokeScriptAsync("eval", new string[] { functionString });
And you could get the return value form ScriptNotify event handler.
private void Test_ScriptNotify(object sender, NotifyEventArgs e)
{
var body = e.Value;
}

Is it possible to get a link?

Here is my code:
private void button1_Click(object sender, EventArgs e)
{
string x = textBox1.Text;
System.Diagnostics.Process.Start("http://www.google.com/search?q="+x+"&btnI");
}
private void textBox1_TextChanged(object sender, EventArgs e)
{
}
Simple code, but I don't want the program to go to the link of the Textbox1.text on youtube. I want the program to just give me the link of the search back and not to go there.
I want to put a word on the text box and when I press the button it should give me the link of youtube and not to go there (like my program does at the moment it goes to the youtube link).
Couldn't explain better. Hope you guys can understand what I wanna do.
I assume you want to retrieve the URL that is returned by google when the query is fired. One possibility would be to use the HttpClient class in order to retrieve the request and from it the requested URL from the RequestMessage property:
var url = "https://www.google.com/search?q=stackoverflow&btnI";
var http = new HttpClient();
var response = http.GetAsync(url);
Console.WriteLine(response.Result.RequestMessage.RequestUri.AbsoluteUri);
The output is:
http://stackoverflow.com/

Open multiple pages in WebBrowser and send a command to all of them

I have a winform app with the following functionality:
Has a multiline textbox that contain one URL on each line - about 30 URLs (each URL is different but the webpage is the same (just the domain is different);
I have another textbox in which I can write a command and a button that sends that command to an input field from the webpage.
I have a WebBrowser controller ( I would like to do all the things in one controller )
The webpage consist of a textbox and a button which I want to be clicked after I insert a command in that textbox.
My code so far:
//get path for the text file to import the URLs to my textbox to see them
private void button1_Click(object sender, EventArgs e)
{
OpenFileDialog fbd1 = new OpenFileDialog();
fbd1.Title = "Open Dictionary(only .txt)";
fbd1.Filter = "TXT files|*.txt";
fbd1.InitialDirectory = #"M:\";
if (fbd1.ShowDialog(this) == DialogResult.OK)
path = fbd1.FileName;
}
//import the content of .txt to my textbox
private void button2_Click(object sender, EventArgs e)
{
textBox1.Lines = File.ReadAllLines(path);
}
//click the button from webpage
private void button3_Click(object sender, EventArgs e)
{
this.webBrowser1.Document.GetElementById("_act").InvokeMember("click");
}
//parse the value of the textbox and press the button from the webpage
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
newValue = textBox2.Text;
HtmlDocument doc = this.webBrowser1.Document;
doc.GetElementById("_cmd").SetAttribute("Value", newValue);
}
Now, how can I add all those 30 URLs from my textbox in the same webcontroller so that I can send the same command to all of the textboxes from all the webpages and then press the button for all of them ?
//EDIT 1
So, I have adapted #Setsu method and I've created the following:
public IEnumerable<string> GetUrlList()
{
string f = File.ReadAllText(path); ;
List<string> lines = new List<string>();
using (StreamReader r = new StreamReader(f))
{
string line;
while ((line = r.ReadLine()) != null)
lines.Add(line);
}
return lines;
}
Now, is this returning what it should return, in order to parse each URL ?
If you want to keep using just 1 WebBrowser control, you'd have to sequentially navigate to each URL. Note, however, that the Navigate method of the WebBrowser class is asynchronous, so you can't just naively call it in a loop. Your best bet is to implement an async/await pattern detailed in this answer here.
Alternatively, you CAN have 30 WebBrowser controls and have each one navigate on its own; this is roughly equivalent to having 30 tabs open in modern browsers. Since each WebBrowser is doing identical work, you can just have 1 DocumentCompleted event written to handle a single WebBrowser, and then hook up the others to the same event. Do note that the WebBrowser control has a bug that will cause it to gradually leak memory, and the only way to solve this is to restart the application. Thus, I would recommend going with the async/await solution.
UPDATE:
Here's a brief code sample of how to do the 30 WebBrowsers way (untested as I don't have access to VS right now):
List<WebBrowser> myBrowsers = new List<WebBrowser>();
public void btnDoWork(object sender, EventArgs e)
{
//This method starts navigation.
//It will call a helper function that gives us a list
//of URLs to work with, and naively create as many
//WebBrowsers as necessary to navigate all of them
IEnumerable<string> urlList = GetUrlList();
//note: be sure to sanitize the URLs in this method call
foreach (string url in urlList)
{
WebBrowser browser = new WebBrowser();
browser.DocumentCompleted += webBrowserDocumentCompleted;
browser.Navigate(url);
myBrowsers.Add(browser);
}
}
private void webBrowserDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
//check that the full document is finished
if (e.Url.AbsolutePath != (sender as WebBrowser).Url.AbsolutePath)
return;
//get our browser reference
WebBrowser browser = sender as WebBrowser;
//get the string command from form TextBox
string command = textBox2.Text;
//enter the command string
browser.Document.GetElementById("_cmd").SetAttribute("Value", command);
//invoke click
browser.Document.GetElementById("_act").InvokeMember("click");
//detach the event handler from the browser
//note: necessary to stop endlessly setting strings and clicking buttons
browser.DocumentCompleted -= webBrowserDocumentCompleted;
//attach second DocumentCompleted event handler to destroy browser
browser.DocumentCompleted += webBrowserDestroyOnCompletion;
}
private void webBrowserDestroyOnCompletion(object sender, WebBrowserDocumentCompletedEventArgs e)
{
//check that the full document is finished
if (e.Url.AbsolutePath != (sender as WebBrowser).Url.AbsolutePath)
return;
//I just destroy the WebBrowser, but you might want to do something
//with the newly navigated page
WebBrowser browser = sender as WebBrowser;
browser.Dispose();
myBrowsers.Remove(browser);
}

How to disable every navigation in WebBrowser?

I have a WebBrowser control which I dinamically refresh/change url based on user input. I don't want to let the user to navigate, so I set AllowNavigation to false. This seems to be OK, however the below link is still "active":
Close Page
The issue here is: If the user clicks it, and confirms closure in the pop-up window I can't manage WebBrowser anymore. Looks like it is closed though the last page is still visible. Also I can't remove this link as the site is not managed by me.
Disable the control? Nope, I have to allow the user to highlight and copy text from the webpage.
Do I have any other option to disable literally ALL links?
#TaW: here is my code based on yours. So I have to set the url from my code and call a custom one:
button_click()
{
webBrowser1_load_URL("http://website/somecheck.php?compname=" + textBoxHost.Text);
}
Here it is the function:
private void webBrowser1_load_URL(string url)
{
string s = GetDocumentText(url.ToString());
s = s.Replace(#"javascript:window.close()", "");
webBrowser1.AllowNavigation = true;
webBrowser1.DocumentText = s;
}
The rest is exaclty what's in your answer:
private void webBrowser1_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
webBrowser1.AllowNavigation = false;
}
public string GetDocumentText(string s)
{
WebBrowser dummy = new WebBrowser(); //(*)
dummy.Url = new Uri(s);
return dummy.DocumentText;
}
Still it's not working. Please help me to spot the issue with my code.
If you have control over the loading of the pages you could grab the pages' text and change the code to disable rogue scripts. The one you showed can simply be deleted. Of course you might have to forsee more than the one..
Obviously this could be eased if you could do without javascript alltogether, but if that is not an option go for those that do real or pseudo-navigation..
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
webBrowser1.AllowNavigation = false;
}
private void loadURL_Click(object sender, EventArgs e)
{
webBrowser1.AllowNavigation = true;
string s = File.ReadAllText(textBox_URL.Text);
s = s.Replace("javascript:window.close()", "");
webBrowser1.DocumentText = s;
}
If the pages are not in the file system, the same trick should work, for instance by loading the URL into a dummy WebBrowser like this:
private void cb_loadURL_Click(object sender, EventArgs e)
{
string s = GetDocumentText(tb_URL.Text);
s = s.Replace("javascript:window.close()", "");
webBrowser1.AllowNavigation = true;
webBrowser1.DocumentText = s;
}
public string GetDocumentText(string s)
{
WebBrowser dummy = new WebBrowser(); //(*)
dummy.Url = new Uri(s);
return dummy.DocumentText;
}
Note: According to this post you can't set the DocumentText quite as freely as one would think; probably a bug.. Instead of creating the dummy each time you can also move the (*) line to class level. Then, no matter how many changes you had to make, you would always have an unchanged version, th user could e.g. save somewhere..

How can I invoke javascript functions with c# with Gekofx Browser

I am using the Gekofx browser because my html files don't work with the default webbrowser control.
So far I was using ObjectForScripting to call javascript code from my C# project. But I was not able to call anything with the Gekofx browser.
I just want to send some data to my html file and display it with the Gekofx browser. Is it possible at all?
For contemplation here is my code:
GeckoWebBrowser myBrowser;
public Form1()
{
InitializeComponent();
String path = #"C:\tools\xulrunner\xulrunner-sdk\bin";
Console.WriteLine("Path: " + path);
Skybound.Gecko.Xpcom.Initialize(path);
myBrowser = new GeckoWebBrowser();
myBrowser.Parent = this;
myBrowser.Dock = DockStyle.Fill;
}
private void btn_go_Click(object sender, EventArgs e)
{
// like the normal browsers
myBrowser.Navigate(tbx_link.Text);
}
private void btn_test_Click(object sender, EventArgs e)
{
// getting the link to my own html file
String path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Webpage");
path += "\\Webpage.html";myBrowser.Navigate(path);
}
I hope you understand what I mean. Thank you!
You can always invoke javascript like this:
mybrowser.Navigate("javascript:YourJavascriptFunction('yourArgument1', 'youArgument2')");
Building on #jordy's answer above, call:
mybrowser.Navigate("javascript:YourJavascriptFunction('yourArgument1', 'youArgument2')");
prefferably in the Document complete event handler to allow for the page to first load.
void myBrowser_DocumentCompleted(object sender, Gecko.Events.GeckoDocumentCompletedEventArgs e)
{
myBrowser.Navigate("javascript:YourJavascriptFunction('yourArgument1', 'youArgument2')");
}

Categories

Resources