Run a script with API call from a webpage using C# - c#

I want to run a script using API calls in C#. I don't want the webpage to open and just the script should run. I am trying this:
HttpWebRequest request = WebRequest.Create(URL) as HttpWebRequest;
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
HtmlDocument doc; //I have tried HtmlDocument = new HtmlDocument();, didn't work.
var resultStream = response.GetResponseStream();
doc.LoadHtml(resultStream); // I have tried using Load instead of LoadHtml,didn't work out.
doc.InvokeScript("Submit");
I get an error, use of unassigned variable doc. and doc doesn't contain function name LoadHtml. I have adding the Microsoft.VisualStudio.TestTools.UITesting.HtmlControls; , didn't help.
I have checked th questions HtmlDocument.LoadHtml from WebResponse? and Get HTML code from website in C# but they didn't get an error on doc.
Any solutions.

You will have to change the way you load the HtmlDocument
string html = new WebClient().DownloadString(URL);
WebBrowser browser = new WebBrowser()
{
ScriptErrorsSuppressed = true,
DocumentText = string.Empty
};
HtmlDocument doc = browser.Document.OpenNew(true);
doc.Write(html);
doc.InvokeScript("Submit");
Hope it works.

HtmlDocument doc; //I have tried HtmlDocument = new HtmlDocument();, didn't work.
This won't work unless you do:
HtmlDocument doc = new HtmlDocument(/*someUri*/, /*documentLocation*/);
You need to initialize doc. This is why you are seeing
use of unassigned variable doc
Check out the documentation here for details:
https://msdn.microsoft.com/en-us/library/microsoft.visualstudio.testtools.webtesting.htmldocument.aspx
According to it, the constructor signature is:
HtmlDocument(Uri, String)
with the description:
Initializes a new instance of the HtmlDocument class. This constructor takes a >string and uses it as the document.
Also, by looking at the documentation, it doesn't have a method LoadHtml()

Related

Get webpage source code with alt key code symbols using asp.net c#

I'm trying to get webpage source code using htmlagilitypack. This is my code to get source code and fill into multiline textbox:
var url = "http://www.example.com";
var web = new HtmlWeb();
var doc = web.Load(url);
sourcecodetxt.Text = doc.ToString();
code is working fine but if my webpage have some "Alt Codes Symbols" then symbol changed with some characters eg: ★ changed with ★
My question is how to get original symbol. Sorry for my bad english. Thanks in advance.
Try using WebClient and HtmlDocument's Load() method so you can specify the encoding:
WebClient client = new WebClient();
HtmlDocument doc = new HtmlDocument();
doc.Load(client.OpenRead("http://www.example.com"), Encoding.UTF8);

XDocument Load - cannot open

I'm trying to load rss feed by XDocument.
The url is:
http://www.ft.com/rss/home/uk
XDocument doc = XDocument.Load(url);
But I'm getting an error:
Cannot open 'http://www.ft.com/rss/home/uk'. The Uri parameter must be a file system relative or absolute path.
XDocument.Load does not take URL's, only files as stated in the documentation.
Try something like the following code which I totally did not test:
using(var httpclient = new HttpClient())
{
var response = await httpclient.GetAsync("http://www.ft.com/rss/home/uk");
var xDoc = XDocument.Load(await response.Content.ReadAsStreamAsync());
}

Can i read iframe through WebClient (i want the outer html)?

Well my program is reading a web target that somewhere in the body there is the iframe that i want to read.
My html source
<html>
...
<iframe src="http://www.mysite.com" ></iframe>
...
</html>
in my program i have a method that is returning the source as a string
public static string get_url_source(string url)
{
using (WebClient client = new WebClient())
{
return client.DownloadString(url);
}
}
My problem is that i want to get the source of the iframe when it's reading the source, as it would do in normal browsing.
Can i do this only by using WebBrowser Class or there is a way to do it within WebClient or even another class?
The real question:
How can i get the outer html given a url? Any appoach is welcomed.
After getting the source of the site, you can use HtmlAgilityPack to get the url of the iframe
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var src = doc.DocumentNode.SelectSingleNode("//iframe")
.Attributes["src"].Value;
then make a second call to get_url_source
Parse your source using HTML Agility Pack and then:
List<String> iframeSource = new List<String>();
HtmlDocument doc = new HtmlDocument();
doc.Load(url);
foreach (HtmlNode node in doc.DocumentElement.SelectNodes("//iframe"))
iframeSource.Add(get_url_source(mainiFrame.Attributes["src"]));
If you are targeting a single iframe, try to identify it using ID attribute or something else so you can only retrieve one source:
String iframeSource;
HtmlDocument doc = new HtmlDocument();
doc.Load(url);
foreach (HtmlNode node in doc.DocumentElement.SelectNodes("//iframe"))
{
// Just an example for check, but you could use different approaches...
if (node.Attributes["id"].Value == 'targetframe')
iframeSource = get_url_source(node.Attributes["src"].Value);
}
Well i found the answer after some search and this is what i wanted
webBrowser1.Url = new Uri("http://www.mysite.com/");
while (webBrowser1.ReadyState != WebBrowserReadyState.Complete) Application.DoEvents();
string InnerSource = webBrowser1.Document.Body.InnerHtml;
//You can use here OuterHtml too.

Get Page Main Content using the URL

I need to be able to get the page main content from a certain url.
a very good example on what i need to do is the following: http://embed.ly/docs/explore/preview?url=http%3A%2F%2Fedition.cnn.com%2F2012%2F08%2F20%2Fworld%2Fmeast%2Fflight-phobia-boy-long-way-home%2Findex.html%3Fiid%3Darticle_sidebar
I am using asp.net with C# language.
Parsing html pages and guessing the main content is not an easy process. I would recomment to use NReadability and HtmlAgilityPack
Here is an example how it could be done. Main text is always in div with id readInner after NReadability transcoded the page.
string url = "http://.......";
var t = new NReadability.NReadabilityWebTranscoder();
bool b;
string page = t.Transcode(url, out b);
if (b)
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
var title = doc.DocumentNode.SelectSingleNode("//title").InnerText;
var text = doc.DocumentNode.SelectSingleNode("//div[#id='readInner']")
.InnerText;
}
Man,
I guess it's made using the implementation of WebClient Class or WebRequest Class. With it you can download all content of page then using any data mining algorithm, you can get the information you want.
[]'s

Timeout error when loading Xml from URL

I am doing task of loading the live xml file (from live url) to XmlDataDocument, but every time I am getting error:
The operation has timed out
The code is as follows, The url containing the xml feeds , I want to load it into xmlDoc.
XmlDataDocument xmlDoc = new XmlDataDocument();
xmlDoc.Load("http://www.globalgear.com.au/productfeed.xml");
Please suggest any solution.
Don't use the Load method of the XmlDataDocument class directly; you have little to no way of influencing the behaviour when it comes to long running HTTP requests.
Instead, use the HttpWebRequest and HttpWebResponse classes to do the work for you, and then load the subsequent response into your document.
For example:
HttpWebRequest rq = WebRequest.Create("http://www.globalgear.com.au/productfeed.xml") as HttpWebRequest;
//60 Second Timeout
rq.Timeout = 60000;
//Also note you can set the Proxy property here if required; sometimes it is, especially if you are behind a firewall - rq.Proxy = new WebProxy("proxy_address");
HttpWebResponse response = rq.GetResponse() as HttpWebResponse;
XmlTextReader reader = new XmlTextReader(response.GetResponseStream());
XmlDocument doc = new XmlDocument();
doc.Load(reader);
I've tested this code in a local app instance and the XmlDocument is populated with the data from your URL.
You can also substitute in XmlDataDocument for XmlDocument in the example above - I prefer to use XmlDocument as it's not (yet) marked as obsolete.
I've wrapped this in a function for you:
public XmlDocument GetDataFromUrl(string url)
{
XmlDocument urlData = new XmlDocument();
HttpWebRequest rq = (HttpWebRequest)WebRequest.Create(url);
rq.Timeout = 60000;
HttpWebResponse response = rq.GetResponse() as HttpWebResponse;
using (Stream responseStream = response.GetResponseStream())
{
XmlTextReader reader = new XmlTextReader(responseStream);
urlData.Load(reader);
}
return urlData;
}
Simply call using:
XmlDocument document = GetDataFromUrl("http://www.globalgear.com.au/productfeed.xml");
To my knowledge there is no easy way to adjust the timeout with the method you are using.
The easiest change would be to use the webclient class and set the timeout property. This is described here http://w3ka.blogspot.co.uk/2009/12/how-to-fix-webclient-timeout-issue.html. Then use downloadfile on the webclient. Then load the saved file in the XMLDocument.
Set a timeout for your web request:
using System;
using System.Net;
using System.Xml;
namespace Shelver
{
class Program
{
static void Main(string[] args)
{
WebRequest requ = WebRequest.Create("http://www.globalgear.com.au/productfeed.xml");
requ.Timeout = 10 * 60 * 1000; // 10 minutes timeout and not 100s as the default.
var resp = requ.GetResponse();
Console.WriteLine("Will download {0:N0}bytes", resp.ContentLength);
var stream = resp.GetResponseStream();
XmlDocument doc = new XmlDocument();
doc.Load(stream);
}
}
}
This example will set it to 10 minutes.
In addition to the previous answers, which should be the first step towards fixing this, I continued to get this exception despite having already loaded the response and closing the connections.
The solution for me: the Load() and LoadXml() methods would throw their own Timeout exception if the value provided wasn't actually XML. Checking to verify that the response content was XML worked in our case (this will require that the host you are getting your response from actually sets content types).
Building upon dash's answer:
public XmlDocument GetDataFromUrl(string url)
{
XmlDocument urlData = new XmlDocument();
HttpWebRequest rq = (HttpWebRequest)WebRequest.Create(url);
rq.Timeout = 60000;
HttpWebResponse response = rq.GetResponse() as HttpWebResponse;
// New check added to dash's answer.
if (response.ContentType.Contains("text/xml")
{
using (Stream responseStream = response.GetResponseStream())
{
XmlTextReader reader = new XmlTextReader(responseStream);
urlData.Load(reader);
}
}
return urlData;
}

Categories

Resources