How To Read A JSON String From A Webpage? - c#

I am trying to download the contents of this webpage into my program. I have tried using WeblClient.DownloadString, WebClient.DownloadFile, then save it to a file and read it from a local file, but none of this is working. When I use breakpoints in Visual Studio, I see the string is correctly saved, but when I try to print it to a file, or print it to the console, nothing is displayed.
What I am aiming to do is download this webpage's content into a String then parse it with JSON.NET.
Here is my attempt to save it to a file:
WebClient webpage = new WebClient();
var html = webpage.DownloadString("https://api.fixer.io/latest");
String k = html;
File.WriteAllText(#"C:\Users\JCena\Desktop\Hell1o.txt", k);

You code is almost fine.
first, get rid of the "new" keyword.
second, make sure you don't have exception for permissions for the folder specified.
try that code:
WebClient webpage = new WebClient();
var html = webpage.DownloadString("https://api.fixer.io/");
String k = html;
File.WriteAllText(#"Hello.txt", k);

Related

File download does not start when invoke or execute an URL

If anyone load this url https://de.visiblealpha.com/links/80488d55-ae41-4def-9452-bae3ac2e2b06 into browser then a excel file start download. so when i invoke the same url by HttpWebRequest then excel file does not start download. this code example i tried.
string address = "https://de.visiblealpha.com/links/80488d55-ae41-4def-9452-bae3ac2e2b06";
using (WebClient client = new WebClient())
{
client.DownloadString(address);
}
again i tried this one too.
string url = "https://de.visiblealpha.com/links/80488d55-ae41-4def-9452-bae3ac2e2b06";
WebRequest request = HttpWebRequest.Create(url);
WebResponse response = request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string responseText = reader.ReadToEnd();
but failed to reach my goal. code successfully executed but no excel file start downloading which i am trying to achieve.
when i tried to load this url https://de.visiblealpha.com/links/80488d55-ae41-4def-9452-bae3ac2e2b06 into webbrowser control then also saw same problem no excel file start download. here is code which i tried.
webBrowser1.Navigate("https://de.visiblealpha.com/links/80488d55-ae41-4def-9452-bae3ac2e2b06");
webBrowser1.ScriptErrorsSuppressed = true;
i just do not understand why excel file is not getting download when invoke or execute the same very url.
so please some one tell me what i need to do as a result the moment i will execute the url excel file will start downloading in client pc.
please share some working code example.
DownloadString returns the contents into a variable aka in memory. A file will not get saved on the system. If that is what you intended, there's a small change you need to make in your code:
string address = "https://de.visiblealpha.com/links/80488d55-ae41-4def-9452-bae3ac2e2b06";
using (WebClient client = new WebClient())
{
string contents = client.DownloadString(address);
}
The variable "contents" will contain html of the URL in your question. If you want it as a file, then I you need to use DownloadFile method instead. The spreadsheet itself is a different URL.
There's an example at the end of this documentation.

How to detect the origin of a webpage's GET requests programmatically? (C#)

In short, I need to detect a webpage's GET requests programmatically.
The long story is that my company is currently trying to write a small installer for a piece of proprietary software that installs another piece of software.
To get this other piece of software, I realize it's as simple as calling the download link through C#'s lovely WebClient class (Dir is just the Temp directory in AppData/Local):
using (WebClient client = new WebClient())
{
client.DownloadFile("[download link]", Dir.FullName + "\\setup.exe");
}
However, the page which the installer comes from does is not a direct download page. The actual download link is subject to change (our company's specific installer might be hosted on a different download server another time around).
To get around this, I realized that I can just monitor the GET requests the page makes and dynamically grab the URL from there.
So, I know I'm going to do, but I was just wondering, is there was a built-in part of the language that allows you to see what requests a page has made? Or do I have to write this functionality myself, and what would be a good starting point?
I think I'd do it like this. First download the HTML contents of the download page (the page that contains the link to download the file). Then scrape the HTML to find the download link URL. And finally, download the file from the scraped address.
using (WebClient client = new WebClient())
{
// Get the website HTML.
string html = client.DownloadString("http://[website that contains the download link]");
// Scrape the HTML to find the download URL (see below).
// Download the desired file.
client.DownloadFile(downloadLink, Dir.FullName + "\\setup.exe");
}
For scraping the download URL from the website I'd recommend using the HTML Agility Pack. See here for getting started with it.
I think you have to write your own "mediahandler", which returns a HttpResponseMessage.
e.g. with webapi2
[HttpGet]
[AllowAnonymous]
[Route("route")]
public HttpResponseMessage GetFile([FromUri] string path)
{
HttpResponseMessage result = new HttpResponseMessage(HttpStatusCode.OK);
result.Content = new StreamContent(new FileStream(path, FileMode.Open, FileAccess.Read));
string fileName = Path.GetFileNameWithoutExtension(path);
string disposition = "attachment";
result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue(disposition) { FileName = fileName + Path.GetExtension(absolutePath) };
result.Content.Headers.ContentType = new MediaTypeHeaderValue(MimeMapping.GetMimeMapping(Path.GetExtension(path)));
return result;
}

PhantomJS pass HTML string and return page source

for a web crawler project in C# I try to execute Javascript and Ajax to retrieve the full page source of a crawled page.
I am using an existing web crawler (Abot) that needs a valid HttpWebResponse object. Therefore I cannot simply use driver.Navigate().GoToUrl() method to retrieve the page source.
The crawler downloads the page source and I want to execute the existing Javascript/Ajax inside the source.
In a sample project I tried the following without success:
WebClient wc = new WebClient();
string content = wc.DownloadString("http://www.newegg.com/Product/Product.aspx?Item=N82E16834257697");
string tmpPath = Path.Combine(Path.GetTempPath(), "temp.htm");
File.WriteAllText(tmpPath, content);
var driverService = PhantomJSDriverService.CreateDefaultService();
var driver = new PhantomJSDriver(driverService);
driver.Navigate().GoToUrl(new Uri(tmpPath));
string renderedContent = driver.PageSource;
driver.Quit();
You need the following nuget packages to run the sample:
https://www.nuget.org/packages/phantomjs.exe/
http://www.nuget.org/packages/selenium.webdriver
Problem here is that the code stops at GoToUrl() and it takes several minutes until program terminates without even giving me the driver.PageSource.
Doing this returns the correct HTML:
driver.Navigate().GoToUrl("http://www.newegg.com/Product/Product.aspx?Item=N82E16834257697");
string renderedContent = driver.PageSource;
But I don't want to download the data twice. The crawler (Abot) downloads the HTML and I just want to parse/render the javascript and ajax.
Thank you!
Without running it, I would bet you need file:/// prior to tmpPath. That is:
WebClient wc = new WebClient();
string content = wc.DownloadString("http://www.newegg.com/Product/Product.aspx?Item=N82E16834257697");
string tmpPath = Path.Combine(Path.GetTempPath(), "temp.htm");
File.WriteAllText(tmpPath, content);
var driverService = PhantomJSDriverService.CreateDefaultService();
var driver = new PhantomJSDriver(driverService);
driver.Navigate().GoToUrl(new Uri("file:///" + tmpPath));
string renderedContent = driver.PageSource;
driver.Quit();
You probably need to allow PhantomJS to make arbitrary requests. Requests are blocked when the domain/protocol doesn't match as is the case when a local file is opened.
var driverService = PhantomJSDriverService.CreateDefaultService();
driverService.LocalToRemoteUrlAccess = true;
driverService.WebSecurity = false; // may not be necessary
var driver = new PhantomJSDriver(driverService);
You might need to combine this with the solution of Dave Bush:
driver.Navigate().GoToUrl(new Uri("file:///" + tmpPath));
Some of the resources have URLs that begin with // which means that the protocol of the page is used when the browser retrieves those resources. When a local file is read, this protocol is file:// in which case none of those resources will be found. The protocol must be added to the local file in order to download all those resources.
File.WriteAllText(tmpPath, content.Replace('"//', '"http://'));
It is apparent from your output that you use PhantomJS 1.9.8. It may be the case that a newly introduced bug is responsible for this sort of thing. You should user PhantomJS 1.9.7 with driverService.SslProcotol = 'tlsv1'.
You should also enable the disk cache if you do this multiple times for the same domain. Otherwise, the resources are downloaded each time you try to scrape it. This can be done with driverService.DiskCache = true;

Get Document OuterHTML of MVC Application in C#

We need to export the entire page of MVC Application to PDF for that purpose need to get all the HTML contents (i.e. including dynamic content too)
To get the contents of page we used following code
string contents = File.ReadAllText(path);
but it will give only static content of page(i.e. it gives page source code) not new nodes added in DOM.
Then tried following code but this also gives static content
// WebClient object
WebClient client = new WebClient();
// Retrieve resource as a stream
Stream data = client.OpenRead(new Uri("xxxx.html"));
// Retrieve the text
StreamReader reader = new StreamReader(data);
string htmlContent = reader.ReadToEnd();
So i want to get enitre outerHTML of document in C# with out using any third party DLL . i googled so many links and everyone updated like use webbrowser control and get the content.
i don't how this will be useful for our application. Our Application is MVC4. we need to export the enitre page to PDF so we need enitre content OF HTML (dynamic content too)
How can i use this below code in ourt MVC Application to get document outerHTML
mshtml.HTMLDocument doc = webBrowser1.Document.DomDocument as mshtml.HTMLDocument;
string html = doc.documentElement.outerHTML;
or
var documentAsIHtmlDocument3 = (mshtml.IHTMLDocument3)webBrowser.Document.DomDocument;
StringReader sr = new StringReader(documentAsIHtmlDocument3.documentElement.outerHTML);
htmlDoc.Load(sr)
Any help on this.
You haven't mentioned what the PDF is intended for. Most likely it is for the visitor of the page to download. If that is true, maybe you could use jsPDF. That way you get around the problem with not having access to the entire page serverside.

Automatically download an xml file from URI with download popup

ASP.NET MVC 4 Razor:
I've been working at this for a bit, so I apologize if I'm missing something obvious, but I will truly appreciate any assistance that could be offered.
In a nutshell, what I'm looking to do is download an XML file from a URI using C#. It ought to be pretty straightforward, but the URI leads to a blank page with a download prompt popup populated with a dynamically created filename.
I can't provide the URI due to its confidential nature, but here is the code I've been toying with. (Forgive my ignorance on this matter, it's the first time I've tried anything like this)
byte[] data;
using (WebClient Client = new WebClient())
{
data = Client.DownloadData(uriString + fileString);
}
File.WriteAllBytes(dirString + fileString, data);
I've also tried:
using (WebClient Client = new WebClient())
{
Client.DownloadFile(uriString + fileString, dirString + fileString);
}
To be honest, this code doesn't really work for me. The downloaded files aren't correct. The XML files appear to contain the code from the webpage they've been downloaded from, and if I try something like an image, the image is broken. So, again, any assistance would be appreciated.
Thanks in advance!
The URI that you are using is probably wrong. You are using the URI that opens the popup page. The popup page should be doing another GET to the dynamically generated file.
To automate this process, you should use a WebRequest to get the contents of the popup page. Scrape the contents of the page to get the actual URL to download the file. Then use the code you have written to download the file.
var request = WebRequest.Create("PopupUrl");
var response = request.GetResponse();
string url = GetUrlFromResponseByRegExOrXMLParsing();
var client = new WebClient();
webClient.DownloadFile(url, filePath);

Categories

Resources