I am attempting to retrieve some information from a website, parse out a specific item, and then move on with my life.
I noticed that when I check "view source" on the website, the results match with what I see when I use the WebClient class' method of DownloadFile. On the other hand, when I use the DownloadString method, the contents of that string are different from both view source and DownloadFile.
I need DownloadString to return similar contents to view source and DownloadFile. Any suggestions? My relevant code is below:
string criticalPathUrl = "http://blahblahblah&sessionId=" + sessionId;
WebClient wc = new WebClient();
wc.Encoding = System.Text.Encoding.UTF8;
//this is different
string urlContentsString = wc.DownloadString(criticalPathUrl);
//than this
wc.DownloadFile(criticalPathUrl, "rawDlTxt2.txt");
Edit: Please ignore this question as I just didn't scroll up far enough. Ugh. One of those days.
use download data instead of downloadstring and use suitable encoding to convert the string then save the file!
watch details: https://www.pavey.me/2016/04/aspnet-c-downloadstring-vs-downloaddata.html
Related
The following code:
var text = (new WebClient()).DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20"));
results in a variable text that contains, among many other things, the string
"$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance"
However, when I visit that URL in Firefox, I get
$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance
which is actually correct. I also tried
var data = (new WebClient()).DownloadData("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
var text = System.Text.UTF8Encoding.Default.GetString(data);
but this gave the same problem.
I'm not sure where the fault lies here. Is the feed lying about being UTF8-encoded, and the browser is smart enough to figure that out, but not WebClient? Is the feed properly UTF8-encoded, but WebClient is failing in some other way? What can I do to mitigate this?
It's not lying. You should set the webclient's encoding first before calling DownloadString.
using(WebClient webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
}
As for why your alternative isn't working, it's because the usage is incorrect. Its should be:
System.Text.Encoding.UTF8.GetString()
I am trying to download the contents of this webpage into my program. I have tried using WeblClient.DownloadString, WebClient.DownloadFile, then save it to a file and read it from a local file, but none of this is working. When I use breakpoints in Visual Studio, I see the string is correctly saved, but when I try to print it to a file, or print it to the console, nothing is displayed.
What I am aiming to do is download this webpage's content into a String then parse it with JSON.NET.
Here is my attempt to save it to a file:
WebClient webpage = new WebClient();
var html = webpage.DownloadString("https://api.fixer.io/latest");
String k = html;
File.WriteAllText(#"C:\Users\JCena\Desktop\Hell1o.txt", k);
You code is almost fine.
first, get rid of the "new" keyword.
second, make sure you don't have exception for permissions for the folder specified.
try that code:
WebClient webpage = new WebClient();
var html = webpage.DownloadString("https://api.fixer.io/");
String k = html;
File.WriteAllText(#"Hello.txt", k);
I have a website named:
string url="http://180.92.171.80/ffs/data-flow-list-based/flood-forecasted-site/"
When I give the station name, River name, Basin name, It returns me present Water level. I want to do it in C#. I can read HTML code from C#. But there is no value in HTML. Where to start or how I can do it easily? Anyone have any idea?
Check out the WebClient class. You can use the DownloadString() method to get the html from your url as a string.
WebClient client = new WebClient ();
string reply = client.DownloadString (address);
http://msdn.microsoft.com/en-us/library/fhd1f0sw.aspx
The following code:
var text = (new WebClient()).DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20"));
results in a variable text that contains, among many other things, the string
"$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance"
However, when I visit that URL in Firefox, I get
$κ$-Minkowski space, scalar field, and the issue of Lorentz invariance
which is actually correct. I also tried
var data = (new WebClient()).DownloadData("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
var text = System.Text.UTF8Encoding.Default.GetString(data);
but this gave the same problem.
I'm not sure where the fault lies here. Is the feed lying about being UTF8-encoded, and the browser is smart enough to figure that out, but not WebClient? Is the feed properly UTF8-encoded, but WebClient is failing in some other way? What can I do to mitigate this?
It's not lying. You should set the webclient's encoding first before calling DownloadString.
using(WebClient webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
string s = webClient.DownloadString("http://export.arxiv.org/api/query?search_query=au:Freidel_L*&start=0&max_results=20");
}
As for why your alternative isn't working, it's because the usage is incorrect. Its should be:
System.Text.Encoding.UTF8.GetString()
Okay so I want to download a file from a website, but the file is lacking an extension.
(it's an image file, I know this much, but the link does not provide the actual extension)
When I use webrequest, or webclient to download the file I get a "404 file not found" exception.
WebClient wc = new WebClient();
Stream strm = wc.DownloadFile("http://some_site.some_domain/some_image.","C:/some_directory/save_name.some_extention");
Notice the lack of extention at the end of the URL.
The site in question displays the image fine in a webbrowser, but when viewing just the image there is no extension and thus it's treated an unknown file (not showing an image).
So simply put: how do I download a file if there is no extention specified?
Thanks in advance!
So you're trying to determine what extension to give the file after downloading? If the URL doesn't have one you would have to inspect the actual data of the file.
You might be able to inspect the beginning of the file and see if it matches known valid file types. For instance, PNGs seem to have 'PNG' as bytes 2-4 (at least in the ones I've inspected). By looking at that data you should be able to determine the format with a fairly high accuracy.
This would be my best suggestion, if this doesn't work I don't know how to solve you problem...
List<string> fileExtensions = new List<string>(){"png","gif","bmp","jpg"}// other known image file extensions here...
WebClient wc = new WebClient();
foreach(var extension in fileExtensions)
{
try
{ wc.DownloadFile("http://some_site.some_domain/some_image."+extension,"C:/some_directory/save_name."+extension);
break;
}
catch {}
}
This would just be a work around, I guess... Not a real solution...