c# webclient.DownloadFileTaskAsync downloads a corrupted 1KB PDF - c#

I have a WebClient created in a WebBrowser_Navigating event handler. It stops navigation (to prevent manual filedownload dialog) and passes the referred URL to the webclient's DownloadFileTaskAsync method.
await client.DownloadFileTaskAsync(e.Url, AppDomain.CurrentDomain.BaseDirectory + "\\SUCCESS.pdf");
I have already set the SecurityProtocolType to Tls12 and passed all cookies and other headers to the webclient.
The file expected is about 11 MB large.

I'll assume that you download the HTML website instead of the actual file. If that is the case you will need to scrape the download link by using an HTML Parser and XPath to navigate inside the HTML (for example the HTML Agility Pack).
If that's not the case, can you maybe print out what exactly e.Url contains to see what URL you are trying to access with await client.DownloadFileTaskAsync(...).
Maybe another problem could be that you don't correctly dispose your HttpClient which might interfere with your file creation. It would be generous of you to add more information about your code to your question.

Related

How make WebClient request all others GET requests

I am trying to simulate a real web browser request and turns out when I use this code:
WebClient client = new WebClient();
client.DownloadFile(address, localFilename);
I get only the GET to the address(of course) and the behavior in a browser is many GET requests to images, blogger, etc...
Is there a shortcut to get/simulate the same behavior or the only alternative is to parse the file/string and make all these requests by myself manually?
Yes, a browser processes the specific type of file (typically HTML) and it parses it. Depending on what the file contains (links to other files like images, etc.) the browser will then start up many other connections to get all those other files to display within the browser.
That doesn't come for free--you have to do that yourself. DownloadFile just downloads a file, that may or may not be an HTML file and thus it doesn't handled all possible file types and process all linked files.

How to download images from response of Readability Parser API in C#

I'm using Readability Parser API to get content of the page.
After result received content goes to kidlegen.exe(to generate .mobi) and then to my kindle via email. The problem is content i get from Readability Parser API contains <img> to remote images, so i need to download them firts and only then launch kindlegen.exe.
The question is how to download remote images from article to my disk in efficient way? I can see only one solution - use regexp to parse response to extract <img>, then extract scr attribute and finally download images, but that's definitely worst way.
I'm using ASP.NET MVC.
Looks like i need HtmlAgilityPack. I'll detach this task from web application to console.

How can I use HTMLAgilityPack to download a CSV file?

I have the code to go to a website, login with the hidden fields and cookies and include a browser header so that I appear as a normal user.
Now that I am in the protected content I need to download a csv file that I have found within the document using HTMLAgilityPack.
I would like to grab the csv with HTMLAgilityPack so that I can continue to use the cookies and browser user-agent string already setup.
From what I have read HTMLAgilityPack parses the dom. I would expect a csv file to cause an error and return null. But I have seen vague references of being able to grab the raw data of the page/file requested before it is parsed. If so, that would be the solution but I cannot find how to do that.
You don't need to use HtmlAgilityPack at all, assuming the HTML form you're submitting is constant. Just craft the HTTP request manually and submit it, then download the corresponding CSV file using a HttpWebRequest.
HtmlAgilityPack is only used for working with HTML you already have in your possession. It does include an ability to make basic HTTP requests, but that's a convenience feature. Generally you should use HttpWebRequest where possible.

How to use WebClient.UploadFileAsync to upload files and POST Params as well?

I am using WebClient.UploadFileAsync to upload local files to a web server and I would also like to pass some parameters with the post as well. I would like to send a few fields that will tell the PHP code on the server specific locations on where to save the uploaded files.
I tried to put the fields directly into the url, for example:
WebClient client = new WebClient();
Uri uri = new Uri("http://example.com/upload.php?field1=test");
client.UploadFileAsync(uri, "POST", "c:\test.jpg");
The PHP code returns false for isset($_REQUEST['field1']).
Thank you for any suggestions.
NOTE: this question was also asked in very similar format for vb.net a while back, but it did not get any answers,
WebClient's UploadFile is designed to send only a file (as byte[]) as part of request. From my understanding, UploadFile method closes the request stream after writing the binaries.
In your scenario, you actually request is has two parts 1. file as byte[] 2. the file name as string.
To do this, you have to use HttpWebRequest or any other high level class capable of creating request.
Refer to the post http://www.codeproject.com/KB/cs/uploadfileex.aspx?display=Print
which does a similar job
This article goes into detail about what is needed to accomplish posting of fields while uploading files using WebClient.
Unfortunately, most file upload scenarios are HTML form based and may
contain form fields in addition to the file data. This is where
WebClient falls flat. After review of the source code for WebClient,
it is obvious that there is no possibility of reusing it to perform a
file upload including additional form fields.
So, the only option is to create a custom implementation that conforms
to rfc1867, rfc2388 and the W3C multipart/form-data specification that
will enable file upload with additional form fields and exposes
control of cookies and headers.
I would look into using the QueryString property of the WebClient to set the value of field1 (as well as any other QueryString parameters to the request).
NameValueCollection query = new NameValueCollection();
query.Add("field1", "test");
client.QueryString = query;
Reference: http://msdn.microsoft.com/en-us/library/system.net.webclient.querystring(v=VS.100).aspx

Forcing image to download

I'm using a WebBrowser control in Silverlight and I'm setting it to a local HTML page. The HTML page has various links and they all work fine. Can I make it so that if the user clicks on an image file, it downloads to their system (or does the default behavior of the broweser) instead of displaying on the webpage? The main question is, is it possible to do this if I don't have access to the server itself? Thanks
edit - Is it possible to send an HTTPWebRequest to get the image and then edit the response headers, all from the client? This may be an alternative.
The standard way of doing this is to send the Content-Disposition HTTP header with attachment as the value. See here for more on this: Uses of content-disposition in an HTTP response header
But if you don't have access to the server, I don't think you can achieve this.

Categories

Resources