Downloading data from a hyperlink on a webpage - c#

I wish to download some data on a daily basis. I can get the data manually by loading the webpage web page with data and then there is a link near the top right hand corner called 'History Download'. This link opens an excel file with the data I require.
Using either C# or VBA is there anyway to automate this process and if so how?
Edit
Here is the code I currently have. It download a text file with all the html of the webpage although looking at the html it looks like the home page. Was hoping this link would download the data as an excel file. I originally save text.xlsx but it didn't like that so have save the file below as txt.
class Program
{
static void Main(string[] args)
{
string path = "http://www.ishares.com/uk/institutional/en/products/251382/ishares-msci-world-minimum-volatility-ucits-etf/1393511975017.ajax?fileType=xls&fileName=iShares-MSCI-World-Minimum-Volatility-UCITS-ETF";
string pathSave = #"C:\MyFolder\test.txt";
WebClient wc = new WebClient();
wc.DownloadFile(path,pathSave);
}
}

Related

Trying to assigning hyperlink to pdf location inside PDF using code behind c# asp.net

Trying to assigning hyperlink to pdf location inside PDF using c# asp.net web forms.
This is my C# code assigned link url to pdf location.
protected void FillPDF()
{
Dictionary<string, string> formFieldMap;
pdfPath = Path.Combine(Server.MapPath("../img/fullapplication_final.pdf"), ""); // need to take
formFieldMap = PDFHelper.GetFormFieldNames(pdfPath);
string livepath = "http://www.example.com/";
if (!string.IsNullOrEmpty(Request.QueryString["RegistrationId"].ToString() as string))
{
bo.Para1 = Request.QueryString["RegistrationId"].ToString();
bo.Para2 = "3";
DataTable dt = bl.Admin_Get_UserInformation(bo);
formFieldMap["text_attchedfilertpin"] = livepath + "TrainingPlan/" + dt.Rows[0]["TrainingPlan"].ToString();
}
}
This code is showing an url like www.example.com/my.pdf as its output.
But I need the output to be like this : click here to download pdf
I am trying below new code to get the output as I need it:
HyperLink DynLink = new HyperLink();
DynLink.ID = "DynLink1";
DynLink.Text = "click here to donwload pdf";
DynLink.NavigateUrl = livepath + "TrainingPlan/" + dt.Rows[0]["TrainingPlan"].ToString();
Page.Controls.Add(DynLink);
But I'm not able to assign view of pdf using
formFieldMap["text_attchedfilertpin"]
I am looking for your help thank you in advance.
In order for the PDF link to be recognized as a file download, you need to add a special Content-Disposition: attachment; file=xxx.pdf HTTP header (see this for code example - code to download PDF file in C# ).
Let's say you want to have a link http://www.example.com/plans/my123.pdf that when clicked initiates a PDF file download of a training plan called "my123".
You can create an HTTP handler - a class PlanPDF that implements IHttpHandler. In the code of the handler you can set the right Content-Type, Content-Disposition, Content-Length and transmit the PDF file as in the link above. See this article for a simple example of IHttpHandler
Next you need to configure URL rewriting so that requests coming to /plans/my123.pdf get mapped to your handler PlanPDF.
This you can do in your "Web.config" (see the same codeproject article for an example).
Parse the plan name from the request URL path, and use it to determine which training plan file to transmit.

Find then save web page to Drive Using C#

i have a problem i want to find a specific string in a web page then save the web page that i found the string.
I am using firefox for web browser
Problem :
1. I open a page (Containing a random word)
2. Then my C# program doing searching in the page, if the word find in the page then program will automaticaly save the page to Drive . If not the program will do click on Next Button on the page then do search again in the page.
Is that possible ?
Ok, so it sounds like you might want to do something like the following.
You can use WebClient to load the response from a url into a string:
using(WebClient client = new WebClient()) {
string s = client.DownloadString(your_url);
}
You can then search for a occurrence of the string you a looking for in "s" using indexOf:
if (s.IndexOf("string you are searching for") > -1)
{
// s contains "string you are searching for"
}
Then you can save "s" to disk using a StreamWriter:
using(StreamWriter sw = new StreamWriter("file name"))
{
sw.WriteLine(s);
}
In terms of clicking the "next" button can you define the urls as a list of strings and then just iterate over them using the previous code for each.

How to detect the origin of a webpage's GET requests programmatically? (C#)

In short, I need to detect a webpage's GET requests programmatically.
The long story is that my company is currently trying to write a small installer for a piece of proprietary software that installs another piece of software.
To get this other piece of software, I realize it's as simple as calling the download link through C#'s lovely WebClient class (Dir is just the Temp directory in AppData/Local):
using (WebClient client = new WebClient())
{
client.DownloadFile("[download link]", Dir.FullName + "\\setup.exe");
}
However, the page which the installer comes from does is not a direct download page. The actual download link is subject to change (our company's specific installer might be hosted on a different download server another time around).
To get around this, I realized that I can just monitor the GET requests the page makes and dynamically grab the URL from there.
So, I know I'm going to do, but I was just wondering, is there was a built-in part of the language that allows you to see what requests a page has made? Or do I have to write this functionality myself, and what would be a good starting point?
I think I'd do it like this. First download the HTML contents of the download page (the page that contains the link to download the file). Then scrape the HTML to find the download link URL. And finally, download the file from the scraped address.
using (WebClient client = new WebClient())
{
// Get the website HTML.
string html = client.DownloadString("http://[website that contains the download link]");
// Scrape the HTML to find the download URL (see below).
// Download the desired file.
client.DownloadFile(downloadLink, Dir.FullName + "\\setup.exe");
}
For scraping the download URL from the website I'd recommend using the HTML Agility Pack. See here for getting started with it.
I think you have to write your own "mediahandler", which returns a HttpResponseMessage.
e.g. with webapi2
[HttpGet]
[AllowAnonymous]
[Route("route")]
public HttpResponseMessage GetFile([FromUri] string path)
{
HttpResponseMessage result = new HttpResponseMessage(HttpStatusCode.OK);
result.Content = new StreamContent(new FileStream(path, FileMode.Open, FileAccess.Read));
string fileName = Path.GetFileNameWithoutExtension(path);
string disposition = "attachment";
result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue(disposition) { FileName = fileName + Path.GetExtension(absolutePath) };
result.Content.Headers.ContentType = new MediaTypeHeaderValue(MimeMapping.GetMimeMapping(Path.GetExtension(path)));
return result;
}

How to select a text box on a webpage

How can I select a text box that is available on a webpage so that my program can add data to the selected text box?
I am trying to setup a C# program that will auto login to a series of websites.
Example website:
http://what.cd/login.php
Current Code:
private void login()
{
System.Net.HttpWebRequest whatCDReq = (System.Net.HttpWebRequest)System.Net.WebRequest.Create("http://what.cd/login.php");
HTMLDocument htmlDoc = new HTMLDocumentClass();
htmlDoc = (HTMLDocument)webBrowser1.Document;
HTMLInputElement username = (HTMLInputElement)htmlDoc.all.item("p", 0);
username.value = "Test";
}
Look, what you want to do is send form requests to the server. Parse the webpage for text box form controls and submit the data in a format that the server can use (usually, the data handling is done within PHP on the server end).
Look in the webpage file for a reference to the Javascript function that performs the action itself (it should format the data and send it to the server). I'd recommend implementing that by translating it to your language of choice OR you could run the Javascript function directly through some 3rd party library (despite what you may think, I find that the first option is ultimately easier for small tasks like this).

Display dynamic images in a flash file

I bought a website template that has a scrolling photo gallery. As it came, the images are static in the fla file itself. I would like to edit the fla and load images dynamically. Ideally from MSSQL. I'm using VS2010, C# webforms, and SQL Server 2008 R2.
Are there any code snippets or tutorials or general guidance on how to do this? I do have a CS3 disc with Flash on it I can use for editing.
You can use a Loader + URLRequest, something like: (untested code)
var imgLoader:Loader = new Loader();
imgLoader.contentLoaderInfo.addEventListener(Event.COMPLETE, imageHasBeenLoaded);
imgLoader.load(new URLRequest("imagePath/from/database.jpg"));
public function imageHasBeenLoaded(e:Event) {
//Get the loaded bitmap image, do what you want with it from here.
var img:Bitmap = Bitmap(e.target.content);
}
Of course you would also want to feed the file paths to flash, either by FlashVars or by hitting a web service type of page (or xml file) via a Flash URLLoader + URLRequest. I prefer an xml file myself.

Categories

Resources