Retrieve data from a website

Retrieve data from a website - c#

I want to grab a set of data from a site into my C# application. I've referred to some sites and articles using the WebClient class.
But the problem is the data I want is in a news bar made using flash. Is it possible to grab the data from it? The data in it also keeps on updating as well.

Have you tried the Yahoo approach? The below project does just that.
It is easy to download stock data from Yahoo!. For example, copy and
paste this URL into your browser address:
http://download.finance.yahoo.com/d/quotes.csv?s=YHOO+GOOG+MSFT&f=sl1d1t1c1hgvbap2.
Depending on your Internet browser setting, you may be asked to save
the results into a filename called "quotes.csv" or the following will
appear in your browser:
http://www.codeproject.com/KB/aspnet/StockQuote.aspx?display=Normal

It is unable to grab a data from Flash.
One possible solution is that, if you dig into embed tag at the Flash object or find some url or rss that looks to be consumed by the flash, you can read that by WebClient or (hopefully) XmlReader.

Related

Using .Net how can I programmatically navigate to a webpage, interact with it via code, then get particular values from a newly generated page

I have a scenario where I would like to automate programmatically the following process:
Currently, I have to manually
Navigate to a webpage
Enter some text (an email) in a certain field on the webpage
Press the 'Search' button, which generates a new page containing a Table with the results on it.
Manually scroll through the generated results table and extract 4 pieces of information.
Is there a way for me to do this from a Desktop WPF App using C# ?
I am aware there is a WebClient type that can download a string, presumably of the content of the webpage, but I don't see how that would help me.
My knowledge of web based stuff is pretty non-existent so I am quite lost how to go about this, or even if this is possible.

I think a web driver is what you're looking for, I would suggest using Selenium, you can navigate to sites and send input or clicks to specific elements in them.

Well, I'll write the algorithm for you but you also need to some homework.
UseWebClient get the htm page with the form you want to auto fill and submit
Us regex and extract the action attribute of the form you want to auto submit. That gets you the URL you want to submit your next request to.
Since you know the fields in that form, create a class corresponding to those fields, let's call the class AutoClass
Create a new instance of your auto class and assign values you want to auto fill
Using WebClient to send your new request with the url you extracted from the form previously, attach your object which you want to send to the server either through serialization or any method.
Send the request and wait for feedback, then further action

Either use a web driver like Puppeteer (Selenium is kinda dead) or use HTTPS protocol to make web requests (if you don't get stopped by bot checks). I feel like your looking for the latter method because there is no reason to use a web driver in this case when a lighter method like HTTP requests can be used.
You can use RestSharp or the built in libraries if you want. Here is a popular thread of the ways to send requests with the libraries built in to C#.
To figure out what you need to send you should use a tool like Fiddler or Chrome Dev Tools (specifically the Network tab) to see what you need to send to acheive your goal as you would in a browser.

How do I download documents from AtTask?

I'm working on a continuing API project. The current issue at hand is to be able to download my data from the AtTask server in precisely the folder structure they exist in on the AtTask servers. I've got the folder creation working nicely; the data types between Document, Document Folder and Document Version seem to be pretty clear. I am a little disillusioned about the fact that extension isn't in the document object (that I have to refer to the document VERSION for that)... but I can see some of the reason for that from a design perspective.
The issue I'm running into now is that I need to get the file content. I originally through from the API documentation that I'd be able to get to the file contents the same way as the documentation recommends uploading it -- through the handle. Unfortunately, neither document nor docv seem to support me accessing the handle except to write a new file.
So that leaves me the "download URL" as the remaining option. If I build the UI strings from the API calls using my browser, I get a URL with https://attaskURL/document/download?ID=xxxx (and can also get the versionID and such). If I paste the url into the browser where I'm logged in to the user interface of AtTask, it works fine and I can download the file. If, instead, I use my C# code to do so, I get the login page returned as a stream for me to download instead of my actual file because I'm not authenicated. I've tried creating a network credential and attaching it to the request with the username and password, but to no avail.
I imagine there's a couple ways to solve this problem -- the easy one being finding a way to "log in" to the download site through code (which doesn't seem to be the usual network credential object in C#) OR find a way to access the file contents through the API.
Appreciate your thoughts!

It looks like you can use the download URL if you put a session id in the URL. The details on getting a session id are here (basically just call login and a session id is returned in JSON):
http://developers.attask.com/api-docs/#Authentication
Then cram it on the end of your document download URL:
https://yourcompany.attask-ondemand.com/document/download?ID=xxxx&sessionID=abc1234
I've given this a quick test and I'm able to access a document.

You can use the downloadURL and a sessionID IF you are not using SAML authentication.
I have tried it both ways and using SAML will redirect you to the login page.

Handling Authentication for a File Display using a Web Service

This is my first time developing this kind of system, so many of these concepts are very new to me. Any and all help would be appreciated. I'll try to sum up what I'm doing as efficiently as possible.
Background: I have a web application running AngularJS with Bootstrap. The app communicates with the server and DB through a web service programmed using C#. On the site, users can upload files and reference them later using direct links. There's no restriction to file type (yet), so just about anything is allowed.
My Goal: Having direct links creates a big security problem for me, since the documents/images are supposed to be private data. What I would prefer to do is validate a user's credentials when the link is clicked, then load the file in the browser using a more generic url path.
--Example--
"mysite.com/attachments/1" ---> (Image)
--instead of--
"mysite.com/data/files/importantImg.jpg"
Where I'm At: Not very far. My first thought was to add a page that sends the server request and receives a file byte stream along with mime type that I can reassemble and present to the user. However, I have no idea if this is possible using a web service that sends JSON requests, nor do I have a clue about how the reassembling process would work client-side.
Like I said, I'll take any and all advice. I'd love to learn more about this subject for future projects as well, but for now I just need to be pointed in the right direction.

Your first thought is correct, for it, you need to use the Response object, and more specifically the AddHeader and Write functions. Of course this will be a different page that will only handle file downloads, so it will be perfectly fine in your JSON web service.

I don't think you want to do this with a web service. Just use a regular IHttpHandler to perform the validation and return the data. So you would have the URL "attachments/1" get rewritten to "attachments/download.ashx?id=1". When you've verified access, write the data to the response stream. You can use the Content Disposition header to set the file name.

Read contents JSON file sitting on webserver from c# code behind

I am trying to read the contents of a JSON file sitting in my github pages repository.
I can navigate and see the file contents in my browser if I specify the url.
If I use the code here:
http://www.codeproject.com/Tips/397574/Use-Csharp-to-get-JSON-Data-from-the-Web-and-Map-i?msg=4615047#xx4615047xx
It claims to "just work", but it doesn't.
All I get back is:
<html><frameset><frame src="URL-TO-JSON-FILE"></frameset></html>
How am I supposed to read the json file and get its contents back as a string. I am using c#?
Once I get the JSON string back I can do the processing I need to do in c#.
EDIT:
According to rawgithub.com those types of urls are not to be used for production. I need this for production. How do production website read remote JSON files that are located on a webserver?
Thank you

Sometimes in github, if you wish to use code from a repository, you must change the url to raw.github.com/ or click on the raw button and use this url.

How to upload files using Yahoo uploader widget in asp.net

Many might have had experience using File Upload widget from Yahoo User Interface library. The docs and community all know how to receive the files on the server using another server technology other than ASP.NET. If anyone has indeed used the widget in their asp.net pages could you share the code on
How to receive the uploaded files Stream/Bytes to a file.
How to check Integrity of the File
How to check if file was received correctly.
Also i would love to do it in single page because doing so i would learn how to differentiate between a normal webpage request and the one caused my file upload widget
Yahoo Upload Widget can be Found here: https://developer.yahoo.com/yui/uploader/.

Have you tried looking at postedfiles collection though? The API looks like it does a standard post. If it does, the just use that collection.
If it doesn't, then you need to use the inputstream property on the request object to read the incoming bytes.
Using something like Fiddler or firebug will tell you how it's making the request. Look for the request type being multipart/mime
edit
Checking the file integrity & whether it was uploaded correctly are pretty much impossible. The only way I can think to do it is to have the user generate a hash of the file then upload the file & the hash & you check the hash is valid. ie not really practical.
All you're getting is a stream of bytes. you have to assume when the stream ends, it ended cleanly & you got all the file.

I answered my own question with code over here.
http://labs.deeptechtons.com/asp-net-tuts/how-to-upload-files-asynchronously-using-yahoo-uploader/

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.