Let's say that I navigate to google. It will download several files, including artworks, js scripts, etc. Can I access it from a member of WebBrowser, and if not, is there a special methodology to follow, in .Net? I already know HtmlAgilityPack, but it is for local file only. Websites behavior relies on a very strict structure from live, online documents and scripts, so I need something that works with online websites.
The WebBrowser of C# is an old browser and because of this you can't do any complex things like downloading from it. But there is a class in c# called WebClient by which you can download and upload what you want...
Related
I am developing asp.net mvc 5 web application using c#. I am trying to display excel file in iframe
<iframe src="../../Data/ExcelSheets/ProjectExpenditureDetails/20170917184328709.xls" width="100%" height="500"></iframe>
When page loading always download the excel file and it is not displaying on the iframe.
Web developer tool says: resource interpreted as document, but transferred with mime type application/vnd.ms-excel
I don't know my approach is correct or not. If it's correct how to solve my problem. If it's wrong what is the best way to display excel file in web page.
I don't believe there is a way to do this natively in a browser. There are likely plugins that would allow it, but on a web site you can't guarantee someone will have that installed. I believe third party services are able to provide a some javascript that allows you to open a document. Google docs does something like this.
Think about flash applications (that used to be a thing). They contained proprietary code that wouldn't run in a browser without a plugin installed. XLS files are similar. There are some exceptions, but mostly a browser can only be expected to understand html, css, javascript, and a number of image formats. Even PDFs require a plugin to view, you just don't see it very often because many browsers make that fairly seamless now.
Unfortunately if you want to do this through a web site that needs to be secure I believe the route to go is actually replacing the shared sheet with application based functionality. Your clients may feel more comfortable about moving to google docs based sheets, which can be shared, but mine wouldn't touch it with a 10 foot pole. I'm not sure it is warranted, but that is how they feel.
One of the requirements for the web application I'm creating is that users should be able to create and edit documents. I've been searching around and I came across the Google Drive REST API, however I'm a little unsure about what it can do.
From what I understand, the API allows my application to access a user's Google Drive account and their files, being able to open and edit them, as well as create documents using my application.
However, I was hoping that I could be able to use the Google Docs editor itself to create and open/edit documents, but from what I can gather is that the editor is up to me to create, and that I can use the Realtime API to enable the collaboration feature that Google Docs offers.
Is this the case? Is Google leaving the job of actually creating the document editor itself up to me (sorry if I sound like a whiny child here, it's an honest question), or does Drive API also provide their editor? The reason I want to use their editors is because it perfectly fits the requirements for the application, and it will be near impossible for me to compete with their document editor.
If I do I have to create the editor myself, can anyone recommend any open source/free document editors with similar features to that of the Google Docs editor that works with C# ASP.NET, or a way that I could somehow use the Google Docs editor in my application?
The short answer is no, Google does not allow directly editing Google Docs themselves, nor is there an API for recreating the Docs editor.
Bear in mind also that realtime data is not actually stored in Google Drive. Google uses Drive as its organisation method for realtime data, but the data itself, being collaborative, is not just a simple file. What is stored in Drive is a shortcut which links to your app's realtime data. In the case of an existing file (text etc), a shortcut is attached to the file, but it can also be a pure shortcut file, with no non-realtime data at all. Only your app can read or modify that realtime data, in much the same way that only Docs can (directly) work with its realtime data.
You can definitely re-create the capabilities of Google Docs using the realtime API, by exporting from Docs, using the realtime API to collaborate on the exported data, then re-import into Docs if necessary. At that point, Google Docs themselves may be superfluous.
What's involved will be something like this:
Set up an app in the Google developers console
Write the editor, and incorporate it in your app
Get the user to authorize your app to access their Drive
Using the picker, or another method, get the user to select a file.
Import that file from Docs
Collaboratively edit it within your app
Export it back to Docs.
You can embed Google Editor in to your web app and use it to edit, comment or read files, that are stored on Google Drive. You need:
click share button in the file
chose emails you want to share document with (or you can choose any one who has link, or even make it public)
choose permissions you want to grant: read, comment, edit
copy that link and paste it in the <iframe src=google_link width=x height=y></iframe> tag in your UI.
I have hundreds of PDF files that i need to present to users. When presenting these files through my MVC web app i don't want the users to have the ability to download the files, e.g.. i don't want the Acrobat reader controls for print/save showing up. Reading this stackoverflow post it seems that it's not possible to disable those controls.
I know users can still take screen shots and print out the page, but that's not an issue in my case.
What is the best way to go about this. I've reasearched using SWFTOOLS which looks like it may be a good solution, but i dont want to save the swf files to my filesystem. The optimal solution is PDF.js, but another problem i have is users will be accessing the files through IE8 - so PDF.js is out of the question. Unless there is another similar library that will convert the files to HTML 4.
Basically I just need to display the PDF files, on the fly would be best, in a different format than PDF
Any suggestions?
I had a similar project a while back, where sensitive pdfs were needed to be displayed to specific users but they weren't allowed to download/print/save it.
Since it was a web app I ended up using pdf.js. It is Mozilla's PDF renderer for firefox. It renders the pdf on to a canvas and by default has all the bells and whistles. If you have firefox, open a pdf file to see it in action.
It was tough to get it running at first but I ended up using a demo I found online as the base of the project. After removing each functionality that was forbidden the finished product did exactly what was required. You will need to add a print css file to block printing or find a better solution. I ended up using the css approach since print preview by passed my javascript check for the print action. Also ensure you block ctrl + s which allows the user to save the pdf.
Another aspect to note is that it works better on later versions of IE and struggles on older versions as the file size increases. Firefox and chrome are not a problem and I believe its the same for opera although I haven't tested that.
I would convert it to an image file, you can find tools or write script to do it, I personally would do it by displaying them in browser first and then use browser plug-ins to take screenshot of the entire webpage.
(you can automate this)
then just display then converted pdfs
**this is probably not the best solution :( **
I am currently building a little application based on watin that log in into a website and then start going through a serie of URL to download PDF files using Watin.
The website uses a lot of javascript to load pdf in embedded HTML.
The program works fine for now but is very slow since watin doesn't handle downloads very efficiently ( It uses Firefox download system and type slowly filename before saving.
I would like to know if there is a better framework for Web Scraping that could provide the same support for Ajax sites but better / faster way to download files.
I've been all around the web and found about selenium, but it doesn't present itself as more efficient than watin concerning file downloading.
Thanks in advance for your help.
You could write a Google Chrome extension using these two APIs as the main engine:
https://developer.chrome.com/extensions/webRequest.html
to know when and how to authenticate and when to start download and:
https://developer.chrome.com/extensions/downloads.html
to start the download of the file.
Whatever is missing from these two APIs for you to achieve your goal, you can compensate with a custom content script - a javascript that is injected into the page that is opened by the extension - and for example hook into the jquery's .ready event to initialize scraping.
These will definitely be faster than Watin since writing for watin is a layer of abstraction more than talking to the browser directly.
If you have worked with IDM(Internet Download Manager) this has a item named Grabber that searches in a special web site and get the files and folders of that website and you can download them using IDM.
I would like to do something similar in C#. I would like to download html web pages and extract links from those pages. I would also like to detect directories and attempt to search their contents - possibly parsing "Index Of" directory listing pages.
How would I go about doing this?
Use regex or use the HtmlAgilityPack (http://htmlagilitypack.codeplex.com/) to parse the website and find links to files. You may need to check the extension of the file. Ie. Only parse links that end in .zip|.exe|.msi|.rar|.png|.pdf|.gif|.jpg|.jpeg.
I once wrote a "Web Spider" to do this and published the source code over at Code Project.
If you want to do it as an end-user, I found out the the free Httrack Website Copier works pretty well.