I'm developing a software on C# which has to get info from a website which the user opens in chrome, the user has to input some data and then the website returns a list of different items.
What I want is a way to be able to access to the source code of the page in order to get the info, I cant open the web myself as it doesnt show anything because I didnt input any data, so I need to get it directly from chrome.
How can I achieve this ? A chrome extension ? Or can I access to chrome directly from my software ?
Off the top of my head, I don't know any application that gets data directly from an open instance of Chrome. You'd have to write your own Chrome extension.
Alternatively, you can open the web browser from your application initially.
You can look into these libraries for doing so:
Watin (My personal favourite)
Selenium
Awesomium (You'd have to roll out your own UI, it's invisible)
Cef
Essential Objects Web Browser
EDIT: I didn't think about using QA tools as the actual browser hook as #TheAnathema mentions. That would probably work for your needs.
You're going to need to create it as Chrome extension if you must be dependent on the user actually going to a specific web site (i.e. not being able to do the requests yourself with either Selenium or standard web requests in Python).
The reason why a Chrome extension would be required is because think of how bad it could be for any software to easily read the pages you browse. Banking, medical, email, etc. could all be accessed anonymously from any process if Google allowed any outside process to tap into the web page.
Even Chrome extensions have to ask for permission to be able to do what they want, but at least it is software the user knowingly installed and agreed to the permissions.
A quick search yielded this example of modifying a page's HTML with a Chrome extension: https://blog.lateral.io/2016/04/create-chrome-extension-modify-websites-html-css/
It sounds like you want to do web scraping. Here's a good tutorial to get you started: HTML Scraping.
And this answer has a good example of how to scrape data from a website where you need to submit a form to get access to the data.
Related
I've a C# WPF application developed in VS 2015, and I want the browser to read some data from it. Just a short string. I can save it in a text file, or in a variable but it should be visible to the browser (using JS I suppose). For instance using file:/// doesn't work if the original page is hosted online - as in my case (different source conflict). This should work in Opera and FFox, but looking at their extensions, it seems you can only develop with front-end technologies, which are not enough in my case since I use WPF to look into Win OS, and then I need to share the result with the browser.
I suspect it's possible, and no , it's not to write a malicious piece of code. For instance I can read the details of the graphic card for diagnostic purposes.
Please help, many thanks.
Browsers run in a security sandbox which is intended to stop them reading or writing files to the file system.
You could write to the user's appdata. There are various javascript frameworks which persist data to there so they can provide offline or static data.
I don't think that is a good plan though.
I suggest your first candidate would be a cookie.
Quick google on how to do that, I find:
How to create cookie in c#.net windows application?
From a web page you can use the content of a cookie dynamically. So you could change what you see in the web page after it's up and running from some process in your wpf app and do a counter or whatever.
I've not used this with windows apps and a browser but I have with a web app and Silverlight. I'm afraid I don't have that code to hand though.
Is there a way using either C# or a scripting language, such as Python, to load up a website in the user's default webbrowser and continue to interact it via code (e.g. invoke existing Javascript methods)? When using WinForms, you can host a Webbrowser control and invoke scripts from there, but only IE is supported. Is there a way of doing the same thing in the user's default browser (not necessarily using WinForms)?
Update: The website is stored on the user's machine, not served from a third party server. It is a help page which works dynamically with my C# program. When the user interacts with my C# program, I want to be able to execute the Javascript methods on the website.
You might want to look into Selenium. It can automate interaction with FireFox, IE, Chrome (with chromedriver) and Opera. It may not be suitable for your purposes due to the fact that it uses a fresh, stripped down profile, rather than the user's normal browser profile.
If you look at the HTTP request header you can determine the user-agent making the request. Based upon that information you can write logic from the server side as to respond with a unique page per detected user-agent string. Then you add any unique JavaScript you want as a static string to be executed by the user-agent application.
I wanted to make an application wherein you specify the name of the websites, your username and password and that application automatically logs in to all your accounts in the specified websites. I have done this using windows form application, using a web browser. But i wanted my application to open all these websites in chrome and log it in there. Plz Help
I'd check out the chrome API failing that
Look into getting a handle to the window through window API calls
But why not just a chrome extension?!? Miles simpler
Have a look at Selenium WebDriver
Doing a quick Google search for "chrome C# api" turned up a number of results I think you may find relevant.
I thought the following were particularly promising, if you're willing to accept a few concessions:
Automating Chrome Browser from C#
ChromeDevTools; a C# Library to interact with Chrome's Developer Tools
Chrome Debugging API
I'm trying to automate the download of a file from a website. Normally to download the file, I login with a username and password. Navigate to a particular screen then click a button.
I've been trying to watch the sequence of POSTs using Chrome's developer mode, and then replicate all the steps using .Net WebClient class, but to no success. I've derived from the WebClient class and added cookie handling. Which seems to be working. I go to the login page and post using WebClient.UploadValues. About half the times it seems to work. The next step appears to make another POST action to a reporting URL. Once again I use WebClient.UploadValues, but the response from the server is a page showing an internal error.
I have a couple of questions.
1) Are there better tools than hand coding C# code to replicate a bunch of web browser interactions? I really only care about being able to download the file at a particular time each day onto a Windows box.
2) The WebClient does not seem to be the best class to use for this. Perhaps it's a bit to simplistic. I tried using HttpWebRequest, but it has no facilities for encoding POST requests. Any other recommendations?
3) Although Chrome's developer plugin appears to show all interaction, I find it a bit cumbersome to use. I'd be interested in seeing all of the raw communication (unencrypted though, the site is only accesses via https), so I can see if I'm really replicating all of the steps.
I can even post the exact code I'm using. The site I'm pulling data from, specifically is the Standard and Poors website. They have the ability to create custom reports for downloading historical data which I need for reporting, not republishing.
Using IE to download the file would be a much easier, as compared to writing C# / Perl / Java code to replicate http requests.
Reason is, even a slight change in JavaScript code can break the flow.
With IE, you can automate it using COM. Following VBA example opens IS and performs a google search:
Sub Search_Google()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "http://www.google.com" 'load web page google.com
While IE.Busy
DoEvents 'wait until IE is done loading page.
Wend
IE.Document.all("q").Value = "what you want to put in text box"
ie.Document.all("btnG").Click
'clicks the button named "btng" which is google's "google search" button
While ie.Busy
DoEvents 'wait until IE is done loading page.
Wend
End Sub
3) Although Chrome's developer plugin appears to show all interaction, I find it a bit cumbersome to use. I'd be interested in seeing all of the raw communication (unencrypted though, the site is only accesses via https), so I can see if I'm really replicating all of the steps.
For this you can use Fiddler to view all the interaction going on and the RAW data going back and forth. To make it work with HTTPS you will need to install the Certificates to enable decryption of trafffic.
This website has a custom google search box:
http://ezinearticles.com/
The search results are generated by a piece of JS code. How would I access these results using wget and/or C#'s WebClient?
It looks like the searches on that page are normal google site searches. Try wget with the following url, where 'asdf' is your search
wget http://www.google.com/search?&q=site:ezinearticles.com+asdf
You need to to what your web browser does - render the page. Maybe you can extract the js call to the webservice providing the results and just execute this request and parse the output directly.
You need to access it with a programmable browser supporting JavaScript.
The HtmlUnit library for Java does this, and runs fine headless.
You can automate a real web browser, e.g. with WatiN on Windows, and access the page's content. This requires a GUI desktop though, because a real browser window is opened.