This website that keeps updating some live information about the bus timings in Helsinki.
I want to parse the live information from the website and display it on my WP7 phone. The user needs to enter the bus stop number and the WP7 app should show the buses/trams currently in the bus stop.
Is there any way I could obtain the real time information from the website?
If you look at the source of the website (http://www.omatlahdot.fi/omatlahdot/web?command=fullscreen&stop=1020455) -- in IE right-click on the page and select View Source -- you'll see that there's really very little in the actual source file, in particular none of the data is there. All of the hard work is coming from the referenced javascript file scripts/fullscreen_header.js (full path is http://www.omatlahdot.fi/omatlahdot/scripts/fullscreen_header.js). You want to download that .js file and study how it retrieves data with AJAX calls. Start with the reloadPage function.
You can make these same calls (e.g., using WebClient) to retrieve the data into your application. If you want to extract the data from the returned HTML, I'd consider parsing it simply as a string since I am assuming that it would have a very regular structure and dragging in a general-purpose HTML parser would probably be overkill.
Alternatively, you might find out if the omatlahodot.fi provides the data as JSON or XML feeds, so you don't have to "screen-scrape" the HTML. I don't read Finnish, so I can't help you with that. Look around on their websites (maybe a section called "dev" or "api") or send them an email inquiry.
Please let us know how it works out!
Related
I am currently developing a Word-Completion application in C# and after getting the UI up and running, keyboard hooks set, and other things of that nature, I came to the realization that I need a WordList. The only issue is, I cant seem to find one with the appropriate information. I also don't want to spend an entire week formatting and gathering a WordList by hand.
The information I want is something like "TheWord, The definition, verb/etc."
So, it hit me. Why not download a basic word list with nothing but words(Already did this; there are about 109,523 words), write a program that iterates through every word, connects to the internet, retrieves the data(definition etc) from some arbitrary site, and creates XML data from said information. It could be 100% automated, and I would only have to wait for maybe an hour depending on my internet connection speed.
This however, brought me to a few questions.
How should I connect to a site to look up these words? << This my actual question.
How would I read this information from the website?
Would I piss off my ISP or the website for that matter?
Is this a really bad idea? Lol.
How do you guys think I should go about this?
EDIT
Someone noticed that Dictionary.com uses the word as a suffix in the url. This will make it easy to iterate through the word file. I also see that the webpage is stored in XHTML(Or maybe just HTML). Here is the source for the Word "Cat". http://pastebin.com/hjZj6AC1
For what you marked as your actual question - you just need to download the data from the website and find what you need.
A great tool for this is CsQuery which allows you to use jquery selectors.
You could do something like this:
var dom = CQ.CreateFromUrl("http://www.jquery.com");
string definition = dom.Select(".definitionDiv").Text();
I have a website, http://www.op.nysed.gov/opsearches.htm, for example where the user selects a Profession and enters a Licensee Name and clicks on the Search button which takes them to a new page to display the result.
For example, the following:
Which displays the following result:
Clicking on any of the set of number next to each name brings up the information, in example like this:
I looked at scrapy, arachnode and other web crawlers on the web for this purpose but wasn't too convinced that is the right technology for it.
I was told that we have to crawl those search results from the page. Is it something that can be done?
Can crawler crawl as the user does the search?
Web Crawling programs will get you a local copy of the target web's srtucture, not really sure if that is what you want.
If you want to extract that data and store it in a way you can query it later, then you must create your own app.
As a point of start the idea is this:
Navigate manually through the web and analyze the POST's done between pages (as an example, what is sent to the server when "Architect" is selected and the button is pressed, or where points a link on the license) and find the real queries, which variables are sent and the formatt of them, then analyze the page's HTML structure to find patterns which can be used with a regular expression engine.
That part will be a hard, you must analyze outgoing and incoming HTTP queries (LiveHTTP Headers complement in Firefox can help you a lot) to simulate them in your program, and construct realiable regular expressions patterns (to test regular expressions The RegEx Coach comes very handy).
Once you know how to navigate through the page structure and have patterns to strip the data, the rest is relatively easy, create a client using WebClient, navigate through the structure, strip the necessary data and store it in a DB.
As you can see this is a very broad answer, but because your question also is really broad.
I need to develop a chat system in ASP.Net. I have gone through lots of SO question asked on similar topic, but did'nt find any one satisfactory. Is it possible to create it from scratch or do i need to go for some API's. My requirement is limited to my site users only, can say intranet based.
Please help me.
To make the text chat is one think that you can done with a simple table, everyone write on it, every one read time to time, and you show it to the page.
Here is an example http://www.codeproject.com/KB/ajax/ChatRoom.aspx
The Video/Audio chat is a complicate one. You can start with this example
http://www.codeproject.com/KB/IP/videochat.aspx
and you can read more here: how to work with videos in ASP.NET?
Text chat is relatively simple. It involves three tier architecture. 1) Javascript timer. 2) WCF Ajax Enabled web service or Generic Http Handler, 3) Data Storage (Preferably SQL).
1) On the page - sending: input text box + button (used to send). The button click event handler or the text box's key down (for enter key) and blur events would invoke a post (via JQuery, plain ol' JavaScript or whatever Javascript library you use) to the WCF service/Generic handler, sending the contents of the text box, along with the chartroom name, the addressee, and the recipient.
2) On the server: WCF Service/Generic Http handler receives the post and stores it in DB.
3) On the page - receiving: using JQuery for example, you would create a javascript timer on document ready (when the page loaded). On every timer's tick event you want to create a GET (or post) via your handy JavaScript framework (or Plain Javascript) to your WCF service/ Generic Handler requesting the latest records stored in the DB for that chatroom. Append the result received (assuming xml/html/json) to the Div or whatever element is used to display your "chats".
This is a very simplified chat in jquery/asp.net.
As far as audio-video is concerned, you have a few problems. 1) The browser itself has no means of interacting with the mike, speakers, and video camera, unless it uses a plugin. Moreover, browsers typically have no way of knowing how to decode a video stream (though some of the smarter ones have it built in... chrome, firefox). 3) Javascript has no way interacting with all the necessary hardware as it lives inside the browser.
All that said, you can use a plugin such as Flash or Silverlight, (that has built in access to the necessary hardware), or whatever. You will also have a conceptual dilemma with those as you have to simultaneously deal with 2 streams - one for coming in, another going out and displaying both. However it is possible.
I have an web based application. The content for the Home page has been currently mentioned in the HTML code for the Home page using , and tags. To change the content anytime in future, it needs to be changed in the HTML code. :(
Is there a way that we can pick up the content from some external place and get it reflected through the website. This ways, any change if required can be made at the external location without referring to the application's code.
Please advise if there is any solution for it.
Thanks.
You can
Use a database
Include external files using Server Side Includes
Read external files and write their contents and an alternative method
Sounds like you're looking for a Content Management System (CMS), which will allow your content editors access to modify only specific blocks of a page that you specify.
There are a ton out there to do what you want, so you don't have to start from scratch. Just Google 'CMS'.
Although I haven't used it myself, DotNetNuke is a popular one these days and has a free version.
I am studying computer science and we have to do a programming project which must be strongly related to XML and using XSD and XSLT or XQuery/XPath at least. Because I like C# I'd like to do it in this language, but I could use another if anyone has another idea.
My Idea is now to code some kind of appointment book. I imagine that all appointments for the week are shown as HTML and you can enter for each day appointment notes in the textarea for this day.
Now my question: How can I take over the data entered in the textboxes? The application is an offline one so I have no web server receiving the GET request containing the entered data. Is it possible to read the current HTML DOM from memory with all its entered values and then transform it to an XML format for persistent storage from which it could be read in later?
Or is this idea totally stupid?
How else can I put all those XML technologies in one app?
If you want to show UI Generation from XSLT, the web page approach is easiest.
More impressive is generation of XAML from XSLT -> windows app (WPF).
Download Visual Web Developer (FREE)
or
Visual C#
Why does it have to be Web based?
You can use those technologies in a Windows Application.
You can use JavaScript. Convert the data into XML or JSON and output it to another element, like div, or textarea.
What you need to do is set a function that does all this and gets executed on submit.
Check this example. Also to speed things up, you can use a library like jQuery.
Being at home and offline do not mean you don't have a Web server. There are zillions of ready-made packages which offer an embedded HTTP server so that the same application can run online and offline without any modification. Very convenient.
(I know you us C# but, just to show an example, I use wsgiref.simple_server for that purpose.)
Why not make a windows app that allows the user to update the appointments which are stored in an XML file. Then use a stylesheet to display the appointments in a web browser.