How to show RSS as pop-up? - c#

On my website users can post comments on a document. Now I want to send an RSS feed to the webmasters when a comment is posted. I want the webmaster to be notified by a small pop-up in the right corner of the page. So this is what's happening:
User adds comment
system checks if webmaster is logged in
if webmaster is logged in; show pop-up in right corner with the title of the comment in it.
How to accomplish this?

Setup a javascript timer to call a webservice periodically (every 5 seconds?) if the user is a webmaster. This webservice can determine if a new comment has been added since the last time it was checked. The webservice returns nothing if no new comment or some information about the comment if there is a new one.
If the webservice returns a comment, put that information into a div tag that you have created on your page and make it visible. If you are sure the webmaster is using a modern browser, you can use position:fixed to put this div tag in the upper right corner. If not, you will have to use some javascript to accomplish this.

Unless you're using a comet style service to push notifictaions to the webmaster's browser, you're going to need to make a page that polls for new notifications at a pre-defined interval. You can then make an AJAX call to the service and parse the response on to a web page that only the webmaster has access to.
If you're interested in comet (services that can push data to the connected client), you can get a start at Wikipedia:
Comet (programming)

Related

Using .Net how can I programmatically navigate to a webpage, interact with it via code, then get particular values from a newly generated page

I have a scenario where I would like to automate programmatically the following process:
Currently, I have to manually
Navigate to a webpage
Enter some text (an email) in a certain field on the webpage
Press the 'Search' button, which generates a new page containing a Table with the results on it.
Manually scroll through the generated results table and extract 4 pieces of information.
Is there a way for me to do this from a Desktop WPF App using C# ?
I am aware there is a WebClient type that can download a string, presumably of the content of the webpage, but I don't see how that would help me.
My knowledge of web based stuff is pretty non-existent so I am quite lost how to go about this, or even if this is possible.
I think a web driver is what you're looking for, I would suggest using Selenium, you can navigate to sites and send input or clicks to specific elements in them.
Well, I'll write the algorithm for you but you also need to some homework.
UseWebClient get the htm page with the form you want to auto fill and submit
Us regex and extract the action attribute of the form you want to auto submit. That gets you the URL you want to submit your next request to.
Since you know the fields in that form, create a class corresponding to those fields, let's call the class AutoClass
Create a new instance of your auto class and assign values you want to auto fill
Using WebClient to send your new request with the url you extracted from the form previously, attach your object which you want to send to the server either through serialization or any method.
Send the request and wait for feedback, then further action
Either use a web driver like Puppeteer (Selenium is kinda dead) or use HTTPS protocol to make web requests (if you don't get stopped by bot checks). I feel like your looking for the latter method because there is no reason to use a web driver in this case when a lighter method like HTTP requests can be used.
You can use RestSharp or the built in libraries if you want. Here is a popular thread of the ways to send requests with the libraries built in to C#.
To figure out what you need to send you should use a tool like Fiddler or Chrome Dev Tools (specifically the Network tab) to see what you need to send to acheive your goal as you would in a browser.

ASP.NET site search string in other website

I have a website with products part number in it. I want to know if it is possible to search this part number in other website search box. That means to insert the string to the other website search box and execute searching all by code behind?
Admitting you're not the owner of the other website and you don't know how it works :
When you click to the "Search" button of the other website, it triggers a request against the webserver : something like a GET or POST, or maybe the same but through an Ajax request.
It depends of the targeted website, but the best thing to do is probably to get the information about this GET or POST (very easy with Firefox or Google in the Network tab) and to process it yourself from your own application's code using an HttpWebRequest (for example).
This approach is easier and safer than to try to mimic the "Fill the textbox and click on the button and retrieve the response" action, but this is possible too (with web testing frameworks for example : watin, selenium...)
Concretely :
Go to the website you want to dig in
Open "Net" tab of your google chrome / firefox / whatever you want, enable it
Fill the search textbox
Click the search button
Look at the request and response appeared in the browser : you will know how to send a "search" query and the format of the answer returned.
This is general advices, not a definitive answer, because you don't provide enough details to help you more, but I hope this will be helpful !

How to detect if there is some fire wall or other causes which prevent me to connect to a specific server?

If my web application has a specific component(widget) which make a connection to another server(which is out of control) to read from an xml file .
sometimes the admin of the server which i connect to put a firewall or change some configuration . and when my application try to connect to this server it takes long time before the widget comes empty.
The problem is the time trying to connect to that server is a part of the time to load the page . and i feel there 's some thing wrong with all this time to request the page !
How can i determine if i can connect to that server to read the data or there is some issue which prevent me to do this.?
I don't quite understand what your widget is composed of and thus why it blocks the loading of the page. But two ways to decouple the widget for the page loading are:
Put the widget in an iframe element.
First insert a placeholder for the widget (e.g. a div element and a Loading... text). Then, after the page has loaded, use Javascript to replace the placeholder with the effective HTML.
Are you allowed to use this XML feed on your site? If not, they may be deliberately blocking your access to it.
However, I would cache the XML file locally, and let a cron job regularly pull the newest version from the other server.

Getting data from a webpage

I have an idea for an App that would really help me out in work but I'm not sure if it's possible.
I want to run a C# desktop application that will ask for a value. When a value is supplied, the application will open a browswer, go to a webpage and add the value into a form on an online website. The form is then submitted and a new page is loaded that contains a table of results. I then want to extract the table of results from the page source and write code to parse the result values.
It is not important that the user see's this happen in an actual browser. In other words if there's a way to do it by reading HTTP requests then thats great.
The biggest problem I have is getting the values into the form and then retrieving the page source after the form is submitted and the next page loads.
Any help really appreciated.
Thanks
Provided that you're only using this in a legal context:
Usually, web forms are sent via POST request to the web server, specifically some script that handles it. You can look at the HTML code for the form's page and find out the destination for the form (form's action).
You can then use a HttpWebRequest in C# to "pretend you are the form", sending a POST request with all the required parameters (adding them to the HTTP header).
As a result you will get the source code of the destination page as it would be sent to the browser. You can parse this.
This is definitely possible and you don't need to use an actual web browser for this. You can simply use a System.Net.WebClient to send your HTTP request and get an HTTP response.
I suggest to use wireshark (or you can use Firefox + Firebug) it allows you to see HTTP requests and responses. By looking at the HTTP traffic you can see exactly how you should pass your HTTP request and which parameters you should be setting.
You don't need to involve the browser with this. WebClient should do all that you require. You'll need to see what's actually being posted when you submit the form with the browser, and then you should be able to make a POST request using the WebClient and retrieve the resulting page as a string.
The docs for the WebClient constructor have a nice example.
See e.g. this question for some pointers on at least the data retrieval side. You're going to know a lot more about the http protocol before you're done with this...
Why would you do this through web pages if you don't even want the user to do anything?
Web pages are purely for interaction with users, if you simply want data transfer, use WCF.
#Brian using Wireshark will result in a very angry network manager, make sure you are actually allowed to use it.

C# WebClient - View source question

I'm using a C# WebClient to post login details to a page and read the all the results.
The page I am trying to load includes flash (which, in the browser, translates into HTML). I'm guessing it's flash to avoid being picked up by search engines???
The flash I am interested in is just text (not an image/video) etc and when I "View Selection Source" in firefox I do actually see the text, within HTML, that I want to see.
(Interestingly when I view the source for the whole page I do not see the text, within HTML, that I want to see. Could this be related?)
Currently after I have posted my login details, and loaded the HTML back, I see the page which does NOT show the flash HTML (as if I had viewed source for the whole page).
Thanks in advance,
Jim
PS: I should point out that the POST is actually working, my log in is successful.
Fiddler (or similar tool) is invaluable to track down screen-scraping problems like this. Using a normal browser and with fiddler active, look at all the requests being made as you go through the login and navigation process to get to the data you want. In between, you will likely see one or more things that your code is doing differently which the server is responding to and hence showing you different HTML than a real client.
The list of stuff below (think of it as "scraping 101") is what you want to look for. Most of the stuff below is probably stuff you're already doing, but I included everything for completeness.
In order to scrape effectively, you may need to deal with one or more of the following:
cookies and/or hidden fields. when you show up at any page on a site, you'll typically get a session cookie and/or hidden form field which (in a normal browser) would be propagated back to the server on all subsequent requests. You will likely also get a persistent cookie. On many sites, if a requests shows up without a proper cookie (or form field for sites using "cookieless sessions"), the site will redirect the user to a "no cookies" UI, a login page, or another undesirable location (from the scraper app's perspective). always make sure you capture the cookies set on the initial request and faithfully send them back to the server on subsequent requests, except if one of those subsequent requests changes a cookie (in which case propagate that new cookie instead).
authentication tokens a special case of above is forms-authentication cookies or hidden fields. make sure you're capturing the login token (usually a cookie) and sending it back.
POST vs. GET this is obvious, but make sure you're using the same HTTP method that a real browser does.
form fields (esp. hidden ones!) I'm sure you're doing this already, but make sure to send all form fields that a real browser does, not just the visible fields. make sure fields are HTML-encoded properly.
HTTP headers. you already checked this, but it may make sense to check again just to make sure the (non-cookie) headers are identical. I always start with the exact same headers and then start pulling out headers one by one, and only keep the ones that cause the request to fail or return bogus data. this approach simplifies your scraping code.
redirects. These can either come from the server, or from client script (e.g. "if user doesn't have flash plug-in loaded, redirect to a non-flash page"). See WebRequest: How to find a postal code using a WebRequest against this ContentType="application/xhtml+xml, text/xml, text/html; charset=utf-8"? for a crazy example of how redirection can trip up a screen-scraper. Note that if you're using .NET for scraping, you'll need to use HttpWebRequest (not WebClient) for redirect-dependent scraping, because by default WebClient doesn't provide a way for your code to attach cookies and headers to the second (post-redirect) request. See the thread above for more details.
sub-requests (frames, ajax, flash, etc.) - often, page elements (not the main HTTP requests) will end up fetching the data you want to scrape. you'll be able to figure this out by looking which HTTP response contains the text you want, and then working backwards until you find what on the page is actually making the request for that content. A few sites do really crazy things in sub-requests, like requesting compressed or encrypted text via ajax, and then using client-side script to decrypt it. if this is the case, you'll need to do a bit more work like reverse-engineering what the client script is doing.
ordering - this one is obvious: make HTTP requests in the same order that a browser client does. that doesn't mean you need to make every request (e.g. images). Typically you only need to make the requests which return text/html content type, unless the data you want is not in the HTML and is in an ajax/flash/etc. request.
(Interestingly when I view the source for the whole page I do not see the text, within HTML, that I want to see. Could this be related?)
This usually means that the discrepancy is caused by some DOM manipulations via javascript after the page has loaded. Try turning off javascript and see what it looks like.

Categories

Resources