Determining Another Site's Traffic Measurements? - c#

I have a conceptual question.
I am wondering how companies such as Alexa Internet determine a given site's (not my own) overall traffic and traffic for each unique page. I would appreciate a technical response - if you were to design this feature (i am sure it is complicated but hypothetically...) how would you go about it?
Thanks in advance.

One way is to be hooked into one or more core routers. From there you could perform deep packet inspection to see where traffic is going, what pages are visited, etc.
Another way is to have people install a browser toolbar which records where they go and submits that information back to you. I think this is how Alexa works.
A third way is to have web site owners install a bit of javascript which performs analytics and submits that data back to you. This is how Google does it.
A fourth way is to buy that data from companies that do one of the above.

Alexa estimates website traffic by extrapolating the data from the browsing sessions of the subset of the Internet population who use the Alexa toolbar or browser extensions. This isn't a truly random sample, so questions are raised over the accuracy of such data:
http://en.wikipedia.org/wiki/Alexa_Internet#Accuracy_of_ranking_by_the_Alexa_Toolbar
Installing the Alexa toolbar modifies the browser user-agent, so you can estimate the % of visitors to your site who are contributing data to Alexa by scanning your server logs for requests with the appropriate user-agent strings.

Related

Issue with HttpWebRequest Google.com

I have a C# application that searches on Google. After a few hits, I see the captcha message.
To solve this, I open Internet Explorer, go to the same page, and I'm presented with the captcha as well. I complete that and then, its all good; search results are shown.
But in my c# application when I hit the same URL, I still see the captcha. Why is that, and how could I bypass it? I am confused as I've completed the captcha (using IE), so why do I see it again on next hit in c# but not from the browser!
I just need to be pointed in the right direction , or some ideas or suggestions.
I don't have any knowledge of how Google does it, but I've seen websites which track how often you use them based on:
IP Address
User-Agent String
Cookies
You can spoof number 2 so its the same as in Internet Explorer, just in case its through that.
Number 3 is easy to check I suppose, and you can transmit the cookie if there is one.
Google wants to prevent other people to send requests by their own applications, there is no advertise , ... . And maybe this is attack, You've two options : 1. Your application should act as the way as the browser acts , for example modifying User-Agent and cookies. 2. Contact google to provide you a API. I'm sure Google provides API for this reason , but I've no more details information.

ASP.NET how to update a page upon a click event of another page

I know the question sounds too vague so let me explain exactly what I want to implement.
I have a WebApplication that many users log into to submit a request, Request in my project is a form that accepts some information from the user and when he click submit, it reflects on the administrator page. then the admin can grant or decline this request. and of course the result need to be sent to the user's 'Pending Requests' page.
this process is all about time so I need a clean and efficient way to show the admin the requests instantly and for the user to see the admin's response instantly. (kind of like facebook notification system).
I hope my problem is know clear. I understand that there are many ways to implement this and I have a very small knowledge about them. But I just want you guys to recommend an effecient way because I'm sure that the good ways to do this is limited.
Thanks in advance everybody :)
I will suggest you take a look at SignalR (https://github.com/SignalR/SignalR). It is a framework developed by a few MS developers for doing long polling/notifications from the server.
Link for webforms walkthrough - http://www.infinitelooping.com/blog/2011/10/17/using-signalr/.
You could also look into using a Timer control. It's a client side control that will cause a postback for ASP.NET AJAX applications. Here's a simple tutorial
http://ajax.net-tutorials.com/controls/timer-control/
What you're talking about is a 'push' notification, where the server would pass a notification to the client (a browser) without the client requesting anything.
This isnt something which HTTP is naturally capable of, however have a read about Comet - this will let you know the current state of what is possible.
You may opt for creating a 'heartbeat' on the client side - a polling mechanism which requests from the server every x seconds, and updates the page when new content is found.
I need a clean and efficient way to show the admin the requests instantly and for the user to see the admin's response instantly.
Instantly is a very strong term and isn't usually very scalable.
For some ideas on how you might implement this I'd recommend you take a look at Wikipedia's Comet Programming page
When a user submit requests I assume that his request is first stored in the database. So on the admin & user part you use ajax which periodically update data from database (for un-approved data), do some google search on ajax auto-update or Javascript's timeout or similar function. The same process will be involved in user part.

Programmatically purchase from a website in C# or javascript

Hey guys, I'm trying to create a website that can help a user purchase items from other websites. What would be the best way to go about doing this?
I know most of the sites I'm using are sending their information using FORM:POST, but I'm having trouble finding the exact POST packet in fiddler (I'm assuming it's encrypted?), and know that a lot of the sites are using login credentials, so that complicates things a bit.
Is there any way I could use webkit or something to handle all the http stuff, and just pass javascript to fill in the forms? Or is there an even simpler way to create proper POST packets and use a WebRequest?
Thank you!
1) get permission
2) use their published API
If the sites do not have an API and allow you to use their server process, copy their forms to your site and use post. You can post from your server with credentials using for example CURL
Usually shopping cart and credit-card transaction use SSL and you have to login in the site. So I think it's not so simple to bridge with a javascript or a simple webrequest.
There's not a statndard-simple-way way to do this!
You're heading for a world of hurt.
First, you should check if what you're trying to do is legal. Does the web site allow "proxy orders"? Or are they forbidden by their EULA?
Second, you'll have to handle the user's confidential data (username, password, credit card number), and especially credit card numbers are calling for troubles.
Third, how are you planning to implement payment methods like PayPal? You're going to collect the user's PayPal credentials in order to make payments on their behalf? (See point number two if answer is yes.)
Fourth, since you have to fake HTTP requests, as soon as the web site changes a single field, your tool will break, how are you planning to handle this?
Or you're trying to automate only the first steps of the orders and not the payment?

How Systems like AdSense and Webstats Work?

I am thinking about working with remote data and receive or send data actually in external web sites. exists a large amount of examples in World Wide Web are working. For example: free online web tools like web stats OR Google's AdSense .... .you know in such web services some code will generate for publishers and the publisher put generated code in her BODY of web page document(HTML file) and the system after that will work. we can have count of visits for home pages, count of clicks on advertisements and so on.now this is my question: How such systems Work? and how can I investigate and search about them to find out how to program them? can you suggest me some keywords? Which Titles should I looking for? and which Technologies is relevant to this kind of programming? Exactly I want to find some relevant references to learn and start some experiences on these systems. if my Q is not Clear I will Explain it more if you want...Help me I am confused.
Consider that I am an Programmer want to program such a systems not to use them.
There are a few different ways to track clicks.
Redirection Tracking
One is to link the advertisement (or any link) to a redirection script. You would normally pass it some sort of ID so it knows which URL it should forward to. But before redirecting the user to that page it can first record that click in a database where it can store the users IP, timestamp, browser information, etc. It will then forward the user (without them really knowing) to the specified URL.
Advertisement ---> Redirection Script (records click) ---> Landing Page
Pixel Tracking
Another way to do it is to use pixel tracking. This is where you put a "pixel" or a piece of Javascript code onto the body of a webpage. The pixel is just an image (or a script posing as an image) which will then be requested by the user visiting the page. The tracker which hosts the pixel can record the relevant information by that image request. Some systems will use Javascript instead of an image (or they use both) to track clicks. This may allow them to gain slightly more information using Javascript's functions.
Advertisement ---> Landing Page ---> User requests pixel (records click)
Here is an example of a pixel: <img src="http://tracker.mydomain.com?id=55&type=png" />
I threw in the png at the end because some systems might require a valid image filetype.
Hidden Tracking
If you do not want the user to know what the tracker is you can put code on your landing page to pass data to your tracker. This would be done on the backend (server side) so it is invisible to the user. Essentially you can just "request" the tracker URL while passing relevant data via the GET parameters. The tracker would then record that data with very limited server load on the landing page's server.
Advertisement ---> Landing Page requests tracker URL and concurrently renders page
Your question really isn't clear I'm afraid.
Are you trying to find out information on who uses your site, how many click you get and so one? Something like Google Analytics might be what you are after - take a look here http://www.google.com/analytics/
EDIT: Adding more info in response to comment.
Ah, OK, so you want to know how Google tracks clicks on sites when those sites use Google ads? Well, a full discussion on how Google AdSense works is well beyond me I'm afraid - you'll probably find some useful info on Google itself and on Wikipedia.
In a nutshell, and at a very basic level, Google Ads work by actually directing the click to Google first - if you look at the URL for a Google ad (on this site for example) you will see the URL starts with "http://googleads.g.doubleclick.net..." (Google own doubleclick), the URL also contains a lot of other information which allows Google to detect where the click came from and where to redirect you to see the actual web site being advertised.
Google analytics is slightly different in that it is a small chunk of JavaScript you run in your page, but that too basically reports back to Google that the page was clicked on, when you landed there and how long you spend on a page.
Like I said a full discussion of this is beyond me I'm afraid, sorry.

Login to online accounts

First time poster. Back to programming after being away for a few years, trying to clean off the rust. I'm creating a dashboard that will run initially on my laptop (Macbook Pro, 10.4.x O/S). Amongst other things I want it to retrieve latest information from my online accounts. I'm starting with html, but will probably migrate to something else (TBD, possibly ruby or c#). What would sample code look like for logging into account, going through specific account workflow, retrieve data/docs/other, and pull it back to be stored locally.
It is a little open-ended, apologies and thanks in advance.
Are you looking for something like Google Gears?
It depends on the kind of accounts you want to log onto.
For instance there is a Google has an specific API for that Google Accounts API, other services provide similar API's some other do not.
So it depends pretty much what are your "online accounts" all about and if they do provide a public API or not.
EDIT
As per your comment and for the products you've mentioned, I'll suggest your to start looking at browsers plugin development and to start understanding the HTTP protocol and all the related technologies around it ( HTTPS, encryption, authentication etc. )
The public API let you easily login into an account, but you don't really need one to do it ( although it make life much more simpler ) If you do not have a public API, you can still login into any account by "simply" doing what the browser does. Sending an HTTP(s) request with the appropriate security mechanism and following the protocol.
If you know how does the browser send the request and you have the user trusting your their passwords, the only remaining thing you have to do is ... :) code it.
As of now the question is too broad to be answered. Pick one service at a time and ask specific questions about it.
I would suggest you to start with the previously mentioned "Google Accounts" API and learn from there.
One open source product that already manages google account authentication is "Ubiquity" you can take a peak at their source code and start understanding how do they fetch the user contact list.

Categories

Resources