Here's what I want the program to do:
Read a text file (the text file contains random search criteria like "sunflower seeds", "chrome water faucets", etc) to retrieve a search phrase.
Submit the search phrase to Google and retrieve the first four URLs.
Retrieve the Google Page Rank of each of the returned URLs.
Being a neophyte C# programmer, I can handle #1 easily. Unfortunately, I've never dealt with using the Google APIs before. I do have a Google API key and I'm aware that there is a search limit using the API. At most, I'll probably use this on a dozen search phrases (or "keywords") per day. I can do this manually, but I know there has to be a way to do this with a C# program. I've read that this can be done using AJAX, but I don't know AJAX and I'd rather this just be an executable program on my PC rather than a web-based app. A push in the right direction from someone would be a big help. Also, I really don't want this to be a "screen-scraper", either. Isn't there a way that I can get the info (URLs and Page Rank) from Google without having to scrape a returned HTML search page?
I don't want anyone to write the code for me, just need to know if it's possible and a push towards finding the information on how to accomplish it.
Thanks in advance everyone!
I don't want anyone to write the code
for me, just need to know if it's
possible and a push towards finding
the information on how to accomplish
it.
Look into the WebClient class
http://msdn.microsoft.com/en-us/library/system.net.webclient(VS.80).aspx
Try this:
googleSearch = #"http://" + #"www.google.com/#hl=en&q="+#query;
where query is the string of your search.
Related
I want to create an application that basically search for something with some filters from various websites (I don't require to login to those third party websites so the data available is open to public) and show it on my application. I have a few questions:
1. Is It Legal ?
2. Is this web scraping or Meta Search Engine ?
3. Can I get more information (any web links/articles) to know more
about it ? How to achieve it technically ? One way I know that we can use the XPath technique to scrape but I am wondering if there are more ways.
I am NOT asking for the entire code. Just how to start / Any guidance?
Thank You in Advance !
Firstly you need to understand how search engines work!
-Our so called search engines like google have special programs designed to mine out information from the web they are called "Spiders" what a spider does is basically scroll over all web pages within the search query and find matching information however that's a really complex thing to work on! it takes really good code and algorithm expertise to develop a spider for yourself. However if you can master that you'll be earning a smooth sum of money, but it's really rare unless you're a blatant genius!
I am currently developing a Word-Completion application in C# and after getting the UI up and running, keyboard hooks set, and other things of that nature, I came to the realization that I need a WordList. The only issue is, I cant seem to find one with the appropriate information. I also don't want to spend an entire week formatting and gathering a WordList by hand.
The information I want is something like "TheWord, The definition, verb/etc."
So, it hit me. Why not download a basic word list with nothing but words(Already did this; there are about 109,523 words), write a program that iterates through every word, connects to the internet, retrieves the data(definition etc) from some arbitrary site, and creates XML data from said information. It could be 100% automated, and I would only have to wait for maybe an hour depending on my internet connection speed.
This however, brought me to a few questions.
How should I connect to a site to look up these words? << This my actual question.
How would I read this information from the website?
Would I piss off my ISP or the website for that matter?
Is this a really bad idea? Lol.
How do you guys think I should go about this?
EDIT
Someone noticed that Dictionary.com uses the word as a suffix in the url. This will make it easy to iterate through the word file. I also see that the webpage is stored in XHTML(Or maybe just HTML). Here is the source for the Word "Cat". http://pastebin.com/hjZj6AC1
For what you marked as your actual question - you just need to download the data from the website and find what you need.
A great tool for this is CsQuery which allows you to use jquery selectors.
You could do something like this:
var dom = CQ.CreateFromUrl("http://www.jquery.com");
string definition = dom.Select(".definitionDiv").Text();
HI I am pretty new in C# sphere. Been in php and JavaScript since the beginning of this year. I want to scrap posts and comments from a blog. The site is http://www.somewhereinblog.net
What I want to do is
1. I want to log in using a software
2. Then download the html
3. Then use regular expressions, xpath whatever comes handy to separate the contents of posts and comments
I been searching all over. Understood very little. Though I am quite sure I need to use 'htmlagilitypack'. I dont know how to add a library to c# console or form application. Can someone give me some help? I badly need this. And I am not too into C# just a week. So would be grateful if there is some detailed information. Waiting eagerly.
Thanks in advance brothers.
Using Webclient you can login and download
Instead html-agility-pack I like CsQuery because lets you use jQuery syntax inside a string in C# code, so you can download to a string the html, and search and do things in it like with jQuery and HTML page.
Going to Google code page I couldn't find the API I should use to perform a basic web search. All other resources I found point to Google Base API but it is no longer available.
What I need is to be able to sumbit a query string and get back a list with site names. For example, I need to find the first results when searching for "champions league" as if typing the query on the Google page.
What is the correct API to use for text searches these days? Are there any librariries for PHP or C Sharp?
EDIT: I found PHP code on the Net that sends requests to ajax.googleapis.com/ajax/services/search/web. I checked it out and it actually returns search results :) Do you know where I can find info for this endpoint and from what API is it part? Also, Custom Search API as suggested by #Rickard doesn't seem to provide this basic functionality. I tried to use it but it asks me to enter the sites I want to search in. I don't want to search particular sites but all.
Thank you
Check out the Google Custom Search API
I found this to work just fine, as the Google API is great unless you need to search +100 times per day, then they charge you. This is a simple solution to this problem, but only works for string searches and not image searches.
Search for "turtle":
string searchString = "Turtle";
System.Diagnostics.Process.Start("www.google.com/search?q=" + searchString);
Code padawan here teaching myself to code so pardon the ignorance.
I want to be able to enter a search term into console and return the search results from google to be displayed in google.
What do I need to learn/read in order to accomplish this?
UPDATED
Enter search term into console
Program takes this search term and
runs it in google search
Take the results of this search query and
output the first page to console
I hope this is clearer ^_^
You may want to look at the WebClient class to retrieve results from google.
As for displaying them, I don't quite understand what you mean in your question.
You can always output them to the console.
return the search results from google
to be displayed in google.
Could you elaborate what you mean by display in google.
To send a query to google you will probably need an API key otherwise you will get a nastygram unless you change your user agent to something you aren't.
This project might be harder than it seems with all the extra hoops.