Is there any API which i can use in a WPF application to search Google?
The thing is:
Currently my team is searching for few files in Google and they do this alot,
so I would like to build an application which will search for the files in Google and return the links.
E.g:
I am searching for the product pentium 4 chips.
Then they will search in Google for the documentation(PDF's basically) about pentimum 4 chips.
Following this, they will take the search results which matches the manufacturer website(e.g. intel.com)
and with the found PDF's they will continue their work.
I want to use the Google search API to get the details and give them the exact links or links which nearly match.
My problems:
I am not finding a correct api.
I am not sure how to use it.
It doesn't matter if its C# or WPF or WCF.
Frontend can be anything, you need to focus on logics.
Try this
http://googledotnet.codeplex.com/
Also
http://social.msdn.microsoft.com/Forums/en-US/netfxnetcom/thread/82da7a25-9f30-4d76-93dc-4acd1c2a938a
Related
I want to create an application that basically search for something with some filters from various websites (I don't require to login to those third party websites so the data available is open to public) and show it on my application. I have a few questions:
1. Is It Legal ?
2. Is this web scraping or Meta Search Engine ?
3. Can I get more information (any web links/articles) to know more
about it ? How to achieve it technically ? One way I know that we can use the XPath technique to scrape but I am wondering if there are more ways.
I am NOT asking for the entire code. Just how to start / Any guidance?
Thank You in Advance !
Firstly you need to understand how search engines work!
-Our so called search engines like google have special programs designed to mine out information from the web they are called "Spiders" what a spider does is basically scroll over all web pages within the search query and find matching information however that's a really complex thing to work on! it takes really good code and algorithm expertise to develop a spider for yourself. However if you can master that you'll be earning a smooth sum of money, but it's really rare unless you're a blatant genius!
I'm working on a c# application to return the exact top 10 Google search results for a specific keyword. So I decided to try the Google search api. In particular, I'm using a dotnet application called "GoogleSearchAPI", but it doesn't return the exact same results as typing into Google & I'm curious if there's a way to do so. Either using the Google Search API or through some other method, I really don't care which way.
For example, here are 2 screenshots using the same search phrase. 1st one is from google:
And this one is what's being returned from Google Search API for the same search phrase (this image looks squished in my preview while writing this, so just in case, here's the image url: image link):
As you can see, the api is returning very different results. The 1st google search return is google plus. The api returns the actual website. Then, the api returns 3 facebook results, where as google returns yelp. Very different.
Here's the sample code I used in the GoogleSearchAPI:
WebQuery query = new WebQuery(tbQuery.Text);
query.ResultSetSize.Value = ResultSetSize.large;
IGoogleResultSet<GoogleWebResult> resultSet = GoogleService.Instance.Search<GoogleWebResult>(query);
dgvResults.DataSource = resultSet.Results;
Does anyone know how I can retrieve the exact search results Google returns? I can always resort to scraping but it's against Google's terms, so I'd need to create workarounds and it becomes rather messy, so I'd prefer to avoid that if I can.
Thanks
If you are getting the result from API is everything Ok. You cant get same resut from google search everything is based on your cookies, browser history, bookmarks, location etc. You can try searching from two different browser you will get different results.
We have a large collection of web content that we want to make searchable by Google Appliance but have a fairly complex list of what we want and don't want included. Because a lot of the content is AJAX like, just having Google do the searching isn't a solution. Instead, we have a classic ASP page that loops through all of our directories and files using the Scripting.FileSystemObject and excludes/includes files/folders and generates a large list of hyperlinks in a page that Google can then query. This process is painfully slow (20 minutes or more) but now we are able to move this process on Dot Net server.
I'm doing a little bit of exploring wondering what solutions people may have found useful for this kind of thing. We're exploring Microsoft.Web.Administration and anything else that will make this more efficient including writing the resulting list to a html file.
Does anyone with experience with this have any suggestions as to how to approach this?
Thank you in advance.
According to the documentation https://developers.google.com/search-appliance/documentation/614/admin_crawl/Preparing#robotscs, the Appliance Server obeys robots.txt. You should be able to that to the root of the site and configure it to disallow indexing of particular folders or extensions.
Going to Google code page I couldn't find the API I should use to perform a basic web search. All other resources I found point to Google Base API but it is no longer available.
What I need is to be able to sumbit a query string and get back a list with site names. For example, I need to find the first results when searching for "champions league" as if typing the query on the Google page.
What is the correct API to use for text searches these days? Are there any librariries for PHP or C Sharp?
EDIT: I found PHP code on the Net that sends requests to ajax.googleapis.com/ajax/services/search/web. I checked it out and it actually returns search results :) Do you know where I can find info for this endpoint and from what API is it part? Also, Custom Search API as suggested by #Rickard doesn't seem to provide this basic functionality. I tried to use it but it asks me to enter the sites I want to search in. I don't want to search particular sites but all.
Thank you
Check out the Google Custom Search API
I found this to work just fine, as the Google API is great unless you need to search +100 times per day, then they charge you. This is a simple solution to this problem, but only works for string searches and not image searches.
Search for "turtle":
string searchString = "Turtle";
System.Diagnostics.Process.Start("www.google.com/search?q=" + searchString);
I want to design my own search engine application, where all the results are displayed to the user on one single page (from Google/Bing etc) unlike Google where it is displayed on different pages.
Does there exist any such API's which can get me all those results?
PS. I am using C#, and considering the IEnumerator interface for this?
If you just want to be able to serve search results to users, then the APIs provided by search engines are probably the way to go. As already mentioned there's Bing's Live Search API (which I've not used but looks fine), and also Google's Web Search API.
Additionally, there's Yahoo BOSS which I found very easy to use. However, it looks like BOSS is now a paid API - so depending on your budget/intention, it might not suit.
Google's Web Search API is now deprecated, but should still work for a small number of queries - it's the platform that tools like this number of results counter are built on. It's been replaced by the Google Custom Search API which depending on your needs may or may not work for you. I've not used it, but it looks fine, and is free for small numbers of queries.
The problem with crawling and then parsing search pages is that search engines regularly change the underlying html of the search result pages - so any screen scraping approach will be quite brittle. Additionally, the terms of service of most commercial search engines prohibit automated access - if you go ahead anyway they may well block your crawler. These two problems are probably why awesome third party parsing APIs don't really exist.
What you can do is to fetch data from different APIs (bing/google etc) and then display it to the user in one flow. Otherwise, crawling search engines is totally illegal.
For Google, you can go to Google Custom Search API or if you have products to search then Google Shopping API.
For Bing, there is a simple and straightforward API.
check NUTCH. Is this what you are looking for?
Bing has an open api http://www.bing.com/developers
Google gives you an api then immediately takes it away. http://code.google.com/apis/websearch/docs/
The google api is deprecated and I think they have another one that is even more limited. Once upon a time they had an API that was comparable to Bing's.
For the exact scenario you mentioned though, the best thing to do is first parse out the number of results, then keep iterating through the pages. You also need to handle errors well because Google very often lies about the number of results it contains.
i m working in same project.
Generate sitemap
private void SubmitSitemap(string PortalName)
{
//PING SEARCH ENGINES TO LET THEM KNOW WE UPDATED OUR SITEMAP
//resubmit to google
System.Net.WebRequest reqGoogle = System.Net.WebRequest.Create("http://www.google.com/webmasters/tools/ping?sitemap=" + HttpUtility.UrlEncode("http://your path'" + PortalName + "'/sitemap.xml"));
reqGoogle.GetResponse();
//resubmit to ask
System.Net.WebRequest reqAsk = System.Net.WebRequest.Create("http://submissions.ask.com/ping?sitemap=" + HttpUtility.UrlEncode("http://your path + "'/sitemap.xml"));
reqAsk.GetResponse();
//resubmit to yahoo
System.Net.WebRequest reqYahoo = System.Net.WebRequest.Create("http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=YahooDemo&url=" + HttpUtility.UrlEncode("http://yourpath/sitemap.xml"));
reqYahoo.GetResponse();
//resubmit to bing
System.Net.WebRequest reqBing = System.Net.WebRequest.Create("http://www.bing.com/webmaster/ping.aspx?siteMap=" + HttpUtility.UrlEncode("http://yourpath + "'/sitemap.xml"));
reqBing.GetResponse();
}
Generate robots.txt file and place it in your root directory.Friendly name and other issues are also imp for this purpose.