Sunday, November 08, 2009  
Google
Web pcquest.com

CIOL Network sites

Search by Issue | Sitemap | Advanced Search

• For most updated version of DQ TOP 20 issue, visit dqindia.com • Ad : Play and Plug ERP by IBM

Home > Technology > Filtering Focused Information

    Enterprise Solutions
    Hands On
    ITstrategy

    Developer

    Tech Forum

    SMB Forum

    Trends

    Shootout

    Reviews
    Editorials
    Linux and Open Source
    Technology
    Extraedge

    IT Careers

    Vertical Focus

Subscribe to Print magazine.


now!


Newsletter


Filtering Focused Information

How search engines tap into the social network  of the World Wide Web to produce focused results


Saturday, November 11, 2000

This article is in continuation of last month’s piece onsearch engines (page 71). There we discussed two issues—first, that even thelargest of engines can’t keep pace with the stupendous growth of the Web, andsecond, today’s one-size-fits-all approach to search must yield touser-adaptive searches, which learn from the past behavior of users.

In this article, we’ll take a look at technologies behind,what can be termed as, second-generation search engines—engines that exploitthe so-called social network of the World Wide Web. These technologies use acombination of statistics, pattern recognition, machine learning, and artificialintelligence to analyze sources of information and extract useful patterns fromthem.

Filtering information

A common approach for filtering information on the Net usestwo strategies: filter by relevance and filter by quality or popularity.Distilling a general topic to a size that will make sense to a user involvesidentifying the most definitive or authoritative Web pages on that topic. Thisis done by locating not only a set of relevant pages, but also those pages thatare of the highest quality. Relevance is handled, to an extent, by keywordmatching.

Another approach, that of tapping into the social network ofthe World Wide Web, can also derive notions of authority. The social network canbe accessed through hyperlinks that contain enormous amounts of latent humanannotation. Specifically, the creation of a hyperlink by the author of a Webpage represents an implicit endorsement of the page. Looking at suchendorsements collectively can give a better understanding of the relevance andquality of the page’s contents.

Google and Clever are two search engines that use bothapproaches of filtering and social-network analysis to search the Internet.Google (www.google.com) was developed atStanford, while researchers at IBM began the development of the Clever system (www.almaden.ibm.com/cs/k53/clever.html).Google analyzes hyperlinks to uncover the best pages on the Internet on a giventopic. Clever goes a step ahead and generates good starting points for Webnavigation on a given topic. Clever gives two types of pages: authorities, whichprovide the best source of information on a given topic, and hubs, which providecollections of links to authorities.

For Google, the measure of authority of a page isproportional to the total authority of all the pages that cite it. We will focusan Clever, since Google can be regarded as a special implementation of Clever.


Page(s)   1   2   3   

End of the article

PC Problems? Get a solution in 24 hours. Ask Tech Expert




Untitled Document



ZTE:Leading CDMA Technology


Extraordinary Networks:Freedom of Choice


Message boards

Discuss this and many other IT topics at the
CIOL message board

Previous Stories

Search Engines

Understanding Geek Talk

Setting up VLANs

   
 

 
 

Magazine Subscription | RQS | Contact Us | Team PCQuest | Advertising - Print | jobs@cybermedia