Why Search Engines Are Redefining The Web

October 24, 2005
Copyright Mediaware Infotech Pvt. Ltd.

Starting with crawling the web to collecting keyword data, search engines have progressed to accepting web-site-maps to facilitate the collection process. But the recent attempts of Google to accept content directly from users (in addition to web crawling) may illustrate a point : how search engines have redefined the way we use the Internet.

Search Engine Operation
Search engines like Google, Yahoo!, MSN need to maintain an updated database on almost all information from almost all web-sites. This database is indexed on multiple keywords for speedy searches. And naturally, this database is updated frequently. All requests for text keyword searches are processed through this vast database and matches are displayed to users.


Similar but more complex technologies are available for audio & video searches.

Traditional Methods of Data Aggregation

Search engines collect, collate & index key word data so that they can offer super-fast "answers" to search queries. The traditional method is web crawling - a process by which software programs (called "spiders") are used to methodically & repeatedly scan information available on millions of web-sites. Data scanned is then compared with existing databases and changes are updated & (re)indexed. And by referring to this regularly updated database, a "search request" can be processed in a blink of an eyelid !

This is the way that Google, Yahoo! & other search engines have been updating their vast databases comprising of millions of web-sites.

Towards More Collaborative Methods
The information collected/updated by any search engine is humongous to say the least. And growing by the day. With its basic utility value targeting as much at the site owner as the web searcher. So it makes sense to collaborate with the site owner to ensure more efficient & greater coverage.

Thus Yahoo!, Google & others launched their "Sitemap" programs.

For sites with dynamic content or pages that require more effort for "spiders" to discover, search engines accept sitemap files which provide more information about web pages. These files help "spiders" know the list of URLs on your site and keep track of the changes - almost like an (additional) overview into your site.

Both Yahoo! as well as Google have their own versions of sitemap files which can be submitted by site owners to aid their "spiders" in web crawling.

Google calls this a "collaborative web crawling system", designed to optimize efficiency while improving coverage.

The Next Step: Asking Web-site Owners to Submit Data
Now, search engines like Google need to collect as much data from all possible web-sites as possible. So it is natural that they continuously try to improve their efficiency to do this.

Moving up the chain of collaborative, the next step is to ask web-site owners to submit data. Being mere aggegrators of published information, search engines have no say over the authenticity of content - they have to accept whatever is put out by web-site owners.

So to the next (obvious) step : Set up a service by which users could submit data to search engines.
Google has been working on such a service called "Google Base" on trial basis - where site owners can submit content in a structured manner. It is natural to expect other search engines to follow.

Increasing Role of Search Engines in E-Commerce
Search engines are increasingly used to locate information on the web. And product marketers / service providers have started paying for listings by keyword alongside search results. Google Base seems to be a grand plan to store & index the world’s information. What search engines have been doing so far is store keywords along with links to the respective web-sites. Beyond the links, the keywords hold no meaning to search engines.

The difference between a search engine and a linked URL is anyway perceived as just one more click. Google's imminent plans to create & maintain its own user-submitted database, if successful, will achieve the effect of a search engine consistently pointing to a single online classifieds site like (say) eBay. Or (more likely) Google Base !

When Google Base data is presented on Google's search results page, Google Base will have taken the first step to blur the distinction between search engines and raw databases. And with feeds that could possibly include podcasts, audio & video, the ultimate effect could be well beyond imagination !

In either case, Google Base serves to demonstrate the growing influence of search engines on the usage of the Internet.

Google Base went live on trial for a short period, this week. The new service, unearthed by bloggers who took screenshots before the pages "disappeared", suggests that Google may compete with online classifieds service providers and e-tailing giant eBay.

This is what Google Base Forum "home page" displayed:

“Post your items on Google. Google Base is Google’s database into which you can add all types of content. We’ll host your content and make it searchable online for free.

Examples of items you can find in Google Base:
Description of your party planning service
Articles on current events from your website
Listing of your used car for sale
Database of protein structures

You can describe any item you post with attributes, which will help people find it when they search Google Base. In fact, based on the relevance of your items, they may also be included in the main Google search index and other Google products like Froogle and Google Local.”

While the site appears to give people the ability to enter content manually, it also lets users - merchants are mentioned specifically - upload information in bulk, presumably through an XML feed formatted to Google's schema. Google already accepts XML feeds from merchants who want their products listed on its Froogle.com site.

Information submitted may end up in the main Google index, on Froogle, or on Google Local.

It is obvious that Google is trying to enter the Online classifieds busineses by creating a layer between web-sites and their databases - by creating a new database (Google SQL?) - with a possible eye on ad revenue in it?

According to a recent report by Classified Intelligence, Google has been quietly approaching job boards & other online classifieds service providers, inviting them to submit feeds of their listings.

The potential service has implications for every player that publishes structured data, such as classifieds, product listings, or travel information. Especially for specialized search engines like Oodle, Indeed.com, SimplyHired and SideStep WHO have indexed such data.
But all these players have long feared that Google would enter their domain.

Source: http://www.googlebaseforum.com, Classified Intelligence, along with a number of blog sites

Mediaware Infotech Pvt. Ltd.
The New Mahalakshmi Silk Mills Premises, Mathuradas Mills Estate, Opp. Kamala City, N.M.Joshi Marg,
Lower parel (West), Mumbai - 400 013. Tel: 91 - 22 - 56602635 - 38 Fax: 91 - 22 - 5660 2634 - ext 300