Category: Top » Internet-marketing » Search-engine-optimization »


Author: eSources | Total views: 31 Comments: 0
Word Count: 608 Date: Wed, 4 Mar 2009 5:04 PM

Search Engines Deconstructed Part I

Search engines are the single most used method for finding information on the Internet. Comscore.com's October 2008 research study reports that 'more than 750 million people age 15 and older or 95 percent of the worldwide Internet audience conducted 61 billion searches worldwide in August, an average of more than 80 searches per searcher.'

Ecommerce entrepreneurs have a huge need to win the search engine sweepstakes. The higher your listing falls in search engine ranking pages, the greater the number of viewers who will read your listing and visit your site.

Although the specific algorithms used by the major search engines (Google, Yahoo, MSN, etc.) are proprietary (though subject to intense investigation by Web watchers), the underlying principles of search engines are available to be studied. These principles are 'spidering', assessment and storage, retrieval and ranking. In this first of several articles on search engine optimization, we will look at the spidering and assessment/storage process.

Web spiders or crawlers

A web spider is an automated program that crawls the Web, gathering URLs and sending them back to a repository, where they are analysed and sorted. Web spiders make it much simpler and more efficient to search the Web because a lot of the work of gathering and sorting has been done days, weeks or even months before you search for that content.

A search engine uses many Web spiders to crawl the Web pages on the Internet, return contents and index the contents according to utility of the information.

Spiders operate according to a set of rules, e.g.:

* A selection policy that states which pages to download;

* A revisit policy that states when to check for changes to the pages;

* A politeness policy that states how to avoid overloading websites by accessing URLs too frequently;

* A parallelisation policy that states how to coordinate distributed web crawlers, that is, how to avoid too many crawlers accessing the same site at the same time.

Once the spider has retrieved URLs and sent them back to its repository, the pages must be assessed for value.

Storage and Assessment

During page assessment, a second search engine program scans each page sent by the spider, analysing the content of the page, i.e., studying 'on page' factors. This program indexes which words are used, how often they are used and whether or not there is special emphasis (bold, italicised, used in heading, part of a link). The results of this analysis are stored in the search engine's document index.

Some of the typical positive on page factors include:

* Keywords located in headings and meta tags;
* Keywords in URL and domain name;
* Keyword density (5 to 20%);
* Keyword proximity (for 2+ keywords).

Negative on page factors include:

* Mostly graphics, little text;
* Bad language;
* Stolen material;
* Keyword over-density.

The program later analyses 'off page factors', i.e., links to other pages and other pages that link to it.

Positive link strategies include:

* Incoming links from high ranking sites;
* Number of incoming links;
* Age of link;
* Keyword presence in link.

Negative link strategies include:

* Link buying;
* Cloaking: show one link to spider, another to users;
* Links to or from bad sites.

Once these analyses have been completed, the search engine can match a user query with web pages that have been dissected into 'component values' based on the search engine's particular logarithm.

About the Author

eSources is the Internet's largest database of verified wholesale suppliers, wholesalers, dropshippers, wholesale distributors, importers and manufacturers from the UK and worldwide. In addition to being a wholesale trade resource, the site helps startups and experienced traders in the development and growth of their online and brick and mortar retail businesses.




Rate, comment or bookmark this article

Seed Newsvine

Rating: Not yet rated

Bookmark this article in your preferred program
AddThis Social Bookmark Button

Comments RSS

No comments posted.

Add Comment

Your Name:


Your Email:


Comment

Enter the code shown

Visual CAPTCHA



Popular Articles in this cathegory

1: Black Hat, White Hat...Gray Hat?
You must generate a lot of traffic to covert the visitors into actual sales. This is where SEO enters into the picture.

2: Learn How to Get 10,000 Visitors Per Day
Web traffic is the key to any online business' success. Without it, nobody will be come to your site and buy your product or invest in your service. In this short article I'm going to teach you how you can receive massive traffic to your site and start reaping huge profits.

3: The Best of SEO Tools is Your Brain...Or is It Really?
If you've ever visited SEO forums and discussions groups you must have seen those heated debates on doing SEO manually vs using SEO tools. You might have participated in them and, who knows, maybe we've even argued with you about it. I bet a lot of people (including myself) are stroking their keyboards right now to chip in their two cents on this epical SEO topic.

So let's see the points from both sides of this debate and finally decide once and for all: Do we really need SEO tools?

4: Keyword Relevance Equals Targeted SEO Indexing
It cannot be stressed enough how important relevance of keyword to content is. The importance of this value reflects on the search engines need to classify websites according to their content. If your keywords don't match your content, your website will be ignored by the search engines.

5: Website Design versus SEO - The Big Two of a Successful Site
Many people spend thousands of dollars and years of their lives on search engine ranking optimization and other internet marketing activities, believing that the reason that their website is not making money is that it doesn't have enough traffic. Your website's design and your URL are as important as improving your Google ranking - if not more. We look at balancing the web sales formula.


Creative Commons License
This article is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
Spanish taslation