In order to maintain its results as relevant as possible for its users, the most known search engine, Google, has a well-defined process for identifying the best web pages. The process is continuously upgrared in order to make search results even better. In fact, this process imply the following steps:
1. Crawling, i.e. following the links in order to discover the most significant pages on the web
2. Indexing, i.e. keeping information about all the retrieved pages for later retrieval
3. Ranking, i.e. determining what every page is about, and how it should rank for relevant queries
Crawling the Web
Search engines work with crawlers, also known as spiders, that “crawl” the World Wide Web (WWW) in order to discover pages that exist. The process is used to identify the best web pages to be evaluated for a query. The involved method by which the crawlers travel are website links.
The task of crawling the whole web each day would be too hard to be achieved, so Google typically spreads its crawl over a large number of weeks. Moreover, search engines like Google do not crawl each and every existing web page. They start with a trusted set of websites that represent the basis for determining how other websites measure up, and by following the links they find on the visited pages, they expand their crawl across the entire web.
Indexing the Data
Indexing represents the action of adding information about a web page to a search engine’s index. In fact, the index is a large collection of web pages, i.e. database, that comprise information related to the pages crawled by search engine spiders.
Ranking the Results
In order to provide meaningful results to the search engine’s end user, search engines perform some important steps:
1. Interpreting the aim of the user query
2. Identifying web pages in the index regarding the query
3. Ranking and returning those web pages conform with their relevance and importance
The details of the Google strategy are actually mmore complex, but knowing the basics of crawling, indexing and ranking can halp you to better understand the methods behind a search engine optimization process.