Google uses a trademarked algorithm called PageRank, which assigns each Web page a relevancy score.
PageRank depends on a few factors:
- Frequency and location of keywords within the Web page: If the keyword only appears once within the body of a page, it will receive a low score for that keyword.
- How long the Web page has existed: People create new Web pages every day, and not all of them stick around for long. Google places more value on pages with an established history.
- Number of other Web pages that link to the page in question: Google looks at how many Web pages link to a particular site to determine its relevance. and value of those links.
- Type of content (how relevant the data on the site is to the search terms)
- Quality of content (spell check is used to separate professional sites from sloppy wannabes)
- Freshness of content (sites from 1996 are less likely to be returned before sites from 2013)
- The user's region (no sense returning webpages in another language)
- Legitimacy of the site (whether the page is deemed likely to be spam-related)
- Name and address of the website
- Search word synonyms
- Social media promotions
PageRank rates web pages based on a score. Sites are assigned these scores based on whether links to them come from important or higher authority sites , high-traffic, well-established pages. These sites are then presented higher in the search result list, allowing the searcher to hit the right target
Google uses automated programs called spiders or crawlers, just like most search engines.
Spiders scans Web pages and creates indexes of keywords. Once a spider has visited, scanned and categorized a page, it follows links from that page to other sites. The spider will continue to crawl from one site to the next, which means the search engine's index becomes more comprehensive and robust.
Crawlers visits Web sites and reads their pages and other information in order to create entries for a search engine index.