An Overview of Google’s Different Crawler Types

11 Jul, 2023 Technical

Google recently updated its official documentation on its different types of crawlers, including an overview of the user-triggered fetchers. Below is an overview of crawlers and the different types outlined by the new Google documentation.

What is a Google Crawler?

Search engines like Google use crawlers to scan the web and identify new sites, pages, information, or content to add to their database. Crawlers are also identifiable as “search bots’ or “spiders,” and they work by following links from one page to another to find all potential web pages. Google crawlers work in two phases. First, it “crawls” the web to find the new content. After it finds something that does not match anything currently in its database, the software indexes the page and ranks them in its system according to various criteria. Google is constantly completing this process and updating its database.

The Different Types

Google has more than 15 types of crawlers, and they fall under three categories: Googlebot, special case crawlers, and user-triggered fetchers. Robots.txt rules may set limits on which URLs a crawler can access.

Googlebot

This is Google’s main crawler. It always respects robots.txt rules. Some examples of this crawler include Googlebot smartphones, Googlebot desktops, Googlebot images, Googlebot news, and Googlebot video.

Special Case Crawler

These crawlers are for more specific functions and may only sometimes respect robots.txt rules. Some examples of this crawler include APIs-Google, AdsBot Mobile Web, AdsBot, and AdSense.

User-Triggered Fetchers

These types of crawlers are triggered by the user and act on behalf of a request by the user. Therefore, they ignore robots.txt rules. Some examples of this crawler include Feedfetcher, Google Publisher Center, Google Read Aloud, and Google Site Verifier.

Google has now published documentation that provides an overview of these three types of crawlers. Website publishers should know these crawlers when producing newly published sites and content.