What is Crawl budget? (Level up your SEO)
Crawl budget is one of those SEO concepts that is easy to understand but hard to implement. Basically, the number of web pages that a Googlebot indexes and crawls within a given amount of time is known as the Crawl Budget.
So how do you determine how much crawl budget to allocate? Should you be doing crawl budget optimization? And what factors influence your site’s crawl budget? Is there an actual crawl limit you should know about? Read on to find out!
Crawl budget is important for your SEO knowledge but is often misunderstood. When and how to increase Google crawl rate (or even slow it down).
You’ll hear people say things like ‘be careful of your crawl budget! ‘or ‘don’t publish too many articles, you might screw up your crawl budget! ‘.
But truth be told, most of these opinions are outdated and don’t reflect modern-day SEO.
Don’t believe me? Just ask John Mueller –Senior Webmaster Trends Analyst at Google.
BUT while crawl budget is (in the words oh @Johnmu) ‘over-rated,’ it’s still an important part of SEO to understand, especially for people running new sites or enormous sites (like eCommerce sites).
Here’s why…
So, what is my crawl budget?
Watch this baby CRAWL!!!
Crawl budget refers to the number of pages crawled and indexed by Googlebot on your website during a set period of time.
Let’s break it down.
The ‘crawl’ part of the crawl budget
In essence – every day, Google sends out an army of robots (commonly referred to as ‘spiders’ or search engine bots) who scour the internet looking for new or updated content.
When they do this, it is called ‘crawling.’ To rank on search engines, your pages need to be crawled and indexed.
Crawling is how websites get found and then ranked on Google search.
The ‘budget’ part of the crawl budget
As with any budget, your crawl budget is about resources. Because while Google is mighty and powerful, it still has its limitations (aka – the spiders, those funky names search engine bots, can only crawl so much so fast! They can only crawl a certain number of pages each day too!).
The crawl budget is directly linked with the resource a crawler uses on your site and your server capacity.
Usually crawl budget is nothing to worry about. However, in a few specific cases, it is essential to look at it.
When should you be concerned about crawl budget?
Popular sites don’t have to worry about crawl budget. This is because Google is familiar with them and crawls them often.
The two sites that DO need to be worried about crawl budgets?
#1 – New websites that are not yet ranking on Google
Why?
That’s because even if a server has the capacity to support more crawling, when your site is new, search engine sites won’t usually crawl it. That’s simply because Google hasn’t assessed it as a reputable source just yet.
You’ll know if this is you by simply logging into your Google Analytics or Google Search Console account. What you will see is that Google’s search results for your site will be poor.
#2 – Enormous websites with hundreds of thousands or millions of pages
Why?
Because there might actually be so many pages that Google actually misses some (or at least they take a while to get picked up).
Ways to verify crawl activity
So if you are a new website, an enormous one, or just want to double-check if your site is being crawled adequately, the process is super simple.
To have an overview of Google crawl activity, head to Google’s Crawl Stats report. It is an authentic google report that helps you identify the crawling behavior, issues, and changes in crawling.
Moreover, it also tells you how the Google search engine crawls your site.
If your site has a complex setup, you will have to access and store data from raw files and potentially use specific software such as Elasticsearch, Logstash, or Splunk.
Very helpful stuff!
How does Google adjust the site crawl?
No crawl budget is the same, and various inputs must be considered Googlebot crawls.
Crawl demand
Crawl demand is how much Google crawls your pages. It is based on the popularity of your pages and the performance of your content against Google’s index.
Googlebot crawls love optimized pages with links. These get a better index than the others.
Keep in mind that Google is indexing all the pages of all the websites on the internet.
To stand out you must make sure your pages are search-engine optimized and regularly updated.
Crawl rate limit
Crawl rate limit refers to the maximum of crawling your website can keep up with before it alters the server stability.
Google will automatically adjust the crawl demand based on your website’s ability to handle it.
How to increase Google crawl demand rate
To improve your site crawl’s demand and crawl limit, you will need to increase crawl rate. So to do crawl budget optimization follow the steps below:
Increase the number of resources and speed up your server
To increase the resources Google downloads when crawling your site and impact your crawl budget, make it easy for Google to connect and download resources related to the server.
Improve internal links and external links too
Crawl demand depends on your page popularity or the quality of the links. To increase your crawl budget, increase the number of both internal links as well as external links on pages on your site.
If you don’t know what links to add, head to Site Audits to find the best links opportunities.
Fix broken and redirected links
Having restricted and broken links on your website impacts your crawl budget negatively. Do an audit and clear all the issues relevant to linking and replace them with high-priority links. It will boost the crawl budget of your website.
Use the “All issues” report in Site Audits to identify broken and redirected links.
Use GET instead of POST where you can
This one involves HTTP Request methods. It consists of using GET requests as much as possible. That is because GET (pull) requests are cached and POST (push) requests are not.
Use the Indexing API
Check if your pages are eligible for the Google Indexing API. This will let Google know automatically when pages are deleted or added to your website. This technique is only valid for job postings, broadcast events, and live videos type services.
How to slow down Google crawl?
There are limited but practical ways to make crawl budget lower. Slowing down the crawl limit may be needed for crawl budget optimization. But, first, you need to make a few adjustments to your website.
Below are our recommended adjustments:
- Use the rate limiter tool in Google Search Console for slow but guaranteed results.
- Use Google’s crawl rate by creating ‘503 Service Unavailable’ or ‘429 Too Many Requests’ status codes on your pages for fast but risky results.
Final thoughts
If you’re having an issue with pages indexation or crawling, be sure to reach out to us to speak with a digital strategist. They can walk you through the entire process, and make a suggestion for how to improve your crawl budget (and rankings) moving forward. Please also, read over our SEO services. These can really help with improving your SEO over your competition.
FAQs on crawl budget and search engine crawling
How important is an xml map for site crawling?
All of the pages on your site should be listed in an XML sitemap file. This is so that Google and its search engine bots can easily identify, and index them all. Additionally, it aids in the structure-understanding of pages on your site for Google. So yes, it is important! If you want Google to index your site well, optimize your crawl budget using a complete XML map of your web pages.
Does page speed affect site crawling?
Yes, page speed does affect a site’s crawl budget. Google has said that one of the signals utilized by its algorithm to rank sites is website speed (including page speed). Google can crawl more sites at once the faster the pages can load. The usefulness of a website is greatly dependent on how quickly pages load. Web pages load incrementally rather than all at once.
Does duplicate content speed affect site crawling?
Duplicate content wastes the time and therefore your crawl rate. Having duplicate content means that the search engine bots or spending time crawling the same thing on a number of pages.