Googlebot Crawls and Indexes First 15MB HTML Content Only: Should You Be Worried?

Last updated on Dec 23rd, 2022 | 6 min

Google has updated its Googlebot help documentation to specify that Googlebot will crawl up to the first 15MB of the page and then stop:

“Googlebot can crawl the first 15MB of an HTML file or supported text-based file. Any resources referenced in the HTML such as images, videos, CSS, and JavaScript are fetched separately. After the first 15MB of the file, Googlebot stops crawling and only considers the first 15MB of the file for indexing. The file size limit is applied to the uncompressed data. Other crawlers may have different limits.”


As it turns out, this isn’t a change that Google tried to announce under the radar. John Mueller has recently confirmed that Googlebots have been indexing the first 15MB for quite some time, it just wasn’t officially documented:

John Mueller Tweet


Although we’ve been living with this “limitation” for who knows how long, and the web is working just fine, some people from the SEO community have expressed their concerns. So let’s debunk a couple of myths.
 

What This Means for SEO

In theory, it sounds concerning that you could potentially have content that doesn’t get used for indexing. In practice, however, 15MB is considered a huge amount of HTML.

According to HTTP Archive, as of June 1, 2022, the median number of HTML bytes requested by a page on desktop and mobile are:

HTTP Archive HTML Bytes per page


So if you’re still concerned that your site will be negatively affected, don’t be. As Google says:

“There are very few pages on the internet that are bigger in size. You, dear reader, are unlikely to be the owner of one, since the median size of an HTML file is about 500 times smaller: 30 kilobytes (kB).”


In fact, SEO best practices currently recommend keeping HTML pages to 100 KB or less. If you run an eCommerce website, having a 150-200 KB HTML page is also acceptable. 

In case you have a web page that includes 15MB of HTML, then your code must be structured in a way that puts the SEO-relevant information with the first 15MB in an HTML or supported text-based file. 

But to be honest, 15MB of HTML is a lot, so you might want to follow Google’s recommendations:

“If you are the owner of an HTML page that's over 15MB, perhaps you could at least move some inline scripts and CSS dust to external files, pretty please.”

But what if your content is buried under 15MB of images? 

Images question tweet

The crawling and indexing concern only the HTML file itself:

John Mueller response

But the truth is that if your HTML is 15MB or greater, you have more severe problems than your site’s SEO.  


What 15MB HTML Means for Your Site’s Performance

It’s highly possible that your website will not be usable, therefore, your visitors will have an awful or non-existent experience. 

As a good rule of thumb, if a testing tool has a hard time fetching your site’s HTML, you should consider applying some changes.

Paul Calvano tweet


The same happened when Paul Calvano, performance architect at Etsy, tried to test the site with Chrome DevTools:

Paul Calvano devtools results

Undoubtedly, 118MB of HMTL is a ridiculous size that will negatively affect every website. 

However, aiming for the smallest possible HTML can also adversely affect your site’s performance. 

For instance, removing valuable items from your HTML to reduce its size might lead to lesser user engagement. 

The gist of the whole HTML size talk is to strike the right balance between keeping your code lean and providing your visitors with an excellent user experience. 

I know that’s easier said than done, but you can achieve it using NitroPack. 

But more on that later. 

For now, let’s see how you can find your site’s HTML size.


How to Analyze Your Site’s HTML

There are many ways to find your site’s HTML size, but the easiest one would be to open the Developer Tools of your browser.

Here’s what that looks like in Chrome. 

Right click, then select Inspect:

Inspect

Open the Network tab and refresh:

Network tab

Then, the top request should be your site’s HTML document. What you’re looking for is in the Size column:

HTML file size


Optimize Your Site’s HTML (Automatically) with NitroPack

Finding your site’s HTML size is the easier part of the equation. 

Applying code optimization techniques to reduce it is where it gets tricky, especially if you don’t have the technical expertise. 

The great news is that you don’t have to be a developer to optimize your code. 

You can install NitroPack and see your HTML getting optimized automatically. 

After installing NitroPack in less than 3 minutes, our service will start applying techniques like: 

  • Code compression means applying algorithms to rewrite the files’ binary code, using fewer bits than the original.
  • Code minification means removing unnecessary parts like whitespace and comments from the code.

Along with numerous other optimizations to help your site pass Core Web Vitals, improve user experience, and boost your conversion rates. 

But don’t take our word for it. Test your website with NitroPack for free

Niko Kaleev
User Experience Content Expert

Niko has 5+ years of experience turning those “it’s too technical for me” topics into “I can’t believe I get it” content pieces. He specializes in dissecting nuanced topics like Core Web Vitals, web performance metrics, and site speed optimization techniques. When he’s taking a breather from researching his next content piece, you’ll find him deep into the latest performance news.