Recently, Google published some guidelines about some of their intricacies of crawling websites. Although these guidelines have been the case for years, they made it public by stating they only crawl the first 15mb of a webpage. Most webpages are well under this size, so it would make sense for them to continue this practice.
What does the 15mb rule include?
The rule includes the bytes (or page content) on each of the indexable pages. It does not include images and scripts and all the other junk stored on a page, those are fetched separately.
What if my page is larger than 15mb?
That's not good, but the good news is... it probably isn't! The average web page size is around 30kb. So, if you're well under that, you're in the clear. If you are over the limit, try to optimize your page by compressing images, minifying code, and using a content delivery network.
How do I check my page size?
You can use Google Chrome's Inspect Element and check the network tab. You might need to refresh your page for it to populate again. You can see the main URL with type of 'document' has a size of about 20kb so we would be in the clear with this page!
How to see if a Webage is being Indexed
The 100% foolproof method is to setup a Google Search Console tag on your website, but the reality is that there are a few different ways to check this that might save you some time.
The first way, and the quickest, is to do a site search on Google. You can do this by typing "site:[yourdomainhere.com]" into the search bar. This will only show results from your domain. If you see your pages listed, then they are being indexed!
The second way is to check the cached version of your page. Google saves a cached version of each page it crawls in case the live version is down or slow. To do this, just do a normal Google search for your page and click on the "Cached" link below the search result. If you see the cached version, then it is being indexed!
The third way is to use the Google Search Console. This is a free tool provided by Google for webmasters to track their site's progress in the search results. If you see your pages in the "Index Status" report, then they are being indexed!
What this rule means for me
If you have a small website with pretty basic pages, then this rule probably won't affect you too much. However, if you have a large website with lots of text and bytes on a single page, you might want to make sure your pages are under 15mb. If not, this could explain indexing issues you might be facing. Be sure to utilize the methods above to check page sizes and indexing status to make sure your pages are being crawled and indexed by Google!