At the beginning of July 2019, Google released a statement saying that they would no longer support the Noindex directive and it would retire all code that handles unsupported & unpublished rules.
So what is Noindex & how did it work?
The Noindex directive simply prevented pages from being indexed, combined with the disallow directive – it meant that pages were not crawled or indexed – therefore creating crawl efficiency.
This was a supported feature within robots.txt for over 10 years!
Why did Google cancel the support of Noindex?
Gary Illyes, from Google, explained that ‘after running analysis around the use of noindex in robots.txt, they found that a high number of sites were hurting themselves.’ He also stated that the update ‘is for the better ecosystem & those who used it correctly will find better ways to achieve the same thing.’
So what are the better ways?
In their official blog explaining it, Google outline several options that would help achieve the same noindex
- Noindex Robots Meta Tag: This is the most effective to remove URLs from the index but still allow crawling. These tags are supported in both the HTTP response headers & HTML & are achieve by adding a meta robots noindex directive on the actual page
- 404 & 410 HTTP status codes: These status codes basically inform Search Engines that they pages do not exist, which will lead to them being dropped from the index – once they have been crawled.
- Password Protection: Prevent Google from accessing a page by hiding it behind a login – this will generally lead to it being removed from the index.
- Disallow in Robots.txt: Blocking a page from being crawled will generally prevent it from being indexed, as Search Engines are only able to index pages they know about. Even though a page may be indexed due to links pointing to it – Google will aim to make the pages less visible in the SERPs.
- Search Console Remove URL Tool: The URL removal tool in GSC is a quick and easy way to temporarily remove a URL from the SERPs.
Identify & Monitor your NoIndex Robots.txt pages ahead of the September 1st update. Simply do a crawl, using your chosen crawler and check which of your pages are currently being noindexed and how: through header, meta tags or robots.txt.
From there you are determine what changes to make, but you best hurry, as it is now a month to the deadline and Google are emailing all webmasters via Google Search Console to tell them to remove all noindex statements from their robots.txt file.
For now, Googlebots are obeying the noindex directive but only until September 1st.
Not sure where to begin? Have no idea if your site is affected by this? Get in touch and we will do a free audit on your site to determine if your site will be affected and how we can help to update.