Google wants the Robots Exclusion Protocol (REP) to become an Internet standard

In order to convince the developer community to use the Robots Exclusion Protocol (REP) as an industry standard, Google has decided to encourage their interest by making them open source with your own set of robots.txt instructions.

The Robots Exclusion Protocol, which was proposed as a standard by the Dutch software engineer Martijn Koster in 1994, has become the most used by websites to tell automated crawlers which parts of a website should not be processed.

The Googlebot Google crawler, for example, analyzes the robots.txt file when indexing websites to verify special instructions on which sections it should ignore and, if there is no such file in the root directory, it assumes that it is correct to scan ( and indexing) of the entire site. These files are not always used to provide direct scan instructions, as they can also be filled with certain keywords to improve search engine optimization.

While the Robots Exclusion Protocol is often referred to as a “standard”, it has never become a true Internet standard, as defined by the Internet Engineering Task Force (IETF) – the non-profit open organization that deals with regulating Internet protocols.

- Advertisement -

Google reported that the REP, as it is, is open to interpretation and may not always cover every single aspect of the websites ( for example, Internet Archive managers haven’t used it for several years now). For this reason, it wants there to be very specific rules. This will allow its tools to index web pages even better, making its search engine even more complete.

Izaan Zubair
Izaan Zubair
Izaan is founder of TechLapse. Izaan developed interest in computers from young age and most of his skills and knowledge are self taught. He can be reached at: [email protected]

Recent News

YouTube puts a brake on videos that link COVID-19 to the 5G network

One of several speculations surrounding COVID-19 blames the disease for the existence of 5G. YouTube, in an attempt to combat and eradicate false content...