Close Menu
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    • Blogging
    • SEO & Digital Marketing
    • WiFi / Internet & Networking
    • Cybersecurity
    • Tech Tools & Mobile / Apps
    • Privacy & Online Earning
    Facebook X (Twitter) Instagram
    Wifi PortalWifi Portal
    Home»SEO & Digital Marketing»Google May Expand Unsupported Robots.txt Rules List
    SEO & Digital Marketing

    Google May Expand Unsupported Robots.txt Rules List

    adminBy adminApril 23, 2026No Comments3 Mins Read
    Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit WhatsApp Email
    Google May Expand Unsupported Robots.txt Rules List
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Google may expand the list of unsupported robots.txt rules in its documentation based on analysis of real-world robots.txt data collected through HTTP Archive.

    Gary Illyes and Martin Splitt described the project on the latest episode of Search Off the Record. The work started after a community member submitted a pull request to Google’s robots.txt repository proposing two new tags be added to the unsupported list.

    Illyes explained why the team broadened the scope beyond the two tags in the PR:

    “We tried to not do things arbitrarily, but rather collect data.”

    Rather than add only the two tags proposed, the team decided to look at the top 10 or 15 most-used unsupported rules. Illyes said the goal was “a decent starting point, a decent baseline” for documenting the most common unsupported tags in the wild.

    How The Research Worked

    The team used HTTP Archive to study what rules websites use in their robots.txt files. HTTP Archive runs monthly crawls across millions of URLs using WebPageTest and stores the results in Google BigQuery.

    The first attempt hit a wall. The team “quickly figured out that no one is actually requesting robots.txt files” during the default crawl, meaning the HTTP Archive datasets don’t typically include robots.txt content.

    After consulting with Barry Pollard and the HTTP Archive community, the team wrote a custom JavaScript parser that extracts robots.txt rules line by line. The custom metric was merged before the February crawl, and the resulting data is now available in the custom_metrics dataset in BigQuery.

    What The Data Shows

    The parser extracted every line that matched a field-colon-value pattern. Illyes described the resulting distribution:

    “After allow and disallow and user agent, the drop is extremely drastic.”

    Beyond those three fields, rule usage falls into a long tail of less common directives, plus junk data from broken files that return HTML instead of plain text.

    Google currently supports four fields in robots.txt. Those fields are user-agent, allow, disallow, and sitemap. The documentation says other fields “aren’t supported” without listing which unsupported fields are most common in the wild.

    Google has clarified that unsupported fields are ignored. The current project extends that work by identifying specific rules Google plans to document.

    The top 10 to 15 most-used rules beyond the four supported fields are expected to be added to Google’s unsupported rules list. Illyes did not name specific rules that would be included.

    Typo Tolerance May Expand

    Illyes said the analysis also surfaced common misspellings of the disallow rule:

    “I’m probably going to expand the typos that we accept.”

    His phrasing implies the parser already accepts some misspellings. Illyes didn’t commit to a timeline or name specific typos.

    Why This Matters

    Search Console already surfaces some unrecognized robots.txt tags. If Google documents more unsupported directives, that could make its public documentation more closely reflect the unrecognized tags people already see surfaced in Search Console.

    Looking Ahead

    The planned update would affect Google’s public documentation and how disallow typos are handled. Anyone maintaining a robots.txt file with rules beyond user-agent, allow, disallow, and sitemap should audit for directives that have never worked for Google.

    The HTTP Archive data is publicly queryable on BigQuery for anyone who wants to examine the distribution directly.


    Featured Image: Screenshot from: YouTube.com/GoogleSearchCentral, April 2026. 

    expand Google list Robots.txt rules Unsupported
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
    Previous ArticleGrab a seat: Google’s next Android Show premieres soon, and there’s major hype
    Next Article Bad Memories Remain a Threat to Agentic AI Systems
    admin
    • Website

    Related Posts

    Robots.txt Docs Expand, Deep Links Get Rules, EU Steps In

    April 24, 2026

    One bankruptcy just wiped out a popular Google TV lineup in Europe

    April 24, 2026

    Google spam reports with personally identifying information won’t be used and processed

    April 24, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Search Blog
    About
    About

    At WifiPortal.tech, we share simple, easy-to-follow guides on cybersecurity, online privacy, and digital opportunities. Our goal is to help everyday users browse safely, protect personal data, and explore smart ways to earn online. Whether you’re new to the digital world or looking to strengthen your online knowledge, our content is here to keep you informed and secure.

    Trending Blogs

    Robots.txt Docs Expand, Deep Links Get Rules, EU Steps In

    April 24, 2026

    Opera: Private Web Browser 97.3.5038.88255 APK Download by Opera

    April 24, 2026

    AI Phishing Is No. 1 With a Bullet for Cyberattackers

    April 24, 2026

    One bankruptcy just wiped out a popular Google TV lineup in Europe

    April 24, 2026
    Categories
    • Blogging (68)
    • Cybersecurity (1,487)
    • Privacy & Online Earning (181)
    • SEO & Digital Marketing (913)
    • Tech Tools & Mobile / Apps (1,773)
    • WiFi / Internet & Networking (243)

    Subscribe to Updates

    Stay updated with the latest tips on cybersecurity, online privacy, and digital opportunities straight to your inbox.

    WifiPortal.tech is a blogging platform focused on cybersecurity, online privacy, and digital opportunities. We share easy-to-follow guides, tips, and resources to help you stay safe online and explore new ways of working in the digital world.

    Our Picks

    Robots.txt Docs Expand, Deep Links Get Rules, EU Steps In

    April 24, 2026

    Opera: Private Web Browser 97.3.5038.88255 APK Download by Opera

    April 24, 2026

    AI Phishing Is No. 1 With a Bullet for Cyberattackers

    April 24, 2026
    Most Popular
    • Robots.txt Docs Expand, Deep Links Get Rules, EU Steps In
    • Opera: Private Web Browser 97.3.5038.88255 APK Download by Opera
    • AI Phishing Is No. 1 With a Bullet for Cyberattackers
    • One bankruptcy just wiped out a popular Google TV lineup in Europe
    • Continuous Observability as the Decision Engine
    • Google spam reports with personally identifying information won’t be used and processed
    • I stopped switching to a terminal to run scripts once I found VS Code’s task runner
    • Regular Password Resets Aren’t as Safe as You Think
    © 2026 WifiPortal.tech. Designed by WifiPortal.tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer

    Type above and press Enter to search. Press Esc to cancel.