noindex / nofollow Guide (Without Breaking Your SEO)Most pages that drop out of Google in 2026 were not penalized. They were deindexed by their own owners — by a stray noindex shipped in a template, a robots.txt block that hid the directive from crawlers, or an X-Robots-Tag header set by a CDN nobody remembers configuring. This is the modern robots meta tag playbook: what each directive does, what AI search engines do with it, and the eight mistakes that quietly cost rankings.
The robots meta tag is a single line that lives in the <head> of a page and tells search crawlers what they can do with the page after they have already fetched it. The two directives that matter for SEO are index/noindex (whether the URL is allowed in the search index) and follow/nofollow (whether the outbound links on the page should be crawled and credited).
<meta name="robots" content="index, follow">
That is also the implicit default, so most pages do not need the tag at all. You add the tag specifically when you want behavior that differs from the default.
noindex — strongest signal. Removes the URL from search results within one to three crawls.nofollow — hint, not directive, since 2019. Use defensively, not strategically.noarchive — blocks the cached snapshot link. Still respected, mostly irrelevant.nosnippet — prevents Google from showing any snippet at all. Useful for paywalls.max-snippet, max-image-preview, max-video-preview — fine-grained controls for AI Overviews and rich results.This is the single most common mistake on the web: blocking a URL in robots.txt and assuming it will be deindexed. It will not. A Disallow in robots.txt tells the crawler not to fetch the page. A noindex in the meta tag tells the crawler what to do after it fetches the page. If you block crawling, the crawler never sees the noindex, and the URL can stay in the index for months — often appearing in results as a bare URL with no title and no description.
| You want to… | Use this |
|---|---|
| Stop a page from appearing in search | <meta name="robots" content="noindex"> |
| Save crawl budget on huge low-value sections | robots.txt Disallow |
| Both — block crawl and indexing | noindex first, wait for deindex, then add Disallow |
| Block a non-HTML asset (PDF, image) | X-Robots-Tag HTTP header |
noindex, follow is almost always the right comboThe pattern that confuses beginners and that experienced SEOs apply by reflex is noindex, follow. The page itself is not allowed into the index, but the links on the page are still crawled and pass equity to wherever they point. This is the right setting for:
The page is invisible in search, but it still acts as a useful intermediate hop in your internal link graph — which is exactly what those pages are for.
nofollow still does anythingSince Google reclassified nofollow as a hint in 2019, the strategic value of nofollow links has dropped to near-zero for most sites. The places it still earns its keep in 2026:
rel="sponsored", not nofollow, but adding nofollow at the meta level on entire affiliate hubs is still defensible.For everything else — your normal blog posts, product pages, landing pages — leave links followable. You give up nothing and you get the benefit of clean internal equity flow.
The 2024-2026 wave of AI search crawlers — OAI-SearchBot, PerplexityBot, ClaudeBot, Google-Extended, and the various Bing AI agents — has changed the equation. In general:
noindex. A page with noindex will not be cited as a source in ChatGPT Search, Perplexity, or AI Overviews.noindex. They are not building a search index — they are building a corpus. To exclude a page from training, you need to block the named user agent in robots.txt or use emerging tokens like noai in the meta tag (still informally adopted).noindex page is not also the canonical target of a page you do want cited.If you want a page included in AI citations, the right setup is the same as for classic SEO: index, follow, self-referencing canonical, fast render, no JavaScript-only directives.
You cannot put a <meta> tag inside a PDF, an image, a video, or a JSON endpoint. For those, the only way to set indexing rules is the HTTP response header X-Robots-Tag. This is also how CDNs and platforms like Vercel, Netlify, or Cloudflare Pages enforce site-wide rules on preview deployments.
X-Robots-Tag: noindex, nofollow
Two things to know: the header takes precedence over an in-page meta tag, and it is invisible unless you actively inspect HTTP responses. If a page is mysteriously deindexed and you cannot find a meta tag, check the response headers first.
noindex, the deploy pipeline merges it, and a week later traffic collapses.noindex on a page you also canonicalize to. You are telling Google that the master copy of a page is itself excluded. Pick one.noindex after launch. Pre-launch noindex is sensible. Leaving it on after launch is catastrophic.noindex to paginated pages. Use noindex, follow if you must, but a bare noindex strands link equity.Our free SEO Meta Generator builds robots, canonical, Open Graph, Twitter Card, and JSON-LD tags from one form. No signup, no tracking, copy-paste output.
For 95% of pages, here is the meta block you actually want:
<meta name="robots" content="index, follow, max-image-preview:large, max-snippet:-1">
<link rel="canonical" href="https://example.com/this-page/">
For pages you want hidden but link-active:
<meta name="robots" content="noindex, follow">
For thank-you pages, internal admin pages, and anything paywalled:
<meta name="robots" content="noindex, nofollow, noarchive">
noindex. Anything in that list that you did not put there on purpose is a problem.curl -I or a Chrome extension like SEO Meta in 1 Click.The robots meta tag is not glamorous, but it is the single line of HTML with the highest leverage on whether your page is allowed to compete in search at all. Treat index, follow as the default, use noindex, follow for archives and internal navigation, set X-Robots-Tag for assets, and audit the whole site once a quarter. Get that right and you have neutralized one of the most common reasons SaaS sites silently lose organic traffic.
Build a complete meta block — robots, canonical, Open Graph, Twitter Card, JSON-LD — in under a minute. No account required.