How Internal Links to Search Pages Killed My Google Traffic — A Case Study
TL;DR
I made metadata clickable on 25,000 pages, linking to my search
engine with query parameters (/search?q=...). Google
crawled tens of thousands of these URLs, classified them as thin
content, and within a week my organic traffic dropped to near zero.
Context
I'm building a legal database with ~25,000 detail pages, each containing structured metadata (issuing authority, chairman, contracting authority, city, thematic issues). To improve UX and internal linking, I made every metadata field clickable — each one linked to the search page with the appropriate query:
<!-- Chairman -->
<a href="/search?q=John+Smith">John Smith</a>
<!-- Contracting authority -->
<a href="/search?q=City+of+Warsaw">City of Warsaw</a>
<!-- City -->
<a href="/search?q=Warsaw">Warsaw</a>
<!-- Date -->
<a href="/search?date_from=2022-05-31&date_to=2022-05-31">31.05.2022</a>
<!-- Issuing authority -->
<a href="/search?issuing_authority=National+Appeals+Chamber">National Appeals Chamber</a>
In theory — great idea. Internal linking is an SEO pillar, and clickable metadata improves UX. User clicks "John Smith" and sees all rulings by that chairman.
What Went Wrong
25,000 pages × ~6 search links each = ~150,000 unique
URLs like /search?q=....
Google discovered these links, started crawling them, and found the same thing on each: a search form + a handful of results. Minimal unique content. Textbook thin content.
Within 2-3 weeks, Google Search Console showed:
- 7,767 pages with status "Crawled — currently not indexed"
- 1,147 pages with status "Alternate page with proper canonical tag"
- Organic traffic: from ~40 clicks/day to zero
Google didn't issue a manual penalty. It simply decided that a domain with thousands of thin content pages doesn't deserve high rankings. The quality score of the entire domain tanked.
Why Canonical Didn't Save Me
I had canonical set to /search (no params) for
date-filtered URLs. But:
- Canonical is a hint, not a directive — Google can ignore it
-
For
?q=...URLs, the canonical pointed to itself (I treated each query as a separate "page") - Even when Google respects the canonical, it still crawls these pages, wasting crawl budget and evaluating their quality
The Fix — Three Layers of Protection
1. robots.txt — Block Crawling
Disallow: /search?
One line. Blocks ALL variants of /search with
parameters. Clean /search (no ?) remains
accessible and indexable.
2. rel="nofollow" — Cut Off the Source
<!-- BEFORE -->
<a href="/search?q=John+Smith">John Smith</a>
<!-- AFTER -->
<a href="/search?q=John+Smith" rel="nofollow">John Smith</a>
Even if Google somehow reaches these URLs, nofollow
tells it not to follow the link. Belt and suspenders.
3. Dynamic noindex Meta Tag — Last Line of Defense
# Clean /search page -> index
# /search?q=anything -> noindex
if query or page > 1 or has_active_filters:
robots = "noindex, follow"
else:
robots = "index, follow"
Third layer. If Google somehow gets past robots.txt and nofollow —
it sees noindex and won't add the page to the index.
Bonus: 301 Redirects for Duplicate Paths
I also discovered that detail pages were accessible via both
/ruling/kio-1234-25 (slug) and
/ruling/32263 (database ID). Google treated these as
duplicates. Fix:
# If URL uses numeric ID but the record has a slug -> 301 redirect
if record.slug and record.slug != url_param:
return redirect(f"/ruling/{record.slug}", status=301)
The Rule
Never create crawlable links to search result pages from templates rendered on a large number of pages.
If you have 100 pages — no problem. If you have 25,000 — every parametrized link is a potential separate URL in Googlebot's eyes.
Checklist before adding internal links:
- Does the target page have unique, valuable content?
- How many unique URLs will this generate? (pages × links per page)
- Should the target page be in the index?
-
If not →
rel="nofollow"+noindex+robots.txt
What I Should Have Done Instead
The UX goal was valid — users should be able to click metadata and find related records. Here are approaches that don't poison your index:
-
rel="nofollow"from day one — simplest fix, zero SEO risk, full UX preserved -
JavaScript-based navigation —
onclickhandler instead of<a href>, invisible to crawlers -
Faceted navigation pages with real content —
instead of linking to
/search?q=John+Smith, create a proper/chairman/john-smithpage with aggregated stats, recent rulings, and unique text. These pages should be indexed.
Option 3 is the most work but the best outcome — you get both UX and SEO value.
Recovery Timeline
I deployed the fix today. According to various sources, recovery from this type of issue takes 2-4 weeks. I'll update this post when I see results.
Update (TBD): Waiting for data...
I'm building Przetargowi.pl — a search engine for public procurement rulings and tenders in Poland.