logo

cmdarek

08-06-2026

How Internal Links to Search Pages Killed My Google Traffic — A Case Study

TL;DR

I made metadata clickable on 25,000 pages, linking to my search engine with query parameters (/search?q=...). Google crawled tens of thousands of these URLs, classified them as thin content, and within a week my organic traffic dropped to near zero.

Context

I'm building a legal database with ~25,000 detail pages, each containing structured metadata (issuing authority, chairman, contracting authority, city, thematic issues). To improve UX and internal linking, I made every metadata field clickable — each one linked to the search page with the appropriate query:

<!-- Chairman -->
<a href="/search?q=John+Smith">John Smith</a>

<!-- Contracting authority -->
<a href="/search?q=City+of+Warsaw">City of Warsaw</a>

<!-- City -->
<a href="/search?q=Warsaw">Warsaw</a>

<!-- Date -->
<a href="/search?date_from=2022-05-31&date_to=2022-05-31">31.05.2022</a>

<!-- Issuing authority -->
<a href="/search?issuing_authority=National+Appeals+Chamber">National Appeals Chamber</a>

In theory — great idea. Internal linking is an SEO pillar, and clickable metadata improves UX. User clicks "John Smith" and sees all rulings by that chairman.

What Went Wrong

25,000 pages × ~6 search links each = ~150,000 unique URLs like /search?q=....

Google discovered these links, started crawling them, and found the same thing on each: a search form + a handful of results. Minimal unique content. Textbook thin content.

Within 2-3 weeks, Google Search Console showed:

  • 7,767 pages with status "Crawled — currently not indexed"
  • 1,147 pages with status "Alternate page with proper canonical tag"
  • Organic traffic: from ~40 clicks/day to zero

Google didn't issue a manual penalty. It simply decided that a domain with thousands of thin content pages doesn't deserve high rankings. The quality score of the entire domain tanked.

Why Canonical Didn't Save Me

I had canonical set to /search (no params) for date-filtered URLs. But:

  1. Canonical is a hint, not a directive — Google can ignore it
  2. For ?q=... URLs, the canonical pointed to itself (I treated each query as a separate "page")
  3. Even when Google respects the canonical, it still crawls these pages, wasting crawl budget and evaluating their quality

The Fix — Three Layers of Protection

1. robots.txt — Block Crawling

Disallow: /search?

One line. Blocks ALL variants of /search with parameters. Clean /search (no ?) remains accessible and indexable.

2. rel="nofollow" — Cut Off the Source

<!-- BEFORE -->
<a href="/search?q=John+Smith">John Smith</a>

<!-- AFTER -->
<a href="/search?q=John+Smith" rel="nofollow">John Smith</a>

Even if Google somehow reaches these URLs, nofollow tells it not to follow the link. Belt and suspenders.

3. Dynamic noindex Meta Tag — Last Line of Defense

# Clean /search page -> index
# /search?q=anything -> noindex
if query or page > 1 or has_active_filters:
    robots = "noindex, follow"
else:
    robots = "index, follow"

Third layer. If Google somehow gets past robots.txt and nofollow — it sees noindex and won't add the page to the index.

Bonus: 301 Redirects for Duplicate Paths

I also discovered that detail pages were accessible via both /ruling/kio-1234-25 (slug) and /ruling/32263 (database ID). Google treated these as duplicates. Fix:

# If URL uses numeric ID but the record has a slug -> 301 redirect
if record.slug and record.slug != url_param:
    return redirect(f"/ruling/{record.slug}", status=301)

The Rule

Never create crawlable links to search result pages from templates rendered on a large number of pages.

If you have 100 pages — no problem. If you have 25,000 — every parametrized link is a potential separate URL in Googlebot's eyes.

Checklist before adding internal links:

  • Does the target page have unique, valuable content?
  • How many unique URLs will this generate? (pages × links per page)
  • Should the target page be in the index?
  • If not → rel="nofollow" + noindex + robots.txt

What I Should Have Done Instead

The UX goal was valid — users should be able to click metadata and find related records. Here are approaches that don't poison your index:

  1. rel="nofollow" from day one — simplest fix, zero SEO risk, full UX preserved
  2. JavaScript-based navigationonclick handler instead of <a href>, invisible to crawlers
  3. Faceted navigation pages with real content — instead of linking to /search?q=John+Smith, create a proper /chairman/john-smith page with aggregated stats, recent rulings, and unique text. These pages should be indexed.

Option 3 is the most work but the best outcome — you get both UX and SEO value.

Recovery Timeline

I deployed the fix today. According to various sources, recovery from this type of issue takes 2-4 weeks. I'll update this post when I see results.

Update (TBD): Waiting for data...


I'm building Przetargowi.pl — a search engine for public procurement rulings and tenders in Poland.