Stop Email Scraping: Why Publishing an Address Invites Spam

Understand how email harvesting works and implement the simplest fix to protect your inbox.

How Email Harvesting Works

The Bot Scraping Process

Step 1: Crawling

  • Automated bots crawl websites 24/7
  • Follow links from page to page
  • Scan millions of websites daily
  • Save HTML source code for analysis

Step 2: Pattern Matching

  • Search for email patterns: text@domain.com
  • Use regex: [a-z0-9]+@[a-z0-9]+\.[a-z]+
  • Extract all matching strings
  • Handle common variations

Step 3: Validation

  • Verify email format is correct
  • Check if domain exists
  • Filter obvious fake addresses
  • Keep likely-valid addresses

Step 4: Database Storage

  • Add to spam databases
  • Tag with source URL
  • Categorize by industry
  • Timestamp the discovery

Step 5: Distribution

  • Sell to spammers
  • Use for own campaigns
  • Share on underground markets
  • Add to automated spam systems

Timeline: Your email can be in spam databases within 24-48 hours of publishing.

Why mailto Makes It Worse

Plain Text Email

When you publish an email on your website:

<p>Contact us: support@example.com</p>

In HTML source:

<p>Contact us: support@example.com</p>

Bot sees: support@example.com in plain text. Trivial to extract.

mailto Links

Even worse—mailto links are deliberately formatted for easy extraction:

<a href="mailto:support@example.com">Contact Us</a>

Bot pattern matching:

  1. Look for href="mailto:
  2. Extract everything between mailto: and " or ?
  3. Store email address
  4. Move to next link

This is easier to scrape than plain text because it has a predictable structure.

How Fast Spam Arrives

Publishing an email address leads to:

Timeline What Happens
Day 1 Email published on website
Day 2-3 Crawlers discover your page
Day 4-7 Email added to databases
Week 2 First spam emails arrive
Week 3-4 Spam volume increases
Month 2 10-20 spam emails per day
Month 6 50-100+ spam emails per day
Year 1 Hundreds of spam emails daily

The problem only gets worse over time.

The Damage

Immediate Impact

Your inbox becomes:

  • Filled with spam
  • Harder to find real messages
  • Time-consuming to manage
  • Frustrating to use

Typical spam mix:

  • Fake business opportunities
  • Phishing attempts
  • Product promotions
  • Scam offers
  • Malicious links

Long-Term Consequences

Once your email is compromised:

  • ✗ Can't unpublish (already in databases)
  • ✗ Can't stop spam (lists are sold repeatedly)
  • ✗ Address is permanently burned
  • ✗ Filters help but don't solve it
  • ✗ Only solution: New email address

Recovery cost:

  • Time setting up new email
  • Updating all accounts
  • Notifying contacts
  • Migrating email history
  • Updating business cards, signatures, etc.

The Simplest Fix

Remove Email from HTML, Use Form Link

The complete solution:

  1. Remove email addresses from all web pages
  2. Replace with contact form link
  3. Form delivers to your email (kept private)
  4. Your inbox stays clean

Before (Exposed)

<footer>
  <p>Contact us at support@example.com</p>
  <a href="mailto:support@example.com">Email Support</a>
</footer>

Result:

  • Email visible in source
  • Bots scrape it immediately
  • Spam guaranteed

After (Protected)

<footer>
  <p><a href="https://supportretriever.com/form/your-form-id">Contact Support</a></p>
</footer>

Result:

  • No email in HTML
  • No email in source code
  • Nothing for bots to scrape
  • Clean inbox

5-Step Action Plan

Step 1: Create Protected Contact Form (5 minutes)

  1. Sign up at SupportRetriever
  2. Complete onboarding flow
  3. Configure form settings
  4. Set recipient email (stays hidden)
  5. Get form URL

Step 2: Find All Published Emails (10 minutes)

Where to look:

  • Website footer
  • Contact page
  • About page
  • Team pages
  • Blog posts
  • Email signatures
  • Social media bios
  • Business directories

How to find them:

Search your website source:

  • Search for @ in HTML files
  • Search for mailto: in code
  • Check all pages manually
  • Use browser "View Source"

Step 3: Replace with Form Links (15 minutes)

Simple replacement pattern:

<!-- OLD -->
<a href="mailto:support@example.com">Contact Support</a>

<!-- NEW -->
<a href="https://supportretriever.com/form/your-form-id">Contact Support</a>

All locations:

  • Navigation menus
  • Footer
  • Contact page
  • Team member pages
  • Blog post signatures
  • Error pages

Step 4: Update External Locations (10 minutes)

Where email might be published:

  • LinkedIn profile
  • Twitter bio
  • Facebook page
  • Google Business Profile
  • Business directories
  • Forum signatures
  • GitHub profile
  • Portfolio sites

Replace with:

  • Form link
  • Contact page URL
  • Social media message option
  • "Contact through website" text

Step 5: Verify and Test (10 minutes)

Verification checklist:

  • No emails visible on website
  • No emails in HTML source
  • All form links work
  • Form submits successfully
  • Email notifications arrive
  • Mobile experience is good

Quick Before/After Example

Before: Email Exposed

<!DOCTYPE html>
<html>
<head>
  <title>My Business</title>
</head>
<body>
  <header>
    <nav>
      <a href="/">Home</a>
      <a href="/about">About</a>
      <a href="mailto:contact@example.com">Contact</a>
    </nav>
  </header>
  
  <main>
    <h1>Welcome to My Business</h1>
    <p>Get in touch: contact@example.com</p>
  </main>
  
  <footer>
    <p>Email us: <a href="mailto:support@example.com">support@example.com</a></p>
    <p>Sales: <a href="mailto:sales@example.com">sales@example.com</a></p>
  </footer>
</body>
</html>

Exposed emails:

All visible to bots. All will receive spam.

After: Email Protected

<!DOCTYPE html>
<html>
<head>
  <title>My Business</title>
</head>
<body>
  <header>
    <nav>
      <a href="/">Home</a>
      <a href="/about">About</a>
      <a href="https://supportretriever.com/form/your-form-id">Contact</a>
    </nav>
  </header>
  
  <main>
    <h1>Welcome to My Business</h1>
    <p>
      <a href="https://supportretriever.com/form/your-form-id?source=homepage">Get in touch</a>
    </p>
  </main>
  
  <footer>
    <p>
      <a href="https://supportretriever.com/form/your-form-id?type=support">Contact Support</a> • 
      <a href="https://supportretriever.com/form/your-form-id?type=sales">Sales Inquiries</a>
    </p>
  </footer>
</body>
</html>

Exposed emails: Zero

All contact happens through protected form. Your actual email stays hidden.

What If You Already Get Spam?

If Your Email Is Already Compromised

Short-term actions:

  1. Enable aggressive spam filtering
  2. Create rules to auto-delete known spam
  3. Mark spam consistently (trains filters)
  4. Consider changing email providers (better filters)

Long-term solution:

  1. Create new email address
  2. Keep old address for existing contacts
  3. Set up form pointing to new address
  4. Never publish new address
  5. Gradually migrate to new address
  6. Plan to retire old address

Timeline:

  • Keep old address for 6-12 months
  • Forward legitimate mail to new address
  • Aggressively filter spam on old address
  • Eventually close old address

Prevention for New Address

Protect your new email:

  • Never publish in plain text anywhere
  • Only reveal through contact forms
  • Share individually with legitimate contacts
  • Keep completely private

Result:

  • New inbox stays spam-free
  • Old address absorbs existing spam
  • Clean transition over time

Why This Works

Form-Based Contact vs Published Email

Aspect Published Email Form-Based Contact
Visibility
In HTML source ✓ Yes ✗ No
Visible to bots ✓ Yes ✗ No
Can be scraped ✓ Yes ✗ No
Protection
Spam protection ✗ None ✓ Multi-layer
Rate limiting ✗ No ✓ Yes
Bot blocking ✗ No ✓ Yes
Result
Spam volume High Minimal
Protection durability Degrades Permanent

Bottom line: Bots can't scrape what isn't there.

Common Questions

"Won't forms reduce conversions?"

No—forms often improve conversions:

  • Better mobile experience (no email client needed)
  • More reliable (always works)
  • Instant confirmation (users know you got it)
  • Professional appearance

"What if someone really needs my email?"

When they submit through your form:

  1. You receive the message
  2. You reply from your real email
  3. They see your email in the reply
  4. They can email you directly after that

Only legitimate contacts get your email.

"Can't I just obfuscate my email?"

Obfuscation provides temporary protection that degrades as bots improve. Forms provide permanent protection. See Email obfuscation vs contact forms: what actually works.

"Isn't this overkill for a small site?"

Spam affects small sites just as much as large ones. Bots don't discriminate by traffic volume. Prevention is easier than cleanup.

Related Topics

Ready to simplify your support?
Join thousands using SupportRetriever to manage customer conversations.
Try Free

Explore More

Browse All Articles