Understand how email harvesting works and implement the simplest fix to protect your inbox.
How Email Harvesting Works
The Bot Scraping Process
Step 1: Crawling
- Automated bots crawl websites 24/7
- Follow links from page to page
- Scan millions of websites daily
- Save HTML source code for analysis
Step 2: Pattern Matching
- Search for email patterns:
text@domain.com - Use regex:
[a-z0-9]+@[a-z0-9]+\.[a-z]+ - Extract all matching strings
- Handle common variations
Step 3: Validation
- Verify email format is correct
- Check if domain exists
- Filter obvious fake addresses
- Keep likely-valid addresses
Step 4: Database Storage
- Add to spam databases
- Tag with source URL
- Categorize by industry
- Timestamp the discovery
Step 5: Distribution
- Sell to spammers
- Use for own campaigns
- Share on underground markets
- Add to automated spam systems
Timeline: Your email can be in spam databases within 24-48 hours of publishing.
Why mailto Makes It Worse
Plain Text Email
When you publish an email on your website:
<p>Contact us: support@example.com</p>
In HTML source:
<p>Contact us: support@example.com</p>
Bot sees: support@example.com in plain text. Trivial to extract.
mailto Links
Even worse—mailto links are deliberately formatted for easy extraction:
<a href="mailto:support@example.com">Contact Us</a>
Bot pattern matching:
- Look for
href="mailto: - Extract everything between
mailto:and"or? - Store email address
- Move to next link
This is easier to scrape than plain text because it has a predictable structure.
How Fast Spam Arrives
Publishing an email address leads to:
| Timeline | What Happens |
|---|---|
| Day 1 | Email published on website |
| Day 2-3 | Crawlers discover your page |
| Day 4-7 | Email added to databases |
| Week 2 | First spam emails arrive |
| Week 3-4 | Spam volume increases |
| Month 2 | 10-20 spam emails per day |
| Month 6 | 50-100+ spam emails per day |
| Year 1 | Hundreds of spam emails daily |
The problem only gets worse over time.
The Damage
Immediate Impact
Your inbox becomes:
- Filled with spam
- Harder to find real messages
- Time-consuming to manage
- Frustrating to use
Typical spam mix:
- Fake business opportunities
- Phishing attempts
- Product promotions
- Scam offers
- Malicious links
Long-Term Consequences
Once your email is compromised:
- ✗ Can't unpublish (already in databases)
- ✗ Can't stop spam (lists are sold repeatedly)
- ✗ Address is permanently burned
- ✗ Filters help but don't solve it
- ✗ Only solution: New email address
Recovery cost:
- Time setting up new email
- Updating all accounts
- Notifying contacts
- Migrating email history
- Updating business cards, signatures, etc.
The Simplest Fix
Remove Email from HTML, Use Form Link
The complete solution:
- Remove email addresses from all web pages
- Replace with contact form link
- Form delivers to your email (kept private)
- Your inbox stays clean
Before (Exposed)
<footer>
<p>Contact us at support@example.com</p>
<a href="mailto:support@example.com">Email Support</a>
</footer>
Result:
- Email visible in source
- Bots scrape it immediately
- Spam guaranteed
After (Protected)
<footer>
<p><a href="https://supportretriever.com/form/your-form-id">Contact Support</a></p>
</footer>
Result:
- No email in HTML
- No email in source code
- Nothing for bots to scrape
- Clean inbox
5-Step Action Plan
Step 1: Create Protected Contact Form (5 minutes)
- Sign up at SupportRetriever
- Complete onboarding flow
- Configure form settings
- Set recipient email (stays hidden)
- Get form URL
Step 2: Find All Published Emails (10 minutes)
Where to look:
- Website footer
- Contact page
- About page
- Team pages
- Blog posts
- Email signatures
- Social media bios
- Business directories
How to find them:
Search your website source:
- Search for
@in HTML files - Search for
mailto:in code - Check all pages manually
- Use browser "View Source"
Step 3: Replace with Form Links (15 minutes)
Simple replacement pattern:
<!-- OLD -->
<a href="mailto:support@example.com">Contact Support</a>
<!-- NEW -->
<a href="https://supportretriever.com/form/your-form-id">Contact Support</a>
All locations:
- Navigation menus
- Footer
- Contact page
- Team member pages
- Blog post signatures
- Error pages
Step 4: Update External Locations (10 minutes)
Where email might be published:
- LinkedIn profile
- Twitter bio
- Facebook page
- Google Business Profile
- Business directories
- Forum signatures
- GitHub profile
- Portfolio sites
Replace with:
- Form link
- Contact page URL
- Social media message option
- "Contact through website" text
Step 5: Verify and Test (10 minutes)
Verification checklist:
- No emails visible on website
- No emails in HTML source
- All form links work
- Form submits successfully
- Email notifications arrive
- Mobile experience is good
Quick Before/After Example
Before: Email Exposed
<!DOCTYPE html>
<html>
<head>
<title>My Business</title>
</head>
<body>
<header>
<nav>
<a href="/">Home</a>
<a href="/about">About</a>
<a href="mailto:contact@example.com">Contact</a>
</nav>
</header>
<main>
<h1>Welcome to My Business</h1>
<p>Get in touch: contact@example.com</p>
</main>
<footer>
<p>Email us: <a href="mailto:support@example.com">support@example.com</a></p>
<p>Sales: <a href="mailto:sales@example.com">sales@example.com</a></p>
</footer>
</body>
</html>
Exposed emails:
- contact@example.com (2 places)
- support@example.com (1 place)
- sales@example.com (1 place)
All visible to bots. All will receive spam.
After: Email Protected
<!DOCTYPE html>
<html>
<head>
<title>My Business</title>
</head>
<body>
<header>
<nav>
<a href="/">Home</a>
<a href="/about">About</a>
<a href="https://supportretriever.com/form/your-form-id">Contact</a>
</nav>
</header>
<main>
<h1>Welcome to My Business</h1>
<p>
<a href="https://supportretriever.com/form/your-form-id?source=homepage">Get in touch</a>
</p>
</main>
<footer>
<p>
<a href="https://supportretriever.com/form/your-form-id?type=support">Contact Support</a> •
<a href="https://supportretriever.com/form/your-form-id?type=sales">Sales Inquiries</a>
</p>
</footer>
</body>
</html>
Exposed emails: Zero
All contact happens through protected form. Your actual email stays hidden.
What If You Already Get Spam?
If Your Email Is Already Compromised
Short-term actions:
- Enable aggressive spam filtering
- Create rules to auto-delete known spam
- Mark spam consistently (trains filters)
- Consider changing email providers (better filters)
Long-term solution:
- Create new email address
- Keep old address for existing contacts
- Set up form pointing to new address
- Never publish new address
- Gradually migrate to new address
- Plan to retire old address
Timeline:
- Keep old address for 6-12 months
- Forward legitimate mail to new address
- Aggressively filter spam on old address
- Eventually close old address
Prevention for New Address
Protect your new email:
- Never publish in plain text anywhere
- Only reveal through contact forms
- Share individually with legitimate contacts
- Keep completely private
Result:
- New inbox stays spam-free
- Old address absorbs existing spam
- Clean transition over time
Why This Works
Form-Based Contact vs Published Email
| Aspect | Published Email | Form-Based Contact |
|---|---|---|
| Visibility | ||
| In HTML source | ✓ Yes | ✗ No |
| Visible to bots | ✓ Yes | ✗ No |
| Can be scraped | ✓ Yes | ✗ No |
| Protection | ||
| Spam protection | ✗ None | ✓ Multi-layer |
| Rate limiting | ✗ No | ✓ Yes |
| Bot blocking | ✗ No | ✓ Yes |
| Result | ||
| Spam volume | High | Minimal |
| Protection durability | Degrades | Permanent |
Bottom line: Bots can't scrape what isn't there.
Common Questions
"Won't forms reduce conversions?"
No—forms often improve conversions:
- Better mobile experience (no email client needed)
- More reliable (always works)
- Instant confirmation (users know you got it)
- Professional appearance
"What if someone really needs my email?"
When they submit through your form:
- You receive the message
- You reply from your real email
- They see your email in the reply
- They can email you directly after that
Only legitimate contacts get your email.
"Can't I just obfuscate my email?"
Obfuscation provides temporary protection that degrades as bots improve. Forms provide permanent protection. See Email obfuscation vs contact forms: what actually works.
"Isn't this overkill for a small site?"
Spam affects small sites just as much as large ones. Bots don't discriminate by traffic volume. Prevention is easier than cleanup.
