I Want to Believe, but: Your Email Link Clicks Aren’t Real
Odds are your email reports are lying and telling you more people are clicking on links in your emails than they actually are.
Marketers have been noticing this trend throughout much of 2017 across multiple platforms like Marketo, HubSpot, SalesLoft and others. But why did this start happening—and what steps can you take to make sure you’re getting the most accurate data possible?
The first known instance of this trend started appearing in late spring/early summer 2013 with Barracuda’s email security platform analyzing incoming emails by checking where links resolved—in other words, where your go.email.com/encodedlink link actually lands when you click on it in an email. Barracuda targeted three main ways a URL in an email could be analyzed:
- Is the URL something that’s known to blacklists and should be blocked?
- Does the URL contain a domain that resolves to known spammers or other blacklisted content?
- Is the URL being hidden in a redirect to prevent anti-spam systems from realizing the final destination of the link?
The third concept had been a major issue in Internet security around the early 2010s. Spammers would use free shortener services (such as bit.ly) or put a redirecting URL in an email to obscure the actual link someone was being sent to. By following the link to its final destination, Barracuda’s software could figure out whether a link was safe or not.
Over the next several years, many other email service providers (ESPs) and anti-spam filtering systems implemented similar technologies to the point where it’s commonplace amongst most major email providers in 2018.
However, this verification process has had one major unintentional side effect: most software that measures email clicks will send links with redirects in order to monitor who clicked on what link. Marketo, for instance, does this by sending uniquely encoded links to each person you mail in the system:
In this example, you’ll see that even though the email sent out of Marketo has an email link to http://www.howmanypeopleareinspacerightnow.com, the link end users receive actually goes to Marketo’s servers first (to record that the person clicked on the email) before they are forwarded on to to the original link in the email.
As a result, security and sales/marketing tactics are directly in conflict. Network security wants to ensure that no one gets sent to a rogue website, but sales and marketing rely on this redirect method to get an idea of who is clicking on their emails. As a result, it’s no surprise that there’s been a steady rise of false positive clicks being recorded by emailing systems.
In practical terms, this means that raw reports of click activity from most email trackers is inaccurate and most likely inflated because the activity can’t be meaningfully separated between spam filters “clicking” on the link and actual users—but this is by design.
Understanding User Agents
Whenever you click on a link—whether it’s in an email client or in your browser—the browser appends a string with information called a user agent to the server of the link you clicked on. The user agent describes to the server the browser’s capabilities, such as type of platform (mobile vs. desktop), software, browser, etc. accessed the link.
For example, take a look at this User Agent string:
Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36
From this, you can deduce that this is a Google Nexus 5 phone that runs Android Marshmellow (6.0) and is using Chrome 63 as its browser, which in turn uses AppleWebKit as its rendering engine. The extra bits you see (Mozilla/5.0 and Mobile Safari/573.36) are legacy lines used to stay compatible with servers and can be safely ignored. (If you’re interested in how user agent strings became the bizarre mess they are, WebAIM has a good writeup.)
Likewise, if you look at a string like:
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:10.0) Gecko/20100101 Firefox/57.0
You can see that this is a Windows 10 64-bit computer using Firefox 57 as its browser and Gecko as its rendering engine.
Essentially, User Agent strings should contain information that look like real-world combinations of devices you know. When you look at user agent strings for email link clicks in Marketo, though, you can see there are some links being accessed by labels that are clearly not real people or devices!
For example, looking at recent email clicks from a wide-audience B2B/B2C company, DemandLab noticed user agents “clicking” on links such as:
- Mozilla/5.0 (compatible; Yahoo Link Preview; https://help.yahoo.com/kb/mail/yahoo-link-preview-SLN23615.html)
- Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
- CRAZYWEBCRAWLER 0.9.10, http://www.crazywebcrawler.com
- Mozilla/5.0 (compatible; Embedly/0.2; +http://support.embed.ly)
- Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US) AppEngine-Google; (+http://code.google.com/; appid: s~virustotalcloud)
- PerfectMail/2.0 (http://perfectmail.com/kb/web_probe)
- Java/1.8.0_102
It’s pretty easy to see that they aren’t real people clicking on these links: they’re web crawlers, mail filters or other pieces of software. However, with most modern spam filters, you’re far more likely to see fake email clicks coming across with user agents like:
- Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
- Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
- Mozilla/5.0 (Windows NT 5.1; rv:52.0) Gecko/20100101 Firefox/52.0
Why? Although these user agents look like real people with real browsers, anti-spam is often a game of cat and mouse. If user agents explicitly said they were bots, it would be easy for malicious emailers to send anti-spam checkers to clean pages and average users to harmful pages. Anti-spam products often mask their user agent to look like very common types of traffic as a result.
Because of this lack of clarity, there have been several different ideas thrown out on how to effectively measure clicks with an automation platform, but nearly all of them have issues and the overall issue between network security and marketing tracking remains largely unsolved.
As such, there have a growing set of options and opinions on how to distinguish signal from noise and get who truly is interacting with your email content. So what should the marketer do? Before we get to what you can do about measurement let’s first touch upon what you shouldn’t do.
How you shouldn’t combat false clicks
Sticking a “fake” link at the top of your emails as a honeypot or filtering out multiple clicks
Why this is ineffective: The idea is that anti-spam checkers will go through the first few links in your email to determine whether you are a bad sender or not. However, this logic is flawed: only a handful of older anti-spam systems do this. More often, you’ll see links checked that are either new to the email or considered potentially risky. As an example, we’ve seen one anti-spam system in particular always click on Google Plus links no matter where they are in the hierarchy of links in the email.
Moreover, sticking a “fake” link in your email can look like an attempt to hide a malicious link to more advanced email analysis systems, especially if it’s hidden via HTML. As a result, this technique can backfire and cause you to look like a bad sender.
Not counting clicks without opens
Why this is ineffective: While this may seem like a straightforward solution, it discounts the very method emails are counted as “opened” by most email providers: using a tracking pixel picture loaded by the user when they look at the email. Unfortunately, Outlook and many common web clients do not load images by default when you open an email, so that “open” can be missed. This is especially problematic for B2B mailers, who often are sending to addresses using some sort of Microsoft product (Outlook, Office 365, etc.)
Not counting clicks made right after an email is sent
Why this is ineffective: The thought behind this idea is that only automated services could click on an email link within minutes of having an email sent. However, in today’s world, email notifications are ever-present—and email open analysis shows that the biggest chunk of time where someone opens an email is within the first hour. Why penalize people for responding to your email when you aimed for them to respond to it?
Not counting clicks until your automation platform confirms the email was delivered
Why this is ineffective: The idea to wait until your automation platform confirms the email was delivered seems very tempting. After all, in most cases the anti-spam system will test the link it needs to by “clicking” on them, then allow the email to be delivered. However, this series of actions (email sent–>email is “clicked”–>email is delivered) usually happens in a span of seconds, and it’s not uncommon to see instances where your platform doesn’t record them in the right order due to race conditions or millisecond delays in the system. While it can be a good indicator that the problem is happening to a certain email address, it’s not foolproof enough for mass use.
How you should combat false clicks
After testing this idea on dozens of different Marketo instances, the single most effective way I’ve found to track email clicks is a combination of two filters:
Email was delivered + Visited web page: [web page linked to in email]
The thing to keep in mind with anti-spam checkers is that often they are checking links just to see where they resolve—so, where your go.email.com/encodedlink actually goes to. If it loads a page, it will do so with minimal rendering. Because Marketo uses Munchkin.js to track page visits, a checker would need to also load JavaScript.
The vast majority of checkers do not.
Checking to see both that the email message was sent to the person and that they visited the page is usually your best bet for recording real clicks. This also helps prevent potential skewed link results: if someone clicks on the logo at the top of your email, should that really count as a link click for your campaign?
Combining this strategy with campaign progression steps can also solve issues around recording email clicks for scoring. Rather than listening for the email click itself, simply listen for the campaign status changing from the combination filters.
While this method has its own faults—especially if you do not have your record cookied or if you have to send traffic to a website you don’t own—it’s the best of bad options.
This presumes that your “Email Delivered” values are correct, and they aren’t always—but more on that in our next post.