Data Discrepancies In Marketing

Discrepancies between marketing data sources undermine trust in the data and lead to multiple problems – let’s change that. A guest post by Barbara Galiza.

Barbara Galiza

Created :

February 11, 2024

Created :

February 11, 2024

Updated :

April 30, 2024

(#)

Minutes

Why do different marketing platforms report a different number of conversions for the same campaign? How can Facebook Ads report 300 conversions, Google Analytics 150, Campaign Manager 200, and Snowplow 100? Why do these discrepancies happen? Should you worry about them and try to fix them? Which number should you use for your CAC calculations?

Data is so pivotal for marketing that you’re probably using multiple tools to track digital campaigns. For example, it’s common for the conversion event of a Facebook campaign to be measured on the ad platform (Meta), the ad server (Campaign Manager), the marketing analytics tool (Google Analytics), as well as an event data collection tool (Snowplow).

None of the conversion events attributed to the campaign are likely to match across these tools. Why is that the case? How can we fix it? More importantly, should we even fix it?

In this article, I will talk about common marketing sources along with some of their associated discrepancies, explore why (some of) those discrepancies are problematic, and propose solutions to these challenges.

Challenges Posed by Data Discrepancies

Data discrepancies are to be expected since different sources capture, interpret, and model data in different ways. However, if discrepancies are not addressed, they can cause a slew of challenges for organizations – let’s take a look at the three most common ones.

Distrust in the Data → Media Misinvestment

Inaccurate or conflicting data can make it difficult to determine which source is reliable, resulting in hesitancy to use the data for critical business decisions. This is especially severe when organizations double down on “one source of truth” because each data source has its own blind spots.

For example, a marketing team may only trust Snowplow (post-click, cookie-based) data and ignore the signals from the Ad Platform (post-view, user-based). This can lead to miscalculation of the true impact of the media because conversions may be incorrectly attributed to direct traffic. The knee-jerk reaction is investing in media that is easier to measure but could be less effective. I’m looking at you, Brand Search Ads.

Misalignment → Conflicts

Discrepancies can create confusion between external agencies and internal teams as they may use different data sources.

For example, a marketing agency may only have access to Campaign Manager data, while the in-house team may only look at Segment-powered Tableau dashboards. This can cause miscommunication and hinder the development of a cohesive strategy. Education and clear communication across departments are essential for addressing this challenge.

Unnecessary Investigation → Time Wasted

This is a consequence of both data distrust and team misalignment.

Time-consuming and often frustrating investigations into discrepancies can waste valuable resources, especially when the discrepancies are not significant and do not impact decision-making.

These investigations, especially the severe ones, tend to involve multiple departments (data engineering, data analytics, paid media, marketing analytics) as well as external stakeholders (marketing agencies, analytics agencies), thus draining valuable resources.

To tackle data discrepancies, marketers should accept that a single source of truth is no panacea – it’s unlikely to provide a complete picture of what’s really going on. Instead, marketers must leverage multiple sources of data to gain a comprehensive understanding of the outcome of their campaigns.

Explaining Discrepancies in Common Marketing Sources

There are plenty of reasons for discrepancies. You can find some common marketing sources and their associated discrepancies below:

Packaged Event Data Solution (Google Analytics)

It attributes conversions to all campaigns (UTMs and linked ad accounts) that you’ve defined, but also to standard sources (like “default channel grouping”) such as direct and organic.
GA uses Google User ID instead of cookies to identify users across devices and sessions, which lasts longer than cookies. .
It comes with an out-of-the-box attribution model with minimal visibility of the marketing touches.
It only tracks post-click conversions (once the user has entered your analytics property).

Custom Event Data Collection (Snowplow, Segment, etc)

Both attribution and sessions need to be defined manually, requiring custom data modeling. How you define them depends on how you identify anonymous (non-logged-in) users across sessions and devices (such as through fingerprinting, cookies, etc).
Conversions also depend on the marketing campaigns you’re tracking and the attribution model you’ve defined.
Like Google Analytics, these tools only track post-click conversions (once the user has entered your analytics property).

Ad Platforms (Meta, Google Ads, etc)

It only attributes conversions to the ad platform itself. On Meta, it doesn’t matter if a user has clicked on 10 Search Ads before being served a Meta ad. If they convert, then Meta claims the conversion.
Conversions depend on the lookback window and attribution model you’ve picked.
Since an ad platform tracks the user before they enter your website, it also tracks post-view conversions.

Ad Server (Campaign Manager, Kevel, etc)

Like an ad platform, an ad server can track the user before they enter your website (measuring impressions) and therefore, it can also track post-view conversions.
Once again, conversions depend on the lookback window and attribution model you’ve picked.

An ad server can attribute conversions to multiple ad platforms, but, unlike Google Analytics or data collection tools, it can’t attribute a conversion to Organic/Direct.

*Comparison chart: How different sources handle attribution*

Four Solutions for Marketers Facing Data Discrepancies

When dealing with discrepancies in data from different marketing sources, marketers should follow these best practices to ensure that they make informed decisions based on accurate information:

Establish acceptable limits for discrepancy

Understand that discrepancies will occur and determine what range of discrepancy is acceptable for your business. This threshold will vary depending on the platforms, campaigns, and products involved.

Additionally, to avoid unnecessary investigation, it’s helpful to explain to stakeholders why discrepancies occur in the first place.

Monitor discrepancies

Implement a system to monitor discrepancies, either through automated tools or regular manual checks. This will help you keep track of any significant changes that may require further investigation.

Understand the nature of the campaign, the platform, and the product

Recognize that different campaigns and platforms may have varying levels of discrepancy. For example, TikTok campaigns might show higher discrepancies when compared to Google Search campaigns. This could be because of cross-device tracking (TikTok will serve on mobile) or the time-to-conversion (Google Search can reach leads with high intent).

Expand with zero-party data and econometrics

“Zero-party data” such as attribution surveys (asking new users where they found out about your product) can offer a better picture of hard-to-attribute sources. You can learn all about zero-party data in this series.

I’ve seen this first-hand while working on a mobile app. Influencer marketing – which was resulting in nearly zero attributed installs monthly – was responsible for 40% new users in the attribution survey.

Attribution analysis using techniques such as Marketing Mix Modeling (MMM) is also gaining traction as a way to measure short-term ROI and long-term ROI across different marketing channels, including OOH (out-of-home) and brand campaigns.

Real-world Solutions: Discrepancy Dashboard and Slack Bot

I’ll cover two solutions I’ve helped a marketing agency implement to monitor discrepancies for their clients. These don’t fix discrepancies, but have helped the team identify when to investigate discrepancies.

Discrepancy Dashboard

The dashboard joined the data sources they used: their DSP (DV360), ad server (Campaign Manager), and brand safety tool (IAS). It reported on impressions based on each marketing source.

The DSP reported on all impressions that were purchased. The ad server reported on all impressions that were purchased and served. The brand safety tool reported on all impressions that were purchased, served, and displayed (deemed valid).

As a result, the number of impressions decreased with each level, leading to discrepancies among these sources.

Discrepancy Slack Bot

Based on historical data, we determined the baseline discrepancies among these marketing sources. We then built a Slack bot that would notify programmatic traders when a placement’s discrepancy surpassed a specified threshold, such as 10%.

This alert system helped the client identify potential fraud, ads placed in the wrong locations, or tracking issues. It was essential to investigate and resolve these problems promptly.

How I Learned to Stop Worrying About and Start Loving Discrepancies

Discrepancies in data from marketing sources can cause distrust in the data, miscommunication between external agencies and internal teams, and wasted time investigating perfectly reasonable discrepancies.

To address these challenges, marketers must understand the differences between sources, report on multiple conversion sources, and be aware of the discrepancies in the marketing sources they use.

By embracing best practices and being proactive in addressing discrepancies, marketers can make more informed decisions and enhance their marketing strategies. Remember, the key is not necessarily to find a perfect one-to-one match, but to exercise common sense, maintain open communication, and establish guardrails to make the most of your marketing data.

‍