First-party Data vs. Zero-party Data
Characteristics, confusion, and ownership.
This is a follow-up to the previous post where I proposed that the term customer data be replaced with audience data.
The rationale behind this suggestion is that enterprises no longer just deal with leads and customers and B2B brands, in particular, must keep their “other audiences” engaged.
This is an excerpt from a guide originally published on the astorik learning hub →
In this post, I’d like to list down the characteristics of first-party and zero-party data, as well as touch upon what causes confusion between the two and which party owns what data.
It helps to keep in mind who the two parties are when we talk about the two types of audience data:
The organization or brand that collects the data is the first party
The end user who shares the data is the zero party
As a reminder, data collected implicitly by organizations is first-party data, whereas data shared explicitly by the end user is zero-party data.
It’s also worth mentioning that the term ‘zero-party data’ hasn’t seen wide adoption yet and it’s common practice to stick to ‘first-party data’ even when referring to zero-party data. There are several points of confusion here but before we get to those, I’d like to describe the key characteristics of these two audience data types.
Let’s dig in.
Characteristics of first-party data:
The data collected by an organization or brand from a user implicitly is first-party data.
It includes any piece of information that a user (the zero party) shares with a brand (the first party) unintentionally or implicitly when they interact with an app or website, and through other touchpoints such as opening an email or clicking on an ad.
First-party data has the following characteristics:
It’s easy to collect since it's done by machines, without any intervention from people whose actions generate this data.
It’s easy to store in a structured manner since data formats are specified when tracking is implemented.
It’s easy to analyze using purpose-built product analytics tools.
It’s not always accurate since it’s not shared by the user directly. It’s also prone to implementation errors and bugs as well as interference by VPNs and ad blockers.
It’s ideally owned by the brand collecting it.
It’s important for brands to make it easy for users to understand what data is collected implicitly and enable them to opt out anytime so that their actions are not turned into data.
Characteristics of zero-party data
The data collected by an organization or brand from a user explicitly is zero-party data.
It includes any piece of information that a user (the zero party) shares with a brand (the first party) intentionally or explicitly by inputting details into a form, or via communication channels like email and chat.
Zero-party data has the following characteristics:
It’s difficult to collect as it relies on the whim of the user who may choose to not share any data in the first place.
It’s difficult to store in a structured manner since the data formats are not consistent (people who design surveys and forms don’t often think about data structures).
It’s difficult to analyze since it requires manual interpretation.
It’s deemed to be accurate since it's shared by individuals directly; however, it might not be factual because an individual can provide false information too.
It’s ideally owned by the end user sharing it.
It’s also important for brands to make it easy for users to decide what purpose their data is used for, and easily take back whatever data they wish to.
The confusion between first-party and zero-party data
There’s a fair bit of confusion regarding what exactly zero-party data is and how it’s different from first-party data and as a result, the term ‘zero-party data’ hasn’t seen wide adoption yet.
I’m hoping that the differences mentioned above are helpful but I’d also like to address the various issues where the confusion stems from.
A specific data point can either be first-party or zero-party
Depending on how it was collected, a piece of data can fall under either bucket.
For instance, if a user shares their location with a brand explicitly by inputting their city or country in a form or via another communication channel, that piece of data is essentially zero-party data.
On the other hand, if the brand derives the same piece of data implicitly using the user’s IP address, that’s first-party data.
Similarly, in a B2C context, if a buyer mentions explicitly that their favorite variety of cheese is Cheddar, that’s zero-party data. Conversely, if the brand infers the buyer’s preference based on their past orders, that’s first-party data.
One can differentiate between zero-party data and first-party data by asking, “where did the data come from?”
Doing so also enables one to understand who should ideally own the data and decide how it’s used.
Providing consent is not the same as sharing data explicitly
By accepting a consent notice, a visitor simply agrees to be tracked by cookies, tags, and analytics tools which results in the collection of first-party data.
Providing consent to be tracked is not the equivalent of handing over zero-party data via a form or another channel.
Data acquired from external vendors is third-party data
Data obtained from a reseller or via an enrichment vendor is always third-party data since it’s not collected by the first party — neither implicitly nor explicitly.
Additionally, second-party data is just a subset of third-party data acquired from an external vendor. In my humble opinion, the term “second party” in the context of data serves no purpose other than that of causing more confusion.
The ownership of first-party and zero-party data
Data ownership is a complicated topic and privacy regulations like GDPR and CCPA, still in their infancy, can be misunderstood or misinterpreted.
Which party owns what data is also very subjective and can vary based on how data is collected and what level of consent has been given by the audience.
Therefore, what follows is a logical explanation of who should own what data keeping privacy and the end user experience in mind.
But first, what does it mean to own the data?
Right to collection and storage certainly doesn’t give an entity ownership of the data they collect and store.
Therefore, data ownership boils down to usage — who gets to decide how certain data is used?
Considering the above as an acceptable statement, here’s who should own what data:
First-party data should be owned by the brand that collects it
Zero-party data should be owned by the end user who chooses to share it with a brand
In other words, as long as they adhere to privacy regulations, brands should be able to use implicitly collected data to improve the audience experience without necessarily explaining how the data is used.
Conversely, when it comes to explicit data, the end user should be able to decide what purpose their data is used for and easily be able to retrieve whatever data they wish to.
The feedback on the previous post really helped me refine some of these ideas and as a result, I decided to expand upon first-party and zero-party data in this issue and cover their subtypes in the next one.
So please keep that feedback coming!