This is part 2 of a 4-part series titled Understanding Zero-party Data.
In the first part of this series, I described the characteristics of first-party data and zero-party data, as well as touched upon what causes confusion between the two and which party owns what data.
Part two here digs deeper into the following subtypes of first-party data and zero-party data:
- First-party data: event data, entity data, and identity data
- Zero-party data: entity data and identity data.
Why bother learning about these subtypes?
Well, understanding the differences between them helps answer the following questions:
- Which types of data do we really need?
- What tools do we use to collect and store each type of data?
- Should we collect certain data points implicitly or explicitly?
- How do we make each type of data actionable?
Let’s dive in.
First-party data: Subtypes
As a reminder, data collected by an organization or brand from a user implicitly is first-party data, which can be further split into three types: event data, entity data, and identity data.
Event data
Also referred to as behavioral data or product-usage data, event data helps understand how a user navigates through a product, where they get stuck, which features they use or don’t use, and at what point they become activated.
Every user action — a click, a tap, or a swipe — can be recorded as an event, along with additional data points called properties that provide more context about that event.
If a logged-in user visits a page inside an app, the event Page_Viewed along with its properties such as user_id, timestamp, and page_title helps understand the following:
- Which users viewed a certain page or screen
- How many times a certain page was viewed
- The exact times when a certain page was viewed
Event data is the most common subtype of first-party data and has become table stakes for both B2B and B2C brands to understand user behavior. If you’d like to get a deeper understanding of the components of event data, check out this guide.
Confusion alert: Sometimes, codeless event tracking is referred to as implicit tracking and instrumenting events via code is called explicit tracking. Either way, users don’t share this data intentionally or explicitly, and therefore, irrespective of the technology used, event data falls under first-party data.
Entity data (implicit)
Any piece of data containing some information or trait about an entity — an individual or group — falls under entity data. And when this data is collected implicitly, it falls under first-party entity data.
User is the most common entity that brands interact with and collect data about. The User table comprises columns to store user properties such as name, email, gender, occupation, is_active, is_customer, and so on.
B2B Context
In B2B, Account is a group entity that represents the customer (an organization) and stores data points associated with an entire account and not just a user who belongs to that account. Common account properties include account_name, account_status, is_customer, is_partner, number_of_users, subscription_type, and so on.
On the other hand, since an individual can be a lead, user, partner, or even just a billing admin or a combination of these, it’s useful to create an entity table such as Person that stores every piece of data pertaining to an individual.
Based on their actions, an individual can belong to one or more audiences as follows:
- A person who subscribes to the blog is a subscriber or lead
- A person who creates a free account is a user who belongs to an account where subscription_type is free
- A person who upgrades a free account can be a user or a billing_contact of a paid account
- A person who signs up for the partner program is an evangelist and a user of an account where is_partner is true
- And so on
A combination of a person property such as number_of_docs (collaboration tool) and an account property like number_of_users helps B2B brands determine who their most active users are and which accounts to upsell to. If you’d like to dig deeper into entity data in the context of B2B SaaS, this guide is what you need.
P.S. It’s useful to note that entities such as Account and Person are referred to as objects inside a CRM.
B2C Context
An e-commerce store needs to maintain a record of the products viewed by a user in order to offer recommendations, forecast demand, or run remarketing campaigns. This piece of entity data (the user being the entity) can be stored in the products_viewed column of the user table.
Marketplaces deal with an additional entity — Merchant (or Seller) whose details and attributes such as merchant_category, product_categories, and serviceable_pincodes are stored in the Merchant table.
Additionally, every unique product (SKU) that a store sells is also an entity, and the feedback or reviews for that product are stored in the table dedicated to that SKU. This is what enables marketplaces like Amazon to display consolidated reviews of a product even though that product is sold by multiple merchants (who may or may not serve the same pin codes).
Identity data (implicit)
Derived from the user's browser or device or fetched using data enrichment tools, first-party identity data helps identify who the user is.
When I visit a brand's website, my IP address is collected automatically which I cannot choose not to share (all I can do is use a VPN or a static IP to supply a different IP address). The brand is able to derive my location from my IP address to offer me a personalized experience or for that matter, block me from accessing their product altogether.
Other identity attributes that can be derived implicitly include a user’s system preferences (such as light or dark mode), OS (Windows, macOS, iOS, or Android), as well as a host of demographic and firmographic data points that are made available by enrichment tools.
Additionally, enrichment tools are also able to fetch one’s name and email address but this data is never 100% accurate or up-to-date.
First-party data, therefore, has a lot of applications and is a vital component for brands to build a strong audience data strategy.
Zero-party data: Subtypes
As a reminder, data collected by an organization or brand from a user explicitly is zero-party data, which can also be split into entity data and identity data.
I want to focus on the individual user or Person entity but the same concepts apply to group entities like Account or Merchant. Also, the whole idea behind zero-party data is to empower the end user and enable them to share data that they wish to — even when the user represents an organization.
Entity data (explicit)
Zero-party entity data refers to data individuals share about themselves or their organization to help brands offer personalized content and experiences.
This includes preferences, professional info, and demographics — essentially anything other than personally identifiable information (PII).
Adopting the following is table stakes for both B2B and B2C brands that are serious about personalization:
- An onboarding survey asking new users what they intend to use the product for or what their goals are, as well as professional (B2B) and personal (B2C) info
- An email preference center where users can choose the types of emails they’d like to subscribe to (blog, product updates, surveys, offers, etc), specify a delivery cadence (weekly, bi-weekly, or monthly), and even pause their subscriptions (for say 30 or 60 days)
These are the easiest tools to collect zero-party entity data since users generally understand that brands need this data to offer a better, tailored experience. Users are more open to sharing this data as they also understand that not doing so results in a less-personalized experience for them.
It’s also useful to keep in mind that part of the data that’s anonymized and made available by advertisers for ad targeting falls under entity data that individuals share explicitly.
In fact, in its early days, Facebook was all about gathering zero-party data from its users to better understand their likes and dislikes. Similarly, by searching for answers on Google, we’re essentially letting Google know what we’re looking for and providing fuel for its advertising machine.
I believe the day isn’t far when brands of all sizes make it dead simple for individuals to tell them what they’re looking for. As a result, the buyer gets exactly what they want and brands get more accurate data — a win-win scenario that makes everyone’s life easier.
Identity data (explicit)
Brands collect personally identifiable information like name and email explicitly when they ask users to input this data before giving them access to a product or service.
Additional PII data like mailing address or phone number are collected at future touchpoints to either complete a transaction or conduct a verification.
Here are a couple of things worth keeping in mind pertaining to zero-party identity data:
- Data points like name and email fetched using enrichment tools fall under third-party identity data instead of zero-party or first-party data because the data is acquired from an external vendor.
- Brands often combine zero-party, first-party, and third-party identity data for the purpose of identity resolution
And here’s a summary of the subtypes of zero-party data:
- Zero-party identity data enables brands to identify individuals and build identity resolution algorithms
- Zero-party entity data enables brands to deliver tailored content and build personalized experiences
Conclusion and what’s next
While first-party data has cemented its position as a prerequisite for brands that are serious about using data to their advantage or to deliver tailored experiences, zero-party data is still in its infancy.
That said, zero-party data also presents an opportunity for brands to build and nurture better relationships with their audiences. And since there are only so many brands an individual would want to build relationships with, a well-executed zero-party data strategy can become a huge moat for brands.
The next part of this series covers the use cases and the personalization benefits of zero-party data.