This is Part 3 of the series titled Modeling Meaningful Metrics
{{line}}
There seems to be a lot of confusion regarding what qualifies as a metric and what doesn’t. I’ve seen the term ‘metric’ thrown around rather loosely, often to refer to the raw materials used to build metrics. Therefore, in this final part of the series, I’d like to discuss what exactly a metric is and equally importantly, what isn’t a metric.
A metric IS a data model
A metric has dependencies on existing data points (events and properties), data models (including entities), or even other metrics – these objects are the raw materials needed to construct metrics.
I had mentioned the above in Part 1 of the series. What I didn’t mention is that in essence, every metric is also a data model – one that may or may not rely on other data models.
However, not every data model needs to be a metric.
I’ve attempted to depict the above in the figure below.
While this sounds confusing, it also underscores the importance and ubiquity of analytical data models that are often invisible to those consuming the data as events, properties, or metrics.
Models and metrics are interdependent yet distinct objects that need to be created and maintained by data teams. As a result, many startups are building products aimed at modeling and maintaining metrics, leading to a new category of tooling currently referred to as the semantic layer (data people are good at many things but naming isn’t one of them).
So, what exactly does a semantic layer do?
In simple terms, a semantic layer helps define a metric using code rather than words.
Doing so makes modeling, maintaining, and reusing metrics a streamlined process, helping avoid confusion and human error. However, the primary challenge today is that every product that aims to be the semantic layer is, in fact, a semantic layer – one of the many opinionated solutions and one that offers a set of unique building blocks to model metrics. While a semantic layer has a lot of merit, using one can result in vendor lock-in. That said, it’s early days for this new category and there’s hope for a universal standard to emerge in the near future. I won’t be going into more detail but wanted to touch upon the semantic layer to highlight the importance of modeling meaningful metrics.
An event is NOT a metric
PageViewed is an event; website_views_count is a metric that’s created by counting the number of times the PageViewed event takes place across all website pages. And website_visitor_count is created by counting the number of unique visitors that performed the PageViewed event.
I wanted to point this out since I’ve heard people use the terms event and metric interchangeably and it makes me uncomfortable because both serve different purposes and both are important to get right.
I must also highlight that the process of tracking an event is completely different from the process of creating a metric. Depending on an organization’s data maturity and team structure, the two activities are carried out by different people on the data team or different teams altogether (Engineering tracks events and Data creates metrics).
A property is NOT a metric
Here’s a quick refresher on the three types of properties:
- An event property provides more context about a particular event: When PageViewed takes place, page_name is an event property that tells us which page was viewed.
- A user property or user attribute provides specific information about a particular user: age, start_date, number_of_orders are some examples.
- An account property (or any other group property) provides specific information about a particular account (team, organization, workspace, and so on): number_of_users, plan_type, number_of_workflows are some examples.
Therefore, all types of properties act as the raw materials needed to model metrics such as total_pageviews_count, total_orders_count, and total_workflows_count.
A segment may or may not be a metric
Segments, as you know, are created using visual tools by applying a set of filters on top of a data set. At the same time, a growing number of modern tools can plug into the raw data source (like a warehouse) for one to create data models using SQL – or import them from model repositories – and then build segments on top of those models using a visual querying interface.
Now, technically speaking, a segment is also a data model, and like any data model, a segment might double up as a metric – or not – depending on its purpose.
Let me refer to Integromat’s definition of an activated account:
is_activated_account = TRUE if active_scenario_count > 0
Here, is_activated_account is a metric as well as a segment, one that can be easily created using a visual interface (offered by a variety of martech and datatech tools) that sits on top of customer data.
On the other hand, if the purpose of a segment is to trigger emails based on a set of conditions, then that particular segment is not exactly a metric (even though it’s technically possible to create a metric out of it).
Let’s take a look at an example from Integromat. New users who saved a scenario but didn’t active it would enter a segment that had the following conditions:
Event EmailVerified is performed within the last 30 days
AND
Event ScenarioSaved is performed
WITH
Event Property is_active_scenario = false
AND
Organization Property active_scenarios_count = 0
AND
User Property user_organization_role = Owner OR Admin
AND
User Property unsubscribe_onboarding = false OR does not exist
Is it possible to create a metric out of this segment? Absolutely.
Does it make sense to create a metric out of this segment? Not at all.
Defining the metric is not only difficult, it is also utterly useless because I have no idea what I’d do by finding out how many users met these conditions at any given point in time.
To reiterate, while it’s possible to turn a segment (or a model) into a metric, it only makes sense to do so when there’s a need for that metric for reporting purposes. For other purposes like activation (or machine learning), all you need are well-defined analytical data models – which, by the way, as I mentioned above, can also be used to create segments.
Conduct this exercise 🏋️
Create a reference doc by listing down all the segments that you’ve created or intend to create in your activation tools. Make sure to list down the conditions in as much detail as you can along with an explanation of the conditions in simple terms.
If you already maintain such a doc, great; if you don’t, you’ll be amazed to find out how much this practice can help clarify your thinking, create better segments, and ultimately, define and model meaningful metrics.