Burning Questions, Answered: What data is needed?

Burning Questions, Answered: Part 2

Arpit Choudhury

Created :

June 5, 2024

Created :

May 15, 2024

Updated :

June 17, 2024

(#)

Minutes

This is part 2 of a 5-part series titled Burning Questions, Answered. Make sure to read part 1 before proceeding.

Figuring out what data is needed to answer burning questions is an inherently collaborative process – Growth needs to involve Data or Engineering to figure out what data is needed either to arrive at a satisfactory answer or to run a data-powered experiment.

However, a keen understanding of the product, the underlying data model powering the app, and the org’s data infrastructure is empowering for a person hungry for growth. It enables them to have better conversations with their data and engineering counterparts and move fast without compromising on data quality.

Now, let’s use the same burning question we used earlier to illustrate the process of figuring out what data is needed to get to a satisfactory answer:

“We’re acquiring a ton of users every day but very few end up hitting the activation milestone; what’s preventing the rest from performing the actions leading to activation?”

This also happens to be one of the burning questions I had during my time at Integromat. It was frustrating that we were acquiring close to a thousand users every day but less than 5% of those users were becoming activated and we didn’t have the data to figure out why.

I must add that the path to the activation milestone wasn’t straightforward due to the nature of the product. It had a relatively steep learning curve, particularly for users who were used to a product like Zapier which was a lot easier to use albeit a lot less powerful. Zapier offered a linear, step-by-step process to build workflows (they still do) whereas Integromat’s users would land on a blank canvas with endless possibilities.

This approach was rather novel at the time, but as a result, only a small number of users we acquired were able to create and activate (or turn on) their first workflow and hit the coveted activation milestone. On Integromat, workflows were called scenarios, a term I’ll be using quite often going forward.

The task at hand was to identify where in the onboarding journey were users getting stuck and dropping off, and then figure out ways to fix that to ultimately increase the activation rate (one of the core metrics for SaaS businesses).

Account-level Data

SaaS companies measure activation at the account or organization level rather than the user level, and depending on how one defines an activated account, it can take more than one user for an account to hit the activation milestone.

Here’s the formula to calculate the activation rate:

Activation rate = number of activated accounts / number of confirmed accounts‍

A confirmed account is one that has completed the account creation process and the account details have been verified to ensure that the account belongs to a legitimate entity – a person or an organization.

In Integromat’s case, even though activation was measured at the account level and multiple users could work together inside an account, creating and running a scenario (or workflow) didn’t necessitate collaboration – an account owner could alone build a scenario and their account would be considered activated. At the time, our goal was to get every new user to create their first scenario on Integromat. This included new users who joined an existing account that had already hit the activation milestone.

*Typical for B2B SaaS products, a user can be part of multiple accounts*

However, and this is something we figured out later, a significant chunk of account owners did not actively use the product themselves but had someone else build scenarios on their behalf. This insight helped us optimize our onboarding and cancellation flows, eventually leading to higher activation and retention rates (we would proactively reach out to account owners who had canceled a paid plan because the primary user who built and maintained the scenarios was either a past employee or a freelancer who was no longer available).

Deciding what to track

To answer my burning question regarding our low activation rate at Integromat, I listed down a handful of events that would help me get a clear picture of where users were dropping off on the path to activation.

Deciding which events to track requires a deep understanding of the product, not only in terms of the features and the user experience but also in terms of how the product handles usage on the backend – one needs to know what is being captured, what cannot be captured due to the limitations of the tech stack used, and what happens when users do something unexpected.

That’s not all though, several other considerations usually come up at the time of event tracking; here are some that you might come across as you go through this process:

Should this event be tracked on the client-side (frontend) or the server-side (backend)?
Do we need this data as an event or is it better to collect the required data as a user property instead? Or should it be an account property since the user is part of an account? Or do we need the data both as an event and as an account property?
Do we even need this data in the first place? Will it help answer our questions or run more granular experiments?
We certainly need this data as an event to build funnel reports and trigger event-based emails, but we’re yet to figure out the event properties that will help us gather deeper insights and run more personalized experiments.

With so many considerations, it becomes crucial to only collect the data needed to answer burning questions and to run data-powered experiments. It’s good to keep in mind (and remind others) that most use cases don’t necessitate collecting a ton of different data points.

*Deciding what to track and in what format*

Moreover, more data than what’s necessary is never helpful, and more often than not, slows teams down as they need to spend additional time figuring out what to use and what to ignore. Contextless data collection is always a bad idea.

The practice of data minimization is particularly applicable to organizations where teams are intrinsically motivated to drive growth – where people want to do the real work and make small improvements to the product experience every day. Unfortunately, it’s hard to find such growth-obsessed individuals – or growth mavericks, as I like to call them – and it’s even harder to foster an environment where growth mavericks thrive; a topic we’ll discuss in the next chapter.

Next, once you figure out what data is needed to answer a burning question, you need to find out where the data originates – in your own environment or that of an external service. Doing so not only makes the collection process easier but also helps figure out whether the data is already being collected or not.

Move on to part 3 that covers the internal sources where data originates.