The Modern Data Stack Is Not An End State

The finish line is an illusion.

Arpit Choudhury

Created :

June 16, 2023

Created :

March 16, 2022

Updated :

April 30, 2024

(#)

Minutes

A car comprises four essential components — the chassis, the engine, the transmission system, and the body; then there are auxiliary parts without which a car can run, but it’s not practical to drive without indicators or air-conditioning.

The modern data stack is not like a car and shouldn’t be compared to one as there aren’t and shouldn’t be a fixed set of components that need to be put together before one can get from point A to point B with a data stack.

A data stack shouldn’t be considered less modern if it’s comprised of CDI + product analytics + data activation. Similarly, data warehouse + ELT + transformation + BI shouldn’t be seen as a silver bullet that can solve all data woes.

My point is that there is no one-size-fits-all rule to building a modern data stack and building one is not a one-time activity with a finish line.

An ongoing, iterative process

Organizations of different sizes from different industries ought to have distinct prerequisites to arrive at a solution that is considered modern enough to fulfill the needs of various teams. But as those needs are met, new ones will arise which will probably require different tools and technologies that seemed redundant earlier.

Moreover, as an organization’s data infrastructure matures and calls for optimization, existing solutions might need to be stripped away to make room for the new.

Data teams should consider building a data stack similar to building a product — an ongoing, iterative process.

The fastest way to fulfill a need

There’s so much data being generated by organizations that their data tooling is no good unless it enables teams to get answers from data and act upon them quickly.

Product, growth, and other GTM teams couldn’t care less about the seven different tools that data teams might want in order to build the ideal version of the modern data stack — they only care about getting users to use their products, derive enough value to pay for them, then use the products some more, and keep deriving value to keep paying for them.

To make this happen, GTM teams need to go beyond deriving insights from reports and need to activate or take action on the data. And for GTM teams to activate data, data teams need to make accurate customer data available in the tools GTM teams use to build customer experiences across touchpoints.

Going from zero to one

Technology startups need a strong foundation for customer data infrastructure — as long as they’re able to collect, store, analyze, and activate data, and do it sooner rather than later, it doesn’t matter whether they use an all-in-one solution or purpose-built ones.

Whatever the toolkit, it can be extremely rewarding for GTM teams to understand what data is collected, how it’s collected, how it’s made available in the tools they use to derive insights and drive action, and what the process is to collect more data if they need to.

This is not to say that GTM teams should do what data teams are meant to do — understanding the process, however, only makes GTM teams more intentional about utilizing available data and requesting new data.

On the other hand, data teams have higher motivation to serve the needs of their GTM counterparts when they (the data people) understand how the data they’re making available is being used to improve the customer experience and how the data impacts not just decision-making, but the overall health of the business.

Moving from one to beyond

For companies that are already reaping the benefits of a sound foundation for customer data, it becomes crucial to streamline the process of collecting new data points and making them available everywhere they are consumed.

At the same time, teams should be equipped with the necessary tooling and resources to quickly find out what data is available, in what form is it accessible, and where it can be accessed. They should also be able to bring the data into the tools where they wish to consume it to build data-powered experiences.

Data teams at such companies can focus their efforts on the scalability and interoperability of the data stack, as well as on the evaluation of tools that help maintain the infrastructure.

For instance, they need tools to test and monitor data quality to ensure the accuracy, freshness, and completeness of data, as well as to be notified when data pipelines break and affect downstream systems. The whole process is not trivial and data teams might even need more than one tool to tackle various data quality issues.

Modern is whatever works best for you

Building a modern data stack is not an end state — it’s an iterative process with no predefined rules in terms of what tools make a data stack modern.

What is modern for one company might be overkill for another — companies should define their ideal version of the modern data stack and then invest heavily in proper implementation and enablement for teams to derive value from tools.

At the end of the day, tools exist to help people solve pressing problems in the most efficient manner.

The original version of this article was published on the Amplitude blog.

‍