Warehouse-native Apps Explained
SaaS tools built on the data warehouse, also known as Connected Apps
This is a topic I’ve been fascinated by and while I’m no expert, I intend to share what I’ve learned so far based on conversations with folks who are either enabling or utilizing this new paradigm.
Please scroll to the end if you’re already familiar with warehouse-native or connected apps.
A warehouse-native app is any SaaS tool that can run on top of a customer’s data warehouse, without the need to ingest and store any data.
In other words, a warehouse-native app vendor enables its customers to bring their own data warehouse to the vendor’s application which then performs tasks using the customer’s data.
The data doesn’t need to be synced from the customer’s data warehouse to the vendor’s application, which is typically the case with traditional SaaS solutions.
Snowflake refers to Warehouse-native apps as Connected apps and their traditional counterparts as Managed apps, which, besides offering a set of functionality, manage their customer’s data too.
But there’s more.
Besides storing a copy of every customer’s customer data, managed apps also store the data generated as a result of their own product being used by their customers. On the other hand, warehouse-native apps automatically load this data back into the customer’s data warehouse.
Let’s break this down using an email marketing tool as an example.
Managed Email Marketing App
Before creating a segment or activating a campaign using a traditional email tool, you must either upload CSV files with data about your customers or sync data from your first-party app to the email tool using their API.
The diagram above depicts the flow of data to a managed email marketing tool such as Mailchimp or Customer.io.
The fact that such companies have to store a copy of every customer’s data explains why their pricing model is based on subscribers/profiles — basically, customers have to pay for their own data to be stored at the vendor’s end.
Additionally, all usage data like campaign metrics, subscription status, etc is also stored by the vendors in their own databases, which customers can later download as CSVs or extract and load into their data warehouse using an ELT tool.
Warehouse-native Email Marketing App
With a warehouse-native or connected email app, you don’t need to upload or sync any data to start using it — you just need to connect your data warehouse to start building segments and campaigns.
As you can see above, a warehouse-native app or connected app is like a component that sits on top of a data warehouse and offers certain functionality, which in this case is email marketing.
Connected apps are able to read data from a warehouse, enabling its users to select appropriate tables from the warehouse and then build segments and campaigns using data from those tables.
In terms of sending emails, connected apps either have their own infrastructure or allow customers to connect to yet another email service provider like Sendgrid. Either way, the connected app vendor loads usage data (campaign metrics, etc) back into the customer’s warehouse, doing away with the need for an ELT pipeline.
Reverse ETL vs Connected Apps
It’s worth mentioning that there’s some overlap between reverse ETL (rETL) tools and some connected apps. In fact, rETL tools behave a lot like connected apps — both sit on top of a data warehouse but don’t store any data at their end, while allowing customers to perform certain actions.
However, even though rETL tools offer features like sending Slack alerts based on changes to your data, moving data from the warehouse to downstream managed apps remains the primary use case for rETL.
Instead of uploading CSVs or using the managed app’s API to sync data, one can move data from the warehouse to a managed app using rETL as middleware.
Similarly, as depicted below, one can move usage data from the managed app back to the warehouse using an ELT tool (or the managed app’s API).
Summary and closing thoughts
I’d like to conclude this post with a quick summary followed by some thoughts.
A warehouse-native app is any SaaS tool that can run on top of a customer’s data warehouse, without the need to ingest and store any data
Warehouse-native apps are also referred to as Connected apps (since they’re connected to the customer’s own data platform)
Warehouse-native apps or connected apps automatically load usage data back into the customer’s data warehouse.
There’s some overlap in the functionality of connected apps and reverse ETL tools — both sit on top of a data warehouse but don’t store any data at their end, while allowing customers to perform certain actions.
While unlikely to happen anytime soon, connected apps can become more like rETL tools by building destination integrations, allowing customers to also move data downstream to other third-party tools.
rETL tools, on the other hand, have the potential to enable traditional SaaS companies to embrace the warehouse-native architecture.
Continue learning about warehouse-native apps:
Hear founders and practitioners — folks who are at the forefront of the warehouse-native paradigm — answer questions like:
What are some key benefits of adopting warehouse-native apps over managed apps?
Does the visual segmentation capability of a warehouse-native app replace the need to build data models in SQL?
Do warehouse-native apps make reverse ETL workflows redundant?