Link copied to clipboard.

Warehouse-native Apps Explained

Understanding SaaS tools built on the cloud data warehouse.

Created :  
June 22, 2023
Created :  
June 24, 2022
|
Updated :  
April 30, 2024
time illustration svg
(#)
Minutes
(#)
Minutes

First things first, what is a warehouse-native app?

Let’s start with a definition:

A warehouse-native app is any SaaS tool that can run on top of a customer’s cloud data warehouse, without the need to ingest and store any data.

In other words, a warehouse-native app vendor enables its customers to bring their own data warehouse (or cloud data platform) to the vendor’s application which then performs tasks using the customer’s data.

The data doesn’t need to be synced from the customer’s data warehouse to the vendor’s application, which is typically the case with traditional SaaS solutions.

This new software development paradigm, being popularized by Snowflake, is being fueled by the proliferation of the cloud data warehouse.

What’s leading to the rapid adoption of the warehouse-native architecture?

The core idea is very simple:

Customers should be able to bring their own data stores to use alongside functionality built by a SaaS vendor so that customers don’t have to pay for their own data twice.

Besides cost, there are also security and privacy concerns due to which enterprises don’t want their data to leave their environment.

Similar to enterprises earlier seeking on-prem solutions, enterprises now want to bring their own cloud platform to the SaaS (not the same as deploying software in a private cloud environment).

Snowflake refers to warehouse-native apps as “connected apps” and their traditional counterparts as “managed apps”.

Besides offering a set of functionality, managed apps also manage their customer’s customer data. (I am of the opinion that we need to replace the term “customer data” altogether).

But that’s not all…

Besides storing a copy of every customer’s customer data, managed apps also store the data generated as a result of their own product being used by their customers. On the other hand, warehouse-native apps automatically load this data back into the customer’s data warehouse.

Let’s break this down using an email marketing tool as an example.

Example: Managed Email Marketing App

Before creating a segment or activating a campaign using a traditional email tool, you must either upload CSV files with data about your customers or sync data from your first-party app to the email tool using their API.

A managed app where data can be uploaded as CSV or sent via an API
A managed app where data can be uploaded as CSV or sent via an API

The diagram above depicts the flow of data to a managed email marketing tool such as Mailchimp or Customer.io.

The fact that such companies have to store a copy of every customer’s data explains why their pricing model is based on subscribers/profiles — basically, customers have to pay for their own data to be stored at the vendor’s end.

Additionally, all usage data like campaign metrics, subscription status, etc is also stored by the vendors in their own databases, which customers can later download as CSVs or extract and load into their data warehouse using an ELT tool.

Example: Warehouse-native Email Marketing App

With a warehouse-native or connected email app, you don’t need to upload or sync any data to start using it — you just need to connect your data warehouse to start building segments and campaigns.

A warehouse-native app sits on top of the customer’s data warehouse
A warehouse-native app sits on top of the customer’s data warehouse

As you can see above, a warehouse-native app or connected app is like a component that sits on top of a data warehouse and offers certain functionality, which in this case is email marketing.

Connected apps are able to read data from a warehouse, enabling its users to select appropriate tables from the warehouse and then build segments and campaigns using data from those tables.

In terms of sending emails, connected apps either have their own infrastructure or allow customers to connect to yet another email service provider like Sendgrid. Either way, the connected app vendor loads usage data (campaign metrics, etc) back into the customer’s warehouse, doing away with the need for an ELT pipeline.

Reverse ETL vs. Warehouse-native Apps

It’s worth mentioning that there’s some overlap between reverse ETL tools and warehouse-native apps — both sit on top of a data warehouse but don’t store any data while allowing customers to perform certain actions on top of the data.

However, even though reverse ETL tools offer features like sending Slack alerts based on changes to your data, moving data from the warehouse to downstream managed apps remains the primary use case for reverse ETL.

Data from a customer’s data warehouse is sent via reverse ETL to a managed app

Instead of uploading CSVs or using the managed app’s API to sync data, one can move data from the warehouse to a managed app using reverse ETL as middleware.

Similarly, as depicted below, one can move usage data from the managed app back to the warehouse using an ELT tool (or the managed app’s API).

Data is sent back from the managed app to the customer’s data warehouse using ELT/ETL
Data is sent back from the managed app to the customer’s data warehouse using ELT/ETL

Summary and closing thoughts

I’d like to conclude this post with a quick summary followed by some thoughts.

  • A warehouse-native app is any SaaS tool that can run on top of a customer’s cloud data warehouse (or cloud data platform), without the need to ingest and store any data.
  • Warehouse-native apps are also referred to as connected apps (since they’re connected to the customer’s own data platform). Their traditional counterparts are referred to as managed apps.
  • Warehouse-native apps (should) automatically load usage data back into the customer’s data warehouse.
  • There’s some overlap in the functionality of connected apps and reverse ETL tools — both sit on top of a data warehouse but don’t store any data at their end, while allowing customers to perform certain actions.

While unlikely to happen anytime soon, connected apps can become more like reverse ETL tools by building destination integrations, allowing customers to also move data downstream to other third-party tools.

On the other hand, reverse ETL tools can potentially enable traditional SaaS companies (managed apps) to embrace the warehouse-native architecture.

To dig deeper into the warehouse-native paradigm, I recommend the following articles from Snowflake:

  1. Integrating with Snowflake — a guide for SaaS Providers
  2. Powered by Snowflake: Building a Connected Application for Growth and Scale
  3. Powered by Snowflake: How Connected Applications Work
  4. Connected Apps or Managed Apps: Which Model to Implement?

Explore more content about Warehouse-native Apps

Get Yourself an Upgrade!

The databeats Pro membership gives you:
  • Exclusive guides + exercises that will enable you to collect good data, derive better insights, and run reliable experiments to drive data-powered growth
  • Access to a member-only Slack community to get answers to all your questions + a lot more
Join databeats Pro
ABOUT THE AUTHOR
Arpit Choudhury

As the founder and operator of databeats, Arpit has made it his mission to beat the gap between data people and non-data people for good.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Explore the full series:
No items found.
No items found.
Join the community

It's time to come together

Welcome to the community!
Oops! Your data didn't make it to our database – can you try again?

line

thick-line

red-line

thick-red-line