Warehouse-native Apps | Luke Ambrosetti, Solutions Engineer at Snowflake (ex-MessageGears)

Part 3 of the collection on warehouse-native apps.

Arpit Choudhury

Created :

August 15, 2023

Created :

July 12, 2022

Updated :

April 30, 2024

(#)

Minutes

→ Link to the collection.

What is a warehouse-native app and what are its benefits over a managed app?

Luke Ambrosetti (who was at MessageGears and is now at Snowflake) not only has first-hand experience but is also deeply knowledgeable and passionate about the warehouse-native architecture. I learned a bunch from his answers and am pretty sure you will too.

Prefer reading? We’ve got professionally edited transcripts for you:

Q. What is a Warehouse-native app?

So a Warehouse-native app is effectively a SaaS application that connects directly to the data warehouse and is doing a data-intensive process for you.

Q. Are connected apps or data apps the same as warehouse-native apps?

Kind of, it depends on who you ask, right? The definition of a connected app or a data app could be very different, to people. And in my view, I'm seeing connected app as kind of a good definition of a warehouse-native SaaS app. Whereas data app could be it could be a SAS, it could be that thing, but maybe data app is actually a larger term that incorporates, maybe an internal app that you create at your, at the company you work for, could be a data app. Right? It doesn't have to be some sort of external tool that you, you buy and or use

Q. What is leading to such a paradigm shift in the way B2B SaaS tools are being built?

It's very exciting, right? Lots of new companies are taking this new approach. Like this, it's called warehouse-native, warehouse-centric, warehouse first right approach. Lots of terms for it.

But it honestly, it comes from this idea of the separation of compute and storage. Right? And then specifically kind of in the industry that I work in, which is more marketing, is this idea of that I've seen, it explained very well, is this idea of the the separation of the system of engagement. Right? Of how you're reaching out to, in this case the marketing's case, your customers, right? And the system of record of, and where what is the state of that customer?

You know what they're doing, where they are. the Customer 360 as it's called. So with those two things separated, or traditionally, I should say you've had to get your data to your system, whatever your system of engagement is. Right? And sometimes, companies like Salesforce have traditionally tried to be both the system of record and the system of engagement. So, now it's with a separation of compute and storage. Right? you can now have those two things, the system of engagement and the system of record separated as well.

Questions? Check out our primer on warehouse-native apps‍

Q. So besides cost savings, what are some key benefits of using a warehouse-native engagement tool over a traditional one?

So yeah, and again the cost savings aspect right here is the idea that you don't have to sync that data right to whatever, your other platform is of where you want to, do either that that in whether it's, if it's marketing, it's engagement, maybe it's analytics, you don't have to sync that data there either whatever it might be. Right?

So that's where the cost savings comes from, but there are so many other, benefits as well right? So it's, you have few data silos, you do it this way. Right, instead of shipping your data out to 10 different SaaS companies. If all 10 could connect your data warehouse and use that system of record, you don't have to duplicate data, I guess, is you know what I'm trying to say, right?

So it's single source of truth, of course, as well. And that's gonna be better, especially for like things like data absorbability or analytics, right? Being able to attribute different events that happen, back to again where you have all of your data setting, instead of it, again, being in a silo you have to then extract with an ETL tool. You get to bring many of these warehouse-native cases you get to bring your own data model. So instead of having to force your data model into kind of how they view the world or how they think your data should be modeled. You get to bring your own, which is great.

Also, it's onboarding time, right? Instead of having, again, the same thing with the bring your own data model you don't have to spend days, or not days, months honestly, sometimes years, wrapping out how the data should be modeled in that destination system. Again, on the time saving, you have speed of change, right? You wanted to add a specific field, right? To whatever you're trying to do. It's easy. It shows up in, in the data warehouse or the database it's right there for you to use. And then lastly, from a security and compliance point of view your data is safe in the data warehouse. Right? You're not having to rely on these task providers to keep your data safe. So if you have, PII data, especially on customers, I come from the marketing space. That data is much safer at home.

Q. So are warehouse-native apps making reverse ETL workflows redundant?

Yes. Kind of, yeah. I mean, so the answer of course is yes and no. Right? I would say it breaks reverse ETL redundant and ETL. Because again, you're not having to kind of sync data in between the source system and the destination system, right? Your source system is your system of record. That's what you get to use. So both the reverse ETL and the EL or ETL tools can be, get replaced. That said, I think it's, is that gonna replace them forever?

No, I don't think so. I think that, this approach is really cool and great, but there are, some companies like when I think again, in marketing, you think about like like Facebook conversions API, or Google ads. Right? I don't think they have any reason to take this approach. Right? They're, you're gonna, you're still gonna have to use their APIs cause that's what they need, right? To do their business. So I don't think, those reverse ETL and El tools are gonna be completely gone.

Q. Can you tell us about the segmentation capabilities of a warehouse-native engagement tool and are there any limitations here?

Yeah. And so again, I work at MessageGears and MessageGears is traditionally an email service provider and that's kind of been our core feature is connecting directly to the data warehouse. Right. So, with this type of architecture or design, the limitation is that you do have to have a data model, right?

In marketing many companies rely on their ESPs or sometimes their traditional CDPs to do that for them. Right? So you need to have a data model because and you can't just take these tools like data ingestion tools, like a snowplow, rudderstack and and just throw data into a data warehouse right? You need those resources, you need those people. Right? To go through and intelligently design models for your, customers or marketing in this case, maybe sales to use, right?

So, when the data team does that for their end customers — the marketing team — it gives those data-adjacent teams the ability to use that data and kind of do that last mile transformation that they need to do to run what they need to run in their programs.

So it's for lower or no-code tools, right? It's gonna be very much a challenge to take this approach. Right? Because again, you have to basically say, okay bring your, here's your data model, bring your data model and now make it work with the solution we have the big reasons, some of those, Cloud SaaS are so popular is because they make it really easy, and they provide that data model for those those low code tools to use.

Q. Can the visual segmentation capability eventually replace the need to build data models in SQL?

Yeah, that's, that's a good question. Maybe. I think, we'll get there, but right now you still need some base SQL, right? Or some sort of base model to work from.

again, those data-adjacent users can go through and use those tools to do segmentation and do some really cool segmentation, honestly, at least in in our platform to do some very complex segments and in groups and labeling to go through and target their customers or their users.

At the same time, it's again, it's more of that last mile approach where they're kind of adding in the things that they need, but they need- For now, they need to start from somewhere, right? They need to have a, what I would call like a base model to work from. And from there they can go through and then create kind of, do some, some kind of, I guess additional modeling on top of that.

Q. If warehouse-native apps don't store any customer data, won't marketing campaigns sometimes break when there's an issue connecting to the customer's warehouse?

Yeah, I've got this question a couple times now from some more of the, technical users or, prospects. And the answer is yes, but at worst it's the same as or better than your typical SaaS model. Right? So with your typical SaaS model, you have to have pipelines and again, and now reverse ETL makes it a lot easier in getting your data synced.

But let's say you wanna send the email campaign to all the users who subscribed yesterday. If that pipeline breaks, then you could, if you don't if your SaaS company isn't looking at the freshness of that data. Every day, you could be sending that message to the exact same users again. Right? And the same kind of thing works here whereas, in this method, if it breaks, nothing's gonna be going out, right? Whereas if your SaaS company is not looking at data freshness, right? Something could break and then you're gonna be sending, you're giving a bad experience to customers because you're sending them the exact same content twice.

That's really interesting. I never thought of that so thanks for sharing.

Q. Last question — what is the one piece of advice you have for companies that are looking to just get started with a data warehouse?

Yeah. I would say start slow and don't try and just jump in, head first. There are so many tools. The modern data stack is so cool. It's a lot of functionality, and lots of cool features that people are doing today. At the same time, you have to start slow. It's a crawl, walk, run approach.

Right to this shutting with one of our enterprise clients who was talking about their move into BigQuery and took them three years. Three years, I think from Teradata. Kind of, on-prem. And so it's, it can take a long time. So you have to have — before you start doing the really cool ML stuff — you have to have your ingestion, your modeling, your transformations figured out. So yeah. Start slow and yeah. Have fun with it!

You can also tune in to the episode on Spotify or Apple Podcasts.

Show me

ABOUT THE AUTHOR

Arpit Choudhury

As the founder and operator of databeats, Arpit has made it his mission to beat the gap between data people and non-data people for good.

Table of contents

Text Link

Explore the full series:

No items found.

Wish to continue learning? We recommend the following: