Data science at Coalition: Building real-time feature computation with Snowflake + DBT

Nelson AunerJune 22, 2021

Featured Image for Data science at Coalition: Building real-time feature computation with Snowflake + DBT

Coalition is a technology (and data) company at heart and an insurance company by trade. Our engineering, data, and product teams are constantly innovating. This is a brief analysis of how the data science team utilized Snowflake and DBT to attempt to answer the question all insurance companies face: how do we reliably predict the likelihood of winning new clients?

Introduction to the problem

In insurance, multiple insurance providers compete with each other on price and coverage to win a prospective client's business. Market conditions can change quickly, especially within specific industries. Accurately estimating our probability of winning any given deal is useful for a wide variety of analyses, but updating features' reliability can be tricky. How should we build such a system?

Engineering implementation

From an implementation perspective, the system that computes “probability to buy” scores would ideally have the following attributes:

Continuously updated features: If last week, restaurants were unlikely to buy from us, but this week they are more likely, we’d like to reflect that immediately in our probability estimates.
Auditable: We want to be able to re-create all probability estimates. If we gave a business a 20% probability to buy last week, and this week gave the same business a 30% probability of buying, we want to be able to deconstruct that +10% delta into its constituent inputs.
Low infrastructure footprint: We want data scientists to be able to run and tweak this system full-stack without a large burden on SRE/platform engineering support. The more that data scientists can own the problem end-to-end, the faster they will be able to iterate.

Is this one of those “good, fast, or easy, pick two” scenarios?

Creating a feature warehouse

Fortunately, we’ve had success using Snowflake, a scalable data warehouse, and DBT, an analytics engineering tool, to help build and run this system. DBT calculates the features in a consistent manner for offline/online usage, and their snapshot functionality allows us to recreate any past prediction, if necessary. Snowflake allows us to run the whole system with a dedicated worker (warehouse), so data scientists can access the data without affecting the online model performance.

The entire setup is best illustrated in the following diagram:

We’ve had this system running for over a month now and have been pleasantly surprised at how easy it was to set up and maintain. A common anecdote is “Data Science is 80% data cleaning and engineering, and 20% modeling,” but this lightweight approach has allowed us to focus most of our effort on the specific form of the probability estimator. We call that a success.

Always room for improvement

As always, no system is perfect. We see two potential areas for future improvement:

We could add an additional DBT job

that takes the prediction outputs, compares it with actuals, and creates a calibration table to help us understand the performance of the model automatically.
We could even use DBT’s monitoring and alerting features

to alert us when probability calibration performance degrades beyond acceptable limits.

If these types of challenges sound interesting, or you’d like to learn more about data science at Coalition, visit our careers page for more information and open opportunities

Tags:

Engineering Product Updates

Blog

Bridging the Cyber Insurance Gap Between Brokers and SMEs

Coalition UK research reveals a gap between specialist cyber insurance take-up by SME decision makers and broker impact on outcomes.

Tom DraperDecember 03, 2025

Read Now

Cyber Insurance

Blog

Beyond Capacity: How Coalition Re is Building the Future of Cyber Stability

Coalition Re harnesses the full power of Coalition’s cyber intelligence to give reinsurance cedants an unprecedented view across the entire digital ecosystem.

Diana LiuNovember 26, 2025

Read Now

Cyber Insurance

Blog

How Hackers Leverage Insurance Details in Ransomware Attacks

Businesses should secure their cyber insurance policies like any other highly sensitive document to prevent it from becoming a negotiation playbook.

Gregory AndersenAugust 28, 2025

Read Now

Cyber Incident? Get Help

Data science at Coalition: Building real-time feature computation with Snowflake + DBT

Introduction to the problem

Engineering implementation

Creating a feature warehouse

Always room for improvement

Related blog posts

Blog

Bridging the Cyber Insurance Gap Between Brokers and SMEs

Coalition UK research reveals a gap between specialist cyber insurance take-up by SME decision makers and broker impact on outcomes.

Blog

Beyond Capacity: How Coalition Re is Building the Future of Cyber Stability

Coalition Re harnesses the full power of Coalition’s cyber intelligence to give reinsurance cedants an unprecedented view across the entire digital ecosystem.

Blog

How Hackers Leverage Insurance Details in Ransomware Attacks

Businesses should secure their cyber insurance policies like any other highly sensitive document to prevent it from becoming a negotiation playbook.