How to Optimise dbt & Looker Development: A Practical Guide

Developing data models and dashboards should be a smooth process, but teams using dbt for transformation and Looker for visualisation often face challenges. This article outlines a practical strategy for achieving reliable development and efficient environment separation between dbt and Looker. We’ll explore how to use a dedicated development data warehouse, robust Looker tests integration with Spectacles, and smart branch management to avoid disruptions, ensure data quality and control spending.

Challenges with using dbt and Looker

As we develop data models using dbt, a SQL-first transformation tool, and create dashboards with Looker, a BI platform, two key challenges often arise:

Reliable Development: How can we confidently develop in dbt without disrupting existing business Looker dashboards?
Looker Environment Separation: How can we effectively separate development and production environments without incurring additional costs?

By implementing a dedicated development environment and robust testing, teams can avoid disruptions to production dashboards, safeguard data quality, and control costs.

Overcoming these challenges brings significant benefits:

Fewer Errors: Ensures changes don’t impact existing production dashboards.
Increased Confidence: Builds trust in your processes and outcomes with business users.
Improved Data Quality: Validates changes before deploying to production.
Faster Deployment: Accelerates development cycles without downtime.
Cost Efficiency: Avoids extra infrastructure expenses.

In this blog post, we’ll explore practical solutions to these challenges and how they can streamline your data development workflow.

A graphic with two software tabs (DBT and Looker) on the opposite sides of a cliff with a small bridge between them — Build the bridge between Looker’s and DBT functionalities

Ensuring Reliable Development

To ensure seamless development and maintain quality between dbt and Looker, there are two key prerequisites:

A dedicated development data warehouse.
Robust tests in Looker

Setting Up a Dedicated development data warehouse

A development data warehouse is a separate environment from production, specifically set up for testing and building new features. Unlike production, it provides a safe space to validate changes in dbt without impacting live data or processes..

Looker Tests

Looker tests are designed to ensure the accuracy and reliability of key performance indicators (KPIs) used in Dashboards. By comparing the KPIs to historical data, any unexpected changes can be detected. This process ensures that both the structure and the results of business metrics remain accurate as the underlying data evolves, safeguarding the integrity of reporting and analytics.

Setting Up a Dedicated Looker development project

One effective approach is to create a “dev-testing” project within Looker. This project connects to the dbt development data warehouse, using its data specifically for testing purposes. It also has a remote dependency on the master branch of the primary Looker project while maintaining its own git repository.

Here’s how the process works:

When a pull request (PR) is created in dbt, triggering a change in data transformations, the development data warehouse is built or updated.
Once the data warehouse is updated, Looker tests can be executed in the “dev-testing” project using Spectacles, a package that automates looker checks. “Spectacles assert” runs all Looker data tests defined in the project and reports any failures or errors.

By incorporating this setup, you can catch potential errors early in the development cycle, preventing issues from reaching production dashboards. Setting up a CI/CD pipeline to automatically trigger Spectacles after each dbt PR ensures that transformation logic aligns with the business logic represented in Looker.

Optimising Costs for dbt Pull Request Builds

When dbt pull requests trigger the construction of the development data warehouse, costs can escalate rapidly. To control these expenses, you can implement two key optimisations: first, rebuild only the modified tables within the development data warehouse; second, schedule a full rebuild of the entire development data warehouse just once per week. This approach minimises redundant builds, effectively reducing unnecessary costs while maintaining development efficiency.

Testing with Spectacles

Spectacles is a package that automates Looker checks, enabling you to define and streamline tests for Looker models while ensuring data consistency. It supports four key types of automated checks:

Content Validation
- Identifies content (Dashboards and Looks) that have errors.
SQL Validation
- Runs queries in the data warehouse to verify the SQL field in each dimension is valid.
LookML Validator
- Runs Looker’s LookML Validator and returns any syntax issues with your LookML.
Assert Validator
- Runs any Looker Data Tests that were defined and returns all failures and errors

The Assert Validator is particularly useful for pointing to a development data warehouse and running Looker tests with development data.

Creating Looker Environment Separation

In Looker, creating separate production and development environments with multiple instances can quickly become costly. Instead, we can achieve separation using branch separation within the main Looker repository, avoiding the need for additional Looker instances.

Branch-Based Environment Management

Rather than maintaining two Looker instances, a more cost-effective solution is to use Git branches to manage development and production environments in a single instance.

Develop Branch: This branch is dedicated to ongoing development work.
Master Branch: The master branch, which is protected, serves as the production environment.

To link the production environment with Looker, a separate Looker project called “looker-prod ” is created. This project uses the master branch as a remote dependency. This ensures that production dashboards remain isolated from development work.

A graph showing Branch Environment Management in Looker. — Looker process

Automating Deployments with GitHub Workflows

To streamline the deployment process, we can configure automated workflows using GitHub Actions. When changes are merged into the master branch, the workflow triggers a deployment to the Looker production environment by updating the master reference in the Looker production repository. Additionally, if Looker’s Advanced Deployment Mode is enabled, a deploy webhook can be set up in GitHub to automatically push Looker code to production.

The custom script triggers when a pull request on master is closed. It has two main jobs:

get_master_latest_commit:
- Checks out the master branch and retrieves the latest commit hash from master.
update_looker_prodcution_repo:
- Updates the manifest_lock.lkml file with the new commit reference (if it has changed).
- Commits and pushes the update

This automates updating and merging the manifest_lock.lkml with the latest master commit.

Protecting the master branch

Rules to protect the master branch:

Only accepts merges from the Develop branch
Force pushes are disabled
Direct commits are not allowed
Merge requests with failing GitHub Actions are disallowed

By protecting the master branch and enforcing these rules, you safeguard the stability of the production environment. This setup ensures that only thoroughly tested and validated changes make it to production, while the Develop branch remains flexible for rapid iteration and experimentation.

Conclusion

In this blog post, we explored strategies for ensuring reliable development with dbt and Looker, focusing on creating separate development environments without incurring additional costs. We discussed the importance of using a dedicated development data warehouse, implementing robust testing with Spectacles, and using Git branch separation to manage Looker environments.

By following these practices, your team can:

Catch errors early, preventing them from affecting production.
Minimise disruptions to business operations.
Maintain cost efficiency while improving overall data reliability

If you want to learn more about dbt, read our article “SQL for Data Transformation: From Excel to Expert“. Curious about Looker? Check it’s 4 unknown capabilities!

Feature Image for Expert View "New data platforms: the announced end of ETLs?"

Are you ready to transform your data development process?

Contact Devoteam today and see how you can implement this streamlined approach in your analytics stack.

Start your Data Journey

Read Success Stories