Best practice for workflows

Highlighting more best practices to keep in mind when building your dbt projects and familiarising ourselves with more advanced functionalities.

Aug 25, 2025

Part of the “Mastering dbt” series. Access to the full Study Guide. Let’s connect on LinkedIn!
Notes from this dbt documentation.

As we continue to dive deeper into the world of the dbt, we are going to explore another set of best practices to consider when building your projects.

Some of the functionality covered here is yet to be covered by this Study Guide, but it is never too soon to start familiarising ourselves with these concepts.

I have taken the liberty of re-categorising these items from the original post for more clarity.

Sources

Use the ref function

Allows dbt to infer dependencies that affect the order in which models are built.
Ensures that the tables and views referenced are in the same environment.

Limit references to raw data

Keep the direct link to the source data in only one place so, in case anything changes in the source, there are fewer amendments to make.
Use the source function instead of direct linking.

Layering

Separate source-centric and business-centric transformations

Staging should be your source-centric layer where you would transform the sources into consistent structures.
From intermediate onwards, you should apply business logic and represent entities and processes relevant to your business.

Rename and recast fields once

The first layer of transformations should be:
- select from only one source.
- rename fields and tables to fit naming conventions.
- recast fields to the correct data type.
All the downstream models should be built upon this clean model.

Break complex models into smaller pieces

It is a good idea to break up complex models when:
- A CTE is duplicated across two models. Place this CTE in a separate model to be referenced by other models and avoid duplicate code.
- A CTE changes the grain of the data. The grain of the data is what one record (each row) represents.
- A model with too many lines. The recommendation is to keep the length at around 100 lines max.

Group your models in directories

Use directories to make it easier to configure groups of models, run subsections of the DAG, clarify modeling steps, and create bespoke conventions.

Consider the information architecture of your data warehouse

Use custom schemas to hide or show models as needed. For instance, create a separate schema where intermediate models are not shown.
Generally, the custom schemas align with the model directories.
Use prefixes to clarify the relations (stg_, fct_, dim_).

Coding & Conventions

Use a style guide

SQL styles, naming conventions and other rules that should be documented, especially if multiple users are writing code.

Manage whitespace generated by Jinja

Sometimes, when using jinja or macros, you may end up with unwanted whitespaces in your compiled code. You should edit your jinja/macro to manage that.

Materialisations

Choose your materialisations wisely

Use views by default - they are faster to build but slower to query.
Use tables for:
- models that are queried by BI tools.
- models that have multiple descendants.
Use ephemeral materialisations for:
- models that only do lightweight transformations that won’t be displayed to end-users.
Use incremental models for:
- situations when tables are taking too long to be built - incremental models improve build time but add complexity to the tables.

Version control

Version control your dbt project

Usage of git branches to develop new features and bug fixes.
Peer review of code changes in a Pull Request.

Use grants to manage privileges on objects

The grants config allows you to apply permissions to dbt objects in a way that you can version control and duplicate them.

Environments

Use separate developments and production environments

Usage of targets defined in a profiles.yml file to separate production from development.
Limit the data processed when in development by:
- Using the target name:
- Using the environment variable DBT_CLOUD_INVOCATION_CONTEXT that produces the possible values of prod, dev, staging, and ci to determine the environment.

Testing

Add tests to your models

Use tests to ensure that the data is as expected.
At a minimum, every model should be tested for uniqueness and non-null of the primary key.

Andrea Leonel - Data Analysis & Analytics Engineering

Discussion about this post

Ready for more?