A deep dive into Seeds

Oct 22

Definition, properties, and configurations for Seeds and the dbt seed command.

2 Comments

Thanks for another interesting post!!Since command “dbt seed” writes directly in the database, in my team we consider a good practice to not directly use .csv files with the “ref” function but rather create staging models for every csv being used and call them with the “source” function.

This way we follow the dbt best practice that for each source (.csv file) we use one staging model.

The advantage is that any future change in the csv file goes through the staging models and doesn’t disrupt the DAG flow.

Expand full comment

Reply (1)

Andrea Leonel

Oct 23

That's a great point, Loukas! dbt makes a similar recommendation for Snapshots, but didn't mention the same thing for Seeds. But you're right, it's a safeguard to have cleaning steps in place for Seeds too, even if they are not supposed to change very frequently.

Expand full comment

Andrea Leonel - Data Analysis & Analytics Engineering

A deep dive into Seeds