2 Comments
User's avatar
Loukas Apostolidis's avatar

Thanks for another interesting post!!Since command “dbt seed” writes directly in the database, in my team we consider a good practice to not directly use .csv files with the “ref” function but rather create staging models for every csv being used and call them with the “source” function.

This way we follow the dbt best practice that for each source (.csv file) we use one staging model.

The advantage is that any future change in the csv file goes through the staging models and doesn’t disrupt the DAG flow.

Expand full comment
Andrea Leonel's avatar

That's a great point, Loukas! dbt makes a similar recommendation for Snapshots, but didn't mention the same thing for Seeds. But you're right, it's a safeguard to have cleaning steps in place for Seeds too, even if they are not supposed to change very frequently.

Expand full comment