Why You Should Use Dev Containers with dbt Fusion
dbt is a widely used, SQL-based data transformation framework that we leverage in nearly all our projects at Brooklyn Data. Last January, dbt Labs acquired SDF, and we were excited about the potential for the dbt developer experience to be supercharged by SDF’s features.
Well, that moment arrived last week when dbt Labs announced dbt Fusion, a completely new transformation engine, written in Rust. The dbt Fusion engine brings compile-time guarantees, SQL-intellisense, faster parsing, state-aware orchestration, cost awareness, and many other benefits that improve the developer experience. This marks another major evolution in how data practitioners develop and deliver data products. If you’re interested in learning about the underlying technology, there are plenty of great blog posts and an illuminating talk by Elias DeFaria that go over, in depth, the technical implementations and the practical implications of these capabilities!
What Does This Mean for Local dbt Installation?
It’s standard practice to install dbt Core within a Python virtual environment. dbt Core is distributed through and installable via PyPI, while dbt Fusion is not. This means that Python-based virtual environment tools/package managers (like venv, pip, uv) are not applicable for managing dbt Fusion. Instead, the recommended installation method — either via the VS Code extension or directly from the command line — places the dbt Fusion executable (a standalone binary CLI tool) in a directory available through your system’s PATH. This approach simplifies execution but can cause conflicts if you’re using both dbt Core and dbt Fusion on the same machine, which is something a lot of us will be doing in the coming months as we shift from dbt Core to dbt Fusion.
If you want to avoid conflicts between dbt Core and dbt Fusion, or if you want to experiment with dbt Fusion projects without affecting your existing setups, using a dev container is a solid option.
Dev Containers
In short, dev containers are abstractions around Docker containers that allow you to easily spin up isolated, reproducible, and ephemeral environments. All you need is a few configurations in your project, and your integrated development environment (IDE) will handle the rest!
Benefits of using containers include:
- Isolated and conflict-free environments
- Consistent and reproducible builds
- Faster setup and onboarding
- Simplified experimentation without impacting your main system configuration or other projects
For more on Docker and dev containers, see these resources:
Configuring dbt Fusion with a Dev Container in VS Code
Configuring dbt Fusion with a dev container is very simple. All you need is a few things:
- Docker Desktop.
- VS Code with the Dev Containers extension.
- A dbt project with a dbt_project.yml in the workspace folder root and proper configs in your local ~/.dbt directory.
- A configuration file in .devcontainer called devcontainer.json.
Your file structure should look something like this:
The Benefits of Using Dev Containers
Using dev containers to manage your development environment, you can achieve several key things.
- Consistency: Every developer works in an identical environment.
- Isolation: The development environment is separated from the host system, which prevents conflicts with other projects or system-wide installations.
- Reproducibility: The environment can be easily recreated (and/or updated) on any machine with Docker installed.
- Portability: The entire development setup is ephemeral, can be version-controlled, and shared across your team.
These benefits not only simplify dbt Fusion installation, environment management, and team onboarding but also make experimentation and migration to the new engine cleaner, safer, and more efficient.
Migrate to the dbt Fusion Engine with Brooklyn Data
With the dbt Fusion engine currently in beta, not all existing dbt projects are compatible. Transitioning to the new engine can be complex, but Brooklyn Data is here to help! Our team of data experts will guide you through every step, from planning and technical implementation to migration execution.
Brooklyn Data offers comprehensive support for implementing and optimizing your modern data stack. We help you develop effective data strategies, manage your data efficiently, and harness the power of AI to unlock your data’s full potential. Contact us today to get started.