記事

Improved Data Quality in dbt Models With dbt-teradata 1.7.1

Teradata introduces dbt-teradata version 1.7.1 for data quality assurance and seamless integration with the modern data stack.

Daniel Herrera
Daniel Herrera
2024年4月30日 2 分で読める
Image illustrative of data contracts

Data engineering is a dynamic and rapidly changing field. New tools and enhancements to existing ones are frequently introduced, sometimes as often as weekly, to aid data engineers in providing value more swiftly and effectively. Teradata is committed to equipping our customers and developer community with tools that seamlessly integrate with the modern data stack and our data platform, Teradata Vantage™.

With the introduction of dbt-teradata, our Teradata dbt connector, our team has dedicated efforts to ensure compatibility with all dbt features, achieving trusted status by dbt-labs last year. Responding to popular demand from our community, we've added support for dbt data contracts, a key data quality assurance tool, and Teradata session mode, the most-used session mode in Teradata environments, for dbt database connections. These enhancements were included in the release of dbt-teradata version 1.7.1 on January 11, 2024.

Data contracts in dbt

Introduced in version 1.5 of dbt in April 2023, data contracts are a crucial element for maintaining data quality. Prior to the advent of contracts, the primary means of assuring data quality in dbt was through tests. While tests are effective in verifying that a model materializes as anticipated, they are conducted after the model's materialization, making them retrospective measures.

In contrast, contracts enable data engineers to establish a data interface that the model must adhere to. This interface's validation occurs before the model is materialized, preventing disruptions in data pipelines caused by models that don’t meet the required data structure.

From a technical standpoint, contracts, like tests, are set up as configurations in the schema.yml file associated with the model as follows:  

Data contracts play a crucial role in large-scale dbt implementations, particularly when adopting strategies like data mesh to foster separation of concerns across various data domains. We will delve deeper into data mesh strategies in an upcoming article.

In the realm of data mesh, where distinct segments of the data pipeline are often managed through separate dbt projects, the interdependencies between these projects necessitate well-defined interfaces. Establishing clear interfaces between dependent models is essential to ensure seamless integration and maintain the integrity of the overall data architecture.

Teradata session mode in dbt database connections

The Teradata session mode is distinguished by its specialized approaches to transaction control, locking behavior, and error handling, which is very familiar to existing Teradata customers. This mode's support within the dbt-teradata connector empowers our clients to seamlessly incorporate their tried-and-tested stored procedures into their dbt projects.

Prioritizing the integration of Teradata session mode into our connector was a top objective for our team. We’re thrilled to announce the successful implementation of this feature, marking a significant milestone in our ongoing commitment to enhance user experience and compatibility within the Teradata ecosystem.

To set up Teradata session mode, choose TERA mode in the related dbt project profile.

 

Feedback and questions

We value your insights and perspective! Feel free to connect with me on LinkedIn and explore the wealth of resources available on the Teradata Developer Portal and Teradata Developer Community.

Tags

Daniel Herrera について

Daniel Herrera is a builder and problem-solver fueled by the opportunity to create tools that aid individuals in extracting valuable insights from data. As a technical product manager, Daniel specialized in data ingestion and extract, transform, and load (ETL) for enterprise applications. He’s actively contributed as a developer, developer advocate, and open-source contributor in the data engineering space. Certified as a Cloud Solutions Architect in Microsoft Azure, his proficiency extends to programming languages including SQL, Python, JavaScript, and Solidity.

Daniel Herreraの投稿一覧はこちら

最新情報をお受け取りください

メールアドレスをご登録ください。ブログの最新情報をお届けします。



テラデータはソリューションやセミナーに関する最新情報をメールにてご案内する場合があります。 なお、お送りするメールにあるリンクからいつでも配信停止できます。 以上をご理解・ご同意いただける場合には「はい」を選択ください。

テラデータはお客様の個人情報を、Teradata Global Privacy Statementに従って適切に管理します。