Dataform is a service for data analysts to develop, test, version control, and schedule complex SQL workflows for data transformation in BigQuery. In this course you will explore the components of Dataform core, learn how to define tables and dependencies in SQLX, document BigQuery tables and views, understand BigQuery security settings and how to manage these with Dataform, write assertions, execute SQL workflows, and explore additional advanced use cases.

 

After completing the Orchestrate BigQuery Workloads with Dataform course, students will be able to:

  • Understand the components of Dataform core.
  • Create tables and views in BigQuery using Dataform.
  • Document BigQuery tables and views.
  • Understand BigQuery security settings using Dataform.
  • Use assertions to validate data in Dataform workflows.
  • Execute Dataform SQL workflows in an automated fashion

Upcoming Batches

Loading Dates...

Orchestrate BigQuery Workloads with Dataform: Key Features:

  • Dataform allows you to create repositories for managing your code.

  • Dataform enables you to create workspaces for development.

  • Dataform allows you to develop the Dataform core within a development workspace.

  • Dataform compiles the Dataform core.

  • Dataform runs the dependency tree.

Who can participate in the training Orchestrate BigQuery Workloads with Dataform?

  • Any data analyst, data engineer, or other data professional who wishes to use Dataform to orchestrate data workloads in BigQuery

What are the prerequisites for the training?

  • Knowledge of SQL data analysis and BigQuery as discussed in BigQuery for Data Analysis
  • Learning objective of the course:

    • Create curated, up-to-date, reliable, and well-documented tables in BigQuery.
    • Facilitate collaboration between data analysts and data engineers within the same repository.
    • Create scalable data pipelines in BigQuery using SQL.
    • Connect with GitHub and GitLab.
    • Maintain updated tables without the need to manage infrastructure.

    Modules covered in the course Orchestrate BigQuery Workloads with Dataform Download Course Outline

    Topics

    • SQL workflow
    • Repositories and workspaces
    • Default files and folders
    • Compiled graphs

    Objectives

    • Understand the components of Dataflow core.

    Topics

    • Declare a data source.
    • Create a table.
    • Create an incremental table.
    • Set partitioning and clustering options.
    • Create an empty table.
    • Create an external BigLake table.
    • Create views and materialized views.
    • Define dependencies.

    Objectives

    • Create tables and views in BigQuery using Dataform

    Topics

    • Use column descriptions
    • Use globally defined JavaScript constants
    • Add labels

    Objectives

    • Document BigQuery tables and views

    Activities

    • 1 lab

    Topics

    • IAM dataset and table/view access
    • Column-level security
    • Row-level security

    Objectives

    • Understand BigQuery security settings using Dataform.

    Topics

    • Use built-in assertions.
    • Create manual assertions.

    Objectives

    • Use assertions to validate data in Dataform workflows.

    Activities

    • 1 lab

    Topics

    • Dataform code lifecycle
    • What happens during compilation
    • Customize and schedule compilation results
    • Execute workflows (UI, Cloud Scheduler, Cloud Composer)
    • Logging and monitoring

    Objectives

    • Execute Dataform SQL workflows in an automated fashion.

    Activities

    • 1 lab

    Topics

    • Create a BigLake table after file upload using Cloud Run functions.
    • Build a Machine Learning pipeline with BigQuery ML.
    • Work with Slowly Changing Dimensions Type 2.

    Objectives

    • Explore additional use cases for Dataform.

    Activities

    • 1 lab

    Select Course date

    Loading Dates...
    Add to Wishlist

    Course ID: 22665

    Course Price at

    Loading price info...
    Enroll Now

    Dataform is a platform that allows data analysts to manage data transformation workflows in SQL, enabling them to develop, test, and schedule complex workflows in BigQuery.

    Dataform directly connects to BigQuery, allowing users to define tables, manage dependencies, and execute SQL workflows seamlessly within the BigQuery environment.

    Yes, Dataform integrates with GitHub and GitLab, allowing you to version control your SQL code and collaborate with your team effectively.

    SQLX is a SQL-based language used in Dataform that allows you to define tables, dependencies, and documentation for your data transformations.

    Dataform provides features to document tables and views directly in your SQLX code, making it easier to maintain clear and accessible documentation.

    Yes, Dataform allows you to manage BigQuery security settings, enabling you to control access to your data and workflows effectively.

    You can write assertions in Dataform to validate data quality by creating checks that will be executed as part of your workflow.

    Advanced use cases include building complex data transformation pipelines, integrating with other data tools, and automating data workflows for large-scale data processing.

    Enquire Now