Voiced by Amazon Polly |
Overview
Databricks workflows are powerful tools for orchestrating data processing, machine learning, and analytics tasks. However, creating truly flexible and reusable workflows often requires dynamically injecting context and runtime information. This is where Databricks Dynamic Value References come in. They provide a simple yet powerful mechanism to access job and task metadata, parameters, and results from previous tasks, directly within your job configurations. This allows for more automated, context-aware, and workflows.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
Introduction to Dynamic Value Reference in Databricks
Imagine needing to name output files based on the specific run ID of a job, or wanting a task to behave differently depending on how the job was triggered. Manually configuring these details for every run or variation is tedious and error-prone. Dynamic Value References solve this by acting as placeholders in your job and task settings.
These references are essentially variables, enclosed in double curly braces (e.g., {{job.run_id}}), that Databricks automatically replaces with their actual values when a job or task runs. They provide access to a wealth of information, including:
- Job-specific details (ID, name, start time, trigger type)
- Task-specific details (name, run ID, execution count)
- Runtime metadata (workspace ID, URL)
- User-defined parameters (job and task level)
- Outputs and states of upstream tasks
By leveraging these references, you can build workflows that adapt to their execution context, pass information seamlessly between tasks, and reduce the need for hardcoded values.
How does Dynamic Value References Work?
The core mechanism behind dynamic value references is string substitution at runtime.
- Syntax: You define a dynamic value reference using double curly braces: {{namespace.value}}. For example, {{job.id}} references the unique ID of the current job. User-provided identifiers (like parameter names) should use alphanumeric characters and underscores. If your identifier contains special characters, enclose it in backticks:
{{tasks.
my-task-name.run_id}}
. - Configuration: You insert these references into various configuration fields within the Databricks Jobs UI or when defining jobs using the Databricks REST API or CLI (in the JSON definition). Fields that support these references often have a { } button in the UI, which provides a handy list of available references for easy insertion.
- Runtime Substitution: When a job run starts, and before a task executes, Databricks scans the task’s configuration for these {{ }} patterns. It replaces each valid reference with its corresponding string value for that specific run. For instance, if you set a task parameter like {“output_path”: “/processed_data/{{job.run_id}}/”}, and the job run ID is 12345, the actual value passed to the task for output_path will be /processed_data/12345/.
- Scope and Usage: Crucially, dynamic value references are resolved within the job and task configuration settings. You cannot use them directly inside the code of your notebooks, scripts, or JARs (e.g., you can’t write print({{job.run_id}}) in a Python notebook). Instead, you must pass the dynamic value reference as a parameter to the task. The task code can then access the resolved value through standard parameter handling mechanisms (like dbutils.widgets.get(“parameter_name”) in notebooks).
- Error Handling: It’s important to note how errors are handled:
- Syntax errors (like missing braces) or references to non-existent values are often silently ignored and treated as literal strings.
- Using an invalid property within a known namespace (e.g., {{job.non_existent_property}}) might result in an error message in the UI or logs.
- The contents within {{ }} are not evaluated as expressions; you cannot perform operations like {{job.repair_count + 1}}.
Examples:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
• JSON - Job Parameter: { "parameters": [ { "name": "job_start_date_iso", "default": "{{job.start_time.iso_date}}" } ] } • JSON - Notebook Task Parameter: { "notebook_task": { "notebook_path": "/Repos/user@example.com/process_data", "base_parameters": { "run_id_param": "{{job.run_id}}", "trigger_type_param": "{{job.trigger.type}}" } } } |
- Referencing Task Output (SQL Task Example): If an upstream SQL task named query_sales outputs columns region and total_sales, a downstream task could reference the first row’s region value using a parameter like: {“sales_region”: “{{tasks.query_sales.output.first_row.region}}”}
Advantages
Using dynamic value references offers several key advantages:
- Increased Automation: Automate the inclusion of run-specific information (like IDs, timestamps) in configurations, logs, or output paths without manual intervention.
- Enhanced Dynamism: Create workflows that adapt based on runtime context, such as the trigger type ({{job.trigger.type}}) or the success/failure state of previous tasks ({{tasks.<task_name>.result_state}}).
- Improved Reusability: Design more generic job templates that can be reused across different scenarios by parameterizing context-dependent values.
- Reduced Hardcoding: Minimize hardcoded values like dates, IDs, or environment details, making jobs easier to maintain and migrate.
- Seamless Information Flow: Pass crucial metadata or even data outputs (like results from SQL tasks using {{tasks.<task_name>.output.*}}) between tasks in a structured way. Task Values ({{tasks.<task_name>.values.<value_name>}}) offer another powerful way to pass custom outputs between tasks.
- Simplified Conditional Execution: Use references like {{tasks.<task_name>.result_state}} in task run conditions to control workflow branching.
Conclusion
Dynamic Value References are an indispensable feature for anyone building sophisticated workflows in Databricks. Providing controlled access to runtime context and metadata directly within job configurations enables greater automation, flexibility, and maintainability.
Drop a query if you have any questions regarding Dynamic Value References and we will get back to you quickly.
Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.
- Reduced infrastructure costs
- Timely data-driven decisions
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, AWS GenAI Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, Amazon ECS Service Delivery Partner, AWS Glue Service Delivery Partner, Amazon Redshift Service Delivery Partner, AWS Control Tower Service Delivery Partner, AWS WAF Service Delivery Partner, Amazon CloudFront Service Delivery Partner, Amazon OpenSearch Service Delivery Partner, AWS DMS Service Delivery Partner, AWS Systems Manager Service Delivery Partner, Amazon RDS Service Delivery Partner, AWS CloudFormation Service Delivery Partner and many more.
FAQs
1. Can I use dynamic value references directly inside my Python notebook or SQL script?
ANS: – No. Dynamic value references are resolved before the task code runs. You must pass the reference as a task parameter. Then, within your notebook or script, you access the resolved value using standard methods for reading parameters (e.g., dbutils.widgets.get() in notebooks).
2. Can I perform calculations or formatting within the {{ }}?
ANS: – No, the content inside the braces is not evaluated as an expression. For instance, {{job.repair_count + 1}} will not work. However, some references, like time values (job.start_time), have built-in formatting options (e.g., {{job.start_time.iso_date}}).
3. How are dynamic value references different from Task Values?
ANS: – Dynamic value references primarily expose metadata and configuration provided by the Databricks Jobs system or passed as parameters. Task Values are specifically designed for tasks to programmatically output custom key-value pairs (using dbutils.jobs.taskValues.set(“key”, “value”) in a notebook, for example) that downstream tasks can then consume using the {{tasks.<task_name>.values.<key_name>}} dynamic reference. They work together to facilitate inter-task communication.

WRITTEN BY Yaswanth Tippa
Yaswanth Tippa is working as a Research Associate - Data and AIoT at CloudThat. He is a highly passionate and self-motivated individual with experience in data engineering and cloud computing with substantial expertise in building solutions for complex business problems involving large-scale data warehousing and reporting.
Comments