Voiced by Amazon Polly |
Overview
Databricks workflows are powerful tools for orchestrating data processing, machine learning, and analytics tasks. However, creating truly flexible and reusable workflows often requires dynamically injecting context and runtime information. This is where Databricks Dynamic Value References come in. They provide a simple yet powerful mechanism to access job and task metadata, parameters, and results from previous tasks, directly within your job configurations. This allows for more automated, context-aware, and workflows.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
Introduction to Dynamic Value Reference in Databricks
Imagine needing to name output files based on the specific run ID of a job, or wanting a task to behave differently depending on how the job was triggered. Manually configuring these details for every run or variation is tedious and error-prone. Dynamic Value References solve this by acting as placeholders in your job and task settings.
These references are essentially variables, enclosed in double curly braces (e.g., {{job.run_id}}), that Databricks automatically replaces with their actual values when a job or task runs. They provide access to a wealth of information, including:
- Job-specific details (ID, name, start time, trigger type)
- Task-specific details (name, run ID, execution count)
- Runtime metadata (workspace ID, URL)
- User-defined parameters (job and task level)
- Outputs and states of upstream tasks
By leveraging these references, you can build workflows that adapt to their execution context, pass information seamlessly between tasks, and reduce the need for hardcoded values.
How does Dynamic Value References Work?
The core mechanism behind dynamic value references is string substitution at runtime.
- Syntax: You define a dynamic value reference using double curly braces: {{namespace.value}}. For example, {{job.id}} references the unique ID of the current job. User-provided identifiers (like parameter names) should use alphanumeric characters and underscores. If your identifier contains special characters, enclose it in backticks:
{{tasks.
my-task-name.run_id}}
. - Configuration: You insert these references into various configuration fields within the Databricks Jobs UI or when defining jobs using the Databricks REST API or CLI (in the JSON definition). Fields that support these references often have a { } button in the UI, which provides a handy list of available references for easy insertion.
- Runtime Substitution: When a job run starts, and before a task executes, Databricks scans the task’s configuration for these {{ }} patterns. It replaces each valid reference with its corresponding string value for that specific run. For instance, if you set a task parameter like {“output_path”: “/processed_data/{{job.run_id}}/”}, and the job run ID is 12345, the actual value passed to the task for output_path will be /processed_data/12345/.
- Scope and Usage: Crucially, dynamic value references are resolved within the job and task configuration settings. You cannot use them directly inside the code of your notebooks, scripts, or JARs (e.g., you can’t write print({{job.run_id}}) in a Python notebook). Instead, you must pass the dynamic value reference as a parameter to the task. The task code can then access the resolved value through standard parameter handling mechanisms (like dbutils.widgets.get(“parameter_name”) in notebooks).
- Error Handling: It’s important to note how errors are handled:
- Syntax errors (like missing braces) or references to non-existent values are often silently ignored and treated as literal strings.
- Using an invalid property within a known namespace (e.g., {{job.non_existent_property}}) might result in an error message in the UI or logs.
- The contents within {{ }} are not evaluated as expressions; you cannot perform operations like {{job.repair_count + 1}}.
Examples:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
• JSON - Job Parameter: { "parameters": [ { "name": "job_start_date_iso", "default": "{{job.start_time.iso_date}}" } ] } • JSON - Notebook Task Parameter: { "notebook_task": { "notebook_path": "/Repos/user@example.com/process_data", "base_parameters": { "run_id_param": "{{job.run_id}}", "trigger_type_param": "{{job.trigger.type}}" } } } |
- Referencing Task Output (SQL Task Example): If an upstream SQL task named query_sales outputs columns region and total_sales, a downstream task could reference the first row’s region value using a parameter like: {“sales_region”: “{{tasks.query_sales.output.first_row.region}}”}
Advantages
Using dynamic value references offers several key advantages:
- Increased Automation: Automate the inclusion of run-specific information (like IDs, timestamps) in configurations, logs, or output paths without manual intervention.
- Enhanced Dynamism: Create workflows that adapt based on runtime context, such as the trigger type ({{job.trigger.type}}) or the success/failure state of previous tasks ({{tasks.<task_name>.result_state}}).
- Improved Reusability: Design more generic job templates that can be reused across different scenarios by parameterizing context-dependent values.
- Reduced Hardcoding: Minimize hardcoded values like dates, IDs, or environment details, making jobs easier to maintain and migrate.
- Seamless Information Flow: Pass crucial metadata or even data outputs (like results from SQL tasks using {{tasks.<task_name>.output.*}}) between tasks in a structured way. Task Values ({{tasks.<task_name>.values.<value_name>}}) offer another powerful way to pass custom outputs between tasks.
- Simplified Conditional Execution: Use references like {{tasks.<task_name>.result_state}} in task run conditions to control workflow branching.
Conclusion
Dynamic Value References are an indispensable feature for anyone building sophisticated workflows in Databricks. Providing controlled access to runtime context and metadata directly within job configurations enables greater automation, flexibility, and maintainability.
Drop a query if you have any questions regarding Dynamic Value References and we will get back to you quickly.
Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.
- Reduced infrastructure costs
- Timely data-driven decisions
About CloudThat
CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.
FAQs
1. Can I use dynamic value references directly inside my Python notebook or SQL script?
ANS: – No. Dynamic value references are resolved before the task code runs. You must pass the reference as a task parameter. Then, within your notebook or script, you access the resolved value using standard methods for reading parameters (e.g., dbutils.widgets.get() in notebooks).
2. Can I perform calculations or formatting within the {{ }}?
ANS: – No, the content inside the braces is not evaluated as an expression. For instance, {{job.repair_count + 1}} will not work. However, some references, like time values (job.start_time), have built-in formatting options (e.g., {{job.start_time.iso_date}}).
3. How are dynamic value references different from Task Values?
ANS: – Dynamic value references primarily expose metadata and configuration provided by the Databricks Jobs system or passed as parameters. Task Values are specifically designed for tasks to programmatically output custom key-value pairs (using dbutils.jobs.taskValues.set(“key”, “value”) in a notebook, for example) that downstream tasks can then consume using the {{tasks.<task_name>.values.<key_name>}} dynamic reference. They work together to facilitate inter-task communication.

WRITTEN BY Yaswanth Tippa
Yaswanth Tippa is working as a Research Associate - Data and AIoT at CloudThat. He is a highly passionate and self-motivated individual with experience in data engineering and cloud computing with substantial expertise in building solutions for complex business problems involving large-scale data warehousing and reporting.
Comments