Google Dataflow Job

This page shows how to write Terraform for Dataflow Job and write them securely.

google_dataflow_job (Terraform)

The Job in Dataflow can be configured in Terraform with the resource name google_dataflow_job. The following sections describe 3 examples of how to use the resource and its parameters.

Example Usage from GitHub

job.tf#L7
resource "google_dataflow_job" "main" {
  for_each = { for v in local._job_conf : v.name => v }

  name                  = each.value.name
  template_gcs_path     = each.value.template_gcs_path
  temp_gcs_location     = each.value.temp_gcs_location
dataflow_setup.tf#L6
resource "google_dataflow_job" "big_data_job" {
  name              = var.data_flow_name
  template_gcs_path = var.template_gcs_path
  temp_gcs_location = var.temp_gcs_location
  project = var.project_name
  network = var.network_name
main.tf#L1
resource "google_dataflow_job" "dataflow_job" {
  region                = var.region
  zone                  = var.zone
  name                  = var.name
  on_delete             = var.on_delete
  max_workers           = var.max_workers

Review your Terraform file for Google best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

Parameters

List of experiments that should be used by the job. An example value is ["enable_stackdriver_agent_metrics"].

Indicates if the job should use the streaming engine feature.

The configuration for VM IPs. Options are "WORKER_IP_PUBLIC" or "WORKER_IP_PRIVATE".

  • job_id optional computed - string

The unique ID of this job.

The name for the Cloud KMS key for the job. Key format is: projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY

  • labels optional - map from string to string

User labels to be specified for the job. Keys and values should follow the restrictions specified in the labeling restrictions page. NOTE: Google-provided Dataflow templates often provide default labels that begin with goog-dataflow-provided. Unless explicitly set in config, these labels will be ignored to prevent diffs on re-apply.

The machine type to use for the job.

The number of workers permitted to work on the job. More workers may improve processing speed at additional cost.

A unique name for the resource, required by Dataflow.

The network to which VMs will be assigned. If it is not provided, "default" will be used.

One of "drain" or "cancel". Specifies behavior of deletion during terraform destroy.

Key/Value pairs to be passed to the Dataflow job (as used in the template).

The project in which the resource belongs.

The region in which the created job should run.

The Service Account email used to create the job.

  • state optional computed - string

The current state of the resource, selected from the JobState enum.

The subnetwork to which VMs will be assigned. Should be of the form "regions/REGION/subnetworks/SUBNETWORK".

A writeable location on Google Cloud Storage for the Dataflow job to dump its temporary data.

The Google Cloud Storage path to the Dataflow job template.

Only applicable when updating a pipeline. Map of transform name prefixes of the job to be replaced with the corresponding name prefixes of the new job.

  • type optional computed - string

The type of this job, selected from the JobType enum.

The zone in which the created job should run. If it is not provided, the provider zone is used.

Explanation in Terraform Registry

Creates a job on Dataflow, which is an implementation of Apache Beam running on Google Compute Engine. For more information see the official documentation for Beam and Dataflow.

Frequently asked questions

What is Google Dataflow Job?

Google Dataflow Job is a resource for Dataflow of Google Cloud Platform. Settings can be wrote in Terraform.

Where can I find the example code for the Google Dataflow Job?

For Terraform, the AtsushiKitano/assets, abhidatametica/ibc-ibx-pilot and marcelopicarelli/google-datalake source code examples are useful. See the Terraform Example section for further details.

security-icon

Automate config file reviews on your commits

Fix issues in your infrastructure as code with auto-generated patches.