Top / Google Cloud Platform / Google Dataproc / Autoscaling Policy

Google Dataproc Autoscaling Policy

This page shows how to write Terraform for Dataproc Autoscaling Policy and write them securely.

Review your .tf file for Google best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

google_dataproc_autoscaling_policy (Terraform)

The Autoscaling Policy in Dataproc can be configured in Terraform with the resource name google_dataproc_autoscaling_policy. The following sections describe 5 examples of how to use the resource and its parameters.

Example Usage from GitHub

atpk2266/gcp-deployment-master

main.tf#L1

resource "google_dataproc_autoscaling_policy" "dp_asp" {
  policy_id = var.policy_name
  project   = var.project_id
  location  = var.region

  worker_config {

Find out how to use this setting securely with Shisho Cloud

yourth/ai-notebooks

main.tf#L17

resource "google_dataproc_autoscaling_policy" "policy_a" {
  provider = google-beta
  project   = var.project_id
  policy_id = "policy-a"

  worker_config {

Find out how to use this setting securely with Shisho Cloud

woop/feast-test

dataproc.tf#L8

resource "google_dataproc_autoscaling_policy" "feast_dataproc_cluster_asp" {
  policy_id = var.name_prefix
  location  = var.region
  project   = var.gcp_project_name

  worker_config {

Find out how to use this setting securely with Shisho Cloud

feast-dev/feast

dataproc.tf#L8

resource "google_dataproc_autoscaling_policy" "feast_dataproc_cluster_asp" {
  policy_id = var.name_prefix
  location  = var.region
  project   = var.gcp_project_name

  worker_config {

Find out how to use this setting securely with Shisho Cloud

rwang249/terraform

main.tf#L68

resource "google_dataproc_autoscaling_policy" "asp" {
  policy_id = var.dataproc_autoscale_policy
  location  = var.region

  worker_config {
    max_instances = var.autoscale_max_instances

Find out how to use this setting securely with Shisho Cloud

Review your Terraform file for Google best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

Parameters

id optional computed - string
location optional - string

The location where the autoscaling policy should reside. The default value is 'global'.

name optional computed - string

The "resource name" of the autoscaling policy.

policy_id required - string

The policy id. The id must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). Cannot begin or end with underscore or hyphen. Must consist of between 3 and 50 characters.

project optional computed - string
basic_algorithm list block
- cooldown_period optional - string
Duration between scaling events. A scaling period starts after the update operation from the previous event has completed. Bounds: [2m, 1d]. Default: 2m.
- yarn_config list block
  - graceful_decommission_timeout required - string
  Timeout for YARN graceful decommissioning of Node Managers. Specifies the duration to wait for jobs to complete before forcefully removing workers (and potentially interrupting jobs). Only applicable to downscaling operations. Bounds: [0s, 1d].
  - scale_down_factor required - number
  Fraction of average pending memory in the last cooldown period for which to remove workers. A scale-down factor of 1 will result in scaling down so that there is no available memory remaining after the update (more aggressive scaling). A scale-down factor of 0 disables removing workers, which can be beneficial for autoscaling a single job. Bounds: [0.0, 1.0].
  - scale_down_min_worker_fraction optional - number
  Minimum scale-down threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2 worker scale-down for the cluster to scale. A threshold of 0 means the autoscaler will scale down on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.
  - scale_up_factor required - number
  Fraction of average pending memory in the last cooldown period for which to add workers. A scale-up factor of 1.0 will result in scaling up so that there is no pending memory remaining after the update (more aggressive scaling). A scale-up factor closer to 0 will result in a smaller magnitude of scaling up (less aggressive scaling). Bounds: [0.0, 1.0].
  - scale_up_min_worker_fraction optional - number
  Minimum scale-up threshold as a fraction of total cluster size before scaling occurs. For example, in a 20-worker cluster, a threshold of 0.1 means the autoscaler must recommend at least a 2-worker scale-up for the cluster to scale. A threshold of 0 means the autoscaler will scale up on any recommended change. Bounds: [0.0, 1.0]. Default: 0.0.
secondary_worker_config list block
- max_instances optional - number
Maximum number of instances for this group. Note that by default, clusters will not use secondary workers. Required for secondary workers if the minimum secondary instances is set. Bounds: [minInstances, ). Defaults to 0.
- min_instances optional - number
Minimum number of instances for this group. Bounds: [0, maxInstances]. Defaults to 0.
- weight optional - number
Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.
timeouts single block
- create optional - string
- delete optional - string
- update optional - string
worker_config list block
- max_instances required - number
Maximum number of instances for this group.
- min_instances optional - number
Minimum number of instances for this group. Bounds: [2, maxInstances]. Defaults to 2.
- weight optional - number
Weight for the instance group, which is used to determine the fraction of total workers in the cluster from this instance group. For example, if primary workers have weight 2, and secondary workers have weight 1, the cluster will have approximately 2 primary workers for each secondary worker. The cluster may not reach the specified balance if constrained by min/max bounds or other autoscaling settings. For example, if maxInstances for secondary workers is 0, then only primary workers will be added. The cluster can also be out of balance when created. If weight is not set on any instance group, the cluster will default to equal weight for all groups: the cluster will attempt to maintain an equal number of workers in each group within the configured size bounds for each group. If weight is set for one group only, the cluster will default to zero weight on the unset group. For example if weight is set only on primary workers, the cluster will use primary workers only and no secondary workers.

>> from Terraform Registry

Explanation in Terraform Registry

Describes an autoscaling policy for Dataproc cluster autoscaler.

>> from Terraform Registry

The Other Related Google Dataproc Resources

Google Dataproc Cluster

Google Dataproc Cluster IAM

Google Dataproc Job

Google Dataproc Job IAM

Google Dataproc Workflow Template

Frequently asked questions

What is Google Dataproc Autoscaling Policy?

Google Dataproc Autoscaling Policy is a resource for Dataproc of Google Cloud Platform. Settings can be wrote in Terraform.

Where can I find the example code for the Google Dataproc Autoscaling Policy?

For Terraform, the atpk2266/gcp-deployment-master, yourth/ai-notebooks and woop/feast-test source code examples are useful. See the Terraform Example section for further details.

Automate config file reviews on your commits

Fix issues in your infrastructure as code with auto-generated patches.

google_dataproc_autoscaling_policy
Frequently asked questions

Google Dataproc Autoscaling Policy

Review your .tf file for Google best practices

google_dataproc_autoscaling_policy (Terraform)

Example Usage from GitHub

Review your Terraform file for Google best practices

Parameters

Explanation in Terraform Registry

The Other Related Google Dataproc Resources

Frequently asked questions

What is Google Dataproc Autoscaling Policy?

Where can I find the example code for the Google Dataproc Autoscaling Policy?

Automate config file reviews on your commits

Table of Contents