Top / Google Cloud Platform / Google BigQuery / Job

Google BigQuery Job

This page shows how to write Terraform for BigQuery Job and write them securely.

Review your .tf file for Google best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

google_bigquery_job (Terraform)

The Job in BigQuery can be configured in Terraform with the resource name google_bigquery_job. The following sections describe how to use the resource and its parameters.

Example Usage from GitHub

An example could not be found in GitHub.

Review your Terraform file for Google best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

Parameters

id optional computed - string
job_id required - string

The ID of the job. The ID must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), or dashes (-). The maximum length is 1,024 characters.

job_timeout_ms optional - string

Job timeout in milliseconds. If this time limit is exceeded, BigQuery may attempt to terminate the job.

job_type optional computed - string

The type of the job.

labels optional - map from string to string

The labels associated with this job. You can use these to organize and group your jobs.

location optional - string

The geographic location of the job. The default value is US.

project optional computed - string
status optional computed - list of object

The status of this job. Examine this value when polling an asynchronous job to see if the job is complete.

error_result - list of object
- location - string
- message - string
- reason - string
errors - list of object
- location - string
- message - string
- reason - string
state - string
user_email optional computed - string

Email address of the user who ran the job.

copy list block
- create_disposition optional - string
Specifies whether the job is allowed to create new tables. The following values are supported: CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table. CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result. Creation, truncation and append actions occur as one atomic update upon job completion Default value: "CREATE_IF_NEEDED" Possible values: ["CREATE_IF_NEEDED", "CREATE_NEVER"]
- write_disposition optional - string
Specifies the action that occurs if the destination table already exists. The following values are supported: WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the query result. WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result. Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion. Default value: "WRITE_EMPTY" Possible values: ["WRITE_TRUNCATE", "WRITE_APPEND", "WRITE_EMPTY"]
- destination_encryption_configuration list block
  - kms_key_name required - string
  Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key.
- destination_table list block
  - dataset_id optional computed - string
  The ID of the dataset containing this table.
  - project_id optional computed - string
  The ID of the project containing this table.
  - table_id required - string
  The table. Can be specified '[[table_id]]' if 'project_id' and 'dataset_id' are also set, or of the form 'projects/[[project]]/datasets/[[dataset_id]]/tables/[[table_id]]' if not.
- source_tables list block
  - dataset_id optional computed - string
  The ID of the dataset containing this table.
  - project_id optional computed - string
  The ID of the project containing this table.
  - table_id required - string
  The table. Can be specified '[[table_id]]' if 'project_id' and 'dataset_id' are also set, or of the form 'projects/[[project]]/datasets/[[dataset_id]]/tables/[[table_id]]' if not.
extract list block
- compression optional - string
The compression type to use for exported files. Possible values include GZIP, DEFLATE, SNAPPY, and NONE. The default value is NONE. DEFLATE and SNAPPY are only supported for Avro.
- destination_format optional computed - string
The exported file format. Possible values include CSV, NEWLINE_DELIMITED_JSON and AVRO for tables and SAVED_MODEL for models. The default value for tables is CSV. Tables with nested or repeated fields cannot be exported as CSV. The default value for models is SAVED_MODEL.
- destination_uris required - list of string
A list of fully-qualified Google Cloud Storage URIs where the extracted table should be written.
- field_delimiter optional computed - string
When extracting data in CSV format, this defines the delimiter to use between fields in the exported data. Default is ','
- print_header optional - bool
Whether to print out a header row in the results. Default is true.
- use_avro_logical_types optional - bool
Whether to use logical types when extracting to AVRO format.
- source_model list block
  - dataset_id required - string
  The ID of the dataset containing this model.
  - model_id required - string
  The ID of the model.
  - project_id required - string
  The ID of the project containing this model.
- source_table list block
  - dataset_id optional computed - string
  The ID of the dataset containing this table.
  - project_id optional computed - string
  The ID of the project containing this table.
  - table_id required - string
  The table. Can be specified '[[table_id]]' if 'project_id' and 'dataset_id' are also set, or of the form 'projects/[[project]]/datasets/[[dataset_id]]/tables/[[table_id]]' if not.
load list block
- allow_jagged_rows optional - bool
Accept rows that are missing trailing optional columns. The missing values are treated as nulls. If false, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. Only applicable to CSV, ignored for other formats.
- allow_quoted_newlines optional - bool
Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false.
- autodetect optional - bool
Indicates if we should automatically infer the options and schema for CSV and JSON sources.
- create_disposition optional - string
Specifies whether the job is allowed to create new tables. The following values are supported: CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table. CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result. Creation, truncation and append actions occur as one atomic update upon job completion Default value: "CREATE_IF_NEEDED" Possible values: ["CREATE_IF_NEEDED", "CREATE_NEVER"]
- encoding optional - string
The character encoding of the data. The supported values are UTF-8 or ISO-8859-1. The default value is UTF-8. BigQuery decodes the data after the raw, binary data has been split using the values of the quote and fieldDelimiter properties.
- field_delimiter optional computed - string
The separator for fields in a CSV file. The separator can be any ISO-8859-1 single-byte character. To use a character in the range 128-255, you must encode the character as UTF8. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. BigQuery also supports the escape sequence "t" to specify a tab separator. The default value is a comma (',').
- ignore_unknown_values optional - bool
Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. The sourceFormat property determines what BigQuery treats as an extra value: CSV: Trailing columns JSON: Named values that don't match any column names
- max_bad_records optional - number
The maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. The default value is 0, which requires that all records are valid.
- null_marker optional - string
Specifies a string that represents a null value in a CSV file. For example, if you specify "N", BigQuery interprets "N" as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value.
- projection_fields optional - list of string
If sourceFormat is set to "DATASTORE_BACKUP", indicates which entity properties to load into BigQuery from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If no properties are specified, BigQuery loads all properties. If any named property isn't found in the Cloud Datastore backup, an invalid error is returned in the job result.
- quote optional computed - string
The value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ('"'). If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true.
- schema_update_options optional - list of string
Allows the schema of the destination table to be updated as a side effect of the load job if a schema is autodetected or supplied in the job configuration. Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema. One or more of the following values are specified: ALLOW_FIELD_ADDITION: allow adding a nullable field to the schema. ALLOW_FIELD_RELAXATION: allow relaxing a required field in the original schema to nullable.
- skip_leading_rows optional - number
The number of rows at the top of a CSV file that BigQuery will skip when loading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped. When autodetect is on, the behavior is the following: skipLeadingRows unspecified - Autodetect tries to detect headers in the first row. If they are not detected, the row is read as data. Otherwise data is read starting from the second row. skipLeadingRows is 0 - Instructs autodetect that there are no headers and data should be read starting from the first row. skipLeadingRows = N > 0 - Autodetect skips N-1 rows and tries to detect headers in row N. If headers are not detected, row N is just skipped. Otherwise row N is used to extract column names for the detected schema.
- source_format optional - string
The format of the data files. For CSV files, specify "CSV". For datastore backups, specify "DATASTORE_BACKUP". For newline-delimited JSON, specify "NEWLINE_DELIMITED_JSON". For Avro, specify "AVRO". For parquet, specify "PARQUET". For orc, specify "ORC". The default value is CSV.
- source_uris required - list of string
The fully-qualified URIs that point to your data in Google Cloud. For Google Cloud Storage URIs: Each URI can contain one '' wildcard character and it must come after the 'bucket' name. Size limits related to load jobs apply to external data sources. For Google Cloud Bigtable URIs: Exactly one URI can be specified and it has be a fully specified and valid HTTPS URL for a Google Cloud Bigtable table. For Google Cloud Datastore backups: Exactly one URI can be specified. Also, the '' wildcard character is not allowed.
- write_disposition optional - string
Specifies the action that occurs if the destination table already exists. The following values are supported: WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the query result. WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result. Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion. Default value: "WRITE_EMPTY" Possible values: ["WRITE_TRUNCATE", "WRITE_APPEND", "WRITE_EMPTY"]
- destination_encryption_configuration list block
  - kms_key_name required - string
  Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key.
- destination_table list block
  - dataset_id optional computed - string
  The ID of the dataset containing this table.
  - project_id optional computed - string
  The ID of the project containing this table.
  - table_id required - string
  The table. Can be specified '[[table_id]]' if 'project_id' and 'dataset_id' are also set, or of the form 'projects/[[project]]/datasets/[[dataset_id]]/tables/[[table_id]]' if not.
- time_partitioning list block
  - expiration_ms optional - string
  Number of milliseconds for which to keep the storage for a partition. A wrapper is used here because 0 is an invalid value.
  - field optional - string
  If not set, the table is partitioned by pseudo column '_PARTITIONTIME'; if set, the table is partitioned by this field. The field must be a top-level TIMESTAMP or DATE field. Its mode must be NULLABLE or REQUIRED. A wrapper is used here because an empty string is an invalid value.
  - type required - string
  The only type supported is DAY, which will generate one partition per day. Providing an empty string used to cause an error, but in OnePlatform the field will be treated as unset.
query list block
- allow_large_results optional - bool
If true and query uses legacy SQL dialect, allows the query to produce arbitrarily large result tables at a slight cost in performance. Requires destinationTable to be set. For standard SQL queries, this flag is ignored and large results are always allowed. However, you must still set destinationTable when result size exceeds the allowed maximum response size.
- create_disposition optional - string
Specifies whether the job is allowed to create new tables. The following values are supported: CREATE_IF_NEEDED: If the table does not exist, BigQuery creates the table. CREATE_NEVER: The table must already exist. If it does not, a 'notFound' error is returned in the job result. Creation, truncation and append actions occur as one atomic update upon job completion Default value: "CREATE_IF_NEEDED" Possible values: ["CREATE_IF_NEEDED", "CREATE_NEVER"]
- flatten_results optional - bool
If true and query uses legacy SQL dialect, flattens all nested and repeated fields in the query results. allowLargeResults must be true if this is set to false. For standard SQL queries, this flag is ignored and results are never flattened.
- maximum_billing_tier optional - number
Limits the billing tier for this job. Queries that have resource usage beyond this tier will fail (without incurring a charge). If unspecified, this will be set to your project default.
- maximum_bytes_billed optional - string
Limits the bytes billed for this job. Queries that will have bytes billed beyond this limit will fail (without incurring a charge). If unspecified, this will be set to your project default.
- parameter_mode optional - string
Standard SQL only. Set to POSITIONAL to use positional (?) query parameters or to NAMED to use named (@myparam) query parameters in this query.
- priority optional - string
Specifies a priority for the query. Default value: "INTERACTIVE" Possible values: ["INTERACTIVE", "BATCH"]
- query required - string
SQL query text to execute. The useLegacySql field can be used to indicate whether the query uses legacy SQL or standard SQL. NOTE: queries containing DML language ('DELETE', 'UPDATE', 'MERGE', 'INSERT') must specify 'create_disposition = ""' and 'write_disposition = ""'.
- schema_update_options optional - list of string
Allows the schema of the destination table to be updated as a side effect of the query job. Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema. One or more of the following values are specified: ALLOW_FIELD_ADDITION: allow adding a nullable field to the schema. ALLOW_FIELD_RELAXATION: allow relaxing a required field in the original schema to nullable.
- use_legacy_sql optional - bool
Specifies whether to use BigQuery's legacy SQL dialect for this query. The default value is true. If set to false, the query will use BigQuery's standard SQL.
- use_query_cache optional - bool
Whether to look for the result in the query cache. The query cache is a best-effort cache that will be flushed whenever tables in the query are modified. Moreover, the query cache is only available when a query does not have a destination table specified. The default value is true.
- write_disposition optional - string
Specifies the action that occurs if the destination table already exists. The following values are supported: WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the query result. WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result. Each action is atomic and only occurs if BigQuery is able to complete the job successfully. Creation, truncation and append actions occur as one atomic update upon job completion. Default value: "WRITE_EMPTY" Possible values: ["WRITE_TRUNCATE", "WRITE_APPEND", "WRITE_EMPTY"]
- default_dataset list block
  - dataset_id required - string
  The dataset. Can be specified '[[dataset_id]]' if 'project_id' is also set, or of the form 'projects/[[project]]/datasets/[[dataset_id]]' if not.
  - project_id optional computed - string
  The ID of the project containing this table.
- destination_encryption_configuration list block
  - kms_key_name required - string
  Describes the Cloud KMS encryption key that will be used to protect destination BigQuery table. The BigQuery Service Account associated with your project requires access to this encryption key.
- destination_table list block
  - dataset_id optional computed - string
  The ID of the dataset containing this table.
  - project_id optional computed - string
  The ID of the project containing this table.
  - table_id required - string
  The table. Can be specified '[[table_id]]' if 'project_id' and 'dataset_id' are also set, or of the form 'projects/[[project]]/datasets/[[dataset_id]]/tables/[[table_id]]' if not.
- script_options list block
  - key_result_statement optional - string
  Determines which statement in the script represents the "key result", used to populate the schema and query results of the script job. Possible values: ["LAST", "FIRST_SELECT"]
  - statement_byte_budget optional - string
  Limit on the number of bytes billed per statement. Exceeding this budget results in an error.
  - statement_timeout_ms optional - string
  Timeout period for each statement in a script.
- user_defined_function_resources list block
  - inline_code optional - string
  An inline resource that contains code for a user-defined function (UDF). Providing a inline code resource is equivalent to providing a URI for a file containing the same code.
  - resource_uri optional - string
  A code resource to load from a Google Cloud Storage URI (gs://bucket/path).
timeouts single block
- create optional - string
- delete optional - string

>> from Terraform Registry

Explanation in Terraform Registry

Jobs are actions that BigQuery runs on your behalf to load data, export data, query data, or copy data. Once a BigQuery job is created, it cannot be changed or deleted. To get more information about Job, see:
API documentation
How-to Guides
BigQuery Jobs Intro

>> from Terraform Registry

Tips: Best Practices for The Other Google BigQuery Resources

In addition to the google_bigquery_dataset, Google BigQuery has the other resources that should be configured for security reasons. Please check some examples of those resources and precautions.

google_bigquery_dataset

Ensure your BigQuery dataset blocks unwanted access

It is better to block unwanted access from users outside the organization.

Review your Google BigQuery settings

In addition to the above, there are other security points you should be aware of making sure that your .tf files are protected in Shisho Cloud.

The Other Related Google BigQuery Resources

Google BigQuery Dataset

Google BigQuery Dataset Access

Google BigQuery Dataset IAM

Google BigQuery Routine

Google BigQuery Table

Google BigQuery Table IAM

Frequently asked questions

What is Google BigQuery Job?

Google BigQuery Job is a resource for BigQuery of Google Cloud Platform. Settings can be wrote in Terraform.

Automate config file reviews on your commits

Fix issues in your infrastructure as code with auto-generated patches.

google_bigquery_job
Frequently asked questions