AWS Glue Workflow

This page shows how to write Terraform and CloudFormation for AWS Glue Workflow and write them securely.

aws_glue_workflow (Terraform)

The Workflow in AWS Glue can be configured in Terraform with the resource name aws_glue_workflow. The following sections describe 3 examples of how to use the resource and its parameters.

Example Usage from GitHub

main.tf#L1
resource "aws_glue_workflow" "glue_workflow" {
  name = var.workflow_name
}

resource "aws_glue_security_configuration" "glue_security" {
  name = var.security_name
main.tf#L1
resource "aws_glue_workflow" "glue_workflow" {
  name = var.workflow_name
}

resource "aws_glue_security_configuration" "glue_security" {
  name = var.security_name
main.tf#L7
resource "aws_glue_workflow" "this" {
  default_run_properties = var.default_run_properties
  description            = var.description
  name                   = var.name
}

Review your Terraform file for AWS best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

Parameters

Explanation in Terraform Registry

Provides a Glue Workflow resource. The workflow graph (DAG) can be build using the aws_glue_trigger resource. See the example below for creating a graph with four nodes (two triggers and two jobs).

AWS::Glue::Workflow (CloudFormation)

The Workflow in Glue can be configured in CloudFormation with the resource name AWS::Glue::Workflow. The following sections describe 10 examples of how to use the resource and its parameters.

Example Usage from GitHub

aws-glue.yml#L98
    Type: AWS::Glue::Workflow
    Properties:
      Description: workflow to execture DataQuest etl
      Name: DataQuestworkflow
Workflow3Triggers.yml#L62
    Type: AWS::Glue::Workflow
    Properties:
      Description: !Ref GlueWorkflowDescription
      Name: !Ref GlueWorkflowName

####################################################################################
mart-glueetl-root.yml#L49
    Type: AWS::Glue::Workflow
    Properties:
        #DefaultRunProperties: Json
        Description: Mart workflow starting with DQ job
        Name: PlusMartWF
        #Tags: Json
Workflow2Triggers.yml#L42
    Type: AWS::Glue::Workflow
    Properties:
      Description: !Ref GlueWorkflowDescription
      Name: !Ref GlueWorkflowName

####################################################################################
clean_template.yml#L124
    Type: AWS::Glue::Workflow
    Properties:
      Description: Testing Glue workflow creation via CloudFormation.

  EnergyDataCrawler:
    Type: AWS::Glue::Crawler
Redshift-reporting-pipeline.template.json#L329
      "Type": "AWS::Glue::Workflow",
      "Properties": {
        "Description": "Quarterly report pipeline for catalog sales",
        "Name": "quarterly-catalog-reporting",
        "Tags": {
          "project": "redshift-demo"
GlueWorkflowSpecification.json#L3
    "AWS::Glue::Workflow": {
      "Documentation": "http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-workflow.html",
      "Properties": {
        "Description": {
          "Required": false,
          "Documentation": "http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-workflow.html#cfn-glue-workflow-description",
CFN_Redshift_GlueJob.json#L289
            "Type": "AWS::Glue::Workflow",
            "Properties": {
                "Description": "Copies data into weather table, unloads in s3 bucket partitioned and runs Glue crawler",
                "Name": "AodRSWorkflow"
            }
        },
Glue-redshift-benchmark-workflow.template.json#L270
      "Type": "AWS::Glue::Workflow",
      "Properties": {
        "Description": "Use TPCDS benchmark Redshift",
        "Name": "redshift-benchmark",
        "Tags": {
          "project": "redshift-benchmark"
workflow-stack.template.json#L218
      "Type": "AWS::Glue::Workflow",
      "Properties": {
        "Description": "ETL workflow to convert CSV to parquet and then load into Redshift",
        "Name": "glue-workflow"
      },
      "Metadata": {

Parameters

Explanation in CloudFormation Registry

The AWS::Glue::Workflow is an AWS Glue resource type that manages AWS Glue workflows. A workflow is a container for a set of related jobs, crawlers, and triggers in AWS Glue. Using a workflow, you can design a complex multi-job extract, transform, and load (ETL) activity that AWS Glue can execute and track as single entity.

Frequently asked questions

What is AWS Glue Workflow?

AWS Glue Workflow is a resource for Glue of Amazon Web Service. Settings can be wrote in Terraform and CloudFormation.

Where can I find the example code for the AWS Glue Workflow?

For Terraform, the 1oglop1/aws-glue-monorepo-style, SJREDDY6/terra and niveklabs/aws source code examples are useful. See the Terraform Example section for further details.

For CloudFormation, the fergo2910/data-quest-infra, MarcoAP/AWSTraining and pradosh2008/cloudproject source code examples are useful. See the CloudFormation Example section for further details.