AWS Glue Job

This page shows how to write Terraform and CloudFormation for AWS Glue Job and write them securely.

aws_glue_job (Terraform)

The Job in AWS Glue can be configured in Terraform with the resource name aws_glue_job. The following sections describe 2 examples of how to use the resource and its parameters.

Example Usage from GitHub

glue.tf#L11
resource "aws_glue_job" "bidb-cdc-data-load-glue-job" {
  name     = "bidb-cdc-data-load-glue-job"
  role_arn = var.glue_s3_redshift_access_role
  connections = [ "bidb-dev-datawarehouse-core-redshift-glue-connection" ]
  default_arguments = {
    "--Secret"               = var.redshift_credentials_parameter_store
main.tf#L31
resource "aws_glue_job" "tf-gluejob-scheduled-1" {
  name     = "tf-gluejob-scheduled-1"
  role_arn = "arn:aws:iam::152944667076:role/glue_helper"

  command {
    script_location = "s3://mvil-glue/MySQLBYOD.py"

Review your Terraform file for AWS best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

Parameters

Explanation in Terraform Registry

Provides a Glue Job resource. -> Glue functionality, such as monitoring and logging of jobs, is typically managed with the default_arguments argument. See the Special Parameters Used by AWS Glue topic in the Glue developer guide for additional information.

AWS::Glue::Job (CloudFormation)

The Job in Glue can be configured in CloudFormation with the resource name AWS::Glue::Job. The following sections describe 10 examples of how to use the resource and its parameters.

Example Usage from GitHub

HudiGlueJobCFn.yml#L18
    Type: AWS::Glue::Job
    Properties:
      Description: This job creates both Hudi and Glueparquet tables
      Command:
        Name: glueetl
        ScriptLocation: "s3://mqtran/artifacts/Glue/HudiJarGlueJob.py"
stack.yml#L124
        Type: "AWS::Glue::Job"
        Properties:
            Name: "tickit_public_category_refine"
            Description: ""
            Role: !Sub "arn:aws:iam::${AWS::AccountId}:role/service-role/AWSGlueServiceRole-demo"
            ExecutionProperty:
glue.yml#L114
    Type: "AWS::Glue::Job"
    Properties:
      Name: !Sub "business_aggregate_daily-${AWS::Region}"
      Role: !GetAtt IAMRole.Arn
      ExecutionProperty:
        MaxConcurrentRuns: 1
glue.yml#L114
    Type: "AWS::Glue::Job"
    Properties:
      Name: !Sub "business_aggregate_daily-${AWS::Region}"
      Role: !GetAtt IAMRole.Arn
      ExecutionProperty:
        MaxConcurrentRuns: 1
templateSkipAll.yml#L107
    Type: "AWS::Glue::Job"
    Properties:
      Command:
        NotScriptLocation: ./packages/gluejob/dummy.py
      Role: !GetAtt GlueRole.Arn
  DummyGlueJob2:
Glue-redshift-benchmark-workflow.template.json#L58
      "Type": "AWS::Glue::Job",
      "Properties": {
        "Command": {
          "Name": "pythonshell",
          "PythonVersion": "3",
          "ScriptLocation": {
Redshift-reporting-pipeline.template.json#L58
      "Type": "AWS::Glue::Job",
      "Properties": {
        "Command": {
          "Name": "pythonshell",
          "PythonVersion": "3",
          "ScriptLocation": {
glue.json#L343
            "Type": "AWS::Glue::Job",
            "Properties": {
                "Command": {
                    "Name": "glueetl",
                    "ScriptLocation": {
                        "Fn::Sub": [
glue-stack.json#L201
      "Type": "AWS::Glue::Job",
      "Properties": {
        "Role": {
          "Ref": "GlueIAMRole"
        },
        "DefaultArguments": {
glue-stack.json#L196
            "Type": "AWS::Glue::Job",
            "Properties": {
                "Role": {
                    "Fn::ImportValue": {
                        "Fn::Sub": "${InfrastructureID}-GlueRoleARN"
                    }

Parameters

Explanation in CloudFormation Registry

The AWS::Glue::Job resource specifies an AWS Glue job in the data catalog. For more information, see Adding Jobs in AWS Glue and Job Structure in the *AWS Glue Developer Guide.

Frequently asked questions

What is AWS Glue Job?

AWS Glue Job is a resource for Glue of Amazon Web Service. Settings can be wrote in Terraform and CloudFormation.

Where can I find the example code for the AWS Glue Job?

For Terraform, the devopsbynaresh/datalake-alsac and m-voels/tftest source code examples are useful. See the Terraform Example section for further details.

For CloudFormation, the mq-tran/hudi-glue, garystafford/tickit-data-lake-demo and aws-samples/aws-utility-meter-data-analytics-platform-cn source code examples are useful. See the CloudFormation Example section for further details.