AWS Glue Crawler
This page shows how to write Terraform and CloudFormation for AWS Glue Crawler and write them securely.
aws_glue_crawler (Terraform)
The Crawler in AWS Glue can be configured in Terraform with the resource name aws_glue_crawler
. The following sections describe how to use the resource and its parameters.
Example Usage from GitHub
An example could not be found in GitHub.
Parameters
-
arn
optional computed - string -
classifiers
optional - list of string -
configuration
optional - string -
database_name
required - string -
description
optional - string -
id
optional computed - string -
name
required - string -
role
required - string -
schedule
optional - string -
security_configuration
optional - string -
table_prefix
optional - string -
tags
optional - map from string to string -
catalog_target
list block-
database_name
required - string -
tables
required - list of string
-
-
dynamodb_target
list block -
jdbc_target
list block-
connection_name
required - string -
exclusions
optional - list of string -
path
required - string
-
-
lineage_configuration
list block-
crawler_lineage_settings
optional - string
-
-
mongodb_target
list block-
connection_name
required - string -
path
required - string -
scan_all
optional - bool
-
-
recrawl_policy
list block-
recrawl_behavior
optional - string
-
-
s3_target
list block-
connection_name
optional - string -
exclusions
optional - list of string -
path
required - string
-
-
schema_change_policy
list block-
delete_behavior
optional - string -
update_behavior
optional - string
-
Explanation in Terraform Registry
Manages a Glue Crawler. More information can be found in the AWS Glue Developer Guide
AWS::Glue::Crawler (CloudFormation)
The Crawler in Glue can be configured in CloudFormation with the resource name AWS::Glue::Crawler
. The following sections describe 10 examples of how to use the resource and its parameters.
Example Usage from GitHub
Type: AWS::Glue::Crawler
DependsOn: GlueRole
Properties:
Name: c_aggregated
Description: !Sub Crawl aggregated datasets at s3://${Covid19Bucket}/covid19/world-cases-deaths-aggregates/
DatabaseName: !Ref GlueDatabaseName
Type: AWS::Glue::Crawler
DependsOn: GlueRole
Properties:
Name: c_aggregated
Description: !Sub Crawl aggregated datasets at s3://${Covid19Bucket}/covid19/world-cases-deaths-aggregates/
DatabaseName: !Ref GlueDatabaseName
Type: AWS::Glue::Crawler
Properties:
Name: ${self:custom.stage}-tvdata-raw-crawler
Role:
Fn::GetAtt: [RatingRole, Arn]
DatabaseName:
Type: AWS::Glue::Crawler
Properties:
Name: smart-hub-locations-csv
Role: !GetAtt "CrawlerRole.Arn"
Targets:
CatalogTargets:
Type: "AWS::Glue::Crawler"
Properties:
Name: "meter-data-business-aggregated-daily"
Role: !Sub "service-role/${IAMRole}"
Targets:
S3Targets:
"Type": "AWS::Glue::Crawler",
"Properties": {
"Name": {"Fn::Sub": "${AWS::StackName}-views-crawler"},
"Role": {"Ref": "glueRole"},
"DatabaseName": {
"Ref": "viewDatabase"
"path": "/ResourceTypes/AWS::Glue::Crawler/Properties/Role/Value",
"value": {
"ValueType": "AWS::IAM::Role.NameOrArn"
}
},
{
"path": "/ResourceTypes/AWS::Glue::Crawler/Properties/Role/Value",
"value": {
"ValueType": "AWS::IAM::Role.NameOrArn"
}
},
{
"Type": "AWS::Glue::Crawler",
"Properties": {
"Role": {
"Ref": "AWSGlueCuratedDatasetsCrawlerRoleName"
},
"DatabaseName": {
"Type": "AWS::Glue::Crawler",
"Properties": {
"Name": "raw_crawler",
"Role": {
"Fn::GetAtt": [
"GlueRole",
Parameters
-
Classifiers
optional - List -
Description
optional - String -
SchemaChangePolicy
optional - SchemaChangePolicy -
Configuration
optional - String -
RecrawlPolicy
optional - RecrawlPolicy -
DatabaseName
optional - String -
Targets
required - Targets -
CrawlerSecurityConfiguration
optional - String -
Name
optional - String -
Role
required - String -
Schedule
optional - Schedule -
TablePrefix
optional - String -
Tags
optional - Json
Explanation in CloudFormation Registry
The
AWS::Glue::Crawler
resource specifies an AWS Glue crawler. For more information, see Cataloging Tables with a Crawler and Crawler Structure in the AWS Glue Developer Guide.
Frequently asked questions
What is AWS Glue Crawler?
AWS Glue Crawler is a resource for Glue of Amazon Web Service. Settings can be wrote in Terraform and CloudFormation.
Where can I find the example code for the AWS Glue Crawler?
For CloudFormation, the GirijaRaniGavara/provision-codepipeline-glue-workflows-, duyhoang15/test and ozzyince/tv source code examples are useful. See the CloudFormation Example section for further details.