AWS Glue Crawler
This page shows how to write Terraform and CloudFormation for AWS Glue Crawler and write them securely.
aws_glue_crawler (Terraform)
The Crawler in AWS Glue can be configured in Terraform with the resource name aws_glue_crawler. The following sections describe how to use the resource and its parameters.
Example Usage from GitHub
An example could not be found in GitHub.
Parameters
-
arnoptional computed - string -
classifiersoptional - list of string -
configurationoptional - string -
database_namerequired - string -
descriptionoptional - string -
idoptional computed - string -
namerequired - string -
rolerequired - string -
scheduleoptional - string -
security_configurationoptional - string -
table_prefixoptional - string -
tagsoptional - map from string to string -
catalog_targetlist block-
database_namerequired - string -
tablesrequired - list of string
-
-
dynamodb_targetlist block -
jdbc_targetlist block-
connection_namerequired - string -
exclusionsoptional - list of string -
pathrequired - string
-
-
lineage_configurationlist block-
crawler_lineage_settingsoptional - string
-
-
mongodb_targetlist block-
connection_namerequired - string -
pathrequired - string -
scan_alloptional - bool
-
-
recrawl_policylist block-
recrawl_behavioroptional - string
-
-
s3_targetlist block-
connection_nameoptional - string -
exclusionsoptional - list of string -
pathrequired - string
-
-
schema_change_policylist block-
delete_behavioroptional - string -
update_behavioroptional - string
-
Explanation in Terraform Registry
Manages a Glue Crawler. More information can be found in the AWS Glue Developer Guide
AWS::Glue::Crawler (CloudFormation)
The Crawler in Glue can be configured in CloudFormation with the resource name AWS::Glue::Crawler. The following sections describe 10 examples of how to use the resource and its parameters.
Example Usage from GitHub
Type: AWS::Glue::Crawler
DependsOn: GlueRole
Properties:
Name: c_aggregated
Description: !Sub Crawl aggregated datasets at s3://${Covid19Bucket}/covid19/world-cases-deaths-aggregates/
DatabaseName: !Ref GlueDatabaseName
Type: AWS::Glue::Crawler
DependsOn: GlueRole
Properties:
Name: c_aggregated
Description: !Sub Crawl aggregated datasets at s3://${Covid19Bucket}/covid19/world-cases-deaths-aggregates/
DatabaseName: !Ref GlueDatabaseName
Type: AWS::Glue::Crawler
Properties:
Name: ${self:custom.stage}-tvdata-raw-crawler
Role:
Fn::GetAtt: [RatingRole, Arn]
DatabaseName:
Type: AWS::Glue::Crawler
Properties:
Name: smart-hub-locations-csv
Role: !GetAtt "CrawlerRole.Arn"
Targets:
CatalogTargets:
Type: "AWS::Glue::Crawler"
Properties:
Name: "meter-data-business-aggregated-daily"
Role: !Sub "service-role/${IAMRole}"
Targets:
S3Targets:
"Type": "AWS::Glue::Crawler",
"Properties": {
"Name": {"Fn::Sub": "${AWS::StackName}-views-crawler"},
"Role": {"Ref": "glueRole"},
"DatabaseName": {
"Ref": "viewDatabase"
"path": "/ResourceTypes/AWS::Glue::Crawler/Properties/Role/Value",
"value": {
"ValueType": "AWS::IAM::Role.NameOrArn"
}
},
{
"path": "/ResourceTypes/AWS::Glue::Crawler/Properties/Role/Value",
"value": {
"ValueType": "AWS::IAM::Role.NameOrArn"
}
},
{
"Type": "AWS::Glue::Crawler",
"Properties": {
"Role": {
"Ref": "AWSGlueCuratedDatasetsCrawlerRoleName"
},
"DatabaseName": {
"Type": "AWS::Glue::Crawler",
"Properties": {
"Name": "raw_crawler",
"Role": {
"Fn::GetAtt": [
"GlueRole",
Parameters
-
Classifiersoptional - List -
Descriptionoptional - String -
SchemaChangePolicyoptional - SchemaChangePolicy -
Configurationoptional - String -
RecrawlPolicyoptional - RecrawlPolicy -
DatabaseNameoptional - String -
Targetsrequired - Targets -
CrawlerSecurityConfigurationoptional - String -
Nameoptional - String -
Rolerequired - String -
Scheduleoptional - Schedule -
TablePrefixoptional - String -
Tagsoptional - Json
Explanation in CloudFormation Registry
The
AWS::Glue::Crawlerresource specifies an AWS Glue crawler. For more information, see Cataloging Tables with a Crawler and Crawler Structure in the AWS Glue Developer Guide.
Frequently asked questions
What is AWS Glue Crawler?
AWS Glue Crawler is a resource for Glue of Amazon Web Service. Settings can be wrote in Terraform and CloudFormation.
Where can I find the example code for the AWS Glue Crawler?
For CloudFormation, the GirijaRaniGavara/provision-codepipeline-glue-workflows-, duyhoang15/test and ozzyince/tv source code examples are useful. See the CloudFormation Example section for further details.