Azure Synapse Spark Pool
This page shows how to write Terraform and Azure Resource Manager for Synapse Spark Pool and write them securely.
azurerm_synapse_spark_pool (Terraform)
The Spark Pool in Synapse can be configured in Terraform with the resource name azurerm_synapse_spark_pool
. The following sections describe 10 examples of how to use the resource and its parameters.
Example Usage from GitHub
resource "azurerm_synapse_spark_pool" "synapseSparkPool001" {
name = "SparkPool001"
synapse_workspace_id = azurerm_synapse_workspace.synapseProduct001.id
node_size_family = "MemoryOptimized"
node_size = "Small"
resource "azurerm_synapse_spark_pool" "this" {
for_each = local.spark
name = each.value.name
synapse_workspace_id = azurerm_synapse_workspace.ws.id
node_size_family = var.node_size_family
resource "azurerm_synapse_spark_pool" "default" {
name = "example"
synapse_workspace_id = azurerm_synapse_workspace.example.id
node_size_family = "MemoryOptimized"
node_size = "Small"
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
resource "azurerm_synapse_spark_pool" "spark_pool" {
depends_on = [
azurerm_synapse_workspace.synapse_workspace
]
name = var.spark_pool_name
synapse_workspace_id = azurerm_synapse_workspace.synapse_workspace.id
resource "azurerm_synapse_spark_pool" "coresynsqlsparkpools" {
for_each = var.coreSynSqlSparkPools
name = each.value["sprkPoolName"]
synapse_workspace_id = each.value["sprkSynWrkSpcId"]
node_size_family = each.value["sprkPoolNodeSizeFamily"]
node_size = each.value["sprkPoolNodeSize"]
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
node_count = try(var.settings.node_count, null)
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
node_count = try(var.settings.node_count, null)
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
node_count = try(var.settings.node_count, null)
Parameters
-
id
optional computed - string -
name
required - string -
node_count
optional - number -
node_size
required - string -
node_size_family
required - string -
spark_events_folder
optional - string -
spark_log_folder
optional - string -
spark_version
optional - string -
synapse_workspace_id
required - string -
tags
optional - map from string to string -
auto_pause
list block-
delay_in_minutes
required - number
-
-
auto_scale
list block-
max_node_count
required - number -
min_node_count
required - number
-
-
library_requirement
list block -
timeouts
single block
Explanation in Terraform Registry
Manages a Synapse Spark Pool.
Tips: Best Practices for The Other Azure Synapse Resources
In addition to the azurerm_synapse_workspace, Azure Synapse has the other resources that should be configured for security reasons. Please check some examples of those resources and precautions.
azurerm_synapse_workspace
Ensure to enable the managed virtual network
It is better to enable the managed virtual network, which is disabled as the default.
Microsoft.Synapse/workspaces/bigDataPools (Azure Resource Manager)
The workspaces/bigDataPools in Microsoft.Synapse can be configured in Azure Resource Manager with the resource name Microsoft.Synapse/workspaces/bigDataPools
. The following sections describe how to use the resource and its parameters.
Example Usage from GitHub
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"apiVersion": "2019-06-01-preview",
"properties": {},
"dependsOn": []
},
{
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"apiVersion": "2019-06-01-preview",
"properties": {
"autoPause": {
"enabled": true,
"delayInMinutes": 15
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Deleting",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Deleting",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Deleting",
Parameters
apiVersion
required - stringlocation
required - stringThe geo-location where the resource lives
name
required - stringBig Data pool name
properties
requiredautoPause
optionaldelayInMinutes
optional - integerNumber of minutes of idle time before the Big Data pool is automatically paused.
enabled
optional - booleanWhether auto-pausing is enabled for the Big Data pool.
autoScale
optionalenabled
optional - booleanWhether automatic scaling is enabled for the Big Data pool.
maxNodeCount
optional - integerThe maximum number of nodes the Big Data pool can support.
minNodeCount
optional - integerThe minimum number of nodes the Big Data pool can support.
cacheSize
optional - integerThe cache size
creationDate
optional - stringThe time when the Big Data pool was created.
customLibraries
optional arraycontainerName
optional - stringStorage blob container name.
name
optional - stringName of the library.
path
optional - stringStorage blob path of library.
type
optional - stringType of the library.
defaultSparkLogFolder
optional - stringThe default folder where Spark logs will be written.
dynamicExecutorAllocation
optionalenabled
optional - booleanIndicates whether Dynamic Executor Allocation is enabled or not.
isComputeIsolationEnabled
optional - booleanWhether compute isolation is required or not.
libraryRequirements
optionalcontent
optional - stringThe library requirements.
filename
optional - stringThe filename of the library requirements file.
nodeCount
optional - integerThe number of nodes in the Big Data pool.
nodeSize
optional - stringThe level of compute power that each node in the Big Data pool has.
nodeSizeFamily
optional - stringThe kind of nodes that the Big Data pool provides.
provisioningState
optional - stringThe state of the Big Data pool.
sessionLevelPackagesEnabled
optional - booleanWhether session level packages enabled.
sparkConfigProperties
optionalcontent
optional - stringThe library requirements.
filename
optional - stringThe filename of the library requirements file.
sparkEventsFolder
optional - stringThe Spark events folder
sparkVersion
optional - stringThe Apache Spark version.
tags
optional - stringResource tags.
type
required - string
Frequently asked questions
What is Azure Synapse Spark Pool?
Azure Synapse Spark Pool is a resource for Synapse of Microsoft Azure. Settings can be wrote in Terraform.
Where can I find the example code for the Azure Synapse Spark Pool?
For Terraform, the tschwarz01/tf-caf-data-landing-zone, PacktPublishing/Azure-Data-Architect-Handbook and infracost/infracost source code examples are useful. See the Terraform Example section for further details.
For Azure Resource Manager, the nisinha/cicd, praveenmathamsetty/bigdata and debhol/azuredocs source code examples are useful. See the Azure Resource Manager Example section for further details.