Azure Synapse Spark Pool
This page shows how to write Terraform and Azure Resource Manager for Synapse Spark Pool and write them securely.
azurerm_synapse_spark_pool (Terraform)
The Spark Pool in Synapse can be configured in Terraform with the resource name azurerm_synapse_spark_pool. The following sections describe 10 examples of how to use the resource and its parameters.
Example Usage from GitHub
resource "azurerm_synapse_spark_pool" "synapseSparkPool001" {
name = "SparkPool001"
synapse_workspace_id = azurerm_synapse_workspace.synapseProduct001.id
node_size_family = "MemoryOptimized"
node_size = "Small"
resource "azurerm_synapse_spark_pool" "this" {
for_each = local.spark
name = each.value.name
synapse_workspace_id = azurerm_synapse_workspace.ws.id
node_size_family = var.node_size_family
resource "azurerm_synapse_spark_pool" "default" {
name = "example"
synapse_workspace_id = azurerm_synapse_workspace.example.id
node_size_family = "MemoryOptimized"
node_size = "Small"
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
resource "azurerm_synapse_spark_pool" "spark_pool" {
depends_on = [
azurerm_synapse_workspace.synapse_workspace
]
name = var.spark_pool_name
synapse_workspace_id = azurerm_synapse_workspace.synapse_workspace.id
resource "azurerm_synapse_spark_pool" "coresynsqlsparkpools" {
for_each = var.coreSynSqlSparkPools
name = each.value["sprkPoolName"]
synapse_workspace_id = each.value["sprkSynWrkSpcId"]
node_size_family = each.value["sprkPoolNodeSizeFamily"]
node_size = each.value["sprkPoolNodeSize"]
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
node_count = try(var.settings.node_count, null)
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
node_count = try(var.settings.node_count, null)
resource "azurerm_synapse_spark_pool" "spark_pool" {
name = azurecaf_name.sparkpool.result
synapse_workspace_id = var.synapse_workspace_id
node_size_family = var.settings.node_size_family
node_size = var.settings.node_size
node_count = try(var.settings.node_count, null)
Parameters
-
idoptional computed - string -
namerequired - string -
node_countoptional - number -
node_sizerequired - string -
node_size_familyrequired - string -
spark_events_folderoptional - string -
spark_log_folderoptional - string -
spark_versionoptional - string -
synapse_workspace_idrequired - string -
tagsoptional - map from string to string -
auto_pauselist block-
delay_in_minutesrequired - number
-
-
auto_scalelist block-
max_node_countrequired - number -
min_node_countrequired - number
-
-
library_requirementlist block -
timeoutssingle block
Explanation in Terraform Registry
Manages a Synapse Spark Pool.
Tips: Best Practices for The Other Azure Synapse Resources
In addition to the azurerm_synapse_workspace, Azure Synapse has the other resources that should be configured for security reasons. Please check some examples of those resources and precautions.
azurerm_synapse_workspace
Ensure to enable the managed virtual network
It is better to enable the managed virtual network, which is disabled as the default.
Microsoft.Synapse/workspaces/bigDataPools (Azure Resource Manager)
The workspaces/bigDataPools in Microsoft.Synapse can be configured in Azure Resource Manager with the resource name Microsoft.Synapse/workspaces/bigDataPools. The following sections describe how to use the resource and its parameters.
Example Usage from GitHub
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"apiVersion": "2019-06-01-preview",
"properties": {},
"dependsOn": []
},
{
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"apiVersion": "2019-06-01-preview",
"properties": {
"autoPause": {
"enabled": true,
"delayInMinutes": 15
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Deleting",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Succeeded",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Deleting",
"type": "Microsoft.Synapse/workspaces/bigDataPools",
"location": "West US 2",
"name": "ExamplePool",
"tags": {},
"properties": {
"provisioningState": "Deleting",
Parameters
apiVersionrequired - stringlocationrequired - stringThe geo-location where the resource lives
namerequired - stringBig Data pool name
propertiesrequiredautoPauseoptionaldelayInMinutesoptional - integerNumber of minutes of idle time before the Big Data pool is automatically paused.
enabledoptional - booleanWhether auto-pausing is enabled for the Big Data pool.
autoScaleoptionalenabledoptional - booleanWhether automatic scaling is enabled for the Big Data pool.
maxNodeCountoptional - integerThe maximum number of nodes the Big Data pool can support.
minNodeCountoptional - integerThe minimum number of nodes the Big Data pool can support.
cacheSizeoptional - integerThe cache size
creationDateoptional - stringThe time when the Big Data pool was created.
customLibrariesoptional arraycontainerNameoptional - stringStorage blob container name.
nameoptional - stringName of the library.
pathoptional - stringStorage blob path of library.
typeoptional - stringType of the library.
defaultSparkLogFolderoptional - stringThe default folder where Spark logs will be written.
dynamicExecutorAllocationoptionalenabledoptional - booleanIndicates whether Dynamic Executor Allocation is enabled or not.
isComputeIsolationEnabledoptional - booleanWhether compute isolation is required or not.
libraryRequirementsoptionalcontentoptional - stringThe library requirements.
filenameoptional - stringThe filename of the library requirements file.
nodeCountoptional - integerThe number of nodes in the Big Data pool.
nodeSizeoptional - stringThe level of compute power that each node in the Big Data pool has.
nodeSizeFamilyoptional - stringThe kind of nodes that the Big Data pool provides.
provisioningStateoptional - stringThe state of the Big Data pool.
sessionLevelPackagesEnabledoptional - booleanWhether session level packages enabled.
sparkConfigPropertiesoptionalcontentoptional - stringThe library requirements.
filenameoptional - stringThe filename of the library requirements file.
sparkEventsFolderoptional - stringThe Spark events folder
sparkVersionoptional - stringThe Apache Spark version.
tagsoptional - stringResource tags.
typerequired - string
Frequently asked questions
What is Azure Synapse Spark Pool?
Azure Synapse Spark Pool is a resource for Synapse of Microsoft Azure. Settings can be wrote in Terraform.
Where can I find the example code for the Azure Synapse Spark Pool?
For Terraform, the tschwarz01/tf-caf-data-landing-zone, PacktPublishing/Azure-Data-Architect-Handbook and infracost/infracost source code examples are useful. See the Terraform Example section for further details.
For Azure Resource Manager, the nisinha/cicd, praveenmathamsetty/bigdata and debhol/azuredocs source code examples are useful. See the Azure Resource Manager Example section for further details.