Top / Microsoft Azure / Azure Synapse / Spark Pool

Azure Synapse Spark Pool

This page shows how to write Terraform and Azure Resource Manager for Synapse Spark Pool and write them securely.

Review your .tf file for Azure best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

azurerm_synapse_spark_pool (Terraform)

The Spark Pool in Synapse can be configured in Terraform with the resource name azurerm_synapse_spark_pool. The following sections describe 10 examples of how to use the resource and its parameters.

Example Usage from GitHub

tschwarz01/tf-caf-data-landing-zone

synapse_spark_pool.tf#L1

resource "azurerm_synapse_spark_pool" "synapseSparkPool001" {
  name                 = "SparkPool001"
  synapse_workspace_id = azurerm_synapse_workspace.synapseProduct001.id
  node_size_family     = "MemoryOptimized"
  node_size            = "Small"

Find out how to use this setting securely with Shisho Cloud

PacktPublishing/Azure-Data-Architect-Handbook

database_pools.tf#L17

resource "azurerm_synapse_spark_pool" "this" {
  for_each             = local.spark
  name                 = each.value.name
  synapse_workspace_id = azurerm_synapse_workspace.ws.id

  node_size_family     = var.node_size_family

Find out how to use this setting securely with Shisho Cloud

infracost/infracost

synapse_spark_pool_test.tf#L46

resource "azurerm_synapse_spark_pool" "default" {
  name                 = "example"
  synapse_workspace_id = azurerm_synapse_workspace.example.id
  node_size_family     = "MemoryOptimized"
  node_size            = "Small"

Find out how to use this setting securely with Shisho Cloud

anmoltoppo/Terraform

spark_pool.tf#L11

resource "azurerm_synapse_spark_pool" "spark_pool" {
  name                 = azurecaf_name.sparkpool.result
  synapse_workspace_id = var.synapse_workspace_id
  node_size_family     = var.settings.node_size_family
  node_size            = var.settings.node_size

Find out how to use this setting securely with Shisho Cloud

davesee/terraform-caf-rover-breakout

spark_pool.tf#L11

resource "azurerm_synapse_spark_pool" "spark_pool" {
  name                 = azurecaf_name.sparkpool.result
  synapse_workspace_id = var.synapse_workspace_id
  node_size_family     = var.settings.node_size_family
  node_size            = var.settings.node_size

Find out how to use this setting securely with Shisho Cloud

chaitanya867/KPMG-a.txt

ApacheSparkPool.tf#L3

resource "azurerm_synapse_spark_pool" "spark_pool" {
    depends_on = [
      azurerm_synapse_workspace.synapse_workspace
    ]
  name                 = var.spark_pool_name
  synapse_workspace_id = azurerm_synapse_workspace.synapse_workspace.id

Find out how to use this setting securely with Shisho Cloud

johhess40/Terraform

main.tf#L42

resource "azurerm_synapse_spark_pool" "coresynsqlsparkpools" {
  for_each             = var.coreSynSqlSparkPools
  name                 = each.value["sprkPoolName"]
  synapse_workspace_id = each.value["sprkSynWrkSpcId"]
  node_size_family     = each.value["sprkPoolNodeSizeFamily"]
  node_size            = each.value["sprkPoolNodeSize"]

Find out how to use this setting securely with Shisho Cloud

pkhuntia/aztfmod

spark_pool.tf#L14

resource "azurerm_synapse_spark_pool" "spark_pool" {
  name                 = azurecaf_name.sparkpool.result
  synapse_workspace_id = var.synapse_workspace_id
  node_size_family     = var.settings.node_size_family
  node_size            = var.settings.node_size
  node_count           = try(var.settings.node_count, null)

Find out how to use this setting securely with Shisho Cloud

aztfmod/terraform-azurerm-caf

spark_pool.tf#L14

resource "azurerm_synapse_spark_pool" "spark_pool" {
  name                 = azurecaf_name.sparkpool.result
  synapse_workspace_id = var.synapse_workspace_id
  node_size_family     = var.settings.node_size_family
  node_size            = var.settings.node_size
  node_count           = try(var.settings.node_count, null)

Find out how to use this setting securely with Shisho Cloud

mennaammar/spoke-landing-zone

spark_pool.tf#L14

resource "azurerm_synapse_spark_pool" "spark_pool" {
  name                 = azurecaf_name.sparkpool.result
  synapse_workspace_id = var.synapse_workspace_id
  node_size_family     = var.settings.node_size_family
  node_size            = var.settings.node_size
  node_count           = try(var.settings.node_count, null)

Find out how to use this setting securely with Shisho Cloud

Review your Terraform file for Azure best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

Parameters

id optional computed - string
name required - string
node_count optional - number
node_size required - string
node_size_family required - string
spark_events_folder optional - string
spark_log_folder optional - string
spark_version optional - string
synapse_workspace_id required - string
tags optional - map from string to string
auto_pause list block
- delay_in_minutes required - number
auto_scale list block
- max_node_count required - number
- min_node_count required - number
library_requirement list block
- content required - string
- filename required - string
timeouts single block
- create optional - string
- delete optional - string
- read optional - string
- update optional - string

>> from Terraform Registry

Explanation in Terraform Registry

Manages a Synapse Spark Pool.

>> from Terraform Registry

Tips: Best Practices for The Other Azure Synapse Resources

In addition to the azurerm_synapse_workspace, Azure Synapse has the other resources that should be configured for security reasons. Please check some examples of those resources and precautions.

azurerm_synapse_workspace

Ensure to enable the managed virtual network

It is better to enable the managed virtual network, which is disabled as the default.

Review your Azure Synapse settings

In addition to the above, there are other security points you should be aware of making sure that your .tf files are protected in Shisho Cloud.

Microsoft.Synapse/workspaces/bigDataPools (Azure Resource Manager)

The workspaces/bigDataPools in Microsoft.Synapse can be configured in Azure Resource Manager with the resource name Microsoft.Synapse/workspaces/bigDataPools. The following sections describe how to use the resource and its parameters.

Example Usage from GitHub

nisinha/cicd

ARMTemplateForWorkspace.json#L177

            "type": "Microsoft.Synapse/workspaces/bigDataPools",
            "apiVersion": "2019-06-01-preview",
            "properties": {},
            "dependsOn": []
        },
        {

praveenmathamsetty/bigdata

ARMTemplateForWorkspace.json#L177

            "type": "Microsoft.Synapse/workspaces/bigDataPools",
            "apiVersion": "2019-06-01-preview",
            "properties": {
                "autoPause": {
                    "enabled": true,
                    "delayInMinutes": 15

debhol/azuredocs