Google Cloud (Stackdriver) Monitoring Slo

This page shows how to write Terraform for Cloud (Stackdriver) Monitoring Slo and write them securely.

google_monitoring_slo (Terraform)

The Slo in Cloud (Stackdriver) Monitoring can be configured in Terraform with the resource name google_monitoring_slo. The following sections describe 5 examples of how to use the resource and its parameters.

Example Usage from GitHub

custom_service.tf#L12
resource "google_monitoring_slo" "window_based_slo" {
  service = google_monitoring_custom_service.terraform-service.service_id
  slo_id = "terraform-slo"
  display_name = "99% of 10-min windows in rolling day have mean latency under 8s"

  goal = 0.99
main.tf#L26
resource "google_monitoring_slo" "home_page_available" {
    project = var.project_id
    service = data.google_monitoring_cluster_istio_service.frontend_external.service_id
    slo_id = "availability-slo"
    display_name = "Home Page Available"

alerts.tf#L191
resource "google_monitoring_slo" "slo_lbbackend" {
  service      = google_monitoring_custom_service.service_slo.service_id
  display_name = "Standard SLO LB Backend latency"

  goal                = 0.8
  rolling_period_days = 20
alerts.tf#L191
resource "google_monitoring_slo" "slo_lbbackend" {
  service      = google_monitoring_custom_service.service_slo.service_id
  display_name = "Standard SLO LB Backend latency"

  goal                = 0.8
  rolling_period_days = 20
alerts.tf#L191
resource "google_monitoring_slo" "slo_lbbackend" {
  service      = google_monitoring_custom_service.service_slo.service_id
  display_name = "Standard SLO LB Backend latency"

  goal                = 0.8
  rolling_period_days = 20

Review your Terraform file for Google best practices

Shisho Cloud, our free checker to make sure your Terraform configuration follows best practices, is available (beta).

Parameters

A calendar period, semantically "since the start of the current <calendarPeriod>". Possible values: ["DAY", "WEEK", "FORTNIGHT", "MONTH"]

Name used for UI elements listing this SLO.

The fraction of service that must be good in order for this objective to be met. 0 < goal <= 0.999

  • id optional computed - string
  • name optional computed - string

The full resource name for this service. The syntax is: projects/[PROJECT_ID_OR_NUMBER]/services/[SERVICE_ID]/serviceLevelObjectives/[SLO_NAME]

A rolling time period, semantically "in the past X days". Must be between 1 to 30 days, inclusive.

ID of the service to which this SLO belongs.

  • slo_id optional computed - string

The id to use for this ServiceLevelObjective. If omitted, an id will be generated instead.

  • basic_sli list block

    An optional set of locations to which this SLI is relevant. Telemetry from other locations will not be used to calculate performance for this SLI. If omitted, this SLI applies to all locations in which the Service has activity. For service types that don't support breaking down by location, setting this field will result in an error.

    An optional set of RPCs to which this SLI is relevant. Telemetry from other methods will not be used to calculate performance for this SLI. If omitted, this SLI applies to all the Service's methods. For service types that don't support breaking down by method, setting this field will result in an error.

    The set of API versions to which this SLI is relevant. Telemetry from other API versions will not be used to calculate performance for this SLI. If omitted, this SLI applies to all API versions. For service types that don't support breaking down by version, setting this field will result in an error.

    • availability list block

      Whether an availability SLI is enabled or not. Must be set to true. Defaults to 'true'.

    • latency list block

      A duration string, e.g. 10s. Good service is defined to be the count of requests made to this service that return in no more than threshold.

  • request_based_sli list block
    • distribution_cut list block

      A TimeSeries monitoring filter aggregating values to quantify the good service provided. Must have ValueType = DISTRIBUTION and MetricKind = DELTA or MetricKind = CUMULATIVE.

      • range list block
        • max optional - number

        max value for the range (inclusive). If not given, will be set to "infinity", defining an open range ">= range.min"

        • min optional - number

        Min value for the range (inclusive). If not given, will be set to "-infinity", defining an open range "< range.max"

    • good_total_ratio list block

      A TimeSeries monitoring filter quantifying bad service provided, either demanded service that was not provided or demanded service that was of inadequate quality. Must have ValueType = DOUBLE or ValueType = INT64 and must have MetricKind = DELTA or MetricKind = CUMULATIVE. Exactly two of 'good_service_filter','bad_service_filter','total_service_filter' must be set (good + bad = total is assumed).

      A TimeSeries monitoring filter quantifying good service provided. Must have ValueType = DOUBLE or ValueType = INT64 and must have MetricKind = DELTA or MetricKind = CUMULATIVE. Exactly two of 'good_service_filter','bad_service_filter','total_service_filter' must be set (good + bad = total is assumed).

      A TimeSeries monitoring filter quantifying total demanded service. Must have ValueType = DOUBLE or ValueType = INT64 and must have MetricKind = DELTA or MetricKind = CUMULATIVE. Exactly two of 'good_service_filter','bad_service_filter','total_service_filter' must be set (good + bad = total is assumed).

  • timeouts single block
  • windows_based_sli list block

    A TimeSeries monitoring filter with ValueType = BOOL. The window is good if any true values appear in the window. One of 'good_bad_metric_filter', 'good_total_ratio_threshold', 'metric_mean_in_range', 'metric_sum_in_range' must be set for 'windows_based_sli'.

    Duration over which window quality is evaluated, given as a duration string "[X]s" representing X seconds. Must be an integer fraction of a day and at least 60s.

    • good_total_ratio_threshold list block

      If window performance >= threshold, the window is counted as good.

      • basic_sli_performance list block

        An optional set of locations to which this SLI is relevant. Telemetry from other locations will not be used to calculate performance for this SLI. If omitted, this SLI applies to all locations in which the Service has activity. For service types that don't support breaking down by location, setting this field will result in an error.

        An optional set of RPCs to which this SLI is relevant. Telemetry from other methods will not be used to calculate performance for this SLI. If omitted, this SLI applies to all the Service's methods. For service types that don't support breaking down by method, setting this field will result in an error.

        The set of API versions to which this SLI is relevant. Telemetry from other API versions will not be used to calculate performance for this SLI. If omitted, this SLI applies to all API versions. For service types that don't support breaking down by version, setting this field will result in an error.

        • availability list block

          Whether an availability SLI is enabled or not. Must be set to 'true. Defaults to 'true'.

        • latency list block

          A duration string, e.g. 10s. Good service is defined to be the count of requests made to this service that return in no more than threshold.

      • performance list block
        • distribution_cut list block

          A TimeSeries monitoring filter aggregating values to quantify the good service provided. Must have ValueType = DISTRIBUTION and MetricKind = DELTA or MetricKind = CUMULATIVE.

          • range list block
            • max optional - number

            max value for the range (inclusive). If not given, will be set to "infinity", defining an open range ">= range.min"

            • min optional - number

            Min value for the range (inclusive). If not given, will be set to "-infinity", defining an open range "< range.max"

        • good_total_ratio list block

          A TimeSeries monitoring filter quantifying bad service provided, either demanded service that was not provided or demanded service that was of inadequate quality. Exactly two of good, bad, or total service filter must be defined (where good + bad = total is assumed) Must have ValueType = DOUBLE or ValueType = INT64 and must have MetricKind = DELTA or MetricKind = CUMULATIVE.

          A TimeSeries monitoring filter quantifying good service provided. Exactly two of good, bad, or total service filter must be defined (where good + bad = total is assumed) Must have ValueType = DOUBLE or ValueType = INT64 and must have MetricKind = DELTA or MetricKind = CUMULATIVE.

          A TimeSeries monitoring filter quantifying total demanded service. Exactly two of good, bad, or total service filter must be defined (where good + bad = total is assumed) Must have ValueType = DOUBLE or ValueType = INT64 and must have MetricKind = DELTA or MetricKind = CUMULATIVE.

    • metric_mean_in_range list block

      A monitoring filter specifying the TimeSeries to use for evaluating window The provided TimeSeries must have ValueType = INT64 or ValueType = DOUBLE and MetricKind = GAUGE. Mean value 'X' should satisfy 'range.min <= X < range.max' under good service.

      • range list block
        • max optional - number

        max value for the range (inclusive). If not given, will be set to "infinity", defining an open range ">= range.min"

        • min optional - number

        Min value for the range (inclusive). If not given, will be set to "-infinity", defining an open range "< range.max"

    • metric_sum_in_range list block

      A monitoring filter specifying the TimeSeries to use for evaluating window quality. The provided TimeSeries must have ValueType = INT64 or ValueType = DOUBLE and MetricKind = GAUGE. Summed value 'X' should satisfy 'range.min <= X < range.max' for a good window.

      • range list block
        • max optional - number

        max value for the range (inclusive). If not given, will be set to "infinity", defining an open range ">= range.min"

        • min optional - number

        Min value for the range (inclusive). If not given, will be set to "-infinity", defining an open range "< range.max"

Explanation in Terraform Registry

A Service-Level Objective (SLO) describes the level of desired good service. It consists of a service-level indicator (SLI), a performance goal, and a period over which the objective is to be evaluated against that goal. The SLO can use SLIs defined in a number of different manners. Typical SLOs might include "99% of requests in each rolling week have latency below 200 milliseconds" or "99.5% of requests in each calendar month return successfully." To get more information about Slo, see:

Frequently asked questions

What is Google Cloud (Stackdriver) Monitoring Slo?

Google Cloud (Stackdriver) Monitoring Slo is a resource for Cloud (Stackdriver) Monitoring of Google Cloud Platform. Settings can be wrote in Terraform.

Where can I find the example code for the Google Cloud (Stackdriver) Monitoring Slo?

For Terraform, the yuriatgoogle/stack-doctor, fawix/boutique-demo-slos-example and tpayne/terraform-examples source code examples are useful. See the Terraform Example section for further details.

security-icon

Automate config file reviews on your commits

Fix issues in your infrastructure as code with auto-generated patches.