Skip to main content

Workflow

info

The English user guide is currently in beta preview. Most of the documents have been automatically translated from the Japanese version. Should you find any inaccuracies, please reach out to Flatt Security.

Advanced: Workflow Execution Optimization

While a single workflow usually includes multiple jobs, and inspection is performed on all audit target data for each job, we have achieved faster inspection by enhancing the following features:

  • The ability to execute only some jobs for multiple jobs in a workflow
  • The ability to acquire only some audit target data and execute inspection code within a workflow job

This document mainly explains optimization related to data acquisition. Note that the audit target resources mentioned repeatedly below are synonymous with audit target data. For the definition of resources, please refer to this page.

info

For details on how to re-run inspections, see this page.

warning

Currently, this feature is only available for a limited number of resources. We plan to expand support in the future.

Optimizing Data Acquisition from Data Sources

The decide block and notify block within each job in a workflow contain GraphQL query definitions. Basically, all resources are retrieved according to the definition and passed to the policy code. For example, the following query gets all Compute Engine instances within a Google Cloud project.

{
googleCloud {
projects {
computeEngine {
instances {
metadata {
id
displayName
}
shieldedInstanceConfiguration {
enableSecureBoot
}
}
}
}
}
}

This feature refers to the directives given in the GraphQL schema when executing GraphQL queries. If the target field meets the following conditions and the target resource ID is specified, only some resources are retrieved.

  1. The @canBePartial directive is added.
  2. The @resource directive indicates the resource kind.
type GoogleCloudComputeEngine {

...

"""
All Google Cloud Compute Engine instances
"""
instances(
condition: GoogleCloudComputeEngineCondition
): [GoogleCloudComputeEngineInstance!]!
@resource(kind: "googlecloud-ce-instance")
@canBePartial

...

}
info

For details on GraphQL directives defined in Shisho Cloud, see this page.

Let's take a closer look at specifying the target resource using the Resource ID.

As mentioned in the "Re-running Inspections" section, this feature can be executed on the Shisho Cloud console, but internally, the following list is passed along with the target job ID when the workflow is executed.

  • googlecloud-ce-instance|489621111111|asia-northeast1-b|5121584011111111111
  • googlecloud-ce-instance|489621111111|asia-northeast1-b|5121584011111111112
  • user-web-application|WA01J1C82HCBZWKN9M7R5TJ1GEDV

The prefix of the Shisho Cloud resource ID indicates the resource kind. This allows Shisho Cloud to determine which field in the GraphQL query to execute this feature based on the received Resource ID.

  1. Whether the @canBePartial directive is added
  2. Whether the @resource directive indicates the resource kind
  3. Whether the prefix of the target resource ID matches the @resource directive

If all of the above conditions are met, a unique identifier is passed to the data source as a search condition to narrow down the target resource, and only matching resources are retrieved.

Further Resource Narrowing by Locator

In addition to narrowing down the target resource by resource ID, you can further narrow it down by using the @locatable directive.

info

For details on the @locatable directive, see this page.

For example, in Shisho Cloud, the scenario (scenarios) field is defined as a child resource of "Web Application" (" user-web-application ").

type Query {
"""
All data from web application integration
"""
webApps: [WebApp!]! @resource(kind: "user-web-application")
}

type WebApp {
"""
Scenarios that the finding relates to
"""
scenarios: [WebAppScenario!]! @locatable(parentKind: "user-web-application")

...
}

By specifying the locator information for each resource as follows when executing the workflow, you can get only the scenarios associated with a specific web application.

  • user-web-application|WA01J1C82HCBZWKN9M7R5TJ1GEDV
    • dd6390cc9f7b1af728b78109f78b8034f9487624c77056cdab064e658df7b5f0
  • user-web-application|WA01J1C82HCBZWKN9M7R5TJ1GEDW
    • dd6390cc9f7b1af728b78109f78b8034f9487624c77056cdab064e658df7b5f1
info

Locator information varies from resource to resource. The locator information for specifying the scenario (scenarios) field is the hash value of the scenario.

Explicitly Retrieving All Resources

So far, we have seen how to narrow down resources. Now let's look at the following case.

{
aws {
accounts {
network {
vpcs {
routeTables { # Get some route tables
id

...

}
}
}
ec2 {
instances {
vpc {
routeTables { # Get some route tables
metadata {
id
}

...

}
}
}
}
}
}
}

In this case, if the condition for narrowing down resources is met for both the aws.accounts.network.vpcs.routeTables and aws.accounts.ec2.instances.vpc.routeTables fields, the audit target resources are narrowed down for each field.

However, there should be cases where you want to include all resources in the policy code. In that case, you can retrieve all resources by adding the @mustBeTotal directive to the target field in the GraphQL query.

{
aws {
accounts {
network {
vpcs {
routeTables @mustBeTotal { # Always get all route tables
id

...

}
}
}
ec2 {
instances {
vpc {
routeTables { # Get some route tables
metadata {
id
}

...

}
}
}
}
}
}
}

If you are modifying an existing workflow or creating a custom workflow in the future, and there are multiple fields with the same resource kind in the GraphQL query, consider using this directive as needed.

info

For details on the @mustBeTotal directive, see this page.