Workflow
The English user guide is currently in beta preview. Most of the documents have been automatically translated from the Japanese version. Should you find any inaccuracies, please reach out to Flatt Security.
Advanced: Workflow Execution Optimization
While a single workflow usually includes multiple jobs, and inspection is performed on all audit target data for each job, we have achieved faster inspection by enhancing the following features:
- The ability to execute only some jobs for multiple jobs in a workflow
- The ability to acquire only some audit target data and execute inspection code within a workflow job
This document mainly explains optimization related to data acquisition. Note that the audit target resources mentioned repeatedly below are synonymous with audit target data. For the definition of resources, please refer to this page.
For details on how to re-run inspections, see this page.
Currently, this feature is only available for a limited number of resources. We plan to expand support in the future.
Optimizing Data Acquisition from Data Sources
The decide
block and notify
block within each job in a workflow contain GraphQL query definitions. Basically, all resources are retrieved according to the definition and passed to the policy code. For example, the following query gets all Compute Engine instances within a Google Cloud project.
{
googleCloud {
projects {
computeEngine {
instances {
metadata {
id
displayName
}
shieldedInstanceConfiguration {
enableSecureBoot
}
}
}
}
}
}
This feature refers to the directives given in the GraphQL schema when executing GraphQL queries. If the target field meets the following conditions and the target resource ID is specified, only some resources are retrieved.
- The
@canBePartial
directive is added. - The
@resource
directive indicates the resource kind.
type GoogleCloudComputeEngine {
...
"""
All Google Cloud Compute Engine instances
"""
instances(
condition: GoogleCloudComputeEngineCondition
): [GoogleCloudComputeEngineInstance!]!
@resource(kind: "googlecloud-ce-instance")
@canBePartial
...
}
For details on GraphQL directives defined in Shisho Cloud, see this page.
Let's take a closer look at specifying the target resource using the Resource ID.
As mentioned in the "Re-running Inspections" section, this feature can be executed on the Shisho Cloud console, but internally, the following list is passed along with the target job ID when the workflow is executed.
googlecloud-ce-instance|489621111111|asia-northeast1-b|5121584011111111111
googlecloud-ce-instance|489621111111|asia-northeast1-b|5121584011111111112
user-web-application|WA01J1C82HCBZWKN9M7R5TJ1GEDV
The prefix of the Shisho Cloud resource ID indicates the resource kind. This allows Shisho Cloud to determine which field in the GraphQL query to execute this feature based on the received Resource ID.
- Whether the
@canBePartial
directive is added - Whether the
@resource
directive indicates the resource kind - Whether the prefix of the target resource ID matches the
@resource
directive
If all of the above conditions are met, a unique identifier is passed to the data source as a search condition to narrow down the target resource, and only matching resources are retrieved.
Further Resource Narrowing by Locator
In addition to narrowing down the target resource by resource ID, you can further narrow it down by using the @locatable
directive.
For details on the @locatable
directive, see this page.
For example, in Shisho Cloud, the scenario (scenarios
) field is defined as a child resource of "Web Application" (" user-web-application
").
type Query {
"""
All data from web application integration
"""
webApps: [WebApp!]! @resource(kind: "user-web-application")
}
type WebApp {
"""
Scenarios that the finding relates to
"""
scenarios: [WebAppScenario!]! @locatable(parentKind: "user-web-application")
...
}
By specifying the locator information for each resource as follows when executing the workflow, you can get only the scenarios associated with a specific web application.
user-web-application|WA01J1C82HCBZWKN9M7R5TJ1GEDV
dd6390cc9f7b1af728b78109f78b8034f9487624c77056cdab064e658df7b5f0
user-web-application|WA01J1C82HCBZWKN9M7R5TJ1GEDW
dd6390cc9f7b1af728b78109f78b8034f9487624c77056cdab064e658df7b5f1
Locator information varies from resource to resource. The locator information for specifying the scenario (scenarios
) field is the hash value of the scenario.
Explicitly Retrieving All Resources
So far, we have seen how to narrow down resources. Now let's look at the following case.
{
aws {
accounts {
network {
vpcs {
routeTables { # Get some route tables
id
...
}
}
}
ec2 {
instances {
vpc {
routeTables { # Get some route tables
metadata {
id
}
...
}
}
}
}
}
}
}
In this case, if the condition for narrowing down resources is met for both the aws.accounts.network.vpcs.routeTables
and aws.accounts.ec2.instances.vpc.routeTables
fields, the audit target resources are narrowed down for each field.
However, there should be cases where you want to include all resources in the policy code. In that case, you can retrieve all resources by adding the @mustBeTotal
directive to the target field in the GraphQL query.
{
aws {
accounts {
network {
vpcs {
routeTables @mustBeTotal { # Always get all route tables
id
...
}
}
}
ec2 {
instances {
vpc {
routeTables { # Get some route tables
metadata {
id
}
...
}
}
}
}
}
}
}
If you are modifying an existing workflow or creating a custom workflow in the future, and there are multiple fields with the same resource kind in the GraphQL query, consider using this directive as needed.
For details on the @mustBeTotal
directive, see this page.