# Partial Data Fetch

Typically, a single workflow contains **multiple jobs**, and each job inspects **all auditable data**. Therefore, if a workflow contains multiple jobs or there is a large amount of auditable data, the inspection may take a long time.

Shisho Cloud achieves faster inspections by enhancing the following features:

- The ability to execute only some of the jobs in a workflow
- The ability to retrieve only some of the auditable data and execute inspection code within a workflow job

This document describes optimization related to data fetching. Currently, the partial data fetching feature is automatically applied when certain conditions are met during partial workflow execution. For information on how to execute only specific jobs in a workflow, see [this page](/docs/g/concepts/partial-workflow-execution.md).

:::warning
This feature is currently only available for a limited number of [resources](/docs/g/concepts/resource.md). We plan to expand support in the future.
:::

:::info
The term "auditable resource" used repeatedly below is synonymous with auditable data. For the definition of a resource, see [this page](/docs/g/concepts/resource.md).
:::

## Specifications of Partial Data Fetch

The **`decide` block** and **`notify` block** of each job in a workflow contain the definition of a GraphQL query. Basically, **all** [**resources**](/docs/g/concepts/resource.md) are retrieved according to the definition and passed to the policy code. For example, the following query retrieves **all Compute Engine instances** in a Google Cloud project.

```graphql
{
  googleCloud {
    projects {
      computeEngine {
        instances {
          metadata {
            id
            displayName
          }
          shieldedInstanceConfiguration {
            enableSecureBoot
          }
        }
      }　
    }
  }
}
```

This feature refers to the respective directives given to the GraphQL query and the GraphQL schema when executing the GraphQL query. If the target field meets the following conditions and the target [resource ID](/docs/g/concepts/resource.md#resource-id) is specified, **only some of the [resources](/docs/g/concepts/resource.md)** are retrieved.

1. The `@canBePartial` directive is **given** to the target field on the GraphQL schema.
2. The [**resource kind**](/docs/g/concepts/resource.md#resource-kind) is specified using the `@resource` directive on the target field of the GraphQL schema.
3. The `@mustBeTotal` directive is **not given** to the target field on the GraphQL query.

The `@mustBeTotal` directive in 3 is described in detail in the [Explicit Fetch of All Resources](#partial-fetch-total) section.

```graphql
type GoogleCloudComputeEngine {

  ...

  """
  All Google Cloud Compute Engine instances
  """
  instances(
    condition: GoogleCloudComputeEngineCondition
  ): [GoogleCloudComputeEngineInstance!]!
    @resource(kind: "googlecloud-ce-instance")
    @canBePartial

  ...

}
```

:::info
For details on the GraphQL directives defined in Shisho Cloud, see [this page](/docs/g/concepts/graphql.md#gql-directives).
:::

### Further Resource Narrowing by Locator

We have described how to narrow down auditable [resources](/docs/g/concepts/resource.md) using the partial data fetching feature. Some [resources](/docs/g/concepts/resource.md) can be further narrowed down using the `@locatable` directive given to the target field in the GraphQL schema.

:::info
For details on the `@locatable` directive, see [this page](/docs/g/concepts/graphql.md#locatable).
:::

In the following example, the scenarios (`scenarios`) field is defined as a **child resource** of "Web Application" ("user-web-application") in Shisho Cloud.

```graphql
type Query {
  """
  All data from web application integration
  """
  webApps: [WebApp!]! @resource(kind: "user-web-application")
}

type WebApp {
  """
  Scenarios that the finding relates to
  """
  scenarios: [WebAppScenario!]! @locatable(parentKind: "user-web-application")

  ...
}
```

By specifying the locator information for each resource when partially executing a workflow, you can retrieve **only the scenarios associated with a specific web application**.

### Explicit Fetch of All Resources {#partial-fetch-total}

So far, we have seen how to narrow down [resources](/docs/g/concepts/resource.md). Now let's look at the following case.

```graphql
{
  aws {
    accounts {
      network {
        vpcs {
          routeTables { # Retrieve some of the route tables
            id

            ...

          }
        }
      }
      ec2 {
        instances {
          vpc {
            routeTables { # Retrieve some of the route tables
              metadata {
                id
              }

              ...

            }
          }
        }
      }
    }
  }
}
```

In this case, if the conditions for narrowing down [resources](/docs/g/concepts/resource.md) are met for both the `aws.accounts.network.vpcs.routeTables` field and the `aws.accounts.ec2.instances.vpc.routeTables` field, auditable [resources](/docs/g/concepts/resource.md) are narrowed down for each field.

However, there should be cases where you want to pass all resources to the policy code. In such a case, you can retrieve **all resources** by giving the `@mustBeTotal` directive to the target field on the GraphQL query.

```graphql
{
  aws {
    accounts {
      network {
        vpcs {
          routeTables @mustBeTotal { # Always retrieve all route tables
            id

            ...

          }
        }
      }
      ec2 {
        instances {
          vpc {
            routeTables { # Retrieve some of the route tables
              metadata {
                id
              }

              ...

            }
          }
        }
      }
    }
  }
}
```

If you are going to modify an existing workflow or create a custom workflow in the future, and the GraphQL query contains **multiple fields with the same [resource kind](/docs/g/concepts/resource.md#resource-kind)**, please consider using this directive as needed.

:::info
For details on the `@mustBeTotal` directive, see [this page](/docs/g/concepts/graphql.md#mustbetotal).
:::

## Summary

The partial data fetching feature is automatically applied when certain conditions are met during partial workflow execution. Therefore, you can write workflows as usual without being particularly conscious of it, and it will be automatically optimized. There is also a way to explicitly avoid the partial data fetching feature using the `@mustBeTotal` directive. Please use this feature.
