Back to Docs

Client-Side Collector

Deploy the open-source Docker agent in your network for air-gapped and defense environments.

Overview

The MergeWhy Collector is an open-source Docker agent designed for defense, government, and highly regulated organizations that cannot send source code to external SaaS platforms. It runs inside your network, evaluates change evidence locally, signs attestations with Ed25519, and pushes only structured results to the MergeWhy API.

Tip

Source code, PR descriptions, review comments, and CI logs never leave your network. Only evidence scores, control pass/fail results, and gap types are transmitted.

Installation

Docker

docker-compose.yml
services:
  mergewhy-collector:
    image: ghcr.io/mergewhy/collector:latest
    restart: unless-stopped
    environment:
      MERGEWHY_API_URL: https://app.mergewhy.com
      MERGEWHY_API_KEY: mw_live_...
      SCM_PROVIDER: github
      GITHUB_TOKEN: ghp_...
      GITHUB_BASE_URL: https://github.example.com/api/v3
      COLLECTOR_FRAMEWORKS: soc2,sox_itgc
      POLL_INTERVAL_SECONDS: 300
      COLLECTOR_PRIVATE_KEY: "base64-encoded-ed25519-private-key"
    ports:
      - "8080:8080"  # Health endpoint

Kubernetes

deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mergewhy-collector
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mergewhy-collector
  template:
    spec:
      containers:
        - name: collector
          image: ghcr.io/mergewhy/collector:latest
          envFrom:
            - secretRef:
                name: mergewhy-collector-secrets
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            periodSeconds: 30
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            periodSeconds: 10

Key Generation

The collector uses Ed25519 asymmetric cryptography. Generate a keypair and register the public key in MergeWhy:

Generate keypair
# Generate a new Ed25519 keypair
docker run --rm ghcr.io/mergewhy/collector:latest keygen

# Output:
# Private Key (keep secret):
#   MC4CAQAwBQYDK2VwBCIE...
#
# Public Key (register in MergeWhy):
#   MCowBQYDK2VwAyEA...
#
# Fingerprint (sha256):
#   sha256:a1b2c3d4e5f6...

Register the public key in Settings → Developer → Collector Keys.

Operating Modes

  • Daemon (default) — Continuously polls for open PRs at the configured interval
  • Once (--once) — Single poll cycle then exits. Ideal for cron-scheduled runs
  • Scan (--scan) — Scans all repositories for open PRs. Used for initial setup

Data Sovereignty

The collector enforces strict data boundaries. The following data is evaluated locally but never transmitted:

  • Source code and diffs
  • PR descriptions and commit messages
  • Review comments and discussion threads
  • CI/CD logs and test output
  • Jira/Slack linked content

Only structured attestation results are sent to MergeWhy: evidence scores (0-100), boolean flags (has ticket, has review, has approval), per-control PASS/FAIL/WARNING status, and gap type identifiers.