Client-Side Collector
Deploy the open-source Docker agent in your network for air-gapped and defense environments.
Overview
The MergeWhy Collector is an open-source Docker agent designed for defense, government, and highly regulated organizations that cannot send source code to external SaaS platforms. It runs inside your network, evaluates change evidence locally, signs attestations with Ed25519, and pushes only structured results to the MergeWhy API.
Tip
Installation
Docker
services:
mergewhy-collector:
image: ghcr.io/mergewhy/collector:latest
restart: unless-stopped
environment:
MERGEWHY_API_URL: https://app.mergewhy.com
MERGEWHY_API_KEY: mw_live_...
SCM_PROVIDER: github
GITHUB_TOKEN: ghp_...
GITHUB_BASE_URL: https://github.example.com/api/v3
COLLECTOR_FRAMEWORKS: soc2,sox_itgc
POLL_INTERVAL_SECONDS: 300
COLLECTOR_PRIVATE_KEY: "base64-encoded-ed25519-private-key"
ports:
- "8080:8080" # Health endpointKubernetes
apiVersion: apps/v1
kind: Deployment
metadata:
name: mergewhy-collector
spec:
replicas: 1
selector:
matchLabels:
app: mergewhy-collector
template:
spec:
containers:
- name: collector
image: ghcr.io/mergewhy/collector:latest
envFrom:
- secretRef:
name: mergewhy-collector-secrets
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /health
port: 8080
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 8080
periodSeconds: 10Key Generation
The collector uses Ed25519 asymmetric cryptography. Generate a keypair and register the public key in MergeWhy:
# Generate a new Ed25519 keypair
docker run --rm ghcr.io/mergewhy/collector:latest keygen
# Output:
# Private Key (keep secret):
# MC4CAQAwBQYDK2VwBCIE...
#
# Public Key (register in MergeWhy):
# MCowBQYDK2VwAyEA...
#
# Fingerprint (sha256):
# sha256:a1b2c3d4e5f6...Register the public key in Settings → Developer → Collector Keys.
Operating Modes
- Daemon (default) — Continuously polls for open PRs at the configured interval
- Once (
--once) — Single poll cycle then exits. Ideal for cron-scheduled runs - Scan (
--scan) — Scans all repositories for open PRs. Used for initial setup
Data Sovereignty
The collector enforces strict data boundaries. The following data is evaluated locally but never transmitted:
- Source code and diffs
- PR descriptions and commit messages
- Review comments and discussion threads
- CI/CD logs and test output
- Jira/Slack linked content
Only structured attestation results are sent to MergeWhy: evidence scores (0-100), boolean flags (has ticket, has review, has approval), per-control PASS/FAIL/WARNING status, and gap type identifiers.