← Back to Dev Diary

2026: Remote Mac Self-Hosted GitHub Actions Runner for OpenClaw CI — Read-Only Webhooks, macOS Label Pools & M4 Concurrency (Lobster Workflows + FAQ)

Laptop with code and terminal, representing a remote Mac self-hosted Runner for OpenClaw CI
Main line: remote Mac → self-hosted Runner → OpenClaw macos-node / macos-swift. Gateway install and six-region shootouts are out of scope here.

If you maintain an OpenClaw fork or plugin, 2026 usually means two parallel tracks: upstream OpenClaw CI expects real macOS for macos-node, macos-swift, and checks-node-compat-node22; meanwhile GitHub-hosted macOS minutes and queues keep climbing. This guide is only about running a self-hosted GitHub Actions macOS runner on a dedicated remote Mac, mapping macOS label pools to those lanes, and using read-only webhooks plus Lobster multi-step workflows for failure triage—not Gateway port 18789 setup, not a six-region node shootout (see our Runner regions & TCO guide). Plans: pricing; US East / West checkout: US East, US West; Runner ops: help center.

1) Hosted macOS runners vs remote Mac self-hosted: decision table for OpenClaw CI

Per OpenClaw CI docs, macos-node runs TypeScript tests when macOS-related sources change; macos-swift covers Swift lint/build/test; manual workflow_dispatch bypasses smart scope and runs the full graph (including Node 22 compat). Upstream defaults to Blacksmith macos-latest-class runners—fine for the main repo, but fork maintainers who cannot rely on that queue need their own macOS tags.

Hosted macOS fits low-frequency PRs; self-hosted on a remote M4 trades fixed rent for persistent node_modules and DerivedData—better for daily main, plugin contract shards, and teams that want local-equivalent pnpm check on hardware they control. For multi-region nodes and daily–monthly TCO math, read the regions & TCO article; here we only wire OpenClaw lanes. Long-lived Gateway/agent deployment is covered in the OpenClaw remote Mac field guide (one line: Gateway owns chat/tools; CI runners own compile/test).

Path OpenClaw CI fit Main cost
GitHub-hosted macos-latest Zero ops for upstream workflows; forks may not share queue/cache. Per-minute billing; cold macos-swift; peak queues.
Remote Mac self-hosted runner Labels map to macos-node / macos-swift; pin Node 22 and pnpm store. Runner upgrades, disk, concurrency mutexes.
Linux hosted only, skip macOS lanes OK for docs/pure Node; cannot replace Swift/macOS tests. Still need manual or scheduled full macOS dispatch before merge.
Decision hint: If you fire workflow_dispatch full CI at least weekly, or macos-node queues past ~15 minutes on your fork, plan one machine with openclaw-macos-node before stacking more hosted minutes. Plugin teams often find a single M4 seat pays back within ~90 days when dispatch is routine—your mileage depends on workflow size; this is not an SLA claim.

Another signal you are ready to self-host: your fork’s preflight job keeps scheduling macOS lanes on every plugin touch, but hosted runners evict caches between jobs, so TypeScript and Swift steps spend more time downloading than testing. A remote Mac with a warm pnpm store and DerivedData folder often cuts wall-clock time even when raw CPU is similar to hosted hardware—measure on your repo before you budget seats.

2) Environment checklist: Xcode CLT, Node 22, CI user isolation

OpenClaw runs most Node shards on Linux, but macos-node and checks-node-compat-node22 require Node 22 (see CI docs and local pnpm check parity). On your runner host:

  • Xcode Command Line Tools plus a team-frozen Xcode minor; Swift lane needs full Xcode, not CLT alone.
  • Node 22.x via fnm or mise in the CI user shell; still use actions/setup-node in workflows.
  • pnpm aligned with upstream packageManager; cache under the CI home, not your daily dev account.
  • Dedicated Unix user (e.g. runner) for actions-runner—not the same user as browser/chat secrets.
  • Resource caps: watch memory_pressure; only parallelize a second job on 24GB after measurement.
Pre-flight checklist (printable):
  1. xcode-select -p and xcodebuild -version match your doc.
  2. node -v is v22.x; pnpm -v matches the lockfile policy.
  3. Runner survives SSH logout via launchd.
  4. Free disk > 25% (DerivedData + pnpm store grow fast).
  5. Split PATs: webhook/reasoning secrets vs signing certs in separate GitHub Secrets.

Platform APIs: Apple Developer Documentation; Node policy: nodejs.org.

3) Runner registration & macOS label pool (HowTo)

Follow GitHub’s self-hosted runner guide. For an OpenClaw fork, map upstream jobs to explicit labels—not a vague lone self-hosted.

Registration & labels (repo runner)
# in ~/actions-runner
./config.sh --url https://github.com/YOUR_ORG/YOUR_FORK \
  --token YOUR_REGISTRATION_TOKEN \
  --labels self-hosted,macOS,ARM64,openclaw-macos-node,openclaw-m4-16 \
  --name nuv-openclaw-macos-node-01

sudo ./svc.sh install
sudo ./svc.sh start

Label pool (split by lane):

  • openclaw-macos-node — upstream macos-node (TypeScript/Vitest macOS paths).
  • openclaw-macos-swiftmacos-swift; heavier CPU/RAM—avoid default parallel with node on 16GB.
  • openclaw-node22 — optional for dispatch compat or local pnpm check smoke.
  • openclaw-m4-16 / openclaw-m4-24 — hardware tier for matrices and capacity planning.
Fork workflow: point macos-node at your pool
jobs:
  macos-node:
    runs-on: [self-hosted, macOS, openclaw-macos-node]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'pnpm'
      - run: pnpm install --frozen-lockfile
      - run: pnpm test --filter macos-relevant

For codesign/notarization, add concurrency: { group: codesign-macos, cancel-in-progress: false } so two jobs never unlock keychains at once. More SSH/Runner notes: help center.

Org vs repo runners: A single-repo fork can use one repo-level runner with the labels above. If several plugins share one billing group, consider an organization runner with repo access limited to forks you control, and use runner groups in GitHub Enterprise (where available) so experimental repos cannot steal Swift-lane machines.

Upgrades: Pin the runner application version in your runbook; after each GitHub runner release, drain jobs (./svc.sh stop), upgrade binaries, restart, and confirm the machine shows Idle before re-enabling auto-assignment. Stale runners are a common source of “works locally, fails on CI” drift when the service protocol changes.

4) Wire OpenClaw CI / dispatch & read-only webhooks

Keep upstream preflight on push/PR; change only macOS runs-on to your pool. For full validation, use supported manual dispatch (from CI docs):

Trigger full CI graph (incl. macOS)
gh workflow run ci.yml --ref main
gh workflow run ci.yml --ref main -f target_ref=your-branch -f include_android=true

Read-only webhook reasoning (phased): For failure triage, subscribe to read-only events first—workflow_run.completed, check_run.completed, workflow_job.completed—and persist metadata (repo, branch, conclusion, URLs). Do not request Contents write or auto-merge in phase 0. After signature verification, let agents fetch log URLs or Actions API read scopes; expand to comments/labels only with approval policy.

Phase Webhook / permissions Purpose
Phase 0 Read-only + workflow_run / check_run Failure alerts, queue watch, Lobster triage input.
Phase 1 Actions log read PAT Pull failed job log slices; tie to macos-node shards.
Phase 2 Controlled write (issue comment, label) Auto ci-failure, pager mention—needs review gates.
vs Gateway: Gateway serves agent tools and local hooks; CI webhooks consume GitHub events only. Never treat PR titles or comment bodies as shell commands in production—metadata-only triage (ClawSweeper-style minimal forwarding).

When wiring the receiver, log only stable fields: workflow_run.id, head_branch, conclusion, and HTML URLs to the run. Store HMAC verification failures separately from auth failures so on-call can tell misconfigured secrets from replay attacks. If you later add Lobster, pass those fields as JSON stdin—do not paste full webhook bodies into chat prompts where a malicious PR title could influence tool selection.

5) M4 16GB / 24GB concurrency cheat sheet (single-machine label pool)

Pragmatic guidance for OpenClaw macOS lanes (not an SLA; strongly repo-dependent). For extra seats or regions, see the TCO article—no six-region table here.

Tier Suggested parallelism Labels Risks
M4 16GB macos-node or 1× light Swift—not both at once. openclaw-macos-node, openclaw-m4-16 Vitest OOM; DerivedData vs pnpm store on disk.
M4 24GB macos-node + 1× off-peak Swift check; or staggered 2× node shards. openclaw-m4-24; keep Swift on its own runner name. Parallel Swift compile can still warn on memory—use concurrency groups.
Two machines One node, one swift—route by label, do not mix on one runner. openclaw-macos-swift dedicated Caches not shared—warm deps per host.

Nuvcloud offers multi-node M4 bare metal with RAM/SSD upgrades—good for fixed CI seats. Availability and price: plans and checkout pages; we do not promise queue SLAs in this article.

When sizing SSD, budget for three growth curves: Xcode DerivedData, pnpm store, and GitHub Actions work directories under _work/. A 256GB base SKU can work for a single-lane node fork if you schedule weekly deriveddata cleanup; Swift-heavy forks should plan 512GB or the 1TB add-on described on the pricing wizard. Memory pressure on 16GB often shows up as killed node children during Vitest—if dmesg or Console shows jetsam events, move Swift to a 24GB host instead of chasing smaller test shards.

6) Lobster workflow: reproducible CI failure log triage

Lobster collapses multi-step tool chains into one deterministic call with approval points and resumeToken—ideal for “CI red → fetch logs → classify → human OK → post summary” instead of an LLM freestyle-calling tools. Enable alsoAllow: ["lobster"] in agent config, then run a pipeline.

Lobster: failed job summary (example pipeline JSON)
{
  "action": "run",
  "pipeline": "exec --json --shell 'gh run list --workflow ci.yml --limit 5 --json databaseId,conclusion,headBranch' | exec --stdin json --shell 'gh run view $ID --log-failed' | exec --stdin json --shell 'node scripts/ci-triage-summarize.mjs' | approve --preview-from-stdin --limit 3 --prompt 'Post summary to on-call channel?'",
  "timeoutMs": 120000,
  "maxStdoutBytes": 512000
}

Split steps in a .lobster workflow file: collectclassifyapprovalnotify (see Lobster “workflow files”). When status is needs_approval, resume with:

Resume after approval
{
  "action": "resume",
  "token": "<resumeToken>",
  "approve": true
}

Reproducible triage habits: Log each failure’s run_id, triggering label (e.g. openclaw-macos-node), and a hash of the first stderr block in your team wiki. Scrub Lobster JSON before posting to issues—secrets sometimes leak in log snippets. On timeout, raise timeoutMs or split “fetch logs” from “LLM summarize.” Keep a 7-day read-only archive per failed run to separate flaky from real regressions.

Teams often keep two pipelines: a fast Lobster flow that only classifies failure type (Node vs Swift vs infra) and a slow flow that fetches full logs after a human approves. That mirrors OpenClaw’s own CI philosophy—cheap preflight, expensive lanes only when needed—and prevents an LLM from burning tokens on green runs.

If you already use gh in CI, the same PAT scope works on the runner host for Lobster: restrict it to actions:read and metadata:read in phase 0. Rotate the PAT when staff leave; never reuse the runner registration token as an API credential.

7) FAQ

Q1: Runner shows Offline?
Check launchd, token expiry, disk full. Remove in GitHub → Settings → Actions → Runners and re-run config.sh; see self-hosted runner docs.

Q2: macos-node complains about Node?
Align Node 22 on the host, setup-node, and checks-node-compat-node22; clear stale global npm caches if needed.

Q3: Swift / SDK not found?
Install full Xcode, run sudo xcodebuild -license accept; never run macos-swift on CLT-only hosts.

Q4: Read-only webhook stuck at 403?
Verify secret, TLS, reverse-proxy path, App event permissions; gateway must not execute PR body text as commands.

Q5: openclaw doctor vs CI red?
doctor is install/gateway health; CI failures live in Actions logs. Use Lobster + gh run view, not onboard docs, for workflow debugging.

Q6: Can we skip macOS and run Linux only?
Yes for pure Node/doc PRs, but before merging macOS/Swift changes run at least one workflow_dispatch full graph or you diverge from upstream preflight intent.

Q7: Share one runner between upstream and fork?
Not recommended—split runner groups and Secrets so untrusted fork scripts cannot reach upstream signing material.

Q8: Lobster invalid JSON?
Each pipeline step stdout must be JSON; raise maxStdoutBytes; do not pipe human logs into the next step without --json.

Q9: Mix hosted and self-hosted?
Yes—e.g. PR on hosted, main macos-swift on self-hosted via if: or separate workflow files.

More posts: tech blog index. For region-level billing and parallel seats, continue with the macOS Runner regions & TCO guide.

Cloud Mac mini: steadier OpenClaw CI pools

Parking macos-node / macos-swift on a dedicated remote M4 buys predictable compute and label-pool elasticity: Nuvcloud bare-metal Mac mini gives native Unix, low idle power for 24/7 runners, optional 24GB RAM or SSD bumps, or a second machine dedicated to Swift—paired with read-only webhooks and Lobster triage, you spend less time watching hosted queues.

Planning an OpenClaw fork CI pool? Start with one M4 tagged openclaw-macos-nodeview plans & regions, then register runners and webhooks from the checklist above.

LIMITED View plans