Runner shows Offline—what now?

Check launchd, registration token expiry, and disk space. Remove the runner in GitHub Settings → Actions → Runners and re-run config.sh.

macos-node fails on Node version?

OpenClaw checks-node-compat-node22 and macos-node require Node 22. Pin 22.x on the runner with fnm or mise and use actions/setup-node in the workflow.

Read-only webhook returns 403?

Verify webhook secret, TLS, and reverse-proxy path. In phase 0 do not request Contents write—ensure workflow_run and check_run events verify and persist metadata only.

How is openclaw doctor different from CI failures?

doctor checks local/gateway health; CI failures live in GitHub Actions job logs. Use Lobster plus gh run view for triage, not onboard docs for workflow debugging.

2026: Remote Mac Self-Hosted GitHub Actions Runner for OpenClaw CI — Read-Only Webhooks, macOS Label Pools & M4 Concurrency (Lobster Workflows + FAQ)

May 19, 2026 · Nuvcloud tech blog · ~18 min read

Laptop with code and terminal, representing a remote Mac self-hosted Runner for OpenClaw CI — Main line: remote Mac → self-hosted Runner → OpenClaw `macos-node` / `macos-swift`. Gateway install and six-region shootouts are out of scope here.

If you maintain an OpenClaw fork or plugin, 2026 usually means two parallel tracks: upstream OpenClaw CI expects real macOS for macos-node, macos-swift, and checks-node-compat-node22; meanwhile GitHub-hosted macOS minutes and queues keep climbing. This guide is only about running a self-hosted GitHub Actions macOS runner on a dedicated remote Mac, mapping macOS label pools to those lanes, and using read-only webhooks plus Lobster multi-step workflows for failure triage—not Gateway port 18789 setup, not a six-region node shootout (see our Runner regions & TCO guide). Plans: pricing; US East / West checkout: US East, US West; Runner ops: help center.

1) Hosted macOS runners vs remote Mac self-hosted: decision table for OpenClaw CI

Per OpenClaw CI docs, macos-node runs TypeScript tests when macOS-related sources change; macos-swift covers Swift lint/build/test; manual workflow_dispatch bypasses smart scope and runs the full graph (including Node 22 compat). Upstream defaults to Blacksmith macos-latest-class runners—fine for the main repo, but fork maintainers who cannot rely on that queue need their own macOS tags.

Hosted macOS fits low-frequency PRs; self-hosted on a remote M4 trades fixed rent for persistent node_modules and DerivedData—better for daily main, plugin contract shards, and teams that want local-equivalent pnpm check on hardware they control. For multi-region nodes and daily–monthly TCO math, read the regions & TCO article; here we only wire OpenClaw lanes. Long-lived Gateway/agent deployment is covered in the OpenClaw remote Mac field guide (one line: Gateway owns chat/tools; CI runners own compile/test).

Path	OpenClaw CI fit	Main cost
GitHub-hosted `macos-latest`	Zero ops for upstream workflows; forks may not share queue/cache.	Per-minute billing; cold `macos-swift`; peak queues.
Remote Mac self-hosted runner	Labels map to `macos-node` / `macos-swift`; pin Node 22 and pnpm store.	Runner upgrades, disk, concurrency mutexes.
Linux hosted only, skip macOS lanes	OK for docs/pure Node; cannot replace Swift/macOS tests.	Still need manual or scheduled full macOS dispatch before merge.

Decision hint: If you fire workflow_dispatch full CI at least weekly, or macos-node queues past ~15 minutes on your fork, plan one machine with openclaw-macos-node before stacking more hosted minutes. Plugin teams often find a single M4 seat pays back within ~90 days when dispatch is routine—your mileage depends on workflow size; this is not an SLA claim.

Another signal you are ready to self-host: your fork’s preflight job keeps scheduling macOS lanes on every plugin touch, but hosted runners evict caches between jobs, so TypeScript and Swift steps spend more time downloading than testing. A remote Mac with a warm pnpm store and DerivedData folder often cuts wall-clock time even when raw CPU is similar to hosted hardware—measure on your repo before you budget seats.

2) Environment checklist: Xcode CLT, Node 22, CI user isolation

OpenClaw runs most Node shards on Linux, but macos-node and checks-node-compat-node22 require Node 22 (see CI docs and local pnpm check parity). On your runner host:

Xcode Command Line Tools plus a team-frozen Xcode minor; Swift lane needs full Xcode, not CLT alone.
Node 22.x via fnm or mise in the CI user shell; still use actions/setup-node in workflows.
pnpm aligned with upstream packageManager; cache under the CI home, not your daily dev account.
Dedicated Unix user (e.g. runner) for actions-runner—not the same user as browser/chat secrets.
Resource caps: watch memory_pressure; only parallelize a second job on 24GB after measurement.

Pre-flight checklist (printable):

xcode-select -p and xcodebuild -version match your doc.
node -v is v22.x; pnpm -v matches the lockfile policy.
Runner survives SSH logout via launchd.
Free disk > 25% (DerivedData + pnpm store grow fast).
Split PATs: webhook/reasoning secrets vs signing certs in separate GitHub Secrets.

Platform APIs: Apple Developer Documentation; Node policy: nodejs.org.

3) Runner registration & macOS label pool (HowTo)

Follow GitHub’s self-hosted runner guide. For an OpenClaw fork, map upstream jobs to explicit labels—not a vague lone self-hosted.

Registration & labels (repo runner)

# in ~/actions-runner
./config.sh --url https://github.com/YOUR_ORG/YOUR_FORK \
  --token YOUR_REGISTRATION_TOKEN \
  --labels self-hosted,macOS,ARM64,openclaw-macos-node,openclaw-m4-16 \
  --name nuv-openclaw-macos-node-01

sudo ./svc.sh install
sudo ./svc.sh start

Label pool (split by lane):

openclaw-macos-node — upstream macos-node (TypeScript/Vitest macOS paths).
openclaw-macos-swift — macos-swift; heavier CPU/RAM—avoid default parallel with node on 16GB.
openclaw-node22 — optional for dispatch compat or local pnpm check smoke.
openclaw-m4-16 / openclaw-m4-24 — hardware tier for matrices and capacity planning.

Fork workflow: point macos-node at your pool

jobs:
  macos-node:
    runs-on: [self-hosted, macOS, openclaw-macos-node]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'pnpm'
      - run: pnpm install --frozen-lockfile
      - run: pnpm test --filter macos-relevant

For codesign/notarization, add concurrency: { group: codesign-macos, cancel-in-progress: false } so two jobs never unlock keychains at once. More SSH/Runner notes: help center.

Org vs repo runners: A single-repo fork can use one repo-level runner with the labels above. If several plugins share one billing group, consider an organization runner with repo access limited to forks you control, and use runner groups in GitHub Enterprise (where available) so experimental repos cannot steal Swift-lane machines.

Upgrades: Pin the runner application version in your runbook; after each GitHub runner release, drain jobs (./svc.sh stop), upgrade binaries, restart, and confirm the machine shows Idle before re-enabling auto-assignment. Stale runners are a common source of “works locally, fails on CI” drift when the service protocol changes.

4) Wire OpenClaw CI / dispatch & read-only webhooks

Keep upstream preflight on push/PR; change only macOS runs-on to your pool. For full validation, use supported manual dispatch (from CI docs):

Trigger full CI graph (incl. macOS)

gh workflow run ci.yml --ref main
gh workflow run ci.yml --ref main -f target_ref=your-branch -f include_android=true

Read-only webhook reasoning (phased): For failure triage, subscribe to read-only events first—workflow_run.completed, check_run.completed, workflow_job.completed—and persist metadata (repo, branch, conclusion, URLs). Do not request Contents write or auto-merge in phase 0. After signature verification, let agents fetch log URLs or Actions API read scopes; expand to comments/labels only with approval policy.

Phase	Webhook / permissions	Purpose
Phase 0	Read-only + `workflow_run` / `check_run`	Failure alerts, queue watch, Lobster triage input.
Phase 1	Actions log read PAT	Pull failed job log slices; tie to `macos-node` shards.
Phase 2	Controlled write (issue comment, label)	Auto `ci-failure`, pager mention—needs review gates.

vs Gateway: Gateway serves agent tools and local hooks; CI webhooks consume GitHub events only. Never treat PR titles or comment bodies as shell commands in production—metadata-only triage (ClawSweeper-style minimal forwarding).

When wiring the receiver, log only stable fields: workflow_run.id, head_branch, conclusion, and HTML URLs to the run. Store HMAC verification failures separately from auth failures so on-call can tell misconfigured secrets from replay attacks. If you later add Lobster, pass those fields as JSON stdin—do not paste full webhook bodies into chat prompts where a malicious PR title could influence tool selection.

5) M4 16GB / 24GB concurrency cheat sheet (single-machine label pool)

Pragmatic guidance for OpenClaw macOS lanes (not an SLA; strongly repo-dependent). For extra seats or regions, see the TCO article—no six-region table here.

Tier	Suggested parallelism	Labels	Risks
M4 16GB	1× `macos-node` or 1× light Swift—not both at once.	`openclaw-macos-node`, `openclaw-m4-16`	Vitest OOM; DerivedData vs pnpm store on disk.
M4 24GB	1× `macos-node` + 1× off-peak Swift check; or staggered 2× node shards.	`openclaw-m4-24`; keep Swift on its own runner name.	Parallel Swift compile can still warn on memory—use `concurrency` groups.
Two machines	One node, one swift—route by label, do not mix on one runner.	`openclaw-macos-swift` dedicated	Caches not shared—warm deps per host.

Nuvcloud offers multi-node M4 bare metal with RAM/SSD upgrades—good for fixed CI seats. Availability and price: plans and checkout pages; we do not promise queue SLAs in this article.

When sizing SSD, budget for three growth curves: Xcode DerivedData, pnpm store, and GitHub Actions work directories under _work/. A 256GB base SKU can work for a single-lane node fork if you schedule weekly deriveddata cleanup; Swift-heavy forks should plan 512GB or the 1TB add-on described on the pricing wizard. Memory pressure on 16GB often shows up as killed node children during Vitest—if dmesg or Console shows jetsam events, move Swift to a 24GB host instead of chasing smaller test shards.

6) Lobster workflow: reproducible CI failure log triage

Lobster collapses multi-step tool chains into one deterministic call with approval points and resumeToken—ideal for “CI red → fetch logs → classify → human OK → post summary” instead of an LLM freestyle-calling tools. Enable alsoAllow: ["lobster"] in agent config, then run a pipeline.

Lobster: failed job summary (example pipeline JSON)

{
  "action": "run",
  "pipeline": "exec --json --shell 'gh run list --workflow ci.yml --limit 5 --json databaseId,conclusion,headBranch' | exec --stdin json --shell 'gh run view $ID --log-failed' | exec --stdin json --shell 'node scripts/ci-triage-summarize.mjs' | approve --preview-from-stdin --limit 3 --prompt 'Post summary to on-call channel?'",
  "timeoutMs": 120000,
  "maxStdoutBytes": 512000
}

Split steps in a .lobster workflow file: collect → classify → approval → notify (see Lobster “workflow files”). When status is needs_approval, resume with:

Resume after approval

{
  "action": "resume",
  "token": "<resumeToken>",
  "approve": true
}

Reproducible triage habits: Log each failure’s run_id, triggering label (e.g. openclaw-macos-node), and a hash of the first stderr block in your team wiki. Scrub Lobster JSON before posting to issues—secrets sometimes leak in log snippets. On timeout, raise timeoutMs or split “fetch logs” from “LLM summarize.” Keep a 7-day read-only archive per failed run to separate flaky from real regressions.

Teams often keep two pipelines: a fast Lobster flow that only classifies failure type (Node vs Swift vs infra) and a slow flow that fetches full logs after a human approves. That mirrors OpenClaw’s own CI philosophy—cheap preflight, expensive lanes only when needed—and prevents an LLM from burning tokens on green runs.

If you already use gh in CI, the same PAT scope works on the runner host for Lobster: restrict it to actions:read and metadata:read in phase 0. Rotate the PAT when staff leave; never reuse the runner registration token as an API credential.

7) FAQ

Q1: Runner shows Offline?
Check launchd, token expiry, disk full. Remove in GitHub → Settings → Actions → Runners and re-run config.sh; see self-hosted runner docs.

Q2: macos-node complains about Node?
Align Node 22 on the host, setup-node, and checks-node-compat-node22; clear stale global npm caches if needed.

Q3: Swift / SDK not found?
Install full Xcode, run sudo xcodebuild -license accept; never run macos-swift on CLT-only hosts.

Q4: Read-only webhook stuck at 403?
Verify secret, TLS, reverse-proxy path, App event permissions; gateway must not execute PR body text as commands.

Q5: openclaw doctor vs CI red?
doctor is install/gateway health; CI failures live in Actions logs. Use Lobster + gh run view, not onboard docs, for workflow debugging.

Q6: Can we skip macOS and run Linux only?
Yes for pure Node/doc PRs, but before merging macOS/Swift changes run at least one workflow_dispatch full graph or you diverge from upstream preflight intent.

Q7: Share one runner between upstream and fork?
Not recommended—split runner groups and Secrets so untrusted fork scripts cannot reach upstream signing material.

Q8: Lobster invalid JSON?
Each pipeline step stdout must be JSON; raise maxStdoutBytes; do not pipe human logs into the next step without --json.

Q9: Mix hosted and self-hosted?
Yes—e.g. PR on hosted, main macos-swift on self-hosted via if: or separate workflow files.

More posts: tech blog index. For region-level billing and parallel seats, continue with the macOS Runner regions & TCO guide.