The tooling surface
Agentic development works when the toolchain makes the work visible: what changed, why it changed, how it was verified, what risk remains, and what needs a human decision.
The progression I keep testing moves from one agent doing a task, to role-separated agents, to review agents, to babysitting flows, to automation that keeps low-risk work moving while preserving human review for consequential changes.
Proof points
| Claim | Evidence | Artifact |
|---|---|---|
| Developer tools should make agent work reviewable. | PR babysitting, CI watching, browser QA, screenshots, and summaries make agent output easier to accept or reject. | Control plane article |
| Agent platforms need reusable skills and workflow memory. | OpenClaw-style workspaces preserve task state, regressions, operating notes, and reusable role behavior. | OpenClaw and agent tooling |
| Tooling is strongest when it ships product. | Swoleby uses the agent workflow against real UX, content, approvals, SMS behavior, and release pressure. | Swoleby |