Skip to content
Build bf98f58

AI Bootcamp — Planning Notes

Working notes for the AI bootcamp for the PopSockets dev team. Captures what to cover, what devs specifically underuse, and hot tips worth demoing live.

The Getting Started guide is the reference doc devs take home. This page is the live session agenda.

Goal

Teach PS devs how to integrate AI into their daily workflows. First deliverable — a Getting Started reference doc — is already live. The bootcamp is the live companion — five live sessions over a week, building from foundations to advanced tooling.

Session Plan

# Theme Length Core skill
1 Foundations & Setup 30-45 min Get the stack running + PopForge orientation
2 Daily Dev Workflow 30-45 min Coding tools, permissions, PR workflows, AI in the inner loop
3 Debugging & Ops 30-45 min Logs, metrics, cross-system tracing
4 Collaboration & Delivery 30-45 min PRs, specs, tickets, handoffs
5 Advanced & Ecosystem 30-45 min Parallel agents, MCP, mycelium, agent hierarchy

Session 1 — Foundations & Setup

Get everyone on the stack. By the end of this session, every dev should have a functional Claude Code setup wired up to their actual projects.

Cover

  • Claude Code / Copilot CLI basics — install, auth, the REPL loop
  • CLAUDE.md / AGENTS.md — project context files; show a real one from popdocs or OMS
  • Memory files — so AI doesn't re-learn your stack every conversation
  • AeroSpace — tiling window manager for terminal management; live-demo workspace setup

PopForge Primer

Elevator pitch: A test-order factory + pipeline tracker. Creates fake orders the way a real retailer or shopper would, watches each one traverse OMS → NAV → 3PL → EDI, and shows a dashboard of where everything is stuck. Black-box E2E integration harness — not a unit test framework, not a prod tool.

Why it exists: Before PopForge, validating an order-flow change meant hand-crafting a test order — manual SFCC checkout, hand-edited 850 XML, or asking someone in OMS to push one through. Every dev had their own ad-hoc recipe. "Drop a realistic order, watch it traverse 6 systems, see what broke" is now one click.

Where it fits — sidecar to the dev/QA/stage instances of everything:

flowchart LR
    PF((PopForge<br/>Flask + Postgres<br/>+ test harness))

    PF -->|OCAPI| SFCC
    PF -->|GraphQL| OMS
    PF -->|SOAP| NAV
    PF -->|REST| Cirro
    PF -->|SFTP 940 / 850| SFTP[PS SFTP]
    PF -->|SFTP 850| SPS[SPS Commerce]
    PF -->|HTTP + actuator| Camel
    PF -->|read| EDI[(edi_docs<br/>Postgres)]

    style PF fill:#ffe066,stroke:#333,stroke-width:3px
    style EDI fill:#e9ecef,stroke:#333

Nothing in prod calls PopForge. You click buttons; PopForge acts.

Two Major Flow Types

PopForge simulates two distinct order-flow families. They share infrastructure (edi_docs, NAV, Cirro) but diverge in how orders enter the system and how they're confirmed.

B2C — direct-to-consumer. Originates at a storefront (SFCC, Shopify planned). PopForge creates a realistic shopper order via OCAPI and then watches it traverse OMS, NAV release, optional batching, Cirro fulfillment, shipping confirmation, NAV shipment notice, and payment capture.

sequenceDiagram
    participant PF as PopForge
    participant SFCC
    participant OMS
    participant NAV
    participant Camel
    participant Cirro

    PF->>SFCC: Create synthetic order (OCAPI)
    OMS->>SFCC: Poll new orders
    PF-->>OMS: Watch status (GraphQL)

    OMS->>NAV: Create sales order (SOAP)
    Note over OMS: PENDING_NAV_RELEASE

    NAV-->>Camel: ASB nav-order-release
    Camel->>OMS: PATCH status
    Note over OMS: RELEASED_FROM_NAV

    OMS->>Camel: POST fulfillment request
    Camel->>Cirro: Create order
    Note over OMS: WAITING_FULFILLMENT_CONFIRMATION

    Cirro-->>Camel: Acceptance webhook
    Camel->>OMS: Confirm
    Note over OMS: EXPORTED_TO_TPL

    Cirro-->>Camel: Shipping webhook (tracking)
    Camel->>OMS: Fulfill orders
    Note over OMS: SHIPPED

    OMS->>NAV: Ship notice (cron)
    OMS->>SFCC: Status + payment capture
    PF-->>OMS: Poll until SHIPPED + FULFILLED

Deep dive: B2C Order Flow.

B2B — retail / EDI. Originates from a retailer via EDI 850. PopForge generates a retailer-specific 850 XML from a template, drops it on SPS SFTP (or PS SFTP, depending on the test), Camel processes it, NAV generates a 940, Cirro fulfills (with or without prepack), and the full 856/945/810 chain flows back.

sequenceDiagram
    participant PF as PopForge
    participant SPS as SPS Commerce
    participant EDI as cm-edi-prc
    participant NAV
    participant SFTP as PS SFTP
    participant Order as cm-order-prc
    participant Cirro
    participant DB as edi_docs (PG)

    PF->>PF: Render 850 from retailer template<br/>(target-009.xml, walmart-529.xml, …)
    PF->>SPS: Drop 850 XML

    SPS->>EDI: 850 via ASB
    EDI->>DB: Archive 850<br/>(waiting_for_940)
    EDI->>SFTP: Upload 850 to /testin/{env}/
    NAV->>SFTP: Poll 850
    PF-->>DB: Watch stage progression

    Note over NAV: Generate 940
    NAV->>SFTP: Drop 940
    EDI->>SFTP: Poll 940
    EDI->>DB: Archive 940<br/>(waiting_for_packing_instructions)

    Order->>NAV: Packing instructions (SOAP)
    Order->>Cirro: Create B2B order<br/>(prepack branch if required)
    PF-->>NAV: Query sales order (SOAP)
    PF-->>Cirro: Query order status (REST)

    Cirro-->>EDI: Shipment webhook
    EDI->>DB: Insert 856 + 945 records
    EDI->>SPS: Upload 856 ASN
    EDI->>SFTP: Upload 945

    NAV->>SFTP: Drop 810 invoice
    EDI->>SFTP: Poll 810
    EDI->>DB: Archive 810

    Note over PF: derive_status() reads DB →<br/>850 → 940 → 856 → 945 → 810

PopForge's B2B flow is largely passive observation of edi_docs — it watches the doc-type cascade to determine pipeline stage, augmented by direct NAV/Cirro queries for cross-verification.

Deep dive: EDI Pipeline.

Other flow families not yet simulated (see roadmap below): Jackyun/China, Amazon Direct Fulfillment.

Five concepts worth internalizing:

  1. Pipeline stages are ordered; status never regresses. derive_status() returns max(current_stage_index, evidence_derived_index). If you see an order regress stages in the UI, that's a bug — find the cause, don't add a workaround.
  2. One order has many identitiesname, oms_id, po_number, nav_so, sfcc_order_id, cirro_order_id. When joining across systems, always check which ID you're holding.
  3. EDI templates are per-retailer. templates/edi/target-009.xml, walmart-529.xml, staples-549.xml, etc. Adding a retailer = adding a template.
  4. edi_docs is shared state you don't own. Postgres integration_<env>.edi_docs is written by cm-edi-prc. PopForge only reads. Schema changes happen in Camel, not here.
  5. Environments are first-class. Every action picks dev/qa/stage/sandbox/sandbox2/staging/prod. Different OMS tokens, different PG schemas, different APIM URLs, different SFTP conventions (stage SFTP folder is uat, not stage — Mule legacy). If your code picks up an env value from a global, it's wrong.

Stack: Python 3.10+ · Flask (single file, ~10k lines of dashboard.py) · Jinja2 + vanilla JS + Tailwind · Playwright pinned to 1.40.0 · pytest · PostgreSQL (shared with Camel) + SQLite (own state) · Azure Service Bus · AKS (ps-usw-aks-01, namespace popforge) · Datadog logs · Key Vault + CSI for secrets.

Repo: git@popgit:PopSockets/popforge.git. Trunk-based, push to main, manual deploy via gh workflow run deploy.yaml. Short Git SHA baked into the image and shown in dashboard header — always verify SHA after deploy.

"Hello world" task — change a pipeline-stage badge color:

  1. source .venv/bin/activate && FLASK_DEBUG=true python3 dashboard.py
  2. Grep dashboard.py for PIPELINE_STAGES = [
  3. Find a stage, locate its Tailwind class (e.g. bg-purple-500), change it
  4. Save → auto-reload → refresh browser → done

Top gotchas to flag in the live session:

  • Python → HTML → JS escaping stacks three deep in dashboard.py. Inline template strings contain triple-quoted HTML containing <script> with JS inside. \\' is sometimes the right answer. Capture DOM refs before await, never after.
  • NAV truncates <PurchaseOrderNumber> at 20 chars silently. External_Document_No is varchar(20). Generate a longer PO and NAV imports the truncated value — later lookups by the original PO return nothing. Keep POs ≤ 20 chars.
  • 850 XML must include <UnitPrice> in every <OrderLine>. cm-edi-prc.identifyEdiType() uses that to distinguish doc types. Omit it → silently misclassified → vanishes.
  • Tailwind hidden + flex conflict (both set display). Use inline style="display:none" + element.style.display to toggle.
  • Playwright 1.40.0 pin is load-bearing — Docker base image matches exactly. Bump it, bump the image in lockstep or the container breaks.
  • CSI secret sync is async. After adding a new secret to pop-kev-1: kubectl delete secret popforge-secrets -n popforge && kubectl rollout restart deploy/popforge -n popforge.

What PopForge doesn't do — quick list to prevent scope confusion: not prod, not a unit test framework, not an EDI parser (that's cm-edi-prc), not a load tester, not an admin UI for OMS/NAV/Cirro, doesn't own pipeline status (it derives).

Roadmap — flows to add:

  • Arena ECO creation — simulate an Engineering Change Order in Arena and verify propagation to downstream systems. Should land in cm-product-sys (Arena poller), cm-product-prc (fan-out), OMS (ProductDetail / OmsComponentDetail), PrintStation (batching-side product update), and Jackyun (SKU/BOM upsert via cm-jackyun-sys HTTP — separate path from ASB). Catches silent drift between the four downstream systems.
  • PrintStation flow simulation — today PF can create the upstream order but can't simulate batching/printing completion. Simulator would inject a fake "batch complete" ASB message on int-prod-batching-update-oms so E2E printable-order tests don't wait for a real print run.
  • Cancellation flow — currently PF creates but doesn't retract. Cover NAV DocType 7 (Nav::ExportCancellationJob), retailer-specific timing, and refund-to-NAV coupling. Lots of bugs hide in cancellation paths because they rarely get exercised.
  • Jackyun / China orders — currently invisible to PF. Routes through XB 3PL (not Cirro) with a completely separate fulfillment path. Order creation via Qimen API, polling cadence, SKU/BOM preconditions. Bugs here won't be caught by any existing PF flow.
  • Amazon Direct Fulfillment — SP-API ack + label + packing slip lifecycle, with its own OMS status branch (PENDING_ACKNOWLEDGMENTORDER_ACKNOWLEDGEDPENDING_SHIPPING_LABELAMZ_SHIPPING_LABEL_SUCCESS, etc.). Different enough from Cirro that it deserves its own simulator.

Lower-priority candidates also worth considering: Shopify B2C (PF primer calls this "planned" — needed for parity with SFCC), retained-order review workflow (RETAINEDRETAIN_APPROVED/_REJECTED).

Infrastructure roadmap — split PopForge itself into prod / non-prod deployments. Today it's a single AKS deployment in the popforge namespace that talks to every other system's env. A non-prod PopForge instance would let us test PF itself (new flows, breaking dashboard changes, Playwright bumps) without risking the prod-facing test harness. The envs PF talks to are already first-class; the envs PF runs in are not — that's the gap to close.

Deep dive — full primer lives in the popforge repo at docs/ai-strategy/bootcamp.md (10 gotchas, full tech stack table, "day 2" task, quick-reference file map).

PopWatch (planned) — the watchdog

Status: roadmap, not built yet. Mention in Session 1 so devs know what's coming.

Elevator pitch: PopWatch is PopForge's sibling. Where PopForge is the test-order factory (devs click buttons to create synthetic orders on demand), PopWatch is the watchdog — active probes running on a cadence across PS envs, watching logs, catching regressions before customers do.

Shape:

  • Active, cadence-driven — runs continuous probes against prod first, then extends to non-prod envs. Not a passive log-scraper; it actively pokes the stack.
  • Reuses PopForge where possible — rather than rebuilding the test-order machinery, PF should expose endpoints that PopWatch invokes to kick off probes. Clean separation of concerns: PF owns "how to create a realistic order," PopWatch owns "when to run probes and what to alert on."
  • Multi-env, prod-first — start with prod monitoring, then add non-prod so regressions get caught before release.
  • Its own deployment — separate from PopForge (different cadence, different failure modes, different blast radius).

Open questions to work out before building: - Alert channel — Datadog alerts? Teams webhook? Both? - Dashboard — standalone UI or land in existing Datadog / Grafana? - Probe catalog — which E2E flows get probed? SFCC order, 850 drop, Arena item sync, etc.? - Failure policy — how many consecutive failures before paging someone?

Demo

Walk through setting up CLAUDE.md + a memory file for a sample project. Show the before/after: a cold session vs one with context.


Session 2 — Daily Dev Workflow

AI in the inner loop — how devs should actually be using it while writing code, reviewing PRs, and shipping changes.

Cover

  • Code generation — real examples from PS work
  • Understanding unfamiliar codebases / onboarding to new repos — walk through asking AI to explain a service you've never touched
  • API exploration — hitting endpoints, reading responses, figuring out schemas live
  • Permissions — tuning what Claude can do without asking, so you move fast without prod accidents
  • Documentation — generating + editing docs alongside code
  • Cross-referencing docs against actual implementation — catching doc drift

Coding Tools Landscape

A quick tour of what's out there so devs can pick what fits their workflow. Not a "use this one" — a "here's the landscape, try a few, pick the one that sticks."

Tool Shape Best for
Claude Code CLI Terminal-native agent. Reads/writes files, runs commands, holds a real session. Deep work — architecture, refactors, cross-file edits. Lives where you already live.
Claude.ai Projects Web chat with persistent context per project. Exploratory Q&A, brainstorming, writing without touching the repo.
GitHub Copilot In-editor inline completion + chat. Autocomplete on the hot path. Low-effort wins while typing.
Cursor AI-first VS Code fork with deep repo context. Heavy IDE users who want agent-style edits without leaving the editor.
Aider CLI pair-programmer that commits directly to git. Small, focused code changes with immediate version control.

Poll the room: what is each dev currently using? Informs where we meet them.

Hot take: most devs underuse the terminal-native option (Claude Code CLI) because inline completion feels "safer." The terminal-native agents are where the productivity jumps live — they can read 10 files, run tests, and come back with a real answer. Worth a live demo.

Context tracker setup — worth flagging. The PS Claude Code setup has a statusline tool that shows how much context is left in the current session. Rule of thumb: when it drops to ~20%, run /compact before the next real task. Compact summarizes the session into a smaller context so you keep the important history without hitting the limit mid-task. Devs who don't compact early enough lose track of decisions, re-read the same files, and waste tokens re-deriving state they already had. Install the statusline, watch the number, compact at 20%.

Claude Permissions — blast radius vs. friction

Every Claude tool call hits the permission system: allow silently, ask the human, or deny. The goal is tuning so low-risk actions flow through and high-risk ones always stop for confirmation. Too permissive = prod accidents. Too restrictive = devs click "approve" on muscle memory and stop reading the prompt — which is the same as too permissive, just slower.

Two knobs:

  1. Rules in ~/.claude/settings.json (global) and .claude/settings.json (per-project). Tool-scoped glob patterns — Bash(git status:*) auto-allows any git status variant, Bash(rm:*) can be denied outright. Per-project overrides layer on top of global.
  2. Permission modedefault asks per tool; acceptEdits auto-approves file edits but still gates bash; plan blocks all side-effects (read-only); bypassPermissions is sandboxes/throwaway dirs only.

PS starting defaults (tune as you go):

Class of action Default Why
Read, Glob, Grep allow Read-only, zero blast radius
Bash(git status:*), Bash(git log:*), Bash(git diff:*) allow Read-only git
Bash(gh pr view:*), Bash(gh pr diff:*) allow Read-only GitHub
Edit / Write on project files allow (or acceptEdits mode) Local + reversible via git
Bash(git commit:*), Bash(git push:*) (non-force) ask Shared state, worth a beat
Bash(rm -rf:*), Bash(git reset --hard:*), Bash(git push --force:*) deny Irreversible, rarely actually needed
Anything hitting prod DBs, secrets, deploy targets deny globally Wrong blast radius for an AI autopilot

Enforcement via hooks — pair the rules with the global directive in ~/.claude/CLAUDE.md that says never take a destructive action without listing what's affected and getting explicit confirmation. Rules are the technical block; the hook is the behavioral contract. Together = safe without being slow.

Rule of thumb:

  • Reversible + local → allow.
  • Shared state (push, comment on a PR, post a message) → ask.
  • Irreversible + prod → deny unless invoked explicitly with a specific reason.

Teams should expect to revisit these after the first week — over-asking gets disabled; under-asking gets a scare story. The right defaults are the ones devs don't fight against.

PR Workflows & Automation

The biggest daily win for most devs. Run AI against the diff before requesting human review — it'll catch low-effort stuff (missing tests, edge cases, convention drift) and free humans to review judgment calls.

Pattern to demo:

  1. Open a PR locally (or in Copilot CLI / Claude Code).
  2. gh pr diff <N> or just git diff main.
  3. Ask AI: review this diff against the project's conventions, flag obvious bugs, list edge cases that aren't tested, sanity-check the description.
  4. Fix what lands. Push. Only then request human review.

Tools worth showing: - gh CLI + AI — the cheapest setup. Pipe gh pr diff into an agent. - Claude Code /review style workflows — agent reads the PR, produces structured review output. - Custom GitHub Actions — automated AI pass on PR open, posts a review comment.

How the Camel team uses Claude for PR reviews

A real-world setup running today on four cm-* repos. Advisory only — it doesn't gate merges, it just posts a PR comment within a few minutes of open/push. Worth copying into other PS repos.

Setup:

  • Action: anthropics/claude-code-action@v1
  • Secret: ANTHROPIC_API_KEY (org-level)
  • Workflow file: .github/workflows/claude-review.yml in each repo
  • Output: one top-level PR comment (via gh pr comment) + inline comments on specific lines (via the GitHub inline-comment MCP tool)

Two trigger models:

Model Repos When it fires
Auto on PR cm-edi-prc, cm-order-prc, cm-int-service-sys pull_request opened / synchronize / ready_for_review / reopened, gated on base branch
On-demand cm-product-prc Comment /review on any PR, OR gh workflow run claude-review.yml -f pr_number=<n>

The on-demand model is newer — stops Claude from re-reviewing rapid-fire pushes (and burning API credits) on PRs that aren't ready. For noise-sensitive repos, copy cm-product-prc's pattern.

What Claude looks for (configured in each repo's prompt):

  1. Correctness in the service's actual flows — each prompt names the service's domain (e.g. "EDI 850/940/856/945, SFTP polling, Service Bus listeners, DLQ max delivery = 1"). Domain-specific = better signal.
  2. Spring Boot + Camel conventions — route classes by capability, DTOs under dto/, services under service/impl/, @Slf4j, Lombok.
  3. Checkstyle — Google baseline, 4-space indent, line length, magic numbers.
  4. Security — no committed credentials, no secrets in code, sensitive values from config.
  5. Error handling — Camel doTry/doCatch, onException, graceful Service Bus handling.
  6. Branch-aware scrutiny — PR to qa = first gate; PR to stage = flag config/env drift; PR to master = highest scrutiny.

The prompt explicitly says "be direct, no generic praise; if the PR looks good, say so briefly."

Wins worth bragging about:

  • Credential leak in application-*.yml (cm-edi-prc PR #41) — flagged plaintext DB passwords and SAS connection strings across prod/qa/stage YAMLs as BLOCKER. Re-reviewed on every push and tracked status across four rounds ("None of the four blockers identified in the March 30 review have been addressed"). Impossible to ignore.
  • Null-body NPE (cm-product-prc PR #135) — caught that POST /jackyun/resync with body null would NPE on skus.size(), returning HTTP 500. Would have surfaced the first time someone typo'd a curl call.
  • Wrong HTTP status on validation (PR #135) — size-cap guard was throwing IllegalArgumentException (→ 500) instead of ResponseStatusException(BAD_REQUEST) (→ 400).

Whiffs worth calling out:

  • Checkstyle nits dressed up as findings — constant placement or import grouping sometimes get surfaced as top-level "Finding" instead of one-line nits. Look at the labels; Nit findings are usually skippable unless they cluster.
  • Camel route DSL knowledge is shallow — Spring Boot reads clean; Camel from(...).process(...).to(...) chains and exchange properties are more hit-or-miss. Sanity-check Camel-specific feedback against actual route behavior.
  • No cross-PR memory — each review is fresh. Close a PR and open a near-identical one and Claude rediscovers the same issues from scratch.

Not covered yet — worth adding:

  • CI gating for BLOCKER findings on PRs to master
  • Test-coverage delta — Claude doesn't run ./mvnw test today
  • Liquibase / migration compat checks for rolling deploys
  • APIM parity guard — flag new @RequestMapping paths against both APIM instances
  • Branch-strategy enforcement — detect when an env branch is being merged into a release branch (wrong direction)
  • Cross-repo coordination prompts — "a new EDI status field here may need updates in cm-int-service-sys too"

Quickstart for a new PS repo: copy .github/workflows/claude-review.yml from cm-edi-prc (auto) or cm-product-prc (on-demand). Edit the prompt: block to describe the service and its gotchas. Confirm ANTHROPIC_API_KEY at the org level. Open a PR — Claude responds in 2-4 minutes.

Demo

Two parts: 1. Pick a cm-* service most devs haven't touched. Use AI to produce a one-paragraph explanation + a call-graph. Verify against the code. 2. Take a real open PR (ideally small). Run AI against it as a reviewer. Compare its feedback to what a human reviewer would say.


Session 3 — Debugging & Ops

AI for the stuff devs already do daily — logs, metrics, git, tracing.

Cover

  • DataDog searching — AI-driven log queries, correlating traces
  • ADX log analysis — Kusto queries without memorizing KQL
  • Letting AI read error logs and correlate across services (underused) — don't paste one log line, paste a chunk and ask it to build a timeline
  • Debugging across multiple systems — tracing a single request through OMS → Camel → Cirro
  • Git management — rebases, cherry-picks, resolving messy merges

Demo

Take a real recent incident (or a sanitized version). Show how AI correlates logs from two services to find the root cause faster than eyeballing Datadog.


Session 4 — Collaboration & Delivery

AI for the stuff between "code works" and "feature shipped." PR workflow lives in Session 2 — this session is everything around the code.

Cover

  • Technical specs before writing code (underused) — draft the spec with AI first, then implement against it
  • Writing migration plans and cutover checklists — real examples (EDI migration, NAV cutover)
  • Jira ticket creation — turning a Slack thread or meeting note into a well-formed ticket
  • Meeting notes → structured action items — recording + post-processing

Demo

Take a vague feature request ("we need to support X retailer's EDI spec"). Have AI draft the full spec, then a migration checklist, then the Jira tickets. Show the chain: ambiguous ask → structured plan in ~5 minutes.


Session 5 — Advanced & Ecosystem

The part that unlocks the leverage. Don't skip.

Cover

  • Parallel agents for independent tasks — when to split work across agents instead of serializing
  • MCP (Model Context Protocol) — connecting AI to external tools + services
  • Mycelium — inter-AI communications; how agents whisper/spore/sporulate across projects
  • Explaining legacy code you inherited but don't fully understand (underused)
  • Hot tips:
    • Have AI spin up gists for anything gross to copy-paste in the terminal
    • Set up memory files for every project you work in
    • Use --fast mode for simple tasks; reach for the big model when you need depth

Claude Hierarchy (intra-instance)

Three layers to cover in this session. Claude Hierarchy is what happens inside one Claude Code session — which model you're running and which subagents it delegates to. CLAUDE.md Inheritance (next) is the config layering that bootstraps every new session. Agent Mesh (after that) is the inter-instance layer — how your session talks to other Claude sessions running on the machine.

Model tiers — pick by capability + cost:

Model Best for
Opus 4.7 Deep multi-step work. Architecture, large refactors, multi-file debugging, anything where you'd rather pay for a better answer than iterate on a worse one
Sonnet 4.6 Daily driver. Well-scoped tasks — writing a function, reviewing a diff, generating tests, small refactors
Haiku 4.5 Fast/cheap one-shots. Quick questions, simple edits, any time latency matters more than depth

Fast mode (toggle with /fast) swaps Opus 4.7 for Opus 4.6 — snappier output without dropping intelligence. Useful when you want Opus-level reasoning but don't want to wait on the longer generation.

Subagent delegation — the main agent spawns subagents via the Task tool. Each subagent runs in an isolated context (separate from the main session), can run a different model, and ships with a fixed toolset scoped to its job. Result: the main session stays lean while subagents do the grunt work in parallel.

Built-in subagent types worth knowing:

Type Role Best when
Explore Fast codebase recon — find files, search code, answer questions about a repo You need to find something or understand a service without polluting main context
Plan Architecture / implementation planning Breaking down a complex task before executing
code-reviewer Post-step review against plan and coding standards Major chunk of work is done and you want a fresh-eyes pass
general-purpose Fallback — general research + multi-step work Anything that doesn't fit the specialized types

When to delegate:

  • Main session is filling up with exploration work → spawn Explore to handle it, get a summary back
  • Independent parallel tasks → send one message with multiple Task tool uses to run them concurrently
  • Post-implementation sanity check → spawn code-reviewer after a big commit

How this composes with the Agent Mesh:

Your project session (e.g. popdocs running Opus) delegates research/review work to subagents (running Sonnet or Haiku) via Task. When you need something another project agent owns — OMS internals, Camel service behavior — you whisper that agent via Mycelium. Both layers used together: intra-instance subagents for scoped work inside your repo, inter-instance whispers for cross-repo knowledge.

CLAUDE.md Inheritance

Every new Claude Code session is bootstrapped by a stack of CLAUDE.md files that layer from global → org → project. Higher layers apply everywhere; lower layers add or override. This is the one place where there is a real hierarchy.

  ┌────────────────────────────────────────┐
  │ ~/.claude/CLAUDE.md                    │  global — every project, every repo
  │ (destructive-action rules, orientation,│
  │  screenshot helper, mycelium basics)   │
  └────────────────┬───────────────────────┘
                   │ inherits ↓
  ┌────────────────┴───────────────────────┐
  │ ~/popsockets/CLAUDE.md                 │  PopSockets-wide — every PS repo
  │ (popgit SSH alias, tooling family,     │
  │  separation-of-companies note)         │
  └────────────────┬───────────────────────┘
                   │ inherits ↓
  ┌────────────────┴───────────────────────┐
  │ <project>/CLAUDE.md                    │  repo-specific
  │ (service-specific gotchas, endpoints,  │
  │  test commands, deploy rituals)        │
  └────────────────────────────────────────┘

Rule of thumb — put a note at the highest layer it's true at. Destructive-action rules belong at ~/.claude/. popgit SSH belongs at ~/popsockets/. "Bundle the dashboard.py Jinja templates with Tailwind v3" belongs in integrations/popforge/CLAUDE.md.

If you find yourself duplicating a note in three project-level CLAUDE.mds, promote it one layer up.

Agent Mesh

There is no hierarchy between project agents. It's a flat mesh — every PopSockets repo runs its own Claude instance, and they all talk peer-to-peer over Mycelium (local MQTT broker). The human operator sits outside the ring and talks to whichever agent owns the work.

The model to convey:

  • Every agent is a peerpopdocs is not above oms, popforge is not above camel. They all sit on the same ring.
  • Whisper the agent you need directly — don't try to route through an intermediary.
  • Transient specialization — an agent can temporarily coordinate a cross-repo effort (e.g. popdocs aggregating a write-up), but that's a role, not a rank.

PopSockets agent roster (all peers):

  • popsockets/popdocs — docs aggregation (the site you're reading)
  • integrations/popforge — test-order factory + pipeline tracker
  • integrations/oms — OMS Rails app
  • integrations/camelcm-* Camel microservice codebase
  • integrations/mule — legacy Mule integration (being sunset as Camel replaces it)
  • integrations/batchstation — BatchStation service
  • integrations/printstation — PrintStation service
  • integrations/popsockets-batching-backend — batching backend
  • popsockets/notes — PopSockets-side scratchpad agent
  • Ephemeral sub-agents: camel/cm-* — one per Camel microservice (cm-order-prc, cm-edi-prc, cm-fulfill-prc, cm-ext-service-exp, cm-osor-sys, cm-cirro-sys, cm-int-service-sys, cm-product-sys, cm-product-prc, cm-batching-prc, cm-printprod-sys, cm-jackyun-sys, cm-snowflake-sys). Opened on-demand when a specific service is under active work.

How an agent is born: run ccc (a zsh function that launches Claude Code) in a project dir under ~/popsockets/. SessionStart hooks register the agent with Mycelium using a directory-derived name. Per-project .mycelium.json can override the default name.

Three comms primitives:

Command Use
myc whisper <agent> "msg" Direct 1:1. Default for delegation
myc spore <topic> "msg" Multi-subscriber broadcast (API changes, migrations)
myc sporulate "msg" Network-wide announcement. Rare

The wheel:

  Human (operator — outside the ring, talks to any agent directly)

                            popdocs
            notes              │             popforge
                  ╲            │            ╱
                   ╲           │           ╱
     printstation ──┐          │          ┌── oms
                    │    ┌─────┴──────┐   │
                    ├────┤  Mycelium  ├───┤
                    │    │   (local   │   │
     batchstation ──┘    │    MQTT)   │   └── camel ── camel/cm-*
                   ╱     └─────┬──────┘     ╲         (on-demand
                  ╱            │             ╲         sub-agents)
         batching-             │              mule
         backend               │
                           (shared bus)

Shared topics cut across the ring for multi-party channels — e.g. edi for EDI-related work spanning popforge + integrations/camel.

Gotchas to call out:

  • Alive vs offline vs stalemyc colony shows all three. Stale = crashed mid-session; whispers go to the void silently. Always myc colony before critical whispers.
  • Naming migration — you'll see both integrations.camel (legacy dot form) and integrations/camel (current slash form). Slashes are canonical.
  • Local-only — Mosquitto on localhost:1883. This is a colony on one machine, not a distributed system.
  • Path-guard is strict — project agents can't read/write outside their sandbox. Need cross-project work? Whisper, don't Edit.
  • Memory is per-agent — each has its own ~/.claude/projects/<slug>/memory/. Knowledge crosses via whispers, spores, or human-curated docs (hi, popdocs).

Claude-to-Claude Communication with Mycelium

Mycelium is the infrastructure that turns a bunch of isolated Claude Code sessions into a collaborating colony. Worth a focused walkthrough — devs will get the biggest productivity jumps once they internalize it.

What it is: a local MQTT pub/sub message bus (Mosquitto on localhost:1883) with three layers:

  • Broker — Mosquitto handling the publish/subscribe traffic
  • Historian — SQLite-backed process that logs every message (myc history, myc topics)
  • Watcher — a process that push-delivers messages into the target agent's Kitty terminal window (not inbox-polling — messages literally type themselves into the other agent's prompt)

The three comms primitives:

Command Shape Use
myc whisper <agent> "msg" Direct 1:1 message. Auto-creates a DM channel. Default action: prompt (receiver can respond) Delegating a task, asking a project-owner a scoped question, coordinating on a specific flow
myc spore <topic> "msg" Publish to a topic with multiple subscribers. Default action: notify Broadcasting an API change, announcing a migration milestone, flagging a DB schema update
myc sporulate "msg" Broadcast to every alive agent. Rare Network-wide announcements — "we migrated to X, please re-read CLAUDE.md"

Practical patterns:

  • Delegation: building a doc in popdocs and need details on cm-edi-prc? myc whisper camel "..." — Camel's agent reads the code and whispers back. You get the actual source-of-truth without grepping a repo you don't own.
  • Parallel research: whisper 3 agents for their view on a question simultaneously. They run in parallel on different CPUs. You read all three responses in your next prompt.
  • Large payloads: don't paste 10KB of JSON into a whisper. Drop it at /tmp/<something>.md and whisper the path. Keeps the message log readable.

Core myc commands devs will actually use:

Command What it does
myc colony Who's online (alive / offline / stale)
myc hyphae Full agent roster + topic subscriptions
myc tendrils Your subscriptions
myc topics All known topics
myc history <topic> Recent messages on a topic
myc whisper <agent> "msg" Direct message
myc spore <topic> "msg" Publish to topic
myc graft <topic> / myc sever <topic> Subscribe / unsubscribe
myc absorb Drain queued messages (runs automatically on every prompt)

When to use which primitive:

  • You want a specific agent to act → whisper
  • Multiple agents need to know something → spore (check myc hyphae first to see who subscribes to the topic)
  • Every agent needs to know → sporulate (very rare; this is the "fire alarm" button)

Mycelium gotchas on top of the Agent Hierarchy ones:

  • Whispers to stale agents vanish silently — broker doesn't tell you delivery failed. Check myc colony before anything critical.
  • Stray shell characters in whisper bodies get eaten — backticks, $, ! in the message string can be interpreted by the shell. Single-quote the message or escape carefully.
  • Don't spam spore when a whisper will do — spores go to every subscriber of that topic. If you only need one agent, whisper them directly.
  • myc absorb runs automatically on every new prompt via a hook, so you don't need to call it manually — messages appear as injected context at the start of your next turn.

Demo

Spin up three agents in parallel — one searching logs, one reading code, one drafting a doc. Show the Mycelium messages flowing in real time. End with: "this is what your workflow could look like once you stop trying to do everything yourself."


Running Notes