Building Custom Claude Code Skills for Performance Snapshots

4 min read

Every time I profiled a Node.js service, I'd end up with a .cpuprofile file, a manual note in a scratchpad, and no connection between the two. Comparing snapshots across runs meant flipping between browser DevTools and a terminal โ€” not great. I wanted something tighter: run a profile, get a structured diff, and have Claude explain what changed.

What Claude Code Skills Are

Claude Code supports a /slash-command extension system where you can define custom skills as Markdown files in .claude/commands/. When you invoke a skill, Claude executes it in the context of your current session โ€” with access to your codebase, your shell, and any MCP servers you've connected.

Skills are just Markdown with embedded bash. The simplicity is the point. There's no SDK to learn, no build step, no daemon. Drop a file into .claude/commands/ and it's live.

The Problem with Manual Profiling

The usual workflow looks like this:

  1. Run the server with --cpu-prof
  2. Wait for a meaningful load test to finish
  3. Open Chrome DevTools, load the .cpuprofile, squint at a flame graph
  4. Repeat with different parameters and try to remember what changed

What I wanted instead: a single command that captures a profile, stores it with metadata, diffs it against the previous run, and surfaces the delta as structured text I can feed to Claude.

Building the Skill

The skill lives in .claude/commands/perf-snapshot.md. The core of it is a bash block that Claude executes when you run /perf-snapshot:

#!/usr/bin/env bash
set -euo pipefail
 
SNAPSHOT_DIR=".perf-snapshots"
mkdir -p "$SNAPSHOT_DIR"
 
TIMESTAMP=$(date +%Y%m%dT%H%M%S)
PROFILE_FILE="$SNAPSHOT_DIR/$TIMESTAMP.cpuprofile"
 
echo "Starting CPU profile for 10s..."
node --cpu-prof \
  --cpu-prof-dir="$SNAPSHOT_DIR" \
  --cpu-prof-interval=100 \
  -e "require('./src/server')" &
 
SERVER_PID=$!
sleep 10
kill "$SERVER_PID" 2>/dev/null || true
 
# Move the generated profile to our named file
GENERATED=$(ls -t "$SNAPSHOT_DIR"/*.cpuprofile 2>/dev/null | head -1)
[ -n "$GENERATED" ] && mv "$GENERATED" "$PROFILE_FILE"
 
echo "Profile saved to $PROFILE_FILE"
 
# Emit a JSON summary of top hot paths for Claude to read
node -e "
  const fs = require('fs');
  const profile = JSON.parse(fs.readFileSync('$PROFILE_FILE', 'utf8'));
  const nodes = profile.nodes
    .sort((a, b) => (b.hitCount ?? 0) - (a.hitCount ?? 0))
    .slice(0, 20)
    .map(n => ({ name: n.callFrame.functionName, url: n.callFrame.url, line: n.callFrame.lineNumber, hits: n.hitCount }));
  console.log(JSON.stringify(nodes, null, 2));
"
๐Ÿ’ก

Set --cpu-prof-interval lower (e.g. 50ms) for more granular data at the cost of a larger file. For most Node.js services, 100ms gives a good signal-to-noise ratio without overwhelming the profiler output.

Diffing Snapshots

The interesting part is the diff. I wrote a small script that compares the top-20 hot paths from two profiles and emits a structured delta:

interface HotPath {
  name: string
  url: string
  line: number
  hits: number
}
 
function diffProfiles(before: HotPath[], after: HotPath[]): string {
  const beforeMap = new Map(before.map((p) => [p.name, p.hits]))
  const afterMap = new Map(after.map((p) => [p.name, p.hits]))
 
  const allNames = new Set([...beforeMap.keys(), ...afterMap.keys()])
  const deltas: Array<{ name: string; delta: number; pct: number }> = []
 
  for (const name of allNames) {
    const b = beforeMap.get(name) ?? 0
    const a = afterMap.get(name) ?? 0
    const delta = a - b
    const pct = b === 0 ? Infinity : (delta / b) * 100
    deltas.push({ name, delta, pct })
  }
 
  return deltas
    .sort((a, b) => Math.abs(b.delta) - Math.abs(a.delta))
    .slice(0, 10)
    .map((d) => `${d.name}: ${d.delta > 0 ? '+' : ''}${d.delta} hits (${d.pct.toFixed(1)}%)`)
    .join('\n')
}

Claude then reads this output and produces a plain-English explanation of what changed between the two runs โ€” which functions got hotter, which cooled down, and what code paths are worth investigating.

The Result

The full skill pipeline looks like this:

  1. /perf-snapshot runs the profile and emits a hot-path JSON summary
  2. Claude stores it in .perf-snapshots/ with a timestamp
  3. On the second run, Claude automatically diffs the two most recent snapshots
  4. The diff is piped back into the session as context, and Claude annotates it

The whole thing takes about 15 seconds for a 10-second profile. The annotation step is the part that surprised me most โ€” Claude consistently identifies non-obvious patterns like GC pressure caused by allocation spikes in utility functions that don't show up as hot themselves.

What I Learned

  • Claude Code's skill system is more capable than the docs suggest. The ability to emit structured output back into the session context is underused.
  • Structured diffs are much more useful than raw profile files. Giving Claude a JSON summary instead of a .cpuprofile produces sharper analysis.
  • This pattern generalises to any "capture โ†’ diff โ†’ explain" workflow. I've since built similar skills for memory heap snapshots and HTTP trace logs.

The skill is about 80 lines of Markdown and bash. If you're doing performance work on a Node.js service, it's worth the 30 minutes to set up.