NOTE FROM THE HUMAN

Last week, Claude and I hacked together this tool to go through its previous conversations looking for memories. All prose was written by Claude. Even the research report that it claims I wrote.

This is just step one to start to annotate my coding assistant's memory with lessons learned outside of what it has Journaled.

Caveat Emptor.

If it breaks, you can keep all the pieces

Claude Memory 🧠

Multi-dimensional memory extraction system for Claude Code conversations. Learns from conversation history to improve future interactions through systematic extraction of technical lessons, working style preferences, and debugging methodologies.

This research was directed by Jesse Vincent jesse@fsck.com, with Claude performing the implementation and analysis under guidance, but Jesse has not read the full content closely.

Quick Start

# Install globally
npm install -g claude-memory

# Extract memories from recent conversations
claude-memory extract --since="2025-09-01T00:00:00"

# View what was extracted
ls ~/.claude/memories/extracted/

What It Does

Analyzes Claude Code conversation logs and extracts transferable lessons using a multi-dimensional framework:

Root Cause Analysis (Five Whys) - Why did this happen?
Psychological Drivers - What hidden motivations explain behavior?
Prevention Strategies - How to avoid this in the future?
Epistemic Humility - Acknowledges uncertainty appropriately

Example extracted lesson:

"When debugging works in one context but not another, gather evidence comparing the two paths before attempting fixes. Don't guess at symptoms - add logging to see actual data structures."

Current Status: Production Ready ✅

85% match to human ground truth validation
Multi-dimensional analysis provides richer insights than single techniques
Methodology detection distinguishes process lessons from technical fixes
Epistemic humility prevents overconfident extraction on ambiguous cases

Installation

Option 1: Global Install (Recommended)

# Install from npm (when published)
npm install -g claude-memory

# Or install from source
git clone https://github.com/obra/claude-memory.git
cd claude-memory
npm install && npm run build
npm link

Option 2: Local Development

git clone https://github.com/obra/claude-memory.git
cd claude-memory
npm install && npm run build

# Run directly
./dist/cli.js extract --since="2025-09-01T00:00:00"

Prerequisites

Claude CLI installed and authenticated
Node.js 18+

# Verify Claude CLI is working
claude --version

# Check system info
claude-memory info

Usage

Extract Memories

# Extract from last 30 days
claude-memory extract --since="2025-08-29T00:00:00"

# Extract from specific projects directory
claude-memory extract --projects-dir=~/.claude/projects --memories-dir=~/.claude/memories

# Dry run to preview (not implemented yet)
claude-memory extract --dry-run

Review Extracted Memories

# List recent extractions
ls -la ~/.claude/memories/extracted/ | head -10

# View a specific memory
cat ~/.claude/memories/extracted/2025-09-28T04-25-56-822Z-test-project-pattern-test-s-000.md

Development Commands

# Type checking
npm run typecheck

# Build
npm run build

# Run tests (when available)
npm test

How It Works

1. Session Processing

SessionParser: Reads JSONL files from ~/.claude/projects
SessionChunker: Breaks long conversations into manageable chunks
Message filtering: Focuses on human-assistant exchanges with meaningful content

2. Multi-Dimensional Extraction

Agent-based: Uses Claude CLI with specialized extraction prompt
Five Whys: Traces root causes through 5 levels of causation
Hidden Motivation: Identifies psychological drivers and fears
Systems Thinking: Creates concrete prevention strategies
Epistemic Humility: Assesses confidence and acknowledges uncertainty

3. Memory Output

Markdown files with YAML frontmatter containing metadata
Comprehensive analysis in structured sections
Confidence scoring (1-5 scale)
Lesson type classification (methodology vs technical)

Project Structure

claude-introspection/
├── README.md                    # This file
├── src/
│   ├── cli.ts                   # Main CLI entry point
│   ├── commands/
│   │   └── extract.ts           # Extract command implementation
│   ├── extraction/
│   │   ├── parser.ts            # JSONL session parsing
│   │   ├── chunker.ts           # Conversation chunking
│   │   └── agent-extractor.ts   # Agent-based extraction
│   └── types/                   # TypeScript definitions
├── agents/
│   ├── memory-extractor.md      # Original single-technique agent
│   └── multi-dimensional-extractor.md  # Current production agent
├── docs/
│   ├── research/                # Research findings and validation
│   │   ├── agent-prompting-research.md
│   │   ├── ground-truth-validation.md
│   │   └── research-summary.md
│   ├── guides/                  # Implementation guides
│   │   ├── initial-design.md
│   │   └── plans/
│   └── experiments/             # Experimental results
└── ~/.claude/memories/extracted/  # Output directory (git-ignored)

Memory File Format

---
id: "mem-2025-09-28T04-25-56-822Z-test-session-000"
created: "2025-09-28T04:25:56.822Z"
confidence: 4.5
certainty_reasoning: "Clear pattern with user corrections"
trigger: "when engineering decisions have multiple valid approaches"
tags: ["project:test-project", "type:pattern", "domain:technical"]
source:
  session: "test-session"
---

# Engineering Tradeoffs: When to Decide vs When to Acknowledge Ambiguity

[Multi-dimensional analysis with Root Cause, Psychological Driver,
Prevention Strategy sections, plus Epistemic Humility notes]

Research Background

This system was developed through extensive research on memory extraction techniques:

Experiment 1: Tested 15 different prompting techniques (Five Whys, Self-Criticism, etc.)
Experiment 3: Discovered agents show dangerous overconfidence on ambiguous cases
Ground Truth Validation: 16-snippet validation with human ground truth

Key findings:

Multi-dimensional analysis superior to single techniques
Methodology lessons more valuable than technical fixes
Epistemic humility crucial for ambiguous cases
Human calibration essential for quality

Known Limitations

Claude CLI dependency: Requires authenticated Claude CLI installation
No search interface: Memories are files, no semantic search yet
Manual review needed: Some extractions may need human validation
Token costs: ~$0.50-2.00 per conversation depending on length
Processing time: ~1-2 minutes per conversation

Roadmap

Next Sprint

Add consensus flagging (>90% agreement triggers review)
Implement Null Hypothesis agent for "nothing went wrong" cases
Build automated comparison to human extractions
Add simple search interface

Future

Claude Code integration for automatic memory injection
Web UI for browsing and managing memories
Memory consolidation and archiving
Team-wide memory sharing
Reinforcement learning from session feedback

Contributing

Add new extraction techniques: Update agents/multi-dimensional-extractor.md
Improve validation: Add more ground truth examples
Build search: Implement semantic search over extracted memories
Add tests: Create test suite for extraction pipeline

Research Papers & Documentation

Research Summary: Comprehensive overview of all experiments
Ground Truth Validation: Human validation results (85% match)
Agent Prompting Research: Comparison of 15 techniques

License

MIT

Status: Production ready after ground truth validation. Currently processing conversation logs and extracting high-quality memories.

obra/claude-memory-extractor

README