Caudal

Caudal is an attention engine for AI agents. It tells an agent what matters right now.

Modern agents have tools, vector search, and long-term storage — but they still behave strangely:

They bring up old topics
They lose track of current user intent
They repeat irrelevant actions
They need custom heuristics for recency and prioritization

The missing piece is not more knowledge. It is attention.

Caudal provides a continuously evolving “attention signal” derived from real interactions. Agents send events. Caudal learns which entities and relationships are currently important, and naturally forgets what is no longer relevant.

No prompts. No fine-tuning. No heuristics. Just behavior → relevance.

What Caudal is

Caudal is not a database, a vector store, or a conversation history. It is a new layer:

System	Answers
SQL / Graph DB	What is true?
Vector DB	What is similar?
LLM	What can I reason about?
Caudal	What should I focus on now?

Caudal stores interaction traces and continuously computes a ranked relevance field with built-in forgetting.

Quickstart (Docker)

# Start dependencies (Postgres) + Caudal
cd docker
docker compose up -d

# Check health
curl http://localhost:8080/actuator/health

Default dev mode allows requests without auth when AUTH_DISABLED=true.

How an agent uses Caudal

1. Write: emit events while working

Whenever the agent observes meaningful activity, it sends events:

User mentions a topic
A tool is used
A document is retrieved
A task succeeds or fails

curl -X POST http://localhost:8080/api/v1/events \
  -H "Content-Type: application/json" \
  -d '{
    "space": "user:123",
    "events": [
      {
        "src": "user:123",
        "dst": "topic:car-buying",
        "type": "chat",
        "intensity": 2.0
      },
      {
        "src": "agent:planner",
        "dst": "tool:car-comparison",
        "type": "tool_use",
        "intensity": 1.0
      }
    ]
  }'

Caudal interprets these as signals of attention, not facts.

2. Read: ask what matters now

Before deciding what to do next, the agent queries Caudal:

curl "http://localhost:8080/api/v1/focus?space=user:123&k=5"

{
  "asOf": "2026-02-28T10:06:00Z",
  "items": [
    { "id": "topic:car-buying", "score": 0.83 },
    { "id": "topic:stroller", "score": 0.61 },
    { "id": "topic:cycling", "score": 0.22 }
  ]
}

The agent now knows what to prioritize in conversation, planning, tool selection, and retrieval.

3. Follow associations

Agents can explore what is likely to come next:

curl "http://localhost:8080/api/v1/next?space=user:123&src=topic:car-buying&k=5"

{
  "asOf": "2026-02-28T10:06:00Z",
  "items": [
    { "id": "user:123", "score": 0.83 },
    { "id": "tool:car-comparison", "score": 0.45 },
    { "id": "topic:stroller", "score": 0.12 }
  ]
}

This enables recommendations, coherent planning, and contextual tool use.

4. Explore multi-hop pathways

For deeper associations across several steps, agents can sample pathways starting from an entity. This helps with planning, recommendations, and explaining why something is relevant.

curl -X POST http://localhost:8080/api/v1/pathways \
  -H "Content-Type: application/json" \
  -d '{
    "space": "user:123",
    "start": "user:123",
    "k": 5,
    "mode": "deep"
  }'

Mode	Description
`"fast"`	Fewer samples, shorter walks — low latency
`"balanced"`	Good default (used when mode is omitted)
`"deep"`	More samples, longer walks — thorough

{
  "asOf": "2026-02-28T10:06:00Z",
  "topEntities": [
    { "id": "topic:car-buying", "score": 0.72 },
    { "id": "brand:toyota", "score": 0.51 },
    { "id": "model:yaris_cross", "score": 0.44 }
  ],
  "paths": [
    { "nodes": ["user:123","topic:car-buying","brand:toyota","model:yaris_cross"], "score": 0.34 },
    { "nodes": ["user:123","topic:car-buying","doc:comparison_guide"], "score": 0.28 }
  ]
}

5. Modulate attention

Sometimes you need to suppress or amplify attention without erasing memory.

curl -X POST http://localhost:8080/api/v1/modulate \
  -H "Content-Type: application/json" \
  -d '{
    "space": "user:123",
    "modulations": [
      { "entity": "topic:bikes", "attention": 0.1, "decay": 50 }
    ]
  }'

Attention value	Effect
`0.0`	Fully suppress — entity disappears from results
`0.1`	Strongly suppress — 10% of normal
`1.0`	Normal (resets any modulation)
`3.0`	Triple the score

The decay field controls how many events until the modulation fades to half strength. Omit it for a persistent modulation.

Why this improves agents

Without Caudal, agent developers implement recency windows, TTL memory, custom ranking formulas, and manual heuristics.

With Caudal:

Recent behavior reinforces relevance
Old context fades naturally
Focus emerges automatically

Caudal gives agents working memory. Vector databases give agents recall. LLMs give agents reasoning. All three together produce far more stable behavior.

Architecture

Core engine (pure Java): decay, reinforcement, ranking, pathways
Server (Spring Boot): REST, auth, persistence, metrics
PostgreSQL: event log (WAL) + periodic snapshots

Time is handled internally via discrete buckets for deterministic recovery and testing.

Docker image

Caudal uses Spring Boot Buildpacks to produce an OCI image — no Dockerfile needed.

# Build the image locally
mvn spring-boot:build-image -pl server -DskipTests

# Run the full stack
cd docker && docker compose --profile full up -d

The image is published to GHCR as ghcr.io/caudal-labs/caudal-server.

Claude Code Skill

Caudal ships a Claude Code skill (caudal-attention) that integrates temporal attention directly into Claude’s workflow.

Install

/plugin install caudal-attention@caudal-skills

Configure

export CAUDAL_URL=http://localhost:8080
export CAUDAL_API_KEY=your-api-key

Add to your project’s CLAUDE.md:

At the start of every conversation, invoke the `caudal-attention` skill before doing anything else.