Caudal

Caudal is an attention engine for AI agents. It tells an agent what matters right now.

Modern agents have tools, vector search, and long-term storage — but they still behave strangely:

The missing piece is not more knowledge. It is attention.

Caudal provides a continuously evolving “attention signal” derived from real interactions. Agents send events. Caudal learns which entities and relationships are currently important, and naturally forgets what is no longer relevant.

No prompts. No fine-tuning. No heuristics. Just behavior → relevance.


What Caudal is

Caudal is not a database, a vector store, or a conversation history. It is a new layer:

SystemAnswers
SQL / Graph DBWhat is true?
Vector DBWhat is similar?
LLMWhat can I reason about?
CaudalWhat should I focus on now?

Caudal stores interaction traces and continuously computes a ranked relevance field with built-in forgetting.


Quickstart (Docker)

# Start dependencies (Postgres) + Caudal
cd docker
docker compose up -d

# Check health
curl http://localhost:8080/actuator/health

Default dev mode allows requests without auth when AUTH_DISABLED=true.


How an agent uses Caudal

1. Write: emit events while working

Whenever the agent observes meaningful activity, it sends events:

curl -X POST http://localhost:8080/api/v1/events \
  -H "Content-Type: application/json" \
  -d '{
    "space": "user:123",
    "events": [
      {
        "src": "user:123",
        "dst": "topic:car-buying",
        "type": "chat",
        "intensity": 2.0
      },
      {
        "src": "agent:planner",
        "dst": "tool:car-comparison",
        "type": "tool_use",
        "intensity": 1.0
      }
    ]
  }'

Caudal interprets these as signals of attention, not facts.

2. Read: ask what matters now

Before deciding what to do next, the agent queries Caudal:

curl "http://localhost:8080/api/v1/focus?space=user:123&k=5"
{
  "asOf": "2026-02-28T10:06:00Z",
  "items": [
    { "id": "topic:car-buying", "score": 0.83 },
    { "id": "topic:stroller", "score": 0.61 },
    { "id": "topic:cycling", "score": 0.22 }
  ]
}

The agent now knows what to prioritize in conversation, planning, tool selection, and retrieval.

3. Follow associations

Agents can explore what is likely to come next:

curl "http://localhost:8080/api/v1/next?space=user:123&src=topic:car-buying&k=5"
{
  "asOf": "2026-02-28T10:06:00Z",
  "items": [
    { "id": "user:123", "score": 0.83 },
    { "id": "tool:car-comparison", "score": 0.45 },
    { "id": "topic:stroller", "score": 0.12 }
  ]
}

This enables recommendations, coherent planning, and contextual tool use.

4. Explore multi-hop pathways

For deeper associations across several steps, agents can sample pathways starting from an entity. This helps with planning, recommendations, and explaining why something is relevant.

curl -X POST http://localhost:8080/api/v1/pathways \
  -H "Content-Type: application/json" \
  -d '{
    "space": "user:123",
    "start": "user:123",
    "k": 5,
    "mode": "deep"
  }'
ModeDescription
"fast"Fewer samples, shorter walks — low latency
"balanced"Good default (used when mode is omitted)
"deep"More samples, longer walks — thorough
{
  "asOf": "2026-02-28T10:06:00Z",
  "topEntities": [
    { "id": "topic:car-buying", "score": 0.72 },
    { "id": "brand:toyota", "score": 0.51 },
    { "id": "model:yaris_cross", "score": 0.44 }
  ],
  "paths": [
    { "nodes": ["user:123","topic:car-buying","brand:toyota","model:yaris_cross"], "score": 0.34 },
    { "nodes": ["user:123","topic:car-buying","doc:comparison_guide"], "score": 0.28 }
  ]
}

5. Modulate attention

Sometimes you need to suppress or amplify attention without erasing memory.

curl -X POST http://localhost:8080/api/v1/modulate \
  -H "Content-Type: application/json" \
  -d '{
    "space": "user:123",
    "modulations": [
      { "entity": "topic:bikes", "attention": 0.1, "decay": 50 }
    ]
  }'
Attention valueEffect
0.0Fully suppress — entity disappears from results
0.1Strongly suppress — 10% of normal
1.0Normal (resets any modulation)
3.0Triple the score

The decay field controls how many events until the modulation fades to half strength. Omit it for a persistent modulation.


Why this improves agents

Without Caudal, agent developers implement recency windows, TTL memory, custom ranking formulas, and manual heuristics.

With Caudal:

Caudal gives agents working memory. Vector databases give agents recall. LLMs give agents reasoning. All three together produce far more stable behavior.


Architecture

Time is handled internally via discrete buckets for deterministic recovery and testing.


Docker image

Caudal uses Spring Boot Buildpacks to produce an OCI image — no Dockerfile needed.

# Build the image locally
mvn spring-boot:build-image -pl server -DskipTests

# Run the full stack
cd docker && docker compose --profile full up -d

The image is published to GHCR as ghcr.io/caudal-labs/caudal-server.


Claude Code Skill

Caudal ships a Claude Code skill (caudal-attention) that integrates temporal attention directly into Claude’s workflow.

Install

/plugin install caudal-attention@caudal-skills

Configure

export CAUDAL_URL=http://localhost:8080
export CAUDAL_API_KEY=your-api-key

Add to your project’s CLAUDE.md:

At the start of every conversation, invoke the `caudal-attention` skill before doing anything else.