Teaching Iris to Remember Properly

The memory tool was one of Iris's earliest features. "Remember that I prefer Apple over Orange." One fact, one database row, done. Simple. Too simple.

The problem with one-at-a-time

Users don't think in single facts. They dump a paragraph: "Remember that I'm allergic to shellfish, I prefer window seats, my anniversary is March 14th, and I hate sweet burger." The old tool would store the first item and ignore the rest, or the model would make four separate tool calls, burning tokens and time on what should be a single operation. In fact, end up not storing at least one of them.

Worse, there was no deduplication. Say "remember I like dark roast" twice and you'd get two identical rows. The memory table was accumulating noise.

Batch storage

The StoreMemory tool now accepts an array. Up to 25 items in one call. The model extracts multiple facts from a single message, bundles them into the memories array, and fires once. One tool call, multiple rows. The schema change was the easy part:

memories: [
  { content: "allergic to shellfish", type: "fact", category: "health" },
  { content: "prefers window seats", type: "preference", category: "travel" },
  { content: "anniversary is March 14th", type: "fact", category: "personal" },
]

If a model sends the old single-item format with content at the top level, the normaliser wraps it into an array of one. Nothing breaks.

Deduplication

Before persisting each item, the tool checks for an exact case-insensitive match against existing memories. Duplicates get skipped, not rejected. The response tells the model exactly what happened: MEMORY_STORED count=2 skipped=1 failed=0. The presentation layer turns that into "Got it, 2 memories saved. 1 duplicate skipped." Clean feedback, no ambiguity.

Smarter shortcut detection

The direct handler (the fast path that bypasses the LLM entirely for simple "remember that X" messages) got sharper. It now catches more trigger phrases: "remember my", "remember I", "save this as memory". But it also knows when to step aside. If the message contains "store these", "break them down", "all of", or similar multi-fact signals, it falls through to the LLM so the model can properly extract and batch the individual facts.

Intent routing

The ToolIntentDecisionService now has dedicated memory signals. Phrases like "remember that", "save to memory", "what do you remember about me" get classified with a memory_action intent and routed to the right tool. store_memory for writes, list_memories for reads. Previously, memory-related messages were relying entirely on the model's tool selection. Now the routing layer gives a strong hint before the model even sees the message.

Structured responses

Every memory operation returns a machine-readable string. MEMORY_STORED count=3 skipped=0 failed=0. MEMORY_STORE_FAILED reason="All items failed to persist.". The presentation layer parses these into human-friendly messages. This is the same pattern used for reminders, calendar events, and emails. Consistent contract, consistent UX.

What changed in practice

Before: "Remember these five things about me" resulted in one memory stored and four lost, or five slow sequential tool calls. After: one tool call, five rows, duplicates caught, clear confirmation. The memory system went from a notepad to a proper intake pipeline.

Next step is fuzzy deduplication. "I prefer spicy chicken" and "I hot chicken wings" are the same memory. Exact string matching catches the obvious duplicates, but semantic similarity would catch the rest. That's a retrieval problem, and the skill graph's hybrid scoring might have something to teach here.