How Roadmap Journal

Private Beta Closed Access

My life, orchestrated by AI agents.

A personal AI system where specialist agents research, write, generate media, manage your calendar, and coordinate with each other. Built for one user. Not a SaaS.

Iris is a personal project in active development. It is not open to the public and there are no plans to offer signups. This page exists to share the journey of building it.

Read the Journal

Multi-Agent Orchestration

Agents delegate to specialists, run tasks in parallel, and return coordinated results with full trace visibility.

Streaming Chat

Real-time conversations with thinking mode, agent switching mid-thread, document reading, and web browsing.

Media Generation

Generate images and videos through AI providers. Queued jobs with progress tracking, delivered in-chat or via Telegram.

Skills and Memory

Persistent memory across sessions. Reusable skills that auto-activate based on context. The assistant learns and remembers.

Telegram Integration

Full two-way Telegram bot. Chat, get reminders, receive generated media, and run commands from your phone.

Automations

Scheduled agent tasks, webhook-triggered actions, calendar sync, and smart home control running in the background.

Demo Videos

Iris in action

AI generated demo clips exploring how Iris works and how interacting with my personal AI system could feel. Iris coordinates specialist agents, remembers context, and helps me manage everyday work.

Thinking Partner

A conversation that actually thinks

A visual exploration of Iris as a thinking partner. The system understands context, reasons about problems, and coordinates specialist agents when working through tasks with me.

Ambient Presence

An AI that is always available

A concept visualisation of Iris as a constant presence in my environment. Ready to respond, coordinate tasks, and surface useful information whenever I need it.

Command Interface

A central interface for my agents

A cinematic visualisation of Iris as a central interface where conversations, agents, research, memory, and generated media come together in one place.

Personal AI

Built for my life

Iris is not a product or a service. It is a personal AI system I am building for myself that remembers context, coordinates specialist agents, and helps organise everyday work.

These videos are AI generated visualisations created to explore ideas and interactions around Iris while the real system continues to evolve.

Why Iris

Not a chatbot. One central place that manages my life.

Chatbots answer questions. Iris is something different - an AI system that knows me, manages things for me, and actively works to improve my life. Think less "ChatGPT but self-hosted" and more "a second brain that can actually do things."

The end state looks something like this: I wake up and Iris has already checked my calendar, noticed a conflict, drafted an email to reschedule, and is waiting for my approval before sending it. My fitness agent remembers I skipped legs on Wednesday and adjusts the rest of the week. The research agent has a briefing ready on a topic I asked about yesterday because it took longer than a quick answer. All of this happened proactively, overnight, without me asking.

That is the vision. We are not there yet. But every piece I am building points in that direction.

Read the full story

How We Build It

The approach behind Iris

Not a wrapper around ChatGPT. A full AI runtime built from scratch on Laravel, with provider abstraction, recursive agent orchestration, and queue-driven execution. Built almost entirely with AI coding assistants.

AI-Assisted Development

Claude Code and Codex handle implementation. I prompt, review, and oversee - nothing goes in without approval. One developer steering AI agents, production-grade output, weeks instead of months.

Provider Strategy Pattern

AI providers are swappable contracts. Chutes AI is the primary (genuinely great service, not affiliated). Ollama for local development. Adding a new provider means implementing one interface.

Recursive Agent Delegation

Agents call other agents through tool use. The orchestration context tracks depth, prevents cycles, enforces time budgets, and merges traces. Sub-agents run in isolated conversations but report back to the parent.

Laravel AI SDK + Prism

Streaming and multi-step tool execution handled by the SDK. The generator yields text deltas while Prism manages tool call loops internally. We layer orchestration, memory injection, and skill matching on top.

Queue-First Architecture

Sub-agent batches, media generation, reminders, and automations all run through Horizon-managed queues. The HTTP stream stays alive while background work dispatches and resolves asynchronously.

Inertia v2 + React 19

Full SPA experience without a separate API layer. Server-side routing, shared validation, deferred props. The backend defines the contract and the frontend mirrors it.

Single-Server Deployment

One Laravel app, one Redis instance, Horizon for queue topology. Deployed on Forge. No microservices, no Kubernetes. Simple until it needs to not be.

Cloud + Local Hybrid

Chutes AI for production, Ollama for local development and privacy-sensitive workflows. Same agents, tools, and orchestration regardless of where the model runs. Develop offline, deploy to cloud.

Where We Are

Roadmap

What's shipped, what's being built, and what's coming next.

Shipped does not mean done. Everything listed here is actively maintained and continuously improved as we learn more about what works in practice.

Shipped (31)

Streaming Chat and Agent System

Multi-provider streaming with tool use, specialist agents with custom prompts and models, thinking mode visibility, agent switching mid-conversation. **Room to grow**: structured output mode for agents returning JSON/tables, conversation branching (fork a thread to explore alternatives), and client-side token usage display per message.

Shipped

Multi-Agent Orchestration

Hierarchical agent orchestration with supervisor pattern. Synchronous sub-agent delegation with cycle detection via visited-set tracking, configurable depth limits, and per-agent time budgets. Fan-out/fan-in parallel execution via Bus batches with automatic sequential fallback when queue infrastructure is unavailable. Orchestration context propagation through the full call tree. **Room to grow**: persistent orchestration plans that survive disconnections, agent capability negotiation (agents declaring what they can and cannot do before accepting a task), and cost-aware routing that factors token spend into delegation decisions.

Shipped

Direct Streaming Delegation

Stream passthrough architecture replacing store-and-forward. Sub-agent output streams directly to the user's browser via SSE, eliminating the parent's second inference pass. TTFT reduced from 15 seconds to 4 seconds. Out-of-band signalling via inline instruction injection tells the parent the content was already delivered. Works for both single delegation and fan-out parallel patterns.

Shipped

Fault-Tolerant Agent Takeover

Circuit breaker with fallback pattern for LLM orchestration. Parent agent takes ownership when specialists fail, implementing graduated degradation: total fallback when all agents error, partial degradation when some succeed. Each failure carries structured metadata (agent name, ID, status: error/timeout/rate_limit/budget_exhausted). Takeover directives use inline instruction injection to override the parent's default behaviour. The leader handles it, no apologies.

Shipped

Inline Sub-Conversation Panels

Full observability for the delegation chain. Correlation ID linking from parent message to sub-agent conversation. Collapsible inline panels with progressive disclosure. Structured trace data including tool calls, tool results, token usage, TTFT, and total duration per agent. Span collection via orchestration context for distributed-tracing-style visibility across the agent call graph.

Shipped

Memory System

Persistent memories with shared and agent-specific scoping. Pinned facts, expiring context, AI tools for reading and writing memories. CRUD interface at /memories. Batch storage for multi-fact intake with deduplication and structured responses. **Room to grow**: automatic memory extraction from conversations (the AI notices important facts and saves them without being asked), fuzzy/semantic deduplication, memory conflict resolution, and semantic search across memories using the embedding pipeline.

Shipped

Agent Builder

Create specialist agents with custom system prompts, model selection, provider choice, per-agent tool allowlists, voice assignment, routing keywords, sort order, default status, guardrails, and handoff rules. Tool allowlists scope each agent to exactly the tools it needs, reducing model reasoning overhead and preventing irrelevant tool selection. Voice and routing keyword fields feed into the voice pipeline and semantic routing layer respectively. Form-based CRUD in agents page. **Room to grow**: agent versioning (roll back to a previous prompt), prompt A/B testing, agent templates and one-click cloning, and a testing sandbox where you can trial an agent before publishing.

Shipped

Skill Registry

Reusable knowledge blocks with trigger-based auto-matching. Skills inject into agent context when the user's prompt matches. CRUD at /skills.

Shipped

Skill Graphs

Connected knowledge maps that give Iris structured, relevant context for every conversation. Nodes represent knowledge (skills, personal claims, workflows, references), edges represent relationships created automatically from wikilinks. Hybrid retrieval (keyword + semantic + type scoring) with graph traversal and budget constraints. Interactive visual editor with force-directed layout, type filtering, and minimap. Full graph and node CRUD. Shadow mode for safe testing. Retrieval logging for analytics. Built with Claude and Codex.

Shipped

Image Generation

AI image generation through Chutes provider. Queued jobs with progress tracking, S3 storage, expiration cleanup, delivered in-chat and via Telegram.

Shipped

Video Generation

Provider-agnostic video pipeline with queued jobs, progress tracking, and S3 delivery. Contract-based architecture for adding new video providers.

Shipped

Video Analysis Pipeline

Multimodal video analysis with hybrid execution: synchronous for small files, queued jobs for large files, and conversation delivery on completion. Telegram-hosted video attachments auto-queue for reliability. Provider agnostic.

Shipped

Reminders

Time-based reminders with recurrence support. Delivered through the app and Telegram. AI tools for creating, listing, and updating reminders.

Shipped

Batch Proposals

Review-before-commit workflow for multi-item operations. When tools detect 3+ items (reminders, memories, tasks), they build a proposal instead of executing immediately. Per-item approve, skip, and edit controls on both web and Telegram. Execution respects individual item choices with item-level error tracking. 24-hour expiry with automatic purge. Inline chat rendering on web, inline keyboards with per-item buttons on Telegram.

Shipped

Telegram Bot

Two-way Telegram integration. Receive messages, send responses, deliver generated media as photos/videos, slash command registration, reminder notifications. Channel-aware output transformation (adapter pattern) strips orchestration markers for clean prose delivery. Heartbeat-pattern typing indicators during delegation windows. Real-time streaming via edit-in-place loop. Per-item batch proposal actions with inline keyboards: approve, skip, confirm, cancel individual items before execution. **Room to grow**: conversation threading in Telegram groups, file uploads forwarded into the agent pipeline.

Shipped

Live Web Research and Data Tools

Expanded from basic page fetch to full research workflow tools: web search, news search, image search, open page, click extracted links, and find exact text patterns in opened pages. Added live market quotes (equity, fund, index, crypto), sports standings/schedules, UTC-offset time lookup, and weather forecast support. Existing website fetch, document reading, and screenshot tools remain in place.

Shipped

Smart Rate-Limit and Fallback Handling

Resilience layer for external tool calls with retry/backoff + jitter, Retry-After handling, provider cooldown windows on 429, and ordered provider fallback chains. Current default web search order is DuckDuckGo first, then Brave, then Serper. Finance now falls back across providers for both traditional and crypto assets.

Shipped

Parallel Tool Batching

Added a parallel tool execution entrypoint that can run multiple read-only tool calls in one step and return grouped results. Uses Laravel concurrency with sync fallback when process parallelism is unavailable.

Shipped

MCP Runtime Tools

Added MCP runtime capabilities to list resources, list resource templates, and read resources from configured MCP servers. Includes transport-level resilience and normalized outputs for agent consumption.

Shipped

Workflow Meta Tools

Added explicit plan update tooling (step/status tracking) and local image-path viewing/analysis. This closes major parity gaps with hosted coding-agent workflows.

Shipped

Automations

Scheduled agent tasks with interval, daily, weekly, and one-time modes. Multi-channel output (app, Telegram). Run history with status tracking.

Shipped

Google Calendar Integration

OAuth-based calendar sync. AI tools for listing upcoming events and creating new ones directly from conversations.

Shipped

Operations Dashboard

System health monitoring. Database, Redis, Horizon queue status. Agent task tracking, orchestration traces, failed job visibility.

Shipped

Voice Input/Output

Speech-to-text via browser Web Speech API and server-side transcription (Chutes, ElevenLabs). Text-to-speech with chunked streaming synthesis, prefetch, and markdown stripping. Self-healing conversation loop with auto-restart on silence, error recovery, and noise filtering. Tap-to-interrupt stops speech and resumes listening. 28 English voices (UK and US) via KokoroVoice enum, fetched from backend API. Configurable providers and voices per provider. **Room to grow**: per-agent voice selection, emotion-aware synthesis (adjust tone based on content), multilingual support, and speaker diarisation for multi-person input.

Shipped

Realtime Voice Mode and Barge-In

Feature-flagged realtime voice mode with always-on listening and hands-free barge-in across web and Apple Watch. Speaking during assistant playback cancels audio and model streaming immediately and returns the session to listening - no tap required. Web barge-in uses WebRTC VAD. Watch barge-in uses RMS level detection on the AVAudioEngine microphone tap with a consecutive-frame threshold to distinguish deliberate speech from speaker bleed. Hardware AEC via `.voiceChat` session mode cleans up most of the echo on watch hardware before the level check runs. Added Groq and Deepgram as fast STT options, Deepgram Aura voices for TTS, and an updated voice selector with realtime/legacy mode toggle. **Room to grow**: adaptive VAD sensitivity per device, session-level voice analytics for turn-taking quality. Voice & LLM Provider Expansion (Groq, Deepgram, Inworld AI) Expanded the AI stack with additional provider gateways and unified provider registration across voice and language services. Added Groq to support both speech-to-text and text-to-speech, while also being available as a high-performance LLM provider. Deepgram continues to provide transcription and synthesis through its Aura voice family, and Inworld AI has been added as an additional TTS provider. Provider capability validation, API configuration payloads, and frontend selectors were updated so supported features (LLM, STT, TTS) are surfaced dynamically in the UI instead of relying on hardcoded assumptions. Room to grow: dynamic provider benchmarking per device/network profile, adaptive fallback routing, and real-time provider health scoring exposed directly in voice and model settings.

Shipped

Chat Markdown Renderer

Rich markdown rendering with rehype-highlight syntax colouring, heuristic block-vs-inline code detection for react-markdown v9 compatibility, and valid DOM nesting (Postel's law approach to streaming partial parse output). Copyable tables with DOM traversal to TSV export. Progressive disclosure via hover-to-reveal controls.

Shipped

Ollama Local Provider

Full Ollama integration for running models locally. Same provider abstraction as cloud - develop offline, test with local models, keep data on your machine. OllamaRoutingStrategy handles all routing transparently.

Shipped

Intelligent Agent Routing

Semantic pre-routing layer that auto-selects the best specialist agent before the LLM runs. Embeds user queries against pre-computed agent capability profiles (Qwen3-Embedding-8B, 4096 dimensions), then blends semantic similarity, historical performance, keyword matching, and recency into a confidence score. Self-learning feedback loop tracks routing outcomes and adjusts agent scores via exponential moving averages. Falls back to the default agent when confidence is low. **Room to grow**: per-user routing preferences, multi-turn context awareness (route based on conversation history rather than just the first message), A/B testing between routing strategies, dynamic threshold adjustment based on the user's override rate, and full agent performance profiling - p50/p95 latency, TTFT distributions, token cost per delegation, failure mode classification, with underperforming agent detection and model swap recommendations feeding back into routing weights.

Shipped

Local Machine Access

When running on a local Mac, Iris can read and write files, search file contents with ripgrep, run shell commands, manage Apple Notes and Apple Reminders (iCloud-synced), list directories, and send native macOS notifications. A lightweight Node.js bridge runs natively on the Mac. Docker containers reach it via `host.docker.internal`. Works in chat and voice mode. Iris reminders auto-sync to Apple Reminders on creation. The bridge has path-safe file access, a command allowlist, and bearer-token auth. **Room to grow**: local coding sessions with Git worktree management, branch creation, and PR opening without GitHub - a fully local coding agent flow.

Shipped

Native Mobile App (Personal)

Shipped as a practical personal mobile build using Capacitor + WebView, running on my phone. This gives me native packaging, push-capable foundations, and mobile access to the full app without waiting for a full App Store product cycle. This is intentionally a personal setup, not a distribution-first mobile strategy. I may publish to the App Store later, but it is not a priority right now because this already covers what I need day to day.

Shipped

Apple Watch Voice App

Native watchOS app for hands-free voice sessions directly from the wrist. Raise wrist, tap start, talk to Iris. No phone required once paired. Hardware acoustic echo cancellation via AVAudioSession `.voiceChat` mode prevents the speaker from bleeding into the microphone. Barge-in works the same way as the web client: speak during assistant playback and the session interrupts immediately. A 400ms grace period after audio ends stops the microphone from reopening into the playback tail. The visual indicator is a compact animated waveform bar strip - 13 bars with phase-offset animation, pattern changing by state. Listening ripples gently. Speaking goes tall and fast. Thinking sweeps. The full orb would eat 40% of the screen; the waveform uses 28 points and leaves room for transcript, status, and controls without anything getting pushed off screen. Transcripts show the active speaker only, switching with a crossfade. Sentence-boundary detection resets the visible chunk at each full stop so text never accumulates beyond the fixed two-line zone. After Iris finishes, her last sentence stays visible until you speak again. Session history persists back to the conversation thread. Switch to iPhone or web afterwards and everything is there. Gesture controls: double tap (watchOS 11+) starts or stops the session from anywhere on screen. Swipe up mutes or unmutes the microphone - requires a clearly vertical stroke so arm movements and accidental swipes do not trigger it. The Digital Crown adjusts playback volume while Iris is speaking, with haptic feedback at both ends of the range. Crown input is ignored in all other states so it does not interfere with navigation.

Shipped

In Progress (4)

Advanced Memory Architecture

Evolving from basic CRUD to a three-layer system: session context (auto-captured), long-term recall (consolidated), and pinned facts (permanent). Automatic memory extraction from conversations is the next step.

In Progress

Realtime Voice Conversations

Provider-backed voice agent sessions across web and Apple Watch. Strict settings-first WebSocket startup, keep-alive handling, tool-call bridging, and voice-to-chat message persistence for continuity when switching between voice and text. URL-safe voice responses, in-chat media continuity (image and video previews persist into text mode after voice turns), and watch session history that syncs back to the conversation thread. Current hardening focus: lower-latency turn transitions, better long-session stability, and tighter audio playback quality across device sample rates.

In Progress

Visual Agent Builder

Moving beyond form-based agent creation to a visual interface. Goal: drag-and-drop tool selection, prompt templates, testing sandbox, and one-click cloning. No code required.

In Progress

MCP and Tooling Hardening

Next hardening pass: migrate MCP server definitions from env JSON to a database-backed registry with per-server metadata, credential management, and enable/disable controls; add provider health metrics and caching for repeated web/data lookups.

In Progress

Planned (12)

Multi-Channel Messaging

Expand beyond Telegram to WhatsApp, Discord, iMessage, and email. Same conversation threads across all channels. Adapter pattern already in place from Telegram - each new channel is a new adapter, not a rewrite.

Planned

Frigate NVR Integration

Home security powered by AI. Frigate handles real-time object detection from cameras (people, cars, animals, packages). Iris adds the intelligence layer: contextual alerts ("delivery driver at front door, Amazon package expected today"), anomaly detection ("unfamiliar car parked outside for 20 minutes"), and time-aware escalation (silent logging during the day, immediate phone alert at 3am). Event clips stored and searchable.

Planned

Raspberry Pi Voice Hub

Dedicated always-listening device with wake word detection. Sits on the desk or in the kitchen. "Hey Iris, what's the weather?" without reaching for a phone. The Star Trek computer, in my house.

Planned

Phone Call Handling

Iris answers calls when I'm busy, takes messages, screens spam, schedules callbacks. Summarises missed calls and voicemails. "He's not available right now, can I take a message?" Voice models are fast enough. Latency is low enough. It's an engineering problem now, not a research one.

Planned

Smart Home Deep Integration

Beyond basic Home Assistant tool calls. Device pairing UI, event-driven automations triggered by sensor data, scene management, and proactive alerts. "Iris, I'm heading to bed" triggers the full chain: lights, thermostat, alarm, morning briefing.

Planned

Telegram Full Sync

Full conversation sync between web and Telegram. Start a chat on web, continue on Telegram. Inline keyboards, rich formatting, file sharing both ways.

Planned

Conversation Intelligence

Export conversations. Search across all threads. Usage analytics, cost tracking per provider, agent performance profiling, and automatic conversation summarisation.

Planned

MCP Server Registry and Governance

Move from static env-configured MCP endpoints to managed server records (per-user or per-workspace), encrypted auth fields, priority/order controls, and audit logs for resource access.

Planned

Web/Data Caching and Cost Controls

Add response caching and deduplication for search/news/finance/sports/weather calls with TTL policies, stale-while-revalidate, and request budget controls to reduce latency and provider spend.

Planned

Email Management

Move beyond send-only email. Gmail inbox display, email threading, draft management, and AI-assisted email composition with context from memories and conversations.

Planned

Autonomous Coding Agent

An agent that writes code, runs tests, opens PRs, reviews its own diffs, and deploys. Routine changes go through autonomously. Complex or risky changes ping me for approval. Merge on approval, deploy to staging, run smoke tests, promote to production. The full pipeline from "fix this bug" to "it's live."

Planned

Automated Shopping and Ordering

Shopper agent that compares prices, finds deals, and places orders on my behalf. "We're low on coffee" triggers a reorder. "Find noise-cancelling headphones under 200" returns ranked options. One tap approval from watch or phone.

Planned

Build Journal

The story so far

Notes on what we built, how we built it, and the decisions along the way.

Pinned Vision Jan 15, 2026

My life, orchestrated by AI agents.

Multi-Agent Orchestration

Streaming Chat

Media Generation

Skills and Memory

Telegram Integration

Automations

Iris in action

A conversation that actually thinks

An AI that is always available

A central interface for my agents

Built for my life

Not a chatbot. One central place that manages my life.

The approach behind Iris

AI-Assisted Development

Provider Strategy Pattern

Recursive Agent Delegation

Laravel AI SDK + Prism

Queue-First Architecture

Inertia v2 + React 19

Single-Server Deployment

Cloud + Local Hybrid

Roadmap

Streaming Chat and Agent System

Multi-Agent Orchestration

Direct Streaming Delegation

Fault-Tolerant Agent Takeover

Inline Sub-Conversation Panels

Memory System

Agent Builder

Skill Registry

Skill Graphs

Image Generation

Video Generation

Video Analysis Pipeline

Reminders

Batch Proposals

Telegram Bot

Live Web Research and Data Tools

Smart Rate-Limit and Fallback Handling

Parallel Tool Batching

MCP Runtime Tools

Workflow Meta Tools

Automations

Google Calendar Integration

Operations Dashboard

Voice Input/Output

Realtime Voice Mode and Barge-In

Chat Markdown Renderer

Ollama Local Provider

Intelligent Agent Routing

Local Machine Access

Native Mobile App (Personal)

Apple Watch Voice App

Advanced Memory Architecture

Realtime Voice Conversations

Visual Agent Builder

MCP and Tooling Hardening

Multi-Channel Messaging

Frigate NVR Integration

Raspberry Pi Voice Hub

Phone Call Handling

Smart Home Deep Integration

Telegram Full Sync

Conversation Intelligence

MCP Server Registry and Governance

Web/Data Caching and Cost Controls

Email Management

Autonomous Coding Agent

Automated Shopping and Ordering

The story so far

Why I'm Building Iris AI

Iris on the Wrist

Giving Iris Access to My Mac

Workflow Routing, Virtual Workers, and Faster Multi-Agent Runs

Replacing the Voice Pipeline With a Single WebSocket

Realtime Voice Barge-In - From Awkward Turns to Natural Conversation

Shipping Video Analysis - What Broke, What Changed, Why

Batch Proposals - Review Before You Commit

Per-Item Proposal Actions on Telegram