Building Software With AI - Claude, Codex, and Steering the Ship

Here's something I want to be transparent about: Iris is built with AI coding assistants. Claude Code and OpenAI's Codex do the heavy lifting on implementation. My job is to oversee, prompt, review, and make sure the right things go into the project.

I don't blindly trust what the AI produces. I'm the overseer. I give direction through prompts, review what comes back, reject what doesn't fit, and steer it until the output is right. Think of it like managing a team of very fast, very capable developers who need clear guidance and constant review.

What I do:

Give direction - describe what to build, what constraints matter, what the end state should look like
Review everything - nothing goes into the project without me reading it and approving it
Push back - reject over-engineering, catch mistakes, say "no, not like that"
Prioritise - decide what matters now vs what can wait
Prompt effectively - getting good output from AI is a skill. Knowing how to describe what you want, with the right context, at the right level of detail
Debug by describing - when something breaks, I describe the symptoms and the AI traces through the code

What the AI does:

Write the implementation code across files
Generate tests, migrations, and boilerplate
Refactor existing code when I describe the target
Trace through complex call chains when debugging
Produce structured output like release notes from commit history

Claude Code handles most of the heavy lifting. It reads the codebase, understands the patterns, and produces code that fits. When something breaks in production at midnight, I describe the error and Claude traces through the streaming pipeline to find the root cause. That actually happened - it identified that PHP-FPM's request_terminate_timeout was killing long AI requests, something that would have taken me hours to trace manually.

Codex is great for batch operations and comparing states across many commits.

The workflow:

I decide what to build next (the "what" and "why")
I describe the feature, constraints, and how it should fit the existing architecture
The AI writes the implementation, tests, and migrations
I review, push back on over-engineering, and course-correct
We iterate until it's right

The key insight: AI coding assistants don't replace oversight. They replace typing. The hard part was never writing a foreach loop - it was knowing what to ask for, recognizing when the output is wrong, and having the judgement to accept or reject what comes back. That's still my job. Always will be.

Some numbers for perspective: 152 files changed, 6,139 insertions in the last development cycle. Multi-agent orchestration, parallel job execution, streaming infrastructure, Telegram integration, media pipelines. All production-grade, all tested. Built in weeks by one person steering AI tools effectively.

The meta irony isn't lost on me: I'm using AI to build an AI system. But that's exactly the future I'm excited about.