This demo script explains how an agentic development product should present its value.
The point is not to perform "AI writes code quickly." The point is engineering flow: task input, context reading, change execution, verification feedback, and a clear handoff back to the human.
Demo rhythm.
The first half shows how a small change is decomposed. The second half shows how browser checks, tests, and the final report become a trustworthy delivery.
Make the workflow clear with numbers and contrasts.
Talking about "engineering rigor" in the abstract isn't convincing, so for this part I use one real, small change as a yardstick and lay out the scale of all four stages.
6
files changed
incl. 2 tests
14
context reads
files + terminal + browser
9
checks run
half lint, half unit tests
Profiling the scale of one mid-sized change
Beyond scale, what's more worth looking at is where the time goes. Line up the four stages side by side and you'll see the real time sink was never "writing code" — it's reading context and verifying.
Not every task should run the same flow. Sort them by risk and verifiability and you'll know which ones you can let go of and which ones need a human watching.
After classification comes choosing the model — the task's difficulty decides which tier to use, not reaching for the most expensive one every time.
| Model | Input | Cached | Output | Context |
|---|---|---|---|---|
codex-minilow-risk batch | 2.0 | 0.5 | 8.0 | 256K |
codex-stdeveryday changes | 9.0 | 2.2 | 36.0 | 512K |
codex-procross-file refactors | 22.0 | 5.5 | 88.0 | 1M |
¥ / million tokens · Match the model to task difficulty
Models keep iterating, and so does this flow. Here are a few of the milestones that brought it to where it is today.
- 2026-02read-only contextunderstand first, then speak
- 2026-04replayable evidence chainterminal + browser join the flow
- 2026-06verification-driven deliverycurrent
How the workflow's capabilities evolved
One sentence to wrap up this section.
Stitch the whole flow together and it's about pushing one task all the way to a verifiable change.
— Discussion
GitHubDiscussion.
Discussion is powered by GitHub Discussions. Use it for follow-up questions, feedback, and notes after the session.
Discussion did not load this time. Refresh to retry, or continue on GitHub Discussions.