How It Works
You provide a simple prompt to our beautiful command line interface:
“Design a landing page for a sustainable fashion brand.”
From there:
PM Agent writes a brief — outlining goals, tone, and design requirements.
Design Agent designs and builds the page — full HTML/CSS/JS, rendered locally.
PM Agent reviews the work: the PM runs the result through Eyequant, a visual-attention API that predicts where users will look, how clear the layout is, and how exciting it feels. Think of it like test-driven development for design — quantitative guardrails that keep iteration focused on improving what matters.
Feedback Loop: the PM interprets those scores, and delivers feedback to the Design Agent.
Iteration: This loop repeats until the Design Agent meets the threshold criteria set by the PM.
Each run produces an interactive report showing visual scores, heatmaps, and side-by-side evolution — a tangible record of two digital minds working through creative tension.
Notes from Building
Creativity is a team Effort.
After receiving a steady stream of boring, templated slop we adjusted our Designer’s prompt to make it more creative. To our our surpise it still produced safe, symmetrical layouts. The problem wasn’t the designer — it was the PM’s prescriptive specs (“hero section, CTA, three-column grid”). Once we rewrote the PM to express emotions instead of structures (“make them feel the chaos before they see the solution”), the results became genuinely original.
That tiny change unlocked everything. Suddenly, the designs felt alive — surprising, sometimes chaotic, occasionally brilliant. The creativity wasn’t in the pixels; it was in the permission.
Feedback Needed Memory.
Early on, each round overwrote the last — new changes undid old improvements. By feeding the Design Agent all prior feedback, it began to build context and compound progress. It learned to preserve wins while fixing what was still off — the same pattern we see in effective human teams.
Metrics Give Meaning.
Eyequant’s attention scores gave both agents a shared language for improvement. Instead of “make it pop,” they had clarity and excitement scores to guide iteration. Like unit tests in code, these metrics turned subjective taste into measurable learning loops.
Permission to Fail Interesting.
The breakthrough wasn’t better tooling — it was psychological safety. Once the PM stopped penalizing risk and started rewarding curiosity, the system got bolder, stranger, and better. The same is true for us: teams do their best work when they’re free to surprise themselves.
MATT began as an AI design experiment. It ended up reminding us how human creativity actually works — iterative, emotional, and rooted in trust.