SoleMate: The Shoe Store with a Personal Shopping Assistant
It’s an online shoe store where an AI agent actually calls you when you’re struggling to find what you want.
Browse normally, hit some dead ends with filters, and our SoleMate agent notices you’re stuck and gives you a ring.
Then you have a natural conversation while watching the AI control the page in real-time: setting filters, showing products, adding items to your cart based on what you’re telling it you want.
It’s like having a knowledgeable salesperson who can see your screen and help you navigate, except they’re available 24/7 and never try to upsell you on shoe cleaner.
Notes from Building
Voice plus real-time page control creates genuinely collaborative shopping. ElevenLabs’ Agent Tools lets our Agent actually control web pages during phone calls: clicking buttons, setting filters, adding items to cart based on natural conversation. A back-and-forth phone call while watching the product page respond to your requests simulates actual person-to-person interaction way more than any chatbot. Plus the AI only needs minimal context like “Add that to cart” or “What’s in size 10?” to be effective.
Trigger-based AI feels less annoying. We don’t want the agent calling everyone who visits the site. Instead, it triggers when someone hits “no products found” multiple times or seems stuck browsing. It’s like having a salesperson who knows when to approach versus when to let you browse in peace.
Multi-modal AI agents are the future of interfaces. This Sneakerhead experiment proves a shopping use case, but it goes beyond that. The combination of voice input, visual feedback, and direct page control creates a new way of interacting with any web application. There’s a not-so-far-away future with agents that control Monday.com, Figma files, or Tableau while you talk to them naturally.
This thing is powerful. It’s real-time audio. It’s an LLM handling the context of the conversation. It’s Claude spinning up an e-commerce UI in a flash. It’s a knowledge base the agent can access anytime. It’s a RAG for fetching the inventory. It’s a small scrappy experiment. But it’s already got our wheels turning. Put these tools together and you get something genuinely powerful. What else can we build?