How it works
🎸 Guitar Tuner – Just like that pedal you have but in your browser
⏱️ Metronome – Set your BPM, time signature, even swing
📝 Word Saver – Capture words as you think of them
🎤 Recorder – High-quality audio recording and playback
🤖 Voice Assistant – Voice-controlled interface using ElevenLabs
Everything works together through natural language commands. Say “set BPM to 120” or “add word catastrophe” or “start recording” and watch it happen.
Notes from Building
Voice + tools is powerful. Last week we started to scratch the surface using ElevenLabs and similar tools to add to cart and filter product. Well, we tried that for a full dashboard: set_metronome_bpm, start_metronome, stop_metronome, set_metronome_time_signature, add_word, start_recording, stop_recording, start_tuner, stop_tuner. Our voice assistant can do it all.
Always interesting to find what the LLM just can’t do. It’s always interesting to stumble across something an LLM is really bad at. Turns out: Tuner – Claude couldn’t do it. Because this isn’t a language problem, it’s a signal processing problem that requires specialized algorithms. Sometimes you just gotta do it yourself (or google a library like aubio.js that does it).
Short and sweet voice controls. At first the agent was WAY too verbose. We want minimal interruption. The system prompt took a few rewrites to get it to just quickly pop in and out, instead of a long, drawn-out chat while we’re trying to think and record. Hey bot, stop talking!
Word banks help. How to Write One Song is full of exercises to get you unstuck. Many involve writing down favorite words or “stealing” them from books or even overheard conversations. We added a simple word bank so you can capture ideas without putting the guitar down.
Voice-first design = accessibility practice. Designing something to work without hands is a good exercise in accessibility. We’ve seen good hands-free design before (your car is probably pretty good at it) but web tools are still clunky. Ever tried to use a screen reader? Gets tedious fast. And too many sites skip over the WCAG standards that would make them work well. This got our gears turning: what other audio interactions could make the experience smoother and more usable on the web?