Live Practice
The score listens to you play. The playhead follows. When you stop, the system audits your run and writes you a coaching note in plain language.
What it is
Practice (Follow) is a score-following surface. You play; the platform listens through your microphone, advances the score in real time so you don't have to turn pages, and at the end produces a report:
- Per-note assessment โ clean, rushed, dragged, wrong, missed.
- Per-bar accuracy heatmap.
- Tempo curve โ your tempo over time vs the target.
- Pitch-class coverage โ for chord-heavy passages, which pitches you actually played.
- A coaching narrative in a teacher voice with concrete suggestions.
Recordings are kept with each session so you can play them back next to the score and hear what you did.
How to start
From a sight-reading passage
- Go to /library/sight-reading.
- Pick a passage. Each card has two CTAs: UNEB cadence (the 30-second-preview-then-perform flow) and Follow mode โ.
- Tap Follow mode. The piece loads into Practice; grant mic permission when prompted.
- Tap Start. You get a 4-beat count-in, then play.
From your own piece
- Open the piece in FORTE Notation.
- Tap Practice in the TopRail (desktop) or in the More menu (mobile).
- Same flow โ grant mic, tap Start.
Practice settings
Before you tap Start, you can adjust three knobs (locked once the run begins):
How it actually works
We don't hide the architecture, because it's relevant to interpreting the results.
Layer 1 โ instant cursor (monophonic)
Your microphone stream is windowed every ~23 ms (44 frames per second). An on-device pitch detector finds the dominant pitch in each window. When a stable pitch locks for ~70 ms, an onset event fires. The aligner consumes onsets, matches them against the next expected note, advances the playhead โ and that's your cursor: the highlighted note on the staff.
Latency is ~100 ms end-to-end. Good enough for the cursor to feel responsive. The aligner has a 3-note lookahead to handle skipped notes and a forgiveness budget (1.5 ร beat duration) before it gives up on a note and marks it missed.
Layer 2 โ polyphonic follower (sliding window)
In parallel with the instant cursor, the system runs an on-device multi-pitch detector on a 2-second sliding window every 1 second. That's a polyphonic-aware view: it finds multiple simultaneous pitches, not just one. It runs entirely in your browser (~5 MB binary, cached after first use).
The polyphonic pass doesn't override the instant cursor unless the cursor has stalled (no advance for > 1.5 ร beat) and the polyphonic detector has high confidence about where you are. That keeps the cursor responsive while still rescuing the playhead when the monophonic path gets confused on a chord.
Layer 3 โ recording + post-session audit
While you play, your audio is captured locally (Opus in WebM, ~64 kbps). On completion the recording uploads to your account โ typically ~250 KB per 30-second run. The session metadata + per-note assessment log is saved to the database.
Then a second audit pass runs offline on the same polyphonic detector โ no real-time pressure, so it can analyse the whole clip. The findings (per-bar accuracy, tempo curve, pitch-class coverage) get persisted to the session row so the detail page can show them without re-running.
Layer 4 โ AI coaching narrative
Once the audit is in, the platform sends a structured summary โ outcome aggregates, top trouble bars, specific wrong-note moments, timing slips, tempo curve, pitch-class gaps โ to a language model. It returns a 2-sentence summary, a 1โ2 paragraph narrative, and 1โ4 concrete suggestions.
We auto-fetch on the stats screen and on the detail page. There's a regenerate button if the first take doesn't land.
Reading the metrics
The per-bar heatmap
Each bar shows clean / total notes for that bar, coloured by ratio:
- Green โ โฅ 90% clean. Solid.
- Amber โ 60โ89% clean. Workable, with rough spots.
- Red โ < 60% clean. Slow it down and revisit.
Hover (or long-press on touch) for the exact count.
The tempo curve
The curve is your local tempo at each matched onset, sampled across the run, smoothed with a 3-point moving average. The dashed line is your target. Shapes to watch for:
- Slowing into cadences โ natural for expressive playing, but consistent across phrases may mean you're not solid on the cadential figuration.
- Rushing through dense bars โ usually anxiety; slow the metronome 8% and try again.
- A step change in the middle โ typically a section break (B-section in faster tempo). The score may not actually have a tempo change marked; you may be rushing it.
Pitch-class coverage (polyphonic mode)
For polyphonic instruments, we compare the pitch classes the polyphonic detector heard during your run against the pitch classes the score expected. A high coverage score for Dโญ means you covered the pitch class regularly when the score asked for it; a low score means you missed it (or played the wrong chord). Useful for piano students working on left-hand accuracy.
Coaching narrative
The AI writes a teacher-voice note based on the structured findings. Format:
- Summary โ 2 short sentences.
- Narrative โ 1โ2 paragraphs that reference specific bars, tempos, and moments.
- Suggestions โ 1โ4 imperative-voice items. Each one tied to a concrete observation, not a platitude.
The model is told explicitly: warm but honest, not a cheerleader, name problems specifically. It will sometimes praise (when warranted), sometimes recommend slowing down, sometimes call out a specific wrong note. Use it as a starting point, not a verdict.
Practice history
Every saved session lives at /practice:
- Recent โ last 7 days, chronological.
- By piece โ runs grouped by piece with up/down/flat trend arrows. "You've practised this 6 times โ clean rate is up from 64% to 88%."
Tap any row for the detail page: full audit, recording playback, audio download, delete.
Data and storage
- Recording size โ ~240 KB per 30 s of practice at 64 kbps. Capped at 25 MB per session.
- Where it lives โ your account's private storage, accessible only via signed requests from your authenticated session.
- What gets sent to the AI model โ only the structured findings JSON. Never the audio. Never your credentials. Never your name unless you put it in your memory facts.
- Deletion โ delete a session from its detail page; the row and the audio are removed. Idempotent.
Honest limits
- Phone microphones are noisier than tablet or desktop mics. Practice tone tracking is good on tablets, OK on phones, best on a laptop with a USB mic.
- Polyphonic accuracy is decent for piano right hand + comping left hand, less reliable for dense organ textures or two-stringed-instrument arrangements.
- Latency floor is the browser's; we can't go below ~80 ms in any browser today.
- Doesn't track timbre. We follow pitches, not playing technique. A coaching note may say "you stumbled at bar 5" without knowing if the stumble was a fingering error or a bow change.