Principles

← Back to Index


These are the core design principles of the model. They are the decisions that shaped everything else. When in doubt about how to implement or extend the model, return to these.


1. The Kick Is the Single Source of Truth

The model is built on one reliably detectable element: the kick drum. Everything derives from it — Foundation measures the kick directly, Variance measures what changes around the kick, and Story is determined by the intersection of both.

This is a deliberate constraint, not a limitation. Attempting to extract multiple independent sources from a live stereo signal in real time is unreliable. By committing to one trustworthy signal, every value the model produces is grounded in something real and measurable. Accuracy comes from narrow scope.


2. Weight Is the Primary Expressive Value

Of all the values the model produces, Weight — how heavy the kick feels — is the most important. It is continuous, smooth, always available when the kick is present, and maps naturally to visual intensity.

If the entire model were reduced to a single number, it would be Weight. Visual artists should start with Weight and add complexity from there.


3. Presence Is Felt Continuously but Breaks Discretely

Presence has two faces. The continuous value (0.0–1.0) tracks how clearly the kick is perceived in the mix, updating every frame for smooth visual responses. The binary gate fires only when the kick has been confirmed absent (or confirmed returned) for enough beats to pass Temporal Thresholds.

This dual nature is the key to a system that is both responsive (continuous value) and stable (gate). Visuals never snap jarringly because the continuous value provides smooth transitions. Stories never flicker because the gate only fires when it’s confident.


4. Transitions Are Triggered by Confirmed Breaks and Shifts

Story transitions do not happen because of individual events — a single loud kick, a one-shot FX, a momentary filter sweep. They happen because a state change has persisted long enough to be confirmed as real.

Temporal thresholds filter noise from signal. A kick that drops for one beat during a DJ mix is not a breakdown. A sustain that changes for half a bar is not a character shift. Only confirmed, sustained changes trigger story transitions.


5. Variance Explains the Difference

Two moments with identical kicks can feel completely different. One is a locked groove, the other is a building tension. The kick didn’t change — everything around it did. Variance captures that difference.

Variance is always measured relative to the kick. The system doesn’t need to identify individual elements in the mix. It tracks how the aggregate spectral environment changes from bar to bar, using the kick as the fixed reference point.


6. Stories Are Named

The model produces human-readable labels — “groove,” “escalation,” “breakdown,” “drop” — not abstract numbers. This is what makes it a language rather than a data format.

Named stories can be communicated between engineers and artists without translation. A visual artist and an audio engineer can discuss “what happens when the story is ‘escalation’” without one needing to understand the other’s technical domain.


7. The Matrix Is a Starting Point

The Story Matrix provides a framework for mapping Foundation × Variance to story names. It is not a rigid specification. Implementations should customize it — merge cells that feel identical, split cells that are too broad, add Sustain modifiers where they matter, and name only the stories that actually occur in the target genre.


8. Not Every Moment Has a Story

When the kick is absent and no Foundation can be identified, the model produces no story. story.name = null. This is a feature.

Silence, noise, ambient interludes, and the gaps between tracks are valid performance moments that don’t need labels. The model’s honesty about what it can and cannot identify is part of its reliability. No foundation, no story.


9. The Model Is Output, Not Implementation

This model exists above the technical layer. It does not describe how to detect beats, what FFT window size to use, or how to configure audio buffers. It describes what the music means — and leaves the “how” to engineers.

Engineers implement below it. Artists consume above it. The model is the shared language between them.


10. Performer Actions Are Audio Changes

The model does not have a separate layer for performer input. When a DJ sweeps a filter, the audio changes, and Variance responds. When a DJ mutes the kick, Presence drops, and a story transition fires.

The model reacts to what it hears, regardless of who caused it. This keeps the system simple and honest — it describes the music as it is, not as someone intended it to be.


← Back to Index