Code Kata: Box Breathing Exercise With SpeechSynthesis And Alpine.js

By Ben Nadel

Published 2024-12-14 in JavaScript / DHTML — Comments (2)

As we near the end of InVision, I've been feeling a lot of anxiety. I'm not one for meditation; but, I do like the idea of breathing exercises to help calm a racing mind. I recently watched a YouTube video about "box breathing" in which a cycle of breathing has four phases—in, hold, out, hold—each of which is performed for 4-seconds. I like to close my eyes when breathing; so, I wanted to see if I could use the SpeechSynthesis API to create a guided meditation with Alpine.js.

Run this demo in my JavaScript Demos project on GitHub.

View this code in my JavaScript Demos project on GitHub.

The simplest way to approach this would have been to create a single queue of "utterances" that repeat over-and-over again. And, if I was only going to close my eyes, that would be fine. But, for funzies I also wanted to create a small visual experience to work alongside the auditory experience.

To that end, I'm defining the exercise using "phases", each of which contains a number of "terms" to be spoken:

var phases = [
	[ "In",   "two", "three", "four" ],
	[ "Hold", "two", "three", "four" ],
	[ "Out",  "two", "three", "four" ],
	[ "Hold", "two", "three", "four" ]
];

Within the bounds of a single phase, the rendered text will be the aggregation of all the text that's already been rendered to the screen. So, the first rendered text will be:

In

... and then:

In ...two

... and then:

In ...two ...three

.. and then:

In ...two ...three ...four

Then, the text will clear and the next phase will start from scratch.

To make this easy to render, I flatten the phases down into a materialized set of states. Each state contains the aggregate text and the SpeechSynthesisUtterance instance that will passed to the SpeechSynthesis API interface for vocalization:

var states = phases.flatMap(
	( phase ) => {

		return phase.map(
			( term, i ) => {

				// As we proceed across the terms in each phase, the text will
				// be the aggregation of the previous text already rendered in
				// the same phase.
				var text = phase
					.slice( 0, ( i + 1 ) )
					.join( " ..." )
				;

				// Note: the voice for the utterance will be set just prior to
				// each vocalization. This way it will always reflect what's
				// currently in the select menu.
				return {
					term: term,
					text: text,
					utterance: new SpeechSynthesisUtterance( term ),
					duration: 1000
				};

			}
		);

	}
);

Notice that each state has a duration property. This is the number of milliseconds that the state will be rendered to the page before the next state is processed (and the next utterance is vocalized). This will be handled by a setTimeout().

In this code kata, some of the state needs to reactive (ie, the Document Object Model needs to be updated in response to state changes); but, a lot of the state can be private and is only needed to drive the internal state.

When it comes to Alpine.js ergonomics, this gets a little tricky. Alpine.js kind of wants everything to be reactive; or, rather, it wants everything to be part of the reactive Proxy that it generates internally. Since all reactive properties are exposed on the this binding, even private methods needs to be on the reactive Proxy so that they can access (and mutate) the reactive state.

This leads to an asymmetric code design where half of the variables are naked and half of the variables are bound to this. To be clear, this isn't a technical problem, it's more of an aesthetic problem. Mixing-and-matching different access patterns just feels icky.

If it ever bothers me too much, I can just move all references into the reactive Proxy and not worry about the fact that half of them won't actually be referenced in the DOM. But, for now, I'm going to leave half of the variables as private references (accessible only via closures) and half of the variables as public state.

This is illustrated clearly in the internal processQueue() method in which most references are naked and only a handful of this bindings are referenced. The processQueue() method is where an individual state is plucked form the state queue and is rendered / vocalized:

function processQueue() {

	// Reset queue when a new iteration is being started.
	if ( ! queue.length ) {

		queue = states.slice();
		this.iteration++;

	}

	var state = queue.shift();
	// Update the utterance to always use the voice that's currently selected.
	// This way, the user can change the voice during vocalization to find one
	// that is the most comfortable.
	state.utterance.voice = this.voices[ this.selectedVoiceIndex ];
	state.utterance.pitch = 0;
	state.utterance.rate = 0.7;

	synth.speak( state.utterance );
	this.text = state.text;

	timer = setTimeout(
		() => {

			this._processQueue();

		},
		state.duration
	);

}

This processQueue() method, which has been prefixed with an _ on the reactive Proxy in order to be "marked as private", calls itself recursively using the state duration. It will keep running forever, refilling the queue as necessary, until the timer is cleared.

The visualization of this code kata looks like this (the audio can be heard in the video above):

Screen recording on box breathing app using Alpine.js

And, here's the full code:

<!doctype html>
<html lang="en">
<head>
	<link rel="stylesheet" type="text/css" href="./main.css" />
	<script type="text/javascript" src="../../vendor/alpine/3.13.5/alpine.3.13.5.min.js" defer></script>
</head>
<body>

	<h1>
		Box Breathing Exercise With SpeechSynthesis And Alpine.js
	</h1>

	<section x-data="Demo" :hidden="( ! voices.length )">

		<div class="form">
			<select x-model.number="selectedVoiceIndex">
				<template x-for="( voice, index ) in voices" :key="index">
					<option
						:value="index"
						x-text="voice.name">
					</option>
				</template>
			</select>

			<button @click="start()">
				Start
			</button>
			<button @click="stop()">
				Stop
			</button>
		</div>

		<p class="text" :hidden="( ! text )">
			[<span x-text="iteration"></span>]:
			<span x-text="text"></span>
		</p>

	</section>

	<script type="text/javascript">

		function Demo() {

			// Box breathing consists of four phases: in, hold, out, hold. Each phase
			// lasts 4-seconds; and each term below will be spoken at a 1-second interval.
			var phases = [
				[ "In",   "two", "three", "four" ],
				[ "Hold", "two", "three", "four" ],
				[ "Out",  "two", "three", "four" ],
				[ "Hold", "two", "three", "four" ]
			];

			// Flatten the phases into a single set of states. This will make it easier to
			// process; and, allows us to materialize some state that otherwise would be
			// more challenging to calculate on the fly (ex, the text to output).
			var states = phases.flatMap(
				( phase ) => {

					return phase.map(
						( term, i ) => {

							// As we proceed across the terms in each phase, the text will
							// be the aggregation of the previous text already rendered in
							// the same phase.
							var text = phase
								.slice( 0, ( i + 1 ) )
								.join( " ..." )
							;

							// Note: the voice for the utterance will be set just prior to
							// each vocalization. This way it will always reflect what's
							// currently in the select menu.
							return {
								term: term,
								text: text,
								utterance: new SpeechSynthesisUtterance( term ),
								duration: 1000
							};

						}
					);

				}
			);

			// Once the timer is started, this queue will hold the states to be processed.
			// And the timer will hold the delay between each utterance.
			var queue = [];
			var timer = null;

			// Short-hand reference.
			var synth = window.speechSynthesis;

			// The tricky thing with Alpine.js is that the object returned from the
			// component becomes the hook for reactivity. Alpine.js creates a Proxy that
			// wraps the given data and updates the DOM when the values are mutated. This
			// makes it a bit challenging to create a separation between public and
			// private properties / methods. In this case, I have to include the private
			// methods on the return value so that they can access the appropriate `this`
			// reference for subsequent reactivity. To help enforce the "private" nature
			// of the methods, I'm aliasing them with a "_" prefix.
			return {
				// Public reactive properties.
				voices: synth.getVoices(),
				selectedVoiceIndex: -1,
				text: "",
				iteration: 0,

				// Public methods.
				init: $init,
				start: start,
				stop: stop,

				// Private methods.
				_processQueue: processQueue,
				_setVoices: setVoices
			};

			// ---
			// PUBLIC METHODS.
			// ---

			/**
			* I initialize the Alpine component.
			*/
			function $init() {

				// Voices aren't available on page ready. Instead, we have to bind to the
				// voiceschanged event and then setup the view-model once they become
				// available on the SpeechSynthesis API.
				synth.addEventListener(
					"voiceschanged",
					( event ) => {

						this._setVoices();

					}
				);

			}

			/**
			* I start the vocalization of the guided box breathing.
			*/
			function start() {

				if ( ! this.voices.length ) {

					console.warn( "No voices have been loaded yet." );
					return;

				}

				queue = states.slice();
				this.iteration = 1;
				this.text = "";
				this._processQueue();

			}

			/**
			* I stop the vocalization of the guided box breathing.
			*/
			function stop() {

				clearInterval( timer );
				this.iteration = 0;
				this.text = "";

			}

			// ---
			// PRIVATE METHODS.
			// ---

			/**
			* I process the queue, vocalizing the next state. This method will call itself
			* recursively (via setTimeout).
			*/
			function processQueue() {

				// Reset queue when a new iteration is being started.
				if ( ! queue.length ) {

					queue = states.slice();
					this.iteration++;

				}

				var state = queue.shift();
				// Update the utterance to always use the voice that's currently selected.
				// This way, the user can change the voice during vocalization to find one
				// that is the most comfortable.
				state.utterance.voice = this.voices[ this.selectedVoiceIndex ];
				state.utterance.pitch = 0;
				state.utterance.rate = 0.7;

				synth.speak( state.utterance );
				this.text = state.text;

				timer = setTimeout(
					() => {

						this._processQueue();

					},
					state.duration
				);

			}

			/**
			* I set the voices based on the current synth state.
			*/
			function setVoices() {

				// There are TONS of voices, but only a handful of them seem to create a
				// reasonable experience. This is probably very specific to each browser
				// or computer; but, I'm going to filter-down to the ones I like.
				this.voices = synth.getVoices().filter(
					( voice ) => {

						switch ( voice.name.toLowerCase() ) {
							case "alex":
							case "alva":
							case "damayanti":
							case "daniel":
							case "fiona":
							case "fred":
							case "karen":
							case "mei-jia":
							case "melina":
							case "moira":
							case "rishi":
							case "samantha":
							case "tessa":
							case "veena":
							case "victoria":
							case "yuri":
								return true;
							break;
						}

						return false;

					}
				);

				// Default to the most pleasing if it exists.
				this.selectedVoiceIndex = this.voices.findIndex(
					( voice ) => {

						return ( voice.name === "Tessa" );

					}
				);

				// If the preferred voice doesn't exist, just use the first one.
				if ( this.selectedVoiceIndex === -1 ) {

					this.selectedVoiceIndex = 0;

				}

			}

		}

	</script>

</body>
</html>

Anyway, I'm not gonna say too much about the SpeechSynthesis API since I don't know that much about it. This was just a fun little Alpine.js code kata.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/4743

Reader Comments

Scott Steinbeck Dec 20, 2024 at 6:03 PM

10 Comments

Funny enough i have just started playing with both the Voice to text and the Speech synthesis features in the last couple weeks. My favorite voice is "Google UK English Male" which i think is only available in Chrome. Here is a nice Demo of all of the languages.

https://mdn.github.io/dom-examples/web-speech-api/speak-easy-synthesis/

Ben Nadel Dec 23, 2024 at 6:00 PM

15,978 Comments

@Scott,

Ha, that voice is great! I'm not sure why I never saw that one before. I feel like I had gone through and tested a bunch of the voices to get a list of ones that felt reasonable. Maybe I was checking on an older version of Chrome. I also don't think I realized that the voices would be browser-specific. I just assumed they would be OS-specific.

Reader Comments

Post A Comment — ❤️ I'd Love To Hear From You! ❤️

Post A Comment — I'd Love To Hear From You!