For the last few years I’ve been helping to run a sort of karaoke night called Maraoke, monthly-ish in London but also at video games events around the world. Maraoke works basically the same as karaoke – you pick a song you know from the list, we call you up to sing, you sing along to a backing track using synchronised lyrics – but with the twist that the lyrics aren’t the ones you remember, because they have been ‘modded’ to make dumb jokes about video games (and tangentially related matters).
The absolute worst part of the night for me is the moment just before each song starts: I’m usually running the laptop that powers all the video and audio, so I’m the person pressing the play button. And invariably when I do there is a pause of indeterminate length – ideally slight, too often not that slight, before the song actually starts, during which I curse whoever built this piece of shit software, i.e. me.
The ‘software’ is a couple of browser windows connecting to a website I built: one window plays the music and shows the words to sing along to, the other lets you queue up the songs. A web-based system makes lots of sense in some ways – it’s easier to maintain the library of songs if they’re all in one place that can be accessed by anyone with a log in, easier to get the night back on track if someone e.g. steals the laptop1, and also I (at least vaguely) knew how to do that at the point when we decided we needed a new system.
The key problem this introduces is that you need a reliable internet connection to use it, which is more of an ask in 2024 than you might think/hope and it’s impossible to know what that’s going to be like ahead of time. Hence the indeterminate pause. When the ‘play’ button is hit in the control window, the following things happen:
- The play window sends a message out to the display window saying ‘play a song with this ID’.
- The display window downloads the song lyrics from the Maraoke server.
- The display window tells the audio element on the page to start downloading the backing track.
- When it has enough of the audio that the browser thinks it will be able to download the rest in time to play it all the way through, the display window plays the song and starts the animation sequence that shows the lyrics at the correct time.
So the gap between hitting play and a song starting is necessarily determined by ‘how long it takes to download the audio’ – but if there is a long gap after you do hit play, it’s unclear whether one of steps 1-4 has broken down entirely, or whether you are on the final step but the browser is just taking a long time to fetch enough of the audio because everyone else in the bar is using the WiFi for ‘TikToking’.
Here’s an example of what I mean – on the left you can see the lyric window, on the right the control window – the video starts as I click the play button. This song in particular has a custom background video that’s also being loaded in at the same time as the audio and lyrics, so there’s even more data required.
In Chrome’s DevTools you can simulate crappy internet connections but I forgot to turn that function on and it turns out my home internet connection is rubbish enough to make the point: there are about 10 seconds between hitting play and getting the Maraoke logo – i.e. the song starts2. That’s a lot of time for the audience to be staring at a possibly already nervous singer, while the rest of the Maraoke team stare at me and I stare at, or give the finger to, the computer.
Generally my ‘‘solution’ to this is just to bring the laptop with the development version of the software on, which has all the files on it already, so the downloads become effectively instantaneous. But this obviously eliminates a lot of the advantages of having it run in a web browser, means I have to remember to download any new songs onto my local version, and if I want to install a local version on anyone else’s machine it often requires e.g. caring about ‘Windows Subsystem for Linux’ and you only get one life don’t you.
The reality is that in the vast majority of cases we will have some kind of internet access, and even if it is slow, we don’t need a huge amount of data – the lyrics data is a few kilobytes, a backing track is a few megabytes big, even for songs with custom video you’re talking 50 MB-ish, not a lot these days. But what we need is to have that data *before* we hit play (and, ideally, to know that we have that data).
Web browsers cache files (so if you visit a page a second time the browser doesn’t necessarily need to bother loading *everything* again) – so my thought was that there could be a way to tell them what to cache beforehand, and it turns out modern browsers have a specific way of doing this – a CacheStorage interface that allows you to request a file and store it in the cache.
caches.open("nameOfCacheGoesHere").then((cache) =>
cache.add("someExcitingFile.mp3")
)
You can also check whether or not a given file is in the cache and retrieve it if it is:
caches.match("someExcitingFile.mp3").then((response) => doSomethingCleverWith(response))
The Maraoke system already allows for songs to be queued up in advance, so it’s easy enough to tell the browser window to request the queue, check what the next song is and what files it requires, loop through those files and put each into the cache, then move on to the next song in the queue.
On the control side of things we can again iterate through the queue, check whether the necessary files are in the cache, and indicate against the on-screen queue that a song has everything ‘ready to play’, like so:
(The white dot means it has data, the slightly faded out ones mean the song wants data but doesn’t have it yet. Songs with a second dot have video or other media assets to download as well as the sounds and the words.)
This is all very good and clever and I was very pleased with myself until I checked the network tab of Chrome’s DevTools – this shows every file the browser is requesting for a given page. And when the browser loaded the new audio into the page’s audio element, instead of using the cached version it was just requesting the file from the server again, making the entire endeavour completely pointless.
The reason for this, I think, is that there are a couple of different ways for a browser to request a file – when I’m caching the audio file it downloads the whole thing and shoves it in the cache, but when it requests the file an audio element it wants to be able to use the data it’s downloading before the download has complete (so you can play the start of the song even if the end hasn’t been downloaded yet), so requests it in chunks. Which is probably why it doesn’t bother to check the cache? Either that or it just hates me.
But you can work around this: when you check if a file exists in the cache, it actually returns a Response as though you’d requested the file, so instead of just giving the audio player a URL and hoping for the best, you can check if the file exists in the cache, and if it does, retrieve the data and tell the audioPlayer to use that instead.
caches.match("someExcitingFile.mp3").then(response => {
//If the cache tells you it has a copy of the file, return the response as a blob (a Binary Large Object, basically a big lump of data). If the file doesn't exist, return nothing.
if (response) {
return response.blob()
} else {
return null
}
}).then(blob => {
//If at the end of the previous step you got a blob, give the blob a URL and tell the audioPlayer to look at that URL, otherwise tell it to look at the original URL for the file.
if (blob) {
audioPlayer.src = window.URL.createObjectURL(blob)
} else {
audioPlayer.src = url
}
})
Does this work? Well, if we turn on caching and check the same song as before on the same dire internet connection, let’s see:
From 10 seconds to less than a second. No more waiting to sing about which of the Pokémon you most want to have sex with!