When it’s not trying to fend off lawsuits from major record labels, Suno is still working on refining its AI music creation tool. The latest model, Suno v5, is an obvious technical improvement over its previous version, v4.5+. But it still can’t seem to escape the bland emptiness that pervades most AI art.
There are some across-the-board upgrades in audio quality that are undeniable, like fewer artifacts and clearer separation between instruments. Some tracks produced using v4.5+ can smush all the melodic parts together in a way where the lines between guitar, bass, and synth are muddy at best. But with v5, the mixes are much cleaner.
During a demo, Henry Phipps, a Suno product manager, pointed to a song we had the model generate that included a flute-like synth with what sounded like a ping-pong delay effect on it: “I’ve never heard that before in previous models… what that says to me is that the model understands that this is an isolated sound that’s being affected and needs to be reproduced faithfully in different parts of the stereo field.” Since Suno isn’t actually applying effects in the traditional sense, this means the model is identifying a particular instrument and approximating the sound of a stereo delay because it’s decided that is what it should sound like.
There are no edges to any of the Suno vocals. Everything is bathed in reverb, layered with harmonies, and perfectly on pitch. Even if you explicitly tell it not to do these things, the model just ignores you.
Suno also claims that v5 has a better understanding of genre, though that claim seems questionable from my testing. With some of my prompts like “modern avant R&B with glitchy, but funky drums, atmospheric melodic parts, and breathy vocals,” neither v5 or v4.5+ seemed to be the clear winner in delivering what I had in mind (mostly Kelela’s Take Me Apart). They both got close, giving me downtempo tracks with some moody synths, but they lacked the weirdness I was hoping for.
Neither could Suno quite figure out what I was looking for with “early ‘90s lo-fi indie rock recorded on a 4-track cassette recorder with off key vocals and slightly out of tune guitars” either, but v5 was definitely more off target. Despite everything I tried, I could not get Suno to spit out anything that sounded even remotely like Pavement. The loose slacker noise pop I associate with Slanted and Enchanted was nowhere to be found. Instead, I got bombastic “indie” rock with chunky riffs and clean driving power chords. Suno v5 kept serving up songs that sounded more like Arctic Monkeys than anything released before the turn of the century.
Similarly, in my testing, v5 seemed to struggle with era- or decade-specific prompts at times. When I asked for “late 1970s krautrock,” v4.5+ basically nails it outside of the vocals (more on that later). But v5 often delivers ‘80s-tinged synthpop and tracks that are distinctly more modern sounding, even if they have some of that classic krautrock DNA.
What I will say is that the arrangements that Suno’s v5 model creates are much more complex. Compared to v4.5+, there are more one-off musical flourishes that keep things from getting too repetitive and more varied song structures. Where v4.5+ is usually content to stick with a basic verse-chorus-verse structure (with a bridge tacked on for good measure), v5 would often have pre- or post- chorus sections, multiple bridges or breakdowns, and generally build over the course of a track offering more of an arc than just distinct sections.
It also occasionally delivered interesting results when remixing existing tracks. I uploaded a song from an EP I released a few years back (which probably should have tripped its copyright filter) and look, I’m not going to lie, I kind of liked the way it transcribed parts of my guitar solo into a recurring synth motif and turned my big chord pads into driving arpeggios.
But what was missing in all of these covers of my song that I asked Suno to create was the raw, lo-fi nature of the track that I recorded in my living room at 3AM about six years ago. And that’s kind of a running theme here. While Suno can mimic some of the superficial features of an old recording or a human performance like tape hiss or breaths, it always feels inauthentic.
... continue reading