Apotheosis of the Voice: An Exploration of Synthetic Vocal Modification in Modern Music
Throughout human history, music often centered around the human voice, whether in the form of large choirs performing complex symphonies or a pop diva showing off their vocal range over a backing track. However, in recent years, it has become clear that the once-heavy importance placed on natural singing ability has more or less subsided. A good vocalist will always have room in music and will always be appreciated and unequivocally praised (as indicated by the continued support for singing shows like American Idol, The Voice, and all of their knockoffs). But, at the end of the day, nobody really cares anymore: if the song sounds good, the singer’s actual singing ability does not matter.
One avenue through which this has been most clearly achieved in the mainstream was through the rise of autotune in the late 90s and early 2000s. Originally a technology developed by a mathematician, derived from algorithms he made to help oil companies interpret seismological sonar readings in order to find new drilling sites, auto-tune was initially intended to be something more underground. But, the moment it appeared on Cher’s 1998 single “Believe,” the technology rocketed to the forefront, despite Cher’s producers’ best attempts to cover it up under lies of vocoders and pedals. Suddenly, the world became a place where anybody, even the worst of singers, could sound passable, and where the best of singers could finally sound absolutely pitch perfect. Auto-tune soon found its place in popular music. Rappers like T-Pain and Quavo seemed to have completely embraced the modulation, and singers like Katy Perry and Rihanna began to use it as a way to bring their vocals one step closer to what they wanted them to be.
At first, there was a great deal of pushback. Critics and the general public seemed to have made it their goal to shame anyone who used autotune, implying that artists’ use of autotune meant that they needed autotune because they didn’t have the talent to sound good without it. Time magazine even named autotune as one of the 50 worst inventions in 2010. Looking back on this outcry within the context of the late 2010s and the early 2020s, where auto-tuned vocals are a dime-a-dozen among, frankly, much wilder and more interesting vocal modulations, one cannot help but think of this backlash as mass pretension. A song nowadays would be boring or derivative or too radio-friendly if it relied on nothing but the natural human voice and the verse-chorus-verse structure of pop songs before. Nowadays, artists pitch-shift, reverberate, tremolo, phase, pan, layer, transpose, slow down, speed up, cut up, and syncopate their vocals in order to add dimension to their songs: autotune is more or less an afterthought at this point, for it seems the human voice is no longer limited by biology but by technology. This does beg the question, though: What does the human voice mean in modern music?
Analyzing popular vocal modification: Nightcore vs. “Slowed & Reverb”
If you spent some time listening to music and clicking through the suggested videos on YouTube in the early 2010s, you may have come across “Nightcore” remixes of your favorite songs. A Nightcore remix is basically a song, which is often used in the form of the entire mp3 bought off of iTunes, sped up to the point where the vocals sound like something straight from Alvin and the Chipmunks. These remixes would also be accompanied by neon pictures of colorful, often sci-fi or fantasy anime girls, frozen in place for the entirety of the pitched-up, sped-up song.
Recently, in the late 2010s and the early 2020s, the foil to these Nightcore videos seemed to have popped up in the form of “Daycore” remixes or, more popularly known as, “Slowed & Reverb” versions.
As you would expect, a Daycore remix is simply a song, once again in the form of the entire mp3, slowed down to the point where the vocals are deep and crooning and the drums seem to drone rather than hit, with both echoing much more than in the original song, seemingly to indicate the presence of these elements in a greater, vaster, emptier space. These remixes would also be accompanied by anime screenshots, much like its Nightcore predecessor, but, rather than neon pictures of anime girls frozen in place, Daycore remixes would be accompanied by GIFs of scenes of tragedy or reflection: a girl bleeding from the head, seemingly giving her last words; a couple sitting together on the beach, watching the waves ebb and flow; a man facing away from the camera, walking down a brightly lit Tokyo street with his bike as the same red car passes by him each time the GIF loops; or a couple on a motorcycle, the woman in the back, leaning forward and hugging her boyfriend driving the vehicle, the same few streetlights gliding infinitely over the visor of her helmet.
Under both these Nightcore and Daycore videos would also, of course, be the view counter, and it is more often than not that you would find a Nightcore or Daycore video with more than one million views than under. What does this mean in terms of how we understand and consume the human voice in music today?
Looking into the origins of both of these online musical movements could help to provide some insight. Nightcore had a very energetic and hectic conception, spearheaded by a Norwegian DJ duo in 2001 who would speed up popular eurodance and trance songs at the time. Originally their band name, the two DJs would release their sped-up songs under the Nightcore name, believing that the greater speed made the drums hit harder, made the vocals seem more euphoric, and made the dancefloor more lively. Eventually, non-dance tracks would also be remixed in the Nightcore style, particularly those of the band Evanescence, the tracks of which seemed to be the more popular of the growing non-dance Nightcore genre that began to grow in both YouTube and Soundcloud by the late 2000s and early 2010s. This sprout soon grew into a full bloom within a matter of a couple of years, as nightcore videos were rampant and one could find remixes of artists from any cloth, ranging Lady Gaga, Rammstein, and Sade.
Daycore, on the other hand, had a much more quiet and reserved conception. Jarylun Moore, under the name Slater, was inspired by the more emotional stylings of hip-hop artist DJ Screw, pioneer of the chopped + screwed style, and released what is credited to be the first “slowed & reverb” remix to YouTube in 2017: a slowed version of “20 Min” by Lil Uzi Vert, set to a loop of a pink skeleton moving about. Unlike Nightcore, whose purpose is to energize and hit hard with impact, Daycore is more intentionally emotional, seeking solace in slowed vocal notes and deeper guitar chords. Paired with the rise of TikTok in 2020 and edits popular on the platform, slowed & reverb mixes would not only grow exponentially but would also be demanded heavily, with seemingly no song able to escape the grasp of a Daycore remixer, from Ariana Grande to 80’s Japanese city pop to Paul Anka’s 1950 cut “Put Your Head on my Shoulders.”
With both of these origins in mind, it is more evident that Nightcore and Daycore remixes are not simply inexperienced people playing around in Audacity, but rather reflect an attempt to achieve a certain mood by warping the source material. With Nightcore, the goal was energy, movement, and impact, even if the end result caused the once human vocals to sound chipmunk-like, cartoonish, and more feminine. With Daycore, the goal was sadness, reflection, and solace, even if the end result caused the once human vocals to sound immensely more grainy, echo-y, and masculine. A pop song about young love could be turned into a tragedy of second-guessing yourself by simply slowing it down or turned into a blast of euphoria and validation and confidence by simply speeding it up, and this display of the human voice as an instrument controlled by technology, not biology, illuminates a breadcrumb trail with which we can reach the point we are today, where importance is not placed on the vocals being natural or unedited but rather the emotions that are able to be evoked by how the voice is used.
Gender euphoria and breaking the barrier of irony
Now that the app from the bowels of Satan himself, TikTok, has been mentioned at least once (1) in this article, it may be time to address another musical style that has grown popular on the app: hyperpop. Hyperpop is a genre known for its artifice, involving heavily edited vocals (often pitched-up in a Nightcore fashion and/or heavily autotuned) over incredibly aggressive synthetic production, calling back to the EDM and eurodance trends of the 2000s, all while following the basic song structure and lyrical vapidity commonly associated with the pop music genre. The style has been making waves on TikTok, with the most infamous example being 100 gecs, whose music also ignited a debate regarding another aspect of the hyperpop genre: its sincerity.
When one uninitiated with the hyperpop genre (or even the greater trend of vocal modification through avenues like Nightcore or Daycore as previously discussed) first hears the music of 100 gecs, the first reaction will almost always be incredulity. With the vocals very unironically sounding straight from Alvin and the Chipmunks, incredibly (and I mean INSANELY) vapid lyrics, and a complete smorgasbord of contemporary genres mashed within 3-minute songs (including but not limited to dubstep, metal, and ska), one cannot possibly hear it and think that the people behind the wall of sound was being sincere… were they?
Hyperpop’s authenticity as a musical genre has always been posed as a question from the moment of its inception, just how its predecessors, Nightcore and Daycore, are often questioned as legitimate remix styles (or, frankly, legitimately enjoyable). Hyperpop made its first major splash with the song “Hey QT” released by the eponymous artist, QT. An archetype for the genre, the song featured pitched-up, autotuned, and markedly soulless vocals and lyrics, a repetitive chorus, bubblegum bass production from hyperpop pioneers SOPHIE and A.G. Cook, and a music video that seemed to have been ripped straight from a Black Mirror episode, ending with what seemed to be an advertisement for a fictional, hot pink energy drink called, you guessed it, QT. All of these elements, such as hyper-feminization and overt advertising and product placement, introduced hyperpop to the masses as nothing but ironic, an overly-produced satire, a mockery of what the public thinks when they hear the term “pop music.” And, at first, it definitely seemed that way. SOPHIE, who would go on to release several singles that would be compiled to form her project PRODUCT in 2015, all of which heralded as hyperpop masterpieces, seemed to sink shamelessly into the archetype set by “Hey QT,” using hyper-feminine and chipmunk vocals, glossy, latex synth chords, and drums that hit way harder than they should to achieve the same overly-produced, overly-satirical hyperpop image. A.G. Cook, through his UK label PC Music, would also promote a similar image as early PC acts like GFOTY and EasyFun seemed to lean into the same aesthetics.
However, one can be ironic for so long until sincerity bleeds through like a hemorrhage. Such was the case with SOPHIE, for, after all the glitz and glamor and artifice of her PRODUCT era, she came out publicly as a trans woman and released what would be the lead single to her debut album Oil of Every Pearl’s Un-Insides, “It’s Okay to Cry,” in 2017, arguably her most intimate and iconic song to date. As its title suggests, the song is SOPHIE’s reassurance to the listener that it is okay to express emotion, to cry, to fall apart at times. The music video for the single is just as touching, as it was SOPHIE’s first music video showing
her face. And even though SOPHIE’s voice was imperfect and auto-tuned and refined in all of the usual hyperpop fixings, none of the human emotion was lost, and all of the irony that was expected to be found in hyperpop washed away.
Suddenly, what was originally thought of to be satire became a queer safe space, particularly a safe space for trans women and other gender non-conforming artists to express themselves and use their voice, for, after SOPHIE, many other trans and GNC artists like Dorian Electra and Arca found solace in the genre’s liberal use of vocal modification to achieve the hyper-feminine, the hyper-masculine, and all in between for the sake of gender bending and, most of all, gender euphoria. This is where 100 gecs falls under as well, for Laura Les, one half of the duo, is also a trans woman and has stated that the pitching-up of her recorded vocals helps her feel less dysphoric listening to her own voice.
The hyperpop did not stop with trans and non-binary expression, though. Soon, cis performers like Namasenda, Planet 1999, and That Kid, as well as hyperpop-adjacent performers like Sega Bodega and Shygirl, would use the aesthetics of the genre and the freedom with which the human voice fluctuated within it to create their own commentaries on their emotions and their expression.
And, even as the human voice was torn and stretched and pulverized and looped and bounced and ripped apart and put back together again in the creation of all this hyperpop music, the emotion was never lost. Much like its more rudimentary siblings of Nightcore and Daycore, hyperpop set out to achieve moods with its use of the human voice, and, with the recent passing of SOPHIE this past January, we can hopefully reflect and be able to see the impact that she was able to make from the trend that she saw was already being set for music as we know it.
As we remember her though, it is important to readjust and remember that the only way left to go is forward. But, how much forward can we go if the human biology of the voice is no longer as important to evoke emotion?
No longer human: modifying the human voice to achieve post-human aesthetics
In the past year, a voice-activated AI by the name of Q began to appear in the press because it was the first non-binary voice for technology. The AI voice seems to fluctuate between female and male as it speaks, but it never lands comfortably at either end: it truly is the first of its kind. While originally designed to help alleviate the dysphoria of trans individuals who may feel discomfort from hearing a female AI voice or a male AI voice and noticing differences between the AI’s voice and their own, Q is an indication of, once again, the human voice no longer being limited by the body and that the public no longer has any interest in limiting it as such. Q is an example of the voice finally transcending our bodies.
Q is not where it starts nor where it ends either. As AI technology continues to grow and develop and become more precise, a growing question in the field has been whether or not AI could be programmed to be creative just as humans. And, if AI can be programmed to make original, creative music, could AI be programmed to sing?
One of the more commercial and visible answers to this question came in the form of artist Holly Herndon’s third album, PROTO, in 2019. Herndon and her partners developed an AI that they named Spawn, which “learned” how to sing and create original music by listening to the voice of Herndon, her partners, a choir, and the audience at live shows Herndon and her team performed at. Once Spawn was able to successfully mimic voices, create melodies, and, soon, sing on its own, Herndon used Spawn as an instrument to build the tracks that would make up PROTO. Herndon described Spawn less as another computer program that generated “original” music by recognizing patterns from music in the past but
more as another member of the choir she directed that would improvise and be as independent and creative as any other member. In fact, the lead single to Herndon’s album, “Godmother,” was purely made by Spawn as an homage to the works of her “godmother,” Herndon’s fellow electronic artist J-Lin. On this track, Spawn truly sounds like the AI “baby” that Herndon makes her out to be, as the sputtering drums and gasping vocals evokes images of the birth of something new into the world.
Modifying the human voice to achieve a post-human aesthetic goes beyond the realm of automation. Several artists have begun experimenting to see just how far they can push the sound of their recorded voice in order to achieve moods that could rarely be reached by conventional singing methods, much like the spirit that pushed Nightcore and Daycore to the forefront. A contemporary example of this push comes in the form of artist Lyra Pramuk’s debut album, Fountain, in which Pramuk uses nothing but her voice, stretched and transposed and such, to build the seven grandiose tracks of the album. This is not a new concept, already having been attempted by Björk with her fifth studio album, Medúlla, and by Imogen Heap with the infamous solo vocoder song, “Hide and Seek.” We see this concept attempted again with Fountain, but, once getting through the entirety of the work, a listener cannot help but feel as though they have listened to seven folk songs from a distant future, a more alien and somehow more complete realization of the completely a cappella album through use of technology. If anything, this growing trend of vocal modification from the miniscule (auto-tune and pitching up/down) to the extreme (automation and intense manipulation) seems to prove further that the human voice no longer cares to remain within our throat: once it is recorded into an mp3, the voice is as vast as the electricity that courses behind the computer screen.
Where to?
All in all, the human voice in music is evolving. I would even akin to an apotheosis of sorts, where the voice is reaching heights that were never predicted before.
However, even with this, the human voice, as it is, is by no means disappearing. There will still be and forever be a place for the natural vocal talent. Several of the artists that are popular now would be considered by the public as natural vocal talents (even if they are unaware of just how much polishing goes on before a song is released). Plenty of trans musicians seek euphoria through the use of their natural voice as others seek it through digital modulation (ANOHNI is the first example that comes to mind).
As with any other field of study in modern history though, it will be interesting to see how the voice continues to develop. The only direction left to go is forward, after all.