Last month I fed eight bars of my own singing — a scratch vocal from a film cue I scored in 2016 — into a voice-cloning tool to see what would happen. Thirty seconds later it handed me back a version of my voice singing a melody I never wrote, in a key I never chose. It was close. Not perfect, but close enough that a tired editor at 11pm would not have flagged it. I had consented to that. I uploaded my own file. But the experience answered a question I had been circling for months: what does it feel like when AI music generation learns you without asking? And the answer, for a working artist, is that it feels like finding your handwriting in a letter you never wrote.
That is roughly where SZA landed this year, except she did it in public and with a number attached.
What SZA actually said
In a series of posts, SZA reported that an AI music platform had trained on a large catalog of her recordings — a specific count of songs, not a vague gesture at "some of my music." She objected on two fronts at once. First, the legal-economic one: a model ingesting her catalog to generate sound-alikes is not a stream, not a license, not a sale. There is no royalty line for being used as raw material. Second, the cultural one: she described hearing AI-generated work imitating Black musical idioms in ways she found reductive — voices and styles flattened into a stereotype and then resold.
She did not soften it, and I am not going to soften it for her. Her register was casual, profane, direct — the register of someone who found their work in a place they never agreed to be. What makes the complaint credible is not the volume. It is the specificity, and the fact that she had been raising versions of this concern before it became a headline.
What "unauthorized training" actually means
Here is the distinction that matters for anyone managing a catalog, because the industry keeps collapsing three different things into one word.
- Sampling takes a piece of a master recording and places it in a new work. It is a copy you can hear, and it is governed by decades of clearance practice.
- A cover re-performs a composition. There are statutory mechanisms for this, and the streams, in theory, flow to the rights holders.
- Training is neither. A model does not store SZA's vocal and replay it. It analyzes thousands of recordings to learn statistical relationships — how her phrasing bends behind the beat, how a certain production style sits in a mix — and uses those learned patterns to generate new output.
That last one is the open wound. When the model outputs something that sounds like SZA but is not a copy of any single recording, the existing clearance machinery has nothing to grab. There is no sample to license, no composition being covered. There is a fingerprint without a file. As SZA pointed out, she cannot even collect the streams, because nothing of hers is being streamed. She was used to build the thing, then left off the cap table.
What is settled and what is not
I score games and films for a living, so let me be precise about the boundary between fact and forecast, because the discourse blurs it constantly.
What is reasonably settled:
- Outputs that reproduce a recognizable copyrighted recording are infringing the same way they always were. AI in the pipeline does not launder that.
- Using a recording artist's actual voice in a deceptive way runs into right-of-publicity and likeness claims in several jurisdictions, with some states moving to strengthen those protections specifically because of synthesis.
What is genuinely unsettled, as of this writing:
- Whether training a model on copyrighted recordings without a license is itself infringement, or falls under some exception. This is being litigated in multiple suits against major AI music and audio companies, and the courts have not given a clean answer.
- Whether "opt-out" — the model that lets companies train on everything until a rights holder objects — is acceptable, or whether consent must come first. The industry position from labels, publishers, and an increasing number of artists is consent-first. Several large platforms have operated opt-out. These two positions cannot both win.
If anyone tells you this is decided, they are selling something. The honest statement is that the value of a catalog now depends partly on a question no court has finished answering.
A working audit for rights-conscious pros
If you manage, publish, or own recordings, here is what I would actually do this quarter rather than wait for case law. None of this is legal advice — it is the operational hygiene I have watched careful catalogs adopt.
- Inventory what is already public. Every recording on a public streaming service is a candidate for scraping. Know your full exposed footprint, including features and one-off collaborations you may have forgotten.
- Read the training clause in every platform agreement. Distribution deals, sync libraries, and DSP terms have begun adding language about AI training rights. Find out what you have already granted, possibly years ago, in a clause nobody read.
- Run a sound-alike search on your top assets. Search the generative platforms for your biggest artists by name and by signature style. You are looking for whether the model already does a convincing impression. If it does, document it with timestamps.
- Decide your consent posture in writing. Opt-out by default means you must act to be excluded. Identify which platforms offer an opt-out mechanism and whether you have used it.
- Watch for likeness, not only copyright. A voice that mimics your artist may be a publicity-rights matter even where the copyright question is murky. Different claim, different remedy, different lawyer.
When step three turns up a clean impression of an artist you represent, you will feel exactly what I felt with my own scratch vocal. That recognition is the whole point of the exercise.
The part that is not about money
The economic argument is the one that travels in boardrooms, but SZA's second point is the one that will age. She did not only object to being used without payment. She objected to how she was used — to a system trained on Black music generating a flattened, stereotyped version of it and selling it back to a market that may not know the difference.
This is the quieter risk in AI music generation, and it has nothing to do with royalties. A model learns the average of what it is fed. Feed it the most-streamed corner of a genre and it will reproduce the cliché, not the edge. The artists who pushed a sound forward become the training set for a machine that produces the sound's most predictable form. The innovator funds her own imitation, and the imitation is duller than she was. That is not a copyright harm a court can size. It is an aesthetic and cultural one, and it is real whether or not it is actionable.
I want to be careful here. AI is an instrument, and I use it. The tools at City of Punk and elsewhere exist because plenty of producers want original sound without crate-digging through clearance nightmares. The fight is not synthesis versus humans. It is consent versus scraping, and credit versus erasure. Those are different arguments, and the people who blur them — usually to defend a training practice — are doing it on purpose.
Back to the vocal
I kept the cloned version of my own voice. I play it for people sometimes and ask if they can tell. Most cannot, at first. Then I tell them it is not me, and they listen again, and they hear it — the slightly too-even vibrato, the phrasing that never quite breathes. The tell is there if you know to look for it. For now.
That "for now" is the whole unsettled thing. SZA's lawyers can argue about her 238 songs, and they should. But the question underneath the lawsuit is one no one has answered: when the tell disappears — when the model's impression of an artist is indistinguishable from the artist — what exactly has been taken, and from whom?
Not sure which tool to use?
Compare the top AI music and sound tools side by side — honest reviews, real pricing, no sponsorships.