“A propulsive and fascinating portrait of the people who helped upend an industry and challenge how music and media are consumed.”
What happens when an entire generation commits the same crime?
How Music Got Free is a riveting story of obsession, music, crime, and money, featuring visionaries and criminals, moguls and tech-savvy teenagers. It’s about the greatest pirate in history, the most powerful executive in the music business, a revolutionary invention and an illegal website four times the size of the iTunes Music Store.
Journalist Stephen Witt traces the secret history of digital music piracy, from the German audio engineers who invented the mp3, to a North Carolina compact-disc manufacturing plant where factory worker Dell Glover leaked nearly two thousand albums over the course of a decade, to the high-rises of midtown Manhattan where music executive Doug Morris cornered the global market on rap, and, finally, into the darkest recesses of the Internet.
Through these interwoven narratives, Witt has written a thrilling book that depicts the moment in history when ordinary life became forever entwined with the world online — when, suddenly, all the music ever recorded was available for free. In the page-turning tradition of writers like Michael Lewis and Lawrence Wright, Witt’s deeply-reported first book introduces the unforgettable characters—inventors, executives, factory workers, and smugglers—who revolutionized an entire art form, and reveals for the first time the secret underworld of media pirates that transformed our digital lives.
An irresistible never-before-told story of greed, cunning, genius, and deceit, How Music Got Free isn’t just a story of the music industry—it’s a must-read history of the Internet itself.
“A propulsive and fascinating portrait of the people who helped upend an industry and challenge how music and media are consumed.”
I am a member of the pirate generation. When I arrived at college in 1997, I had never heard of an mp3. By the end of my first term I had filled my 2-gigabyte hard drive with hundreds of bootlegged songs. By graduation, I had six 20-gigabyte drives, all full. By 2005, when I moved to New York, I had collected 1,500 gigabytes of music, nearly 15,000 albums worth. It took an hour just to queue up my library, and if you ordered the songs alphabetically by artist, you’d have to listen for a year and a half to get from ABBA to ZZ Top.
I pirated on an industrial scale, but told no one. It was an easy secret to keep. You never saw me at the record store and I didn’t DJ parties. The files were procured in chat channels, and through Napster and BitTorrent; I haven’t purchased an album with my own money since the turn of the millennium. The vinyl collectors of old had filled whole basements with dusty album jackets, but my digital collection could fit in a shoebox.
Most of this music I never listened to. I actually hated ABBA, and although I owned four ZZ Top albums, I couldn’t tell you the name of one. What was really driving me, I wonder? Curiosity played a role, but now, years later, I can see that what I really wanted was to belong to an elite and rarefied group. This was not a conscious impulse, and, had you suggested it to me, I would have denied it. But that was the perverse lure of the piracy underground, the point that almost everyone missed. It wasn’t just a way to get the music; it was its own subculture.
I was at the very forefront of the digital download trend. Had I been just a couple of years older, I doubt I would have become so involved. My older friends regarded piracy with skepticism, and sometimes outright hostility. This was true even for those who loved music—in fact, it was especially true for them. Record collecting had been a subculture too, and, for that vanishing breed, finding albums proved to be an exhilarating challenge, one that involved scouring garage sales, sifting through bargain bins, joining mailing lists for bands, and Tuesday visits to the record store. But for me, and those younger, collecting was effortless: the music was simply there. The only hard part was figuring out what to listen to.
As I was browsing through my enormous list of albums one day a few years ago, a fundamental question struck me: where had all this music come from, anyway? I didn’t know the answer, and as I researched it, I realized that no one else did either. There had been heavy coverage of the mp3 phenomenon, of course, and of Apple and Napster and the Pirate Bay, but there had been little talk of the inventors, and almost none at all of those who actually pirated the files.
I became obsessed, and as I researched more, I began to find the most wonderful things. I found the manifesto from the original mp3 piracy clique, a document so old I needed an MS-DOS emulator just to view it. I found the cracked shareware demo for the original mp3 encoder, which even its inventors had considered lost. I found a secret database that tracked thirty years of leaks—software, music, movies—from every major piracy crew, dating back to 1982. I found secret websites in Micronesia and the Congo, registered to shell corporations in Panama, the true proprietors being anyone’s guess. Buried in thousands of pages of court documents, I found wiretap transcripts and FBI surveillance logs and testimony from collaborators in which the details of insidious global conspiracies had been laid bare.
My assumption had been that music piracy was a crowdsourced phenomenon. That is, I believed the mp3s I’d downloaded had been sourced from scattered uploaders around the globe and that this diffuse network of rippers was not organized in any meaningful way. This assumption was wrong. While some of the files were indeed untraceable artifacts from random denizens of the Internet, the vast majority of pirated mp3s came from just a few organized releasing groups. By using forensic data analysis, it was often possible to trace those mp3s back to their place of primary origination. Combining the technical approach with classic investigative reporting, I found I could narrow this down even further. Many times it was possible not just to track the pirated file back to a general origin, but actually to a specific time and a specific person.
That was the real secret, of course: the Internet was made of people. Piracy was a social phenomenon, and once you knew where to look, you could begin to make out individuals in the crowd. Engineers, executives, employees, investigators, convicts, even burnouts—they all played a role.
I started in Germany, where a team of ignored inventors, in a blithe attempt to make a few thousand bucks from a struggling business venture, had accidently crippled a global industry. In so doing, they became extremely wealthy. In interviews, these men dissembled, and attempted to distance themselves from the chaos they had unleashed. Occasionally, they were even disingenuous, but it was impossible to begrudge them their success. After cloistering themselves for years in a listening lab, they had emerged with a technology that would conquer the world.
Then to New York, where I found a powerful music executive in his early 70s who had twice cornered the global market on rap. Nor was that his only achievement; as I researched more, I realized that this man was popular music. From Stevie Nicks to Taylor Swift, there had been almost no major act from the last four decades that he had not somehow touched. Facing an unprecedented onslaught of piracy, his business had suffered, but he had fought valiantly to protect the industry and the artists that he loved. To my eyes, it seemed unquestionable that he had outperformed all of his competitors; for his trouble, he’d become one of the most vilified executives in recent memory.
From the high-rises of midtown Manhattan I turned my attention to Scotland Yard and FBI headquarters, where dogged teams of investigators had been assigned the thankless task of tracking this digital samizdat back to its source, a process that often took years. Following their trail to a flat in northern England, I found a high-fidelity obsessive who had overseen a digital library that would have impressed even Borges. From there to Silicon Valley, where another entrepreneur had also designed a mind-bending technology, but one that he had utterly failed to monetize. Then to Iowa, then to Los Angeles, back to New York again, London, Sarasota, Oslo, Baltimore, Tokyo, and then, for a long time, a string of dead ends.
Until finally I found myself in the strangest place of all, a small town in western North Carolina that seemed as far from the global confluence of technology and music as could be. This was Shelby, a landscape of clapboard Baptist churches and faceless corporate franchises, where one man, acting in almost total isolation, had over a period of eight years cemented his reputation as the most fearsome digital pirate of all. Many of the files I had pirated—perhaps even a majority of them—had originated with him. He was the Patient Zero of Internet music piracy, but almost no one knew his name.
Over the course of more than three years I endeavored to gain his trust. Sitting in the living room of his sister’s ranch house, we often talked for hours. The things he told me were astonishing—at times they seemed almost beyond belief. But the details all checked out, and once, at the end of an interview, I was moved to ask:
“Dell, why haven’t you told anybody any of this before?”
“Man, no one ever asked.”
The death of the mp3 was announced in a conference room in Erlangen, Germany, in the spring of 1995. For the final time, a group of supposedly impartial experts snubbed the technology, favoring its eternal rival, the mp2. This was the end, and the mp3’s inventors knew it. They were running out of state funding, their corporate sponsors were abandoning them, and, after a four-year sales push, the technology had yet to secure a single long-term customer.
Attention in the conference room turned to Karlheinz Brandenburg, the driving intellectual force behind the technology and the leader of the mp3 team. Brandenburg’s work as a graduate student had pointed the way to the technology, and for the last eight years he had worked to commercialize his ideas. He was ambitious and intelligent, with a contagious vision for the future of music. Fifteen engineers worked under him, and he oversaw a million-dollar research budget. But with the latest announcement, it looked as if he had led his team into a graveyard.
Brandenburg did not possess a commanding physical presence. He was very tall, but he hunched, and his body language was erratic. He constantly rocked on his heels, lurching his gangly body forward and back, and when he talked, he nodded his head in gentle circles. His hair was dark and kept too long, and his nervous, perpetual smile exposed teeth that were uneven and small. His wire-frame glasses sat over dark, narrow eyes, and stray hairs protruded like whiskers from his scraggly beard.
He spoke quietly, in long, grammatically perfect sentences, punctuated with short, sharp intakes of breath. He was polite, and overwhelmingly kind, and he always tried his best to put people at ease, but this only made things more awkward. When he talked, he tended to dwell on practical matters, and, perhaps sensing boredom on the part of the listener, he would occasionally pepper this rambling technical discourse with weakly delivered, unfunny jokes. In his personality were united two powerful antiseptic forces: the skepticism of the engineer, and the stuffy, nation-specific conservatism they called typisch Deutsch.
He was brilliant, though. His mathematical talent was surpassing, and he held his contemporaries in thrall. These were men who had excelled in difficult academic disciplines and who had spent their lives near the top of competitive fields. They were not, as a rule, given to intellectual modesty, but when they talked of Brandenburg, their arrogance subsided and they reverted to quiet, confessional tones. “He’s very good at math,” said one. “He really is quite smart,” said another. “He solved a problem I could not,” said a third, and this, for an engineer, was the most terrible admission of all.
When challenged on a point, Brandenburg would pause, then squint, then subject the contrasting claim to a piercing scientific dismissal. In disagreement, his voice grew almost imperceptible, and in his responses he was guarded in the extreme, careful to never make an assertion without the data to back it up. In the conference room then, as he lodged his final objection to the committee, the mp3 went out with a whisper.
Defeat was always bitter, but this one was more so since, after 13 years of work, Brandenburg had solved one of the great open questions in the field of digital audio. The body of research the committee was dismissing went back decades, and engineers had been theorizing about something like the mp3 since the late 1970s. Now from this murky scientific backwater something beautiful had emerged, the refined product of a line of inquiry that went back three generations. Only the suits in the room didn’t care.
Brandenburg’s thesis adviser, a bald, stentorian computer engineer by the name of Dieter Seitzer, had started him down this path. Seitzer himself was indebted to his own thesis adviser, an obsessive investigator named Eberhard Zwicker, the father of an obscure discipline called “psychoacoustics”—the scientific study of the way humans perceive sound. Seitzer had been Zwicker’s protégé, his experimental audio subject, and, most important, his mortal opponent. For nearly a decade, the two had met every weekday after lunch for a game of table tennis, during which, over the course of an hour, Zwicker would school his pupil on the liminal contours of human perception while blasting ping-pong balls at his head. Zwicker’s chief finding, accrued over decades of research with real-world test subjects, was that the human ear did not act like a microphone. Instead it was an adaptive organ, one that natural selection had determined should 1) hear and interpret language and 2) provide an early warning system against enormous carnivorous cats.
The ear was only as good as it needed to be to achieve these goals, and no better. Thus, it had inherited a legacy of anatomical imperfections, and Zwicker’s research had revealed the unsuspected breadth of these errors. For example, anyone could distinguish two simultaneous tones separated by a half note or more, but Zwicker had found that, by moving the tones closer together in pitch, he could trick people into hearing just one. This effect was especially true when the lower-pitched tone was louder than the higher one. Similarly, any listener could distinguish between two clicks spaced a half second apart, but Zwicker had found that, by shortening this interval to just a few milliseconds, he could trick the ear into combining them. Here, too, increasing the relative loudness of one of the clicks made the effect more pronounced. The aggregate effect of these “psychoacoustic masking” illusions meant that reality, as humans heard it, was something of a fiction.
With time, Seitzer began to outplay the master. Zwicker was an anatomist, and his insights were products of the analog era. Seitzer, by contrast, was a computer scientist, and he anticipated the coming era of digitization. In particular, he suspected that, by exploiting Zwicker’s research into the ear’s inherent flaws, it might be possible to record high-fidelity music with very small amounts of data. This unique education gave him an unusual perspective. When the compact disc debuted in 1982, the engineering community celebrated it as one of the most important achievements in the history of the field. Seitzer, practically alone, saw it as a ridiculous exercise in overkill. Where the sales literature promised “Perfect Sound Forever,” Seitzer saw a maximalist repository of irrelevant information, most of which was ignored by the human ear. He knew that most of the data from a compact disc could be discarded—the human auditory system was already doing it.
That same year, Seitzer applied for a patent for a digital jukebox. Under this more elegant model of distribution, consumers could dial into a centralized computer server, then use the keypad to request music over the new digital telephone lines that Germany was just beginning to install. Rather than pressing millions of discs into jewel cases and distributing them through stores, everything would be saved in a single electronic database and accessed as needed. A subscription-based service of this kind could skip the manifold inefficiencies of physical distribution by hooking the stereo directly to the phone.
The patent was rejected. The earliest digital phone lines were primitive affairs, and the enormous amount of audio data on the compact disc could never fit down such a narrow pipe. For Seitzer’s scheme to work, the files on the disc would have to be shrunk to one-twelfth their original size, and no known approach to data compression would get you anywhere near this level. Seitzer battled with the patent examiner for a few years, citing the importance of Zwicker’s findings, but without a working implementation it was hopeless. Eventually, he withdrew his application.
Still, the idea stayed with him. If the limitations of the human ear had been mapped by Zwicker, then the remaining task was to quantify these limitations with math. Seitzer himself had never been able to solve this problem, nor had any of the many other researchers who had tried. But he directed his own protégé toward the problem with enthusiasm: the young electrical engineering student named Karlheinz Brandenburg was one of the smartest people he’d ever met.
Privately, Brandenburg wondered if a decade of table tennis with an eccentric otological experimenter had driven Seitzer insane. Information in the digital age was stored in binary units of zero or one, termed “bits,” and the goal of compression was to use as few of these bits as possible. CD audio used more than 1.4 million bits to store a single second of stereo sound. Seitzer wanted to do it with 128,000.
Brandenburg thought this goal was preposterous—it was like trying to build a car on a budget of two hundred dollars. But he also thought it was a worthy target for his own ambitions. He worked on the problem for the next three years, until in early 1986 he spotted an avenue of inquiry that had never been explored. Dubbing this insight “analysis by synthesis,” he spent the next few sleepless weeks writing a set of mathematical instructions for how those precious bits could be assigned.
He began by chopping the audio up. With a “sampler,” he divided the incoming sound into fractional slivers of a second. With a “filter bank,” he then further sorted the audio into different frequency partitions. (The filter bank worked on sound the way a prism worked on light.) The result was a grid of time and frequency, consisting of microscopic snippets of sound, sorted into narrow bands of pitch—the audio version of pixels.
Brandenburg then told the computer how to simplify these audio “pixels” using four of Zwicker’s psychoacoustic tricks:
First, Zwicker had shown that human hearing was best at a certain range of pitch frequencies, roughly corresponding to the tonal range of the human voice. At registers beyond that, hearing degraded, particularly as you went higher on the scale. That meant you could assign fewer bits to the extreme ends of the spectrum.
Second, Zwicker had shown that tones that were close in pitch tended to cancel each other out. In particular, lower tones overrode higher ones, so if you were digitizing music with overlapping instrumentation—say a violin and a cello at the same time—you could assign fewer bits to the violin.
Third, Zwicker had shown that the auditory system canceled out noise following a loud click. So if you were digitizing music with, say, a cymbal crash every few measures, you could assign fewer bits to the first few milliseconds following the beat.
Fourth—and this is where it gets weird—Zwicker had shown that the auditory system also canceled out noise prior to a loud click. This was because it took a few milliseconds for the ear to actually process what it was sensing, and this processing could be disrupted by a sudden onrush of louder noise. So, going back to the cymbal crash, you could also assign fewer bits to the first few milliseconds before the beat.
Relying on decades of empirical auditory research, Brandenburg told the bits where to go. But this was just the first step. Brandenburg’s real achievement was figuring out that you could run this process iteratively. In other words, you could take the output of his bit-assignment algorithm, feed it back into the algorithm, and run it again. And you could do this as many times as you wished, each time reducing the number of bits you were spending, making the audio file as small as you liked. There was degradation of course: like a copy of a copy or a fourth-generation cassette dub, with each successive pass of the algorithm, audio quality got worse. In fact, if you ran the process a million times, you’d end up with nothing more than a single bit. But if you struck the right balance, it would be possible to both compress the audio and preserve fidelity, using only those bits you knew the human ear could actually hear.
Of course, not all musical work employed such complex instrumentation. A violin concerto might have all sorts of psychoacoustic redundancies; a violin solo would not. Without cymbal crashes, or an overlapping cello, or high register information to be simplified, there was just a pure tone and nowhere to hide. What Brandenburg could do here, though, was dump the output bits from his compression method into a second, completely different one.
Termed “Huffman coding,” this approach had been developed by the pioneering computer scientist David Huffman at MIT in the 1950s. Working at the dawn of the Information Age, Huffman had observed that if you wanted to save on bits, you had to look for patterns, because patterns, by definition, repeated. Which meant that rather than assigning bits to the pattern every time it occurred, you just had to do it once, then refer back to those bits as needed. And from the perspective of information theory, that was all a violin solo was: a vibrating string, cutting predictable, repetitive patterns of sound in the air.
The two methods complemented each other perfectly: Brandenburg’s algorithm for complicated, overlapping noise; Huffman’s for pure, simple tones. The combined result united decades of research into acoustic physics and human anatomy with basic principles of information theory and complex higher math. By the middle of 1986, Brandenburg had even written a rudimentary computer program that provided a working demonstration of this approach. It was the signature achievement of his career: a proven method for capturing audio data that could stick to even the stingiest budget for bits. He was 31 years old.
He received his first patent before he’d even defended his thesis. For a graduate student, Brandenburg was unusually interested in the dynamic potential of the marketplace. With a mind like his, a tenure-track position was guaranteed, but academia held little interest for him. As a child he’d read biographies of the great inventors, and at an early age had internalized the importance of the hands-on approach. Brandenburg—like Bell, like Edison—was an inventor first.
These ambitions were encouraged. After escaping from Zwicker, Dieter Seitzer had spent most of his own career at IBM, accruing basic patents and developing keen commercial instincts. He directed his graduate students to do likewise. When he saw the progress that Brandenburg was making in psychoacoustic research, he pushed him away from the university and toward the nearby Fraunhofer Institute for Integrated Circuits, the newly founded Bavarian technology incubator that Seitzer oversaw.
The institute was a division of the Fraunhofer Society, a massive state-run research organization with dozens of campuses across the country—Germany’s answer to Bell Labs. Fraunhofer allocated taxpayer money toward promising research across a wide variety of academic disciplines, and, as the research matured, brokered commercial relationships with large consumer industrial firms. For a stake in the future revenues of Brandenburg’s ideas, Fraunhofer offered state-of-the-art supercomputers, high-end acoustic equipment, professional intellectual property expertise, and skilled engineering manpower.
The last was critical. Brandenburg’s method was complex, and required several computationally demanding mathematical operations to be conducted simultaneously. 1980s computing technology was barely up to the task, and algorithmic efficiency was key. Brandenburg needed a virtuoso, a caffeine-addled superstar who could translate graduate-level mathematical concepts into flawless computer code. At Fraunhofer he found his man: a 26-year-old computer programmer by the name of Bernhard Grill.
Grill was shorter than Brandenburg and his manner was far more calm. His face was broad and friendly and he wore his sandy hair a little long. He spoke more loudly than Brandenburg, with more passion, and conversations with him were composed and natural. He told jokes, too, jokes that were—well, not all that funny either, but certainly better than Brandenburg’s.
In the world of audio, Grill stood out, for it was possible to imagine him as something other than an engineer. Like Brandenburg, he was Bavarian, but his attitude was more bohemian. He had a relaxed, wonkish nature to him, and was the sort of person who, had he lived in America, might have favored sandals and a Hawaiian shirt. Perhaps it was his background. While Brandenburg’s father was himself a professor, and most of the other Fraunhofer researchers hailed from the upper middle class, Grill’s father had worked in a factory. For Brandenburg, a university education had been a given, practically a birthright, but for Grill it had real meaning.
In his own way he had rebelled against the typisch Deutsch mentality. His original passion had been music. At a young age Grill had taken up the trumpet, and by his teens he was practicing six hours a day. During a brief period in his early 20s he had played professionally in a nine-piece swing band. When the economic realities of that career choice became apparent, he’d returned to engineering, and ended up studying computers. But music remained close to his heart, and over the years he amassed an enormous, eclectic collection of recorded music from a variety of obscure genres. His other hobby was building loudspeakers.
Brandenburg and Grill were joined by four other Fraunhofer researchers. Heinz Gerhäuser oversaw the institute’s audio research group; Harald Popp was a hardware specialist; Ernst Eberlein was a signal processing expert; Jürgen Herre was another graduate student whose mathematical prowess rivaled Brandenburg’s own. In later years this group would refer to themselves as “the original six.”
Beginning in 1987, they took on the full-time task of creating commercial products based on Brandenburg’s patent. The group saw two potential avenues for development. First, Brandenburg’s compression algorithm could be used to “stream” music—that is, send it directly to the user from a central server, as Seitzer had envisioned. Alternatively, Brandenburg’s compression algorithm could be used to “store” music—that is, create replayable music files that the user would keep on a personal computer. Either way, size mattered, and getting the compression ratio to 12 to 1 was the key.
It was slow going. Computing was still emerging from its homebrew origins, and the team built most of its equipment by hand. The lab was a sea of cables, speakers, signal processors, CD players, woofers, and converters. Brandenburg’s algorithm had to be coded directly onto programmable chips, a process that could take days. Once a chip was created, the team would use it to compress a ten-second sample from a compact disc, then compare it with the original to see if they could hear the difference. When they could—which, in the early days, was almost always—they refined the algorithm and tried again.
They started at the top, with the piccolo, then worked down the scale. Grill, who had obsessed over acoustics since childhood, could see at once that the compression technology was far from being marketable. Brandenburg’s algorithm generated a variety of unpredictable errors, and at times it was all Grill could do to take inventory. Sometimes, the encoding was “muddy,” as if the music were being played underwater. Sometimes it “hissed,” like static from an AM radio. Sometimes there was “double-speak,” as if the same recording had been overlaid twice. Worst of all was “pre-echo,” a peculiar phenomenon where ghostly remnants of musical phrases popped up several milliseconds early.
Brandenburg’s math was elegant, even beautiful, but it couldn’t fully account for the messy reality of perception. To truly model human hearing, they needed human test subjects. And these subjects required training to understand the vocabulary of failure as well as Grill did. And once this expertise was established, it would have to be submitted to thousands upon thousands of controlled, randomized, double-blind trials.
Grill approached this time-consuming endeavor with enthusiasm. He was what they called a “golden ear”: he could distinguish between microtones and pick up on frequencies normally available only to children and dogs. He approached the sense of hearing the way a perfumer approached the sense of smell, and this sharpened sense allowed him to name and grade certain sensory phenomena—certain aspects of reality, really—that others could never know.
Charged with selecting the reference material, Grill combed his massive compact disc archive for every conceivable form of music: funk, jazz, rock, R&B, metal, classical—every genre except rap, which he disliked. He wanted to throw everything he could find at Brandenburg’s algorithm, to be sure it could handle every conceivable case. Funded by Fraunhofer’s generous research budget, Grill went beyond music to become a collector of exotic noise. He found recordings of fast talkers with difficult accents. He found recordings of birdcalls and crowd noise. He found recordings of clacking castanets and mistuned harpsichords. His personal favorite came from a visit to Boeing headquarters in Seattle, where, in the gift shop, he found a collection of audio samples from roaring jet engines.
Under Grill’s direction, Fraunhofer also purchased several pairs of thousand-dollar Stax headphones. Made in Japan, these “electrostatic earspeakers” were the size of bricks and required their own dedicated amplifiers. They were impractical and expensive, but Grill considered the Stax to be the finest piece of equipment in the history of audio. They revealed every imperfection with grating clarity, and the ability to isolate these digital glitches spurred a cycle of continuous improvement.
Like a shrinking ray, the compression algorithm could target different output sizes. At half size, the files sounded decent. At quarter size, they sounded OK. In March 1988, Brandenburg isolated a recording of a piano solo, then dialed the encoding ratio as low as he dared—all the way down to Seitzer’s crazy stretch goal of one-twelfth CD size. The resulting encoding was lousy with errors. Brandenburg would later say the pianist sounded “drunk.” But even so, this experiment in uneasy listening gave him confidence, and he began to see for the first time how Seitzer’s vision might be achieved.
Increases in processing power spurred progress. Within a year Brandenburg’s algorithm was handling a wide variety of recorded music. The team hit a milestone with the 1812 Overture, then another with Tracy Chapman, then another with a track by Gloria Estefan (Grill was on a Latin kick). In late 1988, the team made its first sale, and shipped a hand-built decoder to the first ever end user of mp3 technology: a tiny radio station run by missionaries on the remote Micronesian island of Saipan.
But one audio source was proving intractable: what Grill, with his imperfect command of English, called “the lonely voice.” (He meant “lone.”) Human speech could not, in isolation, be psychoacoustically masked. Nor could you use Huffman’s pattern recognition approach—the essence of speech was its dynamic nature, its plosives and sibilants and glottal stops. Brandenburg’s shrinking algorithm could handle symphonies, guitar solos, cannons, even “Oye Mi Canto,” but it still couldn’t handle a newscast.
Stuck, Brandenburg isolated samples of “lonely” voices. The first was a recording of a difficult German dialect that had plagued audio engineers for years. The second was a snippet of Suzanne Vega singing the opening bars of “Tom’s Diner,” her 1987 radio hit. Perhaps you remember the a cappella intro to “Tom’s Diner.” It goes like this:
Dut dut duh dut
Dut dut duh dut
Dut dut duh dut
Dut dut duh dut
Vega had a beautiful voice, but on the early stereo encodings it sounded as if there were rats scratching at the tape.
In 1989, Brandenburg defended his thesis and was awarded his PhD. He then took the voice samples with him on a fellowship to AT&T’s Bell Labs in Murray Hill, New Jersey. There, he worked with James Johnston, a specialist in voice encoding. Johnston was the Newton to Brandenburg’s Leibniz—independently, he had hit upon an identical mathematical approach to psychoacoustic modeling, at almost exactly the same time. After an initial period spent marking territory, the two decided to cooperate. Throughout 1989, listening tests continued in parallel in Erlangen and Murray Hill, but the American test subjects proved less patient than the Germans. After listening to the same rat-eaten, four-second sample of “Tom’s Diner” several hundred times, the volunteers at Bell Labs revolted, and Brandenburg was forced to finish the experiment on his own. He was there in New Jersey, listening to Suzanne Vega, when the Berlin Wall came down.
Johnston was impressed by Brandenburg. He’d spent his life around academic researchers and was accustomed to brilliance, but he’d never seen anybody work so hard. Their collaboration spurred several breakthroughs, and soon the scratching rats were banished. In early 1990, Brandenburg returned to Germany with a nearly finished product in hand. Many compressed samples now revealed a state of perfect “transparency”: even to a discriminating listener like Grill, using the best equipment, they were indistinguishable from the original compact discs.
Impressed, AT&T officially graced the technology with its imprimatur and a modicum of corporate funding. Thomson, a French consumer electronics concern, also began to provide money and technical support. Both firms were seeking an edge in psychoacoustics, as this long-ignored academic discipline was suddenly white hot. Research teams from Europe, Japan, and the United States had been working on the same problem, and other large corporations were jockeying for position. Many had thrown their weight behind Fraunhofer’s better-established competitors. Seeking to mediate, the Moving Picture Experts Group (MPEG)—the standards committee that even today decides which technology makes it to the consumer marketplace—convened a contest in Stockholm in June 1990 to conduct formalized listening tests for the competing methods.
As the ’90s opened, MPEG was preparing for a decade of disruption, shaping technological standards for near-future technologies like high-definition television and the digital video disc. Being moving picture experts, the committee had first focused exclusively on video quality. Audio encoding problems were an afterthought, one they’d tackled only after Brandenburg pointed out that there was no longer much of a market for silent movies. (This was the sort of joke that Brandenburg liked to make.)
An MPEG endorsement might mean a fortune in licensing fees, but Brandenburg knew it would be tough to get. The Stockholm contest was to be graded against ten audio benchmarks: an Ornette Coleman solo, the Tracy Chapman song “Fast Car,” a trumpet solo, a glockenspiel, a recording of fireworks, two separate bass solos, a ten-second castanet sample, a snippet of a newscast, and a recording of Suzanne Vega performing “Tom’s Diner.” (The last was suggested by Fraunhofer.) The judges were neutral participants, selected from a group of Swedish graduate students. And, as MPEG needed undamaged ears that could still hear high-pitched frequencies, the evaluators skewed young.
Fourteen different groups submitted entries to the MPEG trials—the high-stakes version of a middle school science fair. On the eve of the contest, the competing groups conducted informal demonstrations. Brandenburg was confident his group would win. He felt that access to Zwicker’s seminal research, still untranslated from German, gave him an insurmountable edge.
The next day a room full of fair-haired, clear-eared Scandinavian virgins spent the morning listening to “Fast Car” ripped 14 different ways. The listeners scored the results for sound quality on a five-point scale. After tabulating the answers, MPEG announced the results—it was a tie! At the top was Fraunhofer, locked in a statistical dead heat with a rival group called MUSICAM. No one else was close.
Fraunhofer’s strong showing in the contest was unexpected. They were a dark horse candidate from a research institution, a bunch of graduate students competing against established corporate players. MUSICAM was more representative of the typical MPEG contest winner—a well-funded consortium of inventors from four different European universities, with deep ties to the Dutch corporation Philips, which held the patents on the compact disc. MUSICAM also had several German researchers on staff, and Brandenburg suspected this was not a coincidence. They’d had access to Zwicker’s untranslated research, too.
MPEG had not anticipated a tie, and had not made provisions to break one. Fraunhofer’s approach provided better audio quality with less data, but MUSICAM’s required less processing power. Brandenburg felt this disparity worked in his favor, as computer processing speed improved with each new chip cycle, and doubled every 24 months or so. Improving bandwidth was more difficult, as it required digging up city streets and replacing thousands of miles of cable. Thus, Brandenburg felt, MPEG should look to conserve bandwidth rather than processing cycles, and he repeatedly made this argument to the audio committee. But he felt he was being ignored.
After Stockholm the team waited for months for a ruling from MPEG. In October 1990, Germany was reunified, and Grill kept himself busy by applying Brandenburg’s algorithm to his new favorite song: the Scorpions’ “Wind of Change.” In November, Eberhard Zwicker, hearing researcher and table tennis enthusiast, passed away at the age of 66. In January 1991, the Fraunhofer team rolled out its first commercial product, a 25-pound hardware rack for broadcast transmission. It made an early sale to the bus shelters of a reunified Berlin.
Finally, MPEG approached Fraunhofer with a compromise. The committee would make multiple endorsements. Fraunhofer would be included, but only if they agreed to play by certain rules, dictated by MUSICAM. In particular, they would have to adopt a gangrenous piece of proprietary technology called a “polyphase quadrature filter bank.” Four uglier words did not exist. Some kind of filter bank was necessary—this was the technology that split sound into component frequencies, the same way a prism did to light. But the Fraunhofer team already had its own filter bank, which worked fine. Adding another would double the complexity of the algorithm, with no increase in sound quality. Worse, Philips had a patent on the code, which meant giving an economic stake in Fraunhofer’s project to its primary competitor. After a long and heated internal debate, Brandenburg finally agreed to this compromise, as he didn’t see a way forward without MPEG’s endorsement. But to others on the project, it looked like Fraunhofer had been fleeced.
In April 1991, MPEG made its endorsements public. Of the 14 original contenders, three methods would survive. The first was termed Moving Picture Experts Group, Audio Layer I, a compression method optimized for digital cassette tape that was obsolete practically the moment the press release was distributed. Then, with a naming scheme that could only have come from a committee of engineers, MPEG announced the other two methods: MUSICAM’s method, which would henceforth be known as the Moving Picture Experts Group, Audio Layer II—better known today as the mp2—and Brandenburg’s method, which would henceforth be known as the Moving Picture Experts Group, Audio Layer III—better known today as the mp3.
Seeking to create a unified framework for collaboration, MPEG had instead sparked a format war. The mp3 had the technical edge, but the mp2 had name recognition and deeper corporate backing. The MUSICAM group was really just a proxy for Philips, and Philips was visionary. The company was making a fortune in licensing from the compact disc, but already, in 1990, with CD sales just starting to outpace vinyl, it was looking to control the market for its eventual replacement.
This farsighted strategic planning was complemented by a certain gift for low cunning. By this time, both Brandenburg and Grill were beginning to suspect that the suits at Philips were influencing MPEG’s decisions by lobbying behind the scenes. Johnston, the American, shared these suspicions of favoritism, and scoffed at the ridiculous three-tiered “layer” scheme, a last-minute rule change MPEG had made only when its favored team looked likely to lose. Brandenburg, Grill, and Johnston all used the same word to describe this emergent phenomenon: “politics”—a hateful state of affairs in which personal relationships and business considerations trumped raw scientific data.
MPEG defended its decisions and denied any allegations of bias. MUSICAM researchers were indignant at the suggestion. Still, history showed that, from the AC/DC “Current Wars” of the late nineteenth century to the VHS-Betamax battle of the 1980s, victory didn’t necessarily go to the best, but to the most vicious. From Edison to Sony, the spoils were won by those who not only promoted their own standard, but who cleverly undermined the competition. There was a reason they called it a format “war.”
The Fraunhofer team, consisting of young, naive academics, were unprepared for such a battle. Over the next few years, in five straight head-to-head competitions, they got swept. Standardization committees chose the mp2 for digital FM radio, for interactive CD-ROMs, for Video Compact Disc (the predecessor to the DVD), for Digital Audio Tape, and for the soundtrack to over-the-air HDTV broadcasting. They chose the mp3 for nothing.
In discussions with other engineers, the team kept hearing the same criticism: that the mp3 was “too complicated.” In other words, it ate up too much computer processing power for what it spit out. The problem could be traced to Philips’ baneful filter bank. Half of the “work” the mp3 did was just getting around it. In the engineering schematics explaining mp3 technology, the flowchart showed how Brandenburg’s algorithm sidestepped the filter bank entirely, like a detour around a car crash.
The Fraunhofer team began to see how they’d been outmaneuvered. Philips had convinced Fraunhofer to adopt its own inefficient methodology, then pointed to this exact inefficiency to sink them with the standards committees. Worse, engineers there seemed to have started a whisper campaign, to spread the word about these failures to the audio engineering community at large. It was a commendable piece of corporate sabotage. They’d tricked Fraunhofer into wearing an ugly dress to the pageant, then made fun of them behind their backs.
But Brandenburg was not one to cry in the corner—ugly dress or not, he was determined to win. In July 1993, he was given a Fraunhofer directorship. Though he had zero business experience and was fighting from a losing position, he drove his team at all hours. Around this time a gang of thieves broke into the Erlangen campus in the middle of the night, making off with tens of thousands of dollars in computing equipment. Every division was hit, save for the floor that housed audio research. There, at some dead hour of the night, long after everyone else had gone home, two mp3 researchers were still in the listening lab, deaf to the world in their expensive Japanese headphones.
This dedication brought results. By 1994, the mp3 offered substantial improvements in audio quality over the mp2, although it still took slightly longer to encode. Even at the aggressive 12 to 1 compression ratio, the mp3 sounded decent, if not quite stereo quality. Twelve years after a patent examiner had told Seitzer it was impossible, the ability to stream music over digital phone lines was nearly at hand. Plus, there was the growing home PC market, and the prospect of locally stored mp3 media applications.
They just had to make it that far. In early 1995, the mp2 again beat the mp3 in a standards competition, this time for a massive market: the audio track for the home DVD player. Having watched Brandenburg’s team go zero for six, the budget directors at Fraunhofer were starting to ask hard questions. Like: why haven’t you won a standards competition yet? And: why do you have fewer than 100 customers? And: do you think perhaps we could borrow some of your engineers for a different project? And: remind me again why the German taxpayer has sunk millions of deutsche marks into this idea?
So in the spring of 1995, when Fraunhofer entered its final competition, for a subset of multicast frequencies on the European radio band, winning was everything. This was a small market, certainly, but one that would provide enough revenue to keep the team together. And for once there was reason for optimism: the group’s meetings rotated through its membership base, and this time Fraunhofer was scheduled to host. They’d be on home turf, and the final decision on the mp3 would be hashed out in a conference room just down the hall from the laboratory where, seven years earlier, the work on the piccolo had begun.
For months in advance, the broadcasting group strung Fraunhofer along. They promised to revisit the decisions of the past and encouraged them to continue the development of the mp3. They welcomed Brandenburg’s presence in committee meetings and told him they understood the funding difficulties his team was facing. They urged him to hold on just a little bit longer. In advance of the meeting, the committee’s specialized audio subgroup even formally recommended the adoption of the mp3.
Still, Brandenburg wanted nothing left to chance. He put together an engineering document that comprehensively debunked the complexity myth. Fifty pages long, it included a chart showing how, for the past five years, processing speed had outpaced bandwidth gains, just as he had predicted.
The meeting began late in the morning. The conference room in Erlangen was small and the working group was large, so Grill and the other nonpresenting members of the team had to wait outside. Brandenburg was optimistic as he took his seat. He distributed bound copies of his fifty-page presentation, then worked through his talking points with quiet precision. The mp3 could encode higher-quality sound with less data, he said. When planning standards, it was important to look to the future, he said. Computer processing speed would catch up with the algorithm, he said. The complexity argument was a myth, he said. Throughout, he referred to the presentation.
When he was done, it was MUSICAM’s turn. They handed out a presentation, too. It was two pages long. Their spiel was equally brief: a slick reminder of the elegant simplicity of the mp2. Then the committee began its discussions.
Brandenburg quickly realized that, despite the subgroup’s official recommendation, the mp3 was guaranteed nothing. Deliberations continued for the next five hours. The talks grew acrimonious, and once again Brandenburg sensed behind-the-scenes machinations of a political nature. An increasingly agitated Grill repeatedly stopped by the conference room, then left to pace the hall with his colleagues. Finally, a representative from Philips took the floor. His argument was concise: two separate radio standards would lead to fear, uncertainty, and doubt. The whole point of standards was that you needed only one. After a subtle dig at the mp3’s processing power requirements, he concluded with a direct plea to the working group’s voting members: “Don’t destabilize the system.” Then the steering committee—in the interests of stability, presumably—voted to abandon the mp3 forever.
This was the end. There was nothing left to hope for. MPEG had barred them from the video disc and the broadcasting committees had kicked them off the airwaves. In head-to-head competitions against the mp2, Fraunhofer was now zero for seven. The mp3 was Betamax.
Bernhard Grill was crushed. He had been working on this technology for the better part of a decade. Standing in the crowded conference room, his back against the wall, he considered challenging the ruling. He was emotional, and he knew that, once he began speaking, he might lose control and unleash an angry harangue, fueled by the pent-up frustration he felt toward this group of know-nothing corporate big shots who’d been stringing him along for years. Instead, he remained quiet.
Typisch Deutsch, after all. Grill’s failure to speak up at this moment would haunt him for years to come. The budget vultures were smelling blood, and he knew that the mp3’s corporate underwriters would now pull the plug. The German state was happy to sponsor a technology with a fighting chance, but now the format war was plainly lost. Grill was stubborn, and determined to go down swinging, but he foresaw tough conversations ahead: the abandonment of a dead-end project, the breakup of the team, the patronizing commiseration over years of work spent for nothing.
Karlheinz Brandenburg, too, was devastated. He had handled the previous losses with equanimity, but this time they’d let him get his hopes up. The Philips delegate hadn’t even made a real argument. He’d just exercised his political muscle, and that was it. The whole experience seemed sadistic, a deliberate attempt to crush his spirits. For years to come, when he talked of this meeting, the nervous smile would fade, his lips would tighten, and a distant look would appear upon his face.
Still, this was engineering, where verified results should by necessity triumph over human sentiment. After the meeting, Brandenburg gathered his team for a brief pep talk, during which—the forced smile having returned—he explained how the “standards” people had simply made a mistake. Again. The team was baffled by this upbeat attitude, but Brandenburg could point to a binder full of engineering data, full of double-blind tests, that consistently showed his technology was better. Political dickering aside, that was all that mattered. Some way, somehow, the mp3 had to win in the end. They just had to find someone to listen.
On a Saturday morning later that same year, 1995, two men commuted to work at the PolyGram compact disc manufacturing plant in Kings Mountain, North Carolina. They traveled in a black Jeep Grand Cherokee four-by-four with heavily tinted windows. The men were both part-timers at the plant, and their weekend gigs supplemented the income they earned from other jobs moving furniture and serving fast food. The passenger’s name was James Anthony Dockery, but everyone called him “Tony.” The driver’s name was Bennie Lydell Glover, but everyone called him “Dell.”
The men had met a few months earlier on the factory floor, where Dockery, a talker, had convinced Glover, a listener, to provide him with a standing ride to work. They both lived in Shelby, a small town of 15,000 people located about twenty minutes to the northwest. Glover was 21 years old. Dockery was 25. Neither man had graduated from college. Both were practicing Baptists. Neither had lived more than a few miles away from the place where he’d been born.
Glover was black, wore a chinstrap beard and a well-manicured fade, and dressed in T-shirts and blue jeans. His physique was wiry and muscular, and the corners of his mouth turned down into a grimace. His heavy eyelids gave his face a look of perpetual indifference, his body language was slow and deliberate, and there was a stillness to his presence that approached torpor. When he spoke, which wasn’t often, he would first take several moments to collect his thoughts. Then his voice emerged, extremely deep and drenched in the syrupy tones of the small-town South, the medium of delivery for a pithy sentence, maybe less.
Dockery was white, with close-cropped sandy blond hair and bulbous, glassy eyes. He was shorter than Glover, and his weight vacillated between merely girthy and positively obese. He was a fast-talking jokester, emotional and volatile, and although he could be quick to anger, he tended to laugh as he cursed you out. He made his opinions available to anyone who would listen, and even to many who would not.