Audiobook Production Tips

A Step-By-Step Guide to Swearing at Your Cats

The audiobooks I’ve recorded include Empire of the Goddess and Cursed by Christ. Sure, doing short recordings for Variant Frequencies was fun, but sustaining 10- and 8-hour performances intended for sale took me into a whole new realm of hurt.

So, do you want to do it? Then pour yourself a warm beverage, cough out your morning phlegm, and take a seat on your least creaky chair, cuz here’s a crash course on what I’ve learned so far.

Go to a section: Pre-Production | Production | Post Production

Pre-Production

Manuscript

You gotta have something to read, of course. Make a copy of your manuscript file to become your audio script. Then you’ll be free to make inline notations to yourself like, “(SOUTHERN ACCENT)” or the names of corresponding audio files.

Equipment

You don’t need fancy hardware, but you do need something decent. I use a Blue Yeti USB Microphone, colored black like my soul, with a clamp-on Dragonpad pop filter. Total cost: $117. My computer is a Dell PC with a sweet solid-state drive, which is another $1,000. You’ll see from the picture above I also have a tiled monitor and peripheral mouse. I display the manuscript I’m reading on my big screen and run the sound program on the little one.

For software, my manuscript is in MS Word, and I do all the sound recording and editing with Audacity, which is free. During post-production, as I’ll detail below, I use the fre:ac audio converter, which is also free. Some audiobook performers like to read their manuscripts off an iPad for silent scrolling. I get around that issue with my Logitech mouse, which has a non-click, non-wheel scrolling mode.

Anything else equipment-wise is gravy, and the next time I record a book, I might just buy the gravy — like one of these $600 booths — or lease some time at a recording studio. Because here’s the thing: sound isolation. Holy Christ, was that a pain in the ass.

My office is probably the noisiest room in the house, what with the hot water heater, the furnace, and the washer and dryer. Add to that having three cats who, after plowing in Cosmo Kramer-like through the upstairs cat door, like to stomp down the stairs to me, meow for attention, and start chucking litter. After that, they’ll go back outside and proceed to claw my window screen for attention. They drive me nuts!

Don’t believe me? Check out this blooper reel:

Production

Record just one chapter at a time. In my books, an average chapter runs 5,000 words, which takes about 30-45 minutes. After that, it’s time for a nap. So don’t push yourself; the point is to give a great dramatic performance, even if you suck at character voices. Even unadorned prose requires attention to enunciation, pace, and inflection to sound natural, because you don’t want to sound like you’re reading off a screen — even if you are.

Don’t make my mistake and read as if you’re on stage. You don’t have to emote for the back row. Your listener’s ear is literally an inch or two from your mouth. Instead, your technique will be more akin to screen acting: intimate and up close.

Unless you’re a great cold reader (and some of us just have it, baby), it’s a good idea to pre-read or even rehearse before the nerve-wracking act of tapping the red circle to record. Make notations if necessary about who’s speaking, because the writer may not have been clear.

Consult YouTube on how to perform an accent. Decide what voice you’re going to use for each character. And after you’ve canned a given chapter, go back and copy samples of your voices to separate audio files for future reference. You may not hear from Scarlet Witch again until fifteen chapters from now, and it would be a pain to locate that stretch of tape where you first introduced her. It would be better to have a private reference library of your characters. I consider this exercise essential. (I’m looking at you, Elizabeth Olsen!)

From my VX library: “Detective Wilson”

As for the main narrator, the person who reads everything not inside quotation marks, that voice should be your everyday one. It is your most natural voice, after all, and it’s the one that will speak during 95% of the story. Make this part easy on yourself. And please, do us a favor and polish how you read — and by that, I mean sound natural. Make us wonder if you’re even reading something at all. I can’t stand narrators who sound like they’re delivering a valedictorian speech.

Of course you’ll mess up. I once read a claim you should be good enough not to make more than one mistake per page — but I think that’s baloney. I screw up every damn paragraph, depending on how caffeinated I am, and you know what? It doesn’t matter. Because all those false starts will be edited out. Relax.

If you screw up, pause for a second, and restart from the beginning of that sentence. It’s that simple. Always re-record from the beginning of a sentence because a sentence is like a musical phrase with a discrete beginning, middle, and end. It won’t sound natural to splice together different recordings to make a single sentence because each “take” may have been performed differently; that way lies a Frankenstein sentence. That second’s pause will also make it easier to grab chunks of wave form with your mouse during editing.

A few more tips:

Record at the same time every day. Due to my need to record between 5:30 and 6:30 a.m. (see rant above about sound isolation), I developed a wonderful, early morning Love Doctor timbre. But that always left me by 10 a.m., when my voice was noticeably different. So it’s best not to change narrative voices mid-chapter.
If your mic is like mine, it will pick up bodily noises. My stomach starts the morning rumba the moment anything passes my lips. So if you’re an early morning recorder like me, wait until after the session to begin the day’s fueling.
Your voice and lips are your musical instrument, and it can take awhile to warm that horn up — especially if it’s 5:30 a.m.! You might have to re-record that entire first page. Once it’s clear, however, my voice will tend to stay clear unless I’m congested, in which case it’s not a good day for that. A warm beverage might help — but again, I personally minimize that to just a mouthwash because I’m afraid of waking the sleeping dragon in my stomach. You’ll discover what works best for you.

Post-Production

Rough Editing

So now you have a 45-minute recording of false starts, burps, and swearing at the cats. Somewhere in there is 20 minutes of usable audio. This is where the delete key is your friend.

Rough editing is about more than deletion, though. It’s also about pasting silence back in, in tiny increments, to cover up such things as that gross sound your lips make when you open them. (Yes, your mic will pick that up.) You’ll also want to paste over all your distracting inhalations, an editing process called “de-breathing.” There are programs that purport to de-breath recordings for you, but I’m old and don’t trust ’em.

Not all silence is created equal, however, so don’t use Audacity’s nifty “generate silence” tool, or it will sound like Neo shut down power to the Matrix for that period. (Yes, I just made an outdated pop culture reference. Get off my lawn.) What you need on your clipboard is a half second of room tone, i.e., the sound of silence in your recording studio.

This 0.5 seconds of room tone proved so magical that I saved it to a separate file. I used it to fix everything, including even the pacing of my performance. Here’s my formula:

0.5 seconds of room tone between sentences of the same paragraph. Less for higher-tension passages like action scenes.
1.0 seconds between paragraphs, assuming the writer correctly stuck to just one topic per paragraph.
2.0-3.0 seconds at scene breaks

Like I said, this is a bit like composing music. There should be appropriate amounts of silence between sentences so the reader can mentally process what they’ve just heard. This part of editing makes the difference between garbage and “holy shit, that’s good.”

This is time-consuming, however, and you will wear out your Ctrl + V shortcut. It took me an hour to generate just five minutes of rough-edited audio.

Mastering

1 Hour of Finished Audio Equals

2 hours studio time
9-12 hours post production
10,000 words (10 days of writing)

The platform I use for my audiobooks is Author’s Republic. It’s a clearinghouse that places audiobooks with distributors such as Audible (i.e., Amazon), iTunes, Hoopla, and Overdrive. Author’s Republic takes a sales commission.

Their technical specifications for audio files, which dictate things like acceptable peak values and noise floors, appear to match those of the Amazon audiobook self-publishing platform, ACX, so it’s a good idea to use Audacity’s ACX Check plug-in to analyze your files. It’s easy to use: just select the wave form you want to check (it tends to choke if you select more than 30 minutes at a time), run the ACX Check, and wait for the popup to tell you how you’ve utterly failed as a human being.

Through trial, error, and lots of cursing jags of watching other people’s technobabble on YouTube, I eventually figured out how to react to ACX Check’s various warnings:

If peak value and RMS are both high, then run the compressor.
If just peak value is high, then run the limiter.
If necessary, use amplify to bring RMS up or down, as well.

Although Audacity can export MP3 files, it doesn’t appear to have as much fine-grain control as the fre:ac audio converter discussed in ACX’s tutorial on encoding MP3s. Since they’re apparently the tough audience to please, I figure why not go through whatever arcane steps they want.

With all these complicated steps, it might be helpful to create a file tracking grid like I did for Sizzle. Here it is in all its ugly glory. Feel free to download it and adapt it to your own purposes. And hey, if you have a better way or can spot where I’ve done something wrong, then show me yours, eh?

Download [14.57 KB]

(File updated June 2021. Aren’t you lucky?)

Did you see that column called “Car listen”? Yeah, I think it’s a good idea to copy your MP3s over to your cell phone so you can Bluetooth to your car and listen while driving on the interstate. It’s how most of your listeners will experience the book, after all, and it’s your last chance to catch any errors in sound mastering or (gasp!) the book itself. The real reason for the car listen, however, is to spend all those hours telling yourself, “Damn, I sound good.”

Oh, and did I mention that Author’s Republic won’t distribute your audiobook without a concomitant Kindle eBook out there in the ether? I don’t know why; I think it’s stupid, but that’s what they want. So here’s where you can self-publish your ebook with Amazon.

I hope this helps you on your audiobook journey!

Check Out These Links

Cursed by Christ, the one I learned on.
Empire of the Goddess, my second full-length audiobook.
Sizzle, my third audiobook, which is a story for kids aged 8-12
Deena Warner Design for all your cover image and cover design needs.

Matthew Warner