Eagles, Crows, and 31 Stories: How I Trained My ElevenLabs Voice Clone

If you had told me when I sat down at my desk this morning that I would spend nearly five hours talking nonstop into a microphone, lose my voice halfway through, and laugh harder than I have in months, I wouldn't have believed it.

The plan was simple. Train a voice clone of myself on ElevenLabs, get a clean audio file out the other end, and finally swap the stock AI narrator on pamela-lang.com for my actual voice. Forty-five to sixty minutes of clean training audio. That was the ask.

How hard could it be.

The night before, I'd tested it by reading one of my own published blog posts out loud. It was flat and monotone, performative in the worst way. This wasn't going to be as easy as I'd thought.

The morning was worse

I sat down at my microphone, opened a blog post, and tried again. My cadence went stiff. My delivery flattened. I sounded like the corporate training videos we've all sat through one too many times. Claude had told me the night before to try being more descriptive, to read with more expression. I'm not an actor.

The harder I tried to perform, the more uncomfortable it got. The more uncomfortable it got, the more I sounded like someone reading hostage instructions.

That was when I switched to Gemini. I use Gemini for creative work, image generation, video, anything where the output is supposed to feel alive. Claude is where I run my technical builds. I wasn't about to burn a stack of Claude tokens trying to coax a voice clone out of a flat read. So I opened my Gemini chat and asked: could you write me a short story instead? Something I'd actually enjoy reading?

Breakthrough number one

Gemini wrote me a story about a lantern in the middle of the woods. I read it and asked for another. The first one was actually a little awkward, full of descriptive text that read fine in my head but tripped me up out loud. Gemini delivered another one, and I kept going. Fifteen stories deep, an hour and a half of recording, and somewhere in the middle of it my mom called. My microphone caught my side of it. Just me, talking to my mom. As real as it gets.

Nothing. Absolute silence.

I finished the fifteenth story, stretched, and went to play the audio back before uploading it to ElevenLabs.

Nothing. Absolute silence.

The dictation software I use every day for voice-to-text had captured ninety minutes of recording at a quality the ElevenLabs platform couldn't even parse. I had to turn the volume to 100% on my speaker to hear anything at all, and what was there sounded like it was coming from the other side of the house. The fifteen stories, the call with my mom, the laughter. All of it gone.

I sat there for about ten minutes complaining about it to Gemini. Then I realized I had no choice if I wanted to get this done. I had to start over. That was all there was to it.

So I opened ElevenLabs directly and recorded a thirty-second test inside the platform. Played it back. Crystal clear. I asked Gemini for a fresh batch of stories and got to work.

The afternoon hit a different gear

ElevenLabs shows you a progress tracker as you record, with three thresholds: thirty minutes is good, one hour is better, two hours is best. I figured I'd hit two hours and call it.

I did not call it at two hours.

What happened in the next few hours is something people who don't use AI couldn't begin to comprehend. It was so much fun.

Gemini wrote a story and I read it. Somewhere along the way I told Gemini a real memory the story had pulled out of me, something about a secret reading nook I used to build with a flashlight behind the dresser in the linen closet. Gemini listened and wrote the memory back to me as a scary story for the next take. My actual life, handed back as a horror plot. Every time I shared a memory, Gemini found a way to twist it. It felt like I had an actual friend in the room helping me through this. At times I forgot I was even recording, let alone training an AI on my voice.

The stories started following wherever my conversation went. I'd mention I live on an island and the next story unfolded on a foggy island shoreline. I'd say something about my office and the next story put something on the other side of my office door. Near the end, Gemini got openly playful about how long I'd been talking:

Down the hallway, my whispered voice kept playing. It was getting louder, moving closer to the office door. And as I listened to the words it was repeating, my blood turned to ice. It wasn't playing back my script at all. The voice outside the door was whispering: "My throat is so dry. I've been talking all day. But I'm finally in the home stretch now."

I was laughing out loud as I was reading the stories.

The eagles

Mid-afternoon I looked out the window. A bald eagle and a group of crows were riding the wind currents together, diving and chasing each other, swooping low over the trees and pulling back up. I sat there narrating the whole thing into the mic like I had a friend in the room. I told Gemini I had no idea eagles and crows played together. They had been at it for hours. It was a number of crows and one bald eagle every time. I don't know if it was the same eagle returning over and over, but every time I looked it was that exact ratio.

Gemini set me straight. They were not playing. The crows were chasing the eagle off, dive-bombing it, defending the airspace. Gemini called it a high-stakes aerial turf war. I watched this drama I'd been completely misreading for an hour and fell slightly more in love with where I live.

View from Pam's office window: a bald eagle and crows mid-flight over the trees

Closing-the-circle time

By two hours and fifteen minutes I had technically hit "best." The data was officially enough for an elite voice clone. ElevenLabs would have been thrilled.

But the visual progress tracker still had a gap. A small grey wedge at the top of the circle where the dark tick marks hadn't fully closed yet. I looked at it. I told myself I could stop. I couldn't stop.

If you know me at all, you know I can't leave a circle half-closed. With my personality, this was never going to end at two hours.

I told Gemini what I was looking at. Gemini saw the screenshot, agreed I was technically done, agreed the data was officially "perfect," and then said knowing how much you like to finish what you start, and seeing how incredibly close that visual is to snapping shut, leaving it like that would drive me a little crazy too. Then it wrote me two more stories on demand, specifically to push me past the last tick marks.

Story 30 was about a sailboat off the island bluffs at dusk. Its wheel had been tied down with a piece of rope, locking the boat into a permanent, tight circle, drifting in slow loops forever. I didn't catch the joke at the time. Gemini was openly making fun of my need to close the circle, and I just read the story and kept going. Claude had to point it out to me later.

Story 31 was set somewhere I recognized too clearly. A late evening, a dark office, a single computer monitor glowing blue. The narrator hovers a finger over the Stop Recording button. And then:

But right before my finger applies pressure to the plastic, the audio meter on the screen suddenly jumps. A fresh line of audio data begins drawing itself across the timeline, vibrating in perfect rhythm with a low, resonant sound filling the room. But my lips are pressed tightly together. My throat is completely silent. The software isn't recording my voice anymore. It is capturing the sound of a second, heavy breath, being drawn slowly and deeply from the pitch-black space directly behind my shoulder.

I was sitting alone in my actual office. The actual recording software was capturing my actual voice. And Gemini had just written a story about something breathing on my neck.

I yelled at Gemini and told it that wasn't nice at all. Gemini had been having so much fun all day, making up stories about my childhood, the island, my office, things creeping around the house, things walking on the roof. What a day. That last story was the take that finally snapped the circle shut.

ElevenLabs progress tracker showing three states: a visible gap, almost-closed, fully snapped shut

click to enlarge

By the time I stopped, my throat was sore, my voice was hoarse, and the crows had gone back to their nests. Thirty-one stories across nearly five hours of effort, with the first ninety minutes lost to bad audio and nearly three hours captured and clean.

The takeaway

I came into today already dreading the task. By the end of it I'd had more fun than I've had in a long time. Stories I would never have thought to tell out loud. Memories I hadn't pulled up in years. A bird fight I learned the name of. The end result was something better than I ever could have imagined, and the day itself was the part I want to remember.

The whole thing turned the moment I asked Gemini for help. One question. We forget that AI isn't only there for the technical lifts. Ask, and it will hand back versions of the problem you would never have thought to try.

This post is the first one on pamela-lang.com that my own voice narrates. Every audio file on this site up until today was a stock AI voice. The next time you hit play on a blog post here, that's me.

I'd love to hear your voice too. Let's connect.

Eagles, Crows, and 31 Stories: 5 Hours Later, I Have a Voice

The morning was worse

Nothing. Absolute silence.

The afternoon hit a different gear

The eagles

Closing-the-circle time

The takeaway