Tips to Improve the Quality of your Clone

This guide aims to help you understand how the Voice Cloning feature works and how to generate an accurate, high-quality voice clone for your projects.

1. Difference between Instant and High Fidelity Cloning Methods:

Instant Cloning

This method grasps the most prominent qualities of a speaker's voice and imitates the voice profile in the generated results. Currently not available on the PlayHT 1.0 voices

  • The cloning process requires a minimal amount of audio (as little as 30 seconds), and the voice is cloned almost instantly within a few seconds.
  • As this method grasps the most prominent qualities of a speaker's voice, you can also use it to create customized voice styles and emotional tone or deliver an existing voice clone.
  • Works well with almost all English  and Multilingual accents 

High Fidelity Cloning

This method maps a much deeper understanding of a voice’s nuances and accent, hence requiring more training audio. The resulting voice is versatile, complex, and capable of shifting tone in relation to the sentence's context.

  • It requires at least 20-30 minutes of audio for decent results
  • Cloning may take 20 mins or up to a few hours to complete, depending on the length of the training audio uploaded.
  • Works for almost any accent and results in an incredibly thorough resemblance 

2. Improving the Quality of Your Voice Clone

You can always delete your voice clones and create new ones with better training audio. Here are some guidelines on what type of training audio can help you improve the quality of your voice clone.

  • Avoid audio with excessive background noise, music, or sound effects.
  • The Instant Cloning method creates a voice clone using only the first 30 seconds of the training audio you upload. Please ensure that you upload a short, high-quality audio file.
  • For High-Fidelity Cloning, uploading approximately 30 minutes of high-quality training audio is one of the most effective ways to enhance the quality of your cloned voice.
  • Consider the amount of reverb and/or echo in the training audio, as it will likely show up in your voice clone as well. Generally, it is best to minimize the amount of reverb for better quality.

 3. Getting the Accent Right

The best cloning method to get higher accent resemblance is High Fidelity. However, if you’re still experiencing issues with the exact accent, try uploading higher-quality training audio with longer durations. The more training audio you provide, the better the resulting voice clone will be. Almost any accent can be accurately cloned with 4 to 6 hours of high-quality training audio.

 4. Making Your Cloned Voice Energetic and Full of Life

If your cloned voice sounds bland and devoid of personality, take a closer look at the kind of tone your voice had in the audio you used for the cloning process. Keep in mind that the most prominent tone of voice in the training audio provided is also apparent in the cloned voice. So, if you’re looking for an energetic and lively cloned voice, make sure you use training audio that reflects this tone of speech as well.

 

5. Reasons for the Cloning Process to Fail

  • The duration of the audio you submitted for the cloning process was too short. If you’re using Instant Cloning, make sure you upload at least 30 seconds of training audio. If you’re using High Fidelity, make sure you upload at least 30 minutes of training audio 
  • The training audio was in a language other than English. Currently, our AI Model  supports English and Multilingual
  • The training audio either contained excessive background noise or featured music and sound effects throughout most of the training material.
  • If there are multiple speakers in the audio, and you missed specifying which voice to clone (this is only available using the high-fidelity cloning process and not in Instant Cloning),

6. What should the speaker be reading/talking about in the training audio?

There is no preference as such. But it comes down to the nature of the content you’re looking to create using the cloned voice. If you’re looking to have an audiobook narrated with your cloned voice, it's best to record the audio while reading a book. If you’re looking to have a more conversational tone, then try using a recording from a podcast. The thumb rule is that whatever tone of voice you’re aiming for in your cloned voice, ensure that you submit training audio that reflects the same tone of speech.

 

7. Using the API to access cloned voices

To learn how to use your cloned access using our API, please refer to our API Documentation

Was this article helpful?
0 out of 0 found this helpful

Comments

1 comment

Please sign in to leave a comment.

Have more questions?
Submit a request
Share it, if you like it.