![Image of digital waveforms](https://images.newscientist.com/wp-content/uploads/2023/03/06143328/SEI_144483909.jpg?width=300)
An AI can generate more natural-sounding synthetic speech by including pauses
Shutterstock/PrinceOfLove
Generating speech with different rhythms and pauses makes it sound more human-like, according to an assessment of an artificial intelligence trained on speech taken from YouTube and podcasts.
Most artificial intelligence text-to-speech systems are trained on data sets of acted speech, which can lead to the output sounding stilted and one-dimensional. More natural speech often displays a wide range of rhythms and patterns to convey different meanings and emotions.
Now, Alexander Rudnicky at Carnegie Mellon University in Pittsburgh, Pennsylvania, …