Second, Tomashenko says, researchers are looking at distributed and federated learning—where your data doesn’t leave your device but machine learning models still learn to recognize speech by sharing their training with a bigger system. Another approach involves building encrypted infrastructure to protect people’s voices from snooping. However, most efforts are focused on voice anonymization.
Anonymization attempts to keep your voice sounding human while stripping out as much of the information that could be used to identify you as possible. Speech anonymization efforts currently involve two separate strands: anonymizing the content of what someone is saying by deleting or replacing any sensitive words in files before they are saved and anonymizing the voice itself. Most voice anonymization efforts at the moment involve passing someone’s voice through experimental software that will change some of the parameters in the voice signal to make it sound different. This can involve altering the pitch, replacing segments of speech with information from other voices, and synthesizing the final output.
Does anonymization technology work? Male and female voice clips that were anonymized as part of the Voice Privacy Challenge in 2020 definitely do sound different. They’re more robotic, sound slightly pained and could—to some listeners at least—be from a different person than the original voice clips. “I think it can already guarantee a much higher level of protection than doing nothing, which is the current status,” says Vincent, who has been able to reduce how easy it is to identify people in anonymization research. However, humans aren’t the only listeners. Rita Singh, an associate professor in Carnegie Mellon University’s Language Technologies Institute, says that total de-identification of the voice signal is not possible, as machines will always have the potential to make links between attributes and individuals, even connections that aren’t clear to humans. “Is the anonymization with respect to a human listener or is it with respect to a machine listener?” says Shri Narayanan, a professor of electrical and computer engineering at the University of Southern California.
“True anonymization is not possible without completely changing the voice,” Singh says. “When you completely change the voice, then it’s not the same voice.” Despite this, it is still worth developing voice-privacy technology, Singh adds, as no privacy or security system is totally secure. Fingerprints and face identification systems on iPhones have been spoofed in the past, but overall, they’re still an effective method of protecting people’s privacy.
Bye, Alexa
Your voice is increasingly being used as a way to verify your identity. For example, a growing number of banks and other companies are analyzing your voiceprints, with your permission, to replace your password. There’s also the potential for voice analysis to detect illness before other signs are obvious. But the technology to clone or fake someone’s voice is advancing quickly.
If you have a few minutes of someone’s voice recorded, or in some instances a few seconds, it’s possible to recreate that voice using machine learning—The Simpsons’ voice actors could be replaced by deep fake voice clones, for instance. And commercial tools for recreating voices are readily available online. “There’s definitely more work in speaker identification and producing speech to text and text to speech than there is in protecting people from any of those technologies,” Turner says.
Many of the voice anonymization techniques being developed at the moment are still a long way from being used in the real world. When they are ready to be used it’s likely that companies will have to implement tools themselves, to protect their customers’ privacy—there’s currently little individuals can do to protect their own voice. Avoiding calls with call centers or companies that use voice analysis, and not using voice assistants, could limit how much your voice is recorded and reduce possible attack opportunities.
But the biggest protections may come from legal cases and protections. Europe’s GDPR covers biometric data, including people’s voices, in its privacy protections. Guidelines say people should be told how their data is being used and provide consent if they’re being identified, and that some restrictions should be placed on personalization. Meanwhile, in the US, courts in Illinois— home to some of the strongest biometric laws in the country—are increasingly inspecting cases involving people’s voice data. McDonald’s, Amazon, and Google are all facing judicial scrutiny over how they use people’s voice data. The decisions in these cases could lay down new rules for the protection of people’s voices.