By Catherine E. Shoichet, CNN
The Stanford students heard the sadness in their friend’s voice as he broke the news.
“Guys, I had to quit my job.”
To them, it made no sense. He was fluent in English and Spanish, extremely friendly and an expert in systems engineering. Why hadn’t he been able to get a job in a call center?
His accent, the friend said, made it difficult for many customers to understand him; some even hurled insults at the way he spoke.
The three students realized that the problem was even bigger than their friend’s experiment. So they founded a startup to solve it.
Now their company, Sanas, is testing artificial intelligence-based software that aims to eliminate miscommunication by changing people’s accents in real time. A call center employee in the Philippines, for example, might speak normally into the microphone and end up sounding more like someone from Kansas to a customer on the other end of the line.
Call centers, say the startup founders, are just the beginning. The company’s website touts its plans as “Speech, Reinvented.”
Eventually, they hope that the application they are developing will be used by a variety of industries and individuals. It could help doctors better understand patients, they say, or help grandchildren better understand their grandparents.
“We have a very big vision for Sanas,” says CEO Maxim Serebryakov.
And for Serebryakov and his co-founders, the project is personal.
“People’s voices are not heard as much as their accents”
The trio that founded Sanas met at Stanford University, but they’re all from different countries — Serebryakov, now CEO, is from Russia; Andrés Pérez Soderi, now CFO, is from Venezuela; and Shawn Zhang, now chief technology officer, is from China.
They are no longer Stanford students. Serebryakov and Pérez graduated; Zhang gave up to focus on Sanas’ life.
They launched the company last year and gave it a name that can be easily pronounced in different languages ”to emphasize our global mission and our desire to bring people together,” says Pérez.
Over the years, all three say they have experienced how accents can get in the way.
“We all come from international backgrounds. We have seen with our own eyes how people treat you differently just because of the way you speak,” Serebryakov says. “It’s heartbreaking at times.”
Zhang says her mother, who came to the United States more than 20 years ago from China, always asks her to speak to the cashier when they go shopping together because she is embarrassed.
“That’s one of the reasons I joined Max and Andrés in building this business, trying to help people who think their voice isn’t heard as much as their accent,” says- he.
Serebryakov says he has seen how his parents are treated in hotels when they come to visit him in the United States – how people make assumptions when they hear their accents.
“They speak a little louder. They change their behavior,” he says.
Pérez says that after attending a British school, he initially struggled to understand American accents when he arrived in the United States.
And don’t ask him what happens when his dad tries to use the Amazon Alexa his family gave him for Christmas.
“We quickly discovered, when Alexa turned on the lights in random places in the house and turned them pink, that Alexa didn’t understand my dad’s accent at all,” Pérez explains.
Call centers test technology
English is the most used language in the world. It is estimated that 1.5 billion people speak it – and most of them are not native speakers. In the United States alone, millions of people speak English as a second language.
This has created a growing market for apps that help users practice their English pronunciation. But Sanas uses AI to take a different approach.
The premise: rather than learning to pronounce words differently, technology could do it for you. There would no longer be a need for expensive or time-consuming accent reduction training. And understanding would be almost instantaneous.
Serebryakov says he knows people’s accents and identities can be intertwined, and he stresses that the company isn’t trying to erase accents or imply that one way of speaking is better than one. other.
“We allow people not to have to change the way they speak to hold a position, to hold a job. Identity and accents are essential. They are linked,” he says. “You never want someone to change their accent just to please someone.”
Currently, Sanas’ algorithm can convert English to and from American, Australian, British, Filipino, Indian, and Spanish accents, and the team plans to add more. They can add a new focus to the system by training a neural network with professional actors’ audio recordings and other data – a process that takes several weeks.
The Sanas team performed two demos for CNN. In one, we hear a man with an Indian accent reading a series of literary phrases. Then, these same sentences are converted to an American accent:
Another example features phrases that might be more common in a call center, like “if you give me your full name and order number, we can go ahead and start making the correction for you.”
The American-accented results sound somewhat contrived and stilted, like virtual assistant voices such as Siri and Alexa, but Pérez says the team is working on improving the technology.
“The accent changes, but the intonation is maintained,” he says. “We continue to work on how to make the result as natural and as emotional and exciting as possible.”
Initial feedback from call centers that have tried the technology has been positive, Pérez says. So get reviews submitted on their website as word spreads about their business.
And they say their plans for the company netted $5.5 million in seed funding from investors earlier this year.
How the founders of the startup see its future
This allowed Sanas to expand its staff. Most employees at the Palo Alto, Calif.-based company come from international backgrounds. And this is no coincidence, says Serebryakov.
“What we’re building has touched so many people, even the people we hire. … It’s really exciting to see,” he says.
As the company grows, it may still be a while before Sanas appears in an app store or appears on a cell phone near you.
The team says it’s working with larger call center outsourcing companies for now and opting for a slower rollout for individual users so they can fine-tune the technology and ensure security.
But ultimately they hope that Sanas will be used by everyone who needs it – in other areas as well.
Pérez plans to play an important role in helping people communicate with their doctors.
“Every second wasted either in a misunderstanding because of wasted time or a wrong message is potentially very, very impactful,” he says. “We really want to make sure there’s nothing lost in translation.”
One day, he says, it could also help people learn languages, improve voice acting in movies, and help smart speakers in homes and voice assistants in cars understand different accents.
And not just in English – the Sanas team also hopes to add other languages to the algorithm.
The three co-founders are still working out the details. But how this technology might improve communication in the future, they say, is easy to understand.
™ & © 2021 Cable News Network, Inc., a WarnerMedia company. All rights reserved.