Anyone with an idea can use it
The Microsoft for Startups Founders Hub brings together people, knowledge, and benefits to help founders at every stage solve their startup challenges. Sign up in minutes with no funds required.
Duolingo, a popular language learning app, uses artificial intelligence (AI) to enhance the learner experience and make education free for all. As a startup, Duolingo has accomplished its mission of using AI to make language learning fun. Now a multi-million dollar business, Duolingo shares the technology and engineering decisions that have played a key role in making it an iconic brand.
One of the key areas where Duolingo is using AI is voice technology. We reached out to Fabio Lessa, his Director of Engineering Senior, and Kevin Lenzo, Speech Lab Lead at Duolingo, for some insight into the technology. Fabio leads a team responsible for services such as cloud infrastructure and data management that Duolingo needs to operate successfully. Kevin leads the team that keeps speech technology working in all languages and diverse scenarios, with a particular focus on speech recognition and synthesis for speaking and listening practice. Together, their team works to use AI and machine learning to make his Duolingo more engaging and effective for learners around the world.
About Duolingo and its approach Speech AI for core strategy
Fabio Ressa: “Speech is an important part of language learning. This is where the characters and their personalities come in. We strive to match the voices to the characters’ personalities, giving the app another level of sophistication and sophistication and making the experience more enjoyable.Thanks to these characters, Duolingo is It’s become the iconic brand it is today, and it’s really important to get the voices of these characters right.”
Kevin Renzo: “That’s where the technical challenge began: how to create text-to-speech voices that fit these characters. Teaching a language with the precision needed for text-to-speech is complex. We’re pretty It’s a small team, we have a strong background in natural language processing, but relatively few scientists specializing in speech technology.
“We were looking for a solution and we decided to partner with Microsoft because we knew they had the best technology and experience with text-to-speech. Using our custom neural voice service, we were able to create a unique text-to-speech voice for each character, giving each character a unique personality and making every lesson more engaging for the learner. can be made.”
Partner with Microsoft to build and scale your MVP
Kevin Renzo: “The first step was to assess the technical landscape and determine the best provider for our voice building needs. Microsoft Cognitive SserviceFrom there I had to design what these voices needed. Since we were designing for language learning, we knew we needed phonetic coverage and positive his coverage. I had to design an entire course with isolated words, questions, exclamation points, etc. In a representative set of documents he recorded 6,000 sentences. This was above baseline, but just what I needed.
“We launched an infrastructure to provide text-to-speech audio for course content, and implemented a new level of content management to ensure consistency between character personalities and their speech. It’s a critical aspect of the process, using markup and removing bad data to detect problems, find fixes, and monitor everything to detect problems as quickly as possible.
“One of the cost-saving features was our cross-lingual technology. Creating high-quality flagship voices in five languages allowed us to use the original base voice characteristics to nearly double the number of languages. We were able to scale very quickly.”
Understand industry trends and when to build, borrow, or buy technology
Kevin Renzo: “When you look at the adoption rate of any technology, on the low adoption end, many academics are emailing their papers to each other. At this point, there’s a great place for companies that want to follow up on that technology: Duolingo was in the right place at the right time for language-learning technology, and voice technology is for language-learning apps. A key factor, Microsoft has a strong track record and some of the best technology for neural voices, so rather than creating new character voices ourselves and creating them from scratch, we partnered with them. We were able to focus on the characters themselves, the content, and our core mission of making language learning fun and accessible for everyone, using the app’s high-quality custom voices.”
Duolingo’s use of AI to build its business and brand is a great example of how a startup can leverage AI to improve user experience. With recent advances in AI, Vallee Microsoft Research or custom neural voice With Azure, startups are better positioned than ever to capture niche business needs in the market. How will you launch AI as you build your next venture?
For more tips on applying AI to your startup and accessing AI services in Azure, Sign up now for the Microsoft for Startups Founders Hub.