From Haptic Devices to AI: How One Mathematician Changed Speech Recognition
Dimitri Kanevsky started his career researching at top academic institutions like the Weizmann Institute and Princeton's Institute for Advanced Study. Now, as an AI Research Scientist at Google, he's developing practical solutions that support everyday challenges for those experiencing hearing loss.
Dimitri Kanevsky started his career researching at top academic institutions like the Weizmann Institute and Princeton's Institute for Advanced Study. Now, as an AI Research Scientist at Google, he's developing practical solutions that support everyday challenges for those experiencing hearing loss.
In this episode, Dimitri shares how his personal experience with hearing loss informs his understanding of accessibility and his approach to designing technology.
His work demonstrates how accessibility drives innovation - solutions initially developed for people with specific needs often become valuable tools for everyone. From early haptic devices to modern AI-powered captioning, he's seen how technology evolves when developers focus on user needs. He advocates for considering accessibility from day one of development and emphasizes that even imperfect solutions can make meaningful differences in people's daily lives.
Watch the full interview with Dimitri and read on below for a snapshot of some key moments from our conversation.
Dimitri's Early Journey
After going deaf at age 3, Dimitri thrived socially in Russia through lipreading. While waiting to emigrate from the Soviet Union, he realized lipreading would be more challenging in Hebrew and English. This led him to develop a wearable haptic device that transmitted speech to haptic sensations on his hand, helping him detect different sounds, particularly high-frequency ones in Hebrew like "Shabbat" and "Shalom." When he brought this device to Israel, medical doctors saw its potential, leading to his first startup developing multi-channel haptic devices. This early work with haptic devices would later influence his approach to developing more advanced communication technologies, showing how practical solutions often emerge from immediate needs.
From Theory to Real-World Impact
With a PhD from Moscow State University, Dimitri worked at several prestigious institutions: the Weizmann Institute of Science in Israel, Max Planck Institute in Bonn, and Institute for Advanced Study in Princeton. In 1986, realizing there were no good transcription communication tools at the time, he moved from mathematics to technology. This wasn't a complete departure from his past - his experience developing haptic devices in Israel had already given him credibility in speech technologies. This blend of theoretical expertise and practical innovation helped him transition from "pure mathematician" to someone technology companies actively sought out. Notably, his transition to technology didn't mean abandoning mathematics - recently, he solved a 50-year-old mathematical problem, presenting his findings at an international conference using the very speech recognition technology he helped develop.
Evolution at Google
Joining Google in 2014, Dimitri first worked in New York on developing closed captions for YouTube, where his team improved caption quality significantly. Seeing the potential for mobile applications, he moved to Mountain View to develop Live Transcribe for Android. His team then tackled a new challenge: YouTube's speech recognition models weren't equipped for non-standard speech patterns, leading to Project Relate's personalized approach. The impact was clear - at a recent international mathematics conference, he could freely present his work using speech recognition for the first time. This progression from early haptic devices to AI-powered speech recognition demonstrates how accessibility technology has evolved - from physical solutions to sophisticated digital tools that can adapt to individual users' needs.
How Early Adoption Drives Technology Forward
Dimitri emphasizes that when new technology is first developed, it may not be good enough for mainstream use but can still help people with specific needs. Speech recognition followed this pattern - while it had many errors initially, it was valuable for deaf users who needed communication tools. When developing AI systems specifically, he stresses that accessibility features must be planned from the beginning, as it's much more difficult to change designs after product development. This pattern of development - starting with focused solutions for specific needs before expanding to broader applications - continues to shape how his team approaches new challenges in accessibility.
Sign Language's Role in Culture
Dimitri views sign language as a fundamental part of culture, emphasizing it shouldn't be eliminated. He notes that babies don't develop hearing models until around six months old, which is why some parents teach sign language to their hearing babies to help develop better language models. He envisions sign language becoming like another cultural language, similar to speaking multiple languages like Russian, French, or Hebrew. This holistic view of communication technology - embracing both digital tools and traditional methods like sign language - reflects Dimitri's understanding that the best solutions often come from combining multiple approaches to meet diverse needs.