DeepL, the German company known for its high-quality online text translations, is moving into real-time audio translation, aiming to deliver more accurate and nuanced results than competitors like Google Translate. Now valued at $2 billion with over 100,000 paying customers, DeepL is betting big on its latest feature, DeepL Voice.
With DeepL Voice, users can listen to someone speak in one language and see the translation appear instantly in another. Currently, it supports audio input in 13 languages, including English, German, Japanese, Korean, and French. The translations show up as text captions, making it ideal for live conversations and video conferencing, though audio output isn’t yet available.
DeepL’s new voice feature is designed to work in real time, something that distinguishes it from other AI translation tools, which often have delays. For in-person meetings, translations can appear as a mirrored display on a smartphone, letting each participant see translated text in real time. In video conferencing, translations appear as subtitles, currently compatible only with Microsoft Teams.
The demand for voice translation tools is on the rise, with major players like Google introducing similar features in Meet, and newer AI companies like ElevenLabs creating advanced voice dubbing services. Kutylowski hinted at potential future updates, including an audio-output feature and expanded video conferencing integrations.
DeepL Voice doesn’t yet offer an API, as DeepL’s focus remains on B2B partnerships, and for now, the service doesn’t integrate with Google Meet or Zoom. But the release marks a significant step for DeepL, as voice translation was the most requested feature since the company’s launch in 2017.
Unlike many AI services that rely on third-party large language models, DeepL has developed its own translation-optimized LLM, which it claims outperforms GPT-4 and similar models by Google and Microsoft. This tailored approach, along with DeepL’s focus on real-time processing, sets it apart as a high-precision tool for live translation.
DeepL Voice is not only expected to enhance online meetings but could also be a game-changer for industries like hospitality. Imagine restaurant staff communicating effortlessly with international customers through real-time translations. However, privacy concerns linger, especially given that audio data is processed on DeepL’s servers rather than locally. Kutylowski assures that no data is stored or used to train their models, emphasizing compliance with GDPR and other data protection regulations.
With DeepL Voice, the company takes a significant step into the voice translation space, hinting at more developments to come as they continue to build their platform from the ground up.