Today, Easy Cloud AI is proud to launch Beluga, an extremely fast and highly accurate AI-powered service that provides transcriptions, closed captions, summaries, and translations. 

Established in 2022, Easy Cloud is an AI research and deployment company. Our first AI project, named Beluga, leverages Open AI’s suite of products, combining their API to access GPT-3 and its most powerful Davinci model to perform natural language processing tasks along with Whisper, their automatic speech recognition (ASR) system.

The accuracy for transcription and translation tasks has been extremely high (close to 99% in many use cases).

For those requiring privacy or HIPAA compliance, Beluga provides a human-free process, meaning no human will ever look at your files, or the outputted transcripts and/or translations (translations won’t be available until version 2 is released later this year).

Our Process

How is Easy Cloud AI’s Beluga better than other AI-generated transcription and translation services?

Our secret sauce is our use of OpenAI’s GPT-3 Davinci model along with Whisper.

Davinci is the most capable model family and can perform any task the other models can perform and often with less instruction. For applications requiring a lot of understanding of the content, like summarization for a specific audience and creative content generation, Davinci is going to produce the best results. These increased capabilities require more compute resources, so Davinci costs more per API call and is not as fast as the other models.

Another area where Davinci shines is in understanding the intent of text. Davinci is quite good at solving many kinds of logic problems and explaining the motives of characters. Davinci has been able to solve some of the most challenging AI problems involving cause and effect. Read more

In addition, we use Open AI’s Whisper, which is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

With Whisper’s large and diverse dataset, there has been a huge improvement in the areas of accents, background noise and technical language.

Moreover, Whisper enables transcription in multiple languages, as well as translation from those languages into English. Read more

How is Privacy Maintained

To start, fIles are uploaded and processed on our secure servers. All outputted files (transcripts, translations, .MP3s) are stored temporarily on our encrypted Amazon S3 buckets. After 12 hours, the download links to your outputted files expires. Lastly, all outputted files are deleted within 48 hours.

If you are interested in learning more, please contact our Beluga team.