Open source Speech Recognition transformers

Introducing Whisper

7 April 2024

0 Views 0

SaveSavedRemoved 0

Other existing approaches frequently use smaller, more closely paired audio-text training datasets,^{[^reference-1]} ^{[^reference-2]}^{[^reference-3]} or use broad but unsupervised audio pretraining.^{[^reference-4]}^{[^reference-5]}^{[^reference-6]} Because Whisper was trained on a large and diverse dataset and was not fine-tuned to any specific one, it does not beat models that specialize in LibriSpeech performance, a famously competitive benchmark in speech recognition. However, when we measure Whisper’s zero-shot performance across many diverse datasets we find it is much more robust and makes 50% fewer errors than those models.

About a third of Whisper’s audio dataset is non-English, and it is alternately given the task of transcribing in the original language or translating to English. We find this approach is particularly effective at learning speech to text translation and outperforms the supervised SOTA on CoVoST2 to English translation zero-shot.

Discover more from reviewer4you.com

Subscribe to get the latest posts to your email.

Introducing Whisper

Like this:

Discover more from reviewer4you.com

Introducing Whisper

If people succeeded in telling you that it is not your job to sponsor your babe in school, they have scammed you

Transformers Studio Series Reveals Updated Takes on Optimus Prime, Bumblebee & More

Microsoft’s agentic AI OmniParser rockets up open source charts

Meta makes its MobileLLM open for researchers

OpenAI expands Realtime API with new voices and cuts prices for developers

Leave a reply Cancel reply

Introducing Whisper

Share this:

Like this:

Discover more from reviewer4you.com

Introducing Whisper

If people succeeded in telling you that it is not your job to sponsor your babe in school, they have scammed you

Transformers Studio Series Reveals Updated Takes on Optimus Prime, Bumblebee & More

Microsoft’s agentic AI OmniParser rockets up open source charts

Meta makes its MobileLLM open for researchers

OpenAI expands Realtime API with new voices and cuts prices for developers

Leave a reply Cancel reply

Discover more from reviewer4you.com