Transcribe speech to text using the largest and most powerful AI models available, including: OpenAI Whisper large-v3 LLM. Excellent handling of background noise, multiple accents, or live speech.
Be an early adopter and receive additional free transcription hours each month!
No recurring costs
$40 per month Billed annually AnnuallyMonthly Annually |
Pre-recorded Transcriptions | $0.35 per hour |
---|---|
Live Transcription | $0.80 per hour |
API Access | |
---|---|
Database Access | |
Managed Services | |
Transcription Rate Limit | max 50 concurrent sessions |
Server Start | warm boot in non-peak times |
Transcribe Audio from Uploaded File | |
---|---|
Transcribe Audio from URL | |
Transcribe Audio from from Microphone | |
Export Subtitles and Files | |
Translate Transcriptions | |
Polyglot |
Transcribe from Microphone | |
---|---|
Transcribe from Live Stream | |
Real-Time Transcriptions via Public URL | |
Real-Time Translations via Public URL | |
Historical Transcriptions via Pubic URL | |
Enable Password Protection | |
Scheduled Livestream Transcriptions |
Language Support | 57 languages plus dialects & accents |
---|---|
Automatic Language Detection | |
Paragraph Segmentation | |
Summarization | |
Word-Level Time Stamps | |
Word-Level Alignment | |
Speaker Diarization |
Help & Support | Email and Live Chat Support |
---|---|
SLA |
VocalStack uses large language models (LLMs) to get the best transcription quality possible, even in the most challenging audio environments. This includes Whisper, which serves as the core model for the VocalStack platform. The large Whisper model is a state-of-the-art AI model that has been trained on a vast amount of data to understand and transcribe speech accurately.
To better understand the impact of an AI model's size, let's use the different Whisper models to transcribe a fictitious excerpt:
No, you will not be billed for the whole hour. Our billing costs are always calculated per second of transcribed audio regardless of whether the transcription is a prerecorded audio or live audio. This means you only get billed for what you need transcribed. The only exception is that the audio must be at least one minute long. Otherwise, you will be billed for the whole minute.
To simplify this further, here is what you will be billed in each plan for a prerecorded transcription (assuming you've used up all your free transcription hours for the month):
No, there are no hidden costs. You only pay for the transcription of your audio content. (In other words, only for the costs listed on the pricing table.) Other features such as automatic language detection, translations, summarizations, paragraph segmentation, keyword detection, and timestamps are included for free.
Importantly, the number of translations does not affect the transcription cost. For example, if you transcribe an audio file in English and then translate it into Spanish, French, and German, you will only be billed for the transcription of the English audio. This also applies to live transcriptions using Polyglot. You can perform an unlimited number of translations at any time without any additional charges.
Pre-recorded transcription refers to the process of transcribing audio that has been previously recorded. It can be uploaded as an audio file and transcribed at a later time, making it suitable for podcasts, interviews, videos, and other recorded content.
Live transcription refers to the process of transcribing audio in real time as it is being spoken. This is useful for live streams, podcasts, events, meetings, lectures, and other scenarios where immediate transcription (and possibly translation) is required