Groq Whisper from the command line.

Groq runs OpenAI's Whisper models on its LPU hardware — purpose-built inference chips that transcribe a 10-minute recording in about 4 seconds, 4–5x faster than the OpenAI Whisper API at a fraction of the cost. OSTT connects Groq to your terminal, hotkey, and shell pipeline on Linux and macOS.

Groq Whisper

216x real-time speed. $0.04 per hour.

Groq's LPU (Language Processing Unit) infrastructure runs Whisper Large v3 Turbo at 216x real-time speed — a 10-minute recording returns in roughly 4 seconds. Independent benchmarks put Groq at 4–5x lower latency than the OpenAI Whisper endpoint, at about 9x lower cost. Two models: whisper-large-v3-turbo for speed and cost, whisper-large-v3 for maximum accuracy and translation support.

# ~/.config/ostt/ostt.toml
[transcription]
provider = "groq"
model = "whisper-large-v3-turbo"

# For maximum accuracy or audio translation support
# model = "whisper-large-v3"

[groq.whisper-large-v3-turbo.params]
language = "en"
response_format = "verbose_json"
timestamp_granularities = ["word", "segment"]

# Pick interactively
ostt model

# Record with hotkey, transcribe with Groq, copy to clipboard
ostt launch -c

216x real-time transcription

Groq's LPU hardware processes audio dramatically faster than GPU-based providers. A 10-minute recording returns in ~4 seconds. For a hotkey dictation workflow, that means text in your clipboard before your hand reaches the keyboard.

Best cost/performance ratio

Whisper Large v3 Turbo costs $0.04/hr ($0.00067/min) — roughly 9x cheaper than OpenAI at the same price point. At high volume, that difference compounds quickly. Whisper Large v3 is available at $0.111/hr for translation and accuracy-critical work.

Two models, one decision

Groq documents whisper-large-v3-turbo as the best price/performance choice for multilingual transcription and whisper-large-v3 as the accuracy-sensitive model with translation support. Switch between them via ostt model.

99+ languages

Groq Whisper supports the same 99+ language breadth as the base Whisper model. Turbo handles multilingual audio; Large v3 adds translation to English for non-English source audio.

Validated Groq options

Use --param language=en, --param prompt=..., --param temperature=0, or timestamp metadata through response_format=verbose_json and timestamp_granularities=word,segment.

Retry without re-recording

OSTT saves every recording locally. Run ostt retry to re-transcribe with Groq, or switch to any other provider — without speaking again.

Workflow

From speech to useful output.

1. RecordPress your global hotkey or run ostt in the terminal.
2. TranscribeGroq's LPU infrastructure processes the audio in seconds.
3. ProcessOptionally run AI prompts or shell commands on the result.
4. SendPrint to stdout, copy to clipboard, write to a file, or pipe onward.

Pipeline

The fastest path from voice to shell.

Groq's throughput makes OSTT feel instant. Press your hotkey, speak, and the transcript is in your clipboard before you've switched windows. Pipe the output through any CLI tool. Use ostt model params groq/whisper-large-v3-turbo to list supported Groq --param keys.

# Transcribe a 10-minute recording — returns in ~4 seconds
ostt transcribe meeting.mp3 -o notes.md

# Record, process with AI action, copy to clipboard
ostt -p clean -c

# Pipe directly to another command
ostt | xargs -I{} notify-send "Transcribed" "{}"

Groq speed in your terminal.