Mistral
Mistral Voxtral Mini Transcribe is a speech-to-text model with diarization, timestamp granularity controls, context biasing for vocabulary, and optional language selection.
API documentation:
Models
| Model ID | Notes |
|---|---|
mistral/voxtral-mini-latest | Latest Voxtral Mini transcription model. |
mistral/voxtral-mini-2602 | Pinned Voxtral Mini 2602 model. |
Params
toml
[mistral.voxtral-mini-latest.params]
language = "en"
diarize = true
context_bias = ["OSTT", "Voxtral", "Rust"]
temperature = 0.2bash
ostt transcribe lecture.mp3 -m mistral/voxtral-mini-latest --param language=en --param context_bias=OSTT,Voxtral
ostt model params mistral/voxtral-mini-latest --format json| Param | Type | Description |
|---|---|---|
language | string | Language code, for example en. Providing the language can improve accuracy. Cannot be combined with timestamp_granularities. |
diarize | boolean | Enable speaker diarization. |
timestamp_granularities | string list | Requested timestamp granularity: segment or word. Cannot be combined with language. |
context_bias | string list | Vocabulary hints. Saved ostt keyword terms are used as fallback only when context_bias is not set. |
temperature | number | Sampling temperature. |
OSTT uses the synchronous /v1/audio/transcriptions endpoint with local file uploads. It does not expose Mistral streaming, file_url, or uploaded file IDs.