Skip to content

Mistral

Mistral Voxtral Mini Transcribe is a speech-to-text model with diarization, timestamp granularity controls, context biasing for vocabulary, and optional language selection.

API documentation:

Models

Model IDNotes
mistral/voxtral-mini-latestLatest Voxtral Mini transcription model.
mistral/voxtral-mini-2602Pinned Voxtral Mini 2602 model.

Params

toml
[mistral.voxtral-mini-latest.params]
language = "en"
diarize = true
context_bias = ["OSTT", "Voxtral", "Rust"]
temperature = 0.2
bash
ostt transcribe lecture.mp3 -m mistral/voxtral-mini-latest --param language=en --param context_bias=OSTT,Voxtral
ostt model params mistral/voxtral-mini-latest --format json
ParamTypeDescription
languagestringLanguage code, for example en. Providing the language can improve accuracy. Cannot be combined with timestamp_granularities.
diarizebooleanEnable speaker diarization.
timestamp_granularitiesstring listRequested timestamp granularity: segment or word. Cannot be combined with language.
context_biasstring listVocabulary hints. Saved ostt keyword terms are used as fallback only when context_bias is not set.
temperaturenumberSampling temperature.

OSTT uses the synchronous /v1/audio/transcriptions endpoint with local file uploads. It does not expose Mistral streaming, file_url, or uploaded file IDs.