Skip to content

AssemblyAI

AssemblyAI Universal-3 Pro is a promptable speech-to-text model with language detection, formatting, speaker labels, and vocabulary guidance.

AssemblyAI documentation:

Model

Model IDNotes
assemblyai/universal-3-proUniversal-3 Pro cloud transcription model. OSTT sends this as speech_models = ["universal-3-pro"].

Params

toml
[assemblyai.universal-3-pro.params]
language_detection = true
format_text = true
speaker_labels = true
keyterms_prompt = ["OSTT", "Whisper", "Rust"]
bash
ostt transcribe meeting.mp3 -m assemblyai/universal-3-pro --param speaker_labels=true --param keyterms_prompt=OSTT,Whisper
ostt model params assemblyai/universal-3-pro --format json

Do not set prompt and keyterms_prompt together. AssemblyAI documents them as alternatives: use prompt for transcription style and behavior, and keyterms_prompt when you know the specific terms that should be recognized. OSTT validates this before sending a request.

ParamTypeDescription
promptstringNatural-language instruction for transcription style, vocabulary, speaker roles, or domain context. AssemblyAI supports up to 1,500 words for Universal-3 Pro.
keyterms_promptstring listTerms to bias recognition. AssemblyAI supports up to 1,000 words or phrases for Universal-3 Pro, with a maximum of 6 words per phrase. Saved ostt keyword terms are used as fallback only when keyterms_prompt and prompt are not set.
language_codestringKnown source language code.
language_detectionbooleanDetect language automatically.
format_textbooleanApply punctuation, casing, and numeric formatting.
punctuatebooleanAdd punctuation.
disfluenciesbooleanInclude filler words and false starts.
filter_profanitybooleanMask profanity.
speaker_labelsbooleanEnable speaker labels.
speakers_expectedintegerExpected speaker count.
speech_thresholdnumberThreshold for speech detection.
temperaturenumberSampling temperature, 0.0 to 1.0.
word_booststring listVocabulary boost terms.
boost_paramstringBoost strength parameter.