Skip to content

Deepgram

Deepgram Nova models are fast cloud speech-to-text models with rich pre-recorded audio options: language detection, diarization, formatting, redaction, search, topic detection, sentiment, and keyword/keyterm prompting.

API documentation: Deepgram pre-recorded listen API

Models

Model IDNotes
deepgram/nova-3Latest Nova model. Supports keyterm prompting.
deepgram/nova-2Previous Nova generation. Uses keywords prompting.

Select A Model

bash
ostt model select deepgram/nova-3

Per command:

bash
ostt transcribe meeting.mp3 -m deepgram/nova-3

Params

Persistent config:

toml
[deepgram.nova-3.params]
detect_language = ["en", "sv"]
diarize = true
smart_format = true
keyterm = ["OSTT", "Whisper", "Deepgram"]

Per invocation:

bash
ostt transcribe meeting.mp3 -m deepgram/nova-3 --param diarize=true --param smart_format=true --param detect_language=en,sv
ostt model params deepgram/nova-3 --format json

TOML lists must use TOML list syntax (["en", "sv"]). CLI --param list values use commas (--param detect_language=en,sv).

Params

ParamTypeDescription
detect_languageboolean or string listtrue detects the dominant language. A list restricts detection to given BCP-47 language codes, e.g. ["en", "sv"].
languagestringPrimary language hint. Use when the language is known instead of detection.
diarizebooleanAssign speaker numbers to words.
diarize_modelstringSelect a batch diarization model: latest, v1, or v2. Do not combine with diarize=true unless Deepgram documents support for that combination.
dictationbooleanEnable spoken formatting commands.
filler_wordsbooleanInclude filler words such as “uh” and “um”.
measurementsbooleanConvert spoken measurements to abbreviations.
multichannelbooleanTranscribe each audio channel independently.
numeralsbooleanConvert spoken numbers to numeric form.
paragraphsbooleanAdd paragraph structure to the transcript.
profanity_filterbooleanMask or remove profanity.
punctuatebooleanAdd punctuation and capitalization.
smart_formatbooleanFormat dates, times, currency, phone numbers, emails, URLs, and related text.
utterancesbooleanSegment speech into meaningful utterances.
utt_splitnumberPause duration in seconds before a new utterance.
mip_opt_outbooleanOpt out of the Deepgram Model Improvement Program.
keytermstring listNova-3 keyterm prompting. Saved ostt keyword terms are used as fallback only when keyterm is not set.
keywordsstring listKeyword boosting. Nova-2 uses this for saved ostt keyword fallback when keywords is not set.
searchstring listSearch for terms in submitted audio.
replacestring listSearch and replace terms or phrases.
redactstring listRedact sensitive information, such as pci, pii, or numbers.
tagstring listAttach labels for reporting.
extrastring listAttach arbitrary key-value metadata to the response.
detect_entitiesbooleanExtract named entities.
sentimentbooleanDetect sentiment.
summarizeboolean or stringEnable summarization. Deepgram also supports version strings such as v2.
topicsbooleanDetect topics.
custom_topicstring listTopics to detect.
custom_topic_modestringTopic mode: extended or strict.
intentsbooleanDetect speaker intent.
custom_intentstring listIntents to detect.
custom_intent_modestringIntent mode: extended or strict.
encodingstringExpected submitted audio encoding, e.g. linear16, flac, opus. Usually unnecessary because OSTT submits encoded audio.
versionstringModel version, such as latest.