Changelog
0.0.19 - 2026-06-03
Added
- Added
--pasteoutput mode for record, transcribe, retry, process, and launch flows, using configurable[output.paste]clipboard-backed paste settings. - Added deterministic text replace rules configured under
[text.replace], applied before processing/output/history, with anostt replaceTUI for managing rules. - Added external
command/<profile>transcription providers that run configured shell commands with{audio_path}and read transcript text from stdout. - Added external
http/<profile>transcription providers for OpenAI-compatible/v1/audio/transcriptionsendpoints with validated request params.
Changed
ostt modelnow uses a unified picker with custom, cloud, and local models in one list.ostt model listandostt model selectnow support configuredcommand/*andhttp/*profiles.
0.0.18 - 2026-06-02
Changed
- Replaced public transcription model option terminology with params: use repeatable
--param key=valueandostt model params [PROVIDER/MODEL] --format table|json. - Moved persistent transcription params to top-level provider tables:
[provider.params]for provider defaults and[provider."model".params]for model overrides. - Renamed the built-in local Whisper provider ID from
localtowhisper, so local model IDs usewhisper/<model>. - Local model registry entries now carry
provider_id, andostt model local download/removeaccepts eitherMODEL_IDorPROVIDER/MODEL. - Local Whisper audio format is now resolved from
[whisper].output_format,[whisper."model"].output_format, or the built-in Whisper default instead of mutating global[audio].output_format.
Removed
- Removed support for deprecated
[providers],[model_options],[audio].sample_rate,[[process.actions]], andprovider = "local"config shapes. OSTT now fails loudly with targeted configuration errors for these old formats.
0.0.17 - 2026-06-02
Added
- Added per-run model option overrides with repeatable
--mo key=value, validated against the selected provider/model option schema for supported transcription providers. - Added
ostt model options [PROVIDER/MODEL] --format table|jsonto list supported model option keys for scripting. - Added per-model local Whisper options under
[model_options."local/<model>"], using the same keys as[providers.local]. - Added OpenAI
gpt-4o-transcribe-diarizeand validated JSON-safe OpenAI transcription options for logprobs, Whisper timestamp metadata, and diarization requests. - Added validated JSON-safe Groq transcription options for
response_formatandtimestamp_granularitiesbased on Groq's speech-to-text documentation. - Added more DeepInfra speech-recognition models and validated DeepInfra-native options including
initial_prompt,task,chunk_level, andchunk_length_s. - Added validated Berget transcription options for JSON-safe response format, word alignment, diarization, speaker embeddings, chunk size, and batch size.
- Added validated ElevenLabs Scribe options for timestamps, file format, entity detection/redaction, diarization rules, and speaker-role detection rules.
- Added validated Mistral Voxtral transcription options for
language,temperature,timestamp_granularities,diarize, andcontext_bias.
Changed
- Transcription request options are now model-scoped under
[model_options."provider/model"]instead of provider-scoped settings under[providers.*]. openai/gpt-4o-transcribenow honors--mo prompt=...and savedostt keywordterms as prompt context.
Removed
- Removed the misleading
audio.sample_rateconfiguration setting. Recording now uses the input device sample rate, while local transcription compatibility is controlled byaudio.output_format = "pcm_s16le -ar 16000".
0.0.16 - 2026-06-01
Added
- Added per-run transcription model selection with
-m, --model <PROVIDER/MODEL>forostt,ostt record,ostt transcribe,ostt retry, andostt launch. The override applies only to the current invocation and does not change the saved default model. - Added scriptable CLI actions for models, keywords, auth, history, config helpers, logs, and local model download/removal.
Changed
- Cloud transcription models now use provider/model identities such as
deepgram/nova-3,deepinfra/openai/whisper-large-v3, andberget/KBLab/kb-whisper-large, replacing older OSTT-invented model IDs. - Local model UI and error messages now show full public IDs such as
local/turbo. - Cleaned up command syntax:
ostt keywordreplacesostt keywords,ostt config list-devicesreplacesostt list-devices,ostt process listreplacesostt process --list, andostt completions install <shell>replacesostt completions <shell> --install. - Record-only options no longer appear in unrelated management command help.
0.0.15 - 2026-05-27
Fixed
- Fixed Kitty popup launch behavior by using cell-based dimensions, supported window flags, macOS app bundle detection, configured positioning, and automatic quit when the popup closes.
0.0.14 - 2026-05-22
Fixed
- Fixed macOS local artifact builds by removing a Linux-only runtime directory fallback from shared code.
0.0.13 - 2026-05-22
Changed
ostt modelnow shows the currently selected provider/model before choosing between local and cloud models.- Routine popup/recording lifecycle messages now log at debug level to keep normal logs quieter.
Fixed
ostt launchnow targets only the active recorder process via a recorder-owned runtime PID file, preventing the hotkey from accidentally signaling the local model daemon.- Model selection now uses only
[transcription]inostt.toml; the legacy~/.local/share/ostt/modelfile is ignored and no longer migrated. - Avoided printing duplicate logos during authentication flows.
- Improved error screen text color for better readability.
0.0.12 - 2026-05-21
Added
- Local transcription - Run transcription fully offline using whisper.cpp. Metal is enabled on all macOS builds (Apple Silicon ~180 ms). CUDA and Vulkan builds are available for Linux. The install script auto-selects the right GPU variant.
- Local model management -
ostt modellets users browse, download, activate, and delete local models from a hosted registry. Custom models can be added via Hugging Face URLs or direct.gguf/ggml-*.binlinks. - Local model daemon - A persistent background process keeps the model loaded between transcriptions, eliminating per-call startup cost. Managed via
ostt daemon start|stop|restart|status|install|uninstall. Installs as a launchd service (macOS) or systemd user unit (Linux). Daemon logs are merged intoostt logs. - Mistral provider - New transcription provider with Voxtral Mini Transcribe and Voxtral Mini 2602 models. Supports optional language hints via
[providers.mistral].language.
Changed
- Updated model picker UI, dialogs, and toasts to use terminal ANSI colors and consistent local/cloud model screens.
0.0.11 - 2026-05-13
Fixed
- Preserved user
ostt.tomlchanges during version updates and added validation for the embedded default config template.
Added
- Added process-level default AI tool and model settings for processing actions.
- Shell completions -
ostt completions <shell> --installauto-detects the shell and writes completions to the standard system directory. Native packages (.deb,.rpm) ship completion files pre-installed. The AUR PKGBUILD generates them at build time.
0.0.10 - 2026-05-12
Fixed
- Fixed invalid TOML in the embedded default config template for AssemblyAI language detection options.
- Suppressed VLC GUI and DBus noise during
ostt replayon Linux by using terminal-friendly playback commands. - Added ARM64 Linux package builds so Ubuntu arm64 users can install a matching
.debpackage.
0.0.9 - 2026-05-12
Added
- Process command - New
ostt processsubcommand for running processing actions on history items. New-p/--processflag onrecord,transcribe, andretrycommands for post-transcription processing. Includes AI tool execution (OpenCode, Claude Code, Gemini CLI, Codex CLI), bash command execution, and an action picker TUI for selecting which action to run. Actions are configured under[process.actions]in~/.config/ostt/ostt.toml. - Launch command - New
ostt launchsubcommand for cross-platform popup recording. Opens a terminal popup that starts recording immediately, with toggle support (press hotkey again to stop). Auto-detects ghostty, kitty, alacritty, foot, konsole, gnome-terminal, and xfce4-terminal. Replaces the old Hyprland-specific float script. - ElevenLabs provider - New transcription provider with Scribe v2 and Scribe v1 models. Supports optional language hints via
[providers.elevenlabs].language_code. .debpackage for Debian/Ubuntu/Mint installation viacargo-deb.rpmpackage for Fedora/RHEL/openSUSE installation viacargo-generate-rpm- Both packages are automatically built and uploaded to GitHub Releases via CI
- GNOME setup guide - Platform-specific README in
environments/gnome/ - KDE Plasma setup guide - Platform-specific README in
environments/kde/
Removed
- Hyprland float script - Removed legacy
ostt-float.shandalacritty-float.tomlgeneration. Superseded by the newostt launchcommand. - macOS Hammerspoon config - Removed
init.luageneration. Superseded by the newostt launchcommand.
0.0.8 - 2026-03-31
Added
- Transcribe command - Transcribe pre-recorded audio files without recording (
ostt transcribe <file>). Enables use of OSTT's transcription pipeline in non-interactive workflows such as CI pipelines, GitHub Actions, or agentic scripts. Supports the same output flags asrecordandretry(-cfor clipboard,-ofor file, stdout by default). Alias:t. - Deepgram language detection - Enable automatic language detection with
detect_languageoption (default: true). Previously Deepgram defaulted to English only. - Deepgram language detection restriction - Restrict detectable languages with
detect_language_codesoption. For example,detect_language_codes = ["en", "es"]will only detect English or Spanish. - AssemblyAI provider - New transcription provider with the
universal-3-promodel. Configurable via[providers.assemblyai]inostt.toml. - Berget provider - New transcription provider with 3 models: KB Whisper Large (Swedish optimized), NB Whisper Large (Norwegian optimized), and Whisper Large V3 (general-purpose). All data stays within Sweden.
Fixed
- Transcription cancel support - Users can now press Escape, q, or Ctrl+C to cancel during transcription. Previously the UI was stuck until the API responded.
- ErrorScreen reserved for TUI commands - Non-TUI commands (retry, transcribe) now use standard error output instead of launching a full-screen error display.
- macOS Hammerspoon popup - Fixed terminal noise in Ghostty popup by launching
osttviaclear; execthrough a login shell.
0.0.7 - 2026-02-05
Added
- Output mode configuration - Control transcription output destination with CLI flags:
- Default: outputs to stdout for piping to other commands
-cflag: copy to clipboard-o <file>flag: write to file
- Top-level record options -
-cand-oflags now available at CLI top level without explicitrecordcommand (e.g.,ostt -cequivalent toostt record -c) - Automatic log rotation - Log files kept for 7 most recent days; older logs automatically deleted on startup
- Version tracking and auto-updates - Application version tracked in config; app-managed files (float script, Alacritty config) automatically updated on version changes
- Retry command - Re-transcribe previous recordings without re-recording audio (
ostt retryorostt retry N) - Replay command - Playback previous recordings using system audio player (
ostt replayorostt replay N) - Recording history - Maintains history of 10 most recent audio recordings with automatic rotation
- Command aliases - Short aliases for common commands:
r(record),a(auth),h(history),k(keywords),c(config),rp(replay) - Rich help system - Two-tier help with
-h(short) and--help(long with examples) - Improved error messages - Typo suggestions and better command-not-found errors
- Shell completions - Generate completion scripts for bash, zsh, fish, and PowerShell (
ostt completions <shell>)
Changed
- CLI framework migration - Migrated from manual argument parsing to clap for better UX and maintainability
ostt recordnow outputs to stdout by default (enables shell piping) instead of clipboard- Audio player priority on Linux - Replay command now prefers mpv for better user experience (falls back to vlc, ffplay, paplay, xdg-open)
- Hyprland window rules syntax - Updated to new Hyprland window rule syntax with dynamic expressions and
match:patterns (BREAKING CHANGE) - Float script defaults to clipboard -
ostt-float.shnow defaults to-c(clipboard) if no arguments provided; existing Hyprland configs continue to work
Fixed
- Transcribed text no longer includes leading/trailing whitespace added by transcription models
- Log rotation now properly removes old log files (previously accumulated indefinitely)
[0.0.5] - 2025-12-27
Added
- Frequency spectrum visualization - Real-time FFT-based audio spectrum display (new default visualization)
Changed
- Error message centering now accounts for multi-line text, centering entire message block vertically
Fixed
- Segmentation fault on macOS when listing audio devices with incompatible hardware
- Vertical centering of multi-line error messages
[0.0.4] - 2025-12-05
Added
- DeepInfra provider with 2 models: Whisper Large V3 and Whisper Base
- Groq provider with 2 models: Whisper Large V3 and Whisper Large V3 Turbo
- Logging configuration documentation in README with
RUST_LOGenvironment variable support
Changed
- Default log level changed from
debugtoinfofor cleaner log output - Improved logging clarity: reduced redundant messages and moved verbose logs to DEBUG level
- Keywords UI input text now uses clean white color instead of yellow
- Help text on keywords and history screens now shows "esc/q exit" instead of "q quit"
- Suppressed redundant error message when canceling auth command
[0.0.3] - 2025-12-02
Fixed
- Fixed
ostt-float.shscript to correctly locate ostt binary when installed via package managers (Homebrew, AUR, shell installer) - Fixed Hyprland hotkey binding syntax in documentation - added missing description parameter for
bindcommand
Migration Notes
Linux users upgrading from v0.0.2:
- Update
~/.local/bin/ostt-floatscript: Manually update using the latest version - Update Hyprland hotkey binding in
~/.config/hypr/hyprland.conf:
diff
- bind = SUPER, R, exec, bash ~/.local/bin/ostt-float
+ bind = SUPER, R, ostt, exec, bash ~/.local/bin/ostt-float- Reload Hyprland config:
hyprctl reload
[0.0.2] - 2025-11-28
Added
- Initial public release
- Real-time audio recording with waveform visualization
- Speech-to-text transcription via OpenAI and Deepgram providers
- Transcription history browser with clipboard integration
- Keyword management for improved transcription accuracy
- Hyprland/Omarchy floating window integration
- Cross-platform support (Linux and macOS)
- Multiple installation methods (Homebrew, AUR, shell installer)