OSTT vs Hyprwhspr: Linux dictation or terminal-native speech-to-text?

Hyprwhspr is a polished Linux system-wide dictation project with Wayland-focused setup, visual feedback, local backends, model controls, and automatic paste into the active buffer. OSTT is the better fit when you want open speech-to-text that works as a developer tool: choose any provider, use local or cloud models, paste into apps, retry saved recordings, transcribe files, and process text with AI prompts or shell commands.

Try OSTT Hyprland setup

Short Answer

Choose OSTT when you want speech-to-text to behave like a Unix tool.

Hyprwhspr is designed around fast Linux dictation: press a hotkey, speak, stop, and text appears in the active buffer. OSTT can also paste into the focused app, but its bigger advantage is what happens around the transcript: model switching, retry, history, file transcription, stdout, clipboard, custom engines, provider params, deterministic replace rules, and AI or bash processing.

# Hotkey-friendly dictation into the focused app
ostt launch --paste

# Use a different model for one recording
ostt launch --paste -m openai/gpt-4o-transcribe

# Retry the same recording with a local model
ostt retry -m whisper/turbo

# Send transcript through an action before paste
ostt launch --paste -p clean

Feature Comparison

Hyprwhspr is a Linux dictation app. OSTT is a speech-to-text pipeline.

Capability	OSTT	Hyprwhspr
Open source	✅ MIT, Rust	✅ MIT, Python
Linux support	✅ Linux-first platform guides for Omarchy/Hyprland, GNOME, KDE, and other desktops	✅ Linux system-wide dictation focus with Wayland/systemd setup
macOS support	✅ macOS supported	❌ Public docs describe Linux support
Focused app insertion	✅ `--paste` sends text to the focused app and can restore the previous clipboard	✅ Auto-paste into the active buffer is a core documented workflow
Cloud transcription providers	✅ OpenAI, Deepgram, Groq, DeepInfra, AssemblyAI, Berget, ElevenLabs, Mistral	✅ REST API and realtime WebSocket backends are documented for cloud-style integrations
Built-in local transcription	✅ Built-in Whisper-compatible local models	✅ Public docs describe Cohere Transcribe, Parakeet, Whisper, onnx-asr, and related local backends
External local engines	✅ `command/<profile>` and `http/<profile>` integrations for user-managed engines	✅ Broad backend setup is a core Hyprwhspr feature
Retry same recording with another model	✅ First-class `ostt retry -m PROVIDER/MODEL`	⚠️ Not the main documented workflow
File/stdout/shell workflows	✅ Core workflow: stdout, files, clipboard, paste, scripts, and processing actions	⚠️ Dictation-oriented, with command controls and capture workflows documented
Recording modes	⚠️ Terminal recorder, popup launcher, pause/resume, file transcription, retry, replay	✅ Public docs describe toggle, push-to-talk, auto, and long-form modes
Visual/audio feedback	⚠️ Terminal visualization and popup workflow	✅ Themed visualizer, notifications, audio cues, Waybar integration, and audio ducking are documented
Text cleanup	✅ Keywords, deterministic replace rules, AI actions, and bash actions	✅ Word overrides, prompts, filler handling, and symbol replacement are documented
Provider-neutral params	✅ `--param` validation and per-model config across cloud, local, command, and HTTP providers	⚠️ Backend-specific configuration rather than OSTT-style provider/model IDs

Use OSTT for provider choice

Switch between OpenAI, Deepgram, Groq, DeepInfra, AssemblyAI, Berget, ElevenLabs, Mistral, local Whisper, command engines, and HTTP endpoints.

Use OSTT for retry

Speak once, then compare models on the same saved audio with ostt retry -m PROVIDER/MODEL. This is useful for accents, technical vocabulary, and noisy rooms.

Use OSTT for files and scripts

Transcribe meeting.mp3, write notes.md, pipe stdout into tools, or process history entries without treating dictation as only an active-window feature.

Use OSTT for custom engines

Keep OSTT lean while calling your own Parakeet, faster-whisper, Speaches, LocalAI, Cohere Transcribe, or research-model wrapper.

Use OSTT for developer cleanup

Combine ostt keyword, ostt replace, and processing actions so transcripts spell product names, acronyms, code terms, and project vocabulary correctly.

Use Hyprwhspr for pure Linux dictation

If your priority is a Linux-only dictation daemon with visualizer, Waybar integration, long-form modes, and broad local backend setup, Hyprwhspr is strong.

Workflow Difference

OSTT turns dictation into reusable text operations.

1. RecordUse the terminal, a popup hotkey, or an existing audio file.

2. ChooseSelect a cloud model, local Whisper model, command engine, or HTTP endpoint.

3. TransformApply replace rules, AI processing, or bash commands.

4. RoutePaste into the app, copy, write a file, print stdout, or retry with another model.

Model Choice

Local when you need privacy. Cloud when you need hosted quality. External when you want experiments.

OSTT does not ask you to bet everything on one backend. Start with a hosted model, use local Whisper for sensitive work, add Berget for Swedish and EU-focused transcription, or connect a custom local HTTP endpoint when you want newer ASR engines.

# Pick a provider interactively
ostt model

# Use OpenAI for one run
ostt -m openai/gpt-4o-transcribe --paste

# Use Berget for Swedish
ostt -m berget/KBLab/kb-whisper-large --param language=sv -c

# Use a custom local HTTP engine
ostt -m http/speaches --paste

Try OSTT as your Hyprwhspr alternative.

Read the docs Choose a model