dabri

package module
v2.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2026 License: MIT Imports: 0 Imported by: 0

README

Dabri

🗣️ Linux speech-to-text, the Unix way

Go Reference Go Report Card Go Version

CI Release AppImage AUR COPR

Dabri is a minimalist, privacy-first application for offline voice recognition directly into any active window (editors, browsers, IDEs, AI assistants).

Written in pure Go, it leverages whisper.cpp for fast, offline transcription. The architecture is built from the ground up with a minimal set of dependencies, featuring a custom dependency injection factory, ensuring lean and maintainable design.

Features

Privacy Security gosec

Daemon: Dabri runs quietly in the background and integrates into the system tray for convenient management.

Terminal: it can also be invoked as a CLI tool (see CLI Usage Guide) for scripting purposes.

▸ For integration enthusiasts, a WebSocket server is available at localhost:8080. Enable it in your config with web_server enabled: true (disabled by default).

  • Offline speech-to-text, privacy-first: all processing happens locally
  • Portable: AppImage package
  • Cross-platform support for X11 and Wayland
  • Linux DEs: native integration with GNOME, KDE, and others
  • GPU + CPU support: Vulkan backend for faster transcription (auto-fallback to CPU)
  • Voice typing or clipboard mode
  • Flexible audio recording: arecord (ALSA) or ffmpeg (PulseAudio/PipeWire), see audio pipeline
  • Multi-language support, custom hotkey binding, visual notifications
  • Model management: switch between base, small, medium, large-v3-turbo, and large-v3 whisper models via tray or CLI

Beyond Minimalism

Intuitive minimalist UX, robust STT infrastructure. A foundation for voice-controlled automation:

  • Dual API: Unix socket IPC + WebSocket — script locally or integrate remotely
  • Interface-driven: focused contracts — swap STT engines, add I/O methods, extend hotkey providers
  • Daemon + CLI: background hub + stateless commands — perfect for IoT pipelines
  • Graceful degradation: provider fallbacks, optional components, no crashes
# Voice command → smart home action
transcript=$(dabri --json stop | jq -r '.data.transcript')
[[ "$transcript" == *"lights off"* ]] && curl -X POST http://hub/lights/off

✦ Installation

AppImage

Download the latest AppImage from Releases:

# Download the file, then:
chmod +x dabri-*.AppImage
# Ensure user is in input group for hotkeys to work:
sudo usermod -a -G input $USER
# then logout/login or reboot
# Open via GUI or with terminal command:
./dabri-*.AppImage

Arch Linux AUR:

yay -S dabri
# Ensure user is in input group:
sudo usermod -a -G input $USER

Fedora COPR:

sudo dnf copr enable ashbuk/dabri
sudo dnf install dabri
# Ensure user is in input group:
sudo usermod -a -G input $USER

Desktop Environment Compatibility

OS Display

📋 Desktop Environment Support Guide - help us test different desktop environments!

For system tray integration on GNOME — install the AppIndicator extension

KDE and other DEs have built-in system tray support out of the box

For automatic typing on GNOME — see setup guide

Other Wayland compositors (KDE, Hyprland, Sway, etc.): wtype works without setup — automatically detected!
X11: Native support with xdotool out of the box

If automatic typing doesn't appear automatically, the app falls back to clipboard (Ctrl + V) mode

For issues and bug reports: GitHub Issues

See changes: Releases

System Requirements

Category Requirement
OS Linux with glibc 2.35+
Desktop X11 or Wayland
Audio Microphone capability
Storage ~54MB + model (57MB–1.1GB)
Memory ~300MB RAM
CPU AVX-capable (Intel/AMD 2011+)
📋 Supported Distributions
Family Distributions
Ubuntu-based Ubuntu 22.04+, Linux Mint 21+, Pop!_OS 22.04+, Elementary OS 7+, Zorin OS 17+
Debian-based Debian 12+
Fedora Fedora 36+
Rolling release Arch Linux, Manjaro, EndeavourOS, openSUSE Tumbleweed

For Developers

Start onboarding with:

Technical dive into architecture and engineering challenges: Building Dabri on Hashnode

✦ Acknowledgments

  • whisper.cpp for the excellent C++ implementation of OpenAI Whisper
  • fyne.io/systray for cross-platform system tray support
  • ydotool and wtype for Wayland-compatible input automation
  • OpenAI for the original Whisper model

✦ MIT LICENSE

If you use this project, please link back to this repo and ⭐ it if it helped you.

  • Consider contributing back improvements

Sharing with the community for privacy-conscious Linux users


Sponsor

Sponsor PayPal

Please consider supporting development

Documentation

Overview

Package dabri provides a high-level overview of the Dabri project.

Dabri is a minimalist, privacy-focused desktop application written in Go that converts speech to text offline using local Whisper models.

Dual-mode architecture:

  • Daemon mode: Background service with system tray integration for GUI usage
  • CLI mode: Command-line interface for scripting and tiling window managers

Core responsibilities:

  • Global hotkeys using DBus GlobalShortcuts portal (primary) or evdev (fallback)
  • Audio recording via arecord/ffmpeg backends
  • Local transcription using go-whisper (whisper.cpp)
  • Text output routing: clipboard, active window typing, or combined
  • X11 and Wayland support with smart tool selection (xdotool, wtype, ydotool)
  • IPC communication via Unix socket for low-latency CLI operations

Optional WebSocket API:

  • Real-time speech-to-text API for external clients
  • Enabled via config: web_server.enabled: true (default: false)
  • Endpoint: ws://localhost:8080/ws (or /api/v1/ws)
  • Supports authentication, CORS, and connection limits

Packaging:

  • AppImage package with first-run configuration and model copy

Testing strategy:

  • Unit tests colocated with packages (default go test ./...)
  • Integration tests in tests/integration (run with -tags=integration)

For more details, see docs/

Directories

Path Synopsis
Package audio provides a high-level facade for audio recording functionality.
Package audio provides a high-level facade for audio recording functionality.
cmd
dabri command
Package config provides configuration management functionality with support for multiple configuration formats, validation, and security features.
Package config provides configuration management functionality with support for multiple configuration formats, validation, and security features.
Package hotkeys provides a high-level facade for hotkey management It abstracts the underlying implementation of providers and event handling
Package hotkeys provides a high-level facade for hotkey management It abstracts the underlying implementation of providers and event handling
internal
app
ipc
Package output provides a high-level facade for text output functionality It abstracts the underlying implementation of clipboard and typing operations
Package output provides a high-level facade for text output functionality It abstracts the underlying implementation of clipboard and typing operations
tests
integration
Package integration contains integration tests that are built with the "integration" build tag.
Package integration contains integration tests that are built with the "integration" build tag.
Provides a high-level facade for interacting with the speech-to-text functionality.
Provides a high-level facade for interacting with the speech-to-text functionality.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL