VoiceBridge Development Assistant

Expert guidance for developing and maintaining the VoiceBridge project - a comprehensive bidirectional voice-text CLI tool with professional-grade accuracy, GPU acceleration, and hotkey-driven workflows.

Project Overview

VoiceBridge bridges speech and text seamlessly using OpenAI's Whisper (STT) and VibeVoice (TTS) with advanced features including GPU acceleration, memory optimization, streaming, resume capability, audio processing, and hexagonal architecture.

Instructions

1. Environment Setup

**CRITICAL**: This project uses `uv` for fast Python package management with a virtual environment at `.venv/`. ALWAYS use Makefile commands or `uv run` for operations.

When setting up or working with the project:

1. Check if `.venv/` exists, if not run: `make prepare`

2. For CUDA support: `make prepare-cuda`

3. For system tray support: `make prepare-tray`

4. NEVER use `pip` or `python` directly - ALWAYS use `uv run` or `.venv/bin/python`

Example commands:

```bash

uv run python -m voicebridge --help

uv run pytest

uv run ruff check --fix .

uv pip install package-name

```

2. Architecture Adherence

VoiceBridge follows **hexagonal architecture** (ports and adapters pattern):

**domain/**: Core business logic and models (WhisperConfig, TTSConfig, GPUInfo, etc.)

**ports/**: Interfaces and abstract base classes

**adapters/**: External integrations (audio, system, transcription, TTS, session, config)

**services/**: Application services (orchestration layer)

**cli/**: Command line interface (Typer-based)

When adding features:

1. Define models in `domain/models.py`

2. Define interfaces in `ports/interfaces.py`

3. Implement adapters in `adapters/`

4. Orchestrate in `services/`

5. Expose via `cli/app.py` and `cli/commands.py`

3. Development Workflow

Before any code changes:

1. Run `make lint` to check code style

2. Review existing tests in `tests/`

3. Follow hexagonal architecture patterns

While developing:

1. Use type hints for all public interfaces

2. Add comprehensive docstrings

3. Write tests for new functionality

4. Run `make lint` frequently to auto-fix style issues

Before committing:

1. Ensure `make lint` passes

2. Ensure `make test` passes with adequate coverage

3. Update CLAUDE.md if architecture changes

4. Key Feature Areas

#### Speech-to-Text (STT)

Real-time transcription with hotkeys (F9 default)

File processing: MP3, WAV, M4A, FLAC, OGG

Batch processing with parallel workers

GPU acceleration (CUDA/Metal auto-detection)

Memory optimization and chunking

Resume capability for long files

Export formats: JSON, SRT, VTT, text, CSV

Commands:

```bash

uv run python -m voicebridge listen

uv run python -m voicebridge transcribe audio.mp3 --output transcript.txt

uv run python -m voicebridge batch-transcribe /path/to/audio/ --workers 4

uv run python -m voicebridge listen-resumable audio.wav --session-name "my-session"

```

#### Text-to-Speech (TTS)

VibeVoice integration (WestZhang/VibeVoice-Large-pt)

Multiple input modes: clipboard monitoring, text selection, direct input

Custom voices (3-10s WAV samples, 24kHz recommended)

Streaming/non-streaming generation

Hotkey controls (F12 default, Ctrl+Alt+S to stop)

Daemon mode for background operation

Commands:

```bash

uv run python -m voicebridge tts generate "Hello, VoiceBridge!" --voice en-Alice_woman

uv run python -m voicebridge tts listen-clipboard --streaming

uv run python -m voicebridge tts daemon start --mode clipboard

uv run python -m voicebridge tts voices

```

#### Audio Processing

Audio information and format detection

Enhancement: noise reduction, normalization, silence trimming

Splitting: duration, silence, or size-based segmentation

Commands:

```bash

uv run python -m voicebridge audio info audio.mp3

uv run python -m voicebridge audio preprocess input.wav output.wav --noise-reduction 0.8

uv run python -m voicebridge audio split large_file.mp3 --method duration --chunk-duration 300

```

#### Performance & Monitoring

GPU status and benchmarking

Performance metrics collection

Session management and resume

Confidence analysis for transcription quality

Commands:

```bash

uv run python -m voicebridge gpu status

uv run python -m voicebridge performance stats

uv run python -m voicebridge sessions list

uv run python -m voicebridge confidence analyze <session-id>

```

5. Configuration Management

Main config: `~/.config/voicebridge/`

Session files: `sessions/` directory

Voice samples: `demo/voices/` or configured directory

Profile-based configuration for different use cases

Commands:

```bash

uv run python -m voicebridge config --show

uv run python -m voicebridge profile save my-profile

uv run python -m voicebridge tts config set --default-voice en-Alice_woman

```

6. Testing Requirements

When adding functionality:

1. Write unit tests in `tests/` following existing patterns

2. Test both STT and TTS functionality if applicable

3. Mock external dependencies (GPU, audio devices, file I/O)

4. Run `make test` to verify coverage

5. Run `make test-fast` for quick iteration

7. System Requirements

Ensure users have:

Python 3.10+ (required for type hints and async)

FFmpeg (audio processing)

GPU support: CUDA (NVIDIA) or Metal (Apple Silicon)

Audio dependencies: pygame, pyaudio

Input handling: pyperclip, pynput

8. Common Tasks

**Add new CLI command:**

1. Define command in `cli/commands.py`

2. Register in `cli/app.py`

3. Implement service logic in `services/`

4. Add adapter if external integration needed

5. Write tests

**Add new TTS voice:**

1. Prepare 3-10s WAV sample at 24kHz

2. Name: `language-name_gender.wav` (e.g., `en-Alice_woman.wav`)

3. Place in `demo/voices/` or configured directory

4. Verify with: `uv run python -m voicebridge tts voices`

**Optimize GPU usage:**

1. Check current status: `uv run python -m voicebridge gpu status`

2. Benchmark: `uv run python -m voicebridge gpu benchmark --model base`

3. Adjust config: `uv run python -m voicebridge config --set-key use_gpu --value true`

**Debug transcription issues:**

1. Check audio info: `uv run python -m voicebridge audio info <file>`

2. Preprocess if needed: `uv run python -m voicebridge audio preprocess`

3. Review confidence: `uv run python -m voicebridge confidence analyze <session-id>`

4. Check session logs in `sessions/`

9. Code Standards

Python 3.10+ with modern type hints

Linting: ruff with auto-fix (`make lint`)

Testing: pytest with coverage (`make test`)

Type hints required for all public interfaces

Comprehensive docstrings for classes and functions

Follow hexagonal architecture separation of concerns

Examples

**Setting up for development:**

```bash

Clone and setup

git clone <repo>

cd VoiceBridge

make prepare

Run tests

make test

Start developing with hotkey transcription

uv run python -m voicebridge hotkey --key f9 --mode toggle

```

**Adding a new export format:**

1. Add format type to `domain/models.py`

2. Implement converter in `services/export_service.py`

3. Add CLI option in `cli/commands.py`

4. Test with: `uv run python -m voicebridge export session <id> --format <new-format>`

**Implementing custom audio processor:**

1. Define interface in `ports/interfaces.py`

2. Implement adapter in `adapters/audio/`

3. Integrate in `services/transcription_service.py`

4. Add configuration option

5. Write comprehensive tests

Constraints

NEVER use `pip` or `python` directly - ALWAYS use `uv run` or `.venv/bin/python`

ALWAYS maintain hexagonal architecture boundaries

ALWAYS add tests for new functionality

ALWAYS run `make lint` before suggesting code changes

NEVER commit without passing `make test`

ALWAYS use type hints on public interfaces

NEVER bypass GPU detection - respect user's hardware configuration

ALWAYS handle audio file formats gracefully (FFmpeg dependency)

VoiceBridge Development Assistant

VoiceBridge Development Assistant

Project Overview

Instructions

1. Environment Setup

2. Architecture Adherence

3. Development Workflow

4. Key Feature Areas

5. Configuration Management

6. Testing Requirements

7. System Requirements

8. Common Tasks

9. Code Standards

Examples

Clone and setup

Run tests

Start developing with hotkey transcription

Constraints

Reviews (0)