VoiceBridge Project Assistant

Expert assistant for working with the VoiceBridge project—a comprehensive bidirectional voice-text CLI tool that bridges speech and text seamlessly using OpenAI's Whisper (STT) and VibeVoice (TTS).

What This Skill Does

This skill provides specialized guidance for developing and maintaining the VoiceBridge project, including:

Speech-to-Text (STT) features with GPU acceleration and real-time transcription

Text-to-Speech (TTS) synthesis with custom voice management

Audio processing, enhancement, and format conversion

Hexagonal architecture maintenance and development

Python package management using `uv`

Performance optimization and monitoring

Session management and resume capabilities

Instructions

1. Environment Setup

**CRITICAL**: This project uses `uv` for fast Python package management and a virtual environment at `.venv/`. Always use Makefile commands or `uv run` for operations.

When setting up or working with the project:

1. **Initialize environment**: Use `make prepare` to set up `.venv` and install dependencies

2. **CUDA support**: Use `make prepare-cuda` for GPU acceleration

3. **System tray support**: Use `make prepare-tray` if needed

4. **Manual uv commands**: Always prefix with `uv run` (e.g., `uv run pytest`)

2. Architecture Awareness

Follow the hexagonal architecture pattern:

```

voicebridge/

├── domain/ # Core business logic and models

├── ports/ # Interfaces/abstract base classes

├── adapters/ # External integrations (audio, system, transcription, TTS)

├── services/ # Application services

├── cli/ # Command line interface

└── tests/ # Test suite

```

When adding features:

Define models in `domain/models.py`

Create interfaces in `ports/interfaces.py`

Implement adapters in `adapters/`

Orchestrate with services in `services/`

Expose via CLI in `cli/`

3. Development Workflow

Before making changes:

1. **Understand the codebase**: Review relevant files in the hexagonal structure

2. **Check existing tests**: Look at `tests/` for patterns

3. **Lint frequently**: Run `make lint` during development

4. **Test comprehensively**: Run `make test` before committing

When implementing new features:

1. **STT features**: Extend `transcription_service.py` and `adapters/transcription.py`

2. **TTS features**: Extend `tts_service.py` and `adapters/vibevoice_tts.py`

3. **Audio processing**: Modify `adapters/audio/` modules

4. **CLI commands**: Add to `cli/commands.py` with Typer

5. **Performance**: Integrate with `performance_service.py` for metrics

4. Key Commands Reference

Development commands:

`make help` - Show all available commands

`make prepare` - Initialize environment

`make lint` - Run ruff linting with auto-fix

`make test` - Run tests with coverage

`make test-fast` - Quick test run

`make clean` - Clean cache and virtual environment

Manual operations (with uv):

`uv run ruff check --fix .` - Linting

`uv run pytest` - Testing

`uv run python -m voicebridge --help` - Run CLI

`uv pip install package-name` - Install packages

5. Testing Standards

When adding or modifying code:

1. **Write comprehensive tests**: Cover both STT and TTS functionality

2. **Test both success and failure paths**: Include edge cases

3. **Mock external dependencies**: Use pytest fixtures for Whisper/VibeVoice

4. **Verify GPU code paths**: Test CUDA/Metal detection and fallback

5. **Check memory limits**: Test chunking and streaming behavior

Run tests with: `make test` (full coverage) or `make test-fast` (quick validation)

6. Code Standards

Follow these requirements:

**Python Version**: 3.10+ (for modern type hints and async support)

**Type Hints**: Required for all public interfaces

**Linting**: Use ruff with auto-fix (`make lint`)

**Architecture**: Maintain hexagonal/ports & adapters pattern

**Documentation**: Update docstrings and CLAUDE.md for significant changes

7. System Integration

Be aware of system requirements:

**FFmpeg**: Required for audio format conversion

**GPU Support**: Auto-detect CUDA (NVIDIA) and Metal (Apple Silicon)

**Audio Libraries**: pygame, pyaudio for playback

**Input Handling**: pyperclip, pynput for clipboard and hotkeys

When modifying GPU code:

Update `adapters/system.py` for device detection

Test fallback to CPU when GPU unavailable

Verify memory optimization for large files

8. Configuration Management

Configuration locations:

Main config: `~/.config/voicebridge/`

Session files: Local `sessions/` directory

TTS voice samples: `demo/voices/` or configured directory

Profiles: Saved in config directory

When adding configuration options:

Update `adapters/config.py`

Add CLI commands in `cli/commands.py`

Document in help text and CLAUDE.md

9. TTS Voice Setup

VibeVoice voice sample requirements:

**Format**: WAV files, 3-10 seconds duration

**Sample rate**: 24kHz recommended

**Naming**: `language-name_gender.wav` (e.g., `en-Alice_woman.wav`)

**Location**: `demo/voices/` or user-configured directory

For TTS setup assistance: Reference `setup_tts.py` for guided configuration

10. Common Development Tasks

**Adding a new CLI command**:

1. Define command function in `cli/commands.py`

2. Use Typer decorators for arguments/options

3. Call appropriate service methods

4. Add tests in `tests/cli/`

5. Update help documentation

**Implementing a new export format**:

1. Add format to `services/export_service.py`

2. Update `ExportFormat` enum in `domain/models.py`

3. Implement conversion logic

4. Add CLI option in export commands

5. Test with various transcription results

**Adding audio processing feature**:

1. Implement in `adapters/audio/processor.py`

2. Expose via `services/transcription_service.py` or new service

3. Add CLI command if user-facing

4. Test with various audio formats

5. Document parameters and behavior

Example Usage

When a user asks to:

**"Add support for a new audio format"**:

1. Check `adapters/audio/formats.py` for supported formats

2. Verify FFmpeg support for the format

3. Update format detection in `adapters/audio/processor.py`

4. Add format to documentation

5. Test conversion and processing

6. Run `make lint` and `make test`

**"Improve GPU memory handling"**:

1. Review `adapters/system.py` for GPU detection

2. Check `services/transcription_service.py` for chunking logic

3. Implement memory monitoring improvements

4. Test with various file sizes and GPU configurations

5. Verify CPU fallback behavior

6. Update performance metrics collection

**"Add a new TTS voice management feature"**:

1. Review `adapters/vibevoice_tts.py` for voice handling

2. Implement feature in `services/tts_service.py`

3. Add CLI command in `cli/commands.py`

4. Update voice sample detection if needed

5. Test with various voice samples

6. Document usage and voice requirements

Important Notes

**Always use `uv run`** for Python commands—never run directly from system Python

**Follow hexagonal architecture**—keep domain logic separate from infrastructure

**Test both STT and TTS paths**—features often interact across both systems

**Verify GPU code**—test CUDA, Metal, and CPU fallback paths

**Document hotkeys**—F9 (STT), F12 (TTS), Ctrl+Alt+S (TTS stop) are defaults

**Check FFmpeg dependency**—many audio features require it

VoiceBridge Project Assistant

VoiceBridge Project Assistant

What This Skill Does

Instructions

1. Environment Setup

2. Architecture Awareness

3. Development Workflow

4. Key Commands Reference

5. Testing Standards

6. Code Standards

7. System Integration

8. Configuration Management

9. TTS Voice Setup

10. Common Development Tasks

Example Usage

Important Notes

Reviews (0)