AI-powered audio stem separation app development with Python backend (Litestar) and React frontend. Includes Modal GPU processing.
You are working on Stemset, an AI-powered audio stem separation application. The app separates audio files into individual stems (vocals, drums, bass, other) using ML models and provides a web player with independent volume controls.
**Architecture:**
**Strict layering**. Higher layers can import from lower layers ONLY. Never reverse.
```
Configuration (config.py)
↓
Registry (models/registry.py)
↓
Base Abstractions (models/audio_separator_base.py)
↓
Concrete Models (models/atomic_models.py)
↓
Executors (models/strategy_executor.py)
↓
Public Interface (modern_separator.py)
↓
API (api.py)
```
All separation models inherit from `AudioSeparator` ABC:
```python
class AudioSeparator(ABC):
output_slots: dict[str, str] # What this model produces
model_filename: str # Model file to load
@abstractmethod
def separate(self, input_file: Path, output_dir: Path) -> dict[str, Path]:
"""Perform separation, return dict mapping slot names to output files"""
```
**Base Implementation**: `AudioSeparatorLibraryModel` wraps the `audio-separator` library. Concrete models only define:
**Example Model**:
```python
class VocalsMelBandRoformer(AudioSeparatorLibraryModel):
output_slots = {
"vocals": "Vocal track",
"not_vocals": "Instrumental"
}
model_filename = "vocals_mel_band_roformer.ckpt"
```
Separation strategies are defined in `config.yaml` as recursive trees:
```yaml
successive:
_: "vocals_mel_band_roformer.ckpt" # Model to use
vocals: vocals # Final output name
not_vocals: # Process this output further
_: "kuielab_b_drums.onnx"
drums: drums
not_drums:
_: "kuielab_a_bass.onnx"
bass: bass
not_bass: other
```
**Rules:**
- **String**: Final stem name (leaf node)
- **Dict**: Subtree that processes that output further
1. Frontend uploads file to backend
2. Backend uploads to storage (R2 or local)
3. Backend triggers GPU worker (Modal) or local worker
4. Worker processes audio and uploads stems to R2
5. Worker calls back to API with completion status
6. Frontend polls for completion
7. Frontend streams stems from R2
The frontend uses two code generators that must be run after specific changes:
#### 1. OpenAPI Client Generation (API Types)
**When to run**: After changing backend API models or endpoints
```bash
cd frontend
npm run generate # One-time generation
npm run generate:watch # Watch mode (auto-regenerates)
```
#### 2. TanStack Router Generation (Route Types)
**When to run**: After adding/removing/renaming route files
```bash
cd frontend
npm run generate:routes
```
**Adding new API endpoints:**
1. Update backend models in `src/api/models.py`
2. Add/update route handlers in `src/api/*_routes.py`
3. Register handlers in `src/api/app.py`
4. Ensure backend is running: `uv run litestar run --reload`
5. Regenerate frontend types: `cd frontend && npm run generate`
**Adding new frontend routes:**
1. Create route file in `src/routes/` (e.g., `src/routes/p/$profileName/clips/$clipId.tsx`)
2. Regenerate route types: `npm run generate:routes`
3. TypeScript will now recognize new route paths
**Important**: Always use `bun` for frontend package management and scripts, not npm/yarn/pnpm.
```bash
uv run litestar run --reload
uv run stemset process <profile>
basedpyright
modal deploy src/processor/worker.py
```
```bash
cd frontend
bun install
bun run dev
bun run generate
bun run generate:watch
bun run generate:routes
bun run typecheck
bun run build
```
```
src/
├── config.py # Configuration models (Pydantic)
├── models/
│ ├── audio_separator_base.py # ABC for separation models
│ ├── atomic_models.py # Concrete model implementations
│ ├── registry.py # Model registry (singleton)
│ └── strategy_executor.py # Strategy tree execution
├── modern_separator.py # Public separation interface
├── api/
│ ├── app.py # Litestar app setup
│ ├── models.py # API request/response models
│ └── *_routes.py # Route handlers
├── processor/
│ └── worker.py # Modal GPU worker
└── frontend/
├── src/
│ ├── routes/ # TanStack Router routes
│ ├── api/generated/ # Generated API client
│ └── routeTree.gen.ts # Generated route types
└── openapi-ts.config.ts # OpenAPI generator config
```
1. **No defensive programming**. Fail fast, fail loud.
2. **Type everything**. Run `basedpyright` regularly.
3. **Respect layer boundaries**. Never import from higher layers.
4. **Use Pydantic** for all data models and config.
5. **Validate config at runtime**. Strategy output slots must match model definitions.
6. **Intermediate files are WAV**. Only final outputs use configured format.
7. **Run code generators** after API or route changes.
8. **Use `bun`** for all frontend operations.
9. **No `try/except` for control flow**. Only for resource cleanup.
10. **Run type checkers** (`basedpyright` backend, `bun run typecheck` frontend) before committing.
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/stemset-development/raw