Local-first AI orchestrator with multi-agent message bus, H-Net hierarchical memory, and dynamic chunking. Spins up FastAPI services, Redis message bus, and specialized agents. Offline by default, cloud optional.
Expert assistant for working with the CLI AI multi-agent orchestration system - a local-first AI platform that uses message bus architecture to coordinate specialized agents with hierarchical memory management.
CLI AI is a multi-agent command-line interface that uses Redis Streams-based message bus to orchestrate specialized AI agents. It features H-Net hierarchical memory with dynamic chunking to maintain long-term context within token budgets. The system is offline by default with optional cloud integration.
1. **Message Bus Pattern** - Central communication hub using Redis Streams
- `bus_server.py`: FastAPI app on port 7088, requires `BUS_TOKEN` env var
- All components publish/subscribe through topics with bearer token auth
2. **Agent System** - Specialized agents with defined roles
- `agent_server.py`: FastAPI wrapper for individual agents on separate ports
- Agents defined in `configs/agents.yaml` (CEO, CTO, CFO, etc.)
- Each runs as separate FastAPI server communicating via bus
3. **Orchestrator** - Process manager
- `orchestrator.py`: Spawns and manages bus + agent processes
- Follows `wake_order` from agent config
4. **CLI Interface** - Primary user interface
- `ch_cli.py`: Router → generator → QC flow
- Three-stage pipeline: intent classification → Qwen generation → DeepSeek QC
5. **H-Net Memory** - Hierarchical memory management
- `hnet/dynamic_chunker.py`: Token-budget-aware chunking with soft overlap
- `hnet/hierarchical_memory.py`: Vector-based (FAISS/NumPy) memory with recursive summarization
- Persistent storage in `data/hier_mem/` with versioned metadata
6. **Knowledge Base** - SQLite with FTS5
- `core/kb.py`: Conversation context, bus messages, hierarchical chunks
- WAL mode for concurrent access
```bash
pip install .[dev]
uv pip install .[dev]
pip install .[dev,token,observability,faiss]
```
1. **Run full test suite:**
```bash
pytest
```
2. **Run with coverage (CI mode):**
```bash
pytest --maxfail=1 --cov --cov-config=.coveragerc
```
3. **Skip heavy dependencies in test environment:**
```bash
SKIP_OPENVINO=true SKIP_DB=true pytest --maxfail=1
```
1. **Lint checking:**
```bash
ruff check .
```
2. **Type checking:**
```bash
mypy
```
3. **Run all checks:**
```bash
make check
```
1. **Basic CLI workflow:**
```bash
python ch_cli.py new --name "demo" --task "Hello world"
```
2. **Boot full orchestrator (bus + agents):**
```bash
python orchestrator.py boot
```
3. **Interactive shell:**
```bash
python orchestrator.py shell
```
4. **Frontend development:**
```bash
cd frontend
npm install
npm run dev
```
5. **Build frontend:**
```bash
cd frontend
npm run build
```
Both `bus_server.py` and `agent_server.py` use FastAPI's `lifespan` context manager (not deprecated `on_event`). Always respect lifespan hooks when embedding these apps.
Critical environment variables:
Model configuration overrides:
Use `BusClient` from `core/bus_client.py` for publish/subscribe:
```python
{"topic": "agent_role", "data": {"sender": "source", "text": "message"}}
```
All bus requests require valid `BUS_TOKEN` header.
For long context, use `DynamicChunker` from `hnet/dynamic_chunker.py`:
```
ch_cli.py # Main CLI entry point
orchestrator.py # System orchestration
agent_server.py # Generic agent wrapper
bus_server.py # Message bus server
core/ # Core utilities (KB, bus client, settings)
hnet/ # H-Net memory implementation
server/ # Backend API (database, auth, billing)
models/ # Model client wrappers
configs/ # YAML configuration files
prompts/ # Agent system prompts (markdown)
tests/ # Pytest test suite
frontend/ # Vue 3 frontend
integrations/ # Social hooks, external integrations
plugins/ # Plugin system (profiles, badges, avatars)
observability/ # OpenTelemetry setup
```
1. **Token authentication**: All bus requests require valid `BUS_TOKEN` header
2. **Port conflicts**: Agents run on dedicated ports (7001-7009), bus on 7088
3. **Redis required**: Bus server needs Redis running (or set `REDIS_URL`)
4. **FAISS optional**: Falls back to NumPy-based search if unavailable
5. **Model paths**: Router can use in-process llama.cpp or HTTP endpoint
6. **WAL mode**: SQLite KB uses `PRAGMA journal_mode=WAL` for concurrent access
When working with this codebase:
1. **Always verify environment variables** - Check `BUS_TOKEN`, Redis URL, and model endpoints before running
2. **Respect lifespan hooks** - When modifying FastAPI apps, preserve lifespan context managers
3. **Test with optional deps** - Use skip flags for dependencies that may not be available
4. **Check port availability** - Ensure no conflicts with agent ports (7001-7009) and bus (7088)
5. **Use dynamic chunking** - For long context work, leverage `DynamicChunker` instead of naive splitting
6. **Follow message patterns** - Use `BusClient` with proper topic/data structure for agent communication
7. **Maintain WAL mode** - When modifying KB, preserve SQLite WAL journal mode
1. Define agent role in `configs/agents.yaml`
2. Create system prompt in `prompts/{role}.md`
3. Add port assignment in orchestrator
4. Update wake_order if startup sequence matters
```python
from hnet.dynamic_chunker import DynamicChunker, recursive_summarize
chunker = DynamicChunker(budget=800, overlap=80)
chunks = chunker.chunk_text(long_text)
summary = recursive_summarize(chunks, model_client, budget=800)
```
```python
pytest.importorskip("faiss") # Skip test if FAISS unavailable
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/cli-ai-orchestrator/raw