Context Creator CLI Expert

Expert guidance for working with context-creator, a high-performance Rust CLI tool that converts entire codebases into LLM-optimized Markdown. The tool intelligently filters, prioritizes, and formats code from git repositories into cohesive documents optimized for consumption by large language models.

What This Skill Does

This skill provides comprehensive knowledge about the context-creator project architecture, development workflows, configuration system, semantic analysis capabilities, and best practices for extending the tool. Use this when developing features, debugging issues, adding language support, or optimizing the codebase-to-context pipeline.

Core Architecture

Component Overview

1. **CLI Layer** (`src/cli.rs`)

- Command-line argument parsing via clap

- Configuration validation and loading

- Supports directories, glob patterns, and GitHub repos as input

2. **Core Processing** (`src/core/`)

- `walker.rs`: Directory traversal with .gitignore support

- `context_builder.rs`: Markdown generation with token management

- `prioritizer.rs`: File importance scoring and selection

- `semantic/`: AST-based import tracing and dependency resolution

- `cache.rs`: File caching for performance optimization

3. **Configuration System** (`src/config.rs`)

- TOML-based configuration (`.context-creator.toml`)

- Custom priorities, token limits, ignore patterns

- Hierarchical loading: CLI > config file > defaults

4. **Semantic Analysis** (`src/core/semantic/`)

- Tree-sitter AST parsing for 20+ languages

- Import tracing, dependency resolution, caller analysis

- Language-specific analyzers in `languages/` subdirectory

Instructions for AI Agent

When Making Changes

1. **Run validation pipeline before committing**

- Execute `make validate` to run format check + lint

- Execute `make test` to run full test suite

- Fix any failures before proceeding

2. **Follow error handling conventions**

- Use `anyhow::Result` for error propagation

- Add context with `.context()` or `.with_context()`

- Check `src/utils/error.rs` for custom error types

3. **Maintain performance characteristics**

- Use rayon for parallel processing where appropriate

- Check for file caching opportunities

- Profile with `cargo bench` for critical paths

Development Workflows

**Build and validate:**

```bash

make build # Format check + lint + build

make validate # Format check + lint only

```

**Testing:**

```bash

make test # Run all tests

cargo test # Run unit/integration tests

cargo test test_name # Run specific test

make coverage # Generate coverage report

```

**Development iteration:**

```bash

make dev # Build and run with example

make install # Install to cargo bin directory

make doc # Generate and open documentation

```

**Code quality:**

```bash

make fmt # Auto-format code

make fmt-check # Check formatting (CI-safe)

make lint # Run clippy lints

```

Adding New Language Support

When adding support for a new programming language:

1. **Create language-specific analyzer** in `src/core/semantic/languages/`

- Implement trait for import extraction

- Define tree-sitter query patterns

- Handle language-specific import syntax

2. **Add tree-sitter dependency** to `Cargo.toml`

- Include the `tree-sitter-{language}` crate

- Specify compatible version

3. **Register in language registry** (`src/core/semantic/languages/mod.rs`)

- Add enum variant for language

- Map file extensions to language

- Add grammar initialization

4. **Write comprehensive tests**

- Unit tests for import extraction

- Integration tests with real code samples

- Edge case coverage

Configuration Precedence

Understand configuration priority when debugging:

1. **Explicit CLI arguments** (highest priority)

2. **Config file token limits** (for prompt tokens)

3. **Config file defaults**

4. **Hard-coded defaults** (lowest priority)

Configuration files:

`.contextignore`: Exclude patterns (gitignore syntax)

`.contextkeep`: Prioritize important files

`.context-creator.toml`: Advanced settings

Token Management System

The tool implements sophisticated token budgeting:

**Automatic counting**: Uses tiktoken-rs for accurate token counts

**Prompt reservation**: Reserves tokens for LLM interaction prompts

**Intelligent truncation**: Selects files within budget constraints

**Per-LLM limits**: Configurable max tokens per target model

When modifying token logic:

Update token counting in `context_builder.rs`

Test with various file sizes and LLM limits

Ensure prompt tokens are properly reserved

Data Flow Pipeline

1. **Input Processing**: CLI args → Config validation → Directory resolution

2. **File Discovery**: Walker scans → Applies ignore patterns → Filters includes

3. **Semantic Analysis**: Import tracing → Dependency resolution → Relationships

4. **Prioritization**: Importance scoring → Token budget → Selection

5. **Output Generation**: Markdown formatting → Token counting → Final assembly

Testing Strategy

Follow these testing patterns:

**Unit tests**: Core functionality in each module

**Integration tests**: CLI argument combinations

**Benchmarks**: Performance-critical paths (walker, parser)

**Semantic tests**: Import tracing accuracy across languages

**Security tests**: Path traversal, input validation

Performance Optimization

When optimizing performance:

1. **Profile first**: Use `cargo bench` to identify bottlenecks

2. **Consider parallelization**: Use rayon for independent operations

3. **Cache strategically**: File reads, AST parsing, token counts

4. **Pool resources**: Tree-sitter parsers are expensive to create

5. **Benchmark changes**: Compare before/after with `cargo bench`

Common Tasks

Debugging Semantic Analysis

1. Enable debug logging for semantic module

2. Check tree-sitter query patterns in language analyzer

3. Verify AST structure matches expectations

4. Test with minimal reproduction case

Modifying File Prioritization

1. Locate priority logic in `src/core/prioritizer.rs`

2. Understand glob pattern matching (first-match-wins)

3. Test with representative codebase

4. Update config documentation if adding new rules

Extending Configuration Options

1. Update `Config` struct in `src/config.rs`

2. Add TOML deserialization handling

3. Document new option in README

4. Add validation for new option

5. Update CLI help text if exposing via CLI

Troubleshooting Build Issues

1. Ensure Rust toolchain is up to date: `rustup update`

2. Clean build artifacts: `cargo clean`

3. Check for conflicting dependencies: `cargo tree`

4. Verify tree-sitter grammars compile correctly

5. Review error output for missing system dependencies

Key Constraints

Maintain backwards compatibility with existing `.context-creator.toml` files

Preserve token accuracy (critical for LLM usage)

Keep walker performance (parallel processing essential)

Follow Rust idioms and error handling conventions

Ensure cross-platform compatibility (Linux, macOS, Windows)

Example Usage Patterns

**Basic codebase conversion:**

```bash

context-creator /path/to/repo

```

**With semantic analysis:**

```bash

context-creator /path/to/repo --semantic

```

**Trace specific file imports:**

```bash

context-creator /path/to/repo --trace-imports src/main.rs

```

**Custom token limit:**

```bash

context-creator /path/to/repo --max-tokens 50000

```

**With configuration file:**

```bash

context-creator /path/to/repo --config .context-creator.toml

```

Documentation References

README.md: User-facing documentation and examples

CONTRIBUTING.md: Contribution guidelines and standards

src/lib.rs: Public API documentation

Individual module docs: In-code documentation for each component

When in doubt, run `make doc` to browse full documentation locally.

Context Creator CLI Expert

Context Creator CLI Expert

What This Skill Does

Core Architecture

Component Overview

Instructions for AI Agent

When Making Changes

Development Workflows

Adding New Language Support

Configuration Precedence

Token Management System

Data Flow Pipeline

Testing Strategy

Performance Optimization

Common Tasks

Debugging Semantic Analysis

Modifying File Prioritization

Extending Configuration Options

Troubleshooting Build Issues

Key Constraints

Example Usage Patterns

Documentation References

Reviews (0)