bin2nlp Binary Analysis with Direct LLM Integration

A production-ready binary decompilation service that translates compiled binaries into natural language explanations using multiple LLM providers. Features clean architecture with direct provider specification, comprehensive testing, and full Docker deployment.

What This Skill Does

This skill guides you through building and operating a complete binary analysis service with:

**Multi-LLM Provider Framework**: OpenAI, Anthropic, Gemini, and Ollama integration with direct provider specification

**REST API Service**: Complete FastAPI implementation with file upload, job management, and result retrieval

**Radare2 Integration**: Binary decompilation engine with assembly code extraction

**Production Infrastructure**: Docker containerization, PostgreSQL storage, health monitoring, and operational dashboards

**Comprehensive Testing**: End-to-end validation with 6 phases covering functionality, performance, resilience, and security

**Open Access Architecture**: No authentication barriers - direct public binary analysis service

Architecture & Technology Stack

Core Components

**Backend**: FastAPI with Uvicorn ASGI server, Python 3.11+

**Decompilation**: radare2 via r2pipe in isolated containers

**Database**: PostgreSQL with file-based caching

**LLM Integration**: Direct provider creation from API request parameters

**Deployment**: Multi-container Docker (API + workers + PostgreSQL + Nginx)

Project Structure

```

src/

├── api/ # FastAPI endpoints and routing

├── analysis/ # Binary decompilation engine

├── llm/ # LLM provider integrations

├── models/ # Pydantic data models

├── cache/ # Result caching layer

└── core/ # Shared utilities and config

tests/

├── unit/ # Unit tests

├── integration/ # Integration tests

└── performance/ # Performance tests

docker/ # Container configurations

docs/ # Operational documentation

scripts/ # Deployment and housekeeping

0xcc/ # AI Dev Tasks Framework documents

```

Step-by-Step Implementation Guide

Phase 1: Project Foundation

1. **Create Project Structure**

- Set up modular monolith architecture: `src/{api,analysis,llm,models,cache,core}/`

- Initialize Python project with pyproject.toml

- Configure development tools: black, isort, mypy, pytest

2. **Establish Documentation Framework**

- Create `000_PPRD|bin2nlp.md` in `0xcc/prds/` (Project PRD)

- Create `000_PADR|bin2nlp.md` in `0xcc/adrs/` (Architecture Decision Record)

- Update CLAUDE.md with project standards and technology decisions

3. **Set Up Development Environment**

- Configure Docker Compose with API, PostgreSQL, and worker services

- Create environment templates: `.env.example`, `.env.development`

- Set up pre-commit hooks for code quality

Phase 2: Binary Analysis Engine

1. **Implement Radare2 Integration**

- Create `src/analysis/decompiler.py` with r2pipe wrapper

- Build function extraction: entry point detection, assembly code capture

- Add binary validation and size limits (100MB default)

2. **Develop Storage Layer**

- Implement PostgreSQL models for jobs, results, and binaries

- Create file caching system with TTL support

- Add cleanup routines for temporary files

3. **Test Decompilation Engine**

- Unit tests for function detection (target: 10 functions in test binary)

- Integration tests for file processing pipeline

- Performance tests for 453KB binary processing (target: <50ms)

Phase 3: LLM Provider Framework

1. **Implement Direct Provider Architecture**

- Create `src/llm/base.py` with LLMProvider base class

- Implement on-demand provider creation from API request parameters

- Support OpenAI, Anthropic, Gemini, and Ollama providers

2. **Configure Primary LLM Service**

```python

# Environment configuration

OPENAI_API_KEY=ollama-local-key

OPENAI_BASE_URL=http://ollama.mcslab.io:80/v1

OPENAI_MODEL=phi4:latest

LLM_DEFAULT_PROVIDER=openai

```

3. **Add Translation Logic**

- Integrate assembly code with LLM prompts

- Parse LLM responses into structured explanations

- Implement error handling for API failures

Phase 4: REST API Development

1. **Build Core Endpoints**

- `POST /api/v1/analyze` - Binary upload and analysis

- `GET /api/v1/jobs/{job_id}` - Job status and results

- `GET /api/v1/health` - System health checks

- `GET /docs` - Auto-generated OpenAPI documentation

2. **Implement Job Management**

- Async background processing for long-running analyses

- Job queue with status tracking (pending, processing, completed, failed)

- Result caching with configurable TTL (1-24 hours)

3. **Add Validation & Error Handling**

- Pydantic models for request/response validation

- Custom exception hierarchy for business logic errors

- Structured error responses with appropriate HTTP status codes

Phase 5: Production Infrastructure

1. **Docker Containerization**

- Multi-stage Dockerfile for API service

- PostgreSQL container with persistent volumes

- Nginx reverse proxy for production deployment

- Resource limits: API (512MB), Workers (2GB), PostgreSQL (1GB)

2. **Deployment Automation**

```bash

# One-command deployment

./scripts/deploy.sh development # or production

# Health validation

./scripts/health_check.sh

```

3. **Monitoring & Observability**

- Structured logging with correlation IDs

- Performance metrics collection (response times, LLM latency)

- Web dashboard at `/dashboard/` for real-time monitoring

- Background alert system for automated health checks

Phase 6: Comprehensive Testing

1. **Phase A: Foundation Validation**

- Health endpoints: Component status, system capabilities

- API documentation: OpenAPI spec availability

- Response times: Target 6-12ms for standard endpoints

2. **Phase B: Core Functionality**

- Decompilation accuracy: 10 functions detection in test binary

- File processing: 453KB binary upload in <50ms

- Job management: Queue operations and status tracking

3. **Phase C: Advanced Features**

- LLM integration: Provider health checks and API latency

- Translation quality: Assembly code explanation accuracy

- Admin functions: Metrics and dashboard operations

4. **Phase D: Performance & Scale**

- Concurrent handling: 10 simultaneous requests

- Resource efficiency: <50% memory usage under load

- Scalability: Horizontal scaling validation

5. **Phase E: Resilience & Failure**

- Database outage recovery: Graceful degradation

- Container restart: Health restoration in <10 seconds

- LLM provider failures: Proper error handling

- Input validation: File size limits and format checks

6. **Phase F: Security & Compliance**

- Input sanitization: File type validation

- Credential protection: No API key exposure

- Container security: Non-root execution

- Error message safety: No internal path disclosure

Phase 7: Documentation & Operations

1. **Operational Guides**

- `docs/deployment.md` - Production deployment procedures

- `docs/runbooks.md` - Common operational scenarios

- `docs/llm-providers.md` - Multi-provider setup guide

- `docs/troubleshooting.md` - Diagnostic procedures

2. **Developer Documentation**

- API reference with example requests/responses

- Architecture diagrams and data flow

- Testing guidelines and coverage requirements

- Contributing guide and code review standards

Key Design Patterns

Direct Provider Architecture

```python

On-demand provider creation from request

def create_provider(provider_name: str, api_key: str) -> LLMProvider:

"""Create LLM provider dynamically based on request parameters"""

if provider_name == "openai":

return OpenAIProvider(api_key=api_key, base_url=settings.openai_base_url)

elif provider_name == "anthropic":

return AnthropicProvider(api_key=api_key)

# ... other providers

```

Async Job Processing

```python

@router.post("/analyze")

async def analyze_binary(

file: UploadFile,

background_tasks: BackgroundTasks,

provider: str = "openai"

job_id = create_job()

background_tasks.add_task(process_binary, job_id, file, provider)

return {"job_id": job_id, "status": "pending"}

```

Health Monitoring

```python

@router.get("/health")

async def health_check():

return {

"status": "healthy",

"components": {

"api": "operational",

"database": await check_database(),

"llm_providers": "on-demand"

}

```

Coding Standards

File Naming & Organization

File names: `snake_case.py`

Classes: `PascalCase`

Functions: `snake_case`

Import order: standard library → third-party → local application

Quality Requirements

85% minimum unit test coverage

Type hints for all function signatures

Docstrings for public functions and classes

Pre-commit hooks: black, isort, mypy

Commit Message Format

```bash

git commit -m "feat: [brief description]" -m "

[key change 1]

[key change 2]

Tests: [test coverage]

Files: [affected files]"

```

Context Management

Housekeeping Commands

```bash

Intelligent context clear & resume

./scripts/clear_resume

Quick checkpoint

./scripts/hk --summary "Completed phase X" --next-steps "Begin phase Y"

Resume from last session

@.housekeeping/QUICK_RESUME.md

```

Session Start Sequence

```bash

/compact

@CLAUDE.md

@0xcc/spec/050_Developer_Coding_S&Ps.md

@0xcc/prds/000_PPRD|bin2nlp.md

@0xcc/adrs/000_PADR|bin2nlp.md

```

Production Deployment

Environment Configuration

```bash

Copy environment template

cp .env.example .env

Configure LLM providers

OPENAI_API_KEY=your-key-here

OPENAI_BASE_URL=http://ollama.mcslab.io:80/v1

OPENAI_MODEL=phi4:latest

Set resource limits

API_MEMORY_LIMIT=512m

WORKER_MEMORY_LIMIT=2g

POSTGRES_MEMORY_LIMIT=1g

```

One-Command Deployment

```bash

Deploy to production

./scripts/deploy.sh production

Validate deployment

./scripts/health_check.sh

View logs

docker-compose logs -f api

```

Testing Strategy

Balanced Pyramid Approach

**Unit Tests** (70%): Core business logic, utility functions

**Integration Tests** (20%): API endpoints, database operations

**Smoke Tests** (10%): End-to-end critical paths

Test Execution

```bash

Run all tests

pytest

Run with coverage

pytest --cov=src --cov-report=html

Run specific phase

pytest tests/integration/test_phase_*.py

```

Important Constraints

1. **No Persistent Binary Storage**: Binaries are processed and discarded

2. **File Size Limits**: 100MB default maximum (configurable)

3. **Sandboxed Execution**: radare2 runs in isolated containers

4. **Open Access**: No user authentication required

5. **Stateless Operations**: Enables horizontal scaling

6. **Fail-Fast Validation**: Clear error messages for invalid inputs

Success Criteria

✅ Decompilation engine consistently detects 10 functions

✅ File processing under 50ms for typical binaries

✅ API response times 6-12ms for standard endpoints

✅ 85%+ test coverage across all modules

✅ Health monitoring with <10 second recovery

✅ LLM integration with proper error handling

✅ Production deployment with one-command automation

✅ Comprehensive operational documentation

Additional Resources

**Master Coding Standards**: `@0xcc/spec/050_Developer_Coding_S&Ps.md`

**Production Validation Plan**: `@0xcc/tasks/999_COMPREHENSIVE_TEST_PLAN|Production_Validation.md`

**Deployment Guide**: `docs/deployment.md`

**Troubleshooting**: `docs/troubleshooting.md`

bin2nlp Binary Analysis with Direct LLM Integration

bin2nlp Binary Analysis with Direct LLM Integration

What This Skill Does

Architecture & Technology Stack

Core Components

Project Structure

Step-by-Step Implementation Guide

Phase 1: Project Foundation

Phase 2: Binary Analysis Engine

Phase 3: LLM Provider Framework

Phase 4: REST API Development

Phase 5: Production Infrastructure

Phase 6: Comprehensive Testing

Phase 7: Documentation & Operations

Key Design Patterns

Direct Provider Architecture

On-demand provider creation from request

Async Job Processing

Health Monitoring

Coding Standards

File Naming & Organization

Quality Requirements

Commit Message Format

Context Management

Housekeeping Commands

Intelligent context clear & resume

Quick checkpoint

Resume from last session

Session Start Sequence

Production Deployment

Environment Configuration

Copy environment template

Configure LLM providers

Set resource limits

One-Command Deployment

Deploy to production

Validate deployment

View logs

Testing Strategy

Balanced Pyramid Approach

Test Execution

Run all tests

Run with coverage

Run specific phase

Important Constraints

Success Criteria

Additional Resources

Reviews (0)