MAP PRO Development Guide

Guide for AI assistants working on MAP PRO (Multi-market XBRL Analysis Platform) - a production-grade XBRL financial filing processing system.

Project Overview

MAP PRO processes XBRL filings from regulatory markets (SEC, FCA, ESMA) to extract financial data. Built on strict architectural principles:

1. **Market Agnostic** - Supports any regulatory market

2. **Taxonomy Agnostic** - Handles any XBRL taxonomy

3. **Database Reflects Reality** - Filesystem is source of truth

4. **No Hardcoding** - All configurations externalized

5. **Modular Architecture** - Six independent modules

6. **Output Separation** - Program files vs data files strictly separated

Repository Structure

`/home/user/map_pro/` - Program files (code, config)

- `database/` - Metadata coordination layer (PostgreSQL + SQLAlchemy)

- `searcher/` - Filing search across regulatory markets

- `downloader/` - Downloads XBRL filings and taxonomies

- `library/` - Taxonomy library management

- `parser/` - Parse XBRL instance documents to JSON

- `mapper/` - Extract financial statements from parsed filings

- `env` - Master configuration file

- `standards.py` - Complete coding standards and validation

`/mnt/map_pro/` - Data partition (all outputs)

- `downloader/` - Downloaded filings and logs

- `taxonomies/` - Standard taxonomies and cache

- `parser/` - Parsed reports and logs

- `mapper/` - Mapped statements and logs

- `database/` - PostgreSQL data files

Instructions

Before Starting Any Work

1. **Read the coding standards**

- Read `/home/user/map_pro/standards.py` completely

- This contains non-negotiable architectural principles

- Run `python standards_checker.py` to verify compliance

2. **Understand the module structure**

- Review the module's README.md if it exists

- Check current git branch

- Verify folder structure with user before creating new files

3. **Review configuration**

- Master config: `/home/user/map_pro/env`

- Module configs in `{module}/core/config_loader.py`

- Never hardcode - always use config loader

Critical Coding Standards

#### 1. NO HARDCODE - Nowhere and on no subject

**FORBIDDEN:**

```python

url = "https://data.sec.gov/submissions/CIK0000320193.json"

path = "/home/user/data/filings"

```

**CORRECT:**

```python

from core.config_loader import ConfigLoader

config = ConfigLoader()

url = config.sec_submissions_url.format(cik=cik)

path = config.entities_dir

```

Rules:

NO URLs/URIs in code

NO file paths in code

NO IP addresses or hostnames

Use `.env` + `core/config_loader.py` for ALL configuration

#### 2. MARKET AGNOSTIC - Only in `/market/` directory

**FORBIDDEN (outside market/):**

```python

if "us-gaap" in namespace:

# Market-specific logic

```

**CORRECT:**

```python

from xbrl_parser.market import MarketRegistry

validator = MarketRegistry.get_validator(market_code)

validator.validate(filing)

```

Rules:

Market-specific code ONLY in `xbrl_parser/market/` directory

Rest of system must work for ANY market

Use market detector + registry pattern

Avoid keywords outside `/market/`: us-gaap, sec, edgar, ifrs, esef, esma, uk-gaap, frc

#### 3. OUTPUT LOCATION - NEVER write to program files

**FORBIDDEN:**

```python

output_path = "xbrl_parser/output/parsed.json" # In program directory!

```

**CORRECT:**

```python

from core.config_loader import ConfigLoader

config = ConfigLoader()

output_path = config.output_parsed_dir / market / company / form / date / "parsed.json"

Writes to: /mnt/map_pro/parser/parsed_reports/sec/apple/10-K/2024-09-30/parsed.json

```

Rules:

**NEVER** write reports/data under project files

Program files: `.py`, `.env`, config, source code

Data files: `.json`, `.xml`, `.htm`, `.txt`, `.csv`, generated output

Always write to DATA PARTITION (`/mnt/map_pro/`)

#### 4. PATH REGIME CHECK - Before adding ANY path

Before adding a new path:

1. Read `/home/user/map_pro/env` completely

2. Read `core/data_paths.py` to understand path construction

3. Read `core/config_loader.py` to see how paths are loaded

4. Check if required path already exists

5. If exists: USE IT (stop creating duplicates!)

6. If not exists: Add following existing patterns

#### 5. FILE HEADER PATH - Every Python file

**Required first line:**

```python

Path: xbrl_parser/foundation/xml_parser.py

"""

XML Parser for XBRL documents.

"""

```

Rules:

FIRST line of EVERY Python file

Format: `# Path: relative/path/from/project/root.py`

Start from project root (e.g., `parser/`, `database/`, `mapper/`)

Use forward slashes, include `.py` extension

#### 6. ASCII ONLY - No emojis anywhere

**FORBIDDEN:**

```python

print("✓ Success")

log.info("⚠️ Warning")

```

**CORRECT:**

```python

print("[OK] Success")

log.info("[WARN] Warning")

```

Use: `[OK]` `[FAIL]` `[WARN]` `->` `*` instead of emojis

#### 7. FOLDER TREE - Always verify first

Before writing ANY new files:

1. Ask user for current folder tree (Claude sees files flat)

2. Create virtual copy of folder structure

3. Verify file locations

4. DO NOT forget `__init__.py` files

Development Workflow

1. **Setting up:**

```bash

cd map_pro

cd {module} && pip install -r requirements.txt

# Configure /home/user/map_pro/env

cd database && python -m database.operations.initialize

```

2. **While coding:**

- Run `python standards_checker.py` regularly

- Add `# Path:` header to all new files

- Use type hints on all functions

- Write docstrings (Google or NumPy style)

- No hardcoded values - use config loader

3. **Before committing:**

- Run `python standards_checker.py`

- Run `pytest tests/`

- Run `mypy module_name/`

- Run `flake8 module_name/`

- Format: `black module_name/` and `isort module_name/`

4. **Commit conventions:**

- Concise messages focusing on "why" not "what"

- Follow existing commit style (`git log`)

- Analyze ALL changes before drafting message

Code Quality Standards

```python

MAX_CYCLOMATIC_COMPLEXITY = 10

MAX_FUNCTION_LENGTH = 50 # lines

MAX_CLASS_LENGTH = 300 # lines

MAX_LINE_LENGTH = 100 # characters

MIN_UNIT_TEST_COVERAGE = 80%

MIN_INTEGRATION_TEST_COVERAGE = 70%

TARGET_OVERALL_COVERAGE = 95%

```

Module Descriptions

**database/** - Metadata coordination layer

Models: markets, entities, filing_searches, downloaded_filings, taxonomy_libraries

Philosophy: Database is metadata hub, NOT source of truth

Always verify filesystem before trusting database

**searcher/** - Filing search across regulatory markets

Supported: SEC, ESMA, FCA (extensible)

Registry pattern for market-specific implementations

Stores results in database

**downloader/** - Downloads XBRL filings and taxonomies

Streaming downloads with resume capability

Archive auto-detection and extraction

Database synchronization post-download

**library/** - Taxonomy library management

Monitors parsed filings for taxonomy requirements

Registers and downloads missing taxonomies

**parser/** - Parse XBRL instance documents to JSON

Market-agnostic parsing core

Market-specific logic ONLY in `xbrl_parser/market/`

Outputs: parsed.json, facts.csv, summary.txt

**mapper/** - Extract financial statements from parsed filings

Builds statements from linkbases

Handles XBRL dimensions

Outputs structured financial statements

Testing Structure

```

{module}/tests/

├── unit/

│ ├── test_foundation/

│ ├── test_instance/

│ └── test_taxonomy/

├── integration/

├── regression/

└── fixtures/

```

Common Operations

**Query filings ready for parsing:**

```python

from database.operations import FilingQueries

filings = FilingQueries.get_ready_for_parsing(market='sec', limit=10)

```

**Download filings:**

```bash

cd downloader

python download.py

```

**Parse filing:**

```bash

cd parser

python parser.py --filing-path /mnt/map_pro/downloader/entities/sec/apple/...

```

**Map statements:**

```bash

cd mapper

python mapper.py --parsed-path /mnt/map_pro/parser/parsed_reports/...

```

Important Notes

**Filesystem is source of truth** - Database is metadata only

**Data partition separation** - `/mnt/map_pro/` for all outputs

**Path regime** - Check existing paths before adding new ones

**Standards compliance** - Run `standards_checker.py` before commits

**Market agnostic** - Market-specific code ONLY in `/market/` directories

**No hardcoding** - Configuration externalized to `.env` and config loaders

MAP PRO Development Guide

MAP PRO Development Guide

Project Overview

Repository Structure

Instructions

Before Starting Any Work

Critical Coding Standards

Writes to: /mnt/map_pro/parser/parsed_reports/sec/apple/10-K/2024-09-30/parsed.json

Path: xbrl_parser/foundation/xml_parser.py

Development Workflow

Code Quality Standards

Module Descriptions

Testing Structure

Common Operations

Important Notes

Reviews (0)