Development guide for the Original Drug Database (原研药数据库), an open-source database providing comprehensive original drug information for Chinese users. Handles data validation, entry creation, schema compliance, and automated testing.
This skill provides guidance for working with the Original Drug Database (原研药数据库) - an open-source database of original/innovator drugs aimed at providing comprehensive, reliable drug data for Chinese users.
The Original Drug Database is a Chinese-language database focusing on drugs available in mainland China. All drug names and regulatory information follow Chinese NMPA (National Medical Products Administration) standards. The project uses CC-BY-SA 4.0 license for data sharing.
1. **Data Layer** (`/data/`)
- Individual drug entries in Markdown format with YAML frontmatter
- Files numbered sequentially (1.md, 2.md, etc.)
- Strict validation against schema.yaml
2. **Schema Definition** (`schema.yaml`)
- Comprehensive field definitions with validation rules
- Supports both domestic (境内生产药品) and imported (境外生产药品) drugs
- Original drug identification logic embedded
3. **Validation Pipeline** (`/scripts/`)
- `validate.py`: Ensures data compliance with schema
- `export.py`: Converts data to JSON/CSV formats
- `stats.py`: Generates statistics and updates badges
- Test coverage with pytest
4. **CI/CD** (`.github/workflows/`)
- Automated validation on every PR
- Auto-updates statistics and badges
- Enforces data quality standards
1. **Data Authority**: All data must be verifiable through official sources (NMPA)
2. **Strict Validation**: Approval numbers must match exact regulatory format
3. **Change Tracking**: Complete Git history with PR review mechanism
4. **Community-driven**: Open for contributions with clear guidelines
When working with data files, use these commands:
```bash
python scripts/validate.py data/1.md
pytest scripts/test_*.py -v
python scripts/export.py
python scripts/stats.py
```
Follow these commit message conventions:
```bash
git commit -m "数据:添加[药品名]([数量]种)"
git commit -m "修复:[描述问题]"
git commit -m "其他:[描述]"
```
Follow these steps when adding new drug entries:
1. **Create New File**
- Create new file in `/data/` with next sequential ID (e.g., `123.md`)
- Use Markdown format with YAML frontmatter
2. **Include Required Fields**
- ID: Sequential number (1-99999, no gaps)
- Registration type: 境内生产药品 or 境外生产药品
- Generic name (通用名)
- Formulation (剂型)
- Specification (规格)
- Approval date (批准日期, format: YYYY-MM-DD)
- Category (类别)
3. **Conditional Fields by Registration Type**
- **Imported Drugs (境外生产药品)**: Include originator, MAH, manufacturer, packager
- **Domestic Drugs (境内生产药品)**: Include MAH, manufacturer
4. **Original Drug Logic**
- **Imported**: Originator matches MAH/Manufacturer/Packager
- **Domestic**: Originator matches MAH/Manufacturer
5. **Provide Sources**
- Include at least one official source reference
- Follow data source priority:
- A级: NMPA official documents (highest priority)
- B级: Company official statements
- C级: Industry consensus
- D级: Market research data
6. **Validation Rules**
- Approval number format: 国药准字[HZSJTB]八位数字
- Registration number format: 国药准字H[J]八位数字
- Date format: YYYY-MM-DD
- All Chinese regulatory standards must be followed
7. **Run Validation**
- Before committing, run: `python scripts/validate.py data/[filename].md`
- Ensure all validation passes before creating PR
1. All data must be verifiable through official NMPA sources
2. Automated validation runs on every PR
3. Statistics and badges are auto-updated
4. Community review through PR process
5. Complete Git history for change tracking
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/original-drug-database-guide/raw