Convert FHIR healthcare data (bundles, NDJSON, server queries) into pandas DataFrames for analytics, ML, and AI applications
Convert FHIR (Fast Healthcare Interoperability Resources) bundles, NDJSON files, and server queries into pandas DataFrames for health data analytics, machine learning, and AI applications.
FHIRy is a Python package that flattens complex nested FHIR healthcare data into tabular pandas DataFrames. It supports multiple FHIR data sources (bundles, NDJSON files, FHIR servers, BigQuery) and handles medical coding systems like SNOMED, LOINC, and ICD-10.
You are working with the FHIRy project, which provides these capabilities:
**Python Requirements:**
**Project Structure:**
```
src/fhiry/ # Core modules
├── fhiry.py # Bundle processor
├── fhirndjson.py # NDJSON processor
├── fhirsearch.py # FHIR server API
├── bqsearch.py # BigQuery integration
├── flattenfhir.py # Resource flattening
├── parallel.py # Parallel processing
├── base_fhiry.py # Base class
└── main.py # CLI entry point
tests/ # pytest test suite
docs/ # MkDocs documentation
examples/ # Usage examples
```
1. **Formatting:** Ruff formatter (120 char line length)
2. **Type Hints:** Required for all function signatures (enforced by mypy)
3. **Docstrings:** Google-style for classes and public methods
4. **Imports:** Auto-organized by ruff (isort-compatible)
5. **Type Checking:** No implicit optional types, return annotations required
**Framework:** pytest with coverage reporting
**Commands:**
```bash
uv run pytest --cov=src/fhiry tests/ # Run with coverage
uv run pytest tests/ # Run without coverage
uv run pytest tests/test_specific.py # Run specific file
```
**Conventions:**
**FHIR Basics:**
**Key Concepts:**
**Adding a New Resource Processor:**
1. Add logic to appropriate module (fhiry.py, fhirsearch.py, etc.)
2. Follow existing flattening patterns
3. Add complete type hints
4. Write tests with sample FHIR resources
5. Update documentation for public APIs
**Modifying DataFrame Output:**
1. Update `base_fhiry.py` or specific processor
2. Test with multiple FHIR resource types
3. Verify config JSON filtering still works
4. Add tests for new column extraction logic
**Adding CLI Commands:**
1. Edit `src/fhiry/main.py`
2. Use Click decorators for commands
3. Add tests in `tests/test_cli.py`
4. Update documentation
**Core:** pandas, google-cloud-bigquery, tqdm, click, numpy, timeago, prodict, responses, openpyxl
**Optional:** llama-index, langchain (llm extra for LLM-based queries)
**Adding New Dependencies:**
1. Add to `dependencies` in `pyproject.toml`
2. Run `uv sync` to update lock file
3. Check for obsolete deps with `make check` (uses deptry)
Before making changes, check:
1. **Always run tests** before submitting changes
2. **Respect FHIR standards** - consult HL7 FHIR specification
3. **Preserve test coverage** - add tests for new functionality
4. **Use type hints** - required by mypy configuration
5. **Follow existing patterns** - check similar code first
6. **Target develop branch** - never push directly to main
7. **Keep dependencies minimal** - only add if absolutely necessary
8. **Document public APIs** - update docstrings and README
9. **Test with real FHIR data** - use samples in tests/resources/
10. **Handle medical coding systems** - properly extract SNOMED, LOINC, ICD-10 codes
When helping users with FHIRy:
**Convert FHIR Bundle:**
```python
from fhiry import Fhiry
df = Fhiry.from_bundle(bundle_dict)
```
**Process NDJSON File:**
```python
from fhiry import FhirNDJSON
df = FhirNDJSON.from_file("data.ndjson")
```
**Query FHIR Server:**
```python
from fhiry import FhirSearch
df = FhirSearch("https://server.com/fhir").search("Patient", {"name": "Smith"})
```
**BigQuery FHIR Dataset:**
```python
from fhiry import BQSearch
df = BQSearch(project_id, dataset_id).search("Observation", limit=1000)
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/fhiry-data-processor/raw