FHIRy Data Analytics

Process FHIR (Fast Healthcare Interoperability Resources) bundles and NDJSON files for health data analytics, machine learning, and AI applications using Python and pandas.

What This Skill Does

This skill helps you work with the FHIRy Python package to convert complex FHIR healthcare data into structured pandas DataFrames. It supports FHIR bundles, NDJSON files, FHIR server search integration, BigQuery queries, and LLM-based natural language queries for healthcare data analysis.

Instructions

Environment Setup

1. **Verify Python version** (3.10, 3.11, or 3.12 required):

```bash

python --version

```

2. **Install dependencies** using uv package manager:

```bash

uv sync

```

3. **Understand the project structure**:

- `src/fhiry/fhiry.py` — Core FHIR Bundle processor

- `src/fhiry/fhirndjson.py` — NDJSON file processor

- `src/fhiry/fhirsearch.py` — FHIR server search API integration

- `src/fhiry/bqsearch.py` — BigQuery FHIR dataset queries

- `src/fhiry/flattenfhir.py` — FHIR resource flattening logic

- `src/fhiry/main.py` — CLI entry point

- `tests/resources/` — Sample FHIR bundles for testing

Development Guidelines

1. **Follow code style conventions**:

- Use Ruff formatter (120 char line length)

- Add type hints to all function signatures

- Write Google-style docstrings for classes and public methods

- Run pre-commit hooks before committing

2. **Type checking requirements**:

- All functions must have type hints

- Return type annotations are required

- Use mypy for type checking

- Only add `# type: ignore` with justification comments

3. **FHIR domain knowledge**:

- FHIR resources have nested structures that need flattening

- Common resource types: Patient, Observation, Condition, Medication, Procedure

- FHIR bundles are collections of related resources

- NDJSON is newline-delimited JSON for bulk FHIR exports

- Handle coding systems: SNOMED, LOINC, ICD-10 in CodeableConcept

- Process references between resources (e.g., Patient references in Observations)

Testing

1. **Run tests before any changes**:

```bash

uv run pytest --cov=src/fhiry tests/

```

2. **Test conventions**:

- Test files must start with `test_`

- Test functions must start with `test_`

- Use fixtures from `tests/conftest.py`

- Use sample FHIR resources from `tests/resources/`

- Maintain >70% test coverage

3. **Add tests for new functionality**:

- Create test file in `tests/` directory

- Use existing patterns from similar tests

- Test with various FHIR resource types

- Verify config JSON filtering works

Common Development Tasks

#### Adding a New FHIR Resource Processor

1. Identify the appropriate module (fhiry.py, fhirsearch.py, bqsearch.py)

2. Follow existing patterns for resource flattening

3. Add type hints for all methods

4. Write tests with sample FHIR resources in `tests/resources/`

5. Update docstrings and README if adding public API

#### Modifying DataFrame Output

1. Locate column extraction logic in `base_fhiry.py` or specific processor

2. Test with various FHIR resource types

3. Verify config JSON filtering still works

4. Run full test suite to check for regressions

#### Adding CLI Commands

1. Edit `src/fhiry/main.py`

2. Use Click decorators for command definition

3. Add type hints and docstrings

4. Create tests in `tests/test_cli.py`

5. Update CLI documentation

#### Adding Dependencies

1. Add to `dependencies` in `pyproject.toml`

2. Run `uv sync` to update lock file

3. Check for obsolete dependencies with `make check`

4. Only add if absolutely necessary

Key Configuration Files

`pyproject.toml` — Dependencies, tool configuration, project metadata

`Makefile` — Build, test, and development commands

`.pre-commit-config.yaml` — Formatting and linting hooks

`CONTRIBUTING.md` — Contribution guidelines

`README.md` — Public API documentation and usage examples

Best Practices

1. Always run `uv run pytest` before submitting changes

2. Respect HL7 FHIR specification standards when handling resources

3. Preserve or improve test coverage with new code

4. Use type hints (required by mypy configuration)

5. Follow existing code patterns before implementing new features

6. Target develop branch for changes, never push directly to main

7. Keep dependencies minimal

8. Document public APIs with docstrings and README updates

9. Test with real FHIR data from `tests/resources/`

10. Use `uv` for all package management operations

Example Usage

```python

from fhiry import FHIRy

Process FHIR bundle

fhir = FHIRy(bundle_data)

df = fhir.to_dataframe()

Process NDJSON file

from fhiry import FHIRndjson

ndjson = FHIRndjson('path/to/file.ndjson')

df = ndjson.to_dataframe()

Search FHIR server

from fhiry import FHIRsearch

search = FHIRsearch('https://fhir.server.com', resource_type='Patient')

df = search.to_dataframe()

```

Important Notes

This is a healthcare data processing tool — ensure HIPAA compliance and data security

FHIR data structures are complex and nested — flattening logic is critical

BigQuery integration requires appropriate GCP credentials

LLM features require installing the `llm` extra: `uv sync --extra llm`

Always validate output DataFrames against expected FHIR resource schemas

FHIRy Data Analytics

FHIRy Data Analytics

What This Skill Does

Instructions

Environment Setup

Development Guidelines

Testing

Common Development Tasks

Key Configuration Files

Best Practices

Example Usage

Process FHIR bundle

Process NDJSON file

Search FHIR server

Important Notes

Reviews (0)