FHIRy Data Processor

Convert FHIR (Fast Healthcare Interoperability Resources) bundles, NDJSON files, and server queries into pandas DataFrames for health data analytics, machine learning, and AI applications.

What This Skill Does

FHIRy is a Python package that flattens complex nested FHIR healthcare data into tabular pandas DataFrames. It supports multiple FHIR data sources (bundles, NDJSON files, FHIR servers, BigQuery) and handles medical coding systems like SNOMED, LOINC, and ICD-10.

Instructions for AI Agent

Core Capabilities

You are working with the FHIRy project, which provides these capabilities:

Convert FHIR bundles to DataFrames

Process NDJSON FHIR files

Query FHIR servers and convert results

Query BigQuery FHIR datasets

Flatten nested FHIR resource structures

Extract medical coding systems (SNOMED, LOINC, ICD-10)

Filter and rename columns via JSON configuration

Support parallel processing for large datasets

Development Environment

**Python Requirements:**

Python 3.10+ (tested on 3.10, 3.11, 3.12)

Package manager: `uv` for dependency management

Setup: `uv sync` to install dependencies from pyproject.toml

**Project Structure:**

```

src/fhiry/ # Core modules

├── fhiry.py # Bundle processor

├── fhirndjson.py # NDJSON processor

├── fhirsearch.py # FHIR server API

├── bqsearch.py # BigQuery integration

├── flattenfhir.py # Resource flattening

├── parallel.py # Parallel processing

├── base_fhiry.py # Base class

└── main.py # CLI entry point

tests/ # pytest test suite

docs/ # MkDocs documentation

examples/ # Usage examples

```

Code Style Requirements

1. **Formatting:** Ruff formatter (120 char line length)

2. **Type Hints:** Required for all function signatures (enforced by mypy)

3. **Docstrings:** Google-style for classes and public methods

4. **Imports:** Auto-organized by ruff (isort-compatible)

5. **Type Checking:** No implicit optional types, return annotations required

Testing Protocol

**Framework:** pytest with coverage reporting

**Commands:**

```bash

uv run pytest --cov=src/fhiry tests/ # Run with coverage

uv run pytest tests/ # Run without coverage

uv run pytest tests/test_specific.py # Run specific file

```

**Conventions:**

Test files start with `test_`

Test functions start with `test_`

Use fixtures from `tests/conftest.py`

Sample FHIR data in `tests/resources/`

Maintain >70% test coverage

FHIR Domain Knowledge

**FHIR Basics:**

HL7 FHIR standard for healthcare data interoperability

Resources: Patient, Observation, Condition, Medication, Procedure, etc.

Bundles: Collections of related FHIR resources

NDJSON: Newline-delimited JSON for bulk FHIR exports

**Key Concepts:**

Resources have nested structures requiring flattening

CodeableConcept contains coding systems (SNOMED, LOINC, ICD-10)

Resources reference each other (e.g., Observation → Patient)

FHIR Search API uses RESTful queries with specific parameters

BigQuery has native FHIR dataset support

Common Development Tasks

**Adding a New Resource Processor:**

1. Add logic to appropriate module (fhiry.py, fhirsearch.py, etc.)

2. Follow existing flattening patterns

3. Add complete type hints

4. Write tests with sample FHIR resources

5. Update documentation for public APIs

**Modifying DataFrame Output:**

1. Update `base_fhiry.py` or specific processor

2. Test with multiple FHIR resource types

3. Verify config JSON filtering still works

4. Add tests for new column extraction logic

**Adding CLI Commands:**

1. Edit `src/fhiry/main.py`

2. Use Click decorators for commands

3. Add tests in `tests/test_cli.py`

4. Update documentation

Dependencies

**Core:** pandas, google-cloud-bigquery, tqdm, click, numpy, timeago, prodict, responses, openpyxl

**Optional:** llama-index, langchain (llm extra for LLM-based queries)

**Adding New Dependencies:**

1. Add to `dependencies` in `pyproject.toml`

2. Run `uv sync` to update lock file

3. Check for obsolete deps with `make check` (uses deptry)

Key Files to Review

Before making changes, check:

`pyproject.toml` - Dependencies, tool config, metadata

`Makefile` - Build, test, development commands

`.pre-commit-config.yaml` - Formatting and linting config

`CONTRIBUTING.md` - Contribution guidelines

`README.md` - Public API and usage examples

Best Practices

1. **Always run tests** before submitting changes

2. **Respect FHIR standards** - consult HL7 FHIR specification

3. **Preserve test coverage** - add tests for new functionality

4. **Use type hints** - required by mypy configuration

5. **Follow existing patterns** - check similar code first

6. **Target develop branch** - never push directly to main

7. **Keep dependencies minimal** - only add if absolutely necessary

8. **Document public APIs** - update docstrings and README

9. **Test with real FHIR data** - use samples in tests/resources/

10. **Handle medical coding systems** - properly extract SNOMED, LOINC, ICD-10 codes

Example Usage Patterns

When helping users with FHIRy:

**Convert FHIR Bundle:**

```python

from fhiry import Fhiry

df = Fhiry.from_bundle(bundle_dict)

```

**Process NDJSON File:**

```python

from fhiry import FhirNDJSON

df = FhirNDJSON.from_file("data.ndjson")

```

**Query FHIR Server:**

```python

from fhiry import FhirSearch

df = FhirSearch("https://server.com/fhir").search("Patient", {"name": "Smith"})

```

**BigQuery FHIR Dataset:**

```python

from fhiry import BQSearch

df = BQSearch(project_id, dataset_id).search("Observation", limit=1000)

```

Notes

This skill is for the FHIRy Python package for healthcare data processing

Always maintain backward compatibility with existing FHIR resource processors

Consult HL7 FHIR specification when handling new resource types

Test with diverse FHIR resource structures before finalizing changes

FHIRy Data Processor

FHIRy Data Processor

What This Skill Does

Instructions for AI Agent

Core Capabilities

Development Environment

Code Style Requirements

Testing Protocol

FHIR Domain Knowledge

Common Development Tasks

Dependencies

Key Files to Review

Best Practices

Example Usage Patterns

Notes

Reviews (0)