Multi-factor health analysis using INDRA bio-ontology and structural causal models to discover synergistic interventions across interconnected conditions
Build production-grade systems medicine infrastructure that discovers synergistic interventions across multiple interconnected health conditions using INDRA bio-ontology and structural causal models.
This skill helps you architect and implement **systems medicine platforms** that:
1. **Analyze interconnected conditions** (e.g., inflammation + prediabetes) as unified syndromes rather than isolated diseases
2. **Discover shared molecular mechanisms** using INDRA bio-ontology for evidence-based causal pathways
3. **Identify synergistic interventions** where single changes (e.g., reducing air pollution) simultaneously benefit multiple conditions
4. **Integrate multi-factor data** including genetics, biomarkers, environmental exposures, and location history
5. **Calculate super-additive effects** from cross-pathway benefits (1+1=3 synergy)
**Traditional Medicine** (siloed):
**Systems Medicine** (this skill):
**Patient**: 34-year-old with chronic inflammation (elevated CRP, IL-6) and emerging prediabetes (HbA1c 5.9%)
**Causal Model**:
```
PM2.5 → Oxidative Stress (ROS) → {
├─→ NF-κB → IL-6 → CRP (Inflammation)
└─→ JNK → IRS-1 inhibition → Insulin Resistance
}
```
**Insight**: Single intervention (reduce PM2.5 exposure) breaks feedback loop and provides simultaneous benefits across both conditions.
1. **Define your integration strategy**:
- Direct Python imports (recommended for performance)
- HTTP API (for microservices architecture)
- Hybrid approach (local ontology + cloud services)
2. **Select core technologies**:
- **LangGraph**: Multi-agent orchestration (Supervisor, INDRA Query Agent, Web Researcher)
- **AWS Bedrock**: Claude Sonnet 4.5 for intelligent analysis
- **INDRA Bio-Ontology API**: Causal pathway discovery (https://db.indra.bio)
- **Local Graph Database**: Memgraph for fast ontology queries (<100ms)
3. **Design data models**:
```python
# User health context
UserContext:
- user_id: str
- genetics: dict # e.g., {'GSTM1': 'null'}
- current_biomarkers: dict # e.g., {'CRP': 5.2, 'IL-6': 3.8}
- location_history: list # Environmental exposure tracking
# Causal discovery request
CausalDiscoveryRequest:
- request_id: str
- user_context: UserContext
- query: Query
- options: RequestOptions
```
1. **Deploy Memgraph graph database**:
```bash
docker-compose -f docker-compose.local-ontology.yml up -d
```
2. **Integrate four core ontologies**:
- **FPLX**: 579 protein families for pathway aggregation
- **GO**: 12,182 biological processes + 180,317 gene relationships
- **CHEBI**: 218,261 chemical compounds with hierarchical relationships
- **HGNC**: 34,667 genes (auto-created from GO relationships)
3. **Implement query strategy pattern**:
```python
class OntologyQueryStrategy(ABC):
@abstractmethod
async def ground_entity(self, entity_name: str) -> List[GroundedEntity]:
"""Map natural language → database IDs"""
@abstractmethod
async def find_causal_path(self, source_id: str, target_id: str) -> List[CausalPath]:
"""Discover mechanisms between entities"""
```
4. **Verify database health**:
```python
from indra_agent.services.local_ontology import MemgraphClient
client = MemgraphClient(uri='bolt://localhost:7687')
stats = await client.get_stats()
# Expected: 265,689 entities, 464,894 relationships
```
1. **Create entity grounding service**:
- Map user queries (e.g., "CRP biomarkers") → INDRA database IDs
- Use local ontology for fast prefix search
- Fall back to INDRA API for complex entity resolution
2. **Build causal graph constructor**:
- Query INDRA for evidence-backed pathways
- Score relationships by paper count + confidence
- Apply genetic modifiers (e.g., GSTM1_null variants amplify oxidative stress)
3. **Implement multi-agent workflow**:
```python
# LangGraph state machine
Supervisor → {
INDRA Query Agent: Ground entities + discover pathways
Web Researcher: Fetch environmental data (pollution, exposures)
} → Aggregate → Format response
```
4. **Cache common pathways** for reliability:
```python
CACHED_PATHS = {
('PM2.5', 'CRP'): {
'path': ['PM2.5', 'NF-κB', 'IL-6', 'CRP'],
'evidence_count': 312,
'effect_size': 0.87
}
}
```
1. **Implement health query detection**:
```python
HEALTH_KEYWORDS = [
'biomarker', 'crp', 'il-6', 'inflammation',
'pollution', 'pm2.5', 'air quality',
'gene', 'genetic', 'variant', 'gstm1',
'causal', 'pathway', 'mechanism'
]
def is_health_query(text: str) -> bool:
return any(kw in text.lower() for kw in HEALTH_KEYWORDS)
```
2. **Route queries intelligently**:
- Health queries → INDRA Agent (bio-ontology analysis)
- General queries → LLM fallback (conversational AI)
3. **Process INDRA requests**:
```python
request = CausalDiscoveryRequest(
request_id=str(uuid.uuid4()),
user_context=UserContext(
user_id=str(user_id),
genetics=db.get_user_genetics(user_id),
current_biomarkers=db.get_user_biomarkers(user_id),
location_history=db.get_user_locations(user_id)
),
query=Query(text=message_text),
options=RequestOptions()
)
response = await indra_client.process_request(request)
```
4. **Format evidence-based responses**:
```
🧬 Health Intelligence Report
📊 Key Insights:
1. [Top-level finding with mechanism]
2. [Synergistic effects identified]
3. [Evidence base summary]
🔬 Causal Analysis:
• X biological entities identified
• Y causal relationships found
• Based on Z scientific papers
🔗 Top Causal Pathways:
Entity A ⬆️ Entity B
Evidence: N papers, Effect: 0.XX, Lag: Xh
💡 [Clinical interpretation]
```
1. **Store user health context**:
```python
# MongoDB or PostgreSQL
db.set_user_attribute(user_id, 'health_genetics', {
'GSTM1': 'null', # Glutathione S-transferase deletion
'CYP1A1': 'T/T' # Cytochrome P450 variant
})
db.set_user_attribute(user_id, 'health_biomarkers', {
'CRP': 5.2, # mg/L (inflammation marker)
'IL-6': 3.8, # pg/mL (cytokine)
'HbA1c': 5.9 # % (diabetes marker)
})
db.set_user_attribute(user_id, 'health_location_history', [
{
'city': 'Los Angeles',
'start_date': '2024-01-01',
'end_date': '2024-12-31',
'avg_pm25': 35 # µg/m³
}
])
```
2. **Apply genetic modifiers**:
- GSTM1_null → 1.4x oxidative stress amplification
- CYP1A1 variants → altered xenobiotic metabolism
3. **Calculate synergy scores**:
```python
synergy_score = (observed_benefit) / (sum_of_independent_benefits)
# Score > 1.0 indicates super-additive effects
# Score = 1.34 → 34% synergistic benefit from cross-pathway interactions
```
1. **Docker containerization** (single container for simplicity):
```dockerfile
FROM cgr.dev/chainguard-private/python:3.11-dev
# Install INDRA agent dependencies
COPY indra_agent /opt/indra_agent
COPY pyproject.toml /opt/pyproject.toml
RUN cd /opt && pip3 install -e .
# Install application dependencies
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt
# Copy application code
COPY . /code
WORKDIR /code
CMD ["python3", "app.py"]
```
2. **Environment configuration**:
```bash
# AWS Bedrock (Claude Sonnet 4.5)
AWS_ACCESS_KEY_ID=your-key
AWS_SECRET_ACCESS_KEY=your-secret
AWS_REGION=us-east-1
# INDRA Bio-Ontology
INDRA_BASE_URL=https://db.indra.bio
# Local Ontology Database
MEMGRAPH_URI=bolt://localhost:7687
# Optional: Environmental data APIs
IQAIR_API_KEY=your-iqair-key
```
3. **Run with Docker Compose**:
```yaml
services:
health_intelligence:
build:
context: .
env_file: config/.env
ports:
- "8000:8000"
depends_on:
- memgraph
memgraph:
image: memgraph/memgraph:latest
ports:
- "7687:7687"
volumes:
- memgraph_data:/var/lib/memgraph
```
**Input**: "If I move from Los Angeles (PM2.5: 35 µg/m³) to Seattle (PM2.5: 10 µg/m³), how will my inflammation and metabolic markers respond?"
**Output**:
```
🧬 Health Intelligence Report
📊 Predicted Changes:
• CRP: 5.2 → 4.36 mg/L (-16%, enters low-risk range)
• IL-6: 3.8 → 3.12 pg/mL (-18%)
• HbA1c: 5.9% → 4.77% (-19%, exits prediabetes range)
🎯 Synergy Score: 1.34 (34% super-additive benefit)
🔬 Critical Pathway:
PM2.5 ⬇️ ROS ⬇️ NF-κB ⬇️ {IL-6 + Insulin Resistance}
Evidence: 312 papers
Mechanism: Breaks inflammation-insulin resistance feedback loop
💡 Single environmental intervention provides multiplicative benefits across two interconnected conditions.
```
**Input**: "How does my GSTM1_null variant affect my response to air pollution?"
**Output**:
```
🧬 Genetic Modifier Analysis
⚠️ GSTM1_null Status: Detected
Effect: 1.4x oxidative stress amplification
📊 Your Risk Profile:
• Baseline PM2.5 → CRP effect: 0.82
• With GSTM1_null: 1.15 (40% higher sensitivity)
🔬 Mechanism:
GSTM1 normally detoxifies oxidative stress byproducts
Null variant → impaired glutathione conjugation
Result: Elevated ROS accumulation from pollution exposure
💡 Recommendation: Prioritize low-pollution environments and antioxidant interventions (evidence-based targets: Nrf2 pathway activation).
```
1. **Evidence Quality**: Always cite paper counts and confidence scores from INDRA. Low-evidence pathways (<10 papers) should be flagged as preliminary.
2. **Clinical Disclaimers**: This system provides research insights, not medical advice. Include appropriate disclaimers in user-facing interfaces.
3. **Performance Targets**:
- Entity grounding: <200ms (local ontology)
- Causal path discovery: <3 seconds (INDRA API + caching)
- Total query processing: <5 seconds end-to-end
4. **Data Privacy**: Health data is PHI (Protected Health Information). Use encryption at rest and in transit. Consider HIPAA compliance for US deployments.
5. **Fallback Strategies**:
- Pre-cache common pathways (PM2.5→CRP, Inflammation→Insulin Resistance)
- Implement graceful degradation if INDRA API is unavailable
- Provide generic health information when personalization data is missing
6. **Known Limitations**:
- CHEBI ontology: Only hierarchical relationships, no causal interactions
- Genetic variants: Limited to well-studied SNPs (GSTM1, CYP1A1, etc.)
- Environmental data: Depends on third-party APIs (IQAir, EPA)
7. **ID Format Issues**: Memgraph may create double-prefixed IDs (GO:go:12 vs go:12). Implement normalization in grounding service:
```python
def normalize_id(entity_id: str) -> str:
# Strip duplicate prefixes
if ':' in entity_id:
prefix, rest = entity_id.split(':', 1)
if rest.startswith(prefix.lower() + ':'):
return rest
return entity_id
```
Your implementation should:
1. ✅ Identify synergistic interventions across interconnected conditions
2. ✅ Ground entities to INDRA IDs with >90% accuracy for common biomarkers
3. ✅ Discover evidence-backed causal pathways in <3 seconds
4. ✅ Apply genetic modifiers correctly (e.g., GSTM1_null amplification)
5. ✅ Calculate synergy scores with transparent methodology
6. ✅ Provide actionable insights with clinical context
7. ✅ Handle graceful degradation when services are unavailable
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/systems-medicine-health-analysis/raw