Initialize and work with a Julia+Demetrios hybrid scientific computing project for analyzing operator-defined symmetries in bacterial genomes. Handles architecture setup, cross-validation, NCBI data pipelines, and Scientific Data publication requirements.
A specialized skill for working with the Darwin Operator Symmetry Atlas (DOSA) project - a hybrid Julia+Demetrios scientific computing project analyzing operator-defined symmetries in bacterial genomes.
This skill helps you navigate and work with a complex scientific computing project that:
The project uses a 3-layer architecture:
**Layer 3 (Artifacts)**: CSV/JSONL/Parquet outputs with Zenodo DOI
**Layer 2 (Demetrios)**: High-performance kernels with units of measure and refinement types
**Layer 1 (Julia)**: Orchestration, NCBI fetch, validation
**Layer 0 (Julia Pure)**: Reference implementation for cross-validation
When working with this project, ALWAYS first verify the canonical directory structure:
```
darwin-atlas/
├── CLAUDE.md # Project instructions
├── demetrios/ # Layer 2 implementation
│ ├── src/
│ │ ├── operators.d
│ │ ├── exact_symmetry.d
│ │ ├── approx_metric.d
│ │ ├── quaternion.d
│ │ └── ffi.d
├── julia/ # Layers 0+1 implementation
│ ├── src/
│ │ ├── Operators.jl
│ │ ├── ExactSymmetry.jl
│ │ ├── ApproxMetric.jl
│ │ ├── QuaternionLift.jl
│ │ ├── NCBIFetch.jl
│ │ ├── Validation.jl
│ │ ├── DemetriosFFI.jl
│ │ └── CrossValidation.jl
├── data/ # Outputs and manifest
└── Makefile # Build orchestration
```
ALWAYS enforce these non-negotiable rules:
When implementing or validating operators, use these exact definitions:
| Operator | Symbol | Definition | Group |
|----------|--------|------------|-------|
| Identity | I | σ(i) = s_i | D_4 |
| Reverse | R | σ(i) = s_{n-1-i} | D_4 |
| Complement | K | σ(i) = complement(s_i) | D_4 |
| Rev-Comp | RC | σ(i) = complement(s_{n-1-i}) | D_4 |
When working with outputs, validate against these canonical schemas:
**atlas_replicons.csv**: assembly_accession, replicon_id, replicon_type, length_bp, gc_fraction, taxonomy_id, checksum_sha256
**atlas_windows_exact.csv**: replicon_id, window_length, window_start, orbit_ratio, is_palindrome_R, is_fixed_RC, orbit_size
**approx_symmetry_stats.csv**: replicon_id, window_length, d_min, d_min_over_L, transform_family
**dicyclic_lifts.csv**: dihedral_order, verified_double_cover, lift_group, relations_satisfied
When writing Julia code:
```julia
"""
orbit_ratio(seq::LongDNA{4}) -> Float64
Compute orbit ratio: |orbit| / |D₄|.
"""
function orbit_ratio(seq::LongDNA{4})
orbit_size(seq) / 4.0
end
```
When writing Demetrios code:
```d
// Use units of measure for physical quantities
// Use refinement types for domain constraints
// Explicit effect declarations
// FFI exports with #[export] #[no_mangle]
type OrbitRatio = { r: f64 | 0.25 <= r && r <= 1.0 }
pub fn orbit_ratio(seq: &DNASeq) -> OrbitRatio with Alloc {
let size = orbit_size(seq) as f64
size / 4.0
}
```
When implementing any algorithm:
1. Implement in Julia first (Layer 0 - reference)
2. Test Julia implementation thoroughly
3. Implement in Demetrios (Layer 2 - performance)
4. Create FFI wrappers in `DemetriosFFI.jl`
5. Run cross-validation with tolerance: 0 for discrete, 1e-12 for floats
6. **ANY DIVERGENCE IS A BLOCKING BUG** - debug until resolved
| Error Type | Action |
|------------|--------|
| Compilation error | Fix immediately, do not proceed |
| Test failure | Debug root cause, fix before continuing |
| Cross-validation divergence | **STOP** - this is critical |
| NCBI fetch failure | Retry with exponential backoff |
| Memory issue | Profile, optimize, or batch |
Before marking any component complete, verify:
When asked to build or test:
```bash
make all
make julia
make demetrios
make test
make cross-validate
make pipeline
make reproduce
```
Follow this implementation order:
**Phase 1 (Foundation)**:
**Phase 2 (Core Algorithms)**:
**Phase 3 (Pipeline)**:
**Phase 4 (Outputs)**:
Use these commit prefixes:
```
feat: add quaternion lift verification
fix: correct circular window extraction
docs: update schema documentation
test: add property-based tests for operators
refactor: extract common validation logic
```
After each major component, provide:
1. What was implemented
2. Test results summary (pass/fail counts)
3. Any deviations from plan
4. Next steps
When stuck, report:
1. What is blocking
2. What was attempted
3. Proposed solutions
4. Decision needed from user
**Example 1: Implementing a new operator**
1. Add function to `julia/src/Operators.jl` with docstring
2. Add unit tests to `julia/test/runtests.jl`
3. Run `make julia` to verify
4. Implement equivalent in `demetrios/src/operators.d`
5. Export via FFI in `demetrios/src/ffi.d`
6. Create wrapper in `julia/src/DemetriosFFI.jl`
7. Run `make cross-validate` and verify identical outputs
**Example 2: Adding a new data table**
1. Define schema in CLAUDE.md with types and constraints
2. Add type definition to `julia/src/Types.jl`
3. Implement generation logic
4. Add validation in `julia/src/Validation.jl`
5. Update pipeline script to generate table
6. Verify CSV conforms to schema
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/darwin-operator-symmetry-atlas-setup/raw