IIIF Digital Library Manager
This skill helps AI agents manage IIIF (International Image Interoperability Framework) manifest generation systems for digital book collections, specifically designed for East Asian digital libraries.
What This Skill Does
Enables AI agents to work with a complete digital library pipeline that:
Extracts book metadata and structure from HTML archive filesGenerates IIIF Presentation API 3.0 compliant manifestsCreates browsable HTML indexes for manifest collectionsValidates manifests for IIIF complianceGenerates database schemas for persistent storageHandles multi-language metadata (Chinese, Japanese, English)Processes complex volume structures and page rangesInstructions
1. Understanding the Project Architecture
First, identify these core components in the codebase:
**Data Pipeline Scripts:**
`index.ts` - HTML parsing and metadata extraction`generate-iiif-manifests.ts` - IIIF manifest generation`update-index.ts` - HTML index creation`validate-manifests.ts` - Manifest validation`generate-prisma-schema.ts` - Database schema generation**Key Data Structures:**
`BookEntry` - Core book metadata with structure information`BookVolume` - Individual volume/chapter with page ranges`IIIFManifest` - IIIF Presentation API 3.0 compliant structure`LibraryData` - Complete library collection with statistics**Directory Structure:**
`html/` - Input HTML files with book metadata`manifests/` - Output IIIF manifest files and index`toho-data.json` - Intermediate parsed book data2. Running the Processing Pipeline
Execute the pipeline in this order:
**Step 1: Parse HTML files**
```bash
bun run index.ts
```
This extracts book metadata from HTML files and generates `toho-data.json`.
**Step 2: Generate IIIF manifests**
```bash
bun run generate-manifests
```
Creates IIIF Presentation API 3.0 compliant JSON manifests in `manifests/` directory.
**Step 3: Validate manifests**
```bash
bun run validate-manifests
```
Checks generated manifests for IIIF compliance and structural issues.
**Step 4: Update browsable index**
```bash
bun run update-index
```
Generates HTML interface for browsing the manifest collection.
3. HTML Parsing Logic
When working with HTML input files, understand these patterns:
**Expected HTML Structure:**
`<h2>` tags define book categories`<a href="...">BookTitle</a>` links define books (with optional descriptions)`{BookID}menu.html` files contain links to individual volumes`{BookID}{SequenceNumber}.html` files are individual volume pages**JavaScript Variables Extracted from Volume Files:**
`volNum` - Volume number`volName` - Volume title in Chinese/Japanese`volStartPos` - Starting page position`volMaxPage` - Maximum pages in this volume`bookNum` - Book identifier**Chinese Text Processing:**
Convert Chinese numerals (一二三四五) to Arabic numbersExtract dynasty names from Chinese descriptionsParse authors using patterns like "某某撰", "某某輯"Handle complex traditional volume numbering systems4. IIIF Manifest Generation
When generating or modifying manifests:
**Core Configuration (in generate-iiif-manifests.ts):**
```typescript
const BASE_URL = "https://toho-digital-library.zinbun.kyoto-u.ac.jp";
const IMAGE_SERVICE_BASE_URL = "https://iiif.toyjack.net/iiif";
```
**Manifest Structure Requirements:**
Follow IIIF Presentation API 3.0 specificationInclude multi-language labels: Chinese (zh), Japanese (ja), English (en)Set viewing direction to right-to-left for traditional Chinese textsInclude metadata: dynasty, authors, publication info, volume countCreate one canvas per page with proper dimensionsLink to IIIF Image API service for images**Image URL Pattern:**
```
{IMAGE_SERVICE_BASE_URL}/{BookID}/{VolumeID}_{PageNumber}.jpg
```
5. Database Integration (Optional)
If database persistence is needed:
**Generate Prisma schema:**
```bash
bun run generate-prisma-schema
```
**Setup database:**
```bash
bun run db:generate # Generate Prisma client
bun run db:push # Push schema to database
bun run db:seed # Seed initial data
```
**Database administration:**
```bash
bun run db:studio # Open Prisma Studio GUI
bun run db:reset # Reset database
```
6. Adding New Features
When extending functionality:
**Adding Metadata Fields:**
1. Update `BookEntry` interface in type definitions
2. Modify parsing logic in `index.ts` to extract new field
3. Update `generateManifest()` in `generate-iiif-manifests.ts` to include field
4. Add validation rules in `validate-manifests.ts`
5. Update HTML template in `update-index.ts` if field should be displayed
**Modifying IIIF Output:**
1. Edit `generateManifest()` function in `generate-iiif-manifests.ts`
2. Ensure changes comply with IIIF Presentation API 3.0 spec
3. Run validation script to check for issues
4. Test with IIIF viewers (Mirador, Universal Viewer)
**Changing HTML Interface:**
1. Modify `generateHTML()` function in `update-index.ts`
2. Update CSS styles and layout as needed
3. Ensure responsive design for mobile devices
4. Test browsing functionality
7. Error Handling Patterns
Implement these robust handling strategies:
**File existence checks**: Gracefully handle missing menu/volume HTML files**Backup scanning**: Fall back to sequential file scanning when menu files are missing**JSON validation**: Validate manifest structure before writing output**Encoding detection**: Handle various character encodings in HTML files**Malformed HTML**: Use tolerant HTML parsers that recover from errors8. Performance Considerations
Be aware of these performance characteristics:
Processing large HTML collections is memory-intensive (load incrementally if needed)The system processes books sequentially (not parallelized)Generated manifest files can be large for multi-volume works with hundreds of pagesValidate incrementally during generation rather than all at once for faster feedbackConsider pagination for HTML index with very large collections (1000+ books)9. Validation and Quality Assurance
Always validate outputs:
**IIIF Manifest Validation:**
Check `@context` field points to IIIF Presentation API 3.0Verify all required fields are present (id, type, label, items)Validate canvas structure and image service linksCheck multi-language labels are properly formattedEnsure metadata fields follow IIIF conventions**Data Quality Checks:**
Verify all book IDs are uniqueCheck volume sequences are complete and consecutiveValidate page ranges don't overlap within volumesEnsure all referenced image files existCheck for missing or malformed metadata10. Internationalization
Handle multi-language content properly:
Use IIIF language maps for all labels and metadataProvide Chinese (zh), Japanese (ja), and English (en) translationsHandle traditional and simplified Chinese charactersSupport right-to-left reading direction for classical textsPreserve original language in metadata while providing translationsExample Usage
**Scenario 1: Initial setup and complete pipeline run**
```bash
Install dependencies
bun install
Parse HTML archive
bun run index.ts
Generate manifests
bun run generate-manifests
Validate output
bun run validate-manifests
Create browsable index
bun run update-index
```
**Scenario 2: Adding a new book collection**
1. Place new HTML files in `html/` directory following naming conventions
2. Run parser: `bun run index.ts`
3. Generate new manifests: `bun run generate-manifests`
4. Validate: `bun run validate-manifests`
5. Update index: `bun run update-index`
**Scenario 3: Modifying IIIF output structure**
1. Update `generateManifest()` in `generate-iiif-manifests.ts`
2. Regenerate manifests: `bun run generate-manifests`
3. Validate changes: `bun run validate-manifests`
4. Test with IIIF viewer to ensure compatibility
Important Constraints
The system expects Bun runtime (not Node.js)HTML files must follow specific structural patterns for parsingImage files must be accessible via configured IIIF Image API serverGenerated manifests follow IIIF Presentation API 3.0 (not earlier versions)Character encoding must be handled for Chinese/Japanese textSome scripts referenced in package.json may not exist yet and need to be createdThe system processes all files in memory - very large collections may require streamingNotes for AI Agents
Always run the validation script after generating manifests to catch structural issues earlyWhen parsing fails, check HTML structure matches expected patternsTest generated manifests in IIIF viewers (Mirador, Universal Viewer) before deploymentPreserve multi-language metadata - never drop language variantsFollow IIIF Presentation API 3.0 specification strictly for interoperabilityConsider cultural context when handling classical Chinese/Japanese texts