Interactive CLI tool that converts web articles to EPUB format using Playwright for web scraping and OpenAI for content conversion with visible browser feedback.
A Node.js CLI tool that converts web articles to EPUB format through a 4-step pipeline: user input collection, web scraping with Playwright, AI-powered content conversion via OpenAI, and EPUB generation.
The application follows a linear 4-step pipeline orchestrated by `src/index.ts`:
1. **User Input** (`src/cli.ts`) - Interactive prompts collect article URL, title, author, and language
2. **Web Scraping** (`src/scraper.ts`) - Playwright extracts article HTML content with visible browser feedback
3. **Content Conversion** (`src/converter.ts`) - OpenAI API converts HTML to clean Markdown
4. **EPUB Generation** (`src/epub-generator.ts`) - Creates EPUB file from Markdown content
**No Build Step**: Uses Node.js 24's `--experimental-strip-types` flag to run TypeScript directly. All imports MUST use `.js` extensions despite files being `.ts`:
```typescript
// Correct
import { logger } from './logger.js';
// Wrong
import { logger } from './logger.ts';
```
**Visible Browser Mode**: Playwright runs with `headless: false` and `slowMo: 100` for visual feedback. This is intentional UX, not debug mode.
**Content Selector Strategy**: Tries multiple semantic HTML selectors in priority order (`article`, `main`, `[role="main"]`, `.post-content`, `.entry-content`, `body`) in `src/scraper.ts:59-67`.
**Markdown Conversion**: Custom regex-based converter in `src/epub-generator.ts:markdownToHtml()`. If advanced Markdown features needed, integrate the `marked` library.
**Model Selection**: Uses `gpt-5-mini` in `src/converter.ts:33` (verify model name exists or update to `gpt-4o-mini`).
1. Ensure `OPENAI_API_KEY` exists in `.env` file (validated at runtime in `src/index.ts:37-43`)
2. Install dependencies: `npm install`
3. Chromium browser installs automatically via postinstall hook
```bash
npm start
```
```bash
npx playwright install chromium
```
Edit `src/scraper.ts:59-67` to add selectors to priority list. More specific selectors first:
```typescript
const selectors = [
'article',
'main',
'[role="main"]',
'.your-new-selector', // Add here
'body'
];
```
Modify launch options in `src/scraper.ts:24-27`:
Add patterns in `src/epub-generator.ts:68-91`. Order matters—specific patterns before general:
```typescript
// Headers
html = html.replace(/^### (.*$)/gim, '<h3>$1</h3>');
// Links
html = html.replace(/\[([^\]]+)\]\(([^)]+)\)/g, '<a href="$2">$1</a>');
```
Update model name in `src/converter.ts:33`:
```typescript
model: 'gpt-4o-mini', // or other available model
```
Use the custom logger (`src/logger.ts`) for progress tracking:
```typescript
logger.startSpinner('Starting operation');
try {
// ... async work
logger.succeedSpinner('Operation complete');
} catch (error) {
logger.failSpinner('Operation failed');
throw error;
}
```
**Log Types**:
All pipeline functions follow this structure:
```typescript
export async function pipelineStep() {
logger.startSpinner('Starting step');
try {
// ... implementation
logger.succeedSpinner('Step complete');
return result;
} catch (error) {
logger.failSpinner('Step failed');
throw new Error(`Step failed: ${error.message}`);
}
}
```
Main function in `index.ts` catches all errors and exits with code 1.
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/article-to-epub-converter/raw