Build and maintain a WhatsApp chatbot that fact-checks messages using AI-powered claim extraction, web retrieval, and LLM adjudication with transparent citations.
A WhatsApp chatbot that receives user messages via Evolution API, extracts claims from text or media, runs a retrieval and LLM adjudication pipeline, and replies with verdicts and citations.
Build a fact-checking bot that delivers source-grounded verdicts directly inside WhatsApp with minimal friction. Users forward or paste content, and the bot replies with a concise verdict (True/False/Misleading/Unverifiable) plus 3-5 citations.
User on WhatsApp → Evolution API (webhook) → Backend (FastAPI) → Claim Builder (merge text/OCR/URLs) → Retrieval Service (search/filter/dedupe) → LLM Adjudicator (verdict + citations) → Evolution API → User receives reply
1. **One-message verification**: Users send content, bot replies with verdict
2. **Hybrid context**: Handle text, images with OCR, and URLs
3. **Grounded verdicts**: Return verdict with 3-5 citations and rationale
4. **Trust and transparency**: Show what was analyzed and why
5. **Low latency**: P50 under 5s for text, under 12s with OCR
6. **Feedback loop**: Support thumbs up/down and continuous improvement
Analyze text-only messages.
Analyze images using OCR.
Analyze text + image combinations.
Health check endpoint.
1. Create FastAPI app with endpoints: `/api/text`, `/api/images`, `/api/multimodal`, `/health`
2. Use Python dataclasses for request/response schemas (no Pydantic)
3. Configure environment-based settings without Pydantic dependencies
4. Set up SQLAlchemy ORM with PostgreSQL/SQLite
5. Add Redis for caching repeated claims
6. Implement basic logging with configurable levels
1. Connect to Evolution API for webhook events and sending replies
2. Implement webhook signature validation
3. Set up retry handling for failed message sends
4. Parse incoming message payloads: text, caption, media URLs
5. Extract: message_id, sender_id, timestamp, message_type
1. Build claim builder that merges text, OCR results, and URL metadata
2. Implement heuristics and NLP for isolating central statements
3. Add entity recognition and date normalization
4. Support language detection (English and Portuguese)
5. Handle quoted messages and forwarded content
1. Integrate OCR service for image processing
2. Merge caption text with OCR output
3. Add image size limits and validation
4. Implement fast-path to skip OCR when text is sufficient
1. Build web and news search integration with quality filters
2. Implement deduplication logic for sources
3. Add date normalization and timestamp checks
4. Create caching layer for recent evidence
5. Set up circuit breaker for external search APIs
6. Enforce timeouts with graceful degradation
1. Create provider-agnostic LLM adapter
2. Implement structured output format: verdict, confidence, rationale, citations
3. Enforce citation grounding (3-5 sources minimum)
4. Add timeout handling (return "Unverifiable" on timeout)
5. Format citations with: title, source, url, snippet, published_at
1. **Onboarding**: Welcome message with consent request
2. **Receipt acknowledgment**: Confirm message received
3. **Result formatting**: Verdict badge + rationale + numbered citations
4. **Follow-ups**: Quick replies for "check another", "disagree", "learn more"
5. **Commands**: "help", "delete my data", "stop"
1. Implement first-time consent flow before any analysis
2. Add data deletion endpoint for user requests
3. Set retention policy (default 7-30 days)
4. Do not log raw media or full messages (use hashed references)
5. Add redaction guidance in onboarding
6. Document Privacy Policy and Data Use terms
1. Use HTTPS only with HSTS
2. Validate Evolution API webhook signatures
3. Add rate limits per sender
4. Implement anomaly detection for abuse
5. Set maximum image size and page fetch limits
6. Add antivirus scanning for stored media
1. Cache normalized claims and recent adjudications
2. Set hard timeouts for retrieval (e.g., 3s) and LLM (e.g., 5s)
3. Use fast-path when OCR is not needed
4. Implement circuit breaker for external services
5. Target P50 latency: 5s for text, 12s for OCR
1. Unit tests: claim parsing, retrieval ranking, citation formatting
2. Integration tests: end-to-end for text and OCR flows
3. Real site checks: news outlets, social posts, blogs
4. Compliance tests: consent, deletion, retention, opt-out
5. Load tests: spike scenarios during breaking news
1. Track: requests received/completed, OCR rate, link rate
2. Measure: helpfulness rating, disagreement rate, citation coverage
3. Monitor: OCR failures, retrieval misses, LLM timeouts
4. Use sampled and hashed metrics (opt-in only, no per-user tracking)
1. Set up uptime monitoring and error rate alerts
2. Track latency SLOs (P50, P95, P99)
3. Monitor queue depth for webhook processing
4. Create runbooks for: retrieval provider down, LLM quota exceeded, webhook retries
5. Implement staged rollout for updates
```json
{
"message_id": "abc123",
"verdict": "misleading",
"confidence": 0.85,
"rationale": "The claim mixes true and false elements. While X is accurate, Y is not supported by current evidence.",
"citations": [
{
"title": "Study confirms X but debunks Y",
"source": "Reuters",
"url": "https://reuters.com/article/123",
"snippet": "Research shows X is correct, but Y has no scientific basis.",
"published_at": "2024-01-15T10:00:00Z"
}
],
"processing_time_ms": 4200
}
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/fake-news-detector-whatsapp-bot/raw