RAG Customer Support System

Build and maintain a complete RAG (Retrieval-Augmented Generation) customer support system with Node.js, Pinecone vector database, OpenAI GPT, and real-time chat capabilities.

Project Architecture

This is a full-stack RAG system with the following architecture:

**Backend**: Node.js with Express.js RESTful API

**Vector Database**: Pinecone for storing document embeddings

**AI**: OpenAI GPT for response generation and embeddings

**Frontend**: HTML, CSS (Tailwind), vanilla JavaScript

**WebSockets**: Socket.io for real-time chat

**Authentication**: JWT token-based authentication with role-based access (user/admin)

Directory Structure

Follow this standard structure:

```

src/

├── config/ # Configurations (database, environment variables)

├── controllers/ # Controllers (currently unused)

├── middleware/ # Middleware (auth, rate limiting, error handling)

├── models/ # Data models (currently unused)

├── routes/ # API routes (auth, chat, documents, admin)

├── services/ # Core services (chat, documents, openai, socket)

└── utils/ # Utilities (logger, helpers)

```

Core Functionalities

1. Document Processing

Support for PDF, TXT, MD, DOCX file uploads

Extract text content from uploaded documents

Generate embeddings using OpenAI

Store embeddings in Pinecone vector database with metadata

Implement chunking strategy for large documents

2. Intelligent Chat System

Real-time chat interface using Socket.io

Query user questions against vector database

Retrieve relevant document chunks

Generate context-aware responses using OpenAI GPT

Include source citations in responses

3. Administration Panel

User management (view, create, edit, delete users)

System statistics and monitoring

Document management

Role-based access control (admin/user)

4. Authentication & Security

JWT-based authentication

Role-based authorization (user/admin)

Rate limiting to prevent API abuse

Input validation using express-validator

Centralized error handling

5. Real-Time Communication

WebSocket connections with Socket.io

Real-time message delivery

Connection state management

Error handling for socket events

Required Environment Variables

Ensure these environment variables are configured:

```

OPENAI_API_KEY=your_openai_api_key

PINECONE_API_KEY=your_pinecone_api_key

JWT_SECRET=your_jwt_secret

```

Implementation Instructions

When Writing Code

1. **Use Modern JavaScript Patterns**

- Use async/await for all asynchronous operations

- Avoid callback hell

- Use Promise.all() for parallel operations when appropriate

2. **Implement Comprehensive Logging**

- Use Winston for structured logging

- Log all important operations (auth, document processing, chat queries)

- Include context: user ID, request ID, timestamps

- Use appropriate log levels (error, warn, info, debug)

3. **Follow Error Handling Best Practices**

- Use centralized error handling middleware

- Create custom error classes for different error types

- Never expose internal errors to clients

- Log detailed errors server-side

- Return user-friendly error messages

4. **Implement Input Validation**

- Use express-validator for all API endpoints

- Validate file uploads (type, size)

- Sanitize user inputs

- Validate JWT tokens properly

5. **Apply Rate Limiting**

- Implement rate limiting middleware

- Different limits for different endpoints (e.g., stricter for AI queries)

- Return clear error messages when limits are exceeded

6. **Socket.io Best Practices**

- Authenticate socket connections using JWT

- Implement connection/disconnection handlers

- Handle errors gracefully

- Emit typed events with clear naming conventions

- Implement reconnection logic on client side

Service Layer Pattern

#### OpenAI Service

Separate methods for embeddings and chat completions

Implement retry logic for API failures

Handle rate limits gracefully

Cache embeddings when possible

#### Document Service

Parse different file formats (PDF, DOCX, TXT, MD)

Implement text chunking with overlap

Generate metadata for each chunk

Store in Pinecone with proper indexing

#### Chat Service

Query Pinecone for relevant chunks

Construct context from retrieved chunks

Generate prompts with system instructions

Stream responses when possible

Track conversation history

API Routes Structure

`POST /api/auth/login` - User authentication

`POST /api/auth/register` - User registration

`POST /api/documents/upload` - Upload documents

`GET /api/documents` - List documents

`DELETE /api/documents/:id` - Delete document

`POST /api/chat/message` - Send chat message

`GET /api/admin/users` - Admin: List users

`GET /api/admin/stats` - Admin: System statistics

Example: Document Upload Flow

```javascript

// 1. Receive file upload

// 2. Validate file type and size

// 3. Extract text content

// 4. Split into chunks (e.g., 500 words with 50-word overlap)

// 5. Generate embeddings for each chunk

// 6. Store in Pinecone with metadata:

// - documentId, chunkIndex, text, source, uploadedBy, timestamp

// 7. Return success response with document ID

```

Example: Chat Query Flow

```javascript

// 1. Receive user question via WebSocket or API

// 2. Generate embedding for question

// 3. Query Pinecone for top K similar chunks (e.g., K=5)

// 4. Construct prompt with system instructions + context chunks + question

// 5. Call OpenAI GPT with constructed prompt

// 6. Stream response back to user

// 7. Log query and response for analytics

```

Code Style Guidelines

Use ES6+ features (arrow functions, destructuring, template literals)

Use try-catch blocks for async operations

Implement proper TypeScript-style JSDoc comments

Use meaningful variable and function names

Keep functions focused and single-purpose

Separate business logic from route handlers

Use middleware for cross-cutting concerns

Testing Considerations

Test document parsing for all supported formats

Test embedding generation and retrieval

Test rate limiting behavior

Test authentication flows

Test WebSocket connections and disconnections

Test error handling for external API failures

Performance Optimization

Implement caching for frequently accessed data

Use connection pooling for database connections

Compress API responses

Implement pagination for list endpoints

Use streaming for large responses

Optimize Pinecone queries (appropriate K value, metadata filtering)

Security Checklist

Validate and sanitize all user inputs

Use HTTPS in production

Implement CORS properly

Store secrets in environment variables

Hash passwords using bcrypt

Implement request size limits

Use helmet.js for security headers

Implement proper session management

RAG Customer Support System

RAG Customer Support System

Project Architecture

Directory Structure

Core Functionalities

1. Document Processing

2. Intelligent Chat System

3. Administration Panel

4. Authentication & Security

5. Real-Time Communication

Required Environment Variables

Implementation Instructions

When Writing Code

Service Layer Pattern

API Routes Structure

Example: Document Upload Flow

Example: Chat Query Flow

Code Style Guidelines

Testing Considerations

Performance Optimization

Security Checklist

Reviews (0)