Configure and set up a CrewAI-based proof reading multiagent system with proper environment management, logging, and development workflow patterns for medical document review.
A comprehensive setup guide for developing and maintaining a CrewAI-based proof reading multiagent system that automates document review and severity assessment for medical/scientific documents.
This system uses two specialized AI agents to review document issues from Excel files and reassess their severity levels using configurable rulesets. The architecture is optimized for 8GB development environments with production deployment on Google Cloud Run Functions.
**CRITICAL**: Always use the existing virtual environment - never create new ones.
```bash
cd /path/to/proof_reading_multiagent_system
source .virtenv/bin/activate
pip install -r requirements.txt
```
For 8GB laptop development environments:
```bash
echo 'export NODE_OPTIONS="--max-old-space-size=3072"' >> ~/.zshrc
source ~/.zshrc
```
Ensure the following directory structure exists:
```
proof_reading_multiagent_system/
├── logs/ # Root-level logs with archive/
├── proof_reading_multiagent_system/
│ ├── data/ # Data files (benchmarks/, output/, samples/)
│ ├── docs/ # Documentation
│ ├── knowledge/ # CrewAI knowledge base
│ ├── scripts/ # Utility scripts
│ ├── src/proof_reading_multiagent_system/
│ │ ├── config/ # Agent and task YAML configs
│ │ ├── models/ # Data models and schemas
│ │ ├── tools/ # Custom CrewAI tools
│ │ ├── utils/ # Logging infrastructure
│ │ ├── config.yaml # Main system configuration
│ │ ├── crew.py # CrewAI crew definition
│ │ └── main.py # Main execution entry point
│ └── tests/ # Test suite
```
All configuration must be self-contained in config.yaml files - no manual environment variables required:
```python
def _setup_google_cloud_environment(self) -> None:
"""Set up Google Cloud environment from configuration."""
try:
gemini_config = self.config.get('gemini_config', {})
credentials_path = gemini_config.get('credentials_path')
if credentials_path:
if not os.path.isabs(credentials_path):
config_dir = os.path.dirname(__file__)
credentials_path = os.path.join(config_dir, credentials_path)
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = credentials_path
except Exception as e:
self.logger.error(f"Failed to setup Google Cloud environment: {e}")
```
```yaml
document_context:
document_type: "Clinical Study Report"
domain: "Oncology"
urgency: "Standard"
behavior_config:
verbose: false
logging:
log_file: "/absolute/path/to/logs/proof_reading_system.log"
output_directory: "/absolute/path/to/output/"
gemini_config:
credentials_path: "$HOME/.config/gcloud/credentials.json"
```
Always use the structured logging system:
```python
from proof_reading_multiagent_system.utils import (
get_logger, log_structured, correlation_context,
performance_monitor, audit_logger, log_execution_time
)
with performance_monitor.track_processing_operation("excel_processing") as metrics:
# Processing logic
metrics.records_processed = record_count
audit_logger.log_severity_change(issue_id, old_severity, new_severity, reason, agent_name)
@log_execution_time("critical_function")
def process_excel_data():
pass
```
```python
@CrewBase
class ProofReadingCrew:
def __init__(self):
self.logger = setup_logging()
self.session_id = generate_correlation_id()
@before_kickoff
def setup_session_logging(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
with correlation_context(self.session_id):
self.logger.info("Starting proof reading session",
extra={'session_id': self.session_id})
return inputs
@after_kickoff
def finalize_session_logging(self, output: Any) -> Any:
performance_monitor.log_session_summary()
return output
```
```bash
python -m proof_reading_multiagent_system.main --input "data/samples/issues.xlsx"
python -m proof_reading_multiagent_system.main --input "issues.xlsx" --output "/custom/path/result.xlsx"
python -m proof_reading_multiagent_system.main --input "issues.xlsx" --verbose
run_crew
train
replay
test
```
```bash
source .virtenv/bin/activate
git add specific_files_for_feature
git commit -m "Add ExcelReaderTool: implement XLSX to DocumentIssue conversion"
```
```python
try:
# Operation here
result = process_document_issues(excel_data)
except Exception as e:
log_error(
f"Document processing failed: {str(e)}",
error_type=type(e).__name__,
operation="document_processing",
correlation_id=get_correlation_id(),
additional_context={"file_path": input_file}
)
raise # Re-raise after logging
```
Process only Medium/High severity issues for 60-80% efficiency gain:
```
XLSX Input → SeverityFilter → [Medium/High Issues] + [Low Issues (untouched)]
↓ ↓
Process Medium/High Only Keep Low Issues Separate
↓ ↓
Updated Medium/High Issues ← DataMergeTool ← Untouched Low Issues
↓
XLSX Output (Complete Dataset)
```
1. **Check Node.js memory limit**: Ensure NODE_OPTIONS is set to 3GB
2. **Monitor file sizes**: Keep development Excel files under 5MB
3. **Restart when needed**: Don't hesitate to restart Claude Code if memory usage is high
4. **Browser management**: Close unnecessary tabs to free system memory
This setup ensures a robust, scalable proof reading system with proper logging, configuration management, and development workflows optimized for both local development and production deployment.
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/proof-reading-multiagent-system-setup/raw