Plexure API Search Cursor Rules
Apply these rules when building semantic API search platforms with vector embeddings, natural language processing, and API contract indexing.
Code Preservation Principles
**Critical: Never remove existing functionality without explicit approval**
Preserve all existing code not directly related to current change scopeMaintain all existing functionality unless explicitly requested to changeKeep all existing imports and dependencies unless specifically being refactoredPreserve existing error handling and logging mechanismsMaintain existing configuration options and defaultsKeep all existing CLI commands and their functionalityDo not remove or modify existing tests unless directly related to changesPreserve existing documentation unless it needs updating for new changesCore File Responsibilities
**Search Engine Components:**
`searcher.py` — Search engine core logic, query processing, result ranking`indexer.py` — API contract indexing, vector creation, database management`embeddings.py` — Vector embedding generation and model management`boosting.py` — Search result ranking and score boosting logic`expansion.py` — Query expansion and enhancement`understanding.py` — Natural language processing and query understanding**Infrastructure Components:**
`cli.py` — Command-line interface and user interaction handling`config.py` — Configuration management and environment variables`cache.py` — Caching mechanisms for search results and embeddings`metrics.py` — Performance tracking and statistical measurements`validation.py` — Input validation and data sanitization`consistency.py` — Data consistency and validation checks`quality.py` — Search quality metrics and improvements`monitoring.py` — System monitoring and loggingCode Style Standards
Follow PEP 8 with these specific requirements:
Use type hints for all function parameters and return valuesMaximum line length: 100 charactersUse docstrings for all public functions and classes (Google or NumPy style)Use consistent naming conventions (snake_case for functions/variables, PascalCase for classes)Add comments for complex logic or business rulesKeep functions focused and single-purposeUse meaningful variable names that describe their purposeExample:
```python
def calculate_semantic_similarity(
query_embedding: np.ndarray,
document_embedding: np.ndarray,
boost_factor: float = 1.0
) -> float:
"""Calculate cosine similarity between query and document embeddings.
Args:
query_embedding: Query vector representation
document_embedding: Document vector representation
boost_factor: Score multiplier for ranking adjustments
Returns:
Similarity score between 0 and 1
"""
similarity = np.dot(query_embedding, document_embedding)
return min(similarity * boost_factor, 1.0)
```
Project Structure
Maintain this directory organization:
```
plexure-api-search/
├── plexure_api_search/ # Core logic (all modules go here)
│ ├── __init__.py
│ ├── searcher.py
│ ├── indexer.py
│ ├── embeddings.py
│ ├── cli.py
│ ├── config.py
│ └── ...
├── tests/ # All tests
│ ├── test_searcher.py
│ ├── test_indexer.py
│ └── ...
├── docs/ # Documentation
│ ├── api.md
│ ├── architecture.md
│ ├── deployment.md
│ ├── development.md
│ └── testing.md
├── pyproject.toml # Project metadata and dependencies
├── poetry.lock # Locked dependency versions
├── .env.sample # Example environment variables
├── .gitignore
├── README.md
├── LICENSE
└── NOTICE
```
**Key principles:**
Organize modules by feature/domainUse `__init__.py` files to control public APIsKeep circular dependencies strictly forbiddenMaintain clear separation of concernsTesting Requirements
Maintain rigorous test coverage:
Write unit tests for all new functionalityMaintain test coverage above 80%Place tests in `tests/` directory with `test_` prefixTest file names must match source file names (`test_searcher.py` for `searcher.py`)Include integration tests for critical paths (indexing, search, ranking)Mock external dependencies (embedding models, databases, APIs)Test error conditions and edge casesUse pytest fixtures for common setupInclude performance tests for critical operations (search latency, indexing throughput)Example test structure:
```python
import pytest
from plexure_api_search.searcher import SemanticSearcher
@pytest.fixture
def mock_embeddings():
"""Provide mock embedding vectors for testing."""
return np.random.rand(10, 768)
def test_search_returns_results(mock_embeddings):
"""Test that search returns ranked results."""
searcher = SemanticSearcher()
results = searcher.search("user authentication API")
assert len(results) > 0
assert results[0].score >= results[-1].score
def test_search_handles_empty_query():
"""Test graceful handling of empty queries."""
searcher = SemanticSearcher()
results = searcher.search("")
assert len(results) == 0
```
Documentation Standards
Keep documentation comprehensive and current:
Document all public APIs with docstrings (Google or NumPy style)Keep README.md up to date with new featuresInclude usage examples in docstringsDocument complex algorithms with inline commentsMaintain API documentation in `docs/api.md`Include type hints in documentationDocument error conditions and handlingKeep configuration options documented in `docs/development.md`Include troubleshooting guides in `docs/deployment.md`Dependency Management
Use Poetry for all dependency management:
Pin dependency versions in `pyproject.toml`Keep dependencies up to date (use `poetry update` monthly)Document new dependencies in README.md with justificationUse `requirements-dev.txt` for development dependencies (linters, formatters, test tools)Make optional dependencies truly optional with graceful fallbacksHandle dependency conflicts by pinning compatible versionsDocument minimum version requirements in README.mdTest with multiple Python versions (3.9, 3.10, 3.11+)Keep `poetry.lock` committed to version controlExample `pyproject.toml` section:
```toml
[tool.poetry.dependencies]
python = "^3.9"
numpy = "^1.24.0"
sentence-transformers = "^2.2.0"
chromadb = "^0.4.0"
click = "^8.1.0"
[tool.poetry.group.dev.dependencies]
pytest = "^7.4.0"
pytest-cov = "^4.1.0"
black = "^23.7.0"
mypy = "^1.4.0"
```
Environment Configuration
Use `.env` for environment-specific configuration:
Use `.env` for environment variables (never commit)Keep `.env.sample` updated with all required variablesUse Python 3.9 or higherDocument all environment variables with descriptionsProvide default values when appropriateValidate environment variables at startupHandle missing variables gracefully with clear error messagesSupport different environments (development, staging, production)Example `.env.sample`:
```bash
Embedding Model Configuration
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
EMBEDDING_DIMENSION=384
Database Configuration
VECTOR_DB_PATH=./data/chroma_db
CACHE_TTL_SECONDS=3600
Performance Tuning
MAX_SEARCH_RESULTS=50
QUERY_TIMEOUT_SECONDS=30
```
Model Management
Handle embedding models robustly:
Use public, well-established models by default (sentence-transformers/all-MiniLM-L6-v2)Handle model loading errors gracefully with retry logicImplement fallback models for robustness (smaller models as backup)Cache model artifacts appropriately (disk cache, not memory for large models)Monitor model performance metrics (inference latency, embedding quality)Support model versioning in configurationHandle tokenization errors gracefully (truncate, skip, warn)Validate model compatibility at startupDocument model requirements and limitations in README.mdExample model loading:
```python
from sentence_transformers import SentenceTransformer
def load_embedding_model(model_name: str, fallback: str = "all-MiniLM-L6-v2") -> SentenceTransformer:
"""Load embedding model with fallback."""
try:
return SentenceTransformer(model_name)
except Exception as e:
logger.warning(f"Failed to load {model_name}, using fallback: {e}")
return SentenceTransformer(fallback)
```
Error Handling
Implement comprehensive error handling:
Use specific exception types (ValueError, FileNotFoundError, ModelLoadError)Log errors with appropriate context (stack traces, input data, system state)Implement graceful fallbacks (cached results, simplified queries, default responses)Validate inputs early at API boundariesHandle resource cleanup properly (close files, connections, model handles)Provide meaningful error messages to usersInclude error recovery mechanisms (retry logic, circuit breakers)Monitor error rates and patterns in productionDocument error handling procedures in `docs/troubleshooting.md`Performance Optimization
Keep search response times under 200ms:
Cache expensive operations (embedding generation, search results)Use vectorized operations where possible (NumPy, batch processing)Monitor memory usage (track embedding cache size, vector DB memory)Profile performance-critical code (search, indexing, ranking)Implement request timeouts (30s default)Use connection pooling for databasesImplement rate limiting to prevent abuseMonitor resource utilization (CPU, memory, disk I/O)Optimize database queries (use indexes, batch operations)Example caching:
```python
from functools import lru_cache
@lru_cache(maxsize=1000)
def get_embedding(text: str) -> np.ndarray:
"""Get text embedding with LRU cache."""
return embedding_model.encode(text)
```
Security Best Practices
Protect sensitive data and prevent abuse:
Validate all inputs (sanitize queries, check lengths, reject malicious patterns)Sanitize API responses (strip internal paths, remove sensitive metadata)Use environment variables for secrets (never hardcode)Never log sensitive data (API keys, user PII, internal paths)Implement rate limiting (per-user, per-IP)Use secure communication (HTTPS, TLS for database connections)Validate authentication tokens if applicableImplement access controls for admin operationsMonitor security events (failed auth, rate limit hits)Monitoring and Observability
Track system health and performance:
Log important operations (search queries, indexing jobs, cache hits)Track performance metrics (latency percentiles, throughput, cache hit rate)Monitor error rates (by type, endpoint, user)Use structured logging (JSON format, consistent fields)Implement health checks (`/health` endpoint)Track resource utilization (memory, CPU, disk)Monitor API latencies (p50, p95, p99)Implement alerting for anomaliesTrack business metrics (search volume, top queries, user engagement)Monitor cache effectiveness (hit rate, eviction rate)Example structured logging:
```python
import logging
import json
logger = logging.getLogger(__name__)
def log_search(query: str, results_count: int, latency_ms: float):
"""Log search operation with structured data."""
logger.info(json.dumps({
"event": "search",
"query": query,
"results_count": results_count,
"latency_ms": latency_ms,
"timestamp": datetime.utcnow().isoformat()
}))
```
Version Control
Follow conventional commits format:
Write clear commit messages (imperative mood, present tense)Follow semantic versioning (MAJOR.MINOR.PATCH)Keep feature branches small (merge within 2-3 days)Merge only after tests pass (CI/CD gate)Use conventional commits format (`feat:`, `fix:`, `docs:`, `refactor:`, etc.)Review code changes before mergingMaintain CHANGELOG.md with release notesTag releases properly (`v1.2.3`)Document breaking changes in CHANGELOGKeep main branch stable (all tests passing)Example commit messages:
```
feat(search): add query expansion with synonyms
fix(indexer): handle malformed API contracts gracefully
docs(readme): update model requirements section
refactor(cache): extract cache logic into separate module
```
Implementation Checklist
When adding new features:
1. Design the feature with clear interfaces
2. Write tests first (TDD approach)
3. Implement core logic with type hints
4. Add comprehensive docstrings
5. Update relevant documentation
6. Add monitoring and logging
7. Test error conditions
8. Profile performance if critical path
9. Update CHANGELOG.md
10. Review code preservation rules
Common Patterns
**Query Processing Pipeline:**
```python
query = validate_input(raw_query)
expanded_query = expand_query(query)
embedding = generate_embedding(expanded_query)
results = vector_search(embedding)
boosted_results = apply_boosting(results, query_metadata)
return rank_and_format(boosted_results)
```
**Indexing Pipeline:**
```python
api_contracts = load_contracts(source_path)
validated_contracts = validate_contracts(api_contracts)
embeddings = generate_embeddings_batch(validated_contracts)
index_to_vector_db(embeddings, metadata)
update_search_index()
invalidate_cache()
```
Anti-Patterns to Avoid
Do not hardcode model names or pathsDo not ignore model loading failuresDo not skip input validationDo not log full query results (may contain sensitive data)Do not block on synchronous embedding generation for large batchesDo not store embeddings in memory without size limitsDo not skip error handling in production codeDo not remove existing tests without replacementDo not modify core logic without preserving backward compatibility