Enforces PEP 8, Ruff formatting, type hints, and Google-style docstrings for Python data science projects. Generates complete, production-ready code with pandas, numpy, and plotly defaults.
A Cursor rules configuration for Python data science development that enforces strict PEP 8 compliance, comprehensive type hints, and Google-style documentation standards.
This skill configures Cursor to generate Python code that:
When writing Python code, follow these rules:
1. **Adhere strictly to PEP 8 and Ruff formatting standards**
2. Use descriptive variable names and avoid unnecessary abbreviations
3. Prefer early returns for clarity and minimal nesting
4. Always generate complete, working code with no TODOs or placeholders
5. Format code using standard Python conventions with minimal manual adjustments
1. **Automatically include Python type hints for every function and method**
2. For every function or class, generate a Google-style docstring that includes:
- A brief description of the function's purpose
- A Parameters section with types and descriptions
- A Returns section with type and explanation
- An Example section when applicable (especially for data manipulation functions)
3. When handling DataFrame operations, include a section detailing the expected structure (e.g., column names and data types)
4. Ensure that all generated docstrings are copy-paste ready for integration with Google Docs
1. Always produce code that is modular and broken into logical functions or classes
2. If generating data science code, default to using libraries such as pandas, numpy, and plotly
3. When handling statistical or machine learning tasks, include basic logging and error handling
4. If code is complex, add inline comments describing the "why" behind key operations to facilitate future documentation
5. Generate complete code blocks including all necessary imports at the top
6. When suggesting refactoring, ensure that the revised code remains fully functional and improves readability
1. Assume a data science context. When generating functions, check if the operation is on dataframes or numerical arrays
2. Automatically integrate typing hints and Google-style documentation in every generated snippet
3. When referencing external documentation (e.g., for Google Docs integration), include a comment explaining the integration process
4. For multi-file projects, support context via the "@" syntax so that code spanning multiple modules is well-coordinated
5. Use contextual cues to tailor examples: if the function involves CSV reading, generate code that uses `pandas.read_csv` with proper error checks
1. **Default to generating code that uses:**
```python
import pandas as pd
import numpy as np
import plotly.express as px
```
2. When a function deals with data cleaning or analysis, include an explanation of assumptions (e.g., missing values handling)
3. If generating machine learning code, assume a PyTorch workflow with device settings using `torch.device(...)` as needed
4. When generating code snippets that involve documentation export (e.g., to Google Docs), include instructions as comments
5. Prioritize readability: if a more concise or vectorized solution exists (e.g., using list comprehensions or pandas vectorized methods), use it
When generating a new Python function, always provide a docstring in the following format:
```python
def function_name(param1: Type, param2: Type) -> ReturnType:
"""
Brief one-sentence description.
Args:
param1 (Type): Description.
param2 (Type): Description.
Returns:
ReturnType: Description of the returned value.
Example:
>>> function_name(example_param1, example_param2)
Expected output description
"""
# Function implementation
pass
```
This rule should be applied universally to all functions and methods.
When you ask Cursor to generate a data processing function:
```python
import pandas as pd
import numpy as np
from typing import Optional
def clean_dataframe(
df: pd.DataFrame,
columns_to_drop: Optional[list[str]] = None,
fill_na_value: float = 0.0
) -> pd.DataFrame:
"""
Cleans a DataFrame by dropping specified columns and filling missing values.
Args:
df (pd.DataFrame): Input DataFrame to clean.
columns_to_drop (Optional[list[str]]): List of column names to drop. Defaults to None.
fill_na_value (float): Value to use for filling missing data. Defaults to 0.0.
Returns:
pd.DataFrame: Cleaned DataFrame with specified columns removed and NaN values filled.
Example:
>>> df = pd.DataFrame({'a': [1, np.nan, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]})
>>> clean_dataframe(df, columns_to_drop=['c'], fill_na_value=0)
a b
0 1 4
1 0 5
2 3 6
Note:
Assumes input DataFrame has numeric or compatible data types for fill operation.
"""
# Create a copy to avoid modifying the original DataFrame
df_cleaned = df.copy()
# Drop specified columns if provided
if columns_to_drop:
df_cleaned = df_cleaned.drop(columns=columns_to_drop, errors='ignore')
# Fill missing values
df_cleaned = df_cleaned.fillna(fill_na_value)
return df_cleaned
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/python-data-science-with-pep-8-and-google-docstrings/raw