Enforces PEP 8 standards, automatic type hints, Google-style docstrings, and data science best practices for Python development with pandas, numpy, and plotly.
This skill enforces strict Python coding standards with a focus on data science workflows. It ensures all generated code follows PEP 8, includes comprehensive type hints and Google-style docstrings, and integrates best practices for pandas, numpy, and plotly development.
1. **Adhere strictly to PEP 8 and Ruff formatting standards**
- Use descriptive variable names and avoid unnecessary abbreviations
- Prefer early returns for clarity and minimal nesting
- Format code using standard Python conventions with minimal manual adjustments
2. **Always generate complete, working code**
- No TODOs or placeholders
- Include all necessary imports at the top
- Ensure code is fully functional and ready to run
3. **Automatically include Python type hints for every function and method**
- Add type hints to all parameters and return values
- Use appropriate types from `typing` module when needed
4. **Generate Google-style docstrings for every function and class**
- Brief description of the function's purpose
- Parameters section with types and descriptions
- Returns section with type and explanation
- Example section when applicable (especially for data manipulation functions)
- For DataFrame operations, include expected structure (column names and data types)
- Ensure docstrings are copy-paste ready
5. **Follow this docstring template universally:**
```python
def function_name(param1: Type, param2: Type) -> ReturnType:
"""
Brief one-sentence description.
Args:
param1 (Type): Description.
param2 (Type): Description.
Returns:
ReturnType: Description of the returned value.
Example:
>>> function_name(example_param1, example_param2)
expected_output
"""
# function body
```
6. **Produce modular code broken into logical functions or classes**
- Each function should have a single, clear responsibility
- When suggesting refactoring, ensure revised code remains fully functional
7. **Use data science libraries by default:**
- pandas for data manipulation
- numpy for numerical operations
- plotly for visualization
- Include standard imports: `import pandas as pd`, `import numpy as np`, `import plotly.express as px`
8. **Include robust error handling and logging**
- Add try-except blocks for statistical or machine learning tasks
- Include basic logging for complex operations
9. **Add inline comments for complex operations**
- Describe the "why" behind key operations
- Facilitate future documentation and maintenance
10. **Assume data science context by default**
- Check if operations are on DataFrames or numerical arrays
- Tailor examples to the data structure being used
11. **For DataFrame operations:**
- Use `pandas.read_csv` with proper error checks for CSV reading
- Include explanations of assumptions (e.g., missing values handling)
- Prioritize vectorized operations over loops
12. **For machine learning code:**
- Assume PyTorch workflow when applicable
- Include device settings using `torch.device(...)`
13. **Prioritize readability and efficiency:**
- Use list comprehensions or pandas vectorized methods when available
- Choose concise, clear solutions over verbose implementations
14. **Support multi-file projects:**
- Coordinate code across modules using proper imports
- Include comments explaining integration with external systems (e.g., Google Docs export)
When asked to "create a function to clean a DataFrame":
```python
import pandas as pd
from typing import List, Optional
def clean_dataframe(
df: pd.DataFrame,
columns_to_drop: Optional[List[str]] = None,
fill_na_value: Optional[float] = 0.0
) -> pd.DataFrame:
"""
Clean a DataFrame by dropping specified columns and filling missing values.
Args:
df (pd.DataFrame): Input DataFrame to clean.
columns_to_drop (Optional[List[str]]): List of column names to drop. Defaults to None.
fill_na_value (Optional[float]): Value to use for filling NaN entries. Defaults to 0.0.
Returns:
pd.DataFrame: Cleaned DataFrame with specified columns removed and NaN values filled.
Example:
>>> data = pd.DataFrame({'A': [1, 2, None], 'B': [4, None, 6], 'C': [7, 8, 9]})
>>> clean_dataframe(data, columns_to_drop=['C'], fill_na_value=0)
A B
0 1.0 4.0
1 2.0 0.0
2 0.0 6.0
"""
# Create a copy to avoid modifying the original DataFrame
df_cleaned = df.copy()
# Drop specified columns if provided
if columns_to_drop:
df_cleaned = df_cleaned.drop(columns=columns_to_drop, errors='ignore')
# Fill missing values with the specified value
df_cleaned = df_cleaned.fillna(fill_na_value)
return df_cleaned
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/python-data-science-development-rules/raw