Install and use scikit-learn, a comprehensive Python library for machine learning and data mining built on NumPy and SciPy with classification, regression, clustering, and dimensionality reduction capabilities.
Install and configure scikit-learn, a Python module for machine learning built on SciPy, distributed under the 3-Clause BSD license.
This skill helps you set up and use scikit-learn (v1.8.0+) for machine learning tasks including classification, regression, clustering, dimensionality reduction, model selection, and preprocessing. Scikit-learn is built on NumPy and SciPy and maintained by a community of volunteers since 2007.
Check that Python 3.11 or higher is installed:
```bash
python --version
```
Scikit-learn requires NumPy (>= 1.24.1), SciPy (>= 1.10.0), joblib (>= 1.3.0), and threadpoolctl (>= 3.2.0). These will be installed automatically.
Using pip:
```bash
pip install -U scikit-learn
```
Using conda:
```bash
conda install -c conda-forge scikit-learn
```
For plotting capabilities:
```bash
pip install matplotlib>=3.6.1
```
For specific examples:
```bash
pip install scikit-image>=0.22.0 pandas>=1.5.0 seaborn>=0.13.0 plotly>=5.18.0
```
```bash
python -c "import sklearn; print(sklearn.__version__)"
```
```python
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(
iris.data, iris.target, test_size=0.3, random_state=42
)
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
```
```python
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=1, noise=10, random_state=42)
model = LinearRegression()
model.fit(X, y)
predictions = model.predict(X)
```
```python
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
X, _ = make_blobs(n_samples=300, centers=4, random_state=42)
kmeans = KMeans(n_clusters=4, random_state=42)
labels = kmeans.fit_predict(X)
```
Run the test suite after installation:
```bash
pip install pytest>=7.1.2
pytest sklearn
```
Control random number generation during testing:
```bash
export SKLEARN_SEED=42
pytest sklearn
```
If using scikit-learn in scientific publications, please cite appropriately: https://scikit-learn.org/stable/about.html#citing-scikit-learn
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/scikit-learn-machine-learning-library/raw