simple-knn
时间: 2025-04-22 11:01:25 浏览: 29
### Simple KNN Algorithm Implementation and Explanation
K-nearest neighbors (KNN) is one of the simplest yet effective algorithms for both classification and regression problems. In its essence, KNN operates based on similarity measures between instances.
#### How KNN Works
For any new instance requiring prediction, KNN finds the most similar k training samples within the dataset according to some distance metric such as Euclidean distance. For classification tasks, the majority class among those nearest neighbors determines the predicted label; while for regression tasks, predictions might take the average value of target variables from these closest points[^1].
#### Implementing Simple KNN in Python
Below demonstrates how to implement a simple version of the KNN classifier using Python:
```python
import numpy as np
from collections import Counter
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
def euclidean_distance(x1, x2):
return np.sqrt(np.sum((x1 - x2)**2))
class SimpleKNN:
def __init__(self, k=3):
self.k = k
def fit(self, X_train, y_train):
self.X_train = X_train
self.y_train = y_train
def predict(self, X_test):
predictions = [self._predict(x) for x in X_test]
return np.array(predictions)
def _predict(self, x):
distances = [euclidean_distance(x, x_train) for x_train in self.X_train]
k_indices = np.argsort(distances)[:self.k]
k_nearest_labels = [self.y_train[i] for i in k_indices]
most_common = Counter(k_nearest_labels).most_common(1)
return most_common[0][0]
# Load iris dataset
data = load_iris()
X, y = data.data, data.target
# Splitting into training/testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize our custom knn model
clf = SimpleKNN(k=3)
clf.fit(X_train, y_train)
# Predictions
predictions = clf.predict(X_test)
print(f"Predicted labels: {predictions}")
```
This code snippet defines `SimpleKNN`, which implements basic functionalities including fitting models with given datasets (`fit` method), predicting unseen sample classes (`predict` method), calculating pairwise distances via helper function `_predict`. Moreover, this example uses Iris flower species recognition task provided by Scikit-Learn library as demonstration purpose only[^4].
阅读全文
相关推荐


















