手动KNN算法识别MNIST-CSDN博客

本文链接：https://blog.csdn.net/qq_38681174/article/details/108492741

本文介绍了一种使用KNN算法手动识别MNIST手写数字的方法，通过计算训练集与测试集之间的曼哈顿距离，选取最近的k个样本进行分类预测，实现了对MNIST数据集的手写数字识别。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

python tf学习(二) 手动实现KNN算法手写数字识别

import tensorflow as tf
from tensorflow.examples.tutorials.mnist.input_data import read_data_sets
from pandas import Series

digits = read_data_sets('./')
X_train, y_train = digits.train.next_batch(5000)
X_test, y_test = digits.test.next_batch(200)

X = tf.placeholder(dtype=tf.float64, shape=784)
k = tf.placeholder(dtype=tf.float64)

# 定义曼哈顿距离
# X_train, shape(5000, 784)  X shape (1, 784) -> 5000个距离
l1 = tf.reduce_sum(tf.abs(X_train - X), axis=1)

# 有两种计算距离的方法: 欧式距离: (∑(x - y)^2)^0.5   叫做L2距离
# 曼哈顿距离:  |A - B|  |X_train - X_test|     叫做L1距离

min_distance_index = tf.argsort(l1)

with tf.Session() as sess:
    # 计算200个待分类样本点的距离
    acc = 0
    for x, y in zip(X_test, y_test):
        # 排序之后的距离的索引
        dis_index = sess.run(min_distance_index, feed_dict={X: x})
        # 取k个最近的距离
        k_index = dis_index[:8]
        # k个样本中,哪一类出现的最多.
        y_k = y_train[k_index]
        s = Series(y_k)
        y_ = s.value_counts().idxmax()
        print('预测结果', y_k)
        print('真实结果', y)
        print('最终预测结果:', y_)
        if y_ == y:
            acc += 1/200
    print('准确率:', acc)