logistic回归模型的实现-CSDN博客

本文链接：https://blog.csdn.net/boating06/article/details/148045581

一、什么是logistic回归

Logistic回归是一种分类模型，常用来解决二分类问题（比如预测一个邮件是“垃圾邮件”还是“正常邮件”）。虽然名字里有“回归”，但它其实是用来做分类的

它的目标是根据输入的特征，也就是预测事件发生的概率

二、为什么用Logistic回归？

线性回归预测的是一个连续值，但分类问题要输出类别（0或1）

Logistic回归通过一个“sigmoid函数”把线性组合的结果映射到0到1之间，表示概率

三、数学表达

假设输入特征是一个向量 x=(x1,x2,...,xn)，模型有参数 w=(w1,w2,...,wn)和偏置项 b

1. 线性组合

先对输入做线性组合：

2. sigmoid函数

接着用sigmoid函数把 z 转换成概率：

3. 最终概率表达

所以，预测正类概率是：

相应的，预测负类的概率是：

四、决策边界

模型输出概率后，通常会设一个阈值（默认是0.5），

如果 P(y=1∣x)>0.5，就预测为类别1

否则预测为类别0

这个阈值也可以根据实际需求调整

五、训练模型

训练时，目标是找到参数 w 和 b，使得预测概率最大程度符合训练数据

用最大似然估计（MLE）的方法：

对于给定参数 θ，似然函数定义为数据在该参数下的联合概率：

通常假设数据样本相互独立同分布，则联合概率是单个概率的乘积：

损失函数用交叉熵损失（log loss）：

通过梯度下降等优化方法来最小化损失函数，找到最优参数

六、Logistic回归的优点和局限

优点：

1.计算简单，训练快

2.输出概率，解释性强

3.对线性可分问题效果好

4.可以用正则化（L1、L2）防止过拟合

局限：

1.只能处理线性边界（特征和结果之间线性关系）

2.对非线性关系表现不好，需做特征工程或用非线性模型

3.容易受异常值影响

七、Logistic回归模型的代码实现

代码实现：

import numpy as np
import matplotlib.pyplot as plt

def load_data(filename):
    # 载入数据，前两列为特征，第三列为标签
    data = np.loadtxt(filename)
    X = data[:, 0:2]  # 前两列特征
    y = data[:, 2]    # 第三列标签
    return X, y

def sigmoid(z):
    # sigmoid 函数
    return 1.0 / (1.0 + np.exp(-z))

def add_intercept(X):
    # 添加截距项（偏置），使矩阵第一列全为1
    if X.ndim == 1:
        return np.concatenate(([1], X))
    else:
        intercept = np.ones((X.shape[0], 1))
        return np.concatenate((intercept, X), axis=1)

def gradient_descent(X, y, theta, alpha, num_iters):
    # 梯度下降优化theta参数
    m = len(y)
    J_history = []
    
    for i in range(num_iters):
        h = sigmoid(X.dot(theta))        # 预测概率
        error = h - y                   # 误差
        gradient = X.T.dot(error) / m   # 梯度
        theta -= alpha * gradient       # 更新参数
        
        if i % 100 == 0:
            # 计算交叉熵损失
            cost = (-y * np.log(np.clip(h, 1e-5, 1-1e-5)) - 
                   (1 - y) * np.log(np.clip(1 - h, 1e-5, 1-1e-5))).mean()
            J_history.append(cost)
            print(f'Iteration {i}, Cost: {cost}')
    
    return theta, J_history

def predict(X, theta):
    # 根据模型参数预测类别和概率
    prob = sigmoid(np.dot(X, theta))
    prediction = (prob >= 0.5).astype(int)
    # 确保单个样本预测返回标量而非数组
    if isinstance(prob, np.ndarray) and prob.size == 1:
        return prediction.item(), prob.item()
    return prediction, prob

def plot_decision_boundary(X, y, theta, user_input=None, user_prediction=None):
    # 画出数据点和决策边界
    plt.figure(figsize=(10, 6))
    
    # 原始数据点绘制，不同类别用不同颜色和形状
    plt.scatter(X[y==0, 0], X[y==0, 1], c='blue', marker='o', label='Class 0')
    plt.scatter(X[y==1, 0], X[y==1, 1], c='red', marker='x', label='Class 1')
    
    # 用户输入点绘制，颜色和形状根据预测类别不同而变化
    if user_input is not None and user_prediction is not None:
        marker_color = 'green' if user_prediction == 0 else 'purple'
        marker_style = 'o' if user_prediction == 0 else 'x'
        plt.scatter([user_input[0]], [user_input[1]], 
                   c=marker_color, marker=marker_style, s=100, edgecolors='black', 
                   label=f'User Input (Prediction: {user_prediction})')
    
    # 决策边界绘制：根据theta参数算出对应的x2
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    plot_x = np.array([x_min, x_max])
    plot_y = (-1 / theta[2]) * (theta[1] * plot_x + theta[0])
    
    plt.plot(plot_x, plot_y, 'g-', label='Decision Boundary')
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.legend()
    
    # 显示决策边界的解析式
    equation = f'x2 = {-theta[0]/theta[2]:.4f} + {-theta[1]/theta[2]:.4f} * x1'
    plt.annotate(equation, xy=(0.05, 0.95), xycoords='axes fraction')
    
    plt.title('Logistic Regression Classification Result')
    plt.show()

def get_user_input():
    # 获取用户输入的两个特征值，支持空格或逗号分隔
    try:
        user_input = input("Enter two feature values (separated by space or comma, e.g. '1.5 3.2' or 'q' to quit): ")
        
        if user_input.lower() == 'q':
            return None
            
        if ',' in user_input:
            values = user_input.split(',')
            x1 = float(values[0].strip())
            x2 = float(values[1].strip())
        else:
            values = user_input.split()
            x1 = float(values[0])
            x2 = float(values[1])
            
        return np.array([x1, x2])
    except:
        print("Invalid input, please enter two numbers or 'q' to quit")
        return get_user_input()

def main():
    # 训练模型
    print("Loading data and training logistic regression model...")
    X, y = load_data(r'C:\Users\26687\Desktop\test\deep learning\e5\testSet (1).txt')
    X_with_intercept = add_intercept(X)
    
    initial_theta = np.zeros(X_with_intercept.shape[1])
    alpha = 0.01
    num_iters = 1000
    
    theta, _ = gradient_descent(X_with_intercept, y, initial_theta, alpha, num_iters)
    
    # 计算准确率
    predictions, _ = predict(X_with_intercept, theta)
    accuracy = (predictions == y).mean() * 100
    print(f"Model accuracy: {accuracy:.2f}%")
    print(f"Decision boundary equation: x2 = {-theta[0]/theta[2]:.4f} + {-theta[1]/theta[2]:.4f} * x1")
    
    # 用户输入预测
    print("\n===== Prediction Mode =====")
    print("Now you can enter feature values for prediction")
    
    while True:
        user_features = get_user_input()
        
        if user_features is None:
            print("Exiting program.")
            break
            
        # 添加截距项并预测
        user_features_with_intercept = add_intercept(user_features)
        user_prediction, probability = predict(user_features_with_intercept, theta)
        
        # 输出预测结果
        print(f"\nInput features: X1={user_features[0]}, X2={user_features[1]}")
        print(f"Prediction result: Class {user_prediction}")
        print(f"Prediction probability: {probability:.4f}")
        
        # 可视化结果
        plot_decision_boundary(X, y, theta, user_features, user_prediction)

if __name__ == '__main__':
    main()

训练结果：