汽车油耗效率 MPG预测拟牛顿法
时间: 2025-04-23 15:28:34 浏览: 17
### 使用拟牛顿法实现汽车燃油效率MPG预测
为了构建一个基于拟牛顿法的机器学习模型来预测汽车燃油效率(MPG),可以采用BFGS算法作为优化器。该方法属于二阶最优化技术,能够有效处理复杂的非线性关系并加速收敛过程[^1]。
#### 数据预处理阶段
数据集通常包含多个特征变量以及目标标签——即每加仑行驶里程数(MPG)。在实际操作前需完成如下工作:
- 清洗缺失值或异常点;
- 对类别型字段执行独热编码转换;
- 将数值属性标准化至相同尺度范围;
```python
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
# 加载数据集
data = pd.read_csv('auto-mpg.csv')
# 处理缺失值 (假设 horsepower 中有缺失值)
data['horsepower'] = data['horsepower'].replace('?', None).astype(float)
data.dropna(inplace=True)
X = data.drop(columns=['mpg'])
y = data[['mpg']].values.ravel()
numeric_features = ['cylinders', 'displacement', 'horsepower', 'weight', 'acceleration']
categorical_features = ['model year', 'origin']
preprocessor = ColumnTransformer(
transformers=[
('num', StandardScaler(), numeric_features),
('cat', OneHotEncoder(handle_unknown='ignore'), categorical_features)])
pipeline = Pipeline(steps=[('preprocessor', preprocessor)])
X_processed = pipeline.fit_transform(X)
```
#### 构建与训练模型
选用支持自定义优化器的回归库如`scikit-optimize`中的`GP_minimize`接口配合SciPy包下的`minimize`函数实现BFGS求解路径。此处展示如何利用Scipy内置功能创建最小化问题实例并通过调用其solve方法获取最优参数向量w*:
```python
from scipy.optimize import minimize
from sklearn.metrics import mean_squared_error
def objective_function(w, X_train, y_train):
predictions = np.dot(X_train, w.T)
mse_loss = mean_squared_error(y_train, predictions)
return mse_loss
initial_guess = np.zeros((X_processed.shape[1],))
result = minimize(objective_function,
initial_guess,
args=(X_processed, y),
method="L-BFGS-B",
options={'disp': True})
optimal_weights = result.x
print(f'Optimized Weights: {optimal_weights}')
```
上述代码片段展示了通过L-BFGS-B变体找到使均方误差损失达到极小化的权重组合方案[^2]。
#### 性能评估环节
最后一步是对测试样本集上的表现情况进行评测分析,计算诸如R²得分之类的统计指标衡量泛化能力优劣程度。
```python
from sklearn.metrics import r2_score
predictions = np.dot(X_processed, optimal_weights.T)
r2 = r2_score(y, predictions)
print(f'R^2 Score on Training Data: {r2:.4f}')
# 划分训练/验证集合用于更公正评价
X_train, X_val, y_train, y_val = train_test_split(X_processed, y, test_size=0.2, random_state=42)
val_predictions = np.dot(X_val, optimal_weights.T)
validation_r2 = r2_score(y_val, val_predictions)
print(f'R^2 Score on Validation Data: {validation_r2:.4f}')
```
阅读全文
相关推荐












