时间序列天气预测资源-CSDN下载

共10个文件

xml：5个

csv：2个

gitignore：1个

版权申诉

5星 · 超过95%的资源 9 浏览量 2023-08-26 20:57:35 上传评论 3 收藏 19KB RAR 举报

时间序列天气预测是一种利用历史气象数据来预测未来天气条件的技术。这一领域结合了时间序列分析、机器学习和气象学知识，旨在提供准确的短期和长期天气预报。在本项目中，我们将探讨如何运用这些技术来预测天气。我们需要了解时间序列分析的基本概念。时间序列是由特定时间点上连续观测值组成的数据序列，例如每日最高温度、最低温度、降水量等。这些数据通常具有趋势性、季节性和周期性等特征。时间序列模型如ARIMA（自回归积分滑动平均模型）和状态空间模型被广泛用于捕捉这些特性并进行预测。项目中的两个CSV文件——"最新数字化处理历史真实数据.csv"和"数字化处理历史真实数据.csv"，可能包含了多种气象变量的历史记录，如温度、湿度、风速、气压等。在预处理阶段，这些数据可能经过清洗、标准化和归一化，以便更好地输入到预测模型中。数据分析过程中，我们可能需要对这些数据进行缺失值处理、异常值检测以及时间序列分解，以便提取趋势、季节性和残差部分。接下来，标签“时间序列天气预测”暗示了我们将使用机器学习算法来构建预测模型。在这个案例中，"8-26xgbboost.py"可能是一个使用XGBoost（极端梯度提升）的Python脚本。XGBoost是一种强大的梯度提升框架，适用于处理回归和分类问题，包括时间序列预测。通过训练XGBoost模型，我们可以将历史天气数据作为输入特征，而未来的天气状况作为目标变量，从而训练出一个能够预测未来天气的模型。在XGBoost模型训练中，可能涉及以下步骤： 1. 特征选择：挑选与天气变化最相关的特征，如过去几天的温度、湿度等。 2. 模型训练：使用训练集数据拟合模型，调整超参数以优化模型性能。 3. 模型验证：在验证集上评估模型的预测能力，防止过拟合。 4. 模型测试：最后在独立的测试集上验证模型的泛化能力。然而，".idea"文件通常与开发环境（如IntelliJ IDEA）相关，它包含了项目设置和配置信息，对理解模型本身的具体实现帮助不大，但表明开发者可能使用了该IDE进行代码编写和调试。总结来说，这个项目是关于利用时间序列分析和机器学习，特别是XGBoost，来预测天气。通过分析历史气象数据，我们可以构建一个模型来预测未来天气条件，这在农业、交通、能源等领域有着广泛的应用价值。对于这样的预测任务，数据预处理、特征工程和模型选择与优化是关键步骤。

资源推荐

资源详情

资源评论

收起资源包目录

时间序列天气预测-xgboost网格调参实战.rar （10个子文件）

8-26xgbboost.py 4KB

数字化处理历史真实数据.csv 27KB

最新数字化处理历史真实数据.csv 28KB

.idea

workspace.xml 3KB

misc.xml 192B

inspectionProfiles

Project_Default.xml 9KB

profiles_settings.xml 174B

modules.xml 359B

.gitignore 50B

时间序列天气预测-xgboost网格调参实战.iml 291B

import pandas as pd import numpy as np import xgboost as xgb from xgboost.sklearn import XGBClassifier from sklearn import model_selection, metrics from sklearn.model_selection import train_test_split from sklearn.metrics import f1_score, roc_auc_score, accuracy_score import warnings warnings.filterwarnings(module='sklearn*', action='ignore', category=DeprecationWarning) from sklearn import datasets from collections import Counter import os import pandas as pd import pandas as pd import numpy as np import xgboost as xgb import matplotlib.pylab as plt # %matplotlib inline from sklearn.model_selection import GridSearchCV from sklearn.model_selection import train_test_split plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False data = pd.read_csv("数字化处理历史真实数据.csv") # 1 3 7 是预测列 data = data.fillna(0) # ['日期', '星期', '最高气温', '最低气温', '天气', '风向', '级数'] data_x=data[['最高气温', '最低气温', '天气', '风向', '级数']].values data_x=np.array(data_x,dtype=np.float16) print(data_x) # 序列长度 int_sequence_len=3 # 每个序列的长度 int_a = 5 train_x = [] train_y = [] for i in range(0, len(data_x)-int_sequence_len,1): train_x.append(data_x[i:i+int_sequence_len]) train_y.append(data_x[i+int_sequence_len][1]) print(len(train_x), len(train_y)) # 划分验证集和测试集 x_train, x_test, y_train, y_test = train_test_split(np.array(train_x), np.array(train_y), test_size=0.2, random_state=1) x_train = x_train.reshape(len(x_train),-1) # 三维度数据全部数据长度序列长度每个序列维度 y_train = y_train.reshape(len(y_train),1) print(x_train.shape,y_train.shape) x_test = x_test.reshape(len(x_test),-1) # y_test = y_test.reshape(-1,1) from sklearn.model_selection import GridSearchCV #Need to research #research_one: n_epoch #research_one: max_depth #定义xgboost模型 xgb1 = xgb.XGBRegressor() # learning_rate =0.12, # n_estimators=150, # max_depth=5, # min_child_weight=1, # gamma=0, # subsample=0.8, # colsample_bytree=0.8, # objective= 'reg:squarederror', # reg_alpha= 0, # reg_lambda= 1, # nthread=4, # scale_pos_weight=1, # seed=27 param_test1 = { 'min_child_weight': [1, 2, 3,4,5,6,8], 'max_depth':[1,2, 3, 4, 5, 6, 7,8], 'learning_rate':[0.09,0.1,0.12,0.13,0.14], 'n_estimators':[100,120,150] } sklearn_model_raw = GridSearchCV(estimator = xgb1, param_grid = param_test1, n_jobs=4, cv=5) sklearn_model_raw.fit(x_train, y_train) pred_test_raw = sklearn_model_raw.predict(x_test) from sklearn.metrics import mean_squared_error # 均方误差 from sklearn.metrics import mean_absolute_error # 平方绝对误差 from sklearn.metrics import r2_score # R square test_mse = mean_squared_error(y_test, pred_test_raw) print('mse Score: %.6f mse' % (test_mse)) test_mae = mean_absolute_error(y_test, pred_test_raw) print('mae Score: %.6f mae' % (test_mae)) test_R2 = r2_score(y_test, pred_test_raw) print('R2 Score: %.6f R2' % (test_R2)) # 所有画图 len_ = [i for i in range(len(y_test))] plt.xlabel('标签', fontsize=8) plt.ylabel('均值', fontsize=8) plt.plot(len_, y_test, label='y_test', color="blue") plt.plot(len_, pred_test_raw, label='y1_pred_lstm', color="yellow") plt.legend(['truh', 'predict']) plt.title("最低气温预测走势图") plt.show() plt.clf() x_test=[22,18,4,3,1,27,17,0,5,2,28,19,1,7,2] x_test=np.array(x_test).reshape(1,15) pred_test_raw = sklearn_model_raw.predict(x_test) print("预测：", pred_test_raw)

评论收藏

内容反馈

版权申诉