data.fillna(method='ffill', inplace=True) date_history = pd.DataFrame(data.iloc[:, 0]) data_history = pd.DataFrame(data.iloc[:, 1]) date_history = np.array(date_history) data_history = [x for item in np.array(data_history).tolist() for x in item] 代码优化

可以将以上代码进行优化，如下所示： ```python # 使用ffill方法填充缺失值 data.fillna(method='ffill', inplace=True) # 将data的第一列作为date_history date_history = data.iloc[:, 0].values.reshape(-1, 1) # 将data的第二列作为data_history data_history = data.iloc[:, 1].values.reshape(-1, 1) # 将date_history和data_history合并成一个数组 history = np.concatenate([date_history, data_history], axis=1) ``` 以上代码首先使用`fillna`方法填充缺失值，然后将`data`的第一列作为`date_history`，第二列作为`data_history`，并将它们合并成一个数组`history`。这样可以简化代码，并让代码更易读。

data.fillna(method='ffill', inplace=True) date_history,data_history = pd.DataFrame(data.iloc[:, 0]) data_history = pd.DataFrame(data.iloc[:, 1]) date_history = np.array(date_history) data_history = [x for item in np.array(data_history).tolist() for x in item] # 缺失值处理 history_time_list = [] for date in date_history: date_obj = datetime.datetime.strptime(date[0], '%Y/%m/%d %H:%M') #将字符串转为 datetime 对象 history_time_list.append(date_obj) start_time = history_time_list[0] # 起始时间 end_time = history_time_list[-1] # 结束时间 delta = datetime.timedelta(minutes=15) #时间间隔为15分钟 time_new_list = [] current_time = start_time while current_time <= end_time: time_new_list.append(current_time) current_time += delta # 缺失位置记录 code_list = [] for i in range(len(time_new_list)): code_list = code_list history_time_list = history_time_list while (time_new_list[i] - history_time_list[i]) != datetime.timedelta(minutes=0): history_time_list.insert(i, time_new_list[i]) code_list.append(i) for i in code_list: data_history.insert(i, data_history[i - 1]) # 输出补充好之后的数据 data = pd.DataFrame({'date': time_new_list, 'load': data_history}) return data 代码优化

以下是对代码的优化： ```python data.fillna(method='ffill', inplace=True) date_history, data_history = data.iloc[:, 0], data.iloc[:, 1:].values.flatten() date_history = np.array([datetime.datetime.strptime(date, '%Y/%m/%d %H:%M') for date in date_history]) start_time, end_time = date_history[0], date_history[-1] delta = datetime.timedelta(minutes=15) time_new_list = [] current_time = start_time while current_time <= end_time: time_new_list.append(current_time) current_time += delta code_list = [i for i, date in enumerate(date_history) if date not in time_new_list] for i in code_list: data_history = np.insert(data_history, i, data_history[i - 1]) data = pd.DataFrame({'date': time_new_list, 'load': data_history}) return data ``` 代码优化的主要思路是： 1. 将第二列数据展平成一维数组，避免后续操作需要用到的循环。 2. 将日期字符串转换为 datetime 对象。 3. 使用列表推导式生成时间序列。 4. 使用列表推导式生成缺失位置列表。 5. 使用 NumPy 的 `insert()` 方法在数据中插入缺失值。 6. 最后将补充好的数据转换为 DataFrame 返回。这样可以使代码更加简洁、高效，并且减少不必要的循环。

def data_processing(data): # 日期缺失，补充 data.fillna(method='ffill', inplace=True) date_history = pd.DataFrame(data.iloc[:, 0]) data_history = pd.DataFrame(data.iloc[:, 1]) date_history = np.array(date_history) data_history = [x for item in np.array(data_history).tolist() for x in item] # 缺失值处理 history_time_list = [] for date in date_history: date_obj = datetime.datetime.strptime(date[0], '%Y/%m/%d %H:%M') #将字符串转为 datetime 对象 history_time_list.append(date_obj) start_time = history_time_list[0] # 起始时间 end_time = history_time_list[-1] # 结束时间 delta = datetime.timedelta(minutes=15) #时间间隔为15分钟 time_new_list = [] current_time = start_time while current_time <= end_time: time_new_list.append(current_time) current_time += delta # 缺失位置记录 code_list = [] for i in range(len(time_new_list)): code_list = code_list history_time_list = history_time_list while (time_new_list[i] - history_time_list[i]) != datetime.timedelta(minutes=0): history_time_list.insert(i, time_new_list[i]) code_list.append(i) for i in code_list: data_history.insert(i, data_history[i - 1]) # 输出补充好之后的数据 data = pd.DataFrame({'date': time_new_list, 'load': data_history}) return data 代码优化

这段代码可以进行以下优化： 1. 将日期转换为 datetime 对象的操作可以放在读取数据时进行，避免在处理数据时重复执行。 2. 用 pandas 的 resample 函数可以方便地实现时间序列的重采样。可以先将日期列设置为索引，然后使用 resample 函数按照指定的时间间隔进行重采样，再使用 ffill 方法填充缺失值。 3. 可以使用 pandas 的 merge 函数将重采样之后的数据与原数据进行合并，这样可以避免使用循环操作。下面是优化后的代码： ``` import pandas as pd import datetime def data_processing(data): # 将日期转换为 datetime 对象 data['date'] = pd.to_datetime(data['date'], format='%Y/%m/%d %H:%M') data.set_index('date', inplace=True) # 重采样，补全缺失值 data_resampled = data.resample('15T').ffill() # 合并原数据和重采样之后的数据 data_processed = data_resampled.merge(data, how='outer', left_index=True, right_index=True) return data_processed ```

阅读全文

data.fillna(method='ffill', inplace=True) date_history = pd.DataFrame(data.iloc[:, 0]) data_history = pd.DataFrame(data.iloc[:, 1]) date_history = np.array(date_history) data_history = [x for item in np.array(data_history).tolist() for x in item] 代码优化

相关推荐

python dataframe向下向上填充,fillna和ffill的方法

解决pandas.DataFrame.fillna 填充Nan失败的问题

Read data from the Excel spreadsheet.zip_Spreadsheet::Read_excel

【Python金融图表新选择】：mpl_finance模块全面解析

深入Python：高级时间序列分析技巧与实战演练

新手必备：时间序列数据采集与预处理的全面攻略

【Python金融图表新手必学】：mplfinance模块入门教程

【Chrome数据分析】：日期视角下的历史记录使用习惯揭秘

【可视化工具的力量】：GMAT展示任务分析结果的艺术

Python参数估计必学课：Fama French五因子模型的精确调整方法

【QWS数据集全面解析】：精通数据集结构、处理与应用

【LSTM深度学习入门到精通】：揭秘时间序列处理的七大实用技巧

2022年网站美工个人年度工作总结(1).doc

财务软件销售实习报告格式范文-实习报告格式(1).doc

【航迹关联】基于标准 Hough 变换、修正 Hough 变换和序列 Hough 变换实现航迹起始算法研究Matlab代码.rar

大家在看

基于python开发的工商企业名录查询软件v2.2.4下载

ruijin_round2：瑞金医院MMC人工智能辅助建立知识图谱大赛复赛

泛微e8后台维护手册

虚拟光驱DAEMON（支持2000/XP/2003）

ISO/IEC 27001:2022与ISO 27002:2022最新版中英文版合集

最新推荐

python dataframe向下向上填充,fillna和ffill的方法

2022年网站美工个人年度工作总结(1).doc

财务软件销售实习报告格式范文-实习报告格式(1).doc

获取本机IP地址的程序源码分析

【权威指南】：Win11笔记本上RTX4080驱动的安装与调试秘籍

windows环境举例

QQ自动发送/回复系统源代码开放

【7步打造Win11深度学习利器】：Tensorflow-GPU与RTX4080终极优化指南

ue画笔画画蓝图版

VB.NET图表曲线组件实现多种图表绘制