import pandas as pd df = pd.ExcelFile(r'F:\Python_file\test_sheet\output3.xlsx') df.sheet_names df_list = [] for sheet in df.sheet_names: print(sheet) read_sheet = pd.read_excel(df, sheet, header=2) rows = range(0, read_sheet.shape[0]) for row in rows: try: col_test = read_sheet.iloc[row, 4] # print(col_test) # exit() if ("25" in col_test) and ('4.2v' in col_test or '4.2V' in col_test): # if '4.15V' in col_test: # continue # if ('60' in col_test or '45' in col_test): # continue df_list.append(read_sheet.iloc[[row], :5]) else: continue except Exception as err: print('col nan') df_list = pd.DataFrame(df_list) df_concat = pd.concat(df_list, axis=0) print(df_concat)

时间: 2024-02-15 15:26:51 浏览: 140

这段代码看起来是用来处理 Excel 文件的。首先，使用 pandas 库中的 ExcelFile 方法读取一个 Excel 文件，然后通过 sheet_names 属性获取所有工作表的名称。接下来，循环遍历每个工作表，使用 read_excel 方法读取数据，并通过 iloc 方法获取指定的行和列。其中，if 语句用于筛选出符合特定条件的行，将这些行添加到 df_list 中。最后，使用 concat 方法将 df_list 中的数据合并成一个 DataFrame，并输出结果。

import pandas as pd df = pd.ExcelFile(r'F:\Python_file\read_sheet\CLP_LFP\xfun\output.xlsx') df.sheet_names df_flist = [] for sheet in df.sheet_names: print(sheet) read_sheet = pd.read_excel(df, sheet, header=2) rows = range(0, read_sheet.shape[0]) for row in rows: try: col_test = read_sheet.iloc[row, 3] if ("25" in col_test) and ('4.2v' in col_test or '4.2V' in col_test): # if '4.15V' in col_test: # continue # if ('60' in col_test or '45' in col_test): # continue df_flist.append(read_sheet.iloc[[row], :4]) except Exception as err: print('col nan') df_flist = pd.DataFrame(df_flist) df_concat = pd.concat(df_flist, axis=0) print(df_concat) 报错ValueError: Must pass 2-d input. shape=(120, 1, 4)

这是一个Python代码，看起来是在读取一个Excel文件中的多个sheet，然后筛选符合条件的行并将它们添加到一个列表中，最后将列表中的所有行合并成一个DataFrame。根据报错提示，可能是在尝试合并数据时出现了问题，需要查看具体的数据形状和内容来确定原因。可以尝试print一些中间结果来调试。另外，由于我是一个 AI，不能直接运行你的代码，如果你需要更多的帮助，请提供更多的信息或代码片段，这样我才能更好地帮助你。

import torch import torch.nn as nn import torch.optim as optim import os import pandas as pd import numpy as np from sklearn.preprocessing import MinMaxScaler import matplotlib.pyplot as plt from sklearn.metrics import mean_squared_error, mean_absolute_error from sklearn.model_selection import train_test_split # Check if GPU is available, otherwise use CPU device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # 数据加载和预处理：读取文件夹中的所有xlsx文件 def load_data_from_folder(folder_path, time_step, pred_step): all_data = [] for file_name in os.listdir(folder_path): if file_name.endswith('.xlsx'): file_path = os.path.join(folder_path, file_name) print(f"Loading data from {file_path}") try: excel_file = pd.ExcelFile(file_path) sheet_name = excel_file.sheet_names[0] print(f"Sheet names: {excel_file.sheet_names}") df = pd.read_excel(file_path, sheet_name=sheet_name) except Exception as e: print(f"Error reading {file_name}: {e}") continue if df.empty: print(f"Warning: {file_name} is empty") continue print(f"Original data: {df.head()}") df = df.dropna() print(f"Data after dropping NaN values: {df.head()}") # 尝试转换为数值，并打印哪些列有问题 df = df.apply(pd.to_numeric, errors='coerce') print(f"Data after applying pd.to_numeric: {df.head()}") print(f"Columns with NaN values: {df.isna().sum()}") # 如果数据为空，跳过这个文件 if df.empty: print(f"Warning: Data is empty after cleaning in file {file_name}") continue all_data.append(df) if len(all_data) == 0: raise ValueError("No valid data found in any of the files.") all_data_df = pd.concat(all_data, ignore_index=True) features = all_data_df.column

### 使用 PyTorch 构建时间序列预测模型 #### 数据加载与预处理为了实现基于 PyTorch 的时间序列预测模型，首先需要解决数据加载和预处理的问题。如果用户拥有多张 Excel 表格，则可以利用 `pandas` 库来读取这些表格并将它们转换为统一的结构化数据形式。以下是具体方法： 1. **Excel 文件读取** 可以通过 `pandas.read_excel()` 函数逐一读取多个 Excel 文件，并将其合并成单一的数据帧 (DataFrame)。 ```python import pandas as pd # 假设文件名为 'data_1.xlsx', 'data_2.xlsx' ... file_list = ['data_1.xlsx', 'data_2.xlsx'] data_frames = [pd.read_excel(file) for file in file_list] # 合并所有 DataFrame combined_data = pd.concat(data_frames, ignore_index=True) ``` 2. **数据清洗与标准化** 清洗过程可能包括去除缺失值、异常值检测以及特征缩放等操作。对于时间序列数据，通常会采用 Min-Max 归一化或 Z-Score 标准化技术[^3]。 ```python from sklearn.preprocessing import StandardScaler scaler = StandardScaler() scaled_features = scaler.fit_transform(combined_data.values) ``` 3. **划分训练集与测试集** 利用 `train_test_split` 方法将数据划分为训练集和验证集。 ```python X_train, X_test, y_train, y_test = train_test_split( scaled_features[:, :-1], # 特征列 scaled_features[:, -1], # 目标列 test_size=0.2, random_state=42 ) ``` #### 模型定义与训练流程接下来，在 PyTorch 中定义一个适合时间序列预测的神经网络架构。这里可以选择 CNN-BiGRU 结构作为基础框架[^1]。 1. **模型设计** 定义一个包含卷积层、双向 GRU 层以及其他必要组件的类继承自 `nn.Module`。 ```python class TimeSeriesModel(nn.Module): def __init__(self, input_dim, hidden_dim, output_dim, num_layers): super(TimeSeriesModel, self).__init__() self.conv_layer = nn.Conv1d(in_channels=input_dim, out_channels=hidden_dim, kernel_size=3, padding=1) self.bigru = nn.GRU(input_size=hidden_dim, hidden_size=hidden_dim, num_layers=num_layers, batch_first=True, bidirectional=True) self.fc = nn.Linear(hidden_dim * 2, output_dim) # 输出维度加倍因为是双向 RNN def forward(self, x): conv_out = F.relu(self.conv_layer(x.permute(0, 2, 1))) gru_input = conv_out.permute(0, 2, 1) bigru_output, _ = self.bigru(gru_input) final_output = self.fc(bigru_output[:, -1, :]) # 获取最后一个时间步的结果 return final_output ``` 2. **优化器与损失函数配置** 配置 Adam 优化器以及均方误差（MSE）损失函数用于监督学习任务。 ```python model = TimeSeriesModel(input_dim=X_train.shape[1], hidden_dim=64, output_dim=1, num_layers=2) criterion = nn.MSELoss() optimizer = optim.Adam(model.parameters(), lr=0.001) ``` 3. **训练循环** 编写完整的训练逻辑以便迭代更新参数直至收敛。 ```python epochs = 50 for epoch in range(epochs): model.train() inputs = torch.tensor(X_train, dtype=torch.float32).unsqueeze(-1) targets = torch.tensor(y_train, dtype=torch.float32) outputs = model(inputs) loss = criterion(outputs.squeeze(), targets) optimizer.zero_grad() loss.backward() optimizer.step() if (epoch+1) % 10 == 0: print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}') ``` #### 测试阶段评估性能最后一步是对测试集上的表现进行量化评价，比如计算平均绝对误差 MAE 或者决定系数 \(R^2\) 等指标。 ```python model.eval() with torch.no_grad(): test_inputs = torch.tensor(X_test, dtype=torch.float32).unsqueeze(-1) predictions = model(test_inputs).squeeze().numpy() mae = np.mean(np.abs(predictions - y_test)) print(f'Mean Absolute Error on Test Set: {mae:.4f}') ``` --- ###

阅读全文

相关推荐

python read excel.zip_excel读取_python excel_python_read_excel_pyt

详解pandas库pd.read_excel操作读取excel文件参数整理与实例

pd_split_train_test.rar_pandas_pandas对数据分类_pd.split_split_数据分类

表格数据处理自动化实战：Python代码函数的实践与优化技巧！

Python与Excel数据交互指南：从基础到高级，打造数据处理专家

Python数据处理进阶：使用format函数美化Pandas DataFrame输出

Python Excel高级技巧：隐藏宝藏的揭秘

【Pandas错误处理指南】：在Excel写入过程中避免常见错误的策略

【Python与大数据】：学习使用Pandas处理大规模数据集，5小时速成

【DataFrame重塑技巧】：在Pandas中处理数据并写入Excel的有效方法

【C++与Excel的交互】：VBA在高效数据导出中的应用

Python打造动态Excel报表：掌握最佳实践，解锁数据驱动的秘密

创建一个Python脚本,命名为test1.py,使用pandas库完成一下功能: (1)今有2018年1月1日-15日的猪肉价格和牛肉价格的数据,它们存在与一个Excel表格中,如下表所示。将其读入Python中并用一个数据框变量df保存。

pandas修改与保存excel

python数据插入excel

大家在看

MATLAB 2019A 中文文档.pdf

KYN61-40.5安装维护手册

Local Dimming LED TV 背光驱动整体方案

ISO/IEC 27005:2022 英文原版

Sublime Text 3.1.1 build 3176

最新推荐

Screenshot_20250709_163758_com.tencent.tmgp.pubgmhd.jpg

飞思OA数据库文件下载指南

Qt信号与槽优化：提升系统性能与响应速度的实战技巧

D8流向算法

精选36个精美ICO图标免费打包下载

【Qt数据库融合指南】：MySQL与Qt无缝集成的技巧

精选教程分享：数据库系统基础学习资料

Qt架构揭秘：模块化设计与系统扩展性的最佳实践

docker镜像加使