修改代码，使其能生成一个excel表，excel表中内容为网页中的年份及省份，其中年份为行，省份为列import requests from bs4 import BeautifulSoup import re import xlwt headers={ "User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36 Core/1.94.218.400 QQBrowser/12.1.5496.400"} url="https://data.stats.gov.cn/easyquery.htm?cn=E0103" r=requests.get(url,headers=headers) html = (r.text) soup = BeautifulSoup(html, "html.parser") all_years = soup.find_all("table", attrs={"class": "tr-title"}) all_princes = soup.find_all("table", attrs={"class": "table_container_column"}) for year in all_years: year_string = year.string if "/" not in year_string: print(year_string) for prince in all_princes: prince_string = prince.string if "/" not in prince_string: print(prince_string)

时间: 2025-03-28 10:00:34 浏览: 22

为了将年份和省份数据存储到Excel表格中，可以按照以下步骤修改您的代码： ### 修改后的代码 ```python import requests from bs4 import BeautifulSoup import re import xlwt # 请求头设置 headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36 Core/1.94.218.400 QQBrowser/12.1.5496.400" } # URL请求 url = "https://data.stats.gov.cn/easyquery.htm?cn=E0103" response = requests.get(url, headers=headers) # 解析HTML页面 html_content = response.text soup = BeautifulSoup(html_content, "html.parser") # 提取所有年份信息 year_tables = soup.find_all("div", class_="treeYear") years = [] for table in year_trees: for a_tag in table.find_all('a'): years.append(a_tag.text.strip()) # 提取所有省份信息 province_tables = soup.find_all("table", attrs={"class": "table_container_column"}) provinces = set() for province_table in province_tables: td_tags = province_table.find_all("td") for td in td_tags: provinces.add(td.text.strip()) provinces = list(provinces) # 创建Excel工作簿并添加sheet页 workbook = xlwt.Workbook(encoding='utf-8') worksheet = workbook.add_sheet('Years_and_Provinces') # 写入表头（第一列为“年份”，其余为各省） row_idx = 0 col_idx = 0 worksheet.write(row_idx, col_idx, '年份') # 第一列表头 col_idx += 1 for province in provinces: worksheet.write(0, col_idx, province) # 各省作为表头写入第1行 col_idx += 1 # 将年份填充进每一行的第一列 for i, year in enumerate(years): row_idx = i + 1 # 跳过标题行 worksheet.write(row_idx, 0, year) # 年份填入第i+1行、第1列 # 最终保存文件 file_name = './output.xls' workbook.save(file_name) print(f"Excel 文件已成功生成，路径为 {file_name}") ``` --- ### 关键点说明： 1. **提取年份** 使用`BeautifulSoup`解析网页结构，并找到包含年份的标签（此处假设它们位于特定类名如`.treeYear`下），然后将其文本内容逐条存入列表。 2. **提取省份** 遍历所有的省份名称，通过集合去重后再转成列表形式。这一步避免了重复的数据记录。 3. **创建 Excel 表格** 利用 `xlwt` 库新建了一个工作薄和一张 sheet 页面，在首行放置各省市名称作为列标题；在首列依次插入每个年份数值作为行标题。 4. **数据写入规则** 每一行对应一个具体的年度，而每列则代表不同的地区统计项。 ---

阅读全文

相关推荐

Excel表格中根据身份证号码自动填出生日期(精).pdf

kettle循环导出数据到excel生成多个sheet页

2005-2023年31省【地区生产总值+第一二三产业增加值】-Shp版本+Excel版本+可编辑mxd文件+标准成图TIF

及县以上行政区划代码(截止2007年12月31日)Access数据库和Excel文档

Excel数据透视表从入门到精通.pptx

Kettle实现按省份与年份导出多Sheet Excel

Excel透视表：字段设置与应用实战

Excel数据透视表深度解析与应用技巧

Excel数据透视表深度解析与应用示例

Excel数据透视表详解：从入门到精通

Excel数据透视表深度解析：提升数据分析效率

Excel数据透视表深度解析：高效筛选与汇总

Excel数据透视表深度解析：高效分析与汇总

Excel数据透视表深度解析：功能、概念与应用

Excel数据透视表深度解析：从基础到高级应用

请告诉一个没用过stata的新手，已有一个横坐标为年份纵坐标为省份的excel数据，在点开stata后，如何在stata中进行莫兰检验

网络经济年度报告.docx

软件和信息服务业投融资典型案例分析.docx

基于云计算大数据的图书馆创新发展模式研究.docx

大家在看

利用ioctl进行设备管理-驱动程序设计

SmartSVN license

linphone 4.1.1 SDK，C# Demo封装包，包含封装CS文件和所需要Dll，直接拉入项目即可

天津大学计算机网络上机实验

pair_gran_hertz_history_history_Hertz_hertz接触模型Lammps_lammps_接触模

最新推荐

python绘制地震散点图

网络经济年度报告.docx

软件和信息服务业投融资典型案例分析.docx

基于云计算大数据的图书馆创新发展模式研究.docx

中国医疗大数据行业市场现状及发展趋势分析-技术应用环境将更加成熟.docx

掌握Java端口扫描器：从入门到实践

【性能测试基准】：为RK3588选择合适的NVMe性能测试工具指南

vllm部署大模型为什么只用一张卡怎么设置成多卡

ASP+access实现的新闻管理系统开发教程

【固态硬盘寿命延长】：RK3588平台NVMe维护技巧大公开