python 爬取东财股票
时间: 2025-01-15 12:22:51 浏览: 91
### 编写Python爬虫程序抓取东方财富网的股票数据
为了实现这一目标,可以采用`requests`库发送HTTP请求,并利用`BeautifulSoup`或`lxml`解析HTML文档。此外,由于涉及大量结构化数据的操作,建议引入`pandas`用于数据整理与初步分析。
#### 准备工作
安装必要的第三方库:
```bash
pip install requests beautifulsoup4 lxml pandas mysql-connector-python
```
#### 获取页面内容
构建函数以获取指定URL的内容:
```python
import requests
from bs4 import BeautifulSoup
def fetch_page(url):
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
'Accept-Language': 'zh-CN,zh;q=0.9'
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
return response.text
else:
raise Exception(f"Failed to load page {url}")
```
#### 解析网页并提取所需信息
定义一个方法来解析返回的HTML字符串,并从中抽取有用的信息片段:
```python
def parse_html(html_content):
soup = BeautifulSoup(html_content, "lxml")
table = soup.find('table', attrs={'class': 'data-table'}) # 这里假设表格有一个特定类名作为标识
rows = []
for tr in table.tbody.find_all('tr'):
row_data = [td.text.strip() for td in tr.find_all('td')]
rows.append(row_data)
columns = ['日期', '开盘价', '最高价', '最低价', '收盘价', '涨跌幅', '成交额'] # 列表中的字段应根据实际需求调整
df = pd.DataFrame(rows, columns=columns)
return df
```
#### 存储到MySQL数据库
创建连接对象并将DataFrame保存至关系型数据库中:
```python
import mysql.connector as mc
import pandas as pd
def save_to_db(dataframe):
conn_params = {
'host': 'localhost',
'port': 3306,
'database': 'finance',
'user': 'root',
'password': ''
}
connection = mc.connect(**conn_params)
cursor = connection.cursor()
create_table_sql = """
CREATE TABLE IF NOT EXISTS stock_prices (
date DATE PRIMARY KEY,
open_price DECIMAL(10, 2),
high_price DECIMAL(10, 2),
low_price DECIMAL(10, 2),
close_price DECIMAL(10, 2),
change_rate VARCHAR(10),
volume BIGINT
);
"""
insert_into_sql = f"""
INSERT INTO stock_prices ({','.join([col.lower().replace('-', '_') for col in dataframe.columns])})
VALUES (%s,%s,%s,%s,%s,%s,%s);
"""
try:
cursor.execute(create_table_sql)
records = list(zip(*[dataframe[col].values.tolist() for col in dataframe]))
cursor.executemany(insert_into_sql, records)
connection.commit()
except Exception as e:
print(e)
connection.rollback()
finally:
cursor.close()
connection.close()
```
上述代码展示了如何使用Python编写基本的爬虫脚本来收集来自东方财富网的历史股价记录,并将其持久化到本地运行的MySQL实例内[^1]。
阅读全文
相关推荐



















