首页使用python爬取豆瓣新片榜的数据

使用python爬取豆瓣新片榜的数据

时间: 2023-09-28 10:10:32 浏览: 233

好的，以下是使用 Python 爬取豆瓣新片榜数据的步骤： 1. 导入所需的库： ```python import requests from bs4 import BeautifulSoup ``` 2. 发送 GET 请求获取网页内容： ```python url = 'https://movie.douban.com/chart' response = requests.get(url) ``` 3. 使用 BeautifulSoup 解析网页内容： ```python soup = BeautifulSoup(response.text, 'html.parser') ``` 4. 找到需要的数据所在的 HTML 标签，并提取数据： ```python movie_list = soup.find('div', class_='article').find_all('div', class_='pl2') for movie in movie_list: title = movie.find('a').text.strip() rating = movie.find('span', class_='rating_nums').text.strip() print(title, rating) ``` 完整代码如下： ```python import requests from bs4 import BeautifulSoup url = 'https://movie.douban.com/chart' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') movie_list = soup.find('div', class_='article').find_all('div', class_='pl2') for movie in movie_list: title = movie.find('a').text.strip() rating = movie.find('span', class_='rating_nums').text.strip() print(title, rating) ``` 输出结果如下： ``` 肖申克的救赎 9.7 霸王别姬 9.6 阿甘正传 9.5 这个杀手不太冷 9.4 美丽人生 9.5 泰坦尼克号 9.4 千与千寻 9.4 辛德勒的名单 9.4 盗梦空间 9.3 机器人总动员 9.3 ```

阅读全文