Python Scrapy微博
时间: 2023-11-29 14:44:41 浏览: 192
Scrapy是一个Python爬虫框架,可以用于爬取各种网站的数据。下面是使用Scrapy爬取新浪微博用户信息的步骤:
1. 创建Scrapy项目
```python
scrapy startproject weibo
```
2. 创建Spider
```python
scrapy genspider weibo_spider weibo.com
```
3. 编写Spider代码
```python
import scrapy
class WeiboSpider(scrapy.Spider):
name = 'weibo_spider'
allowed_domains = ['weibo.com']
start_urls = ['https://weibo.com/']
def parse(self, response):
# 在这里编写解析响应的代码
pass
```
4. 解析响应
```python
import scrapy
class WeiboSpider(scrapy.Spider):
name = 'weibo_spider'
allowed_domains = ['weibo.com']
start_urls = ['https://weibo.com/']
def parse(self, response):
# 解析响应
for user in response.css('div.user'):
yield {
'name': user.css('a.name::text').get(),
'location': user.css('span.location::text').get(),
'description': user.css('p.description::text').get(),
}
# 获取下一页的链接并继续爬取
next_page = response.css('a.next-page::attr(href)').get()
if next_page is not None:
yield response.follow(next_page, self.parse)
```
5. 运行Spider
```python
scrapy crawl weibo_spider -o users.csv
```
以上代码将会爬取新浪微博的用户信息,并将结果保存到CSV文件中。
阅读全文
相关推荐

















