笔趣阁小说全收录：Python爬虫技术让你一次看个够

本文链接：https://blog.csdn.net/Vergelight/article/details/144792766

开场白：

大家好！又到了宅在家里，抱着手机看小说的时候了。起点小说，这个小说迷们的老朋友，每天都有新故事上架，让人眼花缭乱。挑书挑得头疼？别急，我来教你用Python爬虫，一键找到最火的小说，让你的阅读之旅不再迷茫。

在这里插入图片描述

步骤1：搭建Python爬虫

咱们先来搭建个小工具——Python爬虫。就像搭积木一样简单，用requests去网上找找信息，BeautifulSoup来帮忙看懂这些信息，最后用pandas整理整理，把数据存起来。下面是咱们的部分爬虫代码：

import requests
from bs4 import BeautifulSoup


def searchChapter(url, file):
    res = requests.get(url)
    res.encoding = res.apparent_encoding
    soup = BeautifulSoup(res.text, "html.parser")
    chapterName = soup.find('h1').text
    content = soup.find('div', id='content').text
    content = content.replace('    ', '\n\n  ')
    file.write(chapterName)
    file.write(content)
    file.write('\n\n\n\n\n')
    print(chapterName, '下载完成！')


def searchAllUrl(url, name):
    html = requests.get(url).text
    soup = BeautifulSoup(html, "html.parser")
    dd = soup.find_all('dd')
    file = open(name+'.txt', 'w', encoding='utf-8')
    for d in dd:
        url = 'https://www.xbiquge.la' + d.a['href']
        searchChapter(url, file)
    print('整部小说下载完成！')


def searchBook(name):
    url = 'https://www.xbiquge.la/modules/article/waps.php'
    params = {'searchkey': name}
    response = requests.get(url, params=params)
    html = response.text
    soup = BeautifulSoup(html, "html.parser")
    target = soup.find('td')
    if target is None:
        return 0

    return target.find('a')['href']