python爬虫数据可视化BOSS直聘

以下是使用Python爬取BOSS直聘岗位数据并进行可视化的步骤： 1. 导入必要的库 ```python import requests from bs4 import BeautifulSoup import pymysql from pyecharts.charts import Bar, Pie, WordCloud from pyecharts import options as opts from flask import Flask, render_template ``` 2. 爬取数据 ```python # 爬取BOSS直聘网站上的数据 def get_data(url): headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, 'html.parser') return soup ``` 3. 爬取多页数据 ```python # 爬取多页数据 def get_all_data(): all_data = [] for i in range(1, 11): url = 'https://www.zhipin.com/c101280100-p100109/?page={}&ka=page-{}'.format(i, i) soup = get_data(url) data_list = soup.find_all('div', class_='job-primary') for data in data_list: job_name = data.find('div', class_='job-title').text.strip() salary = data.find('span', class_='red').text.strip() company = data.find('div', class_='company-text').find('a').text.strip() education = data.find('div', class_='job-limit clearfix').find_all('span')[1].text.strip() welfare = data.find('div', class_='info-append').find_all('span') welfare_list = [w.text.strip() for w in welfare] all_data.append([job_name, salary, company, education, welfare_list]) return all_data ``` 4. 存储数据 ```python # 存储数据到MySQL数据库 def save_data(data): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'INSERT INTO job_info(job_name, salary, company, education, welfare) values(%s, %s, %s, %s, %s)' try: cursor.executemany(sql, data) db.commit() except Exception as e: print(e) db.rollback() db.close() ``` 5. 数据可视化 ```python # 数据可视化 app = Flask(__name__) @app.route('/') def index(): return render_template('index.html') @app.route('/salary') def salary(): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'SELECT salary FROM job_info' cursor.execute(sql) results = cursor.fetchall() salary_list = [] for result in results: salary = result[0].replace('k', '').replace('K', '') salary_list.append(int(salary)) salary_dict = {} for i in range(0, 31, 5): salary_dict['{}k-{}k'.format(i, i + 5)] = 0 for salary in salary_list: for key in salary_dict.keys(): if salary >= int(key.split('-')[0]) and salary <= int(key.split('-')[1]): salary_dict[key] += 1 bar = Bar() bar.add_xaxis(list(salary_dict.keys())) bar.add_yaxis('薪资分布', list(salary_dict.values())) bar.set_global_opts(title_opts=opts.TitleOpts(title='BOSS直聘薪资分布图')) return bar.dump_options_with_quotes() @app.route('/education') def education(): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'SELECT education FROM job_info' cursor.execute(sql) results = cursor.fetchall() education_list = [] for result in results: education_list.append(result[0]) education_dict = {} for education in education_list: if education in education_dict.keys(): education_dict[education] += 1 else: education_dict[education] = 1 pie = Pie() pie.add('', list(education_dict.items())) pie.set_global_opts(title_opts=opts.TitleOpts(title='BOSS直聘学历要求分布图')) return pie.dump_options_with_quotes() @app.route('/welfare') def welfare(): db = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='boss_zhipin') cursor = db.cursor() sql = 'SELECT welfare FROM job_info' cursor.execute(sql) results = cursor.fetchall() welfare_list = [] for result in results: welfare_list.extend(result[0]) welfare_dict = {} for welfare in welfare_list: if welfare in welfare_dict.keys(): welfare_dict[welfare] += 1 else: welfare_dict[welfare] = 1 wordcloud = WordCloud() wordcloud.add('', list(welfare_dict.items()), word_size_range=[20, 100]) wordcloud.set_global_opts(title_opts=opts.TitleOpts(title='BOSS直聘福利词云图')) return wordcloud.dump_options_with_quotes() if __name__ == '__main__': app.run() ```

阅读全文

python爬虫数据可视化BOSS直聘

相关推荐

BOSS直聘Python相关招聘岗位数据可视化

人工智能-项目实践-可视化-Boss直聘岗位数据爬虫分析可视化.zip

爬虫-基于python的Boss直聘网站的数据爬取

python爬虫数据可视化boss直聘

python爬虫爬取 boss直聘并可视化

基于 python 实现的Boss直聘岗位数据爬虫分析可视化

基于 python 实现的Boss直聘岗位数据爬虫分析可视化源码+文档说明

基于Python+djangoDRF的Boss直聘在线爬虫及数据分析可视化系统源码+全部资料（毕业设计）.zip

Python期末大作业Boss直聘在线爬虫及数据分析可视化系统源码+文档说明

Boss直聘岗位数据爬虫分析可视化招聘Python源码-含全部资料（期末大作业）

Boss直聘岗位数据爬虫分析可视化.zip

优秀毕设-基于python的Boss直聘岗位数据爬虫分析可视化（含全部资料+报告）.zip

课程设计-基于Python的Boss直聘岗位数据爬虫分析可视化系统（含全部资料+报告）.zip

基于python实现的Boss直-聘岗位数据爬虫分析可视化+源代码+文档说明+界面截图+数据

Python爬虫分析Boss直聘岗位数据：可视化与源代码教程

Python实现Boss直聘薪资分析爬虫及数据可视化

Python爬虫实战：Boss直聘职位信息抓取与分析

基于BOSS直聘的Django数据爬虫及可视化分析项目

基于Python招聘数据分析可视化+爬虫+BOSS直聘数据大屏

Python数据可视化分析项目（boss直聘数据可视化

大家在看

松下kxp1121打印机驱动 官方最新版_支持win7

ENVI遥感图像几何校正 包含练习数据

《OpenGL ES 3.x游戏开发 上卷》源码

RD_FMCW.zip

pb9_pb_

最新推荐

基于 Python 的招聘网站数据分析.docx

深入解析PetShop4.0电子商务架构与技术细节

【技术揭秘】：7步打造YOLOv8人员溺水检测告警监控系统

stm32CAN总线

毕业设计资料分享与学习方法探讨

模式识别期末复习精讲：87个问题的全面解析与策略

import torch import numpy as np def a2t(): np_data = np.array([[1, 2],[3,4]]) #/********** Begin *********/ #将np_data转为对应的tensor，赋给变量torch_data torch_data = torch.tensor(np_data) #/********** End *********/ return(torch_data)

电脑垃圾清理专家：提升系统运行效率

模式识别期末复习必备：掌握87个知识点的速成秘籍

redis集群模式配置

松下kxp1121打印机驱动官方最新版_支持win7

ENVI遥感图像几何校正包含练习数据

《OpenGL ES 3.x游戏开发上卷》源码

import torch import numpy as np def a2t(): np_data = np.array([[1, 2],[3,4]]) #/****** Begin */ #将np_data转为对应的tensor，赋给变量torch_data torch_data = torch.tensor(np_data) #/ End ***/ return(torch_data)