您当前的位置：首页 > 搬砖python中~ Python

python带你采集各种表情包，做群里最靓的崽~

搬砖python中~ 发布时间：2022-07-22 19:08:01 ，浏览量：3

前言

大家早好、午好、晚好吖~

环境使用:

Python 3.8
Pycharm 2021.2版本

模块使用:

import requests >>> pip install requests
import re

爬虫的基本套路一. 数据来源分析

明确自己需求
url 唯一资源定位符

二. 代码实现步骤

发送请求, 用python代码模拟浏览器对于url地址发送请求
获取数据, 获取服务器返回响应数据
解析数据, 提取我们想要图片url以及图片的标题
保存数据, 图片内容保存本地文件夹

代码

import requests  # 数据请求模块
import re  # 正则表达式
import time  # 时间模块
import concurrent.futures

def get_response(html_url):
    """
    发送请求函数
    :param html_url: 形式参数, 不具备实际意义
    :return:
    某些你不加伪装,也可以得到数据 
    headers 字典数据类型,
    Cookie 用户信息, 常用于检测是否有登陆账号  
    User-Agent: 用户代理 表示浏览器基本身份标识
    """
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36'
    }
    response = requests.get(url=html_url, headers=headers)
    return response  #  响应对象, 200表示状态码 请求成功
源码、解答、教程加Q裙：832157862

def get_img_info(html_url):
    """
    获取图片信息函数
    :param html_url: 网址
    :return:
    """
    response = get_response(html_url=html_url)  # 调用函数
    # print(response.text)
    #   () 表示精确匹配, 我们想要数据内容  .*? 通配符 可以匹配任意字符 除了(\n换行符)
    # 正则表达式就是复制粘贴 想要的数据用(.*?)表示就可以了
    title_list = re.findall('', response.text)
    url_list = re.findall('


    
        
            
        
        
            
                
                
                    搬砖python中~
                    暂无认证
                
            
            
                
                    
                        3浏览
                        0关注
                        57博文
                        0收益
                    

                    
                        0浏览
                        0点赞
                        0打赏
                        0留言
                    
                
            
            
                私信
                关注
            

        
        
            热门博文
            
                python采集财经数据信息并作可视化~
Python采集猫咪数据并做数据可视化图
python带你采集MP4、弹幕、评论数据并制作词云图~
python带你采集爆火动漫弹幕,并且做词云图可视化分析
多线程带你采集表情包数据，带你体验超速快乐~
【python采集】把网站排行榜shipin内容通通采集
pycharm里得各种小细节你都知道嘛？超多快捷键等你来领~
Python带你下载你想看的高质量cartoon，不要太爽~
Python实现IP代理批量采集, 并检测代理是否可用 保存一下＜嫖一波免费代理＞
python带你采集美女图片，体验畅通无阻的快乐~






    [ 申请 ]友情链接：
    
        传奇私服
        南島屋
        My命理学
        快连vpn
        快连vpn
        搜外友链
        笔趣阁
        爱思助手
        ClashX教程
        绘画宝宝
        配音宝宝
    


    
        
            关于我们
            服务条款
            广告服务
            联系我们
            网站地图
            免责声明
            WAP
        
        技术支持：
            武汉快勤科技有限公司
            XML网站地图 
            备案号：鄂ICP备18027844号-9
            
        
    




    
        立即登录/注册
        
    
    
        
        微信扫码登录
    












	    基本
        文件
        流程
        错误
        SQL
        调试
    

		    
    
	请求信息 : 2026-02-04 13:02:20 HTTP/2.0 GET : /home/article/detail/id/436554.html
运行时间 : 0.0402s ( Load:0.0119s Init:0.0013s Exec:0.0169s Template:0.0100s )
吞吐率 : 24.88req/s
内存开销 : 1,925.01 kb
查询信息 : 17 queries 0 writes 
文件加载 : 36
缓存信息 : 6 gets 0 writes 
配置加载 : 132
会话信息 : SESSION_ID=82tvkvdnbdbhpee014ibq8l7vl
    
    
        
    
	/www/wwwroot/www.chaojiit.com/index.php ( 1.30 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/ThinkPHP.php ( 4.71 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Think.class.php ( 12.32 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Storage.class.php ( 1.38 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Storage/Driver/File.class.php ( 3.56 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Mode/common.php ( 2.82 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Common/functions.php ( 51.07 KB )
/www/wwwroot/www.chaojiit.com/Application/Common/Common/function.php ( 6.83 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Hook.class.php ( 4.02 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/App.class.php ( 12.44 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Dispatcher.class.php ( 15.15 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Route.class.php ( 13.38 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Controller.class.php ( 10.95 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/View.class.php ( 7.96 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/BuildLiteBehavior.class.php ( 3.69 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ParseTemplateBehavior.class.php ( 3.89 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ContentReplaceBehavior.class.php ( 1.93 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Conf/convention.php ( 11.18 KB )
/www/wwwroot/www.chaojiit.com/Application/Common/Conf/config.php ( 1.81 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Lang/zh-cn.php ( 2.57 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Conf/debug.php ( 1.51 KB )
/www/wwwroot/www.chaojiit.com/Application/Home/Conf/config.php ( 0.05 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ReadHtmlCacheBehavior.class.php ( 5.62 KB )
/www/wwwroot/www.chaojiit.com/Application/Home/Controller/ArticleController.class.php ( 6.84 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Model.class.php ( 67.27 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Db.class.php ( 5.70 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Db/Driver/Mysql.class.php ( 8.73 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Db/Driver.class.php ( 41.60 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Cache.class.php ( 3.84 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Cache/Driver/File.class.php ( 5.90 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Template.class.php ( 28.35 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Template/TagLib/Cx.class.php ( 22.62 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Template/TagLib.class.php ( 9.19 KB )
/www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php ( 15.07 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/WriteHtmlCacheBehavior.class.php ( 1.43 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ShowPageTraceBehavior.class.php ( 5.27 KB )
    
    
        
    
	[ app_init ] --START--
Run Behavior\BuildLiteBehavior [ RunTime:0.000013s ]
[ app_init ] --END-- [ RunTime:0.000053s ]
[ app_begin ] --START--
Run Behavior\ReadHtmlCacheBehavior [ RunTime:0.000475s ]
[ app_begin ] --END-- [ RunTime:0.000507s ]
[ view_parse ] --START--
[ template_filter ] --START--
Run Behavior\ContentReplaceBehavior [ RunTime:0.000066s ]
[ template_filter ] --END-- [ RunTime:0.000098s ]
Run Behavior\ParseTemplateBehavior [ RunTime:0.007616s ]
[ view_parse ] --END-- [ RunTime:0.007642s ]
[ view_filter ] --START--
Run Behavior\WriteHtmlCacheBehavior [ RunTime:0.000186s ]
[ view_filter ] --END-- [ RunTime:0.000198s ]
[ app_end ] --START--
    
    
        
    
	[2] session_save_path(): open_basedir restriction in effect. File(/var/lib/php/session) is not within the allowed path(s): (/www/wwwroot/www.chaojiit.com/:/tmp/) /www/wwwroot/www.chaojiit.com/ThinkPHP/Common/functions.php 第 1239 行.
[8192] Array and string offset access syntax with curly braces is deprecated /www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Cache/Driver/File.class.php 第 59 行.
[8] Undefined variable: user /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 38 行.
[8] Undefined variable: user /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 99 行.
[8] Trying to access array offset on value of type null /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 99 行.
[8] Undefined variable: user /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 100 行.
[8] Trying to access array offset on value of type null /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 100 行.
[8] Undefined variable: pinglun_list /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 109 行.
    
    
        
    
	SHOW COLUMNS FROM `configuration` [ RunTime:0.0005s ]
SELECT `value` FROM `configuration` WHERE `name` = 'site_name' LIMIT 1   [ RunTime:0.0001s ]
SHOW COLUMNS FROM `menu` [ RunTime:0.0003s ]
SELECT * FROM `menu` WHERE `fid` = 0 AND `status` = 1  [ RunTime:0.0002s ]
SELECT * FROM `menu` WHERE `fid` = 1 AND `status` = 1  [ RunTime:0.0001s ]
SELECT * FROM `menu` WHERE `fid` = 2 AND `status` = 1  [ RunTime:0.0001s ]
SELECT * FROM `menu` WHERE `fid` = 3 AND `status` = 1  [ RunTime:0.0000s ]
SELECT * FROM `menu` WHERE `fid` = 4 AND `status` = 1  [ RunTime:0.0000s ]
SHOW COLUMNS FROM `article` [ RunTime:0.0005s ]
SELECT * FROM `article` WHERE `id` = 436554 LIMIT 1   [ RunTime:0.0003s ]
SHOW COLUMNS FROM `bloger` [ RunTime:0.0004s ]
SELECT * FROM `bloger` WHERE `id` = 793 LIMIT 1   [ RunTime:0.0001s ]
SELECT COUNT(*) AS tp_count FROM `article` WHERE `bloger_id` = 793 LIMIT 1   [ RunTime:0.0000s ]
SHOW COLUMNS FROM `article_content` [ RunTime:0.0003s ]
SELECT `content` FROM `article_content` WHERE `article_id` = 436554 LIMIT 1   [ RunTime:0.0004s ]
SHOW COLUMNS FROM `article_cate` [ RunTime:0.0004s ]
SELECT `name` FROM `article_cate` WHERE `id` = 6 LIMIT 1   [ RunTime:0.0001s ]
    
    
        
    
	    
    
    



0.0402s