爬虫之requests模块中cookies参数的使用
上一篇文章在headers参数中携带cookie,也可以使用专门的cookies参数
-
cookies参数的形式:字典
cookies = {"cookie的name":"cookie的value"}
- 该字典对应请求头中Cookie字符串,以分号、空格分割每一对字典键值对
- 等号左边的是一个cookie的name,对应cookies字典的key
- 等号右边对应cookies字典的value
-
cookies参数的使用方法
response = requests.get(url, cookies)
-
将cookie字符串转换为cookies参数所需的字典:
cookies_dict = {cookie.split('=')[0]:cookie.split('=')[-1] for cookie in cookies_str.split('; ')}
-
注意:cookie一般是有过期时间的,一旦过期需要重新获取
示例代码:
import requests
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36',
}
cookies_str = '__Host-user_session_same_site=F6064TJp50tOvhIhRulkPqD3w0PlDdg_2f71xYB_59wG5NjS; logged_in=yes; dotcom_user=keepmoving-521; has_recent_activity=1; tz=Asia%2FShanghai; _gh_sess=%2F7EcUsIfVeLID64pja2DW1XmVFfqJObek2mSQkqa8joz7h3ni%2FGlaRFR3ETrDvuhKiiUgJP8jStjNLNiFfpvWQy7U10IGCY15XhHxudYu3tRlt%2Fawt4SHEaDct0LNUQ%2B%2Fi2rHiCLVsL1Y8w%2BC9HpTtd2S6gxDLzfHK5dvBPc4TB6WDn%2BaRt9ljs4lSdlT0mn--qas9T0w68J9araMi--Se2WerqwQ6PlXV5xa2W9lw%3D%3D'
cookies_dir = {cookie.split('=')[0]:cookie.split('=')[-1] for cookie in cookies_str.split('; ')}
url = 'https://github.com/kxxxx'
response = requests.get(url, headers=headers, cookies=cookies_dir)
print(response.content.decode())
运行效果: