ICode9

精准搜索请尝试: 精确搜索
首页 > 编程语言> 文章详细

python采集快手美女内容,快来学会把你喜欢的作者内容全部下载吧~

2022-06-22 16:04:05  阅读:170  来源: 互联网

标签:__ String typename 快手 photo python json 内容 id



本篇代码提供者: 青灯教育-巳月老师


知识点:

  • 动态数据抓包
  • requests发送请求
  • json数据解析

开发环境:

运行代码

  • python 3.8

辅助敲代码

  • pycharm 2021.2

第三方模块

  • requests

如果安装python第三方模块:

  1. win + R 输入 cmd 点击确定, 输入安装命令 pip install 模块名 (pip install requests) 回车
  2. 在pycharm中点击Terminal(终端) 输入安装命令

如何配置pycharm里面的python解释器?

  1. 选择file(文件) >>> setting(设置) >>> Project(项目) >>> python interpreter(python解释器)

  2. 点击齿轮, 选择add

  3. 添加python安装路径


pycharm如何安装插件?

  1. 选择file(文件) >>> setting(设置) >>> Plugins(插件)

  2. 点击 Marketplace 输入想要安装的插件名字 比如:翻译插件 输入 translation / 汉化插件 输入 Chinese

  3. 选择相应的插件点击 install(安装) 即可

  4. 安装成功之后 是会弹出 重启pycharm的选项 点击确定, 重启即可生效


基本思路流程(通用)

  1. 发送请求
  2. 获取数据
  3. 解析数据
  4. 保存数据

代码

导入模块

import requests     # 发送请求 访问网站
import re

加入伪装

url = 'https://www.kuaishou.com/graphql'
# 伪装
headers = {
    'content-type': 'application/json',
    'Cookie': 'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_ea128125517a46bd491ae9ccb255e242; client_key=65890b29; didv=1646739254078; _bl_uid=pCldq3L00L61qCzj6fytnk2wmhz5; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABJBZGcI4Czt3EqaG90aWFm2EPPYcAQlkuV3ZOkHUBcqEtWV--udx6stXFOhEkGgx4tNCBS9Vhl-GstWLkvn-r_eV1072IPsO8d5sUcUuTJv3nicPWVBcfHW813ST2a4uN7HyHsnpnRkjx2BFXoMRmdSO4tbAgy3-3QRTaw05tiEp79qGRfQgVmK4kVJLOTF9X7o9vSjrLrTaRxEf0rwsXJhoSsmhEcimAl3NtJGybSc8y6sdlIiCkLEL3ZiZwp85TGjXIHaw7KGda_VNpfdZ1qigsOkLmESgFMAE; kuaishou.server.web_ph=ad8a744eb59aab3bf509625671ad16837e66',
    'Host': 'www.kuaishou.com',
    'Origin': 'https://www.kuaishou.com',
    'Referer': 'https://www.kuaishou.com/search/video?searchKey=%E5%8F%8C%E9%A9%AC%E5%B0%BE',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36',
}
# 传数据
for page in range(1, 10):
    json = {
        'operationName': "visionSearchPhoto",
        'query': "fragment photoContent on PhotoEntity {\n  id\n  duration\n  caption\n  likeCount\n  viewCount\n  realLikeCount\n  coverUrl\n  photoUrl\n  photoH265Url\n  manifest\n  manifestH265\n  videoResource\n  coverUrls {\n    url\n    __typename\n  }\n  timestamp\n  expTag\n  animatedCoverUrl\n  distance\n  videoRatio\n  liked\n  stereoType\n  profileUserTopPhoto\n  __typename\n}\n\nfragment feedContent on Feed {\n  type\n  author {\n    id\n    name\n    headerUrl\n    following\n    headerUrls {\n      url\n      __typename\n    }\n    __typename\n  }\n  photo {\n    ...photoContent\n    __typename\n  }\n  canAddComment\n  llsid\n  status\n  currentPcursor\n  __typename\n}\n\nquery visionSearchPhoto($keyword: String, $pcursor: String, $searchSessionId: String, $page: String, $webPageArea: String) {\n  visionSearchPhoto(keyword: $keyword, pcursor: $pcursor, searchSessionId: $searchSessionId, page: $page, webPageArea: $webPageArea) {\n    result\n    llsid\n    webPageArea\n    feeds {\n      ...feedContent\n      __typename\n    }\n    searchSessionId\n    pcursor\n    aladdinBanner {\n      imgUrl\n      link\n      __typename\n    }\n    __typename\n  }\n}\n",
        'variables': {'keyword': "双马尾", 'pcursor': str(page), 'page': "search", 'searchSessionId': "MTRfMjcwOTMyMTQ2XzE2NTU3MjcyNTU3NjFf5Y-M6ams5bC-XzUwMDQ"}
    }

1. 发送请求

    response = requests.post(url=url, headers=headers, json=json)

2. 获取数据

<Response [200]>: 请求成功

    feeds = response.json()['data']['visionSearchPhoto']['feeds']

    for feed in feeds:
        author_id = feed['author']['id']
        photo_id = feed['photo']['id']
        print(author_id, photo_id)
        caption = feed['photo']['caption']
        photoUrl = feed['photo']['photoUrl']
        print(caption, photoUrl)
        caption = re.sub('[\\/:"?<>|\\n]', '', caption)
        json_like = {
            'operationName': "visionVideoLike",
            'query': "mutation visionVideoLike($photoId: String, $photoAuthorId: String, $cancel: Int, $expTag: String) {\n  visionVideoLike(photoId: $photoId, photoAuthorId: $photoAuthorId, cancel: $cancel, expTag: $expTag) {\n    result\n    __typename\n  }\n}\n",
            'variables': {
                'cancel': 0,
                'expTag': "1_i/2005246154318902369_xpcwebsearchxxnull0",
                'photoAuthorId': author_id,
                'photoId': photo_id
            }
        }
        resp_ = requests.post(url=url, headers=headers, json=json_like)
        # video_data = requests.get(photoUrl).content
        # with open(f'video/{caption}.mp4', mode='wb') as f:
        #     f.write(video_data)

尾语

好了,我的这篇文章写到这里就结束啦!

有更多建议或问题可以评论区或私信我哦!一起加油努力叭(ง •_•)ง

喜欢就关注一下博主,或点赞收藏评论一下我的文章叭!!!

标签:__,String,typename,快手,photo,python,json,内容,id
来源: https://www.cnblogs.com/Qqun261823976/p/16398674.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有