python 并发爬虫的快感
2023-09-14 09:06:37 时间
import time
from tomorrow import threads
from requests_html import HTMLSession
session=HTMLSession()
@threads(50) # 使用装饰器,这个函数异步执行
def download(url):
return session.get(url)
def main():
start = time.time()
urls = [
'https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879','https://pypi.org/project/tomorrow/0.2.0/',
'https://www.cnblogs.com/pyld/p/4716744.html',
'http://www.xicidaili.com/nn/10',
'http://baidu.com',
'http://www.bubuko.com/infodetail-1028793.html?yyue=a21bo.50862.201879'
]
req_list=[]
for i in urls:
req_list.append(download(i))
print(req_list)
responses = [i.html.xpath("//title/text()") for i in req_list]
print(responses)
end = time.time()
print("Time: %f seconds" % (end - start))
if __name__ == "__main__":
main()
相关文章
- Python爬虫之多线程
- Python爬虫之BeautifulSoup
- Python入门系列(二)语法风格
- python教程:用简单的Python编写Web应用程序
- Python编程 print输出函数
- 【说站】python最短路径算法如何选择
- 【说站】python统计字符串字符出现次数
- 遗传算法的应用实例python实现_遗传算法Python解决一个问题
- 超实用!使用Python快速对比两个Excel表格之间的差异
- python里面的缩进是什么意思_Python缩进规则(一看即懂)[通俗易懂]
- python制作自动交易程序_Python如何实现自动化交易
- Python绘制旭日图_python绘制散点图
- postman汉化包_python模拟post请求
- 分享Python网络爬虫过程中编码和解码的一个库
- Python 爬虫进阶必备 | 某策网数据返回值 data 解密逻辑分析
- Python爬虫模拟登陆和异步爬虫
- Python爬虫学习:Cookie 和 Session 的区别是什么?
- Python 平台是独立的吗?
- Python升级tensorflow2.x版本相关问题:No module named ‘tensorflow.contrib‘ 问题解决
- Python爬虫:requests的headers该怎么填
- Python发送带附件的SMTP邮件代码详解编程语言
- Python爬虫笔记(一):爬虫基本入门详解编程语言
- 掌握Python访问MySQL的新技能(python访问mysql)
- 搞定!Linux下快速设置Python环境变量(linux设置python环境变量)
- python-pandas:StringMethods详解编程语言
- Python doctest模块:文档测试(超级详细)
- 运维学python之爬虫基础篇(三)urllib模块高级用法
- Linux中如何离开Python环境(linux怎么退出python)
- Python开发实例分享bt种子爬虫程序和种子解析