您现在的位置是：首页 > 后端

当前栏目

python爬虫入门_在百度搜索手机归属地

Python 搜索百度爬虫入门手机归属

2023-06-13 09:15:04 时间

枚举手机号，在百度搜索手机归属地

工具

requests http库
BeautifulSoup html解析库

代码

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import requests
from bs4 import BeautifulSoup

headersPara = {    #伪装浏览器信息
    'Connection': 'Keep-Alive',
    'Accept': 'text/html, application/xhtml+xml, */*',
    'Accept-Language': 'en-US,en;q=0.8,zh-Hans-CN;q=0.5,zh-Hans;q=0.3',
    'Accept-Encoding': 'gzip, deflate',
    'User-Agent': 'Mozilla/6.1 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko'
}
url="https://www.baidu.com/s"
f=open('./phonenumber.txt','w')

for i in range(111,119):
    word="13363460"+str(i)
    print(i)
    f.write(word+' ')
    data={
        'wd':word,
        'ie':'utf-8'
    }
    #发送get请求，添加可选参数params和headers
    response =requests.get(url=url,params=data,headers=headersPara)
    response.encoding="utf-8"
    #获取html网页
    html=response.text
    #使用lxml解析html页面成一棵树，返回给soup
    soup=BeautifulSoup(html,'lxml')
    #data=soup.select('#main > div > div.result-right > div.c-border.op_fraudphone_container > div > div.c-span21.c-span-last > div.op_fraudphone_row')
    #用select选择需要的标签，前面加.表示类名，可用空格组合过滤条件
    data=soup.select('.c-gap-bottom-small span')
    if data and data[1]:
        #print(data[1].get_text())
        #print(data[1].get('href'))
        #获取标签内的文本
        f.write(data[1].get_text())
    f.write('\n')

f.close()

参考

廖雪峰 python教程

欢迎与我分享你的看法。转载请注明出处：http://taowusheng.cn/

猜你喜欢

C语言之分支结构 if(二)详解编程语言
【Oracle面试题：基础知识攻略】（oracle面试题基础）
深入MySQL：从基础到高级进阶（mysql进阶）
Android File Transfer Mac版帮你轻松将Android设备上的文件传输到Mac电脑
UE4 WebUI插件使用指南
自己动手写操作系统–个人实践「建议收藏」
nodejs的require模块(文件模块/核心模块)及路径介绍
php打造属于自己的MVC框架
HJ1 字符串最后一个单词的长度
android浏览器之多窗口方案详解
MSSQL语句长度分析：优化性能的重要建议（mssql语句长度）
利用Redis记录多少数据（多少数据用到redis）
mac用鼠标滚轮滚动方向相反_macbook触控板怎么滚动
对称加密和非对称加密

zl程序教程

当前栏目

python爬虫入门_在百度搜索手机归属地

工具

代码

参考

相关文章