您现在的位置是：首页 > 后端

当前栏目

25 爬虫 - re 正则 finditer 方法

方法爬虫 25 正则 re

2023-09-11 14:15:43 时间

finditer 方法的行为跟 findall 的行为类似，也是搜索整个字符串，获得所有匹配的结果。但它返回一个顺序访问每一个匹配结果（Match 对象）的迭代器。

看看例子：

# -*- coding: utf-8 -*-

import re
pattern = re.compile(r'\d+')

result_iter1 = pattern.finditer('hello 123456 789')
result_iter2 = pattern.finditer('one1two2three3four4', 0, 10)

print type(result_iter1)
print type(result_iter2)

print 'result1...'
for m1 in result_iter1:   # m1 是 Match 对象
    print 'matching string: {}, position: {}'.format(m1.group(), m1.span())

print 'result2...'
for m2 in result_iter2:
    print 'matching string: {}, position: {}'.format(m2.group(), m2.span())

执行结果：

<type 'callable-iterator'>
<type 'callable-iterator'>
result1...
matching string: 123456, position: (6, 12)
matching string: 789, position: (13, 16)
result2...
matching string: 1, position: (3, 4)
matching string: 2, position: (7, 8)

猜你喜欢

不为人知的稠密特征加入CTR预估模型的方法
已解决sys:1: FutureWarning: Could not cast to float64, falling back to object. This behavior is depreca
使用rasa搭建聊天机器人
ubuntu16.04安装verilator+systemc并运行测试程序
一起玩Docker：基本概念
C/C++每日一练(20230408)
[Ionic2] Device Interaction in an Ionic App with Cordova Plugins
[Typescript] 136. Medium - Object to Union
BDOC generated after customer product id is changed in CRM - CUST_MAT_INF
javascript (js)中的基本概念
uboot的relocation原理详细分析
每日一道 LeetCode (18)：删除排序链表中的重复元素
软件测试报告输出
ubuntu安装显卡驱动和cuda
nginx: [warn] conflicting server name "locahost" on 0.0.0.0:80, ignored

相关主题

Jquery扩展方法
反爬虫的方法
postMan使用方法
聚类方法
方法调用
初始化方法
软件开发的方法
pandas的使用方法
线程同步的几种方法
SQL拼接方法
php中的M方法
爬虫与反爬虫

zl程序教程

当前栏目

25 爬虫 - re 正则 finditer 方法

相关文章