如何利用cnocr 识别七段数码?
在博文两款开源的中文OCR工具 介绍了两款OCR工具。对于其中的**CNOCR ** 进行了测试。可以作为今后研究的工具。
01安装cnocr
可以使用pip进行安装
pip install cnocr
也可以使如下的命令安装V1.1.0版本
pip install cnocr=1.1.0
02初步实验
1. 屏幕截取的文字
▲ 屏幕截取的一段文字
- 识别时间:1.98
- 识别结果:
[[‘●’, ‘更’, ‘新’, ‘了’, ‘训’, ‘练’, ‘代’, ‘码’, ‘,’, ‘使’, ‘用’, ‘m’, ‘x’, ‘n’, ‘e’, ‘t’, ‘的’, ‘r’, ‘e’, ‘c’, ‘o’, ‘r’, ‘d’, ‘i’, ‘o’, ‘首’, ‘先’, ‘把’, ‘数’, ‘据’, ‘转’, ‘换’, ‘成’, ‘二’, ‘进’, ‘制’, ‘格’, ‘式’, ‘,’, ‘提’, ‘升’, ‘后’, ‘续’, ‘的’], [‘训’, ‘练’, ‘效’, ‘率’, ‘。’, ‘训’, ‘练’, ‘时’, ‘支’, ‘持’, ‘对’, ‘图’, ‘片’, ‘做’, ‘实’, ‘时’, ‘数’, ‘据’, ‘增’, ‘强’, ‘。’, ‘也’, ‘加’, ‘入’, ‘了’, ‘更’, ‘多’, ‘可’, ‘传’, ‘入’, ‘的’, ‘参’, ‘数’, ‘。’], [‘●’, ‘允’, ‘许’, ‘训’, ‘练’, ‘集’, ‘中’, ‘的’, ‘文’, ‘字’, ‘数’, ‘量’, ‘不’, ‘同’, ‘,’, ‘目’, ‘前’, ‘是’, ‘中’, ‘文’, ‘1’, ‘0’, ‘个’, ‘字’, ‘,’, ‘英’, ‘文’, ‘2’, ‘0’, ‘个’, ‘字’, ‘母’, ‘。’], [’。’, ‘提’, ‘供’, ‘了’, ‘更’, ‘多’, ‘的’, ‘模’, ‘型’, ‘选’, ‘择’, ‘,’, ‘允’, ‘许’, ‘大’, ‘家’, ‘按’, ‘需’, ‘训’, ‘练’, ‘多’, ‘种’, ‘不’, ‘同’, ‘大’, ‘小’, ‘的’, ‘识’, ‘别’, ‘模’, ‘型’, ‘。’], [‘●’, ’ ', ‘内’, ‘置’, ‘了’, ‘各’, ‘种’, ‘训’, ‘练’, ‘好’, ‘的’, ‘模’, ‘型’, ‘,’, ‘最’, ‘小’, ‘的’, ‘模’, ‘型’, ‘只’, ‘有’, ‘之’, ‘前’, ‘模’, ‘型’, ‘的’, ‘1’, ‘/’, ‘5’, ‘大’, ‘小’, ‘。’, ‘所’, ‘有’, ‘模’, ‘型’, ‘都’, ‘可’, ‘免’, ‘费’], [‘使’, ‘用’, ‘。’]]
2.屏幕截取的英文
▲ 屏幕截取的英文文字
-
识别所使用的时间:2.376136064529419
-
识别的结果:
[[‘E’, ‘r’, ‘n’, ‘e’, ‘s’, ‘t’, ’ ', ‘R’, ‘u’, ‘t’, ‘h’, ‘e’, ‘r’, ‘f’, ‘o’, ‘r’, ‘d’, ‘,’, ‘i’, ‘n’, ’ ', ‘f’, ‘u’, ‘l’, ‘l’, ’ ', ‘E’, ‘r’, ‘n’, ‘e’, ‘s’, ‘t’, ’ ', ‘R’, ‘u’, ‘t’, ‘h’, ‘e’, ‘r’, ‘f’, ‘o’, ‘r’, ‘d’, ‘,’, ‘B’, ‘a’, ‘r’, ‘o’, ‘n’, ’ ', ‘R’, ‘u’, ‘t’, ‘h’, ‘e’, ‘r’, ‘f’, ‘o’, ‘r’, ‘d’, ’ ', ‘o’, ‘f’], [‘N’, ‘e’, ‘l’, ‘s’, ‘o’, ‘n’, ‘,’, ‘o’, ‘f’, ’ ', ‘C’, ‘a’, ‘m’, ‘b’, ‘r’, ‘i’, ‘d’, ‘g’, ‘e’, ‘,’, ‘(’, ‘b’, ‘o’, ‘r’, ‘n’, ’ ', ‘A’, ‘u’, ‘g’, ‘s’, ‘t’, ’ ', ‘3’, ‘0’, ‘,’, ‘1’, ‘8’, ‘7’, ‘1’, ‘,’, ‘S’, ‘p’, ‘r’, ‘i’, ‘n’, ‘g’, ’ ', ‘G’, ‘r’, ‘o’, ‘v’, ‘e’, ‘,’, ’ ', ‘N’, ‘e’, ‘w’, ’ ', ‘Z’, ‘e’, ‘a’, ‘l’, ‘a’, ‘n’, ‘d’, ‘-’], [‘d’, ‘i’, ‘e’, ‘d’, ’ ', ‘O’, ‘c’, ‘t’, ‘o’, ‘b’, ‘e’, ‘r’, ’ ', ‘1’, ‘9’, ‘,’, ‘1’, ‘9’, ‘3’, ‘7’, ‘,’, ‘C’, ‘a’, ‘m’, ‘b’, ‘r’, ‘i’, ‘d’, ‘g’, ‘e’, ‘,’, ‘C’, ‘a’, ‘m’, ‘b’, ‘r’, ‘i’, ‘d’, ‘g’, ‘e’, ‘s’, ‘h’, ‘i’, ‘r’, ‘e’, ‘,’, ‘E’, ‘n’, ‘g’, ‘l’, ‘a’, ‘n’, ‘d’, ‘)’, ‘,’, ‘N’, ‘e’, ‘w’, ’ ', ‘Z’, ‘e’, ‘a’, ‘l’, ‘a’, ‘n’, ‘d’, ‘-’], [‘b’, ‘o’, ‘r’, ‘n’, ’ ', ‘B’, ‘r’, ‘i’, ‘t’, ‘i’, ‘s’, ‘h’, ’ ', ‘p’, ‘h’, ‘y’, ‘s’, ‘i’, ‘c’, ‘i’, ‘s’, ‘t’, ’ ', ‘c’, ‘o’, ‘n’, ‘s’, ‘i’, ‘d’, ‘e’, ‘r’, ‘e’, ‘d’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘g’, ‘r’, ‘e’, ‘a’, ‘t’, ‘e’, ‘s’, ‘t’, ’ ', ‘e’, ‘x’, ‘p’, ‘e’, ‘r’, ‘i’, ‘m’, ‘e’, ‘n’, ‘t’, ‘a’, ‘l’, ‘i’, ‘s’, ‘t’, ’ ', ‘s’, ‘i’, ‘n’, ‘c’, ‘e’, ’ ', ‘M’, ‘i’, ‘c’, ‘h’, ‘a’, ‘e’, ‘l’], [‘F’, ‘a’, ‘r’, ‘a’, ‘d’, ‘a’, ‘y’, ’ ', ‘(’, ‘1’, ‘7’, ‘9’, ‘1’, ‘-’, ‘1’, ‘8’, ‘6’, ‘7’, ‘)’, ‘.’, ’ ', ‘R’, ‘u’, ‘t’, ‘h’, ‘e’, ‘r’, ‘f’, ‘o’, ‘r’, ‘d’, ’ ', ‘w’, ‘a’, ‘s’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘c’, ‘e’, ‘n’, ‘t’, ‘r’, ‘a’, ‘l’, ’ ', ‘f’, ‘i’, ‘g’, ‘u’, ‘r’, ‘e’, ’ ', ‘i’, ‘n’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘s’, ‘t’, ‘u’, ‘d’, ‘y’, ’ ', ‘o’, ‘f’], [‘r’, ‘a’, ‘d’, ‘i’, ‘o’, ‘a’, ‘c’, ‘t’, ‘i’, ‘v’, ‘i’, ‘t’, ‘y’, ‘,’, ‘a’, ‘n’, ‘d’, ’ ', ‘w’, ‘i’, ‘t’, ‘h’, ’ ', ‘h’, ‘i’, ‘s’, ’ ', ‘c’, ‘o’, ‘n’, ‘c’, ‘e’, ‘p’, ‘t’, ’ ', ‘o’, ‘f’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘n’, ‘u’, ‘c’, ‘l’, ‘e’, ‘a’, ‘r’, ’ ', ‘a’, ‘t’, ‘o’, ‘m’, ’ ', ‘h’, ‘e’, ’ ', ‘l’, ‘e’, ‘d’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘e’, ‘x’, ‘p’, ‘l’, ‘o’, ‘r’, ‘a’, ‘t’, ‘i’, ‘o’, ‘n’, ’ ', ‘o’, ‘f’], [‘n’, ‘u’, ‘c’, ‘l’, ‘e’, ‘a’, ‘r’, ’ ', ‘p’, ‘h’, ‘y’, ‘s’, ‘i’, ‘c’, ‘s’, ‘.’, ‘H’, ‘e’, ’ ', ‘w’, ‘o’, ‘n’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘N’, ‘o’, ‘b’, ‘e’, ‘l’, ’ ', ‘P’, ‘r’, ‘i’, ‘z’, ‘e’, ’ ', ‘f’, ‘o’, ‘r’, ’ ', ‘C’, ‘h’, ‘e’, ‘m’, ‘i’, ‘s’, ‘t’, ‘r’, ‘y’, ’ ', ‘i’, ‘n’, ’ ', ‘1’, ‘9’, ‘0’, ‘8’, ‘,’, ‘w’, ‘a’, ‘s’, ’ ', ‘p’, ‘r’, ‘e’, ‘s’, ‘i’, ‘d’, ‘e’, ‘n’, ‘t’, ’ ', ‘o’, ‘f’], [‘t’, ‘h’, ‘e’, ’ ', ‘R’, ‘o’, ‘y’, ‘a’, ‘l’, ’ ', ‘S’, ‘o’, ‘c’, ‘i’, ‘e’, ‘t’, ‘y’, ’ ', ‘(’, ‘1’, ‘9’, ‘2’, ‘5’, ‘-’, ‘3’, ‘0’, ‘)’, ‘a’, ‘n’, ‘d’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘B’, ‘r’, ‘i’, ‘t’, ‘i’, ‘s’, ‘h’, ’ ', ‘A’, ‘s’, ‘s’, ‘o’, ‘c’, ‘i’, ‘a’, ‘t’, ‘i’, ‘o’, ‘n’, ’ ', ‘f’, ‘o’, ‘r’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘A’, ‘d’, ‘v’, ‘a’, ‘n’, ‘c’, ‘e’, ‘m’, ‘e’, ‘n’, ‘t’, ’ ', ‘o’, ‘f’], [‘S’, ‘c’, ‘i’, ‘e’, ‘n’, ‘c’, ‘e’, ’ ', ‘(’, ‘1’, ‘9’, ‘2’, ‘3’, ‘)’, ‘,’, ‘w’, ‘a’, ‘s’, ’ ', ‘c’, ‘o’, ‘n’, ‘f’, ‘e’, ‘r’, ‘r’, ‘e’, ‘d’, ’ ', ‘t’, ‘h’, ‘e’, ’ ', ‘O’, ‘r’, ‘d’, ‘e’, ‘r’, ’ ', ‘o’, ‘f’, ’ ', ‘M’, ‘e’, ‘r’, ‘i’, ‘t’, ’ ', ‘i’, ‘n’, ’ ', ‘1’, ‘9’, ‘2’, ‘5’, ‘,’, ‘a’, ‘n’, ‘d’, ’ ', ‘w’, ‘a’, ‘s’, ’ ', ‘r’, ‘a’, ‘i’, ‘s’, ‘e’, ‘d’, ’ ', ‘t’, ‘o’, ’ ', ‘t’, ‘h’, ‘e’], [‘p’, ‘e’, ‘e’, ‘r’, ‘a’, ‘g’, ‘e’, ’ ', ‘a’, ‘s’, ’ ', ‘L’, ‘o’, ‘r’, ‘d’, ’ ', ‘R’, ‘u’, ‘t’, ‘h’, ‘e’, ‘r’, ‘f’, ‘o’, ‘r’, ‘d’, ’ ', ‘o’, ‘f’, ’ ', ‘N’, ‘e’, ‘l’, ‘s’, ‘o’, ‘n’, ’ ', ‘i’, ‘n’, ’ ', ‘1’, ‘9’, ‘3’, ‘1’, ‘.’]]
Ernest Rutherford,in full Ernest Rutherford,Baron Rutherford of
Nelson,of Cambridge,(born Augst 30,1871,Spring Grove, New Zealand-
died October 19,1937,Cambridge,Cambridgeshire,England),New Zealand-
born British physicist considered the greatest experimentalist since Michael
Faraday (1791-1867). Rutherford was the central figure in the study of
radioactivity,and with his concept of the nuclear atom he led the exploration of
nuclear physics.He won the Nobel Prize for Chemistry in 1908,was president of
the Royal Society (1925-30)and the British Association for the Advancement of
Science (1923),was conferred the Order of Merit in 1925,and was raised to the
peerage as Lord Rutherford of Nelson in 1931.
3.识别七段数码管
(1)数码管
▲ 测试的数码管字符
- 识别时间: 0.922秒
- 识别结果:
[[‘目’, ‘囱’, ‘巳’, ‘曰’, ‘臼’], [‘S’, ‘日’, ‘囱’, ‘日’, ’ ', ‘臼’]]
(2)表格中的内容
▲ 字符表格的内容
没有任何结果输出:
(3)手写体识别
▲ 手写文字
- 识别时间: 0.865秒
- 识别结果:
手怎体文字
AB数寇
4.实验程序
#!/usr/local/bin/python
# -*- coding: gbk -*-
#============================================================
# TEST1.PY -- by Dr. ZhuoQing 2020-05-26
#
# Note:
#============================================================
from headm import *
from cnocr import CnOcr
imageid = 7
file = tspgetdopfile(imageid)
#img = mx.image.imread(file, 1)
ocr = CnOcr()
res = ocr.ocr(file)
printf(res)
printf('\a')
#------------------------------------------------------------
# END OF FILE : TEST1.PY
#============================================================
03进一步的测试结果
1. 字符
▲ 一小段黑色背景的字符
- 识别时间: 0.774
- 识别结果:
–by Dr.ZhuoQing 2020-05-26
2. 一整段字符
▲ 黑色字体
- 识别时间:0.883
- 识别结果:
两款开源的中文OCR工具,简直碉堡了
04结论
CNOCR的确是一款对英文和汉字识别很好的模型。但是对于7段字符识别则具有它的局限性。
相关文章
- pytorch使用google UIS-RNN算法识别出每个人的声音(以92%的准确率 google/uis-rnn)
- 【MATLAB教程案例63】学习如何建立自己的深度学习训练样本库,包括分类识别数据库和目标检测数据库
- 编写二维码识别Quickbuild工程
- 【转载】裸眼识别二维码
- 一文教你如何快速实现声音识别
- 名片识别,史上最简单的集成攻略来啦!附有SDK包
- Crazy Rockets-教你如何集成华为HMS ML Kit人脸检测和手势识别打造爆款小游戏
- Android | 教你如何在安卓上实现二代身份证识别,一键实名认证
- 新手指导:教你如何查看识别hadoop是32位还是64位
- SwiftUI机器学习入门之基础概念和万物识别App
- VS Code识别编辑规范,ESlint规则,VS Code保存去掉自动加分号、逗号、双引号
- Linux操作系统如何识别hba卡的wwn号
- tensorflow-CNN识别不同物体图像
- 百度BML&飞桨训练营(五)商品种类识别
- 第十三周-调制识别的过程以及小波变换
- 【MaixPy】:K210识别简例(简单二维码检测和双二维码检测)
- Java-面向对象基础与类的识别
- 深度学习 Day 2——如何利用CNN实现天气识别?
- 【语音识别/科大讯飞】个人笔记,无内容,勿点
- Python 模板匹配 匹配多个 识别多个图形