zl程序教程

您现在的位置是:首页 >  其它

当前栏目

如何获得英语单词的发音?增加 IPA-SAMPA

如何 增加 获得 ipa 英语单词 发音
2023-09-11 14:15:20 时间

简 介: 为了获得英文单词的读音并进行显示,使用 eng_to_ipa 或者单词的 IPA, 然后通过自行编写的转换程序,将IPA 转换成 sampa并进行显示。

关键词 sampaipa

单词发音
目 录
Contents
文字转语音
wordnet
eng-to-ipa
显示IPA
转换程序
gruut-ipa
SAMPA字符对应表格
改造程序
改造原料
改造小程序
总 结

 

§01 词发音


  语的单词的认知需要同时配合他的发音才能够有更好的效果。在测试一些利用PYTHON完成中英文翻译的效果利用有道在线翻译对于TEASOFT软件Python附加程序进行了英语查询扩充,但是现在还缺少读音的提示,下面查看一些辅助的工具,看是否能够将这个功能补齐。

1.1 文字转语音

  在 Python(九)- 音频文字转换 给出了几个语音模块:

  • pyttsx3 :软件包,是系统内置的语音引擎实现发音,不生成MP3
  • win32comWindows 操作系统内置的语音引擎实现文字发音;

  在 更新pip3与pyttsx3文字语音转换 中指出在安装 pyttsx3之前需要安装 pypiwin32

1.1.1 安装软件包

(1)安装 pypiwin32

python -m pip install pypiwin32

(2)安装 pyttsx3

python -m pip install pyttsx3

1.1.2 测试软件包

 Ⅰ.汉语读音
import pyttsx3
engine = pyttsx3.init()
engine.say('人类真帅')
engine.runAndWait()

  可以听到“人类真帅” 的输出语言。

 Ⅱ.英语读音

  engine.say(‘Windows is an OS.’)

  也可以听到英文的读出(“Windows is an OS.”)

  但是上述的输出有着明显的“机器人” 的强调,不是非常自认的语言。

1.2 wordnet

  根据 volcabulary 软件模块,介绍volcabulary获得单词的信息。但是在安装volcabulary的时候出现错误。

python -m pip install nltk

1.2.1 测试 wordnet

import sys,os,math,time
import matplotlib.pyplot as plt
from numpy import *

from nltk.corpus import wordnet
syns = wordnet.synsets('car')
print(syns)

  测试结果:测试错误。 无法工作。

1.3 eng-to-ipa

  在 Convert English text into the Phonetics using Python 介绍利用“eng-to-ipa” 获取英语单词的读音。

  这个软件报是将英语单词转换为 IPA (International Phonetic Alphabet)。

  详细的介绍可以参见: English to IPA (eng_to_ipa)

1.3.1 安装软件包

python -m pip install eng-to-ipa

▲ 图1.3.1 python -m pip install eng-to-ipa

▲ 图1.3.1 python -m pip install eng-to-ipa

1.3.2 测试软件包

import sys,os,math,time
import matplotlib.pyplot as plt
from numpy import *

import eng_to_ipa as p

ipa = p.convert("Hello Geeks.")

filename = r'd:\temp\1.txt'
with open(filename, 'w', encoding='utf-8') as f:
    f.write(ipa)

print('\a')

▲ 图1.3.2 音标对应的显示

▲ 图1.3.2 音标对应的显示

ipa = p.convert("Using ipa-list() instead of convert()")

▲ 图1.3.3 音标显示

▲ 图1.3.3 音标显示

ipa = p.ipa_list("Yes I am geeks, How are you.")

▲ 图1.3.4 音标显示

▲ 图1.3.4 音标显示

1.3.3 get_rhymes

py = p.get_rhymes('test')
printf(py)
['abreast', 'acquiesced', 'addressed', 'addwest', 'arrest', 'assessed', 'attest', 'behest', 'bequest', 'best', 'beste', 'blessed', 'blest', 'breast', 'brest', 'bud-test', "c'est", 'caressed', 'celeste', 'charest', 'chest', 'chrest', 'coalesced', 'compressed', 'confessed', 'congest', 'contest', 'crest', "d'allest", 'depressed', 'dest', 'detest', 'digest', 'digressed', 'dispossessed', 'distressed', 'divest', 'dressed', 'eastern-west', 'est', 'expressed', 'farwest', 'fessed', 'fest', 'finessed', 'gest', 'guessed', 'guest', 'impressed', 'indigest', 'infest', 'ingest', 'intrawest', 'invest', 'jest', 'key_west', 'lest', 'messed', 'mest', 'midwest', 'molest', 'natwest', 'nest', 'neste', 'northwest', 'norwest', 'obsessed', 'oppressed', 'penwest', 'pest', 'possessed', 'pressed', 'prest', 'professed', 'progressed', 'protest', 'quest', 'rearrest', 'reassessed', 'recessed', 'reinvest', 'repossessed', 'repressed', 'request', 'rest', 'retest', 'self-professed', 'southwest', 'stateswest', 'stressed', 'suggest', 'suppressed', 'sylvest', 'telequest', 'telewest', 'transgressed', 'trest', 'unaddressed', 'undressed', 'unimpressed', 'unrest', 'vest', 'west', 'wrest', 'yest', 'yoest', 'zest']

1.3.4 如何显示IPA?

  很尴尬的问题,就是eng_to_ipa输出的文字中包含有很多无法直接显示的字符。那么如何将其进行转化成ASCII?

  在 ipapy 0.0.9.0 给出了关于IPA 的python软件包。

(1)安装 IPAPY

python -m pip install ipapy

  测试了这个软件,但是好像无法进行转换。

 

§02 示IPA


  现在的TEASOFT软件中还无法显示 IPA 的代码。该如何将IPA转换成标准的ASCII呢?

  在 IPA to plain simple English translator 给出了这方面的提问。 也就是将IPA转换成 : American Heritage Dictionary uses

2.1 转换程序

  ipa_converter 给出了一个将IPA转换成 SAMPA - computer readable phonetic alphabet

2.1.1 转换程序

import eng_to_ipa as p
import string

#------------------------------------------------------------
sampafile = r'D:\Temp\sampa.cfg'
table = {}
with open(sampafile, 'r', encoding='utf8') as f:
    for line in f:
        line = line.strip()
        if line == '': continue
        row = line.split()
        sampa_symb = row[0]
        ipa_symb = row[1]
        table[ipa_symb] = sampa_symb

#------------------------------------------------------------
ipa = p.convert("Yes I am geeks, How are you.")

out = []
for c in ipa:
    if c in table: c= table[c]
    elif c not in string.printable: c = ''

    out.append(c)

printf(''.join(out))

2.1.2 测试结果

ipa = p.convert("Yes I am geeks, How are you.")
jEs aI {m giks, haU @r ju.
ipa = p.convert("instead")
%In"stEd

2.2 gruut-ipa

  从 Gruut IPA 安装软件包。

▲ 图2.1.1 SAMPA 对应表格

▲ 图2.1.1 SAMPA 对应表格

2.2.1 安装软件包

python -m pip install gruut-ipa

IPAeSpeakSampaDescription
iiiclose front unrounded vowel
yyyclose front rounded vowel
ɨi1close central unrounded vowel
ʉu}close central rounded vowel
ɯu-Mclose back unrounded vowel
uuuclose back rounded vowel
ɪIInear-close near-front unrounded vowel
ʏI.Ynear-close near-front rounded vowel
ʊUUnear-close near-back rounded vowel
eeeclose-mid front unrounded vowel
øY2close-mid front rounded vowel
ɘ@@close-mid central unrounded vowel
ɵ@.8close-mid central rounded vowel
ɤo-7close-mid back unrounded vowel
oooclose-mid back rounded vowel
ɛEEopen-mid front unrounded vowel
œW9open-mid front rounded vowel
ɜV3open-mid central unrounded vowel
ɞO3open-mid central rounded vowel
ʌVVopen-mid back unrounded vowel
ɔOOopen-mid back rounded vowel
æa{near-open front unrounded vowel
ɐV6near-open central unrounded vowel
aaaopen front unrounded vowel
ɶW&open front rounded vowel
ɑAAopen back unrounded vowel
ɒA.Qopen back rounded vowel
mmmvoiced bilabial nasal
ɱMFvoiced labio-dental nasal
nnnvoiced alveolar nasal
ɳn.n`voiced retroflex nasal
ŋNNvoiced velar nasal
ɴnNvoiced uvular nasal
pppvoiceless bilabial plosive
bbbvoiced bilabial plosive
tttvoiceless alveolar plosive
dddvoiced alveolar plosive
ʈt.t`voiceless retroflex plosive
ɖd.d`voiced retroflex plosive
cccvoiceless palatal plosive
ɟJJvoiced palatal plosive
kkkvoiceless velar plosive
ɡggvoiced velar plosive
gggvoiced velar plosive
qqqvoiceless uvular plosive
ɢGGvoiced uvular plosive
ʡ>voiceless pharyngeal plosive
ʔ??voiceless glottal plosive
p͡fpfpfvoiceless labio-dental affricate
b͡vbvbvvoiced dental affricate
t̪͡stst dsvoiceless dental affricate
t͡ststsvoiceless alveolar affricate
d͡zdzdzvoiced alveolar affricate
t͡ʃtStSvoiceless post-alveolar affricate
d͡ʒdZdZvoiced post-alveolar affricate
ʈ͡ʂtSts`voiceless retroflex affricate
ɖ͡ʐdzdz`voiced retroflex affricate
t͡ɕtS;tsvoiceless palatal affricate
d͡ʑdZ;dzvoiced palatal affricate
k͡xkk xvoiceless velar affricate
ɸFpvoiceless bilabial fricative
βBBvoiced bilabial fricative
fffvoiceless labio-dental fricative
vvvvoiced labio-dental fricative
θTTvoiceless dental fricative
ðDDvoiced dental fricative
sssvoiceless alveolar fricative
zzzvoiced alveolar fricative
ʃSSvoiceless post-alveolar fricative
ʒZZvoiced post-alveolar fricative
ʂs.s`voiceless retroflex fricative
ʐz.z`voiced palatal fricative
çCCvoiceless palatal fricative
xxxvoiceless velar fricative
ɣQGvoiced velar fricative
χXXvoiceless uvular fricative
ʁgRvoiced uvular fricative
ħHXvoiceless pharyngeal fricative
hhhvoiceless glottal fricative
ɦh<?>hvoiced glottal fricative
wwwvoiced bilabial approximant
ʋv#vvoiced labio-dental approximant
ɹrrvoiced alveolar approximant
ɻr.r `voiced retroflex approximant
jjjvoiced palatal approximant
ɰQMvoiced velar approximant
voiced labio-dental flap
ɾ*4voiced alveolar flap
ɽ*.r`voiced retroflex flap
ʙbBvoiced bilabial trill
rrrvoiced alveolar trill
ʀrRvoiced uvular trill
lllvoiced alveolar lateral-approximant
ɫl5voiced alveolar lateral-approximant
ɭl.l`voiced retroflex lateral-approximant
ʎl^Lvoiced palatal lateral-approximant
ʟLLvoiced velar lateral-approximant
ə@@schwa
ɚ3@`r-coloured schwa
ɝ3@`r-coloured schwa
ɹ̩r-r ̩voiced alveolar approximant

  这个软件包似乎只能通过命令行来调用。

2.3 SAMPA字符对应表格

  下面表格从 sampa.cfg 拷贝粘贴过来。

A       ɑ       script a        open back unrounded, Cardinal 5, Eng. start
{       æ       ae ligature     near-open front unrounded, Eng. trap
6       ɐ       turned a        open schwa, Ger. besser
Q       ɒ       turned script a open back rounded, Eng. lot
E       ɛ       epsilon         open-mid front unrounded, C3, Fr. même
@       ə       turned e        schwa, Eng. banana
3       ɜ       rev. epsilon    long mid central, Eng. nurse
I       ɪ       small cap I     lax close front unrounded, Eng. kit
O       ɔ       turned c        open-mid back rounded, Eng. thought
2       ø       o-slash         close-mid front rounded, Fr. deux
9       œ       oe ligature     open-mid front rounded, Fr. neuf
&       ɶ       s.c. OE lig.    open front rounded
U       ʊ       upsilon         lax close back rounded, Eng. foot
}       ʉ       barred u        close central rounded, Swedish sju
V       ʌ       turned v        open-mid back unrounded, Eng. strut
Y       ʏ       small cap Y     lax [y], Ger. hübsch
B       β       beta            voiced bilabial fricative, Sp. cabo
C       ç       c-cedilla       voiceless palatal fricative, Ger. ich
D       ð       eth             voiced dental fricative, Eng. then
G       ɣ       gamma           voiced velar fricative, Sp. fuego
L       ʎ       turned y        palatal lateral, It. famiglia
J       ɲ       left-tail n     palatal nasal, Sp. año
N       ŋ       eng             velar nasal, Eng. thing
R       ʁ       inv. s.c. R     vd. uvular fric. or trill, Fr. roi
S       ʃ       esh             voiceless palatoalveolar fricative, Eng. ship
T       θ       theta           voiceless dental fricative, Eng. thin
H       ɥ       turned h        labial-palatal semivowel, Fr. huit
Z       ʒ       ezh (yogh)      vd. palatoalveolar fric., Eng. measure
?       ʔ       dotless ?       glottal stop, Ger. Verein, also Danish stød
:       ː       length mark     length mark
"       ˈ       vertical stroke primary stress *
%       ˌ       low vert. str.  secondary stress
s'      ʂ       Added for Russian support (ш)
s\      ɕ       Added for Russian support (щ)
s'      ʐ       Added for Russian support (ж)
z\      ʑ       Added for Russian support (ж)
1       ɨ       Added for Russian support (и, sometimes ы)
8       ɵ       Added for Russian support (ё)
_j      ʲ       Added for Russian support (ь)

  将上述文本存储粘贴在 notepad中,并以 ‘utf-8’ 格式进行存储。

#!/usr/bin/python

from __future__ import print_function
from string import printable
import sys, locale

if __name__ == '__main__':
    from argparse import ArgumentParser
    parser = ArgumentParser(description='Converts IPA input to ASCII output. Does not handle certain cases: nasals, tones, syllabic consonants, and non-SAMPA representable phonetic symbols. Writes to stdout.')
    parser.add_argument('-c','--config',help='Name of the configuration table. Defaults to "sampa.cfg" in the working directory.',default='sampa.cfg')
    parser.add_argument('source',help='IPA symbols to convert.')
    args = parser.parse_args(sys.argv[1:])
    config = args.config
    source = args.source.decode(locale.getpreferredencoding())
    print(source)

    config = open(config,'r')
    table = {}
    for line in config:
        line=line.strip()
        if line == '': continue
        row = line.split()
        sampa_symb = row[0].decode('utf-8')
        ipa_symb = row[1].decode('utf-8')
        table[ipa_symb] = sampa_symb

    out = []
    for c in source:
        if c in table: c = table[c]
        elif c not in printable: c = ''
        out.append(c)
    print(''.join(out))

 

§03 造程序


3.1 改造原料

  • sampa.cfg : 对应反映表格,是一个 “utf-8” 格式的文本文件。

  • 代码:

import string

sampafile = r'D:\Python\Cmd\sampa.cfg'
table = {}
with open(sampafile, 'r', encoding='utf8') as f:
    for line in f:
        line = line.strip()
        if line == '': continue
        row = line.split()
        sampa_symb = row[0]
        ipa_symb = row[1]
        table[ipa_symb] = sampa_symb

def ipa2sampa(ipa):
    out = []
    for c in ipa:
        if c in table: c = table[c]
        elif c not in string.printable: c = ''

        out.append(c)

    return ''.join(out)

3.2 改造小程序

  主要改造的小程序包括:

  • cal
  • cdtm

3.2.1 增加IPA-SAMPA

#------------------------------------------------------------
import eng_to_ipa as ipa
import string

sampafile = r'D:\Python\Cmd\sampa.cfg'
table = {}
with open(sampafile, 'r', encoding='utf8') as f:
    for line in f:
        line = line.strip()
        if line == '': continue
        row = line.split()
        sampa_symb = row[0]
        ipa_symb = row[1]
        table[ipa_symb] = sampa_symb

def ipa2sampa(ipa):
    out = []
    for c in ipa:
        if c in table: c = table[c]
        elif c not in string.printable: c = ''

        out.append(c)

    return ''.join(out)

#------------------------------------------------------------
import json
import requests

def translate(word):
    url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&smartresult=ugc&sessionFrom=null'
    key = {
        'type': "AUTO",
        'i': word,
        "doctype": "json",
        "version": "2.1",
        "keyfrom": "fanyi.web",
        "ue": "UTF-8",
        "action": "FY_BY_CLICKBUTTON",
        "typoResult": "true"
    }
    response = requests.post(url, data=key)
    if response.status_code == 200:
        return response.text
    else:
        return None

def get_reuslt(repsonse):
    result = json.loads(repsonse)

    origin = result['translateResult'][0][0]['src']
    target = result['translateResult'][0][0]['tgt']

    originflag = 1
    targetflag = 1

    for c in origin:
        if ord(c) >= 0x80:
            originflag = 0
            break

    for c in target:
        if ord(c) >= 0x80:
            targetflag = 0
            break

    originsampa = ''
    targetsampa = ''

    if originflag > 0:
        originsampa = "/%s/"%ipa2sampa(ipa.convert(origin))
        for c in originsampa:
            if c not in string.printable:
                originsampa = ''
                break

    if targetflag > 0:
        targetsampa = "/%s/"%ipa2sampa(ipa.convert(target))
        for c in targetsampa:
            if c not in string.printable:
                targetsampa = ''
                break

    printf ("%s%s --> %s%s" %(result['translateResult'][0][0]['src'], originsampa,
                              result['translateResult'][0][0]['tgt'], targetsampa))

3.2.2 允许结果

靠谱 --> By spectrum/baI "spEktr@m/
command/k@"m{nd/ --> 命令
python --> python
input/"In%pUt/ --> 输入
python --> python
cmd/cmd*/ --> cmd/cmd*/

append/@"pEnd/ --> 附加

input/"In%pUt/ --> 输入
python --> python
command/k@"m{nd/ --> 命令

命令 --> The command/D@ k@"m{nd/
事情 --> things/TINz/
input/"In%pUt/ --> 输入

backup/"b{%k@p/ --> 备份

 

  结 ※


  了获得英文单词的读音并进行显示,使用 eng_to_ipa 或者单词的 IPA, 然后通过自行编写的转换程序,将IPA 转换成 sampa并进行显示。


■ 相关文献链接:

● 相关图表链接: