mfcc的特征提取python 代码实现和解析
2023-09-11 14:17:11 时间
1 #!/usr/bin/python 2 # -*- coding: UTF-8 -*- 3 4 import numpy 5 import scipy.io.wavfile 6 from matplotlib import pyplot as plt 7 from scipy.fftpack import dct 8 9 sample_rate,signal=scipy.io.wavfile.read('stop.wav') 10 11 print(sample_rate,len(signal)) 12 #读取前3.5s 的数据 13 signal=signal[0:int(3.5*sample_rate)] 14 print(signal) 15 16 17 18 #预先处理 19 pre_emphasis = 0.97 20 emphasized_signal = numpy.append(signal[0], signal[1:] - pre_emphasis * signal[:-1]) 21 22 23 frame_size=0.025 24 frame_stride=0.1 25 frame_length,frame_step=frame_size*sample_rate,frame_stride*sample_rate 26 signal_length=len(emphasized_signal) 27 frame_length=int(round(frame_length)) 28 frame_step=int(round(frame_step)) 29 num_frames=int(numpy.ceil(float(numpy.abs(signal_length-frame_length))/frame_step)) 30 31 32 pad_signal_length=num_frames*frame_step+frame_length 33 z=numpy.zeros((pad_signal_length-signal_length)) 34 pad_signal=numpy.append(emphasized_signal,z) 35 36 37 indices = numpy.tile(numpy.arange(0, frame_length), (num_frames, 1)) + numpy.tile(numpy.arange(0, num_frames * frame_step, frame_step), (frame_length, 1)).T 38 39 frames = pad_signal[numpy.mat(indices).astype(numpy.int32, copy=False)] 40 41 #加上汉明窗 42 frames *= numpy.hamming(frame_length) 43 # frames *= 0.54 - 0.46 * numpy.cos((2 * numpy.pi * n) / (frame_length - 1)) # Explicit Implementation ** 44 45 #傅立叶变换和功率谱 46 NFFT = 512 47 mag_frames = numpy.absolute(numpy.fft.rfft(frames, NFFT)) # Magnitude of the FFT 48 #print(mag_frames.shape) 49 pow_frames = ((1.0 / NFFT) * ((mag_frames) ** 2)) # Power Spectrum 50 51 52 53 low_freq_mel = 0 54 #将频率转换为Mel 55 nfilt = 40 56 high_freq_mel = (2595 * numpy.log10(1 + (sample_rate / 2) / 700)) 57 mel_points = numpy.linspace(low_freq_mel, high_freq_mel, nfilt + 2) # Equally spaced in Mel scale 58 hz_points = (700 * (10**(mel_points / 2595) - 1)) # Convert Mel to Hz 59 60 bin = numpy.floor((NFFT + 1) * hz_points / sample_rate) 61 62 fbank = numpy.zeros((nfilt, int(numpy.floor(NFFT / 2 + 1)))) 63 64 for m in range(1, nfilt + 1): 65 f_m_minus = int(bin[m - 1]) # left 66 f_m = int(bin[m]) # center 67 f_m_plus = int(bin[m + 1]) # right 68 for k in range(f_m_minus, f_m): 69 fbank[m - 1, k] = (k - bin[m - 1]) / (bin[m] - bin[m - 1]) 70 for k in range(f_m, f_m_plus): 71 fbank[m - 1, k] = (bin[m + 1] - k) / (bin[m + 1] - bin[m]) 72 filter_banks = numpy.dot(pow_frames, fbank.T) 73 filter_banks = numpy.where(filter_banks == 0, numpy.finfo(float).eps, filter_banks) # Numerical Stability 74 filter_banks = 20 * numpy.log10(filter_banks) # dB 75 76 num_ceps = 12 77 mfcc = dct(filter_banks, type=2, axis=1, norm='ortho')[:, 1 : (num_ceps + 1)] 78 (nframes, ncoeff) = mfcc.shape 79 80 n = numpy.arange(ncoeff) 81 cep_lifter =22 82 lift = 1 + (cep_lifter / 2) * numpy.sin(numpy.pi * n / cep_lifter) 83 mfcc *= lift #* 84 85 #filter_banks -= (numpy.mean(filter_banks, axis=0) + 1e-8) 86 mfcc -= (numpy.mean(mfcc, axis=0) + 1e-8) 87 88 print(mfcc.shape) 89 plt.plot(filter_banks) 90 91 plt.show()
测试结果:
相关文章
- 多少测试人都不知道... 一行 Python 代码竟 然能实现并行
- LeetCode高频题互联网大厂笔试题:手撕k-means聚类算法:python代码实现
- 机器学习笔记之密度聚类——DBSCAN方法(Python代码实现)
- Python Selenium设计模式及代码实现
- 推荐 8个Python“无代码”实用开发技能
- 广度优先算法(BFS)、深度优先算法(DFS)、最短路径(dijkstra)的python代码实现
- wav转txt格式的代码实现(c,python)
- MD5( 信息摘要算法)的概念原理及python代码的实现
- python之实现ftp上传下载代码(含错误处理)
- python和MFC代码实现强行表白神器(点不到的按钮)
- python代码实现双色球机打号码和挑战10秒小游戏
- GPU实战之如何使用 Python 和 AI 在 4 行代码中编写自动化文本文章
- Python 实现被动收入教程之我如何使用 python 制作我的第一个高级telegram机器人
- Python代码库OpenCV之14 按照先列后行切割
- (数据科学学习手札60)用Python实现WGS84、火星坐标系、百度坐标系、web墨卡托四种坐标相互转换
- 可变神经网络 Python代码
- Python 基础 之 python 进程知识点整理,实现一个简单使用进程池的多进程文件夹文件copy器
- Python 愤怒的小鸟代码实现(1):物理引擎pymunk使用
- 用Python发布自己的代码
- Python opencv-python 简单测试