声振论坛

 找回密码
 我要加入

QQ登录

只需一步,快速开始

查看: 1119|回复: 1

[声学基础] 请问如何调用这个函数提取MFCC参数?(有说明)

[复制链接]
发表于 2013-1-15 12:15 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有账号?我要加入

x
[y, sr, nbits] = wavread( 'filename.wav' );
...
for i = 1 : framenumber
        将第 i 帧存储至向量 x 中,帧长为 framelength;
        【调用 melfcc 函数提取第 i 帧(即x)的12维特征参数;】
end
-------------------------------
我的问题是,我看不懂下面这个函数说明,因此不会使用这个现成的 melfcc 函数,导致不会写上面那条函数调用语句“【调用 melfcc 函数提取第 i 帧(即x)的12维特征参数;】”
-------------------------------
函数说明如下:>> help melfcc
[cepstra,aspectrum,pspectrum] = melfcc(samples, sr[, opts ...])
   Calculate Mel-frequency cepstral coefficients by:
    - take the absolute value of the STFT
    - warp to a Mel frequency scale
    - take the DCT of the log-Mel-spectrum
    - return the first <ncep> components
   This version allows a lot of options to be controlled, as optional
   'name', value pairs from the 3rd argument on: (defaults in parens)
     'wintime' (0.025): window length in sec
     'hoptime' (0.010): step between successive windows in sec
     'numcep'     (13): number of cepstra to return
     'lifterexp' (0.6): exponent for liftering; 0 = none; < 0 = HTK sin lifter
     'sumpower'    (1): 1 = sum abs(fft)^2; 0 = sum abs(fft)
     'preemph'  (0.97): apply pre-emphasis filter [1 -preemph] (0 = none)
     'dither'      (0): 1 = add offset to spectrum as if dither noise
     'minfreq'     (0): lowest band edge of mel filters (Hz)
     'maxfreq'  (4000): highest band edge of mel filters (Hz)
     'nbands'     (40): number of warped spectral bands to use
     'bwidth'    (1.0): width of aud spec filters relative to default
     'dcttype'     (2): type of DCT used - 1 or 2 (or 3 for HTK or 4 for feac)
     'fbtype'  ('mel'): frequency warp: 'mel','bark','htkmel','fcmel'
     'usecmp'      (0): apply equal-loudness weighting and cube-root compr.
     'modelorder'  (0): if > 0, fit a PLP model of this order
     'broaden'     (0): flag to retain the (useless?) first and last bands
     'useenergy'   (0): overwrite C0 with true log energy
  The following non-default values nearly duplicate Malcolm Slaney's mfcc
  (i.e. melfcc(d,16000,opts...) =~= log(10)*2*mfcc(d*(2^17),16000) )
        'wintime': 0.016
      'lifterexp': 0
        'minfreq': 133.33
        'maxfreq': 6855.6
       'sumpower': 0
  The following non-default values nearly duplicate HTK's MFCC
  (i.e. melfcc(d,16000,opts...) =~= 2*htkmelfcc(:,[13,[1:12]])'
   where HTK config has PREEMCOEF = 0.97, NUMCHANS = 20, CEPLIFTER = 22,
   NUMCEPS = 12, WINDOWSIZE = 250000.0, USEHAMMING = T, TARGETKIND = MFCC_0)
      'lifterexp': -22
         'nbands': 20
        'maxfreq': 8000
       'sumpower': 0
         'fbtype': 'htkmel'
        'dcttype': 3


回复
分享到:

使用道具 举报

发表于 2013-1-15 14:27 | 显示全部楼层
看楼主的意思,是要对一个声音文件提取什么参数吧?
您需要登录后才可以回帖 登录 | 我要加入

本版积分规则

QQ|小黑屋|Archiver|手机版|联系我们|声振论坛

GMT+8, 2024-11-17 03:05 , Processed in 0.056123 second(s), 18 queries , Gzip On.

Powered by Discuz! X3.4

Copyright © 2001-2021, Tencent Cloud.

快速回复 返回顶部 返回列表