请问如何调用这个函数提取MFCC参数?(有说明)
= wavread( 'filename.wav' );...
for i = 1 : framenumber
将第 i 帧存储至向量 x 中,帧长为 framelength;
【调用 melfcc 函数提取第 i 帧(即x)的12维特征参数;】
end
-------------------------------
我的问题是,我看不懂下面这个函数说明,因此不会使用这个现成的 melfcc 函数,导致不会写上面那条函数调用语句“【调用 melfcc 函数提取第 i 帧(即x)的12维特征参数;】”
-------------------------------
函数说明如下:>> help melfcc
= melfcc(samples, sr[, opts ...])
Calculate Mel-frequency cepstral coefficients by:
- take the absolute value of the STFT
- warp to a Mel frequency scale
- take the DCT of the log-Mel-spectrum
- return the first <ncep> components
This version allows a lot of options to be controlled, as optional
'name', value pairs from the 3rd argument on: (defaults in parens)
'wintime' (0.025): window length in sec
'hoptime' (0.010): step between successive windows in sec
'numcep' (13): number of cepstra to return
'lifterexp' (0.6): exponent for liftering; 0 = none; < 0 = HTK sin lifter
'sumpower' (1): 1 = sum abs(fft)^2; 0 = sum abs(fft)
'preemph'(0.97): apply pre-emphasis filter (0 = none)
'dither' (0): 1 = add offset to spectrum as if dither noise
'minfreq' (0): lowest band edge of mel filters (Hz)
'maxfreq'(4000): highest band edge of mel filters (Hz)
'nbands' (40): number of warped spectral bands to use
'bwidth' (1.0): width of aud spec filters relative to default
'dcttype' (2): type of DCT used - 1 or 2 (or 3 for HTK or 4 for feac)
'fbtype'('mel'): frequency warp: 'mel','bark','htkmel','fcmel'
'usecmp' (0): apply equal-loudness weighting and cube-root compr.
'modelorder'(0): if > 0, fit a PLP model of this order
'broaden' (0): flag to retain the (useless?) first and last bands
'useenergy' (0): overwrite C0 with true log energy
The following non-default values nearly duplicate Malcolm Slaney's mfcc
(i.e. melfcc(d,16000,opts...) =~= log(10)*2*mfcc(d*(2^17),16000) )
'wintime': 0.016
'lifterexp': 0
'minfreq': 133.33
'maxfreq': 6855.6
'sumpower': 0
The following non-default values nearly duplicate HTK's MFCC
(i.e. melfcc(d,16000,opts...) =~= 2*htkmelfcc(:,])'
where HTK config has PREEMCOEF = 0.97, NUMCHANS = 20, CEPLIFTER = 22,
NUMCEPS = 12, WINDOWSIZE = 250000.0, USEHAMMING = T, TARGETKIND = MFCC_0)
'lifterexp': -22
'nbands': 20
'maxfreq': 8000
'sumpower': 0
'fbtype': 'htkmel'
'dcttype': 3
看楼主的意思,是要对一个声音文件提取什么参数吧?
页:
[1]