Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Mel Frequency Cepstral Coefficients


Related Topics

  
  Feature extraction   (Site not responding. Last check: 2007-11-04)
The cepstral coefficients, which are the coefficients of the Fourier transform representation of the log magnitude spectrum, have been shown to be a more robust, reliable feature set for speech recognition than the LPC coefficients.
Because of the sensitivity of the low order cepstral coefficients to overall spectral slope and the sensitivity of the high-order cepstral coeffecients to noise, it had become a standard technique to weight the cepstral coefficients by a tapered window so as to minimize these sensitivities.
For this project 13 mel frequency cepstral coefficients are generated per frame and these are used as the feature vector.
www.cnel.ufl.edu /~kkale/featext.html   (556 words)

  
 Mel frequency cepstral coefficient - Wikipedia, the free encyclopedia
They are derived from a type of cepstral representation of the audio clip.
The basic difference between the cepstrum and the MFCC is that in the MFCC, the frequency bands are positioned logarithmically (on the mel scale) which approximates the human auditory system's response more closely than the linearly-spaced frequency bands obtained directly from the FFT or DCT.
MFCCs are often used in speech recognition systems, such as the systems which can automatically recognise numbers spoken into a telephone.
en.wikipedia.org /wiki/Mel_frequency_cepstral_coefficient   (227 words)

  
 Mel Frequency Cepstral Coefficients   (Site not responding. Last check: 2007-11-04)
The low order cepstral coefficients are sensitive to overall spectral slope and the high-order cepstral coeffecients are susceptible to noise.
This property of the speech spectrum is captured by the mel spectrum.The mel spectrum operates on the basis of selective weighing of the frequencies in the power spectrum.High order frequencies are weighed on a logarithmic scale where as lower order frequencies are weighed on a linear scale.
The Mel spectrum of the power spectrum is computed by multiplying power spectrum by each of the triangular filters and integrating the result.
www1.cs.columbia.edu /~dhruv/report/node17.html   (190 words)

  
 Mel-frequency Cepstral Coefficients (MFCC) and Spectrograms
The model itself employs mel-frequency cepstral coefficients (MFCC's), which are often used in speech recognition and synthesis.
Thus, MFCC's approximate the shape of a spectrum by placing special emphasis on perceptually pertinent frequency regions.
Using MFCC's, we may calculate a rough approximation of the frequency spectrum by performing a low-dimensional inverse cosine transform with them.
www.music.mcgill.ca /~wes/thesis/thesis/html/node16.html   (548 words)

  
 Signal Modeling
In the frequency domain, the slowly varying vocal tract and quickly varying excitation signal are multiplied to produce the speech spectrum.
By taking the inverse DFT of the log-spectrum, the first coefficients in the cepstrum represent the slowly varying vocal tract parameters, and the remaining coefficients model the quickly varying excitation signal and pitch (see figure 2).
Finally, a better perceptual measure are mel-frequency cepstral coefficients which approximate the critical bands of the human auditory system by warping the frequency axis prior to a linear transform.
users.ece.gatech.edu /users/gt4670a/qualifier/html/node7.html   (490 words)

  
 Untitled
Thus for each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called ''Mel'' scale.
Other subjective pitch values are obtained by adjusting the frequency of a tone such that it is half or twice the perceived pitch of a reference tone (with a known mel frequency).
Feature extraction based on Mel Frequency Cepstral Coefficients (MFCC) utilizes the filter bank of which center frequency and bandwidth are scaled by subjective measure, Mel.
ispl.korea.ac.kr /speech/research/mel.html   (202 words)

  
 Method and apparatus for speech reconstruction in a distributed speech recognition system - Patent 6633839   (Site not responding. Last check: 2007-11-04)
performing an inverse discrete cosine transform on the mel-frequency cepstral coefficients at the mel-frequencies to determine log-spectral magnitudes of the speech input at the mel-frequencies.
performing an inverse discrete cosine transform on the mel-frequency cepstral coefficients at harmonic mel-frequencies corresponding to a pitch period of the speech input to determine log-spectral magnitudes of the speech input at the mel-harmonic frequencies; and
At a step 104, the MFCC values corresponding to the impulse response of the pre-emphasis filter are subtracted from the received MFCC values to remove the effect of the pre-emphasis filter as well as the effect of the Mel-filter.
www.freepatentsonline.com /6633839.html   (6431 words)

  
 Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral ...   (Site not responding. Last check: 2007-11-04)
Mel-frequency cepstral coefficients extracted with the use of 15 ms frames served as training input to an ergodic HMM setup.
where Ci is the cepstral coefficients, P is the order, k is the number of discrete Fourier transform magnitude coefficients, Xk is the kth order log-energy output from the filter bank, and N is the number of filters (usually 20).
This difference may be due to higher variability in both lower and higher frequency components in some utterances, e.g., "seven," "start," "front," etc., which may be due to the inability of the subjects to articulate those terms consistently.
www.vard.org /jour/05/42/3/polur.html   (4123 words)

  
 Contents of Cepstral Features
MFCCs are the parameterisation of choice for many speech recognition applications.
In particular, the effect of inserting a transmission channel on the input speech is to multiply the speech spectrum by the channel transfer function.
In the log cepstral domain, this multiplication becomes a simple addition which can be removed by subtracting the cepstral mean from all input vectors.
users.ece.gatech.edu /~antonio/htkbook/node61_ct.html   (523 words)

  
 Audio::MFCC - Perl module for computing mel-frequency cepstral coefficients
Currently, Sphinx-II also uses delta and double-delta cepstral vectors as input to its vector quantization module, but the calculation of these values is done inside the recognizer's utterance processing module..
In the future it may be possible to move the extraction of these features into the feature extraction library, or to use entirely different features as input (for example, LPC coefficients, though currently, mel-scale cepstra give the best recognition performance).
Returns a list of array references, each of which points to the vector of cepstral coefficients extracted from one frame of data.
cpan.uwinnipeg.ca /htdocs/Speech-Recognizer-SPX/Audio/MFCC.html   (437 words)

  
 Mel scale - Wikipedia, the free encyclopedia
The mel scale, proposed by Stevens, Volkman and Newman in 1937 (J. Acoust.
The reference point between this scale and normal frequency measurement is defined by equating a 1000 Hz tone, 40 dB above the listener's threshold, with a pitch of 1000 mels.
The name mel comes from the word melody to indicate that the scale is based on pitch comparisons.
en.wikipedia.org /wiki/Mel_scale   (200 words)

  
 PLP and RASTA (and MFCC, and inversion) in Matlab using melfcc.m and invmelfcc.m
PLP and RASTA (and MFCC, and inversion) in Matlab
Since Mel-frequency Cepstral Coefficients, the other really popular speech feature, involve almost the same processing steps, I decided to make an implementation for them as well, using the same blocks as far as possible.
The de-facto standard Matlab implementation of MFCCs for Matlab is the one in Malcolm Slaney's Auditory Toolbox.
www.ee.columbia.edu /~dpwe/resources/matlab/rastamat   (1285 words)

  
 5.7.1 HTK Format Parameter Files
If delta coefficients are added, these follow the base coefficients and energy value.
If the 0'th order cepstral coefficient is included as well as energy then it is inserted immediately before the energy coefficient, otherwise it replaces it.
The coefficients A and B are defined as
www.ee.uwa.edu.au /~roberto/research/speech/local/entropic/HTKBook/node63.html   (617 words)

  
 Tanner Labs Research - General Speech
The most common front end extracts Mel-frequency Cepstral coefficients (MFCs), but recent research indicates that the Auditory Image Model (AIM) might also offer high performance.
For example, we are currently developing lip reading technology to exploit visual cues that can play a significant role in human speech recognition, even in those with typical hearing abilities, a phenomenon known as the McGurk Effect.
Mel-frequency Cepstral coefficients (MFCs) are obtained by a Fourier transform of short speech segments into the frequency domain, a computation of the logarithm of the amplitude spectrum, and an inverse Fourier transform back to the time domain.
www.tanner.com /Labs/research/technologies/speech_recognition/general_speech.htm   (976 words)

  
 New time-frequency derived cepstral coefficients for automatic speech recognition   (Site not responding. Last check: 2007-11-04)
The goal is to improve recognition rate by optimisation of Mel Frequency Cepstral Coefficients (MFCCs): modifications concern the time-frequency representation used to estimate these coefficients.
There are many ways to obtain a spectrum out of a signal which differ in the method itself (Fourier, Wavelets,...), and in the normalisation.
We show here that we can obtain noise resistant cepstral coefficients, for speaker independent connected word recognition.The recognition system is based on a continuous whole word hidden Markov model.
www.idiap.ch /publications/wassner-eusipco96.bib.abs.html   (161 words)

  
 ECS EPrints Service - A Novel Approach to Noisy Speech recognition using DTW algorithm with Mel-Frequency Cepstral ...
Shafik, R. and Yousaf-Zai, F. A Novel Approach to Noisy Speech recognition using DTW algorithm with Mel-Frequency Cepstral Coefficients.
The speech signal was then sampled and speech features were extracted using low-level and customized Mel-Frequency Cepstral Coefficients (MFCC), which were later dynamically time-warped to find the average minimal distance from Euclidean distance matrices to help facilitate the recognition of speech.
For generalization, speech data from three speakers, of three different level of pitch, were collected and were compared to a mid-pitch speaker to establish both speaker independent and speaker dependent efficacy and accuracy.
eprints.ecs.soton.ac.uk /13218   (278 words)

  
 Using Mel-Frequency Cepstral Coefficients in Missing Data Technique
Filter bank is the most common feature being employed in the research of the marginalisation approaches for robust speech recognition due to its simplicity in detecting the unreliable data in the frequency domain.
In this paper, we propose a hybrid approach based on the marginalisation and the soft decision techniques that make use of the Mel-frequency cepstral coefficients (MFCCs) instead of filter bank coefficients.
A new technique for estimating the reliability of each cepstral component is also presented.
www.hindawi.com /GetArticle.aspx?doi=10.1155/S1110865704309030   (156 words)

  
 Speaker Independent Voice Recognition for Application to Racquetball Tournaments   (Site not responding. Last check: 2007-11-04)
The pre-processing includes scaling each word in hopes of normalizing the its energy some, pre-emphasis to enhance the high frequency components, and removal of leading and trailing silence.
The LPC and CEP features were created through functions inherent to Matlab, and the MFCC coefficients were generated using the Malcom Slaney's Auditory Toolbox.
We are using ten coefficients for the LPC and CEP features, and thirteen for the MFCC.
www.ecel.ufl.edu /~liszewsk/PatternRec/report.html   (1391 words)

  
 Mel Frequency Cepstral Coefficients with Pitch   (Site not responding. Last check: 2007-11-04)
Here we have added an additional feature to the speech vector comprising of mfccs of a frame, the pitch of the frame.
The pitch is determined by the difference betwen the peaks of the auto correalation of the frame.
Distance between two autocorrelation peaks give pitch information about the block.
www1.cs.columbia.edu /~dhruv/report/node18.html   (61 words)

  
 Amazon.com: "Mel-Frequency Cepstral Coefficients": Key Phrase page   (Site not responding. Last check: 2007-11-04)
We choose Short Time Energy (STE), pitch, Mel-Frequency Cepstral Coefficients (MFCCs), and pause rate.
A Fast Fourier Transform was applied on each frame.
After that 24 critical band energies and 16 mel-frequency cepstral coefficients were calculated.
www.amazon.com /phrase/Mel_Frequency-Cepstral-Coefficients   (553 words)

  
 featureExtraction
Cepstral coefficients are calculated from the output of the Non-linear Transformation method
the output of mel filtering is subjected to a logarithm function (natural logarithm)
Mel Frequency Cepstral Coefficients (32 bit floating point data)
www.music.mcgill.ca /~mcennis/doc/org/oc/ocvolume/dsp/featureExtraction.html   (171 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.