文摘
A source-filter model, originally devised to represent a sound production process, has been widely used to estimate both of the source signal which includes pitch information and the synthesis filter which includes vowel information, as from sounds of a speech signal. We use this model to identify instruments by their instrumental sound signal. However, this model suffers from an indeterminacy problem. To resolve it, we employ three elements of the sound: loudness, pitch and timbre. Our assumption is that the source signal is represented by time-varying pitch and amplitude, and the synthesis filter by time-invariant line spectral frequency parameters. We construct a probabilistic model that represents our assumption with an extension of the source-filter model. For learning of model parameters, we employed an EM-like minimization algorithm of a cost function called the free energy. Reconstruction of the spectrum with the estimated source signal and synthesis filter, and instrument identification by using the model parameters of the estimated synthesis filter are performed to evaluate our approach, showing that this learning scheme could achieve simultaneous estimation of the source signal and the synthesis filter.