Evans, Benjamin Peter (2012) A Review of Automatic Music Transcription Low Level Processing Techniques and the evaluation and Optimisation of Multiresolution FFT Parameters. Masters thesis, University of Huddersfield.
- Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
The Fast Fourier Transform (FFT) is commonly used in the field of digital signal processing to move a signal from the time domain to the frequency domain. The FFT is popular as a low level processing technique in automatic music transcription algorithms, but there is a trade-off bewteen suitable time and frequency resolutions for music transcription. To address this problem, multiresolution methods that employ several FFTs across the frequency spectrum have become popular. The purpose of this investigation was to assess the properties of the FFT in the context of Automatic Music Transcription (AMT) and to optimise the main parameters of a multiresolution FFT to improve the spectoral output.
Background theory of AMT and current low level processing techniques is presented. Discussion of the FFT decomposition theory and multiresolution techniques are followed by a brief overview of spectral processing and current high level processing approaches. These topics are presented within the context of western music harmony as a foundation for the presentation of an optimised multiresolution FFT.
A novel method of scoring FFT parameters based upon frequency resolution, time resolution and the alignment of the fundamental frequencies for equal tempered musical notes with the frequency bins of the FFT was developed. A 4-band multiresolution FFT with optimised sub-band divisions and FFT lengths is derived from the exhaustive evaluation of parameters based upon the scoring method.
The optimised 4-band multiresolution FFT is evaluated against a single band FFT, a 3-band optimised solution, and existing 4-band multiresolution FFT solution and two variations of the existing 4-band multiresolution solution - comparing optimisation scores and performance in sinusoidal extraction tasks.
Theorectical results show the optimised 4-band multiresolution FFT does offer an improved performance for use in automatic music transcription compared to a non-optimised solution. Preliminary real world testing indicated issues that require further investigation.
|Item Type:||Thesis (Masters)|
|Subjects:||M Music and Books on Music > M Music
Q Science > QA Mathematics > QA76 Computer software
|Schools:||School of Computing and Engineering|
|Depositing User:||Gail Hurst|
|Date Deposited:||19 Jun 2013 08:35|
|Last Modified:||07 Dec 2016 18:37|
Downloads per month over past year
Repository Staff Only: item control page