Abstract
Similarity measures are indispensable in music information retrieval. In recent years, various proposals have been made for measuring melodic similarity in symbolically encoded scores. Many of these approaches are ultimately based on a dynamic programming approach such as sequence alignment or edit distance, which has various drawbacks. First, the similarity scores are not necessarily metrics and are not directly comparable. Second, the algorithms are mostly first-order and of quadratic time-complexity, and finally, the features and weights need to be defined precisely. We propose an alternative approach which employs deep neural networks for end-to-end similarity metric learning. We contrast and compare different recurrent neural architectures (LSTM and GRU) for representing symbolic melodies as continuous vectors, and demonstrate how duplet and triplet loss functions can be employed to learn compact distributional representations of symbolic music in an induced melody space. This approach is contrasted with an alignment-based approach. We present results for the Meertens Tune Collections, which consists of a large number of vocal and instrumental monophonic pieces from Dutch musical sources, spanning five centuries, and demonstrate the robustness of the learned similarity metrics.
Original language | English |
---|---|
Title of host publication | Proceedings of the 20th International Society for Music Information Retrieval Conference |
Pages | 478-485 |
Publication status | Published - Oct 2019 |
Fingerprint
Dive into the research topics of 'Learning Similarity Metrics for Melody Retrieval'. Together they form a unique fingerprint.Datasets
-
MTCFeatures 1.1
van Kranenburg, P. (Creator), Meertens Instituut, Nov 2019
DOI: 10.5281/zenodo.3551003, https://zenodo.org/record/3551003 and one more link, https://pvankranenburg.github.io/MTCFeatures/ (show fewer)
Dataset