Computational Musicology, Algorithms, and Datasets


The purpose of this exhibition is to show an overview of the application of algorithms in recent studies in Computational Musicology and to present currently in use musical datasets.

Dec 8, 2020 4:00 PM — 6:00 PM
UFRJ. Online (Zoom)
Rio de Janeiro, RJ


  1. Precison, Recall, F1-score
  2. Frequency analysis
  3. Hipothesis tests
  4. Effect-size
  5. Cluster analysis (dendrograms, k-means)
  6. Principal component analysis
  7. Entropy
  8. Support Vector Machine
  9. Hidden Markov Models
  10. Neural Newtorks (Convolutional and Long-short term memory)
  11. TF-IDF distribution (term frequency-inverse document frequency)
  12. Latent Semantic Analysis
  13. String alignment
  14. Strings similarity
  15. Longest common substring and subsequence (LCS)
  16. Edit distance (between strings and between trees)
  17. Latent Dirichlet Allocation
  18. Kullback-Leibler


  1. KernScores
  2. Music21
  3. ABC Beethoven
  4. Troubador Melodies Database
  5. Cantus Database
  6. GregoBaseCorpus
  7. Weimar Jazz Database
  8. RAG-C Database (Ragtime)
  9. McGill Billboard
  10. Spotify (Metadata)
  11. LastFM (Metadata)
  12. e-chords
  13. Ultimate guitar
  15. Lakh MIDI Dataset (LMD)
  16. Chordify Annotator Subjectivity Dataset



  1. Chapman, Katie Elizabeth. 2020. “Digital Approaches to Troubadour Song.” Indiana University.
  2. Cornelissen, Bas, Willen Zuidema, and John Ashley Burgoyne. 2020. “Mode Classification and Natural Units in Plainchant.” In Proceedings of 21th International Society for Music Information Retrieval Conference. Montréal, Canada.
  3. Cornelissen, Bas, Willem Zuidema, and John Ashley Burgoyne. 2020. “Studying Large Plainchant Corpora Using Chant21.” In Proceedings of 7th International Conference on Digital Libraries for Musicology. Montréal, Canada.
  4. Frieler, Klaus. 2020. “Miles Vs. Trane: Computational and Statistical Comparison of the Improvisatory Styles of Miles Davis and John Coltrane.” Jazz Perspectives 12 (1): 123–45.
  5. Kirlin, Phillip B. 2020. “A Corpus-Based Analysis of Syncopated Patterns in Ragtime.” In Proceedings of 21th International Society for Music Information Retrieval Conference. Montréal, Canada.
  6. Kutschke, Beate Ruth, and Tobias Bachmann. 2020. “Historiography of the Form of Symbolic Music through a Computer-Assisted Analysis.” In Proceedings of the 17th Sound and Music Computing Conference, 386–93. Torino.
  7. Lieck, Robert, Fabian Claude Moss, and Martin Rohrmeier. 2020. “The Tonal Diffusion Model” 3: 153–64.
  8. Mor, Bhavya, Sunita Garhwal, and Ajay Kumar. 2020. “A Systematic Literature Review on Computational Musicology.” Archives of Computational Methods in Engineering 27 (3): 923–37.
  9. Neubarth, Kerstin, and Darrell Conklin. 2020. “Mining Characteristic Patterns for Comparative Music Corpus Analysis.” Applied Sciences 10 (6).
  10. Shaffer, Kris, Esther Vasiete, Brandon Jacquez, Aaron Davis, Diego Escalante, Calvin Hicks, Joshua McCann, Camille Noufi, and Paul Salminen. 2020. “A Cluster Analysis of Harmony in the McGill Billboard Dataset.” Empirical Musicology Review 14 (3–4): 146.


  1. Allegraud, Pierre, Louis Bigo, Laurent Feisthauer, Mathieu Giraud, Richard Groult, Emmanuel Leguy, and Florence Levé. 2019. “Learning Sonata Form Structure on Mozart’s String Quartets.” Transactions of the International Society for Music Information Retrieval 2 (1): 82–96.
  2. Clark, Beach, and Claire Arthur. 2019. “Alternative Measures: A Musicologist Workbench for Popular Music.” In Proceedings of the Sound and Music Computing Conferences, 407–14.
  3. Feisthauer, Laurent, Louis Bigo, and Mathieu Giraud. 2019. “Modeling and Learning Structural Breaks in Sonata Forms.” In Proc. International Society for Music Information Retrieval Conference 2019. Utrecht, Netherlands.
  4. Georges, Patrick, and Ngoc Nguyen. 2019. “Visualizing Music Similarity: Clustering and Mapping 500 Classical Music Composers.” Scientometrics, no. 0123456789.
  5. Gotham, Mark, and Matthew T Ireland. 2019. “Taking Form: A Representation Standard, Conversion Code, and Example Corpus for Recording, Visualizing, and Studying Analysis of Musical Form.” In Proceedings of the 20th International Society for Music Information Retrieval Conference. Delft, Netherlands.
  6. Janssen, Berit, Tom Collins, and Iris Yuping Ren. 2019. “Algorithmic Ability to Predict the Musical Future: Datasets and Evaluation.” In Proceedings of the 20th International Society for Music Information Retrieval Conference, 208–15. Delft, Netherlands.
  7. Koops, Hendrik Vincent, W. Bas de Haas, John Ashley Burgoyne, Jeroen Bransen, Anna Kent-Muller, and Anja Volk. 2019. “Annotator Subjectivity in Harmony Annotations of Popular Music.” Journal of New Music Research 0 (0): 1–21.
  8. Moss, Fabian Claude, Markus Neuwirth, Daniel Harasim, and Martin Rohrmeier. 2019. “Statistical Characteristics of Tonal Harmony: A Corpus Study of Beethoven’s String Quartets.” PLoS ONE, 1–16.
  9. Moss, Fabian Claude. 2019. “Transitions of Tonality: A Model-Based Corpus Study.” Ph.D. Thesis. École Polytehcnique Fédérale de Lausanne.
  10. Nuttall, Thomas, Miguel García-Casado, Víctor Núñez-Tarifa, Rafael Caro Repetto, and Xavier Serra. 2019. “Contributing to New Musicological Theories with Computational Methods: The Case of Centonization in Arab-Andalusian Music.” In Proceedings of the 20th International Society for Music Information Retrieval Conference, 223–28. Delft, Netherlands.
  11. Simonetta, Federico, Carlos Cancino-chacón, Gerhard Widmer, and Stavros Ntalampiras. 2019. “A Convolutional Approach to Melody Line Identification in Symbolic Scores.” Computing Research Repository abs/1906.1.
  12. Temperley, David. 2019. “Second-Position Syncopation in European and American Vocal Music.” Empirical Musicology Review 14 (1–2): 66.
  13. Tymoczko, Dmitri, Mark Gotham, Michael Cuthbert, and Christopher Ariza. 2019. “The RomanText Format: A Flexible and Standard Method for Representing Roman Numerial Analyses.” In Proceedings of the 20th International Society for Music Information Retrieval Conference, 123–29. Delft, The Netherlands.
  14. Warrenburg, Lindsay A, and David Huron. 2019. “Tests of Contrasting Expressive Content between First and Second Musical Themes.” Journal of New Music Research 48 (1): 21–35.
  15. Wu, Yusong, and Shengchen Li. 2019. “Distinguishing Chinese Guqin and Western Baroque Pieces Based on Statistical Model.” In Proceedings of Computer Music Multidisciplinary Research 2019, 1–12.


  1. Bigo, Louis, Laurent Feisthauer, Mathieu Giraud, and Florence Levé. 2018. “Relevance of Musical Features for Cadence Detection.” In Proceedings of 19th International Conference on Music Information Retrieval. Paris.
  2. Chon, Song Hui, David Huron, and Dana DeVlieger. 2018. “An Exploratory Study of Western Orchestration: Patterns through History.” Empirical Musicology Review 12 (3–4): 116.
  3. Neuwirth, Markus, Daniel Harasim, Fabian C. Moss, and Martin Rohrmeier. 2018. “The Annotated Beethoven Corpus (ABC): A Dataset of Harmonic Analyses of All Beethoven String Quartets.” Frontiers in Digital Humanities 5 (July): 1–5.
  4. Sears, David R.W., Marcus Thomas Pearce, William E. Caplin, and Stephen McAdams. 2018. “Simulating Melodic and Harmonic Expectations for Tonal Cadences Using Probabilistic Models.” Journal of New Music Research 47 (1): 29–52.
Marcos Sampaio
Marcos Sampaio
Professor of Music Theory and Composition

My research interests include Computational Musicology, Music Contour, Music Theory and Joseph Haydn.