Christopher Raphael and Joshua Stoddard describe a model for chord labelling from symbolic data (MIDI) in their 2004 Computer Music Journal paper, Functional Harmonic Analysis Using Probabilistic Models, also published in ISMIR 2003.

This is a prime candidate for a chord labelling model that I could (a) use as a baseline for the chord labelling task and (b) extend into a category tagger for processing MIDI input.


Model Details

The model is described in detail in the paper. I've summarized it in this document.


The description of the model omits some details that I need in order to replicate their experiments.

  1. Training dataset is unavailable: they trained on a collect of 5 or 6 Haydn piano sonata movements
  2. Test data and analyses referenced in the paper are not available. They can now be found on Christopher Raphael's current homepage.

  3. Initialization parameters are not given.
  4. They say in the paper that some of the components of the transition distribution don't get learned well by the model, so are set by hand. The parameters used are not given.

Training data

I'm not sure what's best to train the model on to replicate R&S's experiments. They don't have the files that they used any more. One possibility is to try just training on the test midis. R&S used Haydn piano sonatas to train their model (PC, not in paper), so I've collected and cleaned up some such midi files that I can use as training data.


Exact emission distribution initialization parameters probably don't matter that much. Setting them to the kind of ballpark figures you'd imagine from reading the paper (root high probability, chord notes lower, scale notes lower, non-scale notes lower) should get the training off to a good enough start.

Transition distribution parameters

Setting the transition distribution parameters by hand is more dangerous. In the paper, they say initialization of the transition distributions doesn't make much difference. I'll try it anyway and set the transition distribution parameters by hand for initialization and train.

I have confirmed that training without initializing the transition distributions (i.e. initializing to uniform distributions) gives nonsensical parameters after training. I need to try initializing to see if it's any better.

If that still doesn't work, I'll try hand-setting the parameters, as they do.

Trained Models

I'm experimenting with training different models and seeing how they perform (pretty informally).




Trained on 5 Haydn piano sonata movements, each truncated to 50 chords.


Trained on 12 jazz standards midi files, each truncated to 50 chords.


Trained first on the Haydn data, then retrained on the jazz data.

To do

I've got the model training basically working. These are things I want to do next.