I have trained the C&C supertagger on the data in my dataset, consisting currently of 3,756 chords. The annotation is incomplete. Some chords have no category, which will slightly reduce the performance of the tagger.
Heldout Cross-Validation
To perform a basic evaluation of the tagger I split the corpus 10 ways (by sequences, not chords). I trained a model on each combination of 9 partitions and evaluated it on the remaining one, combining the results of all the tests.
Tag Agreement
I used the tagger to pick the highest probability tag for each chord and computed the proportion of tags which matched the gold standard tag. I ignored any chords for which there is currently no gold standard tag.
- Tags matching gold standard: 80.8%
Perplexity
By retrieving a high number of tags, with probabilities, from the tagger for each chord I computed the entropy per chord over each partition and hence the perplexity of the model over the whole evaluated set.
- Entropy per chord (averaged over partitions): 0.144 Perplexity: 1.105
Confusions
The following table shows a count of the incorrectly picked tags over the whole evaluation in the agreement test.
Question marks appear in this table because unknown tags in the training set were not excluded. The tagger has learnt the unknown value and is occasionally choosing it as the most probable tag. Eventually there should be no unknown tags in the training set.
For brevity this table excludes confusion that occur 1 or 2 times.
Correct tag |
Chosen tag |
Count |
T |
D |
102 |
D |
T |
47 |
D_Tt |
D |
41 |
Rep |
T |
41 |
Rep_D |
T |
26 |
Rep_D |
D |
26 |
D |
D_Tt |
22 |
D_Bd |
D |
21 |
TC_IV |
T |
21 |
T |
? |
20 |
D_Tt |
T |
17 |
Rep |
D |
15 |
T |
Rep_D |
13 |
D_Bd |
? |
12 |
T |
D_Tt |
12 |
Rep_D_Tt |
D |
12 |
TC_IV |
D |
12 |
D |
? |
10 |
Rep_D_Tt |
T |
10 |
TC_IVR |
D |
10 |
D_Bd |
T |
9 |
Pass_VI |
? |
9 |
TC_IVR |
T |
9 |
S |
T |
8 |
9c |
D |
8 |
Dim_bVII |
? |
7 |
T |
Rep |
7 |
S |
D |
7 |
Pass_bV |
D |
7 |
TC_II |
? |
7 |
Aug_bII |
D |
6 |
9e |
T |
6 |
TC_IIR |
? |
5 |
Dim_bII |
? |
5 |
T |
TC_IV |
5 |
D |
Rep_D |
5 |
11a |
D |
5 |
T_III |
D |
5 |
0a |
D |
5 |
D_Btk |
D |
5 |
Rep |
Rep_D |
5 |
2a |
D_Bd |
4 |
S |
? |
4 |
D |
TC_IV |
4 |
TC_IV |
D_Tt |
4 |
Pass_I |
D |
4 |
Rep_bVI |
T |
4 |
9e |
D |
4 |
Rep_D_Bd |
D |
3 |
TC_IIR |
T |
3 |
Dim_bII |
D_Tt |
3 |
T |
Rep_D_Tt |
3 |
T_III |
? |
3 |
TC_IVR |
? |
3 |
TC_IV |
D_Bd |
3 |
Category Distribution
At the time of training, the distribution of chords over categories was as follows. Numeric category names are from an old annotation and have not yet been re-annotated: these make up only a small proportion.
Category |
Count |
% |
D |
1,984 |
50.51 |
T |
875 |
22.28 |
D_Tt |
271 |
6.90 |
No category |
172 |
4.38 |
Rep_D |
130 |
3.31 |
Rep |
105 |
2.67 |
TC_IV |
58 |
1.48 |
Dim_bII |
49 |
1.25 |
D_Bd |
48 |
1.22 |
Rep_D_Tt |
38 |
0.97 |
TC_IVR |
35 |
0.89 |
S |
22 |
0.56 |
9e |
11 |
0.28 |
0a |
11 |
0.28 |
T_III |
10 |
0.25 |
TC_II |
10 |
0.25 |
Pass_VI |
9 |
0.23 |
Pass_bV |
9 |
0.23 |
9c |
9 |
0.23 |
TC_IIR |
8 |
0.20 |
Dim_bVII |
8 |
0.20 |
Aug_bII |
8 |
0.20 |
Rep_bVI |
6 |
0.15 |
D_Btk |
6 |
0.15 |
Rep_D_Bd |
5 |
0.13 |
11a |
5 |
0.13 |
Pass_I |
4 |
0.10 |
2a |
4 |
0.10 |
Rep_S |
3 |
0.08 |
Rep_Aug_bII |
3 |
0.08 |
Dim_V |
3 |
0.08 |
T_bVI |
2 |
0.05 |
Aug_VI |
2 |
0.05 |
11b |
2 |
0.05 |
Rep_Aug_VI |
1 |
0.03 |
Dim_III |
1 |
0.03 |
9b |
1 |
0.03 |