Module output
source code
Output chord corpus data to a text file that others can use.
Data structures and utilities are provided elsewhere in the codebase
for loading, editing, converting, saving, etc. chord sequence data with
annotations. It's stored either as a sqlite database or as pickled Python
object, neither of which is useful to many other people. This format is
designed to be easily readable by others.
I don't currently provide any implementation of reading this file
format, since all scripts take their input from an internally-used
format. The description below of the file format should be enough to
implement a function to read this in the language of your choice.
File format
The standard file extension to use for these file shall be
jcc.
The first line is always:
JAZZPARSER CHORD CORPUS
Chord sequences are preceded by a blank line. They begin with the
line:
BEGIN SEQUENCE
The lines that follow, up to the BEGIN CHORDS line,
contain metadata about the sequence.
-
INDEX: sequences are numbered sequentially and this is
the index of the sequence within the file.
-
ID: database id of the sequence. This provides a way
of referring to a sequence in a corpus that is not tied to its
position in the file (you might want a different ordering, or
selection of sequences).
-
NAME: unicode name of the song (utf-8 encoded).
-
KEY: key of the piece in the source. Chords are stored
relative to this key. E.g. in C major, a chord 5 is F. The
formatting of this wasn't originally intended to be machine
readable, so might be a little inconsistent. It is generally a note
name (using b and # for flat and sharp)
followed by major or minor
(major assumed if omitted).
-
BAR LENGTH: integer number of beats per bar (durations
of chords are stored in beats.
-
SOURCE: where the chord sequence was taken from.
Almost always "The Real Book, Sixth
Edition".
Lines between BEGIN CHORDS and END CHORDS
each represent a single chord, with comma-separated fields. The fields
are the following:
-
root: equal-temperament pitch class (integer) relative to
key.
-
chord type: chord type label.
-
duration: integer number of beats.
-
additions: any further additions to the chord notated in the
input not covered by the chord type (anything above the seventh
degree).
-
bass: integer pitch class of bass note, if written in the
input (e.g. C7/G). Otherwise blank.
-
category: lexical category of annotation, from the jazz CCG
grammar.
-
coordination middle: unresolved dominant/subdominant chord
which marks the middle point of a coordination. E.g. G7 in (Dm7 G7)
(A7 Dm7 G7) CM7.
T or F.
-
coordination end: dominant/subdominant sharing its
resolution with a previously marked coordination-middle chord.
T or F.
Author:
Mark Granroth-Wilding <mark.granroth-wilding@ed.ac.uk>
|
|
|
|
|
_write_header(index,
outfile)
Writes a header to the outfile for this sequence index. |
source code
|
|
|
|
_write_sequence(seq,
index,
outfile)
Writes the data for one chord sequence to the outfile. |
source code
|
|
|
|
_write_chord(crd,
outfile)
Writes a single line of data for a chord to the outfile. |
source code
|
|
|
|
__package__ = None
hash(x)
|
|
Outputs the sequences in the sequence index to a text file.
- Parameters:
index (jazzparser.data.db_mirrors.SequenceIndex) - index to get sequences from
outfile (file-like object) - file to write to
|