Package jazzparser :: Package data :: Package db_mirrors :: Module output

Module output

Output chord corpus data to a text file that others can use.

Data structures and utilities are provided elsewhere in the codebase for loading, editing, converting, saving, etc. chord sequence data with annotations. It's stored either as a sqlite database or as pickled Python object, neither of which is useful to many other people. This format is designed to be easily readable by others.

I don't currently provide any implementation of reading this file format, since all scripts take their input from an internally-used format. The description below of the file format should be enough to implement a function to read this in the language of your choice.

File format

The standard file extension to use for these file shall be jcc.

The first line is always:

JAZZPARSER CHORD CORPUS

Chord sequences are preceded by a blank line. They begin with the line:

BEGIN SEQUENCE

The lines that follow, up to the BEGIN CHORDS line, contain metadata about the sequence.

INDEX: sequences are numbered sequentially and this is the index of the sequence within the file.
ID: database id of the sequence. This provides a way of referring to a sequence in a corpus that is not tied to its position in the file (you might want a different ordering, or selection of sequences).
NAME: unicode name of the song (utf-8 encoded).
KEY: key of the piece in the source. Chords are stored relative to this key. E.g. in C major, a chord 5 is F. The formatting of this wasn't originally intended to be machine readable, so might be a little inconsistent. It is generally a note name (using b and # for flat and sharp) followed by major or minor (major assumed if omitted).
BAR LENGTH: integer number of beats per bar (durations of chords are stored in beats.
SOURCE: where the chord sequence was taken from. Almost always "The Real Book, Sixth Edition".

Lines between BEGIN CHORDS and END CHORDS each represent a single chord, with comma-separated fields. The fields are the following:

root: equal-temperament pitch class (integer) relative to key.
chord type: chord type label.
duration: integer number of beats.
additions: any further additions to the chord notated in the input not covered by the chord type (anything above the seventh degree).
bass: integer pitch class of bass note, if written in the input (e.g. C7/G). Otherwise blank.
category: lexical category of annotation, from the jazz CCG grammar.
coordination middle: unresolved dominant/subdominant chord which marks the middle point of a coordination. E.g. G7 in (Dm7 G7) (A7 Dm7 G7) CM7. T or F.
coordination end: dominant/subdominant sharing its resolution with a previously marked coordination-middle chord. T or F.

Author: Mark Granroth-Wilding <mark.granroth-wilding@ed.ac.uk>

Functions

[hide private]

output_sequence_index(index, outfile)
Outputs the sequences in the sequence index to a text file.

source code

_write_header(index, outfile)
Writes a header to the outfile for this sequence index.

source code

_write_sequence(seq, index, outfile)
Writes the data for one chord sequence to the outfile.

source code

_write_chord(crd, outfile)
Writes a single line of data for a chord to the outfile.

source code

Variables

[hide private]

__package__ = None
hash(x)

Function Details

[hide private]

output_sequence_index(index, outfile)

source code

Outputs the sequences in the sequence index to a text file.

Parameters:

index (jazzparser.data.db_mirrors.SequenceIndex) - index to get sequences from
outfile (file-like object) - file to write to