| Trees | Indices | Help |
|
|---|
|
|
object --+
|
Tagger
The superclass of all taggers. Subclass this to create tagger components.
Probabilities are returned by the tagger along with signs. These are posterior probabilities for the C&C supertagging approach: that is, Pr(tag | observations). For the PCFG parser approach, the taggers must yield likelihoods: Pr(observation | tag). A tagger of this sort should have POSTERIOR set to False.
|
|||
|
|||
|
|||
|
|||
| dict |
|
||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Inherited from |
|||
|
|||
|
|||
|
|||
COMPATIBLE_FORMALISMS =
|
|||
INPUT_TYPES = List of allowed input datatypes. |
|||
LEXICAL_PROBABILITY = FalseSome models provide lexical probabilities that the parsing models can use. |
|||
TAGGER_OPTIONS = Tagger-specific options. |
|||
shell_tools = Interactive shell tools available when this tagger is used. |
|||
|
|||
|
input_length Should return the number of words (chords) in the input, or some other measure of input length appropriate to the type of tagger. |
|||
| name | |||
|
Inherited from |
|||
|
|||
The tagger must have reference to the grammar being used to parse the input. It must also be given the full input when instantiated. The format of this input will depend on the tagger: for example, it might be a string or a MIDI file.
|
Normally, options are validated when the tagger is instantiated. This allows you to check them before that. |
Gets all signs that the tagger will return, regardless of offset. This just uses get_signs to get the signs for every offset.
|
Returns a list of tuples Each signtup is a (sign,tag,probability) tuple representing a sign that the tagger wishes to add to the chart in this position. How many are returned is up to the tagger (it may wish to return more in cases where there are no clear winners, for example). If the tag is not found in the grammar, sign will be None. Returned list is sorted by probability, highest first. offset may be set >0 in order to retrieve further signs once some have already been returned. If offset=k, the tagger should disregard all the signs that would have been returned for offset<k and return the next bunch - as many as it sees fit. offset is incremented each time the parse fails. The simplest approach, and that employed by most taggers, has some
signs for each word and none spanning more than one word. That is, the
tuples in the list would be of the form Note:
This functionality used to be provided by
|
Returns a list of string representations of the inputs. This is just a convenience function, which uses whatever representation gets returned by get_word() to produce a representation of the whole input. |
Returns as a float the probability with which the tagger judges the given span will be assigned the given sign. If |
Returns the input word at this index. This does not need to be a string, but must have a sensible __str__, so that it can be converted to a readable string. The purpose of this is to provide a readable form of the input for the parser to store in derivation traces. |
Returns the duration of the word at this index if durations are available. Otherwise raises an AttributeError. |
|
|||
INPUT_TYPESList of allowed input datatypes. See jazzparser.data.input.INPUT_TYPES.
|
LEXICAL_PROBABILITYSome models provide lexical probabilities that the parsing models can
use. They should set this to true. They should also provide a method
|
TAGGER_OPTIONSTagger-specific options. List of ModuleOptions.
|
|
|||
input_lengthShould return the number of words (chords) in the input, or some other measure of input length appropriate to the type of tagger.
|
name
|
| Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0.1 on Mon Nov 26 16:04:58 2012 | http://epydoc.sourceforge.net |