Package jazzparser :: Package utils :: Package nltk :: Package ngram :: Module model
[hide private]
[frames] | no frames]

Module model

source code

Generic n-gram model implementation, using NLTK's probability handling.

NLTK provides an n-gram POS tagger, but it can only assign the most likely tag sequence to observations. It doesn't calculate probabilities, so is no use for our supertagger component. Here I provide a generic n-gram model, using NLTK's probability stuff.


Author: Mark Granroth-Wilding <mark.granroth-wilding@ed.ac.uk>

Classes [hide private]
  NgramModel
A general n-gram model, trained on some labelled data.
  PrecomputedNgramModel
Overrides parts of NgramModel to provide exactly the same interface, but stores the precomputed transition matrix and uses this to provide transition probabilities.
  NgramError
Functions [hide private]
 
matrix_log_probs_to_probs(matrix)
Utility function for methods below.
source code
 
sum_matrix_dims(matrix, dims=2)
Takes an n-dimensional matrix and sums over all but the first dims dimensions, returning a dims-dimensional matrix.
source code
 
_all_indices(length, num_labels)
Function to generate all index n-grams of a given length
source code
Variables [hide private]
  NGRAM_JOINER = '::'
  logger = logging.getLogger("main_logger")
  __package__ = 'jazzparser.utils.nltk.ngram'
Function Details [hide private]

matrix_log_probs_to_probs(matrix)

source code 

Utility function for methods below. Converts all the probabilities in the time-state matrix from log probabilities to probabilities.

sum_matrix_dims(matrix, dims=2)

source code 

Takes an n-dimensional matrix and sums over all but the first dims dimensions, returning a dims-dimensional matrix. You can do this easily in later versions of Numpy!