Package jazzparser :: Package utils :: Package nltk :: Module probability
[hide private]
[frames] | no frames]

Module probability

source code

Extensions to NLTK's probability module.


Author: Mark Granroth-Wilding <mark.granroth-wilding@ed.ac.uk>

Classes [hide private]
  WittenBellProbDistFix
There's a nasty bug in WittenBellProbDist, but the fix is very simple.
  CutoffFreqDist
Like FreqDist, but returns zero counts for everything with a count less than a given cutoff.
  CutoffConditionalFreqDist
A version of ConditionalFreqDist that uses a CutoffFreqDist for each distribution instead of FreqDist.
  CutoffFreqDistStorer
  CutoffConditionalFreqDistStorer
Functions [hide private]
 
logprob(prob)
Returns the base 2 log of the given probability (or other float).
source code
 
add_logs(logx, logy)
Identical to NLTK's nltk.probability.add_logs, but handles the special case where one or both of the numbers is -inf.
source code
 
sum_logs(logs)
Identical to NLTK's nltk.probability.sum_logs, but uses our version of add_logs and -inf for zero probs
source code
 
generate_from_prob_dist(dist)
Generates a sample chosen randomly from the observed samples of an NLTK prob dist, weighted according to their probability.
source code
 
prob_dist_to_dictionary_prob_dist(dist, mutable=False, samples=None)
Takes a probability distribution estimated in any way (e.g.
source code
 
cond_prob_dist_to_dictionary_cond_prob_dist(dist, mutable=False, samples=None, conditions=None)
Takes a conditional probability distribution which may estimate its probabilities in any way (most likely from a set of frequency distributions) and produces an equivalent dictionary conditional distribution, whose distributions are dictionary prob dists.
source code
 
estimator_name(name)
Decorator to add a name attribute to the estimator functions
source code
 
mle_estimator(fdist, bins) source code
 
laplace_estimator(fdist, bins) source code
 
witten_bell_estimator(fdist, bins) source code
 
good_turing_estimator(fdist, bins) source code
 
simple_good_turing_estimator(fdist, bins) source code
 
get_estimator_name(estimator) source code
Variables [hide private]
  ESTIMATORS = {'mle': mle_estimator, 'laplace': laplace_estimat...
  __package__ = 'jazzparser.utils.nltk'
Function Details [hide private]

logprob(prob)

source code 

Returns the base 2 log of the given probability (or other float). If prob == 0.0, returns -inf.

add_logs(logx, logy)

source code 

Identical to NLTK's nltk.probability.add_logs, but handles the special case where one or both of the numbers is -inf. NLTK's version gives a nan in this case.

generate_from_prob_dist(dist)

source code 

Generates a sample chosen randomly from the observed samples of an NLTK prob dist, weighted according to their probability. NLTK provides this, but doesn't allow for the summed probability of the observed samples not being 1.0. But, of course, this is the case when we're smoothing.

prob_dist_to_dictionary_prob_dist(dist, mutable=False, samples=None)

source code 

Takes a probability distribution estimated in any way (e.g. from a freq dist) and produces a corresponding dictionary prob dist that just stores the probability of every sample.

Can be used to turn any kind of prob dist into a dictionary-based one, including a MutableProbDist.

Parameters:
  • mutable (bool) - if True, the returned dist is a mutable prob dist

cond_prob_dist_to_dictionary_cond_prob_dist(dist, mutable=False, samples=None, conditions=None)

source code 

Takes a conditional probability distribution which may estimate its probabilities in any way (most likely from a set of frequency distributions) and produces an equivalent dictionary conditional distribution, whose distributions are dictionary prob dists.

Parameters:
  • mutable (bool) - if True, the returned dist contains mutable prob dists

mle_estimator(fdist, bins)

source code 
Decorators:
  • @estimator_name('mle')

laplace_estimator(fdist, bins)

source code 
Decorators:
  • @estimator_name('laplace')

witten_bell_estimator(fdist, bins)

source code 
Decorators:
  • @estimator_name('witten-bell')

good_turing_estimator(fdist, bins)

source code 
Decorators:
  • @estimator_name('good_turing')

simple_good_turing_estimator(fdist, bins)

source code 
Decorators:
  • @estimator_name('good_turing')

Variables Details [hide private]

ESTIMATORS

Value:
{'mle': mle_estimator, 'laplace': laplace_estimator, 'witten-bell': wi\
tten_bell_estimator, 'good-turing': good_turing_estimator, 'simple-goo\
d-turing': simple_good_turing_estimator,}