Package jazzparser :: Package utils :: Package nltk :: Module probability :: Class CutoffFreqDist
[hide private]
[frames] | no frames]

Class CutoffFreqDist

source code

           object --+        
                    |        
                 dict --+    
                        |    
nltk.probability.FreqDist --+
                            |
                           CutoffFreqDist

Like FreqDist, but returns zero counts for everything with a count less than a given cutoff. Also adjusts the total count to account for the lost counts.

Instance Methods [hide private]
new empty dictionary

__init__(self, cutoff, *args, **kwargs)
Construct a new frequency distribution.
source code
 
__getitem__(self, key)
x[y]
source code
 
raw_counts(self)
Returns the raw counts (i.e.
source code
 
raw_count(self, sample)
Returns the raw count of this sample (doesn't apply a cutoff).
source code
int
N(self)
Returns: The total number of sample outcomes that have been recorded by this FreqDist.
source code
int
B(self)
This is slightly more complicated than the superclass, because we want to count only samples that have non-zero counts after the cutoff has been applied.
source code
 
__len__(self)
len(x)
source code
float
freq(self, sample)
Have to override this because the superclass doesn't use N(), but the internal _N to calculate the frequency.
source code
FreqDist
copy(self)
Create a copy of this frequency distribution.
source code
 
__add__(self, other)
Returns a CutoffFreqDist like this one, but with counts from the other added.
source code
 
_reset_caches(self)
Add our own caches to the superclass'
source code
 
lost_N(self)
The number of counts lost by applying the cutoff
source code
 
_get_cutoff(self)
Make cutoff a read-only attribute
source code
 
_sort_keys_by_value(self)
Need to override this because dict.items(self) accesses the non-cutoff values.
source code

Inherited from nltk.probability.FreqDist: Nr, __eq__, __ge__, __gt__, __iter__, __le__, __lt__, __ne__, __repr__, __setitem__, __str__, clear, count, hapaxes, inc, items, iteritems, iterkeys, itervalues, keys, max, plot, pop, popitem, samples, sorted, sorted_samples, tabulate, update, values

Inherited from dict: __cmp__, __contains__, __delitem__, __getattribute__, __new__, __sizeof__, fromkeys, get, has_key, setdefault, viewitems, viewkeys, viewvalues

Inherited from object: __delattr__, __format__, __reduce__, __reduce_ex__, __setattr__, __subclasshook__

Class Variables [hide private]

Inherited from dict: __hash__

Properties [hide private]
  cutoff
Make cutoff a read-only attribute

Inherited from object: __class__

Method Details [hide private]

__init__(self, cutoff, *args, **kwargs)
(Constructor)

source code 

Construct a new frequency distribution. If samples is given, then the frequency distribution will be initialized with the count of each object in samples; otherwise, it will be initialized to be empty.

In particular, FreqDist() returns an empty frequency distribution; and FreqDist(samples) first creates an empty frequency distribution, and then calls update with the list samples.

Parameters:
  • samples - The samples to initialize the frequency distribution with.
Returns:
new empty dictionary

Overrides: object.__init__
(inherited documentation)

__getitem__(self, key)
(Indexing operator)

source code 

x[y]

Overrides: dict.__getitem__
(inherited documentation)

raw_counts(self)

source code 

Returns the raw counts (i.e. without the cutoff applied) as a dictionary. This could, for example, be used as init data to another FreqDist.

N(self)

source code 
Returns: int
The total number of sample outcomes that have been recorded by this FreqDist. For the number of unique sample values (or bins) with counts greater than zero, use FreqDist.B().
Overrides: nltk.probability.FreqDist.N
(inherited documentation)

B(self)

source code 

This is slightly more complicated than the superclass, because we want to count only samples that have non-zero counts after the cutoff has been applied.

Returns: int
The total number of sample values (or bins) that have counts greater than zero. For the total number of sample outcomes recorded, use FreqDist.N(). (FreqDist.B() is the same as len(FreqDist).)
Overrides: nltk.probability.FreqDist.B

__len__(self)
(Length operator)

source code 

len(x)

Overrides: dict.__len__
(inherited documentation)

freq(self, sample)

source code 

Have to override this because the superclass doesn't use N(), but the internal _N to calculate the frequency.

Parameters:
  • sample - the sample whose frequency should be returned.
Returns: float
The frequency of a given sample.
Overrides: nltk.probability.FreqDist.freq

copy(self)

source code 

Create a copy of this frequency distribution.

Returns: FreqDist
A copy of this frequency distribution object.
Overrides: dict.copy
(inherited documentation)

__add__(self, other)
(Addition operator)

source code 

Returns a CutoffFreqDist like this one, but with counts from the other added. The other may only be another CutoffFreqDist.

Overrides: nltk.probability.FreqDist.__add__

_reset_caches(self)

source code 

Add our own caches to the superclass'

Overrides: nltk.probability.FreqDist._reset_caches

_sort_keys_by_value(self)

source code 

Need to override this because dict.items(self) accesses the non-cutoff values.

Overrides: nltk.probability.FreqDist._sort_keys_by_value

Property Details [hide private]

cutoff

Make cutoff a read-only attribute

Get Method:
_get_cutoff(self) - Make cutoff a read-only attribute