beamed_batch_sizes(probabilities,
batch_ratio,
max_batch=0)
| source code
|
An alternative to batch_sizes which processes many lists of probabilities
at once (i.e. one per word).
The lists returned contain the number of values that should be
returned in each batch to represent a progressively widening probability
beam, which is the same across all the words. The main difference between
this and applying batch_sizes to each word independently is that this may
result in some words having some batches empty, if the beam is not wide
enough to catch the next highest probability.
The one exception to this is the first batch, which will always
contain at least one value, even if this means effectively lowering the
beam for that one word.
Every batch will include at least one value on at least one word.
It is assumed that every word has at least one value.
If max_batch is non-zero, a maximum of
max_batch items are included in each batch for each
word.
- Parameters:
probabilities (list of lists of floats) - a list for each word of the probabilities to batch up.
batch_ratio (float) - maximum ratio between the highest probability in a particular
batch and the lowest (over all words).
- Returns: list of lists of ints
- the list of sizes of each batch for each word.
|