Pruning speedups beam search (#115) · Issue · PaddlePaddle / models

You need to sign in or sign up before continuing.

Pruning speedups beam search

Created by: kuke

The ctc beam search in DS2, or prefix beam search consists of appending candidate characters to prefixes and repeatedly looking up the n-gram language model. Both the two processes are time-consuming, and bring difficulty in parameters tuning and deployment.

A proven effective way is to prune the beam search. Specifically, in extending prefix only the fewest number of characters whose cumulative probability is at least p need to be considered, instead of all the characters in the vocabulary. When p is taken 0.99 as recommended by the DS2 paper, about 20x speedup is yielded in English transcription than that without pruning, with very little loss in accuracy. And for the Mandarin, the speedup ratio is reported to be up to 150x.

Due to pruning, the tuning of parameters gets more efficiently. There are two important parameters alpha and beta in beam search, associated with language model and word insertion respectively. With a more acceptable speed, alpha and beta can be searched elaborately. And the relation between WER and the two parameters turns out to be:

With the optimal parameters alpha=0.26 and beta=0.1 as shown in above figure, currently the beam search decoder has decreased WER to 13% from the best path decoding accuracy 22%.

PaddlePaddle / models 大约 2 年 前同步成功

Pruning speedups beam search

PaddlePaddle / models
大约 2 年前同步成功