diff --git a/doc/design/speech/README.MD b/doc/design/speech/README.MD index cc03aac7b4970a66a5c8c7591aecf6349ae92f8a..4509d6453d6774eba7268004d2fcca041eeaa7fe 100644 --- a/doc/design/speech/README.MD +++ b/doc/design/speech/README.MD @@ -142,13 +142,15 @@ TODO by Assignees

-Figure 2. Algorithm for Beam Search Decoder. +Figure 2. Algorithm for CTC Beam Search Decoder.
-- The **Beam Search Decoder** for DS2 CTC-trained network follows the similar approach in \[[3](#references)\] with a modification for the ambiguous part, as shown in Figure 2. -- An **external defined scorer** would be passed into the decoder to evaluate a candidate prefix during decoding whenever a space character appended. -- Such scorer is a unified class, may consisting of language model, word count or any customed evaluators. -- The **language model** is built from Task 5, with a parameter should be carefully tuned to achieve minimum WER/CER (c.f. Task 7) +- The **Beam Search Decoder** for DS2 CTC-trained network follows the similar approach in \[[3](#references)\] as shown in Figure 2, with two important modifications for the ambiguous parts: + - 1) in the iterative computation of probabilities, the assignment operation is changed to accumulation for one prefix may comes from different paths; + - 2) the if condition ```if l^+ not in A_prev then``` after probabilities' computation is deprecated for it is hard to understand and seems unnecessary. +- An **external scorer** would be passed into the decoder to evaluate a candidate prefix during decoding whenever a white space appended in English decoding and any character appended in Mandarin decoding. +- Such external scorer consists of language model, word count or any other customed scorers. +- The **language model** is built from Task 5, with parameters should be carefully tuned to achieve minimum WER/CER (c.f. Task 7) - This decoder needs to perform with **high efficiency** for the convenience of parameters tuning and speech recognition in reality.