fix detoken for char

8ebd4245 · Hui Zhang · eef8847a · 8ebd4245 · 8ebd4245
隐藏空白更改
内联并排

Showing with 2 addition and 8 deletion

deepspeech/frontend/featurizer/text_featurizer.py deepspeech/frontend/featurizer/text_featurizer.py +1 -1

examples/aishell/s1/README.md examples/aishell/s1/README.md +1 -7

未找到文件。
--- a/deepspeech/frontend/featurizer/text_featurizer.py
+++ b/deepspeech/frontend/featurizer/text_featurizer.py
@@ -140,7 +140,7 @@ class TextFeaturizer():
        Returns:
           str: text string.
        """
-        tokens = tokens.replace(SPACE, " ")
+        tokens = [t.replace(SPACE, " ") for t in tokens ]
        return "".join(tokens)

    def word_tokenize(self, text):

--- a/examples/aishell/s1/README.md
+++ b/examples/aishell/s1/README.md
@@ -11,6 +11,7 @@


 ## Chunk Conformer
+Need set `decoding.decoding_chunk_size=16` when decoding.

 | Model | Params | Config | Augmentation| Test set | Decode method | Chunk Size & Left Chunks | Loss | WER |  
 | --- | --- | --- | --- | --- | --- | --- | --- | --- |  
@@ -18,10 +19,3 @@
 | conformer | 47.06M | conf/chunk_conformer.yaml | spec_aug + shift | test | ctc_greedy_search | 16, -1 | - | 0.070806 |  
 | conformer | 47.06M | conf/chunk_conformer.yaml | spec_aug + shift | test | ctc_prefix_beam_search | 16, -1 | - | 0.070739 |  
 | conformer | 47.06M | conf/chunk_conformer.yaml | spec_aug + shift | test | attention_rescoring | 16, -1 |  - | 0.059400 |  
-
-
-## Transformer
-
-| Model | Params | Config | Augmentation| Test set | Decode method | Loss | WER |  
-| --- | --- | --- | --- | --- | --- | --- | ---|  
-| transformer | - | conf/transformer.yaml | spec_aug + shift | test | attention | - | - |