[ASR]cherrypick change optimizer and fix import error, test=asr (#3049)

* optional tokenizer and fix some doc * cherry pick with pr:3040

[ASR]cherrypick change optimizer and fix import error, test=asr (#3049)
* optional tokenizer and fix some doc * cherry pick with pr:3040
d103cb8f · zxcd · GitHub · 4d1787dc · d103cb8f · d103cb8f
4 changed file
--- a/examples/aishell/asr3/RESULT.md
+++ b/examples/aishell/asr3/RESULT.md
@@ -4,6 +4,7 @@
 * paddle version: develop (commit id: daea892c67e85da91906864de40ce9f6f1b893ae)
 * paddlespeech version: develop (commit id: c14b4238b256693281e59605abff7c9435b3e2b2)
+* paddlenlp version: 2.5.2
 ## Device
 * python: 3.7

--- a/examples/aishell/asr3/conf/train_with_wav2vec.yaml
+++ b/examples/aishell/asr3/conf/train_with_wav2vec.yaml
@@ -83,7 +83,7 @@ dnn_neurons: 1024
 freeze_wav2vec: False
 dropout: 0.15
-tokenizer: !apply:transformers.BertTokenizer.from_pretrained
+tokenizer: !apply:paddlenlp.transformers.AutoTokenizer.from_pretrained
   pretrained_model_name_or_path: bert-base-chinese
 # bert-base-chinese tokens length
 output_neurons: 21128

--- a/examples/aishell/asr3/local/aishell_prepare.py
+++ b/examples/aishell/asr3/local/aishell_prepare.py
@@ -21,7 +21,7 @@ import glob
 import logging
 import os
-from paddlespeech.s2t.models.wav2vec2.io.dataio import read_audio
+from paddlespeech.s2t.io.speechbrain.dataio import read_audio
 logger = logging.getLogger(__name__)

--- a/examples/aishell/asr3/local/data.sh
+++ b/examples/aishell/asr3/local/data.sh
 #!/bin/bash
 stage=-1
-stop_stage=-1
+stop_stage=3
 dict_dir=data/lang_char
 . ${MAIN_ROOT}/utils/parse_options.sh || exit -1;