E2E/Streaming Transformer/Conformer ASR (#578)

* add cmvn and label smoothing loss layer * add layer for transformer * add glu and conformer conv * add torch compatiable hack, mask funcs * not hack size since it exists * add test; attention * add attention, common utils, hack paddle * add audio utils * conformer batch padding mask bug fix #223 * fix typo, python infer fix rnn mem opt name error and batchnorm1d, will be available at 2.0.2 * fix ci * fix ci * add encoder * refactor egs * add decoder * refactor ctc, add ctc align, refactor ckpt, add warmup lr scheduler, cmvn utils * refactor docs * add fix * fix readme * fix bugs, refactor collator, add pad_sequence, fix ckpt bugs * fix docstring * refactor data feed order * add u2 model * refactor cmvn, test * add utils * add u2 config * fix bugs * fix bugs * fix autograd maybe has problem when using inplace operation * refactor data, build vocab; add format data * fix text featurizer * refactor build vocab * add fbank, refactor feature of speech * refactor audio feat * refactor data preprare * refactor data * model init from config * add u2 bins * flake8 * can train * fix bugs, add coverage, add scripts * test can run * fix data * speed perturb with sox * add spec aug * fix for train * fix train logitc * fix logger * log valid loss, time dataset process * using np for speed perturb, remove some debug log of grad clip * fix logger * fix build vocab * fix logger name * using module logger as default * fix * fix install * reorder imports * fix board logger * fix logger * kaldi fbank and mfcc * fix cmvn and print prarams * fix add_eos_sos and cmvn * fix cmvn compute * fix logger and cmvn * fix subsampling, label smoothing loss, remove useless * add notebook test * fix log * fix tb logger * multi gpu valid * fix log * fix log * fix config * fix compute cmvn, need paddle 2.1 * add cmvn notebook * fix layer tools * fix compute cmvn * add rtf * fix decoding * fix layer tools * fix log, add avg script * more avg and test info * fix dataset pickle problem; using 2.1 paddle; num_workers can > 0; ckpt save in exp dir;fix setup.sh; * add vimrc * refactor tiny script, add transformer and stream conf * spm demo; librisppech scripts and confs * fix log * add librispeech scripts * refactor data pipe; fix conf; fix u2 default params * fix bugs * refactor aishell scripts * fix test * fix cmvn * fix s0 scripts * fix ds2 scripts and bugs * fix dev & test dataset filter * fix dataset filter * filter dev * fix ckpt path * filter test, since librispeech will cause OOM, but all test wer will be worse, since mismatch train with test * add comment * add syllable doc * fix ds2 configs * add doc * add pypinyin tools * fix decoder using blank_id=0 * mmseg with pybind11 * format code

E2E/Streaming Transformer/Conformer ASR (#578)
* add cmvn and label smoothing loss layer * add layer for transformer * add glu and conformer conv * add torch compatiable hack, mask funcs * not hack size since it exists * add test; attention * add attention, common utils, hack paddle * add audio utils * conformer batch padding mask bug fix #223 * fix typo, python infer fix rnn mem opt name error and batchnorm1d, will be available at 2.0.2 * fix ci * fix ci * add encoder * refactor egs * add decoder * refactor ctc, add ctc align, refactor ckpt, add warmup lr scheduler, cmvn utils * refactor docs * add fix * fix readme * fix bugs, refactor collator, add pad_sequence, fix ckpt bugs * fix docstring * refactor data feed order * add u2 model * refactor cmvn, test * add utils * add u2 config * fix bugs * fix bugs * fix autograd maybe has problem when using inplace operation * refactor data, build vocab; add format data * fix text featurizer * refactor build vocab * add fbank, refactor feature of speech * refactor audio feat * refactor data preprare * refactor data * model init from config * add u2 bins * flake8 * can train * fix bugs, add coverage, add scripts * test can run * fix data * speed perturb with sox * add spec aug * fix for train * fix train logitc * fix logger * log valid loss, time dataset process * using np for speed perturb, remove some debug log of grad clip * fix logger * fix build vocab * fix logger name * using module logger as default * fix * fix install * reorder imports * fix board logger * fix logger * kaldi fbank and mfcc * fix cmvn and print prarams * fix add_eos_sos and cmvn * fix cmvn compute * fix logger and cmvn * fix subsampling, label smoothing loss, remove useless * add notebook test * fix log * fix tb logger * multi gpu valid * fix log * fix log * fix config * fix compute cmvn, need paddle 2.1 * add cmvn notebook * fix layer tools * fix compute cmvn * add rtf * fix decoding * fix layer tools * fix log, add avg script * more avg and test info * fix dataset pickle problem; using 2.1 paddle; num_workers can > 0; ckpt save in exp dir;fix setup.sh; * add vimrc * refactor tiny script, add transformer and stream conf * spm demo; librisppech scripts and confs * fix log * add librispeech scripts * refactor data pipe; fix conf; fix u2 default params * fix bugs * refactor aishell scripts * fix test * fix cmvn * fix s0 scripts * fix ds2 scripts and bugs * fix dev & test dataset filter * fix dataset filter * filter dev * fix ckpt path * filter test, since librispeech will cause OOM, but all test wer will be worse, since mismatch train with test * add comment * add syllable doc * fix ds2 configs * add doc * add pypinyin tools * fix decoder using blank_id=0 * mmseg with pybind11 * format code
71e046b0 · Hui Zhang · GitHub · 3a2de9e4 · 71e046b0 · 71e046b0
440 changed file
--- a/.clang-format
+++ b/.clang-format
@@ -16,8 +16,8 @@
 ---
 Language:        Cpp
 BasedOnStyle:  Google
-IndentWidth:     2
-TabWidth:        2
+IndentWidth:     4
+TabWidth:        4
 ContinuationIndentWidth: 4
 MaxEmptyLinesToKeep: 2
 AccessModifierOffset: -2  # The private/protected/public has no indent in class

--- a/.flake8
+++ b/.flake8
+[flake8]
+
+########## OPTIONS ##########
+# Set the maximum length that any line (with some exceptions) may be.
+max-line-length = 120
+
+
+################### FILE PATTERNS ##########################
+# Provide a comma-separated list of glob patterns to exclude from checks.
+exclude =
+    # git folder
+    .git,
+    # python cache
+    __pycache__,
+    third_party/,
+# Provide a comma-separate list of glob patterns to include for checks.
+filename =
+    *.py
+
+
+########## RULES ##########
+
+# ERROR CODES
+#
+# E/W  - PEP8 errors/warnings (pycodestyle)
+# F    - linting errors (pyflakes)
+# C    - McCabe complexity error (mccabe)
+#
+# W503 - line break before binary operator
+
+# Specify a list of codes to ignore.
+ignore =
+    W503
+    E252,E262,E127,E265,E126,E266,E241,E261,E128,E125
+    W291,W293,W605
+    E203,E305,E402,E501,E721,E741,F403,F405,F821,F841,F999,W503,W504,C408,E302,W291,E303,
+    # shebang has extra meaning in fbcode lints, so I think it's not worth trying
+    # to line this up with executable bit
+    EXE001,
+    # these ignores are from flake8-bugbear; please fix!
+    B007,B008,
+    # these ignores are from flake8-comprehensions; please fix!
+    C400,C401,C402,C403,C404,C405,C407,C411,C413,C414,C415
+
+# Specify the list of error codes you wish Flake8 to report.
+select =
+    E,
+    W,
+    F,
+    C
--- a/.gitconfig
+++ b/.gitconfig
+[alias]
+  st = status
+  ci = commit
+  br = branch
+  co = checkout
+  df = diff
+  l = log --pretty=format:\"%h %ad | %s%d [%an]\" --graph --date=short
+  ll = log --stat
+
+[merge]
+  tool = vimdiff
+
+[core]
+  excludesfile = ~/.gitignore
+  editor = vim
+
+[color]
+  branch = auto
+  diff = auto
+  status = auto
+
+[color "branch"]
+  current = yellow reverse
+  local = yellow
+  remote = green
+
+[color "diff"]
+  meta = yellow bold
+  frag = magenta bold
+  old = red bold
+  new = green bold
+
+[color "status"]
+  added = yellow
+  changed = green
+  untracked = cyan
+
+[push]
+  default = matching
+
+[credential]
+  helper = store
+
+[user]
+  name =
+  email =
+
+
--- a/.gitignore
+++ b/.gitignore
@@ -5,3 +5,8 @@ tools/venv
 *.log
 *.pdmodel
 *.pdiparams*
+*.zip
+*.tar
+*.tar.gz
+.ipynb_checkpoints
+*.npz
--- a/.notebook/Linear_test.ipynb
+++ b/.notebook/Linear_test.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "academic-surname",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import paddle\n",
+    "from paddle import nn"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "fundamental-treasure",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/workspace/DeepSpeech-2.x/tools/venv-dev/lib/python3.7/site-packages/ipykernel/ipkernel.py:283: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n",
+      "  and should_run_async(code)\n"
+     ]
+    }
+   ],
+   "source": [
+    "L = nn.Linear(256, 2048)\n",
+    "L2 = nn.Linear(2048, 256)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "consolidated-elephant",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import torch\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "moderate-noise",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "float64\n",
+      "Tensor(shape=[2, 51, 256], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[[-1.54171216, -2.61531472, -1.79881978, ..., -0.31395876,  0.56513089, -0.44516513],\n",
+      "         [-0.79492962,  1.91157901,  0.66567147, ...,  0.54825783, -1.01471853, -0.84924090],\n",
+      "         [-1.22556651, -0.36225814,  0.65063190, ...,  0.65726501,  0.05563191,  0.09009409],\n",
+      "         ...,\n",
+      "         [ 0.38615900, -0.77905393,  0.99732304, ..., -1.38463700, -3.32365036, -1.31089687],\n",
+      "         [ 0.05579993,  0.06885809, -1.66662002, ..., -0.23346378, -3.29372883,  1.30561364],\n",
+      "         [ 1.90676069,  1.95093191, -0.28849599, ..., -0.06860496,  0.95347673,  1.00475824]],\n",
+      "\n",
+      "        [[-0.91453546,  0.55298805, -1.06146812, ..., -0.86378336,  1.00454640,  1.26062179],\n",
+      "         [ 0.10223761,  0.81301165,  2.36865163, ...,  0.16821407,  0.29240361,  1.05408621],\n",
+      "         [-1.33196676,  1.94433689,  0.01934209, ...,  0.48036841,  0.51585966,  1.22893548],\n",
+      "         ...,\n",
+      "         [-0.19558455, -0.47075930,  0.90796155, ..., -1.28598249, -0.24321797,  0.17734711],\n",
+      "         [ 0.89819717, -1.39516675,  0.17138045, ...,  2.39761519,  1.76364994, -0.52177650],\n",
+      "         [ 0.94122332, -0.18581429,  1.36099780, ...,  0.67647684, -0.04699665,  1.51205540]]])\n",
+      "tensor([[[-1.5417, -2.6153, -1.7988,  ..., -0.3140,  0.5651, -0.4452],\n",
+      "         [-0.7949,  1.9116,  0.6657,  ...,  0.5483, -1.0147, -0.8492],\n",
+      "         [-1.2256, -0.3623,  0.6506,  ...,  0.6573,  0.0556,  0.0901],\n",
+      "         ...,\n",
+      "         [ 0.3862, -0.7791,  0.9973,  ..., -1.3846, -3.3237, -1.3109],\n",
+      "         [ 0.0558,  0.0689, -1.6666,  ..., -0.2335, -3.2937,  1.3056],\n",
+      "         [ 1.9068,  1.9509, -0.2885,  ..., -0.0686,  0.9535,  1.0048]],\n",
+      "\n",
+      "        [[-0.9145,  0.5530, -1.0615,  ..., -0.8638,  1.0045,  1.2606],\n",
+      "         [ 0.1022,  0.8130,  2.3687,  ...,  0.1682,  0.2924,  1.0541],\n",
+      "         [-1.3320,  1.9443,  0.0193,  ...,  0.4804,  0.5159,  1.2289],\n",
+      "         ...,\n",
+      "         [-0.1956, -0.4708,  0.9080,  ..., -1.2860, -0.2432,  0.1773],\n",
+      "         [ 0.8982, -1.3952,  0.1714,  ...,  2.3976,  1.7636, -0.5218],\n",
+      "         [ 0.9412, -0.1858,  1.3610,  ...,  0.6765, -0.0470,  1.5121]]])\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/workspace/DeepSpeech-2.x/tools/venv-dev/lib/python3.7/site-packages/ipykernel/ipkernel.py:283: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n",
+      "  and should_run_async(code)\n"
+     ]
+    }
+   ],
+   "source": [
+    "x = np.random.randn(2, 51, 256)\n",
+    "print(x.dtype)\n",
+    "px = paddle.to_tensor(x, dtype='float32')\n",
+    "tx = torch.tensor(x, dtype=torch.float32)\n",
+    "print(px)\n",
+    "print(tx)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cooked-progressive",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "mechanical-prisoner",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = np.load('enc_0_ff_out.npz', allow_pickle=True)\n",
+    "t_norm_ff = data['norm_ff']\n",
+    "t_ff_out = data['ff_out']\n",
+    "t_ff_l_x = data['ff_l_x']\n",
+    "t_ff_l_a_x = data['ff_l_a_x']\n",
+    "t_ff_l_a_l_x = data['ff_l_a_l_x']\n",
+    "t_ps = data['ps']"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "indie-marriage",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "assured-zambia",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "True\n",
+      "True\n",
+      "True\n",
+      "True\n"
+     ]
+    }
+   ],
+   "source": [
+    "L.set_state_dict({'weight': t_ps[0].T, 'bias': t_ps[1]})\n",
+    "L2.set_state_dict({'weight': t_ps[2].T, 'bias': t_ps[3]})\n",
+    "\n",
+    "ps = []\n",
+    "for n, p in L.named_parameters():\n",
+    "   ps.append(p)\n",
+    "\n",
+    "for n, p in L2.state_dict().items():\n",
+    "    ps.append(p)\n",
+    "    \n",
+    "for p, tp in zip(ps, t_ps):\n",
+    "    print(np.allclose(p.numpy(), tp.T))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "committed-jacob",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "extreme-traffic",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "optimum-milwaukee",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "viral-indian",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "True\n",
+      "True\n",
+      "True\n",
+      "True\n"
+     ]
+    }
+   ],
+   "source": [
+    "# data = np.load('enc_0_ff_out.npz', allow_pickle=True)\n",
+    "# t_norm_ff = data['norm_ff']\n",
+    "# t_ff_out = data['ff_out']\n",
+    "# t_ff_l_x = data['ff_l_x']\n",
+    "# t_ff_l_a_x = data['ff_l_a_x']\n",
+    "# t_ff_l_a_l_x = data['ff_l_a_l_x']\n",
+    "# t_ps = data['ps']\n",
+    "TL = torch.nn.Linear(256, 2048)\n",
+    "TL2 = torch.nn.Linear(2048, 256)\n",
+    "TL.load_state_dict({'weight': torch.tensor(t_ps[0]), 'bias': torch.tensor(t_ps[1])})\n",
+    "TL2.load_state_dict({'weight': torch.tensor(t_ps[2]), 'bias': torch.tensor(t_ps[3])})\n",
+    "\n",
+    "# for n, p in TL.named_parameters():\n",
+    "#    print(n, p)\n",
+    "# for n, p in TL2.named_parameters():\n",
+    "#    print(n, p)\n",
+    "\n",
+    "ps = []\n",
+    "for n, p in TL.state_dict().items():\n",
+    "    ps.append(p.data.numpy())\n",
+    "    \n",
+    "for n, p in TL2.state_dict().items():\n",
+    "    ps.append(p.data.numpy())\n",
+    "    \n",
+    "for p, tp in zip(ps, t_ps):\n",
+    "    print(np.allclose(p, tp))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "skilled-vietnamese",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[[[ 0.67277956  0.08313607 -0.62761104 ... -0.17480263  0.42718208\n",
+      "   -0.5787626 ]\n",
+      "  [ 0.91516656  0.5393416   1.7159258  ...  0.06144593  0.06486575\n",
+      "   -0.03350811]\n",
+      "  [ 0.438351    0.6227843   0.24096036 ...  1.0912522  -0.90929437\n",
+      "   -1.012989  ]\n",
+      "  ...\n",
+      "  [ 0.68631977  0.14240924  0.10763275 ... -0.11513516  0.48065388\n",
+      "    0.04070369]\n",
+      "  [-0.9525228   0.23197874  0.31264272 ...  0.5312439   0.18773697\n",
+      "   -0.8450228 ]\n",
+      "  [ 0.42024016 -0.04561988  0.54541194 ... -0.41933843 -0.00436018\n",
+      "   -0.06663495]]\n",
+      "\n",
+      " [[-0.11638781 -0.33566502 -0.20887226 ...  0.17423287 -0.9195841\n",
+      "   -0.8161046 ]\n",
+      "  [-0.3469874   0.88269687 -0.11887559 ... -0.15566081  0.16357468\n",
+      "   -0.20766167]\n",
+      "  [-0.3847657   0.3984318  -0.06963477 ... -0.00360622  1.2360432\n",
+      "   -0.26811332]\n",
+      "  ...\n",
+      "  [ 0.08230796 -0.46158582  0.54582864 ...  0.15747628 -0.44790155\n",
+      "    0.06020184]\n",
+      "  [-0.8095085   0.43163058 -0.42837143 ...  0.8627463   0.90656304\n",
+      "    0.15847842]\n",
+      "  [-1.485811   -0.18216592 -0.8882585  ...  0.32596245  0.7822631\n",
+      "   -0.6460344 ]]]\n",
+      "[[[ 0.67278004  0.08313602 -0.6276114  ... -0.17480245  0.42718196\n",
+      "   -0.5787625 ]\n",
+      "  [ 0.91516703  0.5393413   1.7159253  ...  0.06144581  0.06486579\n",
+      "   -0.03350812]\n",
+      "  [ 0.43835106  0.62278455  0.24096027 ...  1.0912521  -0.9092943\n",
+      "   -1.0129892 ]\n",
+      "  ...\n",
+      "  [ 0.6863195   0.14240888  0.10763284 ... -0.11513527  0.48065376\n",
+      "    0.04070365]\n",
+      "  [-0.9525231   0.23197863  0.31264275 ...  0.53124386  0.18773702\n",
+      "   -0.84502304]\n",
+      "  [ 0.42024007 -0.04561983  0.545412   ... -0.41933888 -0.00436005\n",
+      "   -0.066635  ]]\n",
+      "\n",
+      " [[-0.11638767 -0.33566508 -0.20887226 ...  0.17423296 -0.9195838\n",
+      "   -0.8161046 ]\n",
+      "  [-0.34698725  0.88269705 -0.11887549 ... -0.15566081  0.16357464\n",
+      "   -0.20766166]\n",
+      "  [-0.3847657   0.3984319  -0.06963488 ... -0.00360619  1.2360426\n",
+      "   -0.26811326]\n",
+      "  ...\n",
+      "  [ 0.08230786 -0.4615857   0.5458287  ...  0.15747619 -0.44790167\n",
+      "    0.06020182]\n",
+      "  [-0.8095083   0.4316307  -0.42837155 ...  0.862746    0.9065631\n",
+      "    0.15847899]\n",
+      "  [-1.485811   -0.18216613 -0.8882584  ...  0.32596254  0.7822631\n",
+      "   -0.6460344 ]]]\n",
+      "True\n",
+      "False\n"
+     ]
+    }
+   ],
+   "source": [
+    "y = L(px)\n",
+    "print(y.numpy())\n",
+    "\n",
+    "ty = TL(tx)\n",
+    "print(ty.data.numpy())\n",
+    "print(np.allclose(px.numpy(), tx.detach().numpy()))\n",
+    "print(np.allclose(y.numpy(), ty.detach().numpy()))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "incorrect-allah",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "prostate-cameroon",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "governmental-surge",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[[ 0.04476918  0.554463   -0.3027508  ... -0.49600336  0.3751858\n",
+      "   0.8254095 ]\n",
+      " [ 0.95594174 -0.29528382 -1.2899452  ...  0.43718258  0.05584608\n",
+      "  -0.06974669]]\n",
+      "[[ 0.04476918  0.5544631  -0.3027507  ... -0.49600336  0.37518573\n",
+      "   0.8254096 ]\n",
+      " [ 0.95594174 -0.29528376 -1.2899454  ...  0.4371827   0.05584623\n",
+      "  -0.0697467 ]]\n",
+      "True\n",
+      "False\n",
+      "True\n"
+     ]
+    }
+   ],
+   "source": [
+    "x = np.random.randn(2, 256)\n",
+    "px = paddle.to_tensor(x, dtype='float32')\n",
+    "tx = torch.tensor(x, dtype=torch.float32)\n",
+    "y = L(px)\n",
+    "print(y.numpy())\n",
+    "ty = TL(tx)\n",
+    "print(ty.data.numpy())\n",
+    "print(np.allclose(px.numpy(), tx.detach().numpy()))\n",
+    "print(np.allclose(y.numpy(), ty.detach().numpy()))\n",
+    "print(np.allclose(y.numpy(), ty.detach().numpy(), atol=1e-5))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "confidential-jacket",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "improved-civilization",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "5e7e7c9fde8350084abf1898cf52651cfc84b17a\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(paddle.version.commit)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "d1e2d3b4",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['__builtins__',\n",
+       " '__cached__',\n",
+       " '__doc__',\n",
+       " '__file__',\n",
+       " '__loader__',\n",
+       " '__name__',\n",
+       " '__package__',\n",
+       " '__spec__',\n",
+       " 'commit',\n",
+       " 'full_version',\n",
+       " 'istaged',\n",
+       " 'major',\n",
+       " 'minor',\n",
+       " 'mkl',\n",
+       " 'patch',\n",
+       " 'rc',\n",
+       " 'show',\n",
+       " 'with_mkl']"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "dir(paddle.version)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "c880c719",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2.1.0\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(paddle.version.full_version)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "f26977bf",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "commit: 5e7e7c9fde8350084abf1898cf52651cfc84b17a\n",
+      "None\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(paddle.version.show())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "04ad47f6",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "1.6.0\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(torch.__version__)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "e1e03830",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['__builtins__',\n",
+       " '__cached__',\n",
+       " '__doc__',\n",
+       " '__file__',\n",
+       " '__loader__',\n",
+       " '__name__',\n",
+       " '__package__',\n",
+       " '__spec__',\n",
+       " '__version__',\n",
+       " 'cuda',\n",
+       " 'debug',\n",
+       " 'git_version',\n",
+       " 'hip']"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "dir(torch.version)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "4ad0389b",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'b31f58de6fa8bbda5353b3c77d9be4914399724d'"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "torch.version.git_version"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "id": "7870ea10",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'10.2'"
+      ]
+     },
+     "execution_count": 21,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "torch.version.cuda"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "db8ee5a7",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6321ec2a",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/.notebook/compute_cmvn_loader_test.ipynb
+++ b/.notebook/compute_cmvn_loader_test.ipynb
--- a/.notebook/dataloader.ipynb
+++ b/.notebook/dataloader.ipynb
@@ -338,7 +338,7 @@
    }
   ],
   "source": [
-    "for idx, (audio, text, audio_len, text_len) in enumerate(batch_reader()):\n",
+    "for idx, (audio, audio_len, text, text_len) in enumerate(batch_reader()):\n",
    "    print('test', text)\n",
    "    print(\"test raw\", ''.join( chr(i) for i in text[0][:int(text_len[0])] ))\n",
    "    print(\"test raw\", ''.join( chr(i) for i in text[-1][:int(text_len[-1])] ))\n",
@@ -386,4 +386,4 @@
 },
 "nbformat": 4,
 "nbformat_minor": 5
-}
+}
\ No newline at end of file
--- a/.notebook/dataloader_with_tokens_tokenids.ipynb
+++ b/.notebook/dataloader_with_tokens_tokenids.ipynb
--- a/.notebook/hack_api_test.ipynb
+++ b/.notebook/hack_api_test.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "breeding-haven",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "/home/ssd5/zhanghui/DeepSpeech2.x\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'/home/ssd5/zhanghui/DeepSpeech2.x'"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%cd ..\n",
+    "%pwd"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "appropriate-theta",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "LICENSE       deepspeech  examples\t\t    requirements.txt  tools\r\n",
+      "README.md     docs\t  libsndfile-1.0.28\t    setup.sh\t      utils\r\n",
+      "README_cn.md  env.sh\t  libsndfile-1.0.28.tar.gz  tests\r\n"
+     ]
+    }
+   ],
+   "source": [
+    "!ls"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "entire-bloom",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/home/ssd5/zhanghui/DeepSpeech2.x/tools/venv/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.\n",
+      "Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations\n",
+      "  def convert_to_list(value, n, name, dtype=np.int):\n",
+      "WARNING:root:override cat of paddle.Tensor if exists or register, remove this when fixed!\n",
+      "WARNING:root:register user masked_fill to paddle.Tensor, remove this when fixed!\n",
+      "WARNING:root:register user masked_fill_ to paddle.Tensor, remove this when fixed!\n",
+      "WARNING:root:register user repeat to paddle.Tensor, remove this when fixed!\n",
+      "WARNING:root:register user glu to paddle.nn.functional, remove this when fixed!\n",
+      "WARNING:root:register user GLU to paddle.nn, remove this when fixed!\n",
+      "WARNING:root:register user ConstantPad2d to paddle.nn, remove this when fixed!\n",
+      "WARNING:root:override ctc_loss of paddle.nn.functional if exists, remove this when fixed!\n"
+     ]
+    }
+   ],
+   "source": [
+    "from deepspeech.modules import loss"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "governmental-aircraft",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/home/ssd5/zhanghui/DeepSpeech2.x/tools/venv/lib/python3.7/site-packages/ipykernel/ipkernel.py:283: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n",
+      "  and should_run_async(code)\n"
+     ]
+    }
+   ],
+   "source": [
+    "import paddle"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "proprietary-disaster",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<function deepspeech.modules.repeat(xs: paddle.VarBase, *size: Any) -> paddle.VarBase>"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "paddle.Tensor.repeat"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "first-diagram",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<property at 0x7fb515eeeb88>"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "paddle.Tensor.size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "intelligent-david",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<function paddle.tensor.manipulation.concat(x, axis=0, name=None)>"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "paddle.Tensor.cat"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "bronze-tenant",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "a = paddle.to_tensor([12,32, 10, 12, 123,32 ,4])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "balanced-bearing",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "7"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "a.size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "id": "extreme-republic",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def size(xs: paddle.Tensor, *args: int) -> paddle.Tensor:\n",
+    "    nargs = len(args)\n",
+    "    assert (nargs <= 1)\n",
+    "    s = paddle.shape(xs)\n",
+    "    if nargs == 1:\n",
+    "        return s[args[0]]\n",
+    "    else:\n",
+    "        return s\n",
+    "\n",
+    "# logger.warn(\n",
+    "#     \"override size of paddle.Tensor if exists or register, remove this when fixed!\"\n",
+    "# )\n",
+    "paddle.Tensor.size = size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "id": "gross-addiction",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Tensor(shape=[1], dtype=int32, place=CPUPlace, stop_gradient=True,\n",
+       "       [7])"
+      ]
+     },
+     "execution_count": 21,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "a.size(0)\n",
+    "a.size()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "id": "adverse-dining",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Tensor(shape=[1], dtype=int32, place=CPUPlace, stop_gradient=True,\n",
+       "       [7])"
+      ]
+     },
+     "execution_count": 22,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "a.size()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "popular-potato",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/.notebook/jit_infer.ipynb
+++ b/.notebook/jit_infer.ipynb
--- a/.notebook/layer_norm_test.ipynb
+++ b/.notebook/layer_norm_test.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 32,
+   "id": "academic-surname",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import paddle\n",
+    "from paddle import nn"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "id": "fundamental-treasure",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Parameter containing:\n",
+      "Tensor(shape=[256], dtype=float32, place=CUDAPlace(0), stop_gradient=False,\n",
+      "       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])\n",
+      "Parameter containing:\n",
+      "Tensor(shape=[256], dtype=float32, place=CUDAPlace(0), stop_gradient=False,\n",
+      "       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])\n"
+     ]
+    }
+   ],
+   "source": [
+    "L = nn.LayerNorm(256, epsilon=1e-12)\n",
+    "for p in L.parameters():\n",
+    "    print(p)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 34,
+   "id": "consolidated-elephant",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 46,
+   "id": "moderate-noise",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "float64\n"
+     ]
+    }
+   ],
+   "source": [
+    "x = np.random.randn(2, 51, 256)\n",
+    "print(x.dtype)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "id": "cooked-progressive",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "y = L(paddle.to_tensor(x, dtype='float32'))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 48,
+   "id": "optimum-milwaukee",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import torch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 49,
+   "id": "viral-indian",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Parameter containing:\n",
+      "tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
+      "        1., 1., 1., 1.], requires_grad=True)\n",
+      "Parameter containing:\n",
+      "tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
+      "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
+      "       requires_grad=True)\n"
+     ]
+    }
+   ],
+   "source": [
+    "TL = torch.nn.LayerNorm(256, eps=1e-12)\n",
+    "for p in TL.parameters():\n",
+    "    print(p)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 50,
+   "id": "skilled-vietnamese",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ty = TL(torch.tensor(x, dtype=torch.float32))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 51,
+   "id": "incorrect-allah",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False"
+      ]
+     },
+     "execution_count": 51,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "np.allclose(y.numpy(), ty.detach().numpy())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "prostate-cameroon",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 52,
+   "id": "governmental-surge",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 52,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "x = np.random.randn(2, 256)\n",
+    "y = L(paddle.to_tensor(x, dtype='float32'))\n",
+    "ty = TL(torch.tensor(x, dtype=torch.float32))\n",
+    "np.allclose(y.numpy(), ty.detach().numpy())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "confidential-jacket",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/.notebook/mask_and_masked_fill_test.ipynb
+++ b/.notebook/mask_and_masked_fill_test.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "primary-organic",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import torch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 38,
+   "id": "stopped-semester",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def mask_finished_scores(score: torch.Tensor,\n",
+    "                         flag: torch.Tensor) -> torch.Tensor:\n",
+    "    \"\"\"\n",
+    "    If a sequence is finished, we only allow one alive branch. This function\n",
+    "    aims to give one branch a zero score and the rest -inf score.\n",
+    "    Args:\n",
+    "        score (torch.Tensor): A real value array with shape\n",
+    "            (batch_size * beam_size, beam_size).\n",
+    "        flag (torch.Tensor): A bool array with shape\n",
+    "            (batch_size * beam_size, 1).\n",
+    "    Returns:\n",
+    "        torch.Tensor: (batch_size * beam_size, beam_size).\n",
+    "    \"\"\"\n",
+    "    beam_size = score.size(-1)\n",
+    "    zero_mask = torch.zeros_like(flag, dtype=torch.bool)\n",
+    "    if beam_size > 1:\n",
+    "        unfinished = torch.cat((zero_mask, flag.repeat([1, beam_size - 1])),\n",
+    "                               dim=1)\n",
+    "        finished = torch.cat((flag, zero_mask.repeat([1, beam_size - 1])),\n",
+    "                             dim=1)\n",
+    "    else:\n",
+    "        unfinished = zero_mask\n",
+    "        finished = flag\n",
+    "    print(unfinished)\n",
+    "    print(finished)\n",
+    "    score.masked_fill_(unfinished, -float('inf'))\n",
+    "    score.masked_fill_(finished, 0)\n",
+    "    return score"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 58,
+   "id": "agreed-portuguese",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "tensor([[ True],\n",
+      "        [False]])\n",
+      "tensor([[-0.8841,  0.7381, -0.9986],\n",
+      "        [ 0.2675, -0.7971,  0.3798]])\n",
+      "tensor([[ True,  True],\n",
+      "        [False, False]])\n"
+     ]
+    }
+   ],
+   "source": [
+    "score = torch.randn((2, 3))\n",
+    "flag = torch.ones((2, 1), dtype=torch.bool)\n",
+    "flag[1] = False\n",
+    "print(flag)\n",
+    "print(score)\n",
+    "print(flag.repeat([1, 2]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 59,
+   "id": "clean-aspect",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "tensor([[False,  True,  True],\n",
+      "        [False, False, False]])\n",
+      "tensor([[ True, False, False],\n",
+      "        [False, False, False]])\n",
+      "tensor([[ 0.0000,    -inf,    -inf],\n",
+      "        [ 0.2675, -0.7971,  0.3798]])\n",
+      "tensor([[ 0.0000,    -inf,    -inf],\n",
+      "        [ 0.2675, -0.7971,  0.3798]])\n"
+     ]
+    }
+   ],
+   "source": [
+    "r  = mask_finished_scores(score, flag)\n",
+    "print(r)\n",
+    "print(score)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "id": "thrown-airline",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Tensor(shape=[2, 1], dtype=bool, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[True ],\n",
+      "        [False]])\n",
+      "Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 2.05994511,  1.87704289,  0.01988174],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n",
+      "Tensor(shape=[2, 2], dtype=bool, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[True , True ],\n",
+      "        [False, False]])\n"
+     ]
+    }
+   ],
+   "source": [
+    "import paddle\n",
+    "\n",
+    "score = paddle.randn((2, 3))\n",
+    "flag = paddle.ones((2, 1), dtype='bool')\n",
+    "flag[1] = False\n",
+    "print(flag)\n",
+    "print(score)\n",
+    "print(flag.tile([1, 2]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 56,
+   "id": "internal-patent",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Tensor(shape=[2, 3], dtype=bool, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[False, True , True ],\n",
+      "        [False, False, False]])\n",
+      "Tensor(shape=[2, 3], dtype=bool, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[True , False, False],\n",
+      "        [False, False, False]])\n",
+      "x Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 2.05994511,  1.87704289,  0.01988174],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n",
+      "2 Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 2.05994511,  1.87704289,  0.01988174],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n",
+      "3 Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 2.05994511, -inf.      , -inf.      ],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n",
+      "x Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 2.05994511, -inf.      , -inf.      ],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n",
+      "2 Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 2.05994511, -inf.      , -inf.      ],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n",
+      "3 Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 0.        , -inf.      , -inf.      ],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n",
+      "Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[ 0.        , -inf.      , -inf.      ],\n",
+      "        [-0.40165186,  0.77547729, -0.64469045]])\n"
+     ]
+    }
+   ],
+   "source": [
+    "paddle.bool = 'bool'\n",
+    "\n",
+    "def masked_fill(xs:paddle.Tensor, mask:paddle.Tensor, value:float):\n",
+    "    print(xs)\n",
+    "    trues = paddle.ones_like(xs) * value\n",
+    "    assert xs.shape == mask.shape\n",
+    "    xs = paddle.where(mask, trues, xs)\n",
+    "    return xs\n",
+    "\n",
+    "def masked_fill_(xs:paddle.Tensor, mask:paddle.Tensor, value:float):\n",
+    "    print('x', xs)\n",
+    "    trues = paddle.ones_like(xs) * value\n",
+    "    assert xs.shape == mask.shape\n",
+    "    ret = paddle.where(mask, trues, xs)\n",
+    "    print('2', xs)\n",
+    "    paddle.assign(ret, output=xs)\n",
+    "    print('3', xs)\n",
+    "\n",
+    "paddle.Tensor.masked_fill = masked_fill\n",
+    "paddle.Tensor.masked_fill_ = masked_fill_\n",
+    "\n",
+    "def mask_finished_scores_pd(score: paddle.Tensor,\n",
+    "                         flag: paddle.Tensor) -> paddle.Tensor:\n",
+    "    \"\"\"\n",
+    "    If a sequence is finished, we only allow one alive branch. This function\n",
+    "    aims to give one branch a zero score and the rest -inf score.\n",
+    "    Args:\n",
+    "        score (torch.Tensor): A real value array with shape\n",
+    "            (batch_size * beam_size, beam_size).\n",
+    "        flag (torch.Tensor): A bool array with shape\n",
+    "            (batch_size * beam_size, 1).\n",
+    "    Returns:\n",
+    "        torch.Tensor: (batch_size * beam_size, beam_size).\n",
+    "    \"\"\"\n",
+    "    beam_size = score.shape[-1]\n",
+    "    zero_mask = paddle.zeros_like(flag, dtype=paddle.bool)\n",
+    "    if beam_size > 1:\n",
+    "        unfinished = paddle.concat((zero_mask, flag.tile([1, beam_size - 1])),\n",
+    "                               axis=1)\n",
+    "        finished = paddle.concat((flag, zero_mask.tile([1, beam_size - 1])),\n",
+    "                             axis=1)\n",
+    "    else:\n",
+    "        unfinished = zero_mask\n",
+    "        finished = flag\n",
+    "    print(unfinished)\n",
+    "    print(finished)\n",
+    "    \n",
+    "    #score.masked_fill_(unfinished, -float('inf'))\n",
+    "    #score.masked_fill_(finished, 0)\n",
+    "#     infs = paddle.ones_like(score) * -float('inf')\n",
+    "#     score = paddle.where(unfinished, infs, score)\n",
+    "#     score = paddle.where(finished, paddle.zeros_like(score), score)\n",
+    "\n",
+    "#     score = score.masked_fill(unfinished, -float('inf'))\n",
+    "#     score = score.masked_fill(finished, 0)\n",
+    "    score.masked_fill_(unfinished, -float('inf'))\n",
+    "    score.masked_fill_(finished, 0)\n",
+    "    return score\n",
+    "\n",
+    "r  = mask_finished_scores_pd(score, flag)\n",
+    "print(r)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "id": "vocal-prime",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<bound method PyCapsule.value of Tensor(shape=[2, 3], dtype=float32, place=CUDAPlace(0), stop_gradient=True,\n",
+       "       [[ 0.        , -inf.      , -inf.      ],\n",
+       "        [-0.40165186,  0.77547729, -0.64469045]])>"
+      ]
+     },
+     "execution_count": 57,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "score.value"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 71,
+   "id": "bacterial-adolescent",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Union, Any"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 72,
+   "id": "absent-fiber",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def repeat(xs : paddle.Tensor, *size: Any):\n",
+    "    print(size)\n",
+    "    return paddle.tile(xs, size)\n",
+    "paddle.Tensor.repeat = repeat"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 73,
+   "id": "material-harbor",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "(1, 2)\n",
+      "Tensor(shape=[2, 2], dtype=bool, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[True , True ],\n",
+      "        [False, False]])\n"
+     ]
+    }
+   ],
+   "source": [
+    "flag = paddle.ones((2, 1), dtype='bool')\n",
+    "flag[1] = False\n",
+    "print(flag.repeat(1, 2))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 84,
+   "id": "acute-brighton",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "(Tensor(shape=[1], dtype=int64, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [1]), 2)\n",
+      "Tensor(shape=[2, 2], dtype=bool, place=CUDAPlace(0), stop_gradient=True,\n",
+      "       [[True , True ],\n",
+      "        [False, False]])\n"
+     ]
+    }
+   ],
+   "source": [
+    "flag = paddle.ones((2, 1), dtype='bool')\n",
+    "flag[1] = False\n",
+    "print(flag.repeat(paddle.to_tensor(1), 2))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 85,
+   "id": "european-rugby",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def size(xs, *args: int):\n",
+    "    nargs = len(args)\n",
+    "    s = paddle.shape(xs)\n",
+    "    assert(nargs <= 1)\n",
+    "    if nargs == 1:\n",
+    "        return s[args[0]]\n",
+    "    else:\n",
+    "        return s\n",
+    "paddle.Tensor.size = size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 86,
+   "id": "moral-special",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Tensor(shape=[2], dtype=int32, place=CPUPlace, stop_gradient=True,\n",
+       "       [2, 1])"
+      ]
+     },
+     "execution_count": 86,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "flag.size()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 87,
+   "id": "ahead-coach",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Tensor(shape=[1], dtype=int32, place=CPUPlace, stop_gradient=True,\n",
+       "       [1])"
+      ]
+     },
+     "execution_count": 87,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "flag.size(1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 88,
+   "id": "incomplete-fitness",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Tensor(shape=[1], dtype=int32, place=CPUPlace, stop_gradient=True,\n",
+       "       [2])"
+      ]
+     },
+     "execution_count": 88,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "flag.size(0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "upset-connectivity",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/.notebook/position_embeding_check.ipynb
+++ b/.notebook/position_embeding_check.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "designing-borough",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/workspace/DeepSpeech-2.x/tools/venv/lib/python3.7/site-packages/ipykernel/ipkernel.py:283: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n",
+      "  and should_run_async(code)\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[[ 0.0000000e+00  0.0000000e+00  0.0000000e+00 ...  0.0000000e+00\n",
+      "   0.0000000e+00  0.0000000e+00]\n",
+      " [ 8.4147096e-01  8.0196178e-01  7.6172036e-01 ...  1.2409373e-04\n",
+      "   1.1547816e-04  1.0746076e-04]\n",
+      " [ 9.0929741e-01  9.5814437e-01  9.8704624e-01 ...  2.4818745e-04\n",
+      "   2.3095631e-04  2.1492151e-04]\n",
+      " ...\n",
+      " [ 3.7960774e-01  7.4510968e-01  7.3418564e-01 ...  1.2036801e-02\n",
+      "   1.1201146e-02  1.0423505e-02]\n",
+      " [-5.7338190e-01 -8.9752287e-02 -4.1488394e-02 ...  1.2160885e-02\n",
+      "   1.1316618e-02  1.0530960e-02]\n",
+      " [-9.9920684e-01 -8.5234123e-01 -7.8794664e-01 ...  1.2284970e-02\n",
+      "   1.1432089e-02  1.0638415e-02]]\n",
+      "True\n",
+      "True\n"
+     ]
+    }
+   ],
+   "source": [
+    "import torch\n",
+    "import math\n",
+    "import numpy as np\n",
+    "\n",
+    "max_len=100\n",
+    "d_model=256\n",
+    "\n",
+    "pe = torch.zeros(max_len, d_model)\n",
+    "position = torch.arange(0, max_len,\n",
+    "                        dtype=torch.float32).unsqueeze(1)\n",
+    "toruch_position = position\n",
+    "div_term = torch.exp(\n",
+    "    torch.arange(0, d_model, 2, dtype=torch.float32) *\n",
+    "    -(math.log(10000.0) / d_model))\n",
+    "tourch_div_term = div_term.cpu().detach().numpy()\n",
+    "\n",
+    "\n",
+    "\n",
+    "torhc_sin = torch.sin(position * div_term)\n",
+    "torhc_cos = torch.cos(position * div_term)\n",
+    "print(torhc_sin.cpu().detach().numpy())\n",
+    "np_sin = np.sin((position * div_term).cpu().detach().numpy())\n",
+    "np_cos = np.cos((position * div_term).cpu().detach().numpy())\n",
+    "print(np.allclose(np_sin, torhc_sin.cpu().detach().numpy()))\n",
+    "print(np.allclose(np_cos, torhc_cos.cpu().detach().numpy()))\n",
+    "pe[:, 0::2] = torhc_sin\n",
+    "pe[:, 1::2] = torhc_cos\n",
+    "tourch_pe = pe.cpu().detach().numpy()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "swiss-referral",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "True\n",
+      "True\n",
+      "False\n",
+      "False\n",
+      "False\n",
+      "False\n",
+      "[[ 1.          1.          1.         ...  1.          1.\n",
+      "   1.        ]\n",
+      " [ 0.5403023   0.59737533  0.6479059  ...  1.          1.\n",
+      "   1.        ]\n",
+      " [-0.41614684 -0.28628543 -0.1604359  ...  0.99999994  1.\n",
+      "   1.        ]\n",
+      " ...\n",
+      " [-0.92514753 -0.66694194 -0.67894876 ...  0.9999276   0.99993724\n",
+      "   0.9999457 ]\n",
+      " [-0.81928825 -0.9959641  -0.999139   ...  0.99992603  0.999936\n",
+      "   0.99994457]\n",
+      " [ 0.03982088 -0.52298605 -0.6157435  ...  0.99992454  0.9999347\n",
+      "   0.99994344]]\n",
+      "----\n",
+      "[[ 1.          1.          1.         ...  1.          1.\n",
+      "   1.        ]\n",
+      " [ 0.54030234  0.59737533  0.6479059  ...  1.          1.\n",
+      "   1.        ]\n",
+      " [-0.41614684 -0.28628543 -0.1604359  ...  1.          1.\n",
+      "   1.        ]\n",
+      " ...\n",
+      " [-0.92514753 -0.66694194 -0.67894876 ...  0.9999276   0.9999373\n",
+      "   0.9999457 ]\n",
+      " [-0.81928825 -0.9959641  -0.999139   ...  0.99992603  0.999936\n",
+      "   0.99994457]\n",
+      " [ 0.03982088 -0.5229861  -0.6157435  ...  0.99992454  0.9999347\n",
+      "   0.99994344]]\n",
+      ")))))))\n",
+      "[[ 0.0000000e+00  0.0000000e+00  0.0000000e+00 ...  0.0000000e+00\n",
+      "   0.0000000e+00  0.0000000e+00]\n",
+      " [ 8.4147096e-01  8.0196178e-01  7.6172036e-01 ...  1.2409373e-04\n",
+      "   1.1547816e-04  1.0746076e-04]\n",
+      " [ 9.0929741e-01  9.5814437e-01  9.8704624e-01 ...  2.4818745e-04\n",
+      "   2.3095631e-04  2.1492151e-04]\n",
+      " ...\n",
+      " [ 3.7960774e-01  7.4510968e-01  7.3418564e-01 ...  1.2036801e-02\n",
+      "   1.1201146e-02  1.0423505e-02]\n",
+      " [-5.7338190e-01 -8.9752287e-02 -4.1488394e-02 ...  1.2160885e-02\n",
+      "   1.1316618e-02  1.0530960e-02]\n",
+      " [-9.9920684e-01 -8.5234123e-01 -7.8794664e-01 ...  1.2284970e-02\n",
+      "   1.1432089e-02  1.0638415e-02]]\n",
+      "----\n",
+      "[[ 0.0000000e+00  0.0000000e+00  0.0000000e+00 ...  0.0000000e+00\n",
+      "   0.0000000e+00  0.0000000e+00]\n",
+      " [ 8.4147096e-01  8.0196178e-01  7.6172036e-01 ...  1.2409373e-04\n",
+      "   1.1547816e-04  1.0746076e-04]\n",
+      " [ 9.0929741e-01  9.5814437e-01  9.8704624e-01 ...  2.4818745e-04\n",
+      "   2.3095631e-04  2.1492151e-04]\n",
+      " ...\n",
+      " [ 3.7960774e-01  7.4510968e-01  7.3418564e-01 ...  1.2036801e-02\n",
+      "   1.1201146e-02  1.0423505e-02]\n",
+      " [-5.7338190e-01 -8.9752287e-02 -4.1488394e-02 ...  1.2160885e-02\n",
+      "   1.1316618e-02  1.0530960e-02]\n",
+      " [-9.9920684e-01 -8.5234123e-01 -7.8794664e-01 ...  1.2284970e-02\n",
+      "   1.1432089e-02  1.0638415e-02]]\n"
+     ]
+    }
+   ],
+   "source": [
+    "import paddle\n",
+    "paddle.set_device('cpu')\n",
+    "ppe = paddle.zeros((max_len, d_model), dtype='float32')\n",
+    "position = paddle.arange(0, max_len,\n",
+    "                        dtype='float32').unsqueeze(1)\n",
+    "print(np.allclose(position.numpy(), toruch_position))\n",
+    "div_term = paddle.exp(\n",
+    "    paddle.arange(0, d_model, 2, dtype='float32') *\n",
+    "    -(math.log(10000.0) / d_model))\n",
+    "print(np.allclose(div_term.numpy(), tourch_div_term))\n",
+    "\n",
+    "\n",
+    "\n",
+    "p_sin = paddle.sin(position * div_term)\n",
+    "p_cos = paddle.cos(position * div_term)\n",
+    "print(np.allclose(np_sin, p_sin.numpy(), rtol=1.e-6, atol=0))\n",
+    "print(np.allclose(np_cos, p_cos.numpy(), rtol=1.e-6, atol=0))\n",
+    "ppe[:, 0::2] = p_sin\n",
+    "ppe[:, 1::2] = p_cos\n",
+    "print(np.allclose(p_sin.numpy(), torhc_sin.cpu().detach().numpy()))\n",
+    "print(np.allclose(p_cos.numpy(), torhc_cos.cpu().detach().numpy()))\n",
+    "print(p_cos.numpy())\n",
+    "print(\"----\")\n",
+    "print(torhc_cos.cpu().detach().numpy())\n",
+    "print(\")))))))\")\n",
+    "print(p_sin.numpy())\n",
+    "print(\"----\")\n",
+    "print(torhc_sin.cpu().detach().numpy())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "integrated-boards",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "False\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(np.allclose(ppe.numpy(), pe.numpy()))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "flying-reserve",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "revised-divide",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/.notebook/python_test.ipynb
+++ b/.notebook/python_test.ipynb
--- a/.notebook/train_test.ipynb
+++ b/.notebook/train_test.ipynb
@@ -249,7 +249,7 @@
    }
   ],
   "source": [
-    "    for idx, (audio, text, audio_len, text_len) in enumerate(batch_reader()):\n",
+    "    for idx, (audio, audio_len, text, text_len) in enumerate(batch_reader()):\n",
    "        print('test', text)\n",
    "        print(\"test raw\", ''.join(batch_reader.dataset.vocab_list[i] for i in text[0]))\n",
    "        print(\"test raw\", ''.join(batch_reader.dataset.vocab_list[i] for i in text[-1]))\n",
@@ -454,7 +454,7 @@
    "            act='brelu')\n",
    "\n",
    "        out_channel = 32\n",
-    "        self.conv_stack = nn.LayerList([\n",
+    "        self.conv_stack = nn.Sequential([\n",
    "            ConvBn(\n",
    "                num_channels_in=32,\n",
    "                num_channels_out=out_channel,\n",
@@ -835,7 +835,7 @@
    "\n",
    "        return logits, probs, audio_len\n",
    "\n",
-    "    def forward(self, audio, text, audio_len, text_len):\n",
+    "    def forward(self, audio, audio_len, text, text_len):\n",
    "        \"\"\"\n",
    "        audio: shape [B, D, T]\n",
    "        text: shape [B, T]\n",
@@ -877,10 +877,10 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "audio, text, audio_len, text_len = None, None, None, None\n",
+    "audio, audio_len, text, text_len = None, None, None, None\n",
    "\n",
    "for idx, inputs in enumerate(batch_reader):\n",
-    "    audio, text, audio_len, text_len = inputs\n",
+    "    audio, audio_len, text, text_len = inputs\n",
    "#     print(idx)\n",
    "#     print('a', audio.shape, audio.place)\n",
    "#     print('t', text)\n",
@@ -960,7 +960,7 @@
    }
   ],
   "source": [
-    "outputs = dp_model(audio, text, audio_len, text_len)\n",
+    "outputs = dp_model(audio, audio_len, text, text_len)\n",
    "logits, _, logits_len = outputs\n",
    "print('logits len', logits_len)\n",
    "loss = loss_fn.forward(logits, text, logits_len, text_len)\n",
@@ -1884,4 +1884,4 @@
 },
 "nbformat": 4,
 "nbformat_minor": 5
-}
+}
\ No newline at end of file
--- a/.notebook/u2_model.ipynb
+++ b/.notebook/u2_model.ipynb
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -3,6 +3,7 @@
    hooks:
    -   id: yapf
        files: \.py$
+        exclude: (?=third_party).*(\.py)$
 -   repo: https://github.com/pre-commit/pre-commit-hooks
    sha: a11d9314b22d8f8c7556443875b731ef05965464
    hooks:
@@ -14,7 +15,22 @@
        files: \.md$
    -   id: trailing-whitespace
        files: \.md$
-   repo: https://github.com/Lucas-C/pre-commit-hooks
+    -   id: requirements-txt-fixer
+        exclude: (?=third_party).*$
+    -   id: check-yaml
+    -   id: check-json
+    -   id: pretty-format-json
+        args:
+        - --no-sort-keys
+        - --autofix
+    -   id: check-merge-conflict
+    -   id: flake8
+        aergs:
+        -  --ignore=E501,E228,E226,E261,E266,E128,E402,W503
+        -  --builtins=G,request
+        -  --jobs=1
+        exclude: (?=third_party).*(\.py)$
+-   repo : https://github.com/Lucas-C/pre-commit-hooks
    sha: v1.0.1
    hooks:
    -   id: forbid-crlf
@@ -38,4 +54,9 @@
        entry: python .pre-commit-hooks/copyright-check.hook
        language: system
        files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|proto|py)$
-        #exclude: (?=decoders/swig).*(\.cpp|\.h)$
+        exclude: (?=third_party|pypinyin).*(\.cpp|\.h|\.py)$
+-   repo: https://github.com/asottile/reorder_python_imports
+    rev: v2.4.0
+    hooks:
+      - id: reorder-python-imports
+        exclude: (?=third_party).*(\.py)$
--- a/.travis.yml
+++ b/.travis.yml
@@ -19,14 +19,14 @@ addons:
 before_install:
  -  python3 --version
  -  python3 -m pip --version
-  -  sudo pip install -U virtualenv pre-commit pip
+  -  pip3 --version
+  -  sudo pip3 install -U virtualenv pre-commit pip
  -  docker pull paddlepaddle/paddle:latest

 script:
  - exit_code=0
-  - .travis/precommit.sh || exit_code=$(( exit_code | $? ))
  - docker run -i --rm -v "$PWD:/py_unittest" paddlepaddle/paddle:latest /bin/bash -c
-    'cd /py_unittest; source env.sh; bash .travis/unittest.sh' || exit_code=$(( exit_code | $? ))
+    'cd /py_unittest && bash .travis/precommit.sh && source env.sh && bash .travis/unittest.sh' || exit_code=$(( exit_code | $? ))
    exit $exit_code

 notifications:

--- a/.travis/install.sh
+++ b/.travis/install.sh
+#!/bin/bash
+
+setup_env(){
+    cd tools && make && cd - 
+}
+
+install(){
+    if [ -f "setup.sh" ]; then
+        bash setup.sh
+        #export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
+    fi
+    if [ $? != 0 ]; then
+        exit 1
+    fi
+}
+
+print_env(){
+    cat /etc/lsb-release
+    gcc -v
+    g++ -v
+}
+
+abort(){
+    echo "Run install failed" 1>&2
+    echo "Please check your code" 1>&2
+    exit 1
+}
+
+trap 'abort' 0
+set -e
+
+print_env
+setup_env
+source tools/venv/bin/activate
+install
+
+trap : 0
--- a/.travis/precommit.sh
+++ b/.travis/precommit.sh
 #!/bin/bash
+
 function abort(){
    echo "Your commit not fit PaddlePaddle code style" 1>&2
    echo "Please use pre-commit scripts to auto-format your code" 1>&2
    exit 1
 }

+
 trap 'abort' 0
 set -e
-cd `dirname $0`
-cd ..
-export PATH=/usr/bin:$PATH
-pre-commit install
+
+source tools/venv/bin/activate
+
+python3 --version

 if ! pre-commit run -a ; then
  ls -lh

--- a/.travis/unittest.sh
+++ b/.travis/unittest.sh
 #!/bin/bash

+
+
 abort(){
    echo "Run unittest failed" 1>&2
    echo "Please check your code" 1>&2
    exit 1
 }

+
 unittest(){
    cd $1 > /dev/null
    if [ -f "setup.sh" ]; then
@@ -21,13 +24,31 @@ unittest(){
    cd - > /dev/null
 }

+coverage(){
+    cd $1 > /dev/null
+
+    if [ -f "setup.sh" ]; then
+        bash setup.sh
+        export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
+    fi
+    if [ $? != 0 ]; then
+        exit 1
+    fi
+
+    find . -path ./tools/venv -prune -false -o -name 'tests' -type d -print0 | \
+        xargs -0 -I{} -n1 bash -c \
+        'python3 -m coverage run --branch {}'
+    python3 -m coverage report -m
+    python3 -m coverage html
+    cd - > /dev/null
+}
+
 trap 'abort' 0
 set -e

-cd tools; make; cd - 
-. tools/venv/bin/activate
-pip3 install pytest
-
-unittest .
+source tools/venv/bin/activate
+#pip3 install pytest
+#unittest .
+coverage .

 trap : 0
--- a/.vimrc
+++ b/.vimrc
--- a/README.md
+++ b/README.md
@@ -11,7 +11,10 @@

 ## Models

-* [Baidu's Deep Speech2](http://proceedings.mlr.press/v48/amodei16.pdf)
+* [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf)
+* [Transformer](https://arxiv.org/abs/1706.03762)
+* [Conformer](https://arxiv.org/abs/2005.08100)
+* [U2](https://arxiv.org/pdf/2012.05481.pdf)

 ## Setup

@@ -22,19 +25,20 @@ Please see [install](docs/install.md).

 ## Getting Started

-Please see [Getting Started](docs/getting_started.md) and [tiny egs](examples/tiny/README.md).
+Please see [Getting Started](docs/src/geting_started.md) and [tiny egs](examples/tiny/README.md).
+

 ## More Information  

-* [Install](docs/install.md)  
-* [Getting Started](docs/getting_started.md)  
-* [Data Prepration](docs/data_preparation.md)  
-* [Data Augmentation](docs/augmentation.md)  
-* [Ngram LM](docs/ngram_lm.md)  
-* [Server Demo](docs/server.md)  
-* [Benchmark](docs/benchmark.md)  
-* [Relased Model](docs/released_model.md)  
-* [FAQ](docs/faq.md)  
+* [Install](docs/src/install.md)  
+* [Getting Started](docs/src/geting_stared.md)  
+* [Data Prepration](docs/src/data_preparation.md)  
+* [Data Augmentation](docs/src/augmentation.md)  
+* [Ngram LM](docs/src/ngram_lm.md)  
+* [Server Demo](docs/src/server.md)  
+* [Benchmark](docs/src/benchmark.md)  
+* [Relased Model](docs/src/released_model.md)  
+* [FAQ](docs/src/faq.md)  


 ## Questions and Help
@@ -45,3 +49,7 @@ You are welcome to submit questions in [Github Discussions](https://github.com/P
 ## License

 DeepSpeech is provided under the [Apache-2.0 License](./LICENSE).
+
+## Acknowledgement
+
+We depends on many open source repos. See [References](docs/src/reference.md) for more information.
--- a/README_cn.md
+++ b/README_cn.md
@@ -11,7 +11,11 @@

 ## 模型

-* [Baidu's Deep Speech2](http://proceedings.mlr.press/v48/amodei16.pdf)
+* [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf)
+* [Transformer](https://arxiv.org/abs/1706.03762)
+* [Conformer](https://arxiv.org/abs/2005.08100)
+* [U2](https://arxiv.org/pdf/2012.05481.pdf)
+

 ## 安装

@@ -22,19 +26,19 @@

 ## 开始

-请查看 [Getting Started](docs/getting_started.md) 和 [tiny egs](examples/tiny/README.md)。
+请查看 [Getting Started](docs/src/geting_started.md) 和 [tiny egs](examples/tiny/README.md)。

 ## 更多信息

-* [安装](docs/install.md)  
-* [开始](docs/getting_started.md)  
-* [数据处理](docs/data_preparation.md)  
-* [数据增强](docs/augmentation.md)  
-* [语言模型](docs/ngram_lm.md)  
-* [服务部署](docs/server.md)  
-* [Benchmark](docs/benchmark.md)  
-* [Relased Model](docs/released_model.md)  
-* [FAQ](docs/faq.md)  
+* [安装](docs/src/install.md)  
+* [开始](docs/src/geting_stared.md)  
+* [数据处理](docs/src/data_preparation.md)  
+* [数据增强](docs/src/augmentation.md)  
+* [语言模型](docs/src/ngram_lm.md)  
+* [服务部署](docs/src/server.md)  
+* [Benchmark](docs/src/benchmark.md)  
+* [Relased Model](docs/src/released_model.md)  
+* [FAQ](docs/src/faq.md)  

 ## 问题和帮助

@@ -43,3 +47,7 @@
 ## License

 DeepSpeech遵循[Apache-2.0开源协议](./LICENSE)。
+
+## 感谢
+
+开发中参考一些优秀的仓库，详情参见 [References](docs/src/reference.md)。
--- a/deepspeech/__init__.py
+++ b/deepspeech/__init__.py
--- a/deepspeech/decoders/decoders_deprecated.py
+++ b/deepspeech/decoders/decoders_deprecated.py
@@ -12,11 +12,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """Contains various CTC decoders."""
-
+import multiprocessing
 from itertools import groupby
-import numpy as np
 from math import log
-import multiprocessing
+
+import numpy as np


 def ctc_greedy_decoder(probs_seq, vocabulary):
@@ -104,14 +104,14 @@ def ctc_beam_search_decoder(probs_seq,
        global ext_nproc_scorer
        ext_scoring_func = ext_nproc_scorer

-    ## initialize
+    # initialize
    # prefix_set_prev: the set containing selected prefixes
    # probs_b_prev: prefixes' probability ending with blank in previous step
    # probs_nb_prev: prefixes' probability ending with non-blank in previous step
    prefix_set_prev = {'\t': 1.0}
    probs_b_prev, probs_nb_prev = {'\t': 1.0}, {'\t': 0.0}

-    ## extend prefix in loop
+    # extend prefix in loop
    for time_step in range(len(probs_seq)):
        # prefix_set_next: the set containing candidate prefixes
        # probs_b_cur: prefixes' probability ending with blank in current step
@@ -120,7 +120,7 @@ def ctc_beam_search_decoder(probs_seq,

        prob_idx = list(enumerate(probs_seq[time_step]))
        cutoff_len = len(prob_idx)
-        #If pruning is enabled
+        # If pruning is enabled
        if cutoff_prob < 1.0 or cutoff_top_n < cutoff_len:
            prob_idx = sorted(prob_idx, key=lambda asd: asd[1], reverse=True)
            cutoff_len, cum_prob = 0, 0.0
@@ -172,7 +172,7 @@ def ctc_beam_search_decoder(probs_seq,
        # update probs
        probs_b_prev, probs_nb_prev = probs_b_cur, probs_nb_cur

-        ## store top beam_size prefixes
+        # store top beam_size prefixes
        prefix_set_prev = sorted(
            prefix_set_next.items(), key=lambda asd: asd[1], reverse=True)
        if beam_size < len(prefix_set_prev):
@@ -191,7 +191,7 @@ def ctc_beam_search_decoder(probs_seq,
        else:
            beam_result.append((float('-inf'), ''))

-    ## output top beam_size decoding results
+    # output top beam_size decoding results
    beam_result = sorted(beam_result, key=lambda asd: asd[0], reverse=True)
    return beam_result


--- a/deepspeech/decoders/scorer_deprecated.py
+++ b/deepspeech/decoders/scorer_deprecated.py
@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """External Scorer for Beam Search Decoder."""
-
 import os
+
 import kenlm
 import numpy as np

@@ -71,7 +71,7 @@ class Scorer(object):
        """
        lm = self._language_model_score(sentence)
        word_cnt = self._word_count(sentence)
-        if log == False:
+        if log is False:
            score = np.power(lm, self._alpha) * np.power(word_cnt, self._beta)
        else:
            score = self._alpha * np.log(lm) + self._beta * np.log(word_cnt)

--- a/deepspeech/decoders/swig/ctc_beam_search_decoder.cpp
+++ b/deepspeech/decoders/swig/ctc_beam_search_decoder.cpp
--- a/deepspeech/decoders/swig/ctc_greedy_decoder.cpp
+++ b/deepspeech/decoders/swig/ctc_greedy_decoder.cpp
--- a/deepspeech/decoders/swig/decoder_utils.cpp
+++ b/deepspeech/decoders/swig/decoder_utils.cpp
--- a/deepspeech/decoders/swig/decoder_utils.h
+++ b/deepspeech/decoders/swig/decoder_utils.h
--- a/deepspeech/decoders/swig/path_trie.cpp
+++ b/deepspeech/decoders/swig/path_trie.cpp
--- a/deepspeech/decoders/swig/path_trie.h
+++ b/deepspeech/decoders/swig/path_trie.h
--- a/deepspeech/decoders/swig/scorer.cpp
+++ b/deepspeech/decoders/swig/scorer.cpp
--- a/deepspeech/decoders/swig/scorer.h
+++ b/deepspeech/decoders/swig/scorer.h
--- a/deepspeech/decoders/swig/setup.py
+++ b/deepspeech/decoders/swig/setup.py
--- a/deepspeech/decoders/swig/setup.sh
+++ b/deepspeech/decoders/swig/setup.sh
--- a/deepspeech/decoders/swig_wrapper.py
+++ b/deepspeech/decoders/swig_wrapper.py
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """Wrapper for various CTC decoders in SWIG."""
-
 import swig_decoders



--- a/deepspeech/decoders/tests/test_decoders.py
+++ b/deepspeech/decoders/tests/test_decoders.py
--- a/deepspeech/exps/deepspeech2/bin/deploy/client.py
+++ b/deepspeech/exps/deepspeech2/bin/deploy/client.py
--- a/deepspeech/exps/deepspeech2/bin/deploy/record.py
+++ b/deepspeech/exps/deepspeech2/bin/deploy/record.py
--- a/deepspeech/exps/deepspeech2/bin/deploy/runtime.py
+++ b/deepspeech/exps/deepspeech2/bin/deploy/runtime.py
--- a/deepspeech/exps/deepspeech2/bin/deploy/send.py
+++ b/deepspeech/exps/deepspeech2/bin/deploy/send.py
--- a/deepspeech/exps/deepspeech2/bin/deploy/server.py
+++ b/deepspeech/exps/deepspeech2/bin/deploy/server.py
--- a/deepspeech/exps/deepspeech2/bin/export.py
+++ b/deepspeech/exps/deepspeech2/bin/export.py
--- a/deepspeech/exps/deepspeech2/bin/test.py
+++ b/deepspeech/exps/deepspeech2/bin/test.py
--- a/deepspeech/exps/deepspeech2/bin/train.py
+++ b/deepspeech/exps/deepspeech2/bin/train.py
--- a/deepspeech/exps/deepspeech2/bin/tune.py
+++ b/deepspeech/exps/deepspeech2/bin/tune.py
--- a/deepspeech/exps/deepspeech2/config.py
+++ b/deepspeech/exps/deepspeech2/config.py
--- a/deepspeech/exps/deepspeech2/model.py
+++ b/deepspeech/exps/deepspeech2/model.py
--- a/deepspeech/exps/u2/__init__.py
+++ b/deepspeech/exps/u2/__init__.py
--- a/deepspeech/exps/u2/bin/export.py
+++ b/deepspeech/exps/u2/bin/export.py
--- a/deepspeech/exps/deepspeech2/bin/infer.py
+++ b/deepspeech/exps/deepspeech2/bin/infer.py
--- a/deepspeech/exps/u2/bin/train.py
+++ b/deepspeech/exps/u2/bin/train.py
--- a/deepspeech/exps/u2/config.py
+++ b/deepspeech/exps/u2/config.py
--- a/deepspeech/exps/u2/model.py
+++ b/deepspeech/exps/u2/model.py
--- a/deepspeech/frontend/audio.py
+++ b/deepspeech/frontend/audio.py
--- a/deepspeech/frontend/augmentor/augmentation.py
+++ b/deepspeech/frontend/augmentor/augmentation.py
--- a/deepspeech/frontend/augmentor/base.py
+++ b/deepspeech/frontend/augmentor/base.py
--- a/deepspeech/frontend/augmentor/impulse_response.py
+++ b/deepspeech/frontend/augmentor/impulse_response.py
--- a/deepspeech/frontend/augmentor/noise_perturb.py
+++ b/deepspeech/frontend/augmentor/noise_perturb.py
--- a/deepspeech/frontend/augmentor/online_bayesian_normalization.py
+++ b/deepspeech/frontend/augmentor/online_bayesian_normalization.py
--- a/deepspeech/frontend/augmentor/resample.py
+++ b/deepspeech/frontend/augmentor/resample.py
--- a/deepspeech/frontend/augmentor/shift_perturb.py
+++ b/deepspeech/frontend/augmentor/shift_perturb.py
--- a/deepspeech/frontend/augmentor/spec_augment.py
+++ b/deepspeech/frontend/augmentor/spec_augment.py
--- a/deepspeech/frontend/augmentor/speed_perturb.py
+++ b/deepspeech/frontend/augmentor/speed_perturb.py
--- a/deepspeech/frontend/augmentor/volume_perturb.py
+++ b/deepspeech/frontend/augmentor/volume_perturb.py
--- a/deepspeech/frontend/featurizer/audio_featurizer.py
+++ b/deepspeech/frontend/featurizer/audio_featurizer.py
--- a/deepspeech/frontend/featurizer/speech_featurizer.py
+++ b/deepspeech/frontend/featurizer/speech_featurizer.py
--- a/deepspeech/frontend/featurizer/text_featurizer.py
+++ b/deepspeech/frontend/featurizer/text_featurizer.py
--- a/deepspeech/frontend/normalizer.py
+++ b/deepspeech/frontend/normalizer.py
--- a/deepspeech/frontend/speech.py
+++ b/deepspeech/frontend/speech.py
--- a/deepspeech/frontend/utility.py
+++ b/deepspeech/frontend/utility.py
--- a/deepspeech/io/__init__.py
+++ b/deepspeech/io/__init__.py
--- a/deepspeech/io/collator.py
+++ b/deepspeech/io/collator.py
--- a/deepspeech/io/dataset.py
+++ b/deepspeech/io/dataset.py
--- a/deepspeech/io/sampler.py
+++ b/deepspeech/io/sampler.py
--- a/deepspeech/io/utility.py
+++ b/deepspeech/io/utility.py
--- a/deepspeech/models/deepspeech2.py
+++ b/deepspeech/models/deepspeech2.py
--- a/deepspeech/models/u2.py
+++ b/deepspeech/models/u2.py
--- a/deepspeech/modules/activation.py
+++ b/deepspeech/modules/activation.py
--- a/deepspeech/modules/attention.py
+++ b/deepspeech/modules/attention.py
--- a/deepspeech/modules/cmvn.py
+++ b/deepspeech/modules/cmvn.py
--- a/deepspeech/modules/conformer_convolution.py
+++ b/deepspeech/modules/conformer_convolution.py
--- a/deepspeech/modules/conv.py
+++ b/deepspeech/modules/conv.py
--- a/deepspeech/modules/ctc.py
+++ b/deepspeech/modules/ctc.py
--- a/deepspeech/modules/decoder.py
+++ b/deepspeech/modules/decoder.py
--- a/deepspeech/modules/decoder_layer.py
+++ b/deepspeech/modules/decoder_layer.py
--- a/deepspeech/modules/embedding.py
+++ b/deepspeech/modules/embedding.py
--- a/deepspeech/modules/encoder.py
+++ b/deepspeech/modules/encoder.py
--- a/deepspeech/modules/encoder_layer.py
+++ b/deepspeech/modules/encoder_layer.py
--- a/deepspeech/modules/loss.py
+++ b/deepspeech/modules/loss.py
--- a/deepspeech/modules/mask.py
+++ b/deepspeech/modules/mask.py
--- a/deepspeech/modules/positionwise_feed_forward.py
+++ b/deepspeech/modules/positionwise_feed_forward.py
--- a/deepspeech/modules/rnn.py
+++ b/deepspeech/modules/rnn.py
--- a/deepspeech/modules/subsampling.py
+++ b/deepspeech/modules/subsampling.py
--- a/deepspeech/training/__init__.py
+++ b/deepspeech/training/__init__.py
@@ -11,5 +11,3 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
-from deepspeech.training.trainer import *
--- a/deepspeech/training/cli.py
+++ b/deepspeech/training/cli.py
--- a/deepspeech/training/gradclip.py
+++ b/deepspeech/training/gradclip.py
--- a/deepspeech/training/scheduler.py
+++ b/deepspeech/training/scheduler.py
--- a/deepspeech/training/trainer.py
+++ b/deepspeech/training/trainer.py
--- a/deepspeech/utils/checkpoint.py
+++ b/deepspeech/utils/checkpoint.py
--- a/deepspeech/utils/ctc_utils.py
+++ b/deepspeech/utils/ctc_utils.py
--- a/deepspeech/utils/error_rate.py
+++ b/deepspeech/utils/error_rate.py
--- a/deepspeech/utils/layer_tools.py
+++ b/deepspeech/utils/layer_tools.py
--- a/deepspeech/utils/log.py
+++ b/deepspeech/utils/log.py
--- a/deepspeech/utils/mp_tools.py
+++ b/deepspeech/utils/mp_tools.py
--- a/deepspeech/utils/socket_server.py
+++ b/deepspeech/utils/socket_server.py
--- a/deepspeech/utils/tensor_utils.py
+++ b/deepspeech/utils/tensor_utils.py
--- a/deepspeech/utils/utility.py
+++ b/deepspeech/utils/utility.py
--- a/docs/augmentation.md
+++ b/docs/augmentation.md
--- a/docs/benchmark.md
+++ b/docs/benchmark.md
--- a/docs/src/chinese_syllable.md
+++ b/docs/src/chinese_syllable.md
--- a/docs/data_preparation.md
+++ b/docs/data_preparation.md
--- a/docs/faq.md
+++ b/docs/faq.md
--- a/docs/getting_started.md
+++ b/docs/getting_started.md
--- a/docs/install.md
+++ b/docs/install.md
--- a/docs/ngram_lm.md
+++ b/docs/ngram_lm.md
--- a/docs/src/reference.md
+++ b/docs/src/reference.md
--- a/docs/released_model.md
+++ b/docs/released_model.md
--- a/docs/server.md
+++ b/docs/server.md
--- a/docs/src/text_front_end.md
+++ b/docs/src/text_front_end.md
--- a/env.sh
+++ b/env.sh
--- a/examples/aishell/.gitignore
+++ b/examples/aishell/.gitignore
--- a/examples/aishell/README.md
+++ b/examples/aishell/README.md
--- a/examples/aishell/conf/augmentation.config
+++ b/examples/aishell/conf/augmentation.config
--- a/examples/aishell/local/data.sh
+++ b/examples/aishell/local/data.sh
--- a/examples/aishell/local/download_model.sh
+++ b/examples/aishell/local/download_model.sh
--- a/examples/aishell/local/export.sh
+++ b/examples/aishell/local/export.sh
--- a/examples/aishell/local/infer.sh
+++ b/examples/aishell/local/infer.sh
--- a/examples/aishell/run.sh
+++ b/examples/aishell/run.sh
--- a/examples/aishell/s0/.gitignore
+++ b/examples/aishell/s0/.gitignore
--- a/examples/aishell/s0/README.md
+++ b/examples/aishell/s0/README.md
--- a/examples/aishell/s0/conf/augmentation.json
+++ b/examples/aishell/s0/conf/augmentation.json
--- a/examples/aishell/conf/deepspeech2.yaml
+++ b/examples/aishell/conf/deepspeech2.yaml
--- a/examples/aishell/s0/local/avg.sh
+++ b/examples/aishell/s0/local/avg.sh
--- a/examples/aishell/local/client.sh
+++ b/examples/aishell/local/client.sh
--- a/examples/aishell/s0/local/data.sh
+++ b/examples/aishell/s0/local/data.sh
--- a/examples/aishell/local/download_lm_ch.sh
+++ b/examples/aishell/local/download_lm_ch.sh
--- a/examples/aishell/s0/local/export.sh
+++ b/examples/aishell/s0/local/export.sh
--- a/examples/aishell/local/server.sh
+++ b/examples/aishell/local/server.sh
--- a/examples/aishell/local/test.sh
+++ b/examples/aishell/local/test.sh
--- a/examples/aishell/local/train.sh
+++ b/examples/aishell/local/train.sh
--- a/examples/aishell/local/tune.sh
+++ b/examples/aishell/local/tune.sh
--- a/examples/aishell/path.sh
+++ b/examples/aishell/path.sh
--- a/examples/aishell/s0/run.sh
+++ b/examples/aishell/s0/run.sh
--- a/examples/aishell/s1/.gitignore
+++ b/examples/aishell/s1/.gitignore
--- a/examples/aishell/s1/conf/augmentation.json
+++ b/examples/aishell/s1/conf/augmentation.json
--- a/examples/aishell/s1/conf/conformer.yaml
+++ b/examples/aishell/s1/conf/conformer.yaml
--- a/examples/aishell/s1/local/avg.sh
+++ b/examples/aishell/s1/local/avg.sh
--- a/examples/aishell/s1/local/data.sh
+++ b/examples/aishell/s1/local/data.sh
--- a/examples/aishell/s1/local/download_lm_ch.sh
+++ b/examples/aishell/s1/local/download_lm_ch.sh
--- a/examples/aishell/s1/local/export.sh
+++ b/examples/aishell/s1/local/export.sh
--- a/examples/aishell/s1/local/test.sh
+++ b/examples/aishell/s1/local/test.sh
--- a/examples/aishell/s1/local/train.sh
+++ b/examples/aishell/s1/local/train.sh
--- a/examples/aishell/s1/path.sh
+++ b/examples/aishell/s1/path.sh
--- a/examples/aishell/s1/run.sh
+++ b/examples/aishell/s1/run.sh
--- a/examples/aug_conf/augmentation.config
+++ b/examples/aug_conf/augmentation.config
--- a/examples/aug_conf/augmentation.config.example
+++ b/examples/aug_conf/augmentation.config.example
--- a/examples/aug_conf/augmentation.example.json
+++ b/examples/aug_conf/augmentation.example.json
--- a/examples/aug_conf/augmentation.json
+++ b/examples/aug_conf/augmentation.json
--- a/examples/dataset/aishell/aishell.py
+++ b/examples/dataset/aishell/aishell.py
--- a/examples/dataset/chime3_background/chime3_background.py
+++ b/examples/dataset/chime3_background/chime3_background.py
--- a/examples/dataset/librispeech/librispeech.py
+++ b/examples/dataset/librispeech/librispeech.py
--- a/examples/dataset/mini_librispeech/mini_librispeech.py
+++ b/examples/dataset/mini_librispeech/mini_librispeech.py
--- a/examples/dataset/musan/musan.py
+++ b/examples/dataset/musan/musan.py
--- a/examples/dataset/rir_noise/rir_noise.py
+++ b/examples/dataset/rir_noise/rir_noise.py
--- a/examples/dataset/voxforge/voxforge.py
+++ b/examples/dataset/voxforge/voxforge.py
--- a/examples/librispeech/.gitignore
+++ b/examples/librispeech/.gitignore
--- a/examples/librispeech/README.md
+++ b/examples/librispeech/README.md
--- a/examples/librispeech/conf/augmentation.config
+++ b/examples/librispeech/conf/augmentation.config
--- a/examples/librispeech/local/data.sh
+++ b/examples/librispeech/local/data.sh
--- a/examples/librispeech/local/export.sh
+++ b/examples/librispeech/local/export.sh
--- a/examples/librispeech/local/infer.sh
+++ b/examples/librispeech/local/infer.sh
--- a/examples/librispeech/run.sh
+++ b/examples/librispeech/run.sh
--- a/examples/librispeech/s0/README.md
+++ b/examples/librispeech/s0/README.md
--- a/examples/librispeech/s0/conf/augmentation.json
+++ b/examples/librispeech/s0/conf/augmentation.json
--- a/examples/librispeech/conf/deepspeech2.yaml
+++ b/examples/librispeech/conf/deepspeech2.yaml
--- a/examples/librispeech/s0/local/avg.sh
+++ b/examples/librispeech/s0/local/avg.sh
--- a/examples/librispeech/s0/local/data.sh
+++ b/examples/librispeech/s0/local/data.sh
--- a/examples/librispeech/local/download_lm_en.sh
+++ b/examples/librispeech/local/download_lm_en.sh
--- a/examples/librispeech/s0/local/export.sh
+++ b/examples/librispeech/s0/local/export.sh
--- a/examples/librispeech/local/test.sh
+++ b/examples/librispeech/local/test.sh
--- a/examples/librispeech/s0/local/train.sh
+++ b/examples/librispeech/s0/local/train.sh
--- a/examples/librispeech/local/tune.sh
+++ b/examples/librispeech/local/tune.sh
--- a/examples/librispeech/path.sh
+++ b/examples/librispeech/path.sh
--- a/examples/librispeech/s0/run.sh
+++ b/examples/librispeech/s0/run.sh
--- a/examples/librispeech/s1/.gitignore
+++ b/examples/librispeech/s1/.gitignore
--- a/examples/librispeech/s1/conf/augmentation.json
+++ b/examples/librispeech/s1/conf/augmentation.json
--- a/examples/librispeech/s1/conf/chunk_confermer.yaml
+++ b/examples/librispeech/s1/conf/chunk_confermer.yaml
--- a/examples/librispeech/s1/conf/chunk_transformer.yaml
+++ b/examples/librispeech/s1/conf/chunk_transformer.yaml
--- a/examples/librispeech/s1/conf/conformer.yaml
+++ b/examples/librispeech/s1/conf/conformer.yaml
--- a/examples/librispeech/s1/conf/transformer.yaml
+++ b/examples/librispeech/s1/conf/transformer.yaml
--- a/examples/librispeech/s1/local/avg.sh
+++ b/examples/librispeech/s1/local/avg.sh
--- a/examples/librispeech/s1/local/data.sh
+++ b/examples/librispeech/s1/local/data.sh
--- a/examples/tiny/local/download_lm_en.sh
+++ b/examples/tiny/local/download_lm_en.sh
--- a/examples/librispeech/s1/local/export.sh
+++ b/examples/librispeech/s1/local/export.sh
--- a/examples/librispeech/s1/local/test.sh
+++ b/examples/librispeech/s1/local/test.sh
--- a/examples/librispeech/s1/local/train.sh
+++ b/examples/librispeech/s1/local/train.sh
--- a/examples/librispeech/s1/path.sh
+++ b/examples/librispeech/s1/path.sh
--- a/examples/librispeech/s1/run.sh
+++ b/examples/librispeech/s1/run.sh
--- a/examples/spm/.gitignore
+++ b/examples/spm/.gitignore
--- a/examples/spm/README.md
+++ b/examples/spm/README.md
--- a/examples/spm/path.sh
+++ b/examples/spm/path.sh
--- a/examples/spm/run.sh
+++ b/examples/spm/run.sh
--- a/examples/spm/text
+++ b/examples/spm/text
--- a/examples/tiny/README.md
+++ b/examples/tiny/README.md
--- a/examples/tiny/conf/augmentation.config
+++ b/examples/tiny/conf/augmentation.config
--- a/examples/tiny/local/data.sh
+++ b/examples/tiny/local/data.sh
--- a/examples/tiny/local/download_model.sh
+++ b/examples/tiny/local/download_model.sh
--- a/examples/tiny/local/export.sh
+++ b/examples/tiny/local/export.sh
--- a/examples/tiny/local/infer.sh
+++ b/examples/tiny/local/infer.sh
--- a/examples/tiny/local/train.sh
+++ b/examples/tiny/local/train.sh
--- a/examples/tiny/run.sh
+++ b/examples/tiny/run.sh
--- a/examples/tiny/s0/.gitignore
+++ b/examples/tiny/s0/.gitignore
--- a/examples/tiny/s0/README.md
+++ b/examples/tiny/s0/README.md
--- a/examples/tiny/s0/conf/augmentation.json
+++ b/examples/tiny/s0/conf/augmentation.json
--- a/examples/tiny/conf/deepspeech2.yaml
+++ b/examples/tiny/conf/deepspeech2.yaml
--- a/examples/tiny/s0/local/avg.sh
+++ b/examples/tiny/s0/local/avg.sh
--- a/examples/tiny/s0/local/data.sh
+++ b/examples/tiny/s0/local/data.sh
--- a/examples/librispeech/local/download_model.sh
+++ b/examples/librispeech/local/download_model.sh
--- a/examples/tiny/s0/local/export.sh
+++ b/examples/tiny/s0/local/export.sh
--- a/examples/tiny/local/test.sh
+++ b/examples/tiny/local/test.sh
--- a/examples/librispeech/local/train.sh
+++ b/examples/librispeech/local/train.sh
--- a/examples/tiny/local/tune.sh
+++ b/examples/tiny/local/tune.sh
--- a/examples/tiny/path.sh
+++ b/examples/tiny/path.sh
--- a/examples/tiny/s0/run.sh
+++ b/examples/tiny/s0/run.sh
--- a/examples/tiny/s1/.gitignore
+++ b/examples/tiny/s1/.gitignore
--- a/examples/tiny/s1/conf/augmentation.json
+++ b/examples/tiny/s1/conf/augmentation.json
--- a/examples/tiny/s1/conf/chunk_confermer.yaml
+++ b/examples/tiny/s1/conf/chunk_confermer.yaml
--- a/examples/tiny/s1/conf/chunk_transformer.yaml
+++ b/examples/tiny/s1/conf/chunk_transformer.yaml
--- a/examples/tiny/s1/conf/conformer.yaml
+++ b/examples/tiny/s1/conf/conformer.yaml
--- a/examples/tiny/s1/conf/transformer.yaml
+++ b/examples/tiny/s1/conf/transformer.yaml
--- a/examples/tiny/s1/local/avg.sh
+++ b/examples/tiny/s1/local/avg.sh
--- a/examples/tiny/s1/local/data.sh
+++ b/examples/tiny/s1/local/data.sh
--- a/examples/tiny/s1/local/download_lm_en.sh
+++ b/examples/tiny/s1/local/download_lm_en.sh
--- a/examples/tiny/s1/local/export.sh
+++ b/examples/tiny/s1/local/export.sh
--- a/examples/tiny/s1/local/test.sh
+++ b/examples/tiny/s1/local/test.sh
--- a/examples/tiny/s1/local/train.sh
+++ b/examples/tiny/s1/local/train.sh
--- a/examples/tiny/s1/path.sh
+++ b/examples/tiny/s1/path.sh
--- a/examples/tiny/s1/run.sh
+++ b/examples/tiny/s1/run.sh
--- a/requirements.txt
+++ b/requirements.txt
--- a/setup.sh
+++ b/setup.sh
--- a/tests/deepspeech2_model_test.py
+++ b/tests/deepspeech2_model_test.py
--- a/tests/test_error_rate.py
+++ b/tests/test_error_rate.py
--- a/tests/mask_test.py
+++ b/tests/mask_test.py
--- a/tests/network_test.py
+++ b/tests/network_test.py
--- a/tests/u2_model_test.py
+++ b/tests/u2_model_test.py
--- a/third_party/README.md
+++ b/third_party/README.md
--- a/third_party/pymmseg-cpp/.gitignore
+++ b/third_party/pymmseg-cpp/.gitignore
--- a/third_party/pymmseg-cpp/DESCRIPTION
+++ b/third_party/pymmseg-cpp/DESCRIPTION
--- a/third_party/pymmseg-cpp/MANIFEST.in
+++ b/third_party/pymmseg-cpp/MANIFEST.in
--- a/third_party/pymmseg-cpp/README.md
+++ b/third_party/pymmseg-cpp/README.md
--- a/third_party/pymmseg-cpp/bin/pymmseg
+++ b/third_party/pymmseg-cpp/bin/pymmseg
--- a/third_party/pymmseg-cpp/mmseg/data/chars.dic
+++ b/third_party/pymmseg-cpp/mmseg/data/chars.dic
--- a/third_party/pymmseg-cpp/mmseg/data/words.dic
+++ b/third_party/pymmseg-cpp/mmseg/data/words.dic
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/algor.cpp
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/algor.cpp
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/algor.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/algor.h
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/chunk.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/chunk.h
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/dict.cpp
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/dict.cpp
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/dict.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/dict.h
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/memory.cpp
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/memory.cpp
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/memory.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/memory.h
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/mmseg.cpp
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/mmseg.cpp
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/rules.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/rules.h
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/token.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/token.h
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/utils.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/utils.h
--- a/third_party/pymmseg-cpp/mmseg/mmseg-cpp/word.h
+++ b/third_party/pymmseg-cpp/mmseg/mmseg-cpp/word.h
--- a/third_party/pymmseg-cpp/setup.cfg
+++ b/third_party/pymmseg-cpp/setup.cfg
--- a/third_party/pymmseg-cpp/setup.py
+++ b/third_party/pymmseg-cpp/setup.py
--- a/third_party/pymmseg-cpp/tests/mmseg_test.py
+++ b/third_party/pymmseg-cpp/tests/mmseg_test.py
--- a/third_party/pymmseg-cpp/tests/test.sh
+++ b/third_party/pymmseg-cpp/tests/test.sh
--- a/third_party/python-pinyin/.bumpversion.cfg
+++ b/third_party/python-pinyin/.bumpversion.cfg
--- a/third_party/python-pinyin/.circleci/config.yml
+++ b/third_party/python-pinyin/.circleci/config.yml
--- a/third_party/python-pinyin/.coveragerc
+++ b/third_party/python-pinyin/.coveragerc
--- a/third_party/python-pinyin/.editorconfig
+++ b/third_party/python-pinyin/.editorconfig
--- a/third_party/python-pinyin/.flake8
+++ b/third_party/python-pinyin/.flake8
--- a/third_party/python-pinyin/.github/CONTRIBUTING.md
+++ b/third_party/python-pinyin/.github/CONTRIBUTING.md
--- a/third_party/python-pinyin/.github/ISSUE_TEMPLATE.md
+++ b/third_party/python-pinyin/.github/ISSUE_TEMPLATE.md
--- a/third_party/python-pinyin/.github/PULL_REQUEST_TEMPLATE.md
+++ b/third_party/python-pinyin/.github/PULL_REQUEST_TEMPLATE.md
--- a/third_party/python-pinyin/.github/workflows/ci.yml
+++ b/third_party/python-pinyin/.github/workflows/ci.yml
--- a/third_party/python-pinyin/.github/workflows/codeql-analysis.yml
+++ b/third_party/python-pinyin/.github/workflows/codeql-analysis.yml
--- a/third_party/python-pinyin/.gitignore
+++ b/third_party/python-pinyin/.gitignore
--- a/third_party/python-pinyin/.gitmodules
+++ b/third_party/python-pinyin/.gitmodules
--- a/third_party/python-pinyin/.pre-commit-config.yaml
+++ b/third_party/python-pinyin/.pre-commit-config.yaml
--- a/third_party/python-pinyin/.style.yapf
+++ b/third_party/python-pinyin/.style.yapf
--- a/third_party/python-pinyin/.whitesource
+++ b/third_party/python-pinyin/.whitesource
--- a/third_party/python-pinyin/CHANGELOG.rst
+++ b/third_party/python-pinyin/CHANGELOG.rst
--- a/third_party/python-pinyin/CODE_OF_CONDUCT.md
+++ b/third_party/python-pinyin/CODE_OF_CONDUCT.md
--- a/third_party/python-pinyin/LICENSE.txt
+++ b/third_party/python-pinyin/LICENSE.txt
--- a/third_party/python-pinyin/MANIFEST.in
+++ b/third_party/python-pinyin/MANIFEST.in
--- a/third_party/python-pinyin/Makefile
+++ b/third_party/python-pinyin/Makefile
--- a/third_party/python-pinyin/README.md
+++ b/third_party/python-pinyin/README.md
--- a/third_party/python-pinyin/README.rst
+++ b/third_party/python-pinyin/README.rst
--- a/third_party/python-pinyin/docs/CHANGELOG.rst
+++ b/third_party/python-pinyin/docs/CHANGELOG.rst
--- a/third_party/python-pinyin/docs/Makefile
+++ b/third_party/python-pinyin/docs/Makefile
--- a/third_party/python-pinyin/docs/api.rst
+++ b/third_party/python-pinyin/docs/api.rst
--- a/third_party/python-pinyin/docs/conf.py
+++ b/third_party/python-pinyin/docs/conf.py
--- a/third_party/python-pinyin/docs/contrib.rst
+++ b/third_party/python-pinyin/docs/contrib.rst
--- a/third_party/python-pinyin/docs/develop.rst
+++ b/third_party/python-pinyin/docs/develop.rst
--- a/third_party/python-pinyin/docs/faq.rst
+++ b/third_party/python-pinyin/docs/faq.rst
--- a/third_party/python-pinyin/docs/index.rst
+++ b/third_party/python-pinyin/docs/index.rst
--- a/third_party/python-pinyin/docs/installation.rst
+++ b/third_party/python-pinyin/docs/installation.rst
--- a/third_party/python-pinyin/docs/make.bat
+++ b/third_party/python-pinyin/docs/make.bat
--- a/third_party/python-pinyin/docs/related.rst
+++ b/third_party/python-pinyin/docs/related.rst
--- a/third_party/python-pinyin/docs/usage.rst
+++ b/third_party/python-pinyin/docs/usage.rst
--- a/third_party/python-pinyin/gen_phrases_dict.py
+++ b/third_party/python-pinyin/gen_phrases_dict.py
--- a/third_party/python-pinyin/gen_pinyin_dict.py
+++ b/third_party/python-pinyin/gen_pinyin_dict.py
--- a/third_party/python-pinyin/phrase-pinyin-data/.bumpversion.cfg
+++ b/third_party/python-pinyin/phrase-pinyin-data/.bumpversion.cfg
--- a/third_party/python-pinyin/phrase-pinyin-data/.gitignore
+++ b/third_party/python-pinyin/phrase-pinyin-data/.gitignore
--- a/third_party/python-pinyin/phrase-pinyin-data/.travis.yml
+++ b/third_party/python-pinyin/phrase-pinyin-data/.travis.yml
--- a/third_party/python-pinyin/phrase-pinyin-data/CHANGELOG.md
+++ b/third_party/python-pinyin/phrase-pinyin-data/CHANGELOG.md
--- a/third_party/python-pinyin/phrase-pinyin-data/LICENSE
+++ b/third_party/python-pinyin/phrase-pinyin-data/LICENSE
--- a/third_party/python-pinyin/phrase-pinyin-data/Makefile
+++ b/third_party/python-pinyin/phrase-pinyin-data/Makefile
--- a/third_party/python-pinyin/phrase-pinyin-data/README.md
+++ b/third_party/python-pinyin/phrase-pinyin-data/README.md
--- a/third_party/python-pinyin/phrase-pinyin-data/cc_cedict.txt
+++ b/third_party/python-pinyin/phrase-pinyin-data/cc_cedict.txt
--- a/third_party/python-pinyin/phrase-pinyin-data/get_latest_cc_cedict.py
+++ b/third_party/python-pinyin/phrase-pinyin-data/get_latest_cc_cedict.py
--- a/third_party/python-pinyin/phrase-pinyin-data/large_pinyin.txt
+++ b/third_party/python-pinyin/phrase-pinyin-data/large_pinyin.txt
--- a/third_party/python-pinyin/phrase-pinyin-data/merge.py
+++ b/third_party/python-pinyin/phrase-pinyin-data/merge.py
--- a/third_party/python-pinyin/phrase-pinyin-data/overwrite.txt
+++ b/third_party/python-pinyin/phrase-pinyin-data/overwrite.txt
--- a/third_party/python-pinyin/phrase-pinyin-data/parse_latest_cc_cedict.py
+++ b/third_party/python-pinyin/phrase-pinyin-data/parse_latest_cc_cedict.py
--- a/third_party/python-pinyin/phrase-pinyin-data/pinyin.txt
+++ b/third_party/python-pinyin/phrase-pinyin-data/pinyin.txt
--- a/third_party/python-pinyin/phrase-pinyin-data/requirements_dev.txt
+++ b/third_party/python-pinyin/phrase-pinyin-data/requirements_dev.txt
--- a/third_party/python-pinyin/phrase-pinyin-data/zdic_cibs.txt
+++ b/third_party/python-pinyin/phrase-pinyin-data/zdic_cibs.txt
--- a/third_party/python-pinyin/phrase-pinyin-data/zdic_cybs.txt
+++ b/third_party/python-pinyin/phrase-pinyin-data/zdic_cybs.txt
--- a/third_party/python-pinyin/pinyin-data/.bumpversion.cfg
+++ b/third_party/python-pinyin/pinyin-data/.bumpversion.cfg
--- a/third_party/python-pinyin/pinyin-data/.github/workflows/python-app.yml
+++ b/third_party/python-pinyin/pinyin-data/.github/workflows/python-app.yml
--- a/third_party/python-pinyin/pinyin-data/.gitignore
+++ b/third_party/python-pinyin/pinyin-data/.gitignore
--- a/third_party/python-pinyin/pinyin-data/.travis.yml
+++ b/third_party/python-pinyin/pinyin-data/.travis.yml
--- a/third_party/python-pinyin/pinyin-data/CHANGELOG.md
+++ b/third_party/python-pinyin/pinyin-data/CHANGELOG.md
--- a/third_party/python-pinyin/pinyin-data/GBK_PUA.txt
+++ b/third_party/python-pinyin/pinyin-data/GBK_PUA.txt
--- a/third_party/python-pinyin/pinyin-data/LICENSE
+++ b/third_party/python-pinyin/pinyin-data/LICENSE
--- a/third_party/python-pinyin/pinyin-data/Makefile
+++ b/third_party/python-pinyin/pinyin-data/Makefile
--- a/third_party/python-pinyin/pinyin-data/README.md
+++ b/third_party/python-pinyin/pinyin-data/README.md
--- a/third_party/python-pinyin/pinyin-data/kHanyuPinlu.txt
+++ b/third_party/python-pinyin/pinyin-data/kHanyuPinlu.txt
--- a/third_party/python-pinyin/pinyin-data/kHanyuPinyin.txt
+++ b/third_party/python-pinyin/pinyin-data/kHanyuPinyin.txt
--- a/third_party/python-pinyin/pinyin-data/kMandarin.txt
+++ b/third_party/python-pinyin/pinyin-data/kMandarin.txt
--- a/third_party/python-pinyin/pinyin-data/kMandarin_8105.txt
+++ b/third_party/python-pinyin/pinyin-data/kMandarin_8105.txt
--- a/third_party/python-pinyin/pinyin-data/kMandarin_overwrite.txt
+++ b/third_party/python-pinyin/pinyin-data/kMandarin_overwrite.txt
--- a/third_party/python-pinyin/pinyin-data/kTGHZ2013.txt
+++ b/third_party/python-pinyin/pinyin-data/kTGHZ2013.txt
--- a/third_party/python-pinyin/pinyin-data/kXHC1983.txt
+++ b/third_party/python-pinyin/pinyin-data/kXHC1983.txt
--- a/third_party/python-pinyin/pinyin-data/kanji.txt
+++ b/third_party/python-pinyin/pinyin-data/kanji.txt
--- a/third_party/python-pinyin/pinyin-data/merge_unihan.py
+++ b/third_party/python-pinyin/pinyin-data/merge_unihan.py
--- a/third_party/python-pinyin/pinyin-data/nonCJKUI.txt
+++ b/third_party/python-pinyin/pinyin-data/nonCJKUI.txt
--- a/third_party/python-pinyin/pinyin-data/overwrite.txt
+++ b/third_party/python-pinyin/pinyin-data/overwrite.txt
--- a/third_party/python-pinyin/pinyin-data/pinyin.txt
+++ b/third_party/python-pinyin/pinyin-data/pinyin.txt
--- a/third_party/python-pinyin/pinyin-data/tools/china-8105-06062014.txt
+++ b/third_party/python-pinyin/pinyin-data/tools/china-8105-06062014.txt
--- a/third_party/python-pinyin/pinyin-data/tools/gen_8105.py
+++ b/third_party/python-pinyin/pinyin-data/tools/gen_8105.py
--- a/third_party/python-pinyin/pinyin-data/tools/gen_gb_pua.py
+++ b/third_party/python-pinyin/pinyin-data/tools/gen_gb_pua.py
--- a/third_party/python-pinyin/pinyin-data/tools/improve_8105.py
+++ b/third_party/python-pinyin/pinyin-data/tools/improve_8105.py
--- a/third_party/python-pinyin/pinyin-data/tools/requirements.txt
+++ b/third_party/python-pinyin/pinyin-data/tools/requirements.txt
--- a/third_party/python-pinyin/pinyin-data/unihan/.gitignore
+++ b/third_party/python-pinyin/pinyin-data/unihan/.gitignore
--- a/third_party/python-pinyin/pinyin-data/unihan/Makefile
+++ b/third_party/python-pinyin/pinyin-data/unihan/Makefile
--- a/third_party/python-pinyin/pinyin-data/unihan/README.md
+++ b/third_party/python-pinyin/pinyin-data/unihan/README.md
--- a/third_party/python-pinyin/pinyin-data/unihan/diff.sh
+++ b/third_party/python-pinyin/pinyin-data/unihan/diff.sh
--- a/third_party/python-pinyin/pinyin-data/unihan/kHanyuPinlu.txt
+++ b/third_party/python-pinyin/pinyin-data/unihan/kHanyuPinlu.txt
--- a/third_party/python-pinyin/pinyin-data/unihan/kHanyuPinyin.txt
+++ b/third_party/python-pinyin/pinyin-data/unihan/kHanyuPinyin.txt
--- a/third_party/python-pinyin/pinyin-data/unihan/kMandarin.txt
+++ b/third_party/python-pinyin/pinyin-data/unihan/kMandarin.txt
--- a/third_party/python-pinyin/pinyin-data/unihan/kTGHZ2013.txt
+++ b/third_party/python-pinyin/pinyin-data/unihan/kTGHZ2013.txt
--- a/third_party/python-pinyin/pinyin-data/unihan/kXHC1983.txt
+++ b/third_party/python-pinyin/pinyin-data/unihan/kXHC1983.txt
--- a/third_party/python-pinyin/pinyin-data/unihan/parse_pinyin.py
+++ b/third_party/python-pinyin/pinyin-data/unihan/parse_pinyin.py
--- a/third_party/python-pinyin/pinyin-data/zdic.txt
+++ b/third_party/python-pinyin/pinyin-data/zdic.txt
--- a/third_party/python-pinyin/pypinyin/__init__.py
+++ b/third_party/python-pinyin/pypinyin/__init__.py
--- a/third_party/python-pinyin/pypinyin/__main__.py
+++ b/third_party/python-pinyin/pypinyin/__main__.py
--- a/third_party/python-pinyin/pypinyin/constants.py
+++ b/third_party/python-pinyin/pypinyin/constants.py
--- a/third_party/python-pinyin/pypinyin/contrib/_tone_rule.py
+++ b/third_party/python-pinyin/pypinyin/contrib/_tone_rule.py
--- a/third_party/python-pinyin/pypinyin/contrib/neutral_tone.py
+++ b/third_party/python-pinyin/pypinyin/contrib/neutral_tone.py
--- a/third_party/python-pinyin/pypinyin/contrib/tone_convert.py
+++ b/third_party/python-pinyin/pypinyin/contrib/tone_convert.py
--- a/third_party/python-pinyin/pypinyin/contrib/uv.py
+++ b/third_party/python-pinyin/pypinyin/contrib/uv.py
--- a/third_party/python-pinyin/pypinyin/converter.py
+++ b/third_party/python-pinyin/pypinyin/converter.py
--- a/third_party/python-pinyin/pypinyin/core.py
+++ b/third_party/python-pinyin/pypinyin/core.py
--- a/third_party/python-pinyin/pypinyin/phonetic_symbol.py
+++ b/third_party/python-pinyin/pypinyin/phonetic_symbol.py
--- a/third_party/python-pinyin/pypinyin/phrases_dict.py
+++ b/third_party/python-pinyin/pypinyin/phrases_dict.py
--- a/third_party/python-pinyin/pypinyin/pinyin_dict.py
+++ b/third_party/python-pinyin/pypinyin/pinyin_dict.py
--- a/third_party/python-pinyin/pypinyin/runner.py
+++ b/third_party/python-pinyin/pypinyin/runner.py
--- a/third_party/python-pinyin/pypinyin/seg/__init__.py
+++ b/third_party/python-pinyin/pypinyin/seg/__init__.py
--- a/third_party/python-pinyin/pypinyin/seg/mmseg.py
+++ b/third_party/python-pinyin/pypinyin/seg/mmseg.py
--- a/third_party/python-pinyin/pypinyin/seg/simpleseg.py
+++ b/third_party/python-pinyin/pypinyin/seg/simpleseg.py
--- a/third_party/python-pinyin/pypinyin/standard.py
+++ b/third_party/python-pinyin/pypinyin/standard.py
--- a/third_party/python-pinyin/pypinyin/style/__init__.py
+++ b/third_party/python-pinyin/pypinyin/style/__init__.py
--- a/third_party/python-pinyin/pypinyin/style/_constants.py
+++ b/third_party/python-pinyin/pypinyin/style/_constants.py
--- a/third_party/python-pinyin/pypinyin/style/_utils.py
+++ b/third_party/python-pinyin/pypinyin/style/_utils.py
--- a/third_party/python-pinyin/pypinyin/style/bopomofo.py
+++ b/third_party/python-pinyin/pypinyin/style/bopomofo.py
--- a/third_party/python-pinyin/pypinyin/style/cyrillic.py
+++ b/third_party/python-pinyin/pypinyin/style/cyrillic.py
--- a/third_party/python-pinyin/pypinyin/style/finals.py
+++ b/third_party/python-pinyin/pypinyin/style/finals.py
--- a/third_party/python-pinyin/pypinyin/style/initials.py
+++ b/third_party/python-pinyin/pypinyin/style/initials.py
--- a/third_party/python-pinyin/pypinyin/style/others.py
+++ b/third_party/python-pinyin/pypinyin/style/others.py
--- a/third_party/python-pinyin/pypinyin/style/tone.py
+++ b/third_party/python-pinyin/pypinyin/style/tone.py
--- a/third_party/python-pinyin/pypinyin/utils.py
+++ b/third_party/python-pinyin/pypinyin/utils.py
--- a/third_party/python-pinyin/pytest.ini
+++ b/third_party/python-pinyin/pytest.ini
--- a/third_party/python-pinyin/requirements.txt
+++ b/third_party/python-pinyin/requirements.txt
--- a/third_party/python-pinyin/setup.cfg
+++ b/third_party/python-pinyin/setup.cfg
--- a/third_party/python-pinyin/setup.py
+++ b/third_party/python-pinyin/setup.py
--- a/third_party/python-pinyin/tests/__init__.py
+++ b/third_party/python-pinyin/tests/__init__.py
--- a/third_party/python-pinyin/tests/conftest.py
+++ b/third_party/python-pinyin/tests/conftest.py
--- a/third_party/python-pinyin/tests/contrib/__init__.py
+++ b/third_party/python-pinyin/tests/contrib/__init__.py
--- a/third_party/python-pinyin/tests/contrib/test_neutral_tone.py
+++ b/third_party/python-pinyin/tests/contrib/test_neutral_tone.py
--- a/third_party/python-pinyin/tests/contrib/test_tone_convert.py
+++ b/third_party/python-pinyin/tests/contrib/test_tone_convert.py
--- a/third_party/python-pinyin/tests/contrib/test_tone_rule.py
+++ b/third_party/python-pinyin/tests/contrib/test_tone_rule.py
--- a/third_party/python-pinyin/tests/contrib/test_uv.py
+++ b/third_party/python-pinyin/tests/contrib/test_uv.py
--- a/third_party/python-pinyin/tests/seg/__init__.py
+++ b/third_party/python-pinyin/tests/seg/__init__.py
--- a/third_party/python-pinyin/tests/seg/test_mmseg.py
+++ b/third_party/python-pinyin/tests/seg/test_mmseg.py
--- a/third_party/python-pinyin/tests/test_cmd.py
+++ b/third_party/python-pinyin/tests/test_cmd.py
--- a/third_party/python-pinyin/tests/test_converter.py
+++ b/third_party/python-pinyin/tests/test_converter.py
--- a/third_party/python-pinyin/tests/test_core_cls.py
+++ b/third_party/python-pinyin/tests/test_core_cls.py
--- a/third_party/python-pinyin/tests/test_env.py
+++ b/third_party/python-pinyin/tests/test_env.py
--- a/third_party/python-pinyin/tests/test_others.py
+++ b/third_party/python-pinyin/tests/test_others.py
--- a/third_party/python-pinyin/tests/test_pinyin.py
+++ b/third_party/python-pinyin/tests/test_pinyin.py
--- a/third_party/python-pinyin/tests/test_standard.py
+++ b/third_party/python-pinyin/tests/test_standard.py
--- a/third_party/python-pinyin/tests/test_style.py
+++ b/third_party/python-pinyin/tests/test_style.py
--- a/third_party/python-pinyin/tests/utils.py
+++ b/third_party/python-pinyin/tests/utils.py
--- a/third_party/python-pinyin/tidy_phrases_dict.py
+++ b/third_party/python-pinyin/tidy_phrases_dict.py
--- a/third_party/python-pinyin/tox.ini
+++ b/third_party/python-pinyin/tox.ini
--- a/third_party/python_kaldi_features/.gitignore
+++ b/third_party/python_kaldi_features/.gitignore
--- a/third_party/python_kaldi_features/LICENSE
+++ b/third_party/python_kaldi_features/LICENSE
--- a/third_party/python_kaldi_features/MANIFEST
+++ b/third_party/python_kaldi_features/MANIFEST
--- a/third_party/python_kaldi_features/README.rst
+++ b/third_party/python_kaldi_features/README.rst
--- a/third_party/python_kaldi_features/docs/Makefile
+++ b/third_party/python_kaldi_features/docs/Makefile
--- a/third_party/python_kaldi_features/docs/make.bat
+++ b/third_party/python_kaldi_features/docs/make.bat
--- a/third_party/python_kaldi_features/docs/source/conf.py
+++ b/third_party/python_kaldi_features/docs/source/conf.py
--- a/third_party/python_kaldi_features/docs/source/index.rst
+++ b/third_party/python_kaldi_features/docs/source/index.rst
--- a/third_party/python_kaldi_features/english.wav
+++ b/third_party/python_kaldi_features/english.wav
--- a/third_party/python_kaldi_features/example.py
+++ b/third_party/python_kaldi_features/example.py
--- a/third_party/python_kaldi_features/python_speech_features/__init__.py
+++ b/third_party/python_kaldi_features/python_speech_features/__init__.py
--- a/third_party/python_kaldi_features/python_speech_features/base.py
+++ b/third_party/python_kaldi_features/python_speech_features/base.py
--- a/third_party/python_kaldi_features/python_speech_features/base_orig.py
+++ b/third_party/python_kaldi_features/python_speech_features/base_orig.py
--- a/third_party/python_kaldi_features/python_speech_features/sigproc.py
+++ b/third_party/python_kaldi_features/python_speech_features/sigproc.py
--- a/third_party/python_kaldi_features/python_speech_features/sigproc_orig.py
+++ b/third_party/python_kaldi_features/python_speech_features/sigproc_orig.py
--- a/third_party/python_kaldi_features/requirements.txt
+++ b/third_party/python_kaldi_features/requirements.txt
--- a/third_party/python_kaldi_features/setup.py
+++ b/third_party/python_kaldi_features/setup.py
--- a/third_party/python_kaldi_features/test/test_sigproc.py
+++ b/third_party/python_kaldi_features/test/test_sigproc.py
--- a/utils/avg_model.py
+++ b/utils/avg_model.py
--- a/utils/build_vocab.py
+++ b/utils/build_vocab.py
--- a/utils/compute_mean_std.py
+++ b/utils/compute_mean_std.py
--- a/utils/format_data.py
+++ b/utils/format_data.py
--- a/utils/parse_options.sh
+++ b/utils/parse_options.sh
--- a/utils/spm_decode
+++ b/utils/spm_decode
--- a/utils/spm_encode
+++ b/utils/spm_encode
--- a/utils/spm_train
+++ b/utils/spm_train
--- a/utils/utility.py
+++ b/utils/utility.py