提交 · 41e586314963a2b9346f2b5414524bab23ac6fd4 · PaddlePaddle / DeepSpeech

07 9月, 2021 1 次提交
- H
  
  relase librispeech audio max len to 30 second · 184d30dd
  由 Hui Zhang 提交于 9月 07, 2021
  
  184d30dd
06 9月, 2021 1 次提交
- H
  
  add blank_id parameter · 04d9db19
  由 huangyuxin 提交于 9月 06, 2021
  
  04d9db19
08 8月, 2021 1 次提交
- H
  
  fix the bidirect rnn, add deepspeech2.yaml for aishell, tiny, librispeech · 7a3d1641
  由 huangyuxin 提交于 8月 08, 2021
  
  7a3d1641
29 7月, 2021 1 次提交
- H
  
  add the subsampling as conv · e4ef8ed3
  由 huangyuxin 提交于 7月 29, 2021
  
  e4ef8ed3
28 7月, 2021 1 次提交
- H
  
  修改了deepspeech2.py部分LSTM和GRU的代码，增加了LayerNorm · 2cacbaf4
  由 huangyuxin 提交于 7月 28, 2021
  
  2cacbaf4
30 6月, 2021 1 次提交
- H
  
  revise conf/*.yaml · c0f7aac8
  由 Haoxin Ma 提交于 6月 30, 2021
  
  c0f7aac8
25 6月, 2021 1 次提交
- H
  
  fix conf for ds2 · 019ae4b3
  由 Hui Zhang 提交于 6月 25, 2021
  
  019ae4b3
28 5月, 2021 1 次提交
- H
  
  add libri ds2 exp result · de780a0c
  由 Hui Zhang 提交于 5月 28, 2021
  
  de780a0c
19 5月, 2021 1 次提交

由 Hui Zhang 提交于 5月 19, 2021

* default cmvn compute config; more log of grad clip; diff ds2 cmvn compute and conf; ds2 lr step by epoch;

* fix ds2 config

* fix install and egs link

* sox speed pertrub shape (T, C), float64, process using int32

* fix libri ds2 scripts; add ngram and spm doc

* aishell ds2 cer7.86

* fix ds2 result

295f8bda

12 5月, 2021 1 次提交

E2E/Streaming Transformer/Conformer ASR (#578) · 71e046b0

由 Hui Zhang 提交于 5月 12, 2021

* add cmvn and label smoothing loss layer

* add layer for transformer

* add glu and conformer conv

* add torch compatiable hack, mask funcs

* not hack size since it exists

* add test; attention

* add attention, common utils, hack paddle

* add audio utils

* conformer batch padding mask bug fix #223

* fix typo, python infer fix rnn mem opt name error and batchnorm1d, will be available at 2.0.2

* fix ci

* fix ci

* add encoder

* refactor egs

* add decoder

* refactor ctc, add ctc align, refactor ckpt, add warmup lr scheduler, cmvn utils

* refactor docs

* add fix

* fix readme

* fix bugs, refactor collator, add pad_sequence, fix ckpt bugs

* fix docstring

* refactor data feed order

* add u2 model

* refactor cmvn, test

* add utils

* add u2 config

* fix bugs

* fix bugs

* fix autograd maybe has problem when using inplace operation

* refactor data, build vocab; add format data

* fix text featurizer

* refactor build vocab

* add fbank, refactor feature of speech

* refactor audio feat

* refactor data preprare

* refactor data

* model init from config

* add u2 bins

* flake8

* can train

* fix bugs, add coverage, add scripts

* test can run

* fix data

* speed perturb with sox

* add spec aug

* fix for train

* fix train logitc

* fix logger

* log valid loss, time dataset process

* using np for speed perturb, remove some debug log of grad clip

* fix logger

* fix build vocab

* fix logger name

* using module logger as default

* fix

* fix install

* reorder imports

* fix board logger

* fix logger

* kaldi fbank and mfcc

* fix cmvn and print prarams

* fix add_eos_sos and cmvn

* fix cmvn compute

* fix logger and cmvn

* fix subsampling, label smoothing loss, remove useless

* add notebook test

* fix log

* fix tb logger

* multi gpu valid

* fix log

* fix log

* fix config

* fix compute cmvn, need paddle 2.1

* add cmvn notebook

* fix layer tools

* fix compute cmvn

* add rtf

* fix decoding

* fix layer tools

* fix log, add avg script

* more avg and test info

* fix dataset pickle problem; using 2.1 paddle; num_workers can > 0; ckpt save in exp dir;fix setup.sh;

* add vimrc

* refactor tiny script, add transformer and stream conf

* spm demo; librisppech scripts and confs

* fix log

* add librispeech scripts

* refactor data pipe; fix conf; fix u2 default params

* fix bugs

* refactor aishell scripts

* fix test

* fix cmvn

* fix s0 scripts

* fix ds2 scripts and bugs

* fix dev & test dataset filter

* fix dataset filter

* filter dev

* fix ckpt path

* filter test, since librispeech will cause OOM, but all test wer will be worse, since mismatch train with test

* add comment

* add syllable doc

* fix ds2 configs

* add doc

* add pypinyin tools

* fix decoder using blank_id=0

* mmseg with pybind11

* format code

71e046b0

22 3月, 2021 1 次提交

batch average ctc loss (#567) · e0a87a5a

由 Hui Zhang 提交于 3月 22, 2021

* when loss div batchsize, change lr, more epoch, loss can reduce more and cer lower than before

* since loss reduce more when loss div batchsize,  less lm alpha can be better.

* less lm alpha, more cer reduce

* alpha 2.2, cer 0.077478

* alpha 1.9, cer 0.077249

* large librispeech lr for batch_average ctc loss

* since loss reduce and model more confidence, then less lm alpha

e0a87a5a

08 3月, 2021 1 次提交

Support paddle 2.x (#538) · d7e75354

由 Hui Zhang 提交于 3月 08, 2021

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49

d7e75354

PaddlePaddle / DeepSpeech 大约 1 年 前同步成功

PaddlePaddle / DeepSpeech
大约 1 年前同步成功