Unable to train DeepSpeech from scratch on Indian Hinglish(Mixed Hindi_English) 8khz audio data.
Created by: alamnasim
Hi,
I want to train Hinglish(Mixed Hindi_English) 8khz conversation audio data from scratch. I am getting ValueError: Input signal length=0 is too small to resample from 8000->16000.
Training Command
CUDA_VISIBLE_DEVICES=0,1 python train.py
Trainig Log: ----------- Configuration Arguments ----------- batch_size: 256 dev_manifest: data/audio_data/manifest.dev-clean init_from_pretrained_model: None is_local: True learning_rate: 0.0005 max_duration: 27.0 mean_std_path: data/audio_data/mean_std.npz min_duration: 0.0 num_conv_layers: 2 num_epoch: 200 num_iter_print: 100 num_rnn_layers: 3 num_samples: 489948 output_model_dir: checkpoint_dir rnn_layer_size: 2048 save_epoch: 10 share_rnn_weights: False shuffle_method: batch_shuffle_clipped specgram_type: mfcc test_off: False train_manifest: data/audio_data/manifest.train-clean use_gpu: True use_gru: True use_sortagrad: True vocab_path: data/audio_data/eng_vocab.txt
W0702 16:43:10.574574 7779 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 75, Driver API Version: 10.2, Runtime API Version: 10.0
W0702 16:43:10.577214 7779 device_context.cc:260] device: 0, cuDNN Version: 7.6.
W0702 16:43:14.229719 7779 fuse_all_reduce_op_pass.cc:74] Find all_reduce operators: 38. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 38.
/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/resampy/core.py:90: FutureWarning: Conversion of the second argument of issubdtype from float
to np.floating
is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type
.
if not np.issubdtype(x.dtype, np.float):
2020-07-02 16:43:19,910-WARNING: Your reader has raised an exception!
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/reader.py", line 1157, in thread_main
six.reraise(*sys.exc_info())
File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/reader.py", line 1137, in thread_main
for tensors in self._tensor_reader():
File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/data_utils/data.py", line 200, in batch_reader
for instance in instance_reader():
File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/data_utils/data.py", line 279, in reader
instance["text"]),
File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/data_utils/data.py", line 121, in process_utterance
speech_segment, self._keep_transcription_text)
File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/data_utils/featurizer/speech_featurizer.py", line 76, in featurize
audio_feature = self._audio_featurizer.featurize(speech_segment)
File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/data_utils/featurizer/audio_featurizer.py", line 80, in featurize
audio_segment.resample(self._target_sample_rate)
File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/data_utils/audio.py", line 400, in resample
self.samples, self.sample_rate, target_sample_rate, filter=filter)
File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/resampy/core.py", line 102, in resample
'resample from {}->{}'.format(x.shape[axis], sr_orig, sr_new))
ValueError: Input signal length=0 is too small to resample from 8000->16000
/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: "The following exception is not an EOF exception.") Traceback (most recent call last): File "train.py", line 144, in main() File "train.py", line 140, in main train() File "train.py", line 135, in train test_off=args.test_off) File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/model_utils/model.py", line 336, in train return_numpy=False) File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/executor.py", line 1071, in run six.reraise(*sys.exc_info()) File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/executor.py", line 1066, in run return_merged=return_merged) File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/executor.py", line 1167, in _run_impl return_merged=return_merged) File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/executor.py", line 879, in _run_parallel tensors = exe.run(fetch_var_names, return_merged)._move_to_list() paddle.fluid.core_avx.EnforceNotMet:
C++ Call Stacks (More useful to developers):
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) 2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1 (closed)}::operator()() const
Python Call Stacks (More useful to users):
File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op attrs=kwargs.get("attrs", None)) File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/reader.py", line 1079, in _init_non_iterable attrs={'drop_last': self._drop_last}) File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/reader.py", line 977, in init self._init_non_iterable() File "/home/ubuntu/tmp/deepspeech2-venv/local/lib/python2.7/site-packages/paddle/fluid/reader.py", line 608, in from_generator iterable, return_list, drop_last) File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/model_utils/model.py", line 112, in create_network use_double_buffer=True) File "/home/ubuntu/drive_a/Paddlepaddle_DeepSpeech/DeepSpeech/model_utils/model.py", line 281, in train train_reader, log_probs, ctc_loss = self.create_network() File "train.py", line 135, in train test_off=args.test_off) File "train.py", line 140, in main train() File "train.py", line 144, in main()
Error Message Summary:
Error: Blocking queue is killed because the data reader raises an exception [Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141) [operator < read > error]
Anyone can help me to resolve this issue. Can I run 8khz audio on training from scratch? I am able to infer an audio(8k sample_rate) using baidu_en8k model.
Thanks Nasim