模型部署后,对长语音是否支持?
Created by: bigcash
您好,感谢开源! 我部署启动了deploy/demo_server.py,测试大概40分钟的语音,wav文件8000rate的有38M的时候,报内存溢出错误:Out of memory error on GPU 0. 我使用的是2080Ti的单卡进行测试的,显存11019MiB,是不是这种长语音需要对音频进行切割,一般怎样切割可以使最终结果更加准确,是固定时长切割(如按每隔10s切割)还是按录音中没有语音信号时切割的? 我也使用过百度云中提供的api进行语音转文字,是将40分钟录音文件直接提交的,是不是在服务端先进行了语音分割,然后再对所有结果合并,最后给出结果的? 多谢解惑!!
server端报错如下(使用BaiduCN1.2k模型): Received utterance[length=40108716] from 172.16.1.18, saved to demo_cache/20200109031539_172.16.1.18.wav. finish initing model from pretrained params from models/BaiduCN1.2k/ W0109 11:16:28.406390 11887 operator.cc:179] mul raises an exception paddle::memory::allocation::BadAlloc,
C++ Call Stacks (More useful to developers):
0 paddle::memory::detail::GPUAllocator::Alloc(unsigned long*, unsigned long) 1 paddle::memory::detail::BuddyAllocator::RefillPool(unsigned long) 2 paddle::memory::detail::BuddyAllocator::Alloc(unsigned long) 3 void* paddle::memory::legacy::Allocpaddle::platform::CUDAPlace(paddle::platform::CUDAPlace const&, unsigned long) 4 paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl(unsigned long) 5 paddle::memory::allocation::Allocator::Allocate(unsigned long) 6 paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long) 7 paddle::memory::allocation::AllocatorFacade::Alloc(paddle::platform::Place const&, unsigned long) 8 paddle::memory::allocation::AllocatorFacade::AllocShared(paddle::platform::Place const&, unsigned long) 9 paddle::memory::AllocShared(paddle::platform::Place const&, unsigned long) 10 paddle::framework::Tensor::mutable_data(paddle::platform::Place, paddle::framework::proto::VarType_Type, unsigned long) 11 paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const 12 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1 (closed)}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) 13 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const 14 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 15 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 16 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) 17 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool)
Error Message Summary:
Out of memory error on GPU 0. Cannot allocate 1.912537GB memory on GPU 0, available memory is only 381.125000MB.
Please check whether there is any other process using GPU 0.
- If yes, please stop them, or start PaddlePaddle on another GPU.
- If no, please try one of the following suggestions:
- Decrease the batch size of your model.
- FLAGS_fraction_of_gpu_memory_to_use is 0.92 now, please set it to a higher value but less than 1.0.
The command is
export FLAGS_fraction_of_gpu_memory_to_use=xxx
.
at (/paddle/paddle/fluid/memory/detail/system_allocator.cc:151)
Exception happened during processing of request from ('172.16.1.18', 55944) Traceback (most recent call last): File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/SocketServer.py", line 293, in _handle_request_noblock self.process_request(request, client_address) File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/SocketServer.py", line 321, in process_request self.finish_request(request, client_address) File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/SocketServer.py", line 334, in finish_request self.RequestHandlerClass(request, client_address, self) File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/SocketServer.py", line 655, in init self.handle() File "deploy/demo_server.py", line 101, in handle transcript = self.server.audio_process_handler(filename) File "deploy/demo_server.py", line 195, in file_to_transcript feeding_dict=data_generator.feeding) File "/home/lingbao/work/text/DeepSpeech/deploy/../model_utils/model.py", line 424, in infer_batch_probs return_numpy=False) File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/site-packages/paddle/fluid/executor.py", line 780, in run six.reraise(*sys.exc_info()) File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/site-packages/paddle/fluid/executor.py", line 775, in run use_program_cache=use_program_cache) File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/site-packages/paddle/fluid/executor.py", line 822, in _run_impl use_program_cache=use_program_cache) File "/home/lingbao/soft/anaconda3/envs/baidu27/lib/python2.7/site-packages/paddle/fluid/executor.py", line 899, in _run_program fetch_var_name) RuntimeError:
C++ Call Stacks (More useful to developers):
0 paddle::memory::detail::GPUAllocator::Alloc(unsigned long*, unsigned long) 1 paddle::memory::detail::BuddyAllocator::RefillPool(unsigned long) 2 paddle::memory::detail::BuddyAllocator::Alloc(unsigned long) 3 void* paddle::memory::legacy::Allocpaddle::platform::CUDAPlace(paddle::platform::CUDAPlace const&, unsigned long) 4 paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl(unsigned long) 5 paddle::memory::allocation::Allocator::Allocate(unsigned long) 6 paddle::memory::allocation::RetryAllocator::AllocateImpl(unsigned long) 7 paddle::memory::allocation::AllocatorFacade::Alloc(paddle::platform::Place const&, unsigned long) 8 paddle::memory::allocation::AllocatorFacade::AllocShared(paddle::platform::Place const&, unsigned long) 9 paddle::memory::AllocShared(paddle::platform::Place const&, unsigned long) 10 paddle::framework::Tensor::mutable_data(paddle::platform::Place, paddle::framework::proto::VarType_Type, unsigned long) 11 paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const 12 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::MulKernel<paddle::platform::CUDADeviceContext, paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1 (closed)}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) 13 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const 14 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 15 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 16 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) 17 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool)
Error Message Summary:
Out of memory error on GPU 0. Cannot allocate 1.912537GB memory on GPU 0, available memory is only 381.125000MB.
Please check whether there is any other process using GPU 0.
- If yes, please stop them, or start PaddlePaddle on another GPU.
- If no, please try one of the following suggestions:
- Decrease the batch size of your model.
- FLAGS_fraction_of_gpu_memory_to_use is 0.92 now, please set it to a higher value but less than 1.0.
The command is
export FLAGS_fraction_of_gpu_memory_to_use=xxx
.
at (/paddle/paddle/fluid/memory/detail/system_allocator.cc:151)