You need to sign in or sign up before continuing.
加入shape/range op并memory_optimize后C++ inference报错
Created by: Meiyim
paddle 1.5.1 fluid_inference develop 预测网络代码(增量部分),增量前没有预测问题:
src_ids, sent_ids = features
zero = L.fill_constant([1], dtype='int64', value=0)
input_mask = L.cast(L.equal(src_ids, zero), dtype) # assume pad id == 0
d_shape = L.shape(src_ids)
seqlen = d_shape[1]
batch_size = d_shape[0]
pos_ids = L.unsqueeze(L.range(0, seqlen, 1, dtype='int32'), axes=[0])
pos_ids = L.expand(pos_ids, [batch_size, 1])
pos_ids = L.unsqueeze(pos_ids, axes=[2])
pos_ids = L.cast(pos_ids, 'int64')
pos_ids.stop_gradient = True
input_mask.stop_gradient = True
task_ids = L.zeros_like(src_ids)
task_ids.stop_gradient = True
bert = ErnieModel(
src_ids=src_ids,
position_ids=pos_ids,
sentence_ids=sent_ids,
task_ids=task_ids,
input_mask=input_mask,
config=self.hparam,
use_fp16=self.hparam['fp16']
)
paddle predictor 配置:
paddle::AnalysisConfig config;
config.SetModel(FLAGS_model_dir);
config.EnableUseGpu(8000);
config.SwitchSpecifyInputNames(true);
auto predictor = CreatePaddlePredictor(config);
无论在多显存的机器上都会运行爆显存(后台观察到显卡瞬间被打满):
I0828 15:42:42.367769 207866 buddy_allocator.cc:69] Allocate 150994944 bytes from chunk size 150995200
I0828 15:42:42.367838 207866 gpu_info.cc:195] GPU usage 8M/7611M, 8M available to allocate
I0828 15:42:42.367843 207866 gpu_info.cc:216] Alloc size is 4 MiB, is it Re-alloc: 1
W0828 15:42:42.368458 207866 system_allocator.cc:121] Cannot malloc 144 MB GPU memory. Please shrink FLAGS_fraction_of_gpu_memory_to_use or FLAGS_initial_gpu_memory_in_mb or FLAGS_reallocate_gpu_memory_in_mbenvironment variable to a lower value. Current FLAGS_fraction_of_gpu_memory_to_use value is 0.531531. Current FLAGS_initial_gpu_memory_in_mb value is 0. Current FLAGS_reallocate_gpu_memory_in_mb value is 0
F0828 15:42:42.368579 207866 naive_best_fit_allocator.cc:164] Cannot allocate 144.000000MB in GPU 0, available 8.937500MB, total 7.433533GB, GpuMinChunkSize 256.000000B, GpuMaxChunkSize 3.668286GB, GPU memory used: 7.089328GB
*** Check failure stack trace: ***
./gpu.sh: line 13: 207866 Aborted (core dumped) ./build/inference --logtostderr --model_dir $2 --data $1 --repeat 5 --output_prediction true --use_gpu true --device 0
49301,1 Bot