- 08 4月, 2020 1 次提交
-
-
由 dingminghui 提交于
bug caused by lazy malloc of tensor memory. while batchsize large than 1, output tensor is divided(shared memory), and each divided tensor do malloc.
-
- 20 2月, 2020 1 次提交
-
-
由 yiicy 提交于
add multiclass_nms2 kernel, test=develop
-
- 24 12月, 2019 1 次提交
-
-
由 yiicy 提交于
-
- 03 9月, 2019 1 次提交
-
-
由 juncaipeng 提交于
* add ops for faster rcnn * disable test for generate_proposals and roi_align, test=develop * remove .swp file * remove log in tensor slice * finish the unit test for roi_align, test=develop * add box_clip op and fix tensor slice bug * remove add four op twice * rewrite the implement for box_coder and sequence_expand, add faster_rcnn_test, test=develop * fix test bug of box_clip in x86 server, test=develop * rewrite multiclass_nms according to fluid, test=develop * fix param load bug in box_coder and multiclass_nms op, test=develop * fix value transfor error in multiclass_nms, test=develop
-
- 29 8月, 2019 1 次提交
-
-
由 Wilber 提交于
* add yolo_box_compute cuda * move multiclass_nms(arm) to host * add lod in scale op * add yolo_box_cuda cmake config * modify shuffle_channel_fuse and transpose_softmax_transpose_fuse to support run ssd model. test=develop * reshape and transpose op don't have xshape output. * modify yolo_box_compute_cuda, use tensor to manage cuda memory test=develop * add yolo_box use kernel test=develop
-
- 16 8月, 2019 1 次提交
-
-
由 Yan Chunwei 提交于
-