- 16 9月, 2021 10 次提交
-
-
由 crystal 提交于
-
由 Wangzheee 提交于
* fix gather * fix
-
由 0x45f 提交于
* fix no_grad context error in dy2stat * remove useless comments * fix error by drop_kids in python * add test and fix review
-
由 Guoxia Wang 提交于
* support fp16 dtype
-
由 Wilber 提交于
-
由 lilong12 提交于
-
由 zhangkaihuo 提交于
-
由 wuhuanzhou 提交于
PR主要功能:针对fusion等子图替换场景,支持Python侧开发并注册Pass。 背景 Pass是指输入一个深度学习计算图Graph,依照一定条件进行修改,输出修改后的Graph的过程; 当前PaddlePadle框架编写Pass代码存在以下问题: 用户需要手写Graph的条件匹配、在Graph上的修改代码; 对Graph操作需要深入底层框架代码,了解Graph的结构,并且知道相关Pass写法; 我们提出了针对fusion等子图替换类Pass的优化方案以支持用户在Python侧开发注册Pass,提升二次开发体验: 用户只需要输入匹配和替换的子图描述,由深度学习框架编写的代码来生成匹配和替换的逻辑,不需要用户对Graph进行匹配和替换操作; API级别的替换,用户可以通过Paddle的Python API构造子图,从而不需要知道Graph的结构,也能写Paddle的Graph Pass代码
-
由 wanghuancoder 提交于
-
由 zhangkaihuo 提交于
-
- 15 9月, 2021 24 次提交
-
-
由 Sing_chan 提交于
-
由 jakpiase 提交于
* fixed slice error * added handling of StartsTensor+List and EndsTensor+List * fix for ppyolo model
-
由 jakpiase 提交于
-
由 wanghuancoder 提交于
* add inplace logic into new_executor, test=develop * check shape and add inplace FLAGS, test=develop * refine, test=develop * refine, test=develop
-
由 王明冬 提交于
* clip op extra information when export model,test=ocr * rename clip_extra parameter to kwargs in save_inference_model, test=ocr
-
由 zyfncg 提交于
* Change the invoking method of settiem from numpy to set_value op when value is not tensor * fix the check logic for inplace in setitem * fix the unittest problem caused by setitem doesn't support fp16 * modify some code format in setitem
-
由 zhaoyingli 提交于
* add dist_attr for dist op * add unitest * update inputname * update function name * add unitest * update CMakeLists.txt for CI * fix dis_matmul * fix compile error * update matmul to matmul_v2
-
由 Liu-xiandong 提交于
Put Nvidia's cusparse library into paddle.
-
由 pangyoki 提交于
* add beam_search npu op * fix CMakeList and add unittest * fix bug of beam search npu op * fix unittest * let input ids become int64 * set output ids to int64_t * delete check_dygraph * fix beam_width=1
-
由 houj04 提交于
-
由 Sing_chan 提交于
-
由 Qi Li 提交于
* [NPU] fix depthwise_conv2d_grad, test=develop * remove debug files, test=develop
-
由 Yiqun Liu 提交于
-
由 JingZhuangzhuang 提交于
Co-authored-by: Nxiaoxiaohehe001 <hiteezsf@163.com>
-
由 YuanRisheng 提交于
* Add New Op: gumbel_softmax * Add New Op: gumbel_softmax * Add New Op: gumbel_softmax (amend) * add __main__ function in unit test * fix bugs when test in windows ci * update en docs * delete reletive error in unit test * delete relative error in unit test * set hard=True in unit test
-
由 Siming Dai 提交于
Add paddle.cuda.device.stream_guard API
-
由 WangXi 提交于
-
由 Li Min 提交于
-
由 xiaoxiaohehe001 提交于
* add_split_op * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller * add_split_teller
-
由 xiaoxiaohehe001 提交于
* add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller * add_transpose_teller
-
由 xiaoxiaohehe001 提交于
* add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller * add_scale_teller
-
由 津 提交于
-
由 津 提交于
-
由 ronnywang 提交于
-
- 14 9月, 2021 6 次提交
-
-
由 zhaoyingli 提交于
* add layerwise learning rate for adamw * fix format * add unitest * add NotImplementedError * add gpu unitest * update gpuplace
-
由 pangyoki 提交于
-
由 JingZhuangzhuang 提交于
* add anchor_generator test * add anchor_generator test * add clip convert test * add swish convert test * modify Co-authored-by: Nxiaoxiaohehe001 <hiteezsf@163.com>
-
由 Sing_chan 提交于
* new function: share third party cache among servers to fasten build speed * modified code according to zhouwei25's comment * add wget install step, move cd build to the last of if condition * block note and error of third_party share; change bce upload method * change third_party sub_dir in bos, since third party in different cuda version cant share * set sub_dir by get nvcc version * change third_party local path to be same with bos path
-
由 JingZhuangzhuang 提交于
* add anchor_generator test * add anchor_generator test * add clip convert test * delete wrong file Co-authored-by: Nxiaoxiaohehe001 <hiteezsf@163.com>
-
由 Yiqun Liu 提交于
Implement FunctionTraits to support two kinds of elementwise functor and remove some old codes for broadcast. (#35688)
-