- 16 6月, 2021 1 次提交
-
-
由 TTerror 提交于
* fix gather op and add logsumexp op on kunlun * update xpu depence * update tests and fix elementwise_add
-
- 15 6月, 2021 3 次提交
-
-
由 wawltor 提交于
-
由 ShenLiang 提交于
* Fix gather infer shape using axis (#33413) * fix gather shape bug * fix None * fix topo * Fix hang of hybrid parallel in new_group (#33141) * fix hang of hybrid parallel * fix new_group for hang problem * fix hang
-
由 WeiXin 提交于
修复pylayer 返回to_tensor时触发段错误的bug。 原因: 如果在Python端修改了stop_gradient属性,c++ 端InnerSetOverridedStopGradient 无法修改stop_gradient属性,在c++端调用SetOverridedStopGradient修改stop_gradient属性。 to_tensor产生的tensor的grad var的DataType为默认值(-1),在backward的过程中grad var的DataType不能为默认值(-1),因此在调用ForwardDataType设置grad var的DataType。 原始PR:#33303
-
- 12 6月, 2021 1 次提交
-
-
由 zhiboniu 提交于
* Eliminate numerical differences of LayerNorm; fix LayerNorm Nan Bug while large data input * fix bug while large shape of data input
-
- 11 6月, 2021 3 次提交
-
-
由 liuyuhui 提交于
* add unit8 for concat (#32850) * add bool type for tril api (#33402)
-
由 Chen Weihang 提交于
Support diff dataset tensor place in single process dataloader cherry-pick of #33470
-
由 Lijunhui 提交于
使用op benchmark时发现,当输入数据量小于某个值时,python 端 log_softmax 接口的输入值经过计算过后 会被改变为nan。输出正常。 cherry-pick自 #32937
-
- 10 6月, 2021 1 次提交
-
-
由 wangguanzhong 提交于
-
- 09 6月, 2021 2 次提交
- 08 6月, 2021 1 次提交
-
-
由 TeslaZhao 提交于
* Fix two english api documents, transpose and strided_slice * OP:strided_slice_op supports bool type inputs
-
- 04 6月, 2021 1 次提交
-
-
由 wawltor 提交于
* fix compare op in for in the cuda device * fix the paddle compare op for the broadcast
-
- 01 6月, 2021 1 次提交
-
-
由 whs 提交于
-
- 07 5月, 2021 4 次提交
-
-
由 Jiawei Wang 提交于
-
由 LielinJiang 提交于
* fix compile error on jetson platform * remove unused head file * rm decode_jpeg op on jetson platform
-
由 WeiXin 提交于
修复了py_layer_op由于没有析构PyLayerContext造成内存(显存)泄露的问题。 原始pr:#32707
-
由 WeiXin 提交于
* clear 'BasicEngine' when an exception occurs in the backward. (#32546) * clear 'BasicEngine' when an exception occurs in the backward. * deal with conflict. * deal with conflict. * forward return any type. (#32661)
-
- 06 5月, 2021 3 次提交
-
-
由 Adam Osewski 提交于
-
由 jakpiase 提交于
* base changes for fix * minor change * fix for bwd kernel * removed unnecessary import * implemented reviewers suggestions * CI fix
-
由 chajchaj 提交于
cherry-pick:change softmax_with_cross_entropy_op's parameter name from softmax_switch to use_softmax (#32750) * change parameter name from softmax_switch to use_softmax, test=develop * cherry-pick:change parameter name from softmax_switch to use_softmax, test=develop
-
- 04 5月, 2021 1 次提交
-
-
由 Baibaifan 提交于
-
- 01 5月, 2021 1 次提交
-
-
由 Baibaifan 提交于
-
- 30 4月, 2021 3 次提交
-
-
由 pangyoki 提交于
* add relu6_ hardsigmoid_ leaky_relu_ Inplace APIs * add softmax_with_cross_entropy_ Inplace API * add clip_ scale_ add_ subtract_ Inplace APIs * add wlist * fix parameter of scale api * add add_n_ Inplace API and remove log_ Inplace API * fix elementwise_add_ and elementwise_sub_ broadcast problem * elementwise inplace api give error message before run the op * use broadcast_shape in elementwise inplace op * add 8 inplace apis that is auto generated * add unittest for all inplace apis * add decorator for inplace apis in static mode * fix windows blas fail of exp inplace api, change array_equal to allclose * add flatten inplace api * add flatten unittest * fix flatten unittest * add decorator * fix grad.numpy in test_pylayer_op * unsupport softmax_with_cross_entropy_ * add test_inplace_softmax_with_cross_entropy to static_mode_white_list * delete __all__ in inplace_utils * delete activation inplace function and add Tensor.inplace_func * change paddle.inplace_ to Tensor.inplace_ * fix little problem * add paddle in inplace_utils
-
由 ceci3 提交于
-
由 LielinJiang 提交于
* add op read_file and decode_jpeg
-
- 29 4月, 2021 3 次提交
-
-
由 joanna.wozna.intel 提交于
* Add bf16 uniform random initializer * Remove duplicated section * Change UT to CPU place only * Put detail functions into anonymous namespace
-
由 arlesniak 提交于
This is cherry-pick of #32281
-
由 Jacek Czaja 提交于
- Executor is nt always having FLAGS_use_mkldnn set to true
-
- 28 4月, 2021 1 次提交
-
-
由 jiangcheng 提交于
* optimize update_loss_scaling_op by fused for loop to one kernel, test=develop * remove useless while loop and optimize variable name, test=develop * optimize variable name from out_addrs_tensor to out_addrs_mem, test=develop * optimize variable name for readable by change prefix identifier from t_ to local_
-
- 27 4月, 2021 2 次提交
-
-
由 Zhong Hui 提交于
* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
-
由 Aurelius84 提交于
-
- 26 4月, 2021 5 次提交
-
-
由 jiangcheng 提交于
* new optimize for where_index_op with prefix sum version. * write a scan prefix sum kernel with stream for where index op. * optimize where_index by using cub::DeviceScan::InclusiveSum instead of imperfect self-kernel. * remove CheckTrue struct and rename stide_array for readable. * optimize variable name for readable. * optimize function name and annotation.
-
由 WangXi 提交于
-
由 ShenLiang 提交于
* fix model parallel * rm parallel_help.py * add embedding
-
由 WeiXin 提交于
* support backward return None. * edit unittest. * edit code according to CI * Improve error information
-
由 jiangcheng 提交于
* optimize slice op and slice grad op, test=develop * optimize variable name and annotation information, test=develop
-
- 25 4月, 2021 3 次提交
-
-
由 liym27 提交于
-
由 Baibaifan 提交于
-
由 Zhang Ting 提交于
-