- 02 12月, 2020 3 次提交
-
-
由 Wojciech Uss 提交于
-
由 furnace 提交于
* add fp16 for layer_norm op * revert layernorm api * fix forward * fix forward * fix backward for layernorm with fp16 * fix unit test for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 * 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U> * fix with_mkldnn compile error for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
由 Shang Zhizhou 提交于
-
- 01 12月, 2020 9 次提交
-
-
由 Leo Chen 提交于
* pass stop_gradient for cast op * improve performance of elementwise_add grad * use tensor copy async * dygraph branch * fix dygraph branch * add ut
-
由 卖鱼的哲学 提交于
* rebase develop * update deformable_conv op on xpu * update deformable_conv op on xpu
-
由 Chen Weihang 提交于
* hot fix complle failed in gcc4.8 * fix failed unittest
-
由 GeminiCarrie 提交于
* Fix a bug when running on an operating system without "bash." * add execution condition * for ci-coverage
-
由 ShenLiang 提交于
-
由 QingshuChen 提交于
* update conv2d & softmax to new xpu api * test=kunlun * remove useless comments * test=kunlun * remote softmax xpu op * test=kunlun * update kunlun softmax * test=kunlun * update xpu unitest * test=kunlun * fix elementwise_grad bug for kunlun *test=kunlun
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest
-
由 Zhou Wei 提交于
* The leaf tensor concept is exposed and the gradient accumulation of leaf tensor * The leaf tensor concept is exposed and the gradient accumulation of leaf tensor * fix coverage * fix api doc * fix CI unittest * fix CI unittest * fix unitest * empty tensor does’t need inner_var_ * fix some error message
-
由 Wilber 提交于
-
- 30 11月, 2020 10 次提交
-
-
由 Adam Osewski 提交于
- Make sure that oneDNN memory descriptors are created only once at first iteration.
-
由 joanna.wozna.intel 提交于
-
由 Wilber 提交于
-
由 123malin 提交于
* fix paramete prefetch & device guard Co-authored-by: NMrChengmo <cmchengmo@163.com> Co-authored-by: Nchengmo <chengmo@baidu.com>
-
由 liym27 提交于
* Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable. * Add a new attribute `_inplace_version` for VarBase. * Raise exception if an inplace operation can result in incorrect gradient computation. * Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation. * For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode. * Use original var_wrapper if the inplace_version is not changed. * Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.
-
由 123malin 提交于
* test=develop, optimize async prefetch
-
由 WangXi 提交于
-
由 Chen Weihang 提交于
* fix failed tests in yingchun gived list * add unittests into static_mode_white_list * add enable static * fix dist unittest * skip test_sigmoid_focal_loss_op & add gym * revert no need skip unittests * remove gym
-
由 Wojciech Uss 提交于
-
由 Jack Zhou 提交于
fix gru gcc7.4 bug for the gru compile
-
- 28 11月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 27 11月, 2020 8 次提交
-
-
由 ShenLiang 提交于
* add reducer * refine envent for memorycopy * add concat&split for allreduce * apply concat & split for fuse tensor * fix nccl dep * fix the untest, compile problem and ddp initialize problem * fix untest for mac & add some comments & solve the repeated param in sublayers * fix untest for windows & fix document
-
由 lilong12 提交于
update expand as op to use the shape of the target tensor instead of the target tensor itself. (#29020) * update, test=develop
-
由 Zhou Wei 提交于
-
由 Jack Zhou 提交于
Add eigen gru and fix the dropout bug in the rnn
-
由 yaoxuefeng 提交于
-
由 arlesniak 提交于
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
由 Leo Chen 提交于
-
- 26 11月, 2020 9 次提交
-
-
由 Noel 提交于
Fix ops doc for some ops
-
由 Leo Chen 提交于
* split train_mode and has_grad * fix format * fix ci problems * fix sample code
-
由 Aurelius84 提交于
-
由 WangXi 提交于
-
由 Shang Zhizhou 提交于
-
由 Shibo Tao 提交于
add API serialize_program, serialize_persistables, save_to_file, deserialize_program, deserialize_persistables, load_from_file. (#29034)
-
由 joanna.wozna.intel 提交于
* Add bf16 pool2d and unify bf16 unit tests * Add change default ops test
-
由 joanna.wozna.intel 提交于
* Fix cpu_bfloat16_pass * Add output_format * Fix incorrect SetOutput * Change fromating
-
由 Qi Li 提交于
* fix win ci failure, test=develop * add ci test, test=develop
-