提交 · 6330fc948d15499220d28a3911aaff3dd01e2d59 · 机器未来 / Paddle

01 3月, 2021 2 次提交

W

cherry-pick (#31279) · 6330fc94
由 Wilber 提交于 3月 01, 2021

6330fc94

[Cherry-pick] The 4th part of new custom op (#31282) · 777d1a45

由 Chen Weihang 提交于 3月 01, 2021

* modify custom op dependent from paddle_framework to paddle_custom_op (#31195)

* [Custom Op] Remove unsupport dtypes (#31232)

* remove remove_unsupport_dtype

* remove remove_unsupport_dtype

* remove test dtype

* add more include

* change dtype.h's enum as enum class to avoid conflict with inference lib

* make enum as enum class

* remove additional test

* merge develop

* polish code

* [Custom OP] Support stream set on Custom Op (#31257)

* [Custom OP] change the user header file format, test=develop (#31274)

* [Custom OP]add PD_THROW and PD_CHECK for User Error message (#31253)

* [Custom OP]add PD_THROW and PD_CHECK for User error message

* PD_THROW and PD_CHECK, fix comment

* fix Windows error message

* fix Windows error message

* fix CI

* [Custom OP]add MSVC compile check on Windows (#31265)

* fix test_check_abi
Co-authored-by: NZhou Wei <52485244+zhouwei25@users.noreply.github.com>
Co-authored-by: NJiabin Yang <marsyang199376@gmail.com>
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: Nzhouwei25 <zhouwei25@baidu.com>

777d1a45

26 2月, 2021 2 次提交
- C
  [Cherry-pick] The Second part of new custom op extension in 2.0.1 (#31237) · d3e60959
  由 Chen Weihang 提交于 2月 26, 2021
```
[Cherry-pick] The Second part of new custom op extension in 2.0.1
```
  d3e60959
- W
  
  fix xpu compile error (#31224) · 37b71828
  由 Wilber 提交于 2月 26, 2021
  
  37b71828
22 2月, 2021 1 次提交
- W
  
  update paddle_fluid.so to paddle_inference.so (#30850) (#31076) · 6ec5f0fb
  由 Wilber 提交于 2月 22, 2021
  
  6ec5f0fb
04 2月, 2021 1 次提交
- 石
  
  support xpu with analysis predictor, test=develop (#30832) (#30863) · d199edd8
  由石晓伟提交于 2月 04, 2021
  
  d199edd8
20 1月, 2021 1 次提交
- Q
  
  update kunlun dependence for aarch64 & sunway platform (#30516) (#30570) · 3688d9e9
  由 QingshuChen 提交于 1月 20, 2021
  
  3688d9e9
19 1月, 2021 3 次提交
- W
  [cherry pick]if pybind.cc changed, generate total report (#30557) · dbbfbccd
  由 wanghuancoder 提交于 1月 19, 2021
```
* if pybind.cc changed, generate total report
```
  dbbfbccd
- H
  
  Ascend Framework Part1: OP & Wrapper (#30281) (#30546) · 6f563ace
  由 hutuxian 提交于 1月 19, 2021
  
  6f563ace
- T
  Pd2.0 (#30532) · 1323e5e7
  由 taixiurong 提交于 1月 19, 2021
```
* support transformer v2.0

* fix range op crash in dygraph xpu place
```
  1323e5e7
18 1月, 2021 1 次提交
- W
  
  【Release/2.0】fix compile error in ARM subgraph (#30488) · 3e49fdcc
  由 Wilber 提交于 1月 18, 2021
  
  3e49fdcc
15 1月, 2021 1 次提交

[cherry-pick2.0]Enhance installation error message after separating AVX and... · 8ab8c620

由 Zhou Wei 提交于 1月 15, 2021

 [cherry-pick2.0]Enhance installation error message after separating AVX and NO_AVX compilation #30442 

cherry-pick #30413
1. 30架构对应很早期的显卡，在2.0及之后移除该架构编译
2. 分离avx与core_avx编译，并优化了安装报错信息。

8ab8c620

13 1月, 2021 2 次提交
- T
  split ps with distributed (#30337) · a97ca56a
  由 tangwei12 提交于 1月 13, 2021
```
Change-Id: I3c788e7576688e63181e7f01562529b85a09cc59
```
  a97ca56a
- W
  resolve #30141 (#30145) (#30345) · 0fbfbeac
  由 Wilber 提交于 1月 13, 2021
```
fix compile problem on FT
Co-authored-by: Nhouj04 <35131887+houj04@users.noreply.github.com>
```
  0fbfbeac
11 1月, 2021 1 次提交

add aarch64 and sunway kunlun lib (#30027) (#30237) · eacbd488

由 QingshuChen 提交于 1月 11, 2021

* add aarch64 and sunway kunlun lib

* minor

* optimize elementwise_add for kunlun

* update kunlun dependence

* minor

* minor

eacbd488

04 1月, 2021 1 次提交
- W
  
  make lite subgraph support multiple tensor precision. (#30055) · 878b6972
  由 Wilber 提交于 1月 04, 2021
  
  878b6972
29 12月, 2020 3 次提交

[Kunlun] 2.0 cherry-pick:Support for Baidu Kunlun XPU multi card training (#29713) · 847aa172

由 liuyuhui 提交于 12月 29, 2020

* [Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337)

* [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574)

* [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor  (#29926)

* add bkcl.so in whl for kunlun (#29947)

* [Kunlun] bug fix of PR2: Support MultiDevicePass and BKCL in parallel executor  (#29961)
Co-authored-by: NQingshuChen <qingshu.chen714@gmail.com>

847aa172

石

[cherry-pick] #26920 , #22924 (#29948) · bea300dd
由石晓伟提交于 12月 29, 2020

bea300dd
W

Support mips (#29943) · 5a8d43bb
由 Wilber 提交于 12月 29, 2020

5a8d43bb

25 12月, 2020 1 次提交

2 0 ps core 2 (#29894) · f781ab08

由 tangwei12 提交于 12月 25, 2020

* add ps table (#29463)

* add ps table

Change-Id: I468a04bd071d21ff52654926fcf4d5f3da19e178

* add service (#29560)

* add service, remove ut on mac

* fix heter_profiler & add heter stop method

* fix code style

* merge pscore

Change-Id: Ie7f60d1cdde6755a0c29db26863c6283e9843d57

* fix cmake

Change-Id: I6773509a7b4ca79139ecc40b7bf3eb318ceff8bb

* fix conflit

Change-Id: I35575be0c96a8520f9d756ea7f1ff0b904a165ba

* fix conflit

Change-Id: Ic926ea0b0d67803226d51241397ba3b510226bfa

f781ab08

17 12月, 2020 1 次提交

update activation op on kunlun (#29577) (#29717) · e82efc0c

由 TTerror 提交于 12月 17, 2020

* fix expand && concat/transpose to new api

* update xpu_header

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* update activation op on kunlun

* add nearest_interp on kunlun

* update error message

e82efc0c

16 12月, 2020 1 次提交
- Q
  support roi_align & affine_channel for kunlun (#29561) (#29657) · d82b0300
  由 QingshuChen 提交于 12月 16, 2020
```
* support roi_align & affine_channel for kunlun

* minor
```
  d82b0300
15 12月, 2020 1 次提交

cherry-pick kunlun PR: 29458, 29539 (#29583) · 03ddf690

由 QingshuChen 提交于 12月 15, 2020

* support mobilenet for kunlun (#29458)

* add xpu ops for training transformer in kunlun (#29539)

* 1.fix matmul bug 2. add one hot

* add xpu error msg
Co-authored-by: Nprocr <procrboo@gmail.com>
Co-authored-by: Ntaixiurong <taixiurong@126.com>

03ddf690

08 12月, 2020 1 次提交

[2.0 rc1/cherrypick] cherry-pick kunlun PR:29234/29229/29293/29367/29280/29448 (#29466) · 6bfc5721

由 liuyuhui 提交于 12月 08, 2020

* add deformable_conv op on xpu (#29234)

* rebase develop

* update deformable_conv op on xpu

* update deformable_conv op on xpu

* update kunlun conv2d/softmax/elementwise implemetation (#29229)

* update conv2d & softmax to new xpu api
* test=kunlun

* remove useless comments
* test=kunlun

* remote softmax xpu op
* test=kunlun

* update kunlun softmax
* test=kunlun

* update xpu unitest
* test=kunlun

* fix elementwise_grad bug for kunlun
*test=kunlun

* support global pooling for kunlun (#29293)

* test=kunlun

* update reduce_sum op on xpu (#29367)

* update reduce_sum op on xpu

* update reduce_sum op on xpu

* support running on xpu

* fix expand/uniform_random && concat/transpose to new api on xpu (#29280)

* fix expand && concat/transpose to new api

* update uniform_random_op

* update xpu_header

* 1. fix elementwise ops'bug 2. fix softmax_with_cross_entropy_op 3. add biliner_interp_op (#29448)
Co-authored-by: Nroot <root@bjhw-sys-rpm0223.bjhw.baidu.com>
Co-authored-by: N卖鱼的哲学 <tangzhiyi11@users.noreply.github.com>
Co-authored-by: NQingshuChen <qingshu.chen714@gmail.com>
Co-authored-by: Ntaixiurong <taixiurong@126.com>
Co-authored-by: Nroot <root@bjhw-sys-rpm0223.bjhw.baidu.com>

6bfc5721

07 12月, 2020 2 次提交
- W
  [Release/2.0 rc1] fix cmake error message. (#29420) · 401cc1e0
  由 Wilber 提交于 12月 07, 2020
```
* update  lite tag.

* fix cmake error log.
```
  401cc1e0
- W
  
  update lite tag. (#29394) · 07a7cd4b
  由 Wilber 提交于 12月 07, 2020
  
  07a7cd4b
05 12月, 2020 1 次提交
- W
  
  update cmake for FT openbals version (#29383) · 4a8aef49
  由 Wilber 提交于 12月 05, 2020
  
  4a8aef49
02 12月, 2020 1 次提交

add compile option WITH_TENSORRT (#29208) (#29264) · f5afeef1

由 Shang Zhizhou 提交于 12月 02, 2020

* add compile option WITH_TENSORRT

* add WITH_TENSORRT to ci paddle_buils.sh

* add WITH_TENSORRT to paddle_build.sh

* change FATAL to WARNING when TensorRT is not found and WITN_TENSORRT=ON, just to pass ci-py3 temporarily

f5afeef1

01 12月, 2020 1 次提交
- W
  
  revert python file coverage, delete coverage run --include, test=develop (#29230) · 2b2cd186
  由 wanghuancoder 提交于 12月 01, 2020
  
  2b2cd186
30 11月, 2020 2 次提交

W

[Lite-Subgraph] Fix compile error for lite subgraph. (#29146) · 4fec182d
由 Wilber 提交于 11月 30, 2020

4fec182d

Generate code coverage reports only for incremental files (#28508) · 0239f796

由 wanghuancoder 提交于 11月 30, 2020

* Generate code coverage reports only for incremental files, test=develop

* Generate code coverage reports only for incremental files, test=develop

* Generate code coverage reports only for incremental files, test=develop

* test for diff python file, test=develop

* fix no python diff report, test=develop

* add cc test file, test=develop

* fix bug in generic.cmake, test=develop

* for debug no cc report, test=develp

* modify compire branch form test_pr to test, test=develop

* fix bug, test=develop

* test for h file changed, test=develop

* debug for redefinition of argument optimize error, test=develop

* close -o3 for test, test=develop

* remove -o3 for test, test=develop

* remove coverage option for nvcc, test=develop

* use CMAKE_CXX_FLAGS open coverage option when header file changed, test=develop

* reopen -o3, test=develop

* remove debug code, test=develop

* remove unused code, test=develop

0239f796

27 11月, 2020 2 次提交

Z

fix CUDA 11 error on windows (#29101) · e668cb07
由 Zhou Wei 提交于 11月 27, 2020

e668cb07

detect tensorRT plugin fp16 in runtime (#27933) · b9e76a01

由 Shang Zhizhou 提交于 11月 27, 2020

* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake

* comile with cuda9

* add some unittest

* notest;test=coverage

* add unittest for trt plugin swish && split

* update ernie unittest

* fix some error message

* remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter

* fix comile errror when CUDA_ARCH_NAME < Pascal"

* fix comile error

* update unittest timeout

* compile with cuda9

* update error msg

* fix code style

* add some comments

* add define IF_CUDA_ARCH_SUPPORT_FP16

* rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED

b9e76a01

24 11月, 2020 1 次提交
- Y
  
  restore timeout value (#29027) · 5cb8e17a
  由 YUNSHEN XIE 提交于 11月 24, 2020
  
  5cb8e17a
20 11月, 2020 1 次提交

add kunlun kernel: slice, slice_grad, top_k, cast. *test=kunlun (#28542) · d3d1a6b6

由 taixiurong 提交于 11月 20, 2020

* 1.add xpu slice op 2. add xpu top_k op 3.modify xpu cast to new api

* 1.add xpu slice op 2. add xpu top_k op 3.modify xpu cast to new api

d3d1a6b6

16 11月, 2020 1 次提交
- Z
  open a part of GPU unittest for windows (#28378) · 93c39779
  由 Zhou Wei 提交于 11月 16, 2020
```
* open a part of GPU unittest for windows

* open a part of GPU unittest for windows
```
  93c39779
12 11月, 2020 1 次提交
- S
  裁剪transformer模型trt支持；修复tensorRT不支持DeletePass的bug (#28517) · 8699f38d
  由 Shang Zhizhou 提交于 11月 12, 2020
```
* skip_layernorm_op done

* add unittest

* slice op convertor support trt < 6

* skip_layernorm only work in ernie
```
  8699f38d
09 11月, 2020 2 次提交
- Y
  modified timeout value on windows (#28499) · d3b2d07d
  由 YUNSHEN XIE 提交于 11月 09, 2020
```
* modified timeout value on windows

* fix some error
```
  d3b2d07d
- Y
  exec ut no more than 15s 2 (#28441) · 72c78e4d
  由 YUNSHEN XIE 提交于 11月 09, 2020
```
* exec ut no more than 15s 2

* fix for ut test_inplace_addto_strategy timeout
```
  72c78e4d
06 11月, 2020 1 次提交
- Q
  fix batch_norm_xpu bug & remove xpusimulator dependence (#28430) · 6bba8e57
  由 QingshuChen 提交于 11月 06, 2020
```
*test=kunlun
```
  6bba8e57

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致