提交 · 8ce2482b8011536edee493590808411de8839582 · 机器未来 / Paddle

09 1月, 2021 1 次提交

add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op (#29913) · da16b33f

由 pangyoki 提交于 1月 09, 2021

* add view strategy on squeeze,unsqueeze,reshape,flatten

* add squeeze unittest

* add unittests

* use View strategy as name rather than Reuse Allacation

* fix view api doc

* fix format

* use core.ops when input of reshape2 is Tensor

* fix test_cross_entropy_loss error because of reshape2

* delete selected_rows

* change op_function

* little change

* solve HandleViewBetweenInputAndOutput

da16b33f

08 1月, 2021 3 次提交

Fix dtype of ungenerated grad var (#28511) · 8696335f

由 Leo Chen 提交于 1月 08, 2021

* fix dtype of ungenerated grad var

* update ut

* refine code

* set default dtype

* fix could_use_cudnn bug

* remove debug code

* re-implement

* fix bug

8696335f

Add callback after TensorCopy (#30123) · 1f97d61c

由 Leo Chen 提交于 1月 08, 2021

* change to tensor copy sync

* change to tensor copy sync

* make copy_to safe when use TensorCopy

* refine code

* add ut

* add cudapinned garbagecollector

* add testcase: cpu place -> cuda pinned place

1f97d61c

C
【Paddle.Fleet】Fix tensor table (#30075) · 528e03fc
由 Chengmo 提交于 1月 08, 2021
```
* add tensor table
```
528e03fc

07 1月, 2021 1 次提交
- 1
  Add Lookahead and ModelAverage Optimizer (#30004) · 198fbdfb
  由 123malin 提交于 1月 07, 2021
```
* test=develop, add model_average and lookahead
```
  198fbdfb
06 1月, 2021 2 次提交

add dispenable input for core.ops.reshape2/expand/slice (#30072) · adac38c5

由 Leo Chen 提交于 1月 06, 2021

* add dispenable input 'shape' for core.ops.reshape2

* add dispenable inputs for core.ops.reshape2/expand/slice

* add ut

adac38c5

Fix bug: In dynamic mode, if start or end is negetive, __getitem__ return wrong result(#30003) · 9922bd41

由 liym27 提交于 1月 06, 2021

1. when slice_item is a slice: 
 1) the start of __getitem__ should be std::max(start, 0) if slice
 2) the start of __getitem__ should be std::min(end, dim) 
2. when slice_item is an integer, it should be in [-dim_len, dim_len) 
3. Fix error message to use accurate data

9922bd41

05 1月, 2021 1 次提交
- T
  add topo-aware in heter-ps (#30087) · 0b8e1fad
  由 Thunderbrook 提交于 1月 05, 2021
```
* add topo aware

* resource.h

* topo aware

* format
```
  0b8e1fad
04 1月, 2021 1 次提交
- C
  [Inference] zero_copy_tensor supports int8_t (#30053) · 68398abc
  由 cc 提交于 1月 04, 2021
```
* zero_copy_tensor supports int8_t
```
  68398abc
27 12月, 2020 1 次提交

[Dynamic Inplace] Support ShareInplaceVersionCounterWith for C++ Tensor (#29842) · 9602a182

由 liym27 提交于 12月 27, 2020

* Revert "[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267)"

This reverts commit b10ecd9d.

* Support ShareInplaceVersionCounterWith to share the same inplace version counter for VarBase

9602a182

26 12月, 2020 1 次提交
- L
  
  [Kunlun] PR2: Support MultiDevicePass and BKCL in parallel executor (#29574) · 4427df37
  由 liuyuhui 提交于 12月 26, 2020
  
  4427df37
24 12月, 2020 1 次提交

[Feature] one ps (3/4) (#29604) · 032414ca

由 tangwei12 提交于 12月 24, 2020

* oneps (3/4)
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nmalin10 <malin10@baidu.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

032414ca

23 12月, 2020 1 次提交

heter box (#29734) · 09b6e719

由 Thunderbrook 提交于 12月 23, 2020

* 　add heter box

* add trainer, worker, wrapper...

* format

* for ci

* format

* remove boost get

* boost & copyright

* rename

* 　rename

* format

* format

* format
Co-authored-by: Nyaoxuefeng6 <yaoxuefeng@baidu.com>

09b6e719

22 12月, 2020 1 次提交
- S
  Support multi-stream communication for dynamic graph distributed (#29525) · 01e2874a
  由 ShenLiang 提交于 12月 22, 2020
```
* fix fleet for multi-stream

* fix memcpy for ncclid

* use sync to solve move operation
```
  01e2874a
16 12月, 2020 2 次提交

L

[Kunlun] PR1:Support one Kunlun card training in parallel executor (#29337) · f13c3a9c
由 liuyuhui 提交于 12月 16, 2020

f13c3a9c

添加rocm平台支持代码 (#29342) · 76738504

由 Y_Xuan 提交于 12月 16, 2020

* 添加rocm平台支持代码

* 修改一些问题

* 修改一些歧义并添加备注

* 修改代码格式

* 解决冲突后的代码修改

* 修改operators.cmake

* 修改格式

* 修正错误

* 统一接口

* 修改日期

76738504

15 12月, 2020 2 次提交
- A
  
  Add tf32 support for A100 tensor core acceleration for cuBLAS (#28732) · efea540c
  由 AshburnLee 提交于 12月 15, 2020
  
  efea540c
- W
  
  fix none-contiguous bug for python api. (#29615) · 78dad786
  由 Wilber 提交于 12月 15, 2020
  
  78dad786
09 12月, 2020 2 次提交
- Z
  support deepcopy for Layer/Tensor/Paramerbase (#29387) · e74e1a22
  由 Zhou Wei 提交于 12月 09, 2020
```
* support deepcopy for Layer/Tensor/Paramerbase

* fix some code
```
  e74e1a22
- S
  Rebuild group automatically in dynamic graph distributed (#29255) · 2ef9e0e2
  由 ShenLiang 提交于 12月 09, 2020
```
* add tensor_indices in AssignGroupBySize

* add rebuild group in reducer
```
  2ef9e0e2
05 12月, 2020 1 次提交

update unbind norm add CUDAPlace api doc information (#29322) · 7c508d86

由 myq406450149 提交于 12月 05, 2020

* enhance array_to_lod_tensor_op lod_tensor_to_array_op errors information. test=develop

* fix format. test=develop

* format fix. test=develop

* add lod_rank_table. test=develop

* fix format. test=develop

* fix doc info. test=develop

* fix np error

* add unbind dygraph api. test=develop

* fix unbind doc.test=develop

7c508d86

04 12月, 2020 2 次提交

[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in... · b10ecd9d

由 liym27 提交于 12月 04, 2020

[inplace] Add ShareHolderWith for class Variable and SharePlaceholderWith in VarBase.detach() to share the same Tensor/SelectedRows (#29267)

b10ecd9d

Support type promote for basic math ops (quantum required) (#29265) · 9ad800eb

由 Chen Weihang 提交于 12月 04, 2020

* basic impl of type promote

* add comment & another testcase

* fix complex bugs & support python op promote type

* fix failed unittests & polish code

* add unittest for coverage

* change to only promote complex type

* polish code details

* polish several comments

9ad800eb

02 12月, 2020 1 次提交

Add pure fp16 training with master weights. (#27712) · be3777a5

由 Zhen Wang 提交于 12月 02, 2020

* add the weight decay func for the momentum op

* Add the multi_precision function in Momentum Optimizer.

* Make sure that the initial value of master weights are same with the fp16 weights.

* add static loss scaling.

* add the rescale_grad function in the pure fp16 training.

* use the original momentum updating method.

* Polish some codes, such as variable names.

* add docstring for apis.

* update the var creation details of _create_master_weight.

* not modify codes about imperative momentum updating.

* Fix the error of test_dist_sparse_tensor_load_momentum UT.

* add unit test for multi precision fp16 training.

* add more unit tests for CI.

* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.

* For CI Coverage Checking.

be3777a5

01 12月, 2020 2 次提交

add complex64 and complex128 type; add +-*/@ and slice opreator for c… (#29199) · 8f45d142

由 chentianyu03 提交于 12月 01, 2020

* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types

* add test cases for complex elementwise, matmul and getitem unittest

* add test cases for complex types

* add test cases for complex matmul unittest

8f45d142

accumulate gradient for leaf tensor with previous graph and expose leaf tensor concept (#28429) · c0a991c8

由 Zhou Wei 提交于 12月 01, 2020

* The leaf tensor concept is exposed and the gradient accumulation of leaf tensor

* The leaf tensor concept is exposed and the gradient accumulation of leaf tensor

* fix coverage

* fix api doc

* fix CI unittest

* fix CI unittest

* fix unitest

* empty tensor does’t need inner_var_

* fix some error message

c0a991c8

30 11月, 2020 1 次提交

Check whether there is any inplace operation affecting gradient calculation. (#27901) · 865a4598

由 liym27 提交于 11月 30, 2020

* Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable.

* Add a new attribute `_inplace_version` for VarBase.

* Raise exception if an inplace operation can result in incorrect gradient computation.

* Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation.

* For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode.

* Use original var_wrapper if the inplace_version is not changed.

* Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.

865a4598

27 11月, 2020 1 次提交

Support dynamic graph distributed (#28997) · e2d01eb6

由 ShenLiang 提交于 11月 27, 2020

* add reducer

* refine envent for memorycopy

* add concat&split for allreduce

* apply concat & split for fuse tensor

* fix nccl dep

* fix the untest, compile problem and ddp initialize problem

* fix untest for mac & add some comments & solve the repeated param in sublayers

* fix untest for windows & fix document

e2d01eb6

26 11月, 2020 1 次提交
- L
  Split train_mode and has_grad for tracer (#29064) · 770395cb
  由 Leo Chen 提交于 11月 26, 2020
```
* split train_mode and has_grad

* fix format

* fix ci problems

* fix sample code
```
  770395cb
25 11月, 2020 1 次提交
- Z
  fix tensor detach to zero copy (#27921) · 8ca0a8a8
  由 Zhou Wei 提交于 11月 25, 2020
```
* fix tensor detach to zero copy

* fix tensor detach to zero copy
```
  8ca0a8a8
23 11月, 2020 1 次提交
- C
  
  polish two api doc detail, test=document_fix (#28971) · 768dab44
  由 Chen Weihang 提交于 11月 23, 2020
  
  768dab44
20 11月, 2020 2 次提交
- G
  
  Fix gpu memory allocation bug. (#28703) · 1dad8cea
  由 gongweibao 提交于 11月 20, 2020
  
  1dad8cea
- Z
  
  fix bug that to_tensor not support paddle.Place (#28717) · 3b0dd5f6
  由 Zhou Wei 提交于 11月 20, 2020
  
  3b0dd5f6
18 11月, 2020 1 次提交
- L
  Add check for non-dispensable input (#28666) · 3d09929b
  由 Leo Chen 提交于 11月 18, 2020
```
* Add check for non-dispensable input

* fix typo
```
  3d09929b
13 11月, 2020 1 次提交
- Z
  updata 2.0 API english doc (#28525) · bf6e7cba
  由 Zhou Wei 提交于 11月 13, 2020
```
* make Numpy version is below 1.19.3

* fix 2.0 doc
```
  bf6e7cba
11 11月, 2020 1 次提交
- W
  
  [Inference] Add TryShrinkMemory interface. (#28409) · 1bf48365
  由 Wilber 提交于 11月 11, 2020
  
  1bf48365
05 11月, 2020 1 次提交
- 石
  
  check op_version_registry in CI test, test=develop (#28402) · c41fd033
  由石晓伟提交于 11月 05, 2020
  
  c41fd033
04 11月, 2020 2 次提交

Add broadcast_shape api (#28257) · 8b2436a7

由 Leo Chen 提交于 11月 04, 2020

* add broadcast_shape api

* add ut

* follow comments

* add example code, test=dodument_fix

* update example code, test=document_fix

8b2436a7

石

enhance the op_version_registry, test=develop (#28347) · 21a63f6f

由石晓伟提交于 11月 04, 2020

* enhance the op_version_registry, test=develop

* add unittests, test=develop

* enhance the op_version_registry, test=develop

* fix bugs, test=develop

* revert pybind_boost_headers.h, test=develop

* fix a attribute bug, test=develop

21a63f6f

03 11月, 2020 1 次提交

TensorRT中ernie模型推理性能优化，支持变长输入 (#28367) · ea851796

由 Shang Zhizhou 提交于 11月 03, 2020

* fp16 result ok

* change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS

* auto detect special slice op converter for ernie with trt oss

* ernie oss only support fp16

* fix special_slice_plugin serialize bug

* matmul in tensorrt ok

* ernie unittest ok

* add matmul tensorrt unittest

* remove demo code

ea851796

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致