提交 · 3e088aafe4a0c5c169cfa2cc21cc047cfc320081 · Crayon鑫 / Paddle

25 11月, 2021 21 次提交
- F
  [NPU] add int64 support for argsort op (#37434) · 3e088aaf
  由 furnace 提交于 11月 25, 2021
```
* [NPU] add int64 support for argsort op

* [NPU] delete debug codes
```
  3e088aaf
- F
  [NPU] add NPU kernel for prior_box op (#37519) · 1127fecb
  由 furnace 提交于 11月 25, 2021
```
* [NPU] add NPU kernel for prior_box op

* [NPU] delete debug codes
```
  1127fecb
- Y
  
  Disable the check of missing op benchmark script temporarily. (#37535) · 65056742
  由 Yiqun Liu 提交于 11月 25, 2021
  
  65056742
- Z
  
  Pass the stream created by Paddle to CINN. (#37337) · c249556d
  由 Zhen Wang 提交于 11月 25, 2021
  
  c249556d
- W
  
  fix pass_desc.proto compilation error, test=develop (#37536) · a4ef88ed
  由 wuhuanzhou 提交于 11月 25, 2021
  
  a4ef88ed
- B
  
  Add InternalStorage and add ShardingOptimizerStage2 (#37489) · 5af64631
  由 Baibaifan 提交于 11月 25, 2021
  
  5af64631
- Z
  [cherry-pick 2.2 heterps]bug fix for launch_utils.py (#37521) · 8bb1038c
  由 zmx 提交于 11月 25, 2021
```
* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* [heterps]bug fix for _run_from_dataset

* fix heter_server.cc

* fix launch_utils.py

* fix heter_section_worker.cc

* fix. test=develop

* fix. test=develop
```
  8bb1038c
- F
  Support multi-stream allocation for CUDA place (#37290) · b9c464c3
  由 From00 提交于 11月 25, 2021
```
* Support multi-stream allocation for CUDA place

* Do not notify the retrying from other streams when free CUDA allocation

* Fix compile error for CPU

* Fix compile error for HIP

* Release memory for StreamSafeCUDAAllocaRetry in malloc_test

* Add FLAGS_use_stream_safe_cuda_allocator

* Fix CI error for 'set_tests_properties'

* Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy

* Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock

* FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator

* Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator

* Add UT for alloc interface

* Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
```
  b9c464c3
- S
  block unknown option /arch:SSE3 (#37439) · adb54eb0
  由 Sing_chan 提交于 11月 25, 2021
```
* block unknown option /arch:SSE3

* modify according to zhouwei's comment
```
  adb54eb0
- add new API paddle.nn.initializer.Dirac (#37389) · bbb9b28a
  由 zhouweiwei2014 提交于 11月 25, 2021
```
* add new API paddle.nn.initializer.Dirac

* fix doc
```
  bbb9b28a
- L
  [new-exec] fix program cache key (#37500) · e64829e2
  由 Leo Chen 提交于 11月 25, 2021
```
* fix program cache key

* bug fix

* fix cache problem

* remove unused code
```
  e64829e2
- W
  
  [fleet_executor] Compute Interceptor stop along data flow (#37531) · 50f75fb5
  由 WangXi 提交于 11月 25, 2021
  
  50f75fb5
- T
  Fix static-ci (#37504) · 992d4ebb
  由 tianshuo78520a 提交于 11月 25, 2021
```
* Fix static-ci
```
  992d4ebb
- Z
  Added GradTensorHolder to Eager Dygraph (#37458) · bc9f9f43
  由 Zhanlue Yang 提交于 11月 25, 2021
```
* Added GradTensorHolder to Eager Dygraph

* Added accumulation codes to Eager Dygraph

* Fix windows-ci issue

* Fix NPU-CI issue

* Fixed CI-Coverage issue
```
  bc9f9f43
- L
  
  Export task node to python (#37509) · 3f815e76
  由 LiYuRio 提交于 11月 25, 2021
  
  3f815e76
- C
  Hot fix for dataloader thread error because of pten (#37520) · ed7a21de
  由 Chen Weihang 提交于 11月 24, 2021
```
* hot fix for dataloader thread error

* polish comment

* fix type in comment, test=document_fix
```
  ed7a21de
- X
  Fix test rnn memory helper op (#37474) · e4791d88
  由 xiongkun 提交于 11月 25, 2021
```
* clear LoDTensorArray

* fix  bugs

* fix

* fix gpu
```
  e4791d88
- M
  【PaddlePaddle Hackathon】6、在 Paddle 中新增 ZeroPad2d (#37151) · 81861f69
  由 Matsumoto GAO 提交于 11月 25, 2021
```
* add zeropad2d v0.1

* add zeropad2d v0.2

* add zeropad2d v0.3

* add zeropad2d v0.3

* add zeropad2d v0.3

* add zeropad2d v0.4

* add zeropad2d v0.5

* add zeropad2d v0.5 codestyle

* add zeropad2d v0.5 codestyle

* add zeropad2d v0.6 functional

* add zeropad2d v0.6 functional

* add zeropad2d v0.6 functional
```
  81861f69
- W
  
  fix_matmul_op_int8_plugin (#37525) · 0fd70d71
  由 Wangzheee 提交于 11月 25, 2021
  
  0fd70d71
- C
  
  infershape func to infermeta (#37524) · 2a905f6b
  由 Chen Weihang 提交于 11月 24, 2021
  
  2a905f6b
- L
  [new-exec] skip compiled program (#37512) · 171da2ce
  由 Leo Chen 提交于 11月 25, 2021
```
* skip compiled program

* fix ut
```
  171da2ce
24 11月, 2021 19 次提交
- P
  Changed second batch of deprecated mkldnn header and function names to new oneDNN names (#37351) · 7db7a0ec
  由 piotrekobiIntel 提交于 11月 24, 2021
```
* Add second batch of deprecated mkldnn namespace and macro changes

* Unlock CI

* Fix temporary namespace alias placing
```
  7db7a0ec
- Y
  
  [fleet_executor] fix message bus bug (#37507) · 10d8d6b6
  由 Yuang Liu 提交于 11月 24, 2021
  
  10d8d6b6
- Z
  Added EagerUtils to Eager Dygraph (#37479) · 7de99d8c
  由 Zhanlue Yang 提交于 11月 24, 2021
```
* Added EagerUtils to Eager Dygraph

* Purified include dependencies for global_utils

* Fixed merge conflicts
```
  7de99d8c
- S
  
  bring forward check added_ut (#37511) · 486b77f2
  由 Sing_chan 提交于 11月 24, 2021
  
  486b77f2
- T
  [GpuPs]pybind core (#37287) · d69daed1
  由 Thunderbrook 提交于 11月 24, 2021
```
* pybind core

* set use psgpu
```
  d69daed1
- A
  
  Fix lod in fetch_v2 (#37514) · acbf9974
  由 Aurelius84 提交于 11月 24, 2021
  
  acbf9974
- J
  
  fix range op (#37486) · d5c51e62
  由 Jiawei Wang 提交于 11月 24, 2021
  
  d5c51e62
- L
  
  [new-exec] support skipping infershape (#37510) · e76b601b
  由 Leo Chen 提交于 11月 24, 2021
  
  e76b601b
- Y
  elementwise_mul refactor (#37471) · c5e857d4
  由 YuanRisheng 提交于 11月 24, 2021
```
* elementwise_mul refactor

* perfect code in test

* delete redundant code

* fix bugs when run test_multiply

* adjust the location of macro

* fix bugs when run ci
```
  c5e857d4
- Z
  【PTen】Add Scalar and ScalarArray in pten (#37409) · 0f24de83
  由 zyfncg 提交于 11月 24, 2021
```
* add scalar and scalar_array

* remove DenseTensor include from Scalar and ScalarArray

* remove inner header from scalar_array

* refactor the method of fill_constant and add some comment
```
  0f24de83
- W
  [Paddle-Inference] Matmul_int8_convert: tensor*tensor (#37285) · 16590799
  由 Wangzheee 提交于 11月 24, 2021
```
* matmul_convert_int8

* matmul_convert_int8

* matmulconvert_int8

* Matmul_int8_convert: tensor*tensor

* Matmul_int8_convert: tensor*tensor

* Matmul_int8_convert: tensor*tensor
```
  16590799
- Z
  Adapt auto search (#37490) · 025053b4
  由 zhaoyingli 提交于 11月 24, 2021
```
* adapt auto search

* adapt auto search

* fix matmulv2 compatible

* del debug
```
  025053b4
- T
  Fix op-benchmark CI (#37487) · 5ff1ff5a
  由 tianshuo78520a 提交于 11月 24, 2021
```
Fix op-benchmark CI
```
  5ff1ff5a
- Y
  [Auto Parallel] Add the unified cluster representation (#37091) · db727551
  由 Yulong Ao 提交于 11月 24, 2021
```
* [Auto Parallel]  Add the unified cluster representation

* Add the local id for devices

* Add some comments
```
  db727551
- A
  
  [NewExe] Support HandleComplexGradToRealGrad to cast complex into Real (#37450) · 8b87d5eb
  由 Aurelius84 提交于 11月 24, 2021
  
  8b87d5eb
- C
  [PTen] Standardized unittest namespace (#37456) · 1c969d20
  由 Chen Weihang 提交于 11月 23, 2021
```
* standarded unittest namespace

* fix detail error
```
  1c969d20
- 0
  [Dy2stat]support pure fp16 for dy2stat (#36944) · 52edad6a
  由 0x45f 提交于 11月 24, 2021
```
* run dy2stat pure fp16 in Linear model

* no use self._pure_fp16_inputs

* add test and fix Adam error in dy2stat pure fp16 training

* use paddle.optimizer.Adam

* run test in gpu

* change test time for CI

* enlarge atol for test_resnet_pure_fp16

* refine code and enlarge atol

* make custom_white_list and custom_black_list take effect for AMP and pure fp16

* check tracer is not None

* use default atol

* change filter_size

* change atol and add some NOTE
```
  52edad6a
- Z
  
  fix lite with xpu or nnadapter (#37449) · 93aefceb
  由 zhupengyang 提交于 11月 24, 2021
  
  93aefceb
- F
  
  fix:transform the data from cpu to gpu when trt is used (#37427) · 49366a63
  由 feng_shuai 提交于 11月 24, 2021
  
  49366a63

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致