- 23 8月, 2021 10 次提交
-
-
由 Jacek Czaja 提交于
* - disabled interpolate onednn * - compilation fix * - draft of batch_norm cache disabling * - fixes to UT
-
由 Peihan 提交于
* enable infer_ut on windows * remove lib calculation & time * unset http_proxy when download bos file on windows
-
由 Li Min 提交于
Refactor the organization of layer_norm cuda impl so that it can be reused in fused attention op. Extract the layer_norm cuda impl form layer_norm_op.cu to layer_norm_kernel.cu.h. Define fused/attention_layer_norm.h, which can be used in fused attention op in next PR.
-
由 zyfncg 提交于
* Support getitem by Bool index * delete some debug info of bool index * support the case that the shape of bool index is different from indexed tensor
-
由 wanghuancoder 提交于
This reverts commit 6bacfb0e.
-
由 pangyoki 提交于
-
由 pangyoki 提交于
-
由 TeslaZhao 提交于
-
由 seemingwang 提交于
-
由 zhaoyingli 提交于
* adamw support cuda * adamw support cuda
-
- 22 8月, 2021 1 次提交
-
-
由 Zhang Zheng 提交于
-
- 20 8月, 2021 10 次提交
-
-
由 Hao Lin 提交于
-
由 Yuang Liu 提交于
-
由 lzzyzlbb 提交于
* add rmsprop npu * add argsort npu * add argsort npu * modify according to review * modify sharedatawith according to review * modify reshape according to review * rm dygraph=false
-
由 Sing_chan 提交于
* [NPU] Support npu kernel for pad3d op * fix for comment of zhouwei25 * fix some bugs according to qili93's comments * add support and test for paddings in input * delete VLOG used for debug
-
由 wanghuancoder 提交于
* use spin lock in auto growth allocator, test=develop * use pthread spin lock, test=develop * use lock guard, test=develop * use malloc spin lock, test=develop * use lock_guard, test=develop
-
由 wangguanqun 提交于
* add trainer desc config to distributed strategy * code style modified * data_feed set lod
-
由 zhaoyingli 提交于
* add depthwise_conv2d npu * add some tests * Delete test_unique_op_npu.py * delete trans input
-
由 zhaoyingli 提交于
* [NPU] Support npu op where and where grad * fix use const_cast * delete a test
-
由 Peihan 提交于
-
由 JYChen 提交于
* add (N,C,*) input support for GroupNorm * --amend
-
- 19 8月, 2021 6 次提交
-
-
由 JingZhuangzhuang 提交于
* add npu sin op * [NPU] Support npu kernel for sin op * modify support npu kernel for sin op * modify support npu kernel for sin op * modify nou sin op * modify npu sin op * add sin op npu
-
由 Peihan 提交于
* add slim resnet50 quant model in pr-ci-inference * enable resnet50_quant multi_thread4_trt_int8_bz1 * remove LOG(FATAL)
-
由 Yiqun Liu 提交于
Add dimension check for inverse to avoid dividing by 0 error when input's shape is [0, 0, 0]. (#34996)
-
由 ceci3 提交于
* fix batch_norm and instance norm when input is []
-
由 tianshuo78520a 提交于
* notest;test=gpu-inference * notest;test=gpu-inference * notest;test=gpu-inference * notest;test=gpu-inference * fix error * notest;test=gpu-inference * notest;test=gpu-inference * notest;test=gpu-inference * test=gpu-inference
-
由 Aurelius84 提交于
* add device_context * add gtest for device_event_gpu * Remvoe duplicate DeviceType * push for test * add unittest * fix macros * fix MSVC using usage
-
- 18 8月, 2021 13 次提交
-
-
由 lzzyzlbb 提交于
* [npu]add rmsprop op
-
由 xiongkun 提交于
* Add NPU kernel for norm Op: float16 and float32 * fix code for code review * fix for code review * add type for paddle_throw * remove unnecessary head file.\nAdd more testcase * remove a broadcast
-
由 littletomatodonkey 提交于
* fix pad outliers err * fix pad api input type and doc * fix example of pad * add unittest for pad3d * fix unittest * fix error format * fix pad doc
-
由 wanghuancoder 提交于
* code refactoring, test=develop * refine, test=develop * refine, test=develop * refine, test=develop
-
由 Peihan 提交于
-
由 Jackwaterveg 提交于
* test=develop * test=develop
-
由 Jackwaterveg 提交于
* test=develop * test=develop
-
由 WangXi 提交于
[Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965)
-
由 Chen Weihang 提交于
* fix ext_tensor.cast failed bug * remove useless deps * fix windows cmake failed * try to fix windows make failed * fix make error on windwos
-
由 Zhanlue Yang 提交于
* Add function to disable paddle signal handler Paddle used google::InstallFaultSignalHandler to handle selected system signals, mainly for debugging and bug report purposes. However, this can be conflicted with other python packages whoever captures similar signals. Such python package involves tvm and more To resolve this issue, we support a function to disable signal handler * Remove signal test from WIN32 platform * Remove redundant return from disable_signal_handler() function * Add detailed messages to en_doc
-
由 wawltor 提交于
-
由 Leo Chen 提交于
* add retry for HcclGetRootInfo * refine code * reduce retry interval
-
由 Guoxia Wang 提交于
* support class center sample of PartialFC
-