- 04 8月, 2021 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 Lijunhui 提交于
-
- 03 8月, 2021 2 次提交
-
-
由 WangXi 提交于
-
由 QingshuChen 提交于
* support Kunlun2 * support KL2 * support KL2
-
- 30 7月, 2021 2 次提交
- 29 7月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
* add fix op run order pass * add ut for fix_op_run_order * fix ci error * improve coverage * improve coverge again and fix cpu test case * follow some comments
-
- 20 7月, 2021 2 次提交
-
-
由 李季 提交于
* fix cast
-
由 chentianyu03 提交于
-
- 19 7月, 2021 2 次提交
-
-
由 Qi Li 提交于
-
由 chentianyu03 提交于
* add cuda event and stream api * add cuda event and stream api * add get_current_stream api * add get_current_stream api * init streams * modify get_current_stream * modify get_cuttent_stream * add synchronize func * add current_stream doc and test file * move get_current_stream into CUDA macro * move CudaEvent into CUDA macro * move _get_current_stream and _device_synchronize into cuda macro * modify the macro of cuda stream and event * add test case for synchronize * add paddle.devices.cuda module * event and stream support hip * add doc for stream and event class * move cuda stream and event into single pybind * add cuda_streams_py.cc to cmakelist * add _device_synchronize and _get_current_stream to core module * add test case for cudastream and cudaevent * move __all__ in streams.py * fix test fail * add cuda to devices __all__ * fix current_stream doc writing error * move devices to device direction, and merge device.py into __init__.py * add required:gpu to sample codes * remove cuda direction from device/__init__.py
-
- 15 7月, 2021 1 次提交
-
-
由 Aurelius84 提交于
* Refine Constructor logic of ParallelExecutor * Replace executor into ParallelExecutor in run_program_op
-
- 13 7月, 2021 1 次提交
-
-
由 LiuWei 提交于
-
- 12 7月, 2021 1 次提交
-
-
由 Zhang Zheng 提交于
-
- 07 7月, 2021 1 次提交
-
-
由 feng_shuai 提交于
-
- 29 6月, 2021 1 次提交
-
-
由 taixiurong 提交于
-
- 24 6月, 2021 2 次提交
-
-
由 Jacek Czaja 提交于
* - fix to #33282 * - Increased threshold for elementwise_mul_bf16 grad * -disabled faulty UT * - fix to approval
-
由 Zhou Wei 提交于
* Modify the search order of dynamic library * Modify the search order of dynamic library
-
- 23 6月, 2021 1 次提交
-
-
由 jakpiase 提交于
* base changes for split op * 90% of split functionality added * full fp32 functionality * added bf16 test * added submemory caching * added bf test to static mode whitelist * minor change * enabled split op for inference * minor fix * minor fix
-
- 21 6月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* enable npu alignment * support flatten_params/grads * support clip by global norm * remove memset in coalesce_tensor_op * fix npu kernel of sum op when input is one tensor * add ut for flatten_param_grads+regularizer * fix ut * fix typo
-
- 16 6月, 2021 1 次提交
-
-
由 Jacek Czaja 提交于
* - Draft of implementation of refactoring - compilation fix * - Fixes after review * - Removed unnecessary comment
-
- 11 6月, 2021 1 次提交
-
-
由 ronnywang 提交于
-
- 10 6月, 2021 2 次提交
- 09 6月, 2021 2 次提交
-
-
由 Jacek Czaja 提交于
* - First fix to #33021
-
由 s.feng 提交于
-
- 02 6月, 2021 4 次提交
-
-
由 Qi Li 提交于
-
由 chentianyu03 提交于
-
由 Qi Li 提交于
-
由 wuhuanzhou 提交于
-
- 01 6月, 2021 2 次提交
-
-
由 chentianyu03 提交于
-
由 chentianyu03 提交于
* replace and remove complex64/128 types in custom OP and other files * fix custom_tensor_test fail bug * fix custom_conj_test fail bug * fix dispatch_test_op build fail bug
-
- 27 5月, 2021 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 Zhou Wei 提交于
* Unify all external API error message mechanism and enhance third-party API error msg * fix some comment * fix some comment
-
- 26 5月, 2021 2 次提交
- 25 5月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* modify complex template for elementwise ops * modify mul, div grad struct * add complex template for CudaShuffleDownSync CudaShuffleXorSync funcs and fix the bug when delete cuda<9000 * fix shuffle func args bug * fix shuffle func args bug * fix shuffle func args bug
-
- 20 5月, 2021 2 次提交
-
-
由 chentianyu03 提交于
* add complex template file * add numtraits for complex template * add complex template type register * modify specify template of complex * modify specify template of complex * modify specify template of complex * modify specify template of complex * make TensorCheckerVisitor support complex type * fix operator= error * add complex template * add complex template type * add complex template type to pyarray transform * add complex template type to pyarray transform * remove complex type for dlpack register * set dlpack supprot complex type * set dlpack supprot complex type * set dlpack supprot complex type * remove explict for complex constructor * add complex unit test file
-
由 limingshu 提交于
-
- 19 5月, 2021 1 次提交
-
-
由 Jacek Czaja 提交于
-