- 08 12月, 2021 12 次提交
-
-
由 WangXi 提交于
-
由 Yuang Liu 提交于
-
由 crystal 提交于
* add boardcast_sub * add boardcast_sub
-
由 From00 提交于
* Fix CUDAGraph bug for StreamSafeCUDAAllocator * Add CUDAGrapthAllocator check in multi-stream interface * Set FLAGS_use_stream_safe_cuda_allocator defaulted to false * Fix environment error for cmake * Fix cmake error * Add UT of GetAllocatorInterfaceTest * Add UT of CUDAGraphExceptionTest * Enhance CUDAGraphExceptionTest
-
由 chentianyu03 提交于
-
由 feng_shuai 提交于
fix: when ceil_model==true && Padding_algo!=SAME, (x-size)/stride != int, this convert is wrong (#37929)
-
由 wanghuancoder 提交于
* refine a test case, test=develop * publish python c api for eager, test=develop * revert modify about test_allclose_layer.py, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * delete numpy includes, use pybind11 numpy.h, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * suport eager error msg, and add grad test case, test=develop * refine, test=develop * refine, test=develop * generate eager core ops, only 4 ops, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop
-
由 Zhanlue Yang 提交于
* Rearranged Eager AutoCodeGen directory structure * Removed USE_OP in Eager AutoCodeGen * Enabled generation for Operators without Grad/Inputs/Outputs * Resolved operators without input * Fixed merge conflicts * Enabled Eager AutoCodeGen for 10+ more operators
-
由 Yanxing Shi 提交于
-
由 Sing_chan 提交于
-
由 sneaxiy 提交于
* fix CUDA Graph H2D bug again * fix no return bug
-
由 Yuang Liu 提交于
-
- 07 12月, 2021 28 次提交
-
-
由 LiYuRio 提交于
-
由 Yan Chunwei 提交于
* add infrt code refined with Paddle's code style. * rename CinnRtConfig to InfRtConfig * rename CinnRt to InfRt of some code * rename CINNRT to INFRT * remove unnecessary code * replace CINN to INFRT in the source code * replace all "cinn" in code to "infrt" * remove some const_cast
-
由 xiaoting 提交于
* add maxunpool2d in __all__ * fix MaxUnPool2D example
-
由 Sing_chan 提交于
-
由 Zhanlue Yang 提交于
* Debug * Fixed issue with reset_grad_inplace_version when used with clear_gradient & cross-batch accumulation * Rearranged interfaces * Fixed ci issues
-
由 Shang Zhizhou 提交于
* update logsumexp doc * update api doc * update api doc
-
由 zyfncg 提交于
-
由 tianshuo78520a 提交于
* fix static git diff check * test=document_fix
-
由 Wilber 提交于
-
由 danleifeng 提交于
-
由 Sing_chan 提交于
* make some non_parallel unittest parallel execute * delete duplicate ut
-
由 JingZhuangzhuang 提交于
* multithread_memory_optimize
-
由 Huihuang Zheng 提交于
Paddle don't have to set runtime_include_dir during run CINN.
-
由 0x45f 提交于
* polish for zip in dy2stat * polish comment * polish is_builtin_len * fix comment
-
由 TTerror 提交于
* format xpu op list * format xpu op list * update xpu1 op list
-
由 wanghuancoder 提交于
* refine a test case, test=develop * rm python, test=develop * refine, test=develop * fix cmake generate error, and fix circular import, test=develop
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the unified cluster representation * [Auto Parallel] Add the graph class for physical mapping * [Auto Parallel] Add the simple physical mapper * Set the timeout of the mapper * Merge the upstream develop unittests cmake files * Fix a bug of the process group * Remove mapper unittest from platforms which is not GPU * Move the instantiation of process group after resharding * Add the local id for devices * Update the rank mapping format * [Auto Parallel] Relaunch with the rank mapping file * Remove the unnecessary json file * Avoid entering get_device_proc_info for auto mapping * Correct the mapper unit test * Add some comments * Remove the related files about mapping * Update the unittest for auto mapping * Remove unused rank_mapping unittest * Improve the unittest coverage * Improve the unittest coverage * Improve the unittest of relaunch * Fix the unittest problem in CI * Improve the unittest of relaunch * Remove unnecessary statements * Update the unittest cmakefile * Correct the cmakefile of auto parallel unittests * Modify codes based on the new elastic change * Use the GPUs exclusively in the unittest * Correct the cmakefile * Set the timeout of the unittest
-
由 YuanRisheng 提交于
* add inplace op adaptation * optimize inplace logic and fix bugs when run kernel that has args of vector<DenseTensor> * move func in kernel_context.h into kernel_context.cc * refactor logic that transform variable to densetensor * fix bugs when compile * update func name * fix bugs when run windows-ci
-
由 zmxdream 提交于
* fix heter service. test=develop * fix heter section worker in debug mode
-
由 wenbin 提交于
don't exit if requested_size < size
-
由 zyfncg 提交于
-
由 tianshuo78520a 提交于
-
由 Zuza 提交于
* quantize slice op * correct test * fix code formatting
-
由 jianghaicheng 提交于
-
由 Zhanlue Yang 提交于
-
由 Aurelius84 提交于
-
由 Leo Chen 提交于
-
由 Yuang Liu 提交于
-