- 27 3月, 2022 5 次提交
-
-
由 hong 提交于
* move slice to pten * merge develop; test=develop * fix slice bug; * update * update * fix error * update * fix bug * polish code * polish code * polish code * try to fix windows bug * add gpu compile flag; * try to fix * remov template; * polish code; * fix npu bug; * fix npu bug * fix npu bug; test=develop * fix slice bug; * remove no need dep
-
由 From00 提交于
* Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy * Set FLAGS_use_stream_safe_cuda_allocator to false * Update * Remove unnecessary code * Fix CI errors * Add UT
-
由 pangyoki 提交于
* fix inplace bug in final_state eager_gen * fix python_c_gen
-
由 zhangbo9674 提交于
* fix amp with optiontional api bug * refine optional code for amp
-
由 Jack Zhou 提交于
* add string tensor and case convert kernels * Add strings empty kernel; Reorganize the structure of case convert kernel * Add string infermeta * Update mutable_data of string tensor * rename kernel name * add string copy tmp * Fix strings copy device bug * add utf8 gpu converter * add string tensor c++ api * Remove mutable_data of string tensor * update string tensor interface * remove charcases_flag.h * remove some fluid headers * Add make_ddim * __HIPCC__ -> PADDLE_WITH_HIP * remove fluid headers * fix cpu compile * remove std::hash * Fix cudaMalloc * Remove strings/impl directory * Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps * Add empty kernel test * Remove some comments * Modify lower/upper api encoding type: string->bool * STRING->PSTRING; Add CreateInferLikeMeta * Add code gen for C++ String API * remove strings_api_utils.h * Add ignore file (strings_api.h, strings_api.cc) * update strings gen script * change args order of case convert kernels * Add comments for pstring, StringTensor * cpstring_internal.h -> cpstring_impl.h * Update accordding to comments: 1. Remove fluid headers 2. paddle::platform::errors -> phi::errors 3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()' 4. Use camel code style * Remove all singletons in strings kernels * fix rocm compile * Fix py3 compile * Fix c++ coverage * 1. Add pstring proto type 2. Add StringTensor debug info 3. Rename case_convert_kernel to strings_lower_upper 4. Remove serialize derialize strings kernel * DataLayout::PSTRING -> DataLayout::PSTRING_UNION * Register pstring data type * Fix strings api gen * Fix dense tensor register pstring dtype * Fix error messages * remove line * add pstring unittest * remove test string api unitest * remove empty line * Remove some headers to decrease the size of executable file
-
- 26 3月, 2022 2 次提交
-
-
由 zhangbo9674 提交于
* add amp for final status * solve compile error
-
由 Chen Weihang 提交于
* move mean infershape into phi * try to run ci * share layout for mkldnn * revert grad infershape * revert grad infershape
-
- 25 3月, 2022 20 次提交
-
-
由 hong 提交于
* update * remove useless code * remove label smooth test * polish code * polish code * polish code * remove _in_eager_mode error;
-
由 duanboqiang 提交于
* fix lars optitmizer bug * Update optimizer.py
-
由 zn 提交于
-
由 Zhanlue Yang 提交于
-
由 YuanRisheng 提交于
-
由 Aurelius84 提交于
* [Phi] Migrate strided_slice into Phi * [Phi] Migrate strided_slice into Phi * fix compilation problem
-
由 Aurelius84 提交于
* [Phi] Migrate Adam and Adamw into Phi * fix compile error and unittest ok * fix compile error and unittest ok * fix undefined reference to fLI::FLAGS * test depend on operator * fix cmake * fix xpu compile * fix infrt * fix amp_type_traits * fix amp_type_traits * modify according reviewer * modify according reviewer * fix dtype float16 * fix typo * fix Cmake * fix code style
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * Update ThreadDataRegistry Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 z8hanghuan 提交于
* support multi_dims for tril_triu, *test=kunlun * support multi_dims for tril_triu, *test=kunlun * support multi_dims for tril_triu, *test=kunlun * update xpu.cmake date, support multi_dims for tril_triu, *test=kunlun
-
由 FlyingQianMM 提交于
* add maximum limit for grid of reduce, elementwise and gather * add {} after if
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
* move part sum op kernel * remove deprecated names
-
由 zhouweiwei2014 提交于
-
由 Jiabin Yang 提交于
* refactor eager flags * fix flags error when we switch from eager to dygraph * fix ci problem * fix ci * fix ci * merge develop and fix code style * merge develop and fix code style * fix op test error * fix op test error * fix op test error * fix op test error * fix op test error * merge develop
-
由 FlyingQianMM 提交于
-
由 0x45f 提交于
* Fix loop index for FillZeroForEmptyGradInputs * Call fill zero in run_program_grad
-
由 seemingwang 提交于
-
由 Aganlengzi 提交于
* [NPU] add merged_momentum * fix * fix device
-
由 Zhangjingyu06 提交于
-
由 zyfncg 提交于
* Scalar support marking data_type in yaml * fix code-gene bug
-
- 24 3月, 2022 13 次提交
-
-
由 Chen Weihang 提交于
* add mean phi kernel * remove original mean kernel * add alias name
-
由 Chen Weihang 提交于
* move batch size like infershape * revert other op change * call infermeta in infershape * adjust batchsize like pos
-
由 zhiboniu 提交于
-
由 Leo Chen 提交于
-
由 jiangcheng 提交于
* fix build_cinn_pass internal var may be control var problem * add annotation and vlog by review advice
-
由 zyfncg 提交于
* support intermediate for saprse api * close intermediate in yaml * fix dygraph_api dep for eager
-
由 zhangbo9674 提交于
* approve amp for intermediate_dygraph * add amp_utils for intermediate_dygraph * add amp needcast check for mlu & npu * test unittest * add SetGradNode for set_stop_gradient && add checktensor for GradientHooks * refine code * refien unittest of imperative_amp for new dygraph * inplace api skip amp * add test_imperative_qat_amp for intermediate amp * refine code * refine test_amp ci strategy * refine unittest code * refine amp_utils code * refine amp getpromotetype for some special op * refine unittest code
-
由 joanna.wozna.intel 提交于
* Correct MultipleQuantizeSquash * Correct logging
-
由 Roc 提交于
* # This is a combination of 10 commits. # The first commit's message is: add expert count op add ut for expert_count # This is the 2nd commit message: update UT only for cuda # This is the 3rd commit message: fix for rocm # This is the 4th commit message: update ut # This is the 5th commit message: add moe module # This is the 6th commit message: add expert count op add ut for expert_count # This is the 7th commit message: update UT only for cuda # This is the 8th commit message: update ut # This is the 9th commit message: add moe module # This is the 10th commit message: make expert count private * add assign pos op * fix upper num name * add api _assign pos * add ut for assign pos op * update date * fix for win * update for test (timeout) * fix ut * update * fix ut for number count Co-authored-by: Nhlygit66666 <2570058140@qq.com>
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Add EventsWaiter * update * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * update * update Error MSG * update EventsWaiter * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 zhangkaihuo 提交于
-
由 caozhou 提交于
* migrate infershape * fix tril_triu infershape error * fix qr_op infershape * add parse qr mode func * move order
-
由 Zhanlue Yang 提交于
* [Refactor] refactored eager_gen.py PR #1 * [Refactor] refactored eager_gen.py PR #1 * Refactored version 2 * Added automatic code generation utils * Fixed merge issues
-