- 24 3月, 2022 33 次提交
-
-
由 zhangbo9674 提交于
* approve amp for intermediate_dygraph * add amp_utils for intermediate_dygraph * add amp needcast check for mlu & npu * test unittest * add SetGradNode for set_stop_gradient && add checktensor for GradientHooks * refine code * refien unittest of imperative_amp for new dygraph * inplace api skip amp * add test_imperative_qat_amp for intermediate amp * refine code * refine test_amp ci strategy * refine unittest code * refine amp_utils code * refine amp getpromotetype for some special op * refine unittest code
-
由 Aurelius84 提交于
-
由 joanna.wozna.intel 提交于
* Correct MultipleQuantizeSquash * Correct logging
-
由 Ren Wei (任卫) 提交于
-
由 Roc 提交于
* # This is a combination of 10 commits. # The first commit's message is: add expert count op add ut for expert_count # This is the 2nd commit message: update UT only for cuda # This is the 3rd commit message: fix for rocm # This is the 4th commit message: update ut # This is the 5th commit message: add moe module # This is the 6th commit message: add expert count op add ut for expert_count # This is the 7th commit message: update UT only for cuda # This is the 8th commit message: update ut # This is the 9th commit message: add moe module # This is the 10th commit message: make expert count private * add assign pos op * fix upper num name * add api _assign pos * add ut for assign pos op * update date * fix for win * update for test (timeout) * fix ut * update * fix ut for number count Co-authored-by: Nhlygit66666 <2570058140@qq.com>
-
由 lilong12 提交于
-
由 Guoxia Wang 提交于
-
由 Sing_chan 提交于
* make vcvars64 and cuda_version can be set in xly pipe * make third_party_path reused by ci and build pipe;test=windows_ci_inference;test=windows_op;test=windows_ci
-
由 tianshuo78520a 提交于
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Add EventsWaiter * update * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * update * update Error MSG * update EventsWaiter * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 kuizhiqing 提交于
* test=document_fix , fix launch doc * test=document_fix , fix typo
-
由 Jack Zhou 提交于
* Fix rnn, wmt16 docs;test=document_fix * Fix wmt14 docs;test=document_fix * Add more description;test=document_fix
-
由 xiayanming 提交于
* [Auto Parallel] gradient merge pass support dist attribute
-
由 zhangkaihuo 提交于
-
由 zhaocaibei123 提交于
-
由 caozhou 提交于
* migrate infershape * fix tril_triu infershape error * fix qr_op infershape * add parse qr mode func * move order
-
由 huzhiqiang 提交于
-
由 Zhanlue Yang 提交于
* [Refactor] refactored eager_gen.py PR #1 * [Refactor] refactored eager_gen.py PR #1 * Refactored version 2 * Added automatic code generation utils * Fixed merge issues
-
由 huzhiqiang 提交于
-
由 Sing_chan 提交于
-
由 ronnywang 提交于
-
由 Jiabin Yang 提交于
-
由 kuizhiqing 提交于
-
由 Zhanlue Yang 提交于
-
由 王明冬 提交于
-
由 seemingwang 提交于
* extract sub-graph * graph-engine merging * fix * fix * fix heter-ps config * test performance * test performance * test performance * test * test * update bfs * change cmake
-
由 xiongkun 提交于
Polish optest: refine the optest parameter logic. support name, dtype, out, output in arbitrary position (#40824) * 1. add the python api grad 2. add final and intermediate state vlog 3. change the python_api error logic * add python api or close the check_eager=True * fix the compatibility * matmul * disable unittests: test_elementwise_add_op test_scatter_nd_op test_gather_nd_op test_scatter_op test_index_sample_op test_elementwise_add_mkldnn_op * refine the logic of prepara_parameter logic * fix Tensor(gpu) 2 Scalar segment fault.
-
由 0x45f 提交于
* Refine eager run_program OP for dy2st UT * append run_program error string and refine run_program_grad * remove some comments * refine ConstructXGradTensors
-
由 Aurelius84 提交于
* [phi] Split selected_rows CMake compilation * move file back * move file back
-
由 caozhou 提交于
* refactor cost model
-
由 Chen Weihang 提交于
* add mul phi kernel * remove mul op kernel * remove original mul grad op * fix cinn test * fix dygraph test failed
-
由 Wilber 提交于
* infrt add trt engine * fix register * file generate * fix ci error * fix conflict * add copyright * update * update * update * update engine name * refactor trt code * update * update * update * update * fix conflict * update * refactor code * first commit * update pdtensor to denseTensor * code * style * code * code style * add the tensor map, test=develop * update * update * update * trt engine * update trt mlir and runtime * update mlir test * update * update * update Co-authored-by: NDannyIsFunny <912790387@qq.com> Co-authored-by: NShixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 niuliling123 提交于
-
- 23 3月, 2022 7 次提交
-
-
由 Leo Chen 提交于
-
由 王明冬 提交于
-
由 jakpiase 提交于
* added missing BF16 activations * added softplus bf16 * minor change * disabled tests for GPU
-
由 furnace 提交于
-
由 furnace 提交于
* [NPU] add npu support for conv3d and conv3d_grad * [NPU] delete failed unittests due to Ascend not support * [NPU] delete debug codes * [NPU] optimize codes, notest * [NPU] remove const_cast * [NPU] optimize for remove const_cast * [NPU] fix written errors
-
由 Zhanlue Yang 提交于
-
由 zhaocaibei123 提交于
* fix benchmark and communicator config * fix bugs of the_one_ps * multi program and fix bug in optimizer * multi program in the_one_ps * public commcontext * ps optimizer multi programs * cvm & datanorm backend * fix dim * fix unittest * fix * the one ps merge * remove comm * add DownpourLiteWorker * all * fix * fix * device worker downpour lite * fix * fix bug in global shuffle * save inference model * fix & add log * fix * remove log * fix * fix save summary * fix * fix pscore * fix * fix * fix * fix * fix * remove logs * fix * fix * fix * fix * fix * add some comments * fix Co-authored-by: Nesythan <esythan@126.com>
-