- 15 4月, 2022 6 次提交
-
-
由 zhangxiaoci 提交于
-
由 limingshu 提交于
* change cudnn helper for auto-tune * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm. * Fix the bug in calculating and printing current step cache hit rate. * Improve the autotune cache and fix unittest. * Change the key from AlgorithmType to int64_t. * Fix unittest for cpu-only env. * change ChooseAlgoByWorkspace for heuristic mode Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
由 fwenguang 提交于
-
由 fwenguang 提交于
* [MLU] add mlu new profiler * fix format
-
由 caozhou 提交于
* update cluster
-
由 hong 提交于
* try to fix batch norm memory issue * fix batch norm memroy alloc bug * polish some code
-
- 14 4月, 2022 29 次提交
-
-
由 caozhou 提交于
-
由 chenjian 提交于
-
由 houj04 提交于
-
由 Lijunhui 提交于
* regist elementwise_xxx
-
由 Chen Weihang 提交于
-
由 YuanRisheng 提交于
* support construct scalar using non-cpu tensor * fix bugs when run unittest * fix compile bugs * fix bugs when run ci * fix compile bugs * fix bugs when move copy * perfect unit test * perfect unittest * update according to comment * add target dependency * deal with conflict * fix bugs when run unit test * fix unit test bugs
-
由 Yiqun Liu 提交于
-
由 zhangkaihuo 提交于
-
由 liutiexing 提交于
* executor perf statistics * fix ut * fix ut * fix ut * add ut * add ut
-
由 Jacek Czaja 提交于
* Add UT - Added missed data_layout - Added missing conversions - NDHWC added - NDHWC support in data_transform - another fix - condddate change - fix u- fix - fix - fix - fix - fix - fix to hack - compilation fix - fix to automatic merge * - reduced UT * - fix * - lint * - fix to lint
-
由 Sławomir Siwek 提交于
* Change tensor name to match activation * declare fc_eltwise_add pass * merge conv_eltwise refactor PR * first compilable draft * unittest feedback tools * Fuse pass tester * Move IsReachable() to shared file * 100% coverage of fuse_pass_tester.cc * register pass * Add bias node * Improve unit tests / remove bias node from pattern * improve fc_eltwiseadd_unittest * cancel eltwise_add fuse if act is already fused * Add elementwise_input scale * Residual MVP * Add new FC attrs * Add more test cases * Add missing op attrs * Adapt code to new Elementwise pattern * reuse existing fcpattern * improve code style * remove unused arguments * fix typo * remove whitespace * remove int8 related code * Remove attributes from base ops * style * style check * Remove input from base op * Set attribute during fuse * ut timeout * download and test model * DRY * apply feedback from review * Style check * fix typo * cosmetic changes * explicitly set residual as output * VIT-OCR accuracy check * trigger CI * remove whitespaces * fix missing data file
-
由 Sing_chan 提交于
-
由 zmxdream 提交于
* modify xpu_kp.cmake with HETERPS&PSLIB * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop
-
由 Vigi Zhang 提交于
-
由 z8hanghuan 提交于
* support multi layer and bidirection of lstm_grad, *test=kunlun * support multi layer and bidirection of lstm_grad, *test=kunlun
-
由 Sing_chan 提交于
-
由 zyfncg 提交于
* support some c++ api in paddle namespace * change c++ api namespace in custom op
-
由 Zhanlue Yang 提交于
* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode * Enabled more test cases * Fixed performance issues * Fixed minor issue
-
由 Sing_chan 提交于
* fix bfgs_doc; test=document_fix * add parameter name; test=document_fix * modify according to chenlong's comments;test=document_fix
-
由 Vigi Zhang 提交于
-
由 Aurelius84 提交于
-
由 Chen Weihang 提交于
-
由 Sing_chan 提交于
-
由 xiayanming 提交于
-
由 zhangbo9674 提交于
-
由 Wilber 提交于
* temporariliy run once * update * update * update * update * fix ci problem
-
由 Chen Weihang 提交于
* chnage dispatch to visit * resolve conflict
-
由 baoachun 提交于
* add mkldnn int8 pass [step3] * Add test for compute_propagate_scales_mkldnn_pass * update pass * update api comment and python api Co-authored-by: Nwozna <joanna.wozna@intel.com>
-
由 jakpiase 提交于
* added shuffle_channel bf16/fp32 fwd kernel * added missing files * CI fix * changed from pten to phi * tmp save * added reviewers suggestions * fix for test
-
- 13 4月, 2022 5 次提交
-
-
由 levi131 提交于
* native commit for triple grad of sigmod * Updated unittests files * init functional jacobian api * Updated trible_test func * Updated gradient_checker & test_script * finish test with dtype float32 * add float64 test case * polish code * use atol=1e-5 with dtype float64 * fix for ci * set timeout for test_jacobian * fix dygraph grad to support high differential * polish API docstring * Updated gradient checker and some related files * fix double grad strip error for high differential * fix double grad strip error for high differential * Add Sigmoid triple grad tests * fix dygraph double grad dtype error when calling for high differential senario * Updated triple grad teses func * Use np.random to initialize ddx * Updated triple_grad_check func * add todo for gradient checker and refine some comments * remove additional code * add test for warnging in backward.py * format python code * support multi input in triple gradient checker * Add matmul triple grad kernel * Updated comments of TODO * Supported some special tests * Change code-format to follow CI std * Updated gradient_checker.py * Fix conflicts * Removed unnecessary printing log * Change code style to follow CI std * merge upstream * add_p * rm useless files * add sub_p mul_p div_p * add sqrt_p and tanh_p * add reshape_p * add broadcast_p * add broadcast_p fill_constant_p matmul_p reduce_p reshape_p transpose_p * add split_p and concat_p * add gather_p and scatter_add_p * add slice_select_p and slice_assign_p * add multi input check for add_p, sub_p, mul_p, div_p * update concat_p * refine gather_p and scatter_add_p * refine slice_assign_p and slice_select_p * add 9 test for prim ops * add more test and fix some bug * add more test * register proto * add shape valid check for broadcast_p op, and add keepdim attr into reduce_p op proto * support multi input and multi output for split_p and concat_p * fix slice bug for slice_select_p and slice_assign_p * dtype for axis attr should be long int * update dtype for axis attr int64_t * update for iscan CI * add more shape and dtype check * change IndexTensor into int32 dtype
-
由 wangguanqun 提交于
* the one ps proto * the one ps proto * fix * fix * fix * fix windows ci * fix windows ci * add dependency * add dependency
-
由 zyfncg 提交于
* adjust the slice end in getitem * fix bug * fix bug * fix bug * recover start change
-
由 hong 提交于
-
由 zmxdream 提交于
[XPUPS]add support for kunlun2 Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
-