- 10 8月, 2022 1 次提交
-
-
由 ceci3 提交于
* fix quant scale name (#44116) * fix acc diff problem caused by pr #44116 (#44311) Co-authored-by: Nhandiz <35895648+ZhangHandi@users.noreply.github.com>
-
- 04 8月, 2022 1 次提交
-
-
由 Guanghua Yu 提交于
* fix QuantizeLinear kernel and pass in QAT (#44784) * Add Reduce Max in Quant (#44825) Co-authored-by: NChang Xu <molixu7@gmail.com>
-
- 27 6月, 2022 1 次提交
-
-
由 Guanghua Yu 提交于
* update quantization clip and round * fix quantization clip and round Attribute * fix typo
-
- 23 6月, 2022 1 次提交
-
-
由 lidanqing 提交于
-
- 22 6月, 2022 1 次提交
-
-
由 shiyutang 提交于
* merge_release_and_dev * merge_release_dev * update * Use tempfile to place the temporary files (#43237) * tempfile_fix * update * fix_CI * update_word2vec.inference.model * remove_change_in_word2vec_book * fix_word2vec_book * rm_affine * update
-
- 16 6月, 2022 1 次提交
-
-
由 Guanghua Yu 提交于
* Add progress bar and speed up Quantization Pass * fix typo
-
- 09 6月, 2022 2 次提交
-
-
由 Guanghua Yu 提交于
* support fuse conv and bn in QAT (#42255) * support skip_op_list in PostTrainingQuantization (#42378) * fix unittest
-
由 Guanghua Yu 提交于
-
- 04 5月, 2022 2 次提交
-
-
由 Guanghua Yu 提交于
* fix PTQ unittest timeout * fix ut
-
由 cc 提交于
Co-authored-by: Njoanna.wozna.intel <joanna.wozna@intel.com>
-
- 29 4月, 2022 1 次提交
-
-
由 WangXi 提交于
[cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer generation performance (#42311) * Add fused_multi_transformer op to optimize transformer generation performance (#41814) * fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315) * fix ci timeout
-
- 22 4月, 2022 1 次提交
-
-
由 Allen Guo 提交于
add mixed-precission support for ipu cherry-pick from #41733
-
- 05 4月, 2022 1 次提交
-
-
由 Guanghua Yu 提交于
-
- 01 4月, 2022 1 次提交
-
-
由 danleifeng 提交于
-
- 28 3月, 2022 3 次提交
-
-
由 danleifeng 提交于
* add fused_seqpool_cvm op;test=develop
-
由 Ligoml 提交于
* update docs dtype(core.VarDesc.VarType) * fix code style, test=document_fix fix code style, test=document_fix Co-authored-by: NChen Long <1300851984@qq.com>
-
由 Guanghua Yu 提交于
* add adaround post-quant method
-
- 25 3月, 2022 1 次提交
-
-
由 Jiabin Yang 提交于
* refactor eager flags * fix flags error when we switch from eager to dygraph * fix ci problem * fix ci * fix ci * merge develop and fix code style * merge develop and fix code style * fix op test error * fix op test error * fix op test error * fix op test error * fix op test error * merge develop
-
- 24 3月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* approve amp for intermediate_dygraph * add amp_utils for intermediate_dygraph * add amp needcast check for mlu & npu * test unittest * add SetGradNode for set_stop_gradient && add checktensor for GradientHooks * refine code * refien unittest of imperative_amp for new dygraph * inplace api skip amp * add test_imperative_qat_amp for intermediate amp * refine code * refine test_amp ci strategy * refine unittest code * refine amp_utils code * refine amp getpromotetype for some special op * refine unittest code
-
- 16 3月, 2022 3 次提交
-
-
由 joanna.wozna.intel 提交于
* Modify save_quant_model.py to support differnet input and output filenames * Correct wrong order of arguments
-
由 Ming-Xu Huang 提交于
-
由 qipengh 提交于
-
- 15 3月, 2022 1 次提交
-
-
由 Guanghua Yu 提交于
* add some op for full_quantization
-
- 11 3月, 2022 1 次提交
-
-
由 Guanghua Yu 提交于
-
- 04 3月, 2022 1 次提交
-
-
由 Jiabin Yang 提交于
-
- 03 3月, 2022 2 次提交
-
-
由 Baibaifan 提交于
-
由 Jiabin Yang 提交于
* eager, test=develop * fix bug, test=develop * eager, test=develop * merge legacy to fluid * eager, test=develop * eager, test=develop * Refactor TensorAdd func by template and remove gradient_accumulation in eager * Remove needless target name * eager, test=develop * eager, test=develop * Use overload instead of template * Remove legacy code * Remove legacy code * selectedrows, test=develop * Remove DataType test * eager, test=develop * eager, test=develop * support gan, test=develop * Using Tensor directly instead of using EagerTensor * support gradient_accumulation * make test_imperative_lod_tensor_to_selected_rows longer * make test_imperative_lod_tensor_to_selected_rows longer * refine code * ptb, test=develop * Rename all EagerTensor to Tensor * Rename some EagerTensor to Tensor * rename EagerTensor to EagerVariable * eager, test=develop * eager, test=develop * eager, test=develop * eager, test=develop * add more test * eager, test=develop * Support copiable selected rows and merge develop * save load, eager, test=develop * save load, eager, test=develop * refine, test=develop * remove useless _set_value method * refine, test=develop * refine, test=develop * revert static_runner, test=develop * EagerTensor to Tensor, test=develop * refine, test=develop * refine, test=develop * clear grad, test=develop * merge, develop * merge, develop * merge, test=develop * merge, test=develop * Support quant and part of slice * support legacy static save * extend slim tests time * remove imperative on inference * remove imperative on inference * merge develop * fix typo * fix typo * split slice related code into 2 part for imperative and eager * split slice from inference * split slice from inference * fix test_tensor_register_hook Co-authored-by: NWang Huan <wanghuan29@baidu.com> Co-authored-by: NWeilong Wu <veyron_wu@163.com> Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>
-
- 01 3月, 2022 2 次提交
-
-
由 joanna.wozna.intel 提交于
* Add mobilenetv3_large performance test * Disable the BF16 test if the device does not support BF16 computations * Change test timeout
-
由 wenbin 提交于
* remove * pass * more pass
-
- 19 2月, 2022 1 次提交
-
-
由 sneaxiy 提交于
* add DistributedFusedLamb op * polish code * fix compile error * compatible with pten changement * fix rocm compile error * improve converage * update upstream/develop * fix cast_with_ptr.h * add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1 * fix clip before allreduce * add use_master_param_norm * code polish * fix bug * fix ROCM ci
-
- 14 2月, 2022 1 次提交
-
-
由 Sławomir Siwek 提交于
* mish unit tests * code format * remove unused imports * code format * remove hard-coded shape values * remove timeouts * remove timeouts v2 * restore timeouts
-
- 09 2月, 2022 1 次提交
-
-
由 Wangzheee 提交于
* rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu
-
- 07 2月, 2022 1 次提交
-
-
由 arlesniak 提交于
* amp list updated * tests updated * gray list updated * amp list updated * test updated
-
- 27 1月, 2022 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Upadate pass in quant2_int8_mkldnn_pass * Back to the previous scale_matmul order * Change place of cpu_quantize_placement_pass
-
- 21 1月, 2022 1 次提交
-
-
由 ceci3 提交于
-
- 13 1月, 2022 1 次提交
-
-
由 jakpiase 提交于
* base changes for mul reimplementation * empty commit * tmp save * full implementation of mul bf16/fp32 fwd bwd * CI fix * CI rerun * changed unity build cmake to avoid gpu issues * removed mul mkldnn from unity build * added skipping tests if not cpu_bf16 * CI fix * CI fix * CI fix
-
- 12 1月, 2022 1 次提交
-
-
由 Sylwester Fraczek 提交于
* fix conv act int8 scale * add unit test for conv+hard_swish
-
- 06 1月, 2022 1 次提交
-
-
由 minghaoBD 提交于
-
- 05 1月, 2022 2 次提交
-
-
由 Jiaqi Liu 提交于
* make post training quant API support dataloader
-
由 joanna.wozna.intel 提交于
* Quantize nearest_interp and nearest_interp_v2 * Check if avx_core supported * Add depthwise_conv2d to supported quantization list
-