- 10 12月, 2021 1 次提交
-
-
由 Guanghua Yu 提交于
-
- 07 12月, 2021 1 次提交
-
-
由 Zuza 提交于
* quantize slice op * correct test * fix code formatting
-
- 01 12月, 2021 2 次提交
-
-
由 Sylwester Fraczek 提交于
* dequantize matmul and matmul_v2 Y weights in qat2_int8 * review fix * split conv and mul tests, add matmul test * fixup * fix ci build * remove unused variables * formatting fix * remove extra newline at end of file
-
由 Guanghua Yu 提交于
-
- 30 11月, 2021 1 次提交
-
-
由 Sylwester Fraczek 提交于
-
- 26 11月, 2021 1 次提交
-
-
由 zhaocaibei123 提交于
* test * test * rm test * update * update * update * add unittest * update * update save
-
- 04 11月, 2021 1 次提交
-
-
由 XGZhang 提交于
* fix a quantization bug
-
- 29 10月, 2021 1 次提交
-
-
由 Ming-Xu Huang 提交于
-
- 28 10月, 2021 1 次提交
-
-
由 XGZhang 提交于
-
- 27 10月, 2021 1 次提交
-
-
由 zhangkaihuo 提交于
本PR是fused_transformer的layer层代码,包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。
-
- 20 10月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 19 10月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
* add pow2_warmup op * remove contrib __all__ * add AttrT * rename * follow comments * fix duplicate PADDLE_RESTRICT
-
- 18 10月, 2021 1 次提交
-
-
由 ceci3 提交于
* quant support matmul_v2 * fix format
-
- 14 10月, 2021 2 次提交
-
-
由 Yanxing Shi 提交于
* add sparse_embedding doc * delete wrong space * fix error for sample code * fix error for doc compile * delete __all__ * modify sample code
-
由 Zhang Zheng 提交于
-
- 11 10月, 2021 1 次提交
-
-
由 zlsh80826 提交于
Sparse tensor core for convolution requires the input channel dimension is 2:4 structed sparse. So we have to mask the input channel dimension for using sparse tensor core
-
- 22 9月, 2021 1 次提交
-
-
由 joanna.wozna.intel 提交于
-
- 21 9月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Create stateful OneDNNAXPYHandler object. This makes it possible to call it multiple times without recreating the oneDNN primitives every time. * Prepare SGDOpKernel to reuse its implementation from OneDNN kernel. * OneDNN SGD kernel. * Update call to use new OneDNNAXPYHandler object api. * Setup seed in proper place. * Enable OneDNN kernel only for single case. * For dense param and sparse grad. * Small refactor. * Enable oneDNN by op attr or by cmd line flag. * Use int64_t type for number of elements. * Support dense param and grad from OneDNN kernel. * Enable SGD OneDNN kernel when use MP BF16 optimizer. * Force non-copyable/movable OneDNNAXPYHandler. * Reuse OneDNNAXPYHandler for spare tensors in SUM op. * Fix SFINAE rules. * Remove recording event inside AXPY. * Get rid of internal primitive caching. * Stop use PP cache mechanims to store mem and primitive obj. * Handler obj store and reuse needed desc & prim * Do not derive from MKLDNNHandlerT
-
- 17 9月, 2021 1 次提交
-
-
由 zhangbo9674 提交于
* add pure fp16 major function in auto_cast & tracer * support master weight in dygraph for pure fp16 * check mix dtype of fp16&fp32 for check_finite_and_unscale op * change pure fp16 funtion name * refine some bug in auto_cast * refine auto_cast interface logic * add param _casted_by_pure_fp16 for class Layer * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator * refine pure_fp16_decorator as decorator * add unittest * add comment * add comment * support recompute * add comment for auto_cast and decorator * support to_static_state_dict for paddle.jit.save * unlimite models num and optimizers num * add lookup_table in black_list * fix momentum and layer state_dict * fix bug in layer state_dict * fix bug in layer state_dict_helper * refine unittest * refine test_momentun_op * refine interface and some code * refine amp_decorator interface * refine pure fp16 interface * refine master weight interface
-
- 15 9月, 2021 1 次提交
-
-
由 王明冬 提交于
* clip op extra information when export model,test=ocr * rename clip_extra parameter to kwargs in save_inference_model, test=ocr
-
- 13 9月, 2021 3 次提交
-
-
由 zhulei 提交于
* [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace
-
由 lidanqing 提交于
-
由 joanna.wozna.intel 提交于
-
- 10 9月, 2021 2 次提交
- 09 9月, 2021 1 次提交
-
-
由 XGZhang 提交于
-
- 06 9月, 2021 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Add fusion_lstm INT8 PTQ * Correct mkldnn_cache_capacity and enable fc_lstm_fuse_pass only for this test * Change mkldnn_cache_capacity
-
- 03 9月, 2021 1 次提交
-
-
由 XGZhang 提交于
-
- 01 9月, 2021 1 次提交
-
-
由 cc 提交于
-
- 31 8月, 2021 1 次提交
-
-
由 XGZhang 提交于
-
- 26 8月, 2021 1 次提交
-
-
由 XGZhang 提交于
-
- 24 8月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Small corrections. * Fix lr for bf16. * Revert some changes.
-
- 18 8月, 2021 1 次提交
-
-
由 XGZhang 提交于
-
- 17 8月, 2021 1 次提交
-
-
由 Roc 提交于
-
- 16 8月, 2021 1 次提交
-
-
由 zhangchunle 提交于
-
- 10 8月, 2021 1 次提交
-
-
由 XGZhang 提交于
-
- 05 8月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 30 7月, 2021 1 次提交
-
-
由 zhangchunle 提交于
-
- 28 7月, 2021 1 次提交
-
-
由 cc 提交于
-
- 22 7月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* copy found_inf to cpu in advance to improve performance * add npu test * add npu test * refine code * refine memcpy op * fix adam
-