- 12 4月, 2022 1 次提交
-
-
由 liutiexing 提交于
-
- 11 4月, 2022 2 次提交
-
-
由 zhouweiwei2014 提交于
-
由 Allen Guo 提交于
-
- 09 4月, 2022 1 次提交
-
-
由 limingshu 提交于
* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode. * Use the system cudaMalloc and cudaFree to allocate workspace during searching. * Enable switch of two kind of workspace setting methods. Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
- 08 4月, 2022 2 次提交
-
-
由 Qi Li 提交于
* [ROCm] fix dcu error in device event base, test=develop * fix, test=develop
-
由 taixiurong 提交于
-
- 07 4月, 2022 5 次提交
-
-
由 zhouweiwei2014 提交于
-
由 chenjian 提交于
* no * maintain old profiler * fix old dygraph record event
-
由 QingshuChen 提交于
* ignore some failed test for KL2 *test=kunlun * minor *test=kunlun * minor *test=kunlun
-
由 JingZhuangzhuang 提交于
* modify infer gpu memory strategy * modify infer gpu memory strategy
-
由 Yiqun Liu 提交于
* Add GPU memory usage information in the print of profiler. * Add ifdef.
-
- 06 4月, 2022 1 次提交
-
-
由 Allen Guo 提交于
* remove paddle_ipu shared library * fix unique_name
-
- 03 4月, 2022 1 次提交
-
-
由 FlyingQianMM 提交于
* limit grid dim for index select * mv LimitGridDim into gpu_launch_config.h * fix conflicts * fix conflicts * fix code style * set block to 256 * fix grid setting * set dtype of block_dim to unsigned int
-
- 01 4月, 2022 2 次提交
-
-
由 wanghuancoder 提交于
* support pinned, test=develop * support async_write, test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine,test=develop * refine, test=develop * refine, test=develop * refine, test=develop * refine, test=develop
-
由 z8hanghuan 提交于
* support multi_layer of bilstm,*test=kunlun * support multi_layer of bilstm, *test=kunlun * support multi_layer of bilstm, *test=kunlun * support multi_layer of bilstm, *test=kunlun
-
- 31 3月, 2022 3 次提交
-
-
由 Leo Chen 提交于
* fix bug that some op has no op_role attr * add mkldnn support for new executor * fit for mkldnn data_transfer * fit for mkldnn data_transfer
-
由 chenjian 提交于
* no * maintain old profiler * exclude new python record events for old profiler * maintain old profiler * maintain * maintain old profiler * maintain * fix cmakes
-
由 chenjian 提交于
* no * fix bugs * fix doc according to review * fix api doc format * fix api doc according to review * fix bug and add unit test * fix record event bug * optimize chrome tracing display * fix bug * add comment * add unit test * fix a bug * fix * fix * fix format
-
- 30 3月, 2022 3 次提交
-
-
由 From00 提交于
Add new APIs for GPU memory monitoring (max_memory_allocated, max_memory_reserved, memory_allocated, memory_reserved) (#38657) * Add new API memory_reserved * Add memory_allocated, max_memory_reserved and max_memory_allocater * Fix CI error * Fix CI error * Enhance UT * Add FLAGS_memory_stats_opt * Add STATS macro functions * Add StatAllocator * Fix CI errors * Add UT * Fix CI errors
-
由 ykkk2333 提交于
* add bilinear interpolate v2 to xpu list and unitteset, *test=kunlun * Delete ps_usr_print_log * Delete ps_usr_print_log * Delete xpu_op_test
-
由 houj04 提交于
* swish and pow op for xpu. test=kunlun * fix code style. test=kunlun. * use pow_grad xdnn api. test=kunlun.
-
- 29 3月, 2022 1 次提交
-
-
由 zhangyikun02 提交于
-
- 28 3月, 2022 1 次提交
-
-
由 chenjian 提交于
* no * fix bugs * fix doc according to review * fix api doc format * fix api doc according to review * fix bug and add unit test * fix record event bug
-
- 27 3月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* fit for mkldnn and inplace op * fix compile * refine ut * register op version * fix inplace op * fix transfer_layout
-
- 25 3月, 2022 2 次提交
-
-
由 Aurelius84 提交于
* [Phi] Migrate Adam and Adamw into Phi * fix compile error and unittest ok * fix compile error and unittest ok * fix undefined reference to fLI::FLAGS * test depend on operator * fix cmake * fix xpu compile * fix infrt * fix amp_type_traits * fix amp_type_traits * modify according reviewer * modify according reviewer * fix dtype float16 * fix typo * fix Cmake * fix code style
-
由 FlyingQianMM 提交于
* add maximum limit for grid of reduce, elementwise and gather * add {} after if
-
- 23 3月, 2022 3 次提交
-
-
由 furnace 提交于
* [NPU] add npu support for conv3d and conv3d_grad * [NPU] delete failed unittests due to Ascend not support * [NPU] delete debug codes * [NPU] optimize codes, notest * [NPU] remove const_cast * [NPU] optimize for remove const_cast * [NPU] fix written errors
-
由 From00 提交于
* Performance optimize * Optimize GetAllocator, RWLock and ProcessUnfreedAllocation * Remove test file * Fix CI error * Fix CI errors * Fix CI errors
-
由 chenjian 提交于
* add event record for model profiling * fix format * fix format * fix code example bug * no * add profiler statistic * add profiler feature * fix bug * fix bug * fix bug * fix bug * required: gpu * required: gpu * fix bug * required: gpu * fix ci bug * fix ci error * fix ci error * upgrade document * fix doc * fix ci bug * add doc and fix bug * nothing * fix bug * fix format bug * modify format * add deprecated description for old profiler * fix bug * fix bug * fix * add load_profiler_reuslt doc * add load_profiler_reuslt doc * add load_profiler_reuslt doc * help fix old profiler sample code * add api doc * fix format * fix api doc * fix api doc format * fix api doc format * fix api doc c format * fix api doc format
-
- 21 3月, 2022 4 次提交
-
-
由 Chen Weihang 提交于
* add phi device context pool * change year * fix compile error * fix operator = error * refine init impl * polish details * refine init impl
-
由 zhangyikun02 提交于
-
由 Allen Guo 提交于
* add more ops * add authors Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NAllen Guo <alleng@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai> * rm ipu_strategy.check() * fix UT fail * fix typo Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
-
由 Allen Guo 提交于
* sync changes * copy sOpNamescope * fix UTs * add authors Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NAllen Guo <alleng@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai> * fix code-format * fix compile error * add comments for feed_op Co-authored-by: NXiaobing Wang <xiaobingw@graphcore.ai> Co-authored-by: NZhixin Yao <zhixiny@graphcore.ai> Co-authored-by: NZhaorui Chen <zhaoruic@graphcore.ai> Co-authored-by: NHan Zhao <hanzhao@graphcore.ai>
-
- 15 3月, 2022 1 次提交
-
-
由 Jacek Czaja 提交于
* - Prototype of third solution - fix - compilation fixes - fix - fixe - fix - fix - compilation fix - comment fix - lint update mkldnn conv_elementwise_add_fuse_pass ut - NHWC changes to prelu - alhpa dims - UT fix - fix to UT - lint - Some fixes - added to BWD of prelu NHWC support - reverted removal of resetting cu_layout in clearing of caching * - Small changes * - compilation fix * - fix * - fix * lint * - fixes after internal review * - compilation fix * - lint
-
- 14 3月, 2022 3 次提交
-
-
由 Tomasz Socha 提交于
* Add elementwise add and activation fuse pass * Fix copy ellision * More flexible pattern detector * More flexible fusion pass * Update lists for pass * Add support for Pow operator * Add support for more activation types * Style * Rename fusion pass * First version of tests * Dirty version of pass * Polished version * Update pbtxt * Style * Update names * Style * Use PADDLE_ENFORCE_EQ * Save error message to variable * WO for error checks * CR * Static style check * Add missing 'activation_scale' attribute * Add relu6 and sigmoid activations * Style * Fix fuse list formating * Sync filenames for fuse pass files * Fix cmake after move * Fix registration * Fix pass name in tests * Add missing activations to checker * WIPS * Working mul op * Working sub * Working Add * Remove pten includes * Remove some forward declarations * Remove Includes * Fixes * Remove default kernels * Add check if post_ops attributes are avaliable * Style * Code adjustment * Register default kernels * We have year 2022 not 2021... Co-authored-by: Njakpiase <jakpia21@gmail.com> Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com> * Fast review fixes Co-authored-by: Njakpiase <jakpia21@gmail.com> Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com> * Review Fix * Rename one_dnn -> onednn * Style after review * Fast and dirty fix for quantization * Update tests * Style * Fix mkldnn_quantizer config * Add Joanna's suggestion. * Check if operator is explicitly disables on OneDNN * Try to use unregistered attributes * Style * Test new framework * FXI * FXII * Update test * Style Co-authored-by: Njakpiase <jakpia21@gmail.com> Co-authored-by: NSylwester Fraczek <sylwester.fraczek@intel.com>
-
由 Lijunhui 提交于
[KP] Add unittests for brelu,ceil,celu,elu,floor,hard_shrink,hard_sigmoid,log1p,logsigmoid,relu6,silu,soft_relu,softsign,swish (#40448) * solve unexecuted UT * add 24 activation op UT * append swish&thresholded_relu to kpfirst_list * rm thresholded_relu
-
由 liutiexing 提交于
-
- 11 3月, 2022 2 次提交
-
-
由 zhouweiwei2014 提交于
-
由 houj04 提交于
-
- 10 3月, 2022 1 次提交
-
-
由 Lijunhui 提交于
-