- 30 11月, 2021 4 次提交
-
-
由 Guoxia Wang 提交于
* support data_format='NHWC' for prelu channel mode
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the unified cluster representation * [Auto Parallel] Add the graph class for physical mapping * [Auto Parallel] Add the simple physical mapper * Set the timeout of the mapper * Merge the upstream develop unittests cmake files * Fix a bug of the process group * Remove mapper unittest from platforms which is not GPU * Move the instantiation of process group after resharding * Add the local id for devices * Update the rank mapping format * Add some comments * Remove the related files about mapping * Update the unittest for auto mapping * Remove unused rank_mapping unittest * Improve the unittest coverage * Improve the unittest coverage
-
由 LiYuRio 提交于
-
由 xiongkun 提交于
* add scope_guard * 1. fix control flow cases 2. fix calc_gradient
-
- 29 11月, 2021 7 次提交
-
-
由 TTerror 提交于
* add expand_v2/expand_as_v2 for kunlun * update expand_as_v2 * update expand_as_v2 * support float16/bool * update xpu.cmake
-
由 zhangbo9674 提交于
* amp.decorate optimizers set to None is ok * refine unittest * add unittest and refine example code * refine unittest
-
由 Yuang Liu 提交于
-
由 Weilong Wu 提交于
* native commit for triple grad of sigmod * Updated unittests files * init functional jacobian api * Updated trible_test func * Updated gradient_checker & test_script * finish test with dtype float32 * add float64 test case * polish code * use atol=1e-5 with dtype float64 * fix for ci * set timeout for test_jacobian * fix dygraph grad to support high differential * polish API docstring * Updated gradient checker and some related files * fix double grad strip error for high differential * fix double grad strip error for high differential * Add Sigmoid triple grad tests * fix dygraph double grad dtype error when calling for high differential senario * Updated triple grad teses func * Use np.random to initialize ddx * Updated triple_grad_check func * add todo for gradient checker and refine some comments * remove additional code * add test for warnging in backward.py * format python code * support multi input in triple gradient checker * Add matmul triple grad kernel * Updated comments of TODO * Supported some special tests * Change code-format to follow CI std * Updated gradient_checker.py * Fix conflicts * Removed unnecessary printing log * Change code style to follow CI std * support batch in jacobian and hessian * add batch jacobian and batch hessian * Add batch_jacobian test, draft version * [New features] Add elementwise_mul triple grad kernel (#37152) * Add elementwise_mul triple grad kernel * Removed InplaceInferer and polished code * Add numerical_batch_jacobian,numerical_batch_hessian and tests * Support batch_jacobian and batch_numerical * Use pre-commit to check code format * Update doc, polish code, add unit test * Reset the TIMEOUT properties of test_jacobian to pass CI Co-authored-by: Nlevi131 <limaolin01@baidu.com> Co-authored-by: NJiabin Yang <360788950@qq.com>
-
由 Baibaifan 提交于
-
由 李季 提交于
Co-authored-by: NChen Long <1300851984@qq.com>
-
由 Wilber 提交于
-
- 27 11月, 2021 2 次提交
-
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the unified cluster representation * [Auto Parallel] Add the graph class for physical mapping * [Auto Parallel] Add the simple physical mapper * Set the timeout of the mapper * Merge the upstream develop unittests cmake files * Fix a bug of the process group * Remove mapper unittest from platforms which is not GPU * Move the instantiation of process group after resharding * Add the local id for devices * Update the rank mapping format * Add some comments * Remove the related files about mapping * Remove unused rank_mapping unittest * Improve the unittest coverage
-
由 JingZhuangzhuang 提交于
-
- 26 11月, 2021 7 次提交
-
-
由 Steffy-zxf 提交于
* fix data parallel when VOCAB var in program
-
由 wanghuancoder 提交于
-
由 zhaocaibei123 提交于
* test * test * rm test * update * update * update * add unittest * update * update save
-
由 Li Min 提交于
* Fix bugs when bias is none for static graph for fused_attention op.
-
由 Zhanlue Yang 提交于
reset_inplace_version removes all inplace related records to VarBase/VariableWrapper, the essential purpose of which is to let you use inplace operations as if using its non-inplaced version, which of course will cause unexpected consequences if not used with care. This is essentially a hack interface to satisfy one specific request
-
由 wangzhen38 提交于
* add tdm sample * add tdm sample in c++ * update tdm sample * modify sample count * fix conflict * add set_date * fix cmake error * fix bug of proto * update index_dataset proto * update cmake * fix error cmake * fix cmake mkldnn * fix cmake proto * update cmake proto * update cmake * update rec * update dataset * update dataset * update dataset * updata dataset * updata dataset * updata coverage * updata ci * goback4 * fix npu ci * add xxhash dep
-
由 smallv0221 提交于
* fix dropout static when axis != None * update dropout test * add dropout test * fix test * Update test_dropout_op.py * Update test_dropout_op.py * fix testcase * fix testcase * Update test_dropout_op.py * fix testcase * fix testcase * optimize perf * add new test * fix testcase
-
- 25 11月, 2021 10 次提交
-
-
由 furnace 提交于
* [NPU] add int64 support for argsort op * [NPU] delete debug codes
-
由 furnace 提交于
* [NPU] add NPU kernel for prior_box op * [NPU] delete debug codes
-
由 Baibaifan 提交于
-
由 zmx 提交于
* fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * [heterps]bug fix for _run_from_dataset * fix heter_server.cc * fix launch_utils.py * fix heter_section_worker.cc * fix. test=develop * fix. test=develop
-
由 zhouweiwei2014 提交于
* add new API paddle.nn.initializer.Dirac * fix doc
-
由 Leo Chen 提交于
* fix program cache key * bug fix * fix cache problem * remove unused code
-
由 LiYuRio 提交于
-
由 Chen Weihang 提交于
* hot fix for dataloader thread error * polish comment * fix type in comment, test=document_fix
-
由 Matsumoto GAO 提交于
* add zeropad2d v0.1 * add zeropad2d v0.2 * add zeropad2d v0.3 * add zeropad2d v0.3 * add zeropad2d v0.3 * add zeropad2d v0.4 * add zeropad2d v0.5 * add zeropad2d v0.5 codestyle * add zeropad2d v0.5 codestyle * add zeropad2d v0.6 functional * add zeropad2d v0.6 functional * add zeropad2d v0.6 functional
-
由 Leo Chen 提交于
* skip compiled program * fix ut
-
- 24 11月, 2021 6 次提交
-
-
由 Thunderbrook 提交于
* pybind core * set use psgpu
-
由 Jiawei Wang 提交于
-
由 Wangzheee 提交于
* matmul_convert_int8 * matmul_convert_int8 * matmulconvert_int8 * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor
-
由 zhaoyingli 提交于
* adapt auto search * adapt auto search * fix matmulv2 compatible * del debug
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the unified cluster representation * Add the local id for devices * Add some comments
-
由 0x45f 提交于
* run dy2stat pure fp16 in Linear model * no use self._pure_fp16_inputs * add test and fix Adam error in dy2stat pure fp16 training * use paddle.optimizer.Adam * run test in gpu * change test time for CI * enlarge atol for test_resnet_pure_fp16 * refine code and enlarge atol * make custom_white_list and custom_black_list take effect for AMP and pure fp16 * check tracer is not None * use default atol * change filter_size * change atol and add some NOTE
-
- 23 11月, 2021 4 次提交
-
-
由 pangyoki 提交于
* fix inplace bug * fix custom grad input error * add unittest * fix inplace bug
-
由 Li Min 提交于
Add support for bias is none for fused_attention op.
-
由 CtfGo 提交于
`paddle.utils.download` :change to call `extractall` on tar/zip compressd file to speed up the uncompress process when they includes many files --- result of decompression speed comparison --- 1. dataset:https://paddlenlp.bj.bcebos.com/datasets/cnn_dailymail/cnn_stories.tgz, decompression time :5m50s vs 20s 2. dataset:https://paddlenlp.bj.bcebos.com/datasets/cnn_dailymail/dailymail_stories.tgz, decompression time:33m20s vs 47s
-
由 Leo Chen 提交于
* skip compiled program with places > 1 * fix corner case and add ut
-