- 02 7月, 2019 4 次提交
-
-
由 Yi Liu 提交于
1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops 2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext 3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis
-
由 tianshuo78520a 提交于
* fix the api.spec file does not get the class comment problem * cat new.spec * check api.spec * test=develop
-
由 guru4elephant 提交于
make fleet support mpi job submit directly.
-
由 chengduo 提交于
* add not_been_used_vars to no_grad_set test=develop
-
- 01 7月, 2019 9 次提交
-
-
由 LielinJiang 提交于
* modify roi_perspective_transform_op to output mask and transform matrix * modify comment * modify comment * modify API.spec * update API.spec * remove no use header, test=develop * resolve conflict
-
由 XiaoguangHu 提交于
update README_cn.md with latest version
-
由 XiaoguangHu 提交于
update README.md with latest release & install version
-
由 tensor-tang 提交于
* fix mac ci random fail * use platform instead
-
由 Michał Gallus 提交于
* Int8: Fix Pooling output scale test=develop * Update scales quantization for certain operators These include: concat, transpose, pool and reshape. test=develop * Move concat minimum scale finding to quantizer test=develop
-
由 Brian Liu 提交于
* Fix bug in quantize kernel which cause crash in vgg16/19 model test=develop * refine the code to reduce verbose code; test=develop * remove useless code; test=develop
-
由 xiaoting 提交于
replace mnist dataset url
-
由 xsrobin 提交于
-
由 tianshuo78520a 提交于
-
- 30 6月, 2019 1 次提交
-
-
由 hutuxian 提交于
* update api format test=develop * update API.spec test=develop
-
- 29 6月, 2019 3 次提交
-
-
由 jiaqi 提交于
fix data feed ptr runtime error, pipeline trainer will core in some cases, so set it nullptr as default value.
-
由 guru4elephant 提交于
change url of pslib.tar.gz
-
由 tensor-tang 提交于
* fix py-cpuinfo mac random fail * differentiate version on windows
-
- 28 6月, 2019 10 次提交
-
-
由 Jie Fang 提交于
test=develop
-
由 tianshuo78520a 提交于
* test=develop * test=develop
-
由 guru4elephant 提交于
* add MultiSlotStringDataGenerator for speedup of string based user input data
-
由 Leo Zhao 提交于
1. some key generation method is not aligned with PR#17965 2. enlarge ptr lifetime to avoid memory release if SetBlob fails otherwise it will get core dump. test=develop
-
由 tianshuo78520a 提交于
* change nproc 8
-
由 Zeng Jinle 提交于
* add_elementwise_add_inplace_test,test=develop * rename file, test=develop
-
由 Jiabin Yang 提交于
* test=develop, add some comments for Program.clone * test=develop, add API.spec * test=develop, refine comments * refine Program doc and clone doc * test=develop, refine doc
-
由 Jiabin Yang 提交于
-
由 chengduo 提交于
* add cuda_is_available test=develop * Fix api.spec test=develop * fix api doc test=develop
-
由 Wojciech Uss 提交于
test=develop
-
- 27 6月, 2019 12 次提交
-
-
由 lujun 提交于
Fix dygraph show style for FluidDoc.
-
由 HaoRen 提交于
* add dependecy of collective_helper * test=develop fix dependecy of collective_helper
-
由 翟飞跃 提交于
-
由 chengduo 提交于
* update pe reduce config test=develop * drop the local_exe_scopes of the previous parallel_executor test=develop
-
由 tangwei12 提交于
* add is_runnning in communicator, test=develop
-
由 tianshuo78520a 提交于
需要在avx_noavx build时候,生成dockerfile。 使用combine_avx_noavx 参数生成whl后发现不能build镜像,原因:没有生成dockerfile。需要添加生成dockerfile选项。
-
由 kh2se2013 提交于
* add WITH_COVERAGE option, default OFF test=develop * add coverage for python sdk test=develop * fix code style * fix COVERAGE_FILE path test=develop * remove coverage package test=develop * test = develop, run coverage as module
-
由 Michał Gallus 提交于
test=develop
-
由 HaoRen 提交于
* fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * fix comment test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * supports collective training in executor * make fetch_list runable with variables, add more unittest for use_program_cache test=develop * use unique name for nccl_id * supports output to stream in program_to_code * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code * set op role in collective training * add collective op role * fix comment test=develop * remove orig file * add build optimizer by strategy * add collective strategy * refine collective strategy * add multi-process role maker * refine strategy building factory so that we can easily plugin more strategy * scale loss grad in collective sgd transpiler * add support for distributed fc * code format * revert some features for dist fc * add support for distributed fc training * test=develop add collective op unittest standard * test=develop remove the test_collective directory * test=develop remove the test_collective directory * remove slicegather test * code format for reducescatter * update attr of shard_index_op * Modify macro nccl_helper * remove test without distribute * macro collective_helper * marcro update * test=develop update support python3.5 * test=develop change gpu memory use to 0.1 when test * test=develop update ut equal func * test=develop set flags to 1.5 * test=develop fix pickle dumple py35 * test=develop fix divide in slice and add sync_comm_stream update atol and rtol to 1e-05 rm shard_index op and test modify read input from file to read from memory remove origin_program in framework and add i/o in c_sync_calc_stream * test=develop update unittest sync operator I/O
-
由 Sylwester Fraczek 提交于
add prior_box quantization code add scale algo rules for prior box test=develop
-
由 lidanqing 提交于
* some fixes for int8 mobilenet_ssd tester test=develop * change wrong data file name test=develop * change test images bin file from 200 images to 100 images * change directory existence to file existence during downloading test=develop * reuse download_data test=develop * run full dataset when iterations=0 test=develop
-
由 Jacek Czaja 提交于
* - Reusing of reuder used in elementwise_add_mkldnn - Added MKL-DNN sum prim reusing test=develop - Compilation fixes test=develop - Yet another compilation fix test=develop - Yet another compilation fix test=develo - Yet another linking fix test=develop - Final compilation fix test=develop - lint fixes test=develop - Lint fixes test=develop * - Fixes after review test=develop
-
- 26 6月, 2019 1 次提交
-
-
由 qingqing01 提交于
* Simplify multi_box_head API in detection.py and remove assign op.
-