diff --git a/.clang-format b/.clang-format index aff93435f58c522f5ed1090aef2005f76e91cf31..8b5830627348c6bff12260b7d9adbd357f074718 100644 --- a/.clang-format +++ b/.clang-format @@ -19,7 +19,7 @@ BasedOnStyle: Google IndentWidth: 2 TabWidth: 2 ContinuationIndentWidth: 4 -AccessModifierOffset: -2 # The private/protected/public has no indent in class +AccessModifierOffset: -1 # The private/protected/public has no indent in class Standard: Cpp11 AllowAllParametersOfDeclarationOnNextLine: true BinPackParameters: false diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 6140340890c0e5025eb08209e8ea78df918b4dc0..eeda759ff18ccb86ce6a585fe41cb972ea3ae295 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -34,6 +34,14 @@ repos: entry: bash ./tools/codestyle/cpplint_pre_commit.hook language: system files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx)$ +- repo: local + hooks: + - id: pylint-doc-string + name: pylint + description: Check python docstring style using docstring_checker. + entry: bash ./tools/codestyle/pylint_pre_commit.hook + language: system + files: \.(py)$ - repo: https://github.com/PaddlePaddle/pre-commit-golang sha: 8337620115c25ff8333f1b1a493bd031049bd7c0 hooks: diff --git a/.travis.yml b/.travis.yml index 3391e2c3cab9938c9dc5705b51367c707d3bbe9d..8c772030925dcad3909f142b08e4d8057a3f89b7 100644 --- a/.travis.yml +++ b/.travis.yml @@ -18,6 +18,8 @@ env: addons: ssh_known_hosts: 13.229.163.131 before_install: + # For pylint dockstring checker + - sudo pip install pylint pytest astroid isort - | function timeout() { perl -e 'alarm shift; exec @ARGV' "$@"; } script: diff --git a/Dockerfile b/Dockerfile index e5508486d6df6a7465998b7e2926b21a1604dfb4..80a96983ec1ca6b9ec440f7e95de6c328eb1ed40 100644 --- a/Dockerfile +++ b/Dockerfile @@ -79,6 +79,9 @@ RUN pip install pre-commit 'ipython==5.3.0' && \ pip install 'ipykernel==4.6.0' 'jupyter==1.0.0' && \ pip install opencv-python +#For docstring checker +RUN pip install pylint pytest astroid isort + COPY ./python/requirements.txt /root/ RUN pip install -r /root/requirements.txt diff --git a/benchmark/fluid/README.md b/benchmark/fluid/README.md index 0fc02b704362f79f2219252538b4b3195e665b2c..7071e9fdcd394a5a4db4d0d599610a72d98c0a3c 100644 --- a/benchmark/fluid/README.md +++ b/benchmark/fluid/README.md @@ -24,22 +24,22 @@ Currently supported `--model` argument include: * Run the following command to start a benchmark job locally: ```bash - python fluid_benchmark.py --model mnist --parallel 1 --device GPU --with_test + python fluid_benchmark.py --model mnist --device GPU ``` You can choose to use GPU/CPU training. With GPU training, you can specify - `--parallel 1` to run multi GPU training. + `--gpus ` to run multi GPU training. * Run distributed training with parameter servers: * start parameter servers: ```bash - PADDLE_TRAINING_ROLE=PSERVER PADDLE_PSERVER_PORT=7164 PADDLE_PSERVER_IPS=127.0.0.1 PADDLE_TRAINERS=1 PADDLE_CURRENT_IP=127.0.0.1 PADDLE_TRAINER_ID=0 python fluid_benchmark.py --model mnist --parallel 0 --device GPU --update_method pserver + PADDLE_TRAINING_ROLE=PSERVER PADDLE_PSERVER_PORT=7164 PADDLE_PSERVER_IPS=127.0.0.1 PADDLE_TRAINERS=1 PADDLE_CURRENT_IP=127.0.0.1 PADDLE_TRAINER_ID=0 python fluid_benchmark.py --model mnist --device GPU --update_method pserver ``` * start trainers: ```bash - PADDLE_TRAINING_ROLE=PSERVER PADDLE_PSERVER_PORT=7164 PADDLE_PSERVER_IPS=127.0.0.1 PADDLE_TRAINERS=1 PADDLE_CURRENT_IP=127.0.0.1 PADDLE_TRAINER_ID=0 python fluid_benchmark.py --model mnist --parallel 0 --device GPU --update_method pserver + PADDLE_TRAINING_ROLE=TRAINER PADDLE_PSERVER_PORT=7164 PADDLE_PSERVER_IPS=127.0.0.1 PADDLE_TRAINERS=1 PADDLE_CURRENT_IP=127.0.0.1 PADDLE_TRAINER_ID=0 python fluid_benchmark.py --model mnist --device GPU --update_method pserver ``` * Run distributed training using NCCL2 ```bash - PADDLE_PSERVER_PORT=7164 PADDLE_TRAINER_IPS=192.168.0.2,192.168.0.3 PADDLE_CURRENT_IP=127.0.0.1 PADDLE_TRAINER_ID=0 python fluid_benchmark.py --model mnist --parallel 0 --device GPU --update_method nccl2 + PADDLE_PSERVER_PORT=7164 PADDLE_TRAINER_IPS=192.168.0.2,192.168.0.3 PADDLE_CURRENT_IP=127.0.0.1 PADDLE_TRAINER_ID=0 python fluid_benchmark.py --model mnist --device GPU --update_method nccl2 ``` ## Run Distributed Benchmark on Kubernetes Cluster @@ -48,7 +48,7 @@ We provide a script `kube_gen_job.py` to generate Kubernetes yaml files to submi distributed benchmark jobs to your cluster. To generate a job yaml, just run: ```bash -python kube_gen_job.py --jobname myjob --pscpu 4 --cpu 8 --gpu 8 --psmemory 20 --memory 40 --pservers 4 --trainers 4 --entry "python fluid_benchmark.py --model mnist --parallel 1 --device GPU --update_method pserver --with_test" --disttype pserver +python kube_gen_job.py --jobname myjob --pscpu 4 --cpu 8 --gpu 8 --psmemory 20 --memory 40 --pservers 4 --trainers 4 --entry "python fluid_benchmark.py --model mnist --parallel 1 --device GPU --update_method pserver " --disttype pserver ``` Then the yaml files are generated under directory `myjob`, you can run: @@ -58,3 +58,14 @@ kubectl create -f myjob/ ``` The job shall start. + + +## Notes for Run Fluid Distributed with NCCL2 and RDMA + +Before running NCCL2 distributed jobs, please check that whether your node has multiple network +interfaces, try to add the environment variable `export NCCL_SOCKET_IFNAME=eth0` to use your actual +network device. + +To run high-performance distributed training, you must prepare your hardware environment to be +able to run RDMA enabled network communication, please check out [this](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/howto/cluster/nccl2_rdma_training.md) +note for details. diff --git a/cmake/inference_lib.cmake b/cmake/inference_lib.cmake index 3b13b2150514bd615667241272d287c7e55d4e74..236a55d332a91c88d1c5515e7aca4142930a079f 100644 --- a/cmake/inference_lib.cmake +++ b/cmake/inference_lib.cmake @@ -56,24 +56,28 @@ set(dst_dir "${FLUID_INSTALL_DIR}/third_party/eigen3") copy(eigen3_lib SRCS ${EIGEN_INCLUDE_DIR}/Eigen/Core ${EIGEN_INCLUDE_DIR}/Eigen/src ${EIGEN_INCLUDE_DIR}/unsupported/Eigen DSTS ${dst_dir}/Eigen ${dst_dir}/Eigen ${dst_dir}/unsupported + DEPS eigen3 ) set(dst_dir "${FLUID_INSTALL_DIR}/third_party/install/gflags") copy(gflags_lib SRCS ${GFLAGS_INCLUDE_DIR} ${GFLAGS_LIBRARIES} DSTS ${dst_dir} ${dst_dir}/lib + DEPS gflags ) set(dst_dir "${FLUID_INSTALL_DIR}/third_party/install/glog") copy(glog_lib SRCS ${GLOG_INCLUDE_DIR} ${GLOG_LIBRARIES} DSTS ${dst_dir} ${dst_dir}/lib + DEPS glog ) set(dst_dir "${FLUID_INSTALL_DIR}/third_party/boost/") copy(boost_lib SRCS ${BOOST_INCLUDE_DIR}/boost DSTS ${dst_dir} + DEPS boost ) if(NOT PROTOBUF_FOUND) @@ -81,6 +85,7 @@ if(NOT PROTOBUF_FOUND) copy(protobuf_lib SRCS ${PROTOBUF_INCLUDE_DIR} ${PROTOBUF_LIBRARY} DSTS ${dst_dir} ${dst_dir}/lib + DEPS extern_protobuf ) endif() @@ -89,12 +94,14 @@ if(NOT CBLAS_FOUND) copy(openblas_lib SRCS ${CBLAS_INSTALL_DIR}/lib ${CBLAS_INSTALL_DIR}/include DSTS ${dst_dir} ${dst_dir} + DEPS extern_openblas ) elseif (WITH_MKLML) set(dst_dir "${FLUID_INSTALL_DIR}/third_party/install/mklml") copy(mklml_lib SRCS ${MKLML_LIB} ${MKLML_IOMP_LIB} ${MKLML_INC_DIR} DSTS ${dst_dir}/lib ${dst_dir}/lib ${dst_dir} + DEPS mklml ) endif() @@ -103,6 +110,7 @@ if(WITH_MKLDNN) copy(mkldnn_lib SRCS ${MKLDNN_INC_DIR} ${MKLDNN_SHARED_LIB} DSTS ${dst_dir} ${dst_dir}/lib + DEPS mkldnn ) endif() @@ -110,17 +118,20 @@ if(NOT MOBILE_INFERENCE AND NOT RPI) set(dst_dir "${FLUID_INSTALL_DIR}/third_party/install/snappy") copy(snappy_lib SRCS ${SNAPPY_INCLUDE_DIR} ${SNAPPY_LIBRARIES} - DSTS ${dst_dir} ${dst_dir}/lib) + DSTS ${dst_dir} ${dst_dir}/lib + DEPS snappy) set(dst_dir "${FLUID_INSTALL_DIR}/third_party/install/snappystream") copy(snappystream_lib SRCS ${SNAPPYSTREAM_INCLUDE_DIR} ${SNAPPYSTREAM_LIBRARIES} - DSTS ${dst_dir} ${dst_dir}/lib) + DSTS ${dst_dir} ${dst_dir}/lib + DEPS snappystream) set(dst_dir "${FLUID_INSTALL_DIR}/third_party/install/zlib") copy(zlib_lib SRCS ${ZLIB_INCLUDE_DIR} ${ZLIB_LIBRARIES} - DSTS ${dst_dir} ${dst_dir}/lib) + DSTS ${dst_dir} ${dst_dir}/lib + DEPS zlib) endif() # paddle fluid module diff --git a/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md b/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md new file mode 100644 index 0000000000000000000000000000000000000000..0c0156c8e46378e7bbeea8072938b8ccfb9ab6d7 --- /dev/null +++ b/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md @@ -0,0 +1,1819 @@ + +# Paddle Fluid 开发者指南 + +--- + +### ==1==. 为什么需要 PaddlePaddle Fluid? + +--- + +### 两个基础问题 + + + +1. 如何描述机器学习模型和优化过程? + - 完备自洽,表达能力足以支持潜在出现的各种计算需求 +1. 如何充分利用资源高效计算? + - 支持异步设备、多卡、分布式计算 + - 降低计算/计算优化的开发成本 + - …… + + + +--- + +### 如何描述模型和优化过程? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
一组连续执行的layersvariable和operator构成的计算图 不再有模型的概念
2013 Caffe,Theano, Torch, PaddlePaddle
2015 TensorFlow, MxNet, Caffe2, ONNX, n-graph
2016 PyTorch, TensorFlow Eager Execution, **==PaddlePaddle Fluid==**
+ +--- + + +###

目标

+ + + +- 提高对各类机器学习任务的描述能力:能够描述潜在出现的任意机器学习模型。 +- 代码结构逻辑清晰,各模块充分解耦:内外部贡献者能够专注于自己所需的功能模块,基于框架进行再次开发。 +- 从设计上,留下技术优化的空间和潜力。 +- 代码解耦后降低多设备支持、计算优化等的开发成本。 +- 在统一的设计理念下,实现自动可伸缩,自动容错的分布式计算。 + + + +--- + +## ==2.== Design Overview + +--- + +# Fluid: 系统形态 + +- [编译器式的执行流程,区分编译时和运行时](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/motivation/fluid_compiler.md) +
+ +

+ +

+ +--- + +#### 让我们在Fluid程序实例中,区分编译时和运行时 + +--- +### Fluid 编译时 + + + +- ==**定义前向计算**== + + ```python + x = fluid.layers.data(name='x',shape=[13], dtype='float32') + y_predict = fluid.layers.fc(input=x, size=1, act=None) + y = fluid.layers.data(name='y', shape=[1], dtype='float32') + cost = fluid.layers.square_error_cost(input=y_predict, label=y) + avg_cost = fluid.layers.mean(x=cost) + ``` + +- ==**添加反向、正则、优化**== + ```python + learning_rate = 0.01 + sgd_optimizer = fluid.optimizer.SGD(learning_rate) + sgd_optimizer.minimize(avg_cost) + ``` + + +--- + +### `Program` vs. 计算图 + + + +- 在科学计算领域,计算图是一种描述计算的经典方式。下图展示了从前向计算图(蓝色)开始,通过添加反向(红色)和优化算法相关(绿色)操作,构建出整个计算图的过程: +- +

+ +

+ + +- Fluid ==使用`Program`而不是计算图==来描述模型和优化过程。`Program`由`Block`、`Operator`和`Variable`构成,相关概念会在后文详细展开。 +- 编译时 Fluid 接受前向计算(这里可以先简单的理解为是一段有序的计算流)`Program`,为这段前向计算按照:前向 -> 反向 -> 梯度 clip -> 正则 -> 优化 的顺序,添加相关 `Operator`和`Variable`到`Program`到完整的计算。 + +
+ +--- + +### Fluid 运行时 + + + +- ==**读入数据**== + + ```python + train_reader = paddle.batch( + paddle.reader.shuffle(paddle.dataset.uci_housing.train(), buf_size=500), + batch_size=20) + feeder = fluid.DataFeeder(place=place, feed_list=[x, y]) + ``` +- ==**定义执行程序的设备**== + ```python + place = fluid.CPUPlace() + feeder = fluid.DataFeeder(place=place,feed_list=[x, y]) + ``` + +- ==创建执行器(Executor),执行初始化 `Program`和训练`Program`== + + ```python + exe = fluid.Executor(place) + exe.run(fluid.default_startup_program()) + PASS_NUM = 100 + for pass_id in range(PASS_NUM): + for data in train_reader(): + avg_loss_value, = exe.run(fluid.default_main_program(), + feed=feeder.feed(data), + fetch_list=[avg_cost]) + print(avg_loss_value) + ``` + + +--- + +### 总结:框架做什么?用户做什么? +
+ + + + + + + + + + + + + + + + +
构建训练执行训练
+用户:描述前向运算
框架:添加反向运算
框架:添加优化运算
框架:添加内存优化
框架:添加并行/多设备/分布式相关的计算单元 +
+框架:创建Operator(计算)+ Variable(数据)
框架:创建`Block`
框架:内存管理/设备管理
框架:执行计算 +
+
+ +--- + +###

总结:编译时

+ + +**用户编写一段Python程序,描述模型的前向计算** +1. 创建变量描述 `VarDesc` +1. 创建operators的描述 `OpDesc` +1. 创建operators的属性 +1. 推断变量的类型和形状,进行静态检查:`inferShape` +1. 规划变量的内存复用 +1. 创建反向计算 +1. 添加优化相关的Operators +1. (可选)添加多卡/多机相关的Operator,生成在多卡/多机上运行的程序 + + + +--- + +###

总结:运行时

+ + +**执行规划好的计算** +1. 创建`Executor` +1. 为将要执行的一段计算,在层级式的`Scope`空间中创建`Scope` +1. 创建`Block`,依次执行`Block` + +

+
+ Figure. 编译时运行时概览 +

+ +
+ +--- + +## ==3==. 用户如何描述计算? +--- + +### Fluid:==像写程序一样==定义计算 + + +- 顺序执行 + ```python + x = fluid.layers.data(name='x',shape=[13], dtype='float32') + y_predict = fluid.layers.fc(input=x, size=1, act=None) + y = fluid.layers.data(name='y', shape=[1], dtype='float32') + cost = fluid.layers.square_error_cost(input=y_predict, label=y) + ``` + +- 条件分支: [swith](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/execution/switch.md)、[ifelse](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/execution/if_else_op.md) + + ```python + a = fluid.Var(10) + b = fluid.Var(0) + + switch = fluid.switch() + with switch.block(): + with switch.case(fluid.less_equal(a, 10)): + fluid.print("Case 1") + with switch.case(fluid.larger(a, 0)): + fluid.print("Case 2") + with switch.default(): + fluid.print("Case 3") + ``` + +>[A Lisp cond form may be compared to a continued if-then-else as found in many algebraic programming languages](https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node84.html). + + + +--- + +### Fluid: ==像写程序一样==定义计算 + + + +- 循环:[while](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_machine_translation.py#L105) + + ```python + d0 = layers.data("d0", shape=[10], dtype='float32') + data_array = layers.array_write(x=d0, i=i) + array_len = layers.fill_constant(shape=[1],dtype='int64', value=3) + + cond = layers.less_than(x=i, y=array_len) + while_op = layers.While(cond=cond) + with while_op.block(): + d = layers.array_read(array=data_array, i=i) + i = layers.increment(x=i, in_place=True) + layers.array_write(result, i=i, array=d) + layers.less_than(x=i, y=array_len, cond=cond) + ``` + +- 完整实例请点查看 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/unittests/test_while_op.py#L36-L44) +- beam search [->]( https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_machine_translation.py#L105) + + + +--- + +####

总结

+ + + +1. 用户层提供的描述语法具有完备性、自洽性,有能力支持对复杂计算过程描述 +1. 使用方式和核心概念可以类比编程语言,认知能够直接迁移 +1. 能够支持:定义问题,逐步求解 + + + +--- + +## ==3.== 核心概念 + +--- +### 编译时概念 :==变量和计算的描述== + + + +- `VarDesc` + `TensorDesc` + `OpDesc` -> `BlockDesc` -> `ProgramDesc` + - https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/framework.proto + +- 什么是 Fluid Program + + - 在Fluid中,一个神经网络任务(训练/预测)被描述为一段`Program` + - `Program`包含对`Variable`(数据)和 `Operator`(对数据的操作)的描述 + - `Variable` 和 `Operator` 被组织为多个可以嵌套的`Block`,构成一段完整的`Fluid Program` + + +>编译阶段最终,经过 Transpiler 的执行规划,变换处理,生成使用`protobuf`序列化后的`ProgramDesc`。可以发送给多卡或者网络中的其它计算节点执行 + + + +--- + +### 编译时概念 :==**[Transpiler](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/motivation/fluid_compiler.md)**== + + +1. 接受一段`ProgramDesc`作为输入,生成一段新的`ProgramDesc` + + - *Memory optimization transpiler*:向原始`ProgramDesc` 中插入 `FreeMemoryOps`,在一次迭代优化结束前提前释放内存,使得能够维持较小的 memory footprint + + - *Distributed training transpiler*:将原始的`ProgramDesc`中转化为对应的分布式版本,生成两段新的`ProgramDesc`: + 1. trainer进程执行的`ProgramDesc` + 1. parameter server执行的`ProgramDesc` + +1. ==**WIP**==: 接受一段`ProgramDesc`,生成可直接被`gcc`, `nvcc`, `icc`等编译的代码,编译后得到可执行文件 + + + +--- +### Transplier + +

+ +

+ +--- + +### 打印 `ProgramDesc` + +

+ +

+ + + +- `default_startup_program`:创建可学习参数,对参数进行初始化 +- `default_main_program`:由用户定义的模型,包括了前向、反向、优化及所有必要的计算 + +- 打印可读的 `Program` + ```python + from paddle.v2.fluid import debuger + print debuger.pprint_program_codes(framework.default_main_program().desc) + ``` + + +--- +### 输出效果 + + + + + + + + + + + + + + +
variable in block 0variable in block 0
+
+ +--- + +### 运行时概念 + + + +- 数据相关 + - `Tensor` / `LoDTensor` / `Variable` + - `Scope` + +- 计算相关 + - `Block` + - `Kernel`、`OpWithKernel`、`OpWithoutKernel` + + + + + + + + + + + + + + + + + + + + + + + + + + + +
protobuf messagesC++ class objects
Data[VarDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/framework.proto#L107) +[Variable](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/variable.h#L24) +
Operation[OpDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/framework.proto#L35) +[Operator](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.h#L64) +
BlockBlockDesc +Block +
+ +- 执行相关 :`Executor` + +
+ +--- +#### Tensor 和 LoD(Level-of-Detail) Tensor + + +- Tensor 是$n$-dimensional arry的推广,LoDTensor是在Tensor基础上附加了序列信息 +- Fluid中输入、输出,网络中的可学习参数全部统一使用LoDTensor(n-dimension array)表示 +- 一个mini-batch输入数据是一个LoDTensor + - 在Fluid中,RNN 处理变长序列无需padding,得益于 `LoDTensor`表示 + - 可以简单将 LoD 理解为:`std::vector>` + - 对非序列数据,LoD 信息为空 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TensorFlowPaddlePaddle
RNNSupport +Support +
recursive RNNSupport +Support +
padding zerosMust +No need +
blob data typeTensor +LODTensor +
+ +
+ +--- +#### LoD 信息实例 + + + +

+ +

+ +- 图(a)的LoD 信息 + ```cpp + [0, 5, 8, 10, 14] + ``` +- 图(b)的 LoD 信息 + ```cpp + [[0, 5, 8, 10, 14] /*level=1*/, [0, 2, 3, 5, 7, 8, 10, 13, 14] /*level=2*/] + ``` +
+ +--- +#### Tensor, Variable, Scope 之间的关系 + +

+ +

+ + +1. `Block` 是一个实现层的概念,不在应用层暴露给用户。目前用户无法自行创建并利用`Block`,用户能够感知的只有`Program`这个概念。 +1. 逻辑上,可以将 `Block` 类比为编程语言中的大括号:定义了一段作用域,其中运行一段代码 +1. `Executor`会为每一个`Block`创建一个`Scope`,`Block`是可嵌套的,因此`Scope`也是可嵌套的 + + + +--- +### Executor + + + + + + + + + + + + + + +
接口说明

+ +

输入
1. `ProgramDesc`
2. `Scope`
3.`block_id`

解释执行步骤
1. 创建所有 Variables
2. 逐一创建 Operator 并运行 +
+ +--- +### Operator/OpWithKernel/Kernel + + +

+ +

+ +- operator 无状态,Operator的核心是==Run==方法 +- 一个operator可以注册多个kernel +- operator 可以无 kernel:while_op 、ifelse op + +
+ +--- +#### Fluid Operator vs. PaddlePaddle layers + + + + + + + + + + + + + + + + + + +
LayerOperator

+ +

+ +

1. 内部维护状态
2. 包含forward和backward方法
1. 内部无状态
2. 只有Run方法
+ +
+ +--- + +### ==4.== 内存管理 + +--- +### 目标 + +- 为异构设备提供统一的内存分配、回收接口 +- 最小化管理内存所需的时间,最小化管理开销 +- 减少内存碎片 +- 将内存管理与计算(Operators/Kernels)完全剥离 +- 统一内存管理是内存优化的基础 + +--- + + + +### Memory 接口 + +- 内存管理模块向上层应用逻辑提供三个基础接口: + ```cpp + template + void* Alloc(Place place, size_t size); + + template + void Free(Place place, void* ptr); + + template + size_t Used(Place place); + + struct Usage : public boost::static_visitor { + size_t operator()(const platform::CPUPlace& cpu) const; + size_t operator()(const platform::CUDAPlace& gpu) const; + }; + ``` +- 模板参数 `Place` 指示内存分配发生的设备 +- 实现时,需特化支持的 `Place`, 提供以上三个接口的实现 + + + +--- +### 代码结构 + + + +内存管理模块可以理解为由以下两部分构成: + +1. SystemAllocator:实际从物理设备上分配、释放的内存的接口 +1. BuddyAllocator:内存管理算法 + + + +--- +### System Allocator + + + +- SystemAllocator 是实现物理内存分配、回收的基类 + - 不同设备上的内存分配和回收终将转化为标准接口调用 + - 为不同设备实现MemoryAllocator,继承自SystemAllocator + + ```cpp + class SystemAllocator { + public: + virtual ~SystemAllocator() {} + virtual void* Alloc(size_t& index, size_t size) = 0; + virtual void Free(void* p, size_t size, size_t index) = 0; + virtual bool UseGpu() const = 0; + }; + ``` + + +--- + +### CPU/GPU Allocator + + + +```cpp +class CPUAllocator : public SystemAllocator { + public: + virtual void* Alloc(size_t& index, size_t size); + virtual void Free(void* p, size_t size, size_t index); + virtual bool UseGpu() const; +}; + +#ifdef PADDLE_WITH_CUDA +class GPUAllocator : public SystemAllocator { + public: + virtual void* Alloc(size_t& index, size_t size); + virtual void Free(void* p, size_t size, size_t index); + virtual bool UseGpu() const; + private: + size_t gpu_alloc_size_ = 0; + size_t fallback_alloc_size_ = 0; +}; +#endif +``` +- CPUAllocator和GPUAllocator分别继承自SystemAllocator,分别调用相应的标准库函数实现物理内存的分配和释放。 +- 一旦大块、连续的物理内存分配之后,将通过内存管理算法实现内存的按块分配、回收、重用等。 + + + +--- +### CPU Allocator + + + +- CPU 内存的分配提供两种选项: + 1. non-pinned memory:可分页内存 + 2. pinned memory:页锁定内存 + - 分配过大的页锁定内存有可能因为系统可使用的分页内存减少,影响系统性能,默认CPU下分配的是可分页内存 + +- 通过gflags进行设置一次性分配内存的大小以及是否使用页锁定内存。 + + ```cpp + DEFINE_bool(use_pinned_memory, true, "If set, allocate cpu pinned memory."); + DEFINE_double(fraction_of_cpu_memory_to_use, 1, + "Default use 100% of CPU memory for PaddlePaddle," + "reserve the rest for page tables, etc"); + ``` + + + +--- +### GPU Allocator + + + +- 通过 cudaMalloc 分配GPU显存 +- GPUAllocator::Alloc 首先会计算指定GPU device上的可用显存 + - 如果可用显存小于请求分配大小,调用cudaMalloc进行分配 + - 如果可用显存不足,目前会报错退出。 +- 通过gflags控制GPU下一次性分配显存的大小: + + ```cpp + DEFINE_double(fraction_of_gpu_memory_to_use, 0.92, + "Default use 92% of GPU memory for PaddlePaddle," + "reserve the rest for page tables, etc"); + ``` + + + +--- +#### 内存管理算法: [Buddy Memory Allocation](https://en.wikipedia.org/wiki/Buddy_memory_allocation) + + + +- Memory Arena:一次性分配大块连续内存,之后会基于这块内存进行内存管理:动态分配、释放、重用内存块。 +- 伙伴内存分配: + - 将内存划分为 2 的幂次方个分区,使用 best-fit 方法来分配内存请求。 + - 当释放内存时,检查 buddy 块,查看相邻的内存块是否也已被释放。如果是,将内存块合并,以最小化内存碎片。 + - 分配的内存在物理内存的自然边界对齐,提高内存访问效率。 + - 算法的时间效率高,单使用 best-fit 方法的缘故,会产生一定的内存浪费 + + + +--- + +### Buddy Allocator + + + +- BuddyAllocator 是一个单例,每个设备(如: GPU/CPU(0)/GPU(1)) 拥有一个BuddyAllocator +- BuddyAllocator 内部拥有一个私有成员变量 SystemAllocator +- 当请求的内存超过BuddyAllocator管理的空余内存时,将会调用SystemAllocator去指定的设备上分配物理内存 + + + +--- +### 实例:CPU 下内存管理接口的实现 + + + +- 对上层应用,统一通过BuddyAllocator来实现内存的分配、释放以及用量查询 + ```cpp + template <> + void* Alloc(platform::CPUPlace place, size_t size) { + VLOG(10) << "Allocate " << size << " bytes on " << platform::Place(place); + void* p = GetCPUBuddyAllocator()->Alloc(size); + VLOG(10) << " pointer=" << p; + return p; + } + + template <> + void Free(platform::CPUPlace place, void* p) { + VLOG(10) << "Free pointer=" << p << " on " << platform::Place(place); + GetCPUBuddyAllocator()->Free(p); + } + + template <> + size_t Used(platform::CPUPlace place) { + return GetCPUBuddyAllocator()->Used(); + } + ``` + + +--- +### ==5.== 多设备支持 + +--- +### 多设备支持(一) + + + +- step 1:添加Place类型,由用户实现添加到框架 + - 可以将Place类型理解为一个整数加上一个枚举型,包括:设备号 + 设备类型 + +

+ +

+- DeviceContext + - 不同的Place会对应一个相应的DeviceContext,用于组织管理与设备相关的信息 + - 例如,GpuDeviceContext中会管理Cuda stream + - 目前实现中一些特殊的库也会对应有自己的DeviceContext:例如: + ```cpp + class MKLDNNDeviceContext : public CPUDeviceContext {……} + ``` + - 每种设备对应的DeviceContext需要管理的内容不尽相同,视具体需求来实现 + +
+ +--- + +### 多设备支持(二) + + + +- step 2: 增加KernelType,为相应的KernelType注册Kernel对象,由用户实现注册给框架 可以按照: + 1. Place 执行设备 + 1. DataType 执行数据类型 FP32/FP64/INT32/INT64 + 1. Memory layout: 运行时 Tensor 在内存中的排布格式 NCHW、 NHWC + 1. 使用的库 + + 来区分Kernel,为同一个operator注册多个 Kernel。 + + ```cpp + struct OpKernelType { + proto::DataType data_type_; + DataLayout data_layout_; + platform::Place place_; + LibraryType library_type_; + } + ``` + + + +--- + +### 多设备支持(三) + + + +step 3: 运行时的 KernelType 推断和Kernel切换,按需要修改Kernel推断和Kernel切换规则 +- Expected Kernel:期待调用的Kernel:由(1)`Place`和计算精度决定;或(2)用户在配置中显示指定使用的计算库,如`cudnn`、`mkldnn`等。 +- Actual Kernel:运行时从`Operator`的输入(`Variable`)可以推断出实际需要的`KernelType` +- 当Expected Kernel和Actual Kernel不一致的时候,框架会插入`data_transformer`或者`data_layerout_transform`等,保证Expected Kernel可以执行,包括: + - CPUPlace -> GPUPlace :跨设备内存复制 + - NCHW -> nChw8c :Layout转换 + - FP32 -> FP16 :精度转换 _**尚未支持**_ + - …… +- 以上过程实现在OperatorWithKernel类的Run方法中 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.cc#L497) + + + +--- +## ==6.== while_op + +--- +### while_op + + + +- 循环执行一段`Program`,直到条件operator判断循环条件不满足时终止循环 +- while_op 的特殊之处: + 1. while_op 没有 kernel + 1. while_op 拥有自己的`Block`,会形成一段嵌套的`Block` + 1. ==while_op 内部创建了一个 Executor,来循环执行`Block`== + +- while_op 输入输出 : LoDTensorArray + ```cpp + namespace paddle { + namespace framework { + using LoDTensorArray = std::vector; + } + } + ``` + - 每一次循环,从原始输入中“切出”一个片段 + - LoDTensorArray 在Python端暴露,是Fluid支持的基础数据结构之一,用户可以直接创建并使用 + + + +--- +### while_op [Run](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/while_op.cc#L42) 方法概览 + + + +```cpp + +void Run(const framework::Scope &scope, + const platform::Place &dev_place) const override { + PADDLE_ENFORCE_NOT_NULL(scope.FindVar(Input(kCondition))); + auto &cond = scope.FindVar(Input(kCondition))->Get(); + PADDLE_ENFORCE_EQ(cond.dims(), paddle::framework::make_ddim({1})); + + framework::Executor executor(dev_place); + auto *block = Attr(kStepBlock); + + auto *program = block->Program(); + auto step_scopes = + scope.FindVar(Output(kStepScopes))->GetMutable(); + + while (cond.data()[0]) { + auto ¤t_scope = scope.NewScope(); + step_scopes->push_back(¤t_scope); + executor.Run(*program, ¤t_scope, block->ID(), + false /*create_local_scope*/); + } +} + +``` + + + +--- +### while_op 的重要应用:Dynamic RNN + +--- + +### 什么是 `dynamicRNN` ? + + +
+ +1. 用户可以自定义在一个时间步之内的计算, 框架接受序列输入数据,在其上循环调用用户定义的单步计算 +1. 可学习参数在多个时间步之间共享 +1. `dynamicRNN` 由 `while_op` 实现 +1. 如果`dynamicRNN`中定义了`memory`,将会构成一个循环神经网络,否则其行为就等于在输入序列上循环调用预定义的单步计算 + +
+ +--- + +#### `dynamic RNN` 用户接口 + + +

+ +

+ +- `dynamicRNN` 中的重要元素 + 1. **step input**: `dynamicRNN` 每个时间步的输入 + 1. **step function**: 用户定义的单步计算 + 1. **memory**: 用于形成循环连接 + 1. **external/static memory**:单步计算的每一步都可以全部读取到的外部输入 + +
+ +--- + +#### dynamicRNN 中的 Memory + + + +`dynamicRNN`中`memory`的行为非常类似于 C++ 中的引用变量 + - `memory` “指向” 一个operator的输出变量,记作: A + - `memory` 可以被 LoDTensor 初始化(当LoD信息为空时,为非序列,否则为序列),默认`memory`被初始化为零 + - `memory` 在 operator A 前向计算之后,进行前向计算 + - 当 `memory` 的前向计算会 "指向" A 的输出 LoDTensor + - `memory` 的输出可以是另一个 operator 的输入,于是形成了“循环”连接 + + + +--- + +### DynamicRNN 实现细节 + + + +- `while_op` 无法独立构成dynamicRNN,必须和一组相关的 operator 及数据结构配合 + - 依赖的 operators (这里仅列出最重要的,并非全部): + - `lod_rank_table` operator + - `lod_tensor_to_array` operator + - `array_to_lod_tensor` operator + - `shrink_memory` operator + - 依赖的数据结构 + - `TensorArray` + - `LoDRankTable` + +- 在Fluid中,RNN接受变长序列输入,无需填充,以上数据结构和相关的operator配合工作,实现了对变长输入以batch计算 + + + +--- + +### `dynamicRNN` 如何实现 batch 计算 ? + + + +- 问题: + - RNN 可以看作是一个展开的前向网络,前向网络的深度是最长序列的长度 + - 如果不对变长序列进行填充,将它们填充到一样长度,每个mini-batch输入将会不等长,每个样本展开长度不一致,导致前向和反向计算实现困难 + + + +---- +##### 实例 :RNN encoder-decoder with attention + + + +- 以机器翻译的RNN encoder-decoder 模型(涉及了`dynamicRNN`的所有设计要素)为例,下图是 RNN encoder-decoder 的原始输入: +

+
Figure. RNN encoder-decoder 原始batch 输入数据 +

+ +- source word sequences 是encoder RNN的输出,是一个LoDTensor +- target word sequences 是look_uptable的输入,是一个LoDTensor +- 上图中一个矩形方块是CPU/GPU内存中一片连续的内存空间,表示一个dense vector + +
+ +--- + +### `dynamicRNN` 如何实现 batch 计算 ? + + + +1. 对一个mini batch中不等长样本进行排序,最长样本变成batch中的第一个,最短样本是batch中最后一个 + - `LoDTensor` -> `LoDRankTable` :heavy_plus_sign: `lod_rank_table operaator` + - 可以将`LoDRankTable`理解为对LoDTensor中的多个序列按照长度排序LoDRankTable 存储了排序之后的index + +2. 构建每个时间步的batch输入:随着时间步增加,每个时间步的batch输入可能会逐渐缩小 + - `TensorArray` :heavy_plus_sign: `lod_tensor_to_array` -> `LoDTensor` (without LoD) +3. 每个时间步输出写入一个输出 `LoDTensorArray` +3. `dynamicRNN`循环结束后, 按照`LoDRankTable`中记录的信息对输出`LoDTensorArray`重排序,还原会原始输入顺序 + - `TensorArray` :heavy_plus_sign: `array_to_lod_tensor` -> `LoDTensor` + + + +--- + +### 运行实例 + +

+ +

+ +--- +### 运行实例 + +

+ +

+ + + +- 执行到第5~7个batch时,batch size将会缩小 + + + +--- +### 运行实例 + +

+ +

+ + + +- 第5 ~ 7个batch时RNN的`memory`会发生什么? + - `memory` 指向某个operator的输出Tensor,在该operator前向计算之后,“取回”其计算结果 + - 5 ~ 7时,遇到了序列的结束,==下一个时间步计算不再需要在已经结束的序列上展开== + - 在`dynamicRNN`中`shrink_memory` operator 用来缩小`memory`的batch输入 + + + +--- +### 运行实例:batch 1 ~ 2 + +

+
Figure. 第1、2个batch输入dynamicRNN的batch输入 +

+ +--- +### 运行实例:batch 3 ~ 4 + +

+
Figure. 第3、4个batch输入dynamicRNN的batch输入 +

+ +--- + +### 运行实例:batch 5 ~ 7 + +

+
Figure. 第5、6、7个batch输入dynamicRNN的batch输入 +

+ +--- +### ==7.== Fluid 代码结构 + +--- +### Fluid 代码结构 + + + + + + + + + + + + + + + +
代码结构模块结构
+

+ +

+
+

+ +

+
+ +--- + +### ==8.== 文档总结 + +--- + + +- 设计概览 + - 重构概览 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/refactorization.md) + - fluid [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/fluid.md) + - fluid_compiler [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/design/motivation/fluid_compiler.md) +- 核心概念 + - variable 描述 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/var_desc.md) + - Tensor [->](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/tensor.md) + - LoDTensor [->](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md) + - TensorArray [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/tensor_array.md) + - Program [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/program.md) + - Block [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/block.md) + - Scope [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/scope.md) + +--- + +- 重要功能模块 + - backward [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/backward.md) + - 内存优化 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/memory_optimization.md) + - evaluator [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/executor.md) + - python API [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/python_api.md) + - regularization [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/regularization.md) + +- 开发指南 + - 支持新设硬件设备库 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/support_new_device.md) + - 添加新的Operator [->](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/new_op_cn.md) + - 添加新的Kernel [->]( +https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/new_op_kernel_en.md) + + + +--- + +### ==9.== 开发指南 + +--- + +#### 建议开发环境:使用 Docker 编译和测试 + + + +Docker编译PaddlePaddle源码: [->](http://www.paddlepaddle.org/docs/develop/documentation/fluid/zh/build_and_install/docker_install_cn.html) + +PaddlePaddle 在 Dockerhub 地址:[->]( + https://hub.docker.com/r/paddlepaddle/paddle/tags/) + +1. 获取PaddlePaddle的Docker镜像 + ```bash + docker pull paddlepaddle/paddle:latest-dev + ``` + +1. 启动 docker container + + ```bash + docker run -it -v $PWD/Paddle:/paddle paddlepaddle/paddle:latest-dev /bin/bash + ``` + +1. 进入docker container后,从源码编译,请参考文档 [->]( http://www.paddlepaddle.org/docs/develop/documentation/fluid/zh/build_and_install/build_from_source_cn.html) + + + +--- + +### 一些说明 + + + +1. PaddlePaddle的Docker镜像为了减小体积,默认没有安装vim,可以在容器中执行`apt-get install -y vim`来安装vim。 +1. 开发推荐使用tag为`latest-dev`的镜像,其中打包了所有编译依赖。`latest`及`lastest-gpu`是production镜像,主要用于运行PaddlePaddle程序。 +2. 在Docker中运行GPU程序,推荐使用nvidia-docker,[否则需要将CUDA库和设备挂载到Docker容器内](http://www.paddlepaddle.org/docs/develop/documentation/fluid/zh/build_and_install/docker_install_cn.html)。 + + + ```bash + nvidia-docker run -it -v $PWD/Paddle:/paddle paddlepaddle/paddle:latest-dev /bin/bash + ``` + + + + + +--- + +### [如何贡献](http://www.paddlepaddle.org/docs/develop/documentation/fluid/zh/dev/contribute_to_paddle_cn.html) + + + +- ==提交PullRequest前请务必阅读==: [->](http://www.paddlepaddle.org/docs/develop/documentation/fluid/zh/dev/contribute_to_paddle_cn.html) +- 代码要求 + 1. 代码注释遵守 Doxygen 的样式 + 1. 确保编译器选项 WITH_STYLE_CHECK 已打开,并且编译能通过代码样式检查 + 1. 所有代码必须具有单元测试,且能够通过所有单元测试 +- 使用 `pre-commit` 钩子提交Pull Request + 1. 帮助格式化源代码(C++,Python) + 1. 在提交前自动检查一些基本事宜:如每个文件只有一个 EOL,Git 中不要添加大文件等 + 1. 安装pre-commit,并在PaddlePaddle根目录运行: + ```bash + ➜ pip install pre-commit + ➜ pre-commit install + ``` + + +--- + +### 如何贡献 + + + +1. 开始开发之前请先建立issue。 + - 让其它同学知道某项工作已经有人在进行,以避免多人开发同一功能的情况。 +1. 提交PR必须关联相关的issue。做法请参考:[->](https://help.github.com/articles/closing-issues-using-keywords/) + - 目的:为了在提交的版本中留有记录描述这个PR是为了开发什么样的功能,为了解决什么样的问题。 + - 当PR被merge后,关联的issue会被自动关闭。 +1. PR review 中,reviewer的每条comment都必须回复。 + - 如修改完可直接回复:Done。 + - 目的:review comment 中可能会有(1)询问类型的问题;(2)可以在下一个PR修改的问题;(3)comment意见不合理等。需要明确回复,以便reviewer和其他人有历史可查,便于区分是否已经进行修改,或者准备下一个PR修改,或者意见不合理可以不用进行修改。 + + + +--- + +### ==10.== 添加新的 Operator + +--- + +### 概念简介 + + + +添加一个新的operator,会涉及实现以下C++类的派生类: + +1. `framework::OperatorBase`: Operator(简写,Op)基类。 +1. `framework::OpKernel`: Op计算函数的基类,称作Kernel。 +1. `framework::OperatorWithKernel`:继承自OperatorBase,Op有计算函数,称作有Kernel。 +1. `class OpProtoAndCheckerMaker`:描述该Op的输入、输出、属性、注释,主要用于Python API接口生成 + +依据是否包含kernel,可以将Op分为两种: +1. 包含Kernel的Op:继承自OperatorWithKernel,==绝大多数operator都属于这一类== +1. 不包含kernel的Op,继承自OperatorBase,只有少量Op属于这一类,例如while_op,ifelse_op + +这里主要介绍带Kernel的Op如何编写。 + + + +--- + +#### 添加新的Operator需要修改/添加哪些文件? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
内容定义位置
+OpProtoMake定义 + +`.cc`文件,Backward Op不需要OpProtoMaker +
+Op定义 + +`.cc`文件 +
+Kernel实现 + +CPU、CUDA共享Kernel实现在`.h`文件中,否则,CPU 实现在`.cc`文件中,CUDA 实现在`.cu`文件中。 +
+注册Op + +Op注册实现在`.cc`文件;Kernel注册CPU实现在`.cc`文件中,CUDA实现在`.cu`文件中 +
+ +- 添加 Operator 之前请阅读:[Operator 命名规范](https://github.com/PaddlePaddle/Paddle/blob/63cca04cfd488a4dab6d6273fd04a8017ef45932/doc/fluid/dev/name_convention.md)及[Operator Markdown注释规范](https://github.com/PaddlePaddle/Paddle/blob/63cca04cfd488a4dab6d6273fd04a8017ef45932/doc/fluid/dev/op_markdown_format.md)。 +- 实现新的op都添加至目录[paddle/operators](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/fluid/operators)下,文件命名以`*_op.h`(如有) 、 `*_op.cc` 、`*_op.cu`(如有)结尾。 +- 根据文件名自动构建op和Python端绑定,请务必遵守以上命名,否则需要进一步修改PyBind相关文件及CMakeLists.txt。 +
+ +--- + +###### 实现带Kernel的Operator step1: 定义ProtoMaker类 + + + +下面均以[clip_op](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/clip_op.h)为例进行介绍 + +- clip_op计算公式:$Out = \min(\max(X, min), max)$ +- 首先定义`ProtoMaker`来描述该Op的输入、输出,并添加注释(*下面代码段的中注释进行了简化,实现时需按照规范添加注释*): + + ```cpp + template + class ClipOpMaker : public framework::OpProtoAndCheckerMaker { + public: + ClipOpMaker(OpProto* proto, OpAttrChecker* op_checker) + : OpProtoAndCheckerMaker(proto, op_checker) { + AddInput("X","(Tensor)The input of clip op."); + AddOutput("Out", "(Tensor),The output of clip op."); + AddAttr( + "min", "(float),Minimum value."); + AddAttr( + "max", "(float),Maximum value."); + AddComment(R"DOC( + …… + )DOC"); + } + }; + ``` + + + +--- + +###### 实现带Kernel的Operator step2: 定义Operator类 + + + +下面的代码段实现了`clip_op`的定义: + +```cpp +class ClipOp : public framework::OperatorWithKernel { + public: + using framework::OperatorWithKernel::OperatorWithKernel; + + void InferShape(framework::InferShapeContext* ctx) const override { + PADDLE_ENFORCE(ctx->HasInput("X"), + "Input(X) of ClipOp should not be null."); + PADDLE_ENFORCE(ctx->HasOutput("Out"), + "Output(Out) of ClipOp should not be null."); + auto x_dims = ctx->GetInputDim("X"); + auto max = ctx->Attrs().Get("max"); + auto min = ctx->Attrs().Get("min"); + PADDLE_ENFORCE_LT(min, max, "max should be greater than min."); + ctx->SetOutputDim("Out", x_dims); + ctx->ShareLoD("X", /*->*/ "Out"); + } +}; +``` + + +--- + +### Operator 类中需要完成的工作 + + + +1. clip_op 继承自`OperatorWithKernel`, + + ```cpp + using framework::OperatorWithKernel::OperatorWithKernel; + ``` + 表示使用基类`OperatorWithKernel`的构造函数。 + +1. 重写`InferShape`接口。 + - `InferShape` 为const函数,不能修改Op的成员变 + - `InferShape` 的参数为 `const framework::InferShapeContext &ctx`,从中可获取到输入输出以及属性 + - `InferShape` 会被调用两次,一次是编译时(创建op),一次是运行时(调用op的`Run`方法时),需要完成以下功能: + 1. 做检查, 尽早报错:检查输入数据维度、类型等是否合法 + 2. 设置输出Tensor的形状 + +通常`OpProtoMaker`和`Op`类的定义写在`.cc`文件中。 + + + +--- + +### 补充说明 + + + +1. `InferShape`目前支持两种实现方式,二者最后都会生成一个functor注册给OpInfo结构体。 + 1. 继承framework::InferShapeBase,实现为一个functor(参考 [mul_op](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/mul_op.cc#L22)) + 2. override InferShape函数(参考 [clip_op](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/clip_op.cc#L24)) + +1. 什么是`functor` ? + + - 类或结构体仅重载了`()`,一般是可被多个kernel复用的计算函数。 + + + + ```cpp + template + class CrossEntropyFunctor { + public: + void operator()(const platform::CPUDeviceContext& ctx, + framework::Tensor* out, + const framework::Tensor* prob, + const framework::Tensor* labels, const bool softLabel) { + …… + } + }; + ``` + + + - 在 clip_op 内也会看到将一段计算函数抽象为functor的使用法: [->](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/clip_op.h#L27)。 + + + +--- + +###### 实现带Kernel的Operator step3: 定义OpKernel类 + + + +- `ClipKernel`继承自`framework::OpKernel`,带有下面两个模板参数: + 1. `typename DeviceContext`: 表示设备类型,不同设备共享同一个Kernel时,需添加该模板参数。不共享时,需要提供针对不同设备的特化实现。 + 1. `typename T` : 表示支持的数据类型,如`float`, `double`等 + +- 在`ClipKernel`类中重写`Compute`方法 + 1. `Compute`接受输入参数:`const framework::ExecutionContext& context` + - `ExecutionContext` 是从 `Scope`中将运行时Op的输入、输出`Variable`组织在一起,使得Op在调用`Compute`方法时,能够简单地通过名字拿到需要的输入输出`Variable` + - 与`InferShapeContext`相比,`ExecutionContext` 中增加了设备类型 + 1. 在`Compute`函数里实现`OpKernel`的具体计算逻辑 + + + +--- +#### ClipKernel 代码概览 + + + +```cpp +template +class ClipKernel : public framework::OpKernel { + public: + void Compute(const framework::ExecutionContext& context) const override { + auto max = context.Attr("max"); + auto min = context.Attr("min"); + auto* x = context.Input("X"); + auto* out = context.Output("Out"); + T* out_data = out->mutable_data(context.GetPlace()); + const T* x_data = x->data(); + int64_t numel = x->numel(); + Transform trans; + trans(context.template device_context(), x_data, + x_data + numel, out_data, ClipFunctor(min, max)); + } +}; +``` + +- 为了使`OpKernel`的计算过程书写更加简单,并且CPU、CUDA的代码可以复用, Fluid 使用 Eigen 作为基础的矩阵运算库 +- Fluid对Eigen unsupported Tensor提供了一些基本的封装,可以在`Compute`接口中直接调用 + - 关于在PaddlePaddle中如何使用Eigen库,请参考[使用文档](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/dev/use_eigen_cn.md)。 + + + +--- +###### 实现带Kernel的Operator step4: 实现反向Op + + + +- ==**反向Op没有`ProtoMaker`**==,除此之外定义与实现方式前向Op完全一致,不再赘述 +- 这里仅对反向Op的输入输出进行说明: + 1. 反向Op的输入 + - 前向Op的输出 + - 反向传播过程中传递给当前Op的梯度 + - 需要注意,Fluid中,不区分Cost Op和中间层Op,所有Op都必须正确处理接收到的梯度 + 2. 反向Op的输出 + - 对可学习参数的求导结果 + - 对所有输入的求导结果 + + + + +--- + +###### 实现带Kernel的Operator step5: 注册Op及Kernel + + + +至此Op和Op kernel都已经实现完毕,接下来,需要在`.cc`和`cu`文件中注册op和kernel + +1. 在`.cc`文件中注册前向、反向Op类,注册CPU Kernel。 + + + + ```cpp + namespace ops = paddle::operators; + REGISTER_OP(clip, ops::ClipOp, ops::ClipOpMaker, clip_grad, + ops::ClipOpGrad); + REGISTER_OP_CPU_KERNEL( + clip, ops::ClipKernel); + REGISTER_OP_CPU_KERNEL( + clip_grad, ops::ClipGradKernel); + ``` + + - 在上面的代码片段中: + + 1. `REGISTER_OP` : 注册`ops::ClipOp`类,类型名为`clip`,该类的`ProtoMaker`为`ops::ClipOpMaker`,注册`ops::ClipOpGrad`,类型名为`clip_grad` + 1. `REGISTER_OP_WITHOUT_GRADIENT` : 用于注册没有反向的Op,例如:优化算法相关的Op + 1. `REGISTER_OP_CPU_KERNEL` :注册`ops::ClipKernel`类,并特化模板参数为`paddle::platform::CPUPlace`和`float`类型,同理,注册`ops::ClipGradKernel`类 + + +1. 按照同样方法,在`.cu`文件中注册GPU Kernel + - 如果CUDA Kernel的实现基于Eigen,需在 `.cu`的开始加上宏定义 `#define EIGEN_USE_GPU` + + + +--- + +##### 编译和Python端绑定 + + + +- 运行下面命令可以仅编译新添加的Op: + + ``` + make mul_op + ``` + - 需注意,运行单元测试需要编译整个工程 + +- 如果遵循前文的文件命名规则,构建过程中,会自动为新增的op添加Python端绑定,并链接到生成的lib库中 + + + +--- + +###### 实现带Kernel的Operator step6: 添加前向单测及梯度检测 + + + +- 新增Op的单元测试统一添加至:[python/paddle/v2/fluid/tests/unittests](https://github.com/PaddlePaddle/Paddle/tree/develop/python/paddle/fluid/tests/unittests)目录 +- 前向Operator单测 + + 1. Op单元测试继承自`OpTest`,各项具体的单元测试在`TestClipOp`里完成,所有单测case都以`TestXX`命名 + 1. 单元测试Operator,需要: + 1. 在`setUp`函数定义输入、输出,以及相关的属性参数 + 1. 生成随机的输入数据 + 1. 在Python脚本中实现与前向operator相同的计算逻辑,得到输出值,与operator前向计算的输出进行对比 + 1. 反向梯度检测流程测试框架已经实现,直接调用相应接口`check_grad`即可 + +- `clip_op` 单测代码请参考 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/unittests/test_clip_op.py),这里不再展开 + + + +--- +#### 编译执行单测 + + + +- `python/paddle/v2/framework/tests` 目录下新增的 `test_*.py` 单元测试会被自动加入工程进行编译 + + - 运行单元测试测时需要编译整个工程,并且编译时需要打开`WITH_TESTING`, 即`cmake paddle_dir -DWITH_TESTING=ON` +- 编译成功后,执行下面的命令来运行单元测试: + + ```bash + make test ARGS="-R test_mul_op -V" + ``` + + 或者: + + ``` + ctest -R test_mul_op + ``` + + +--- + +### 添加Op的一些注意事项 + + + +- 为每个Op创建单独的`*_op.h`(如有)、`*_op.cc`和`*_op.cu`(如有)。不允许一个文件中包含多个Op,将会导致编译出错。 +- 注册Op时的类型名,需要和该Op的名字一样。不允许在`A_op.cc`里面,注册`REGISTER_OP(B, ...)`,会导致单元测试出错。 +- 如果Op没有实现CUDA Kernel,不要创建空的`*_op.cu`,会导致单元测试出错。 +- 如果多个Op依赖一些共用的函数,可以创建非`*_op.*`格式的文件来存放,如`gather.h`文件。 + + + +--- + +### ==10.== 使用相关问题 + +--- + +### 定义前向计算 + + + +- 当在python端执行时: + ```python + import paddle.v2.fluid as fluid + ``` + [`framework.py`](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/framework.py#L1040)定义了两个全局`Program`: + ```python + # program is a global instance. + _main_program_ = Program() + _startup_program_ = Program() + ``` + +- 前向定义的过程就是不断往`mian_program`中添加Op和Variable +- 如果需要执行一个新的`mian_program`时,可以调用调用: + ```python + def switch_main_program(program): + """ + Switch the main program to a new program. + This funtion returns the previous main program. + """ + …… + ``` + + +--- + +### 自定义参数的初始化 + + + +- 调用`fluid.ParamAttr(……)`接口,自定义参数的初始化 + + ```python + w_param_attrs = ParamAttr(name=None, + initializer=UniformInitializer(low=-1.0, high=1.0, seed=0), + learning_rate=1.0, + regularizer=L1Decay(1.0), + trainable=True, + clip=GradientClipByValue(-1.0, 1.0), + ) + y_predict = fluid.layers.fc(input=x, size=1, param_attr=w_param_attrs) + ``` + +- 补充问题:如何创建 `Variable` + ```python + cur_program = Program() + cur_block = cur_program.current_block() + new_var = cur_block.create_var(name="X", shape=[-1, 16, 16], dtype="float32") + ``` + + + +--- + +### 添加反向Op + + + +- 调用`fluid.backward.append_backward(X)`(`X`是一个Variable),来为一段前向`ProgramDesc`添加反Op + + ```python + data = fluid.layers.data(name="data", shape=(2,3,4)) + out = fluid.layers.fc(input=data,size=128,act=None) + loss = fluid.layers.reduce_sum(out) + fluid.backward.append_backward(loss=loss) + ``` + +- 添加优化相关的Op + ```python + sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001) + sgd_optimizer.minimize(loss) + ``` + +- 可以随时调用`print(fluid.default_main_program())`来输出当前的`main_program` + +- 当构建完成整个`Program`后,调用下面的接口执行内存优化: + ```python + fluid.memory_optimize(fluid.default_main_program()) + ``` + - _注:内存优化目前仍在持续开发中,有可能不够稳定。_ + + + +--- + +### 总结:编译时执行流程 + + + +- 用户定义前向计算 +- 添加反向Op到`default_main_program` +- 添加 gradient clipping Op 到 +- 添加 regularization Op 到`default_main_program` +- 为指定的优化算法,添加相关的状态 variable of optimizer 到`default_startup_program` + - 状态相关 variable是指如学习率, 历史 momentum, 二阶momentum等 +- 添加初始化 variable 的Op 到 `default_startup_program` +- 为整个网络最后一个op,添加设置其接受到的梯度的Op到`default_main_program` +- 进行内存优化规划 + + + +--- + +### Feed 数据 (一):通过 feed 字典 + + + +- 执行executor的run方法时,指定feed字典,feed op 会将指定的数据放到`x`和`y`两个Variable中 + ```python + y_data = np.random.randint(0, 8, [1]).astype("int32") + y_tensor = core.Tensor() + y_tensor.set(y_data, place) + + x_data = np.random.uniform(0.1, 1, [11, 8]).astype("float32") + x_tensor = core.Tensor() + x_tensor.set(x_data, place) + …… + cost = exe.run( + fluid.default_main_program(), + feed={'x': x_tensor, + 'y': y_tensor}, + fetchlist=[avg_cost]) + ``` + +- 这种方法较为底层,一般用于单测中 + + + +--- + +### Feed 数据 (二):使用 DataFeeder接口 + + + +- 编写一个data_reader函数,data_reader是一个Python generator + + ```python + def demo_reader(): + def random_generator(): + yield np.random.uniform(0.1, 1, [4]), np.random.randint(0, 1, [1]) + return random_generator + ``` +- 在训练任务中使用 DataFeeder 接口 + ```python + cost = exe.run( + fluid.default_main_program(), + feed={'x': x_tensor, + 'y': y_tensor}, + fetchlist=[avg_cost]) + + train_reader = paddle.batch( + paddle.reader.shuffle(demo_reader(), buf_size=500), batch_size=4) + feeder = fluid.DataFeeder(place=place, feed_list=[x, y]) + for data in train_reader(): + cost = exe.run( + fluid.default_main_program(), + feed=feeder.feed(data), + fetch_list=[cost]) + ``` + + + +--- + +### 常见问题 + + + +- 如何使用 evaluator ? [->](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_label_semantic_roles.py#L168) + + ```python + accuracy = fluid.evaluator.Accuracy(input=predict, label=label) + for pass_id in range(PASS_NUM): + accuracy.reset() + for data in train_reader(): + loss, acc = exe.run(fluid.default_main_program(), + feed=feeder.feed(data), + fetch_list=[avg_cost] + accuracy.metrics) + pass_acc = accuracy.eval(exe) + # acc 当前一个batch 的 accuracy + # pass_acc 当前batch 的 accuracy + pass_total_acc = accuracy.eval(exe) # 整个pass的accuracy + ``` + +- 如何在训练中测试?[->](https://github.com/dzhwinter/benchmark/blob/master/fluid/vgg16.py#L144) +- 如何保存训练好的模型?[->](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recognize_digits.py#L143) +- 如何加载训练好的模型进行预测?[->](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_recognize_digits.py#L154) +- 如何在同一个训练任务中定义多个Program,并交替运行? [->](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/demo/fc_gan.py) +- 如何profile?Fluid 实现了profile 工具,可以直接调用。请参考示例 [->](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/unittests/test_profiler.py) + + + + +--- diff --git a/doc/fluid/images/1.png b/doc/fluid/images/1.png new file mode 100644 index 0000000000000000000000000000000000000000..67daf566f91aab570e60971c4ea8e2be876e214d Binary files /dev/null and b/doc/fluid/images/1.png differ diff --git a/doc/fluid/images/2.png b/doc/fluid/images/2.png new file mode 100644 index 0000000000000000000000000000000000000000..43367777f41449a666e7a3b571f09ac5d5dfb1ae Binary files /dev/null and b/doc/fluid/images/2.png differ diff --git a/doc/fluid/images/3.png b/doc/fluid/images/3.png new file mode 100644 index 0000000000000000000000000000000000000000..481021ef306e2596818aab7fe17a570754f63635 Binary files /dev/null and b/doc/fluid/images/3.png differ diff --git a/doc/fluid/images/4.png b/doc/fluid/images/4.png new file mode 100644 index 0000000000000000000000000000000000000000..4279f41e06de459f18b9a622539511d555e9a0af Binary files /dev/null and b/doc/fluid/images/4.png differ diff --git a/doc/fluid/images/LoDTensor.png b/doc/fluid/images/LoDTensor.png new file mode 100644 index 0000000000000000000000000000000000000000..75369f5378309e0f304b83f6bb69bdb195eac079 Binary files /dev/null and b/doc/fluid/images/LoDTensor.png differ diff --git a/doc/fluid/images/compile_run_time.png b/doc/fluid/images/compile_run_time.png new file mode 100644 index 0000000000000000000000000000000000000000..0bc9b2fd0e81b4851e6d96171ccb9a05d0f42a48 Binary files /dev/null and b/doc/fluid/images/compile_run_time.png differ diff --git a/doc/fluid/images/executor.png b/doc/fluid/images/executor.png new file mode 100644 index 0000000000000000000000000000000000000000..b29c0d779e3d46b779b5baeabe3176adaeb00a6d Binary files /dev/null and b/doc/fluid/images/executor.png differ diff --git a/doc/fluid/images/fluid_examples.png b/doc/fluid/images/fluid_examples.png new file mode 100644 index 0000000000000000000000000000000000000000..aa99472c0f914cde128fd7b3bd8dc29ac24f94b6 Binary files /dev/null and b/doc/fluid/images/fluid_examples.png differ diff --git a/doc/fluid/images/fluid_module_1.png b/doc/fluid/images/fluid_module_1.png new file mode 100644 index 0000000000000000000000000000000000000000..554782ba54e43efc3d6babbb94e3cac3530ac649 Binary files /dev/null and b/doc/fluid/images/fluid_module_1.png differ diff --git a/doc/fluid/images/fluid_module_2.png b/doc/fluid/images/fluid_module_2.png new file mode 100644 index 0000000000000000000000000000000000000000..4219efccbb1e87839adf6b5720fe46808b7d2fcf Binary files /dev/null and b/doc/fluid/images/fluid_module_2.png differ diff --git a/doc/fluid/images/layer.png b/doc/fluid/images/layer.png new file mode 100644 index 0000000000000000000000000000000000000000..e46db4c9c6f5b65ff274b498b716b11de343a8b0 Binary files /dev/null and b/doc/fluid/images/layer.png differ diff --git a/doc/fluid/images/operator1.png b/doc/fluid/images/operator1.png new file mode 100644 index 0000000000000000000000000000000000000000..3975b06f615b7a88dfc11e71b6451fdf4ce42d60 Binary files /dev/null and b/doc/fluid/images/operator1.png differ diff --git a/doc/fluid/images/operator2.png b/doc/fluid/images/operator2.png new file mode 100644 index 0000000000000000000000000000000000000000..b7bb1fae2050d3a70797517bc20dbbdef3dfcb7c Binary files /dev/null and b/doc/fluid/images/operator2.png differ diff --git a/doc/fluid/images/place.png b/doc/fluid/images/place.png new file mode 100644 index 0000000000000000000000000000000000000000..14e77511d639af155e5a3725cde05323e0cc94f2 Binary files /dev/null and b/doc/fluid/images/place.png differ diff --git a/doc/fluid/images/print_fluid_program.png b/doc/fluid/images/print_fluid_program.png new file mode 100644 index 0000000000000000000000000000000000000000..e8e459e1b3d5c8706b3caa05dc371db8d46df4a5 Binary files /dev/null and b/doc/fluid/images/print_fluid_program.png differ diff --git a/doc/fluid/images/program_desc1.png b/doc/fluid/images/program_desc1.png new file mode 100644 index 0000000000000000000000000000000000000000..0656336914ece957f2e5bb4d70ad337a63e31d88 Binary files /dev/null and b/doc/fluid/images/program_desc1.png differ diff --git a/doc/fluid/images/program_desc2.png b/doc/fluid/images/program_desc2.png new file mode 100644 index 0000000000000000000000000000000000000000..db5bfa1231345add8661b4f8ef0fc9d861f40d24 Binary files /dev/null and b/doc/fluid/images/program_desc2.png differ diff --git a/doc/fluid/images/raw_input.png b/doc/fluid/images/raw_input.png new file mode 100644 index 0000000000000000000000000000000000000000..0725f92d2b169c2b59ec7c68b402859c2a2dd1d8 Binary files /dev/null and b/doc/fluid/images/raw_input.png differ diff --git a/doc/fluid/images/scope_variable_tensor.png b/doc/fluid/images/scope_variable_tensor.png new file mode 100644 index 0000000000000000000000000000000000000000..59b0de6fb36f9f6b469227c05760a7612bb30b4d Binary files /dev/null and b/doc/fluid/images/scope_variable_tensor.png differ diff --git a/doc/fluid/images/sorted_input.png b/doc/fluid/images/sorted_input.png new file mode 100644 index 0000000000000000000000000000000000000000..ff601128368ee179e3fd33e5e295a9ddd3dcbaeb Binary files /dev/null and b/doc/fluid/images/sorted_input.png differ diff --git a/doc/fluid/images/transpiler.png b/doc/fluid/images/transpiler.png new file mode 100644 index 0000000000000000000000000000000000000000..422973c0dc7aa2b544d2fc86a97ace706388cb9e Binary files /dev/null and b/doc/fluid/images/transpiler.png differ diff --git a/doc/fluid/images/user_interface.png b/doc/fluid/images/user_interface.png new file mode 100644 index 0000000000000000000000000000000000000000..ffc94e3d8945ec6291460afd90e8fcc600828390 Binary files /dev/null and b/doc/fluid/images/user_interface.png differ diff --git a/paddle/api/GradientMachine.cpp b/paddle/api/GradientMachine.cpp index a3d6f0f080abcf1f45d9bc5fbdb39bb6b6ca1553..0d9ad30de9c1f3f8f58c856a748abdc050ff8740 100644 --- a/paddle/api/GradientMachine.cpp +++ b/paddle/api/GradientMachine.cpp @@ -94,7 +94,7 @@ void UpdateCallback::apply(Parameter* p) { } class UpdateCallbackWrapper { -public: + public: explicit UpdateCallbackWrapper(const UpdateCallback& callback) : callback(const_cast(callback)) {} @@ -105,7 +105,7 @@ public: delete p; } -private: + private: UpdateCallback& callback; }; diff --git a/paddle/api/PaddleAPI.h b/paddle/api/PaddleAPI.h index 67368d1a99d980b248789d24a2ea4f466255687a..7866122006a996cbe5201c661cab9c81aa82a219 100644 --- a/paddle/api/PaddleAPI.h +++ b/paddle/api/PaddleAPI.h @@ -59,9 +59,10 @@ class RangeError {}; /// Not support Error, such as access GPU memory directly, etc. class UnsupportError : public std::runtime_error { -public: - UnsupportError() : std::runtime_error(" "){}; - UnsupportError(const std::string& message) : std::runtime_error(message){}; + public: + UnsupportError() : std::runtime_error(" ") {} + explicit UnsupportError(const std::string& message) + : std::runtime_error(message) {} }; /// This type will map to python's list of float. @@ -105,7 +106,7 @@ class Matrix { DISABLE_COPY(Matrix); static Matrix* createByPaddleMatrixPtr(void* sharedPtr); -public: + public: virtual ~Matrix(); /** @@ -231,7 +232,7 @@ public: bool isGpu() const; -private: + private: void* getSharedPtr() const; MatrixPrivate* m; @@ -248,7 +249,7 @@ class Vector { void* getSharedPtr(); -public: + public: ~Vector(); /// Create Vector filled with zero. @@ -310,10 +311,10 @@ public: /// __len__ in python size_t getSize() const; -private: + private: VectorPrivate* m; -private: + private: friend class Parameter; friend class ParameterOptimizer; friend struct ParameterTraverseCallbackPrivate; @@ -325,7 +326,7 @@ class IVector { DISABLE_COPY(IVector); static IVector* createByPaddleVectorPtr(void* ptr); -public: + public: /// Create IVector filled with zero static IVector* createZero(size_t sz, bool useGpu = isUsingGpu()); @@ -389,7 +390,7 @@ public: /// This method will map to python __len__(); size_t getSize() const; -private: + private: void* getSharedPtr() const; friend class Arguments; @@ -400,11 +401,11 @@ struct ArgumentsPrivate; /// The Arguments is actual a std::vector in paddle. class Arguments { -private: + private: Arguments(); // Internal Create. DISABLE_COPY(Arguments); -public: + public: /** * Create a arguments with size. * Note that it can be zero. @@ -475,12 +476,12 @@ public: float sum() const; -private: + private: static Arguments* createByPaddleArgumentVector(void* ptr); static Arguments* createByPaddleArgument(const void* ptr); void* getInternalArgumentsPtr() const; -private: + private: ArgumentsPrivate* m; friend class Trainer; friend class GradientMachine; @@ -507,7 +508,7 @@ class ParameterConfig { static ParameterConfig* createParameterConfigFromParameterPtr(void* ptr); void* getRawPtr(); -public: + public: ~ParameterConfig(); /** @@ -515,10 +516,10 @@ public: */ std::string toProtoString() const; -private: + private: ParameterConfigPrivate* m; -private: + private: friend class Parameter; friend class ParameterOptimizer; friend struct ParameterTraverseCallbackPrivate; @@ -529,7 +530,7 @@ class OptimizationConfig { DISABLE_COPY(OptimizationConfig); OptimizationConfig(); -public: + public: static OptimizationConfig* createFromProtoString(const std::string& str); ~OptimizationConfig(); @@ -538,7 +539,7 @@ public: */ std::string toProtoString(); -private: + private: OptimizationConfigPrivate* m; friend class TrainerConfig; @@ -549,11 +550,11 @@ private: struct ParameterPrivate; class Parameter { -private: + private: Parameter(); DISABLE_COPY(Parameter); -public: + public: virtual ~Parameter(); /** @@ -580,11 +581,11 @@ public: size_t getSize() const; -private: + private: static Parameter* createFromRawPtr(void* ptr); static Parameter* createFromSharedPtr(void* ptr); -private: + private: ParameterPrivate* m; friend class UpdateCallbackWrapper; friend class GradientMachine; @@ -598,14 +599,14 @@ struct ModelConfigPrivate; * It is used by GradientMachine. */ class ModelConfig { -private: + private: ModelConfig(); DISABLE_COPY(ModelConfig); -public: + public: virtual ~ModelConfig(); -private: + private: ModelConfigPrivate* m; friend class TrainerConfig; friend struct TrainerConfigPrivate; @@ -619,11 +620,11 @@ struct TrainerConfigPrivate; * It is used by GradientMachine. */ class TrainerConfig { -private: + private: TrainerConfig(); DISABLE_COPY(TrainerConfig); -public: + public: virtual ~TrainerConfig(); static TrainerConfig* createFromTrainerConfigFile( @@ -634,7 +635,7 @@ public: OptimizationConfig* getOptimizationConfig() const; -private: + private: TrainerConfigPrivate* m; friend class Trainer; }; @@ -654,7 +655,7 @@ private: * @endcode */ class UpdateCallback { -public: + public: virtual ~UpdateCallback(); virtual void apply(Parameter* p); }; @@ -664,14 +665,14 @@ class ParameterTraverseCallback { DISABLE_COPY(ParameterTraverseCallback); ParameterTraverseCallback(); -public: + public: ~ParameterTraverseCallback(); void apply(const std::vector& vecs, const ParameterConfig& config, size_t sparseId); -private: + private: ParameterTraverseCallbackPrivate* m; friend class ParameterOptimizer; }; @@ -686,7 +687,7 @@ class ParameterOptimizer { DISABLE_COPY(ParameterOptimizer); ParameterOptimizer(); -public: + public: static ParameterOptimizer* create(OptimizationConfig* config); ~ParameterOptimizer(); @@ -710,7 +711,7 @@ public: ParameterTraverseCallback* needSpecialTraversal( const ParameterConfig& config) const; -private: + private: ParameterOptimizerPrivate* m; }; @@ -718,11 +719,11 @@ class SequenceGenerator; class Evaluator; struct GradientMachinePrivate; class GradientMachine { -private: + private: GradientMachine(); DISABLE_COPY(GradientMachine); -public: + public: virtual ~GradientMachine(); /** @@ -817,7 +818,7 @@ public: void eval(Evaluator* evaluator); -private: + private: GradientMachinePrivate* m; static GradientMachine* createFromPaddleModelPtr( @@ -833,10 +834,10 @@ private: struct ParameterUpdaterPrivate; class ParameterUpdater { -private: + private: ParameterUpdater(); -public: + public: static ParameterUpdater* createLocalUpdater(OptimizationConfig* config); static ParameterUpdater* createRemoteUpdater(OptimizationConfig* config, int passCount, @@ -911,17 +912,17 @@ public: */ void catchUpWith(); -private: + private: ParameterUpdaterPrivate* m; }; struct EvaluatorPrivate; class Evaluator { -private: + private: Evaluator(); DISABLE_COPY(Evaluator); -public: + public: ~Evaluator(); /** @@ -945,7 +946,7 @@ public: double getValue(const std::string name) const; -private: + private: EvaluatorPrivate* m; friend class GradientMachine; @@ -953,13 +954,13 @@ private: struct TrainerPrivate; class Trainer { -private: + private: TrainerPrivate* m; Trainer(); Trainer(TrainerConfig* optConfig, GradientMachine* gm); DISABLE_COPY(Trainer); -public: + public: virtual ~Trainer(); /// Create A Trainer By TrainerConfig. using paddle command line. @@ -1002,7 +1003,7 @@ public: /// the N-Best results generated from one input sequence. class ISequenceResults { -public: + public: virtual ~ISequenceResults(); /// Number of result. @@ -1026,7 +1027,7 @@ class SequenceGenerator { DISABLE_COPY(SequenceGenerator); SequenceGenerator(); -public: + public: virtual ~SequenceGenerator(); /** @@ -1044,10 +1045,10 @@ public: void setMaxLength(size_t maxlength); void setBeamSize(size_t beamSize); -private: + private: static SequenceGenerator* createByGradientMachineSharedPtr(void* ptr); friend class GradientMachine; -private: + private: SequenceGeneratorPrivate* m; }; diff --git a/paddle/api/SequenceGenerator.cpp b/paddle/api/SequenceGenerator.cpp index 1b30aec8f6b6b73764886a7c7274be67851e4815..1446c3084238859a759669f3a32c7efde67dcc2b 100644 --- a/paddle/api/SequenceGenerator.cpp +++ b/paddle/api/SequenceGenerator.cpp @@ -138,7 +138,7 @@ struct SequenceGeneratorPrivate { maxLength(0UL), feedback(__create_feedback__()) {} -private: + private: static paddle::Argument __create_feedback__() { paddle::Argument feedback; feedback.ids = paddle::IVector::create(/* size= */ 1, FLAGS_use_gpu); @@ -157,7 +157,7 @@ SequenceGenerator::~SequenceGenerator() { delete m; } class PathSequenceResults : public ISequenceResults { // ISequenceResults interface -public: + public: PathSequenceResults(const std::shared_ptr>& path, const std::shared_ptr>& dict) : path_(path), dict_(dict) {} @@ -196,7 +196,7 @@ public: } } -private: + private: std::shared_ptr> path_; std::shared_ptr> dict_; }; diff --git a/paddle/capi/gradient_machine.cpp b/paddle/capi/gradient_machine.cpp index ea9aab00e3d05f1e2ef0c91eab93b67e0a3d5f37..8c3f504e5a2d807c0cc664af486ebab4a82ddec3 100644 --- a/paddle/capi/gradient_machine.cpp +++ b/paddle/capi/gradient_machine.cpp @@ -26,7 +26,7 @@ enum GradientMatchineCreateMode { namespace paddle { class MyNeuralNetwork : public NeuralNetwork { -public: + public: MyNeuralNetwork(const std::string& name, NeuralNetwork* network) : NeuralNetwork(name, network) {} }; diff --git a/paddle/contrib/inference/paddle_inference_api.h b/paddle/contrib/inference/paddle_inference_api.h index 9ac8ebdef8151f2a144b479fa258b8bc830fc2e9..f804d9b28697a6703d63d9a640c4ec337effaba6 100644 --- a/paddle/contrib/inference/paddle_inference_api.h +++ b/paddle/contrib/inference/paddle_inference_api.h @@ -50,7 +50,7 @@ struct PaddleTensor { * TODO(Superjomn) Prepare another API for NLP-related usages. */ class PaddlePredictor { -public: + public: struct Config; PaddlePredictor() = default; PaddlePredictor(const PaddlePredictor&) = delete; @@ -66,6 +66,7 @@ public: // be thread-safe. virtual std::unique_ptr Clone() = 0; + virtual bool InitShared() { return false; } // Destroy the Predictor. virtual ~PaddlePredictor() {} diff --git a/paddle/contrib/inference/paddle_inference_api_impl.cc b/paddle/contrib/inference/paddle_inference_api_impl.cc index ecca16d3f82bbeee6858883a0f9e577a479f9d06..e7a0b341dda1ca8d2ccfc0d6c12a7ac3d4c691d5 100644 --- a/paddle/contrib/inference/paddle_inference_api_impl.cc +++ b/paddle/contrib/inference/paddle_inference_api_impl.cc @@ -28,7 +28,7 @@ namespace { // Timer for timer class Timer { -public: + public: double start; double startu; void tic() { @@ -135,16 +135,17 @@ bool PaddlePredictorImpl::Run(const std::vector &inputs, std::unique_ptr PaddlePredictorImpl::Clone() { VLOG(3) << "Predictor::clone"; - std::unique_ptr cls(new PaddlePredictorImpl(config_)); - if (!cls->InitShared(this)) { + std::unique_ptr cls(new PaddlePredictorImpl(config_)); + if (!cls->InitShared()) { LOG(ERROR) << "fail to call InitShared"; return nullptr; } - return cls; + // fix manylinux compile error. + return std::move(cls); } // TODO(panyx0718): Consider merge with Init()? -bool PaddlePredictorImpl::InitShared(PaddlePredictorImpl *cls) { +bool PaddlePredictorImpl::InitShared() { VLOG(3) << "Predictor::init_shared"; // 1. Define place, executor, scope if (this->config_.device >= 0) { diff --git a/paddle/contrib/inference/paddle_inference_api_impl.h b/paddle/contrib/inference/paddle_inference_api_impl.h index 831abce5da58f90b38e27b5638e953de5167647a..a0c7ff030735fc1c6b9d717f8f9e4addc7e0c6b0 100644 --- a/paddle/contrib/inference/paddle_inference_api_impl.h +++ b/paddle/contrib/inference/paddle_inference_api_impl.h @@ -41,7 +41,7 @@ struct VisConfig : public PaddlePredictor::Config { * Do not use this, just a demo indicating how to customize a Predictor. */ class PaddlePredictorImpl : public PaddlePredictor { -public: + public: explicit PaddlePredictorImpl(const VisConfig &config) : config_(config) {} bool Init(); @@ -53,8 +53,8 @@ public: ~PaddlePredictorImpl() override{}; -private: - bool InitShared(PaddlePredictorImpl *cls); + private: + bool InitShared() override; bool SetFeed(const std::vector &input_datas, std::vector *feeds); bool GetFetch(const std::vector &fetchs, diff --git a/paddle/contrib/inference/test_paddle_inference_api.cc b/paddle/contrib/inference/test_paddle_inference_api.cc index a19173087649e8493b8c72e758456cc5b8970e23..bc7faab6e208a66d7a56e41a56bd743c7644eea2 100644 --- a/paddle/contrib/inference/test_paddle_inference_api.cc +++ b/paddle/contrib/inference/test_paddle_inference_api.cc @@ -31,7 +31,7 @@ struct DemoConfig : public PaddlePredictor::Config { * Do not use this, just a demo indicating how to customize a Predictor. */ class DemoPredictor : public PaddlePredictor { -public: + public: explicit DemoPredictor(const DemoConfig &config) { LOG(INFO) << "I get other_config " << config.other_config; } diff --git a/paddle/contrib/inference/test_paddle_inference_api_impl.cc b/paddle/contrib/inference/test_paddle_inference_api_impl.cc index 43b068fb42c5e5c58f2932c3e528fd0fa0502ec3..2a58f6989d5dad23b2f267adafde2cc105bf5651 100644 --- a/paddle/contrib/inference/test_paddle_inference_api_impl.cc +++ b/paddle/contrib/inference/test_paddle_inference_api_impl.cc @@ -44,7 +44,7 @@ TEST(paddle_inference_api_impl, word2vec) { VisConfig config; config.model_dir = FLAGS_dirname + "word2vec.inference.model"; LOG(INFO) << "dirname " << config.model_dir; - config.fraction_of_gpu_memory = 0.85; + config.fraction_of_gpu_memory = 0.15; config.device = 0; config.share_variables = true; @@ -68,11 +68,11 @@ TEST(paddle_inference_api_impl, word2vec) { std::vector outputs; ASSERT_TRUE(predictor->Run(cpu_feeds, &outputs)); - ASSERT_EQ(outputs.size(), 1); + ASSERT_EQ(outputs.size(), 1UL); for (size_t i = 0; i < outputs.size(); ++i) { size_t len = outputs[i].data.length; float* data = static_cast(outputs[i].data.data); - for (int j = 0; j < len / sizeof(float); ++j) { + for (size_t j = 0; j < len / sizeof(float); ++j) { ASSERT_LT(data[j], 1.0); ASSERT_GT(data[j], -1.0); } diff --git a/paddle/cuda/include/hl_activation_functions.h b/paddle/cuda/include/hl_activation_functions.h index 29ec248420058db08bd1932f702d26074d49f38c..66a69db545b541409f895820ad621a2a9a684e20 100644 --- a/paddle/cuda/include/hl_activation_functions.h +++ b/paddle/cuda/include/hl_activation_functions.h @@ -31,7 +31,7 @@ namespace hppl { */ template class Active { -public: + public: typedef T (*forward)(T); typedef T (*backward)(T, T); }; diff --git a/paddle/cuda/include/hl_tensor_ops.h b/paddle/cuda/include/hl_tensor_ops.h index 85a022ff5e26daab97be52b7ea9814c6b8078561..bc5e5da53d5c6ac2bae3b0067f46e39accd1b9d8 100644 --- a/paddle/cuda/include/hl_tensor_ops.h +++ b/paddle/cuda/include/hl_tensor_ops.h @@ -23,128 +23,128 @@ namespace unary { template class add_scale { -private: + private: const T p; -public: + public: INLINE add_scale(const T s) : p(s) {} INLINE T operator()(const T a) const { return a + p; } }; template class sub_scale { -private: + private: const T p; -public: + public: INLINE sub_scale(const T s) : p(s) {} INLINE T operator()(const T a) const { return a - p; } }; template class mul_scale { -private: + private: const T p; -public: + public: INLINE mul_scale(const T s) : p(s) {} INLINE T operator()(const T a) const { return a * p; } }; template class div_scale { -private: + private: const T p; -public: + public: INLINE div_scale(const T s) : p(s) {} INLINE T operator()(const T a) const { return a / p; } }; template class neg { -public: + public: INLINE T operator()(const T a) const { return -a; } }; template class exp_op { -public: + public: INLINE T operator()(const T a) const { return std::exp(a); } }; template class log_op { -public: + public: INLINE T operator()(const T a) const { return std::log(a); } }; template class sqrt_op { -public: + public: INLINE T operator()(const T a) const { return std::sqrt(a); } }; template class square { -public: + public: INLINE T operator()(const T a) const { return a * a; } }; template class reciprocal { -public: + public: INLINE T operator()(const T a) const { return T(1) / a; } }; template class abs { -public: + public: INLINE T operator()(const T a) const { return a > 0 ? a : -a; } }; template class sign { -public: + public: INLINE T operator()(const T a) const { return (a > 0) - (a < 0); } }; template class min { -private: + private: const T p; -public: + public: INLINE min(const T s) : p(s) {} INLINE T operator()(const T a) const { return a > p ? p : a; } }; template class max { -private: + private: const T p; -public: + public: INLINE max(const T s) : p(s) {} INLINE T operator()(const T a) const { return a < p ? p : a; } }; template class pow_op { -private: + private: const T p; -public: + public: INLINE pow_op(const T s) : p(s) {} INLINE T operator()(const T a) const { return std::pow(a, p); } }; template class constant { -private: + private: const T p; -public: + public: INLINE constant(const T s) : p(s) {} INLINE T operator()(int i) const { return p; } INLINE T operator()(int i, int j) const { return p; } @@ -152,80 +152,80 @@ public: template class cmp_eq { -private: + private: const T p; -public: + public: INLINE cmp_eq(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a == p; } }; template class cmp_ne { -private: + private: const T p; -public: + public: INLINE cmp_ne(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a != p; } }; template class cmp_le { -private: + private: const T p; -public: + public: INLINE cmp_le(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a <= p; } }; template class cmp_lt { -private: + private: const T p; -public: + public: INLINE cmp_lt(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a < p; } }; template class cmp_ge { -private: + private: const T p; -public: + public: INLINE cmp_ge(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a >= p; } }; template class cmp_gt { -private: + private: const T p; -public: + public: INLINE cmp_gt(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a > p; } }; template class and_op { -private: + private: const T p; -public: + public: INLINE and_op(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a && p; } }; template class or_op { -private: + private: const T p; -public: + public: INLINE or_op(const T s) : p(s) {} INLINE bool operator()(const T a) const { return a || p; } }; @@ -235,96 +235,96 @@ public: namespace binary { template class add { -public: + public: INLINE T operator()(const T a, const T b) const { return a + b; } }; template class add_scale { -private: + private: const T p1; const T p2; -public: + public: INLINE add_scale(const T s1, const T s2) : p1(s1), p2(s2) {} INLINE T operator()(const T a, const T b) const { return p1 * a + p2 * b; } }; template class sub { -public: + public: INLINE T operator()(const T a, const T b) const { return a - b; } }; template class mul { -public: + public: INLINE T operator()(const T a, const T b) const { return a * b; } }; template class div { -public: + public: INLINE T operator()(const T a, const T b) const { return a / b; } }; template class cmp_eq { -public: + public: INLINE bool operator()(const T a, const T b) const { return a == b; } }; template class cmp_ne { -public: + public: INLINE bool operator()(const T a, const T b) const { return a != b; } }; template class cmp_le { -public: + public: INLINE bool operator()(const T a, const T b) const { return a <= b; } }; template class cmp_lt { -public: + public: INLINE bool operator()(const T a, const T b) const { return a < b; } }; template class cmp_ge { -public: + public: INLINE bool operator()(const T a, const T b) const { return a >= b; } }; template class cmp_gt { -public: + public: INLINE bool operator()(const T a, const T b) const { return a > b; } }; template class and_op { -public: + public: INLINE bool operator()(const T a, const T b) const { return a && b; } }; template class or_op { -public: + public: INLINE bool operator()(const T a, const T b) const { return a || b; } }; template class min { -public: + public: INLINE T operator()(const T a, const T b) const { return a > b ? b : a; } }; template class max { -public: + public: INLINE T operator()(const T a, const T b) const { return a < b ? b : a; } }; @@ -332,7 +332,7 @@ public: #ifndef PADDLE_TYPE_DOUBLE template <> class add<__m128> { -public: + public: INLINE __m128 operator()(const __m128 a, const __m128 b) const { return _mm_add_ps(a, b); } @@ -340,11 +340,11 @@ public: template <> class add_scale<__m128> { -private: + private: const __m128 p1; const __m128 p2; -public: + public: INLINE add_scale(const __m128 s1, const __m128 s2) : p1(s1), p2(s2) {} INLINE __m128 operator()(const __m128 a, const __m128 b) const { return _mm_add_ps(_mm_mul_ps(p1, a), _mm_mul_ps(p2, b)); @@ -353,7 +353,7 @@ public: template <> class sub<__m128> { -public: + public: INLINE __m128 operator()(const __m128 a, const __m128 b) const { return _mm_sub_ps(a, b); } @@ -361,7 +361,7 @@ public: template <> class mul<__m128> { -public: + public: INLINE __m128 operator()(const __m128 a, const __m128 b) const { return _mm_mul_ps(a, b); } @@ -369,7 +369,7 @@ public: template <> class div<__m128> { -public: + public: INLINE __m128 operator()(const __m128 a, const __m128 b) const { return _mm_div_ps(a, b); } @@ -377,7 +377,7 @@ public: template <> class min<__m128> { -public: + public: INLINE __m128 operator()(const __m128 a, const __m128 b) const { return _mm_min_ps(a, b); } @@ -385,7 +385,7 @@ public: template <> class max<__m128> { -public: + public: INLINE __m128 operator()(const __m128 a, const __m128 b) const { return _mm_max_ps(a, b); } @@ -393,7 +393,7 @@ public: #else template <> class add<__m128d> { -public: + public: INLINE __m128d operator()(const __m128d a, const __m128d b) const { return _mm_add_pd(a, b); } @@ -401,11 +401,11 @@ public: template <> class add_scale<__m128d> { -private: + private: const __m128d p1; const __m128d p2; -public: + public: INLINE add_scale(const __m128d s1, const __m128d s2) : p1(s1), p2(s2) {} INLINE __m128d operator()(const __m128d a, const __m128d b) const { return _mm_add_pd(_mm_mul_pd(p1, a), _mm_mul_pd(p2, b)); @@ -414,7 +414,7 @@ public: template <> class sub<__m128d> { -public: + public: INLINE __m128d operator()(const __m128d a, const __m128d b) const { return _mm_sub_pd(a, b); } @@ -422,7 +422,7 @@ public: template <> class mul<__m128d> { -public: + public: INLINE __m128d operator()(const __m128d a, const __m128d b) const { return _mm_mul_pd(a, b); } @@ -430,7 +430,7 @@ public: template <> class div<__m128d> { -public: + public: INLINE __m128d operator()(const __m128d a, const __m128d b) const { return _mm_div_pd(a, b); } @@ -438,7 +438,7 @@ public: template <> class min<__m128d> { -public: + public: INLINE __m128d operator()(const __m128d a, const __m128d b) const { return _mm_min_pd(a, b); } @@ -446,7 +446,7 @@ public: template <> class max<__m128d> { -public: + public: INLINE __m128d operator()(const __m128d a, const __m128d b) const { return _mm_max_pd(a, b); } @@ -458,7 +458,7 @@ public: #ifndef PADDLE_TYPE_DOUBLE template <> class add { -public: + public: INLINE float32x4_t operator()(const float32x4_t a, const float32x4_t b) const { return vaddq_f32(a, b); @@ -467,11 +467,11 @@ public: template <> class add_scale { -private: + private: const float32x4_t p1; const float32x4_t p2; -public: + public: INLINE add_scale(const float32x4_t s1, const float32x4_t s2) : p1(s1), p2(s2) {} INLINE float32x4_t operator()(const float32x4_t a, @@ -482,7 +482,7 @@ public: template <> class sub { -public: + public: INLINE float32x4_t operator()(const float32x4_t a, const float32x4_t b) const { return vsubq_f32(a, b); @@ -491,7 +491,7 @@ public: template <> class mul { -public: + public: INLINE float32x4_t operator()(const float32x4_t a, const float32x4_t b) const { return vmulq_f32(a, b); @@ -500,7 +500,7 @@ public: template <> class div { -public: + public: INLINE float32x4_t operator()(const float32x4_t a, const float32x4_t b) const { float32x4_t tmp = vrecpeq_f32(b); @@ -510,7 +510,7 @@ public: template <> class min { -public: + public: INLINE float32x4_t operator()(const float32x4_t a, const float32x4_t b) const { return vminq_f32(a, b); @@ -519,7 +519,7 @@ public: template <> class max { -public: + public: INLINE float32x4_t operator()(const float32x4_t a, const float32x4_t b) const { return vmaxq_f32(a, b); diff --git a/paddle/cuda/src/hl_cuda_lstm.cu b/paddle/cuda/src/hl_cuda_lstm.cu index e30fcddffdf99417a4b9b811a0b0cb0a12e79b99..b8c4e433a118fb1c5af753751f91c34543b1114c 100644 --- a/paddle/cuda/src/hl_cuda_lstm.cu +++ b/paddle/cuda/src/hl_cuda_lstm.cu @@ -30,7 +30,7 @@ bool hl_lstm_sequence_parallel(int frameSize) { } class frameValue { -public: + public: real *value_; __device__ frameValue(real *value) : value_(value) {} template diff --git a/paddle/fluid/framework/data_device_transform.cc b/paddle/fluid/framework/data_device_transform.cc index a876725ac0f17838458065c4b4753a03e2812801..6bcfc6cd55f02f0d4f0f6e3170e7cc19ce666a28 100644 --- a/paddle/fluid/framework/data_device_transform.cc +++ b/paddle/fluid/framework/data_device_transform.cc @@ -16,31 +16,25 @@ limitations under the License. */ namespace paddle { namespace framework { -static const platform::DeviceContext* GetDeviceContext( - const platform::Place& src_place, const platform::Place& dst_place) { - platform::DeviceContextPool& pool = platform::DeviceContextPool::Instance(); - - if (platform::is_gpu_place(src_place) && platform::is_cpu_place(dst_place)) { - return pool.Get(src_place); - } else if (platform::is_cpu_place(src_place) && - platform::is_gpu_place(dst_place)) { - return pool.Get(dst_place); - } else { - PADDLE_THROW( - "Currently, model parallelism is only supported between CPU and CUDA"); - } -} - -void TransDataDevice(const Tensor& in, const platform::Place& dst_place, - Tensor* out) { +void TransDataDevice(const Tensor &in, const platform::Place &dst_place, + Tensor *out) { VLOG(3) << "DeviceTransform in, src_place " << in.place() << " dst_place: " << dst_place; - auto* dev_ctx = GetDeviceContext(in.place(), dst_place); - TensorCopy(in, dst_place, *dev_ctx, out); - if (platform::is_gpu_place(in.place()) && platform::is_cpu_place(dst_place)) { - dev_ctx->Wait(); - } + PADDLE_ENFORCE_NE( + in.place().which(), dst_place.which(), + "Currently, model parallelism is only supported between CPU and CUDA"); + + // FIXME(zcd): TransDataDevice is used to transform data from GPU to CPU and + // the enforced checkings have been done in GetDeviceContext, so the + // `dev_ctx->Wait()` is necessary. But `dev_ctx->Wait()` will make the program + // slow, especially when the number of elements is little, for example, + // the elements of learning rate are one and it's CPU side. + // One solution is to use a CUDA kernel to complete the copy operation when + // the transforming is from CPU to GPU and the number of elements is little. + // But the embarrassment is that this solution this solution makes training + // slower. + TensorCopySync(in, dst_place, out); } } // namespace framework diff --git a/paddle/fluid/framework/details/CMakeLists.txt b/paddle/fluid/framework/details/CMakeLists.txt index b69de2ced03569d5e9ffe313527ab776ee798496..1bcd8412eb2d618b923bcd0557d118af62271f4a 100644 --- a/paddle/fluid/framework/details/CMakeLists.txt +++ b/paddle/fluid/framework/details/CMakeLists.txt @@ -3,7 +3,7 @@ cc_library(op_handle_base SRCS op_handle_base.cc DEPS var_handle device_context cc_library(scale_loss_grad_op_handle SRCS scale_loss_grad_op_handle.cc DEPS op_handle_base scope lod_tensor ddim memory) cc_library(fetch_op_handle SRCS fetch_op_handle.cc DEPS op_handle_base scope lod_tensor ddim memory) cc_library(computation_op_handle SRCS computation_op_handle.cc DEPS framework_proto scope place operator op_registry) -cc_library(send_op_handle SRCS send_op_handle.cc DEPS framework_proto scope place operator op_registry) +cc_library(rpc_op_handle SRCS rpc_op_handle.cc DEPS framework_proto scope place operator op_registry) cc_library(ssa_graph SRCS ssa_graph.cc DEPS var_handle op_handle_base) cc_library(ssa_graph_builder SRCS ssa_graph_builder.cc DEPS ssa_graph) @@ -26,7 +26,7 @@ endif() cc_library(gather_op_handle SRCS gather_op_handle.cc DEPS op_handle_base scope ddim memory variable_visitor) cc_library(multi_devices_graph_builder SRCS multi_devices_graph_builder.cc DEPS ssa_graph_builder computation_op_handle - scale_loss_grad_op_handle send_op_handle ${multi_devices_graph_builder_deps} reduce_op_handle broadcast_op_handle) + scale_loss_grad_op_handle rpc_op_handle ${multi_devices_graph_builder_deps} reduce_op_handle broadcast_op_handle) cc_library(ssa_graph_executor SRCS ssa_graph_executor.cc DEPS ssa_graph framework_proto) cc_library(threaded_ssa_graph_executor SRCS threaded_ssa_graph_executor.cc DEPS fetch_op_handle ssa_graph_executor scope diff --git a/paddle/fluid/framework/details/multi_devices_graph_builder.cc b/paddle/fluid/framework/details/multi_devices_graph_builder.cc index 35d23d68c0dd26a05544a72316d5764129aa8d40..d8e711994c5dba15ce0a1c237558b121888902e3 100644 --- a/paddle/fluid/framework/details/multi_devices_graph_builder.cc +++ b/paddle/fluid/framework/details/multi_devices_graph_builder.cc @@ -12,12 +12,13 @@ // See the License for the specific language governing permissions and // limitations under the License. #include "paddle/fluid/framework/details/multi_devices_graph_builder.h" +#include #include #include "paddle/fluid/framework/details/broadcast_op_handle.h" #include "paddle/fluid/framework/details/computation_op_handle.h" #include "paddle/fluid/framework/details/reduce_op_handle.h" +#include "paddle/fluid/framework/details/rpc_op_handle.h" #include "paddle/fluid/framework/details/scale_loss_grad_op_handle.h" -#include "paddle/fluid/framework/details/send_op_handle.h" #include "paddle/fluid/framework/op_info.h" #include "paddle/fluid/framework/scope.h" @@ -28,6 +29,10 @@ #include #include +DEFINE_string(ssa_graph_path, "/tmp/ssa_graph.dot", + "the ssa graph path only print with GLOG_v=10," + "default /tmp/graph.dot"); + namespace paddle { namespace framework { namespace details { @@ -79,9 +84,44 @@ void MultiDevSSAGraphBuilder::CreateOpHandleIOs(SSAGraph *result, } } -bool MultiDevSSAGraphBuilder::IsDistTrainOp(const OpDesc &op, - OpDesc *send_op) const { - if (send_op == nullptr) { +std::vector MultiDevSSAGraphBuilder::FindDistTrainSendVars( + const ProgramDesc &program) const { + std::vector send_vars; + // since parameters are all in block 0, + // it's enough to only scan send ops in block 0 + for (auto *op : program.Block(0).AllOps()) { + // TODO(Yancey1989): use a graceful method to find send op, + // instead of the the hard code string + if (op->Type() == "send_vars") { + auto op_vars = op->InputArgumentNames(); + send_vars.reserve(send_vars.size() + + std::distance(op_vars.begin(), op_vars.end())); + send_vars.insert(send_vars.end(), op_vars.begin(), op_vars.end()); + } + } + return send_vars; +} + +std::vector MultiDevSSAGraphBuilder::FindDistTrainRecvVars( + const ProgramDesc &program) const { + std::vector recv_vars; + for (auto *op : program.Block(0).AllOps()) { + // TODO(Yancey1989): use a graceful method to find recv op, + // instead of the hard code string + if (op->Type() == "recv") { + auto op_vars = op->OutputArgumentNames(); + recv_vars.reserve(recv_vars.size() + + std::distance(op_vars.begin(), op_vars.end())); + recv_vars.insert(recv_vars.end(), op_vars.begin(), op_vars.end()); + } + } + return recv_vars; +} + +bool MultiDevSSAGraphBuilder::IsDistTrainOp( + const OpDesc &op, const std::vector &send_vars, + const std::vector &recv_vars) const { + if (send_vars.size() == 0 || recv_vars.size() == 0) { return false; } @@ -89,22 +129,21 @@ bool MultiDevSSAGraphBuilder::IsDistTrainOp(const OpDesc &op, * Check any of opvars contains `.block` and in sendvars */ auto checker = [](const std::vector &opvars, - const std::vector &sendvars) -> bool { + const std::vector &rpc_vars) -> bool { for (auto &var : opvars) { + // a variable name with the suffix `.block` means it's a splited + // variable by (DistributeTranspiler) + // [python/paddle/fluid/transpiler/distribute_transpiler.py] if (var.find(".block") != std::string::npos && - std::find(sendvars.begin(), sendvars.end(), var) != sendvars.end()) { + std::find(rpc_vars.begin(), rpc_vars.end(), var) != rpc_vars.end()) { return true; } } return false; }; - if (op.Type() == "split" || op.Type() == "split_byref") { - return checker(op.OutputArgumentNames(), send_op->InputArgumentNames()); - } else if (op.Type() == "concat") { - return checker(op.InputArgumentNames(), send_op->OutputArgumentNames()); - } - return false; + return checker(op.OutputArgumentNames(), send_vars) || + checker(op.InputArgumentNames(), recv_vars); } std::unique_ptr MultiDevSSAGraphBuilder::Build( @@ -123,8 +162,10 @@ std::unique_ptr MultiDevSSAGraphBuilder::Build( std::unordered_map>>>( places_.size()); - // Find "send" op first for split is in front of send. - OpDesc *send_op = GetSendOpDesc(program); + // find send/recv vars so that we can place the distributed training + // realted op in the place 0 + auto send_vars = FindDistTrainSendVars(program); + auto recv_vars = FindDistTrainRecvVars(program); size_t cur_device_id = 0; std::vector> var_name_on_devices; @@ -134,12 +175,14 @@ std::unique_ptr MultiDevSSAGraphBuilder::Build( bool is_forwarding = true; for (auto *op : program.Block(0).AllOps()) { - if (op->Type() == "send") { - // append send op if program is distributed trainer main program. + if (boost::get( + op->GetAttr(OpProtoAndCheckerMaker::OpRoleAttrName())) == + static_cast(OpRole::kRPC)) { + // append rpc op if program is distributed trainer main program. // always use the first device - CreateSendOp(&result, *op); - } else if (IsDistTrainOp(*op, send_op)) { - CreateComputationalOps(&result, *op, 1); + CreateRPCOp(&result, *op); + } else if (IsDistTrainOp(*op, send_vars, recv_vars)) { + CreateDistTrainOp(&result, *op); } else if (IsScaleLossOp(*op)) { // user can customize loss@grad if not use_default_grad_scale_ if (strategy_.gradient_scale_ != @@ -218,9 +261,8 @@ std::unique_ptr MultiDevSSAGraphBuilder::Build( AddOutputToLeafOps(&result); if (VLOG_IS_ON(10)) { - std::ostringstream sout; - PrintGraphviz(*graph, sout); - VLOG(10) << sout.str(); + std::ofstream fout(FLAGS_ssa_graph_path); + PrintGraphviz(*graph, fout); } return std::unique_ptr(graph); @@ -270,15 +312,6 @@ void MultiDevSSAGraphBuilder::CreateComputationalOp(SSAGraph *result, CreateOpHandleIOs(result, op, dev_id); } -OpDesc *MultiDevSSAGraphBuilder::GetSendOpDesc( - const ProgramDesc &program) const { - for (auto *op : program.Block(0).AllOps()) { - if (op->Type() == "send") { - return op; - } - } - return nullptr; -} void MultiDevSSAGraphBuilder::InsertNCCLAllReduceOp( SSAGraph *result, const std::string &og) const { #ifdef PADDLE_WITH_CUDA @@ -401,14 +434,48 @@ VarHandle *MultiDevSSAGraphBuilder::CreateReduceOp(SSAGraph *result, return var; } -void MultiDevSSAGraphBuilder::CreateSendOp(SSAGraph *result, - const OpDesc &op) const { +void MultiDevSSAGraphBuilder::ConnectOp(SSAGraph *result, OpHandleBase *op, + const std::string &prev_op_name) const { + for (auto &prev_op : result->ops_) { + if (prev_op->Name() == prev_op_name) { + auto *dep_var = new DummyVarHandle(); + prev_op->AddOutput(dep_var); + result->dep_vars_.emplace(dep_var); + op->AddInput(dep_var); + } + } +} + +void MultiDevSSAGraphBuilder::CreateDistTrainOp(SSAGraph *result, + const OpDesc &op) const { + CreateComputationalOp(result, op, 0); + if (op.Type() == "concat") { + ConnectOp(result, result->ops_.back().get(), "fetch_barrier"); + } +} + +void MultiDevSSAGraphBuilder::CreateRPCOp(SSAGraph *result, + const OpDesc &op) const { auto &p = places_[0]; auto *s = local_scopes_[0]; - // FIXME(wuyi): send op always copy from GPU 0 - result->ops_.emplace_back(new SendOpHandle(op, s, p)); - // Create inputs for output on original place and no ssa output - // is created for send op. + result->ops_.emplace_back(new RPCOpHandle(op, s, p, op.Type())); + + if (op.Type() == "send_barrier") { + ConnectOp(result, result->ops_.back().get(), "send_vars"); + } else if (op.Type() == "recv") { + ConnectOp(result, result->ops_.back().get(), "send_barrier"); + } else if (op.Type() == "fetch_barrier") { + ConnectOp(result, result->ops_.back().get(), "recv"); + } else if (op.Type() == "send_vars") { + // do nothing + } else { + PADDLE_THROW( + "rpc op should be in [" + "send_vars, send_barrier. recv, fetch_barrier]"); + } + + // TODO(Yancey1989): schedule rpc op on different place may + // increate throughput CreateOpHandleIOs(result, op, 0); } diff --git a/paddle/fluid/framework/details/multi_devices_graph_builder.h b/paddle/fluid/framework/details/multi_devices_graph_builder.h index 4f708521884247fc013f0ae336ab683c3fe7ef2f..e07597dbd80889c366babe79455beb12c9eb80d9 100644 --- a/paddle/fluid/framework/details/multi_devices_graph_builder.h +++ b/paddle/fluid/framework/details/multi_devices_graph_builder.h @@ -64,12 +64,24 @@ class MultiDevSSAGraphBuilder : public SSAGraphBuilder { bool IsScaleLossOp(const OpDesc &op) const; - void CreateSendOp(SSAGraph *result, const OpDesc &op) const; + void CreateRPCOp(SSAGraph *result, const OpDesc &op) const; + void CreateDistTrainOp(SSAGraph *result, const OpDesc &op) const; /** * Is this operator as the end-point operator before/after send operator. */ - bool IsDistTrainOp(const OpDesc &op, OpDesc *send_op) const; + bool IsDistTrainOp(const OpDesc &op, + const std::vector &send_vars, + const std::vector &recv_vars) const; + + std::vector FindDistTrainSendVars( + const ProgramDesc &program) const; + + std::vector FindDistTrainRecvVars( + const ProgramDesc &program) const; + + void ConnectOp(SSAGraph *result, OpHandleBase *op, + const std::string &prev_op_name) const; void CreateComputationalOps(SSAGraph *result, const OpDesc &op, size_t num_places) const; @@ -93,12 +105,6 @@ class MultiDevSSAGraphBuilder : public SSAGraphBuilder { void CreateBroadcastOp(SSAGraph *result, const std::string &p_name, size_t src_dev_id) const; - /** - * Get send op in the global block of program. - * nullptr if not found. - */ - OpDesc *GetSendOpDesc(const ProgramDesc &program) const; - bool IsSparseGradient( const std::unordered_map &var_types, const std::string &og) const; diff --git a/paddle/fluid/framework/details/send_op_handle.cc b/paddle/fluid/framework/details/rpc_op_handle.cc similarity index 75% rename from paddle/fluid/framework/details/send_op_handle.cc rename to paddle/fluid/framework/details/rpc_op_handle.cc index 7109659dd7001f91e7674ac7bebbe3a59794cfc0..7f4da4c01de1010467d839ee5490c5e0d02d8c24 100644 --- a/paddle/fluid/framework/details/send_op_handle.cc +++ b/paddle/fluid/framework/details/rpc_op_handle.cc @@ -12,24 +12,26 @@ // See the License for the specific language governing permissions and // limitations under the License. -#include "paddle/fluid/framework/details/send_op_handle.h" +#include "paddle/fluid/framework/details/rpc_op_handle.h" namespace paddle { namespace framework { namespace details { -SendOpHandle::SendOpHandle(const framework::OpDesc &op_desc, - const Scope *local_scope, - const platform::Place &place) +RPCOpHandle::RPCOpHandle(const framework::OpDesc &op_desc, + const Scope *local_scope, const platform::Place &place, + const std::string &name) : op_(framework::OpRegistry::CreateOp(op_desc)), local_scope_(local_scope), - place_(place) {} + place_(place), + name_(name) {} -void SendOpHandle::RunImpl() { +void RPCOpHandle::RunImpl() { // TODO(wuyi): need further analysis whether wait VarDummyHandle. // Wait input done for (auto *in : inputs_) { auto &p = static_cast(in)->place_; + // FIXME(Yancey1989): need a better solution instead of use DebugString() if (in->DebugString() == "dummy") { // HACK continue; } @@ -43,7 +45,7 @@ void SendOpHandle::RunImpl() { op_->Run(*tmp_scope, place_); } -std::string SendOpHandle::Name() const { return "send"; } +std::string RPCOpHandle::Name() const { return name_; } } // namespace details } // namespace framework } // namespace paddle diff --git a/paddle/fluid/framework/details/send_op_handle.h b/paddle/fluid/framework/details/rpc_op_handle.h similarity index 87% rename from paddle/fluid/framework/details/send_op_handle.h rename to paddle/fluid/framework/details/rpc_op_handle.h index 2f78811fad50642b5e45776c41910df6f4cc48f6..d28b7721720d808a8d81701c3811eae16121fb41 100644 --- a/paddle/fluid/framework/details/send_op_handle.h +++ b/paddle/fluid/framework/details/rpc_op_handle.h @@ -27,9 +27,9 @@ namespace paddle { namespace framework { namespace details { -struct SendOpHandle : public OpHandleBase { - SendOpHandle(const framework::OpDesc& op_desc, const Scope* local_scope, - const platform::Place& place); +struct RPCOpHandle : public OpHandleBase { + RPCOpHandle(const framework::OpDesc& op_desc, const Scope* local_scope, + const platform::Place& place, const std::string& name); std::string Name() const override; @@ -44,6 +44,7 @@ struct SendOpHandle : public OpHandleBase { std::unique_ptr op_; const Scope* local_scope_; const platform::Place& place_; + const std::string name_; }; } // namespace details diff --git a/paddle/fluid/framework/executor.cc b/paddle/fluid/framework/executor.cc index 4e431561f81b2a84c06dff9fcb041317ebc84ae3..863053c32b190f4e8497b16f3edd76cb2f76168b 100644 --- a/paddle/fluid/framework/executor.cc +++ b/paddle/fluid/framework/executor.cc @@ -24,9 +24,6 @@ limitations under the License. */ #include "paddle/fluid/platform/profiler.h" DECLARE_bool(benchmark); -DEFINE_bool(check_nan_inf, false, - "Checking whether operator produce NAN/INF or not. It will be " - "extremely slow so please use this flag wisely."); namespace paddle { namespace framework { @@ -78,21 +75,6 @@ void InitializeVariable(Variable* var, proto::VarType::Type var_type) { } } -static void CheckTensorNANOrInf(const std::string& name, - const framework::Tensor& tensor) { - if (tensor.memory_size() == 0) { - return; - } - if (tensor.type().hash_code() != typeid(float).hash_code() && // NOLINT - tensor.type().hash_code() != typeid(double).hash_code()) { // NOLINT - return; - } - PADDLE_ENFORCE(!framework::TensorContainsInf(tensor), - "Tensor %s contains Inf", name); - PADDLE_ENFORCE(!framework::TensorContainsNAN(tensor), - "Tensor %s contains NAN", name); -} - void Executor::CreateVariables(const ProgramDesc& pdesc, Scope* scope, int block_id) { auto& global_block = pdesc.Block(block_id); @@ -340,15 +322,6 @@ void Executor::RunPreparedContext(ExecutorPrepareContext* ctx, Scope* scope, VLOG(2) << "Memory used after operator " + op->Type() + " running: " << memory::memory_usage(place_); } - if (FLAGS_check_nan_inf) { - for (auto& vname : op->OutputVars(true)) { - auto* var = local_scope->FindVar(vname); - if (var == nullptr) continue; - if (var->IsType()) { - CheckTensorNANOrInf(vname, var->Get()); - } - } - } } platform::DeviceContextPool::Instance().Get(place_)->Wait(); if (create_vars && create_local_scope) { diff --git a/paddle/fluid/framework/op_proto_maker.cc b/paddle/fluid/framework/op_proto_maker.cc index 5a4380a83a2e5bf492098032cd9de7bf274fe47e..ae9f4efd44acdcdff2806deea6826e4089459a78 100644 --- a/paddle/fluid/framework/op_proto_maker.cc +++ b/paddle/fluid/framework/op_proto_maker.cc @@ -66,7 +66,7 @@ void OpProtoAndCheckerMaker::operator()(proto::OpProto* proto, .InEnum( {static_cast(OpRole::kForward), static_cast(OpRole::kBackward), - static_cast(OpRole::kOptimize), + static_cast(OpRole::kOptimize), static_cast(OpRole::kRPC), static_cast(OpRole::kLoss) | static_cast(OpRole::kForward), static_cast(OpRole::kLoss) | static_cast(OpRole::kBackward), diff --git a/paddle/fluid/framework/op_proto_maker.h b/paddle/fluid/framework/op_proto_maker.h index 9bd6ca6ea32734707a5c37b3ecfe449436c04c8c..8493b9d8b326c71a33b95bf95e5fc1743c686eb7 100644 --- a/paddle/fluid/framework/op_proto_maker.h +++ b/paddle/fluid/framework/op_proto_maker.h @@ -24,6 +24,7 @@ enum class OpRole { kForward = 0x0000, kBackward = 0x0001, kOptimize = 0x0002, + kRPC = 0x0003, kLoss = 0x0100, // The default value of op's role. This should be only used for unittests and diff --git a/paddle/fluid/framework/operator.cc b/paddle/fluid/framework/operator.cc index d70f26026c28867e592a9f8e37cc53e6c1d6d85e..e3d2e5377eac49003b0082c39c9dd0460e2acd92 100644 --- a/paddle/fluid/framework/operator.cc +++ b/paddle/fluid/framework/operator.cc @@ -24,6 +24,9 @@ limitations under the License. */ #include "paddle/fluid/platform/profiler.h" DECLARE_bool(benchmark); +DEFINE_bool(check_nan_inf, false, + "Checking whether operator produce NAN/INF or not. It will be " + "extremely slow so please use this flag wisely."); namespace paddle { namespace framework { @@ -513,6 +516,21 @@ class RuntimeInferShapeContext : public InferShapeContext { const Scope& scope_; }; +static void CheckTensorNANOrInf(const std::string& name, + const framework::Tensor& tensor) { + if (tensor.memory_size() == 0) { + return; + } + if (tensor.type().hash_code() != typeid(float).hash_code() && // NOLINT + tensor.type().hash_code() != typeid(double).hash_code()) { // NOLINT + return; + } + PADDLE_ENFORCE(!framework::TensorContainsInf(tensor), + "Tensor %s contains Inf", name); + PADDLE_ENFORCE(!framework::TensorContainsNAN(tensor), + "Tensor %s contains NAN", name); +} + void OperatorWithKernel::RunImpl(const Scope& scope, const platform::Place& place) const { RuntimeInferShapeContext infer_shape_ctx(*this, scope); @@ -597,6 +615,16 @@ void OperatorWithKernel::RunImpl(const Scope& scope, if (FLAGS_benchmark) { new_dev_ctx->Wait(); } + + if (FLAGS_check_nan_inf) { + for (auto& vname : OutputVars(true)) { + auto* var = new_scope.FindVar(vname); + if (var == nullptr) continue; + if (var->IsType()) { + CheckTensorNANOrInf(vname, var->Get()); + } + } + } } proto::VarType::Type OperatorWithKernel::IndicateDataType( diff --git a/paddle/fluid/framework/selected_rows.cc b/paddle/fluid/framework/selected_rows.cc index 56cf6693caf4529d6e157e6e9a0d5c27d05ee0c3..b4168f38949c7fcb057ec8c5c562d0529a6d9e48 100644 --- a/paddle/fluid/framework/selected_rows.cc +++ b/paddle/fluid/framework/selected_rows.cc @@ -121,24 +121,29 @@ bool SelectedRows::HasKey(int64_t key) const { } std::vector> SelectedRows::Get( - std::vector keys, framework::Tensor* value) const { + const std::vector& keys, framework::Tensor* value) const { PADDLE_ENFORCE(value->IsInitialized(), "The value tensor should be initialized."); std::vector> non_keys_pair; - int64_t value_width = value_->numel() / value_->dims()[0]; - PADDLE_ENFORCE_EQ(value_width, value->numel() / value->dims()[0], - "output tensor should have the same shape with table " - "execpt the dims[0]."); - - for (size_t i = 0; i < keys.size(); ++i) { - int64_t index = Index(keys[i]); - if (index == -1) { - non_keys_pair.push_back(std::make_pair(keys[i], static_cast(i))); - } else { - framework::VisitDataType( - framework::ToDataType(value_->type()), - TensorCopyVisitor(value, i * value_width, *value_.get(), - index * value_width, value_width)); + if (keys.empty()) { + VLOG(3) << "keys is empty, please check data!"; + } else { + int64_t value_width = value_->numel() / value_->dims()[0]; + PADDLE_ENFORCE_EQ(value_width, value->numel() / value->dims()[0], + "output tensor should have the same shape with table " + "except the dims[0]."); + + for (size_t i = 0; i < keys.size(); ++i) { + int64_t index = Index(keys[i]); + if (index == -1) { + non_keys_pair.push_back( + std::make_pair(keys[i], static_cast(i))); + } else { + framework::VisitDataType( + framework::ToDataType(value_->type()), + TensorCopyVisitor(value, i * value_width, *value_.get(), + index * value_width, value_width)); + } } } return non_keys_pair; diff --git a/paddle/fluid/framework/selected_rows.h b/paddle/fluid/framework/selected_rows.h index c27c927ee751c4392840bfb71f4814991b23a8c9..c80b05eed9b1c50325316057a8afc26d5d52e82c 100644 --- a/paddle/fluid/framework/selected_rows.h +++ b/paddle/fluid/framework/selected_rows.h @@ -82,7 +82,7 @@ class SelectedRows { * @return a list of pair which contains the non-exists key and the index in * the value */ - std::vector> Get(std::vector keys, + std::vector> Get(const std::vector& keys, framework::Tensor* value) const; /* diff --git a/paddle/fluid/inference/analysis/data_flow_graph_tester.cc b/paddle/fluid/inference/analysis/data_flow_graph_tester.cc index 51d38d6251d853fa8a02a4e22f819cfc44294453..9d7cceeb65888b8ba3fdf39e88fc2877abd82d11 100644 --- a/paddle/fluid/inference/analysis/data_flow_graph_tester.cc +++ b/paddle/fluid/inference/analysis/data_flow_graph_tester.cc @@ -35,7 +35,7 @@ TEST(DataFlowGraph, BFS) { GraphTraits trait(&dfg); auto nodes = trait.nodes(); - int count = 0; + size_t count = 0; for (auto it = nodes.begin(); it != nodes.end(); ++it) { LOG(INFO) << "visiting " << it->name(); ++count; @@ -49,7 +49,7 @@ TEST(DataFlowGraph, DFS) { dfg.Build(); GraphTraits trait(&dfg); auto nodes = trait.nodes_in_DFS(); - int count = 0; + size_t count = 0; for (auto it = nodes.begin(); it != nodes.end(); ++it) { LOG(INFO) << "visiting " << it->name(); ++count; diff --git a/paddle/fluid/inference/analysis/helper.h b/paddle/fluid/inference/analysis/helper.h index ea39ba4ddb5e8d5d6cce9b116ab968764e578c26..24ea9a4bae7132eb1692b0ffb02f8ab5e02b21a9 100644 --- a/paddle/fluid/inference/analysis/helper.h +++ b/paddle/fluid/inference/analysis/helper.h @@ -24,6 +24,15 @@ namespace paddle { namespace inference { namespace analysis { +template +int AccuDims(Vec &&vec, int size) { + int res = 1; + for (int i = 0; i < size; i++) { + res *= std::forward(vec)[i]; + } + return res; +} + #define SET_TYPE(type__) dic_[typeid(type__).hash_code()] = #type__; /* * Map typeid to representation. @@ -101,7 +110,5 @@ class OrderedRegistry { } // namespace paddle #define PADDLE_DISALLOW_COPY_AND_ASSIGN(type__) \ - \ type__(const type__ &) = delete; \ - \ void operator=(const type__ &) = delete; diff --git a/paddle/fluid/inference/tensorrt/convert/CMakeLists.txt b/paddle/fluid/inference/tensorrt/convert/CMakeLists.txt index 7cd777de27e9457260a1b2f5936dc917f0821984..5ada1d631269209e912e2d4817382ea2c6c67353 100644 --- a/paddle/fluid/inference/tensorrt/convert/CMakeLists.txt +++ b/paddle/fluid/inference/tensorrt/convert/CMakeLists.txt @@ -1,7 +1,10 @@ -nv_test(test_op_converter SRCS test_op_converter.cc mul_op.cc conv2d_op.cc DEPS ${FLUID_CORE_MODULES}) +# Add TRT tests +nv_test(test_op_converter SRCS test_op_converter.cc mul_op.cc conv2d_op.cc DEPS ${FLUID_CORE_MODULES} tensorrt_engine) # This test is not stable # See https://paddleci.ngrok.io/viewLog.html?tab=buildLog&buildTypeId=Paddle_PrCi2&buildId=36834&_focus=8828 #nv_test(test_trt_activation_op SRCS test_activation_op.cc activation_op.cc io_converter.cc # DEPS ${FLUID_CORE_MODULES} activation_op tensorrt_engine # SERIAL) nv_test(test_io_converter SRCS test_io_converter.cc io_converter.cc DEPS dynload_cuda dynamic_loader lod_tensor) +nv_test(test_trt_mul_op SRCS test_mul_op.cc mul_op.cc + DEPS ${FLUID_CORE_MODULES} tensorrt_engine mul_op SERIAL) diff --git a/paddle/fluid/inference/tensorrt/convert/mul_op.cc b/paddle/fluid/inference/tensorrt/convert/mul_op.cc index 3ca58b139bd3af1947ae7f063060e11d2ea7d577..ed09f54bde00d12aaec829ba90cc08ebfef57e92 100644 --- a/paddle/fluid/inference/tensorrt/convert/mul_op.cc +++ b/paddle/fluid/inference/tensorrt/convert/mul_op.cc @@ -18,11 +18,25 @@ namespace paddle { namespace inference { namespace tensorrt { +/* + * MulOp, IMatrixMultiplyLayer in TRT. This Layer doesn't has weights. + */ class MulOpConverter : public OpConverter { public: MulOpConverter() {} void operator()(const framework::proto::OpDesc& op) override { - LOG(INFO) << "convert a fluid mul op to tensorrt fc layer without bias"; + VLOG(4) << "convert a fluid mul op to tensorrt fc layer without bias"; + + framework::OpDesc op_desc(op, nullptr, nullptr); + // Declare inputs + auto* input1 = engine_->GetITensor(op_desc.Input("X")[0]); + auto* input2 = engine_->GetITensor(op_desc.Input("Y")[0]); + // Both the input1 and input2 do not need transpose. + auto* layer = TRT_ENGINE_ADD_LAYER( + engine_, MatrixMultiply, *const_cast(input1), false, + *const_cast(input2), false); + + engine_->DeclareOutput(layer, 0, op_desc.Output("Out")[0]); } }; diff --git a/paddle/fluid/inference/tensorrt/convert/test_activation_op.cc b/paddle/fluid/inference/tensorrt/convert/test_activation_op.cc index ec33f97c8240dfc09a203d68599bffe78a4abb12..86ca2ca08eb14265e1bfe7abd5eb6af5c83b8a5c 100644 --- a/paddle/fluid/inference/tensorrt/convert/test_activation_op.cc +++ b/paddle/fluid/inference/tensorrt/convert/test_activation_op.cc @@ -102,3 +102,5 @@ TEST(OpConverter, ConvertRelu) { } // namespace tensorrt } // namespace inference } // namespace paddle + +USE_OP(activation); diff --git a/paddle/fluid/inference/tensorrt/convert/test_mul_op.cc b/paddle/fluid/inference/tensorrt/convert/test_mul_op.cc new file mode 100644 index 0000000000000000000000000000000000000000..d8b61d5f08ffd071c112b4677fcb6f6f50784bbc --- /dev/null +++ b/paddle/fluid/inference/tensorrt/convert/test_mul_op.cc @@ -0,0 +1,47 @@ +/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. */ + +#include +#include "paddle/fluid/framework/op_registry.h" +#include "paddle/fluid/inference/tensorrt/convert/ut_helper.h" + +namespace paddle { +namespace inference { +namespace tensorrt { + +TEST(MulOpConverter, main) { + TRTConvertValidation validator(10, 1000); + validator.DeclInputVar("mul-X", nvinfer1::Dims2(10, 6)); + validator.DeclInputVar("mul-Y", nvinfer1::Dims2(6, 10)); + validator.DeclOutputVar("mul-Out", nvinfer1::Dims2(10, 10)); + + // Prepare Op description + framework::OpDesc desc; + desc.SetType("mul"); + desc.SetInput("X", {"mul-X"}); + desc.SetInput("Y", {"mul-Y"}); + desc.SetOutput("Out", {"mul-Out"}); + + LOG(INFO) << "set OP"; + validator.SetOp(*desc.Proto()); + LOG(INFO) << "execute"; + + validator.Execute(10); +} + +} // namespace tensorrt +} // namespace inference +} // namespace paddle + +USE_OP(mul); diff --git a/paddle/fluid/inference/tensorrt/convert/test_op_converter.cc b/paddle/fluid/inference/tensorrt/convert/test_op_converter.cc index 8d66543eb7637c5a8ae670b89ef5996954ba2e7b..9ae7de9cbfa656fbcbb48557bd4b548115897c6d 100644 --- a/paddle/fluid/inference/tensorrt/convert/test_op_converter.cc +++ b/paddle/fluid/inference/tensorrt/convert/test_op_converter.cc @@ -23,8 +23,6 @@ namespace tensorrt { TEST(OpConverter, ConvertBlock) { framework::ProgramDesc prog; auto* block = prog.MutableBlock(0); - auto* mul_op = block->AppendOp(); - mul_op->SetType("mul"); auto* conv2d_op = block->AppendOp(); conv2d_op->SetType("conv2d"); diff --git a/paddle/fluid/inference/tensorrt/convert/ut_helper.h b/paddle/fluid/inference/tensorrt/convert/ut_helper.h new file mode 100644 index 0000000000000000000000000000000000000000..37fcb5c50309db0ad0924a057a6b481750665531 --- /dev/null +++ b/paddle/fluid/inference/tensorrt/convert/ut_helper.h @@ -0,0 +1,156 @@ +/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +/* + * This file implements a UT framework to make the validation of transforming + * Fluid Op to TRT Layer. + */ + +#pragma once + +#include "paddle/fluid/framework/lod_tensor.h" +#include "paddle/fluid/framework/op_registry.h" +#include "paddle/fluid/inference/analysis/helper.h" +#include "paddle/fluid/inference/tensorrt/convert/op_converter.h" +#include "paddle/fluid/inference/tensorrt/engine.h" + +namespace paddle { +namespace inference { +namespace tensorrt { + +/* + * Get a random float value between [low, high] + */ +float random(float low, float high) { + static std::random_device rd; + static std::mt19937 mt(rd()); + std::uniform_real_distribution dist(1.0, 10.0); + return dist(mt); +} + +void RandomizeTensor(framework::LoDTensor* tensor, const platform::Place& place, + const platform::DeviceContext& ctx) { + auto dims = tensor->dims(); + size_t num_elements = analysis::AccuDims(dims, dims.size()); + PADDLE_ENFORCE_GT(num_elements, 0); + auto* data = tensor->mutable_data(place); + for (size_t i = 0; i < num_elements; i++) { + *(data + i) = random(0., 1.); + } +} + +/* + * Help to validate the correctness between Fluid Op and the corresponding TRT + * layer. + */ +class TRTConvertValidation { + public: + TRTConvertValidation() = delete; + + TRTConvertValidation(int batch_size, int workspace_size = 1 << 10) { + // create engine. + engine_.reset(new TensorRTEngine(10, 1 << 10, &stream_)); + engine_->InitNetwork(); + + PADDLE_ENFORCE_EQ(cudaStreamCreate(&stream_), 0); + } + + // Declare a Variable as input with random initialization. + void DeclInputVar(const std::string& name, const nvinfer1::Dims& dims) { + DeclVar(name, dims); + // Declare TRT inputs. + engine_->DeclareInput(name, nvinfer1::DataType::kFLOAT, dims); + } + + void DeclOutputVar(const std::string& name, const nvinfer1::Dims& dims) { + DeclVar(name, dims); + } + + void DeclVar(const std::string& name, const nvinfer1::Dims& dims) { + platform::CPUPlace place; + platform::CPUDeviceContext ctx(place); + + // Init Fluid tensor. + std::vector dim_vec(dims.nbDims); + for (int i = 0; i < dims.nbDims; i++) { + dim_vec[i] = dims.d[i]; + } + auto* x = scope_.Var(name); + auto* x_tensor = x->GetMutable(); + x_tensor->Resize(framework::make_ddim(dim_vec)); + RandomizeTensor(x_tensor, place, ctx); + } + + void SetOp(const framework::proto::OpDesc& desc) { + op_ = framework::OpRegistry::CreateOp(desc); + + OpConverter op_converter; + op_converter.ConvertOp(desc, engine_.get()); + + engine_->FreezeNetwork(); + + // Declare outputs. + op_desc_.reset(new framework::OpDesc(desc, nullptr, nullptr)); + + // Set Inputs. + for (const auto& input : op_desc_->InputArgumentNames()) { + auto* var = scope_.FindVar(input); + PADDLE_ENFORCE(var); + auto tensor = var->GetMutable(); + engine_->SetInputFromCPU( + input, static_cast(tensor->data()), + sizeof(float) * + analysis::AccuDims(tensor->dims(), tensor->dims().size())); + } + } + + void Execute(int batch_size) { + // Execute Fluid Op + // Execute TRT + platform::CPUPlace place; + platform::CPUDeviceContext ctx(place); + engine_->Execute(batch_size); + + op_->Run(scope_, place); + + ASSERT_FALSE(op_desc_->OutputArgumentNames().empty()); + for (const auto& output : op_desc_->OutputArgumentNames()) { + std::vector fluid_out; + std::vector trt_out(200); + engine_->GetOutputInCPU(output, &trt_out[0], 200 * sizeof(float)); + + auto* var = scope_.FindVar(output); + auto tensor = var->GetMutable(); + framework::TensorToVector(*tensor, ctx, &fluid_out); + // Compare two output + ASSERT_FALSE(fluid_out.empty()); + for (size_t i = 0; i < fluid_out.size(); i++) { + EXPECT_LT(std::abs(fluid_out[i] - trt_out[i]), 0.001); + } + } + } + + framework::Scope& scope() { return scope_; } + + private: + std::unique_ptr engine_; + cudaStream_t stream_; + framework::Scope scope_; + std::unique_ptr op_; + std::unique_ptr op_desc_; +}; + +} // namespace tensorrt +} // namespace inference +} // namespace paddle diff --git a/paddle/fluid/inference/tensorrt/engine.cc b/paddle/fluid/inference/tensorrt/engine.cc index 1c296e33a610493b889359c43629003fd76b893c..fb27c8394c1f94953093ed90627e63e6241130ed 100644 --- a/paddle/fluid/inference/tensorrt/engine.cc +++ b/paddle/fluid/inference/tensorrt/engine.cc @@ -18,6 +18,7 @@ limitations under the License. */ #include #include #include +#include "paddle/fluid/inference/analysis/helper.h" #include "paddle/fluid/inference/tensorrt/helper.h" #include "paddle/fluid/platform/enforce.h" @@ -71,9 +72,10 @@ void TensorRTEngine::FreezeNetwork() { for (auto& item : buffer_sizes_) { if (item.second == 0) { auto slot_offset = infer_engine_->getBindingIndex(item.first.c_str()); + auto dims = infer_engine_->getBindingDimensions(slot_offset); item.second = kDataTypeSize[static_cast( infer_engine_->getBindingDataType(slot_offset))] * - AccumDims(infer_engine_->getBindingDimensions(slot_offset)); + analysis::AccuDims(dims.d, dims.nbDims); } auto& buf = buffer(item.first); CHECK(buf.buffer == nullptr); // buffer should be allocated only once. @@ -85,14 +87,15 @@ void TensorRTEngine::FreezeNetwork() { nvinfer1::ITensor* TensorRTEngine::DeclareInput(const std::string& name, nvinfer1::DataType dtype, - const nvinfer1::Dims& dim) { + const nvinfer1::Dims& dims) { PADDLE_ENFORCE_EQ(0, buffer_sizes_.count(name), "duplicate input name %s", name); PADDLE_ENFORCE(infer_network_ != nullptr, "should initnetwork first"); - auto* input = infer_network_->addInput(name.c_str(), dtype, dim); + auto* input = infer_network_->addInput(name.c_str(), dtype, dims); PADDLE_ENFORCE(input, "infer network add input %s failed", name); - buffer_sizes_[name] = kDataTypeSize[static_cast(dtype)] * AccumDims(dim); + buffer_sizes_[name] = kDataTypeSize[static_cast(dtype)] * + analysis::AccuDims(dims.d, dims.nbDims); TensorRTEngine::SetITensor(name, input); return input; } @@ -162,13 +165,13 @@ void TensorRTEngine::SetInputFromCPU(const std::string& name, void* data, void TensorRTEngine::SetITensor(const std::string& name, nvinfer1::ITensor* tensor) { PADDLE_ENFORCE(tensor != nullptr); - PADDLE_ENFORCE_EQ(0, itensor_map_.count(name), "duplicate itensor name %s", + PADDLE_ENFORCE_EQ(0, itensor_map_.count(name), "duplicate ITensor name %s", name); itensor_map_[name] = tensor; } nvinfer1::ITensor* TensorRTEngine::GetITensor(const std::string& name) { - PADDLE_ENFORCE(itensor_map_.count(name), "no itensor %s", name); + PADDLE_ENFORCE(itensor_map_.count(name), "no ITensor %s", name); return itensor_map_[name]; } diff --git a/paddle/fluid/inference/tensorrt/helper.h b/paddle/fluid/inference/tensorrt/helper.h index 2b402cce60762d774cd7b371e448b2b88794b6a8..b6e7968108403c9c9c192759c44eac040d1c5073 100644 --- a/paddle/fluid/inference/tensorrt/helper.h +++ b/paddle/fluid/inference/tensorrt/helper.h @@ -26,15 +26,6 @@ namespace tensorrt { namespace dy = paddle::platform::dynload; -static size_t AccumDims(nvinfer1::Dims dims) { - size_t num = dims.nbDims == 0 ? 0 : 1; - for (int i = 0; i < dims.nbDims; i++) { - PADDLE_ENFORCE_GT(dims.d[i], 0); - num *= dims.d[i]; - } - return num; -} - // TensorRT data type to size const int kDataTypeSize[] = { 4, // kFLOAT diff --git a/paddle/fluid/operators/CMakeLists.txt b/paddle/fluid/operators/CMakeLists.txt index f72997ca24ed837f761b52cbecdc05998424a675..e00cc73565fc98615090367606b6ba4f58feacfd 100644 --- a/paddle/fluid/operators/CMakeLists.txt +++ b/paddle/fluid/operators/CMakeLists.txt @@ -200,7 +200,9 @@ if(WITH_DISTRIBUTE) op_library(send_vars_op DEPS ${DISTRIBUTE_DEPS}) set_source_files_properties(send_vars_op.cc PROPERTIES COMPILE_FLAGS ${DISTRIBUTE_COMPILE_FLAGS}) op_library(send_barrier_op DEPS ${DISTRIBUTE_DEPS}) + op_library(fetch_barrier_op DEPS ${DISTRIBUTE_DEPS}) set_source_files_properties(send_barrier_op.cc PROPERTIES COMPILE_FLAGS ${DISTRIBUTE_COMPILE_FLAGS}) + set_source_files_properties(fetch_barrier_op.cc PROPERTIES COMPILE_FLAGS ${DISTRIBUTE_COMPILE_FLAGS}) #set_source_files_properties(send_recv_op_test.cc PROPERTIES COMPILE_FLAGS ${DISTRIBUTE_COMPILE_FLAGS}) #cc_test(test_send_recv SRCS send_recv_op_test.cc DEPS prefetch_op send_op # listen_and_serv_op sum_op executor SERIAL) @@ -214,7 +216,7 @@ if(WITH_DISTRIBUTE) set(DEPS_OPS ${DEPS_OPS} gen_nccl_id_op) endif() else() - set(DEPS_OPS ${DEPS_OPS} send_op prefetch_op recv_op listen_and_serv_op send_vars_op send_barrier_op gen_nccl_id_op) + set(DEPS_OPS ${DEPS_OPS} send_op prefetch_op recv_op listen_and_serv_op send_vars_op send_barrier_op fetch_barrier_op gen_nccl_id_op) endif() op_library(cross_entropy_op DEPS cross_entropy) diff --git a/paddle/fluid/operators/detail/grpc_client.cc b/paddle/fluid/operators/detail/grpc_client.cc index 47892b1bcc073d24ea617ea1c680138a88925177..f7ce7786874285795878b655365974f082c00b44 100644 --- a/paddle/fluid/operators/detail/grpc_client.cc +++ b/paddle/fluid/operators/detail/grpc_client.cc @@ -25,6 +25,21 @@ namespace paddle { namespace operators { namespace detail { +std::once_flag RPCClient::init_flag_; + +std::unique_ptr RPCClient::rpc_client_(nullptr); + +RPCClient* RPCClient::GetInstance() { + std::call_once(init_flag_, &RPCClient::Init); + return rpc_client_.get(); +} + +void RPCClient::Init() { + if (rpc_client_.get() == nullptr) { + rpc_client_.reset(new RPCClient()); + } +} + bool RPCClient::AsyncSendVariable(const std::string& ep, const platform::DeviceContext& ctx, const framework::Scope& scope, @@ -60,7 +75,6 @@ bool RPCClient::AsyncSendVariable(const std::string& ep, call->StartCall(); call->Finish(&s->reply_, &s->status_, reinterpret_cast(s)); }); - req_count_++; return true; @@ -249,8 +263,9 @@ bool RPCClient::Proceed() { delete c; return true; } - std::shared_ptr RPCClient::GetChannel(const std::string& ep) { + // TODO(Yancey1989): make grpc client completely thread-safe + std::unique_lock lock(mutex_); auto it = channels_.find(ep); if (it != channels_.end()) { return it->second; @@ -263,7 +278,6 @@ std::shared_ptr RPCClient::GetChannel(const std::string& ep) { auto ch = grpc::CreateCustomChannel(ep, grpc::InsecureChannelCredentials(), args); - channels_[ep] = ch; return ch; } diff --git a/paddle/fluid/operators/detail/grpc_client.h b/paddle/fluid/operators/detail/grpc_client.h index dabce7414d2f0dca74193f1cd10c341793c10ec9..449d5105afb8c02294a0ef57610e7de1b1631b35 100644 --- a/paddle/fluid/operators/detail/grpc_client.h +++ b/paddle/fluid/operators/detail/grpc_client.h @@ -21,6 +21,7 @@ limitations under the License. */ #include #include #include +#include // NOLINT #include #include @@ -35,6 +36,7 @@ limitations under the License. */ #include "paddle/fluid/framework/scope.h" #include "paddle/fluid/framework/selected_rows.h" #include "paddle/fluid/operators/detail/sendrecvop_utils.h" +#include "paddle/fluid/platform/macros.h" // for DISABLE_COPY_AND_ASSIGN namespace paddle { namespace operators { @@ -161,6 +163,10 @@ class FetchBarrierProcessor : public BaseProcessor { class RPCClient { public: + RPCClient() {} + + static RPCClient* GetInstance(); + bool AsyncSendVariable(const std::string& ep, const platform::DeviceContext& ctx, const framework::Scope& scope, @@ -191,11 +197,17 @@ class RPCClient { private: bool Proceed(); std::shared_ptr GetChannel(const std::string& ep); + // Init is called by GetInstance. + static void Init(); private: grpc::CompletionQueue cq_; std::map> channels_; - int64_t req_count_ = 0; + std::atomic req_count_{0}; + std::mutex mutex_; + static std::unique_ptr rpc_client_; + static std::once_flag init_flag_; + DISABLE_COPY_AND_ASSIGN(RPCClient); }; } // namespace detail diff --git a/paddle/fluid/operators/detail/grpc_server.cc b/paddle/fluid/operators/detail/grpc_server.cc index 58faead2bdf9a89749e08207d964836bbf5cb68e..361cc24b5ba11e2654f1282327730befaeca9f55 100644 --- a/paddle/fluid/operators/detail/grpc_server.cc +++ b/paddle/fluid/operators/detail/grpc_server.cc @@ -177,11 +177,8 @@ class RequestPrefetch final : public RequestBase { program_(program), prefetch_ctx_(prefetch_ctx), req_id_(req_id) { - if (sync_mode_) { - request_.reset(new VariableResponse(scope, dev_ctx_, false)); - } else { - request_.reset(new VariableResponse(scope, dev_ctx_, true)); - } + // prefetch always create a new sub scope + request_.reset(new VariableResponse(scope, dev_ctx_, true)); int method_id = static_cast(detail::GrpcMethod::kPrefetchVariable); service_->RequestAsyncUnary( method_id, &ctx_, request_.get(), &responder_, cq_, cq_, @@ -198,10 +195,10 @@ class RequestPrefetch final : public RequestBase { std::string var_name = request_->OutVarname(); VLOG(3) << "RequestPrefetch " << var_name; auto var_desc = program_->Block(0).FindVar(var_name); - framework::Scope* local_scope = &scope_->NewScope(); + framework::Scope* local_scope = request_->GetMutableLocalScope(); auto* var = local_scope->FindVar(var_name); InitializeVariable(var, var_desc->GetType()); - executor_->RunPreparedContext(prefetch_ctx_, scope_); + executor_->RunPreparedContext(prefetch_ctx_, local_scope); SerializeToByteBuffer(var_name, var, *dev_ctx_, &reply_); diff --git a/paddle/fluid/operators/detail/grpc_server_test.cc b/paddle/fluid/operators/detail/grpc_server_test.cc index 73e75c9087fef756840c76db249f8996253ced64..350a7ee1234da5b88d09ea955ce14b7c161d804e 100644 --- a/paddle/fluid/operators/detail/grpc_server_test.cc +++ b/paddle/fluid/operators/detail/grpc_server_test.cc @@ -121,10 +121,10 @@ TEST(PREFETCH, DISABLED_CPU) { std::string in_var_name("ids"); std::string out_var_name("out"); - detail::RPCClient client; - client.AsyncPrefetchVariable("127.0.0.1:8889", ctx, scope, in_var_name, - out_var_name); - client.Wait(); + auto client = detail::RPCClient::GetInstance(); + client->AsyncPrefetchVariable("127.0.0.1:8889", ctx, scope, in_var_name, + out_var_name); + client->Wait(); auto var = scope.Var(out_var_name); auto value = var->GetMutable()->value(); diff --git a/paddle/fluid/operators/fetch_barrier_op.cc b/paddle/fluid/operators/fetch_barrier_op.cc new file mode 100644 index 0000000000000000000000000000000000000000..79ec02f52094121d01c6bda2a5d99d2211893e89 --- /dev/null +++ b/paddle/fluid/operators/fetch_barrier_op.cc @@ -0,0 +1,87 @@ +/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. */ + +#include // NOLINT +#include + +#include "paddle/fluid/framework/data_type.h" +#include "paddle/fluid/framework/framework.pb.h" +#include "paddle/fluid/framework/lod_tensor.h" +#include "paddle/fluid/framework/op_registry.h" + +#include "paddle/fluid/operators/detail/grpc_client.h" +#include "paddle/fluid/platform/profiler.h" + +namespace paddle { +namespace operators { + +class FetchBarrierOp : public framework::OperatorBase { + public: + FetchBarrierOp(const std::string& type, + const framework::VariableNameMap& inputs, + const framework::VariableNameMap& outputs, + const framework::AttributeMap& attrs) + : OperatorBase(type, inputs, outputs, attrs) {} + + void RunImpl(const framework::Scope& scope, + const platform::Place& place) const override { + std::vector eps = Attr>("endpoints"); + + platform::DeviceContextPool& pool = platform::DeviceContextPool::Instance(); + auto& ctx = *pool.Get(place); + // For profiling + platform::RecordEvent record_event(Type(), &ctx); + + auto rpc_client = detail::RPCClient::GetInstance(); + + PADDLE_ENFORCE(rpc_client->Wait()); + + for (auto& ep : eps) { + VLOG(3) << "fetch barrier, ep: " << ep; + rpc_client->AsyncSendFetchBarrier(ep); + } + PADDLE_ENFORCE(rpc_client->Wait()); + } +}; + +class FetchBarrierOpMaker : public framework::OpProtoAndCheckerMaker { + public: + void Make() { + AddComment(R"DOC( +SendBarrier operator + +This operator will send a send barrier signal to list_and_serv op, so that +the Parameter Server would knew all variables have been sent. +)DOC"); + + AddAttr>("endpoints", + "(string vector, default 127.0.0.1:6164)" + "Server endpoints to send variables to.") + .SetDefault({"127.0.0.1:6164"}); + } +}; + +class FetchBarrierOpShapeInference : public framework::InferShapeBase { + public: + void operator()(framework::InferShapeContext* ctx) const override {} +}; + +} // namespace operators +} // namespace paddle + +namespace ops = paddle::operators; + +REGISTER_OPERATOR(fetch_barrier, ops::FetchBarrierOp, + paddle::framework::EmptyGradOpMaker, ops::FetchBarrierOpMaker, + ops::FetchBarrierOpShapeInference); diff --git a/paddle/fluid/operators/listen_and_serv_op.cc b/paddle/fluid/operators/listen_and_serv_op.cc index 3e693ed7170530c5ca5cf8820e469146c2eb0c02..df5f229acd75ee3df55d46444a63d9f1915f9d22 100644 --- a/paddle/fluid/operators/listen_and_serv_op.cc +++ b/paddle/fluid/operators/listen_and_serv_op.cc @@ -13,8 +13,9 @@ See the License for the specific language governing permissions and limitations under the License. */ #include // for removing the port file +#include +#include #include -#include #include // NOLINT #include @@ -28,7 +29,6 @@ void RunServer(std::shared_ptr service) { service->RunSyncUpdate(); VLOG(4) << "RunServer thread end"; } - static void split(const std::string &str, char sep, std::vector *pieces) { pieces->clear(); @@ -59,7 +59,7 @@ static void ParallelExecuteBlocks( int run_block = idx; // thread local try { executor->RunPreparedContext(prepared[run_block].get(), scope); - } catch (std::exception &e) { + } catch (const std::exception &e) { LOG(ERROR) << "run sub program error " << e.what(); } })); @@ -75,8 +75,11 @@ ListenAndServOp::ListenAndServOp(const std::string &type, const framework::AttributeMap &attrs) : OperatorBase(type, inputs, outputs, attrs) {} +ListenAndServOp::~ListenAndServOp() { Stop(); } + void ListenAndServOp::Stop() { rpc_service_->Push(LISTEN_TERMINATE_MESSAGE); + rpc_service_->ShutDown(); server_thread_->join(); auto file_path = string::Sprintf("/tmp/paddle.%d.port", ::getpid()); remove(file_path.c_str()); @@ -122,7 +125,7 @@ void ListenAndServOp::RunSyncLoop(framework::Executor *executor, // Record received sparse variables, so that // we could reset those after execute optimize program std::vector sparse_vars; - while (!exit_flag) { + while (!exit_flag && !SignalHandler::IsProgramExit()) { // Get from multiple trainers, we don't care about the order in which // the gradients arrives, just add suffix 0~n and merge the gradient. rpc_service_->SetCond(0); @@ -187,7 +190,7 @@ void ListenAndServOp::RunSyncLoop(framework::Executor *executor, // mini-batch. // TODO(Yancey1989): move the reset action into an operator, we couldn't // have any hide logic in the operator. - for (auto &var : sparse_vars) { + for (framework::Variable *var : sparse_vars) { var->GetMutable()->mutable_rows()->clear(); } @@ -204,9 +207,14 @@ static void AsyncUpdateThread( framework::Executor *executor, framework::ExecutorPrepareContext *prepared) { VLOG(3) << "update thread for " << var_name << " started"; - while (!exit_flag) { + while (!exit_flag && !SignalHandler::IsProgramExit()) { const detail::ReceivedMessage v = queue->Pop(); + if (SignalHandler::IsProgramExit()) { + VLOG(3) << "update thread for " << var_name << " exit"; + break; + } auto recv_var_name = v.first; + VLOG(4) << "async update " << recv_var_name; auto var = v.second->GetVar(); if (var == nullptr) { LOG(ERROR) << "Can not find server side var: " << recv_var_name; @@ -216,7 +224,7 @@ static void AsyncUpdateThread( try { executor->RunPreparedContext(prepared, v.second->GetMutableLocalScope()); - } catch (std::exception &e) { + } catch (const std::exception &e) { LOG(ERROR) << "run sub program error " << e.what(); } }); @@ -235,7 +243,7 @@ void ListenAndServOp::RunAsyncLoop(framework::Executor *executor, auto grad_to_block_id_str = Attr>("grad_to_block_id"); - for (auto &grad_and_id : grad_to_block_id_str) { + for (const auto &grad_and_id : grad_to_block_id_str) { std::vector pieces; split(grad_and_id, ':', &pieces); VLOG(3) << "after split, grad = " << pieces[0] << ", id=" << pieces[1]; @@ -243,7 +251,11 @@ void ListenAndServOp::RunAsyncLoop(framework::Executor *executor, PADDLE_ENFORCE_EQ(grad_to_block_id.count(pieces[0]), 0); int block_id = std::stoi(pieces[1]); grad_to_block_id[pieces[0]] = block_id; - grad_to_queue[pieces[0]] = std::make_shared(); + std::shared_ptr queue = + std::make_shared(); + grad_to_queue[pieces[0]] = queue; + // record blocking queue in SignalHandler + SignalHandler::RegisterBlockingQueue(queue); id_to_grad[block_id] = pieces[0]; } size_t num_blocks = program->Size(); @@ -275,9 +287,8 @@ void ListenAndServOp::RunAsyncLoop(framework::Executor *executor, executor, grad_to_prepared_ctx[grad_name].get()); })); } - VLOG(3) << "RunAsyncLoop into while"; - while (!exit_flag) { + while (!exit_flag && !SignalHandler::IsProgramExit()) { const detail::ReceivedMessage v = rpc_service_->Get(); auto recv_var_name = v.first; if (recv_var_name == LISTEN_TERMINATE_MESSAGE) { @@ -332,6 +343,10 @@ void ListenAndServOp::RunImpl(const framework::Scope &scope, VLOG(3) << "wait server thread to become ready..."; rpc_service_->WaitServerReady(); + // register SIGINT(from ctrl+C) and SIGTERM(from kill) signal handlers + signal(SIGINT, SignalHandler::StopAndExit); + signal(SIGTERM, SignalHandler::StopAndExit); + // Write to a file of server selected port for python use. std::string file_path = string::Sprintf("/tmp/paddle.%d.selected_port", static_cast(::getpid())); @@ -347,12 +362,9 @@ class ListenAndServOpMaker : public framework::OpProtoAndCheckerMaker { public: void Make() { AddInput("X", "(Tensor) Variables that server recv.").AsDuplicable(); - AddComment(R"DOC( -ListenAndServ operator - -This operator will start a RPC server which can receive variables -from send_op and send back variables to recv_op. -)DOC"); + AddComment(R"DOC(" + "ListenAndServ operator" + "\n" + "This operator" + +" will start a RPC server which can receive variables from send_op and send" + +"back variables to recv_op.)DOC"); AddAttr("endpoint", "(string, default 127.0.0.1:6164)" "IP address to listen on.") @@ -373,6 +385,29 @@ from send_op and send back variables to recv_op. } }; +bool SignalHandler::program_exit_flag_ = false; + +SignalHandler::BlockingQueueSet SignalHandler::blocking_queue_set_{}; + +void SignalHandler::StopAndExit(int signal_num) { + VLOG(3) << "Catch interrupt signal: " << signal_num << ", program will exit"; + + program_exit_flag_ = true; + + // awake all blocking queues + for (BlockingQueueSet::iterator iter = blocking_queue_set_.begin(); + iter != blocking_queue_set_.end(); iter++) { + iter->get()->Push( + std::make_pair(std::string(LISTEN_TERMINATE_MESSAGE), nullptr)); + } + + exit(EXIT_SUCCESS); +} + +void SignalHandler::RegisterBlockingQueue(BlockingQueue &queue) { + blocking_queue_set_.insert(queue); +} + } // namespace operators } // namespace paddle diff --git a/paddle/fluid/operators/listen_and_serv_op.h b/paddle/fluid/operators/listen_and_serv_op.h index 8af061eaf2bec4a9edd264c8c77ac69e228b0669..6f868369dcf2067fd71f4107d20c79ead0cf9f56 100644 --- a/paddle/fluid/operators/listen_and_serv_op.h +++ b/paddle/fluid/operators/listen_and_serv_op.h @@ -16,7 +16,7 @@ limitations under the License. */ #include #include -#include +#include #include #include "paddle/fluid/framework/executor.h" @@ -40,6 +40,8 @@ class ListenAndServOp : public framework::OperatorBase { const framework::VariableNameMap& outputs, const framework::AttributeMap& attrs); + virtual ~ListenAndServOp(); + void RunSyncLoop(framework::Executor* executor, framework::ProgramDesc* program, framework::Scope* recv_scope, @@ -68,5 +70,25 @@ class ListenAndServOp : public framework::OperatorBase { static std::atomic_int selected_port_; }; +class SignalHandler { + public: + typedef std::shared_ptr BlockingQueue; + typedef std::unordered_set BlockingQueueSet; + + public: + static void StopAndExit(int signal_num); + + static void RegisterBlockingQueue(BlockingQueue&); + + static inline bool IsProgramExit() { return program_exit_flag_; } + + private: + static bool program_exit_flag_; + + static BlockingQueueSet blocking_queue_set_; + + DISABLE_COPY_AND_ASSIGN(SignalHandler); +}; + } // namespace operators } // namespace paddle diff --git a/paddle/fluid/operators/lookup_sparse_table_op.cc b/paddle/fluid/operators/lookup_sparse_table_op.cc index d07a81968565f095cdb6425d104bc7a11bc9cfad..2ce11e712fb1a8aa9748313ec7cf4e895a931465 100644 --- a/paddle/fluid/operators/lookup_sparse_table_op.cc +++ b/paddle/fluid/operators/lookup_sparse_table_op.cc @@ -127,7 +127,7 @@ class LookupSparseTableOpMaker : public framework::OpProtoAndCheckerMaker { .SetDefault(-1.0f); AddAttr("max", "(float, default 1.0) " - "Maximun value of uniform random") + "Maximum value of uniform random") .SetDefault(1.0f); AddAttr("seed", "(int, default 0) " diff --git a/paddle/fluid/operators/math/cross_entropy.cc b/paddle/fluid/operators/math/cross_entropy.cc index fc0fca5ad3370633b2f60db65fdb7c01c417dc50..caff35e03ae3a144f799d982c859ded62cb3e93d 100644 --- a/paddle/fluid/operators/math/cross_entropy.cc +++ b/paddle/fluid/operators/math/cross_entropy.cc @@ -46,7 +46,10 @@ class CrossEntropyFunctor { const int64_t* label_data = labels->data(); for (int i = 0; i < batch_size; ++i) { - int index = i * class_num + label_data[i]; + int lbl = label_data[i]; + PADDLE_ENFORCE_GE(lbl, 0); + PADDLE_ENFORCE_LT(lbl, class_num); + int index = i * class_num + lbl; loss_data[i] = -math::TolerableValue()(std::log(prob_data[index])); } } diff --git a/paddle/fluid/operators/prefetch_op.cc b/paddle/fluid/operators/prefetch_op.cc index 4cfea958e8e50156c90af8806414b043e15f8a9c..e0a9b24ac8978418a1a4ece62286e022bec8b834 100644 --- a/paddle/fluid/operators/prefetch_op.cc +++ b/paddle/fluid/operators/prefetch_op.cc @@ -41,12 +41,7 @@ class PrefetchOp : public framework::OperatorBase { platform::DeviceContextPool& pool = platform::DeviceContextPool::Instance(); auto& ctx = *pool.Get(place); - auto client_var_name = Output("RPCClient"); - PADDLE_ENFORCE_NOT_NULL(scope.FindVar(client_var_name), - "Can not find variable '%s' in the scope.", - client_var_name); - auto* client_var = scope.FindVar(client_var_name); - detail::RPCClient* rpc_client = client_var->GetMutable(); + auto rpc_client = detail::RPCClient::GetInstance(); for (size_t i = 0; i < ins.size(); i++) { if (NeedSend(scope, ins[i])) { @@ -66,9 +61,6 @@ class PrefetchOpMaker : public framework::OpProtoAndCheckerMaker { public: void Make() { AddInput("X", "(LoDTensor) Input Id variables to be sent").AsDuplicable(); - AddOutput("RPCClient", - "(RPCClient) The RPC client object which will be" - "initialized at most once."); AddOutput("Out", "(LoDTensor) result " "to be fetched from parameter server") @@ -87,17 +79,6 @@ the parameter server and fetch result back. } }; -class PrefetchOpVarTypeInference : public framework::VarTypeInference { - public: - void operator()(const framework::OpDesc& op_desc, - framework::BlockDesc* block) const override { - auto out_var_name = op_desc.Output("RPCClient").front(); - auto& out_var = block->FindRecursiveOrCreateVar(out_var_name); - auto var_type = framework::proto::VarType::RAW; - out_var.SetType(var_type); - } -}; - class PrefetchOpShapeInference : public framework::InferShapeBase { public: void operator()(framework::InferShapeContext* ctx) const override {} @@ -110,5 +91,4 @@ namespace ops = paddle::operators; REGISTER_OPERATOR(prefetch, ops::PrefetchOp, paddle::framework::EmptyGradOpMaker, ops::PrefetchOpMaker, - ops::PrefetchOpVarTypeInference, ops::PrefetchOpShapeInference); diff --git a/paddle/fluid/operators/recv_op.cc b/paddle/fluid/operators/recv_op.cc index 7148bd0e363a71b58581a6c3c5f245d98d5b9d02..d8ddb7b448910b5e0e6e71742eb2fdc6a225c919 100644 --- a/paddle/fluid/operators/recv_op.cc +++ b/paddle/fluid/operators/recv_op.cc @@ -21,6 +21,7 @@ limitations under the License. */ #include "paddle/fluid/framework/op_registry.h" #include "paddle/fluid/operators/detail/grpc_client.h" +#include "paddle/fluid/platform/profiler.h" namespace paddle { namespace operators { @@ -36,19 +37,23 @@ class RecvOp : public framework::OperatorBase { const platform::Place& place) const override { auto outs = Outputs("Out"); std::vector epmap = Attr>("epmap"); + int sync_mode = Attr("sync_mode"); platform::DeviceContextPool& pool = platform::DeviceContextPool::Instance(); auto& ctx = *pool.Get(place); + // For profiling + platform::RecordEvent record_event(Type(), &ctx); + + auto rpc_client = detail::RPCClient::GetInstance(); for (size_t i = 0; i < outs.size(); i++) { - VLOG(3) << "getting " << outs[i]; - client_.AsyncGetVariable(epmap[i], ctx, scope, outs[i]); + VLOG(3) << "getting " << outs[i] << " from " << epmap[i]; + rpc_client->AsyncGetVariable(epmap[i], ctx, scope, outs[i]); + } + if (sync_mode) { + PADDLE_ENFORCE(rpc_client->Wait()); } - PADDLE_ENFORCE(client_.Wait()); } - - private: - mutable detail::RPCClient client_; }; class RecvOpMaker : public framework::OpProtoAndCheckerMaker { @@ -65,6 +70,10 @@ This operator can get variables from server side. "Server endpoints in the order of input " "variables for mapping") .SetDefault({}); + AddAttr("sync_mode", + "(int, default 0)" + "sync recv or async recv.") + .SetDefault(0); } }; diff --git a/paddle/fluid/operators/send_barrier_op.cc b/paddle/fluid/operators/send_barrier_op.cc index 1ce0907f3a9473e37f53bf7b2d42cddcb629dfa6..2c77ee2e2792d6fdd76bacd68b6c3b4a296b2e3a 100644 --- a/paddle/fluid/operators/send_barrier_op.cc +++ b/paddle/fluid/operators/send_barrier_op.cc @@ -21,6 +21,7 @@ limitations under the License. */ #include "paddle/fluid/framework/op_registry.h" #include "paddle/fluid/operators/detail/grpc_client.h" +#include "paddle/fluid/platform/profiler.h" namespace paddle { namespace operators { @@ -36,31 +37,30 @@ class SendBarrierOp : public framework::OperatorBase { void RunImpl(const framework::Scope& scope, const platform::Place& place) const override { std::vector eps = Attr>("endpoints"); + bool sync_mode = Attr("sync_mode"); - auto client_var_name = Output("RPCClient"); - PADDLE_ENFORCE_NOT_NULL(scope.FindVar(client_var_name), - "Can not find variable '%s' in the scope.", - client_var_name); - auto* client_var = scope.FindVar(client_var_name); - detail::RPCClient* rpc_client = client_var->GetMutable(); + platform::DeviceContextPool& pool = platform::DeviceContextPool::Instance(); + auto& ctx = *pool.Get(place); + // For profiling + platform::RecordEvent record_event(Type(), &ctx); + + auto rpc_client = detail::RPCClient::GetInstance(); // need to wait before sending send_barrier message PADDLE_ENFORCE(rpc_client->Wait()); - - for (auto& ep : eps) { - VLOG(3) << "send barrier, ep: " << ep; - rpc_client->AsyncSendBatchBarrier(ep); + if (sync_mode) { + for (auto& ep : eps) { + VLOG(3) << "send barrier, ep: " << ep; + rpc_client->AsyncSendBatchBarrier(ep); + } + PADDLE_ENFORCE(rpc_client->Wait()); } - PADDLE_ENFORCE(rpc_client->Wait()); } }; class SendBarrierOpMaker : public framework::OpProtoAndCheckerMaker { public: void Make() { - AddOutput("RPCClient", - "(RPCClient) The RPC client object which is" - "initialized at most once."); AddComment(R"DOC( SendBarrier operator @@ -72,17 +72,7 @@ the Parameter Server would knew all variables have been sent. "(string vector, default 127.0.0.1:6164)" "Server endpoints to send variables to.") .SetDefault({"127.0.0.1:6164"}); - } -}; - -class SendBarrierOpVarTypeInference : public framework::VarTypeInference { - public: - void operator()(const framework::OpDesc& op_desc, - framework::BlockDesc* block) const override { - auto out_var_name = op_desc.Output("RPCClient").front(); - auto& out_var = block->FindRecursiveOrCreateVar(out_var_name); - auto var_type = framework::proto::VarType::RAW; - out_var.SetType(var_type); + AddAttr("sync_mode", "work in sync_mode or not").SetDefault(true); } }; @@ -98,5 +88,4 @@ namespace ops = paddle::operators; REGISTER_OPERATOR(send_barrier, ops::SendBarrierOp, paddle::framework::EmptyGradOpMaker, ops::SendBarrierOpMaker, - ops::SendBarrierOpVarTypeInference, ops::SendBarrierOpShapeInference); diff --git a/paddle/fluid/operators/send_op.cc b/paddle/fluid/operators/send_op.cc index 95bb1f3c695297e6d8134a647925310207118a9b..a5150f242ca3b0befafa2443f0bc466e2aea85e4 100644 --- a/paddle/fluid/operators/send_op.cc +++ b/paddle/fluid/operators/send_op.cc @@ -49,12 +49,7 @@ class SendOp : public framework::OperatorBase { // For profiling platform::RecordEvent record_event(Type(), &ctx); - auto client_var_name = Output("RPCClient"); - PADDLE_ENFORCE_NOT_NULL(scope.FindVar(client_var_name), - "Can not find variable '%s' in the scope.", - client_var_name); - auto* client_var = scope.FindVar(client_var_name); - detail::RPCClient* rpc_client = client_var->GetMutable(); + auto rpc_client = detail::RPCClient::GetInstance(); for (size_t i = 0; i < ins.size(); i++) { if (NeedSend(scope, ins[i])) { @@ -96,9 +91,6 @@ class SendOpMaker : public framework::OpProtoAndCheckerMaker { AddInput("X", "(Tensor) Input tensor to be sent").AsDuplicable(); AddOutput("Out", "(Tensor) Output tensor to be received from server") .AsDuplicable(); - AddOutput("RPCClient", - "(RPCClient) The RPC client object which is" - "initialized at most once."); AddComment(R"DOC( Send operator @@ -119,17 +111,6 @@ This operator will send tensor to recv_op at the parameter server. } }; -class SendOpVarTypeInference : public framework::VarTypeInference { - public: - void operator()(const framework::OpDesc& op_desc, - framework::BlockDesc* block) const override { - auto out_var_name = op_desc.Output("RPCClient").front(); - auto& out_var = block->FindRecursiveOrCreateVar(out_var_name); - auto var_type = framework::proto::VarType::RAW; - out_var.SetType(var_type); - } -}; - class SendOpShapeInference : public framework::InferShapeBase { public: void operator()(framework::InferShapeContext* ctx) const override {} @@ -141,5 +122,4 @@ class SendOpShapeInference : public framework::InferShapeBase { namespace ops = paddle::operators; REGISTER_OPERATOR(send, ops::SendOp, paddle::framework::EmptyGradOpMaker, - ops::SendOpMaker, ops::SendOpVarTypeInference, - ops::SendOpShapeInference); + ops::SendOpMaker, ops::SendOpShapeInference); diff --git a/paddle/fluid/operators/send_recv_op_test.cc b/paddle/fluid/operators/send_recv_op_test.cc index d5303eaf50722234d205264e56892b1723104d53..e550552b195b768d68ec64e9c3b5889b56ca719f 100644 --- a/paddle/fluid/operators/send_recv_op_test.cc +++ b/paddle/fluid/operators/send_recv_op_test.cc @@ -156,6 +156,7 @@ TEST(SendRecvOp, CPUDense) { std::thread server_thread(StartServerNet, false, &initialized); while (!initialized) { } + static_cast(listen_and_serv_op.get()) ->WaitServerReady(); @@ -175,9 +176,10 @@ TEST(SendRecvOp, CPUDense) { std::string endpoint = paddle::string::Sprintf("127.0.0.1:%d", selected_port); attrs.insert({"endpoints", std::vector({endpoint})}); attrs.insert({"epmap", std::vector({endpoint})}); - auto send_op = f::OpRegistry::CreateOp( - "send", {{"X", {"x1"}}}, - {{"Out", {"Out"}}, {"RPCClient", {"RPC_CLIENT_VAR"}}}, attrs); + const f::VariableNameMap &inputs = {{"X", {"x1"}}}; + const f::VariableNameMap &outputs = {{"Out", {"Out"}}}; + + auto send_op = f::OpRegistry::CreateOp("send", inputs, outputs, attrs); send_op->Run(scope, place); auto in_var = scope.Var("x1"); @@ -220,9 +222,8 @@ TEST(SendRecvOp, CPUSparse) { std::string endpoint = paddle::string::Sprintf("127.0.0.1:%d", selected_port); attrs.insert({"endpoints", std::vector({endpoint})}); attrs.insert({"epmap", std::vector({endpoint})}); - auto send_op = f::OpRegistry::CreateOp( - "send", {{"X", {"x1"}}}, - {{"Out", {"Out"}}, {"RPCClient", {"RPC_CLIENT_VAR"}}}, attrs); + auto send_op = f::OpRegistry::CreateOp("send", {{"X", {"x1"}}}, + {{"Out", {"Out"}}}, attrs); send_op->Run(scope, place); auto x0 = scope.Var("x0")->GetMutable(); diff --git a/paddle/fluid/operators/send_recv_util.h b/paddle/fluid/operators/send_recv_util.h index 113513eb6b327773ab4a1c062fb8a3f06fddfbca..deab005149027caffa962783df944fad7110382f 100644 --- a/paddle/fluid/operators/send_recv_util.h +++ b/paddle/fluid/operators/send_recv_util.h @@ -20,6 +20,9 @@ namespace operators { inline bool NeedSend(const framework::Scope& scope, const std::string& varname) { + // dummy variable is only used in parallel executor to represent + // some dependency relationship, we don't need to send/recv it. + if (varname == "dummy") return false; auto* var = scope.FindVar(varname); PADDLE_ENFORCE_NOT_NULL(var, "Can not find variable '%s' in the send side.", varname); diff --git a/paddle/fluid/operators/send_vars_op.cc b/paddle/fluid/operators/send_vars_op.cc index f11e84c176ae97dff0fda560ce3ebe2ab72c7bcc..fe839dab6924618c8a4c39868d9bf86056a0be40 100644 --- a/paddle/fluid/operators/send_vars_op.cc +++ b/paddle/fluid/operators/send_vars_op.cc @@ -20,6 +20,7 @@ limitations under the License. */ #include "paddle/fluid/framework/op_registry.h" #include "paddle/fluid/operators/detail/grpc_client.h" #include "paddle/fluid/operators/send_recv_util.h" +#include "paddle/fluid/platform/profiler.h" namespace paddle { namespace operators { @@ -41,12 +42,10 @@ class SendVarsOp : public framework::OperatorBase { platform::DeviceContextPool& pool = platform::DeviceContextPool::Instance(); auto& ctx = *pool.Get(place); - auto client_var_name = Output("RPCClient"); - PADDLE_ENFORCE_NOT_NULL(scope.FindVar(client_var_name), - "Can not find variable '%s' in the scope.", - client_var_name); - auto* client_var = scope.FindVar(client_var_name); - detail::RPCClient* rpc_client = client_var->GetMutable(); + // For profiling + platform::RecordEvent record_event(Type(), &ctx); + + auto rpc_client = detail::RPCClient::GetInstance(); for (size_t i = 0; i < ins.size(); i++) { if (NeedSend(scope, ins[i])) { @@ -69,9 +68,6 @@ class SendVarsOpMaker : public framework::OpProtoAndCheckerMaker { void Make() { AddInput("X", "(Tensor, SelectedRows) Input variables to be sent") .AsDuplicable(); - AddOutput("RPCClient", - "(RPCClient) The RPC client object which will be" - "initialized at most once."); AddComment(R"DOC( Send operator @@ -89,17 +85,6 @@ This operator will send variables to listen_and_serve op at the parameter server } }; -class SendVarsOpVarTypeInference : public framework::VarTypeInference { - public: - void operator()(const framework::OpDesc& op_desc, - framework::BlockDesc* block) const override { - auto out_var_name = op_desc.Output("RPCClient").front(); - auto& out_var = block->FindRecursiveOrCreateVar(out_var_name); - auto var_type = framework::proto::VarType::RAW; - out_var.SetType(var_type); - } -}; - class SendVarsOpShapeInference : public framework::InferShapeBase { public: void operator()(framework::InferShapeContext* ctx) const override {} @@ -112,5 +97,4 @@ namespace ops = paddle::operators; REGISTER_OPERATOR(send_vars, ops::SendVarsOp, paddle::framework::EmptyGradOpMaker, ops::SendVarsOpMaker, - ops::SendVarsOpVarTypeInference, ops::SendVarsOpShapeInference); diff --git a/paddle/fluid/operators/sgd_op.h b/paddle/fluid/operators/sgd_op.h index f3e88b0a0b05ef792b2cc8e880bdfddb6e6124d1..f9e0596191d0b86686e0fa36265806111c774b38 100644 --- a/paddle/fluid/operators/sgd_op.h +++ b/paddle/fluid/operators/sgd_op.h @@ -96,8 +96,12 @@ class SGDOpKernel : public framework::OpKernel { return; } - size_t param_row_width = param.value().numel() / param.rows().size(); - size_t grad_row_width = grad.value().numel() / grad.rows().size(); + auto param_row_width = param.value().dims()[1]; + auto grad_row_width = grad.value().dims()[1]; + VLOG(4) << " param rows: " << param.rows().size() + << " param memory rows: " << param.value().dims()[0] + << " grad rows: " << grad.rows().size() + << " grad memory rows: " << grad.value().dims()[0]; PADDLE_ENFORCE_EQ(param_row_width, grad_row_width, "param_row should have the same size with grad_row"); diff --git a/paddle/fluid/pybind/const_value.cc b/paddle/fluid/pybind/const_value.cc index 9111abca5aac97e9d5c7b00ce5173f08e49cda12..76aa7d2010682416f68e982e9b89da9813abb078 100644 --- a/paddle/fluid/pybind/const_value.cc +++ b/paddle/fluid/pybind/const_value.cc @@ -32,7 +32,8 @@ void BindConstValue(pybind11::module* m) { .value("Forward", framework::OpRole::kForward) .value("Backward", framework::OpRole::kBackward) .value("Optimize", framework::OpRole::kOptimize) - .value("Loss", framework::OpRole::kLoss); + .value("Loss", framework::OpRole::kLoss) + .value("RPC", framework::OpRole::kRPC); op_proto_and_checker_maker.def( "kOpRoleAttrName", framework::OpProtoAndCheckerMaker::OpRoleAttrName); diff --git a/paddle/function/BlockExpandOp.cpp b/paddle/function/BlockExpandOp.cpp index aa53853e08716ff0dd8dce7c73766d9543bed2b9..f01f89a7277acc5fe494b92a3e7ca3ca18498c97 100644 --- a/paddle/function/BlockExpandOp.cpp +++ b/paddle/function/BlockExpandOp.cpp @@ -33,7 +33,7 @@ namespace paddle { * \param outputs[0] Image data of NCHW format. */ class BlockExpandFunction : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { // function arguments strides_ = config.get>("strides"); @@ -81,7 +81,7 @@ public: (size_t)blockW()}); } -protected: + protected: std::vector strides_; std::vector paddings_; std::vector blocks_; @@ -101,7 +101,7 @@ protected: template class BlockExpandForward : public BlockExpandFunction { -public: + public: void init(const FuncConfig& config) override { BlockExpandFunction::init(config); } @@ -149,7 +149,7 @@ public: template class BlockExpandBackward : public BlockExpandFunction { -public: + public: void init(const FuncConfig& config) override { BlockExpandFunction::init(config); } diff --git a/paddle/function/BufferArg.h b/paddle/function/BufferArg.h index 89ee09837db69d79bbd678312f02f6dc87e8067c..6de8c94e778c8d1439b2a2aa3c581a5a3cf70261 100644 --- a/paddle/function/BufferArg.h +++ b/paddle/function/BufferArg.h @@ -63,12 +63,12 @@ enum ArgType { ADD_TO = 2, }; class BufferArg { -public: + public: void setArgType(ArgType argType) { argType_ = argType; } ArgType getArgType() const { return argType_; } -public: + public: BufferArg(ValueType valueType, const TensorShape& shape, ArgType argType = UNSPECIFIED) @@ -169,7 +169,7 @@ public: const SequenceArg& sequence() const; const SparseMatrixArg& sparse() const; -protected: + protected: void* buf_; ValueType valueType_; TensorShape shape_; @@ -185,7 +185,7 @@ protected: // valueType_ = int32 // if a < b then value_.buf_[a] < value_.buf_[b] class SequenceIdArg : public BufferArg { -public: + public: SequenceIdArg(const TensorShape& shape, ArgType argType = UNSPECIFIED) : BufferArg(VALUE_TYPE_INT32, shape, argType) { bufferType_ = TENSOR_SEQUENCE_ID; @@ -212,7 +212,7 @@ public: size_t numSeqs() const { return numSeqs_; } -private: + private: size_t numSeqs_; }; @@ -222,7 +222,7 @@ private: // SequenceArg can be used to represent sequences that contain multiple // unequal lengths. class SequenceArg : public BufferArg { -public: + public: SequenceArg(ValueType valueType, const TensorShape& shape, ArgType argType = UNSPECIFIED) @@ -255,7 +255,7 @@ public: SequenceIdArg& getSequenceId() { return startPositions_; } const SequenceIdArg& getSequenceId() const { return startPositions_; } -private: + private: SequenceIdArg startPositions_; }; @@ -263,7 +263,7 @@ private: // valueType_ == float or double // shape_.ndims() == 2 class SparseMatrixArg : public BufferArg { -public: + public: SparseMatrixArg(void* buf, ValueType valueType, const TensorShape& shape, @@ -353,7 +353,7 @@ public: SparseDataType dataType() const { return type_; } -private: + private: BufferArg row_; BufferArg col_; size_t nnz_; diff --git a/paddle/function/ContextProjectionOp.cpp b/paddle/function/ContextProjectionOp.cpp index 904b0958e6f2c1b8fb8cf56f3cd7d07ad8e24f19..1187842452460ac3fd71f48150fab6467f93dc6c 100644 --- a/paddle/function/ContextProjectionOp.cpp +++ b/paddle/function/ContextProjectionOp.cpp @@ -100,7 +100,7 @@ void ContextProjectionForward(CpuMatrix& out_mat, */ template class ContextProjectionForwardFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { context_length_ = config.get("context_length"); context_start_ = config.get("context_start"); @@ -146,7 +146,7 @@ public: begin_pad_); } -private: + private: size_t context_length_; int context_start_; size_t begin_pad_; @@ -223,7 +223,7 @@ void ContextProjectionBackward(const CpuMatrix& out_grad_mat, */ template class ContextProjectionBackwardFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { context_length_ = config.get("context_length"); context_start_ = config.get("context_start"); @@ -278,7 +278,7 @@ public: total_pad_); } -private: + private: size_t context_length_; int context_start_; size_t begin_pad_; @@ -299,7 +299,7 @@ private: */ template class ContextProjectionBackwardDataFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { context_length_ = config.get("context_length"); context_start_ = config.get("context_start"); @@ -331,7 +331,7 @@ public: out_grad_mat, in_grad_mat, seq_vec, context_length_, context_start_); } -private: + private: size_t context_length_; int context_start_; }; @@ -348,7 +348,7 @@ private: */ template class ContextProjectionBackwardWeightFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { context_length_ = config.get("context_length"); context_start_ = config.get("context_start"); @@ -382,7 +382,7 @@ public: begin_pad_); } -private: + private: size_t context_length_; int context_start_; size_t begin_pad_; diff --git a/paddle/function/ConvOp.h b/paddle/function/ConvOp.h index 7d23d0079c8f62b2c8912dfcb9f191c622a60bc9..2d8437bcfe60d1d81897f1c4be1cbfecb5b27fe0 100644 --- a/paddle/function/ConvOp.h +++ b/paddle/function/ConvOp.h @@ -56,7 +56,7 @@ namespace paddle { * H and W is height and width of filter. */ class ConvFunctionBase : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { // function arguments strides_ = config.get>("strides"); @@ -101,7 +101,7 @@ public: } } -protected: + protected: size_t getFilterHeight(const TensorShape& filter) const { return filter[filter.ndims() - 2]; } diff --git a/paddle/function/CosSimOp.cpp b/paddle/function/CosSimOp.cpp index 81bccc1a9c7d614763a10e3838271b57eef2c603..2c25e1af44965d30591faeccc9a181e36c7e0a0f 100644 --- a/paddle/function/CosSimOp.cpp +++ b/paddle/function/CosSimOp.cpp @@ -97,7 +97,7 @@ class CosSimForwardFunc : public FunctionBase { CosSimForward(out_mat, in1_mat, in2_mat, scale_); } -private: + private: real scale_; }; @@ -227,7 +227,7 @@ class CosSimBackwardFunc : public FunctionBase { out_grad, out_val, in1_val, in2_val, in1_grad, in2_grad, scale_); } -private: + private: real scale_; }; diff --git a/paddle/function/CropOp.cpp b/paddle/function/CropOp.cpp index 7aa527d21615e19257bd003d0563b5e26b2fcb2f..5bd98910fe838751935f8ef2387ce96e755c6df1 100644 --- a/paddle/function/CropOp.cpp +++ b/paddle/function/CropOp.cpp @@ -112,7 +112,7 @@ void CropGrad(const real* inGrad, */ template class CropFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { conf_ = config; } void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -130,7 +130,7 @@ public: conf_); } -private: + private: FuncConfig conf_; }; @@ -145,7 +145,7 @@ private: template class CropGradFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { conf_ = config; } void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -163,7 +163,7 @@ public: conf_); } -private: + private: FuncConfig conf_; }; diff --git a/paddle/function/CrossMapNormalOp.cpp b/paddle/function/CrossMapNormalOp.cpp index 75c0fc2a3d047a9162d49809a717629f2270872d..7ff9227e5c2702d9d5334db501730b57ec10bfe3 100644 --- a/paddle/function/CrossMapNormalOp.cpp +++ b/paddle/function/CrossMapNormalOp.cpp @@ -160,7 +160,7 @@ void CrossMapNormalGrad(real* inputsGrad, */ template class CrossMapNormalFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { // function arguments size_ = config.get("size"); @@ -220,7 +220,7 @@ public: return ops; } -private: + private: size_t size_; real scale_; real pow_; @@ -260,7 +260,7 @@ private: */ template class CrossMapNormalGradFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { // function arguments size_ = config.get("size"); @@ -328,7 +328,7 @@ public: return ops; } -private: + private: size_t size_; real scale_; real pow_; diff --git a/paddle/function/DepthwiseConvOp.cpp b/paddle/function/DepthwiseConvOp.cpp index 46651345b45e4ced9a3ef3373af437d939a66716..958034e08e60c9a63d1c480bde7c84b760205ae4 100644 --- a/paddle/function/DepthwiseConvOp.cpp +++ b/paddle/function/DepthwiseConvOp.cpp @@ -19,7 +19,7 @@ namespace paddle { template class DepthwiseConvFunctor { -public: + public: void operator()(const T* inputData, const T* filterData, int batchSize, @@ -43,7 +43,7 @@ public: template class DepthwiseConvGradInputFunctor { -public: + public: void operator()(const T* outputGrad, const T* filterData, int batchSize, @@ -66,7 +66,7 @@ public: template class DepthwiseConvGradFilterFunctor { -public: + public: void operator()(const T* outputGrad, const T* inputData, int batchSize, @@ -93,7 +93,7 @@ public: */ template class DepthwiseConvFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } @@ -156,7 +156,7 @@ public: */ template class DepthwiseConvGradInputFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } @@ -220,7 +220,7 @@ public: */ template class DepthwiseConvGradFilterFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } diff --git a/paddle/function/DepthwiseConvOp.h b/paddle/function/DepthwiseConvOp.h index 6700747314fa8377828dab0c436eb4b2053f46f6..7837edd1c071980592b1cf36ecb69a3b7c12cc5e 100644 --- a/paddle/function/DepthwiseConvOp.h +++ b/paddle/function/DepthwiseConvOp.h @@ -44,7 +44,7 @@ namespace paddle { */ template class DepthwiseConvFunctor { -public: + public: void operator()(const T* inputData, const T* filterData, int batchSize, @@ -89,7 +89,7 @@ public: */ template class DepthwiseConvGradInputFunctor { -public: + public: void operator()(const T* outputGrad, const T* filterData, int batchSize, @@ -135,7 +135,7 @@ public: */ template class DepthwiseConvGradFilterFunctor { -public: + public: void operator()(const T* outputGrad, const T* inputData, int batchSize, diff --git a/paddle/function/DepthwiseConvOpGpu.cu b/paddle/function/DepthwiseConvOpGpu.cu index cd1d55a416c84c6327226ffaae4d5d9d5be81038..2c0e71b19b22abac25d273d8bbeddc330e67f8b0 100644 --- a/paddle/function/DepthwiseConvOpGpu.cu +++ b/paddle/function/DepthwiseConvOpGpu.cu @@ -199,7 +199,7 @@ __global__ void ConvolutionDepthwiseFilterBackward(const int num_i, template class DepthwiseConvFunctor { -public: + public: void operator()(const T* inputData, const T* filterData, int batchSize, @@ -249,7 +249,7 @@ public: template class DepthwiseConvGradInputFunctor { -public: + public: void operator()(const T* outputGrad, const T* filterData, int batchSize, @@ -300,7 +300,7 @@ public: template class DepthwiseConvGradFilterFunctor { -public: + public: void operator()(const T* outputGrad, const T* inputData, int batchSize, diff --git a/paddle/function/EigenThreadDevice.h b/paddle/function/EigenThreadDevice.h index 74269aa664a711c905e12a61958c9ab01e2340c0..eb92251c827a26d55ca021c4418182bae28dd6a5 100644 --- a/paddle/function/EigenThreadDevice.h +++ b/paddle/function/EigenThreadDevice.h @@ -46,7 +46,7 @@ int GetCpuCount() { return 1; } #endif class EigenDeviceWarpper { -public: // NOLINT + public: // NOLINT #if EIGEN_USE_THREADS static Eigen::ThreadPoolDevice* device() { const int num_cpus = GetCpuCount(); diff --git a/paddle/function/Function.h b/paddle/function/Function.h index 01288ef92e7b59d7958e6e23daf641b30a60eed1..a6c14ef29b760faa393c37bd2357824a061c7b38 100644 --- a/paddle/function/Function.h +++ b/paddle/function/Function.h @@ -29,7 +29,7 @@ namespace paddle { * The argument type of Function::init. */ class FuncConfig { -public: + public: template T get(const std::string& key, Error* err = nullptr) const { try { @@ -59,7 +59,7 @@ public: return *this; } -protected: + protected: mutable std::unordered_map valueMap_; }; @@ -77,7 +77,7 @@ protected: * in the BufferArgs life time. */ class BufferArgs { -public: + public: BufferArgs() {} ~BufferArgs() { @@ -137,7 +137,7 @@ public: void addArg(SparseMatrixArg& arg) { args_.push_back(&arg); } -private: + private: std::vector args_; // The BufferArg object is constructed and freed by BufferArgs. std::vector _args_; @@ -163,7 +163,7 @@ private: * If Function has more than one output, each output can have different modes. */ class FunctionBase { -public: + public: virtual ~FunctionBase() {} virtual void init(const FuncConfig& config) {} @@ -192,7 +192,7 @@ public: static ClassRegistrar funcRegistrar_; -protected: + protected: // numInputs_ and numOutputs_ represents the maximum // input and output supported by Function. // Some functions are optimized for input and output, diff --git a/paddle/function/FunctionTest.h b/paddle/function/FunctionTest.h index 56c3537b6a96c8042d172f8aca2163fa18c813c1..14003d2c885c8f846f9445ad8844869c9112816e 100644 --- a/paddle/function/FunctionTest.h +++ b/paddle/function/FunctionTest.h @@ -39,7 +39,7 @@ struct Allocator { // Copy argument1 to argument2 template class CopyArgument { -public: + public: void operator()(const BufferArg& arg1, BufferArg& arg2) { CHECK_EQ(arg1.valueType(), arg2.valueType()); CHECK_LE(arg1.shape().getElements(), arg2.shape().getElements()); @@ -95,7 +95,7 @@ public: */ template class Compare2Function { -public: + public: typedef typename test::Allocator::type Allocator1; typedef typename test::Allocator::type Allocator2; typedef typename Tensor::Vector Vector1; @@ -305,7 +305,7 @@ public: std::shared_ptr getFunction2() const { return function2_; } -protected: + protected: // only init cpu argument, gpu argument copy from cpu argument. void initArg(BufferArg& arg) { Vector1 vector(arg.shape().getElements(), (real*)arg.data()); @@ -381,7 +381,7 @@ protected: } } -protected: + protected: std::shared_ptr function1_; std::shared_ptr function2_; std::vector> func1Memory_; @@ -400,7 +400,7 @@ protected: class CpuGpuFuncCompare : public Compare2Function { -public: + public: CpuGpuFuncCompare(const std::string& name, const FuncConfig& config) : Compare2Function(name + "-CPU", name + "-GPU", config) {} diff --git a/paddle/function/GemmConvOp.cpp b/paddle/function/GemmConvOp.cpp index 2b7c6f9eab223c8d6a2107ff4605ac6e60295f7d..5b023e2c10e5040a28660d555efceb0e26b40d49 100644 --- a/paddle/function/GemmConvOp.cpp +++ b/paddle/function/GemmConvOp.cpp @@ -24,7 +24,7 @@ namespace paddle { */ template class GemmConvFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } @@ -136,7 +136,7 @@ public: */ template class GemmConvMobileFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } @@ -297,7 +297,7 @@ public: */ template class GemmConvGradInputFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } @@ -404,7 +404,7 @@ public: */ template class GemmConvGradFilterFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } diff --git a/paddle/function/Im2Col.h b/paddle/function/Im2Col.h index 6a0778700037c142d62fdb99667403ade806f7c1..e0ce6918a2a5324a396ade734945cf426b81ab56 100644 --- a/paddle/function/Im2Col.h +++ b/paddle/function/Im2Col.h @@ -70,7 +70,7 @@ enum ColFormat { kCFO = 0, kOCF = 1 }; */ template class Im2ColFunctor { -public: + public: void operator()(const T* imData, const TensorShape& imShape, T* colData, @@ -85,7 +85,7 @@ public: template class Col2ImFunctor { -public: + public: void operator()(T* imData, const TensorShape& imShape, const T* colData, @@ -100,7 +100,7 @@ public: template class Im2ColMobileFunctor { -public: + public: void operator()(const T* imData, const TensorShape& imShape, T* colData, diff --git a/paddle/function/Im2ColOp.cpp b/paddle/function/Im2ColOp.cpp index ad2aed8f3c237cf9c0f7f0dcc4900cac807e25ea..55a3ff98db63ede96094a3d3fdeedf03b573294f 100644 --- a/paddle/function/Im2ColOp.cpp +++ b/paddle/function/Im2ColOp.cpp @@ -23,7 +23,7 @@ namespace paddle { */ template class Im2ColFunctor { -public: + public: void operator()(const T* imData, const TensorShape& imShape, T* colData, @@ -75,7 +75,7 @@ public: */ template class Col2ImFunctor { -public: + public: void operator()(T* imData, const TensorShape& imShape, const T* colData, @@ -130,7 +130,7 @@ template class Col2ImFunctor; */ template class Im2ColFunctor { -public: + public: void operator()(const T* imData, const TensorShape& imShape, T* colData, @@ -188,7 +188,7 @@ public: */ template class Col2ImFunctor { -public: + public: void operator()(T* imData, const TensorShape& imShape, const T* colData, diff --git a/paddle/function/Im2ColOpGpu.cu b/paddle/function/Im2ColOpGpu.cu index a944a0ee687fefc5e002096b9c5b869495554167..96dd8f528eaa38f9d174ab7c2a5ea5eb96e2a060 100644 --- a/paddle/function/Im2ColOpGpu.cu +++ b/paddle/function/Im2ColOpGpu.cu @@ -71,7 +71,7 @@ __global__ void im2col(const T* data_im, */ template class Im2ColFunctor { -public: + public: void operator()(const T* imData, const TensorShape& imShape, T* colData, @@ -184,7 +184,7 @@ __global__ void col2im(size_t n, */ template class Col2ImFunctor { -public: + public: void operator()(T* imData, const TensorShape& imShape, const T* colData, @@ -292,7 +292,7 @@ __global__ void im2colOCF(const T* imData, */ template class Im2ColFunctor { -public: + public: void operator()(const T* imData, const TensorShape& imShape, T* colData, @@ -399,7 +399,7 @@ __global__ void col2imOCF(T* imData, */ template class Col2ImFunctor { -public: + public: void operator()(T* imData, const TensorShape& imShape, const T* colData, diff --git a/paddle/function/MulOp.cpp b/paddle/function/MulOp.cpp index 90cd4a2b6d1bfb2529e1c966cf7a1fb904a844d7..7bf36c8050a8c33d836ce98dc7f3cf6d3de38d55 100644 --- a/paddle/function/MulOp.cpp +++ b/paddle/function/MulOp.cpp @@ -240,7 +240,7 @@ void MulOp(CpuMatrix& out, */ template class MulFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { aTrans_ = config.get("aTrans"); bTrans_ = config.get("bTrans"); @@ -335,7 +335,7 @@ public: } } -private: + private: bool aTrans_; bool bTrans_; }; diff --git a/paddle/function/NaiveConvOp.cpp b/paddle/function/NaiveConvOp.cpp index 22d3b33d0f4a730691234c6c742978abd72294a6..99c8b81acbbb16a91bc0faa1c7f2873fa94ab108 100644 --- a/paddle/function/NaiveConvOp.cpp +++ b/paddle/function/NaiveConvOp.cpp @@ -24,7 +24,7 @@ namespace paddle { */ template class NaiveConvFunctor { -public: + public: void operator()(const T* inputData, size_t batchSize, size_t inputChannels, @@ -85,7 +85,7 @@ public: template class NaiveConvFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } diff --git a/paddle/function/PadOp.cpp b/paddle/function/PadOp.cpp index db6dd518ca5df9d852e545b37f61f1141c81f57c..5d7515e8c053439b95fb18de3c8ffe70705600a3 100644 --- a/paddle/function/PadOp.cpp +++ b/paddle/function/PadOp.cpp @@ -132,7 +132,7 @@ static inline PadConf castToPadConf(const FuncConfig& conf) { template class PadFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { pad_ = castToPadConf(config); } void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -157,7 +157,7 @@ public: pad_); } -private: + private: PadConf pad_; }; @@ -173,7 +173,7 @@ private: template class PadGradFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { pad_ = castToPadConf(config); } void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -201,7 +201,7 @@ public: pad_); } -private: + private: PadConf pad_; }; diff --git a/paddle/function/RowConvOp.cpp b/paddle/function/RowConvOp.cpp index 925860346e1a53065b0fe4ccbd26853afc8898a1..129e9334582fad011c259e8ab8268b00a7fab7b6 100644 --- a/paddle/function/RowConvOp.cpp +++ b/paddle/function/RowConvOp.cpp @@ -129,7 +129,7 @@ void RowConvGrad(const CpuMatrix& outG, template class RowConvFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override {} void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -176,7 +176,7 @@ public: template class RowConvGradFunc : public FunctionBase { // TODO(qingqing): split into RowConvDataFunc and RowConvWeightFunc -public: + public: void init(const FuncConfig& config) override {} void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { diff --git a/paddle/function/ScaleSubRegionOp.cpp b/paddle/function/ScaleSubRegionOp.cpp index 6ed6eb2dba477722664ca4a29f4689114f368846..9a06ef2a96f25b5b7326049df2a708637f319561 100644 --- a/paddle/function/ScaleSubRegionOp.cpp +++ b/paddle/function/ScaleSubRegionOp.cpp @@ -92,7 +92,7 @@ void ScaleSubRegionGrad(const real* inGrad, */ template class ScaleSubRegionFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { conf_ = config; } void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -109,7 +109,7 @@ public: conf_); } -private: + private: FuncConfig conf_; }; @@ -124,7 +124,7 @@ private: template class ScaleSubRegionGradFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override { conf_ = config; } void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -141,7 +141,7 @@ public: conf_); } -private: + private: FuncConfig conf_; }; diff --git a/paddle/function/SwitchOp.cpp b/paddle/function/SwitchOp.cpp index 50e1d6c04c54fed5b847aa10dbb253f00cfa42d4..750fb6bf28baf050b1f9f965a1a9b315363e5645 100644 --- a/paddle/function/SwitchOp.cpp +++ b/paddle/function/SwitchOp.cpp @@ -75,7 +75,7 @@ void NHWC2NCHW(real* outputs, */ template class NCHW2NHWCFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override {} void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { @@ -108,7 +108,7 @@ public: */ template class NHWC2NCHWFunc : public FunctionBase { -public: + public: void init(const FuncConfig& config) override {} void calc(const BufferArgs& inputs, const BufferArgs& outputs) override { diff --git a/paddle/function/TensorShape.h b/paddle/function/TensorShape.h index 02d38c32c007325a928910d136d48214ba5f6bc3..d4d1eae3960c333a2a7dc6099ae7a68677fdcd5f 100644 --- a/paddle/function/TensorShape.h +++ b/paddle/function/TensorShape.h @@ -22,7 +22,7 @@ namespace paddle { * TensorShape used to represent shape of normal tensor. */ class TensorShape { -public: + public: TensorShape() : ndims_(0), nelements_(0) { initDims(0); } TensorShape(size_t ndims) : ndims_(ndims), nelements_(1) { initDims(ndims); }; @@ -80,7 +80,7 @@ public: bool operator!=(const TensorShape& t) const { return !(*this == t); } -private: + private: // compute number of elements void numElements() { nelements_ = 1; diff --git a/paddle/function/neon/NeonDepthwiseConv.cpp b/paddle/function/neon/NeonDepthwiseConv.cpp index d3298c753853ca6d212a619cf8d0bd9356a8dbd7..85bc95bb88ca606e289fb6dad4946a77faf3d5fb 100644 --- a/paddle/function/neon/NeonDepthwiseConv.cpp +++ b/paddle/function/neon/NeonDepthwiseConv.cpp @@ -21,7 +21,7 @@ namespace paddle { template class NeonDepthwiseConvFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } diff --git a/paddle/function/neon/NeonDepthwiseConvTranspose.cpp b/paddle/function/neon/NeonDepthwiseConvTranspose.cpp index d443d3fa4902f998230651c5c64355d93c4c4f6a..1fc5daf6078bbd5b4506ff2e0832e2cc3ec48fe3 100644 --- a/paddle/function/neon/NeonDepthwiseConvTranspose.cpp +++ b/paddle/function/neon/NeonDepthwiseConvTranspose.cpp @@ -21,7 +21,7 @@ namespace paddle { template class NeonDepthwiseConvTransposeFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); } diff --git a/paddle/function/nnpack/NNPACKConvOp.cpp b/paddle/function/nnpack/NNPACKConvOp.cpp index 3cdba4f2ed0dad42035fe2d0de87ad5aeeef20ca..48c997b50d8c73b25c58801c30e597c9d1f3232a 100644 --- a/paddle/function/nnpack/NNPACKConvOp.cpp +++ b/paddle/function/nnpack/NNPACKConvOp.cpp @@ -46,7 +46,7 @@ nnp_convolution_algorithm get_nnp_convolution_algorithm( template class NNPACKConvFunction : public ConvFunctionBase { -public: + public: void init(const FuncConfig& config) override { ConvFunctionBase::init(config); algorithm_ = get_nnp_convolution_algorithm(config.get("algo")); @@ -231,7 +231,7 @@ public: } } -private: + private: nnp_convolution_algorithm algorithm_; nnp_convolution_transform_strategy transform_strategy_; void* workspaceBuffer_; diff --git a/paddle/gserver/activations/ActivationFunction.cpp b/paddle/gserver/activations/ActivationFunction.cpp index 8d8f01234fe3859989e44fe6147105fb72b832ff..71c238fbfe9f32f3764601ebb441336931f8ef5f 100644 --- a/paddle/gserver/activations/ActivationFunction.cpp +++ b/paddle/gserver/activations/ActivationFunction.cpp @@ -44,10 +44,10 @@ static ClassRegistrar gActivationRegistrar; */ #define BEGIN_DEFINE_ACTIVATION(ACTIVATION_NAME) \ class ACTIVATION_CLASS_NAME(ACTIVATION_NAME) : public ActivationFunction { \ - private: \ + private: \ static const std::string name; \ \ - public: \ + public: \ const std::string& getName() const { return name; } /** * @def END_DEFINE_ACTIVATION @@ -70,7 +70,7 @@ static ClassRegistrar gActivationRegistrar; * Do nothing when forward/backward. */ class IdentityActivation : public ActivationFunction { -public: + public: static const std::string name; Error __must_check forward(Argument& act) { (void)act; diff --git a/paddle/gserver/activations/ActivationFunction.h b/paddle/gserver/activations/ActivationFunction.h index 0f4b0fe0abb85403d42fc8a2ac28560e10058c20..8e2e144769f2e668a9a8f02890d29c4a7fe128a3 100644 --- a/paddle/gserver/activations/ActivationFunction.h +++ b/paddle/gserver/activations/ActivationFunction.h @@ -31,7 +31,7 @@ struct Argument; * */ class ActivationFunction { -public: + public: static ActivationFunction* create(const std::string& type); static std::vector getAllRegisteredTypes(); diff --git a/paddle/gserver/activations/MKLDNNActivation.cpp b/paddle/gserver/activations/MKLDNNActivation.cpp index 56ffb839344aabe43eaae0bd46e6dbf95e4d8f20..672444c6561adbeb78c3c453f12ab6aaedeed646 100644 --- a/paddle/gserver/activations/MKLDNNActivation.cpp +++ b/paddle/gserver/activations/MKLDNNActivation.cpp @@ -35,10 +35,10 @@ static ClassRegistrar gMKLDNNActivationRegistrar; * @def END_MKLDNN_ACTIVATION */ #define END_MKLDNN_ACTIVATION(ACT_TYPE) \ -private: \ + private: \ static const std::string name; \ \ -public: \ + public: \ const std::string& getName() const { return name; } \ } \ ; \ @@ -63,11 +63,11 @@ public: \ #define DEFINE_MKLDNN_ELTWISE_ACTIVATION( \ ACT_TYPE, BASE_CLASS, ALPHA, BWD_ALPHA) \ BEGIN_MKLDNN_ACTIVATION(ACT_TYPE, BASE_CLASS) \ -private: \ + private: \ static const float alpha; \ static const float bwdAlpha; \ \ -public: \ + public: \ float getAlpha() const { return alpha; } \ float getBwdAlpha() const { return bwdAlpha; } \ END_MKLDNN_ACTIVATION(ACT_TYPE) \ diff --git a/paddle/gserver/activations/MKLDNNActivation.h b/paddle/gserver/activations/MKLDNNActivation.h index 392b32c70dae3728e13ee64f09f135c015c122cf..eece1b9c37e72624dffd119804c65f7bd36e20fb 100644 --- a/paddle/gserver/activations/MKLDNNActivation.h +++ b/paddle/gserver/activations/MKLDNNActivation.h @@ -27,7 +27,7 @@ namespace paddle { * including mkldnn_relu, mkldnn_elu, mkldnn_tanh, mkldnn_softmax */ class MKLDNNActivation : public ActivationFunction { -protected: + protected: // input value element count size_t cnt_; // should not merge the resetBwd into resetFwd, @@ -43,7 +43,7 @@ protected: std::vector pipelineFwd_; std::vector pipelineBwd_; -public: + public: MKLDNNActivation() : cnt_(0), needResetBwd_(true) {} ~MKLDNNActivation() {} static ActivationFunction* create(const std::string& type); @@ -72,7 +72,7 @@ class MKLDNNEltwiseActivation : public MKLDNNActivation { typedef mkldnn::eltwise_backward eltwise_bwd; typedef mkldnn::algorithm algorithm; -protected: + protected: // save the forward primitive desc, which can be used backward std::shared_ptr fwdPD_; // eltwise_bwd need src input value @@ -80,7 +80,7 @@ protected: // use for copy data std::shared_ptr copyInVal_; -public: + public: MKLDNNEltwiseActivation() {} ~MKLDNNEltwiseActivation() {} virtual const std::string& getName() const = 0; @@ -102,12 +102,12 @@ public: class MKLDNNSoftmaxActivation : public MKLDNNActivation { typedef mkldnn::softmax_forward softmax_fwd; -private: + private: // for backward MatrixPtr sftMaxSum_; MatrixPtr sftMaxDot_; -public: + public: MKLDNNSoftmaxActivation() {} ~MKLDNNSoftmaxActivation() {} virtual const std::string& getName() const = 0; diff --git a/paddle/gserver/dataproviders/DataProvider.h b/paddle/gserver/dataproviders/DataProvider.h index 4851168abab7179d552648c88923a529d55e6a7e..21822b10c2ebf1d353195794cf8f49e02b64c177 100644 --- a/paddle/gserver/dataproviders/DataProvider.h +++ b/paddle/gserver/dataproviders/DataProvider.h @@ -71,7 +71,7 @@ typedef std::shared_ptr BufferBatchPtr; * @brief Data for batch training a neural network */ class DataBatch { -public: + public: DataBatch() : size_(0) { data_.clear(); } /** * @brief Get batch size @@ -181,7 +181,7 @@ public: } } -protected: + protected: /** * @brief batch size */ @@ -194,7 +194,7 @@ protected: }; class BufferBatch { -public: + public: BufferBatch() { hlStream_ = HPPL_STREAM_DEFAULT; hlEvent_ = NULL; @@ -235,7 +235,7 @@ public: void swap(BufferBatch* bufBatch); void clone(DataBatch* srcBatch, bool useGpu); -protected: + protected: DataBatch* batchData_; hl_stream_t hlStream_; hl_event_t hlEvent_; @@ -247,7 +247,7 @@ typedef std::shared_ptr DataProviderPtr; typedef Queue BufferBatchQueue; class DoubleBuffer { -public: + public: DoubleBuffer(DataProvider* dataPool, bool useGpu, int64_t batchSize = 0); virtual ~DoubleBuffer(); void removeOneBatch(DataBatch* dataBatch); @@ -267,7 +267,7 @@ public: void setPending(bool pending) { pending_ = pending; } -protected: + protected: virtual void asyncLoadBatch(); void insertOneBatch(DataBatch* batch); @@ -290,7 +290,7 @@ protected: * one is for input, one is for label. */ class DataProvider { -public: + public: static ClassRegistrar registrar_; static DataProvider* create(const DataConfig& config, const ModelConfig& modelConfig, @@ -359,7 +359,7 @@ public: */ virtual int64_t getNextBatchInternal(int64_t size, DataBatch* batch) = 0; -protected: + protected: DataConfig config_; bool skipShuffle_; float usageRatio_; @@ -382,7 +382,7 @@ protected: * necessary configurations such as stream_names */ class DummyDataProvider : public DataProvider { -public: + public: DummyDataProvider(const DataConfig& config, bool useGpu) : DataProvider(config, useGpu) {} virtual void shuffle() {} @@ -399,7 +399,7 @@ public: * Data provider for one input and one integer label. */ class SimpleDataProviderBase : public DataProvider { -protected: + protected: /// sample feature dimension int64_t sampleDim_; /// the number of samples @@ -425,7 +425,7 @@ protected: RWLock lock_; -public: + public: SimpleDataProviderBase(const DataConfig& config, bool useGpu, bool withInfo); ~SimpleDataProviderBase() {} @@ -440,7 +440,7 @@ public: /// return the number of samples in the buffer int64_t fillBuffer(); -protected: + protected: /** * @brief Fill at most size samples into data and label. * @@ -458,12 +458,12 @@ protected: }; class SimpleDataProvider : public SimpleDataProviderBase { -public: + public: SimpleDataProvider(const DataConfig& config, bool useGpu); ~SimpleDataProvider(); virtual void reset(); -protected: + protected: void loadData(const std::string& fileName); void loadDataFile(const std::string& fileName); virtual int64_t fillBufferImp(real* data, @@ -471,7 +471,7 @@ protected: int* info, int64_t size); -protected: + protected: size_t currentSampleIndex_; std::vector labels_; std::vector data_; diff --git a/paddle/gserver/dataproviders/DataProviderGroup.h b/paddle/gserver/dataproviders/DataProviderGroup.h index 768e54fe82bedd6faca5ad9eb2b6f2ee0017dc3d..91c94dc986c7aeb70df25511ce14a5f9c312a159 100644 --- a/paddle/gserver/dataproviders/DataProviderGroup.h +++ b/paddle/gserver/dataproviders/DataProviderGroup.h @@ -20,7 +20,7 @@ namespace paddle { template class DataProviderGroup : public DataProvider { -protected: + protected: typedef T ProviderType; typedef std::shared_ptr ProviderPtrType; ProviderPtrType provider_; @@ -29,7 +29,7 @@ protected: std::mutex lock_; std::unique_ptr> loader_; -public: + public: DataProviderGroup(const DataConfig& config, bool useGpu); ~DataProviderGroup() {} @@ -38,7 +38,7 @@ public: virtual int64_t getSize() { return -1; } virtual int64_t getNextBatchInternal(int64_t size, DataBatch* batch); -private: + private: void startLoader(); void stopLoader(); void forceStopLoader(); diff --git a/paddle/gserver/dataproviders/MultiDataProvider.h b/paddle/gserver/dataproviders/MultiDataProvider.h index 9a863c896773d71a99e21660fc13e3dd477a0c12..baa1fc019002f86414c9c45734ad65cda916d457 100644 --- a/paddle/gserver/dataproviders/MultiDataProvider.h +++ b/paddle/gserver/dataproviders/MultiDataProvider.h @@ -19,10 +19,10 @@ limitations under the License. */ namespace paddle { class MultiDataProvider : public DataProvider { -protected: + protected: std::vector> subDataProviders_; -public: + public: MultiDataProvider(const DataConfig& config, const ModelConfig& modelConfig, bool useGpu); @@ -33,7 +33,7 @@ public: virtual int64_t getNextBatchInternal(int64_t size, DataBatch* batch); bool isTestMode() const { return isTestMode_; } -private: + private: int totalDataRatio_; bool isTestMode_; }; diff --git a/paddle/gserver/dataproviders/ProtoReader.h b/paddle/gserver/dataproviders/ProtoReader.h index 786703f4dee4802bb967f9d15fb69ebcbc15d997..08d045226e1ebb014bdd91ebf0e8f0353179b0c8 100644 --- a/paddle/gserver/dataproviders/ProtoReader.h +++ b/paddle/gserver/dataproviders/ProtoReader.h @@ -28,7 +28,7 @@ namespace paddle { * messages from/to i/ostream. */ class ProtoReader { -public: + public: explicit ProtoReader(std::istream* s, bool dataCompression = false) { CHECK(s) << "istream pointer is nullptr"; istreamInput_.reset(new google::protobuf::io::IstreamInputStream(s)); @@ -109,7 +109,7 @@ public: return true; } -protected: + protected: std::unique_ptr istreamInput_; std::unique_ptr gzipInput_; std::unique_ptr codedInput_; @@ -144,7 +144,7 @@ protected: }; class ProtoWriter { -public: + public: explicit ProtoWriter(std::ostream* s, bool dataCompression = false) { CHECK(s) << "ostream pointer is nullptr"; ostreamOutput_.reset(new google::protobuf::io::OstreamOutputStream(s)); @@ -168,7 +168,7 @@ public: return ret; } -protected: + protected: std::unique_ptr ostreamOutput_; std::unique_ptr gzipOutput_; std::unique_ptr codedOutput_; diff --git a/paddle/gserver/dataproviders/PyDataProvider.h b/paddle/gserver/dataproviders/PyDataProvider.h index e53354c9e43ea9dc58fd4bd38a533025b6f17482..da50dd4e2ebb743ef45af319bc713ed7ac3d3e10 100644 --- a/paddle/gserver/dataproviders/PyDataProvider.h +++ b/paddle/gserver/dataproviders/PyDataProvider.h @@ -23,7 +23,7 @@ limitations under the License. */ namespace paddle { class PyDataProvider : public DataProvider { -public: + public: PyDataProvider(const DataConfig& config, bool useGpu, bool loadDataAll = true); @@ -40,7 +40,7 @@ public: virtual int64_t getNextBatchInternal(int64_t size, DataBatch* batch); -protected: + protected: struct ProtoSlot; // return false if each each sample is one sequence, i.e., independent // of other samples. @@ -73,7 +73,7 @@ protected: void resetSlots(); void loadData(const std::vector& fileList); -protected: + protected: struct ProtoSlot { SlotDef::SlotType type; int dim; diff --git a/paddle/gserver/dataproviders/PyDataProvider2.cpp b/paddle/gserver/dataproviders/PyDataProvider2.cpp index b4215bb307cc31ce64bb724986b88fdc20bbbf45..54ee091e8f257f76b113d4ca6f8a7c3989c0c1df 100644 --- a/paddle/gserver/dataproviders/PyDataProvider2.cpp +++ b/paddle/gserver/dataproviders/PyDataProvider2.cpp @@ -93,7 +93,7 @@ inline std::ostream& operator<<(std::ostream& os, const SlotHeader& header) { * prepare step, fill data into argument during fill step. */ class IFieldScanner { -public: + public: DISABLE_COPY(IFieldScanner); /** * Ctor. @@ -146,7 +146,7 @@ public: */ static IFieldScanner* create(SlotHeader* header); -protected: + protected: SlotHeader* headerPtr_; }; @@ -154,7 +154,7 @@ protected: * Py Data Provider Cache Interface. */ class IPyDataProviderCache { -public: + public: virtual ~IPyDataProviderCache() {} /** @@ -193,7 +193,7 @@ public: * data. And it support cache strategies. */ class PyDataProvider2 : public DataProvider { -public: + public: /** * Ctor */ @@ -234,7 +234,7 @@ public: */ virtual ~PyDataProvider2() { resetImpl(false); } -private: + private: void createPyDataObj(const std::string& model, const std::string& className, const std::string& fileListName, @@ -435,7 +435,7 @@ private: exit_ = false; } -private: + private: std::unique_ptr loadThread_; std::atomic exit_; std::deque callingContexts_; @@ -461,7 +461,7 @@ private: static PyObjectPtr zeroTuple_; class PositionRandom { - public: + public: inline explicit PositionRandom(bool skipRand) : eng_(ThreadLocalRandomEngine::get()), skipRand_(skipRand) {} @@ -476,14 +476,14 @@ private: } } - private: + private: std::default_random_engine& eng_; std::unique_ptr> dist_; bool skipRand_; }; // DataProvider interface -public: + public: /** * Resetting the PyDataProvider. May start reading thread here. */ @@ -666,7 +666,7 @@ REGISTER_DATA_PROVIDER_EX(py2, PyDataProvider2); * Scanner for dense slot. */ class DenseScanner : public IFieldScanner { -public: + public: explicit DenseScanner(SlotHeader* ptr) : IFieldScanner(ptr), height_(0) {} /** @@ -708,7 +708,7 @@ public: ++height_; } -private: + private: size_t height_; }; @@ -716,7 +716,7 @@ private: * Scanner for index slot */ class IndexScanner : public IFieldScanner { -public: + public: explicit IndexScanner(SlotHeader* ptr) : IFieldScanner(ptr), cnt_(0) {} /** @@ -740,12 +740,12 @@ public: CHECK(ok) << "Cannot cast int " << py::repr(obj); } -private: + private: size_t cnt_; }; class SparseNonValueScanner : public IFieldScanner { -public: + public: explicit SparseNonValueScanner(SlotHeader* ptr) : IFieldScanner(ptr), nnz_(0), height_(0) {} @@ -790,7 +790,7 @@ public: ++height_; } -protected: + protected: /** * Set a single sparse index and value. * @param [out] col sparse index @@ -809,7 +809,7 @@ protected: }; class SparseValueScanner : public SparseNonValueScanner { -public: + public: explicit SparseValueScanner(SlotHeader* ptr) : SparseNonValueScanner(ptr) {} virtual void finishPrepare(Argument& argument) { @@ -817,7 +817,7 @@ public: argument.value, height_, headerPtr_->dim, nnz_, FLOAT_VALUE); } -protected: + protected: virtual void setData(int* col, real* dat, PyObject* obj) { py::SequenceHelper s(obj); SparseNonValueScanner::setData(col, dat, s[0]); @@ -829,7 +829,7 @@ protected: * Sequence Scanner. Scanner for sequence or sub-sequence. */ class SequenceScanner : public IFieldScanner { -public: + public: /** * Ctor * @param innerScanner inner scanner for each timestep or sub-sequence. @@ -902,7 +902,7 @@ public: */ virtual void finishFill(Argument& argument) { inner_->finishFill(argument); } -protected: + protected: size_t getSize(PyObject* obj) { py::SequenceHelper s(obj); auto sc = dynamic_cast(inner_.get()); @@ -917,7 +917,7 @@ protected: } } -private: + private: std::unique_ptr inner_; size_t cnt_; std::function getSeqStartPos_; @@ -969,7 +969,7 @@ IFieldScanner* IFieldScanner::create(SlotHeader* header) { * python every pass. */ class NoCacheStrategy : public IPyDataProviderCache { -public: + public: virtual bool reset() { return true; } virtual void drop(std::deque* data) { data->clear(); } @@ -984,7 +984,7 @@ public: * The rest passes, will load data from memory. */ class CacheOnePassInMemory : public IPyDataProviderCache { -public: + public: CacheOnePassInMemory() : objPool_(new std::deque()), droppedPool_(new std::deque()) {} @@ -1011,7 +1011,7 @@ public: virtual std::deque* load() { return objPool_.get(); } -private: + private: std::unique_ptr> objPool_; std::unique_ptr> droppedPool_; }; diff --git a/paddle/gserver/evaluators/CTCErrorEvaluator.cpp b/paddle/gserver/evaluators/CTCErrorEvaluator.cpp index 0f680de776f4755ca5fe83c86ea759d88f93ed01..c6cd41de9a1a22470d8659eb90d1ac2b075b2df9 100644 --- a/paddle/gserver/evaluators/CTCErrorEvaluator.cpp +++ b/paddle/gserver/evaluators/CTCErrorEvaluator.cpp @@ -22,7 +22,7 @@ namespace paddle { * calculate sequence-to-sequence edit distance */ class CTCErrorEvaluator : public Evaluator { -private: + private: MatrixPtr outActivations_; int numTimes_, numClasses_, numSequences_, blank_; real deletions_, insertions_, substitutions_; @@ -197,7 +197,7 @@ private: (real)seqClassficationError_ / numSequences_; } -public: + public: CTCErrorEvaluator() : numTimes_(0), numClasses_(0), diff --git a/paddle/gserver/evaluators/ChunkEvaluator.cpp b/paddle/gserver/evaluators/ChunkEvaluator.cpp index 755b91d05caf33745e66415e7b111ba348c575d9..a2216293b1ab3a32e9cc903b805ca0aca10d58c1 100644 --- a/paddle/gserver/evaluators/ChunkEvaluator.cpp +++ b/paddle/gserver/evaluators/ChunkEvaluator.cpp @@ -77,7 +77,7 @@ class ChunkEvaluator : public Evaluator { std::set excludedChunkTypes_; mutable std::unordered_map values_; -public: + public: virtual void init(const EvaluatorConfig& config) { Evaluator::init(config); if (config.chunk_scheme() == "IOB") { @@ -276,7 +276,7 @@ public: return "chunk"; } -private: + private: void storeLocalValues() const { CHECK_GE(numOutputSegments_, 0); CHECK_GE(numLabelSegments_, 0); diff --git a/paddle/gserver/evaluators/DetectionMAPEvaluator.cpp b/paddle/gserver/evaluators/DetectionMAPEvaluator.cpp index f43ef5dd51407236a3a36b300b33f92a9fad885a..ddb8ebca784db4a83c328ff75f5c50c7aecd7352 100644 --- a/paddle/gserver/evaluators/DetectionMAPEvaluator.cpp +++ b/paddle/gserver/evaluators/DetectionMAPEvaluator.cpp @@ -28,7 +28,7 @@ namespace paddle { * The config file api is detection_map_evaluator. */ class DetectionMAPEvaluator : public Evaluator { -public: + public: DetectionMAPEvaluator() : evaluateDifficult_(false), cpuOutput_(nullptr), cpuLabel_(nullptr) {} @@ -132,7 +132,7 @@ public: LOG(FATAL) << "Distribute detection evaluation not implemented."; } -protected: + protected: void calcTFPos(const size_t batchSize, const vector>>& allGTBBoxes, const vector>>>& @@ -287,7 +287,7 @@ protected: real getValueImpl() const { return calcMAP(); } -private: + private: real overlapThreshold_; // overlap threshold when determining whether matched bool evaluateDifficult_; // whether evaluate difficult ground truth size_t backgroundId_; // class index of background diff --git a/paddle/gserver/evaluators/Evaluator.cpp b/paddle/gserver/evaluators/Evaluator.cpp index 79478e7fac63a49c494105d53a6944b4b89e6c63..941fb8fb539d58cca22ecf563d2effa816243c3b 100644 --- a/paddle/gserver/evaluators/Evaluator.cpp +++ b/paddle/gserver/evaluators/Evaluator.cpp @@ -38,7 +38,7 @@ void Evaluator::eval(const NeuralNetwork& nn) { * The config file api is classification_error_evaluator. */ class ClassificationErrorEvaluator : public Evaluator { -public: + public: /* ClassificationErrorEvaluator() : totalScore2_(0) {} @@ -124,7 +124,7 @@ public: } // Evaluator interface -protected: + protected: std::string getTypeImpl() const { return "classification_error"; } }; @@ -135,7 +135,7 @@ protected: */ class SequenceClassificationErrorEvaluator : public ClassificationErrorEvaluator { -public: + public: virtual void updateSamplesNum(const std::vector& arguments) { numSamples_ += arguments[0].getNumSequences(); } @@ -166,7 +166,7 @@ public: } // Evaluator interface -protected: + protected: std::string getTypeImpl() const { return "seq_classification_error"; } }; REGISTER_EVALUATOR(seq_classification_error, @@ -178,7 +178,7 @@ REGISTER_EVALUATOR(seq_classification_error, * The config file api is sum_evaluator. */ class SumEvaluator : public Evaluator { -public: + public: SumEvaluator() : cpuLabel_(nullptr), cpuWeight_(nullptr) {} virtual void updateSamplesNum(const std::vector& arguments) { @@ -255,12 +255,12 @@ public: mergeResultsOfAllClients(client); } -private: + private: IVectorPtr cpuLabel_; MatrixPtr cpuWeight_; // Evaluator interface -protected: + protected: std::string getTypeImpl() const { return "sum"; } }; /** @@ -274,7 +274,7 @@ protected: * */ class ColumnSumEvaluator : public Evaluator { -public: + public: explicit ColumnSumEvaluator(int32_t colIdx) : colIdx_(colIdx), colNum_(0), sum_(nullptr) {} @@ -368,13 +368,13 @@ public: client->reduce(&numSamples_, &numSamples_, 1, FLAGS_trainer_id, 0); } -private: + private: int32_t colIdx_; size_t colNum_; MatrixPtr sum_; /* cpu matrix */ // Evaluator interface -protected: + protected: std::string getTypeImpl() const { if (colIdx_ == -1) return "last-column-sum"; @@ -1018,7 +1018,7 @@ static InitFunction __reg_type_auc_sum__([]() { * The config file api is value_printer_evaluator. */ class ValuePrinter : public NotGetableEvaluator { -public: + public: virtual void eval(const NeuralNetwork& nn) { for (const std::string& name : config_.input_layers()) { nn.getLayer(name)->getOutput().printValueString(LOG(INFO), @@ -1038,7 +1038,7 @@ REGISTER_EVALUATOR(value_printer, ValuePrinter); * The config file api is gradient_printer_evaluator. */ class GradientPrinter : public NotGetableEvaluator { -public: + public: virtual void eval(const NeuralNetwork& nn) { for (const std::string& name : config_.input_layers()) { const Argument& argu = nn.getLayer(name)->getOutput(); @@ -1061,11 +1061,11 @@ REGISTER_EVALUATOR(gradient_printer, GradientPrinter); * The config file api is maxid_printer_evaluator. */ class MaxIdPrinter : public NotGetableEvaluator { -private: + private: IVectorPtr maxIds_; MatrixPtr maxValues_; -public: + public: MaxIdPrinter() {} virtual void eval(const NeuralNetwork& nn) { @@ -1103,12 +1103,12 @@ REGISTER_EVALUATOR(max_id_printer, MaxIdPrinter); * The config file api is maxframe_printer_evaluator. */ class MaxFramePrinter : public NotGetableEvaluator { -private: + private: IVectorPtr maxIds_; MatrixPtr maxValues_; MatrixPtr value_; -public: + public: MaxFramePrinter() { value_ = Matrix::create(nullptr, /* height= */ 1, 1, /* trans= */ false, false); @@ -1190,7 +1190,7 @@ REGISTER_EVALUATOR(max_frame_printer, MaxFramePrinter); * */ class SequenceTextPrinter : public NotGetableEvaluator { -private: + private: /// dict_file, which contains a list of tokens std::vector dict_; /// result_file, which is the output file @@ -1203,7 +1203,7 @@ private: /// store the probability associated with each sequence std::vector cpuIn_; -public: + public: SequenceTextPrinter() {} virtual void init(const EvaluatorConfig& config) { @@ -1334,7 +1334,7 @@ REGISTER_EVALUATOR(seq_text_printer, SequenceTextPrinter); * The config file api is classification_error_printer_evaluator. */ class ClassificationErrorPrinter : public ClassificationErrorEvaluator { -public: + public: virtual void updateSamplesNum(const std::vector& arguments) {} virtual real evalImp(std::vector& arguments) { diff --git a/paddle/gserver/evaluators/Evaluator.h b/paddle/gserver/evaluators/Evaluator.h index be2032992c455fe2b442dbe05d84128ef8ebf82f..42948f1097d9a12600f4b11646a47e45b9bf4e96 100644 --- a/paddle/gserver/evaluators/Evaluator.h +++ b/paddle/gserver/evaluators/Evaluator.h @@ -40,7 +40,7 @@ class NeuralNetwork; * has been by a trained model. */ class Evaluator { -public: + public: static Evaluator* create(const EvaluatorConfig& config); Evaluator() : numSamples_(0), totalScore_(0) {} @@ -172,7 +172,7 @@ public: return this->getTypeImpl(); } -protected: + protected: /** * @brief getValueImpl The simplest way to define getValue result. If this * evaluator doesn't contain multiple fields, and do not throw any error, just @@ -191,7 +191,7 @@ protected: */ virtual std::string getTypeImpl() const { return "base"; } -protected: + protected: EvaluatorConfig config_; double numSamples_; double totalScore_; @@ -204,7 +204,7 @@ protected: */ class NotGetableEvaluator : public Evaluator { // Evaluator interface -public: + public: void getNames(std::vector* names) {} real getValue(const std::string& name, Error* err) const { @@ -219,7 +219,7 @@ public: }; class DummyEvaluator : public Evaluator { -public: + public: DummyEvaluator() {} virtual void init(const EvaluatorConfig&) {} virtual void start() {} @@ -232,7 +232,7 @@ public: virtual void printStats(std::ostream&) const {} // Evaluator interface -protected: + protected: std::string getTypeImpl() const; }; /** @@ -251,7 +251,7 @@ protected: * */ class AucEvaluator : public Evaluator { -public: + public: AucEvaluator(int32_t colIdx) : colIdx_(colIdx), realColumnIdx_(0), @@ -269,7 +269,7 @@ public: virtual void distributeEval(ParameterClient2* client); -private: + private: static const uint32_t kBinNum_ = (1 << 24) - 1; static const int kNegativeLabel_ = 0; double statPos_[kBinNum_ + 1]; @@ -292,7 +292,7 @@ private: double calcAuc() const; // Evaluator interface -protected: + protected: real getValueImpl() const; std::string getTypeImpl() const; }; @@ -305,7 +305,7 @@ protected: * dense value. */ class RankAucEvaluator : public Evaluator { -public: + public: // evaluate ranking AUC virtual void start(); @@ -317,7 +317,7 @@ public: mergeResultsOfAllClients(client); } -private: + private: MatrixPtr output_; MatrixPtr click_; MatrixPtr pv_; @@ -329,7 +329,7 @@ private: size_t size); // Evaluator interface -protected: + protected: std::string getTypeImpl() const; }; @@ -344,7 +344,7 @@ protected: * The config file api is precision_recall_evaluator. */ class PrecisionRecallEvaluator : public Evaluator { -public: + public: // Evaluate precision, recall and F1 score PrecisionRecallEvaluator() : isMultiBinaryLabel_(false), @@ -379,7 +379,7 @@ public: StatsInfo() : TP(0.0), TN(0.0), FP(0.0), FN(0.0) {} }; -private: + private: bool isMultiBinaryLabel_; std::vector statsInfo_; @@ -444,7 +444,7 @@ private: * The config file api is pnpair_evaluator. */ class PnpairEvaluator : public Evaluator { -public: + public: PnpairEvaluator() : cpuOutput_(nullptr), cpuLabel_(nullptr), @@ -491,7 +491,7 @@ public: << " calc total neg pair: " << pairArray_[1]; } -private: + private: static const uint32_t kPairArrayNum_ = 2; double pairArray_[kPairArrayNum_]; MatrixPtr cpuOutput_; @@ -500,7 +500,7 @@ private: MatrixPtr cpuWeight_; // Evaluator interface -protected: + protected: real getValueImpl() const { return pairArray_[0] / ((pairArray_[1] <= 0) ? 1.0 : pairArray_[1]); } diff --git a/paddle/gserver/gradientmachines/GradientMachine.h b/paddle/gserver/gradientmachines/GradientMachine.h index 60936c311d1b0119186c76d5c95b8819294446ce..22cf5d265f429ecbcea1808a54c85d7e89f8bc99 100644 --- a/paddle/gserver/gradientmachines/GradientMachine.h +++ b/paddle/gserver/gradientmachines/GradientMachine.h @@ -73,7 +73,7 @@ class GradientMachine; typedef std::shared_ptr GradientMachinePtr; class GradientMachine { -public: + public: enum CreateMode { kNormal = 0, kSgdSparseCpuTraining = 3, @@ -240,7 +240,7 @@ public: */ virtual void releaseOutput() {} -protected: + protected: virtual void onLoadParameter() {} std::vector parameters_; diff --git a/paddle/gserver/gradientmachines/GradientMachineMode.h b/paddle/gserver/gradientmachines/GradientMachineMode.h index 898b68fbbc329145109ad0ae4b97c872d4f9a37c..dd944a35f8952e354f8e4f3eb5c67b136c5f080e 100644 --- a/paddle/gserver/gradientmachines/GradientMachineMode.h +++ b/paddle/gserver/gradientmachines/GradientMachineMode.h @@ -19,14 +19,14 @@ limitations under the License. */ namespace paddle { class IGradientMachineMode { -public: + public: virtual ~IGradientMachineMode() {} -public: // interfaces - /** - * @brief create current mode's gradient machine by model config. - * @param config model config - */ + public: // interfaces + /** + * @brief create current mode's gradient machine by model config. + * @param config model config + */ virtual GradientMachine* create(const ModelConfig& config) = 0; /** @@ -55,14 +55,14 @@ public: // interfaces */ virtual bool needTrainWholeDataInOneBatch() const = 0; -public: // static methods. - /** - * @brief register a custom gradient machine mode. - * @note For user to register a custom gradient machine mode, id should >= - * kCustom. - * @param mode mode id. - * @param ptr mode description object. - */ + public: // static methods. + /** + * @brief register a custom gradient machine mode. + * @note For user to register a custom gradient machine mode, id should >= + * kCustom. + * @param mode mode id. + * @param ptr mode description object. + */ static void regGradientMachineMode( int32_t mode, std::unique_ptr&& ptr) { modes_.insert(std::make_pair(mode, std::move(ptr))); @@ -141,7 +141,7 @@ public: // static methods. } } -private: + private: static std::unordered_map> modes_; }; diff --git a/paddle/gserver/gradientmachines/MultiGradientMachine.h b/paddle/gserver/gradientmachines/MultiGradientMachine.h index 83d2651f34b3698848427f29b1a90e606e57950e..eff7d5284c6dd4898344203b50acc94ae61b4d59 100644 --- a/paddle/gserver/gradientmachines/MultiGradientMachine.h +++ b/paddle/gserver/gradientmachines/MultiGradientMachine.h @@ -166,7 +166,7 @@ struct GradBuffer { * the merged gradient to parameter server. */ class MultiGradientMachine : public GradientMachine { -public: + public: enum TaskType { TASK_FORWARD_BACKWARD = 0, TASK_FORWARD = 1, @@ -213,7 +213,7 @@ public: /// The gradietns will be copied to each thread in the computing threads. virtual void setOutputGrad(const std::vector& args); -protected: + protected: friend class TrainerThread; std::vector& getAllThreads() { return threads_; } @@ -281,7 +281,7 @@ protected: int paraMainThread(int pid) const { return paraMainThread_[pid]; } -protected: + protected: virtual void forwardImp(const std::vector& inArgs, std::vector* outArgs, PassType passType, @@ -298,7 +298,7 @@ protected: void allocGradBufs(); -protected: + protected: bool useGpu_; bool hasNonstaticCpuParamters_; @@ -342,7 +342,7 @@ protected: }; class TrainerThread { -public: + public: TrainerThread(const ModelConfig& config, int threadId, MultiGradientMachine* multiMachine); @@ -392,7 +392,7 @@ public: /// Whether the thread has input data. bool hasInputData() { return batchSize_ != 0; } -protected: + protected: void mergeCpuGradients(); void mergeGradSparse( @@ -421,7 +421,7 @@ protected: /// GradientMachine::backward void doCallback(int pid); -protected: + protected: MultiGradientMachine* multiMachine_; ModelConfig config_; /// whether the thread should stop diff --git a/paddle/gserver/gradientmachines/MultiNetwork.cpp b/paddle/gserver/gradientmachines/MultiNetwork.cpp index a1140402b8baaae20e20802ebf87462e301b60f9..5f3d09dda26772850828e6d44e8cc65635b314dc 100644 --- a/paddle/gserver/gradientmachines/MultiNetwork.cpp +++ b/paddle/gserver/gradientmachines/MultiNetwork.cpp @@ -122,7 +122,7 @@ void MultiNetwork::finish() { } class MultiCombinedEvaluator : public Evaluator { -public: + public: MultiCombinedEvaluator() {} void addEvaluator(std::unique_ptr&& evaluator) { evaluators_.emplace_back(std::move(evaluator)); @@ -167,7 +167,7 @@ public: } } -protected: + protected: std::vector> evaluators_; }; diff --git a/paddle/gserver/gradientmachines/MultiNetwork.h b/paddle/gserver/gradientmachines/MultiNetwork.h index 186a9ad0a39cd7815aea6738e6c6bc4a0c944aa9..495d5592017b5fb937fb8243bf12a5f2f30d67e7 100644 --- a/paddle/gserver/gradientmachines/MultiNetwork.h +++ b/paddle/gserver/gradientmachines/MultiNetwork.h @@ -22,7 +22,7 @@ limitations under the License. */ namespace paddle { class MultiNetwork : public NeuralNetwork { -public: + public: explicit MultiNetwork(std::string subModelName = "") : NeuralNetwork(subModelName) {} @@ -58,7 +58,7 @@ public: virtual void finish(); -protected: + protected: std::vector> subNetworks_; }; } // namespace paddle diff --git a/paddle/gserver/gradientmachines/NeuralNetwork.cpp b/paddle/gserver/gradientmachines/NeuralNetwork.cpp index a3c13df3dbad973505d8919bce8b95348527e273..ac60a3a3408d37b66cb712d893c6b93a1750f448 100644 --- a/paddle/gserver/gradientmachines/NeuralNetwork.cpp +++ b/paddle/gserver/gradientmachines/NeuralNetwork.cpp @@ -362,7 +362,7 @@ void NeuralNetwork::releaseOutput() { #ifndef PADDLE_MOBILE_INFERENCE class CombinedEvaluator : public Evaluator { -public: + public: void addEvaluator(std::unique_ptr&& evaluator) { evaluators_.emplace_back(std::move(evaluator)); } @@ -400,11 +400,11 @@ public: } } -protected: + protected: std::vector> evaluators_; // Evaluator interface -public: + public: /** * @brief getNames will return all inside evaluators' names. * @param names [out]: return names. @@ -435,7 +435,7 @@ public: }); } -private: + private: template T getMethodHelper(const std::string& name, Error* err, @@ -454,7 +454,7 @@ private: }; class SubnetEvaluator : public CombinedEvaluator { -public: + public: SubnetEvaluator(const std::string& layerName, std::unique_ptr&& evaluator) : layerName_(layerName) { @@ -473,7 +473,7 @@ public: << " in submodel " << nn.getName(); } -protected: + protected: std::string layerName_; }; diff --git a/paddle/gserver/gradientmachines/NeuralNetwork.h b/paddle/gserver/gradientmachines/NeuralNetwork.h index 5b32f844f742c07c8bee6638cb46dc00285f49b0..3e5615c8f0b30ab1283d41e025496051869289dc 100644 --- a/paddle/gserver/gradientmachines/NeuralNetwork.h +++ b/paddle/gserver/gradientmachines/NeuralNetwork.h @@ -56,7 +56,7 @@ void parameterInitNN(int paramId, std::vector* sharedParams); class NeuralNetwork : public GradientMachine { -public: + public: virtual void init(const ModelConfig& config, ParamInitCallback callback = nullptr, const std::vector& parameterTypes = @@ -144,7 +144,7 @@ public: */ void releaseOutput(); -protected: + protected: /** * The constructor of NeuralNetwork. * The sub networks can get parameters_ and parameterMap_ diff --git a/paddle/gserver/gradientmachines/ParallelNeuralNetwork.h b/paddle/gserver/gradientmachines/ParallelNeuralNetwork.h index e3b6812123141e8e0afb9368fb06f2b34f526800..c091459506ad477bed3f429a22071eccedd664bb 100644 --- a/paddle/gserver/gradientmachines/ParallelNeuralNetwork.h +++ b/paddle/gserver/gradientmachines/ParallelNeuralNetwork.h @@ -32,7 +32,7 @@ enum TaskType { * multiple threads in parallel. */ class ParallelNeuralNetwork : public NeuralNetwork { -public: + public: ParallelNeuralNetwork(std::string subModelName = "", NeuralNetwork *rootNetwork = nullptr) : NeuralNetwork(subModelName, rootNetwork) {} @@ -66,7 +66,7 @@ public: // virtual void eval(Evaluator* evaluator); -protected: + protected: bool useGpu_; /// number of gpu devices int numDevices_; @@ -74,7 +74,7 @@ protected: }; class ParallelThread { -public: + public: ParallelThread(int threadId, int deviceId, bool useGpu); ~ParallelThread(); void jobEnqueue(LayerPtr layer, TaskType task); @@ -87,10 +87,10 @@ public: } void setForwardPassType(PassType passType) { passType_ = passType; } -protected: + protected: void computeThread(); -public: + public: struct Job { LayerPtr layer_; TaskType task_; @@ -98,7 +98,7 @@ public: typedef Queue JobQueue; JobQueue queue_; -protected: + protected: /// from 0 to threads-1 int threadId_; /// the GPU device Id which the computeThread_ used diff --git a/paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp b/paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp index 2429b5d1a0a5ccf66db365b82c494c53d8e1fd4b..73ac8cda721f200c1a02cd9c1d9456df70d7b7d2 100644 --- a/paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp +++ b/paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp @@ -96,7 +96,7 @@ static InitFunction __init__diy_prob_method( std::numeric_limits::max()); class BeamSearchControlCallbacks { -public: + public: RecurrentGradientMachine::BeamSearchCandidatesAdjustCallback beamSearchCandidateAdjust; RecurrentGradientMachine::NormOrDropNodeCallback normOrDropNode; @@ -115,7 +115,7 @@ public: }; class BeamSearchStatisticsCallbacks { -public: + public: RecurrentGradientMachine::EachStepCallback onEachStepStarted; RecurrentGradientMachine::EachStepCallback onEachStepStoped; @@ -148,11 +148,11 @@ RecurrentGradientMachine::RecurrentGradientMachine( * so it's should not be placed in root network. */ class BootBiasLayer : public Layer { -protected: + protected: std::unique_ptr biases_; IVectorPtr cpuIds_; -public: + public: explicit BootBiasLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/gradientmachines/RecurrentGradientMachine.h b/paddle/gserver/gradientmachines/RecurrentGradientMachine.h index 0032b72cdae44588af976f1ac542149545f551f1..7e943cebd35234ba7af357c9f64fde6b0a9546ce 100644 --- a/paddle/gserver/gradientmachines/RecurrentGradientMachine.h +++ b/paddle/gserver/gradientmachines/RecurrentGradientMachine.h @@ -30,7 +30,7 @@ class BeamSearchControlCallbacks; class BeamSearchStatisticsCallbacks; class RecurrentGradientMachine : public NeuralNetwork { -public: + public: RecurrentGradientMachine(const std::string& subModelName, NeuralNetwork* rootNetwork); @@ -290,7 +290,7 @@ public: return this->finalPaths_; } -protected: + protected: std::vector commonSeqInfo_; ICpuGpuVectorPtr sequenceStartPositions_; void calcSequenceStartPositions(); @@ -447,7 +447,7 @@ protected: MatrixPtr cpuProb_; IVectorPtr cpuEos_; -private: + private: /* * @return beam size in beam search */ diff --git a/paddle/gserver/layers/AddtoLayer.h b/paddle/gserver/layers/AddtoLayer.h index 1d000630567cb1116ab0ff69e42380fc0eae6173..6ea54f4a53d466594055db2fb5167fa1a9d6c9da 100644 --- a/paddle/gserver/layers/AddtoLayer.h +++ b/paddle/gserver/layers/AddtoLayer.h @@ -33,10 +33,10 @@ namespace paddle { * The config file api is addto_layer. */ class AddtoLayer : public Layer { -protected: + protected: std::unique_ptr biases_; -public: + public: explicit AddtoLayer(const LayerConfig& config) : Layer(config) {} ~AddtoLayer() {} diff --git a/paddle/gserver/layers/AgentLayer.h b/paddle/gserver/layers/AgentLayer.h index da0ac4530836205757399ac8eb64dd003740a53f..51f346d5c9fdf9599cddf4b668c128035fd94187 100644 --- a/paddle/gserver/layers/AgentLayer.h +++ b/paddle/gserver/layers/AgentLayer.h @@ -26,11 +26,11 @@ namespace paddle { * called to set one and only one real layer */ class AgentLayer : public Layer { -protected: + protected: LayerPtr realLayer_; int numSamples_; -public: + public: explicit AgentLayer(const LayerConfig& config) : Layer(config) {} ~AgentLayer() {} @@ -55,14 +55,14 @@ public: * GatherAgentLayer collect a complete sequence. */ class GatherAgentLayer : public Layer { -protected: + protected: std::vector realLayers_; std::vector idsVec_; // we don't clear idsVec_ vector to aviod IVector alloc/free IVectorPtr allIds_; std::vector idIndex_; -public: + public: explicit GatherAgentLayer(const LayerConfig& config) : Layer(config) {} virtual ~GatherAgentLayer() {} @@ -95,7 +95,7 @@ public: * if it is, the agent will select a few ids in real layer. */ class ScatterAgentLayer : public Layer { -protected: + protected: LayerPtr realLayer_; IVectorPtr ids_; IVectorPtr cpuIds_; @@ -113,7 +113,7 @@ protected: // true for setRealLayer, false for setRealLayerAndOutput bool selectionMode_; -public: + public: explicit ScatterAgentLayer(const LayerConfig& config) : Layer(config) {} virtual ~ScatterAgentLayer() {} diff --git a/paddle/gserver/layers/AverageLayer.h b/paddle/gserver/layers/AverageLayer.h index 24602d2a9c3e08cf76f6f98b5f9e3f593118e6e1..03e2673b55ceca7a698f1b858327ad6fad739087 100644 --- a/paddle/gserver/layers/AverageLayer.h +++ b/paddle/gserver/layers/AverageLayer.h @@ -37,7 +37,7 @@ namespace paddle { * The config file api is pooling_layer. */ class AverageLayer : public SequencePoolLayer { -public: + public: enum AverageStrategy { kAverage = 0, kSum = 1, kAverageSquareRootN = 2 }; explicit AverageLayer(const LayerConfig& config) : SequencePoolLayer(config) {} @@ -48,7 +48,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: int mode_; }; } // namespace paddle diff --git a/paddle/gserver/layers/BatchNormBaseLayer.h b/paddle/gserver/layers/BatchNormBaseLayer.h index 69d642af4f12593e8db8a726310e6b1934c8e3be..5a446c0843a22adecbaf2ae09fcd526b68865ae2 100644 --- a/paddle/gserver/layers/BatchNormBaseLayer.h +++ b/paddle/gserver/layers/BatchNormBaseLayer.h @@ -40,7 +40,7 @@ namespace paddle { */ class BatchNormBaseLayer : public Layer { -public: + public: explicit BatchNormBaseLayer(const LayerConfig& config) : Layer(config) {} ~BatchNormBaseLayer() {} @@ -61,7 +61,7 @@ public: */ void calFeatureMapSize(); -protected: + protected: /// Batch normalization scale parameter, which is referred to as gamma in /// in original paper. std::unique_ptr weight_; diff --git a/paddle/gserver/layers/BatchNormalizationLayer.h b/paddle/gserver/layers/BatchNormalizationLayer.h index 95add69215e3ea0b0225d0a245fe37905c33127b..e5e4e690b6017f32de0f4d7557065c02c03d689f 100644 --- a/paddle/gserver/layers/BatchNormalizationLayer.h +++ b/paddle/gserver/layers/BatchNormalizationLayer.h @@ -27,7 +27,7 @@ namespace paddle { */ class BatchNormalizationLayer : public BatchNormBaseLayer { -public: + public: explicit BatchNormalizationLayer(const LayerConfig& config) : BatchNormBaseLayer(config), firstTest_(true) {} @@ -38,7 +38,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: /// Load pre-calculated mean and std. void setMeanAndStd(); diff --git a/paddle/gserver/layers/BilinearInterpLayer.h b/paddle/gserver/layers/BilinearInterpLayer.h index acd320420f4bbfe313f3ae77577ffc6b5cbfbfdf..8e08c2e1ce80172f55c93d8242821f683fa1a731 100644 --- a/paddle/gserver/layers/BilinearInterpLayer.h +++ b/paddle/gserver/layers/BilinearInterpLayer.h @@ -26,13 +26,13 @@ namespace paddle { * @note The config file api is bilinear_interp_layer. */ class BilinearInterpLayer : public Layer { -protected: + protected: size_t outImgH_, outImgW_; size_t inImgH_, inImgW_; real ratioH_, ratioW_; size_t numChannels_; -public: + public: explicit BilinearInterpLayer(const LayerConfig& config) : Layer(config) {} virtual ~BilinearInterpLayer() {} diff --git a/paddle/gserver/layers/BlockExpandLayer.h b/paddle/gserver/layers/BlockExpandLayer.h index 1797b64036b5cb9f97477d5a44b2f58e2d6c0cd4..9d76584f3a4eda19a9e8f806256a7b8da617cc37 100644 --- a/paddle/gserver/layers/BlockExpandLayer.h +++ b/paddle/gserver/layers/BlockExpandLayer.h @@ -40,7 +40,7 @@ namespace paddle { * The config file api is block_expand_layer. */ class BlockExpandLayer : public Layer { -protected: + protected: /** * @brief Calculate outputH_ and outputW_ and return block number which * actually is time steps. @@ -53,7 +53,7 @@ protected: TensorShape inputShape_; TensorShape outputShape_; -public: + public: explicit BlockExpandLayer(const LayerConfig& config) : Layer(config) {} ~BlockExpandLayer() {} diff --git a/paddle/gserver/layers/CRFDecodingLayer.h b/paddle/gserver/layers/CRFDecodingLayer.h index fba3cebac1a375008c58d21c458d9e0b98305ffa..018162e146fa93725fe84bdf2da9a6124f3cea6f 100644 --- a/paddle/gserver/layers/CRFDecodingLayer.h +++ b/paddle/gserver/layers/CRFDecodingLayer.h @@ -30,14 +30,14 @@ namespace paddle { * See LinearChainCRF.h for the detail of the CRF formulation. */ class CRFDecodingLayer : public CRFLayer { -public: + public: explicit CRFDecodingLayer(const LayerConfig& config) : CRFLayer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap) override; void forward(PassType passType) override; void backward(const UpdateCallback& callback) override; -protected: + protected: std::unique_ptr crf_; }; diff --git a/paddle/gserver/layers/CRFLayer.h b/paddle/gserver/layers/CRFLayer.h index cb5bd05568cc79c0093d6af0791cf0b3ce2dae47..88c2ed343ad5743068c871fe351437270d85f223 100644 --- a/paddle/gserver/layers/CRFLayer.h +++ b/paddle/gserver/layers/CRFLayer.h @@ -27,14 +27,14 @@ namespace paddle { * See class LinearChainCRF for the detail of the CRF formulation. */ class CRFLayer : public Layer { -public: + public: explicit CRFLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap) override; void forward(PassType passType) override; void backward(const UpdateCallback& callback) override; -protected: + protected: size_t numClasses_; ParameterPtr parameter_; std::vector crfs_; diff --git a/paddle/gserver/layers/CTCLayer.h b/paddle/gserver/layers/CTCLayer.h index fcbc42565e9340903d05aca2d0ba2091ffe20be0..5d70b1f4ceb03028865378d1d01b5706b35b10de 100644 --- a/paddle/gserver/layers/CTCLayer.h +++ b/paddle/gserver/layers/CTCLayer.h @@ -20,7 +20,7 @@ limitations under the License. */ namespace paddle { class CTCLayer : public Layer { -public: + public: explicit CTCLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap) override; @@ -31,7 +31,7 @@ public: const Argument& softmaxSeqs, const Argument& labelSeqs); -protected: + protected: size_t numClasses_; bool normByTimes_; std::vector ctcs_; diff --git a/paddle/gserver/layers/ClipLayer.cpp b/paddle/gserver/layers/ClipLayer.cpp index dbc3337499788af5a9b6f68a6016e94c2072d61b..6aa3c8fe64f5a59e82f3271baed99fd17fd6653f 100644 --- a/paddle/gserver/layers/ClipLayer.cpp +++ b/paddle/gserver/layers/ClipLayer.cpp @@ -24,11 +24,11 @@ namespace paddle { */ class ClipLayer : public Layer { -protected: + protected: double min_; double max_; -public: + public: explicit ClipLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/ConcatenateLayer.cpp b/paddle/gserver/layers/ConcatenateLayer.cpp index f5ab29a509e45e72c71ba122c73aeba1b3b6a827..e6de329ff3f9ccfdd1cbe697c1de1a9cd8c7926a 100644 --- a/paddle/gserver/layers/ConcatenateLayer.cpp +++ b/paddle/gserver/layers/ConcatenateLayer.cpp @@ -23,7 +23,7 @@ namespace paddle { * each input as one row for the output of this layer and apply activation. */ class ConcatenateLayer : public Layer { -public: + public: explicit ConcatenateLayer(const LayerConfig& config) : Layer(config) {} ~ConcatenateLayer() {} @@ -97,7 +97,7 @@ void ConcatenateLayer::backward(const UpdateCallback& callback) { * processed by a Projection. */ class ConcatenateLayer2 : public Layer { -public: + public: explicit ConcatenateLayer2(const LayerConfig& config) : Layer(config) {} ~ConcatenateLayer2() {} @@ -108,7 +108,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: std::vector> projections_; std::vector projOutput_; std::vector> projCol_; diff --git a/paddle/gserver/layers/ContextProjection.h b/paddle/gserver/layers/ContextProjection.h index e30f98f58d2be9ac538f6385efe68990b705ac5f..9c217145419048282a9a09ad899dc970e7c9704f 100644 --- a/paddle/gserver/layers/ContextProjection.h +++ b/paddle/gserver/layers/ContextProjection.h @@ -42,7 +42,7 @@ namespace paddle { * The config file api is context_projection. */ class ContextProjection : public Projection { -public: + public: /** * Constructor. If context_start is zero and context_lenth is one, it will * set trainable_padding false. trainable_padding is an optional arguments @@ -63,7 +63,7 @@ public: virtual bool init(); -protected: + protected: std::unique_ptr weight_; /// number of extra timesteps added at the beginning size_t beginPad_; diff --git a/paddle/gserver/layers/Conv3DLayer.h b/paddle/gserver/layers/Conv3DLayer.h index 5ab5ff3d4af07449484c441958c31c8fb06de894..07b804bad02beb6ec9c3e9fd43c3cd3aa6d50b22 100644 --- a/paddle/gserver/layers/Conv3DLayer.h +++ b/paddle/gserver/layers/Conv3DLayer.h @@ -26,7 +26,7 @@ namespace paddle { * calculate convolution operation. */ class Conv3DLayer : public ConvBaseLayer { -public: + public: explicit Conv3DLayer(const LayerConfig& config) : ConvBaseLayer(config) {} ~Conv3DLayer() {} @@ -40,7 +40,7 @@ public: void bpropWeights(int i); size_t getSize(); -protected: + protected: // Figure out the dimensions for individual gemms. IntV M_; /// numFilters_ / filter_group_; IntV N_; /// channels_ * filterSizeZ_ * filterSize_ * filterSizeY_ diff --git a/paddle/gserver/layers/ConvBaseLayer.h b/paddle/gserver/layers/ConvBaseLayer.h index 93869fe68d15b1cf38296fa8e2f6197dc74f879f..801bc4f888c5a60e803c882dcf807678c64af20c 100644 --- a/paddle/gserver/layers/ConvBaseLayer.h +++ b/paddle/gserver/layers/ConvBaseLayer.h @@ -24,7 +24,7 @@ namespace paddle { */ class ConvBaseLayer : public Layer { -protected: + protected: typedef std::vector IntV; /// True if it's deconv layer, false if it's convolution layer @@ -88,7 +88,7 @@ protected: /// of output size. bool caffeMode_; -public: + public: explicit ConvBaseLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/ConvBaseOperator.h b/paddle/gserver/layers/ConvBaseOperator.h index 27fb0362d3c9518a263eac54206e00974d08eb20..c3c647cb69da5a70eb5346737cc0092e2201c89e 100644 --- a/paddle/gserver/layers/ConvBaseOperator.h +++ b/paddle/gserver/layers/ConvBaseOperator.h @@ -29,7 +29,7 @@ namespace paddle { */ class ConvBaseOperator : public Operator { -public: + public: ConvBaseOperator(const OperatorConfig &config, bool useGpu); /** * Free workspace in device and destroy cudnn tensor descriptor. @@ -46,7 +46,7 @@ public: hl_destroy_convolution_descriptor(convDesc_); } -protected: + protected: /** * Get convolution parameters from layer config and * initialize member variables. diff --git a/paddle/gserver/layers/ConvBaseProjection.h b/paddle/gserver/layers/ConvBaseProjection.h index ba76d236d901187093a2e372a61c5d29d661e8bb..f3266ae1ab945042cde9f24b7c2673c18d37bc11 100644 --- a/paddle/gserver/layers/ConvBaseProjection.h +++ b/paddle/gserver/layers/ConvBaseProjection.h @@ -23,7 +23,7 @@ namespace paddle { * @brief Base class for ConvProjection and ConvTransProjection. */ class ConvBaseProjection : public Projection { -public: + public: /** * Constructor. */ @@ -33,7 +33,7 @@ public: ~ConvBaseProjection(); -protected: + protected: void getConvParams(); void initCudnn(); diff --git a/paddle/gserver/layers/ConvOperator.h b/paddle/gserver/layers/ConvOperator.h index fbdb7bb1cd2b81bd72912dffdc9d059c520068a8..527dbf8c270f35e19ca23acd8a3ba8197d03b988 100644 --- a/paddle/gserver/layers/ConvOperator.h +++ b/paddle/gserver/layers/ConvOperator.h @@ -29,7 +29,7 @@ namespace paddle { */ class ConvOperator : public ConvBaseOperator { -public: + public: ConvOperator(const OperatorConfig &config, bool useGpu) : ConvBaseOperator(config, useGpu) {} /** diff --git a/paddle/gserver/layers/ConvProjection.h b/paddle/gserver/layers/ConvProjection.h index e8ecb99431a421d4b52228600909568b0808649a..22a2202bb6cc256a4a5897724d8eb8a93fefb79f 100644 --- a/paddle/gserver/layers/ConvProjection.h +++ b/paddle/gserver/layers/ConvProjection.h @@ -23,7 +23,7 @@ namespace paddle { * @brief Convolution projection do the same calculation with CudnnConvLayer. */ class ConvProjection : public ConvBaseProjection { -public: + public: /** * Constructor. */ diff --git a/paddle/gserver/layers/ConvShiftLayer.cpp b/paddle/gserver/layers/ConvShiftLayer.cpp index fb877710196835e025466f37b5da27bcf80a3db4..615c3478061b591ea30cbf0b3d27ef2551c0dd28 100644 --- a/paddle/gserver/layers/ConvShiftLayer.cpp +++ b/paddle/gserver/layers/ConvShiftLayer.cpp @@ -42,7 +42,7 @@ namespace paddle { */ class ConvShiftLayer : public Layer { -public: + public: explicit ConvShiftLayer(const LayerConfig& config) : Layer(config) {} ~ConvShiftLayer() {} diff --git a/paddle/gserver/layers/ConvTransOperator.h b/paddle/gserver/layers/ConvTransOperator.h index 1bf58f2bfb78ae7dee433455ece37d908b113045..53cb7a21b49189898d09aa20cd46d04cc5c20198 100644 --- a/paddle/gserver/layers/ConvTransOperator.h +++ b/paddle/gserver/layers/ConvTransOperator.h @@ -29,7 +29,7 @@ namespace paddle { */ class ConvTransOperator : public ConvBaseOperator { -public: + public: ConvTransOperator(const OperatorConfig &config, bool useGpu) : ConvBaseOperator(config, useGpu) {} /** diff --git a/paddle/gserver/layers/ConvTransProjection.h b/paddle/gserver/layers/ConvTransProjection.h index 269b2694c82ea076102633537d7c961139a19a43..0f9ed720d3b8855a3a24ac25a1c3917c4b98e81d 100644 --- a/paddle/gserver/layers/ConvTransProjection.h +++ b/paddle/gserver/layers/ConvTransProjection.h @@ -23,7 +23,7 @@ namespace paddle { * @brief Convolution projection do the same calculation with CudnnConvLayer. */ class ConvTransProjection : public ConvBaseProjection { -public: + public: /** * Constructor. */ diff --git a/paddle/gserver/layers/ConvexCombinationLayer.cpp b/paddle/gserver/layers/ConvexCombinationLayer.cpp index dce751940c1bf1695a034a3c551412dcb9b7b8b5..31363d97c4fd318ec2c6d48f9200f6ba1f49ba11 100644 --- a/paddle/gserver/layers/ConvexCombinationLayer.cpp +++ b/paddle/gserver/layers/ConvexCombinationLayer.cpp @@ -36,7 +36,7 @@ namespace paddle { * The config file api is linear_comb_layer. */ class ConvexCombinationLayer : public Layer { -protected: + protected: /// A matrix pointer pointing to second input. MatrixPtr tmpMtx0; /// A matrix pointer pointing to first input. @@ -44,7 +44,7 @@ protected: /// A matrix pointer pointing to output. MatrixPtr tmpRow1; -public: + public: explicit ConvexCombinationLayer(const LayerConfig& config) : Layer(config) {} ~ConvexCombinationLayer() {} diff --git a/paddle/gserver/layers/CosSimLayer.h b/paddle/gserver/layers/CosSimLayer.h index 675cdb16b563faa7acf9e701096bd334ed661160..d9fe1ff270f1f76e3b246dca374ddf45445419f9 100644 --- a/paddle/gserver/layers/CosSimLayer.h +++ b/paddle/gserver/layers/CosSimLayer.h @@ -33,7 +33,7 @@ namespace paddle { * The config file api is cos_sim. */ class CosSimLayer : public Layer { -public: + public: explicit CosSimLayer(const LayerConfig& config) : Layer(config) {} ~CosSimLayer() {} diff --git a/paddle/gserver/layers/CosSimVecMatLayer.cpp b/paddle/gserver/layers/CosSimVecMatLayer.cpp index 685b4e8ef376b76b3058eeba82d803d460e7105c..230ecc768b4d7314b21ac1d76899c3c3bab12309 100644 --- a/paddle/gserver/layers/CosSimVecMatLayer.cpp +++ b/paddle/gserver/layers/CosSimVecMatLayer.cpp @@ -32,7 +32,7 @@ namespace paddle { */ class CosSimVecMatLayer : public Layer { -protected: + protected: MatrixPtr tmpMtx0; MatrixPtr tmpMtx1; MatrixPtr tmpRow0; @@ -40,7 +40,7 @@ protected: MatrixPtr tmpRow2; MatrixPtr tmpRow3; -public: + public: explicit CosSimVecMatLayer(const LayerConfig& config) : Layer(config) {} ~CosSimVecMatLayer() {} diff --git a/paddle/gserver/layers/CostLayer.cpp b/paddle/gserver/layers/CostLayer.cpp index 484f803a8387a16152c5911d7d5c72b0111283ae..1327616950a8887efa2cba410fa7ae8b5bd97da4 100644 --- a/paddle/gserver/layers/CostLayer.cpp +++ b/paddle/gserver/layers/CostLayer.cpp @@ -716,7 +716,7 @@ void HuberTwoClassification::backwardImp(Matrix& output, * \f] */ class SumCostLayer : public Layer { -public: + public: explicit SumCostLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/CostLayer.h b/paddle/gserver/layers/CostLayer.h index 306c067ed1c040555d2b03996cc0749faf0ea68c..9bfec0e2b169fac4f235fd13347be687c4f1a222 100644 --- a/paddle/gserver/layers/CostLayer.h +++ b/paddle/gserver/layers/CostLayer.h @@ -29,7 +29,7 @@ namespace paddle { * handled by the base class. */ class CostLayer : public Layer { -public: + public: explicit CostLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -51,7 +51,7 @@ public: Argument& label, Matrix& outputGrad) = 0; -protected: + protected: LayerPtr weightLayer_; real coeff_; }; @@ -65,7 +65,7 @@ protected: * \f] */ class MultiClassCrossEntropy : public CostLayer { -public: + public: explicit MultiClassCrossEntropy(const LayerConfig& config) : CostLayer(config) {} @@ -95,7 +95,7 @@ public: * In Proceedings of the ACL 2014 Conference. */ class MultiClassCrossEntropyWithSelfNorm : public CostLayer { -public: + public: explicit MultiClassCrossEntropyWithSelfNorm(const LayerConfig& config) : CostLayer(config) {} @@ -108,7 +108,7 @@ public: Argument& label, Matrix& outputGrad) override; -protected: + protected: MatrixPtr sftMaxSum_; MatrixPtr sumInv_; }; @@ -120,7 +120,7 @@ protected: * \f] */ class SoftBinaryClassCrossEntropy : public CostLayer { -public: + public: explicit SoftBinaryClassCrossEntropy(const LayerConfig& config) : CostLayer(config) {} @@ -133,7 +133,7 @@ public: Argument& label, Matrix& outputGrad) override; -protected: + protected: MatrixPtr targetPerDim_; }; @@ -145,7 +145,7 @@ protected: * \f] */ class SumOfSquaresCostLayer : public CostLayer { -public: + public: explicit SumOfSquaresCostLayer(const LayerConfig& config) : CostLayer(config) {} @@ -171,7 +171,7 @@ public: * x = output - label */ class SmoothL1CostLayer : public CostLayer { -public: + public: explicit SmoothL1CostLayer(const LayerConfig& config) : CostLayer(config) {} bool init(const LayerMap& layerMap, @@ -197,7 +197,7 @@ public: * Rank useing Gradient Descent. */ class RankingCost : public Layer { -public: + public: explicit RankingCost(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -225,7 +225,7 @@ public: (void)outputGrad; } -private: + private: double posPairCount_; double negPairCount_; MatrixPtr margin_; @@ -250,7 +250,7 @@ private: * with Nonsmooth Cost Functions. */ class LambdaCost : public Layer { -public: + public: explicit LambdaCost(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -270,7 +270,7 @@ public: real* gradData, int size); -private: + private: MatrixPtr marginGrad_; int truncationSize_; int maxSortSize_; @@ -287,10 +287,10 @@ private: * \f] */ class MultiBinaryLabelCrossEntropy : public CostLayer { -protected: + protected: MatrixPtr targetPerDim_; -public: + public: explicit MultiBinaryLabelCrossEntropy(const LayerConfig& config) : CostLayer(config) {} @@ -308,7 +308,7 @@ public: * A base layer for HuberRegressionLoss and HuberTwoClassification. */ class HuberCost : public CostLayer { -public: + public: std::vector tmpCpuInput_; explicit HuberCost(const LayerConfig& config) : CostLayer(config) {} @@ -331,7 +331,7 @@ public: * Loss = delta * abs(y - f) - 0.5 * delta^2, otherwise */ class HuberRegressionLoss : public HuberCost { -public: + public: explicit HuberRegressionLoss(const LayerConfig& config) : HuberCost(config) {} bool init(const LayerMap& layerMap, @@ -343,7 +343,7 @@ public: Argument& label, Matrix& outputGrad) override; -protected: + protected: real delta_; }; @@ -356,7 +356,7 @@ protected: * Loss = 0, otherwise */ class HuberTwoClassification : public HuberCost { -public: + public: explicit HuberTwoClassification(const LayerConfig& config) : HuberCost(config) {} diff --git a/paddle/gserver/layers/CropLayer.h b/paddle/gserver/layers/CropLayer.h index 1a85911ef75e992df587a60cfc9a727eafa4cc76..ef88bc483d157406a0f5a7924c14c345ea0df8c4 100644 --- a/paddle/gserver/layers/CropLayer.h +++ b/paddle/gserver/layers/CropLayer.h @@ -28,7 +28,7 @@ namespace paddle { * crop input as this shape conf */ class CropLayer : public Layer { -public: + public: explicit CropLayer(const LayerConfig& config) : Layer(config) {} ~CropLayer() {} @@ -38,7 +38,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: void setOutDims(); void setInDims(); diff --git a/paddle/gserver/layers/CrossEntropyOverBeam.h b/paddle/gserver/layers/CrossEntropyOverBeam.h index b47a2933c255c264ba780b2d87c9fbe53cb5665d..c8702b16165eee8d552c563082ffc708ce443deb 100644 --- a/paddle/gserver/layers/CrossEntropyOverBeam.h +++ b/paddle/gserver/layers/CrossEntropyOverBeam.h @@ -44,7 +44,7 @@ struct BeamExpansion { typedef std::shared_ptr BeamExpansionPtr; class CostForOneSequence { -public: + public: CostForOneSequence() : beamSize_(0), validExpansionCount_(0), goldAsExtraPath_(false) {} void setData(const BeamExpansionPtr bPtr, size_t beamSize) { @@ -64,7 +64,7 @@ public: real forward(); void backward(); -private: + private: void calValidExpandStep(); void constructTotalExpansion(); size_t initLastExpansion(); @@ -93,14 +93,14 @@ private: }; class CrossEntropyOverBeam : public Layer { -public: + public: explicit CrossEntropyOverBeam(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap) override; void forward(PassType passType) override; void backward(const UpdateCallback& callback) override; -private: + private: void checkInputs(); void copyInputsToCpu(); void resizeOutput(); diff --git a/paddle/gserver/layers/CudnnBatchNormLayer.h b/paddle/gserver/layers/CudnnBatchNormLayer.h index aa279f73d66770384815cad4d9e2ee0b04a4a1ad..1bb4eff8d2372660caa4ec4a4a20a27f365bebd0 100644 --- a/paddle/gserver/layers/CudnnBatchNormLayer.h +++ b/paddle/gserver/layers/CudnnBatchNormLayer.h @@ -30,7 +30,7 @@ namespace paddle { */ class CudnnBatchNormLayer : public BatchNormBaseLayer { -public: + public: explicit CudnnBatchNormLayer(const LayerConfig& config) : BatchNormBaseLayer(config) {} @@ -46,7 +46,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: /// Epsilon value used in the batch normalization formula. /// Same epsilon value should be used in forward and backward functions. double eps_; diff --git a/paddle/gserver/layers/CudnnConvBaseLayer.h b/paddle/gserver/layers/CudnnConvBaseLayer.h index 698104e4fbd2556f426001687a581153f32773d8..1ee1aa100d8adaed04ce24ee12b5b9af52c14b13 100644 --- a/paddle/gserver/layers/CudnnConvBaseLayer.h +++ b/paddle/gserver/layers/CudnnConvBaseLayer.h @@ -31,14 +31,14 @@ namespace paddle { * The config file api is img_conv_layer. */ class CudnnConvBaseLayer : public ConvBaseLayer { -protected: + protected: std::vector> projConf_; std::vector> projections_; hl_tensor_descriptor biasDesc_; hl_tensor_descriptor outputDesc_; -public: + public: explicit CudnnConvBaseLayer(const LayerConfig& config) : ConvBaseLayer(config) {} diff --git a/paddle/gserver/layers/CudnnPoolLayer.h b/paddle/gserver/layers/CudnnPoolLayer.h index 9eb4fc6138b0bce59660406705d15291eb38af9b..fc249354d10333211691b6844bffa3c8da8a79ee 100644 --- a/paddle/gserver/layers/CudnnPoolLayer.h +++ b/paddle/gserver/layers/CudnnPoolLayer.h @@ -26,7 +26,7 @@ namespace paddle { */ class CudnnPoolLayer : public PoolLayer { -protected: + protected: int windowHeight, windowWidth; int heightPadding, widthPadding, strideHeight, strideWidth; int imageH_, imageW_, outputH_, outputW_; @@ -40,7 +40,7 @@ protected: /// A description of a pooling operation. hl_pooling_descriptor poolingDesc_; -public: + public: static bool typeCheck(const std::string& poolType, hl_pooling_mode_t* mode = nullptr); explicit CudnnPoolLayer(const LayerConfig& config); diff --git a/paddle/gserver/layers/DataLayer.h b/paddle/gserver/layers/DataLayer.h index 4b12afe0efe81843b58e459ca1e58b4f7f4a1664..d02f5a4697b9067f7d34e4d0b2d34f8c63ffe020 100644 --- a/paddle/gserver/layers/DataLayer.h +++ b/paddle/gserver/layers/DataLayer.h @@ -25,7 +25,7 @@ namespace paddle { * The config file api is data_layer. */ class DataLayer : public Layer { -public: + public: explicit DataLayer(const LayerConfig& config) : Layer(config) {} virtual void setData(const Argument& data) { data_ = data; } @@ -58,10 +58,10 @@ public: } } -private: + private: void copyDataToOutput(Argument& output); -protected: + protected: Argument data_; }; diff --git a/paddle/gserver/layers/DataNormLayer.h b/paddle/gserver/layers/DataNormLayer.h index 2a2a2a4aa76e8e315d9d66da1b738d6d615d10f2..7ae67a877b488c8d197896b8b1e3e90057fbe1c9 100644 --- a/paddle/gserver/layers/DataNormLayer.h +++ b/paddle/gserver/layers/DataNormLayer.h @@ -37,7 +37,7 @@ namespace paddle { */ class DataNormLayer : public Layer { -public: + public: enum NormalizationStrategy { kZScore = 0, kMinMax = 1, kDecimalScaling = 2 }; explicit DataNormLayer(const LayerConfig& config) : Layer(config) {} @@ -50,7 +50,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: int mode_; std::unique_ptr weight_; MatrixPtr min_; diff --git a/paddle/gserver/layers/DeConv3DLayer.h b/paddle/gserver/layers/DeConv3DLayer.h index 57d51cdec66930b9b79c0c0395da66922cd53ae4..13d1d07cf5cc6e2a6ea89768e29b1fe8cda5e81c 100644 --- a/paddle/gserver/layers/DeConv3DLayer.h +++ b/paddle/gserver/layers/DeConv3DLayer.h @@ -27,7 +27,7 @@ namespace paddle { * calculate deconvolution3D operation. */ class DeConv3DLayer : public ConvBaseLayer { -public: + public: explicit DeConv3DLayer(const LayerConfig& config) : ConvBaseLayer(config) {} ~DeConv3DLayer() {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap); @@ -40,7 +40,7 @@ public: void bpropWeights(int i); size_t getSize(); -protected: + protected: // Figure out the dimensions for individual gemms. IntV M_; /// numFilters_ / filter_group_; IntV N_; /// channels_ * filterSizeZ_ * filterSize_ * filterSizeY_ diff --git a/paddle/gserver/layers/DetectionOutputLayer.h b/paddle/gserver/layers/DetectionOutputLayer.h index 174a6e5d9acb476276b66627b4aabce2ae6c1037..b0270ed33141993665aeabdc53829600a4403643 100644 --- a/paddle/gserver/layers/DetectionOutputLayer.h +++ b/paddle/gserver/layers/DetectionOutputLayer.h @@ -33,7 +33,7 @@ namespace paddle { */ class DetectionOutputLayer : public Layer { -public: + public: explicit DetectionOutputLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap); @@ -42,7 +42,7 @@ public: void backward(const UpdateCallback& callback = nullptr) {} -protected: + protected: inline LayerPtr getPriorBoxLayer() { return inputLayers_[0]; } inline LayerPtr getLocInputLayer(size_t index) { @@ -53,7 +53,7 @@ protected: return inputLayers_[1 + inputNum_ + index]; } -private: + private: size_t numClasses_; // number of classes size_t inputNum_; // number of input layers real nmsThreshold_; diff --git a/paddle/gserver/layers/DotMulOperator.cpp b/paddle/gserver/layers/DotMulOperator.cpp index 68db2929adee1336e52abfcb8e6495e589afa683..03d18d9b239e57dc41334462f2324ae2d0505a62 100644 --- a/paddle/gserver/layers/DotMulOperator.cpp +++ b/paddle/gserver/layers/DotMulOperator.cpp @@ -27,7 +27,7 @@ namespace paddle { * The config file api is dotmul_operator. */ class DotMulOperator : public Operator { -public: + public: DotMulOperator(const OperatorConfig& config, bool useGpu); virtual void forward(); virtual void backward(); diff --git a/paddle/gserver/layers/DotMulProjection.cpp b/paddle/gserver/layers/DotMulProjection.cpp index 86453aae84142f9f534182d085f4a96a2c7a3e15..d7780387670e83af24fa342be3d596b618b1f677 100644 --- a/paddle/gserver/layers/DotMulProjection.cpp +++ b/paddle/gserver/layers/DotMulProjection.cpp @@ -26,14 +26,14 @@ namespace paddle { * The config file api is dotmul_projection. */ class DotMulProjection : public Projection { -public: + public: DotMulProjection(const ProjectionConfig& config, const ParameterPtr& parameter, bool useGpu); virtual void forward(); virtual void backward(const UpdateCallback& callback); -protected: + protected: /// shared memory with parameter std::unique_ptr weight_; }; diff --git a/paddle/gserver/layers/DotProdLayer.cpp b/paddle/gserver/layers/DotProdLayer.cpp index 5148d93e27d199b0c373221cedd4f03d6d32c8ab..72b0c707b2131dc275ba604cd20ae0007c34a9a9 100644 --- a/paddle/gserver/layers/DotProdLayer.cpp +++ b/paddle/gserver/layers/DotProdLayer.cpp @@ -27,7 +27,7 @@ namespace paddle { */ class DotProdLayer : public Layer { -public: + public: explicit DotProdLayer(const LayerConfig& config) : Layer(config) {} ~DotProdLayer() {} diff --git a/paddle/gserver/layers/EosIdCheckLayer.cpp b/paddle/gserver/layers/EosIdCheckLayer.cpp index 470a5b8ea208ad0acb64e3067881e0d183e1dc39..04400f2836581179849a4dd1c256bbddcc82530f 100644 --- a/paddle/gserver/layers/EosIdCheckLayer.cpp +++ b/paddle/gserver/layers/EosIdCheckLayer.cpp @@ -24,7 +24,7 @@ namespace paddle { * It is used by recurrent layer group. */ class EosIdCheckLayer : public Layer { -public: + public: explicit EosIdCheckLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/ExpandConvLayer.h b/paddle/gserver/layers/ExpandConvLayer.h index be968155efd0b8f19503c996ccd329379c6b1104..6919ef71355a4c660b9ddd60bff75fee399cfaa9 100644 --- a/paddle/gserver/layers/ExpandConvLayer.h +++ b/paddle/gserver/layers/ExpandConvLayer.h @@ -29,7 +29,7 @@ namespace paddle { */ class ExpandConvLayer : public ConvBaseLayer { -public: + public: explicit ExpandConvLayer(const LayerConfig& config) : ConvBaseLayer(config) {} ~ExpandConvLayer() {} @@ -42,7 +42,7 @@ public: size_t getOutputSize(); -protected: + protected: std::vector inputShape_; std::vector filterShape_; std::vector outputShape_; diff --git a/paddle/gserver/layers/ExpandLayer.h b/paddle/gserver/layers/ExpandLayer.h index 04bbfcbd04931fa11d11a9fcc74f0e4f19767f1b..06bd4ef05ee206628d981fee8e7eec3c91b18b7a 100644 --- a/paddle/gserver/layers/ExpandLayer.h +++ b/paddle/gserver/layers/ExpandLayer.h @@ -37,7 +37,7 @@ namespace paddle { */ class ExpandLayer : public Layer { -protected: + protected: std::unique_ptr biases_; /// if input[0] is dense data, ExpandLevel=kNonSeq; /// if input[0] is sequence data, ExpandLevel=kSeq @@ -48,7 +48,7 @@ protected: /// of input[1] ICpuGpuVectorPtr expandStartsPos_; -public: + public: explicit ExpandLayer(const LayerConfig& config) : Layer(config) {} ~ExpandLayer() {} diff --git a/paddle/gserver/layers/FactorizationMachineLayer.h b/paddle/gserver/layers/FactorizationMachineLayer.h index 684da4e65a461d46204c348b3374b0e9e00eb389..148abe238173dd44cd0fcf3f5cda732f70078706 100644 --- a/paddle/gserver/layers/FactorizationMachineLayer.h +++ b/paddle/gserver/layers/FactorizationMachineLayer.h @@ -42,7 +42,7 @@ namespace paddle { */ class FactorizationMachineLayer : public Layer { -protected: + protected: // The latent vectors, shape: (size, factorSize_) // Each row of the latentVectors_ matrix is the latent vector // corresponding to one input feature dimension @@ -50,7 +50,7 @@ protected: // The hyperparameter that defines the dimensionality of the factorization size_t factorSize_; -private: + private: // Store the square values of the letent vectors matrix MatrixPtr latentVectorsSquare_; // Store the square values of input matrix @@ -65,7 +65,7 @@ private: // Negative identity matrix MatrixPtr negOnes_; -public: + public: explicit FactorizationMachineLayer(const LayerConfig& config) : Layer(config) {} ~FactorizationMachineLayer() {} diff --git a/paddle/gserver/layers/FeatureMapExpandLayer.cpp b/paddle/gserver/layers/FeatureMapExpandLayer.cpp index 81b98da45bc4b9b8ef0723dd6ea2db809860e219..d95f0b9b3d13e8bff635373cb4d5705c2351bd97 100644 --- a/paddle/gserver/layers/FeatureMapExpandLayer.cpp +++ b/paddle/gserver/layers/FeatureMapExpandLayer.cpp @@ -38,11 +38,11 @@ namespace paddle { */ class FeatureMapExpandLayer : public Layer { -private: + private: int numFilters_; bool asRowVector_; -public: + public: explicit FeatureMapExpandLayer(const LayerConfig& config) : Layer(config) {} ~FeatureMapExpandLayer() {} diff --git a/paddle/gserver/layers/FullMatrixProjection.h b/paddle/gserver/layers/FullMatrixProjection.h index 7c4cd1a7066d427f54e1a280a956acb025e6dc16..a27aa4a12327ac39ec3418a849b1230e13f759ee 100644 --- a/paddle/gserver/layers/FullMatrixProjection.h +++ b/paddle/gserver/layers/FullMatrixProjection.h @@ -28,14 +28,14 @@ namespace paddle { * The config file api is full_matrix_projection. */ class FullMatrixProjection : public Projection { -public: + public: FullMatrixProjection(const ProjectionConfig& config, const ParameterPtr& parameter, bool useGpu); virtual void forward(); virtual void backward(const UpdateCallback& callback); -protected: + protected: std::unique_ptr weight_; }; diff --git a/paddle/gserver/layers/FullyConnectedLayer.h b/paddle/gserver/layers/FullyConnectedLayer.h index e66aeeb7334c9c871749196d77474a02ecf82b09..e0f9d6ce55fbdf73e5507032c108c735bf04597b 100644 --- a/paddle/gserver/layers/FullyConnectedLayer.h +++ b/paddle/gserver/layers/FullyConnectedLayer.h @@ -28,11 +28,11 @@ namespace paddle { */ class FullyConnectedLayer : public Layer { -protected: + protected: WeightList weights_; std::unique_ptr biases_; -public: + public: explicit FullyConnectedLayer(const LayerConfig& config) : Layer(config) {} ~FullyConnectedLayer() {} diff --git a/paddle/gserver/layers/GatedRecurrentLayer.h b/paddle/gserver/layers/GatedRecurrentLayer.h index f0a3a823018f3943b0295c172b19d0fe9d0674b4..46508dc977bf1a6fd33dc1fb024bd1aed36a0ff3 100644 --- a/paddle/gserver/layers/GatedRecurrentLayer.h +++ b/paddle/gserver/layers/GatedRecurrentLayer.h @@ -47,7 +47,7 @@ namespace paddle { */ class GatedRecurrentLayer : public Layer, public GruCompute { -public: + public: explicit GatedRecurrentLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -63,7 +63,7 @@ public: LayerStatePtr getState() override; -protected: + protected: void forwardSequence(int batchSize, size_t numSequences, const int* starts, @@ -79,7 +79,7 @@ protected: MatrixPtr inputValue); void backwardBatch(int batchSize, MatrixPtr inputGrad); -protected: + protected: std::unique_ptr weight_; std::unique_ptr gateWeight_; std::unique_ptr stateWeight_; diff --git a/paddle/gserver/layers/GetOutputLayer.cpp b/paddle/gserver/layers/GetOutputLayer.cpp index f255681f3e678e51f069522f965fd2776680b595..7c1e3c407cca374c7aa238d07e2263c4a142b6a5 100644 --- a/paddle/gserver/layers/GetOutputLayer.cpp +++ b/paddle/gserver/layers/GetOutputLayer.cpp @@ -17,7 +17,7 @@ limitations under the License. */ namespace paddle { class GetOutputLayer : public Layer { -public: + public: explicit GetOutputLayer(const LayerConfig& config) : Layer(config) {} ~GetOutputLayer() {} diff --git a/paddle/gserver/layers/GruCompute.h b/paddle/gserver/layers/GruCompute.h index fb6bc56422002b4d4080ccb8438767b27ceef064..50006325ce9969c4941aaf28604260f0aeb9b97a 100644 --- a/paddle/gserver/layers/GruCompute.h +++ b/paddle/gserver/layers/GruCompute.h @@ -21,7 +21,7 @@ limitations under the License. */ namespace paddle { class GruCompute { -public: + public: void init(LayerConfig &config); template @@ -33,7 +33,7 @@ public: int frameSize, int batchSize = 1); -public: + public: hl_activation_mode_t activeNode_; hl_activation_mode_t activeGate_; }; diff --git a/paddle/gserver/layers/GruStepLayer.cpp b/paddle/gserver/layers/GruStepLayer.cpp index 917c50250c1c04d6c8f113c8d42ef029e1028606..114f287411c2fccbc08b7da4c05462967c81b268 100644 --- a/paddle/gserver/layers/GruStepLayer.cpp +++ b/paddle/gserver/layers/GruStepLayer.cpp @@ -44,13 +44,13 @@ namespace paddle { * The config file api if gru_step_layer. */ class GruStepLayer : public Layer, public GruCompute { -protected: + protected: Argument gate_; Argument resetOutput_; std::unique_ptr weight_; std::unique_ptr bias_; -public: + public: explicit GruStepLayer(const LayerConfig& config) : Layer(config) {} ~GruStepLayer() {} diff --git a/paddle/gserver/layers/HierarchicalSigmoidLayer.h b/paddle/gserver/layers/HierarchicalSigmoidLayer.h index 10e501f1807ef6ba03d326a1bcf257ede0ee850a..73ef252fd5a5443fe065f3b7bd8c49951ae0b4bd 100644 --- a/paddle/gserver/layers/HierarchicalSigmoidLayer.h +++ b/paddle/gserver/layers/HierarchicalSigmoidLayer.h @@ -58,7 +58,7 @@ namespace paddle { * The config file api is hsigmod_layer. */ class HierarchicalSigmoidLayer : public Layer { -public: + public: explicit HierarchicalSigmoidLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -66,7 +66,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback) override; -protected: + protected: /** * The last of inputs is label layer. */ diff --git a/paddle/gserver/layers/IdentityProjection.cpp b/paddle/gserver/layers/IdentityProjection.cpp index 6c70f77acc0c890e11a4929ea013d7745d8bbed0..34e9eb90161f7942c528b70f177e30f301a8f53f 100644 --- a/paddle/gserver/layers/IdentityProjection.cpp +++ b/paddle/gserver/layers/IdentityProjection.cpp @@ -26,7 +26,7 @@ namespace paddle { * The config file api is identity_projection. */ class IdentityProjection : public Projection { -public: + public: IdentityProjection(const ProjectionConfig& config, const ParameterPtr& parameter, bool useGpu); @@ -68,7 +68,7 @@ void IdentityProjection::backward(const UpdateCallback& callback) { * The config file api is identity_projection. */ class IdentityOffsetProjection : public Projection { -public: + public: IdentityOffsetProjection(const ProjectionConfig& config, const ParameterPtr& parameter, bool useGpu); diff --git a/paddle/gserver/layers/InterpolationLayer.cpp b/paddle/gserver/layers/InterpolationLayer.cpp index 0ac92024bc7eddf05ce023708537d0aa7bab6426..509c07cf22c9bcbe9283241b38540162b3dbe26b 100644 --- a/paddle/gserver/layers/InterpolationLayer.cpp +++ b/paddle/gserver/layers/InterpolationLayer.cpp @@ -33,12 +33,12 @@ namespace paddle { */ class InterpolationLayer : public Layer { -protected: + protected: /// weightLast = 1 - weight MatrixPtr weightLast_; MatrixPtr tmpMatrix; -public: + public: explicit InterpolationLayer(const LayerConfig& config) : Layer(config) {} ~InterpolationLayer() {} diff --git a/paddle/gserver/layers/KmaxSeqScoreLayer.cpp b/paddle/gserver/layers/KmaxSeqScoreLayer.cpp index 0ea960902efc10007896b3f4ce915dea79d0d12d..7fd25954efeb9d9e672040f9909198f2ae3c0449 100644 --- a/paddle/gserver/layers/KmaxSeqScoreLayer.cpp +++ b/paddle/gserver/layers/KmaxSeqScoreLayer.cpp @@ -17,14 +17,14 @@ limitations under the License. */ namespace paddle { class KmaxSeqScoreLayer : public Layer { -private: + private: MatrixPtr scores_; size_t beamSize_; void kmaxScorePerSeq(const real* score, real* sortedRes, const ICpuGpuVectorPtr seqStartPos); -public: + public: explicit KmaxSeqScoreLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/L2DistanceLayer.h b/paddle/gserver/layers/L2DistanceLayer.h index 97f35daf7860fb3b082ef03203327e09dca67371..44e688e1377145845033d9d5cc3f31f5594a11f6 100644 --- a/paddle/gserver/layers/L2DistanceLayer.h +++ b/paddle/gserver/layers/L2DistanceLayer.h @@ -33,7 +33,7 @@ namespace paddle { */ class L2DistanceLayer : public Layer { -public: + public: explicit L2DistanceLayer(const LayerConfig& config) : Layer(config) {} ~L2DistanceLayer() {} @@ -43,7 +43,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -private: + private: // Store the result of subtracting Input2 from Input1 in forward computation, // which will be reused in backward computation. MatrixPtr inputSub_; diff --git a/paddle/gserver/layers/Layer.h b/paddle/gserver/layers/Layer.h index 8da342a00f72ee1196c4af24104ce92c6bbf9f5c..13e20e8316323f9082a9615041584685853aa395 100644 --- a/paddle/gserver/layers/Layer.h +++ b/paddle/gserver/layers/Layer.h @@ -60,7 +60,7 @@ enum PADDLE_DEVICE_ID { * Define necessary variables and functions for every layer. */ class Layer { -protected: + protected: /// Layer config LayerConfig config_; /// whether to use GPU @@ -112,7 +112,7 @@ protected: /// Layer backward function std::vector> backward_; -public: + public: /** * Wait until all input value ready. * Called before Layer::forward() function. @@ -137,7 +137,7 @@ public: */ virtual void markAllInputGrad(); -protected: + protected: /** * Create layer function. Function is called in forward or backward. * \param function, Layer::forward_ or Layer::backward_ @@ -252,7 +252,7 @@ protected: */ void addOutputArgument(int deviceId); -public: + public: explicit Layer(const LayerConfig& config, bool useGpu = FLAGS_use_gpu); virtual ~Layer() {} @@ -490,7 +490,7 @@ public: */ virtual void onPassEnd() {} -protected: + protected: /** * Forward of activation function. */ diff --git a/paddle/gserver/layers/LinearChainCRF.h b/paddle/gserver/layers/LinearChainCRF.h index 1ea4c7e105703b76601499bf3944648cdc98ec99..e802b701d0237bed44adc83273fe53c3e18c92ec 100644 --- a/paddle/gserver/layers/LinearChainCRF.h +++ b/paddle/gserver/layers/LinearChainCRF.h @@ -19,7 +19,7 @@ limitations under the License. */ namespace paddle { class LinearChainCRF { -public: + public: /** * The size of para must be \f$(numClasses + 2) * numClasses\f$. * The first numClasses values of para are for starting weights (\f$a\f$). @@ -71,7 +71,7 @@ public: */ MatrixPtr getXGrad() { return matGrad_; } -protected: + protected: int numClasses_; MatrixPtr a_; MatrixPtr b_; diff --git a/paddle/gserver/layers/LinearChainCTC.h b/paddle/gserver/layers/LinearChainCTC.h index 0b774277dc8cf27f48c6905168cdea047365c99d..5b325a0deb0e9d8df241175159321e52f527f6c4 100644 --- a/paddle/gserver/layers/LinearChainCTC.h +++ b/paddle/gserver/layers/LinearChainCTC.h @@ -20,7 +20,7 @@ limitations under the License. */ namespace paddle { class LinearChainCTC { -public: + public: LinearChainCTC(int numClasses, bool normByTimes); // Calculate the negative log probability as loss @@ -35,7 +35,7 @@ public: int* labelSeq, int labelSeqLen); -protected: + protected: int numClasses_, blank_, totalSegments_, totalTime_; bool normByTimes_; bool isInvalid_; diff --git a/paddle/gserver/layers/LstmCompute.h b/paddle/gserver/layers/LstmCompute.h index b7d55eb1f984d102802cab87ba12ca9c69a2f4be..80fb01cd1885151c8d62a4b5dfdb4ba08327926d 100644 --- a/paddle/gserver/layers/LstmCompute.h +++ b/paddle/gserver/layers/LstmCompute.h @@ -21,7 +21,7 @@ limitations under the License. */ namespace paddle { class LstmCompute { -public: + public: void init(LayerConfig &config); /** @@ -57,7 +57,7 @@ public: hl_lstm_grad grad, int frameSize); -public: + public: hl_activation_mode_t activeNode_; hl_activation_mode_t activeGate_; hl_activation_mode_t activeState_; diff --git a/paddle/gserver/layers/LstmLayer.h b/paddle/gserver/layers/LstmLayer.h index 4568b13ade5555e3cff703ceda1bbce3007c409d..76dfe8146bf67a0b7b4fd4835851fae6ac38d80f 100644 --- a/paddle/gserver/layers/LstmLayer.h +++ b/paddle/gserver/layers/LstmLayer.h @@ -71,7 +71,7 @@ namespace paddle { */ class LstmLayer : public Layer, public LstmCompute { -public: + public: explicit LstmLayer(const LayerConfig &config) : Layer(config) {} bool init(const LayerMap &layerMap, @@ -87,7 +87,7 @@ public: LayerStatePtr getState() override; -protected: + protected: /** * @brief Compute lstm forward one sequence by one sequence. * @param batchSize The batchSize is not equal to the batch_size in @@ -165,7 +165,7 @@ protected: */ void getPrevBatchState(size_t numSequences); -protected: + protected: /// Learned parameters, shape: (size, 4*size). /// The weight ([size, 4*size]) contains \f$W_{hi}, W_{hf}, W_{hc}, W_{ho}\f$. std::unique_ptr weight_; diff --git a/paddle/gserver/layers/LstmStepLayer.cpp b/paddle/gserver/layers/LstmStepLayer.cpp index 8faaa1c4e138fe1ec04b1911449d05528bb5b8b5..c44768ddb2b903763288465325899d86176df73a 100644 --- a/paddle/gserver/layers/LstmStepLayer.cpp +++ b/paddle/gserver/layers/LstmStepLayer.cpp @@ -22,7 +22,7 @@ namespace paddle { * LstmStepLayer used in recurrent layer group. */ class LstmStepLayer : public Layer, public LstmCompute { -protected: + protected: Argument state_; Argument gate_; Argument stateActive_; @@ -30,7 +30,7 @@ protected: MatrixPtr checkIgGrad_, checkFgGrad_, checkOgGrad_; std::unique_ptr weight_; -public: + public: explicit LstmStepLayer(const LayerConfig& config) : Layer(config) {} ~LstmStepLayer() {} diff --git a/paddle/gserver/layers/MDLstmLayer.cpp b/paddle/gserver/layers/MDLstmLayer.cpp index 7cfdb3ff25096ad06c09434cdee48b5f85d650af..22c28157c5a5b19aa54b3151a6c9a4cdcfb01765 100644 --- a/paddle/gserver/layers/MDLstmLayer.cpp +++ b/paddle/gserver/layers/MDLstmLayer.cpp @@ -19,7 +19,7 @@ limitations under the License. */ namespace paddle { class CoordIterator { -public: + public: std::vector dims_; std::vector directions_; std::vector curPos_; @@ -51,7 +51,7 @@ public: } } -public: + public: CoordIterator(std::vector dim, std::vector directions) : dims_(dim), directions_(directions), end_(false) { CHECK_EQ(dims_.size(), directions_.size()); @@ -178,7 +178,7 @@ public: * */ class MDLstmLayer : public LstmLayer { -public: + public: explicit MDLstmLayer(const LayerConfig& config) : LstmLayer(config) {} bool init(const LayerMap& layerMap, @@ -188,13 +188,13 @@ public: void backward(const UpdateCallback& callback) override; -protected: + protected: void forwardOneSequence(int start, CoordIterator& coordIter); void backwardOneSequence(int start, CoordIterator& coordIter); void forwardGate2OutputSequence(int start, CoordIterator& coordIter); void backwardGate2OutputSequence(int start, CoordIterator& coordIter); -protected: + protected: std::vector frameInputGate_; std::vector frameForgetGate_; std::vector frameOutputGate_; diff --git a/paddle/gserver/layers/MKLDNNAddtoLayer.h b/paddle/gserver/layers/MKLDNNAddtoLayer.h index e40e2f2251a1b739958773b8e6dc95a70ed58c76..0b385e804fdbc74c8612031cf415d06f15ce311a 100644 --- a/paddle/gserver/layers/MKLDNNAddtoLayer.h +++ b/paddle/gserver/layers/MKLDNNAddtoLayer.h @@ -25,7 +25,7 @@ namespace paddle { * The config file api is mkldnn_addto */ class MKLDNNAddtoLayer : public MKLDNNLayer { -protected: + protected: // layer size == ic * ih * iw == oc * oh *ow, and can not be changed size_t layerSize_; @@ -38,7 +38,7 @@ protected: std::vector> fwdBias_; std::shared_ptr bwdBias_; -public: + public: explicit MKLDNNAddtoLayer(const LayerConfig& config) : MKLDNNLayer(config) {} ~MKLDNNAddtoLayer() {} @@ -59,7 +59,7 @@ public: void updateWeights(const UpdateCallback& callback) override; -protected: + protected: void resetFwdBuffers(std::vector& inputs, MKLDNNMatrixPtr& bias, MKLDNNMatrixPtr& out); diff --git a/paddle/gserver/layers/MKLDNNBase.h b/paddle/gserver/layers/MKLDNNBase.h index d84e2859407711c13c475a19e140e2f5f51e61c2..786ceaf86086d7c04331641693181809ac019597 100644 --- a/paddle/gserver/layers/MKLDNNBase.h +++ b/paddle/gserver/layers/MKLDNNBase.h @@ -31,7 +31,7 @@ typedef enum { * */ class CPUEngine { -public: + public: static CPUEngine& Instance() { // Thread-safe in C++11. static CPUEngine myInstance; @@ -46,12 +46,12 @@ public: mkldnn::engine& getEngine() { return cpuEngine_; } -protected: + protected: CPUEngine() : cpuEngine_(mkldnn::engine::cpu, 0) {} // CPUEngine() : cpuEngine_(mkldnn::engine::cpu_lazy, 0) {} ~CPUEngine() {} -private: + private: mkldnn::engine cpuEngine_; }; @@ -60,7 +60,7 @@ private: * */ class MKLDNNStream { -public: + public: MKLDNNStream() : ready_(false) { resetState(); } virtual ~MKLDNNStream() {} @@ -89,7 +89,7 @@ public: ready_ = true; } -private: + private: bool ready_; std::shared_ptr stream_; }; diff --git a/paddle/gserver/layers/MKLDNNBatchNormLayer.h b/paddle/gserver/layers/MKLDNNBatchNormLayer.h index 93e182206a1ab1f06087cb808bb266ddea1468c9..9aa20df98f30837e1b80b4269d05d85b7d99ba76 100644 --- a/paddle/gserver/layers/MKLDNNBatchNormLayer.h +++ b/paddle/gserver/layers/MKLDNNBatchNormLayer.h @@ -27,7 +27,7 @@ typedef mkldnn::batch_normalization_backward bn_bwd; * The config file api is mkldnn_batch_norm */ class MKLDNNBatchNormLayer : public MKLDNNLayer { -protected: + protected: // save forward primitive_desc, which can be used backward std::shared_ptr fwdPD_; @@ -62,7 +62,7 @@ protected: MKLDNNMatrixPtr mean_; MKLDNNMatrixPtr var_; -public: + public: explicit MKLDNNBatchNormLayer(const LayerConfig& config) : MKLDNNLayer(config), useGlobalStats_(true), hasInitedWgt_(false) {} @@ -88,7 +88,7 @@ public: void convertWeightsFromPaddle() override; -protected: + protected: void initWeight(); /** * cal moving mean and variance. diff --git a/paddle/gserver/layers/MKLDNNConcatLayer.h b/paddle/gserver/layers/MKLDNNConcatLayer.h index f7abdabfb51df27f8db4e6d4d88c80546eeba248..d7738df6c106c68f55b313f2d119e31c6e444cbf 100644 --- a/paddle/gserver/layers/MKLDNNConcatLayer.h +++ b/paddle/gserver/layers/MKLDNNConcatLayer.h @@ -25,7 +25,7 @@ namespace paddle { * The config file api is mkldnn_concat */ class MKLDNNConcatLayer : public MKLDNNLayer { -protected: + protected: std::vector> bwds_; // input channel numbers std::vector channels_; @@ -35,7 +35,7 @@ protected: // if axis_ == 1, concat channel (default) int axis_; -public: + public: explicit MKLDNNConcatLayer(const LayerConfig& config) : MKLDNNLayer(config), axis_(1) {} @@ -75,7 +75,7 @@ public: return totalSize; } -protected: + protected: void resetFwdBuffers(std::vector& inputs, MKLDNNMatrixPtr& out); void resetFwdPD(std::shared_ptr& pd, diff --git a/paddle/gserver/layers/MKLDNNConvLayer.h b/paddle/gserver/layers/MKLDNNConvLayer.h index 29c8735fbb91e7418797874238eb87759420f181..d399035ed3ae2f411587c1fcf1799bb71c8de63e 100644 --- a/paddle/gserver/layers/MKLDNNConvLayer.h +++ b/paddle/gserver/layers/MKLDNNConvLayer.h @@ -28,7 +28,7 @@ typedef mkldnn::convolution_backward_data conv_bwdData; * The config file api is mkldnn_conv */ class MKLDNNConvLayer : public MKLDNNLayer { -protected: + protected: // padding height and width int ph_, pw_; // stride height and width @@ -59,7 +59,7 @@ protected: std::unique_ptr weight_; std::unique_ptr biases_; -public: + public: explicit MKLDNNConvLayer(const LayerConfig& config) : MKLDNNLayer(config), hasInitedWgt_(false), caffeMode_(true) {} @@ -92,7 +92,7 @@ public: << ", sw: " << sw_ << ", dh: " << dh_ << ", dw: " << dw_; } -protected: + protected: /** * load the dims settings of this conv */ diff --git a/paddle/gserver/layers/MKLDNNFcLayer.h b/paddle/gserver/layers/MKLDNNFcLayer.h index 0d41a4379d677f86f672852fec09b1241009597b..a704066cc818a6b33bd0eed4612d62b674fa72ca 100644 --- a/paddle/gserver/layers/MKLDNNFcLayer.h +++ b/paddle/gserver/layers/MKLDNNFcLayer.h @@ -28,7 +28,7 @@ typedef mkldnn::inner_product_backward_data fc_bwdData; * The config file api is mkldnn_fc */ class MKLDNNFcLayer : public MKLDNNLayer { -protected: + protected: // input layer size, can not be change after init size_t iLayerSize_; // == ic * ih * iw @@ -42,7 +42,7 @@ protected: std::unique_ptr weight_; std::unique_ptr biases_; -public: + public: explicit MKLDNNFcLayer(const LayerConfig& config) : MKLDNNLayer(config), hasInitedWgt_(false) {} @@ -68,7 +68,7 @@ public: void convertWeightsToPaddle() override; -protected: + protected: void resetFwdBuffers(MKLDNNMatrixPtr& in, MKLDNNMatrixPtr& wgt, MKLDNNMatrixPtr& bias, diff --git a/paddle/gserver/layers/MKLDNNLRNLayer.h b/paddle/gserver/layers/MKLDNNLRNLayer.h index b503ee55947294d7c44d1760058f8c26bceed142..028438f2c93b2182318c53cd348351376d491e79 100644 --- a/paddle/gserver/layers/MKLDNNLRNLayer.h +++ b/paddle/gserver/layers/MKLDNNLRNLayer.h @@ -27,7 +27,7 @@ typedef mkldnn::lrn_backward lrn_bwd; * The config file api is mkldnn_lrn */ class MKLDNNLRNLayer : public MKLDNNLayer { -protected: + protected: // save forward primitive_desc, which can be used in backward std::shared_ptr fwdPD_; // according to https://github.com/01org/mkl-dnn/blob/master/tests/gtests/ @@ -37,7 +37,7 @@ protected: int localSize_; float alpha_, beta_; // scale and pow in paddle -public: + public: explicit MKLDNNLRNLayer(const LayerConfig& config) : MKLDNNLayer(config) {} ~MKLDNNLRNLayer() {} @@ -56,7 +56,7 @@ public: std::vector& inputs, MKLDNNMatrixPtr& out) override; -protected: + protected: void resetFwdBuffers(MKLDNNMatrixPtr& in, MKLDNNMatrixPtr& out); void resetFwdPD(std::shared_ptr& pd, MKLDNNMatrixPtr in, diff --git a/paddle/gserver/layers/MKLDNNLayer.h b/paddle/gserver/layers/MKLDNNLayer.h index 4a7eb74ce3a13ed38be3548d8ce34382c594205a..2b164d0d3bc0e1446d7e4d82bb8a713195dbd927 100644 --- a/paddle/gserver/layers/MKLDNNLayer.h +++ b/paddle/gserver/layers/MKLDNNLayer.h @@ -33,7 +33,7 @@ typedef std::shared_ptr MKLDNNLayerPtr; * */ class MKLDNNLayer : public Layer { -protected: + protected: // batch size int bs_; // their sizes are always from the first input layer @@ -95,7 +95,7 @@ protected: // tmp input argument to save input grad, only used to merge grad Argument tmpInArg_; -public: + public: explicit MKLDNNLayer(const LayerConfig& config) : Layer(config), ih_(0), @@ -162,7 +162,7 @@ public: */ void addOutputArgument(int deviceId) { Layer::addOutputArgument(deviceId); } -protected: + protected: /** * Some layers may have different condition to reset the forward. * The function returns the condition that do not need reset forward. @@ -233,7 +233,7 @@ protected: */ void resetMergeGrad(MKLDNNMatrixPtr& out); -protected: + protected: /** * Set deviceId of this layer. */ @@ -340,7 +340,7 @@ protected: } } -private: + private: /** * clear all grad */ diff --git a/paddle/gserver/layers/MKLDNNPoolLayer.h b/paddle/gserver/layers/MKLDNNPoolLayer.h index 12821cda7308602dd2fe834f52c614e6112b7cea..1eb0ee4ad946f61e32b7d4f4fd376dda89d6acf7 100644 --- a/paddle/gserver/layers/MKLDNNPoolLayer.h +++ b/paddle/gserver/layers/MKLDNNPoolLayer.h @@ -27,7 +27,7 @@ typedef mkldnn::pooling_backward pool_bwd; * The config file api is mkldnn_pool */ class MKLDNNPoolLayer : public MKLDNNLayer { -protected: + protected: // padding height and width int ph_, pw_; // stride height and width @@ -44,7 +44,7 @@ protected: // test_pooling_forward.cpp, pool need workspace for backward std::shared_ptr workspace_; -public: + public: explicit MKLDNNPoolLayer(const LayerConfig& config) : MKLDNNLayer(config) {} ~MKLDNNPoolLayer() {} @@ -70,7 +70,7 @@ public: << ", sw: " << sw_; } -protected: + protected: void resetFwdBuffers(MKLDNNMatrixPtr& in, MKLDNNMatrixPtr& out); void resetFwdPD(std::shared_ptr& pd, MKLDNNMatrixPtr in, diff --git a/paddle/gserver/layers/MKLPackedRecurrentLayer.h b/paddle/gserver/layers/MKLPackedRecurrentLayer.h index 37eb362d45215edc736984f8da784fe74bb43f2b..441025a9c9d75786b17db84c74995a96b6a06ea8 100644 --- a/paddle/gserver/layers/MKLPackedRecurrentLayer.h +++ b/paddle/gserver/layers/MKLPackedRecurrentLayer.h @@ -29,7 +29,7 @@ namespace paddle { */ class MKLPackedRecurrentLayer : public RecurrentLayer { -public: + public: explicit MKLPackedRecurrentLayer(const LayerConfig& config) : RecurrentLayer(config) {} @@ -38,7 +38,7 @@ public: void backward(const UpdateCallback& callback) override; -protected: + protected: void forwardBatch(int batchSize, size_t numSequences, const int* starts) override; @@ -47,7 +47,7 @@ protected: size_t numSequences, const int* starts) override; -protected: + protected: /// packed_weight_ contains same data with /// RecurrentLayer::weight_ but is packed std::unique_ptr packed_weight_; diff --git a/paddle/gserver/layers/MKLPackedWeight.h b/paddle/gserver/layers/MKLPackedWeight.h index 28b8a7db7cc3d2be12d6ce9291de1e415cf77bbc..b01a961d007a0e2e343db7b51e50fd3ee776435e 100644 --- a/paddle/gserver/layers/MKLPackedWeight.h +++ b/paddle/gserver/layers/MKLPackedWeight.h @@ -21,7 +21,7 @@ limitations under the License. */ namespace paddle { class MKLPackedWeight { -protected: + protected: /// The pointer of weight real *weight_; /// The pointer of cblas packed gemm to weight @@ -30,7 +30,7 @@ protected: size_t width_; bool transW_; -public: + public: explicit MKLPackedWeight(MatrixPtr weight, bool transW = false) { packedWeight_ = nullptr; weight_ = weight->getData(); @@ -59,7 +59,7 @@ public: dst->getWidth()); } -protected: + protected: void pack_(real *src) { if (!packedWeight_) { packedWeight_ = cblas_sgemm_alloc(CblasBMatrix, 1, width_, height_); diff --git a/paddle/gserver/layers/MaxIdLayer.cpp b/paddle/gserver/layers/MaxIdLayer.cpp index 84e375d7441ce3ccd8a5df94df22d85d104b5d96..eecd4996e962857b09001a1bb36bc027cbaa4308 100644 --- a/paddle/gserver/layers/MaxIdLayer.cpp +++ b/paddle/gserver/layers/MaxIdLayer.cpp @@ -23,11 +23,11 @@ namespace paddle { * The config file api is maxid_layer. */ class MaxIdLayer : public Layer { -private: + private: /// a predetermined number of best states at each level size_t beamSize_; -public: + public: explicit MaxIdLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/MaxLayer.h b/paddle/gserver/layers/MaxLayer.h index 9dbc672652dc2670a775f02ecd3a9de9919c8ae0..e46f997c342ce5d6b724629dff6950c4f1680ce8 100644 --- a/paddle/gserver/layers/MaxLayer.h +++ b/paddle/gserver/layers/MaxLayer.h @@ -39,11 +39,11 @@ namespace paddle { */ class MaxLayer : public SequencePoolLayer { -protected: + protected: // maxIndex_[i][j] = k : the value at (i, j) is from input[k]. IVectorPtr maxIndex_; -public: + public: explicit MaxLayer(const LayerConfig& config) : SequencePoolLayer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/MaxOutLayer.h b/paddle/gserver/layers/MaxOutLayer.h index 1fb371836bacb9e02cc32eabfd21bf24165b0734..0eb8674b4c4f3f58b103c6b59ad13931a6992a1b 100644 --- a/paddle/gserver/layers/MaxOutLayer.h +++ b/paddle/gserver/layers/MaxOutLayer.h @@ -29,7 +29,7 @@ namespace paddle { */ class MaxOutLayer : public Layer { -protected: + protected: size_t groups_; size_t imgSizeH_, imgSizeW_; /// outputChannels_ = channels_ / groups_ @@ -38,7 +38,7 @@ protected: size_t featLen_; IVectorPtr maxoutId_; -public: + public: /// return imgSizeH_ * imgSizeW_ * outputChannels_; size_t getSize(); diff --git a/paddle/gserver/layers/MaxPoolWithMaskLayer.h b/paddle/gserver/layers/MaxPoolWithMaskLayer.h index 74cc8acf3515b10257ffb185061344fbcc94a337..c948364f6b83b0de1ee07cc185b69346f5cb1a7e 100644 --- a/paddle/gserver/layers/MaxPoolWithMaskLayer.h +++ b/paddle/gserver/layers/MaxPoolWithMaskLayer.h @@ -23,10 +23,10 @@ namespace paddle { * @brief Basic parent layer of different kinds of pooling */ class MaxPoolWithMaskLayer : public PoolLayer { -protected: + protected: Argument mask_; -public: + public: explicit MaxPoolWithMaskLayer(const LayerConfig& config) : PoolLayer(config) {} diff --git a/paddle/gserver/layers/MixedLayer.h b/paddle/gserver/layers/MixedLayer.h index a1a43c52e4f503178a66ad8aa6c12bec89566081..43ee2bd81854f2dea837734f556c197613f6fdaf 100644 --- a/paddle/gserver/layers/MixedLayer.h +++ b/paddle/gserver/layers/MixedLayer.h @@ -30,7 +30,7 @@ namespace paddle { * The config file api is mixed_layer. */ class MixedLayer : public Layer { -public: + public: explicit MixedLayer(const LayerConfig& config) : Layer(config) {} ~MixedLayer() {} @@ -52,7 +52,7 @@ public: */ LayerStatePtr getState() override; -protected: + protected: std::vector> projections_; std::vector> operators_; /// the matrix size of projection state diff --git a/paddle/gserver/layers/MultiBoxLossLayer.h b/paddle/gserver/layers/MultiBoxLossLayer.h index 9935da56446c1508549906becfd28548d5deecde..a358cded00bb01bfe5d02f9a6d8a24e4b2e51b74 100644 --- a/paddle/gserver/layers/MultiBoxLossLayer.h +++ b/paddle/gserver/layers/MultiBoxLossLayer.h @@ -41,7 +41,7 @@ namespace paddle { */ class MultiBoxLossLayer : public CostLayer { -public: + public: explicit MultiBoxLossLayer(const LayerConfig& config) : CostLayer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap); @@ -54,7 +54,7 @@ public: void backwardImp(Matrix& outputValue, Argument& label, Matrix& outputGrad) {} -protected: + protected: inline LayerPtr getPriorBoxLayer() { return inputLayers_[0]; } inline LayerPtr getLabelLayer() { return inputLayers_[1]; } inline LayerPtr getLocInputLayer(size_t index) { @@ -64,7 +64,7 @@ protected: return inputLayers_[2 + inputNum_ + index]; } -protected: + protected: size_t numClasses_; real overlapThreshold_; real negPosRatio_; diff --git a/paddle/gserver/layers/MultinomialSampler.h b/paddle/gserver/layers/MultinomialSampler.h index 1f9e818ee5d21188e3bd39d1225912a1a2ae1598..8cbb229f157c0904e63a696f860ec6739d5167c4 100644 --- a/paddle/gserver/layers/MultinomialSampler.h +++ b/paddle/gserver/layers/MultinomialSampler.h @@ -29,7 +29,7 @@ namespace paddle { * The computational complexity of generate one sample is O(1). */ class MultinomialSampler { -public: + public: MultinomialSampler(const real* prob, int size); //! protobuf always using double. @@ -53,7 +53,7 @@ public: return gen1([&g, this]() { return rand_(g); }); } -protected: + protected: /** * @brief Generation * @param[in] rand rand is a real random number distribution diff --git a/paddle/gserver/layers/MultiplexLayer.cpp b/paddle/gserver/layers/MultiplexLayer.cpp index 82857f8c3ef3e39ec451c1f26bac4996c12350a5..43ecc48cd97fb54d8dc4eb1d87ebf60f5aa040d8 100644 --- a/paddle/gserver/layers/MultiplexLayer.cpp +++ b/paddle/gserver/layers/MultiplexLayer.cpp @@ -37,7 +37,7 @@ namespace paddle { */ class MultiplexLayer : public Layer { -protected: + protected: /** * @brief A struct is used to save the copy information, includes input * layer index and copy size. @@ -64,7 +64,7 @@ protected: /// Temporary matrix pointer to point to output data. MatrixPtr tmpDest_; -public: + public: explicit MultiplexLayer(const LayerConfig& config) : Layer(config) {} ~MultiplexLayer() {} @@ -75,7 +75,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -private: + private: /** * @brief Calculate copy info for input layers. */ diff --git a/paddle/gserver/layers/NCELayer.cpp b/paddle/gserver/layers/NCELayer.cpp index d3d7b1fd9ac3c366d11c3060848e89c24a16a70b..cc48fe100f12446f9522078119ae2ead039a82cc 100644 --- a/paddle/gserver/layers/NCELayer.cpp +++ b/paddle/gserver/layers/NCELayer.cpp @@ -54,7 +54,7 @@ class NCELayer : public Layer { IVectorPtr labelIds_; -public: + public: explicit NCELayer(const LayerConfig& config) : Layer(config), numClasses_(config.num_classes()), diff --git a/paddle/gserver/layers/NormLayer.h b/paddle/gserver/layers/NormLayer.h index c89cbbfce9d9e35a6dd300864ee094ef8f9e283a..3807584415f99a7110170748501589dac85eac52 100644 --- a/paddle/gserver/layers/NormLayer.h +++ b/paddle/gserver/layers/NormLayer.h @@ -27,7 +27,7 @@ namespace paddle { * @note Normalize the input in local region */ class NormLayer : public Layer { -public: + public: explicit NormLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -49,12 +49,12 @@ public: * Need to implement in the futrue. */ class ResponseNormLayer : public NormLayer { -protected: + protected: size_t channels_, size_, outputX_, imgSize_, outputY_, imgSizeY_; real scale_, pow_; MatrixPtr denoms_; -public: + public: explicit ResponseNormLayer(const LayerConfig& config) : NormLayer(config) {} bool init(const LayerMap& layerMap, @@ -76,7 +76,7 @@ public: * Cheng-Yang Fu, Alexander C. Berg. SSD: Single Shot MultiBox Detector */ class CrossChannelNormLayer : public NormLayer { -public: + public: explicit CrossChannelNormLayer(const LayerConfig& config) : NormLayer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap); @@ -85,7 +85,7 @@ public: MatrixPtr createSampleMatrix(MatrixPtr data, size_t iter, size_t spatialDim); MatrixPtr createSpatialMatrix(MatrixPtr data, size_t iter, size_t spatialDim); -protected: + protected: size_t channels_; std::unique_ptr scale_; MatrixPtr scaleDiff_; diff --git a/paddle/gserver/layers/NormProjectionLayer.h b/paddle/gserver/layers/NormProjectionLayer.h index 898b5823a9011c4b66e045c54afba070dd5cf772..64803a1603599f2e393ec772a32d64f4d271fe71 100644 --- a/paddle/gserver/layers/NormProjectionLayer.h +++ b/paddle/gserver/layers/NormProjectionLayer.h @@ -28,7 +28,7 @@ class CMRProjectionNormLayer : public ResponseNormLayer { size_t imgSizeH_, imgSizeW_; size_t outputH_, outputW_; -public: + public: explicit CMRProjectionNormLayer(const LayerConfig& config) : ResponseNormLayer(config) {} @@ -41,7 +41,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: TensorShape shape_; }; } // namespace paddle diff --git a/paddle/gserver/layers/Operator.h b/paddle/gserver/layers/Operator.h index a620926cccd3004d7bef57976047a190b4b566e2..42d525ef3e4534acea7512d5ecdbe8a0e1d110d9 100644 --- a/paddle/gserver/layers/Operator.h +++ b/paddle/gserver/layers/Operator.h @@ -34,7 +34,7 @@ namespace paddle { * @note: Operator can't have parameters. */ class Operator { -public: + public: static Operator* create(const OperatorConfig& config, bool useGpu); Operator(const OperatorConfig& config, bool useGpu) @@ -81,7 +81,7 @@ public: */ virtual LayerStatePtr getState() { return nullptr; } -protected: + protected: /// Config of operator OperatorConfig config_; bool useGpu_; diff --git a/paddle/gserver/layers/OuterProdLayer.cpp b/paddle/gserver/layers/OuterProdLayer.cpp index 75f4abf93e5db11dc688f8f2e0b2a36bf70fbccc..11a910f3316114b309efe9007a156e842b3d6229 100644 --- a/paddle/gserver/layers/OuterProdLayer.cpp +++ b/paddle/gserver/layers/OuterProdLayer.cpp @@ -28,12 +28,12 @@ namespace paddle { */ class OuterProdLayer : public Layer { -protected: + protected: MatrixPtr tmpMtx0; MatrixPtr tmpRow0; MatrixPtr tmpRow1; -public: + public: explicit OuterProdLayer(const LayerConfig& config) : Layer(config) {} ~OuterProdLayer() {} diff --git a/paddle/gserver/layers/PadLayer.h b/paddle/gserver/layers/PadLayer.h index 7e09d7f8a0d4dfd5300298ad0514b69781d87016..46b8a595978489c630b3ff2429ecb19d7c12521a 100644 --- a/paddle/gserver/layers/PadLayer.h +++ b/paddle/gserver/layers/PadLayer.h @@ -24,7 +24,7 @@ namespace paddle { * the 4th dimenstion according padc_, padh_ and padw_. */ class PadLayer : public Layer { -public: + public: explicit PadLayer(const LayerConfig& config) : Layer(config) {} ~PadLayer() {} @@ -34,7 +34,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: void setOutDims(const size_t batchSize); void setTensorDim(const size_t batchSize); diff --git a/paddle/gserver/layers/ParameterReluLayer.h b/paddle/gserver/layers/ParameterReluLayer.h index 3725fa4a1199285b703590255af492ebffdaab2c..4553413fcdbecbc83e1f50e8ffbe874fdf05d828 100644 --- a/paddle/gserver/layers/ParameterReluLayer.h +++ b/paddle/gserver/layers/ParameterReluLayer.h @@ -36,7 +36,7 @@ namespace paddle { */ class ParameterReluLayer : public Layer { -protected: + protected: std::unique_ptr weight_; /** @@ -51,7 +51,7 @@ protected: */ size_t partialSum_; -public: + public: explicit ParameterReluLayer(const LayerConfig& config) : Layer(config) {} ~ParameterReluLayer() {} diff --git a/paddle/gserver/layers/Pool3DLayer.h b/paddle/gserver/layers/Pool3DLayer.h index 59ee73f7cb9fb4287c12f3c7d0cacfc812484770..32605f8b7028cfb4909c885e83017a8cffa79575 100644 --- a/paddle/gserver/layers/Pool3DLayer.h +++ b/paddle/gserver/layers/Pool3DLayer.h @@ -26,7 +26,7 @@ namespace paddle { * Pools the input within regions */ class Pool3DLayer : public Layer { -public: + public: explicit Pool3DLayer(const LayerConfig& config) : Layer(config) {} ~Pool3DLayer() {} @@ -36,7 +36,7 @@ public: void backward(const UpdateCallback& callback) override; size_t getSize(); -protected: + protected: int channels_; int sizeX_, sizeY_, sizeZ_; int strideW_, strideH_, strideD_; diff --git a/paddle/gserver/layers/PoolLayer.h b/paddle/gserver/layers/PoolLayer.h index 58d5fb0a095e8326f9b6f9cb2a97bb88022ceed8..99f8f148e2eb00f7e431e7d8c5acbf9e27574017 100644 --- a/paddle/gserver/layers/PoolLayer.h +++ b/paddle/gserver/layers/PoolLayer.h @@ -26,7 +26,7 @@ namespace paddle { * Pools the input within regions */ class PoolLayer : public Layer { -protected: + protected: size_t channels_, sizeX_, stride_, outputX_, imgSize_; int confPadding_; @@ -40,7 +40,7 @@ protected: bool excludeMode_; -public: + public: explicit PoolLayer(const LayerConfig& config) : Layer(config) {} /** diff --git a/paddle/gserver/layers/PoolProjection.h b/paddle/gserver/layers/PoolProjection.h index c99287dbf0f4503c180b9b4e9e46abafa67bf64d..8004cc1550337160b7f022c97a23ed8eb9d43ca4 100644 --- a/paddle/gserver/layers/PoolProjection.h +++ b/paddle/gserver/layers/PoolProjection.h @@ -20,7 +20,7 @@ limitations under the License. */ namespace paddle { class PoolProjection : public Projection { -protected: + protected: size_t imgSizeY_, imgSize_; size_t outputY_, outputX_; size_t strideY_, stride_; @@ -30,7 +30,7 @@ protected: std::string poolType_; bool excludeMode_; -public: + public: PoolProjection(const ProjectionConfig& config, ParameterPtr parameter, bool useGpu); @@ -45,7 +45,7 @@ public: }; class MaxPoolProjection : public PoolProjection { -public: + public: MaxPoolProjection(const ProjectionConfig& config, ParameterPtr parameter, bool useGpu) @@ -56,7 +56,7 @@ public: }; class AvgPoolProjection : public PoolProjection { -public: + public: AvgPoolProjection(const ProjectionConfig& config, ParameterPtr parameter, bool useGpu) diff --git a/paddle/gserver/layers/PoolProjectionLayer.h b/paddle/gserver/layers/PoolProjectionLayer.h index 5a97a7769aaeebcfd4fe2c10d8ac0cc8892f68e3..9ad144cc2ad426caa522bf1061a750d47e64a755 100644 --- a/paddle/gserver/layers/PoolProjectionLayer.h +++ b/paddle/gserver/layers/PoolProjectionLayer.h @@ -24,13 +24,13 @@ namespace paddle { * @brief Basic parent layer of different kinds of pooling */ class PoolProjectionLayer : public PoolLayer { -protected: + protected: size_t imgSizeH_, imgSizeW_; size_t outputH_, outputW_; std::unique_ptr poolProjection_; ProjectionConfig projectionConfig_; -public: + public: explicit PoolProjectionLayer(const LayerConfig& config) : PoolLayer(config) { PoolConfig* conf = projectionConfig_.mutable_pool_conf(); *conf = config_.inputs(0).pool_conf(); diff --git a/paddle/gserver/layers/PowerLayer.cpp b/paddle/gserver/layers/PowerLayer.cpp index 18f650fcdaded5ad7199510594b873fc18c3d7b5..7e8d60db8fe588026c6040099745c3aefd7237b5 100644 --- a/paddle/gserver/layers/PowerLayer.cpp +++ b/paddle/gserver/layers/PowerLayer.cpp @@ -32,10 +32,10 @@ namespace paddle { */ class PowerLayer : public Layer { -protected: + protected: MatrixPtr tmpMtx; -public: + public: explicit PowerLayer(const LayerConfig& config) : Layer(config) {} ~PowerLayer() {} diff --git a/paddle/gserver/layers/PrintLayer.cpp b/paddle/gserver/layers/PrintLayer.cpp index 5a527d598dd5e11ae0b74a32c9b9884e73ed45a8..6fbcc447f92208439bddd14d421d62cab30d81f4 100644 --- a/paddle/gserver/layers/PrintLayer.cpp +++ b/paddle/gserver/layers/PrintLayer.cpp @@ -17,7 +17,7 @@ limitations under the License. */ namespace paddle { class PrintLayer : public Layer { -public: + public: explicit PrintLayer(const LayerConfig& config) : Layer(config) {} void forward(PassType passType) override { diff --git a/paddle/gserver/layers/PriorBox.cpp b/paddle/gserver/layers/PriorBox.cpp index 56a4d942f0fdcb981f52f6ce0f644ec57a0e3c9a..39d2c2d737fa90737635efdb209610e156c8662f 100644 --- a/paddle/gserver/layers/PriorBox.cpp +++ b/paddle/gserver/layers/PriorBox.cpp @@ -28,7 +28,7 @@ namespace paddle { */ class PriorBoxLayer : public Layer { -public: // NOLINT + public: // NOLINT explicit PriorBoxLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap) override; @@ -36,7 +36,7 @@ public: // NOLINT void forward(PassType passType) override; void backward(const UpdateCallback& callback) override {} -protected: // NOLINT + protected: // NOLINT int numPriors_; std::vector minSize_; std::vector maxSize_; diff --git a/paddle/gserver/layers/Projection.h b/paddle/gserver/layers/Projection.h index 1f0b96c79ec7313cd9c5ff9139a455b3269b222b..88a41355cfce711e1e9522655058d0f1198e4e76 100644 --- a/paddle/gserver/layers/Projection.h +++ b/paddle/gserver/layers/Projection.h @@ -37,7 +37,7 @@ namespace paddle { * to output Argument. */ class Projection { -public: + public: static Projection* create(const ProjectionConfig& config, ParameterPtr parameter, bool useGpu); @@ -98,7 +98,7 @@ public: */ size_t getOutputSize() const { return config_.output_size(); } -protected: + protected: /** * Create layer function. Function is called in forward or backward. * \param function, Layer::forward_ or Layer::backward_ @@ -119,7 +119,7 @@ protected: func->init(config); } -protected: + protected: /// Config of projection ProjectionConfig config_; /// Parameter of projection diff --git a/paddle/gserver/layers/ROIPoolLayer.h b/paddle/gserver/layers/ROIPoolLayer.h index b1735e9748dc3956aade010f33303b55d4f9f439..801a9b3aebe6d718ea38b76246a6056891d0b1f6 100644 --- a/paddle/gserver/layers/ROIPoolLayer.h +++ b/paddle/gserver/layers/ROIPoolLayer.h @@ -33,7 +33,7 @@ namespace paddle { */ class ROIPoolLayer : public Layer { -protected: + protected: size_t channels_; size_t width_; size_t height_; @@ -44,7 +44,7 @@ protected: // Since there is no int matrix, use real maxtrix instead. MatrixPtr maxIdxs_; -public: + public: explicit ROIPoolLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/RecurrentLayer.h b/paddle/gserver/layers/RecurrentLayer.h index 8fd4fe6b78ae6474f3cfcec605f25b72af8295bb..94e633e65777aad540738ea67ea1b4e03dd75954 100644 --- a/paddle/gserver/layers/RecurrentLayer.h +++ b/paddle/gserver/layers/RecurrentLayer.h @@ -40,7 +40,7 @@ namespace paddle { */ class RecurrentLayer : public Layer { -public: + public: explicit RecurrentLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -56,7 +56,7 @@ public: LayerStatePtr getState() override; -protected: + protected: /** * @brief If user do not set --rnn_use_batch=true, it will * compute rnn forward one sequence by one sequence in default. @@ -110,7 +110,7 @@ protected: size_t numSequences, const int* starts); -protected: + protected: std::unique_ptr weight_; std::unique_ptr bias_; diff --git a/paddle/gserver/layers/RecurrentLayerGroup.cpp b/paddle/gserver/layers/RecurrentLayerGroup.cpp index 44b57185c5a5fa7703ca477b990a73cdad2c2aa1..6694e8f2996fdd2c98da1507e5fb3b90b271c850 100644 --- a/paddle/gserver/layers/RecurrentLayerGroup.cpp +++ b/paddle/gserver/layers/RecurrentLayerGroup.cpp @@ -27,7 +27,7 @@ namespace paddle { * between RecurrentLayerGroupBegin and RecurrentLayerGroupEnd. */ class RecurrentLayerGroup : public Layer { -public: + public: explicit RecurrentLayerGroup(const LayerConfig& config) : Layer(config) {} void initSubNetwork(NeuralNetwork* rootNetwork, @@ -58,7 +58,7 @@ public: callback(*network_); } -private: + private: std::unique_ptr network_; }; diff --git a/paddle/gserver/layers/ResizeLayer.cpp b/paddle/gserver/layers/ResizeLayer.cpp index 831f4c3b7e103bc51d870cfa44616980adca08e8..d4ae9945934a40719d253d4b53915530423448af 100644 --- a/paddle/gserver/layers/ResizeLayer.cpp +++ b/paddle/gserver/layers/ResizeLayer.cpp @@ -24,7 +24,7 @@ namespace paddle { * resize matrix: (height * width / size) * size */ class ResizeLayer : public Layer { -public: + public: explicit ResizeLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/RotateLayer.h b/paddle/gserver/layers/RotateLayer.h index 3b619921ab741e1236a495e497e18e265bd6e110..7ecbff20167dd95f782f2d61dc34697ab3273934 100644 --- a/paddle/gserver/layers/RotateLayer.h +++ b/paddle/gserver/layers/RotateLayer.h @@ -32,7 +32,7 @@ namespace paddle { */ class RotateLayer : public Layer { -public: + public: explicit RotateLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, const ParameterMap& parameterMap); @@ -40,7 +40,7 @@ public: void forward(PassType passType); void backward(const UpdateCallback& callback = nullptr); -private: + private: int batchSize_; int size_; int height_; diff --git a/paddle/gserver/layers/RowConvLayer.h b/paddle/gserver/layers/RowConvLayer.h index ba0af1de68a5f77d9ffefac6ef5193bb9d1b4f83..3b74df0b1af5caef1a1abd3d3c5b3ae3b67c429b 100644 --- a/paddle/gserver/layers/RowConvLayer.h +++ b/paddle/gserver/layers/RowConvLayer.h @@ -22,7 +22,7 @@ namespace paddle { * \brief Row Convolution Layer. */ class RowConvLayer : public Layer { -public: + public: explicit RowConvLayer(const LayerConfig& config) : Layer(config) {} ~RowConvLayer() {} @@ -32,7 +32,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -protected: + protected: // Row convolution weight, context_lenght_ * fan_out. // fan_out is the size of output feature. std::unique_ptr weight_; diff --git a/paddle/gserver/layers/RowL2NormLayer.cpp b/paddle/gserver/layers/RowL2NormLayer.cpp index 7ff0c9bae927cae2bc6a332bc0bde013e07edd0a..d5e6e10a0276adb74ec31c13d9e8acc77414a85b 100644 --- a/paddle/gserver/layers/RowL2NormLayer.cpp +++ b/paddle/gserver/layers/RowL2NormLayer.cpp @@ -26,12 +26,12 @@ namespace paddle { */ class RowL2NormLayer : public Layer { -protected: + protected: MatrixPtr inSquare_; MatrixPtr l2NormReciprocal_; MatrixPtr dotSum_; -public: + public: explicit RowL2NormLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/SamplingIdLayer.cpp b/paddle/gserver/layers/SamplingIdLayer.cpp index 2edd915d226edfd7e48df1a066d5a6f51f259511..dbce63588126c012e3b9713e8be749e0001ddec7 100644 --- a/paddle/gserver/layers/SamplingIdLayer.cpp +++ b/paddle/gserver/layers/SamplingIdLayer.cpp @@ -31,7 +31,7 @@ class SamplingIdLayer : public Layer { std::uniform_real_distribution rand1_; std::vector tmpCpuInput_; -public: + public: explicit SamplingIdLayer(const LayerConfig& config) : Layer(config), rand1_(0, 1) {} diff --git a/paddle/gserver/layers/ScaleShiftLayer.cpp b/paddle/gserver/layers/ScaleShiftLayer.cpp index 799d1fe51a65da10bef637894931627315daf0a2..8af78a2e27d2b50572f8bdd6e98696f3d1967eb1 100644 --- a/paddle/gserver/layers/ScaleShiftLayer.cpp +++ b/paddle/gserver/layers/ScaleShiftLayer.cpp @@ -30,11 +30,11 @@ namespace paddle { */ class ScaleShiftLayer : public Layer { -protected: + protected: std::unique_ptr scale_; std::unique_ptr offset_; -public: + public: explicit ScaleShiftLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/ScaleSubRegionLayer.h b/paddle/gserver/layers/ScaleSubRegionLayer.h index 6e861be4858cfc21a42ef7293652d5cdf81be5f5..fe431698bc6cd5e52e2c545756b40be8b307e644 100644 --- a/paddle/gserver/layers/ScaleSubRegionLayer.h +++ b/paddle/gserver/layers/ScaleSubRegionLayer.h @@ -29,7 +29,7 @@ namespace paddle { * region. */ class ScaleSubRegionLayer : public Layer { -public: + public: explicit ScaleSubRegionLayer(const LayerConfig& config) : Layer(config) {} ~ScaleSubRegionLayer() {} @@ -40,7 +40,7 @@ public: void backward(const UpdateCallback& callback = nullptr); -protected: + protected: TensorShape shape_; TensorShape indicesShape_; size_t imgH_; diff --git a/paddle/gserver/layers/ScalingLayer.cpp b/paddle/gserver/layers/ScalingLayer.cpp index 1d98a7373d172d40cddc9b4611cb00434f17e00b..15e07daebee194a789da52d37a192e031348300c 100644 --- a/paddle/gserver/layers/ScalingLayer.cpp +++ b/paddle/gserver/layers/ScalingLayer.cpp @@ -32,7 +32,7 @@ namespace paddle { */ class ScalingLayer : public Layer { -public: + public: explicit ScalingLayer(const LayerConfig& config) : Layer(config) {} ~ScalingLayer() {} diff --git a/paddle/gserver/layers/ScalingProjection.cpp b/paddle/gserver/layers/ScalingProjection.cpp index 99b5b68f543842d23f20b626fddd66b677ebe059..4d871cafc4d0194a61044d76a766236209c33d47 100644 --- a/paddle/gserver/layers/ScalingProjection.cpp +++ b/paddle/gserver/layers/ScalingProjection.cpp @@ -17,7 +17,7 @@ limitations under the License. */ namespace paddle { class ScalingProjection : public Projection { -public: + public: ScalingProjection(const ProjectionConfig& config, const ParameterPtr& parameter, bool useGpu) @@ -48,7 +48,7 @@ public: } } -protected: + protected: std::unique_ptr weight_; }; diff --git a/paddle/gserver/layers/SelectiveFullyConnectedLayer.h b/paddle/gserver/layers/SelectiveFullyConnectedLayer.h index 81564074185a5d9fc80d4d3a64af998098ab5472..4b32ce8b162c2a8b1a6c34adc0885a7701f5f91e 100644 --- a/paddle/gserver/layers/SelectiveFullyConnectedLayer.h +++ b/paddle/gserver/layers/SelectiveFullyConnectedLayer.h @@ -33,11 +33,11 @@ namespace paddle { * The config file api is selective_fc_layer. */ class SelectiveFullyConnectedLayer : public Layer { -protected: + protected: WeightList weights_; std::unique_ptr biases_; -private: + private: /** * Get selected columns each forward. */ @@ -60,7 +60,7 @@ private: /// if true, means output_.value is the same as Fc Layer bool fullOutput_; -public: + public: explicit SelectiveFullyConnectedLayer(const LayerConfig& config) : Layer(config), selCols_(nullptr) {} @@ -94,7 +94,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -private: + private: /** * @brief Make SelectiveFC act as FullyConnectedLayer */ diff --git a/paddle/gserver/layers/SequenceConcatLayer.cpp b/paddle/gserver/layers/SequenceConcatLayer.cpp index cf573f3f33fcd70c6768b164f158cb1f545414fc..c84c3ce4f080cc19f4937f04585accb5b2b347f9 100644 --- a/paddle/gserver/layers/SequenceConcatLayer.cpp +++ b/paddle/gserver/layers/SequenceConcatLayer.cpp @@ -29,10 +29,10 @@ namespace paddle { */ class SequenceConcatLayer : public Layer { -protected: + protected: std::unique_ptr biases_; -public: + public: explicit SequenceConcatLayer(const LayerConfig& config) : Layer(config) {} ~SequenceConcatLayer() {} diff --git a/paddle/gserver/layers/SequenceLastInstanceLayer.cpp b/paddle/gserver/layers/SequenceLastInstanceLayer.cpp index 6c4ae775c16ac76e237fb8f8ee5ec9ed8f11802e..28d0a9296d4accd4152e886ccae12a776fdb8f7f 100644 --- a/paddle/gserver/layers/SequenceLastInstanceLayer.cpp +++ b/paddle/gserver/layers/SequenceLastInstanceLayer.cpp @@ -38,12 +38,12 @@ namespace paddle { */ class SequenceLastInstanceLayer : public SequencePoolLayer { -protected: + protected: MatrixPtr tmpSrc_; MatrixPtr tmpDest_; std::vector instanceIds_; -public: + public: explicit SequenceLastInstanceLayer(const LayerConfig& config) : SequencePoolLayer(config) {} diff --git a/paddle/gserver/layers/SequencePoolLayer.h b/paddle/gserver/layers/SequencePoolLayer.h index 254e4cc6b3aacf21565cb03e5bdb52a2beb9fea8..01183060afd58376bb718dda64d8106cce4899f9 100644 --- a/paddle/gserver/layers/SequencePoolLayer.h +++ b/paddle/gserver/layers/SequencePoolLayer.h @@ -41,7 +41,7 @@ namespace paddle { */ class SequencePoolLayer : public Layer { -protected: + protected: int type_; std::unique_ptr biases_; enum SequenceLevel { kNonSeq = 0, kSeq = 1 }; @@ -51,7 +51,7 @@ protected: // Whether the input sequence is reversed or not. bool reversed_ = false; -public: + public: explicit SequencePoolLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/SequenceReshapeLayer.cpp b/paddle/gserver/layers/SequenceReshapeLayer.cpp index fb96669917236b98809f1cda0d023600f1e76731..319310af8c4ac3bdefd814ad05b7fde6070f2340 100644 --- a/paddle/gserver/layers/SequenceReshapeLayer.cpp +++ b/paddle/gserver/layers/SequenceReshapeLayer.cpp @@ -29,12 +29,12 @@ namespace paddle { */ class SequenceReshapeLayer : public Layer { -protected: + protected: std::unique_ptr biases_; MatrixPtr reshapedOutputGrad; -public: + public: explicit SequenceReshapeLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/SequenceSliceLayer.cpp b/paddle/gserver/layers/SequenceSliceLayer.cpp index 1b7c33477ea64c1cdb7c8e85d7a5302b299d7552..a6d810b583aab6e44faa583795686f06e17beeb9 100644 --- a/paddle/gserver/layers/SequenceSliceLayer.cpp +++ b/paddle/gserver/layers/SequenceSliceLayer.cpp @@ -21,7 +21,7 @@ limitations under the License. */ namespace paddle { class SequenceSliceLayer : public Layer { -public: + public: explicit SequenceSliceLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -30,7 +30,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -private: + private: /* * TODO(caoying) * In PaddePaddle, currently all matrices are real number types, diff --git a/paddle/gserver/layers/SequenceToBatch.h b/paddle/gserver/layers/SequenceToBatch.h index 8743a5ef10f61970d3d48b105b9da29bcd10ba83..5200e702d9bc947746567c19ca7d552750828131 100644 --- a/paddle/gserver/layers/SequenceToBatch.h +++ b/paddle/gserver/layers/SequenceToBatch.h @@ -39,7 +39,7 @@ namespace paddle { * */ class SequenceToBatch { -public: + public: explicit SequenceToBatch(bool useGpu) : useGpu_(useGpu) {} /* resize and calculate the batchIndex_ */ @@ -82,7 +82,7 @@ public: numBatch_ = seq2batch.numBatch_; } -protected: + protected: void sequence2BatchCopy(Matrix &batch, Matrix &sequence, IVector &seq2BatchIdx, diff --git a/paddle/gserver/layers/SliceProjection.cpp b/paddle/gserver/layers/SliceProjection.cpp index 5627ad1eb3a49a73261bc2197cbd3735489509d2..b474f2db759adfad337f9485a5a38588b6839c54 100644 --- a/paddle/gserver/layers/SliceProjection.cpp +++ b/paddle/gserver/layers/SliceProjection.cpp @@ -44,14 +44,14 @@ namespace paddle { * The config file api is slice_projection. */ class SliceProjection : public Projection { -public: + public: SliceProjection(const ProjectionConfig& config, const ParameterPtr& parameter, bool useGpu); virtual void forward(); virtual void backward(const UpdateCallback& callback); -protected: + protected: std::vector> slices_; }; diff --git a/paddle/gserver/layers/SlopeInterceptLayer.cpp b/paddle/gserver/layers/SlopeInterceptLayer.cpp index c94a07e5da7442bba1ce7e9c09c4ffea3e5cd4ac..f7f4735c1b72d4ac6540714573fd7e15ef99ea5b 100644 --- a/paddle/gserver/layers/SlopeInterceptLayer.cpp +++ b/paddle/gserver/layers/SlopeInterceptLayer.cpp @@ -36,7 +36,7 @@ namespace paddle { */ class SlopeInterceptLayer : public Layer { -public: + public: explicit SlopeInterceptLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/SpatialPyramidPoolLayer.h b/paddle/gserver/layers/SpatialPyramidPoolLayer.h index 6cb5fdf83e2b88ce4adb392807a1fdbac253c51c..421bdfe09c46f656f500daff195c755274bf8bb7 100644 --- a/paddle/gserver/layers/SpatialPyramidPoolLayer.h +++ b/paddle/gserver/layers/SpatialPyramidPoolLayer.h @@ -29,7 +29,7 @@ namespace paddle { */ class SpatialPyramidPoolLayer : public Layer { -protected: + protected: size_t channels_; size_t imgSizeW_; size_t imgSizeH_; @@ -40,7 +40,7 @@ protected: std::vector projOutput_; std::vector> projCol_; -public: + public: explicit SpatialPyramidPoolLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/SubNestedSequenceLayer.cpp b/paddle/gserver/layers/SubNestedSequenceLayer.cpp index db240ab0c96510263d90b291f6396ac51a73fbbd..e2bb00bbfacb26dc736a63877119b379f22b5983 100644 --- a/paddle/gserver/layers/SubNestedSequenceLayer.cpp +++ b/paddle/gserver/layers/SubNestedSequenceLayer.cpp @@ -21,7 +21,7 @@ limitations under the License. */ namespace paddle { class SubNestedSequenceLayer : public Layer { -public: + public: explicit SubNestedSequenceLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -30,7 +30,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback = nullptr) override; -private: + private: /* * This functions generates the indices of rows in a batch according to the * indices of selected sub-sequence in each sequence. diff --git a/paddle/gserver/layers/SubSequenceLayer.cpp b/paddle/gserver/layers/SubSequenceLayer.cpp index 808627f09273950bb6f52a4a6e497bcb8ea170f7..ba49f5710f9d0bb985cf1e80d5c4a972d8f046a6 100644 --- a/paddle/gserver/layers/SubSequenceLayer.cpp +++ b/paddle/gserver/layers/SubSequenceLayer.cpp @@ -27,12 +27,12 @@ namespace paddle { */ class SubSequenceLayer : public Layer { -protected: + protected: std::unique_ptr biases_; MatrixPtr tmpSrc_; MatrixPtr tmpDest_; -public: + public: explicit SubSequenceLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/SumToOneNormLayer.cpp b/paddle/gserver/layers/SumToOneNormLayer.cpp index ffbe14925300ad1ffbd33f43a6c0afadddd231e6..00764717e8b6be30230e44626974033e929352da 100644 --- a/paddle/gserver/layers/SumToOneNormLayer.cpp +++ b/paddle/gserver/layers/SumToOneNormLayer.cpp @@ -32,13 +32,13 @@ namespace paddle { */ class SumToOneNormLayer : public Layer { -protected: + protected: /// reciprocalRowSum_ = \f$1 / \sum_{k=1}^N in[k]\f$ MatrixPtr reciprocalRowSum_; /// dotSum = output_.grad \f$.*\f$ output_.value MatrixPtr dotSum_; -public: + public: explicit SumToOneNormLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/SwitchOrderLayer.h b/paddle/gserver/layers/SwitchOrderLayer.h index 882437f4434c2e61a5b08328d2f79c1e7f589204..8a551a2bba698374841e73dc4dbad403034dd300 100644 --- a/paddle/gserver/layers/SwitchOrderLayer.h +++ b/paddle/gserver/layers/SwitchOrderLayer.h @@ -22,7 +22,7 @@ namespace paddle { * \brief This layer calculate softmax in image channel dimension. */ class SwitchOrderLayer : public Layer { -public: + public: explicit SwitchOrderLayer(const LayerConfig& config) : Layer(config) {} ~SwitchOrderLayer() {} @@ -34,7 +34,7 @@ public: void setInDims(); void setOutDims(); -protected: + protected: std::vector> nchw2nhwc_; std::vector> nhwc2nchw_; TensorShape inDims_; diff --git a/paddle/gserver/layers/TableProjection.h b/paddle/gserver/layers/TableProjection.h index ffb05e68f068a7b9abb0db5cea6133e64300cb55..60286149f4227fbc758dca7864c6d1f67782c7ae 100644 --- a/paddle/gserver/layers/TableProjection.h +++ b/paddle/gserver/layers/TableProjection.h @@ -32,7 +32,7 @@ namespace paddle { * @note If \f$ids[i] = -1\f$, it will be ignored. */ class TableProjection : public Projection { -public: + public: TableProjection(const ProjectionConfig& config, const ParameterPtr& parameter, bool useGpu); @@ -43,7 +43,7 @@ public: virtual void forward(); virtual void backward(const UpdateCallback& callback); -protected: + protected: std::unique_ptr table_; }; diff --git a/paddle/gserver/layers/TensorLayer.h b/paddle/gserver/layers/TensorLayer.h index 8a323aa15f6f3761c45b6ca7e3be8f15621a189e..5c1ee40ceda9387138a82368ec4edcbae4bd3419 100644 --- a/paddle/gserver/layers/TensorLayer.h +++ b/paddle/gserver/layers/TensorLayer.h @@ -37,11 +37,11 @@ namespace paddle { */ class TensorLayer : public Layer { -protected: + protected: WeightList weights_; std::unique_ptr biases_; -public: + public: explicit TensorLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/TransLayer.h b/paddle/gserver/layers/TransLayer.h index 03d094862459c80aee8899c0352ffce732db08af..1cd8fd91f785d5a43fc7d7663e657702b32fa534 100644 --- a/paddle/gserver/layers/TransLayer.h +++ b/paddle/gserver/layers/TransLayer.h @@ -29,7 +29,7 @@ namespace paddle { * The config file api is trans_layer. */ class TransLayer : public Layer { -public: + public: explicit TransLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, diff --git a/paddle/gserver/layers/TransposedFullMatrixProjection.cpp b/paddle/gserver/layers/TransposedFullMatrixProjection.cpp index 755389f7074c252c0fad396e629c6ffedc74b531..45f59779896f993aface284e3485e1e3d801f4c5 100644 --- a/paddle/gserver/layers/TransposedFullMatrixProjection.cpp +++ b/paddle/gserver/layers/TransposedFullMatrixProjection.cpp @@ -24,14 +24,14 @@ namespace paddle { * The config file api is trans_full_matrix_projection. */ class TransposedFullMatrixProjection : public Projection { -public: + public: TransposedFullMatrixProjection(const ProjectionConfig& config, ParameterPtr parameter, bool useGPu); virtual void forward(); virtual void backward(const UpdateCallback& callback); -protected: + protected: std::unique_ptr weight_; }; diff --git a/paddle/gserver/layers/UpsampleLayer.h b/paddle/gserver/layers/UpsampleLayer.h index 25efbac5e9e6e92653f7c2b2f4dca9221737e5d6..c9d079c3141c37517866bfdad10d9b2cdb89f7d5 100644 --- a/paddle/gserver/layers/UpsampleLayer.h +++ b/paddle/gserver/layers/UpsampleLayer.h @@ -30,7 +30,7 @@ namespace paddle { */ class UpsampleLayer : public Layer { -public: + public: explicit UpsampleLayer(const LayerConfig& config) : Layer(config) {} ~UpsampleLayer() {} @@ -42,7 +42,7 @@ public: size_t getOutputSize(); -protected: + protected: size_t scale_, scaleY_; size_t upsampleSize_, upsampleSizeY_; size_t padOutX_, padOutY_; diff --git a/paddle/gserver/layers/ValidationLayer.h b/paddle/gserver/layers/ValidationLayer.h index f412d685c0541537bd4318fec2dae06215c4afbe..be41128ef4530f32a63c757648c2f393fd118ea6 100644 --- a/paddle/gserver/layers/ValidationLayer.h +++ b/paddle/gserver/layers/ValidationLayer.h @@ -23,7 +23,7 @@ DECLARE_int32(trainer_id); namespace paddle { class ValidationLayer : public Layer { -public: + public: explicit ValidationLayer(const LayerConfig& config) : Layer(config) {} bool init(const LayerMap& layerMap, @@ -51,7 +51,7 @@ public: * AucValidation */ class AucValidation : public ValidationLayer { -public: + public: explicit AucValidation(const LayerConfig& config) : ValidationLayer(config), cpuOutput_(nullptr), @@ -72,7 +72,7 @@ public: }; std::vector predictArray_; -private: + private: bool passBegin_; std::unique_ptr evaluator_; MatrixPtr cpuOutput_; @@ -84,7 +84,7 @@ private: * positive-negative pair rate Validation */ class PnpairValidation : public ValidationLayer { -public: + public: explicit PnpairValidation(const LayerConfig& config) : ValidationLayer(config) {} @@ -95,7 +95,7 @@ public: void onPassEnd() override; -private: + private: bool passBegin_; std::unique_ptr evaluator_; }; diff --git a/paddle/gserver/layers/WarpCTCLayer.h b/paddle/gserver/layers/WarpCTCLayer.h index 6f6be359c0aa46a4f3775f8405e1aa51ca1ae147..3017ca794ecc14f5a3cbd0b302a4953a191a5065 100644 --- a/paddle/gserver/layers/WarpCTCLayer.h +++ b/paddle/gserver/layers/WarpCTCLayer.h @@ -26,7 +26,7 @@ namespace paddle { * The config file api is warp_ctc_layer. */ class WarpCTCLayer : public Layer { -public: + public: explicit WarpCTCLayer(const LayerConfig& config) : Layer(config) {} ~WarpCTCLayer() {} @@ -35,7 +35,7 @@ public: void forward(PassType passType) override; void backward(const UpdateCallback& callback) override; -protected: + protected: /** * sequence matrix and batch matrix copy: * sequence (s0, s0, s0, s0; s1, s1; s2, s2, s2; s3) @@ -49,7 +49,7 @@ protected: const ICpuGpuVectorPtr& seqStartPositions, bool normByTimes); -protected: + protected: size_t numClasses_; size_t blank_; size_t maxSequenceLength_; diff --git a/paddle/gserver/tests/MKLDNNTester.h b/paddle/gserver/tests/MKLDNNTester.h index c1faa6fd90e06d8c742e97c9ce51eeba3c24a550..41ac46b70ab08d4071f4e6abfca94667268015d7 100644 --- a/paddle/gserver/tests/MKLDNNTester.h +++ b/paddle/gserver/tests/MKLDNNTester.h @@ -44,7 +44,7 @@ class MKLDNNTester { std::vector paraValues; }; -protected: + protected: std::vector configs_; vector layerNames_; vector> dataLayers_; @@ -65,7 +65,7 @@ protected: /// passType, PASS_TRAIN, PASS_TEST or PASS_GC (Gradient Check pass) PassType passType_; -public: + public: explicit MKLDNNTester(size_t iter = 3, float epsilon = 1e-4) { iter_ = iter; eps_ = epsilon; @@ -75,7 +75,7 @@ public: ~MKLDNNTester() {} -public: + public: void run(const TestConfig& dnn, const TestConfig& ref, size_t batchSize, @@ -97,7 +97,7 @@ public: bool use_mkldnn, size_t iter = 2); -private: + private: void reset(const TestConfig& dnn, const TestConfig& ref, size_t batchSize); void setInputImgSize(); void runOnce(); diff --git a/paddle/gserver/tests/test_MultinomialSampler.cpp b/paddle/gserver/tests/test_MultinomialSampler.cpp index 4a295ea9d51788f988fe79f8439cc7769f661d8e..043025239e744601cbef3ca5c241509872963bd8 100644 --- a/paddle/gserver/tests/test_MultinomialSampler.cpp +++ b/paddle/gserver/tests/test_MultinomialSampler.cpp @@ -27,7 +27,7 @@ using namespace paddle; // NOLINT using namespace std; // NOLINT class MultinomialSamplerTester : public MultinomialSampler { -public: + public: MultinomialSamplerTester(real* prob, int size) : MultinomialSampler(prob, size) {} diff --git a/paddle/gserver/tests/test_RecurrentGradientMachine.cpp b/paddle/gserver/tests/test_RecurrentGradientMachine.cpp index 72324fcf29cc60867005da25b35a8075fd590a89..9770567b88a2af946b30439300540ed61694ba10 100644 --- a/paddle/gserver/tests/test_RecurrentGradientMachine.cpp +++ b/paddle/gserver/tests/test_RecurrentGradientMachine.cpp @@ -26,7 +26,7 @@ DECLARE_int32(seed); using namespace paddle; // NOLINT using namespace std; // NOLINT class TrainerForTest : public paddle::Trainer { -public: + public: void startTrain() { GradientMachine& gm = *this->trainerInternal_.getGradientMachine(); gm.start(); diff --git a/paddle/gserver/tests/test_RecurrentLayer.cpp b/paddle/gserver/tests/test_RecurrentLayer.cpp index e5ce922f15749cb18b93f64e0e08f437c5633065..b54e37b7dbf8bffeb949f709e6a4f9ec86ea13c3 100644 --- a/paddle/gserver/tests/test_RecurrentLayer.cpp +++ b/paddle/gserver/tests/test_RecurrentLayer.cpp @@ -225,7 +225,7 @@ TEST(Layer, RecurrentLayer) { #include "paddle/gserver/layers/RecurrentLayer.h" template class TestRecurrentLayer { -public: + public: LayerConfig config_; bool useGpu_; bool useBatch_; diff --git a/paddle/math/Allocator.h b/paddle/math/Allocator.h index ae60f6fe5fa142bdffeafc31b5816b8fcc94ad5c..c43a83891eb6b7eae278169736149ad1d89e950e 100644 --- a/paddle/math/Allocator.h +++ b/paddle/math/Allocator.h @@ -27,7 +27,7 @@ namespace paddle { * This is the base class of all Allocator class. */ class Allocator { -public: + public: virtual ~Allocator() {} virtual void* alloc(size_t size) = 0; virtual void free(void* ptr) = 0; @@ -38,7 +38,7 @@ public: * @brief CPU allocator implementation. */ class CpuAllocator : public Allocator { -public: + public: ~CpuAllocator() {} /** @@ -76,7 +76,7 @@ public: * @brief GPU allocator implementation. */ class GpuAllocator : public Allocator { -public: + public: ~GpuAllocator() {} /** @@ -107,7 +107,7 @@ public: * @brief CPU pinned memory allocator implementation. */ class CudaHostAllocator : public Allocator { -public: + public: ~CudaHostAllocator() {} /** diff --git a/paddle/math/BaseMatrix.h b/paddle/math/BaseMatrix.h index 00ce5a19491048f3339d608ac37669816a9ad3f5..1958629aa0354fcc332b1e5677a64c29397e0d26 100644 --- a/paddle/math/BaseMatrix.h +++ b/paddle/math/BaseMatrix.h @@ -43,7 +43,7 @@ typedef bool_constant true_type; address += row * ld + col; class MatrixOffset { -public: + public: size_t aCol_; size_t aRow_; size_t bCol_; @@ -72,14 +72,14 @@ public: template class BaseMatrixT : public TensorExpression, T> { -public: + public: size_t height_, width_; size_t stride_; T* data_; bool trans_; bool useGpu_; -public: + public: virtual ~BaseMatrixT() {} BaseMatrixT(size_t height, size_t width, T* data, bool trans, bool useGpu) : height_(height), diff --git a/paddle/math/CpuSparseMatrix.h b/paddle/math/CpuSparseMatrix.h index 22b6b71688bd555cf8bf8a29088ad01b092d67cf..172792c2950ce56281715cb7f3eb076da252d77e 100644 --- a/paddle/math/CpuSparseMatrix.h +++ b/paddle/math/CpuSparseMatrix.h @@ -22,7 +22,7 @@ limitations under the License. */ namespace paddle { class CpuSparseMatrix : public Matrix { -public: + public: CpuSparseMatrix(size_t height, size_t width, size_t nnz, /* used to allocate space */ @@ -291,10 +291,10 @@ public: LOG(FATAL) << "not supported!"; } -private: + private: MatrixPtr clone(size_t height = 0, size_t width = 0, bool useGpu = false); -protected: + protected: void sparseResize(); /*for csr , record row start position, for csc, record row index for every no * zero value*/ @@ -310,10 +310,10 @@ protected: static ThreadLocal> cpuLocalMats_; // BaseMatrixT interface -public: + public: bool isSparse() const { return true; } -private: + private: using Matrix::mul; using Matrix::copyFrom; using Matrix::rowMax; @@ -329,7 +329,7 @@ private: namespace paddle { class CpuSparseMatrix : public Matrix { -public: + public: CpuSparseMatrix(size_t height, size_t width, size_t nnz, /* used to allocate space */ diff --git a/paddle/math/ExecViaCpu.h b/paddle/math/ExecViaCpu.h index 9b2a3c2b8accd384aac896e86ef8315a744633e1..ec2337545e9e3efdf31d3d786a096a67283715f2 100644 --- a/paddle/math/ExecViaCpu.h +++ b/paddle/math/ExecViaCpu.h @@ -31,17 +31,17 @@ namespace paddle { template class CopyToCpu { -public: + public: explicit CopyToCpu(Arg& arg) : arg_(arg) {} Arg& copiedArg() const { return arg_; } -private: + private: Arg& arg_; }; template <> class CopyToCpu { -public: + public: explicit CopyToCpu(Matrix& arg) : arg_(arg) { if (arg.useGpu()) { CHECK(!arg.isTransposed()) << "Not supported"; @@ -59,14 +59,14 @@ public: } Matrix& copiedArg() const { return copied_ ? *copied_ : arg_; } -private: + private: Matrix& arg_; MatrixPtr copied_; }; template <> class CopyToCpu { -public: + public: explicit CopyToCpu(const Matrix& arg) : arg_(arg) { if (arg.useGpu()) { CHECK(!arg.isTransposed()) << "Not supported"; @@ -79,14 +79,14 @@ public: } const Matrix& copiedArg() const { return copied_ ? *copied_ : arg_; } -private: + private: const Matrix& arg_; MatrixPtr copied_; }; template <> class CopyToCpu { -public: + public: explicit CopyToCpu(IVector& arg) : arg_(arg) { if (arg.useGpu()) { copied_ = IVector::create(arg.getSize(), /* useGpu= */ false); @@ -100,14 +100,14 @@ public: } IVector& copiedArg() const { return copied_ ? *copied_ : arg_; } -private: + private: IVector& arg_; IVectorPtr copied_; }; template <> class CopyToCpu { -public: + public: explicit CopyToCpu(const IVector& arg) : arg_(arg) { if (arg.useGpu()) { copied_ = IVector::create(arg.getSize(), /* useGpu= */ false); @@ -116,7 +116,7 @@ public: } const IVector& copiedArg() const { return copied_ ? *copied_ : arg_; } -private: + private: const IVector& arg_; IVectorPtr copied_; }; @@ -128,7 +128,7 @@ class GpuFuncWrapperImp; template class GpuFuncWrapperBase { -public: + public: typedef R ResultType; R operator()(F&& f, Args... args) { return f(CopyToCpu::type>(args) diff --git a/paddle/math/MKLDNNMatrix.h b/paddle/math/MKLDNNMatrix.h index e1fb81679adf4658a58ceee73c8d5da6c0b61050..d4a78f3e54b73add3c00e17f13d91359839d3d14 100644 --- a/paddle/math/MKLDNNMatrix.h +++ b/paddle/math/MKLDNNMatrix.h @@ -35,7 +35,7 @@ typedef std::shared_ptr MKLDNNMatrixPtr; * */ class MKLDNNMatrix : public CpuMatrix, public mkldnn::memory { -public: + public: MKLDNNMatrix(CpuMatrixPtr m, mkldnn::memory::primitive_desc pd) : CpuMatrix(m->getData(), m->getHeight(), m->getWidth(), false), mkldnn::memory(pd, m->getData()), @@ -107,7 +107,7 @@ public: dst.copyFrom(*m_); } -public: + public: /** * Reorder this MKLDNNMatrix from other format. * Support inplace reorder. @@ -226,7 +226,7 @@ public: */ mkldnn::engine getEngine() { return getPrimitiveDesc().get_engine(); } -protected: + protected: /** * Do reorder once. * Can support inplace. @@ -248,7 +248,7 @@ protected: set_data_handle(data); } -private: + private: // save the CpuMatrixPtr in case the buffer released outside CpuMatrixPtr m_; }; diff --git a/paddle/math/MathFunctions.cpp b/paddle/math/MathFunctions.cpp index de404cad89fba8021b8645a40e25c1f5b7e86596..f48119aa511578b21602a225277f01b4c6a9e9a8 100644 --- a/paddle/math/MathFunctions.cpp +++ b/paddle/math/MathFunctions.cpp @@ -12,7 +12,7 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. */ -#include "MathFunctions.h" +#include "paddle/math/MathFunctions.h" #include "hl_matrix_apply.cuh" #include "hl_matrix_ops.cuh" #include "paddle/utils/DynamicLoader.h" @@ -240,6 +240,36 @@ template <> void vAdd(const int n, const double* a, const double* b, double* r) { vdAdd(n, a, b, r); } + +template <> +void vTanh(const int n, const float* a, float* r) { + vsTanh(n, a, r); +} + +template <> +void vTanh(const int n, const double* a, double* r) { + vdTanh(n, a, r); +} + +template <> +void vInvSqrt(const int n, const float* a, float* r) { + vsInvSqrt(n, a, r); +} + +template <> +void vInvSqrt(const int n, const double* a, double* r) { + vdInvSqrt(n, a, r); +} + +template <> +void vLog1p(const int n, const float* a, float* r) { + vsLog1p(n, a, r); +} + +template <> +void vLog1p(const int n, const double* a, double* r) { + vdLog1p(n, a, r); +} #else DEFINE_MATRIX_BINARY_OP(vExp, b = std::exp(a)); @@ -277,17 +307,6 @@ void vAdd(const int n, const T* a, const T* b, T* r) { n); } -template void vExp(const int n, const float* a, float* r); -template void vExp(const int n, const double* a, double* r); -template void vLog(const int n, const float* a, float* r); -template void vLog(const int n, const double* a, double* r); -template void vPow(const int n, const float* a, const float b, float* r); -template void vPow(const int n, const double* a, const double b, double* r); -template void vAdd(const int n, const float* a, const float* b, float* r); -template void vAdd(const int n, const double* a, const double* b, double* r); - -#endif - DEFINE_MATRIX_BINARY_OP(vInvSqrt, b = 1.0f / std::sqrt(a)); template void vInvSqrt(const int n, const T* a, T* r) { @@ -311,11 +330,19 @@ void vTanh(const int n, const T* a, T* r) { binary::vTanh(), const_cast(a), r, 1, n, n, n); } +template void vExp(const int n, const float* a, float* r); +template void vExp(const int n, const double* a, double* r); +template void vLog(const int n, const float* a, float* r); +template void vLog(const int n, const double* a, double* r); +template void vPow(const int n, const float* a, const float b, float* r); +template void vPow(const int n, const double* a, const double b, double* r); +template void vAdd(const int n, const float* a, const float* b, float* r); +template void vAdd(const int n, const double* a, const double* b, double* r); template void vInvSqrt(const int n, const double* a, double* r); template void vInvSqrt(const int n, const float* a, float* r); template void vLog1p(const int n, const float* a, float* r); template void vLog1p(const int n, const double* a, double* r); template void vTanh(const int n, const float* a, float* r); template void vTanh(const int n, const double* a, double* r); - +#endif } // namespace paddle diff --git a/paddle/math/Matrix.h b/paddle/math/Matrix.h index 04e9614eabc47c4c661ace2106e8ca96f45a1d49..4c3b2c95361065372f5969a2da73bce0eb9d123f 100644 --- a/paddle/math/Matrix.h +++ b/paddle/math/Matrix.h @@ -77,7 +77,7 @@ typedef std::shared_ptr CpuSparseMatrixPtr; * instead. */ class Matrix : public BaseMatrix { -protected: + protected: Matrix(MemoryHandlePtr memHandle, size_t height, size_t width, @@ -95,11 +95,11 @@ protected: static ThreadLocal tmpMat_; -public: + public: size_t elementCnt_; // maximal number of elements which can be held in data_ MemoryHandlePtr memoryHandle_; -public: + public: virtual ~Matrix() {} static MatrixPtr create(MemoryHandlePtr memHandle, @@ -412,7 +412,7 @@ public: LOG(FATAL) << "Not implemented"; } -public: + public: /// Only set all variables to 0 or NULL but not free them. virtual void clear() { height_ = 0; @@ -1228,7 +1228,7 @@ inline std::ostream& operator<<(std::ostream& os, const Matrix& mat) { } class GpuMatrix : public Matrix { -public: + public: GpuMatrix(); GpuMatrix(size_t height, size_t width, bool trans = false); @@ -1660,11 +1660,11 @@ public: }; class CpuMatrix : public Matrix { -private: + private: MatrixPtr sftmaxSum_; MatrixPtr sftmaxDot_; -public: + public: CpuMatrix(size_t height, size_t width, bool trans = false); CpuMatrix(real* data, size_t height, size_t width, bool trans = false) : Matrix(data, height, width, trans, false) {} @@ -1892,7 +1892,7 @@ public: real* getRow(size_t row) { return BaseMatrix::rowBuf(row); } virtual real* getRowBuf(size_t row) { return getRow(row); } -public: + public: /// add b to each sample of this. void addBias(Matrix& b, real scale); void addSharedBias(Matrix& b, real scale); @@ -2128,7 +2128,7 @@ public: }; class SharedCpuMatrix : public CpuMatrix { -public: + public: #ifndef PADDLE_MOBILE_INFERENCE /* blockNum is number of partitions of the matrix */ SharedCpuMatrix(int blockNum, size_t height, size_t width, bool trans = false) @@ -2160,12 +2160,12 @@ public: ~SharedCpuMatrix() {} -public: + public: virtual void mul(CpuSparseMatrix* a, CpuMatrix* b, real scaleAB, real scaleT); virtual void add(Matrix& b, real p1, real p2); virtual void add(real p1, real p2); -private: + private: using Matrix::mul; void initShared(int blockNum); void initBlock(int blockNum); diff --git a/paddle/math/MatrixBitCode.cpp b/paddle/math/MatrixBitCode.cpp index 61a9923bc2e6f358738f80de4a30d83c0cc00656..f7a949294b54a5a874e1239a13ca9dce3ba18e94 100644 --- a/paddle/math/MatrixBitCode.cpp +++ b/paddle/math/MatrixBitCode.cpp @@ -27,7 +27,7 @@ struct SimpleCode { inline bool calcBit(int bit) const { return c_ & (1 << bit); } inline int getLength() const { return findLastSet(c_) - 1; } -private: + private: size_t c_; }; @@ -39,7 +39,7 @@ struct SimpleCodeTable { size_t size() const { return numClasses_; } int getMaxCodeLength() const { return findLastSet(numClasses_ - 1); } -private: + private: size_t numClasses_; int maxCodeLength_; }; diff --git a/paddle/math/MemoryHandle.h b/paddle/math/MemoryHandle.h index 03ee413c1218376635c4696ebb774c584aa67aa4..516e09dbed47ac6b039ccb094614c9588eeb3cd5 100644 --- a/paddle/math/MemoryHandle.h +++ b/paddle/math/MemoryHandle.h @@ -20,16 +20,16 @@ limitations under the License. */ namespace paddle { class MemoryHandle { -protected: + protected: explicit MemoryHandle(size_t size); virtual ~MemoryHandle() {} -public: + public: void* getBuf() const { return buf_; } size_t getSize() const { return size_; } size_t getAllocSize() const { return allocSize_; } -protected: + protected: PoolAllocator* allocator_; size_t size_; // the requested size size_t allocSize_; // the allocated size @@ -43,7 +43,7 @@ protected: * The raw handle will be released at destructor */ class GpuMemoryHandle : public MemoryHandle { -public: + public: explicit GpuMemoryHandle(size_t size); virtual ~GpuMemoryHandle(); }; @@ -54,7 +54,7 @@ public: * The raw handle will be released at destructor */ class CpuMemoryHandle : public MemoryHandle { -public: + public: explicit CpuMemoryHandle(size_t size); virtual ~CpuMemoryHandle(); }; diff --git a/paddle/math/PoolAllocator.h b/paddle/math/PoolAllocator.h index 90141fef3fd43fe221874cc50e688f6db9e2dee6..7239cf1c4494e207081e325a7e6067ba26a9c852 100644 --- a/paddle/math/PoolAllocator.h +++ b/paddle/math/PoolAllocator.h @@ -27,7 +27,7 @@ namespace paddle { * @brief Memory pool allocator implementation. */ class PoolAllocator { -public: + public: /** * @brief constructor. * @param allocator a Allocator object. @@ -47,7 +47,7 @@ public: void free(void* ptr, size_t size); std::string getName() { return name_; } -private: + private: void freeAll(); void printAll(); std::unique_ptr allocator_; diff --git a/paddle/math/RowBuffer.h b/paddle/math/RowBuffer.h index 2e4d11a86bf8bd1308b2972f549bc7c201044785..6950afaa21d60615b27c06a151b0afbb296653bf 100644 --- a/paddle/math/RowBuffer.h +++ b/paddle/math/RowBuffer.h @@ -26,7 +26,7 @@ namespace paddle { * If not set memory handler, then the data could be auto growth. */ class RowBuffer { -public: + public: /** * @brief RowBuffer create a auto-growth row buffer. The row length is width. * @param width the length of each row, a.k.a matrix width. @@ -129,7 +129,7 @@ public: */ inline size_t getWidth() const { return width_; } -private: + private: //! TODO(yuyang18): Add resize method to CpuMemHandlePtr, then we can get rid //! of std::vector here. CpuMemHandlePtr preallocatedBuf_; diff --git a/paddle/math/SparseMatrix.h b/paddle/math/SparseMatrix.h index 7c525f4edf3d53544c195f8e253c27a03854a793..9181fa29233677d8f4fac503905cc31eb66cb6c1 100644 --- a/paddle/math/SparseMatrix.h +++ b/paddle/math/SparseMatrix.h @@ -25,7 +25,7 @@ namespace paddle { typedef std::shared_ptr<_hl_sparse_matrix_s> hl_sparse_matrix_s_ptr; class GpuSparseMatrix : public Matrix { -public: + public: MemoryHandlePtr sMemoryHandle_; int* rows_; int* cols_; @@ -36,7 +36,7 @@ public: SparseValueType valueType_; SparseFormat format_; -public: + public: GpuSparseMatrix(size_t height, size_t width, size_t nnz, /* used to allocate space */ @@ -73,7 +73,7 @@ public: bool trans, MemoryHandlePtr sMemoryHandle); -protected: + protected: struct Element { int row; int col; @@ -82,7 +82,7 @@ protected: : row(rowIn), col(colIn), val(valIn) {} }; -public: + public: ~GpuSparseMatrix() {} void resize(size_t newHeight, @@ -211,13 +211,13 @@ public: */ void rowMax(IVector& maxIds, Matrix& maxVal); -protected: + protected: void sparseResize(); void copyRow(int offsets, size_t colNum, const sparse_non_value_t* row); void copyRow(int offsets, size_t colNum, const sparse_float_value_t* row); -public: + public: void mul(const Matrix& a, const Matrix& b, real scaleAB, real scaleT); void copyFrom(CpuSparseMatrix& src, hl_stream_t stream); @@ -228,10 +228,10 @@ public: void trimFromCSC(const CpuSparseMatrix& src); // BaseMatrixT interface -public: + public: bool isSparse() const { return true; } -private: + private: using Matrix::mul; using Matrix::copyFrom; using Matrix::rowMax; @@ -248,7 +248,7 @@ private: namespace paddle { class GpuSparseMatrix : public Matrix { -public: + public: GpuSparseMatrix(size_t height, size_t width, size_t nnz, /* used to allocate space */ diff --git a/paddle/math/SparseRowMatrix.h b/paddle/math/SparseRowMatrix.h index 3920de32df7de925d6e22e17b93b15bff8785675..cf6779e8b0b1d6b0c13b21a08ffff5af76e57ba6 100644 --- a/paddle/math/SparseRowMatrix.h +++ b/paddle/math/SparseRowMatrix.h @@ -29,7 +29,7 @@ namespace paddle { * Sparse Row */ class SparseRowCpuMatrix : public CpuMatrix { -public: + public: struct IndexDict { // In the following, global id means the row id in the original matrix. // Local id means the row id in the local storage which only contains @@ -53,7 +53,7 @@ public: virtual ~SparseRowCpuMatrix() {} -public: + public: /** * Get the row buf * @@ -163,7 +163,7 @@ public: return indexDictHandle_->localIndices; } -protected: + protected: template void apply(Func f) { f(buf_->data(), localIndices_->size() * width_); @@ -204,7 +204,7 @@ class SyncThreadPool; /// For prefetching parameters from remote Parameter server class SparsePrefetchRowCpuMatrix : public SparseRowCpuMatrix { -public: + public: SparsePrefetchRowCpuMatrix(CpuMemHandlePtr dataHandle, size_t height, size_t width, @@ -229,13 +229,13 @@ public: */ void setupIndices(); -protected: + protected: void addRows(const unsigned int* ids, size_t len); SyncThreadPool* pool_; }; class SparseAutoGrowRowCpuMatrix : public SparseRowCpuMatrix { -public: + public: SparseAutoGrowRowCpuMatrix(size_t height, size_t width, IndexDictPtr indexDictHandle = nullptr, @@ -258,7 +258,7 @@ public: }; class CacheRowCpuMatrix : public SparseAutoGrowRowCpuMatrix { -public: + public: CacheRowCpuMatrix(size_t height, size_t width, IndexDictPtr indexDictHandle = nullptr, @@ -287,7 +287,7 @@ public: virtual void mul(CpuSparseMatrix* a, CpuMatrix* b, real scaleAB, real scaleT); -public: + public: CpuVectorPtr sourceDataVec_; real* sourceData_; }; @@ -299,7 +299,7 @@ public: * ids are hashed by worker thread id. */ class SparseRowIdsCpuMatrix : public CpuMatrix { -public: + public: SparseRowIdsCpuMatrix(CpuMemHandlePtr dataHandle, size_t height, size_t width, @@ -310,7 +310,7 @@ public: std::vector& getIds(size_t threadId) { return idsArray_[threadId]; } -private: + private: std::vector> idsArray_; }; @@ -320,13 +320,13 @@ private: namespace paddle { class SparseRowCpuMatrix : public CpuMatrix { -public: + public: void reserveStore() {} void clearIndices() {} }; class SparsePrefetchRowCpuMatrix : public SparseRowCpuMatrix { -public: + public: void setupIndices() {} void addRows(MatrixPtr input) {} void addRows(IVectorPtr ids) {} diff --git a/paddle/math/Storage.h b/paddle/math/Storage.h index ba8f4689a1e896304aa14821b40fc8ff0c304bb2..61a9aa2a07442d9e4ede80c961e17e079eb8b3ba 100644 --- a/paddle/math/Storage.h +++ b/paddle/math/Storage.h @@ -25,7 +25,7 @@ namespace paddle { * @brief Storage manager for multiple devices. */ class StorageEngine { -public: + public: /** * @return Storage singleton */ @@ -41,7 +41,7 @@ public: */ PoolAllocator* getCpuAllocator(); -protected: + protected: StorageEngine(); ~StorageEngine(); RWLock lock_; diff --git a/paddle/math/TensorApply.h b/paddle/math/TensorApply.h index 7d79cae5a11851b190afbb9ac94efdf2ba2510b7..8b642047bffa33b47dfb8ffc8e3fd2a9b7dbae3a 100644 --- a/paddle/math/TensorApply.h +++ b/paddle/math/TensorApply.h @@ -21,7 +21,7 @@ namespace paddle { */ template class TensorApply { -public: + public: explicit INLINE TensorApply(const Derived& p) : data_(p.data_), stride_(p.stride_), @@ -52,7 +52,7 @@ public: */ template class TensorApply { -public: + public: explicit INLINE TensorApply(const Derived& p) : data_(p.data_), stride_(p.stride_), @@ -77,7 +77,7 @@ public: template class TensorApply, T> { -public: + public: explicit TensorApply(const TensorExpression& expr) : expr_(expr.derived()) {} @@ -97,7 +97,7 @@ public: */ template class TensorApply, T> { -public: + public: explicit INLINE TensorApply(const TensorUnaryOp& expr) : op_(expr.op_), expr_(expr.expr_) {} @@ -118,7 +118,7 @@ public: */ template class TensorApply, T> { -public: + public: explicit INLINE TensorApply( const TensorBinaryOp& expr) : op_(expr.op_), lhs_(expr.lhs_), rhs_(expr.rhs_) { @@ -153,7 +153,7 @@ public: */ template class TensorApply, T> { -public: + public: explicit INLINE TensorApply( const TensorTernaryOp& expr) : expr1_(expr.expr1_), expr2_(expr.expr2_), expr3_(expr.expr3_) { @@ -192,7 +192,7 @@ public: */ template class TensorApply, T> { -public: + public: explicit INLINE TensorApply(const TensorConstant& expr) : op_(expr.op_), expr_(expr.expr_) {} diff --git a/paddle/math/TensorAssign.h b/paddle/math/TensorAssign.h index 113d98c16b22b06971040b1a1ce52c696f6c3c14..7d4726ddba43202970c37dd1a08f842104b24ada 100644 --- a/paddle/math/TensorAssign.h +++ b/paddle/math/TensorAssign.h @@ -25,7 +25,7 @@ namespace paddle { */ template class TensorAssignOp { -public: + public: explicit TensorAssignOp(const LhsType& lhs, const RhsType& rhs) : lhs_(lhs), rhs_(rhs) { #ifndef __CUDA_ARCH__ @@ -49,7 +49,7 @@ public: } INLINE bool useGpu() const { return lhs_.useGpu(); } -private: + private: TensorApply lhs_; TensorApply rhs_; }; diff --git a/paddle/math/TensorExpression.h b/paddle/math/TensorExpression.h index 83229ae65dd1f4ed6b885c3d6195b3758b8ba039..f6da9adfca50e49ca260e20313c8979a38e1b06b 100644 --- a/paddle/math/TensorExpression.h +++ b/paddle/math/TensorExpression.h @@ -40,7 +40,7 @@ class TensorAssignOp; */ template class TensorExpression { -public: + public: /** * Element wise unary expression. */ @@ -355,7 +355,7 @@ public: return TensorAssignOp(derived(), expr); } -protected: + protected: const Derived& derived() const { return *static_cast(this); } }; @@ -365,7 +365,7 @@ protected: template class TensorUnaryOp : public TensorExpression, T> { -public: + public: explicit TensorUnaryOp(const OP op, const ExprType& expr) : op_(op), expr_(expr) {} @@ -379,7 +379,7 @@ public: template class TensorBinaryOp : public TensorExpression, T> { -public: + public: explicit TensorBinaryOp(const OP op, const LhsType& lhs, const RhsType& rhs) : op_(op), lhs_(lhs), rhs_(rhs) {} @@ -395,7 +395,7 @@ template class TensorTernaryOp : public TensorExpression< TensorTernaryOp, T> { -public: + public: explicit TensorTernaryOp(const ExprType1& expr1, const ExprType2& expr2, const ExprType3& expr3) @@ -412,7 +412,7 @@ public: template class TensorConstant : public TensorExpression, T> { -public: + public: explicit TensorConstant(const OP op, const ExprType& expr) : op_(op), expr_(expr) {} diff --git a/paddle/math/Vector.h b/paddle/math/Vector.h index 3efbc769dff5aa1dbc9d5015b0cbac313710d70d..964b42cae52af9b487ab17103bc5e999514e4dd1 100644 --- a/paddle/math/Vector.h +++ b/paddle/math/Vector.h @@ -40,13 +40,13 @@ class Matrix; template class BaseVector : public BaseMatrixT { -public: + public: BaseVector(size_t size, T* data, bool useGpu) : BaseMatrixT(1, size, data, false, useGpu), size_(this->width_) {} ~BaseVector() {} -protected: + protected: size_t& size_; }; @@ -57,7 +57,7 @@ protected: */ template class VectorT : public BaseVector { -protected: + protected: VectorT(size_t size, MemoryHandlePtr memoryHandle, size_t offset, bool useGpu) : BaseVector(size, reinterpret_cast(memoryHandle->getBuf()) + offset, @@ -71,7 +71,7 @@ protected: VectorT(size_t size, T* data, bool useGpu) : BaseVector(size, data, useGpu) {} -public: + public: virtual ~VectorT() {} static std::shared_ptr> create(size_t size, bool useGpu); @@ -281,7 +281,7 @@ public: } } -protected: + protected: friend class GpuVectorT; friend class CpuVectorT; virtual void copyTo(CpuVectorT* dest) const = 0; @@ -297,7 +297,7 @@ std::ostream& operator<<(std::ostream& os, const VectorT& vec) { template class GpuVectorT : public VectorT { -public: + public: explicit GpuVectorT(size_t size); GpuVectorT(size_t size, GpuMemHandlePtr memHandle, size_t offset) : VectorT(size, memHandle, offset, true) {} @@ -343,14 +343,14 @@ public: TensorGpuApply(*this, expr); } -protected: + protected: virtual void copyTo(CpuVectorT* dest) const; virtual void copyTo(GpuVectorT* dest) const; }; template class CpuVectorT : public VectorT { -public: + public: explicit CpuVectorT(size_t size); CpuVectorT(size_t size, MemoryHandlePtr memoryHandle, size_t offset) : VectorT(size, memoryHandle, offset, false) {} @@ -415,7 +415,7 @@ public: template class ParallelCpuVectorT : public CpuVectorT { -public: + public: ParallelCpuVectorT(size_t size, SyncThreadPool* pool) : CpuVectorT(size), pool_(pool) {} @@ -434,7 +434,7 @@ public: virtual void exec(SyncThreadPool::JobFunc jobFunc); -private: + private: typedef std::function& vec)> ExecFunc; void parallelExec(ExecFunc func); SyncThreadPool* pool_; @@ -445,7 +445,7 @@ private: */ template class CpuGpuVectorT { -public: + public: /** * @brief An enum type of SyncedFlag using to * mark data memory is in CPU or GPU. @@ -670,7 +670,7 @@ public: setSync(flag); } -protected: + protected: void resizeOrCreate(size_t size, bool useGpu); /** diff --git a/paddle/math/tests/TensorCheck.h b/paddle/math/tests/TensorCheck.h index f4332ede36356bc666612a240448c1be71e5170e..40ac04ef5d4baa0239bb03b04c3a6cce0fcac5a5 100644 --- a/paddle/math/tests/TensorCheck.h +++ b/paddle/math/tests/TensorCheck.h @@ -32,7 +32,7 @@ using paddle::CpuVectorT; using paddle::GpuVectorT; class AssertEqual { -public: + public: AssertEqual(real err = 0) : err_(err) {} inline bool operator()(real a, real b) { @@ -51,7 +51,7 @@ public: return true; } -private: + private: real err_; }; @@ -60,71 +60,71 @@ class CopyToCpu; template <> class CopyToCpu { -public: + public: explicit CopyToCpu(const CpuMatrix& arg) : arg_(arg) {} const CpuMatrix& copiedArg() const { return arg_; } -private: + private: const CpuMatrix& arg_; }; template <> class CopyToCpu { -public: + public: explicit CopyToCpu(const GpuMatrix& arg) : arg_(arg.getHeight(), arg.getWidth()) { arg_.copyFrom(arg); } CpuMatrix& copiedArg() { return arg_; } -private: + private: CpuMatrix arg_; }; template <> class CopyToCpu { -public: + public: explicit CopyToCpu(const Matrix& arg) : arg_(arg.getHeight(), arg.getWidth()) { arg_.copyFrom(arg); } CpuMatrix& copiedArg() { return arg_; } -private: + private: CpuMatrix arg_; }; template class CopyToCpu> { -public: + public: explicit CopyToCpu(const CpuVectorT& arg) : arg_(arg) {} const CpuVectorT& copiedArg() const { return arg_; } -private: + private: const CpuVectorT& arg_; }; template class CopyToCpu> { -public: + public: explicit CopyToCpu(const GpuVectorT& arg) : arg_(arg.getSize()) { arg_.copyFrom(arg); } CpuVectorT& copiedArg() { return arg_; } -private: + private: CpuVectorT arg_; }; template class CopyToCpu> { -public: + public: explicit CopyToCpu(const VectorT& arg) : arg_(arg.getSize()) { arg_.copyFrom(arg); } CpuVectorT& copiedArg() { return arg_; } -private: + private: CpuVectorT arg_; }; diff --git a/paddle/math/tests/TestUtils.h b/paddle/math/tests/TestUtils.h index d2b9706432f84fa082e071eb09d2ffe7402a085f..e1966ec8a74747960420ec80fdfbb957f7cf177f 100644 --- a/paddle/math/tests/TestUtils.h +++ b/paddle/math/tests/TestUtils.h @@ -56,31 +56,31 @@ using paddle::GpuSparseMatrix; template class ReplaceType { -public: + public: typedef T1 type; }; template <> class ReplaceType { -public: + public: typedef CpuMatrix type; }; template <> class ReplaceType { -public: + public: typedef GpuMatrix type; }; template <> class ReplaceType { -public: + public: typedef CpuMatrix type; }; template <> class ReplaceType { -public: + public: typedef GpuMatrix type; }; @@ -180,25 +180,25 @@ R call(C& obj, R (FC::*f)(FArgs...), Args&&... args) { template class ReturnType { -public: + public: typedef T type; }; template <> class ReturnType { -public: + public: typedef GpuMatrix type; }; template <> class ReturnType { -public: + public: typedef GpuIVector type; }; template <> class ReturnType { -public: + public: typedef GpuSparseMatrix type; }; @@ -234,7 +234,7 @@ GpuSparseMatrix autoArgs(CpuSparseMatrix& v) { } class AutoCompare { -public: + public: /** * err is the allowed calculation error. * The smaller the value of err, @@ -285,7 +285,7 @@ public: TensorCheck(compare, cpu, gpu); } -protected: + protected: CpuMatrix cpu; GpuMatrix gpu; AssertEqual compare; diff --git a/paddle/math/tests/test_ExecViaCpu.cpp b/paddle/math/tests/test_ExecViaCpu.cpp index 513c7b440e0aa6f20cc8209a3624f32f4892225b..72256cb9d4c93159418d27c7ca0d4f8b9a412a64 100644 --- a/paddle/math/tests/test_ExecViaCpu.cpp +++ b/paddle/math/tests/test_ExecViaCpu.cpp @@ -39,7 +39,7 @@ real f(Matrix& mat1, } class Functor { -public: + public: real operator()(Matrix& mat1, const Matrix& mat2, IVector& vec1, @@ -49,7 +49,7 @@ public: return a_; } -private: + private: real a_; }; diff --git a/paddle/math/tests/test_TrainingAlgorithm.cpp b/paddle/math/tests/test_TrainingAlgorithm.cpp index fb146176ca8eb97a9cdbaf9ebd5c4997a8439718..fb58d26734cab5d7d7bbbbe1cf8a920e4195b4bb 100644 --- a/paddle/math/tests/test_TrainingAlgorithm.cpp +++ b/paddle/math/tests/test_TrainingAlgorithm.cpp @@ -28,14 +28,14 @@ DEFINE_double(max_diff, 1e-13, "max diff allowed"); #endif class SetMaxDiff { -public: + public: explicit SetMaxDiff(double max_diff) { max_diff_ = FLAGS_max_diff; FLAGS_max_diff = max_diff; } ~SetMaxDiff() { FLAGS_max_diff = max_diff_; } -private: + private: double max_diff_; }; diff --git a/paddle/math/tests/test_perturbation.cpp b/paddle/math/tests/test_perturbation.cpp index ef99dab60a874846d04c5ce07d38b2857640ad7b..969400666f12e4c6001f270be3ec144e7e4d0702 100644 --- a/paddle/math/tests/test_perturbation.cpp +++ b/paddle/math/tests/test_perturbation.cpp @@ -32,7 +32,7 @@ const int TGT_SIZE = 21; const int CHANNELS = 3; class PerturbationTest : public testing::Test { -protected: + protected: virtual void SetUp() { generateTestImages(gpuImages_); } virtual void TearDown() {} diff --git a/paddle/optimizer/adadelta_optimizer.h b/paddle/optimizer/adadelta_optimizer.h index 74df9d54be734fedec8aeddff5f50b1d1aefb1d3..5beb62295a83ba4826e9a6b9caf21de78d2e8ced 100644 --- a/paddle/optimizer/adadelta_optimizer.h +++ b/paddle/optimizer/adadelta_optimizer.h @@ -20,7 +20,7 @@ namespace paddle { namespace optimizer { class AdadeltaOptimizer : public ParameterOptimizer { -public: + public: AdadeltaOptimizer( Tensor *parameter, LrPolicy *lr, double rho, double epsilon, double decay) : ParameterOptimizer(parameter, lr), @@ -40,7 +40,7 @@ public: std::string SerializeState(); void DeserializeState(const std::string &state); -private: + private: Tensor *accum_gradient_; Tensor *accum_delta_; Tensor *update_delta_; diff --git a/paddle/optimizer/adagrad_optimizer.h b/paddle/optimizer/adagrad_optimizer.h index 1d58402d78ff9ada8b084a472d46c96580d01e5b..b6fc06739970984cf4bbd27d3e6e1e9066bc350f 100644 --- a/paddle/optimizer/adagrad_optimizer.h +++ b/paddle/optimizer/adagrad_optimizer.h @@ -20,7 +20,7 @@ namespace paddle { namespace optimizer { class AdagradOptimizer : public ParameterOptimizer { -public: + public: AdagradOptimizer(Tensor *parameter, LrPolicy *lr, double epsilon, @@ -36,7 +36,7 @@ public: std::string SerializeState(); void DeserializeState(const std::string &state); -private: + private: Tensor *accum_gradient_; double epsilon_; double decay_; diff --git a/paddle/optimizer/adam_optimizer.h b/paddle/optimizer/adam_optimizer.h index 7977226c8602745d5733021a51fc03d932b0921a..fce10960068364b40592b26a6b439494d75cfa03 100644 --- a/paddle/optimizer/adam_optimizer.h +++ b/paddle/optimizer/adam_optimizer.h @@ -20,7 +20,7 @@ namespace paddle { namespace optimizer { class AdamOptimizer : public ParameterOptimizer { -public: + public: AdamOptimizer(Tensor *parameter, LrPolicy *lr, double beta_1, @@ -42,7 +42,7 @@ public: std::string SerializeState(); void DeserializeState(const std::string &state); -private: + private: Tensor *momentums_; Tensor *velocitys_; double beta_1_; diff --git a/paddle/optimizer/lr_policy.h b/paddle/optimizer/lr_policy.h index 14422d1f42fc45d5e9a560c45259d4003a0b3d11..d639c9f22c8ad77267f68e2c3b35257211bf90df 100644 --- a/paddle/optimizer/lr_policy.h +++ b/paddle/optimizer/lr_policy.h @@ -20,7 +20,7 @@ namespace paddle { namespace optimizer { class LrPolicy { -public: + public: virtual ~LrPolicy() {} virtual double LearningRate(const uint64_t num_sample_passed) = 0; virtual std::string SerializeState() = 0; @@ -29,7 +29,7 @@ public: // constant learning rate policy class ConstLr final : public LrPolicy { -public: + public: ConstLr(double lr) : learning_rate_(lr){}; double LearningRate(const uint64_t num_sample_passed) { return learning_rate_; @@ -45,12 +45,12 @@ public: learning_rate_ = state.learning_rate(); } -private: + private: double learning_rate_; }; class LinearLr final : public LrPolicy { -public: + public: LinearLr(double lr, double lr_decay_a, double lr_decay_b) : learning_rate_(lr), lr_decay_a_(lr_decay_a), lr_decay_b_(lr_decay_b) {} double LearningRate(const uint64_t num_sample_passed) { @@ -72,7 +72,7 @@ public: lr_decay_b_ = state.lr_decay_b(); } -private: + private: double learning_rate_; double lr_decay_a_; double lr_decay_b_; diff --git a/paddle/optimizer/parameter_optimizer.h b/paddle/optimizer/parameter_optimizer.h index c7cf8db3ee05c75c171b68bcbcb06a5ae8fa5b48..d5abca82d55c12aed0f4fca0c4c1f21d20586155 100644 --- a/paddle/optimizer/parameter_optimizer.h +++ b/paddle/optimizer/parameter_optimizer.h @@ -26,7 +26,7 @@ namespace paddle { namespace optimizer { class ParameterOptimizer { -public: + public: /** * @brief update hook for algorithm need to traverse parameter more than * once. @@ -45,7 +45,7 @@ public: virtual std::string SerializeState() = 0; virtual void DeserializeState(const std::string &state) = 0; -protected: + protected: Tensor *parameter_; // learning rate policy LrPolicy *lr_policy_; diff --git a/paddle/optimizer/parameter_optimizer_test.cc b/paddle/optimizer/parameter_optimizer_test.cc index d663e2fd007febd3b9f0f43d213d63d2b20656b8..1d9572999e9e0f10092eecbc1b41369a89629da7 100644 --- a/paddle/optimizer/parameter_optimizer_test.cc +++ b/paddle/optimizer/parameter_optimizer_test.cc @@ -38,7 +38,7 @@ paddle::optimizer::Tensor* FixedTensor(size_t size) { } class OptimizerTest : public testing::Test { -public: + public: virtual ~OptimizerTest() {} // init paddle::optimizer::Tensor shape const size_t kSize = 5; @@ -115,7 +115,7 @@ public: } } -private: + private: std::vector opts_; paddle::OptimizerConfig config_; }; diff --git a/paddle/optimizer/sgd_optimizer.h b/paddle/optimizer/sgd_optimizer.h index f504d98adb8a01fd69ff313075b4c417222c765e..a8957cde54abd6667143d2a8265d732c849294e3 100644 --- a/paddle/optimizer/sgd_optimizer.h +++ b/paddle/optimizer/sgd_optimizer.h @@ -20,7 +20,7 @@ namespace paddle { namespace optimizer { class SGDOptimizer : public ParameterOptimizer { -public: + public: SGDOptimizer(Tensor* parameter, LrPolicy* lr, double m, double d, bool n) : ParameterOptimizer(parameter, lr), momentums_(nullptr), @@ -39,7 +39,7 @@ public: std::string SerializeState(); void DeserializeState(const std::string& state); -private: + private: Tensor* momentums_; double momentum_; double decay_; diff --git a/paddle/optimizer/tensor.h b/paddle/optimizer/tensor.h index fd32398a237e7e08a198707347cd3c0a4ed77bb3..d2cef99074335be6f9852d60daa103b9b45a550d 100644 --- a/paddle/optimizer/tensor.h +++ b/paddle/optimizer/tensor.h @@ -26,7 +26,7 @@ namespace optimizer { template class TensorT { -public: + public: TensorT(size_t size) : height_(1), width_(size) { // new T[size]() initializes all element to zero value. data_ptr_ = std::shared_ptr(new T[size](), std::default_delete()); @@ -54,7 +54,7 @@ public: // TODO: replace with tensorshape size_t size() const { return this->width_ * this->height_; } -protected: + protected: size_t height_; size_t width_; std::shared_ptr data_ptr_; diff --git a/paddle/parameter/AverageOptimizer.h b/paddle/parameter/AverageOptimizer.h index 4ad3c18d56abf16d1274c5b3b8e0347b85e64dea..f0fe2fd28e4be7df8ebc52fd9b9b5540f3d76949 100644 --- a/paddle/parameter/AverageOptimizer.h +++ b/paddle/parameter/AverageOptimizer.h @@ -21,7 +21,7 @@ namespace paddle { // After Optimization, parameter values are further averaged within // time range. class AverageOptimizer : public ParameterOptimizer { -public: + public: // if *useParameterApply* set, use PARAMETER_APPLY to store averaged parameter // else use PARAMETER_VALUE, and value backup in PARAMETER_GRADIENT AverageOptimizer(const OptimizationConfig& optConfig, @@ -65,7 +65,7 @@ public: virtual void setNoDecay() { optimizer_->setNoDecay(); } -protected: + protected: std::unique_ptr optimizer_; bool useApply_; @@ -98,7 +98,7 @@ protected: // Average Optimizer with Sparse support. class AverageSparseOptimizer : public AverageOptimizer { -public: + public: AverageSparseOptimizer(const OptimizationConfig& optConfig, ParameterOptimizer* optimizer, bool useParameterApply) @@ -130,7 +130,7 @@ public: t0Vec_.assign(t0Vec_.size(), 0); } -protected: + protected: /** * counting batches, clear after catch up with * t(timer_) is current time, diff --git a/paddle/parameter/FirstOrderOptimizer.h b/paddle/parameter/FirstOrderOptimizer.h index 047989fcad52afc1d4d4c347258d0fb2f069f3d4..86b9a591aff7a58aafa194c64cb09cd6636d0454 100644 --- a/paddle/parameter/FirstOrderOptimizer.h +++ b/paddle/parameter/FirstOrderOptimizer.h @@ -22,7 +22,7 @@ namespace paddle { // Plain SGD optimization. class SgdOptimizer : public ParameterOptimizer { -public: + public: explicit SgdOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig) { addParameterType(PARAMETER_MOMENTUM); @@ -77,7 +77,7 @@ class SparseMomentumParameterOptimizer : public ParameterOptimizer { \gamma_t: learning rate at the t'th step */ -public: + public: explicit SparseMomentumParameterOptimizer( const OptimizationConfig& optConfig); virtual void init(size_t numRows, const ParameterConfig* config); @@ -89,7 +89,7 @@ public: const ParameterConfig& config) const; virtual void finishBatch(); -private: + private: real alpha_; real beta_; real tau_; @@ -98,7 +98,7 @@ private: real momentum_; real decayRate_; -protected: + protected: int64_t timer_; mutable std::vector t0Vec_; bool isParameterSparse_; @@ -109,7 +109,7 @@ protected: * http://www.magicbroom.info/Papers/DuchiHaSi10.pdf */ class AdagradParameterOptimizer : public ParameterOptimizer { -public: + public: explicit AdagradParameterOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig) { addParameterType(PARAMETER_MOMENTUM); @@ -129,7 +129,7 @@ public: virtual TraverseCallback needSpecialTraversal( const ParameterConfig& config) const; -protected: + protected: int64_t numUpdates_; static const int64_t kMaxNumAccumulates = 16384; }; @@ -139,7 +139,7 @@ protected: * http://www.matthewzeiler.com/pubs/googleTR2012/googleTR2012.pdf */ class AdaDeltaParameterOptimizer : public ParameterOptimizer { -public: + public: explicit AdaDeltaParameterOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig) { addParameterType(PARAMETER_MOMENTUM); @@ -158,14 +158,14 @@ public: const ParameterConfig& config, size_t sparseId) const; -protected: + protected: real rou_; real epsilon_; }; // RMSProp Parameter Optimization. class RMSPropParameterOptimizer : public ParameterOptimizer { -public: + public: explicit RMSPropParameterOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig) { addParameterType(PARAMETER_MOMENTUM); @@ -191,7 +191,7 @@ public: const ParameterConfig& config, size_t sparseId) const; -protected: + protected: real rou_; real epsilon_; @@ -208,7 +208,7 @@ protected: // Decayed AdaGrad Optimization. class DecayedAdagradParameterOptimizer : public ParameterOptimizer { -public: + public: explicit DecayedAdagradParameterOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig) { addParameterType(PARAMETER_MOMENTUM); @@ -233,7 +233,7 @@ public: const ParameterConfig& config, size_t sparseId) const; -protected: + protected: real rou_; real epsilon_; @@ -253,7 +253,7 @@ protected: * Reference Paper: http://arxiv.org/abs/1412.6980 Algorithm 1 */ class AdamParameterOptimizer : public ParameterOptimizer { -public: + public: explicit AdamParameterOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig), beta1_(optConfig.adam_beta1()), @@ -275,7 +275,7 @@ public: const ParameterConfig& config, size_t sparseId) const; -protected: + protected: real beta1_; real beta2_; real epsilon_; @@ -288,7 +288,7 @@ protected: * Reference Paper: http://arxiv.org/abs/1412.6980 Algorithm 2 */ class AdamaxParameterOptimizer : public ParameterOptimizer { -public: + public: explicit AdamaxParameterOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig), beta1_(optConfig.adam_beta1()), @@ -305,7 +305,7 @@ public: const ParameterConfig& config, size_t sparseId) const; -protected: + protected: real beta1_; real beta2_; int64_t step_; @@ -315,7 +315,7 @@ protected: // Used in pserver, // when PARAMETER_DELTA stores in PARAMETER_GRADIENT. class AddOptimizer : public ParameterOptimizer { -public: + public: explicit AddOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig) {} @@ -333,7 +333,7 @@ public: // A optimizer which does nothing. class DummyOptimizer : public ParameterOptimizer { -public: + public: explicit DummyOptimizer(const OptimizationConfig& optConfig) : ParameterOptimizer(optConfig) {} @@ -344,7 +344,7 @@ public: // Do gradient clipping before sgd update class OptimizerWithGradientClipping : public ParameterOptimizer { -public: + public: OptimizerWithGradientClipping(const OptimizationConfig& optConfig, ParameterOptimizer* optimizer) : ParameterOptimizer(optConfig), optimizer_(optimizer) { @@ -374,7 +374,7 @@ public: virtual void setNoDecay() { optimizer_->setNoDecay(); } -protected: + protected: std::unique_ptr optimizer_; }; diff --git a/paddle/parameter/LearningRateScheduler.cpp b/paddle/parameter/LearningRateScheduler.cpp index b6b58e3ddad6a0e8811bf56502c3f2f0c8728f5c..d57d2189a45dc8cbcea7a8a5f25c5ec7ac71cca3 100644 --- a/paddle/parameter/LearningRateScheduler.cpp +++ b/paddle/parameter/LearningRateScheduler.cpp @@ -28,20 +28,20 @@ LearningRateScheduler* LearningRateScheduler::create( // LRS stands for LearningRateScheduler class BaseLRS : public LearningRateScheduler { -public: + public: explicit BaseLRS(const OptimizationConfig& config) : learningRate_(config.learning_rate()), a_(config.learning_rate_decay_a()), b_(config.learning_rate_decay_b()) {} -protected: + protected: real learningRate_; real a_; real b_; }; class ConstLRS : public BaseLRS { -public: + public: explicit ConstLRS(const OptimizationConfig& config) : BaseLRS(config) {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) { return learningRate_; @@ -50,7 +50,7 @@ public: REGISTER_LEARNING_RATE_SCHEDULER(constant, ConstLRS); class PolyLRS : public BaseLRS { -public: + public: explicit PolyLRS(const OptimizationConfig& config) : BaseLRS(config) {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) { return learningRate_ * pow(1.0 + a_ * numSamplesProcessed, -b_); @@ -59,7 +59,7 @@ public: REGISTER_LEARNING_RATE_SCHEDULER(poly, PolyLRS); class CaffePolyLRS : public BaseLRS { -public: + public: explicit CaffePolyLRS(const OptimizationConfig& config) : BaseLRS(config) {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) { if (numSamplesProcessed > a_) { @@ -78,7 +78,7 @@ public: REGISTER_LEARNING_RATE_SCHEDULER(caffe_poly, CaffePolyLRS); class ExpLRS : public BaseLRS { -public: + public: explicit ExpLRS(const OptimizationConfig& config) : BaseLRS(config) {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) { double decayRatio = (double)numSamplesProcessed / b_; @@ -88,7 +88,7 @@ public: REGISTER_LEARNING_RATE_SCHEDULER(exp, ExpLRS); class DiscreteExpLRS : public BaseLRS { -public: + public: explicit DiscreteExpLRS(const OptimizationConfig& config) : BaseLRS(config) {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) { int numDecays = floor(numSamplesProcessed / b_); @@ -98,7 +98,7 @@ public: REGISTER_LEARNING_RATE_SCHEDULER(discexp, DiscreteExpLRS); class LinearLRS : public BaseLRS { -public: + public: explicit LinearLRS(const OptimizationConfig& config) : BaseLRS(config) {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) { return std::max(learningRate_ - a_ * numSamplesProcessed, b_); @@ -113,7 +113,7 @@ REGISTER_LEARNING_RATE_SCHEDULER(linear, LinearLRS); then learning_rate = learning_rate_base * rate_i */ class ManualLRS : public BaseLRS { -public: + public: explicit ManualLRS(const OptimizationConfig& config) : BaseLRS(config), currentSegment_(0), lastNum_(0) { std::vector pieces; @@ -151,7 +151,7 @@ public: return learningRate_ * rates_.back(); } -protected: + protected: std::vector rates_; std::vector segments_; size_t currentSegment_; @@ -161,7 +161,7 @@ protected: REGISTER_LEARNING_RATE_SCHEDULER(manual, ManualLRS); class PassManualLRS : public ManualLRS { -public: + public: explicit PassManualLRS(const OptimizationConfig& config) : ManualLRS(config) {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) { diff --git a/paddle/parameter/LearningRateScheduler.h b/paddle/parameter/LearningRateScheduler.h index aea99a1c204b46e937135cbde22360a12d087ae2..3fad97040248dcf8a22988c38153df31f267ed37 100644 --- a/paddle/parameter/LearningRateScheduler.h +++ b/paddle/parameter/LearningRateScheduler.h @@ -26,7 +26,7 @@ namespace paddle { }) class LearningRateScheduler { -public: + public: static LearningRateScheduler* create(const OptimizationConfig& config); virtual ~LearningRateScheduler() {} virtual real calcLearningRate(int64_t numSamplesProcessed, int64_t pass) = 0; diff --git a/paddle/parameter/OptimizerWithRegularizer.h b/paddle/parameter/OptimizerWithRegularizer.h index 7219d96d924dfa26d3ab52b8c6a2ce1249e4f45c..bd29b3966324b2e206cfe56cc15678539d1e870e 100644 --- a/paddle/parameter/OptimizerWithRegularizer.h +++ b/paddle/parameter/OptimizerWithRegularizer.h @@ -20,7 +20,7 @@ namespace paddle { // add regularizer for objective function to do optimization class OptimizerWithRegularizer : public ParameterOptimizer { -public: + public: static ParameterOptimizer* create(const OptimizationConfig& optConfig, const ParameterConfig& paraConfig, bool isParameterSparse, @@ -67,7 +67,7 @@ public: regularizer_->update(vecs, config, optimizer_->getLearningRate(), 0, 1); } -protected: + protected: std::unique_ptr optimizer_; Regularizer* regularizer_; @@ -84,7 +84,7 @@ protected: // Regularized Loss function for every num of batches class OptimizerWithRegularizerEveryNumBatches : public OptimizerWithRegularizer { -public: + public: OptimizerWithRegularizerEveryNumBatches(const OptimizationConfig& optConfig, ParameterOptimizer* optimizer, Regularizer* regularizer) @@ -112,7 +112,7 @@ public: virtual TraverseCallback startCatchUpWith() const; virtual void finishCatchUpWith() { baseTimer_ = timer_; } -protected: + protected: bool isRegularizationBatch(const ParameterConfig& config) const { return ((timer_ + 1) % config.num_batches_regularization() == 0); } @@ -125,7 +125,7 @@ protected: // Regularized Loss function with Sparse support class OptimizerWithRegularizerSparse : public OptimizerWithRegularizer { -public: + public: OptimizerWithRegularizerSparse(const OptimizationConfig& optConfig, ParameterOptimizer* optimizer, Regularizer* regularizer) @@ -145,7 +145,7 @@ public: t0Vec_.assign(t0Vec_.size(), 0); } -protected: + protected: /** * t0Vec_ are last occur time of i rows * if one block is update by multi threads, diff --git a/paddle/parameter/Parameter.h b/paddle/parameter/Parameter.h index 24ac10f3fe5977553332a9a8402d6795577b5ad8..ef519bf35a4f051b4477eb04b5eb2c5f0b5e29e8 100644 --- a/paddle/parameter/Parameter.h +++ b/paddle/parameter/Parameter.h @@ -58,7 +58,7 @@ class Parameter; typedef std::shared_ptr ParameterPtr; class Parameter { -public: + public: Parameter(const ParameterConfig& config, bool useGpu, bool doInit = true); const std::string& getName() const { return config_.name(); } @@ -311,7 +311,7 @@ public: } } -protected: + protected: /** * @brief create matrix to matType. * @@ -326,7 +326,7 @@ protected: void clearUpdate() { updateCounter_ = 0; } -protected: + protected: ParameterConfig config_; bool useGpu_; @@ -363,7 +363,7 @@ protected: std::vector> updaterHooks_; -public: + public: void setSharedCount(int cnt) { sharedCount_ = cnt; } int getSharedCount() { return sharedCount_; } diff --git a/paddle/parameter/ParameterOptimizer.h b/paddle/parameter/ParameterOptimizer.h index a8d0ca72f21d04e0e65a9dd6a07e8f53b23e4223..019afa1358ae255fd096e84e5eb1d7b0b9d6859f 100644 --- a/paddle/parameter/ParameterOptimizer.h +++ b/paddle/parameter/ParameterOptimizer.h @@ -30,12 +30,12 @@ namespace paddle { * may be called many times, should be no state change between calls. */ class ParameterOptimizer { -public: + public: typedef std::function TraverseCallback; -public: + public: explicit ParameterOptimizer(const OptimizationConfig& optConfig) : applyDecay_(true), optConfig_(optConfig), @@ -175,7 +175,7 @@ public: static ParameterOptimizer* create(const OptimizationConfig& optConfig, bool inPserver = false); -protected: + protected: typedef std::vector TraverseCallbackVec; static TraverseCallback composeCallbacks( diff --git a/paddle/parameter/ParameterUpdaterBase.h b/paddle/parameter/ParameterUpdaterBase.h index 717e1c6721b6e4d3ff81172eb06213677c3bff98..493512886cad3ea9b74026d6dfcc4fc90f6aadb9 100644 --- a/paddle/parameter/ParameterUpdaterBase.h +++ b/paddle/parameter/ParameterUpdaterBase.h @@ -21,7 +21,7 @@ namespace paddle { class ParameterOptimizer; class ParameterUpdater { -public: + public: ParameterUpdater() : parameterTypes_{PARAMETER_VALUE, PARAMETER_GRADIENT} {} virtual ~ParameterUpdater() {} @@ -89,7 +89,7 @@ public: virtual void setForwardbackwardTime(uint64_t delta) {} #endif -protected: + protected: virtual void updateImpl(Parameter* para) = 0; std::vector parameterTypes_; @@ -101,7 +101,7 @@ protected: // part of all Parameters. It's useful when we need different // update strategy for different Parameter. class ParameterUpdaterComposite : public ParameterUpdater { -public: + public: ParameterUpdaterComposite() {} virtual ~ParameterUpdaterComposite() {} @@ -173,7 +173,7 @@ public: [&](int tid, size_t numThreads) { updaters_[tid]->restore(); }); } -protected: + protected: virtual void updateImpl(Parameter* para) {} std::vector> updaters_; std::unique_ptr syncThreadPool_; diff --git a/paddle/parameter/ParameterUpdaterHook.cpp b/paddle/parameter/ParameterUpdaterHook.cpp index e6aec3c34820764b3515f47f13a432961de1a673..989185b66a5b7785bb0572fba59a72adeef9797b 100644 --- a/paddle/parameter/ParameterUpdaterHook.cpp +++ b/paddle/parameter/ParameterUpdaterHook.cpp @@ -37,7 +37,7 @@ namespace paddle { */ class StaticPruningHook : public IParameterUpdaterHook { -public: + public: explicit StaticPruningHook(const ParameterUpdaterHookConfig &hookConfig) : initCount_(0) { sparsityRatio_ = hookConfig.sparsity_ratio(); @@ -96,7 +96,7 @@ public: paraVec->dotMul(*maskVec_); } -private: + private: SameThreadChecker updateThreadChecker_; std::atomic initCount_; VectorPtr maskVec_; @@ -116,12 +116,12 @@ IParameterUpdaterHook::~IParameterUpdaterHook() {} * May be extracted to Util.h to unify the hasher. */ class StringIntPairHasher { -public: + public: size_t operator()(const std::pair &k) const { return intHasher_(strHasher_(k.first) + k.second); } -private: + private: std::hash strHasher_; std::hash intHasher_; }; diff --git a/paddle/parameter/ParameterUpdaterHook.h b/paddle/parameter/ParameterUpdaterHook.h index d30530ec393c097bf77e5e376e3c4dc84b321ed8..cb96e4cf007572e9688c11719017a9d2771ecd51 100644 --- a/paddle/parameter/ParameterUpdaterHook.h +++ b/paddle/parameter/ParameterUpdaterHook.h @@ -29,7 +29,7 @@ class Parameter; * parameter optimization. */ class IParameterUpdaterHook { -public: + public: virtual ~IParameterUpdaterHook(); /** @@ -53,7 +53,7 @@ public: */ virtual void init(Parameter* para) = 0; -protected: + protected: /** * Ctor. */ diff --git a/paddle/parameter/Regularizer.h b/paddle/parameter/Regularizer.h index 6bed7b0ddfe7b72c697af60f5243f9037999d54a..fa5384e23251b918cc914df36c16ad790a5c59c5 100644 --- a/paddle/parameter/Regularizer.h +++ b/paddle/parameter/Regularizer.h @@ -20,7 +20,7 @@ namespace paddle { // Regularizer function for parameter, e.g. L1/L2 class Regularizer { -public: + public: virtual void update(const VectorPtr vecs[], const ParameterConfig& paraConfig, real learningRate, // learningrate from optimizer diff --git a/paddle/parameter/Weight.h b/paddle/parameter/Weight.h index 7314c29d0db92db06d5b921c09de39d3b0029ef3..113dd6530c82fe1e831ad4a35e9cbcb9880b9243 100644 --- a/paddle/parameter/Weight.h +++ b/paddle/parameter/Weight.h @@ -23,12 +23,12 @@ limitations under the License. */ namespace paddle { class Weight { -private: + private: MatrixPtr weight_; MatrixPtr weightGrad_; ParameterPtr parameter_; -public: + public: Weight(size_t height, size_t width, ParameterPtr parameter); Weight(size_t height, size_t width, ParameterPtr parameter, size_t offset); diff --git a/paddle/parameter/tests/test_common.cpp b/paddle/parameter/tests/test_common.cpp index 6e10becabbbbb8861095fed5aab9ac1e05bcac91..89dcc6c751eb2ec07bfe8297c93d56c824086211 100644 --- a/paddle/parameter/tests/test_common.cpp +++ b/paddle/parameter/tests/test_common.cpp @@ -24,7 +24,7 @@ limitations under the License. */ using namespace paddle; // NOLINT class CommonTest : public ::testing::Test { -protected: + protected: CommonTest() : testStat_("test") {} virtual ~CommonTest() {} virtual void SetUp() { @@ -51,7 +51,7 @@ protected: virtual void TreaDown() { LOG(INFO) << "All Test Finished."; } -protected: + protected: std::vector> valueUint_; std::vector sizeVec_; real learningRate_; diff --git a/paddle/pserver/BaseClient.h b/paddle/pserver/BaseClient.h index a932d34712f56de1cbbf84a9db4476f862febca0..d50230e73a3a7d128cbfd1d70517fddd228fb1bb 100644 --- a/paddle/pserver/BaseClient.h +++ b/paddle/pserver/BaseClient.h @@ -32,7 +32,7 @@ namespace paddle { * connections. */ class BaseClient { -protected: + protected: typedef std::unique_ptr ThreadPtr; typedef std::vector> InputIovs; typedef std::vector SendRequest; @@ -49,7 +49,7 @@ protected: SendDataRequestVec parallelDataRequests; }; -public: + public: explicit BaseClient(bool separate = false, int numPorts = FLAGS_ports_num); virtual ~BaseClient(); @@ -141,7 +141,7 @@ public: return dataType; } -protected: + protected: /// for a > 0, b > 0: /// return the smallest x s.t. b*x >= a static int divup(int a, int b) { return (a + b - 1) / b; } @@ -264,7 +264,7 @@ protected: */ virtual void recv(int threadId) = 0; -protected: + protected: bool stopping_; /// nodes * ports that means the number of real pservers int serviceNum_; diff --git a/paddle/pserver/LightNetwork.h b/paddle/pserver/LightNetwork.h index 2aaa26a5c708f9c01f006136619f599bcfe0db71..bcfc9655e989e80e08e9dce9b8734c0643cbf661 100644 --- a/paddle/pserver/LightNetwork.h +++ b/paddle/pserver/LightNetwork.h @@ -41,7 +41,7 @@ class SocketServer : public Thread { // rdmaCpu controls the cpu affinity of RDMA server daemon, // which could benifit performance. rdmaCpu = -1 means TCP // is used instead of RDMA transport. -public: + public: SocketServer(const std::string& addr, int port, int rdmaCpu); ~SocketServer(); @@ -50,7 +50,7 @@ public: typedef std::function& outputIovs)> ResponseCallback; -protected: + protected: // // The derived class needs to implement this function // to handle the request received by SocketWorker @@ -70,13 +70,13 @@ protected: friend class SocketWorker; -private: + private: void rdmaServer(); void tcpServer(); void detach() {} // detach accept thread is forbidden -protected: + protected: enum ChannelType tcpRdma_; // for rdma int rdmaCpu_; @@ -96,7 +96,7 @@ protected: * @note all parameter processing will run in the context of this worker */ class SocketWorker : public Thread { -public: + public: SocketWorker(std::unique_ptr&& channel, SocketServer* server) : channel_(std::move(channel)), server_(server) {} @@ -104,7 +104,7 @@ public: virtual void run(); -protected: + protected: std::unique_ptr channel_; SocketServer* server_; enum ChannelType tcpRdma_; @@ -118,12 +118,12 @@ protected: * single cpu core for better load balance performance */ class RdmaClientDaemons { -private: + private: RdmaClientDaemons(); static std::unique_ptr daemons_; -public: + public: static RdmaClientDaemons* get() { std::call_once(RdmaClientDaemons::initDataFlag_, &RdmaClientDaemons::getInstance); @@ -141,10 +141,10 @@ public: ~RdmaClientDaemons(); -public: + public: friend class SocketClient; -private: + private: static std::once_flag initDataFlag_; static void getInstance() { if (!daemons_.get()) daemons_.reset(new RdmaClientDaemons()); @@ -162,19 +162,19 @@ private: * read data */ class SocketClient { -public: + public: SocketClient(const std::string& serverAddr, int serverPort, enum ChannelType channelType); SocketChannel* getChannel() { return channel_.get(); } -protected: + protected: std::unique_ptr channel_; struct sxi_socket* socketDaemon_; enum ChannelType tcpRdma_; -private: + private: void RdmaClient(const std::string& serverAddr, int serverPort); void TcpClient(const std::string& serverAddr, int serverPort); }; diff --git a/paddle/pserver/ParameterClient2.h b/paddle/pserver/ParameterClient2.h index d63273ccbc8ed30d9df50d9f8b1a4d1e4fba6720..c96bb787151a525556c8217629109de201762cff 100644 --- a/paddle/pserver/ParameterClient2.h +++ b/paddle/pserver/ParameterClient2.h @@ -50,11 +50,11 @@ struct PServerVector { * @brief A class to help to prepare server-side operations. */ class PreparedOperations { -protected: + protected: class ResultsAdder; struct LocalOperationResult; -public: + public: /** * Offers an easy way to prepare operations that will be performed on * server-side. @@ -93,7 +93,7 @@ public: return ResultsAdder(&localResults_.back()); } -protected: + protected: void addOperationHelper(Operation* op) {} /** @@ -151,7 +151,7 @@ protected: * @brief ResultsAdder offers easy ways to quickly store operation results. */ class ResultsAdder { - public: + public: explicit ResultsAdder(LocalOperationResult* localResult) : localResult_(localResult) {} template @@ -172,11 +172,11 @@ protected: addResult(args...); } - protected: + protected: LocalOperationResult* localResult_; }; -protected: + protected: DoOperationRequest request_; std::vector inputIovs_; struct LocalOperationResult { @@ -214,7 +214,7 @@ struct ParameterSegments { * waiting until all parameters are received to CPU host end. */ class ParameterClient2 : public BaseClient { -public: + public: /** Constructor. * @param separate True if sending and recieving activities are separated * into 2 threads, otherwise false. @@ -232,7 +232,7 @@ public: static int calcParameterBlockSize(const std::vector& parameters, size_t serviceNum); -public: + public: bool init(const std::vector& parameters); /// service functions @@ -514,7 +514,7 @@ public: void setForwardbackwardTime(uint64_t delta) { forwardbackwordTime_ = delta; } #endif -protected: + protected: template void multiCall(const char* funcName, const ProtoIn& request, @@ -529,7 +529,7 @@ protected: } } -private: + private: void destroy(); /** @@ -573,7 +573,7 @@ private: /// start necessary threads for threadPool void initThreads(); -protected: + protected: /// start port number of pserver /// it deduce all ports for dense and sparse with some rules int port_; diff --git a/paddle/pserver/ParameterServer2.h b/paddle/pserver/ParameterServer2.h index 3ed06b6b045802bcfd48bcff6bd0c1b34e9bbb86..0b8ef5c170c01ec8a5d53f01db9888f82ca68eec 100644 --- a/paddle/pserver/ParameterServer2.h +++ b/paddle/pserver/ParameterServer2.h @@ -71,7 +71,7 @@ namespace paddle { * to prevent from being polluted. */ class ParameterServer2 : public ProtoServer { -protected: + protected: /// parameter_ mutex. RWLock parameterMutex_; @@ -169,7 +169,7 @@ protected: template class ReadWriteBuffer : public std::vector> { - public: + public: static_assert(sizeof(T) % AlignBytes == 0 || AlignBytes % sizeof(T) == 0, "Type T must be able to aligned."); @@ -229,7 +229,7 @@ protected: return r; } - private: + private: size_t curOffset_; }; @@ -298,17 +298,17 @@ protected: /// barrier performance tuning sync-sgd required std::atomic batchId_; -public: + public: struct Buffer { real* base; size_t size; }; -protected: + protected: /// async gradient commit control bool asyncGrdientCommitCheckAndStat(const SendParameterRequest& request); -public: + public: /// disable default parameter for overloading /// @rdmaCpu:the id of cpu core hosting RDMA server(0-N) /// -1 means using TCP transport instead of RDMA @@ -437,7 +437,7 @@ public: void saveValueVector(const SaveValueRequest& request, ProtoResponseCallback callback); -public: + public: /** * @brief initialize parameter server */ @@ -512,7 +512,7 @@ public: SendParameterResponse* response, std::vector* outputBuffers); -protected: + protected: void mergeSegments(BlockSegments* segments); /// set the unused segments to zero @@ -641,7 +641,7 @@ protected: const VectorPtr vecs[], const ParameterOptimizer::TraverseCallback& callback); -public: + public: typedef void (ParameterServer2::*OperatorFunction)(const Operation& operation, OperationResult* result); diff --git a/paddle/pserver/ParameterServerController.h b/paddle/pserver/ParameterServerController.h index 3a9bc74edf240a12fe1f7bd266f0311555349311..1308d62fb1787f19123fe37d49f8e14039c5a39a 100644 --- a/paddle/pserver/ParameterServerController.h +++ b/paddle/pserver/ParameterServerController.h @@ -28,7 +28,7 @@ namespace paddle { * by gflags or proto. */ class ParameterServerController final { -public: + public: DISABLE_COPY(ParameterServerController); /** @@ -67,7 +67,7 @@ public: */ void wait(); -private: + private: std::vector> parameterServers_; }; diff --git a/paddle/pserver/ProtoServer.h b/paddle/pserver/ProtoServer.h index 3f78799dbfe1d4b80249e8cb27f269e6358903dd..2943867de5885ab1af1aa0f69e93a931092b28e3 100644 --- a/paddle/pserver/ProtoServer.h +++ b/paddle/pserver/ProtoServer.h @@ -34,7 +34,7 @@ namespace paddle { * for single NIC hardward with --port=N(N>1) for small cluster job. */ class ProtoServer : public SocketServer { -public: + public: /// rdmaCpu controls the cpu affinity of RDMA server daemon, /// which could benifit performance. rdmaCpu = -1 means TCP /// is used instead of RDMA transport. @@ -87,7 +87,7 @@ public: std::unique_ptr msgReader, ProtoResponseCallbackEx callback)> func); -protected: + protected: /** * @brief handle rpc request * @param[in] msgReader Message reader for reading data from connection @@ -111,7 +111,7 @@ protected: void registerServiceFunctionImp(const std::string& funcName, ServiceFunction func); -protected: + protected: /// Tuning bare network overhead: the beginning of receiving request ThreadLocal handleRequestBegin_; @@ -120,7 +120,7 @@ protected: }; class ProtoClient : public SocketClient { -public: + public: ProtoClient(const std::string& serverAddr, int serverPort, enum ChannelType channelType = F_TCP) diff --git a/paddle/pserver/SocketChannel.h b/paddle/pserver/SocketChannel.h index c0f30d0db760045a8c0cb001fcadaae8f0c03f9d..8b45ac56090ef82e77514566e7df6b366958655e 100644 --- a/paddle/pserver/SocketChannel.h +++ b/paddle/pserver/SocketChannel.h @@ -33,7 +33,7 @@ enum ChannelType { /// reading a set of blocks of data from SocketChannel. class MsgReader { -public: + public: MsgReader(SocketChannel* channel, size_t numIovs); ~MsgReader() { /// ensure all data blocks have been processed @@ -75,7 +75,7 @@ public: void readBlocks(const std::vector& bufs); void readNextBlock(void* buf); -protected: + protected: SocketChannel* channel_; std::vector blockLengths_; size_t currentBlockIndex_; @@ -84,7 +84,7 @@ protected: /// APIs for reading and writing byte stream data or naive iov data /// from the APIs both RDMA and TCP exhibits byte stream style class SocketChannel { -public: + public: SocketChannel(int socket, const std::string& peerName) : tcpSocket_(socket), peerName_(peerName) { tcpRdma_ = F_TCP; @@ -137,7 +137,7 @@ public: /// return null to indicate socket is closed std::unique_ptr readMessage(); -protected: + protected: struct MessageHeader { int64_t totalLength; /// include the header int64_t numIovs; diff --git a/paddle/pserver/SparseParameterDistribution.h b/paddle/pserver/SparseParameterDistribution.h index 13f199548d56262e77e91e45052f3e435dea407c..e168f36c75e9452fff547f139a67a553cc6b796a 100644 --- a/paddle/pserver/SparseParameterDistribution.h +++ b/paddle/pserver/SparseParameterDistribution.h @@ -31,7 +31,7 @@ namespace paddle { * if unbalanced distribution exhibts by default. */ class SparseParameterDistribution { -public: + public: /// serviceNum means the number of ParameterServers explicit SparseParameterDistribution(size_t serviceNum); ~SparseParameterDistribution() {} @@ -39,7 +39,7 @@ public: void probeDistribution(int serverId, size_t data); void checkAndResetDistribution(); -private: + private: std::vector data_; std::atomic totBytes_; diff --git a/paddle/pserver/test/SocketTest.cpp b/paddle/pserver/test/SocketTest.cpp index 6019dccaadf7fab5a1db7183c07cbbd9562dab2e..206cd17c379f529579c103893cfb492524bc6f8d 100644 --- a/paddle/pserver/test/SocketTest.cpp +++ b/paddle/pserver/test/SocketTest.cpp @@ -30,12 +30,12 @@ struct MessageHeader { }; class Thread { -public: + public: void start(); virtual void run() = 0; virtual ~Thread() {} -protected: + protected: std::unique_ptr thread_; }; @@ -44,13 +44,13 @@ void Thread::start() { } class SocketChannel { -public: + public: explicit SocketChannel(int socket) : socket_(socket) {} int getSocketFd() const { return socket_; } uint64_t readAll(void* buf, size_t size); uint64_t writeAll(const void* buf, size_t size); -protected: + protected: int socket_; }; @@ -79,7 +79,7 @@ uint64_t SocketChannel::writeAll(const void* buf, size_t size) { } class SocketWorker : public Thread { -public: + public: explicit SocketWorker(int socket) : channel_(socket) {} virtual void run(); @@ -88,19 +88,19 @@ public: // write n bytes -protected: + protected: SocketChannel channel_; std::string buffer_; }; class SocketServer : public Thread { -public: + public: explicit SocketServer(int port) : port_(port), socket_(0), maxPendingConnections_(100) {} virtual void run(); -protected: + protected: int port_; int socket_; int maxPendingConnections_; @@ -161,11 +161,11 @@ void SocketWorker::run() { } class SocketClient { -public: + public: SocketClient(const std::string& serverAddr, int serverPort); SocketChannel* getChannel() const { return channel_.get(); } -protected: + protected: std::unique_ptr channel_; }; diff --git a/paddle/pserver/test/test_ParameterServer2.cpp b/paddle/pserver/test/test_ParameterServer2.cpp index e742cd0871da865e02a60a125a936eea8f15e575..01d179258dffaf996a57022801ee3bd60a268f77 100644 --- a/paddle/pserver/test/test_ParameterServer2.cpp +++ b/paddle/pserver/test/test_ParameterServer2.cpp @@ -26,7 +26,7 @@ DEFINE_string(server_addr, "127.0.0.1", "assign server address"); DEFINE_int32(server_cpu, 0, "assign server cpu"); class ParameterServer2Tester : public ParameterServer2 { -public: + public: ParameterServer2Tester(std::string serverAddr, int port, int rdmaCpu = -1, @@ -88,7 +88,7 @@ public: void waitPassFinishTest(); void synchronizeTest(); -protected: + protected: ParameterClient2 client_; vector clientConfigs_; vector parameters_; diff --git a/paddle/pserver/test/test_ProtoServer.cpp b/paddle/pserver/test/test_ProtoServer.cpp index d68a8d2180cc3081346106132799498f6dc3fa20..a66b14a1cc58d11988e4936a9c35d98b8bf5edc1 100644 --- a/paddle/pserver/test/test_ProtoServer.cpp +++ b/paddle/pserver/test/test_ProtoServer.cpp @@ -28,7 +28,7 @@ DEFINE_bool(benchmark, false, "Do benchmark. Skip some tests"); using namespace paddle; // NOLINT class MyServer : public ProtoServer { -public: + public: explicit MyServer(int port, int rdmaCpu = -1) : ProtoServer(FLAGS_server_addr, port, rdmaCpu), status_(PSERVER_STATUS_NOT_SET) { @@ -62,7 +62,7 @@ public: callback(response); } -protected: + protected: PServerStatus status_; std::string buffer_; }; diff --git a/paddle/scripts/paddle_build.sh b/paddle/scripts/paddle_build.sh index 624203132fe929ad7d641328d7a3a81aa9906c42..fd3834ee21d8858016c3039cfea152904ac573e2 100755 --- a/paddle/scripts/paddle_build.sh +++ b/paddle/scripts/paddle_build.sh @@ -105,7 +105,7 @@ function cmake_gen() { -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DWITH_FLUID_ONLY=${WITH_FLUID_ONLY:-OFF} -DCMAKE_EXPORT_COMPILE_COMMANDS=ON - -DWITH_CONTRIB=ON + -DWITH_CONTRIB=${WITH_CONTRIB:-ON} ======================================== EOF # Disable UNITTEST_USE_VIRTUALENV in docker because @@ -132,7 +132,7 @@ EOF -DCMAKE_MODULE_PATH=/opt/rocm/hip/cmake \ -DWITH_FLUID_ONLY=${WITH_FLUID_ONLY:-OFF} \ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \ - -DWITH_CONTRIB=ON + -DWITH_CONTRIB=${WITH_CONTRIB:-ON} } function abort(){ diff --git a/paddle/trainer/NewRemoteParameterUpdater.h b/paddle/trainer/NewRemoteParameterUpdater.h index 6223ba427c9b94494c2bee8f0847442f1b0574c9..02693c675e6f5cb574e52e9681963a5904676028 100644 --- a/paddle/trainer/NewRemoteParameterUpdater.h +++ b/paddle/trainer/NewRemoteParameterUpdater.h @@ -29,7 +29,7 @@ namespace paddle { * New remote parameter updater for dense parameters that use cclient of go. */ class NewRemoteParameterUpdater : public ParameterUpdater { -public: + public: NewRemoteParameterUpdater(const OptimizationConfig& config, const std::string pserverSpec); NewRemoteParameterUpdater(const OptimizationConfig& config, @@ -61,13 +61,13 @@ public: virtual void startPass(); virtual bool finishPass(); -protected: + protected: /** * work need to do after finishBatch */ virtual void updateImpl(Parameter* para); -private: + private: int parameterSize() { return (int)parameters_.size(); } /** @@ -104,7 +104,7 @@ private: } } -protected: + protected: const OptimizationConfig& trainerConfig_; /// internal parameter client object for exchanging data with pserver paddle_pserver_client parameterClient_; diff --git a/paddle/trainer/ParamUtil.h b/paddle/trainer/ParamUtil.h index 2e05595848760c9abd7d916003656c8103151abf..10746b4d58e3a82c081987a6aaad9e0b42272a03 100644 --- a/paddle/trainer/ParamUtil.h +++ b/paddle/trainer/ParamUtil.h @@ -56,7 +56,7 @@ struct ParameterUtilConfig { * Utility class for loading and saving parameters */ class ParameterUtil { -public: + public: /** * Ctor. * @@ -115,7 +115,7 @@ public: } } -private: + private: std::shared_ptr config_; std::unique_ptr intConfig_; GradientMachinePtr gserver_; diff --git a/paddle/trainer/ParameterUpdater.h b/paddle/trainer/ParameterUpdater.h index 9e9e948b8856d2712f8894b3d14db9c795d5f694..ef7ab92eca77bab2a8481561713f8034d2b8505d 100644 --- a/paddle/trainer/ParameterUpdater.h +++ b/paddle/trainer/ParameterUpdater.h @@ -36,7 +36,7 @@ namespace paddle { * @brief Parameter Updater for SGD, and local(not cluster) run. */ class SgdLocalUpdater : public ParameterUpdater { -public: + public: /** * @brief Ctor. Initialize optimizer locally by optConfig. * @param optConfig optimization config. @@ -131,7 +131,7 @@ public: } } -protected: + protected: /** * @brief update method. Update value from gradient. * @param para parameter that will be updated. @@ -159,7 +159,7 @@ protected: * @deprecated */ class SgdCpuUpdater : public SgdLocalUpdater, public Deprecated { -public: + public: explicit SgdCpuUpdater(const OptimizationConfig& optConfig) : SgdLocalUpdater(optConfig), Deprecated( @@ -178,7 +178,7 @@ public: optimizer_->finishBatch(); } -protected: + protected: /** * @brief do nothing. * @param para @@ -192,7 +192,7 @@ protected: * It will do model average in cpu to reduce gpu memory comsuption. */ class SgdUpdaterWithCpuAverager : public SgdLocalUpdater { -public: + public: /** * @brief Ctor. * @@ -233,12 +233,12 @@ public: */ virtual void restore(); -protected: + protected: virtual void updateImpl(Parameter* para); void updateFunc(Parameter* para); -protected: + protected: std::unique_ptr averager_; /** diff --git a/paddle/trainer/RemoteParameterUpdater.h b/paddle/trainer/RemoteParameterUpdater.h index 5e82c944751629632ea8d16992bd8f4178a2fbd5..3a40a46354efd6b92278884c8f5b72504a3ff283 100644 --- a/paddle/trainer/RemoteParameterUpdater.h +++ b/paddle/trainer/RemoteParameterUpdater.h @@ -53,7 +53,7 @@ namespace paddle { * backward and communication is not supported. */ class RemoteParameterUpdater : public ParameterUpdater { -public: + public: RemoteParameterUpdater( const OptimizationConfig& config, int expectedPassCount, @@ -101,7 +101,7 @@ public: virtual void apply(); virtual void restore(); -protected: + protected: /** * control all pservers with all trainers for sync-sgd */ @@ -128,7 +128,7 @@ protected: */ void copyParametersFromDevice(ParameterType parameterType); -protected: + protected: /// Optimization config used to guide initialization and finishBatch OptimizationConfig config_; /// internal parameter client object for exchanging data with pserver @@ -178,7 +178,7 @@ protected: * It contains separate send and recv thread for pipeline usage. */ class ConcurrentRemoteParameterUpdater : public RemoteParameterUpdater { -public: + public: ConcurrentRemoteParameterUpdater( OptimizationConfig config, int expectedPassCount, @@ -194,7 +194,7 @@ public: */ virtual void finishBatch(real cost); -protected: + protected: virtual void updateImpl(Parameter* para); /// internal thread called in send thread void send(Parameter* para); // para == NULL indicate end of a minibatch @@ -221,7 +221,7 @@ protected: return (numBatches_ + 1) % config_.num_batches_per_send_parameter() == 0; } -private: + private: /// send thread used for overlapping std::unique_ptr sendThread_; /// recv thread used for overlapping @@ -263,7 +263,7 @@ private: * to encapsulate sparse specified message for all pservers. */ class SparseRemoteParameterUpdater : public ParameterUpdater { -public: + public: SparseRemoteParameterUpdater(const OptimizationConfig& config, int expectedPassCount, bool testing); @@ -303,7 +303,7 @@ public: } #endif -protected: + protected: /// update implimentation, not implemented virtual void updateImpl(Parameter* para) {} @@ -313,7 +313,7 @@ protected: /// start controller thread void startController(); -protected: + protected: /// optimization config OptimizationConfig config_; /// internal parameter client @@ -335,7 +335,7 @@ protected: * it directly call internal dense and sparse udpater individually. */ class SparseRemoteParameterUpdaterComposite : public ParameterUpdaterComposite { -public: + public: enum { UPDATER_SPARSE_REMOTE = 0, // execute in sync thread pool(tid:0) UPDATER_NORMAL = 1, // execute in Owner thread(tid:1) @@ -364,7 +364,7 @@ public: }; class ParameterUpdaterCreators { -public: + public: /** * @brief add a creator to create custom ParameterUpdater while training. * The creator is a function with type (alogrithm, optConfig, isLocal, @@ -407,7 +407,7 @@ public: return nullptr; } -private: + private: static std::vector> constructors_; diff --git a/paddle/trainer/Tester.h b/paddle/trainer/Tester.h index e892744db278586f2fd5b3cb527aa7c17752c477..801c77e3116369732bf4b03107adce6a71dc2184 100644 --- a/paddle/trainer/Tester.h +++ b/paddle/trainer/Tester.h @@ -38,7 +38,7 @@ namespace paddle { * It is a private class for Trainer. */ class Tester { -public: + public: /** * Ctor * @param config Trainer Config. @@ -87,7 +87,7 @@ public: */ void test(); -protected: + protected: std::shared_ptr testParameterClient_; std::shared_ptr config_; std::unique_ptr intconfig_; @@ -107,7 +107,7 @@ protected: real cost; } testContext_; -private: + private: /** * Test one batch by batchId. It is only used for testOnePass. * diff --git a/paddle/trainer/ThreadParameterUpdater.h b/paddle/trainer/ThreadParameterUpdater.h index bc08a9e9f0eda1cab7776ba76c67e88add1028a9..b5e6a7ce3c8457364b10c921bca3386fbb6f6cbf 100644 --- a/paddle/trainer/ThreadParameterUpdater.h +++ b/paddle/trainer/ThreadParameterUpdater.h @@ -39,7 +39,7 @@ namespace paddle { class. */ class SgdThreadUpdater : public ParameterUpdater { -public: + public: explicit SgdThreadUpdater(const OptimizationConfig& optConfig); virtual ~SgdThreadUpdater() {} @@ -57,7 +57,7 @@ public: virtual void apply(); virtual void restore(); -protected: + protected: // This is the function that will be eventualy called by the GradientMachine. // used only for GPU update. virtual void updateImpl(Parameter* para); diff --git a/paddle/trainer/Trainer.h b/paddle/trainer/Trainer.h index fac589d1d711affcd008f90edf87d865c8362f69..78127b7be5cef34f51a4b540852c139625b571dd 100644 --- a/paddle/trainer/Trainer.h +++ b/paddle/trainer/Trainer.h @@ -41,7 +41,7 @@ namespace paddle { * train/test a NeuralNetwork. */ class Trainer { -public: + public: /** * Ctor. * @return @@ -138,7 +138,7 @@ public: */ ParameterUtil* getParameterUtilPtr(); -protected: + protected: /** * Train one pass of data. * @@ -159,10 +159,10 @@ protected: void createTester(); -private: + private: std::unique_ptr createTesterConfig(); -protected: + protected: std::shared_ptr config_; std::shared_ptr stats_; diff --git a/paddle/trainer/TrainerConfigHelper.h b/paddle/trainer/TrainerConfigHelper.h index f1366cc041b0d983e65a1bf5b02ec2128324c5a8..b21dda964e70fce6e5e9672cc131595ad5af3bbc 100644 --- a/paddle/trainer/TrainerConfigHelper.h +++ b/paddle/trainer/TrainerConfigHelper.h @@ -37,7 +37,7 @@ class DataConfig; * Define a macro to unify 'final' keyword */ class TrainerConfigHelper /*final*/ { -public: + public: DISABLE_COPY(TrainerConfigHelper); /** @@ -193,7 +193,7 @@ public: */ static std::shared_ptr createFromFlagConfig(); -private: + private: static std::string getConfigNameFromPassId(int passId, const std::string& modelPath); diff --git a/paddle/trainer/TrainerInternal.h b/paddle/trainer/TrainerInternal.h index 7018faab24744f7a087a53130acc56ec6314101e..48ee53a5e60f950bfc3cc299c754b0e72601c818 100644 --- a/paddle/trainer/TrainerInternal.h +++ b/paddle/trainer/TrainerInternal.h @@ -34,7 +34,7 @@ namespace paddle { * the core training class for driving training logic */ class TrainerInternal { -public: + public: struct ParaStat { real maxAbsGrad; real avgAbsGrad; @@ -126,7 +126,7 @@ public: UpdateCallback updateCallback, bool doPipelineUpdate); -protected: + protected: std::shared_ptr parameterUpdater_; GradientMachinePtr gradientMachine_; std::shared_ptr config_; diff --git a/paddle/trainer/TrainerInternalConfig.h b/paddle/trainer/TrainerInternalConfig.h index b47692720efc2ed4f2db84f61ca81fcb52d234c0..43aae381029784278ad58c9398f64af24dffa1df 100644 --- a/paddle/trainer/TrainerInternalConfig.h +++ b/paddle/trainer/TrainerInternalConfig.h @@ -37,7 +37,7 @@ namespace paddle { * through one mini-batch. */ class TrainerStats { -public: + public: /** * @brief reset all stats. * @@ -147,7 +147,7 @@ public: return os.str(); } -private: + private: int64_t numProcessed_; real totalCost_; real currentCost_; diff --git a/paddle/trainer/tests/picojson.h b/paddle/trainer/tests/picojson.h index eaa8b9baf6e4e753a441ab77811f494cbdab80cf..75349537b1c7f10d23bae788e8414a753c7ccab0 100644 --- a/paddle/trainer/tests/picojson.h +++ b/paddle/trainer/tests/picojson.h @@ -125,7 +125,7 @@ enum { INDENT_WIDTH = 2 }; struct null {}; class value { -public: + public: typedef std::vector array; typedef std::map object; union _storage { @@ -139,11 +139,11 @@ public: object* object_; }; -protected: + protected: int type_; _storage u_; -public: + public: value(); value(int type, bool); explicit value(bool b); @@ -179,7 +179,7 @@ public: void serialize(Iter os, bool prettify = false) const; std::string serialize(bool prettify = false) const; -private: + private: template value(const T*); // intentionally defined to block implicit conversion of // pointer to bool @@ -588,13 +588,13 @@ inline std::string value::_serialize(int indent) const { template class input { -protected: + protected: Iter cur_, end_; int last_ch_; bool ungot_; int line_; -public: + public: input(const Iter& first, const Iter& last) : cur_(first), end_(last), last_ch_(-1), ungot_(false), line_(1) {} int getc() { @@ -873,7 +873,7 @@ inline bool _parse(Context& ctx, input& in) { } class deny_parse_context { -public: + public: bool set_null() { return false; } bool set_bool(bool) { return false; } #ifdef PICOJSON_USE_INT64 @@ -898,10 +898,10 @@ public: }; class default_parse_context { -protected: + protected: value* out_; -public: + public: default_parse_context(value* out) : out_(out) {} bool set_null() { *out_ = value(); @@ -949,18 +949,18 @@ public: return _parse(ctx, in); } -private: + private: default_parse_context(const default_parse_context&); default_parse_context& operator=(const default_parse_context&); }; class null_parse_context { -public: + public: struct dummy_str { void push_back(int) {} }; -public: + public: null_parse_context() {} bool set_null() { return true; } bool set_bool(bool) { return true; } @@ -985,7 +985,7 @@ public: return _parse(*this, in); } -private: + private: null_parse_context(const null_parse_context&); null_parse_context& operator=(const null_parse_context&); }; diff --git a/paddle/trainer/tests/test_TrainerOnePass.cpp b/paddle/trainer/tests/test_TrainerOnePass.cpp index b2a93d4d5eea37ad716b59427f2aa4409d2f537d..de12c4d649c6041f497c0eeac0904ebfc0d5bf97 100644 --- a/paddle/trainer/tests/test_TrainerOnePass.cpp +++ b/paddle/trainer/tests/test_TrainerOnePass.cpp @@ -38,7 +38,7 @@ DECLARE_int32(num_passes); DECLARE_int32(saving_period); class TrainerForTest : public paddle::Trainer { -public: + public: inline const std::shared_ptr& getParameterUpdaterForTest() { return this->trainerInternal_.getParameterUpdater(); } diff --git a/paddle/utils/ClassRegistrar.h b/paddle/utils/ClassRegistrar.h index 1ac27bafabd1945d1d01e3bead22b0dd200d8688..5f40a0b25e92c7adcfe3f8c4be96016be801da3b 100644 --- a/paddle/utils/ClassRegistrar.h +++ b/paddle/utils/ClassRegistrar.h @@ -41,7 +41,7 @@ namespace paddle { */ template class ClassRegistrar { -public: + public: typedef std::function ClassCreator; // Register a class using a creation function. @@ -74,7 +74,7 @@ public: } } -protected: + protected: std::map creatorMap_; }; diff --git a/paddle/utils/CpuId.h b/paddle/utils/CpuId.h index 869be5be541dafd699a87a8e8893aadadf59b711..ed58211d13ac1e0f80d6728950f0b88dc0ae625f 100644 --- a/paddle/utils/CpuId.h +++ b/paddle/utils/CpuId.h @@ -35,7 +35,7 @@ enum simd_t { // clang-format on class SIMDFlags final { -public: + public: DISABLE_COPY(SIMDFlags); SIMDFlags(); @@ -46,7 +46,7 @@ public: return !((simd_flags_ & flags) ^ flags); } -private: + private: int simd_flags_ = SIMD_NONE; }; diff --git a/paddle/utils/CustomStackTrace.h b/paddle/utils/CustomStackTrace.h index 52a6df94979fd3d8d7d540ed0e3898bb3375d975..b60077ea2d946366910780eeb773635972211e04 100644 --- a/paddle/utils/CustomStackTrace.h +++ b/paddle/utils/CustomStackTrace.h @@ -49,7 +49,7 @@ namespace paddle { */ template class CustomStackTrace { -public: + public: /** * @brief Pop out an item from the top of the stack if item == top. * Else, just set status to popping. @@ -136,7 +136,7 @@ public: p.push(item); } -private: + private: /** * Get thread local attribute, and save them into a map (threadId => TYPE*) * @@ -174,7 +174,7 @@ private: return this->getThreadLocal(this->isPushing_, this->pushingBuffers_); } -private: + private: mutable std::mutex mtx_; std::unordered_map*> stackBuffers_; diff --git a/paddle/utils/Error.h b/paddle/utils/Error.h index 7cde98306026ca1de76089749aaea265d151da33..1fc8482e3a1bef869d4df147bbd3cab6e62ccf49 100644 --- a/paddle/utils/Error.h +++ b/paddle/utils/Error.h @@ -95,7 +95,7 @@ namespace paddle { * log(FATAL) and CHECK in Paddle, 'check' method will be removed. */ class Error { -public: + public: /** * Construct a no-error value. */ @@ -138,7 +138,7 @@ public: */ bool isOK() const { return msg_ == nullptr; } -private: + private: std::shared_ptr msg_; }; diff --git a/paddle/utils/GlobalConstants.h b/paddle/utils/GlobalConstants.h index 0ec1c28dfbb2a7db9fa84c9eb2bc4dad806b78e9..3f45e82268435e4c22d1879e909b0c90838d6693 100644 --- a/paddle/utils/GlobalConstants.h +++ b/paddle/utils/GlobalConstants.h @@ -78,7 +78,7 @@ enum ParameterType { using namespace enumeration_wrapper; // NOLINT class TrainAlgorithm { -public: + public: static const std::string SGD; static const std::string AsyncSGD; static const std::string OWLQN; diff --git a/paddle/utils/Locks.h b/paddle/utils/Locks.h index e87abb9139f1c3f250f8b8fe1afdd8883f682647..65f983685f5e178345a6a875a79a6573ce1ccca1 100644 --- a/paddle/utils/Locks.h +++ b/paddle/utils/Locks.h @@ -42,7 +42,7 @@ namespace paddle { * Use unlock() to unlock the lock. */ class RWLock { -public: + public: RWLock() { pthread_rwlock_init(&rwlock_, NULL); } ~RWLock() { pthread_rwlock_destroy(&rwlock_); } RWLock(const RWLock&) = delete; @@ -62,7 +62,7 @@ public: void lock_shared() { pthread_rwlock_rdlock(&rwlock_); } void unlock() { pthread_rwlock_unlock(&rwlock_); } -protected: + protected: pthread_rwlock_t rwlock_; }; @@ -71,7 +71,7 @@ protected: * using RAII management mechanism. */ class ReadLockGuard { -public: + public: /** * @brief Construct Function. Lock on rwlock in read mode. */ @@ -86,7 +86,7 @@ public: */ ~ReadLockGuard() { rwlock_->unlock(); } -protected: + protected: RWLock* rwlock_; }; @@ -98,7 +98,7 @@ protected: */ class SpinLockPrivate; class SpinLock { -public: + public: DISABLE_COPY(SpinLock); SpinLock(); ~SpinLock(); @@ -107,7 +107,7 @@ public: void lock(); void unlock(); -private: + private: SpinLockPrivate* m; }; @@ -116,7 +116,7 @@ private: */ class SemaphorePrivate; class Semaphore { -public: + public: //! Disable copy & assign Semaphore(const Semaphore& other) = delete; Semaphore& operator=(const Semaphore&& other) = delete; @@ -124,7 +124,7 @@ public: //! Enable move. Semaphore(Semaphore&& other) : m(std::move(other.m)) {} -public: + public: /** * @brief Construct Function. * @param[in] initValue the initial value of the @@ -156,7 +156,7 @@ public: */ void post(); -private: + private: SemaphorePrivate* m; }; @@ -166,7 +166,7 @@ private: */ class ThreadBarrierPrivate; class ThreadBarrier { -public: + public: DISABLE_COPY(ThreadBarrier); /** @@ -184,7 +184,7 @@ public: */ void wait(); -private: + private: ThreadBarrierPrivate* m; }; @@ -192,7 +192,7 @@ private: * A wrapper for condition variable with mutex. */ class LockedCondition : public std::condition_variable { -public: + public: /** * @brief execute op and notify one thread which was blocked. * @param[in] op a thread can do something in op before notify. @@ -235,7 +235,7 @@ public: */ std::mutex* mutex() { return &mutex_; } -protected: + protected: std::mutex mutex_; }; diff --git a/paddle/utils/PythonUtil.h b/paddle/utils/PythonUtil.h index daebaffc855518425ae43942c22ec150d2e327f0..6f8d7e09309503e47aca7ae2d20774c748703b21 100644 --- a/paddle/utils/PythonUtil.h +++ b/paddle/utils/PythonUtil.h @@ -55,12 +55,12 @@ std::string callPythonFunc(const std::string& moduleName, * NOTE: the lock of this guard is reentrant or recursive. */ class PyGuard { -public: + public: PyGuard(); PyGuard(const PyGuard& other) = delete; PyGuard& operator=(const PyGuard& other) = delete; -private: + private: std::lock_guard guard_; }; @@ -133,7 +133,7 @@ std::string getPyCallStack(); * Implements getAttr method for object. */ class ObjectHelper { -public: + public: explicit ObjectHelper(const PyObjectPtr& obj) : obj_(obj) {} /** @@ -192,7 +192,7 @@ public: return PyObject_IsTrue(tmp.get()); } -private: + private: const PyObjectPtr& obj_; }; @@ -202,7 +202,7 @@ private: * The python sequence means list or tuple. */ class SequenceHelper { -public: + public: explicit SequenceHelper(const PyObjectPtr& seq) : seq_(seq.get()) { CHECK(PySequence_Check(seq_)); } @@ -248,12 +248,12 @@ public: } } -private: + private: PyObject* seq_; }; class DictHelper { -public: + public: explicit DictHelper(PyObject* d) : dict_(d) {} explicit DictHelper(const PyObjectPtr& d) : dict_(d.get()) {} @@ -275,7 +275,7 @@ public: this->set(key, list); } -private: + private: inline void checkDict() { CHECK(PyDict_Check(this->dict_)); } PyObject* dict_; @@ -289,7 +289,7 @@ inline static bool isCallable(const PyObjectPtr& obj) { * Wrap a callable object. */ class CallableHelper { -public: + public: explicit CallableHelper(const PyObjectPtr& obj) : obj_(obj) { CHECK(py::isCallable(obj_)); } @@ -315,7 +315,7 @@ public: return PyObject_Call(obj_.get(), args.get(), kwargs.get()); } -private: + private: const PyObjectPtr& obj_; PyObjectPtr args; PyObjectPtr kwargs; diff --git a/paddle/utils/Queue.h b/paddle/utils/Queue.h index f054738f87c02d2d749eec8d6c7bb55b506a6d91..189e1a14f7b2d133408a50418d96431164248f0e 100644 --- a/paddle/utils/Queue.h +++ b/paddle/utils/Queue.h @@ -56,7 +56,7 @@ namespace paddle { */ template class Queue { -public: + public: /** * @brief Construct Function. Default capacity of Queue is zero. */ @@ -147,7 +147,7 @@ public: }); } -private: + private: std::deque elements_; int numElements_; std::mutex queueLock_; @@ -185,7 +185,7 @@ private: */ template class BlockingQueue { -public: + public: /** * @brief Construct Function. * @param[in] capacity the max numer of elements the queue can have. @@ -244,7 +244,7 @@ public: return queue_.empty(); } -private: + private: std::mutex mutex_; std::condition_variable notEmpty_; std::condition_variable notFull_; diff --git a/paddle/utils/Stat.h b/paddle/utils/Stat.h index 79fd3b8cf043e62922dfd046754ee8ac261990c5..100e9eba909466fcca57f755405ab63b638a8ebd 100644 --- a/paddle/utils/Stat.h +++ b/paddle/utils/Stat.h @@ -33,7 +33,7 @@ namespace paddle { class Stat; class StatInfo { -public: + public: explicit StatInfo(Stat* stat = nullptr) : stat_(stat) { total_ = 0; max_ = 0; @@ -61,7 +61,7 @@ class Stat; typedef std::shared_ptr StatPtr; class StatSet { -public: + public: explicit StatSet(const std::string& name) : name_(name) {} ~StatSet() {} @@ -102,7 +102,7 @@ public: // pserver code logic, -_- ). void reset(bool clearRawData = true); -private: + private: std::unordered_map statSet_; const std::string name_; RWLock lock_; @@ -112,7 +112,7 @@ extern StatSet globalStat; /*@brief : a simple stat*/ class Stat { -public: + public: explicit Stat(const std::string& statName) : destructStat_(nullptr), name_(statName), openThreadInfo_(false) {} ~Stat() {} @@ -137,7 +137,7 @@ public: friend class StatInfo; -private: + private: void mergeThreadStat(StatInfo& allThreadStat); std::mutex lock_; @@ -164,7 +164,7 @@ inline uint64_t nowInMicroSec() { * A simple help class to measure time interval */ class Timer { -public: + public: explicit Timer(bool autoStart = true) : total_(0), startStamp_(0) { if (autoStart) { start(); @@ -181,13 +181,13 @@ public: void reset() { total_ = 0; } -protected: + protected: uint64_t total_; uint64_t startStamp_; }; class TimerOnce { -public: + public: TimerOnce(Stat* stat, const char* info = "", uint64_t threshold = -1, @@ -208,7 +208,7 @@ public: stat_->addSample(span); } -private: + private: Stat* stat_; const char* info_; Timer timer_; @@ -280,11 +280,11 @@ inline StatSet& registerTimerArg2(uint64_t threshold = -1, #endif // DISABLE_TIMER class GpuProfiler final { -public: + public: GpuProfiler(std::string statName, std::string info); ~GpuProfiler(); -private: + private: std::lock_guard guard_; }; diff --git a/paddle/utils/Thread.h b/paddle/utils/Thread.h index ef36a8c5b2b0e95d759da8a781d781b71d067b7a..2ee6eba1a68202282537788160a77f7689a2ffdb 100644 --- a/paddle/utils/Thread.h +++ b/paddle/utils/Thread.h @@ -29,7 +29,7 @@ namespace paddle { */ class Thread { -public: + public: /** * @brief Construct Function. Default thread pointer is null. */ @@ -62,7 +62,7 @@ public: */ virtual void run() = 0; -protected: + protected: std::unique_ptr thread_; }; @@ -73,7 +73,7 @@ protected: * Use addJob() to add a new job to the job queue. */ class ThreadWorker : protected Thread { -public: + public: typedef std::function JobFunc; /** @@ -116,7 +116,7 @@ public: finishCV_.wait([this] { return empty_; }); } -protected: + protected: /** * @brief Execute jobs in the job queue sequentianlly, * @note If finish all the jobs in the job queue, @@ -150,7 +150,7 @@ protected: * JobFunc can use tid to divide input data. */ class SyncThreadPool { -public: + public: typedef std::function JobFunc; /** @@ -236,7 +236,7 @@ public: } } -protected: + protected: /** * @brief Start all the workers in the pool, call their run() function. */ @@ -285,7 +285,7 @@ protected: } } -protected: + protected: pid_t ownerThreadId_; bool stopping_; ThreadBarrier jobStartBarrier_; @@ -323,7 +323,7 @@ protected: */ template class MultiThreadWorker { -public: + public: typedef T ResultType; typedef std::shared_ptr ResultPtrType; typedef std::function JobFunc; @@ -424,7 +424,7 @@ public: */ bool testResult() { return results_.empty(); } -protected: + protected: /** * @brief Do the jobs in the job queue sequentianlly * and enqueue the result into the result queue. @@ -476,7 +476,7 @@ protected: * thread pool. */ class AsyncThreadPool { -public: + public: typedef std::function JobFunc; AsyncThreadPool() { LOG(FATAL) << "Not implemented"; } @@ -594,7 +594,7 @@ public: } } -protected: + protected: /** * @brief Execute the jobs in the job queue. */ @@ -606,7 +606,7 @@ protected: } } -private: + private: std::vector> workers_; Queue jobs_; bool stopping_; diff --git a/paddle/utils/ThreadLocal.h b/paddle/utils/ThreadLocal.h index 0a27b8b97b83a9066af23039a317c437ea56777a..c5b07506d36875ead65887ea2e221e762be0d621 100644 --- a/paddle/utils/ThreadLocal.h +++ b/paddle/utils/ThreadLocal.h @@ -49,7 +49,7 @@ namespace paddle { */ template class ThreadLocal { -public: + public: ThreadLocal() { CHECK_EQ(pthread_key_create(&threadSpecificKey_, dataDestructor), 0); } @@ -92,7 +92,7 @@ public: */ operator T*() { return get(); } -private: + private: static void dataDestructor(void* p) { delete (T*)p; } pthread_key_t threadSpecificKey_; @@ -111,7 +111,7 @@ private: */ template class ThreadLocalD { -public: + public: ThreadLocalD() { CHECK_EQ(pthread_key_create(&threadSpecificKey_, NULL), 0); } ~ThreadLocalD() { pthread_key_delete(threadSpecificKey_); @@ -150,7 +150,7 @@ public: */ T& operator*() { return *get(); } -private: + private: static void dataDestructor(void* p) { delete (T*)p; } void updateMap(T* p) { @@ -172,7 +172,7 @@ private: * @brief Thread-safe C-style random API. */ class ThreadLocalRand { -public: + public: /** * initSeed just like srand, * called by main thread, @@ -205,7 +205,7 @@ public: */ static int getDefaultSeed() { return defaultSeed_; } -protected: + protected: static unsigned int defaultSeed_; static ThreadLocal seed_; }; @@ -214,7 +214,7 @@ protected: * @brief Thread-safe C++ style random engine. */ class ThreadLocalRandomEngine { -public: + public: /** * get random_engine for each thread. * @@ -222,7 +222,7 @@ public: */ static std::default_random_engine& get(); -protected: + protected: static ThreadLocal engine_; }; diff --git a/paddle/utils/Util.h b/paddle/utils/Util.h index 9579881ea3b92abab0189631184bab515afb67a3..e6f05e30d308b8b94935897e947350934a5971ee 100644 --- a/paddle/utils/Util.h +++ b/paddle/utils/Util.h @@ -179,7 +179,7 @@ void loadFileList(const std::string& fileListFileName, */ void registerInitFunction(std::function func, int priority = 0); class InitFunction { -public: + public: explicit InitFunction(std::function func, int priority = 0) { registerInitFunction(func, priority); } @@ -191,7 +191,7 @@ public: * When the SetDevice object is destructed, it will restore device environment. */ class SetDevice { -public: + public: explicit SetDevice(int deviceId) { isSet_ = deviceId >= 0; devId_ = 0; @@ -206,7 +206,7 @@ public: } } -protected: + protected: bool isSet_; int devId_; }; @@ -240,7 +240,7 @@ inline void enablePeerAccess(int d1, int d2) { * } */ class AsyncGpuBlock { -public: + public: AsyncGpuBlock() : syncFlag_(hl_get_sync_flag()) { hl_set_sync_flag(false); } ~AsyncGpuBlock() { if (syncFlag_) { @@ -249,7 +249,7 @@ public: } } -private: + private: bool syncFlag_; }; @@ -378,7 +378,7 @@ std::string join(const std::string& part1, * A Checker for each invoke of method in same thread. */ class SameThreadChecker { -public: + public: SameThreadChecker() {} /** @@ -400,7 +400,7 @@ public: << invokeThreadId_ << " current invoked in " << curThreadId; } -private: + private: std::once_flag onceFlag_; std::thread::id invokeThreadId_; }; @@ -421,7 +421,7 @@ private: */ template class WeakKVCache { -public: + public: WeakKVCache() {} std::shared_ptr get(const KType& key, @@ -442,7 +442,7 @@ public: return retVal; } -private: + private: std::mutex lock_; std::unordered_map, Hash> storage_; }; @@ -453,7 +453,7 @@ private: */ template class ScopedCallbacks { -public: + public: ScopedCallbacks(CallbackType enter, CallbackType exit, Args&... args) : exit_(std::bind(exit, args...)) { enter(args...); @@ -464,7 +464,7 @@ public: ~ScopedCallbacks() { exit_(); } -private: + private: std::function exit_; }; @@ -475,7 +475,7 @@ private: */ template class AlignedAllocator { -public: + public: /// std campatible typedefs. typedef T* pointer; typedef const T* const_pointer; @@ -552,12 +552,12 @@ public: return this->allocate(n); } -private: + private: AlignedAllocator& operator=(const AlignedAllocator&); // disable }; class Deprecated { -public: + public: explicit Deprecated(const std::string& msg = "") { if (msg.empty()) { LOG(WARNING) << "This class is deprecated, please do not use this class."; diff --git a/paddle/utils/arch/linux/Locks.cpp b/paddle/utils/arch/linux/Locks.cpp index a4e6c8f7b8397adc262588612c250bac5ef5eaa6..409af8bce3621c51bfd7a69c6b4ec1f9cc6be8e4 100644 --- a/paddle/utils/arch/linux/Locks.cpp +++ b/paddle/utils/arch/linux/Locks.cpp @@ -19,7 +19,7 @@ limitations under the License. */ namespace paddle { class SemaphorePrivate { -public: + public: sem_t sem; }; @@ -45,7 +45,7 @@ void Semaphore::post() { sem_post(&m->sem); } #ifdef PADDLE_USE_PTHREAD_SPINLOCK class SpinLockPrivate { -public: + public: inline SpinLockPrivate() { pthread_spin_init(&lock_, 0); } inline ~SpinLockPrivate() { pthread_spin_destroy(&lock_); } @@ -63,7 +63,7 @@ public: // clang-format on class SpinLockPrivate { -public: + public: inline void lock() { while (lock_.test_and_set(std::memory_order_acquire)) { } @@ -86,7 +86,7 @@ void SpinLock::unlock() { m->unlock(); } #ifdef PADDLE_USE_PTHREAD_BARRIER class ThreadBarrierPrivate { -public: + public: pthread_barrier_t barrier_; inline explicit ThreadBarrierPrivate(int count) { @@ -101,7 +101,7 @@ public: #else class ThreadBarrierPrivate { -public: + public: pthread_mutex_t mutex_; pthread_cond_t cond_; int count_; diff --git a/paddle/utils/arch/osx/Locks.cpp b/paddle/utils/arch/osx/Locks.cpp index e03992363fd6051a1970664d63406b2e7a47fce3..f3905091bd024ab02c3f5d39cfed6dbc38fabbbc 100644 --- a/paddle/utils/arch/osx/Locks.cpp +++ b/paddle/utils/arch/osx/Locks.cpp @@ -21,7 +21,7 @@ limitations under the License. */ namespace paddle { class SemaphorePrivate { -public: + public: ~SemaphorePrivate() { dispatch_release(sem); } dispatch_semaphore_t sem; @@ -45,7 +45,7 @@ void Semaphore::wait() { void Semaphore::post() { dispatch_semaphore_signal(m->sem); } class SpinLockPrivate { -public: + public: std::atomic_flag lock_ = ATOMIC_FLAG_INIT; char padding_[64 - sizeof(lock_)]; // Padding to cache line size }; @@ -61,7 +61,7 @@ void SpinLock::lock() { void SpinLock::unlock() { m->lock_.clear(std::memory_order_release); } class ThreadBarrierPrivate { -public: + public: pthread_mutex_t mutex_; pthread_cond_t cond_; int count_; diff --git a/python/paddle/fluid/framework.py b/python/paddle/fluid/framework.py index 08b756d95b9b72db5d978afbe437bbfcb52025b0..33b5caa0eab0ec192eb4a3b63cf82a672c58d2cb 100644 --- a/python/paddle/fluid/framework.py +++ b/python/paddle/fluid/framework.py @@ -797,7 +797,7 @@ class Block(object): Rename variable in vars and ops' inputs and outputs """ if not self.has_var(name): - raise ValueError("var %s is not in current" % name) + raise ValueError("var %s is not in current block" % name) v = self.var(name) if type(v) == Parameter: var_type = "Parameter" @@ -843,6 +843,7 @@ class Block(object): self.vars[new_name] = var del self.vars[name] self.sync_with_cpp() + return var def remove_var(self, name): self.sync_with_cpp() diff --git a/python/paddle/fluid/layers/io.py b/python/paddle/fluid/layers/io.py index 03d4602f7a99dc335260cffdcdc30a839f3988cd..8758ac9f94ab91b5be5fc70917c64db38997d1c1 100644 --- a/python/paddle/fluid/layers/io.py +++ b/python/paddle/fluid/layers/io.py @@ -195,21 +195,23 @@ def Send(endpoints, send_vars, get_vars=None): endpoints = list(set(epmap)) helper = LayerHelper("Send", **locals()) - rpc_client_var = default_main_program().global_block().create_var( - name="RPC_CLIENT_VAR", persistable=True, type=core.VarDesc.VarType.RAW) if not get_vars: get_vars = [] for s in send_vars: v = helper.create_tmp_variable(dtype=s.dtype, stop_gradient=True) get_vars.append(v) + rpc_op_role_name = core.op_proto_and_checker_maker.kOpRoleAttrName() helper.append_op( type="send", inputs={"X": send_vars}, - outputs={"Out": get_vars, - "RPCClient": rpc_client_var}, - attrs={"endpoints": endpoints, - "epmap": epmap}) + outputs={"Out": get_vars}, + attrs={ + "endpoints": endpoints, + "epmap": epmap, + rpc_op_role_name: core.op_proto_and_checker_maker.OpRole.RPC + }) + return get_vars diff --git a/python/paddle/fluid/layers/nn.py b/python/paddle/fluid/layers/nn.py index b6c47aa9a65b9145983513715233784d77e3d904..21d74deab70182b52ccf60537d85d2359cc0ceb7 100644 --- a/python/paddle/fluid/layers/nn.py +++ b/python/paddle/fluid/layers/nn.py @@ -1855,6 +1855,7 @@ def conv2d_transpose(input, 'strides': stride, 'paddings': padding, 'dilations': dilation, + 'groups': groups, 'use_cudnn': use_cudnn }) diff --git a/python/paddle/fluid/tests/unittests/CMakeLists.txt b/python/paddle/fluid/tests/unittests/CMakeLists.txt index eed1412ba4f2b8f2209c0573359bea1e4b20d8d5..fead95ffdab25c7ea96b7ef223efc0abf7eea3e3 100644 --- a/python/paddle/fluid/tests/unittests/CMakeLists.txt +++ b/python/paddle/fluid/tests/unittests/CMakeLists.txt @@ -48,3 +48,5 @@ foreach(TEST_OP ${TEST_OPS}) endforeach(TEST_OP) py_test_modules(test_warpctc_op MODULES test_warpctc_op ENVS FLAGS_warpctc_dir=${WARPCTC_LIB_DIR} SERIAL) py_test_modules(test_dist_train MODULES test_dist_train SERIAL) +# tests that need to be done in fixed timeout +set_tests_properties(test_listen_and_serv_op PROPERTIES TIMEOUT 20) diff --git a/python/paddle/fluid/tests/unittests/test_dist_transpiler.py b/python/paddle/fluid/tests/unittests/test_dist_transpiler.py index 10f8c4f3f0167632bb4a3d454ab026ba73a8f305..fa49bd41a5876847d046682dce5c3d3868a18500 100644 --- a/python/paddle/fluid/tests/unittests/test_dist_transpiler.py +++ b/python/paddle/fluid/tests/unittests/test_dist_transpiler.py @@ -49,7 +49,6 @@ class TestDistTranspiler(unittest.TestCase): def test_transpiler(self): trainer = self.get_trainer() pserver, startup = self.get_pserver(self.current_pserver_ep) - self.assertEqual([op.type for op in trainer.global_block().ops], self.get_expect_trainer_ops()) @@ -67,7 +66,7 @@ class TestDistTranspiler(unittest.TestCase): "fill_constant", "fill_constant", "uniform_random", "uniform_random" ]) - # the variable #fc_w will be split into two blocks + # the variable #fc_w will be split into two blocks fc_w_var = startup.global_block().var("fc_w.block1") self.assertEqual(fc_w_var.shape, (500, 1000)) @@ -86,8 +85,12 @@ class TestDistTranspiler(unittest.TestCase): optimize_ops, params_grads = self.net_conf() delete_ops(trainer.global_block(), optimize_ops) - return [op.type for op in trainer.global_block().ops - ] + ["split_byref", "send", "concat"] + ops = [op.type for op in trainer.global_block().ops] + [ + "split_byref", "send_vars", "send_barrier", "recv", "recv", + "fetch_barrier", "concat" + ] + ops.insert(ops.index("elementwise_add_grad") + 1, "send_vars") + return ops def get_trainer(self): return self._transpiler_instance().get_trainer_program() diff --git a/python/paddle/fluid/tests/unittests/test_listen_and_serv_op.py b/python/paddle/fluid/tests/unittests/test_listen_and_serv_op.py new file mode 100644 index 0000000000000000000000000000000000000000..cf89f9d0ebf6200933e539ef7fa8cbdc8f6db058 --- /dev/null +++ b/python/paddle/fluid/tests/unittests/test_listen_and_serv_op.py @@ -0,0 +1,109 @@ +# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import paddle +import paddle.fluid as fluid +import os +import signal +import subprocess +import time +import unittest +from multiprocessing import Process +from op_test import OpTest + + +def run_pserver(use_cuda, sync_mode, ip, port, trainer_count, trainer_id): + x = fluid.layers.data(name='x', shape=[1], dtype='float32') + y_predict = fluid.layers.fc(input=x, size=1, act=None) + y = fluid.layers.data(name='y', shape=[1], dtype='float32') + + # loss function + cost = fluid.layers.square_error_cost(input=y_predict, label=y) + avg_cost = fluid.layers.mean(cost) + + # optimizer + sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001) + sgd_optimizer.minimize(avg_cost) + + place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace() + exe = fluid.Executor(place) + + port = os.getenv("PADDLE_INIT_PORT", port) + pserver_ips = os.getenv("PADDLE_INIT_PSERVERS", ip) # ip,ip... + eplist = [] + for ip in pserver_ips.split(","): + eplist.append(':'.join([ip, port])) + pserver_endpoints = ",".join(eplist) # ip:port,ip:port... + trainers = int(os.getenv("TRAINERS", trainer_count)) + current_endpoint = os.getenv("POD_IP", ip) + ":" + port + trainer_id = int(os.getenv("PADDLE_INIT_TRAINER_ID", trainer_id)) + t = fluid.DistributeTranspiler() + t.transpile( + trainer_id, + pservers=pserver_endpoints, + trainers=trainers, + sync_mode=sync_mode) + pserver_prog = t.get_pserver_program(current_endpoint) + pserver_startup = t.get_startup_program(current_endpoint, pserver_prog) + exe.run(pserver_startup) + exe.run(pserver_prog) + + +class TestListenAndServOp(OpTest): + def setUp(self): + self.sleep_time = 5 + self.ip = "127.0.0.1" + self.port = "6173" + self.trainer_count = 1 + self.trainer_id = 1 + + def _raise_signal(self, parent_pid, raised_signal): + time.sleep(self.sleep_time) + ps_command = subprocess.Popen( + "ps -o pid --ppid %d --noheaders" % parent_pid, + shell=True, + stdout=subprocess.PIPE) + ps_output = ps_command.stdout.read() + retcode = ps_command.wait() + assert retcode == 0, "ps command returned %d" % retcode + + for pid_str in ps_output.split("\n")[:-1]: + try: + os.kill(int(pid_str), raised_signal) + except Exception: + continue + + def _start_pserver(self, use_cuda, sync_mode): + p = Process( + target=run_pserver, + args=(use_cuda, sync_mode, self.ip, self.port, self.trainer_count, + self.trainer_id)) + p.start() + + def test_handle_signal_in_serv_op(self): + # run pserver on CPU in sync mode + self._start_pserver(False, True) + + # raise SIGINT to pserver + self._raise_signal(os.getpid(), signal.SIGINT) + + # run pserver on CPU in async mode + self._start_pserver(False, False) + + # raise SIGTERM to pserver + self._raise_signal(os.getpid(), signal.SIGTERM) + + +if __name__ == '__main__': + unittest.main() diff --git a/python/paddle/fluid/transpiler/__init__.py b/python/paddle/fluid/transpiler/__init__.py index 413c36c5c41bbe0169f1c050ccdac040202d66df..045ca537b2e84c02298d6375a7ef5bdbb5517380 100644 --- a/python/paddle/fluid/transpiler/__init__.py +++ b/python/paddle/fluid/transpiler/__init__.py @@ -16,8 +16,9 @@ from distribute_transpiler import DistributeTranspiler from inference_transpiler import InferenceTranspiler from memory_optimization_transpiler import memory_optimize, release_memory from distribute_transpiler_simple import SimpleDistributeTranspiler +from ps_dispatcher import HashName, RoundRobin __all__ = [ "DistributeTranspiler", "InferenceTranspiler", "SimpleDistributeTranspiler", - "memory_optimize", "release_memory" + "memory_optimize", "release_memory", "HashName", "RoundRobin" ] diff --git a/python/paddle/fluid/transpiler/distribute_transpiler.py b/python/paddle/fluid/transpiler/distribute_transpiler.py index 42ff0a9eb1112ed5709749e3867794c80be8f1d1..e9b7d9e9d2dea54a33068d5c3fe3fbf22620d1ea 100644 --- a/python/paddle/fluid/transpiler/distribute_transpiler.py +++ b/python/paddle/fluid/transpiler/distribute_transpiler.py @@ -16,7 +16,7 @@ from __future__ import print_function import math -import distributed_splitter as splitter +from ps_dispatcher import RoundRobin, HashName, PSDispatcher from .. import core, framework from ..framework import Program, default_main_program, \ default_startup_program, \ @@ -24,7 +24,9 @@ from ..framework import Program, default_main_program, \ LOOKUP_TABLE_TYPE = "lookup_table" LOOKUP_TABLE_GRAD_TYPE = "lookup_table_grad" -RPC_CLIENT_VAR_NAME = "RPC_CLIENT_VAR" +RPC_OP_ROLE_ATTR_NAME = op_role_attr_name = core.op_proto_and_checker_maker.kOpRoleAttrName( +) +RPC_OP_ROLE_ATTR_VALUE = core.op_proto_and_checker_maker.OpRole.RPC class VarBlock: @@ -149,13 +151,27 @@ def delete_ops(block, ops): block.program.sync_with_cpp() +def find_op_by_input_arg(block, arg_name): + for index, op in enumerate(block.ops): + if arg_name in op.input_arg_names: + return index + return -1 + + +def find_op_by_output_arg(block, arg_name): + for index, op in enumerate(block.ops): + if arg_name in op.output_arg_names: + return index + return -1 + + class DistributeTranspiler: def transpile(self, trainer_id, program=None, pservers="127.0.0.1:6174", trainers=1, - split_method=splitter.round_robin, + split_method=RoundRobin, sync_mode=True): """ Transpile the program to distributed data-parallelism programs. @@ -196,7 +212,7 @@ class DistributeTranspiler: :param sync_mode: if sync_mode is set True, it means that dist transpiler will transpile the program into sync_mode pserver and trainer program. """ - assert (callable(split_method)) + assert (split_method.__bases__[0] == PSDispatcher) if program is None: program = default_main_program() self.origin_program = program @@ -209,6 +225,7 @@ class DistributeTranspiler: pserver_endpoints = pservers.split(",") self.pserver_endpoints = pserver_endpoints self.optimize_ops, params_grads = self._get_optimize_pass() + ps_dispatcher = split_method(pserver_endpoints) # process lookup_table_op # 1. check all lookup_table_op is distributed @@ -256,66 +273,132 @@ class DistributeTranspiler: if param_grad[0].name == self.table_name ][0] table_grad_var = self.table_param_grad[1] - self.table_grad_list = [ - program.global_block().create_var( - name="%s.trainer_%d.pserver_%d" % - (table_grad_var.name, trainer_id, index), - type=table_grad_var.type, - shape=table_grad_var.shape, - dtype=table_grad_var.dtype) - for index in range(len(self.pserver_endpoints)) - ] + if self.sync_mode: + self.trainer_side_table_grad_list = [ + program.global_block().create_var( + name="%s.trainer_%d.pserver_%d" % + (table_grad_var.name, trainer_id, index), + type=table_grad_var.type, + shape=table_grad_var.shape, + dtype=table_grad_var.dtype) + for index in range(len(self.pserver_endpoints)) + ] + else: + self.trainer_side_table_grad_list = [ + program.global_block().create_var( + name="%s.pserver_%d" % (table_grad_var.name, index), + type=table_grad_var.type, + shape=table_grad_var.shape, + dtype=table_grad_var.dtype) + for index in range(len(self.pserver_endpoints)) + ] grad_blocks = split_dense_variable(grad_list, len(pserver_endpoints)) param_blocks = split_dense_variable(param_list, len(pserver_endpoints)) + assert (len(grad_blocks) == len(param_blocks)) # step2: Create new vars for the parameters and gradients blocks and # add ops to do the split. - grad_var_mapping = self._append_split_op(program, grad_blocks) param_var_mapping = self._create_vars_from_blocklist(program, param_blocks) + grad_var_mapping = self._create_vars_from_blocklist( + program, grad_blocks, add_trainer_suffix=self.trainer_num > 1) + grad_param_mapping = dict() + for g, p in zip(grad_blocks, param_blocks): + g_name, g_bid, _ = g.split(":") + p_name, p_bid, _ = p.split(":") + grad_param_mapping[grad_var_mapping[g_name][int(g_bid)]] = \ + param_var_mapping[p_name][int(p_bid)] + + # step 3: transpile trainer side program, insert recv op and send op. - # step3: Add gradients as send op inputs and parameters as send - # op outputs. - send_inputs = [] - send_outputs = [] - for b in grad_blocks: # append by order - varname, block_id, _ = b.split(":") - send_inputs.append(grad_var_mapping[varname][int(block_id)]) - - for b in param_blocks: - varname, block_id, _ = b.split(":") - send_outputs.append(param_var_mapping[varname][int(block_id)]) - - # let send_op know which endpoint to send which var to, eplist has the same - # order as send_inputs. - eplist = split_method(send_inputs, pserver_endpoints) # create mapping of endpoint -> split var to create pserver side program self.param_grad_ep_mapping = dict() + [ + self.param_grad_ep_mapping.update({ + ep: { + "params": [], + "grads": [] + } + }) for ep in self.pserver_endpoints + ] + + # step 3.1: insert send op to send gradient vars to parameter servers + ps_dispatcher.reset() + send_vars = [] + for orig_varname, splited_vars in grad_var_mapping.items(): + eplist = ps_dispatcher.dispatch(splited_vars) + if len(splited_vars) == 1: + orig_varname = splited_vars[0].name + index = find_op_by_output_arg(program.global_block(), + orig_varname) + elif len(splited_vars) > 1: + orig_var = program.global_block().vars[orig_varname] + index = find_op_by_output_arg(program.global_block(), + orig_varname) + self._insert_split_op(program, orig_var, index, splited_vars) + index += 1 + else: + AssertionError("Can not insert the send op by original " + "variable name :", orig_varname) + + program.global_block().insert_op( + index=index + 1, + type="send_vars", + inputs={"X": splited_vars}, + outputs={}, + attrs={ + "epmap": eplist, + RPC_OP_ROLE_ATTR_NAME: RPC_OP_ROLE_ATTR_VALUE + }) + for _, var in enumerate(splited_vars): + send_vars.append(var) + + if self.sync_mode: + program.global_block().append_op( + type="send_barrier", + inputs={}, + outputs={}, + attrs={ + "endpoints": pserver_endpoints, + "sync_mode": self.sync_mode, + RPC_OP_ROLE_ATTR_NAME: RPC_OP_ROLE_ATTR_VALUE + }) + + # step 3.2: insert recv op to receive parameters from parameter server + recv_vars = [] + for _, var in enumerate(send_vars): + recv_vars.append(grad_param_mapping[var]) + ps_dispatcher.reset() + eplist = ps_dispatcher.dispatch(recv_vars) + for i, ep in enumerate(eplist): - param = send_outputs[i] - grad = send_inputs[i] - if not self.param_grad_ep_mapping.has_key(ep): - self.param_grad_ep_mapping[ep] = {"params": [], "grads": []} - self.param_grad_ep_mapping[ep]["params"].append(param) - self.param_grad_ep_mapping[ep]["grads"].append(grad) - - rpc_client_var = program.global_block().create_var( - name=RPC_CLIENT_VAR_NAME, - persistable=True, - type=core.VarDesc.VarType.RAW) - - # create send_op + self.param_grad_ep_mapping[ep]["params"].append(recv_vars[i]) + self.param_grad_ep_mapping[ep]["grads"].append(send_vars[i]) + # step4: Concat the parameters splits together after recv. + for varname, splited_var in param_var_mapping.iteritems(): + eps = [] + for var in splited_var: + index = [v.name for v in recv_vars].index(var.name) + eps.append(eplist[index]) + + program.global_block().append_op( + type="recv", + inputs={}, + outputs={"Out": splited_var}, + attrs={ + "epmap": eps, + RPC_OP_ROLE_ATTR_NAME: RPC_OP_ROLE_ATTR_VALUE + }) + program.global_block().append_op( - type="send", - inputs={"X": send_inputs}, - outputs={"Out": send_outputs, - "RPCClient": rpc_client_var}, + type="fetch_barrier", + inputs={}, + outputs={}, attrs={ "endpoints": pserver_endpoints, - "epmap": eplist, - "sync_mode": self.sync_mode + RPC_OP_ROLE_ATTR_NAME: RPC_OP_ROLE_ATTR_VALUE }) - # step4: Concat the parameters splits together after recv. + for varname, splited_var in param_var_mapping.iteritems(): if len(splited_var) <= 1: continue @@ -327,10 +410,9 @@ class DistributeTranspiler: attrs={"axis": 0}) if self.has_distributed_lookup_table: - self._replace_lookup_table_op_with_prefetch(program, rpc_client_var, - eplist) - self._split_table_grad_and_add_send_vars(program, rpc_client_var, - pserver_endpoints) + self._replace_lookup_table_op_with_prefetch(program, + pserver_endpoints) + self._split_table_grad_and_add_send_vars(program, pserver_endpoints) def get_trainer_program(self): # remove optimize ops and add a send op to main_program @@ -466,7 +548,7 @@ class DistributeTranspiler: if self.has_distributed_lookup_table: pserver_index = self.pserver_endpoints.index(endpoint) table_opt_block = self._create_table_optimize_block( - pserver_index, pserver_program, pre_block_idx) + pserver_index, pserver_program, pre_block_idx, grad_to_block_id) prefetch_block = self._create_prefetch_block( pserver_index, pserver_program, table_opt_block) @@ -550,8 +632,8 @@ class DistributeTranspiler: return s_prog # transpiler function for dis lookup_table - def _replace_lookup_table_op_with_prefetch(self, program, rpc_client_var, - eplist): + def _replace_lookup_table_op_with_prefetch(self, program, + pserver_endpoints): # 1. replace lookup_table_op with split_ids_op -> prefetch_op -> sum_op self.prefetch_input_vars = None self.prefetch_output_vars = None @@ -598,11 +680,11 @@ class DistributeTranspiler: index=op_index + 1, type="prefetch", inputs={'X': self.prefetch_input_vars}, - outputs={ - "Out": self.prefetch_output_vars, - "RPCClient": rpc_client_var - }, - attrs={"epmap": eplist}) + outputs={"Out": self.prefetch_output_vars}, + attrs={ + "epmap": pserver_endpoints, + RPC_OP_ROLE_ATTR_NAME: RPC_OP_ROLE_ATTR_VALUE + }) # insert concat_op program.global_block().insert_op( @@ -622,8 +704,7 @@ class DistributeTranspiler: # break for loop break - def _split_table_grad_and_add_send_vars(self, program, rpc_client_var, - pserver_endpoints): + def _split_table_grad_and_add_send_vars(self, program, pserver_endpoints): # 2. add split_ids_op and send_vars_op to send gradient to pservers # there should only be one table_name all_ops = program.global_block().ops @@ -638,14 +719,17 @@ class DistributeTranspiler: inputs={ 'Ids': [program.global_block().vars[table_grad_name]] }, - outputs={"Out": self.table_grad_list}) + outputs={"Out": self.trainer_side_table_grad_list}) program.global_block().insert_op( index=op_index + 2, type="send_vars", - inputs={'X': self.table_grad_list}, - outputs={"RPCClient": rpc_client_var}, - attrs={"sync_send": True, - "epmap": pserver_endpoints}) + inputs={'X': self.trainer_side_table_grad_list}, + outputs={}, + attrs={ + "sync_send": True, + "epmap": pserver_endpoints, + RPC_OP_ROLE_ATTR_NAME: RPC_OP_ROLE_ATTR_VALUE + }) break def _create_prefetch_block(self, pserver_index, pserver_program, @@ -678,16 +762,7 @@ class DistributeTranspiler: return prefetch_block def _create_table_optimize_block(self, pserver_index, pserver_program, - pre_block_idx): - def _clone_var(block, var, persistable=True): - assert isinstance(var, Variable) - return block.create_var( - name=var.name, - shape=var.shape, - dtype=var.dtype, - type=var.type, - persistable=persistable) - + pre_block_idx, grad_to_block_id): # STEP: create table optimize block # create table param and grad var in pserver program origin_param_var = self.origin_program.global_block().vars[ @@ -698,11 +773,11 @@ class DistributeTranspiler: dtype=origin_param_var.dtype, type=core.VarDesc.VarType.SELECTED_ROWS, persistable=True) - grad_var = _clone_var( - pserver_program.global_block(), + # parameter must be selected rows + param_var.desc.set_type(core.VarDesc.VarType.SELECTED_ROWS) + grad_var = pserver_program.global_block().clone_variable( self.origin_program.global_block().vars[grad_var_name( - self.table_name)], - persistable=False) + self.table_name)]) # create table optimize block in pserver program table_opt_op = [ @@ -716,7 +791,7 @@ class DistributeTranspiler: if self.sync_mode: # create grad vars in pserver program table_grad_var = self.table_param_grad[1] - table_grad_list = [ + pserver_side_table_grad_list = [ pserver_program.global_block().create_var( name="%s.trainer_%d.pserver_%d" % (table_grad_var.name, index, pserver_index), @@ -726,11 +801,21 @@ class DistributeTranspiler: for index in range(self.trainer_num) ] - # append sum op for table_grad_list + # append sum op for pserver_side_table_grad_list table_opt_block.append_op( type="sum", - inputs={"X": table_grad_list}, + inputs={"X": pserver_side_table_grad_list}, outputs={"Out": [grad_var]}) + else: + # in async_mode, for table gradient, it also need to be splited to each parameter server + origin_grad_name = grad_var.name + splited_grad_name = self.trainer_side_table_grad_list[ + pserver_index].name + if not splited_grad_name.startswith(origin_grad_name): + raise ValueError("origin_grad_var: " + splited_grad_name + + " grad_var:" + grad_var.name) + grad_var = pserver_program.global_block().rename_var( + origin_grad_name, splited_grad_name) lr_var = pserver_program.global_block().vars[table_opt_op.input( "LearningRate")[0]] @@ -746,6 +831,9 @@ class DistributeTranspiler: outputs=outputs, attrs=table_opt_op.attrs) + # add table parameter gradient and it's block id to grad_to_block_id + grad_to_block_id.append(grad_var.name + ":" + str(table_opt_block.idx)) + return table_opt_block # ====================== private transpiler functions ===================== @@ -838,50 +926,31 @@ class DistributeTranspiler: lod_level=var.lod_level, persistable=persistable) - def _append_split_op(self, program, gradblocks): - """ - Split variables that need to be split and append respective ops - Args: - program (ProgramDesc): ProgramDesc that gradients blong. - gradblocks (list[(varname, block_id, block_size)]): List of gradient blocks. - Returns: - var_mapping (dict(varname->[new_splitted_variable])):A dict mapping - from original var name to each var split. - """ - - add_suffix = False - if self.trainer_num > 1: - add_suffix = True - var_mapping = self._create_vars_from_blocklist( - program, gradblocks, add_trainer_suffix=add_suffix) - for varname, splited_vars in var_mapping.iteritems(): - # variable that don't need to split have empty splited_vars - if len(splited_vars) <= 1: - continue - orig_var = program.global_block().vars[varname] - if orig_var.type == core.VarDesc.VarType.SELECTED_ROWS: - height_sections = [] - for v in splited_vars: - height_sections.append(v.shape[0]) - program.global_block().append_op( - type="split_selected_rows", - inputs={"X": orig_var}, - outputs={"Out": splited_vars}, - attrs={"height_sections": height_sections}) - elif orig_var.type == core.VarDesc.VarType.LOD_TENSOR: - sections = [] - for v in splited_vars: - sections.append(v.shape[0]) - program.global_block().append_op( - type="split_byref", - inputs={"X": orig_var}, - outputs={"Out": splited_vars}, - attrs={"sections": sections} # assume split evenly - ) - else: - AssertionError("Variable type should be in set " - "[LOD_TENSOR, SELECTED_ROWS]") - return var_mapping + def _insert_split_op(self, program, orig_var, index, splited_vars): + if orig_var.type == core.VarDesc.VarType.SELECTED_ROWS: + height_sections = [] + for v in splited_vars: + height_sections.append(v.shape[0]) + program.global_block().insert_op( + index=index + 1, + type="split_selected_rows", + inputs={"X": orig_var}, + outputs={"Out": splited_vars}, + attrs={"height_sections": height_sections}) + elif orig_var.type == core.VarDesc.VarType.LOD_TENSOR: + sections = [] + for v in splited_vars: + sections.append(v.shape[0]) + program.global_block().insert_op( + index=index + 1, + type="split_byref", + inputs={"X": orig_var}, + outputs={"Out": splited_vars}, + attrs={"sections": sections} # assume split evenly + ) + else: + AssertionError("Variable type should be in set " + "[LOD_TENSOR, SELECTED_ROWS]") def _get_optimizer_input_shape(self, op_type, varkey, orig_shape, param_shape): diff --git a/python/paddle/fluid/transpiler/distributed_splitter.py b/python/paddle/fluid/transpiler/distributed_splitter.py deleted file mode 100644 index 060c1df8ad2badc5132f45ff0f44d136d828faa1..0000000000000000000000000000000000000000 --- a/python/paddle/fluid/transpiler/distributed_splitter.py +++ /dev/null @@ -1,57 +0,0 @@ -# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - - -def hash_name(varlist, pserver_endpoints): - """ - hash variable names to several endpoints. - - Args: - varlist(list): a list of Variables - - Returns(dict): a map of pserver endpoint -> varname - """ - - def _hash_block(block_str, total): - return hash(block_str) % total - - eplist = [] - for var in varlist: - server_id = _hash_block(var.name(), len(pserver_endpoints)) - server_for_param = pserver_endpoints[server_id] - eplist.append(server_for_param) - return eplist - - -def round_robin(varlist, pserver_endpoints): - """ - Distribute variables to several endpoints. - Args: - varlist(list): a list of variables - pserver_endpoints(list): a list of pserver endpoints - - Returns(list[int]): the endpoint for each variable - """ - assert (len(varlist) >= len(pserver_endpoints)) - - eplist = [] - pserver_idx = 0 - for var in varlist: - server_for_param = pserver_endpoints[pserver_idx] - eplist.append(server_for_param) - - pserver_idx += 1 - if pserver_idx >= len(pserver_endpoints): - pserver_idx = 0 - return eplist diff --git a/python/paddle/fluid/transpiler/ps_dispatcher.py b/python/paddle/fluid/transpiler/ps_dispatcher.py new file mode 100644 index 0000000000000000000000000000000000000000..d6a68677527deb09ace0e3a23cbc093d6d7b4349 --- /dev/null +++ b/python/paddle/fluid/transpiler/ps_dispatcher.py @@ -0,0 +1,78 @@ +# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +class PSDispatcher(object): + """ + PSDispatcher is the base class for dispatching vars + into different pserver instance. + You need to implement the `dispatch` inferface. + """ + + def __init__(self, pserver_endpoints): + self._eps = pserver_endpoints + self._step = 0 + + @property + def eps(self): + return self._eps + + def reset(self): + self._step = 0 + + def dispatch(self, varlist): + """ + :param varlist: a list of Variables + :return: a map of pserver endpoint -> varname + """ + AssertionError("Interface has not been implemented.") + + +class HashName(PSDispatcher): + """ + Hash variable names to several endpoints + """ + + def __init__(self, pserver_endpoints): + super(self.__class__, self).__init__(pserver_endpoints) + + def _hash_block(self, block_str, total): + return hash(block_str) % total + + def dispatch(self, varlist): + eplist = [] + for var in varlist: + server_id = self._hash_block(var.name(), len(self._eps)) + server_for_param = self._eps[server_id] + eplist.append(server_for_param) + return eplist + + +class RoundRobin(PSDispatcher): + """ + Distribute variables to serveral endpoints. + """ + + def __init__(self, pserver_endpoints): + super(self.__class__, self).__init__(pserver_endpoints) + + def dispatch(self, varlist): + eplist = [] + for var in varlist: + server_for_param = self._eps[self._step] + eplist.append(server_for_param) + self._step += 1 + if self._step >= len(self._eps): + self._step = 0 + return eplist diff --git a/tools/codestyle/cpplint_pre_commit.hook b/tools/codestyle/cpplint_pre_commit.hook index 94d1e23ce716f7f1d723bad5f1f4c60030f19eb7..b194af76dc529fd52b0aedfab9c41d625fe64c0d 100755 --- a/tools/codestyle/cpplint_pre_commit.hook +++ b/tools/codestyle/cpplint_pre_commit.hook @@ -4,8 +4,12 @@ TOTAL_ERRORS=0 # The trick to remove deleted files: https://stackoverflow.com/a/2413151 for file in $(git diff --cached --name-status | awk '$1 != "D" {print $2}'); do - cpplint $file; - TOTAL_ERRORS=$(expr $TOTAL_ERRORS + $?); + if [[ $file =~ ^(paddle/api/.*|paddle/capi/.*|paddle/contrib/.*|paddle/cuda/.*|paddle/function/.*|paddle/gserver/.*|paddle/math/.*|paddle/optimizer/.*|paddle/parameter/.*|paddle/pserver/.*|paddle/trainer/.*|paddle/utils/.*) ]]; then + continue; + else + cpplint $file; + TOTAL_ERRORS=$(expr $TOTAL_ERRORS + $?); + fi done exit $TOTAL_ERRORS diff --git a/tools/codestyle/docstring_checker.py b/tools/codestyle/docstring_checker.py new file mode 100644 index 0000000000000000000000000000000000000000..48100e5bf989520043b5ca372b02883faea8a9fd --- /dev/null +++ b/tools/codestyle/docstring_checker.py @@ -0,0 +1,334 @@ +# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""DocstringChecker is used to check python doc string's style.""" + +import six +import astroid + +from pylint.checkers import BaseChecker, utils +from pylint.interfaces import IAstroidChecker + +from collections import defaultdict +import re + + +def register(linter): + """Register checkers.""" + linter.register_checker(DocstringChecker(linter)) + + +class Docstring(object): + """Docstring class holds the parsed doc string elements. + """ + + def __init__(self): + self.d = defaultdict(list) #name->[] + self.clear() + + def clear(self): + self.d['Args'] = [] + self.d['Examples'] = [] + self.d['Returns'] = [] + self.d['Raises'] = [] + self.args = {} #arg_name->arg_type + + def get_level(self, string, indent=' '): + level = 0 + unit_size = len(indent) + while string[:unit_size] == indent: + string = string[unit_size:] + level += 1 + + return level + + def parse(self, doc): + """parse gets sections from doc + Such as Args, Returns, Raises, Examples s + Args: + doc (string): is the astroid node doc string. + Returns: + True if doc is parsed successfully. + """ + self.clear() + + lines = doc.splitlines() + state = ("others", -1) + for l in lines: + c = l.strip() + if len(c) <= 0: + continue + + level = self.get_level(l) + if c.startswith("Args:"): + state = ("Args", level) + elif c.startswith("Returns:"): + state = ("Returns", level) + elif c.startswith("Raises:"): + state = ("Raises", level) + elif c.startswith("Examples:"): + state = ("Examples", level) + else: + if level > state[1]: + self.d[state[0]].append(c) + continue + + state = ("others", -1) + self.d[state[0]].append(c) + + self._arg_with_type() + return True + + def get_returns(self): + return self.d['Returns'] + + def get_raises(self): + return self.d['Raises'] + + def get_examples(self): + return self.d['Examples'] + + def _arg_with_type(self): + + for t in self.d['Args']: + m = re.search('([A-Za-z0-9_-]+)\s{0,4}(\(.+\))\s{0,4}:', t) + if m: + self.args[m.group(1)] = m.group(2) + + return self.args + + +class DocstringChecker(BaseChecker): + """DosstringChecker is pylint checker to + check docstring style. + """ + __implements__ = (IAstroidChecker, ) + + POSITIONAL_MESSAGE_ID = 'str-used-on-positional-format-argument' + KEYWORD_MESSAGE_ID = 'str-used-on-keyword-format-argument' + + name = 'doc-string-checker' + symbol = "doc-string" + priority = -1 + msgs = { + 'W9001': ('One line doc string on > 1 lines', symbol + "-one-line", + 'Used when a short doc string is on multiple lines'), + 'W9002': + ('Doc string does not end with "." period', symbol + "-end-with", + 'Used when a doc string does not end with a period'), + 'W9003': ('All args with their types must be mentioned in doc string', + symbol + "-with-all-args", + 'Used when not all arguments are in the doc string '), + 'W9005': ('Missing docstring or docstring is too short', + symbol + "-missing", 'Add docstring longer >=10'), + 'W9006': ('Docstring indent error, use 4 space for indent', + symbol + "-indent-error", 'Use 4 space for indent'), + 'W9007': ('You should add `Returns` in comments', + symbol + "-with-returns", + 'There should be a `Returns` section in comments'), + 'W9008': ('You should add `Raises` section in comments', + symbol + "-with-raises", + 'There should be a `Raises` section in comments'), + } + options = () + + def visit_functiondef(self, node): + """visit_functiondef checks Function node docstring style. + Args: + node (astroid.node): The visiting node. + Returns: + True if successful other wise False. + """ + + self.check_doc_string(node) + + if node.tolineno - node.fromlineno <= 10: + return True + + if not node.doc: + return True + + doc = Docstring() + doc.parse(node.doc) + + self.all_args_in_doc(node, doc) + self.with_returns(node, doc) + self.with_raises(node, doc) + + def visit_module(self, node): + self.check_doc_string(node) + + def visit_classdef(self, node): + self.check_doc_string(node) + + def check_doc_string(self, node): + self.missing_doc_string(node) + self.one_line(node) + self.has_period(node) + self.indent_style(node) + + def missing_doc_string(self, node): + if node.tolineno - node.fromlineno <= 10: + return True + + if node.doc is None or len(node.doc) < 10: + self.add_message('W9005', node=node, line=node.fromlineno) + return False + + # FIXME(gongwb): give the docstring line-no + def indent_style(self, node, indent=4): + """indent_style checks docstring's indent style + Args: + node (astroid.node): The visiting node. + indent (int): The default indent of style + Returns: + True if successful other wise False. + """ + if node.doc is None: + return True + + doc = node.doc + lines = doc.splitlines() + + for l in lines: + cur_indent = len(l) - len(l.lstrip()) + if cur_indent % indent != 0: + self.add_message('W9006', node=node, line=node.fromlineno) + return False + + return True + + def one_line(self, node): + """one_line checks if docstring (len < 40) is on one line. + Args: + node (astroid.node): The node visiting. + Returns: + True if successful otherwise False. + """ + + doc = node.doc + if doc is None: + return True + + if len(doc) > 40: + return True + elif sum(doc.find(nl) for nl in ('\n', '\r', '\n\r')) == -3: + return True + else: + self.add_message('W9001', node=node, line=node.fromlineno) + return False + + return True + + def has_period(self, node): + """has_period checks if one line doc end-with '.' . + Args: + node (astroid.node): the node is visiting. + Returns: + True if successful otherwise False. + """ + if node.doc is None: + return True + + if len(node.doc.splitlines()) > 1: + return True + + if not node.doc.strip().endswith('.'): + self.add_message('W9002', node=node, line=node.fromlineno) + return False + + return True + + def with_raises(self, node, doc): + """with_raises checks if one line doc end-with '.' . + Args: + node (astroid.node): the node is visiting. + doc (Docstring): Docstring object. + Returns: + True if successful otherwise False. + """ + + find = False + for t in node.body: + if not isinstance(t, astroid.Raise): + continue + + find = True + break + + if not find: + return True + + if len(doc.get_raises()) == 0: + self.add_message('W9008', node=node, line=node.fromlineno) + return False + + return True + + def with_returns(self, node, doc): + """with_returns checks if docstring comments what are returned . + Args: + node (astroid.node): the node is visiting. + doc (Docstring): Docstring object. + Returns: + True if successful otherwise False. + """ + + find = False + for t in node.body: + if not isinstance(t, astroid.Return): + continue + + find = True + break + + if not find: + return True + + if len(doc.get_returns()) == 0: + self.add_message('W9007', node=node, line=node.fromlineno) + return False + + return True + + def all_args_in_doc(self, node, doc): + """all_args_in_doc checks if arguments are mentioned in doc + Args: + node (astroid.node): the node is visiting. + doc (Docstring): Docstring object + Returns: + True if successful otherwise False. + """ + args = [] + for arg in node.args.get_children(): + if (not isinstance(arg, astroid.AssignName)) \ + or arg.name == "self": + continue + args.append(arg.name) + + if len(args) <= 0: + return True + + parsed_args = doc.args + if len(args) > 0 and len(parsed_args) <= 0: + print "debug:parsed args: ", parsed_args + self.add_message('W9003', node=node, line=node.fromlineno) + return False + + for t in args: + if t not in parsed_args: + print t, " with (type) not in ", parsed_args + self.add_message('W9003', node=node, line=node.fromlineno) + return False + + return True diff --git a/tools/codestyle/pylint_pre_commit.hook b/tools/codestyle/pylint_pre_commit.hook new file mode 100755 index 0000000000000000000000000000000000000000..e7c92ba671e0eb778b2ab5447bea7c4b14fe761b --- /dev/null +++ b/tools/codestyle/pylint_pre_commit.hook @@ -0,0 +1,19 @@ +#!/bin/bash + +TOTAL_ERRORS=0 + + +DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" +export PYTHONPATH=$DIR:$PYTHONPATH + +# The trick to remove deleted files: https://stackoverflow.com/a/2413151 +for file in $(git diff --cached --name-status | awk '$1 != "D" {print $2}'); do + pylint --disable=all --load-plugins=docstring_checker \ + --enable=doc-string-one-line,doc-string-end-with,doc-string-with-all-args,doc-string-triple-quotes,doc-string-missing,doc-string-indent-error,doc-string-with-returns,doc-string-with-raises $file; + TOTAL_ERRORS=$(expr $TOTAL_ERRORS + $?); +done + +#exit $TOTAL_ERRORS +#For now, just warning: +exit 0 + diff --git a/tools/codestyle/test_docstring_checker.py b/tools/codestyle/test_docstring_checker.py new file mode 100644 index 0000000000000000000000000000000000000000..0547f7d1610c64b0ca6efa9384e97d658c8276fe --- /dev/null +++ b/tools/codestyle/test_docstring_checker.py @@ -0,0 +1,232 @@ +# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import docstring_checker +import pylint.testutils +import astroid +import pytest +import sys + + +class TestDocstring(pylint.testutils.CheckerTestCase): + CHECKER_CLASS = docstring_checker.DocstringChecker + + def test_one_line(self): + func_node = astroid.extract_node(''' + def test(): + """get + news. + """ + if True: + return 5 + return 5 + ''') + + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 1 + assert 'W9001' == got[0][0] + + def test_one_line(self): + func_node = astroid.extract_node(''' + def test(): + """get news""" + if True: + return 5 + return 5 + ''') + + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 1 + assert 'W9002' == got[0][0] + + def test_args(self): + func_node = astroid.extract_node(''' + def test(scale, mean): + """get news. + Args: + scale (int): scale is the number. + """ + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + ''') + + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 1 + assert 'W9003' == got[0][0] + + def test_missing(self): + func_node = astroid.extract_node(''' + def test(): + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + ''') + + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 1 + assert 'W9005' == got[0][0] + + def test_indent(self): + func_node = astroid.extract_node(''' + def test(): + """ get get get get get get get get + get get get get get get get get. + """ + pass + ''') + + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 1 + assert 'W9006' == got[0][0] + + def test_with_resturns(self): + func_node = astroid.extract_node(''' + def test(): + """get news. + Args: + scale (int): scale is the number. + """ + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + return mean + ''') + + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 1 + assert 'W9007' == got[0][0] + + def test_with_raises(self): + func_node = astroid.extract_node(''' + def test(): + """get news. + Args: + scale (int): scale is the number. + """ + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + mean=scale + raise ValueError('A very specific bad thing happened.') + ''') + + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 1 + assert 'W9008' == got[0][0] + + def test_no_message(self): + p = ''' +def fc(input, + size, + num_flatten_dims=1, + param_attr=None, + bias_attr=None, + act=None, + name=None): + """ + **Fully Connected Layer** + The fully connected layer can take multiple tensors as its inputs. It + creates a variable called weights for each input tensor, which represents + a fully connected weight matrix from each input unit to each output unit. + The fully connected layer multiplies each input tensor with its coresponding + weight to produce an output Tensor. If multiple input tensors are given, + the results of multiple multiplications will be sumed up. If bias_attr is + not None, a bias variable will be created and added to the output. Finally, + if activation is not None, it will be applied to the output as well. + This process can be formulated as follows: + + Args: + input (Variable|list of Variable): The input tensor(s) of this layer, and the dimension of + the input tensor(s) is at least 2. + size(int): The number of output units in this layer. + num_flatten_dims (int, default 1): The fc layer can accept an input tensor with more than + two dimensions. If this happens, the multidimensional tensor will first be flattened + into a 2-dimensional matrix. The parameter `num_flatten_dims` determines how the input + tensor is flattened: the first `num_flatten_dims` (inclusive, index starts from 1) + dimensions will be flatten to form the first dimension of the final matrix (height of + the matrix), and the rest `rank(X) - num_flatten_dims` dimensions are flattened to + form the second dimension of the final matrix (width of the matrix). For example, suppose + `X` is a 6-dimensional tensor with a shape [2, 3, 4, 5, 6], and `num_flatten_dims` = 3. + Then, the flattened matrix will have a shape [2 x 3 x 4, 5 x 6] = [24, 30]. + param_attr (ParamAttr|list of ParamAttr, default None): The parameter attribute for learnable + parameters/weights of this layer. + bias_attr (ParamAttr|list of ParamAttr, default None): The parameter attribute for the bias + of this layer. If it is set to None, no bias will be added to the output units. + act (str, default None): Activation to be applied to the output of this layer. + name (str, default None): The name of this layer. + Returns: + A tensor variable storing the transformation result. + Raises: + ValueError: If rank of the input tensor is less than 2. + Examples: + .. code-block:: python + data = fluid.layers.data(name="data", shape=[32, 32], dtype="float32") + fc = fluid.layers.fc(input=data, size=1000, act="tanh") + """ + raise ValueError('A very specific bad thing happened.') + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + size = 1 + return size + ''' + + func_node = astroid.extract_node(p) + self.checker.visit_functiondef(func_node) + got = self.linter.release_messages() + assert len(got) == 0