Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
BaiXuePrincess
Paddle
提交
ebbe6e1a
P
Paddle
项目概览
BaiXuePrincess
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
ebbe6e1a
编写于
11月 23, 2016
作者:
D
dangqingqing
浏览文件
操作
浏览文件
下载
差异文件
Merge branch 'develop' of
https://github.com/baidu/Paddle
into config_parse_bug_fix
上级
1a8fcc00
5c0eb23d
变更
377
展开全部
隐藏空白更改
内联
并排
Showing
377 changed file
with
10817 addition
and
7795 deletion
+10817
-7795
paddle/api/Arguments.cpp
paddle/api/Arguments.cpp
+1
-2
paddle/api/ConfigParser.cpp
paddle/api/ConfigParser.cpp
+1
-3
paddle/api/GradientMachine.cpp
paddle/api/GradientMachine.cpp
+16
-9
paddle/api/Internal.h
paddle/api/Internal.h
+4
-4
paddle/api/Matrix.cpp
paddle/api/Matrix.cpp
+34
-17
paddle/api/PaddleAPI.h
paddle/api/PaddleAPI.h
+56
-32
paddle/api/Parameter.cpp
paddle/api/Parameter.cpp
+0
-1
paddle/api/ParameterOptimizer.cpp
paddle/api/ParameterOptimizer.cpp
+21
-15
paddle/api/SequenceGenerator.cpp
paddle/api/SequenceGenerator.cpp
+6
-4
paddle/api/Trainer.cpp
paddle/api/Trainer.cpp
+8
-8
paddle/api/Util.cpp
paddle/api/Util.cpp
+5
-3
paddle/api/Vector.cpp
paddle/api/Vector.cpp
+12
-9
paddle/cuda/include/hl_activation_functions.h
paddle/cuda/include/hl_activation_functions.h
+5
-9
paddle/cuda/include/hl_aggregate.h
paddle/cuda/include/hl_aggregate.h
+0
-1
paddle/cuda/include/hl_avx_functions.h
paddle/cuda/include/hl_avx_functions.h
+9
-10
paddle/cuda/include/hl_base.h
paddle/cuda/include/hl_base.h
+47
-54
paddle/cuda/include/hl_batch_transpose.h
paddle/cuda/include/hl_batch_transpose.h
+2
-6
paddle/cuda/include/hl_cnn.h
paddle/cuda/include/hl_cnn.h
+150
-92
paddle/cuda/include/hl_cuda.h
paddle/cuda/include/hl_cuda.h
+10
-13
paddle/cuda/include/hl_cuda_cublas.h
paddle/cuda/include/hl_cuda_cublas.h
+44
-34
paddle/cuda/include/hl_cuda_cudnn.h
paddle/cuda/include/hl_cuda_cudnn.h
+49
-51
paddle/cuda/include/hl_dso_loader.h
paddle/cuda/include/hl_dso_loader.h
+0
-1
paddle/cuda/include/hl_functions.h
paddle/cuda/include/hl_functions.h
+17
-18
paddle/cuda/include/hl_gpu.h
paddle/cuda/include/hl_gpu.h
+0
-1
paddle/cuda/include/hl_lstm.h
paddle/cuda/include/hl_lstm.h
+0
-1
paddle/cuda/include/hl_matrix.h
paddle/cuda/include/hl_matrix.h
+19
-47
paddle/cuda/include/hl_sequence.h
paddle/cuda/include/hl_sequence.h
+13
-15
paddle/cuda/include/hl_sparse.h
paddle/cuda/include/hl_sparse.h
+41
-37
paddle/cuda/include/hl_table_apply.h
paddle/cuda/include/hl_table_apply.h
+11
-9
paddle/cuda/include/hl_time.h
paddle/cuda/include/hl_time.h
+0
-1
paddle/cuda/include/hl_top_k.h
paddle/cuda/include/hl_top_k.h
+8
-6
paddle/cuda/include/stub/hl_aggregate_stub.h
paddle/cuda/include/stub/hl_aggregate_stub.h
+6
-13
paddle/cuda/include/stub/hl_cnn_stub.h
paddle/cuda/include/stub/hl_cnn_stub.h
+151
-93
paddle/cuda/include/stub/hl_cuda_cublas_stub.h
paddle/cuda/include/stub/hl_cuda_cublas_stub.h
+30
-29
paddle/cuda/include/stub/hl_cuda_cudnn_stub.h
paddle/cuda/include/stub/hl_cuda_cudnn_stub.h
+83
-88
paddle/cuda/include/stub/hl_cuda_stub.h
paddle/cuda/include/stub/hl_cuda_stub.h
+12
-15
paddle/cuda/include/stub/hl_lstm_stub.h
paddle/cuda/include/stub/hl_lstm_stub.h
+0
-1
paddle/cuda/include/stub/hl_matrix_stub.h
paddle/cuda/include/stub/hl_matrix_stub.h
+20
-40
paddle/cuda/include/stub/hl_sequence_stub.h
paddle/cuda/include/stub/hl_sequence_stub.h
+9
-13
paddle/cuda/include/stub/hl_sparse_stub.h
paddle/cuda/include/stub/hl_sparse_stub.h
+41
-39
paddle/cuda/src/avx_mathfun.h
paddle/cuda/src/avx_mathfun.h
+254
-252
paddle/cuda/src/hl_avx_functions.cc
paddle/cuda/src/hl_avx_functions.cc
+40
-44
paddle/cuda/src/hl_cpu_functions.cc
paddle/cuda/src/hl_cpu_functions.cc
+24
-37
paddle/cuda/src/hl_cuda_cublas.cc
paddle/cuda/src/hl_cuda_cublas.cc
+187
-121
paddle/cuda/src/hl_cuda_cudnn.cc
paddle/cuda/src/hl_cuda_cudnn.cc
+626
-620
paddle/cuda/src/hl_cuda_device.cc
paddle/cuda/src/hl_cuda_device.cc
+167
-183
paddle/cuda/src/hl_cudart_wrap.cc
paddle/cuda/src/hl_cudart_wrap.cc
+87
-88
paddle/cuda/src/hl_math.cc
paddle/cuda/src/hl_math.cc
+4
-13
paddle/cuda/src/hl_time.cc
paddle/cuda/src/hl_time.cc
+3
-5
paddle/gserver/activations/ActivationFunction.cpp
paddle/gserver/activations/ActivationFunction.cpp
+46
-24
paddle/gserver/activations/ActivationFunction.h
paddle/gserver/activations/ActivationFunction.h
+0
-1
paddle/gserver/dataproviders/DataProvider.cpp
paddle/gserver/dataproviders/DataProvider.cpp
+13
-8
paddle/gserver/dataproviders/DataProvider.h
paddle/gserver/dataproviders/DataProvider.h
+21
-19
paddle/gserver/dataproviders/DataProviderGroup.h
paddle/gserver/dataproviders/DataProviderGroup.h
+5
-5
paddle/gserver/dataproviders/MultiDataProvider.cpp
paddle/gserver/dataproviders/MultiDataProvider.cpp
+2
-5
paddle/gserver/dataproviders/MultiDataProvider.h
paddle/gserver/dataproviders/MultiDataProvider.h
+0
-1
paddle/gserver/dataproviders/ProtoDataProvider.cpp
paddle/gserver/dataproviders/ProtoDataProvider.cpp
+79
-50
paddle/gserver/dataproviders/ProtoDataProvider.h
paddle/gserver/dataproviders/ProtoDataProvider.h
+6
-4
paddle/gserver/dataproviders/ProtoReader.h
paddle/gserver/dataproviders/ProtoReader.h
+2
-2
paddle/gserver/dataproviders/PyDataProvider.cpp
paddle/gserver/dataproviders/PyDataProvider.cpp
+82
-51
paddle/gserver/dataproviders/PyDataProvider.h
paddle/gserver/dataproviders/PyDataProvider.h
+14
-8
paddle/gserver/dataproviders/PyDataProvider2.cpp
paddle/gserver/dataproviders/PyDataProvider2.cpp
+139
-173
paddle/gserver/evaluators/CTCErrorEvaluator.cpp
paddle/gserver/evaluators/CTCErrorEvaluator.cpp
+11
-7
paddle/gserver/evaluators/ChunkEvaluator.cpp
paddle/gserver/evaluators/ChunkEvaluator.cpp
+2
-1
paddle/gserver/evaluators/Evaluator.cpp
paddle/gserver/evaluators/Evaluator.cpp
+44
-25
paddle/gserver/evaluators/Evaluator.h
paddle/gserver/evaluators/Evaluator.h
+17
-8
paddle/gserver/gradientmachines/GradientMachine.cpp
paddle/gserver/gradientmachines/GradientMachine.cpp
+7
-6
paddle/gserver/gradientmachines/GradientMachine.h
paddle/gserver/gradientmachines/GradientMachine.h
+7
-7
paddle/gserver/gradientmachines/GradientMachineMode.h
paddle/gserver/gradientmachines/GradientMachineMode.h
+22
-21
paddle/gserver/gradientmachines/MultiGradientMachine.cpp
paddle/gserver/gradientmachines/MultiGradientMachine.cpp
+77
-103
paddle/gserver/gradientmachines/MultiGradientMachine.h
paddle/gserver/gradientmachines/MultiGradientMachine.h
+72
-107
paddle/gserver/gradientmachines/MultiNetwork.cpp
paddle/gserver/gradientmachines/MultiNetwork.cpp
+6
-5
paddle/gserver/gradientmachines/MultiNetwork.h
paddle/gserver/gradientmachines/MultiNetwork.h
+7
-6
paddle/gserver/gradientmachines/NeuralNetwork.cpp
paddle/gserver/gradientmachines/NeuralNetwork.cpp
+44
-37
paddle/gserver/gradientmachines/NeuralNetwork.h
paddle/gserver/gradientmachines/NeuralNetwork.h
+11
-11
paddle/gserver/gradientmachines/ParallelNeuralNetwork.cpp
paddle/gserver/gradientmachines/ParallelNeuralNetwork.cpp
+10
-8
paddle/gserver/gradientmachines/ParallelNeuralNetwork.h
paddle/gserver/gradientmachines/ParallelNeuralNetwork.h
+12
-10
paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp
paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp
+91
-54
paddle/gserver/gradientmachines/RecurrentGradientMachine.h
paddle/gserver/gradientmachines/RecurrentGradientMachine.h
+26
-15
paddle/gserver/layers/AddtoLayer.cpp
paddle/gserver/layers/AddtoLayer.cpp
+0
-1
paddle/gserver/layers/AddtoLayer.h
paddle/gserver/layers/AddtoLayer.h
+12
-12
paddle/gserver/layers/AgentLayer.cpp
paddle/gserver/layers/AgentLayer.cpp
+24
-12
paddle/gserver/layers/AgentLayer.h
paddle/gserver/layers/AgentLayer.h
+14
-10
paddle/gserver/layers/AverageLayer.cpp
paddle/gserver/layers/AverageLayer.cpp
+2
-2
paddle/gserver/layers/BatchNormBaseLayer.cpp
paddle/gserver/layers/BatchNormBaseLayer.cpp
+0
-1
paddle/gserver/layers/BatchNormBaseLayer.h
paddle/gserver/layers/BatchNormBaseLayer.h
+6
-7
paddle/gserver/layers/BatchNormalizationLayer.cpp
paddle/gserver/layers/BatchNormalizationLayer.cpp
+43
-34
paddle/gserver/layers/BatchNormalizationLayer.h
paddle/gserver/layers/BatchNormalizationLayer.h
+0
-1
paddle/gserver/layers/BilinearInterpLayer.cpp
paddle/gserver/layers/BilinearInterpLayer.cpp
+21
-9
paddle/gserver/layers/BlockExpandLayer.cpp
paddle/gserver/layers/BlockExpandLayer.cpp
+49
-18
paddle/gserver/layers/BlockExpandLayer.h
paddle/gserver/layers/BlockExpandLayer.h
+0
-1
paddle/gserver/layers/CRFDecodingLayer.cpp
paddle/gserver/layers/CRFDecodingLayer.cpp
+2
-2
paddle/gserver/layers/CRFDecodingLayer.h
paddle/gserver/layers/CRFDecodingLayer.h
+0
-1
paddle/gserver/layers/CRFLayer.cpp
paddle/gserver/layers/CRFLayer.cpp
+8
-8
paddle/gserver/layers/CRFLayer.h
paddle/gserver/layers/CRFLayer.h
+1
-2
paddle/gserver/layers/CTCLayer.cpp
paddle/gserver/layers/CTCLayer.cpp
+11
-14
paddle/gserver/layers/CTCLayer.h
paddle/gserver/layers/CTCLayer.h
+2
-2
paddle/gserver/layers/ConcatenateLayer.cpp
paddle/gserver/layers/ConcatenateLayer.cpp
+3
-4
paddle/gserver/layers/ContextProjection.cpp
paddle/gserver/layers/ContextProjection.cpp
+30
-16
paddle/gserver/layers/ContextProjection.h
paddle/gserver/layers/ContextProjection.h
+2
-2
paddle/gserver/layers/ConvBaseLayer.cpp
paddle/gserver/layers/ConvBaseLayer.cpp
+21
-28
paddle/gserver/layers/ConvBaseLayer.h
paddle/gserver/layers/ConvBaseLayer.h
+0
-1
paddle/gserver/layers/ConvOperator.cpp
paddle/gserver/layers/ConvOperator.cpp
+76
-25
paddle/gserver/layers/ConvProjection.cpp
paddle/gserver/layers/ConvProjection.cpp
+86
-42
paddle/gserver/layers/ConvProjection.h
paddle/gserver/layers/ConvProjection.h
+10
-3
paddle/gserver/layers/ConvShiftLayer.cpp
paddle/gserver/layers/ConvShiftLayer.cpp
+0
-1
paddle/gserver/layers/ConvexCombinationLayer.cpp
paddle/gserver/layers/ConvexCombinationLayer.cpp
+15
-7
paddle/gserver/layers/CosSimLayer.cpp
paddle/gserver/layers/CosSimLayer.cpp
+6
-4
paddle/gserver/layers/CosSimLayer.h
paddle/gserver/layers/CosSimLayer.h
+1
-3
paddle/gserver/layers/CosSimVecMatLayer.cpp
paddle/gserver/layers/CosSimVecMatLayer.cpp
+37
-16
paddle/gserver/layers/CostLayer.cpp
paddle/gserver/layers/CostLayer.cpp
+70
-49
paddle/gserver/layers/CostLayer.h
paddle/gserver/layers/CostLayer.h
+8
-3
paddle/gserver/layers/CudnnBatchNormLayer.cpp
paddle/gserver/layers/CudnnBatchNormLayer.cpp
+35
-12
paddle/gserver/layers/CudnnBatchNormLayer.h
paddle/gserver/layers/CudnnBatchNormLayer.h
+2
-2
paddle/gserver/layers/CudnnConvLayer.cpp
paddle/gserver/layers/CudnnConvLayer.cpp
+15
-9
paddle/gserver/layers/CudnnConvLayer.h
paddle/gserver/layers/CudnnConvLayer.h
+0
-1
paddle/gserver/layers/CudnnPoolLayer.cpp
paddle/gserver/layers/CudnnPoolLayer.cpp
+18
-5
paddle/gserver/layers/CudnnPoolLayer.h
paddle/gserver/layers/CudnnPoolLayer.h
+6
-7
paddle/gserver/layers/DataLayer.cpp
paddle/gserver/layers/DataLayer.cpp
+7
-6
paddle/gserver/layers/DataLayer.h
paddle/gserver/layers/DataLayer.h
+4
-7
paddle/gserver/layers/DataNormLayer.cpp
paddle/gserver/layers/DataNormLayer.cpp
+22
-11
paddle/gserver/layers/DataNormLayer.h
paddle/gserver/layers/DataNormLayer.h
+0
-1
paddle/gserver/layers/DotMulOperator.cpp
paddle/gserver/layers/DotMulOperator.cpp
+2
-3
paddle/gserver/layers/DotMulProjection.cpp
paddle/gserver/layers/DotMulProjection.cpp
+4
-3
paddle/gserver/layers/EosIdCheckLayer.cpp
paddle/gserver/layers/EosIdCheckLayer.cpp
+1
-2
paddle/gserver/layers/ExpandConvBaseLayer.cpp
paddle/gserver/layers/ExpandConvBaseLayer.cpp
+67
-38
paddle/gserver/layers/ExpandConvBaseLayer.h
paddle/gserver/layers/ExpandConvBaseLayer.h
+1
-2
paddle/gserver/layers/ExpandConvLayer.cpp
paddle/gserver/layers/ExpandConvLayer.cpp
+0
-2
paddle/gserver/layers/ExpandConvLayer.h
paddle/gserver/layers/ExpandConvLayer.h
+2
-3
paddle/gserver/layers/ExpandConvTransLayer.cpp
paddle/gserver/layers/ExpandConvTransLayer.cpp
+1
-3
paddle/gserver/layers/ExpandConvTransLayer.h
paddle/gserver/layers/ExpandConvTransLayer.h
+2
-3
paddle/gserver/layers/FeatureMapExpandLayer.cpp
paddle/gserver/layers/FeatureMapExpandLayer.cpp
+12
-7
paddle/gserver/layers/FullMatrixProjection.cpp
paddle/gserver/layers/FullMatrixProjection.cpp
+0
-1
paddle/gserver/layers/FullMatrixProjection.h
paddle/gserver/layers/FullMatrixProjection.h
+2
-1
paddle/gserver/layers/FullyConnectedLayer.cpp
paddle/gserver/layers/FullyConnectedLayer.cpp
+0
-1
paddle/gserver/layers/FullyConnectedLayer.h
paddle/gserver/layers/FullyConnectedLayer.h
+3
-5
paddle/gserver/layers/GatedRecurrentLayer.cpp
paddle/gserver/layers/GatedRecurrentLayer.cpp
+53
-49
paddle/gserver/layers/GatedRecurrentLayer.h
paddle/gserver/layers/GatedRecurrentLayer.h
+13
-7
paddle/gserver/layers/GetOutputLayer.cpp
paddle/gserver/layers/GetOutputLayer.cpp
+0
-1
paddle/gserver/layers/GruCompute.cpp
paddle/gserver/layers/GruCompute.cpp
+14
-17
paddle/gserver/layers/GruCompute.h
paddle/gserver/layers/GruCompute.h
+3
-2
paddle/gserver/layers/GruStepLayer.cpp
paddle/gserver/layers/GruStepLayer.cpp
+19
-12
paddle/gserver/layers/HierarchicalSigmoidLayer.cpp
paddle/gserver/layers/HierarchicalSigmoidLayer.cpp
+19
-12
paddle/gserver/layers/HierarchicalSigmoidLayer.h
paddle/gserver/layers/HierarchicalSigmoidLayer.h
+7
-8
paddle/gserver/layers/IdentityProjection.cpp
paddle/gserver/layers/IdentityProjection.cpp
+4
-3
paddle/gserver/layers/InterpolationLayer.cpp
paddle/gserver/layers/InterpolationLayer.cpp
+2
-3
paddle/gserver/layers/Layer.cpp
paddle/gserver/layers/Layer.cpp
+31
-23
paddle/gserver/layers/Layer.h
paddle/gserver/layers/Layer.h
+21
-18
paddle/gserver/layers/LinearChainCRF.cpp
paddle/gserver/layers/LinearChainCRF.cpp
+0
-1
paddle/gserver/layers/LinearChainCRF.h
paddle/gserver/layers/LinearChainCRF.h
+2
-2
paddle/gserver/layers/LinearChainCTC.cpp
paddle/gserver/layers/LinearChainCTC.cpp
+9
-6
paddle/gserver/layers/LinearChainCTC.h
paddle/gserver/layers/LinearChainCTC.h
+6
-3
paddle/gserver/layers/LstmCompute.cpp
paddle/gserver/layers/LstmCompute.cpp
+21
-11
paddle/gserver/layers/LstmCompute.h
paddle/gserver/layers/LstmCompute.h
+5
-2
paddle/gserver/layers/LstmLayer.cpp
paddle/gserver/layers/LstmLayer.cpp
+183
-78
paddle/gserver/layers/LstmLayer.h
paddle/gserver/layers/LstmLayer.h
+19
-7
paddle/gserver/layers/LstmStepLayer.cpp
paddle/gserver/layers/LstmStepLayer.cpp
+46
-28
paddle/gserver/layers/MDLstmLayer.cpp
paddle/gserver/layers/MDLstmLayer.cpp
+174
-81
paddle/gserver/layers/MaxIdLayer.cpp
paddle/gserver/layers/MaxIdLayer.cpp
+4
-1
paddle/gserver/layers/MaxLayer.cpp
paddle/gserver/layers/MaxLayer.cpp
+2
-2
paddle/gserver/layers/MaxLayer.h
paddle/gserver/layers/MaxLayer.h
+0
-1
paddle/gserver/layers/MixedLayer.cpp
paddle/gserver/layers/MixedLayer.cpp
+3
-5
paddle/gserver/layers/MixedLayer.h
paddle/gserver/layers/MixedLayer.h
+3
-4
paddle/gserver/layers/MultinomialSampler.cpp
paddle/gserver/layers/MultinomialSampler.cpp
+0
-1
paddle/gserver/layers/MultinomialSampler.h
paddle/gserver/layers/MultinomialSampler.h
+0
-1
paddle/gserver/layers/MultiplexLayer.cpp
paddle/gserver/layers/MultiplexLayer.cpp
+0
-1
paddle/gserver/layers/NCELayer.cpp
paddle/gserver/layers/NCELayer.cpp
+20
-8
paddle/gserver/layers/NormLayer.cpp
paddle/gserver/layers/NormLayer.cpp
+0
-1
paddle/gserver/layers/NormLayer.h
paddle/gserver/layers/NormLayer.h
+2
-3
paddle/gserver/layers/NormProjectionLayer.cpp
paddle/gserver/layers/NormProjectionLayer.cpp
+11
-5
paddle/gserver/layers/NormProjectionLayer.h
paddle/gserver/layers/NormProjectionLayer.h
+0
-1
paddle/gserver/layers/Operator.cpp
paddle/gserver/layers/Operator.cpp
+0
-1
paddle/gserver/layers/Operator.h
paddle/gserver/layers/Operator.h
+4
-3
paddle/gserver/layers/OuterProdLayer.cpp
paddle/gserver/layers/OuterProdLayer.cpp
+8
-6
paddle/gserver/layers/ParameterReluLayer.cpp
paddle/gserver/layers/ParameterReluLayer.cpp
+2
-3
paddle/gserver/layers/ParameterReluLayer.h
paddle/gserver/layers/ParameterReluLayer.h
+0
-1
paddle/gserver/layers/PoolLayer.cpp
paddle/gserver/layers/PoolLayer.cpp
+0
-1
paddle/gserver/layers/PoolLayer.h
paddle/gserver/layers/PoolLayer.h
+0
-1
paddle/gserver/layers/PoolProjection.cpp
paddle/gserver/layers/PoolProjection.cpp
+62
-14
paddle/gserver/layers/PoolProjection.h
paddle/gserver/layers/PoolProjection.h
+8
-4
paddle/gserver/layers/PoolProjectionLayer.cpp
paddle/gserver/layers/PoolProjectionLayer.cpp
+8
-3
paddle/gserver/layers/PowerLayer.cpp
paddle/gserver/layers/PowerLayer.cpp
+1
-2
paddle/gserver/layers/PrintLayer.cpp
paddle/gserver/layers/PrintLayer.cpp
+1
-2
paddle/gserver/layers/Projection.cpp
paddle/gserver/layers/Projection.cpp
+2
-2
paddle/gserver/layers/Projection.h
paddle/gserver/layers/Projection.h
+4
-2
paddle/gserver/layers/RecurrentLayer.cpp
paddle/gserver/layers/RecurrentLayer.cpp
+40
-23
paddle/gserver/layers/RecurrentLayerGroup.cpp
paddle/gserver/layers/RecurrentLayerGroup.cpp
+7
-5
paddle/gserver/layers/ResizeLayer.cpp
paddle/gserver/layers/ResizeLayer.cpp
+5
-4
paddle/gserver/layers/ScalingLayer.cpp
paddle/gserver/layers/ScalingLayer.cpp
+1
-2
paddle/gserver/layers/ScalingProjection.cpp
paddle/gserver/layers/ScalingProjection.cpp
+8
-4
paddle/gserver/layers/SelectiveFullyConnectedLayer.cpp
paddle/gserver/layers/SelectiveFullyConnectedLayer.cpp
+78
-35
paddle/gserver/layers/SelectiveFullyConnectedLayer.h
paddle/gserver/layers/SelectiveFullyConnectedLayer.h
+1
-4
paddle/gserver/layers/SequenceConcatLayer.cpp
paddle/gserver/layers/SequenceConcatLayer.cpp
+6
-11
paddle/gserver/layers/SequenceLastInstanceLayer.cpp
paddle/gserver/layers/SequenceLastInstanceLayer.cpp
+0
-1
paddle/gserver/layers/SequencePoolLayer.cpp
paddle/gserver/layers/SequencePoolLayer.cpp
+1
-1
paddle/gserver/layers/SequenceReshapeLayer.cpp
paddle/gserver/layers/SequenceReshapeLayer.cpp
+7
-8
paddle/gserver/layers/SequenceToBatch.cpp
paddle/gserver/layers/SequenceToBatch.cpp
+36
-20
paddle/gserver/layers/SequenceToBatch.h
paddle/gserver/layers/SequenceToBatch.h
+11
-5
paddle/gserver/layers/SlopeInterceptLayer.cpp
paddle/gserver/layers/SlopeInterceptLayer.cpp
+4
-3
paddle/gserver/layers/SpatialPyramidPoolLayer.cpp
paddle/gserver/layers/SpatialPyramidPoolLayer.cpp
+2
-1
paddle/gserver/layers/SpatialPyramidPoolLayer.h
paddle/gserver/layers/SpatialPyramidPoolLayer.h
+6
-3
paddle/gserver/layers/SubSequenceLayer.cpp
paddle/gserver/layers/SubSequenceLayer.cpp
+6
-11
paddle/gserver/layers/SumToOneNormLayer.cpp
paddle/gserver/layers/SumToOneNormLayer.cpp
+1
-2
paddle/gserver/layers/TableProjection.cpp
paddle/gserver/layers/TableProjection.cpp
+2
-2
paddle/gserver/layers/TableProjection.h
paddle/gserver/layers/TableProjection.h
+2
-2
paddle/gserver/layers/TensorLayer.cpp
paddle/gserver/layers/TensorLayer.cpp
+6
-3
paddle/gserver/layers/TensorLayer.h
paddle/gserver/layers/TensorLayer.h
+0
-1
paddle/gserver/layers/TransLayer.cpp
paddle/gserver/layers/TransLayer.cpp
+0
-1
paddle/gserver/layers/TransLayer.h
paddle/gserver/layers/TransLayer.h
+0
-1
paddle/gserver/layers/TransposedFullMatrixProjection.cpp
paddle/gserver/layers/TransposedFullMatrixProjection.cpp
+2
-2
paddle/gserver/layers/ValidationLayer.cpp
paddle/gserver/layers/ValidationLayer.cpp
+5
-3
paddle/gserver/tests/LayerGradUtil.cpp
paddle/gserver/tests/LayerGradUtil.cpp
+135
-57
paddle/gserver/tests/LayerGradUtil.h
paddle/gserver/tests/LayerGradUtil.h
+73
-30
paddle/gserver/tests/TestUtil.cpp
paddle/gserver/tests/TestUtil.cpp
+13
-13
paddle/gserver/tests/TestUtil.h
paddle/gserver/tests/TestUtil.h
+13
-14
paddle/gserver/tests/test_ActivationGrad.cpp
paddle/gserver/tests/test_ActivationGrad.cpp
+2
-2
paddle/gserver/tests/test_ConvTrans.cpp
paddle/gserver/tests/test_ConvTrans.cpp
+193
-193
paddle/gserver/tests/test_Evaluator.cpp
paddle/gserver/tests/test_Evaluator.cpp
+15
-11
paddle/gserver/tests/test_LayerGrad.cpp
paddle/gserver/tests/test_LayerGrad.cpp
+176
-76
paddle/gserver/tests/test_LinearChainCRF.cpp
paddle/gserver/tests/test_LinearChainCRF.cpp
+0
-1
paddle/gserver/tests/test_MultinomialSampler.cpp
paddle/gserver/tests/test_MultinomialSampler.cpp
+1
-3
paddle/gserver/tests/test_NetworkCompare.cpp
paddle/gserver/tests/test_NetworkCompare.cpp
+20
-11
paddle/gserver/tests/test_ProtoDataProvider.cpp
paddle/gserver/tests/test_ProtoDataProvider.cpp
+51
-34
paddle/gserver/tests/test_PyDataProvider.cpp
paddle/gserver/tests/test_PyDataProvider.cpp
+4
-4
paddle/gserver/tests/test_PyDataProvider2.cpp
paddle/gserver/tests/test_PyDataProvider2.cpp
+48
-48
paddle/gserver/tests/test_RecurrentGradientMachine.cpp
paddle/gserver/tests/test_RecurrentGradientMachine.cpp
+20
-12
paddle/gserver/tests/test_RecurrentLayer.cpp
paddle/gserver/tests/test_RecurrentLayer.cpp
+55
-32
paddle/gserver/tests/test_SelectiveFCLayer.cpp
paddle/gserver/tests/test_SelectiveFCLayer.cpp
+92
-66
paddle/math/Allocator.h
paddle/math/Allocator.h
+11
-16
paddle/math/BaseMatrix.h
paddle/math/BaseMatrix.h
+99
-46
paddle/math/CpuSparseMatrix.cpp
paddle/math/CpuSparseMatrix.cpp
+107
-42
paddle/math/CpuSparseMatrix.h
paddle/math/CpuSparseMatrix.h
+30
-18
paddle/math/ExecViaCpu.h
paddle/math/ExecViaCpu.h
+10
-6
paddle/math/MathFunctions.cpp
paddle/math/MathFunctions.cpp
+128
-69
paddle/math/MathFunctions.h
paddle/math/MathFunctions.h
+36
-24
paddle/math/MathUtils.cpp
paddle/math/MathUtils.cpp
+11
-10
paddle/math/MathUtils.h
paddle/math/MathUtils.h
+6
-6
paddle/math/Matrix.cpp
paddle/math/Matrix.cpp
+797
-346
paddle/math/Matrix.h
paddle/math/Matrix.h
+554
-245
paddle/math/MatrixBitCode.cpp
paddle/math/MatrixBitCode.cpp
+47
-29
paddle/math/MemoryHandle.cpp
paddle/math/MemoryHandle.cpp
+3
-8
paddle/math/MemoryHandle.h
paddle/math/MemoryHandle.h
+3
-4
paddle/math/PoolAllocator.cpp
paddle/math/PoolAllocator.cpp
+3
-5
paddle/math/PoolAllocator.h
paddle/math/PoolAllocator.h
+0
-1
paddle/math/SIMDFunctions.cpp
paddle/math/SIMDFunctions.cpp
+10
-8
paddle/math/SIMDFunctions.h
paddle/math/SIMDFunctions.h
+4
-6
paddle/math/SparseMatrix.cpp
paddle/math/SparseMatrix.cpp
+178
-85
paddle/math/SparseMatrix.h
paddle/math/SparseMatrix.h
+46
-23
paddle/math/SparseRowMatrix.cpp
paddle/math/SparseRowMatrix.cpp
+41
-32
paddle/math/SparseRowMatrix.h
paddle/math/SparseRowMatrix.h
+31
-16
paddle/math/Storage.cpp
paddle/math/Storage.cpp
+11
-12
paddle/math/Vector.cpp
paddle/math/Vector.cpp
+75
-57
paddle/math/Vector.h
paddle/math/Vector.h
+19
-26
paddle/math/tests/test_Allocator.cpp
paddle/math/tests/test_Allocator.cpp
+4
-4
paddle/math/tests/test_ExecViaCpu.cpp
paddle/math/tests/test_ExecViaCpu.cpp
+16
-6
paddle/math/tests/test_FPException.cpp
paddle/math/tests/test_FPException.cpp
+5
-6
paddle/math/tests/test_SIMDFunctions.cpp
paddle/math/tests/test_SIMDFunctions.cpp
+4
-6
paddle/math/tests/test_batchTranspose.cpp
paddle/math/tests/test_batchTranspose.cpp
+2
-3
paddle/math/tests/test_matrix.cpp
paddle/math/tests/test_matrix.cpp
+95
-41
paddle/math/tests/test_matrixCompare.cpp
paddle/math/tests/test_matrixCompare.cpp
+302
-164
paddle/math/tests/test_matrixUtil.h
paddle/math/tests/test_matrixUtil.h
+3
-3
paddle/math/tests/test_perturbation.cpp
paddle/math/tests/test_perturbation.cpp
+120
-54
paddle/math/tests/test_sparseMatrixCompare.cpp
paddle/math/tests/test_sparseMatrixCompare.cpp
+1
-1
paddle/parameter/Argument.cpp
paddle/parameter/Argument.cpp
+136
-83
paddle/parameter/Argument.h
paddle/parameter/Argument.h
+25
-14
paddle/parameter/AverageOptimizer.cpp
paddle/parameter/AverageOptimizer.cpp
+26
-14
paddle/parameter/AverageOptimizer.h
paddle/parameter/AverageOptimizer.h
+10
-6
paddle/parameter/FirstOrderOptimizer.cpp
paddle/parameter/FirstOrderOptimizer.cpp
+48
-33
paddle/parameter/FirstOrderOptimizer.h
paddle/parameter/FirstOrderOptimizer.h
+29
-19
paddle/parameter/LearningRateScheduler.cpp
paddle/parameter/LearningRateScheduler.cpp
+0
-1
paddle/parameter/LearningRateScheduler.h
paddle/parameter/LearningRateScheduler.h
+4
-4
paddle/parameter/OptimizerFunctions.cpp
paddle/parameter/OptimizerFunctions.cpp
+9
-7
paddle/parameter/OptimizerFunctions.h
paddle/parameter/OptimizerFunctions.h
+2
-2
paddle/parameter/OptimizerWithRegularizer.cpp
paddle/parameter/OptimizerWithRegularizer.cpp
+35
-24
paddle/parameter/OptimizerWithRegularizer.h
paddle/parameter/OptimizerWithRegularizer.h
+12
-7
paddle/parameter/ParallelParameter.cpp
paddle/parameter/ParallelParameter.cpp
+2
-2
paddle/parameter/ParallelParameter.h
paddle/parameter/ParallelParameter.h
+10
-10
paddle/parameter/Parameter.cpp
paddle/parameter/Parameter.cpp
+88
-51
paddle/parameter/Parameter.h
paddle/parameter/Parameter.h
+10
-8
paddle/parameter/ParameterOptimizer.cpp
paddle/parameter/ParameterOptimizer.cpp
+0
-1
paddle/parameter/ParameterOptimizer.h
paddle/parameter/ParameterOptimizer.h
+70
-69
paddle/parameter/ParameterUpdateFunctions.cpp
paddle/parameter/ParameterUpdateFunctions.cpp
+43
-13
paddle/parameter/ParameterUpdateFunctions.h
paddle/parameter/ParameterUpdateFunctions.h
+21
-9
paddle/parameter/ParameterUpdaterBase.cpp
paddle/parameter/ParameterUpdaterBase.cpp
+0
-1
paddle/parameter/ParameterUpdaterBase.h
paddle/parameter/ParameterUpdaterBase.h
+0
-1
paddle/parameter/ParameterUpdaterHook.cpp
paddle/parameter/ParameterUpdaterHook.cpp
+2
-2
paddle/parameter/ParameterUpdaterHook.h
paddle/parameter/ParameterUpdaterHook.h
+0
-1
paddle/parameter/Regularizer.cpp
paddle/parameter/Regularizer.cpp
+3
-3
paddle/parameter/Regularizer.h
paddle/parameter/Regularizer.h
+32
-14
paddle/parameter/Weight.cpp
paddle/parameter/Weight.cpp
+10
-4
paddle/parameter/tests/test_common.cpp
paddle/parameter/tests/test_common.cpp
+43
-29
paddle/pserver/BaseClient.cpp
paddle/pserver/BaseClient.cpp
+0
-1
paddle/pserver/BaseClient.h
paddle/pserver/BaseClient.h
+32
-11
paddle/pserver/LightNetwork.cpp
paddle/pserver/LightNetwork.cpp
+25
-18
paddle/pserver/LightNetwork.h
paddle/pserver/LightNetwork.h
+5
-6
paddle/pserver/ParameterClient2.cpp
paddle/pserver/ParameterClient2.cpp
+74
-41
paddle/pserver/ParameterClient2.h
paddle/pserver/ParameterClient2.h
+64
-33
paddle/pserver/ParameterServer2.cpp
paddle/pserver/ParameterServer2.cpp
+97
-60
paddle/pserver/ParameterServer2.h
paddle/pserver/ParameterServer2.h
+29
-21
paddle/pserver/ProtoServer.cpp
paddle/pserver/ProtoServer.cpp
+2
-3
paddle/pserver/ProtoServer.h
paddle/pserver/ProtoServer.h
+82
-67
paddle/pserver/RDMANetwork.h
paddle/pserver/RDMANetwork.h
+1
-3
paddle/pserver/SocketChannel.cpp
paddle/pserver/SocketChannel.cpp
+32
-15
paddle/pserver/SocketChannel.h
paddle/pserver/SocketChannel.h
+0
-1
paddle/pserver/SparseParameterDistribution.cpp
paddle/pserver/SparseParameterDistribution.cpp
+10
-6
paddle/pserver/test/SocketTest.cpp
paddle/pserver/test/SocketTest.cpp
+2
-2
paddle/pserver/test/test_ParameterServer2.cpp
paddle/pserver/test/test_ParameterServer2.cpp
+8
-6
paddle/pserver/test/test_ProtoServer.cpp
paddle/pserver/test/test_ProtoServer.cpp
+5
-3
paddle/trainer/ParamUtil.cpp
paddle/trainer/ParamUtil.cpp
+11
-10
paddle/trainer/ParamUtil.h
paddle/trainer/ParamUtil.h
+12
-13
paddle/trainer/ParameterUpdater.cpp
paddle/trainer/ParameterUpdater.cpp
+2
-2
paddle/trainer/ParameterUpdater.h
paddle/trainer/ParameterUpdater.h
+6
-7
paddle/trainer/RemoteParameterUpdater.cpp
paddle/trainer/RemoteParameterUpdater.cpp
+27
-17
paddle/trainer/RemoteParameterUpdater.h
paddle/trainer/RemoteParameterUpdater.h
+15
-11
paddle/trainer/Tester.cpp
paddle/trainer/Tester.cpp
+45
-52
paddle/trainer/Tester.h
paddle/trainer/Tester.h
+4
-7
paddle/trainer/TesterConfig.h
paddle/trainer/TesterConfig.h
+0
-1
paddle/trainer/ThreadParameterUpdater.cpp
paddle/trainer/ThreadParameterUpdater.cpp
+25
-24
paddle/trainer/ThreadParameterUpdater.h
paddle/trainer/ThreadParameterUpdater.h
+4
-6
paddle/trainer/Trainer.cpp
paddle/trainer/Trainer.cpp
+81
-74
paddle/trainer/Trainer.h
paddle/trainer/Trainer.h
+9
-10
paddle/trainer/TrainerConfigHelper.cpp
paddle/trainer/TrainerConfigHelper.cpp
+14
-23
paddle/trainer/TrainerConfigHelper.h
paddle/trainer/TrainerConfigHelper.h
+4
-15
paddle/trainer/TrainerInternal.cpp
paddle/trainer/TrainerInternal.cpp
+54
-56
paddle/trainer/TrainerInternal.h
paddle/trainer/TrainerInternal.h
+9
-17
paddle/trainer/TrainerInternalConfig.cpp
paddle/trainer/TrainerInternalConfig.cpp
+2
-1
paddle/trainer/TrainerInternalConfig.h
paddle/trainer/TrainerInternalConfig.h
+4
-9
paddle/trainer/TrainerMain.cpp
paddle/trainer/TrainerMain.cpp
+1
-2
paddle/trainer/tests/picojson.h
paddle/trainer/tests/picojson.h
+14
-6
paddle/trainer/tests/test_Compare.cpp
paddle/trainer/tests/test_Compare.cpp
+2
-2
paddle/trainer/tests/test_CompareSparse.cpp
paddle/trainer/tests/test_CompareSparse.cpp
+21
-17
paddle/trainer/tests/test_CompareTwoNets.cpp
paddle/trainer/tests/test_CompareTwoNets.cpp
+28
-12
paddle/trainer/tests/test_CompareTwoOpts.cpp
paddle/trainer/tests/test_CompareTwoOpts.cpp
+25
-9
paddle/trainer/tests/test_Prediction.cpp
paddle/trainer/tests/test_Prediction.cpp
+8
-5
paddle/trainer/tests/test_PyDataProviderWrapper.cpp
paddle/trainer/tests/test_PyDataProviderWrapper.cpp
+0
-1
paddle/trainer/tests/test_Trainer.cpp
paddle/trainer/tests/test_Trainer.cpp
+4
-2
paddle/trainer/tests/test_TrainerOnePass.cpp
paddle/trainer/tests/test_TrainerOnePass.cpp
+19
-16
paddle/trainer/tests/test_recurrent_machine_generation.cpp
paddle/trainer/tests/test_recurrent_machine_generation.cpp
+14
-6
paddle/utils/BarrierStat.cpp
paddle/utils/BarrierStat.cpp
+11
-7
paddle/utils/BarrierStat.h
paddle/utils/BarrierStat.h
+54
-55
paddle/utils/ClassRegistrar.h
paddle/utils/ClassRegistrar.h
+5
-6
paddle/utils/CommandLineParser.cpp
paddle/utils/CommandLineParser.cpp
+19
-25
paddle/utils/CommandLineParser.h
paddle/utils/CommandLineParser.h
+4
-3
paddle/utils/CustomStackTrace.cpp
paddle/utils/CustomStackTrace.cpp
+19
-17
paddle/utils/CustomStackTrace.h
paddle/utils/CustomStackTrace.h
+12
-15
paddle/utils/DisableCopy.h
paddle/utils/DisableCopy.h
+3
-4
paddle/utils/Excepts.cpp
paddle/utils/Excepts.cpp
+6
-6
paddle/utils/Flags.cpp
paddle/utils/Flags.cpp
+20
-11
paddle/utils/Flags.h
paddle/utils/Flags.h
+0
-1
paddle/utils/GlobalConstants.cpp
paddle/utils/GlobalConstants.cpp
+0
-1
paddle/utils/GlobalConstants.h
paddle/utils/GlobalConstants.h
+4
-5
paddle/utils/Locks.h
paddle/utils/Locks.h
+24
-25
paddle/utils/Logging.cpp
paddle/utils/Logging.cpp
+15
-11
paddle/utils/Logging.h
paddle/utils/Logging.h
+4
-4
paddle/utils/PythonUtil.cpp
paddle/utils/PythonUtil.cpp
+17
-22
paddle/utils/PythonUtil.h
paddle/utils/PythonUtil.h
+19
-32
paddle/utils/Queue.h
paddle/utils/Queue.h
+7
-10
paddle/utils/Stat.h
paddle/utils/Stat.h
+16
-13
paddle/utils/StringUtil.h
paddle/utils/StringUtil.h
+0
-3
paddle/utils/Thread.h
paddle/utils/Thread.h
+31
-23
paddle/utils/ThreadLocal.cpp
paddle/utils/ThreadLocal.cpp
+2
-1
paddle/utils/ThreadLocal.h
paddle/utils/ThreadLocal.h
+1
-4
paddle/utils/TypeDefs.h
paddle/utils/TypeDefs.h
+0
-1
paddle/utils/Util.cpp
paddle/utils/Util.cpp
+37
-33
paddle/utils/Util.h
paddle/utils/Util.h
+23
-31
paddle/utils/Version.cpp
paddle/utils/Version.cpp
+14
-11
paddle/utils/Version.h
paddle/utils/Version.h
+3
-13
paddle/utils/arch/linux/Locks.cpp
paddle/utils/arch/linux/Locks.cpp
+9
-23
paddle/utils/arch/osx/Locks.cpp
paddle/utils/arch/osx/Locks.cpp
+10
-18
paddle/utils/tests/test_CommandLineParser.cpp
paddle/utils/tests/test_CommandLineParser.cpp
+10
-7
paddle/utils/tests/test_CustomStackTrace.cpp
paddle/utils/tests/test_CustomStackTrace.cpp
+25
-18
paddle/utils/tests/test_CustomStackTracePrint.cpp
paddle/utils/tests/test_CustomStackTracePrint.cpp
+1
-1
paddle/utils/tests/test_Logging.cpp
paddle/utils/tests/test_Logging.cpp
+2
-4
paddle/utils/tests/test_SpinLock.cpp
paddle/utils/tests/test_SpinLock.cpp
+13
-11
paddle/utils/tests/test_StringUtils.cpp
paddle/utils/tests/test_StringUtils.cpp
+0
-1
paddle/utils/tests/test_Thread.cpp
paddle/utils/tests/test_Thread.cpp
+14
-17
paddle/utils/tests/test_ThreadBarrier.cpp
paddle/utils/tests/test_ThreadBarrier.cpp
+31
-29
未找到文件。
paddle/api/Arguments.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "PaddleAPIPrivate.h"
...
...
@@ -112,7 +111,7 @@ void Arguments::setSlotSequenceStartPositions(size_t idx,
}
void
Arguments
::
setSlotSubSequenceStartPositions
(
size_t
idx
,
IVector
*
vec
)
throw
(
RangeError
)
{
size_t
idx
,
IVector
*
vec
)
throw
(
RangeError
)
{
auto
&
a
=
m
->
getArg
(
idx
);
auto
&
v
=
m
->
cast
<
paddle
::
IVector
>
(
vec
->
getSharedPtr
());
a
.
subSequenceStartPositions
=
std
::
make_shared
<
paddle
::
ICpuGpuVector
>
(
v
);
...
...
paddle/api/ConfigParser.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "PaddleAPIPrivate.h"
#include "paddle/trainer/Trainer.h"
...
...
@@ -44,8 +43,7 @@ TrainerConfig* TrainerConfig::createFromTrainerConfigFile(
return
retv
;
}
TrainerConfig
*
TrainerConfig
::
createFromProtoString
(
const
std
::
string
&
str
)
{
TrainerConfig
*
TrainerConfig
::
createFromProtoString
(
const
std
::
string
&
str
)
{
auto
retv
=
new
TrainerConfig
();
paddle
::
TrainerConfig
trainerConfigProto
;
auto
conf
=
std
::
make_shared
<
paddle
::
TrainerConfigHelper
>
(
trainerConfigProto
);
...
...
paddle/api/GradientMachine.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "PaddleAPIPrivate.h"
...
...
@@ -27,7 +26,8 @@ GradientMachine::GradientMachine() : m(new GradientMachinePrivate()) {}
GradientMachine
::~
GradientMachine
()
{
delete
m
;
}
GradientMachine
*
GradientMachine
::
createFromPaddleModelPtr
(
const
void
*
confPtr
,
GradientMatchineCreateMode
mode
,
const
void
*
confPtr
,
GradientMatchineCreateMode
mode
,
const
std
::
vector
<
int
>&
types
)
{
auto
&
conf
=
*
(
const
paddle
::
ModelConfig
*
)(
confPtr
);
std
::
vector
<
ParameterType
>
realTypes
;
...
...
@@ -44,7 +44,8 @@ GradientMachine* GradientMachine::createFromPaddleModelPtr(
}
GradientMachine
*
GradientMachine
::
createByConfigProtoStr
(
const
std
::
string
&
protoStr
,
GradientMatchineCreateMode
mode
,
const
std
::
string
&
protoStr
,
GradientMatchineCreateMode
mode
,
const
std
::
vector
<
int
>&
types
)
{
paddle
::
ModelConfig
conf
;
conf
.
ParseFromString
(
protoStr
);
...
...
@@ -56,13 +57,15 @@ GradientMachine* GradientMachine::createByConfigProtoStr(
}
GradientMachine
*
GradientMachine
::
createByModelConfig
(
ModelConfig
*
conf
,
GradientMatchineCreateMode
mode
,
ModelConfig
*
conf
,
GradientMatchineCreateMode
mode
,
const
std
::
vector
<
int
>&
types
)
{
auto
confPtr
=
&
conf
->
m
->
conf
->
getModelConfig
();
return
GradientMachine
::
createFromPaddleModelPtr
(
confPtr
,
mode
,
types
);
}
void
GradientMachine
::
forward
(
const
Arguments
&
inArgs
,
Arguments
*
outArgs
,
void
GradientMachine
::
forward
(
const
Arguments
&
inArgs
,
Arguments
*
outArgs
,
PassType
passType
)
{
auto
&
in
=
m
->
cast
<
std
::
vector
<
paddle
::
Argument
>>
(
inArgs
.
getInternalArgumentsPtr
());
...
...
@@ -99,7 +102,8 @@ void GradientMachine::backward(const UpdateCallback& callback) {
}
void
GradientMachine
::
forwardBackward
(
const
Arguments
&
inArgs
,
Arguments
*
outArgs
,
PassType
passType
,
Arguments
*
outArgs
,
PassType
passType
,
const
UpdateCallback
&
callback
)
{
auto
&
in
=
m
->
cast
<
std
::
vector
<
paddle
::
Argument
>>
(
inArgs
.
getInternalArgumentsPtr
());
...
...
@@ -129,7 +133,7 @@ Parameter* GradientMachine::getParameter(size_t i) throw(RangeError) {
void
GradientMachine
::
randParameters
()
{
m
->
machine
->
randParameters
();
}
Matrix
*
GradientMachine
::
getLayerOutput
(
const
std
::
string
&
layerName
)
const
throw
(
UnsupportError
)
{
throw
(
UnsupportError
)
{
auto
nn
=
std
::
dynamic_pointer_cast
<
paddle
::
NeuralNetwork
>
(
m
->
machine
);
if
(
nn
)
{
auto
mat
=
nn
->
getLayerOutput
(
layerName
);
...
...
@@ -140,8 +144,11 @@ Matrix* GradientMachine::getLayerOutput(const std::string& layerName) const
}
SequenceGenerator
*
GradientMachine
::
asSequenceGenerator
(
const
std
::
vector
<
std
::
string
>&
dict
,
size_t
begin_id
,
size_t
end_id
,
size_t
max_length
,
size_t
beam_size
)
{
const
std
::
vector
<
std
::
string
>&
dict
,
size_t
begin_id
,
size_t
end_id
,
size_t
max_length
,
size_t
beam_size
)
{
SequenceGenerator
*
r
=
SequenceGenerator
::
createByGradientMachineSharedPtr
(
&
m
->
machine
);
r
->
setDict
(
dict
);
...
...
paddle/api/Internal.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "PaddleAPI.h"
...
...
@@ -23,7 +22,8 @@ limitations under the License. */
template
<
typename
T1
,
typename
T2
>
void
staticCastVector
(
std
::
vector
<
T2
>*
dest
,
const
std
::
vector
<
T1
>&
src
)
{
dest
->
resize
(
src
.
size
());
std
::
transform
(
src
.
begin
(),
src
.
end
(),
dest
->
begin
(),
[](
T1
t
){
return
static_cast
<
T2
>
(
t
);
});
std
::
transform
(
src
.
begin
(),
src
.
end
(),
dest
->
begin
(),
[](
T1
t
)
{
return
static_cast
<
T2
>
(
t
);
});
}
paddle/api/Matrix.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "paddle/math/Matrix.h"
#include "paddle/math/SparseMatrix.h"
...
...
@@ -44,17 +43,21 @@ Matrix* Matrix::createZero(size_t height, size_t width, bool useGpu) {
return
m
;
}
Matrix
*
Matrix
::
createDense
(
const
std
::
vector
<
float
>&
data
,
size_t
height
,
size_t
width
,
bool
useGpu
)
{
Matrix
*
Matrix
::
createDense
(
const
std
::
vector
<
float
>&
data
,
size_t
height
,
size_t
width
,
bool
useGpu
)
{
auto
m
=
new
Matrix
();
m
->
m
->
mat
=
paddle
::
Matrix
::
create
(
height
,
width
,
useGpu
);
m
->
m
->
mat
->
copyFrom
(
data
.
data
(),
data
.
size
());
return
m
;
}
Matrix
*
Matrix
::
createDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
bool
copy
,
bool
useGpu
)
throw
(
UnsupportError
)
{
Matrix
*
Matrix
::
createDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
bool
copy
,
bool
useGpu
)
throw
(
UnsupportError
)
{
if
(
useGpu
)
{
/// Gpu mode only supports copy=True
if
(
!
copy
)
{
...
...
@@ -66,7 +69,9 @@ Matrix* Matrix::createDenseFromNumpy(float* data, int dim1, int dim2,
}
}
Matrix
*
Matrix
::
createCpuDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
Matrix
*
Matrix
::
createCpuDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
bool
copy
)
{
auto
m
=
new
Matrix
();
if
(
copy
)
{
...
...
@@ -85,12 +90,20 @@ Matrix* Matrix::createGpuDenseFromNumpy(float* data, int dim1, int dim2) {
return
m
;
}
Matrix
*
Matrix
::
createSparse
(
size_t
height
,
size_t
width
,
size_t
nnz
,
bool
isNonVal
,
bool
isTrans
,
bool
useGpu
)
{
Matrix
*
Matrix
::
createSparse
(
size_t
height
,
size_t
width
,
size_t
nnz
,
bool
isNonVal
,
bool
isTrans
,
bool
useGpu
)
{
auto
m
=
new
Matrix
();
m
->
m
->
mat
=
paddle
::
Matrix
::
createSparseMatrix
(
height
,
width
,
nnz
,
isNonVal
?
paddle
::
NO_VALUE
:
paddle
::
FLOAT_VALUE
,
isTrans
,
useGpu
);
height
,
width
,
nnz
,
isNonVal
?
paddle
::
NO_VALUE
:
paddle
::
FLOAT_VALUE
,
isTrans
,
useGpu
);
return
m
;
}
...
...
@@ -221,7 +234,8 @@ FloatArray Matrix::getData() const {
}
void
Matrix
::
sparseCopyFrom
(
const
std
::
vector
<
int
>&
rows
,
const
std
::
vector
<
int
>&
cols
,
const
std
::
vector
<
int
>&
rows
,
const
std
::
vector
<
int
>&
cols
,
const
std
::
vector
<
float
>&
vals
)
throw
(
UnsupportError
)
{
auto
cpuSparseMat
=
std
::
dynamic_pointer_cast
<
paddle
::
CpuSparseMatrix
>
(
m
->
mat
);
...
...
@@ -240,7 +254,8 @@ void Matrix::sparseCopyFrom(
void
*
Matrix
::
getSharedPtr
()
const
{
return
&
m
->
mat
;
}
void
Matrix
::
toNumpyMatInplace
(
float
**
view_data
,
int
*
dim1
,
void
Matrix
::
toNumpyMatInplace
(
float
**
view_data
,
int
*
dim1
,
int
*
dim2
)
throw
(
UnsupportError
)
{
auto
cpuMat
=
std
::
dynamic_pointer_cast
<
paddle
::
CpuMatrix
>
(
m
->
mat
);
if
(
cpuMat
)
{
...
...
@@ -251,7 +266,8 @@ void Matrix::toNumpyMatInplace(float** view_data, int* dim1,
throw
UnsupportError
();
}
}
void
Matrix
::
copyToNumpyMat
(
float
**
view_m_data
,
int
*
dim1
,
void
Matrix
::
copyToNumpyMat
(
float
**
view_m_data
,
int
*
dim1
,
int
*
dim2
)
throw
(
UnsupportError
)
{
static_assert
(
sizeof
(
paddle
::
real
)
==
sizeof
(
float
),
"Currently PaddleAPI only support for single "
...
...
@@ -269,8 +285,8 @@ void Matrix::copyToNumpyMat(float** view_m_data, int* dim1,
}
else
if
(
auto
gpuMat
=
dynamic_cast
<
paddle
::
GpuMatrix
*>
(
m
->
mat
.
get
()))
{
auto
src
=
gpuMat
->
getData
();
auto
dest
=
*
view_m_data
;
hl_memcpy_device2host
(
dest
,
src
,
sizeof
(
paddle
::
real
)
*
(
*
dim1
)
*
(
*
dim2
));
hl_memcpy_device2host
(
dest
,
src
,
sizeof
(
paddle
::
real
)
*
(
*
dim1
)
*
(
*
dim2
));
}
else
{
LOG
(
WARNING
)
<<
"Unexpected Situation"
;
throw
UnsupportError
();
...
...
@@ -278,7 +294,8 @@ void Matrix::copyToNumpyMat(float** view_m_data, int* dim1,
}
}
void
Matrix
::
copyFromNumpyMat
(
float
*
data
,
int
dim1
,
void
Matrix
::
copyFromNumpyMat
(
float
*
data
,
int
dim1
,
int
dim2
)
throw
(
UnsupportError
,
RangeError
)
{
if
(
isSparse
())
{
throw
UnsupportError
();
...
...
paddle/api/PaddleAPI.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include <stddef.h>
...
...
@@ -61,8 +60,8 @@ class RangeError {};
/// Not support Error, such as access GPU memory directly, etc.
class
UnsupportError
:
public
std
::
runtime_error
{
public:
UnsupportError
()
:
std
::
runtime_error
(
" "
)
{};
UnsupportError
(
const
std
::
string
&
message
)
:
std
::
runtime_error
(
message
)
{};
UnsupportError
()
:
std
::
runtime_error
(
" "
){};
UnsupportError
(
const
std
::
string
&
message
)
:
std
::
runtime_error
(
message
){};
};
/// This type will map to python's list of float.
...
...
@@ -112,7 +111,8 @@ public:
/**
* Create A Matrix with height,width, which is filled by zero.
*/
static
Matrix
*
createZero
(
size_t
height
,
size_t
width
,
static
Matrix
*
createZero
(
size_t
height
,
size_t
width
,
bool
useGpu
=
isUsingGpu
());
/**
...
...
@@ -124,8 +124,11 @@ public:
*
* @note the default sparse type is SPARSE_CSR.
*/
static
Matrix
*
createSparse
(
size_t
height
,
size_t
width
,
size_t
nnz
,
bool
isNonVal
=
true
,
bool
trans
=
false
,
static
Matrix
*
createSparse
(
size_t
height
,
size_t
width
,
size_t
nnz
,
bool
isNonVal
=
true
,
bool
trans
=
false
,
bool
useGpu
=
isUsingGpu
());
/**
...
...
@@ -134,13 +137,17 @@ public:
* @param data list of float should be passed in python.
* @note the value will be copy into a new matrix.
*/
static
Matrix
*
createDense
(
const
std
::
vector
<
float
>&
data
,
size_t
height
,
size_t
width
,
bool
useGpu
=
isUsingGpu
());
static
Matrix
*
createDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
bool
copy
=
true
,
bool
useGpu
=
isUsingGpu
())
throw
(
UnsupportError
);
static
Matrix
*
createDense
(
const
std
::
vector
<
float
>&
data
,
size_t
height
,
size_t
width
,
bool
useGpu
=
isUsingGpu
());
static
Matrix
*
createDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
bool
copy
=
true
,
bool
useGpu
=
isUsingGpu
())
throw
(
UnsupportError
);
/**
* Create Cpu Dense Matrix from numpy matrix, dtype=float32
...
...
@@ -151,7 +158,9 @@ public:
* @param copy true if copy into a new matrix, false will create
* matrix inplace.
*/
static
Matrix
*
createCpuDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
static
Matrix
*
createCpuDenseFromNumpy
(
float
*
data
,
int
dim1
,
int
dim2
,
bool
copy
=
false
);
/// Create Gpu Dense Matrix from numpy matrix, dtype=float32
...
...
@@ -171,11 +180,13 @@ public:
* numpy_mat = m.toNumpyMat()
* @endcode
*/
void
toNumpyMatInplace
(
float
**
view_data
,
int
*
dim1
,
void
toNumpyMatInplace
(
float
**
view_data
,
int
*
dim1
,
int
*
dim2
)
throw
(
UnsupportError
);
/// Copy To numpy mat.
void
copyToNumpyMat
(
float
**
view_m_data
,
int
*
dim1
,
void
copyToNumpyMat
(
float
**
view_m_data
,
int
*
dim1
,
int
*
dim2
)
throw
(
UnsupportError
);
/// Copy From Numpy Mat
...
...
@@ -248,15 +259,18 @@ public:
static
Vector
*
create
(
const
std
::
vector
<
float
>&
data
,
bool
useGpu
=
isUsingGpu
());
static
Vector
*
createVectorFromNumpy
(
float
*
data
,
int
dim
,
bool
copy
=
true
,
bool
useGpu
=
isUsingGpu
())
throw
(
UnsupportError
);
static
Vector
*
createVectorFromNumpy
(
float
*
data
,
int
dim
,
bool
copy
=
true
,
bool
useGpu
=
isUsingGpu
())
throw
(
UnsupportError
);
/**
* Create Cpu Vector from numpy array, which dtype=float32
*
* If copy is false, it will create vector inplace.
*/
static
Vector
*
createCpuVectorFromNumpy
(
float
*
data
,
int
dim
,
static
Vector
*
createCpuVectorFromNumpy
(
float
*
data
,
int
dim
,
bool
copy
=
false
);
/// Create Gpu Vector from numpy array, which dtype=float32
...
...
@@ -312,16 +326,19 @@ public:
static
IVector
*
create
(
const
std
::
vector
<
int
>&
data
,
bool
useGpu
=
isUsingGpu
());
static
IVector
*
createVectorFromNumpy
(
int
*
data
,
int
dim
,
bool
copy
=
true
,
bool
useGpu
=
isUsingGpu
())
throw
(
UnsupportError
);
static
IVector
*
createVectorFromNumpy
(
int
*
data
,
int
dim
,
bool
copy
=
true
,
bool
useGpu
=
isUsingGpu
())
throw
(
UnsupportError
);
/**
* Create Cpu IVector from numpy array, which dtype=int32
*
* If copy is false, it will create vector inplace
*/
static
IVector
*
createCpuVectorFromNumpy
(
int
*
data
,
int
dim
,
static
IVector
*
createCpuVectorFromNumpy
(
int
*
data
,
int
dim
,
bool
copy
=
false
);
/**
* Create Gpu IVector from numpy array, which dtype=int32
...
...
@@ -605,7 +622,8 @@ class ParameterTraverseCallback {
public:
~
ParameterTraverseCallback
();
void
apply
(
const
std
::
vector
<
Vector
*>&
vecs
,
const
ParameterConfig
&
config
,
void
apply
(
const
std
::
vector
<
Vector
*>&
vecs
,
const
ParameterConfig
&
config
,
size_t
sparseId
);
private:
...
...
@@ -638,7 +656,8 @@ public:
void
finishBatch
();
void
update
(
const
std
::
vector
<
Vector
*>&
vecs
,
const
ParameterConfig
&
conf
,
void
update
(
const
std
::
vector
<
Vector
*>&
vecs
,
const
ParameterConfig
&
conf
,
size_t
sparseId
=
NO_SPARSE_ID
);
std
::
vector
<
int
>
getParameterTypes
()
const
;
...
...
@@ -678,7 +697,8 @@ public:
* model config by TrainerConfig
*/
static
GradientMachine
*
createByModelConfig
(
ModelConfig
*
conf
,
GradientMatchineCreateMode
mode
=
CREATE_MODE_NORMAL
,
ModelConfig
*
conf
,
GradientMatchineCreateMode
mode
=
CREATE_MODE_NORMAL
,
const
std
::
vector
<
int
>&
parameterTypes
=
defaultParamTypes
);
/**
...
...
@@ -701,7 +721,8 @@ public:
/**
* Combine forward/backward
*/
void
forwardBackward
(
const
Arguments
&
inArgs
,
Arguments
*
outArgs
,
void
forwardBackward
(
const
Arguments
&
inArgs
,
Arguments
*
outArgs
,
PassType
passType
,
const
UpdateCallback
&
callback
=
UpdateCallback
());
...
...
@@ -722,14 +743,17 @@ public:
*/
SequenceGenerator
*
asSequenceGenerator
(
const
std
::
vector
<
std
::
string
>&
dict
=
std
::
vector
<
std
::
string
>
(),
size_t
begin_id
=
0UL
,
size_t
end_id
=
0UL
,
size_t
max_length
=
100UL
,
size_t
begin_id
=
0UL
,
size_t
end_id
=
0UL
,
size_t
max_length
=
100UL
,
size_t
beam_size
=
-
1UL
);
private:
GradientMachinePrivate
*
m
;
static
GradientMachine
*
createFromPaddleModelPtr
(
const
void
*
confPtr
,
GradientMatchineCreateMode
mode
,
const
void
*
confPtr
,
GradientMatchineCreateMode
mode
,
const
std
::
vector
<
int
>&
types
);
// Not to use c++ 11 init-list, so we use static var as function default arg.
...
...
@@ -751,8 +775,8 @@ public:
/// Create A Trainer By TrainerConfig. using paddle command line.
static
Trainer
*
createByCommandLine
()
throw
(
IOError
);
static
Trainer
*
create
(
TrainerConfig
*
optConfig
,
GradientMachine
*
gm
)
throw
(
IOError
);
static
Trainer
*
create
(
TrainerConfig
*
optConfig
,
GradientMachine
*
gm
)
throw
(
IOError
);
/// Start training
void
startTrain
();
...
...
paddle/api/Parameter.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "paddle/parameter/Parameter.h"
...
...
paddle/api/ParameterOptimizer.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "PaddleAPIPrivate.h"
#include "paddle/parameter/ParameterOptimizer.h"
...
...
@@ -32,17 +31,21 @@ struct ParameterTraverseCallbackPrivate {
const
paddle
::
ParameterOptimizer
::
TraverseCallback
&
callback
)
:
callback
(
callback
)
{}
void
apply
(
const
std
::
vector
<
Vector
*>&
vecs
,
const
ParameterConfig
&
conf
,
void
apply
(
const
std
::
vector
<
Vector
*>&
vecs
,
const
ParameterConfig
&
conf
,
size_t
sparseId
)
{
std
::
vector
<
paddle
::
VectorPtr
>
real_vecs
;
real_vecs
.
resize
(
vecs
.
size
());
std
::
transform
(
vecs
.
begin
(),
vecs
.
end
(),
real_vecs
.
begin
(),
[](
Vector
*
v
)
{
if
(
v
)
{
return
*
(
paddle
::
VectorPtr
*
)(
v
->
getSharedPtr
());
}
else
{
return
paddle
::
VectorPtr
();
}
});
std
::
transform
(
vecs
.
begin
(),
vecs
.
end
(),
real_vecs
.
begin
(),
[](
Vector
*
v
)
{
if
(
v
)
{
return
*
(
paddle
::
VectorPtr
*
)(
v
->
getSharedPtr
());
}
else
{
return
paddle
::
VectorPtr
();
}
});
paddle
::
ParameterConfig
&
real_conf
=
*
(
paddle
::
ParameterConfig
*
)(
const_cast
<
ParameterConfig
&>
(
conf
)
...
...
@@ -86,10 +89,12 @@ void ParameterOptimizer::startBatch(size_t numSamplesProcessed) {
void
ParameterOptimizer
::
finishBatch
()
{
m
->
optimizer
->
finishBatch
();
}
void
ParameterOptimizer
::
update
(
const
std
::
vector
<
Vector
*>&
vecs
,
const
ParameterConfig
&
conf
,
size_t
sparseId
)
{
ParameterTraverseCallbackPrivate
invoker
([
&
](
const
paddle
::
VectorPtr
_vecs
[],
const
paddle
::
ParameterConfig
&
config
,
size_t
sid
=
-
1UL
)
{
m
->
optimizer
->
update
(
_vecs
,
config
,
sid
);
});
const
ParameterConfig
&
conf
,
size_t
sparseId
)
{
ParameterTraverseCallbackPrivate
invoker
(
[
&
](
const
paddle
::
VectorPtr
_vecs
[],
const
paddle
::
ParameterConfig
&
config
,
size_t
sid
=
-
1UL
)
{
m
->
optimizer
->
update
(
_vecs
,
config
,
sid
);
});
invoker
.
apply
(
vecs
,
conf
,
sparseId
);
}
...
...
@@ -116,8 +121,9 @@ void ParameterTraverseCallback::apply(const std::vector<Vector*>& vecs,
ParameterTraverseCallback
*
ParameterOptimizer
::
needSpecialTraversal
(
const
ParameterConfig
&
config
)
const
{
auto
&
param_config
=
*
(
paddle
::
ParameterConfig
*
)
const_cast
<
ParameterConfig
&>
(
config
).
getRawPtr
();
auto
&
param_config
=
*
(
paddle
::
ParameterConfig
*
)
const_cast
<
ParameterConfig
&>
(
config
)
.
getRawPtr
();
auto
callback
=
m
->
optimizer
->
needSpecialTraversal
(
param_config
);
if
(
callback
)
{
auto
retCallback
=
new
ParameterTraverseCallback
();
...
...
paddle/api/SequenceGenerator.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "paddle/gserver/gradientmachines/GradientMachine.h"
#include "paddle/parameter/Argument.h"
...
...
@@ -42,8 +41,10 @@ struct Path {
// position
static
void
findNBest
(
paddle
::
GradientMachine
*
gradMachine
,
std
::
vector
<
paddle
::
Argument
>&
inArgs
,
std
::
vector
<
Path
>&
finalPaths
,
size_t
bos_id
,
size_t
eos_id
,
size_t
max_length
)
{
std
::
vector
<
Path
>&
finalPaths
,
size_t
bos_id
,
size_t
eos_id
,
size_t
max_length
)
{
std
::
vector
<
Path
>
paths
;
Path
emptyPath
;
paths
.
push_back
(
emptyPath
);
...
...
@@ -166,7 +167,8 @@ public:
if
(
id
<
getSize
())
{
Path
&
p
=
(
*
path_
)[
id
];
std
::
ostringstream
sout
;
std
::
transform
(
p
.
ids
.
begin
(),
p
.
ids
.
end
(),
std
::
transform
(
p
.
ids
.
begin
(),
p
.
ids
.
end
(),
std
::
ostream_iterator
<
std
::
string
>
(
sout
,
split
?
" "
:
""
),
[
&
](
int
id
)
{
return
(
*
dict_
)[
id
];
});
return
sout
.
str
();
...
...
paddle/api/Trainer.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -64,12 +64,11 @@ Trainer* Trainer::createByCommandLine() throw(IOError) {
Trainer
::
Trainer
(
TrainerConfig
*
config
,
GradientMachine
*
gm
)
:
m
(
new
TrainerPrivate
())
{
m
->
init
(
config
->
m
->
conf
,
/* testing= */
false
,
gm
?
gm
->
m
->
machine
:
nullptr
);
m
->
init
(
config
->
m
->
conf
,
/* testing= */
false
,
gm
?
gm
->
m
->
machine
:
nullptr
);
}
Trainer
*
Trainer
::
create
(
TrainerConfig
*
config
,
GradientMachine
*
gm
)
throw
(
IOError
)
{
Trainer
*
Trainer
::
create
(
TrainerConfig
*
config
,
GradientMachine
*
gm
)
throw
(
IOError
)
{
auto
retv
=
new
Trainer
(
config
,
gm
);
if
(
retv
->
m
->
getConfig
().
IsInitialized
())
{
return
retv
;
...
...
@@ -134,15 +133,17 @@ void Trainer::finishTestPeriod() { m->finishTestPeriod(); }
Matrix
*
Trainer
::
getLayerOutput
(
const
std
::
string
&
layerName
)
{
auto
nn
=
std
::
dynamic_pointer_cast
<
paddle
::
NeuralNetwork
>
(
this
->
m
->
getGradientMachine
());
this
->
m
->
getGradientMachine
());
CHECK
(
nn
)
<<
"trainerInternal_.getGradientMachine() is not NeuralNetwork"
;
auto
m
=
nn
->
getLayerOutput
(
layerName
);
return
Matrix
::
createByPaddleMatrixPtr
(
&
m
);
}
void
Trainer
::
forwardOneBatch
(
size_t
batchSize
)
{
m
->
forwardOneBatch
(
batchSize
);
}
void
Trainer
::
forwardOneBatch
(
size_t
batchSize
)
{
m
->
forwardOneBatch
(
batchSize
);
}
bool
TrainerPrivate
::
forwardOneBatch
(
size_t
batchSize
)
{
bool
TrainerPrivate
::
forwardOneBatch
(
size_t
batchSize
)
{
CHECK
(
dataProvider_
)
<<
"data_provider is not specified"
;
paddle
::
DataBatch
dataBatch
;
int
num
=
dataProvider_
->
getNextBatch
(
batchSize
,
&
dataBatch
);
...
...
@@ -156,7 +157,6 @@ bool TrainerPrivate::forwardOneBatch(size_t batchSize) {
void
TrainerPrivate
::
forwardOneDataBatch
(
const
std
::
vector
<
paddle
::
Argument
>&
inArgs
)
{
std
::
vector
<
paddle
::
Argument
>&
outArgs
=
forwardOutput_
;
if
(
config_
->
getOptConfig
().
use_sparse_remote_updater
())
{
...
...
paddle/api/Util.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -37,13 +37,15 @@ FloatArray::FloatArray(const float* b, const size_t l)
IntArray
::
IntArray
(
const
int
*
b
,
const
size_t
l
,
bool
f
)
:
buf
(
b
),
length
(
l
),
needFree
(
f
)
{}
IntWithFloatArray
::
IntWithFloatArray
(
const
float
*
v
,
const
int
*
i
,
size_t
l
,
IntWithFloatArray
::
IntWithFloatArray
(
const
float
*
v
,
const
int
*
i
,
size_t
l
,
bool
f
)
:
valBuf
(
v
),
idxBuf
(
i
),
length
(
l
),
needFree
(
f
)
{}
bool
isUsingGpu
()
{
return
FLAGS_use_gpu
;
}
bool
isUsingGpu
()
{
return
FLAGS_use_gpu
;
}
void
setUseGpu
(
bool
useGpu
)
{
FLAGS_use_gpu
=
useGpu
;
}
void
setUseGpu
(
bool
useGpu
)
{
FLAGS_use_gpu
=
useGpu
;
}
bool
isGpuVersion
()
{
#ifdef PADDLE_ONLY_CPU
...
...
paddle/api/Vector.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "PaddleAPI.h"
#include "paddle/math/Vector.h"
...
...
@@ -39,8 +38,10 @@ IVector* IVector::create(const std::vector<int>& data, bool useGpu) {
return
v
;
}
IVector
*
IVector
::
createVectorFromNumpy
(
int
*
data
,
int
dim
,
bool
copy
,
bool
useGpu
)
throw
(
UnsupportError
){
IVector
*
IVector
::
createVectorFromNumpy
(
int
*
data
,
int
dim
,
bool
copy
,
bool
useGpu
)
throw
(
UnsupportError
)
{
if
(
useGpu
)
{
/// if use gpu only copy=true is supported
if
(
!
copy
)
{
...
...
@@ -137,8 +138,8 @@ void IVector::copyToNumpyArray(int** view_m_data, int* dim1) {
if
(
auto
cpuVec
=
dynamic_cast
<
paddle
::
CpuIVector
*>
(
m
->
vec
.
get
()))
{
std
::
memcpy
(
*
view_m_data
,
cpuVec
->
getData
(),
sizeof
(
int
)
*
(
*
dim1
));
}
else
if
(
auto
gpuVec
=
dynamic_cast
<
paddle
::
GpuIVector
*>
(
m
->
vec
.
get
()))
{
hl_memcpy_device2host
(
*
view_m_data
,
gpuVec
->
getData
(),
sizeof
(
int
)
*
(
*
dim1
));
hl_memcpy_device2host
(
*
view_m_data
,
gpuVec
->
getData
(),
sizeof
(
int
)
*
(
*
dim1
));
}
else
{
LOG
(
INFO
)
<<
"Unexpected situation"
;
}
...
...
@@ -201,8 +202,10 @@ Vector* Vector::createByPaddleVectorPtr(void* ptr) {
}
}
Vector
*
Vector
::
createVectorFromNumpy
(
float
*
data
,
int
dim
,
bool
copy
,
bool
useGpu
)
throw
(
UnsupportError
){
Vector
*
Vector
::
createVectorFromNumpy
(
float
*
data
,
int
dim
,
bool
copy
,
bool
useGpu
)
throw
(
UnsupportError
)
{
if
(
useGpu
)
{
/// if use gpu only copy=True is supported
if
(
!
copy
)
{
...
...
@@ -251,8 +254,8 @@ void Vector::copyToNumpyArray(float** view_m_data, int* dim1) {
if
(
auto
cpuVec
=
dynamic_cast
<
paddle
::
CpuVector
*>
(
m
->
vec
.
get
()))
{
std
::
memcpy
(
*
view_m_data
,
cpuVec
->
getData
(),
sizeof
(
float
)
*
(
*
dim1
));
}
else
if
(
auto
gpuVec
=
dynamic_cast
<
paddle
::
CpuVector
*>
(
m
->
vec
.
get
()))
{
hl_memcpy_device2host
(
*
view_m_data
,
gpuVec
->
getData
(),
sizeof
(
float
)
*
(
*
dim1
));
hl_memcpy_device2host
(
*
view_m_data
,
gpuVec
->
getData
(),
sizeof
(
float
)
*
(
*
dim1
));
}
else
{
LOG
(
INFO
)
<<
"Unexpected situation"
;
}
...
...
paddle/cuda/include/hl_activation_functions.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_ACTIVATION_FUNCTIONS_H_
#define HL_ACTIVATION_FUNCTIONS_H_
...
...
@@ -21,11 +20,8 @@ limitations under the License. */
/**
* Active functions: sigmoid, relu, tanh and linear.
*/
#define HPPL_ACTIVE_FUNCTION {hppl::sigmoid, \
hppl::relu, \
hppl::tanh, \
hppl::linear \
}
#define HPPL_ACTIVE_FUNCTION \
{ hppl::sigmoid, hppl::relu, hppl::tanh, hppl::linear }
namespace
hppl
{
...
...
@@ -42,18 +38,18 @@ public:
#ifdef __NVCC__
namespace
gpu
{
static
__device__
Active
<
real
>::
forward
forward
[]
=
HPPL_ACTIVE_FUNCTION
;
static
__device__
Active
<
real
>::
forward
forward
[]
=
HPPL_ACTIVE_FUNCTION
;
static
__device__
Active
<
real
>::
backward
backward
[]
=
HPPL_ACTIVE_FUNCTION
;
}
#else
namespace
cpu
{
static
Active
<
real
>::
forward
forward
[]
=
HPPL_ACTIVE_FUNCTION
;
static
Active
<
real
>::
forward
forward
[]
=
HPPL_ACTIVE_FUNCTION
;
static
Active
<
real
>::
backward
backward
[]
=
HPPL_ACTIVE_FUNCTION
;
}
#ifdef __AVX__
namespace
avx
{
static
Active
<
__m256
>::
forward
forward
[]
=
HPPL_ACTIVE_FUNCTION
;
static
Active
<
__m256
>::
forward
forward
[]
=
HPPL_ACTIVE_FUNCTION
;
static
Active
<
__m256
>::
backward
backward
[]
=
HPPL_ACTIVE_FUNCTION
;
}
#endif
...
...
paddle/cuda/include/hl_aggregate.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_AGGREGATE_H_
#define HL_AGGREGATE_H_
...
...
paddle/cuda/include/hl_avx_functions.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,22 +12,21 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_AVX_FUNCTIONS_H_
#define HL_AVX_FUNCTIONS_H_
#include <immintrin.h>
namespace
hppl
{
__m256
relu
(
const
__m256
a
);
__m256
sigmoid
(
const
__m256
a
);
__m256
tanh
(
const
__m256
a
);
__m256
linear
(
const
__m256
a
);
__m256
relu
(
const
__m256
a
,
const
__m256
b
);
__m256
sigmoid
(
const
__m256
a
,
const
__m256
b
);
__m256
tanh
(
const
__m256
a
,
const
__m256
b
);
__m256
linear
(
const
__m256
a
,
const
__m256
b
);
__m256
relu
(
const
__m256
a
);
__m256
sigmoid
(
const
__m256
a
);
__m256
tanh
(
const
__m256
a
);
__m256
linear
(
const
__m256
a
);
__m256
relu
(
const
__m256
a
,
const
__m256
b
);
__m256
sigmoid
(
const
__m256
a
,
const
__m256
b
);
__m256
tanh
(
const
__m256
a
,
const
__m256
b
);
__m256
linear
(
const
__m256
a
,
const
__m256
b
);
}
// namespace hppl
#endif // HL_AVX_FUNCTIONS_H_
paddle/cuda/include/hl_base.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,8 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_BASE_H_
#define HL_BASE_H_
...
...
@@ -33,36 +31,36 @@ limitations under the License. */
* HPPL_STREAM_DEFAULT is HPPL default stream.
*/
typedef
enum
{
HPPL_STREAM_DEFAULT
=
0
,
/* Thread Default Stream*/
HPPL_STREAM_1
=
1
,
HPPL_STREAM_2
=
2
,
HPPL_STREAM_3
=
3
,
HPPL_STREAM_4
=
4
,
HPPL_THREAD_STREAM_1
=
5
,
HPPL_THREAD_STREAM_2
=
6
,
HPPL_THREAD_STREAM_3
=
7
,
HPPL_THREAD_STREAM_4
=
8
,
HPPL_STREAM_END
HPPL_STREAM_DEFAULT
=
0
,
/* Thread Default Stream*/
HPPL_STREAM_1
=
1
,
HPPL_STREAM_2
=
2
,
HPPL_STREAM_3
=
3
,
HPPL_STREAM_4
=
4
,
HPPL_THREAD_STREAM_1
=
5
,
HPPL_THREAD_STREAM_2
=
6
,
HPPL_THREAD_STREAM_3
=
7
,
HPPL_THREAD_STREAM_4
=
8
,
HPPL_STREAM_END
}
hl_stream_t
;
/**
* @brief HPPL activation mode.
*/
typedef
enum
{
HL_ACTIVATION_SIGMOID
=
0
,
HL_ACTIVATION_RELU
=
1
,
HL_ACTIVATION_TANH
=
2
,
HL_ACTIVATION_LINEAR
=
3
,
HL_ACTIVATION_END
HL_ACTIVATION_SIGMOID
=
0
,
HL_ACTIVATION_RELU
=
1
,
HL_ACTIVATION_TANH
=
2
,
HL_ACTIVATION_LINEAR
=
3
,
HL_ACTIVATION_END
}
hl_activation_mode_t
;
/**
* @brief Transpose type.
*/
typedef
enum
{
HPPL_OP_N
=
0
,
/* transpose */
HPPL_OP_T
=
1
,
/* non transpose */
HPPL_OP_END
HPPL_OP_N
=
0
,
/* transpose */
HPPL_OP_T
=
1
,
/* non transpose */
HPPL_OP_END
}
hl_trans_op_t
;
/**
...
...
@@ -148,23 +146,21 @@ typedef struct {
* @brief Sparse matrix value type.
*/
typedef
enum
{
HL_NO_VALUE
=
0
,
/* matrix values only 0 or 1 */
HL_FLOAT_VALUE
=
1
,
HL_VALUE_END
HL_NO_VALUE
=
0
,
/* matrix values only 0 or 1 */
HL_FLOAT_VALUE
=
1
,
HL_VALUE_END
}
hl_matrix_value_t
;
/**
* @brief HPPL matrix format.
*/
typedef
enum
{
HL_SPARSE_CSR
=
0
,
HL_SPARSE_CSC
=
1
,
HL_SPARSE_END
HL_SPARSE_CSR
=
0
,
HL_SPARSE_CSC
=
1
,
HL_SPARSE_END
}
hl_matrix_format_t
;
typedef
struct
_hl_matrix_s
*
hl_matrix_s
;
typedef
struct
_hl_matrix_s
*
hl_matrix_s
;
/**
* @brief HPPL sparse matrix.
...
...
@@ -177,12 +173,12 @@ typedef struct _hl_matrix_s * hl_matrix_s;
* @param nnz nonzero values of sparse matrix.
*/
typedef
struct
{
hl_matrix_s
matrix
;
hl_matrix_format_t
format
;
hl_matrix_value_t
type
;
int
rows
;
int
cols
;
size_t
nnz
;
hl_matrix_s
matrix
;
hl_matrix_format_t
format
;
hl_matrix_value_t
type
;
int
rows
;
int
cols
;
size_t
nnz
;
}
_hl_sparse_matrix_s
,
*
hl_sparse_matrix_s
;
#ifndef PADDLE_TYPE_DOUBLE
...
...
@@ -195,7 +191,7 @@ typedef struct {
*
* HL_FLOAT_MIN: 1.17549435e-38F
*/
#define HL_FLOAT_MAX
3.40282347e+38F
#define HL_FLOAT_MAX 3.40282347e+38F
/**
* if real == double
*
...
...
@@ -203,20 +199,18 @@ typedef struct {
*
* HL_FLOAT_MIN: 2.2250738585072014e-308
*/
#define HL_FLOAT_MIN
1.17549435e-38F
#define HL_FLOAT_MIN 1.17549435e-38F
#else
#define HL_FLOAT_MAX
1.7976931348623157e+308
#define HL_FLOAT_MIN
2.2250738585072014e-308
#define HL_FLOAT_MAX 1.7976931348623157e+308
#define HL_FLOAT_MIN 2.2250738585072014e-308
#endif
/**
* The maximum input value for exp, used to avoid overflow problem.
*
* Currently only used for tanh function.
*/
#define EXP_MAX_INPUT 40.0
#define EXP_MAX_INPUT 40.0
/**
* @brief DIVUP(x, y) is similar to ceil(x / y).
...
...
@@ -224,7 +218,7 @@ typedef struct {
* the size of blockDim.
*/
#ifndef DIVUP
#define DIVUP(x, y) (((x) + (y)
-
1) / (y))
#define DIVUP(x, y) (((x) + (y)
-
1) / (y))
#endif
#ifdef __NVCC__
...
...
@@ -233,7 +227,7 @@ typedef struct {
#include "hl_cuda.h"
#include "cuda_runtime.h"
extern
__thread
bool
g_sync_flag
;
extern
__thread
bool
g_sync_flag
;
extern
__thread
cudaStream_t
default_stream
;
#define STREAM_DEFAULT default_stream
...
...
@@ -241,16 +235,15 @@ extern __thread cudaStream_t default_stream;
* @brief Check cuda kernel execution.
* @param msg error string
*/
#define CHECK_SYNC(msg) \
if (true == g_sync_flag) { \
hl_stream_synchronize(HPPL_STREAM_DEFAULT); \
cudaError_t err \
= (cudaError_t)hl_get_device_last_error(); \
CHECK_EQ(cudaSuccess, err) << "[" << msg << "] " \
<< "CUDA error: " \
<< hl_get_device_error_string((size_t)err); \
#define CHECK_SYNC(msg) \
if (true == g_sync_flag) { \
hl_stream_synchronize(HPPL_STREAM_DEFAULT); \
cudaError_t err = (cudaError_t)hl_get_device_last_error(); \
CHECK_EQ(cudaSuccess, err) \
<< "[" << msg << "] " \
<< "CUDA error: " << hl_get_device_error_string((size_t)err); \
}
#endif
/* __NVCC__ */
#endif
/* __NVCC__ */
#endif
/* HL_BASE_H_ */
#endif
/* HL_BASE_H_ */
paddle/cuda/include/hl_batch_transpose.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_BATCH_TRANSPOSE_H_
#define HL_BATCH_TRANSPOSE_H_
...
...
@@ -31,10 +30,7 @@ limitations under the License. */
* order. Each batch has height * width data, which are
* arranged in height-first (or row-first) manner.
*/
extern
void
batchTranspose
(
const
real
*
input
,
real
*
output
,
int
width
,
int
height
,
int
batchSize
);
extern
void
batchTranspose
(
const
real
*
input
,
real
*
output
,
int
width
,
int
height
,
int
batchSize
);
#endif // HL_BATCH_TRANSPOSE_H_
paddle/cuda/include/hl_cnn.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CNN_H_
#define HL_CNN_H_
...
...
@@ -37,15 +36,21 @@ limitations under the License. */
* @param[in] alpha
* @param[in] beta
*/
extern
void
hl_shrink_col2feature
(
const
real
*
dataCol
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataIm
,
real
alpha
=
1
.
0
f
,
real
beta
=
0
.
0
f
);
extern
void
hl_shrink_col2feature
(
const
real
*
dataCol
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataIm
,
real
alpha
=
1
.
0
f
,
real
beta
=
0
.
0
f
);
/**
* @brief Expand feature to column.
...
...
@@ -65,14 +70,19 @@ extern void hl_shrink_col2feature(
* @param[out] dataCol expand data.
*
*/
extern
void
hl_expand_feature2col
(
const
real
*
dataIm
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataCol
);
extern
void
hl_expand_feature2col
(
const
real
*
dataIm
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataCol
);
/**
* @brief Maximum pool forward.
...
...
@@ -94,15 +104,21 @@ extern void hl_expand_feature2col(
* @param[in] tgtStride stride between output data samples.
*
*/
extern
void
hl_maxpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
);
extern
void
hl_maxpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
);
/**
* @brief Maximum pool backward.
...
...
@@ -125,20 +141,28 @@ extern void hl_maxpool_forward(
* @param[in] paddingH padding height.
* @param[in] paddingW padding width.
* @param[out] targetGrad output grad.
* @param[in] outStride stride between output data samples.
* @param[in] outStride stride between output data samples.
*
*/
extern
void
hl_maxpool_backward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
real
*
outData
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
targetGrad
,
const
int
outStride
);
extern
void
hl_maxpool_backward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
real
*
outData
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
targetGrad
,
const
int
outStride
);
/**
* @brief Averge pool forward.
...
...
@@ -160,15 +184,21 @@ extern void hl_maxpool_backward(
* @param[in] tgtStride stride between output data samples.
*
*/
extern
void
hl_avgpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
);
extern
void
hl_avgpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
);
/**
* @brief Maximum pool backward.
...
...
@@ -189,19 +219,26 @@ extern void hl_avgpool_forward(
* @param[in] scaleA scale.
* @param[in] scaleB scale.
* @param[out] backGrad output grad.
* @param[in] outStride stride between output data samples.
* @param[in] outStride stride between output data samples.
*
*/
extern
void
hl_avgpool_backward
(
const
int
frameCnt
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
int
paddingH
,
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
backGrad
,
const
int
outStride
);
extern
void
hl_avgpool_backward
(
const
int
frameCnt
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
int
paddingH
,
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
backGrad
,
const
int
outStride
);
/**
* @brief Cross-map-respose normalize forward.
...
...
@@ -218,10 +255,16 @@ extern void hl_avgpool_backward(
* @param[in] beta scale.
*
*/
extern
void
hl_CMRNorm_forward
(
size_t
frameCnt
,
const
real
*
in
,
real
*
scale
,
real
*
out
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
);
extern
void
hl_CMRNorm_forward
(
size_t
frameCnt
,
const
real
*
in
,
real
*
scale
,
real
*
out
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
);
/**
* @brief Cross-map-respose normalize backward.
...
...
@@ -240,11 +283,18 @@ extern void hl_CMRNorm_forward(
* @param[in] beta scale.
*
*/
extern
void
hl_CMRNorm_backward
(
size_t
frameCnt
,
const
real
*
inV
,
const
real
*
scale
,
const
real
*
outV
,
const
real
*
outDiff
,
real
*
inDiff
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
);
extern
void
hl_CMRNorm_backward
(
size_t
frameCnt
,
const
real
*
inV
,
const
real
*
scale
,
const
real
*
outV
,
const
real
*
outDiff
,
real
*
inDiff
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
);
/**
* @brief Bilinear interpolation forward.
...
...
@@ -278,24 +328,24 @@ extern void hl_bilinear_forward(const real* inData,
const
real
ratioH
,
const
real
ratioW
);
/**
* @brief Bilinear interpolation backward.
*
* @param[out] inGrad input gradient.
* @param[in] inImgH input image height.
* @param[in] inImgW input image width.
* @param[in] inputH input batchSize.
* @param[in] inputW input image data dim.
* @param[in] outGrad output gradient.
* @param[in] outImgH output image height.
* @param[in] outImgW output image width.
* @param[in] outputH output batchSize.
* @param[in] outputW output image data dim.
* @param[in] numChannels number of channels.
* @param[in] ratioH inImgH / outImgH.
* @param[in] ratioW inImgW / outImgW.
*
*/
/**
* @brief Bilinear interpolation backward.
*
* @param[out] inGrad input gradient.
* @param[in] inImgH input image height.
* @param[in] inImgW input image width.
* @param[in] inputH input batchSize.
* @param[in] inputW input image data dim.
* @param[in] outGrad output gradient.
* @param[in] outImgH output image height.
* @param[in] outImgW output image width.
* @param[in] outputH output batchSize.
* @param[in] outputW output image data dim.
* @param[in] numChannels number of channels.
* @param[in] ratioH inImgH / outImgH.
* @param[in] ratioW inImgW / outImgW.
*
*/
extern
void
hl_bilinear_backward
(
real
*
inGrad
,
const
size_t
inImgH
,
const
size_t
inImgW
,
...
...
@@ -321,9 +371,13 @@ extern void hl_bilinear_backward(real* inGrad,
* @param[in] featLen feature length = image height * image width.
* @param[in] groups number of groups.
*/
extern
void
hl_maxout_forward
(
const
real
*
inData
,
real
*
outData
,
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
groups
);
extern
void
hl_maxout_forward
(
const
real
*
inData
,
real
*
outData
,
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
groups
);
/**
* @brief MaxOut backward.
...
...
@@ -336,8 +390,12 @@ extern void hl_maxout_forward(
* @param[in] featLen feature length = image height * image width.
* @param[in] groups number of groups.
*/
extern
void
hl_maxout_backward
(
real
*
inGrad
,
const
real
*
outGrad
,
const
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
groups
);
extern
void
hl_maxout_backward
(
real
*
inGrad
,
const
real
*
outGrad
,
const
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
groups
);
#endif
/* HL_CNN_H_ */
paddle/cuda/include/hl_cuda.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CUDA_H_
#define HL_CUDA_H_
...
...
@@ -22,8 +21,7 @@ limitations under the License. */
/**
* @brief HPPL event.
*/
typedef
struct
_hl_event_st
*
hl_event_t
;
typedef
struct
_hl_event_st
*
hl_event_t
;
/**
* @brief return cuda runtime api version.
...
...
@@ -42,7 +40,7 @@ extern void hl_start();
* if device is NULL, will start all GPU.
* @param[in] number number of devices.
*/
extern
void
hl_specify_devices_start
(
int
*
device
,
int
number
);
extern
void
hl_specify_devices_start
(
int
*
device
,
int
number
);
/**
* @brief Queries if a device may directly access a peer device's memory.
...
...
@@ -126,7 +124,7 @@ extern int hl_get_device();
*
* @return dest_d pointer to device memory.
*/
extern
void
*
hl_malloc_device
(
size_t
size
);
extern
void
*
hl_malloc_device
(
size_t
size
);
/**
* @brief Free device memory.
...
...
@@ -143,7 +141,7 @@ extern void hl_free_mem_device(void *dest_d);
*
* @return dest_h pointer to host memory.
*/
extern
void
*
hl_malloc_host
(
size_t
size
);
extern
void
*
hl_malloc_host
(
size_t
size
);
/**
* @brief Free host page-lock memory.
...
...
@@ -228,9 +226,9 @@ extern void hl_srand(unsigned int seed);
* @param[in] stream stream id.
*/
extern
void
hl_memcpy_async
(
void
*
dst
,
void
*
src
,
size_t
size
,
hl_stream_t
stream
);
void
*
src
,
size_t
size
,
hl_stream_t
stream
);
/**
* @brief Waits for stream tasks to complete.
...
...
@@ -261,8 +259,7 @@ extern void hl_destroy_event(hl_event_t event);
*
* @return time Time between start and end in ms.
*/
extern
float
hl_event_elapsed_time
(
hl_event_t
start
,
hl_event_t
end
);
extern
float
hl_event_elapsed_time
(
hl_event_t
start
,
hl_event_t
end
);
/**
* @brief Records an event.
...
...
@@ -300,7 +297,7 @@ extern void hl_set_device_flags_block();
/**
* @brief Returns the last error string from a cuda runtime call.
*/
extern
const
char
*
hl_get_device_error_string
();
extern
const
char
*
hl_get_device_error_string
();
/**
* @brief Returns the last error string from a cuda runtime call.
...
...
@@ -309,7 +306,7 @@ extern const char* hl_get_device_error_string();
*
* @see hl_get_device_last_error()
*/
extern
const
char
*
hl_get_device_error_string
(
size_t
err
);
extern
const
char
*
hl_get_device_error_string
(
size_t
err
);
/**
* @brief Returns the last error number.
...
...
paddle/cuda/include/hl_cuda_cublas.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CUDA_CUBLAS_H_
#define HL_CUDA_CUBLAS_H_
...
...
@@ -29,12 +28,8 @@ limitations under the License. */
* @param[in] ldc the first dimension of C_d.
*
*/
extern
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
lda
,
int
ldc
);
extern
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
lda
,
int
ldc
);
/*
* @brief Matrix transpose, while lda = dimN, ldc = dimM.
...
...
@@ -45,10 +40,7 @@ extern void hl_matrix_transpose(real *A_d,
* @param[in] dimN matrix width.
*
*/
extern
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
);
/*
* @brief Matrix inverse
...
...
@@ -60,11 +52,7 @@ extern void hl_matrix_transpose(real *A_d,
* @param[in] ldc the first dimension of C_d
*
*/
extern
void
hl_matrix_inverse
(
real
*
A_d
,
real
*
C_d
,
int
dimN
,
int
lda
,
int
ldc
);
extern
void
hl_matrix_inverse
(
real
*
A_d
,
real
*
C_d
,
int
dimN
,
int
lda
,
int
ldc
);
/**
* @brief C_d = alpha*(op(A_d) * op(B_d)) + beta*C_d
...
...
@@ -84,12 +72,19 @@ extern void hl_matrix_inverse(real *A_d,
* @param[in] ldc the first dimension of C_d.
*
*/
extern
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
extern
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
,
int
lda
,
int
ldb
,
int
ldc
);
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
,
int
lda
,
int
ldb
,
int
ldc
);
/**
* @brief C_d = alpha*(op(A_d) * op(B_d)) + beta*C_d
...
...
@@ -106,11 +101,16 @@ extern void hl_matrix_mul(real *A_d, hl_trans_op_t transa,
* @param[in] beta scalar used for multiplication.
*
*/
extern
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
extern
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
/**
* @brief This function performs the matrix-vector multiplication.
...
...
@@ -132,11 +132,17 @@ extern void hl_matrix_mul(real *A_d, hl_trans_op_t transa,
*
*/
extern
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
,
int
lda
,
int
incb
,
int
incc
);
extern
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
,
int
lda
,
int
incb
,
int
incc
);
/**
* @brief This function performs the matrix-vector multiplication.
...
...
@@ -154,9 +160,13 @@ extern void hl_matrix_mul_vector(real *A_d, hl_trans_op_t trans,
* @param[in] beta scalar used for multiplication.
*
*/
extern
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
);
extern
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
);
#endif
/* HL_CUDA_CUBLAS_H_ */
paddle/cuda/include/hl_cuda_cudnn.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CUDA_CUDNN_H_
#define HL_CUDA_CUDNN_H_
...
...
@@ -22,7 +21,7 @@ limitations under the License. */
* hppl pooling mode
*/
typedef
enum
{
HL_POOLING_MAX
=
0
,
HL_POOLING_MAX
=
0
,
// average includes padded values
HL_POOLING_AVERAGE
=
1
,
// average does not include padded values
...
...
@@ -324,17 +323,16 @@ extern void hl_convolution_forward_add_bias(hl_tensor_descriptor bias,
* @param[in] sizeInBytes gpu workspace size (bytes).
* @param[in] convBwdFilterAlgo backward filter algorithm.
*/
extern
void
hl_convolution_backward_filter
(
hl_tensor_descriptor
input
,
real
*
input_data
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_grad_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdFilterAlgo
);
extern
void
hl_convolution_backward_filter
(
hl_tensor_descriptor
input
,
real
*
input_data
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_grad_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdFilterAlgo
);
/**
* @brief convolution backward data(calculate input image grad data).
...
...
@@ -350,17 +348,16 @@ extern void hl_convolution_backward_filter(
* @param[in] sizeInBytes gpu workspace size (bytes).
* @param[in] convBwdDataAlgo backward data algorithm.
*/
extern
void
hl_convolution_backward_data
(
hl_tensor_descriptor
input
,
real
*
input_data_grad
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdDataAlgo
);
extern
void
hl_convolution_backward_data
(
hl_tensor_descriptor
input
,
real
*
input_data_grad
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdDataAlgo
);
/**
* @brief convolution backward bias(calculate bias grad data).
...
...
@@ -383,8 +380,8 @@ extern void hl_convolution_backward_bias(hl_tensor_descriptor bias,
* @param[in] height matrix height.
* @param[in] width matrix width.
*/
extern
void
hl_softmax_forward
(
real
*
input
,
real
*
output
,
extern
void
hl_softmax_forward
(
real
*
input
,
real
*
output
,
int
height
,
int
width
);
...
...
@@ -396,8 +393,8 @@ extern void hl_softmax_forward(real *input,
* @param[in] height matrix height.
* @param[in] width matrix width.
*/
extern
void
hl_softmax_backward
(
real
*
output_value
,
real
*
output_grad
,
extern
void
hl_softmax_backward
(
real
*
output_value
,
real
*
output_grad
,
int
height
,
int
width
);
...
...
@@ -426,18 +423,18 @@ extern void hl_softmax_backward(real *output_value,
*
*/
extern
void
hl_batch_norm_forward_training
(
hl_tensor_descriptor
inputDesc
,
real
*
input
,
real
*
input
,
hl_tensor_descriptor
outputDesc
,
real
*
output
,
real
*
output
,
hl_tensor_descriptor
bnParamDesc
,
real
*
scale
,
real
*
bias
,
real
*
scale
,
real
*
bias
,
double
factor
,
real
*
runningMean
,
real
*
runningInvVar
,
real
*
runningMean
,
real
*
runningInvVar
,
double
epsilon
,
real
*
savedMean
,
real
*
savedVar
);
real
*
savedMean
,
real
*
savedVar
);
/**
* @brief cudnn batch norm forward.
...
...
@@ -463,14 +460,14 @@ extern void hl_batch_norm_forward_training(hl_tensor_descriptor inputDesc,
*
*/
extern
void
hl_batch_norm_forward_inference
(
hl_tensor_descriptor
inputDesc
,
real
*
input
,
real
*
input
,
hl_tensor_descriptor
outputDesc
,
real
*
output
,
real
*
output
,
hl_tensor_descriptor
bnParamDesc
,
real
*
scale
,
real
*
bias
,
real
*
estimatedMean
,
real
*
estimatedVar
,
real
*
scale
,
real
*
bias
,
real
*
estimatedMean
,
real
*
estimatedVar
,
double
epsilon
);
/**
...
...
@@ -483,7 +480,8 @@ extern void hl_batch_norm_forward_inference(hl_tensor_descriptor inputDesc,
* @param[in] inGradDesc input tensor descriptor desc.
* @param[in] inGrad input data.
* @param[in] dBnParamDesc tensor descriptor desc.
* bnScale, bnBias, running mean/var, save_mean/var.
* bnScale, bnBias, running mean/var,
* save_mean/var.
* @param[in] scale batch normalization scale parameter (in original
* paper scale is referred to as gamma).
* @param[in] scaleGrad batch normalization scale parameter (in original
...
...
@@ -497,17 +495,17 @@ extern void hl_batch_norm_forward_inference(hl_tensor_descriptor inputDesc,
*
*/
extern
void
hl_batch_norm_backward
(
hl_tensor_descriptor
inputDesc
,
real
*
input
,
real
*
input
,
hl_tensor_descriptor
outGradDesc
,
real
*
outGrad
,
real
*
outGrad
,
hl_tensor_descriptor
inGradDesc
,
real
*
inGrad
,
real
*
inGrad
,
hl_tensor_descriptor
dBnParamDesc
,
real
*
scale
,
real
*
scaleGrad
,
real
*
biasGrad
,
real
*
scale
,
real
*
scaleGrad
,
real
*
biasGrad
,
double
epsilon
,
real
*
savedMean
,
real
*
savedInvVar
);
real
*
savedMean
,
real
*
savedInvVar
);
#endif // HL_CUDA_CUDNN_H_
paddle/cuda/include/hl_dso_loader.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_DSO_LOADER_H_
#define HL_DSO_LOADER_H_
...
...
paddle/cuda/include/hl_functions.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_FUNCTIONS_H_
#define HL_FUNCTIONS_H_
...
...
@@ -21,30 +20,30 @@ limitations under the License. */
/**
* sigmoid threshold maximum
*/
#define
SIGMOID_THRESHOLD_MIN
-40.0
#define
SIGMOID_THRESHOLD_MIN
-40.0
/**
* sigmoid threshold minimum
*/
#define
SIGMOID_THRESHOLD_MAX
13.0
#define
SIGMOID_THRESHOLD_MAX
13.0
#ifndef __NVCC__
namespace
hppl
{
/*
* forward activation
*/
real
relu
(
const
real
a
);
real
sigmoid
(
const
real
a
);
real
tanh
(
const
real
a
);
real
linear
(
const
real
a
);
/*
* backward activation
*/
real
relu
(
const
real
a
,
const
real
b
);
real
sigmoid
(
const
real
a
,
const
real
b
);
real
tanh
(
const
real
a
,
const
real
b
);
real
linear
(
const
real
a
,
const
real
b
);
/*
* forward activation
*/
real
relu
(
const
real
a
);
real
sigmoid
(
const
real
a
);
real
tanh
(
const
real
a
);
real
linear
(
const
real
a
);
/*
* backward activation
*/
real
relu
(
const
real
a
,
const
real
b
);
real
sigmoid
(
const
real
a
,
const
real
b
);
real
tanh
(
const
real
a
,
const
real
b
);
real
linear
(
const
real
a
,
const
real
b
);
}
// namespace hppl
#ifdef __AVX__
...
...
paddle/cuda/include/hl_gpu.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_GPU_H_
#define HL_GPU_H_
...
...
paddle/cuda/include/hl_lstm.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_LSTM_H_
#define HL_LSTM_H_
...
...
paddle/cuda/include/hl_matrix.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_MATRIX_H_
#define HL_MATRIX_H_
...
...
@@ -30,13 +29,8 @@ limitations under the License. */
* @param[in] beta scalar used for addition.
*
*/
extern
void
hl_matrix_add
(
real
*
A_d
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
);
extern
void
hl_matrix_add
(
real
*
A_d
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
);
/**
* @brief Matrix Softmax.
*
...
...
@@ -46,7 +40,7 @@ extern void hl_matrix_add(real* A_d,
* @param[in] dimN matrix width.
*
*/
extern
void
hl_matrix_softmax
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_softmax
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
);
/**
* @brief Matrix softmax derivative.
...
...
@@ -58,11 +52,8 @@ extern void hl_matrix_softmax(real *A_d, real *C_d, int dimM, int dimN);
* @param[in] dimN matrix width.
*
*/
extern
void
hl_matrix_softmax_derivative
(
real
*
grad_d
,
real
*
output_d
,
real
*
sftmaxSum_d
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_softmax_derivative
(
real
*
grad_d
,
real
*
output_d
,
real
*
sftmaxSum_d
,
int
dimM
,
int
dimN
);
/**
* @brief Sequence softmax.
...
...
@@ -73,8 +64,8 @@ extern void hl_matrix_softmax_derivative(real* grad_d,
* @param[in] numSequence sequence number.
*
*/
extern
void
hl_sequence_softmax_forward
(
real
*
A_d
,
real
*
C_d
,
extern
void
hl_sequence_softmax_forward
(
real
*
A_d
,
real
*
C_d
,
const
int
*
index
,
int
numSequence
);
...
...
@@ -88,11 +79,8 @@ extern void hl_sequence_softmax_forward(real *A_d,
* @param[in] dimN matrix width.
*
*/
extern
void
hl_matrix_classification_error
(
real
*
A_d
,
int
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_classification_error
(
real
*
A_d
,
int
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
);
/**
* @brief Matrix cross entropy.
...
...
@@ -104,11 +92,8 @@ extern void hl_matrix_classification_error(real* A_d,
* @param[in] dimN matrix width.
*
*/
extern
void
hl_matrix_cross_entropy
(
real
*
A_d
,
real
*
C_d
,
int
*
label_d
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_cross_entropy
(
real
*
A_d
,
real
*
C_d
,
int
*
label_d
,
int
dimM
,
int
dimN
);
/**
* @brief Matrix cross entropy back propagation.
...
...
@@ -120,11 +105,8 @@ extern void hl_matrix_cross_entropy(real* A_d,
* @param[in] dimN matrix width.
*
*/
extern
void
hl_matrix_cross_entropy_bp
(
real
*
grad_d
,
real
*
output_d
,
int
*
label_d
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_cross_entropy_bp
(
real
*
grad_d
,
real
*
output_d
,
int
*
label_d
,
int
dimM
,
int
dimN
);
/**
* @brief Matrix multi-binary label cross entropy
...
...
@@ -135,11 +117,8 @@ extern void hl_matrix_cross_entropy_bp(real* grad_d,
* @param[in] dimM matrix height.
* @param[in] dimN matrix width.
*/
extern
void
hl_matrix_multi_binary_cross_entropy
(
real
*
output
,
real
*
entropy
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_multi_binary_cross_entropy
(
real
*
output
,
real
*
entropy
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
);
/**
* @brief Matrix multi-binary label cross entropy backprop
...
...
@@ -150,11 +129,8 @@ extern void hl_matrix_multi_binary_cross_entropy(real* output,
* @param[in] dimM matrix height.
* @param[in] dimN matrix width.
*/
extern
void
hl_matrix_multi_binary_cross_entropy_bp
(
real
*
output
,
real
*
grad
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
);
extern
void
hl_matrix_multi_binary_cross_entropy_bp
(
real
*
output
,
real
*
grad
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
);
/**
* @brief Matrix zero memory.
...
...
@@ -176,12 +152,8 @@ extern void hl_matrix_zero_mem(real* data, int num);
* @param[in] partial_sum
*/
extern
void
hl_param_relu_forward
(
real
*
output
,
real
*
input
,
real
*
w
,
int
width
,
int
height
,
int
partial_sum
);
extern
void
hl_param_relu_forward
(
real
*
output
,
real
*
input
,
real
*
w
,
int
width
,
int
height
,
int
partial_sum
);
/**
* @brief parameter relu backward w
*
...
...
paddle/cuda/include/hl_sequence.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_SEQUENCE_H_
#define HL_SEQUENCE_H_
...
...
@@ -32,7 +31,7 @@ limitations under the License. */
extern
void
hl_max_sequence_forward
(
real
*
input
,
const
int
*
sequence
,
real
*
output
,
int
*
index
,
int
*
index
,
int
numSequences
,
int
dim
);
...
...
@@ -46,11 +45,8 @@ extern void hl_max_sequence_forward(real* input,
* @param[in] dim input dimension.
*
*/
extern
void
hl_max_sequence_backward
(
real
*
outputGrad
,
int
*
index
,
real
*
inputGrad
,
int
numSequences
,
int
dim
);
extern
void
hl_max_sequence_backward
(
real
*
outputGrad
,
int
*
index
,
real
*
inputGrad
,
int
numSequences
,
int
dim
);
/**
* @brief Context projection forward.
...
...
@@ -63,7 +59,8 @@ extern void hl_max_sequence_backward(real* outputGrad,
* @param[in] inputDim input sequence dimension.
* @param[in] contextLength context length.
* @param[in] contextStart context start.
* @param[in] beginPad number of extra timesteps added at the beginning.
* @param[in] beginPad number of extra timesteps added at the
* beginning.
* @param[in] isPadding trainable padding.
*
*/
...
...
@@ -109,7 +106,8 @@ extern void hl_context_projection_backward_data(real* outputGrad,
* @param[in] totalPad number of extra timesteps.
* @param[in] contextLength context length.
* @param[in] contextStart context start.
* @param[in] beginPad number of extra timesteps added at the beginning.
* @param[in] beginPad number of extra timesteps added at the
* beginning.
*
*/
extern
void
hl_context_projection_backward_weight
(
real
*
outputGrad
,
...
...
@@ -141,9 +139,9 @@ extern void hl_context_projection_backward_weight(real* outputGrad,
* @param[in] seq2batch copy direction.
*
*/
extern
void
hl_sequence2batch_copy
(
real
*
batch
,
real
*
sequence
,
const
int
*
batchIndex
,
extern
void
hl_sequence2batch_copy
(
real
*
batch
,
real
*
sequence
,
const
int
*
batchIndex
,
int
seqWidth
,
int
batchCount
,
bool
seq2batch
);
...
...
@@ -167,9 +165,9 @@ extern void hl_sequence2batch_copy(real *batch,
* @param[in] seq2batch copy direction.
*
*/
extern
void
hl_sequence2batch_add
(
real
*
batch
,
real
*
sequence
,
int
*
batchIndex
,
extern
void
hl_sequence2batch_add
(
real
*
batch
,
real
*
sequence
,
int
*
batchIndex
,
int
seqWidth
,
int
batchCount
,
bool
seq2batch
);
...
...
paddle/cuda/include/hl_sparse.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_SPARSE_H_
#define HL_SPARSE_H_
...
...
@@ -31,7 +30,7 @@ limitations under the License. */
*/
extern
void
hl_malloc_sparse_matrix
(
hl_sparse_matrix_s
*
A_d
,
hl_matrix_format_t
format
,
hl_matrix_value_t
value_type
,
hl_matrix_value_t
value_type
,
int
dimM
,
int
dimN
,
int
nnz
);
...
...
@@ -60,10 +59,10 @@ extern void hl_free_sparse_matrix(hl_sparse_matrix_s A_d);
*
*/
extern
void
hl_construct_sparse_matrix
(
hl_sparse_matrix_s
*
A_d
,
void
*
dest_d
,
void
*
dest_d
,
size_t
size
,
hl_matrix_format_t
format
,
hl_matrix_value_t
value_type
,
hl_matrix_value_t
value_type
,
int
dimM
,
int
dimN
,
int
nnz
);
...
...
@@ -94,11 +93,11 @@ extern void hl_construct_sparse_matrix(hl_sparse_matrix_s *A_d,
*
*/
extern
void
hl_construct_sparse_matrix
(
hl_sparse_matrix_s
*
A_d
,
real
*
value_d
,
int
*
rows_d
,
int
*
cols_d
,
real
*
value_d
,
int
*
rows_d
,
int
*
cols_d
,
hl_matrix_format_t
format
,
hl_matrix_value_t
value_type
,
hl_matrix_value_t
value_type
,
int
dimM
,
int
dimN
,
int
nnz
);
...
...
@@ -259,10 +258,14 @@ extern void hl_matrix_csr_mul_dense(hl_sparse_matrix_s A_d,
*/
extern
void
hl_matrix_csc_mul_dense
(
hl_sparse_matrix_s
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
/**
* @brief C_d = alpha*(op(A_d) * op(B_d)) + beta*C_d.
...
...
@@ -311,11 +314,16 @@ extern void hl_matrix_dense_mul_csc(real *A_d,
* @note transb is not support HPPL_OP_T.
*
*/
extern
void
hl_sparse_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
extern
void
hl_sparse_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
hl_sparse_matrix_s
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
/**
* @brief C_d = alpha*(op(A_d) * op(B_d)) + beta*C_d
...
...
@@ -336,12 +344,16 @@ extern void hl_sparse_matrix_mul(real* A_d, hl_trans_op_t transa,
* @note transa is not support HPPL_OP_T.
*
*/
extern
void
hl_matrix_dense_mul_csr
(
real
*
A_d
,
hl_trans_op_t
transa
,
extern
void
hl_matrix_dense_mul_csr
(
real
*
A_d
,
hl_trans_op_t
transa
,
hl_sparse_matrix_s
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
);
/**
* @brief Memcpy csc_matrix to host.
...
...
@@ -412,7 +424,6 @@ extern void hl_memcpy_from_csr_matrix(real *csr_val,
hl_sparse_matrix_s
csr_matrix
,
hl_stream_t
stream
);
/**
* @brief A_d[j] += B_d[i,j] for i in range(height)
*
...
...
@@ -423,19 +434,13 @@ extern void hl_memcpy_from_csr_matrix(real *csr_val,
* @param[in] scale scale of B_d
*
*/
extern
void
hl_sparse_matrix_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
);
extern
void
hl_sparse_matrix_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
);
/**
* @brief implementation of csr sparse matrix in hl_sparse_matirx_column_sum
*/
extern
void
hl_matrix_csr_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
);
extern
void
hl_matrix_csr_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
);
/**
* @brief A_d[i,j] += B_d[j]
...
...
@@ -446,13 +451,13 @@ extern void hl_matrix_csr_column_sum(real* A_d,
*
*/
extern
void
hl_sparse_matrix_add_bias
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
real
scale
);
/**
* @brief implementation of csr sparse matrix in hl_sparse_matrix_add_bias
*/
extern
void
hl_matrix_csr_add_bias
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
real
scale
);
/**
...
...
@@ -470,7 +475,7 @@ extern void hl_matrix_csr_add_bias(hl_sparse_matrix_s A_d,
*
*/
extern
void
hl_sparse_matrix_add_dense
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
int
dimM
,
int
dimN
,
real
alpha
,
...
...
@@ -479,7 +484,7 @@ extern void hl_sparse_matrix_add_dense(hl_sparse_matrix_s A_d,
* @brief implementation of csr sparse matrix in hl_sparse_matrix_add_dense
*/
extern
void
hl_matrix_csr_add_dense
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
int
dimM
,
int
dimN
,
real
alpha
,
...
...
@@ -493,7 +498,7 @@ extern void hl_matrix_csr_add_dense(hl_sparse_matrix_s A_d,
* @return return rows pointer, which is gpu address
*
*/
extern
int
*
hl_sparse_matrix_get_rows
(
hl_sparse_matrix_s
sMat
);
extern
int
*
hl_sparse_matrix_get_rows
(
hl_sparse_matrix_s
sMat
);
/**
* @brief get cols pionter of GpuSparseMatrix
...
...
@@ -503,7 +508,7 @@ extern int* hl_sparse_matrix_get_rows(hl_sparse_matrix_s sMat);
* @return return cols pointer, which is gpu address
*
*/
extern
int
*
hl_sparse_matrix_get_cols
(
hl_sparse_matrix_s
sMat
);
extern
int
*
hl_sparse_matrix_get_cols
(
hl_sparse_matrix_s
sMat
);
/**
* @brief get value pionter of GpuSparseMatrix
...
...
@@ -513,7 +518,6 @@ extern int* hl_sparse_matrix_get_cols(hl_sparse_matrix_s sMat);
* @return return value pointer, which is gpu address
*
*/
extern
real
*
hl_sparse_matrix_get_value
(
hl_sparse_matrix_s
sMat
);
extern
real
*
hl_sparse_matrix_get_value
(
hl_sparse_matrix_s
sMat
);
#endif
/* HL_SPARSE_H_ */
paddle/cuda/include/hl_table_apply.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_TABLE_APPLY_H_
#define HL_TABLE_APPLY_H_
...
...
@@ -31,8 +30,10 @@ limitations under the License. */
* @param[in] dim width of table.
*
*/
extern
void
hl_matrix_select_rows
(
real
*
output
,
int
ldo
,
real
*
table
,
int
ldt
,
extern
void
hl_matrix_select_rows
(
real
*
output
,
int
ldo
,
real
*
table
,
int
ldt
,
int
*
ids
,
int
numSamples
,
int
tableSize
,
...
...
@@ -53,8 +54,10 @@ extern void hl_matrix_select_rows(real* output, int ldo,
* @param[in] dim width of table.
*
*/
extern
void
hl_matrix_add_to_rows
(
real
*
table
,
int
ldt
,
real
*
input
,
int
ldi
,
extern
void
hl_matrix_add_to_rows
(
real
*
table
,
int
ldt
,
real
*
input
,
int
ldi
,
int
*
ids
,
int
numSamples
,
int
tableSize
,
...
...
@@ -72,8 +75,7 @@ extern void hl_matrix_add_to_rows(real* table, int ldt,
*
*/
template
<
class
T
>
extern
void
hl_vector_select_from
(
T
*
dst
,
int
sized
,
const
T
*
src
,
int
sizes
,
const
int
*
ids
,
int
sizei
);
extern
void
hl_vector_select_from
(
T
*
dst
,
int
sized
,
const
T
*
src
,
int
sizes
,
const
int
*
ids
,
int
sizei
);
#endif
/* HL_TABLE_APPLY_H_ */
#endif
/* HL_TABLE_APPLY_H_ */
paddle/cuda/include/hl_time.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_TIME_H_
#define HL_TIME_H_
...
...
paddle/cuda/include/hl_top_k.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_TOP_K_H_
#define HL_TOP_K_H_
...
...
@@ -31,9 +30,11 @@ limitations under the License. */
* @param[in] numSamples height of input value.
*
*/
extern
void
hl_matrix_top_k
(
real
*
topVal
,
int
ldv
,
int
*
topIds
,
real
*
src
,
int
lds
,
extern
void
hl_matrix_top_k
(
real
*
topVal
,
int
ldv
,
int
*
topIds
,
real
*
src
,
int
lds
,
int
dim
,
int
beamSize
,
int
numSamples
);
...
...
@@ -50,8 +51,9 @@ extern void hl_matrix_top_k(real* topVal, int ldv,
*
* @note Only support HL_SPARSE_CSR format.
*/
extern
void
hl_sparse_matrix_top_k
(
real
*
topVal
,
int
ldv
,
int
*
topIds
,
extern
void
hl_sparse_matrix_top_k
(
real
*
topVal
,
int
ldv
,
int
*
topIds
,
hl_sparse_matrix_s
src
,
int
beamSize
,
int
numSamples
);
...
...
paddle/cuda/include/stub/hl_aggregate_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,29 +12,22 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_AGGREGATE_STUB_H_
#define HL_AGGREGATE_STUB_H_
#include "hl_aggregate.h"
inline
void
hl_matrix_row_sum
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_row_sum
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_row_max
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_row_max
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_row_min
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_row_min
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_column_sum
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_column_sum
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_column_max
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_column_max
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_column_min
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_column_min
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_vector_sum
(
real
*
A_d
,
real
*
C_h
,
int
dimM
)
{}
...
...
paddle/cuda/include/stub/hl_cnn_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,84 +12,134 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CNN_STUB_H_
#define HL_CNN_STUB_H_
#include "hl_cnn.h"
inline
void
hl_shrink_col2feature
(
const
real
*
dataCol
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataIm
,
real
alpha
,
real
beta
)
{}
inline
void
hl_expand_feature2col
(
const
real
*
dataIm
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataCol
)
{}
inline
void
hl_maxpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
)
{}
inline
void
hl_maxpool_backward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
real
*
outData
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
targetGrad
,
const
int
outStride
)
{}
inline
void
hl_avgpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
)
{}
inline
void
hl_avgpool_backward
(
const
int
frameCnt
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
int
paddingH
,
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
backGrad
,
const
int
outStride
)
{}
inline
void
hl_CMRNorm_forward
(
size_t
frameCnt
,
const
real
*
in
,
real
*
scale
,
real
*
out
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
)
{}
inline
void
hl_CMRNorm_backward
(
size_t
frameCnt
,
const
real
*
inV
,
const
real
*
scale
,
const
real
*
outV
,
const
real
*
outDiff
,
real
*
inDiff
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
)
{}
inline
void
hl_shrink_col2feature
(
const
real
*
dataCol
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataIm
,
real
alpha
,
real
beta
)
{}
inline
void
hl_expand_feature2col
(
const
real
*
dataIm
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
blockH
,
size_t
blockW
,
size_t
strideH
,
size_t
strideW
,
size_t
paddingH
,
size_t
paddingW
,
size_t
outputH
,
size_t
outputW
,
real
*
dataCol
)
{}
inline
void
hl_maxpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
)
{}
inline
void
hl_maxpool_backward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
real
*
outData
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
targetGrad
,
const
int
outStride
)
{}
inline
void
hl_avgpool_forward
(
const
int
frameCnt
,
const
real
*
inputData
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
const
int
paddingH
,
const
int
paddingW
,
real
*
tgtData
,
const
int
tgtStride
)
{}
inline
void
hl_avgpool_backward
(
const
int
frameCnt
,
const
real
*
outGrad
,
const
int
channels
,
const
int
height
,
const
int
width
,
const
int
pooledH
,
const
int
pooledW
,
const
int
sizeX
,
const
int
sizeY
,
const
int
strideH
,
const
int
strideW
,
int
paddingH
,
int
paddingW
,
real
scaleA
,
real
scaleB
,
real
*
backGrad
,
const
int
outStride
)
{}
inline
void
hl_CMRNorm_forward
(
size_t
frameCnt
,
const
real
*
in
,
real
*
scale
,
real
*
out
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
)
{}
inline
void
hl_CMRNorm_backward
(
size_t
frameCnt
,
const
real
*
inV
,
const
real
*
scale
,
const
real
*
outV
,
const
real
*
outDiff
,
real
*
inDiff
,
size_t
channels
,
size_t
height
,
size_t
width
,
size_t
sizeX
,
real
alpha
,
real
beta
)
{}
inline
void
hl_bilinear_forward
(
const
real
*
inData
,
const
size_t
inImgH
,
...
...
@@ -106,25 +156,33 @@ inline void hl_bilinear_forward(const real* inData,
const
real
ratioW
)
{}
inline
void
hl_bilinear_backward
(
real
*
inGrad
,
const
size_t
inImgH
,
const
size_t
inImgW
,
const
size_t
inputH
,
const
size_t
inputW
,
const
real
*
outGrad
,
const
size_t
outImgH
,
const
size_t
outImgW
,
const
size_t
outputH
,
const
size_t
outputW
,
const
size_t
numChannels
,
const
real
ratioH
,
const
real
ratioW
)
{}
inline
void
hl_maxout_forward
(
const
real
*
inData
,
real
*
outData
,
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
group
)
{}
inline
void
hl_maxout_backward
(
real
*
inGrad
,
const
real
*
outGrad
,
const
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
group
)
{}
const
size_t
inImgH
,
const
size_t
inImgW
,
const
size_t
inputH
,
const
size_t
inputW
,
const
real
*
outGrad
,
const
size_t
outImgH
,
const
size_t
outImgW
,
const
size_t
outputH
,
const
size_t
outputW
,
const
size_t
numChannels
,
const
real
ratioH
,
const
real
ratioW
)
{}
inline
void
hl_maxout_forward
(
const
real
*
inData
,
real
*
outData
,
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
group
)
{}
inline
void
hl_maxout_backward
(
real
*
inGrad
,
const
real
*
outGrad
,
const
int
*
idData
,
size_t
batchSize
,
size_t
size
,
size_t
featLen
,
size_t
group
)
{}
#endif // HL_CNN_STUB_H_
paddle/cuda/include/stub/hl_cuda_cublas_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,41 +12,42 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CUDA_CUBLAS_STUB_H_
#define HL_CUDA_CUBLAS_STUB_H_
#include "hl_cuda_cublas.h"
inline
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
lda
,
int
ldc
)
{}
inline
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_inverse
(
real
*
A_d
,
real
*
C_d
,
int
dimN
,
int
lda
,
int
ldc
)
{}
inline
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
,
int
lda
,
int
ldb
,
int
ldc
)
{}
inline
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
lda
,
int
ldc
)
{}
inline
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
inline
void
hl_matrix_inverse
(
real
*
A_d
,
real
*
C_d
,
int
dimN
,
int
lda
,
int
ldc
)
{}
inline
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
,
int
lda
,
int
ldb
,
int
ldc
)
{}
inline
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
#endif // HL_CUDA_CUBLAS_STUB_H_
paddle/cuda/include/stub/hl_cuda_cudnn_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,15 +12,12 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CUDA_CUDNN_STUB_H_
#define HL_CUDA_CUDNN_STUB_H_
#include "hl_cuda_cudnn.h"
inline
int
hl_get_cudnn_lib_version
()
{
return
0
;
}
inline
int
hl_get_cudnn_lib_version
()
{
return
0
;
}
inline
void
hl_create_tensor_descriptor
(
hl_tensor_descriptor
*
image_desc
)
{}
...
...
@@ -68,41 +65,41 @@ inline void hl_pooling_backward(hl_tensor_descriptor input,
hl_pooling_descriptor
pooling
)
{}
inline
void
hl_create_filter_descriptor
(
hl_filter_descriptor
*
filter
,
int
input_feature_maps
,
int
output_feature_maps
,
int
height
,
int
width
)
{}
int
input_feature_maps
,
int
output_feature_maps
,
int
height
,
int
width
)
{}
inline
void
hl_destroy_filter_descriptor
(
hl_filter_descriptor
filter
)
{}
inline
void
hl_create_convolution_descriptor
(
hl_convolution_descriptor
*
conv
,
hl_tensor_descriptor
image
,
hl_filter_descriptor
filter
,
int
padding_height
,
int
padding_width
,
int
stride_height
,
int
stride_width
)
{}
hl_tensor_descriptor
image
,
hl_filter_descriptor
filter
,
int
padding_height
,
int
padding_width
,
int
stride_height
,
int
stride_width
)
{}
inline
void
hl_reset_convolution_descriptor
(
hl_convolution_descriptor
conv
,
hl_tensor_descriptor
image
,
hl_filter_descriptor
filter
,
int
padding_height
,
int
padding_width
,
int
stride_height
,
int
stride_width
)
{}
hl_tensor_descriptor
image
,
hl_filter_descriptor
filter
,
int
padding_height
,
int
padding_width
,
int
stride_height
,
int
stride_width
)
{}
inline
void
hl_destroy_convolution_descriptor
(
hl_convolution_descriptor
conv
)
{}
inline
void
hl_conv_workspace
(
hl_tensor_descriptor
input
,
hl_tensor_descriptor
output
,
hl_filter_descriptor
filter
,
hl_convolution_descriptor
conv
,
int
*
convFwdAlgo
,
size_t
*
fwdLimitBytes
,
int
*
convBwdDataAlgo
,
size_t
*
bwdDataLimitBytes
,
int
*
convBwdFilterAlgo
,
size_t
*
bwdFilterLimitBytes
)
{}
hl_tensor_descriptor
output
,
hl_filter_descriptor
filter
,
hl_convolution_descriptor
conv
,
int
*
convFwdAlgo
,
size_t
*
fwdLimitBytes
,
int
*
convBwdDataAlgo
,
size_t
*
bwdDataLimitBytes
,
int
*
convBwdFilterAlgo
,
size_t
*
bwdFilterLimitBytes
)
{}
inline
void
hl_convolution_forward
(
hl_tensor_descriptor
input
,
real
*
input_data
,
...
...
@@ -116,86 +113,84 @@ inline void hl_convolution_forward(hl_tensor_descriptor input,
int
convFwdAlgo
)
{}
inline
void
hl_convolution_forward_add_bias
(
hl_tensor_descriptor
bias
,
real
*
bias_data
,
hl_tensor_descriptor
output
,
real
*
output_data
)
{}
inline
void
hl_convolution_backward_filter
(
hl_tensor_descriptor
input
,
real
*
input_data
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_grad_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdFilterAlgo
)
{}
inline
void
hl_convolution_backward_data
(
hl_tensor_descriptor
input
,
real
*
input_data_grad
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdDataAlgo
)
{}
real
*
bias_data
,
hl_tensor_descriptor
output
,
real
*
output_data
)
{}
inline
void
hl_convolution_backward_filter
(
hl_tensor_descriptor
input
,
real
*
input_data
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_grad_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdFilterAlgo
)
{}
inline
void
hl_convolution_backward_data
(
hl_tensor_descriptor
input
,
real
*
input_data_grad
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
,
hl_filter_descriptor
filter
,
real
*
filter_data
,
hl_convolution_descriptor
conv
,
void
*
gpuWorkSpace
,
size_t
sizeInBytes
,
int
convBwdDataAlgo
)
{}
inline
void
hl_convolution_backward_bias
(
hl_tensor_descriptor
bias
,
real
*
bias_grad_data
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
)
{}
real
*
bias_grad_data
,
hl_tensor_descriptor
output
,
real
*
output_grad_data
)
{}
inline
void
hl_softmax_forward
(
real
*
input
,
real
*
output
,
int
height
,
int
width
)
{}
inline
void
hl_softmax_backward
(
real
*
output_value
,
real
*
output_grad
,
inline
void
hl_softmax_forward
(
real
*
input
,
real
*
output
,
int
height
,
int
width
)
{}
inline
void
hl_softmax_backward
(
real
*
output_value
,
real
*
output_grad
,
int
height
,
int
width
)
{}
inline
void
hl_batch_norm_forward_training
(
hl_tensor_descriptor
inputDesc
,
real
*
input
,
real
*
input
,
hl_tensor_descriptor
outputDesc
,
real
*
output
,
real
*
output
,
hl_tensor_descriptor
bnParamDesc
,
real
*
scale
,
real
*
bias
,
real
*
scale
,
real
*
bias
,
double
factor
,
real
*
runningMean
,
real
*
runningInvVar
,
real
*
runningMean
,
real
*
runningInvVar
,
double
epsilon
,
real
*
savedMean
,
real
*
savedVar
)
{}
real
*
savedMean
,
real
*
savedVar
)
{}
inline
void
hl_batch_norm_forward_inference
(
hl_tensor_descriptor
inputDesc
,
real
*
input
,
real
*
input
,
hl_tensor_descriptor
outputDesc
,
real
*
output
,
real
*
output
,
hl_tensor_descriptor
bnParamDesc
,
real
*
scale
,
real
*
bias
,
real
*
estimatedMean
,
real
*
estimatedVar
,
real
*
scale
,
real
*
bias
,
real
*
estimatedMean
,
real
*
estimatedVar
,
double
epsilon
)
{}
inline
void
hl_batch_norm_backward
(
hl_tensor_descriptor
inputDesc
,
real
*
input
,
real
*
input
,
hl_tensor_descriptor
outGradDesc
,
real
*
outGrad
,
real
*
outGrad
,
hl_tensor_descriptor
inGradDesc
,
real
*
inGrad
,
real
*
inGrad
,
hl_tensor_descriptor
dBnParamDesc
,
real
*
scale
,
real
*
scaleGrad
,
real
*
biasGrad
,
real
*
scale
,
real
*
scaleGrad
,
real
*
biasGrad
,
double
epsilon
,
real
*
savedMean
,
real
*
savedInvVar
)
{}
real
*
savedMean
,
real
*
savedInvVar
)
{}
#endif // HL_CUDA_CUDNN_STUB_H_
paddle/cuda/include/stub/hl_cuda_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_CUDA_STUB_H_
#define HL_CUDA_STUB_H_
...
...
@@ -24,29 +23,25 @@ inline void hl_specify_devices_start(int *device, int number) {}
inline
void
hl_init
(
int
device
)
{}
inline
int
hl_get_cuda_lib_version
(
int
device
)
{
return
0
;
}
inline
int
hl_get_cuda_lib_version
(
int
device
)
{
return
0
;
}
inline
void
hl_fini
()
{}
inline
void
hl_set_sync_flag
(
bool
flag
)
{}
inline
bool
hl_get_sync_flag
()
{
return
false
;
}
inline
bool
hl_get_sync_flag
()
{
return
false
;
}
inline
int
hl_get_device_count
()
{
return
0
;
}
inline
int
hl_get_device_count
()
{
return
0
;
}
inline
void
hl_set_device
(
int
device
)
{}
inline
int
hl_get_device
()
{
return
0
;
}
inline
int
hl_get_device
()
{
return
0
;
}
inline
void
*
hl_malloc_device
(
size_t
size
)
{
return
NULL
;
}
inline
void
*
hl_malloc_device
(
size_t
size
)
{
return
NULL
;
}
inline
void
hl_free_mem_device
(
void
*
dest_d
)
{}
inline
void
*
hl_malloc_host
(
size_t
size
)
{
return
NULL
;
}
inline
void
*
hl_malloc_host
(
size_t
size
)
{
return
NULL
;
}
inline
void
hl_free_mem_host
(
void
*
dest_h
)
{}
...
...
@@ -64,7 +59,9 @@ inline void hl_rand(real *dest_d, size_t num) {}
inline
void
hl_srand
(
unsigned
int
seed
)
{}
inline
void
hl_memcpy_async
(
void
*
dst
,
void
*
src
,
size_t
size
,
inline
void
hl_memcpy_async
(
void
*
dst
,
void
*
src
,
size_t
size
,
hl_stream_t
stream
)
{}
inline
void
hl_stream_synchronize
(
hl_stream_t
stream
)
{}
...
...
@@ -83,11 +80,11 @@ inline void hl_stream_wait_event(hl_stream_t stream, hl_event_t event) {}
inline
void
hl_event_synchronize
(
hl_event_t
event
)
{}
inline
int
hl_get_device_last_error
()
{
return
0
;
}
inline
int
hl_get_device_last_error
()
{
return
0
;
}
inline
const
char
*
hl_get_device_error_string
()
{
return
NULL
;
}
inline
const
char
*
hl_get_device_error_string
()
{
return
NULL
;
}
inline
const
char
*
hl_get_device_error_string
(
size_t
err
)
{
return
NULL
;
}
inline
const
char
*
hl_get_device_error_string
(
size_t
err
)
{
return
NULL
;
}
inline
bool
hl_cuda_event_is_ready
(
hl_event_t
event
)
{
return
true
;
}
...
...
paddle/cuda/include/stub/hl_lstm_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_LSTM_STUB_H_
#define HL_LSTM_STUB_H_
...
...
paddle/cuda/include/stub/hl_matrix_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_MATRIX_STUB_H_
#define HL_MATRIX_STUB_H_
...
...
@@ -26,48 +25,30 @@ inline void hl_matrix_add(real* A_d,
real
alpha
,
real
beta
)
{}
inline
void
hl_matrix_softmax
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_softmax
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_sequence_softmax_forward
(
real
*
A_d
,
real
*
C_d
,
inline
void
hl_sequence_softmax_forward
(
real
*
A_d
,
real
*
C_d
,
const
int
*
index
,
int
numSequence
)
{}
inline
void
hl_matrix_softmax_derivative
(
real
*
grad_d
,
real
*
output_d
,
real
*
sftmaxSum_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_classification_error
(
real
*
A_d
,
int
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_cross_entropy
(
real
*
A_d
,
real
*
C_d
,
int
*
label_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_cross_entropy_bp
(
real
*
grad_d
,
real
*
output_d
,
int
*
label_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_multi_binary_cross_entropy
(
real
*
output
,
real
*
entropy
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_multi_binary_cross_entropy_bp
(
real
*
output
,
real
*
grad
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_softmax_derivative
(
real
*
grad_d
,
real
*
output_d
,
real
*
sftmaxSum_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_classification_error
(
real
*
A_d
,
int
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_cross_entropy
(
real
*
A_d
,
real
*
C_d
,
int
*
label_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_cross_entropy_bp
(
real
*
grad_d
,
real
*
output_d
,
int
*
label_d
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_multi_binary_cross_entropy
(
real
*
output
,
real
*
entropy
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_multi_binary_cross_entropy_bp
(
real
*
output
,
real
*
grad
,
hl_sparse_matrix_s
mat
,
int
dimM
,
int
dimN
)
{}
inline
void
hl_matrix_zero_mem
(
real
*
data
,
int
num
)
{}
...
...
@@ -101,7 +82,6 @@ inline void hl_cossim(real* output,
int
input2_height
,
real
scale
)
{}
inline
void
hl_cossim_derivative
(
real
*
grad
,
real
*
output
,
real
*
prevOutX
,
...
...
paddle/cuda/include/stub/hl_sequence_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_SEQUENCE_STUB_H_
#define HL_SEQUENCE_STUB_H_
...
...
@@ -21,15 +20,12 @@ limitations under the License. */
inline
void
hl_max_sequence_forward
(
real
*
input
,
const
int
*
sequence
,
real
*
output
,
int
*
index
,
int
*
index
,
int
numSequences
,
int
dim
)
{}
inline
void
hl_max_sequence_backward
(
real
*
outputGrad
,
int
*
index
,
real
*
inputGrad
,
int
numSequences
,
int
dim
)
{}
inline
void
hl_max_sequence_backward
(
real
*
outputGrad
,
int
*
index
,
real
*
inputGrad
,
int
numSequences
,
int
dim
)
{}
inline
void
hl_context_projection_forward
(
real
*
input
,
const
int
*
sequence
,
...
...
@@ -60,16 +56,16 @@ inline void hl_context_projection_backward_weight(real* outputGrad,
int
contextStart
,
int
beginPad
)
{}
inline
void
hl_sequence2batch_copy
(
real
*
batch
,
real
*
sequence
,
const
int
*
batchIndex
,
inline
void
hl_sequence2batch_copy
(
real
*
batch
,
real
*
sequence
,
const
int
*
batchIndex
,
int
seqWidth
,
int
batchCount
,
bool
seq2batch
)
{}
inline
void
hl_sequence2batch_add
(
real
*
batch
,
real
*
sequence
,
int
*
batchIndex
,
inline
void
hl_sequence2batch_add
(
real
*
batch
,
real
*
sequence
,
int
*
batchIndex
,
int
seqWidth
,
int
batchCount
,
bool
seq2batch
)
{}
...
...
paddle/cuda/include/stub/hl_sparse_stub.h
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifndef HL_SPARSE_STUB_H_
#define HL_SPARSE_STUB_H_
...
...
@@ -20,7 +19,7 @@ limitations under the License. */
inline
void
hl_malloc_sparse_matrix
(
hl_sparse_matrix_s
*
A_d
,
hl_matrix_format_t
format
,
hl_matrix_value_t
value_type
,
hl_matrix_value_t
value_type
,
int
dimM
,
int
dimN
,
int
nnz
)
{}
...
...
@@ -28,20 +27,20 @@ inline void hl_malloc_sparse_matrix(hl_sparse_matrix_s *A_d,
inline
void
hl_free_sparse_matrix
(
hl_sparse_matrix_s
A_d
)
{}
inline
void
hl_construct_sparse_matrix
(
hl_sparse_matrix_s
*
A_d
,
void
*
dest_d
,
void
*
dest_d
,
size_t
size
,
hl_matrix_format_t
format
,
hl_matrix_value_t
value_type
,
hl_matrix_value_t
value_type
,
int
dimM
,
int
dimN
,
int
nnz
)
{}
inline
void
hl_construct_sparse_matrix
(
hl_sparse_matrix_s
*
A_d
,
real
*
value_d
,
int
*
rows_d
,
int
*
cols_d
,
real
*
value_d
,
int
*
rows_d
,
int
*
cols_d
,
hl_matrix_format_t
format
,
hl_matrix_value_t
value_type
,
hl_matrix_value_t
value_type
,
int
dimM
,
int
dimN
,
int
nnz
)
{}
...
...
@@ -87,10 +86,14 @@ inline void hl_matrix_csr_mul_dense(hl_sparse_matrix_s A_d,
inline
void
hl_matrix_csc_mul_dense
(
hl_sparse_matrix_s
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
inline
void
hl_matrix_dense_mul_csc
(
real
*
A_d
,
hl_trans_op_t
transa
,
...
...
@@ -103,18 +106,27 @@ inline void hl_matrix_dense_mul_csc(real *A_d,
real
alpha
,
real
beta
)
{}
inline
void
hl_sparse_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
inline
void
hl_sparse_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
hl_sparse_matrix_s
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
inline
void
hl_matrix_dense_mul_csr
(
real
*
A_d
,
hl_trans_op_t
transa
,
inline
void
hl_matrix_dense_mul_csr
(
real
*
A_d
,
hl_trans_op_t
transa
,
hl_sparse_matrix_s
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{}
inline
void
hl_memcpy_from_csc_matrix
(
real
*
csc_val
,
size_t
val_size
,
...
...
@@ -134,49 +146,39 @@ inline void hl_memcpy_from_csr_matrix(real *csr_val,
hl_sparse_matrix_s
csr_matrix
,
hl_stream_t
stream
)
{}
inline
void
hl_sparse_matrix_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
)
{}
inline
void
hl_sparse_matrix_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
)
{}
inline
void
hl_matrix_csr_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
)
{}
inline
void
hl_matrix_csr_column_sum
(
real
*
A_d
,
hl_sparse_matrix_s
B_d
,
int
dimM
,
int
dimN
,
real
scale
)
{}
inline
void
hl_sparse_matrix_add_bias
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
real
scale
)
{}
inline
void
hl_matrix_csr_add_bias
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
real
scale
)
{}
inline
void
hl_sparse_matrix_add_dense
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
)
{}
inline
void
hl_matrix_csr_add_dense
(
hl_sparse_matrix_s
A_d
,
real
*
B_d
,
real
*
B_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
)
{}
inline
int
*
hl_sparse_matrix_get_rows
(
hl_sparse_matrix_s
sMat
)
{
return
NULL
;
}
inline
int
*
hl_sparse_matrix_get_rows
(
hl_sparse_matrix_s
sMat
)
{
return
NULL
;
}
inline
int
*
hl_sparse_matrix_get_cols
(
hl_sparse_matrix_s
sMat
)
{
return
NULL
;
}
inline
int
*
hl_sparse_matrix_get_cols
(
hl_sparse_matrix_s
sMat
)
{
return
NULL
;
}
inline
real
*
hl_sparse_matrix_get_value
(
hl_sparse_matrix_s
sMat
)
{
inline
real
*
hl_sparse_matrix_get_value
(
hl_sparse_matrix_s
sMat
)
{
return
NULL
;
}
...
...
paddle/cuda/src/avx_mathfun.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/cuda/src/hl_avx_functions.cc
浏览文件 @
ebbe6e1a
...
...
@@ -12,62 +12,58 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include <immintrin.h>
#include "hl_functions.h"
namespace
hppl
{
extern
__m256
exp
(
__m256
a
);
extern
__m256
exp
(
__m256
a
);
__m256
relu
(
const
__m256
a
)
{
__m256
tmp
=
_mm256_set1_ps
(
0.0
f
);
return
_mm256_max_ps
(
a
,
tmp
);
}
__m256
relu
(
const
__m256
a
)
{
__m256
tmp
=
_mm256_set1_ps
(
0.0
f
);
return
_mm256_max_ps
(
a
,
tmp
);
}
__m256
sigmoid
(
const
__m256
a
)
{
__m256
max
=
_mm256_set1_ps
(
SIGMOID_THRESHOLD_MAX
);
__m256
min
=
_mm256_set1_ps
(
SIGMOID_THRESHOLD_MIN
);
__m256
tmp
=
_mm256_max_ps
(
a
,
min
);
tmp
=
_mm256_min_ps
(
tmp
,
max
);
tmp
=
_mm256_sub_ps
(
_mm256_set1_ps
(
0.0
f
),
tmp
);
tmp
=
exp
(
tmp
);
tmp
=
_mm256_add_ps
(
_mm256_set1_ps
(
1.0
f
),
tmp
);
tmp
=
_mm256_div_ps
(
_mm256_set1_ps
(
1.0
f
),
tmp
);
return
tmp
;
}
__m256
sigmoid
(
const
__m256
a
)
{
__m256
max
=
_mm256_set1_ps
(
SIGMOID_THRESHOLD_MAX
);
__m256
min
=
_mm256_set1_ps
(
SIGMOID_THRESHOLD_MIN
);
__m256
tmp
=
_mm256_max_ps
(
a
,
min
);
tmp
=
_mm256_min_ps
(
tmp
,
max
);
tmp
=
_mm256_sub_ps
(
_mm256_set1_ps
(
0.0
f
),
tmp
);
tmp
=
exp
(
tmp
);
tmp
=
_mm256_add_ps
(
_mm256_set1_ps
(
1.0
f
),
tmp
);
tmp
=
_mm256_div_ps
(
_mm256_set1_ps
(
1.0
f
),
tmp
);
return
tmp
;
}
__m256
tanh
(
const
__m256
a
)
{
__m256
max
=
_mm256_set1_ps
(
EXP_MAX_INPUT
);
__m256
tmp
=
_mm256_mul_ps
(
_mm256_set1_ps
(
-
2.0
f
),
a
);
tmp
=
_mm256_min_ps
(
tmp
,
max
);
tmp
=
exp
(
tmp
);
return
_mm256_sub_ps
(
_mm256_div_ps
(
_mm256_set1_ps
(
2.0
f
),
_mm256_add_ps
(
_mm256_set1_ps
(
1.0
f
),
tmp
)),
_mm256_set1_ps
(
1.0
f
));
}
__m256
tanh
(
const
__m256
a
)
{
__m256
max
=
_mm256_set1_ps
(
EXP_MAX_INPUT
);
__m256
tmp
=
_mm256_mul_ps
(
_mm256_set1_ps
(
-
2.0
f
),
a
);
tmp
=
_mm256_min_ps
(
tmp
,
max
);
tmp
=
exp
(
tmp
);
return
_mm256_sub_ps
(
_mm256_div_ps
(
_mm256_set1_ps
(
2.0
f
),
_mm256_add_ps
(
_mm256_set1_ps
(
1.0
f
),
tmp
)
),
_mm256_set1_ps
(
1.0
f
));
}
__m256
linear
(
const
__m256
a
)
{
return
a
;
}
__m256
linear
(
const
__m256
a
)
{
return
a
;
}
__m256
relu
(
const
__m256
a
,
const
__m256
b
)
{
return
_mm256_mul_ps
(
a
,
__m256
relu
(
const
__m256
a
,
const
__m256
b
)
{
return
_mm256_mul_ps
(
a
,
_mm256_and_ps
(
_mm256_cmp_ps
(
b
,
_mm256_set1_ps
(
0.0
f
),
_CMP_GT_OS
),
_mm256_set1_ps
(
1.0
f
)));
}
_mm256_set1_ps
(
1.0
f
)));
}
__m256
sigmoid
(
const
__m256
a
,
const
__m256
b
)
{
return
_mm256_mul_ps
(
_mm256_mul_ps
(
a
,
b
),
_mm256_sub_ps
(
_mm256_set1_ps
(
1.0
f
),
b
));
}
__m256
sigmoid
(
const
__m256
a
,
const
__m256
b
)
{
return
_mm256_mul_ps
(
_mm256_mul_ps
(
a
,
b
),
_mm256_sub_ps
(
_mm256_set1_ps
(
1.0
f
),
b
));
}
__m256
tanh
(
const
__m256
a
,
const
__m256
b
)
{
return
_mm256_mul_ps
(
a
,
_mm256_sub_ps
(
_mm256_set1_ps
(
1.0
f
),
_mm256_mul_ps
(
b
,
b
)));
}
__m256
tanh
(
const
__m256
a
,
const
__m256
b
)
{
return
_mm256_mul_ps
(
a
,
_mm256_sub_ps
(
_mm256_set1_ps
(
1.0
f
),
_mm256_mul_ps
(
b
,
b
)));
}
__m256
linear
(
const
__m256
a
,
const
__m256
b
)
{
return
a
;
}
__m256
linear
(
const
__m256
a
,
const
__m256
b
)
{
return
a
;
}
}
// namespace hppl
paddle/cuda/src/hl_cpu_functions.cc
浏览文件 @
ebbe6e1a
...
...
@@ -12,46 +12,33 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include <math.h>
#include "hl_functions.h"
namespace
hppl
{
real
relu
(
const
real
a
)
{
return
a
>
0.0
f
?
a
:
0.0
f
;
}
real
sigmoid
(
const
real
a
)
{
const
real
min
=
SIGMOID_THRESHOLD_MIN
;
const
real
max
=
SIGMOID_THRESHOLD_MAX
;
real
tmp
=
(
a
<
min
)
?
min
:
((
a
>
max
)
?
max
:
a
);
return
1.0
/
(
1.0
+
exp
(
-
tmp
));
}
real
tanh
(
const
real
a
)
{
real
tmp
=
-
2.0
*
a
;
tmp
=
(
tmp
>
EXP_MAX_INPUT
)
?
EXP_MAX_INPUT
:
tmp
;
return
(
2.0
/
(
1.0
+
exp
(
tmp
)))
-
1.0
;
}
real
linear
(
const
real
a
)
{
return
a
;
}
real
relu
(
const
real
a
,
const
real
b
)
{
return
a
*
(
b
>
0.0
f
?
1.0
f
:
0.0
f
);
}
real
sigmoid
(
const
real
a
,
const
real
b
)
{
return
a
*
b
*
(
1
-
b
);
}
real
tanh
(
const
real
a
,
const
real
b
)
{
return
a
*
(
1.0
f
-
b
*
b
);
}
real
linear
(
const
real
a
,
const
real
b
)
{
return
a
;
}
real
relu
(
const
real
a
)
{
return
a
>
0.0
f
?
a
:
0.0
f
;
}
real
sigmoid
(
const
real
a
)
{
const
real
min
=
SIGMOID_THRESHOLD_MIN
;
const
real
max
=
SIGMOID_THRESHOLD_MAX
;
real
tmp
=
(
a
<
min
)
?
min
:
((
a
>
max
)
?
max
:
a
);
return
1.0
/
(
1.0
+
exp
(
-
tmp
));
}
real
tanh
(
const
real
a
)
{
real
tmp
=
-
2.0
*
a
;
tmp
=
(
tmp
>
EXP_MAX_INPUT
)
?
EXP_MAX_INPUT
:
tmp
;
return
(
2.0
/
(
1.0
+
exp
(
tmp
)))
-
1.0
;
}
real
linear
(
const
real
a
)
{
return
a
;
}
real
relu
(
const
real
a
,
const
real
b
)
{
return
a
*
(
b
>
0.0
f
?
1.0
f
:
0.0
f
);
}
real
sigmoid
(
const
real
a
,
const
real
b
)
{
return
a
*
b
*
(
1
-
b
);
}
real
tanh
(
const
real
a
,
const
real
b
)
{
return
a
*
(
1.0
f
-
b
*
b
);
}
real
linear
(
const
real
a
,
const
real
b
)
{
return
a
;
}
}
// namespace hppl
paddle/cuda/src/hl_cuda_cublas.cc
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include <sys/time.h>
#include <mutex>
#include "hl_cuda.h"
...
...
@@ -24,7 +23,7 @@ limitations under the License. */
namespace
dynload
{
std
::
once_flag
cublas_dso_flag
;
void
*
cublas_dso_handle
=
nullptr
;
void
*
cublas_dso_handle
=
nullptr
;
/**
* The following macro definition can generate structs
...
...
@@ -34,31 +33,30 @@ void* cublas_dso_handle = nullptr;
* note: default dynamic linked libs
*/
#ifdef PADDLE_USE_DSO
#define DYNAMIC_LOAD_CUBLAS_WRAP(__name) \
struct DynLoad__##__name { \
template <typename... Args> \
cublasStatus_t operator()(Args... args) { \
typedef cublasStatus_t (*cublasFunc)(Args...); \
std::call_once(cublas_dso_flag, GetCublasDsoHandle, \
&cublas_dso_handle); \
void* p_##__name = dlsym(cublas_dso_handle, #__name); \
return reinterpret_cast<cublasFunc>(p_##__name)(args...); \
} \
#define DYNAMIC_LOAD_CUBLAS_WRAP(__name) \
struct DynLoad__##__name { \
template <typename... Args> \
cublasStatus_t operator()(Args... args) { \
typedef cublasStatus_t (*cublasFunc)(Args...); \
std::call_once(cublas_dso_flag, GetCublasDsoHandle, &cublas_dso_handle); \
void *p_##__name = dlsym(cublas_dso_handle, #__name); \
return reinterpret_cast<cublasFunc>(p_##__name)(args...); \
} \
} __name; // struct DynLoad__##__name
#else
#define DYNAMIC_LOAD_CUBLAS_WRAP(__name)
\
struct DynLoad__##__name {
\
template <typename... Args>
\
cublasStatus_t operator()(Args... args) {
\
return __name(args...);
\
}
\
#define DYNAMIC_LOAD_CUBLAS_WRAP(__name) \
struct DynLoad__##__name {
\
template <typename... Args> \
cublasStatus_t operator()(Args... args) { \
return __name(args...); \
} \
} __name; // struct DynLoad__##__name
#endif
#define DYNAMIC_LOAD_CUBLAS_V2_WRAP(__name) \
DYNAMIC_LOAD_CUBLAS_WRAP(__name)
#define DYNAMIC_LOAD_CUBLAS_V2_WRAP(__name) DYNAMIC_LOAD_CUBLAS_WRAP(__name)
// include all needed cublas functions in HPPL
// clang-format off
#define CUBLAS_BLAS_ROUTINE_EACH(__macro) \
__macro(cublasSgemv) \
__macro(cublasDgemv) \
...
...
@@ -88,41 +86,41 @@ CUBLAS_BLAS_ROUTINE_EACH(DYNAMIC_LOAD_CUBLAS_V2_WRAP)
}
/* namespace dynload */
// clang-format on
#ifndef PADDLE_TYPE_DOUBLE
#define
CUBLAS_GEAM
dynload::cublasSgeam
#define
CUBLAS_GEMV
dynload::cublasSgemv
#define
CUBLAS_GEMM
dynload::cublasSgemm
#define
CUBLAS_GETRF
dynload::cublasSgetrfBatched
#define
CUBLAS_GETRI
dynload::cublasSgetriBatched
#define
CUBLAS_GEAM
dynload::cublasSgeam
#define
CUBLAS_GEMV
dynload::cublasSgemv
#define
CUBLAS_GEMM
dynload::cublasSgemm
#define
CUBLAS_GETRF
dynload::cublasSgetrfBatched
#define
CUBLAS_GETRI
dynload::cublasSgetriBatched
#else
#define
CUBLAS_GEAM
dynload::cublasDgeam
#define
CUBLAS_GEMV
dynload::cublasDgemv
#define
CUBLAS_GEMM
dynload::cublasDgemm
#define
CUBLAS_GETRF
dynload::cublasDgetrfBatched
#define
CUBLAS_GETRI
dynload::cublasDgetriBatched
#define
CUBLAS_GEAM
dynload::cublasDgeam
#define
CUBLAS_GEMV
dynload::cublasDgemv
#define
CUBLAS_GEMM
dynload::cublasDgemm
#define
CUBLAS_GETRF
dynload::cublasDgetrfBatched
#define
CUBLAS_GETRI
dynload::cublasDgetriBatched
#endif
const
char
*
hl_cublas_get_error_string
(
cublasStatus_t
status
)
{
const
char
*
hl_cublas_get_error_string
(
cublasStatus_t
status
)
{
switch
(
status
)
{
case
CUBLAS_STATUS_NOT_INITIALIZED
:
return
"[cublas status]: not initialized"
;
case
CUBLAS_STATUS_ALLOC_FAILED
:
return
"[cublas status]: allocate failed"
;
case
CUBLAS_STATUS_INVALID_VALUE
:
return
"[cublas status]: invalid value"
;
case
CUBLAS_STATUS_ARCH_MISMATCH
:
return
"[cublas status]: arch mismatch"
;
case
CUBLAS_STATUS_MAPPING_ERROR
:
return
"[cublas status]: mapping error"
;
case
CUBLAS_STATUS_EXECUTION_FAILED
:
return
"[cublas status]: execution failed"
;
case
CUBLAS_STATUS_INTERNAL_ERROR
:
return
"[cublas status]: internal error"
;
case
CUBLAS_STATUS_SUCCESS
:
return
"[cublas status]: success"
;
default:
return
"[cublas status]: unknown error"
;
case
CUBLAS_STATUS_NOT_INITIALIZED
:
return
"[cublas status]: not initialized"
;
case
CUBLAS_STATUS_ALLOC_FAILED
:
return
"[cublas status]: allocate failed"
;
case
CUBLAS_STATUS_INVALID_VALUE
:
return
"[cublas status]: invalid value"
;
case
CUBLAS_STATUS_ARCH_MISMATCH
:
return
"[cublas status]: arch mismatch"
;
case
CUBLAS_STATUS_MAPPING_ERROR
:
return
"[cublas status]: mapping error"
;
case
CUBLAS_STATUS_EXECUTION_FAILED
:
return
"[cublas status]: execution failed"
;
case
CUBLAS_STATUS_INTERNAL_ERROR
:
return
"[cublas status]: internal error"
;
case
CUBLAS_STATUS_SUCCESS
:
return
"[cublas status]: success"
;
default:
return
"[cublas status]: unknown error"
;
}
}
...
...
@@ -131,27 +129,21 @@ const char* hl_cublas_get_error_string(cublasStatus_t status) {
* support << operator for more details error info.
*/
cublasStatus_t
g_cublasStat
;
#define CHECK_CUBLAS(cublas_func) \
g_cublasStat = cublas_func; \
CHECK_EQ(CUBLAS_STATUS_SUCCESS, g_cublasStat) \
<< "Cublas Error: " \
<< hl_cublas_get_error_string(g_cublasStat) \
<< " "
#define CHECK_CUBLAS(cublas_func) \
g_cublasStat = cublas_func; \
CHECK_EQ(CUBLAS_STATUS_SUCCESS, g_cublasStat) \
<< "Cublas Error: " << hl_cublas_get_error_string(g_cublasStat) << " "
void
hl_cublas_init
(
cublasHandle_t
*
cublas_handle
,
cudaStream_t
stream
)
{
CHECK_CUBLAS
(
dynload
::
cublasCreate
(
cublas_handle
))
<<
"[cublas init] Cublas create handle faild!"
;
<<
"[cublas init] Cublas create handle faild!"
;
CHECK_CUBLAS
(
dynload
::
cublasSetStream
(
*
cublas_handle
,
stream
))
<<
"[cublas init] Cublas set stream faild!"
;
<<
"[cublas init] Cublas set stream faild!"
;
}
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
lda
,
int
ldc
)
{
void
hl_matrix_transpose
(
real
*
A_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
lda
,
int
ldc
)
{
real
alpha
=
1.0
;
real
beta
=
0.0
;
...
...
@@ -159,11 +151,18 @@ void hl_matrix_transpose(real *A_d,
CHECK_NOTNULL
(
C_d
);
CHECK_CUBLAS
(
CUBLAS_GEAM
(
t_resource
.
handle
,
CUBLAS_OP_T
,
CUBLAS_OP_N
,
dimM
,
dimN
,
&
alpha
,
A_d
,
lda
,
&
beta
,
nullptr
,
dimM
,
C_d
,
ldc
));
CUBLAS_OP_T
,
CUBLAS_OP_N
,
dimM
,
dimN
,
&
alpha
,
A_d
,
lda
,
&
beta
,
nullptr
,
dimM
,
C_d
,
ldc
));
CHECK_SYNC
(
"hl_matrix_transpose failed"
);
}
...
...
@@ -188,13 +187,13 @@ void hl_matrix_inverse(real *A_d, real *C_d, int dimN, int lda, int ldc) {
small-sized matrices. There may be a better way to reconstruct
the API for better performance.
*/
CHECK_CUBLAS
(
CUBLAS_GETRF
(
t_resource
.
handle
,
dimN
,
inout_d
,
lda
,
pivot_d
,
info_d
,
1
));
CHECK_CUBLAS
(
CUBLAS_GETRF
(
t_resource
.
handle
,
dimN
,
inout_d
,
lda
,
pivot_d
,
info_d
,
1
));
int
info_h
;
hl_memcpy
(
&
info_h
,
info_d
,
sizeof
(
int
));
if
(
info_h
!=
0
)
{
LOG
(
FATAL
)
<<
"Factorization of matrix failed: matrix may be singular.
\n
"
;
LOG
(
FATAL
)
<<
"Factorization of matrix failed: matrix may be singular.
\n
"
;
}
/* Step 2: Compute the inverse of the matrix given its LU decomposition */
...
...
@@ -203,12 +202,18 @@ void hl_matrix_inverse(real *A_d, real *C_d, int dimN, int lda, int ldc) {
hl_memcpy
(
out_d
,
out_h
,
sizeof
(
real
*
));
CHECK_CUBLAS
(
CUBLAS_GETRI
(
t_resource
.
handle
,
dimN
,
(
const
real
**
)
inout_d
,
lda
,
pivot_d
,
out_d
,
ldc
,
info_d
,
1
));
dimN
,
(
const
real
**
)
inout_d
,
lda
,
pivot_d
,
out_d
,
ldc
,
info_d
,
1
));
hl_memcpy
(
&
info_h
,
info_d
,
sizeof
(
int
));
if
(
info_h
!=
0
)
{
LOG
(
FATAL
)
<<
"Inversion of matrix failed: matrix may be singular.
\n
"
;
LOG
(
FATAL
)
<<
"Inversion of matrix failed: matrix may be singular.
\n
"
;
}
hl_free_mem_device
(
inout_d
);
...
...
@@ -218,12 +223,19 @@ void hl_matrix_inverse(real *A_d, real *C_d, int dimN, int lda, int ldc) {
CHECK_SYNC
(
"hl_matrix_inverse failed"
);
}
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
,
int
lda
,
int
ldb
,
int
ldc
)
{
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
,
int
lda
,
int
ldb
,
int
ldc
)
{
CHECK_NOTNULL
(
A_d
);
CHECK_NOTNULL
(
B_d
);
CHECK_NOTNULL
(
C_d
);
...
...
@@ -231,8 +243,8 @@ void hl_matrix_mul(real *A_d, hl_trans_op_t transa,
if
(
dimN
==
1
&&
dimM
!=
1
&&
dimK
!=
1
&&
transb
==
HPPL_OP_N
)
{
int
m
=
(
transa
==
HPPL_OP_N
)
?
dimM
:
dimK
;
int
n
=
(
transa
==
HPPL_OP_N
)
?
dimK
:
dimM
;
hl_matrix_mul_vector
(
A_d
,
transa
,
B_d
,
C_d
,
m
,
n
,
alpha
,
beta
,
lda
,
ldb
,
ldc
);
hl_matrix_mul_vector
(
A_d
,
transa
,
B_d
,
C_d
,
m
,
n
,
alpha
,
beta
,
lda
,
ldb
,
ldc
);
return
;
}
...
...
@@ -240,8 +252,7 @@ void hl_matrix_mul(real *A_d, hl_trans_op_t transa,
int
m
=
(
transb
==
HPPL_OP_N
)
?
dimK
:
dimN
;
int
n
=
(
transb
==
HPPL_OP_N
)
?
dimN
:
dimK
;
hl_trans_op_t
trans
=
(
transb
==
HPPL_OP_N
)
?
HPPL_OP_T
:
HPPL_OP_N
;
hl_matrix_mul_vector
(
B_d
,
trans
,
A_d
,
C_d
,
m
,
n
,
alpha
,
beta
,
ldb
,
1
,
1
);
hl_matrix_mul_vector
(
B_d
,
trans
,
A_d
,
C_d
,
m
,
n
,
alpha
,
beta
,
ldb
,
1
,
1
);
return
;
}
...
...
@@ -250,26 +261,47 @@ void hl_matrix_mul(real *A_d, hl_trans_op_t transa,
stat
=
CUBLAS_GEMM
(
t_resource
.
handle
,
CUBLAS_OP_N
,
CUBLAS_OP_N
,
dimN
,
dimM
,
dimK
,
&
alpha
,
B_d
,
ldb
,
A_d
,
lda
,
&
beta
,
C_d
,
ldc
);
dimN
,
dimM
,
dimK
,
&
alpha
,
B_d
,
ldb
,
A_d
,
lda
,
&
beta
,
C_d
,
ldc
);
}
else
if
((
HPPL_OP_T
==
transa
)
&&
(
HPPL_OP_N
==
transb
))
{
stat
=
CUBLAS_GEMM
(
t_resource
.
handle
,
CUBLAS_OP_N
,
CUBLAS_OP_T
,
dimN
,
dimM
,
dimK
,
&
alpha
,
B_d
,
ldb
,
A_d
,
lda
,
&
beta
,
C_d
,
ldc
);
dimN
,
dimM
,
dimK
,
&
alpha
,
B_d
,
ldb
,
A_d
,
lda
,
&
beta
,
C_d
,
ldc
);
}
else
if
((
HPPL_OP_N
==
transa
)
&&
(
HPPL_OP_T
==
transb
))
{
stat
=
CUBLAS_GEMM
(
t_resource
.
handle
,
CUBLAS_OP_T
,
CUBLAS_OP_N
,
dimN
,
dimM
,
dimK
,
&
alpha
,
B_d
,
ldb
,
A_d
,
lda
,
&
beta
,
C_d
,
ldc
);
dimN
,
dimM
,
dimK
,
&
alpha
,
B_d
,
ldb
,
A_d
,
lda
,
&
beta
,
C_d
,
ldc
);
}
else
{
LOG
(
FATAL
)
<<
"parameter transa error!"
;
}
...
...
@@ -277,24 +309,46 @@ void hl_matrix_mul(real *A_d, hl_trans_op_t transa,
CHECK_SYNC
(
"hl_matrix_mul failed"
);
}
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
void
hl_matrix_mul
(
real
*
A_d
,
hl_trans_op_t
transa
,
real
*
B_d
,
hl_trans_op_t
transb
,
real
*
C_d
,
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{
int
dimM
,
int
dimN
,
int
dimK
,
real
alpha
,
real
beta
)
{
int
lda
=
(
HPPL_OP_N
==
transa
)
?
dimK
:
dimM
;
int
ldb
=
(
HPPL_OP_N
==
transb
)
?
dimN
:
dimK
;
int
ldc
=
dimN
;
hl_matrix_mul
(
A_d
,
transa
,
B_d
,
transb
,
C_d
,
dimM
,
dimN
,
dimK
,
alpha
,
beta
,
lda
,
ldb
,
ldc
);
hl_matrix_mul
(
A_d
,
transa
,
B_d
,
transb
,
C_d
,
dimM
,
dimN
,
dimK
,
alpha
,
beta
,
lda
,
ldb
,
ldc
);
}
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
,
int
lda
,
int
incb
,
int
incc
)
{
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
,
int
lda
,
int
incb
,
int
incc
)
{
CHECK_NOTNULL
(
A_d
);
CHECK_NOTNULL
(
B_d
);
CHECK_NOTNULL
(
C_d
);
...
...
@@ -303,21 +357,29 @@ void hl_matrix_mul_vector(real *A_d, hl_trans_op_t trans,
if
(
HPPL_OP_N
==
trans
)
{
stat
=
CUBLAS_GEMV
(
t_resource
.
handle
,
CUBLAS_OP_T
,
dimN
,
dimM
,
dimN
,
dimM
,
&
alpha
,
A_d
,
lda
,
B_d
,
incb
,
A_d
,
lda
,
B_d
,
incb
,
&
beta
,
C_d
,
incc
);
C_d
,
incc
);
}
else
if
(
HPPL_OP_T
==
trans
)
{
stat
=
CUBLAS_GEMV
(
t_resource
.
handle
,
CUBLAS_OP_N
,
dimN
,
dimM
,
dimN
,
dimM
,
&
alpha
,
A_d
,
lda
,
B_d
,
incb
,
A_d
,
lda
,
B_d
,
incb
,
&
beta
,
C_d
,
incc
);
C_d
,
incc
);
}
else
{
LOG
(
FATAL
)
<<
"parameter transa error!"
;
}
...
...
@@ -326,10 +388,14 @@ void hl_matrix_mul_vector(real *A_d, hl_trans_op_t trans,
CHECK_SYNC
(
"hl_matrix_mul_vector"
);
}
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
)
{
hl_matrix_mul_vector
(
A_d
,
trans
,
B_d
,
C_d
,
dimM
,
dimN
,
alpha
,
beta
,
dimN
,
1
,
1
);
void
hl_matrix_mul_vector
(
real
*
A_d
,
hl_trans_op_t
trans
,
real
*
B_d
,
real
*
C_d
,
int
dimM
,
int
dimN
,
real
alpha
,
real
beta
)
{
hl_matrix_mul_vector
(
A_d
,
trans
,
B_d
,
C_d
,
dimM
,
dimN
,
alpha
,
beta
,
dimN
,
1
,
1
);
}
paddle/cuda/src/hl_cuda_cudnn.cc
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/cuda/src/hl_cuda_device.cc
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/cuda/src/hl_cudart_wrap.cc
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#ifdef PADDLE_USE_DSO
#include <mutex>
...
...
@@ -29,26 +28,26 @@ limitations under the License. */
namespace
dynload
{
extern
std
::
once_flag
cudart_dso_flag
;
extern
void
*
cudart_dso_handle
;
extern
void
*
cudart_dso_handle
;
/**
* The following macro definition can generate structs
* (for each function) to dynamic load cuda routine
* via operator overloading.
**/
#define DYNAMIC_LOAD_CUDART_WRAP(__name, __type) \
struct DynLoad__##__name { \
template <typename... Args> \
__type operator()(Args... args) { \
typedef __type (*cudartFunc)(Args...); \
std::call_once(cudart_dso_flag, GetCudartDsoHandle, \
&cudart_dso_handle); \
void* p_##__name = dlsym(cudart_dso_handle, #__name); \
return reinterpret_cast<cudartFunc>(p_##__name)(args...); \
} \
} __name;
/* struct DynLoad__##__name */
#define DYNAMIC_LOAD_CUDART_WRAP(__name, __type) \
struct DynLoad__##__name { \
template <typename... Args> \
__type operator()(Args... args) { \
typedef __type (*cudartFunc)(Args...); \
std::call_once(cudart_dso_flag, GetCudartDsoHandle, &cudart_dso_handle); \
void *p_##__name = dlsym(cudart_dso_handle, #__name); \
return reinterpret_cast<cudartFunc>(p_##__name)(args...); \
} \
} __name;
/* struct DynLoad__##__name */
/* include all needed cuda functions in HPPL */
// clang-format off
#define CUDA_ROUTINE_EACH(__macro) \
__macro(cudaLaunch, cudaError_t) \
__macro(cudaSetupArgument, cudaError_t) \
...
...
@@ -61,16 +60,17 @@ extern void* cudart_dso_handle;
__macro(__cudaInitModule, char) \
__macro(__cudaRegisterTexture, void) \
__macro(__cudaRegisterSurface, void)
// clang-format on
CUDA_ROUTINE_EACH
(
DYNAMIC_LOAD_CUDART_WRAP
)
#if CUDART_VERSION >= 7000
DYNAMIC_LOAD_CUDART_WRAP
(
cudaLaunchKernel
,
cudaError_t
)
DYNAMIC_LOAD_CUDART_WRAP
(
cudaLaunchKernel
,
cudaError_t
)
#endif
#undef CUDA_ROUNTINE_EACH
}
/* namespace dynload */
}
/* namespace dynload */
#if CUDART_VERSION >= 7000
__host__
cudaError_t
CUDARTAPI
cudaLaunchKernel
(
const
void
*
func
,
...
...
@@ -79,12 +79,11 @@ __host__ cudaError_t CUDARTAPI cudaLaunchKernel(const void *func,
void
**
args
,
size_t
sharedMem
,
cudaStream_t
stream
)
{
return
dynload
::
cudaLaunchKernel
(
func
,
gridDim
,
blockDim
,
args
,
sharedMem
,
stream
);
return
dynload
::
cudaLaunchKernel
(
func
,
gridDim
,
blockDim
,
args
,
sharedMem
,
stream
);
}
#endif
/* CUDART_VERSION >= 7000 */
__host__
cudaError_t
CUDARTAPI
cudaLaunch
(
const
void
*
func
)
{
return
dynload
::
cudaLaunch
(
func
);
}
...
...
@@ -99,13 +98,12 @@ __host__ cudaError_t CUDARTAPI cudaConfigureCall(dim3 gridDim,
dim3
blockDim
,
size_t
sharedMem
,
cudaStream_t
stream
)
{
return
dynload
::
cudaConfigureCall
(
gridDim
,
blockDim
,
sharedMem
,
stream
);
return
dynload
::
cudaConfigureCall
(
gridDim
,
blockDim
,
sharedMem
,
stream
);
}
extern
"C"
{
void
**
CUDARTAPI
__cudaRegisterFatBinary
(
void
*
fatCubin
)
{
void
**
CUDARTAPI
__cudaRegisterFatBinary
(
void
*
fatCubin
)
{
return
dynload
::
__cudaRegisterFatBinary
(
fatCubin
);
}
...
...
@@ -113,86 +111,87 @@ void CUDARTAPI __cudaUnregisterFatBinary(void **fatCubinHandle) {
return
dynload
::
__cudaUnregisterFatBinary
(
fatCubinHandle
);
}
void
CUDARTAPI
__cudaRegisterFunction
(
void
**
fatCubinHandle
,
const
char
*
hostFun
,
char
*
deviceFun
,
const
char
*
deviceName
,
int
thread_limit
,
uint3
*
tid
,
uint3
*
bid
,
dim3
*
bDim
,
dim3
*
gDim
,
int
*
wSize
)
{
return
dynload
::
__cudaRegisterFunction
(
fatCubinHandle
,
hostFun
,
deviceFun
,
deviceName
,
thread_limit
,
tid
,
bid
,
bDim
,
gDim
,
wSize
);
void
CUDARTAPI
__cudaRegisterFunction
(
void
**
fatCubinHandle
,
const
char
*
hostFun
,
char
*
deviceFun
,
const
char
*
deviceName
,
int
thread_limit
,
uint3
*
tid
,
uint3
*
bid
,
dim3
*
bDim
,
dim3
*
gDim
,
int
*
wSize
)
{
return
dynload
::
__cudaRegisterFunction
(
fatCubinHandle
,
hostFun
,
deviceFun
,
deviceName
,
thread_limit
,
tid
,
bid
,
bDim
,
gDim
,
wSize
);
}
void
CUDARTAPI
__cudaRegisterVar
(
void
**
fatCubinHandle
,
char
*
hostVar
,
char
*
deviceAddress
,
const
char
*
deviceName
,
int
ext
,
int
size
,
int
constant
,
int
global
)
{
return
dynload
::
__cudaRegisterVar
(
fatCubinHandle
,
hostVar
,
deviceAddress
,
deviceName
,
ext
,
size
,
constant
,
global
);
void
CUDARTAPI
__cudaRegisterVar
(
void
**
fatCubinHandle
,
char
*
hostVar
,
char
*
deviceAddress
,
const
char
*
deviceName
,
int
ext
,
int
size
,
int
constant
,
int
global
)
{
return
dynload
::
__cudaRegisterVar
(
fatCubinHandle
,
hostVar
,
deviceAddress
,
deviceName
,
ext
,
size
,
constant
,
global
);
}
extern
void
CUDARTAPI
__cudaRegisterManagedVar
(
void
**
fatCubinHandle
,
void
**
hostVarPtrAddress
,
char
*
deviceAddress
,
const
char
*
deviceName
,
int
ext
,
int
size
,
int
constant
,
int
global
)
{
return
dynload
::
__cudaRegisterManagedVar
(
fatCubinHandle
,
hostVarPtrAddress
,
deviceAddress
,
deviceName
,
ext
,
size
,
constant
,
global
);
extern
void
CUDARTAPI
__cudaRegisterManagedVar
(
void
**
fatCubinHandle
,
void
**
hostVarPtrAddress
,
char
*
deviceAddress
,
const
char
*
deviceName
,
int
ext
,
int
size
,
int
constant
,
int
global
)
{
return
dynload
::
__cudaRegisterManagedVar
(
fatCubinHandle
,
hostVarPtrAddress
,
deviceAddress
,
deviceName
,
ext
,
size
,
constant
,
global
);
}
char
CUDARTAPI
__cudaInitModule
(
void
**
fatCubinHandle
)
{
char
CUDARTAPI
__cudaInitModule
(
void
**
fatCubinHandle
)
{
return
dynload
::
__cudaInitModule
(
fatCubinHandle
);
}
void
CUDARTAPI
__cudaRegisterTexture
(
void
**
fatCubinHandle
,
const
struct
textureReference
*
hostVar
,
const
void
**
deviceAddress
,
const
char
*
deviceName
,
int
dim
,
int
norm
,
int
ext
)
{
void
CUDARTAPI
__cudaRegisterTexture
(
void
**
fatCubinHandle
,
const
struct
textureReference
*
hostVar
,
const
void
**
deviceAddress
,
const
char
*
deviceName
,
int
dim
,
int
norm
,
int
ext
)
{
return
dynload
::
__cudaRegisterTexture
(
fatCubinHandle
,
hostVar
,
deviceAddress
,
deviceName
,
dim
,
norm
,
ext
);
fatCubinHandle
,
hostVar
,
deviceAddress
,
deviceName
,
dim
,
norm
,
ext
);
}
void
CUDARTAPI
__cudaRegisterSurface
(
void
**
fatCubinHandle
,
const
struct
surfaceReference
*
hostVar
,
const
void
**
deviceAddress
,
const
char
*
deviceName
,
int
dim
,
int
ext
)
{
void
CUDARTAPI
__cudaRegisterSurface
(
void
**
fatCubinHandle
,
const
struct
surfaceReference
*
hostVar
,
const
void
**
deviceAddress
,
const
char
*
deviceName
,
int
dim
,
int
ext
)
{
return
dynload
::
__cudaRegisterSurface
(
fatCubinHandle
,
hostVar
,
deviceAddress
,
deviceName
,
dim
,
ext
);
fatCubinHandle
,
hostVar
,
deviceAddress
,
deviceName
,
dim
,
ext
);
}
}
/* extern "C" */
...
...
paddle/cuda/src/hl_math.cc
浏览文件 @
ebbe6e1a
...
...
@@ -12,24 +12,15 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "avx_mathfun.h"
namespace
hppl
{
__m256
exp
(
__m256
a
)
{
return
exp256_ps
(
a
);
}
__m256
exp
(
__m256
a
)
{
return
exp256_ps
(
a
);
}
__m256
log
(
__m256
a
)
{
return
log256_ps
(
a
);
}
__m256
log
(
__m256
a
)
{
return
log256_ps
(
a
);
}
__m256
sin
(
__m256
a
)
{
return
sin256_ps
(
a
);
}
__m256
sin
(
__m256
a
)
{
return
sin256_ps
(
a
);
}
__m256
cos
(
__m256
a
)
{
return
cos256_ps
(
a
);
}
__m256
cos
(
__m256
a
)
{
return
cos256_ps
(
a
);
}
}
// namespace hppl
paddle/cuda/src/hl_time.cc
浏览文件 @
ebbe6e1a
...
...
@@ -12,7 +12,6 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include <chrono>
#include <stdlib.h>
#include <iostream>
...
...
@@ -21,8 +20,7 @@ limitations under the License. */
using
std
::
chrono
::
high_resolution_clock
;
int64_t
getCurrentTimeStick
()
{
high_resolution_clock
::
time_point
tp
=
high_resolution_clock
::
now
();
high_resolution_clock
::
duration
dtn
=
tp
.
time_since_epoch
();
return
dtn
.
count
();
high_resolution_clock
::
time_point
tp
=
high_resolution_clock
::
now
();
high_resolution_clock
::
duration
dtn
=
tp
.
time_since_epoch
();
return
dtn
.
count
();
}
paddle/gserver/activations/ActivationFunction.cpp
浏览文件 @
ebbe6e1a
...
...
@@ -51,12 +51,14 @@ static ClassRegistrar<ActivationFunction> gActivationRegistrar;
* @brief Macro for registering a derived activation class
*/
#define END_DEFINE_ACTIVATION(ACTIVATION_NAME) \
}; \
} \
; \
const std::string ACTIVATION_CLASS_NAME(ACTIVATION_NAME)::name = \
#ACTIVATION_NAME; \
static InitFunction __reg_activation__##ACTIVATION_NAME([] { \
gActivationRegistrar.registerClass< \
ACTIVATION_CLASS_NAME(ACTIVATION_NAME)>(#ACTIVATION_NAME); \
gActivationRegistrar \
.registerClass<ACTIVATION_CLASS_NAME(ACTIVATION_NAME)>( \
#ACTIVATION_NAME); \
});
/**
...
...
@@ -111,14 +113,22 @@ void backward(Argument& act) {
outputG
->
softmaxBackward
(
*
outputV
);
}
else
{
SetDevice
device
(
act
.
deviceId
);
Matrix
::
resizeOrCreate
(
sftMaxDot_
,
outputG
->
getHeight
(),
Matrix
::
resizeOrCreate
(
sftMaxDot_
,
outputG
->
getHeight
(),
outputG
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
Matrix
::
resizeOrCreate
(
sftMaxSum_
,
outputG
->
getHeight
(),
1
,
/* trans */
false
,
useGpu
(
act
.
deviceId
));
/* trans */
false
,
useGpu
(
act
.
deviceId
));
Matrix
::
resizeOrCreate
(
sftMaxSum_
,
outputG
->
getHeight
(),
1
,
/* trans */
false
,
useGpu
(
act
.
deviceId
));
if
(
!
one_
||
one_
->
getWidth
()
!=
outputG
->
getWidth
())
{
Matrix
::
resizeOrCreate
(
one_
,
1
,
outputG
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
Matrix
::
resizeOrCreate
(
one_
,
1
,
outputG
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
one_
->
one
();
}
...
...
@@ -130,7 +140,6 @@ void backward(Argument& act) {
}
END_DEFINE_ACTIVATION
(
softmax
)
/**
* @brief Sequence_softmax Activation
* @note Softmax on all frames of one sequence.
...
...
@@ -146,10 +155,16 @@ void forward(Argument& act) {
CHECK_EQ
(
act
.
value
->
getWidth
(),
1UL
);
if
(
!
argument_
.
value
)
{
argument_
.
value
=
Matrix
::
create
(
nullptr
,
/* height= */
1
,
1
,
/* trans= */
false
,
useGpu
(
act
.
deviceId
));
argument_
.
grad
=
Matrix
::
create
(
nullptr
,
/* height= */
1
,
1
,
/* trans= */
false
,
useGpu
(
act
.
deviceId
));
argument_
.
value
=
Matrix
::
create
(
nullptr
,
/* height= */
1
,
1
,
/* trans= */
false
,
useGpu
(
act
.
deviceId
));
argument_
.
grad
=
Matrix
::
create
(
nullptr
,
/* height= */
1
,
1
,
/* trans= */
false
,
useGpu
(
act
.
deviceId
));
}
auto
starts
=
act
.
sequenceStartPositions
->
getVector
(
useGpu
(
act
.
deviceId
));
...
...
@@ -267,8 +282,11 @@ END_DEFINE_ACTIVATION(softrelu)
BEGIN_DEFINE_ACTIVATION
(
abs
)
void
forward
(
Argument
&
act
)
{
SetDevice
device
(
act
.
deviceId
);
Matrix
::
resizeOrCreate
(
act
.
in
,
act
.
value
->
getHeight
(),
act
.
value
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
Matrix
::
resizeOrCreate
(
act
.
in
,
act
.
value
->
getHeight
(),
act
.
value
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
act
.
in
->
copyFrom
(
*
act
.
value
);
act
.
value
->
abs
(
*
act
.
value
);
...
...
@@ -286,8 +304,11 @@ END_DEFINE_ACTIVATION(abs)
BEGIN_DEFINE_ACTIVATION
(
square
)
void
forward
(
Argument
&
act
)
{
SetDevice
device
(
act
.
deviceId
);
Matrix
::
resizeOrCreate
(
act
.
in
,
act
.
value
->
getHeight
(),
act
.
value
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
Matrix
::
resizeOrCreate
(
act
.
in
,
act
.
value
->
getHeight
(),
act
.
value
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
act
.
in
->
copyFrom
(
*
act
.
value
);
act
.
value
->
square
(
*
act
.
value
);
...
...
@@ -317,8 +338,11 @@ END_DEFINE_ACTIVATION(exponential)
BEGIN_DEFINE_ACTIVATION
(
log
)
void
forward
(
Argument
&
act
)
{
SetDevice
device
(
act
.
deviceId
);
Matrix
::
resizeOrCreate
(
act
.
in
,
act
.
value
->
getHeight
(),
act
.
value
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
Matrix
::
resizeOrCreate
(
act
.
in
,
act
.
value
->
getHeight
(),
act
.
value
->
getWidth
(),
/* trans */
false
,
useGpu
(
act
.
deviceId
));
act
.
in
->
copyFrom
(
*
act
.
value
);
act
.
value
->
log
(
*
act
.
value
);
...
...
@@ -333,11 +357,9 @@ ActivationFunction* ActivationFunction::create(const std::string& type) {
std
::
vector
<
std
::
string
>
ActivationFunction
::
getAllRegisteredTypes
()
{
std
::
vector
<
std
::
string
>
types
;
gActivationRegistrar
.
forEachType
([
&
](
const
std
::
string
&
type
)
{
types
.
push_back
(
type
);
});
gActivationRegistrar
.
forEachType
(
[
&
](
const
std
::
string
&
type
)
{
types
.
push_back
(
type
);
});
return
types
;
}
}
// namespace paddle
paddle/gserver/activations/ActivationFunction.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/DataProvider.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/DataProvider.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/DataProviderGroup.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/MultiDataProvider.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/MultiDataProvider.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/ProtoDataProvider.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/ProtoDataProvider.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/ProtoReader.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/PyDataProvider.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/PyDataProvider.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/dataproviders/PyDataProvider2.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/evaluators/CTCErrorEvaluator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/evaluators/ChunkEvaluator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/evaluators/Evaluator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/evaluators/Evaluator.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/GradientMachine.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/GradientMachine.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/GradientMachineMode.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/MultiGradientMachine.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/MultiGradientMachine.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/MultiNetwork.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/MultiNetwork.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/NeuralNetwork.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/NeuralNetwork.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/ParallelNeuralNetwork.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/ParallelNeuralNetwork.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/gradientmachines/RecurrentGradientMachine.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/AddtoLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/AddtoLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/AgentLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/AgentLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/AverageLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/BatchNormBaseLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/BatchNormBaseLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/BatchNormalizationLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/BatchNormalizationLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/BilinearInterpLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/BlockExpandLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/BlockExpandLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CRFDecodingLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CRFDecodingLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CRFLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CRFLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CTCLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CTCLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConcatenateLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ContextProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ContextProjection.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConvBaseLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConvBaseLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConvOperator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConvProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConvProjection.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConvShiftLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ConvexCombinationLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CosSimLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CosSimLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CosSimVecMatLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CostLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CostLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CudnnBatchNormLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CudnnBatchNormLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CudnnConvLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CudnnConvLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CudnnPoolLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/CudnnPoolLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/DataLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/DataLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/DataNormLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/DataNormLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/DotMulOperator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/DotMulProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/EosIdCheckLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ExpandConvBaseLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ExpandConvBaseLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ExpandConvLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ExpandConvLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ExpandConvTransLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ExpandConvTransLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/FeatureMapExpandLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/FullMatrixProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/FullMatrixProjection.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/FullyConnectedLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/FullyConnectedLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/GatedRecurrentLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/GatedRecurrentLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/GetOutputLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/GruCompute.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/GruCompute.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/GruStepLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/HierarchicalSigmoidLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/HierarchicalSigmoidLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/IdentityProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/InterpolationLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/Layer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/Layer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LinearChainCRF.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LinearChainCRF.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LinearChainCTC.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LinearChainCTC.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LstmCompute.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LstmCompute.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LstmLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LstmLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/LstmStepLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MDLstmLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MaxIdLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MaxLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MaxLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MixedLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MixedLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MultinomialSampler.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MultinomialSampler.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/MultiplexLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/NCELayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/NormLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/NormLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/NormProjectionLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/NormProjectionLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/Operator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/Operator.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/OuterProdLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ParameterReluLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ParameterReluLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/PoolLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/PoolLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/PoolProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/PoolProjection.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/PoolProjectionLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/PowerLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/PrintLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/Projection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/Projection.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/RecurrentLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/RecurrentLayerGroup.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ResizeLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ScalingLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ScalingProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SelectiveFullyConnectedLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SelectiveFullyConnectedLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SequenceConcatLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SequenceLastInstanceLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SequencePoolLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SequenceReshapeLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SequenceToBatch.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SequenceToBatch.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SlopeInterceptLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SpatialPyramidPoolLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SpatialPyramidPoolLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SubSequenceLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/SumToOneNormLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/TableProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/TableProjection.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/TensorLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/TensorLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/TransLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/TransLayer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/TransposedFullMatrixProjection.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/layers/ValidationLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/LayerGradUtil.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/LayerGradUtil.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/TestUtil.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/TestUtil.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_ActivationGrad.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_ConvTrans.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_Evaluator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_LayerGrad.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_LinearChainCRF.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_MultinomialSampler.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_NetworkCompare.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_ProtoDataProvider.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_PyDataProvider.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_PyDataProvider2.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_RecurrentGradientMachine.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_RecurrentLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/gserver/tests/test_SelectiveFCLayer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/Allocator.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/BaseMatrix.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/CpuSparseMatrix.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/CpuSparseMatrix.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/ExecViaCpu.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/MathFunctions.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/MathFunctions.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/MathUtils.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/MathUtils.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/Matrix.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/Matrix.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/MatrixBitCode.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/MemoryHandle.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/MemoryHandle.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/PoolAllocator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/PoolAllocator.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/SIMDFunctions.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/SIMDFunctions.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/SparseMatrix.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/SparseMatrix.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/SparseRowMatrix.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/SparseRowMatrix.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/Storage.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/Vector.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/Vector.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_Allocator.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_ExecViaCpu.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_FPException.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_SIMDFunctions.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_batchTranspose.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_matrix.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_matrixCompare.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_matrixUtil.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_perturbation.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/math/tests/test_sparseMatrixCompare.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/Argument.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/Argument.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/AverageOptimizer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/AverageOptimizer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/FirstOrderOptimizer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/FirstOrderOptimizer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/LearningRateScheduler.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/LearningRateScheduler.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/OptimizerFunctions.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/OptimizerFunctions.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/OptimizerWithRegularizer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/OptimizerWithRegularizer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParallelParameter.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParallelParameter.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/Parameter.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/Parameter.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterOptimizer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterOptimizer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterUpdateFunctions.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterUpdateFunctions.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterUpdaterBase.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterUpdaterBase.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterUpdaterHook.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/ParameterUpdaterHook.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/Regularizer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/Regularizer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/Weight.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/parameter/tests/test_common.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/BaseClient.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/BaseClient.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/LightNetwork.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/LightNetwork.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/ParameterClient2.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/ParameterClient2.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/ParameterServer2.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/ParameterServer2.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/ProtoServer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/ProtoServer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/RDMANetwork.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/SocketChannel.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/SocketChannel.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/SparseParameterDistribution.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/test/SocketTest.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/test/test_ParameterServer2.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/pserver/test/test_ProtoServer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/ParamUtil.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/ParamUtil.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/ParameterUpdater.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/ParameterUpdater.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/RemoteParameterUpdater.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/RemoteParameterUpdater.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/Tester.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/Tester.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TesterConfig.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/ThreadParameterUpdater.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/ThreadParameterUpdater.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/Trainer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/Trainer.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TrainerConfigHelper.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TrainerConfigHelper.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TrainerInternal.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TrainerInternal.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TrainerInternalConfig.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TrainerInternalConfig.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/TrainerMain.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/picojson.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_Compare.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_CompareSparse.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_CompareTwoNets.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_CompareTwoOpts.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_Prediction.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_PyDataProviderWrapper.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_Trainer.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_TrainerOnePass.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/trainer/tests/test_recurrent_machine_generation.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/BarrierStat.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/BarrierStat.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/ClassRegistrar.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/CommandLineParser.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/CommandLineParser.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/CustomStackTrace.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/CustomStackTrace.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/DisableCopy.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Excepts.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Flags.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Flags.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/GlobalConstants.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/GlobalConstants.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Locks.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Logging.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Logging.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/PythonUtil.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/PythonUtil.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Queue.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Stat.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/StringUtil.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Thread.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/ThreadLocal.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/ThreadLocal.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/TypeDefs.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Util.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Util.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Version.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/Version.h
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/arch/linux/Locks.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/arch/osx/Locks.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_CommandLineParser.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_CustomStackTrace.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_CustomStackTracePrint.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_Logging.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_SpinLock.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_StringUtils.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_Thread.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
paddle/utils/tests/test_ThreadBarrier.cpp
浏览文件 @
ebbe6e1a
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录