• Q
    Add job=time in trainer, refine cudnn_conv to reduce gpu memory and speed up training. (#218) · 45c81a41
    qingqing01 提交于
    * Add benchmark for PaddlePaddle, tensorflow and caffe
    
    * ConvProjection to reduce memory for goolenet
    
    * Add unit test for ConvProjection.
    1. unit test in test_LayerGrad.
    2. compare the ConvPorjection and CudnnConvLayer, also compare the concat_layer+img_conv_layer and concat_layer_conv_projection.
    
    * Reduce cudnn_conv memory and add benchmark document.
    1. Use TmpMatrix as the workspace in cudnn_conv to reduce gpu memory. It reduce lots of memory.
    2. Add benchmark document.
    3. fix smallnet_mnist_cifar.py in paddle.
    
    * Add job=time and refine cudnn_conv to reduce gpu memroy and speed up
    
    * Refine cudnn_conv and shared biases operation in concat_layer and mixed_layer.
    
    * follow comments
    
    * follow comments
    
    * Use unique_ptr to prevent memory leaks in CudnnConvLayer.
    45c81a41
hl_cuda_matrix.cu 22.4 KB