+
图1. 稀疏矩阵存储示意图
+
+
图2. 序列输入示意图
+
Python 端数据类型 | +C-API 输入数据类型 | +
---|---|
paddle.data_type.integer_value | +整型数组,无需附加序列信息 | +
paddle.data_type.dense_vector | +浮点型稠密矩阵,无需附加序列信息 | +
paddle.data_type.sparse_binary_vector | +浮点型稀疏矩阵,无需提供非零元的值,默认为1,无需附加序列信息 | +
paddle.data_type.sparse_vector | +浮点型稀疏矩阵,需提供非零元的值,无需附加序列信息 | +
paddle.data_type.integer_value_sequence | +整型数组,需附加序列信息 | +
paddle.data_type.dense_vector_sequence | +浮点型稠密矩阵,需附加序列信息 | +
paddle.data_type.sparse_binary_vector_sequence | +浮点型稀疏矩阵,无需提供非零元的值,默认为1,需附加序列信息 | +
paddle.data_type.sparse_vector_sequence | +浮点型稀疏矩阵,需提供非零元的值,需附加序列信息 | +
paddle.data_type.integer_value_sub_sequence | +整型数组,需附加双层序列信息 | +
paddle.data_type.dense_vector_sub_sequence | +浮点型稠密矩阵,需附加双层序列信息 | +
paddle.data_type.sparse_binary_vector_sub_sequence | +浮点型稀疏矩阵,无需提供非零元的值,默认为1,需附加双层序列信息 | +
paddle.data_type.sparse_vector_sub_sequence | +浮点型稀疏矩阵,需提供非零元的值,需附加双层序列信息 | +
+
图1. C-API使用流程示意图
+
IOS_PLATFORM | +IOS_ARCH | +
---|---|
OS | +armv7, armv7s, arm64 | +
SIMULATOR | +i386, x86_64 | +
- Figure 1. GAN-Model-Structure - figure credit -
- -The generator and discriminator take turn to be trained using SGD. The objective function of the generator is for its generated images being classified as real by the discriminator, and the objective function of the discriminator is to correctly classify real and fake images. When the GAN model is trained to converge to the equilibrium state, the generator will transform the given noise distribution to the distribution of real images, and the discriminator will not be able to distinguish between real and fake images at all. - -## Implementation of GAN Model Structure -Since GAN model involves multiple neural networks, it requires to use paddle python API. So the code walk-through below can also partially serve as an introduction to the usage of Paddle Python API. - -There are three networks defined in gan_conf.py, namely **generator_training**, **discriminator_training** and **generator**. The relationship to the model structure we defined above is that **discriminator_training** is the discriminator, **generator** is the generator, and the **generator_training** combined the generator and discriminator since training generator would require the discriminator to provide loss function. This relationship is described in the following code: -```python -if is_generator_training: - noise = data_layer(name="noise", size=noise_dim) - sample = generator(noise) - -if is_discriminator_training: - sample = data_layer(name="sample", size=sample_dim) - -if is_generator_training or is_discriminator_training: - label = data_layer(name="label", size=1) - prob = discriminator(sample) - cost = cross_entropy(input=prob, label=label) - classification_error_evaluator( - input=prob, label=label, name=mode + '_error') - outputs(cost) - -if is_generator: - noise = data_layer(name="noise", size=noise_dim) - outputs(generator(noise)) -``` - -In order to train the networks defined in gan_conf.py, one first needs to initialize a Paddle environment, parse the config, create GradientMachine from the config and create trainer from GradientMachine as done in the code chunk below: -```python -import py_paddle.swig_paddle as api -# init paddle environment -api.initPaddle('--use_gpu=' + use_gpu, '--dot_period=10', - '--log_period=100', '--gpu_id=' + args.gpu_id, - '--save_dir=' + "./%s_params/" % data_source) - -# Parse config -gen_conf = parse_config(conf, "mode=generator_training,data=" + data_source) -dis_conf = parse_config(conf, "mode=discriminator_training,data=" + data_source) -generator_conf = parse_config(conf, "mode=generator,data=" + data_source) - -# Create GradientMachine -dis_training_machine = api.GradientMachine.createFromConfigProto( -dis_conf.model_config) -gen_training_machine = api.GradientMachine.createFromConfigProto( -gen_conf.model_config) -generator_machine = api.GradientMachine.createFromConfigProto( -generator_conf.model_config) - -# Create trainer -dis_trainer = api.Trainer.create(dis_conf, dis_training_machine) -gen_trainer = api.Trainer.create(gen_conf, gen_training_machine) -``` - -In order to balance the strength between generator and discriminator, we schedule to train whichever one is performing worse by comparing their loss function value. The loss function value can be calculated by a forward pass through the GradientMachine. -```python -def get_training_loss(training_machine, inputs): - outputs = api.Arguments.createArguments(0) - training_machine.forward(inputs, outputs, api.PASS_TEST) - loss = outputs.getSlotValue(0).copyToNumpyMat() - return numpy.mean(loss) -``` - -After training one network, one needs to sync the new parameters to the other networks. The code below demonstrates one example of such use case: -```python -# Train the gen_training -gen_trainer.trainOneDataBatch(batch_size, data_batch_gen) - -# Copy the parameters from gen_training to dis_training and generator -copy_shared_parameters(gen_training_machine, -dis_training_machine) -copy_shared_parameters(gen_training_machine, generator_machine) -``` - - -## A Toy Example -With the infrastructure explained above, we can now walk you through a toy example of generating two dimensional uniform distribution using 10 dimensional Gaussian noise. - -The Gaussian noises are generated using the code below: -```python -def get_noise(batch_size, noise_dim): - return numpy.random.normal(size=(batch_size, noise_dim)).astype('float32') -``` - -The real samples (2-D uniform) are generated using the code below: -```python -# synthesize 2-D uniform data in gan_trainer.py:114 -def load_uniform_data(): - data = numpy.random.rand(1000000, 2).astype('float32') - return data -``` - -The generator and discriminator network are built using fully-connected layer and batch_norm layer, and are defined in gan_conf.py. - -To train the GAN model, one can use the command below. The flag -d specifies the training data (cifar, mnist or uniform) and flag --useGpu specifies whether to use gpu for training (0 is cpu, 1 is gpu). -```bash -$python gan_trainer.py -d uniform --useGpu 1 -``` -The generated samples can be found in ./uniform_samples/ and one example is shown below as Figure 2. One can see that it roughly recovers the 2D uniform distribution. - -- Figure 2. Uniform Sample -
- -## MNIST Example -### Data preparation -To download the MNIST data, one can use the following commands: -```bash -$cd data/ -$./get_mnist_data.sh -``` - -### Model description -Following the DC-Gan paper (https://arxiv.org/abs/1511.06434), we use convolution/convolution-transpose layer in the discriminator/generator network to better deal with images. The details of the network structures are defined in gan_conf_image.py. - -### Training the model -To train the GAN model on mnist data, one can use the following command: -```bash -$python gan_trainer.py -d mnist --useGpu 1 -``` -The generated sample images can be found at ./mnist_samples/ and one example is shown below as Figure 3. -- Figure 3. MNIST Sample -
diff --git a/doc/v1_api_tutorials/gan/mnist_sample.png b/doc/v1_api_tutorials/gan/mnist_sample.png deleted file mode 100644 index f9c7bf7ddd7f148eac4fe347e9c38afaa8876760..0000000000000000000000000000000000000000 Binary files a/doc/v1_api_tutorials/gan/mnist_sample.png and /dev/null differ diff --git a/doc/v1_api_tutorials/gan/uniform_sample.png b/doc/v1_api_tutorials/gan/uniform_sample.png deleted file mode 100644 index e716c48e782019a757bed0cb443f2ed97386cbe2..0000000000000000000000000000000000000000 Binary files a/doc/v1_api_tutorials/gan/uniform_sample.png and /dev/null differ diff --git a/doc/v1_api_tutorials/imagenet_model/resnet_block.jpg b/doc/v1_api_tutorials/imagenet_model/resnet_block.jpg deleted file mode 100644 index e16bd3c624030c4c09b358a015b491141b42d8f1..0000000000000000000000000000000000000000 Binary files a/doc/v1_api_tutorials/imagenet_model/resnet_block.jpg and /dev/null differ diff --git a/doc/v1_api_tutorials/imagenet_model/resnet_model_cn.md b/doc/v1_api_tutorials/imagenet_model/resnet_model_cn.md deleted file mode 100644 index 82ec9d70b345c11aba3aa86f8206eedc8072bb88..0000000000000000000000000000000000000000 --- a/doc/v1_api_tutorials/imagenet_model/resnet_model_cn.md +++ /dev/null @@ -1,284 +0,0 @@ -# Model Zoo - ImageNet # - -[ImageNet](http://www.image-net.org/) 是通用物体分类领域一个众所周知的数据库。本教程提供了一个用于ImageNet上的卷积分类网络模型。 - -## ResNet 介绍 - -论文 [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385) 中提出的ResNet网络结构在2015年ImageNet大规模视觉识别竞赛(ILSVRC 2015)的分类任务中赢得了第一名。他们提出残差学习的框架来简化网络的训练,所构建网络结构的的深度比之前使用的网络有大幅度的提高。下图展示的是基于残差的连接方式。左图构造网络模块的方式被用于34层的网络中,而右图的瓶颈连接模块用于50层,101层和152层的网络结构中。 - -ResNet | -Top-1 | -Model Size | -
---|---|---|
ResNet-50 | -24.9% | -99M | -
ResNet-101 | -23.7% | -173M | -
ResNet-152 | -23.2% | -234M | -
参数名 | -尺寸 | -含义 | -
---|---|---|
_res2_1_branch1_bn.w0 | -256 | -gamma, 缩放参数 | -
_res2_1_branch1_bn.w1 | -256 | -特征图均值 | -
_res2_1_branch1_bn.w2 | -256 | -特征图方差 | -
_res2_1_branch1_bn.wbias | -256 | -beta, 偏置参数 | -
ResNet | -Top-1 | -Model Size | -
---|---|---|
ResNet-50 | -24.9% | -99M | -
ResNet-101 | -23.7% | -173M | -
ResNet-152 | -23.2% | -234M | -
Parameter Name | -Number | -Meaning | -
---|---|---|
_res2_1_branch1_bn.w0 | -256 | -gamma, scale parameter | -
_res2_1_branch1_bn.w1 | -256 | -mean value of feature map | -
_res2_1_branch1_bn.w2 | -256 | -variance of feature map | -
_res2_1_branch1_bn.wbias | -256 | -beta, shift parameter | -
Network name | -Number of parameters | -Test error | - - - - -
---|---|---|
Logistic regression | -252 KB | -8.652% | -
Network name | -Number of parameters | -Test error | - - - - -
---|---|---|
Word embedding model | -15 MB | -8.484% | -
Network name | -Number of parameters | -Test error | - - - - -
---|---|---|
Convolutional model | -16 MB | -5.628% | -
Network name | -Number of parameters | -Test error | - - - - -
---|---|---|
Recurrent model | -16 MB | -4.812% | -
Network name | -Number of parameters | -Error rate | -Configuration file name | - - - - -
---|---|---|---|
Logistic regression model(BOW) | -252KB | -8.652% | -trainer_config.lr.py | -
Word embedding | -15MB | -8.484% | -trainer_config.emb.py | -
Convolution model | -16MB | -5.628% | -trainer_config.cnn.py | -
Time sequence model | -16MB | -4.812% | -trainer_config.lstm.py | -
Name | -Explanation | - - - -
---|---|
Batch=20 | -You have trained 20 batches. | -
samples=2560 | -You have trained 2560 examples. | -
AvgCost | -The average cost from the first batch to the current batch. | -
CurrentCost | -the average cost of the last log_period batches | -
Eval: classification_error_evaluator | -The average classification error from the first batch to the current batch. | -
CurrentEval: classification_error_evaluator | -The average error rate of the last log_period batches | -
-
-
- Figure 1. Sharing variables in operators.
-
-
-
-
- Figure 2. Replace sharing variable's gradient with `Add` operator.
-
-