Parameter sharing (#11209) · Issue · PaddlePaddle / Paddle

Parameter sharing

Created by: emailweixu

The current way of sharing parameter between two parts of a model is to use the same full name of the parameter. This can become very cumbersome for sharing large models. There are two ways of achieving parameter sharing:

1). object oriented approach. This is used by PyTorch (https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network). Currently, our reinforcement learning framework also choose to use approach (https://github.com/PaddlePaddle/PARL/blob/develop/parl/layers/tests/test_param_sharing.py) because it does not need to use any name, which is error prone.

2). variable_scope. This is used by tensorflow (https://www.tensorflow.org/api_docs/python/tf/variable_scope).

and 2) will result in very different way of writing models. Given our current state, perhaps we should implement a mechanism similar to variable scope.

PaddlePaddle / Paddle 大约 1 年 前同步成功

Parameter sharing

PaddlePaddle / Paddle
大约 1 年前同步成功