Fork自 PaddlePaddle / Paddle
* add bernoulli op * fix cuda kernel and add unit test * refine doc * fix uniform