RNN: High gpu memory usage for recurrent_group + simple_attention (#420) · Issue · PaddlePaddle / Paddle

RNN: High gpu memory usage for recurrent_group + simple_attention

Created by: byzhang

We observed significant more gpu memory usage when using recurrent_group + simple_attention as compared with the standard gru or lstm. @lyxm added some logging in https://github.com/lyxm/Paddle/commit/9ce03727c273542492d63db9bb16088153d1edc4 with some analysis on the logging results. It seems some gpu memory copied in mixed_layer may be eliminated. @emailweixu commented, "大概是在RecurrentGrsdientMachine里// connect in_links 那里修改，把rnn 每个frame和static input相连". It's kind of a blocking issue to learn a meaningful sized model with long sequences. It's highly appreciated if you can triage this issue. cc @xliux

PaddlePaddle / Paddle 大约 1 年 前同步成功

RNN: High gpu memory usage for recurrent_group + simple_attention

PaddlePaddle / Paddle
大约 1 年前同步成功