ngraph cannot be used with garbage collection strategy together
Created by: sneaxiy
We found that when FLAGS_use_ngraph
is true and FLAGS_eager_delete_tensor_gb
is 0, there is difference in almost all ngraph unittests test_xxx_ngraph_op.py
.
FLAGS_eager_delete_tensor_gb
is an environment variable to control whether garbage collection strategy is enabled. If FLAGS_eager_delete_tensor_gb
is larger than or equal to 0, garbage collection strategy is enabled.
The garbage collection strategy in PaddlePaddle is designed to save both GPU and CPU memory usages. It would release memory of Tensors which would not be used in the following network calculation.
For example, there is a network with only 3 operators:
x2 = op1(x1)
x3 = op2(x2)
x4 = op3(x3)
Before op2
runs, we can release memory of x1
because x1
would not be be used any more after op1
runs. For the same reason, we can release memory of x2
after op2
runs, and release memory of x3
after op3
runs. Finally, the network would become something like:
x2 = op1(x1)
release_memory(x1)
x3 = op2(x2)
release_memory(x2)
x4 = op3(x3)
release_memory(x3)
Does ngraph implementation in PaddlePaddle would be conflict with garbage collection strategy above?
How to reproduce
If you want to reproduce the problem, please comment this line and run any test_xxx_ngraph_op.py
.