Fork自 PaddlePaddle / Paddle
* [CustomDevice] add profiler apis * migrate CalculateEstOccupancy into cuda_tracer * update * add ut