fea/pile allocator !15662

Created by: Superjomn

Pile Allocator

This is a buddy-allocator based high-level memory allocator for both CPU and GPU memory. Beside a normal memory allocator, it offers the ability to save memory consumption in some special scenarios:

In inference, several models are executed in sequential, it can help to reuse the memory for temporary variables,
In training, it can also squeeze the memory used by sequential models execution.

The example scenarios:

We have N models, each has an average memory size W for weight and T for temporary variables, and all these models are used in sequential. By default, these models will take N*(W+T) size of memory.

By using PileAllocator, we can share the temporary variable memory space for all the models, and in total these model will only take N*W+T size of memory. That effect is remarkable when T is large.

TODOs:

TODO add performance benchmark
TODO re-consider the overall allocator interface, it seems that the design of Allocation as the return type add the overhead, and make it less flexible.

PaddlePaddle / Paddle 1 年多 前同步成功

fea/pile allocator !15662

Pile Allocator

PaddlePaddle / Paddle
1 年多前同步成功