powerpc: Speed up clear_page by unrolling it
Unroll clear_page 8 times. A simple microbenchmark which allocates and frees a zeroed page: for (i = 0; i < iterations; i++) { unsigned long p = __get_free_page(GFP_KERNEL | __GFP_ZERO); free_page(p); } improves 20% on POWER8. This assumes cacheline sizes won't grow beyond 512 bytes or page sizes wont drop below 1kB, which is unlikely, but we could add a runtime check during early init if it makes people nervous. Michael found that some versions of gcc produce quite bad code (all multiplies), so we give gcc a hand by using shifts and adds. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Showing
想要评论请 注册 或 登录