• A
    powerpc: Feature nop out reservation clear when stcx checks address · f89451fb
    Anton Blanchard 提交于
    The POWER architecture does not require stcx to check that it is operating
    on the same address as the larx. This means it is possible for an
    an exception handler to execute a larx, get a reservation, decide
    not to do the stcx and then return back with an active reservation. If the
    interrupted code was in the middle of a larx/stcx sequence the stcx could
    incorrectly succeed.
    
    All recent POWER CPUs check the address before letting the stcx succeed
    so we can create a CPU feature and nop it out. As Ben suggested, we can
    only do this in our syscall path because there is a remote possibility
    some kernel code gets interrupted by an exception that ends up operating
    on the same cacheline.
    
    Thanks to Paul Mackerras and Derek Williams for the idea.
    
    To test this I used a very simple null syscall (actually getppid) testcase
    at http://ozlabs.org/~anton/junkcode/null_syscall.c
    
    I tested against 2.6.35-git10 with the following changes against the
    pseries_defconfig:
    
    CONFIG_VIRT_CPU_ACCOUNTING=n
    CONFIG_AUDIT=n
    CONFIG_PPC_4K_PAGES=n
    CONFIG_PPC_64K_PAGES=y
    CONFIG_FORCE_MAX_ZONEORDER=9
    CONFIG_PPC_SUBPAGE_PROT=n
    CONFIG_FUNCTION_TRACER=n
    CONFIG_FUNCTION_GRAPH_TRACER=n
    CONFIG_IRQSOFF_TRACER=n
    CONFIG_STACK_TRACER=n
    
    to remove the overhead of virtual CPU accounting, syscall auditing and
    the ftrace mcount tracers. 64kB pages were enabled to minimise TLB misses.
    
    POWER6: +8.2%
    POWER7: +7.0%
    
    Another suggestion was to use a larx to something in the L1 instead of a stcx.
    This was almost as fast as removing the larx on POWER6, but only 3.5% faster
    on POWER7. We can use this to speed up the reservation clear in our
    exception exit code.
    Signed-off-by: NAnton Blanchard <anton@samba.org>
    Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
    f89451fb
cputable.h 19.9 KB