• N
    percpu: improve generic percpu modify-return implementation · 1b5ca121
    Nicholas Piggin 提交于
    Some architectures require an additional load to find the address of
    percpu pointers. In some implemenatations, the C aliasing rules do not
    allow the result of that load to be kept over the store that modifies
    the percpu variable, which causes additional loads.
    
    Work around this by finding the pointer first, then operating on that.
    
    It's also possible to mark things as restrict and those kind of games,
    but that can require larger and arch specific changes.
    
    On powerpc, __this_cpu_inc_return compiles to:
    
            ld 10,48(13)
            ldx 9,3,10
            addi 9,9,1
            stdx 9,3,10
            ld 9,48(13)
            ldx 3,9,3
    
    With this patch it compiles to:
    
            ld 10,48(13)
            ldx 9,3,10
            addi 9,9,1
            stdx 9,3,10
    Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
    To: Tejun Heo <tj@kernel.org>
    To: Christoph Lameter <cl@linux.com>
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: NTejun Heo <tj@kernel.org>
    1b5ca121
percpu.h 12.2 KB