• I
    cxl: Use call_rcu to reduce latency when releasing the afu fd · 8ac75b96
    Ian Munsie 提交于
    The afu fd release path was identified as a significant bottleneck in
    the overall performance of cxl. While an optimal AFU design would
    minimise the need to close & reopen the AFU fd, it is not always
    practical to avoid.
    
    The bottleneck seems to be down to the call to synchronize_rcu(), which
    will block until every other thread is guaranteed to be out of an RCU
    critical section. Replace it with call_rcu() to free the context
    structures later so we can return to the application sooner.
    
    This reduces the time spent in the fd release path from 13356 usec to
    13.3 usec - about a 100x speed up.
    Reported-by: NFei K Chen <uchen@cn.ibm.com>
    Signed-off-by: NIan Munsie <imunsie@au1.ibm.com>
    Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
    8ac75b96
context.c 6.4 KB