• A
    NTB: Remove dma_sync_wait from ntb_async_rx · 905921e7
    Allen Hubbe 提交于
    The dma_sync_wait can hurt the performance of workloads mixed with both
    large and small frames.  Large frames will be copied using the dma
    engine.  Small frames will be copied by the cpu.  The dma_sync_wait
    prevents the cpu and dma engine copying in parallel.
    
    In the period where the cpu is copying, the dma engine is stopped.  The
    dma engine is not doing any useful work to copy large frames during that
    time, and the additional time to restart the dma engine for the next
    large frame.  This will decrease the throughput for the portion of a
    workload with large frames.
    
    In the period where the dma engine is copying, the cpu is held up
    waiting for dma to complete.  The small frames processing will be
    delayed until the dma is complete.  The RX frames are completed
    in-order, and the processing of small frames takes very little time, so
    dma_sync_wait may have an insignificant impact on the respose time of
    frames.  The more significant impact is to the system, because the delay
    in dma_sync_wait is implemented as busy non-blocking wait.  This can
    prevent the delayed core from doing any useful work, even if it could be
    processing work for other drivers, unrelated to transport RX processing.
    
    After applying the earlier patch to fix out-of-order RX acknoledgement,
    the dma_sync_wait is no longer necessary.  Remove it, so that cpu memcpy
    will proceed immediately for small frames, in parallel with ongoing dma
    for large frames.  Do not hold up the cpu from doing work while dma is
    in progress.  The prior fix will continue to ensure in-order completion
    of the RX frames to the upper layer, and in-order delivery of the RX
    acknoledgement.
    Signed-off-by: NAllen Hubbe <Allen.Hubbe@emc.com>
    Signed-off-by: NJon Mason <jdmason@kudzu.us>
    905921e7
ntb_transport.c 50.6 KB