• D
    xfs: Don't allocate new buffers on every call to _xfs_buf_find · 3815832a
    Dave Chinner 提交于
    Stats show that for an 8-way unlink @ ~80,000 unlinks/s we are doing
    ~1 million cache hit lookups to ~3000 buffer creates. That's almost
    3 orders of magnitude more cahce hits than misses, so optimising for
    cache hits is quite important. In the cache hit case, we do not need
    to allocate a new buffer in case of a cache miss, so we are
    effectively hitting the allocator for no good reason for vast the
    majority of calls to _xfs_buf_find. 8-way create workloads are
    showing similar cache hit/miss ratios.
    
    The result is profiles that look like this:
    
         samples  pcnt function                        DSO
         _______ _____ _______________________________ _________________
    
         1036.00 10.0% _xfs_buf_find                   [kernel.kallsyms]
          582.00  5.6% kmem_cache_alloc                [kernel.kallsyms]
          519.00  5.0% __memcpy                        [kernel.kallsyms]
          468.00  4.5% __ticket_spin_lock              [kernel.kallsyms]
          388.00  3.7% kmem_cache_free                 [kernel.kallsyms]
          331.00  3.2% xfs_log_commit_cil              [kernel.kallsyms]
    
    
    Further, there is a fair bit of work involved in initialising a new
    buffer once a cache miss has occurred and we currently do that under
    the rbtree spinlock. That increases spinlock hold time on what are
    heavily used trees.
    
    To fix this, remove the initialisation of the buffer from
    _xfs_buf_find() and only allocate the new buffer once we've had a
    cache miss. Initialise the buffer immediately after allocating it in
    xfs_buf_get, too, so that is it ready for insert if we get another
    cache miss after allocation. This minimises lock hold time and
    avoids unnecessary allocator churn. The resulting profiles look
    like:
    
         samples  pcnt function                    DSO
         _______ _____ ___________________________ _________________
    
         8111.00  9.1% _xfs_buf_find               [kernel.kallsyms]
         4380.00  4.9% __memcpy                    [kernel.kallsyms]
         4341.00  4.8% __ticket_spin_lock          [kernel.kallsyms]
         3401.00  3.8% kmem_cache_alloc            [kernel.kallsyms]
         2856.00  3.2% xfs_log_commit_cil          [kernel.kallsyms]
         2625.00  2.9% __kmalloc                   [kernel.kallsyms]
         2380.00  2.7% kfree                       [kernel.kallsyms]
         2016.00  2.3% kmem_cache_free             [kernel.kallsyms]
    
    Showing a significant reduction in time spent doing allocation and
    freeing from slabs (kmem_cache_alloc and kmem_cache_free).
    Signed-off-by: NDave Chinner <dchinner@redhat.com>
    Reviewed-by: NChristoph Hellwig <hch@lst.de>
    Signed-off-by: NAlex Elder <aelder@sgi.com>
    3815832a
xfs_buf.c 41.2 KB