• U
    mm/vmalloc: switch to bulk allocator in __vmalloc_area_node() · 5c1f4e69
    Uladzislau Rezki (Sony) 提交于
    Recently there has been introduced a page bulk allocator for users which
    need to get number of pages per one call request.
    
    For order-0 pages switch to an alloc_pages_bulk_array_node() instead of
    alloc_pages_node(), the reason is the former is not capable of allocating
    set of pages, thus a one call is per one page.
    
    Second, according to my tests the bulk allocator uses less cycles even for
    scenarios when only one page is requested.  Running the "perf" on same
    test case shows below difference:
    
    <default>
      - 45.18% __vmalloc_node
         - __vmalloc_node_range
            - 35.60% __alloc_pages
               - get_page_from_freelist
                    3.36% __list_del_entry_valid
                    3.00% check_preemption_disabled
                    1.42% prep_new_page
    <default>
    
    <patch>
      - 31.00% __vmalloc_node
         - __vmalloc_node_range
            - 14.48% __alloc_pages_bulk
                 3.22% __list_del_entry_valid
               - 0.83% __alloc_pages
                    get_page_from_freelist
    <patch>
    
    The "test_vmalloc.sh" also shows performance improvements:
    
    fix_size_alloc_test_4MB   loops: 1000000 avg: 89105095 usec
    fix_size_alloc_test       loops: 1000000 avg: 513672   usec
    full_fit_alloc_test       loops: 1000000 avg: 748900   usec
    long_busy_list_alloc_test loops: 1000000 avg: 8043038  usec
    random_size_alloc_test    loops: 1000000 avg: 4028582  usec
    fix_align_alloc_test      loops: 1000000 avg: 1457671  usec
    
    fix_size_alloc_test_4MB   loops: 1000000 avg: 62083711 usec
    fix_size_alloc_test       loops: 1000000 avg: 449207   usec
    full_fit_alloc_test       loops: 1000000 avg: 735985   usec
    long_busy_list_alloc_test loops: 1000000 avg: 5176052  usec
    random_size_alloc_test    loops: 1000000 avg: 2589252  usec
    fix_align_alloc_test      loops: 1000000 avg: 1365009  usec
    
    For example 4MB allocations illustrates ~30% gain, all the
    rest is also better.
    
    Link: https://lkml.kernel.org/r/20210516202056.2120-3-urezki@gmail.comSigned-off-by: NUladzislau Rezki (Sony) <urezki@gmail.com>
    Acked-by: NMel Gorman <mgorman@suse.de>
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    5c1f4e69
vmalloc.c 97.9 KB