• K
    check_updates(): effective removal of cache entries marked CE_REMOVE · 36419c8e
    Kjetil Barvik 提交于
    Below is oprofile output from GIT command 'git chekcout -q my-v2.6.25'
    (move from tag v2.6.27 to tag v2.6.25 of the Linux kernel):
    
    CPU: Core 2, speed 1999.95 MHz (estimated)
    Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
                             mask of 0x00 (Unhalted core cycles) count 20000
    Counted INST_RETIRED_ANY_P events (number of instructions retired) with a
                               unit mask of 0x00 (No unit mask) count 20000
    CPU_CLK_UNHALT...|INST_RETIRED:2...|
      samples|      %|  samples|      %|
    ------------------------------------
       409247 100.000    342878 100.000 git
            CPU_CLK_UNHALT...|INST_RETIRED:2...|
              samples|      %|  samples|      %|
            ------------------------------------
               260476 63.6476    257843 75.1996 libz.so.1.2.3
               100876 24.6492     64378 18.7758 kernel-2.6.28.4_2.vmlinux
                30850  7.5382      7874  2.2964 libc-2.9.so
                14775  3.6103      8390  2.4469 git
                 2020  0.4936      4325  1.2614 libcrypto.so.0.9.8
                  191  0.0467        32  0.0093 libpthread-2.9.so
                   58  0.0142        36  0.0105 ld-2.9.so
                    1 2.4e-04         0       0 libldap-2.3.so.0.2.31
    
    Detail list of the top 20 function entries (libz counted in one blob):
    
    CPU_CLK_UNHALTED  INST_RETIRED_ANY_P
    samples  %        samples  %        image name               symbol name
    260476   63.6862  257843   75.2725  libz.so.1.2.3            /lib/libz.so.1.2.3
    16587     4.0555  3636      1.0615  libc-2.9.so              memcpy
    7710      1.8851  277       0.0809  libc-2.9.so              memmove
    3679      0.8995  1108      0.3235  kernel-2.6.28.4_2.vmlinux d_validate
    3546      0.8670  2607      0.7611  kernel-2.6.28.4_2.vmlinux __getblk
    3174      0.7760  1813      0.5293  libc-2.9.so              _int_malloc
    2396      0.5858  3681      1.0746  kernel-2.6.28.4_2.vmlinux copy_to_user
    2270      0.5550  2528      0.7380  kernel-2.6.28.4_2.vmlinux __link_path_walk
    2205      0.5391  1797      0.5246  kernel-2.6.28.4_2.vmlinux ext4_mark_iloc_dirty
    2103      0.5142  1203      0.3512  kernel-2.6.28.4_2.vmlinux find_first_zero_bit
    2077      0.5078  997       0.2911  kernel-2.6.28.4_2.vmlinux do_get_write_access
    2070      0.5061  514       0.1501  git                      cache_name_compare
    2043      0.4995  1501      0.4382  kernel-2.6.28.4_2.vmlinux rcu_irq_exit
    2022      0.4944  1732      0.5056  kernel-2.6.28.4_2.vmlinux __ext4_get_inode_loc
    2020      0.4939  4325      1.2626  libcrypto.so.0.9.8       /usr/lib/libcrypto.so.0.9.8
    1965      0.4804  1384      0.4040  git                      patch_delta
    1708      0.4176  984       0.2873  kernel-2.6.28.4_2.vmlinux rcu_sched_grace_period
    1682      0.4112  727       0.2122  kernel-2.6.28.4_2.vmlinux sysfs_slab_alias
    1659      0.4056  290       0.0847  git                      find_pack_entry_one
    1480      0.3619  1307      0.3816  kernel-2.6.28.4_2.vmlinux ext4_writepage_trans_blocks
    
    Notice the memmove line, where the CPU did 7710 / 277 = 27.8 cycles
    per instruction, and compared to the total cycles spent inside the
    source code of GIT for this command, all the memmove() calls
    translates to (7710 * 100) / 14775 = 52.2% of this.
    
    Retesting with a GIT program compiled for gcov usage, I found out that
    the memmove() calls came from remove_index_entry_at() in read-cache.c,
    where we have:
    
            memmove(istate->cache + pos,
                    istate->cache + pos + 1,
                    (istate->cache_nr - pos) * sizeof(struct cache_entry *));
    
    remove_index_entry_at() is called 4902 times from check_updates() in
    unpack-trees.c, and each time called we move each cache_entry pointers
    (from the removed one) one step to the left.
    
    Since we have 28828 entries in the cache this time, and if we on
    average move half of them each time, we in total move approximately
    4902 * 0.5 * 28828 * 4 = 282 629 712 bytes, or twice this amount if
    each pointer is 8 bytes (64 bit).
    
    OK, is seems that the function check_updates() is called 28 times, so
    the estimated guess above had been more correct if check_updates() had
    been called only once, but the point is: we get lots of bytes moved.
    
    To fix this, and use an O(N) algorithm instead, where N is the number
    of cache_entries, we delete/remove all entries in one loop through all
    entries.
    
    From a retest, the new remove_marked_cache_entries() from the patch
    below, ended up with the following output line from oprofile:
    
    46        0.0105  15        0.0041  git                      remove_marked_cache_entries
    
    If we can trust the numbers from oprofile in this case, we saved
    approximately ((7710 - 46) * 20000) / (2 * 1000 * 1000 * 1000) = 0.077
    seconds CPU time with this fix for this particular test.  And notice
    that now the CPU did only 46 / 15 = 3.1 cycles/instruction.
    Signed-off-by: NKjetil Barvik <barvik@broadpark.no>
    Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: NJunio C Hamano <gitster@pobox.com>
    36419c8e
cache.h 33.2 KB