• Z
    ext4: make ext4_has_inline_data() as a inline function · 83447ccb
    Zheng Liu 提交于
    Now ext4_has_inline_data() is used in wide spread codepaths.  So we need
    to make it as a inline function to avoid burning some CPU cycles.
    
    Change in text size:
    
             text     data      bss     dec     hex filename
    before: 326110    19258    5528  350896   55ab0 fs/ext4/ext4.o
    after:  326227    19258    5528  351013   55b25 fs/ext4/ext4.o
    
    I use the following script to measure the CPU usage.
    
      #!/bin/bash
    
      shm_base='/dev/shm'
      img=${shm_base}/ext4-img
      mnt=/mnt/loop
    
      e2fsprgs_base=$HOME/e2fsprogs
      mkfs=${e2fsprgs_base}/misc/mke2fs
      fsck=${e2fsprgs_base}/e2fsck/e2fsck
    
      sudo umount $mnt
      dd if=/dev/zero of=$img bs=4k count=3145728
      ${mkfs} -t ext4 -O inline_data -F $img
      sudo mount -t ext4 -o loop $img $mnt
    
      # start testing...
      testdir="${mnt}/testdir"
      mkdir $testdir
      cd $testdir
    
      echo "start testing..."
      for ((cnt=0;cnt<100;cnt++)); do
    
      for ((i=0;i<5;i++)); do
      	for ((j=0;j<5;j++)); do
      		for ((k=0;k<5;k++)); do
      			for ((l=0;l<5;l++)); do
      				mkdir -p $i/$j/$k/$l
      				echo "$i-$j-$k-$l" > $i/$j/$k/$l/testfile
      			done
      		done
      	done
      done
    
      ls -R $testdir > /dev/null
      rm -rf $testdir/*
    
      done
    
    The result of `perf top -G -U` is as below.
    
    vanilla:
     13.92%  [ext4]  [k] ext4_do_update_inode
      9.36%  [ext4]  [k] __ext4_get_inode_loc
      4.07%  [ext4]  [k] ftrace_define_fields_ext4_writepages
      3.83%  [ext4]  [k] __ext4_handle_dirty_metadata
      3.42%  [ext4]  [k] ext4_get_inode_flags
      2.71%  [ext4]  [k] ext4_mark_iloc_dirty
      2.46%  [ext4]  [k] ftrace_define_fields_ext4_direct_IO_enter
      2.26%  [ext4]  [k] ext4_get_inode_loc
      2.22%  [ext4]  [k] ext4_has_inline_data
      [...]
    
    After applied the patch, we don't see ext4_has_inline_data() because it
    has been inlined and perf couldn't sample it.  Although it doesn't mean
    that the CPU cycles can be saved but at least the overhead of function
    calls can be eliminated.  So IMHO we'd better inline this function.
    
    Cc: Andreas Dilger <adilger.kernel@dilger.ca>
    Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
    Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
    83447ccb
ext4.h 99.7 KB