1. 11 4月, 2015 2 次提交
    • C
      f2fs: persist system.advise into on-disk inode · 30c62fdb
      Chao Yu 提交于
      This patch fixes to dirty inode for persisting i_advise of f2fs inode info into
      on-disk inode if user sets system.advise through setxattr. Otherwise the new
      value will be lost.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      30c62fdb
    • C
      f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get · 84e97c27
      Chao Yu 提交于
      We will encounter oops by executing below command.
      getfattr -n system.advise /mnt/f2fs/file
      Killed
      
      message log:
      BUG: unable to handle kernel NULL pointer dereference at   (null)
      IP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs]
      *pdpt = 00000000319b7001 *pde = 0000000000000000
      Oops: 0002 [#1] SMP
      Modules linked in: f2fs(O) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev
      snd_seq_device snd_timer bnep snd rfcomm microcode bluetooth soundcore i2c_piix4 mac_hid serio_raw parport_pc ppdev lp parport
      binfmt_misc hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs]
      CPU: 3 PID: 3134 Comm: getfattr Tainted: G           O    4.0.0-rc1 #6
      Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      task: f3a71b60 ti: f19a6000 task.ti: f19a6000
      EIP: 0060:[<f8b54d69>] EFLAGS: 00010246 CPU: 3
      EIP is at f2fs_xattr_advise_get+0x29/0x40 [f2fs]
      EAX: 00000000 EBX: f19a7e71 ECX: 00000000 EDX: f8b5b467
      ESI: 00000000 EDI: f2008570 EBP: f19a7e14 ESP: f19a7e08
       DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      CR0: 80050033 CR2: 00000000 CR3: 319b8000 CR4: 000007f0
      Stack:
       f8b5a634 c0cbb580 00000000 f19a7e34 c1193850 00000000 00000007 f19a7e71
       f19a7e64 c0cbb580 c1193810 f19a7e50 c1193c00 00000000 00000000 00000000
       c0cbb580 00000000 f19a7f70 c1194097 00000000 00000000 00000000 74737973
      Call Trace:
       [<c1193850>] generic_getxattr+0x40/0x50
       [<c1193810>] ? xattr_resolve_name+0x80/0x80
       [<c1193c00>] vfs_getxattr+0x70/0xa0
       [<c1194097>] getxattr+0x87/0x190
       [<c11801d7>] ? path_lookupat+0x57/0x5f0
       [<c11819d2>] ? putname+0x32/0x50
       [<c116653a>] ? kmem_cache_alloc+0x2a/0x130
       [<c11819d2>] ? putname+0x32/0x50
       [<c11819d2>] ? putname+0x32/0x50
       [<c11819d2>] ? putname+0x32/0x50
       [<c11827f9>] ? user_path_at_empty+0x49/0x70
       [<c118283f>] ? user_path_at+0x1f/0x30
       [<c11941e7>] path_getxattr+0x47/0x80
       [<c11948e7>] SyS_getxattr+0x27/0x30
       [<c163f748>] sysenter_do_call+0x12/0x12
      Code: 66 90 55 89 e5 57 56 53 66 66 66 66 90 8b 78 20 89 d3 ba 67 b4 b5 f8 89 d8 89 ce e8 42 7c 7b c8 85 c0 75 16 0f b6 87 44 01 00
      00 <88> 06 b8 01 00 00 00 5b 5e 5f 5d c3 8d 76 00 b8 ea ff ff ff eb
      EIP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] SS:ESP 0068:f19a7e08
      CR2: 0000000000000000
      ---[ end trace 860260654f1f416a ]---
      
      The reason is that in getfattr there are two steps which is indicated by strace info:
      1) try to lookup and get size of specified xattr.
      2) get value of the extented attribute.
      
      strace info:
      getxattr("/mnt/f2fs/file", "system.advise", 0x0, 0) = 1
      getxattr("/mnt/f2fs/file", "system.advise", "\x00", 256) = 1
      
      For the first step, getfattr may pass a NULL pointer in @value and zero in @size
      as parameters for ->getxattr, but we access this @value pointer directly without
      checking whether the pointer is valid or not in f2fs_xattr_advise_get, so the
      oops occurs.
      
      This patch fixes this issue by verifying @value pointer before using.
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      84e97c27
  2. 04 11月, 2014 1 次提交
    • J
      f2fs: avoid deadlock on init_inode_metadata · bce8d112
      Jaegeuk Kim 提交于
      Previously, init_inode_metadata does not hold any parent directory's inode
      page. So, f2fs_init_acl can grab its parent inode page without any problem.
      But, when we use inline_dentry, that page is grabbed during f2fs_add_link,
      so that we can fall into deadlock condition like below.
      
      INFO: task mknod:11006 blocked for more than 120 seconds.
            Tainted: G           OE  3.17.0-rc1+ #13
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mknod           D ffff88003fc94580     0 11006  11004 0x00000000
       ffff880007717b10 0000000000000002 ffff88003c323220 ffff880007717fd8
       0000000000014580 0000000000014580 ffff88003daecb30 ffff88003c323220
       ffff88003fc94e80 ffff88003ffbb4e8 ffff880007717ba0 0000000000000002
      Call Trace:
       [<ffffffff8173dc40>] ? bit_wait+0x50/0x50
       [<ffffffff8173d4cd>] io_schedule+0x9d/0x130
       [<ffffffff8173dc6c>] bit_wait_io+0x2c/0x50
       [<ffffffff8173da3b>] __wait_on_bit_lock+0x4b/0xb0
       [<ffffffff811640a7>] __lock_page+0x67/0x70
       [<ffffffff810acf50>] ? autoremove_wake_function+0x40/0x40
       [<ffffffff811652cc>] pagecache_get_page+0x14c/0x1e0
       [<ffffffffa029afa9>] get_node_page+0x59/0x130 [f2fs]
       [<ffffffffa02a63ad>] read_all_xattrs+0x24d/0x430 [f2fs]
       [<ffffffffa02a6ca2>] f2fs_getxattr+0x52/0xe0 [f2fs]
       [<ffffffffa02a7481>] f2fs_get_acl+0x41/0x2d0 [f2fs]
       [<ffffffff8122d847>] get_acl+0x47/0x70
       [<ffffffff8122db5a>] posix_acl_create+0x5a/0x150
       [<ffffffffa02a7759>] f2fs_init_acl+0x29/0xcb [f2fs]
       [<ffffffffa0286a8d>] init_inode_metadata+0x5d/0x340 [f2fs]
       [<ffffffffa029253a>] f2fs_add_inline_entry+0x12a/0x2e0 [f2fs]
       [<ffffffffa0286ea5>] __f2fs_add_link+0x45/0x4a0 [f2fs]
       [<ffffffffa028b5b6>] ? f2fs_new_inode+0x146/0x220 [f2fs]
       [<ffffffffa028b816>] f2fs_mknod+0x86/0xf0 [f2fs]
       [<ffffffff811e3ec1>] vfs_mknod+0xe1/0x160
       [<ffffffff811e4b26>] SyS_mknod+0x1f6/0x200
       [<ffffffff81741d7f>] tracesys+0xe1/0xe6
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bce8d112
  3. 10 9月, 2014 1 次提交
  4. 04 9月, 2014 1 次提交
  5. 20 8月, 2014 1 次提交
  6. 02 6月, 2014 1 次提交
    • J
      f2fs: fix recursive lock by f2fs_setxattr · d631abda
      Jaegeuk Kim 提交于
      This patch should resolve the following recursive lock.
      
      [<ffffffff8135a9c3>] call_rwsem_down_write_failed+0x13/0x20
      [<ffffffffa01749dc>] f2fs_setxattr+0x5c/0xa0 [f2fs]
      [<ffffffffa0174c99>] __f2fs_set_acl+0x1b9/0x340 [f2fs]
      [<ffffffffa017515a>] f2fs_init_acl+0x4a/0xcb [f2fs]
      [<ffffffffa0159abe>] __f2fs_add_link+0x26e/0x780 [f2fs]
      [<ffffffffa015d4d8>] f2fs_mkdir+0xb8/0x150 [f2fs]
      [<ffffffff811cebd7>] vfs_mkdir+0xb7/0x160
      [<ffffffff811cf89b>] SyS_mkdir+0xab/0xe0
      [<ffffffff817244bf>] tracesys+0xe1/0xe6
      [<ffffffffffffffff>] 0xffffffffffffffff
      
      The call path indicates:
      - f2fs_add_link
         : down_write(&fi->i_sem);
      
       - init_inode_metadata
         - f2fs_init_acl
           - __f2fs_set_acl
             - f2fs_setxattr
               : down_write(&fi->i_sem);
      
      Here we should not call f2fs_setxattr, but __f2fs_setxattr.
      But __f2fs_setxattr is a static function in xattr.c, so that I found the other
      generic approach to use f2fs_setxattr.
      
      In f2fs_setxattr, the page pointer is only given from init_inode_metadata.
      So, this patch adds this condition to avoid this in f2fs_setxattr.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d631abda
  7. 07 5月, 2014 4 次提交
  8. 01 4月, 2014 1 次提交
  9. 20 3月, 2014 2 次提交
    • J
      f2fs: avoid RECLAIM_FS-ON-W warning · 808a1d74
      Jaegeuk Kim 提交于
      This patch should resolve the following possible bug.
      
      RECLAIM_FS-ON-W at:
       mark_held_locks+0xb9/0x140
       lockdep_trace_alloc+0x85/0xf0
       __kmalloc+0x53/0x1d0
       read_all_xattrs+0x3d1/0x3f0 [f2fs]
       f2fs_getxattr+0x4f/0x100 [f2fs]
       f2fs_get_acl+0x4c/0x290 [f2fs]
       get_acl+0x4f/0x80
       posix_acl_create+0x72/0x180
       f2fs_init_acl+0x29/0xcc [f2fs]
       __f2fs_add_link+0x259/0x710 [f2fs]
       f2fs_create+0xad/0x1c0 [f2fs]
       vfs_create+0xed/0x150
       do_last+0xd36/0xed0
       path_openat+0xc5/0x680
       do_filp_open+0x43/0xa0
       do_sys_open+0x13c/0x230
       SyS_creat+0x1e/0x20
       system_call_fastpath+0x16/0x1b
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      808a1d74
    • J
      f2fs: introduce fi->i_sem to protect fi's info · d928bfbf
      Jaegeuk Kim 提交于
      This patch introduces fi->i_sem to protect fi's info that includes xattr_ver,
      pino, i_nlink.
      This enables to remove i_mutex during f2fs_sync_file, resulting in performance
      improvement when a number of fsync calls are triggered from many concurrent
      threads.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      d928bfbf
  10. 26 1月, 2014 1 次提交
  11. 23 12月, 2013 1 次提交
  12. 29 10月, 2013 2 次提交
  13. 07 10月, 2013 1 次提交
    • G
      f2fs: use rw_sem instead of fs_lock(locks mutex) · e479556b
      Gu Zheng 提交于
      The fs_locks is used to block other ops(ex, recovery) when doing checkpoint.
      And each other operate routine(besides checkpoint) needs to acquire a fs_lock,
      there is a terrible problem here, if these are too many concurrency threads acquiring
      fs_lock, so that they will block each other and may lead to some performance problem,
      but this is not the phenomenon we want to see.
      Though there are some optimization patches introduced to enhance the usage of fs_lock,
      but the thorough solution is using a *rw_sem* to replace the fs_lock.
      Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block
      other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other,
      this can avoid the problem described above completely.
      Because of the weakness of rw_sem, the above change may introduce a potential problem
      that the checkpoint thread might get starved if other threads are intensively locking
      the read semaphore for I/O.(Pointed out by Xu Jin)
      In order to avoid this, a wait_list is introduced, the appending read semaphore ops
      will be dropped into the wait_list if checkpoint thread is waiting for write semaphore,
      and will be waked up when checkpoint thread gives up write semaphore.
      Thanks to Kim's previous review and test, and will be very glad to see other guys'
      performance tests about this patch.
      
      V2:
        -fix the potential starvation problem.
        -use more suitable func name suggested by Xu Jin.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: adjust minor coding standard]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      e479556b
  14. 25 9月, 2013 1 次提交
    • R
      f2fs: don't GC or take an fs_lock from f2fs_initxattrs() · 52ab9560
      Russ Knize 提交于
      f2fs_initxattrs() is called internally from within F2FS and should
      not call functions that are used by VFS handlers.  This avoids
      certain deadlocks:
      
      - vfs_create()
       - f2fs_create() <-- takes an fs_lock
        - f2fs_add_link()
         - __f2fs_add_link()
          - init_inode_metadata()
           - f2fs_init_security()
            - security_inode_init_security()
             - f2fs_initxattrs()
              - f2fs_setxattr() <-- also takes an fs_lock
      
      If the caller happens to grab the same fs_lock from the pool in both
      places, they will deadlock.  There are also deadlocks involving
      multiple threads and mutexes:
      
      - f2fs_write_begin()
       - f2fs_balance_fs() <-- takes gc_mutex
        - f2fs_gc()
         - write_checkpoint()
          - block_operations()
           - mutex_lock_all() <-- blocks trying to grab all fs_locks
      
      - f2fs_mkdir() <-- takes an fs_lock
       - __f2fs_add_link()
        - f2fs_init_security()
         - security_inode_init_security()
          - f2fs_initxattrs()
           - f2fs_setxattr()
            - f2fs_balance_fs() <-- blocks trying to take gc_mutex
      Signed-off-by: NRuss Knize <Russ.Knize@motorola.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      52ab9560
  15. 26 8月, 2013 2 次提交
    • J
      f2fs: support the inline xattrs · 65985d93
      Jaegeuk Kim 提交于
      0. modified inode structure
      --------------------------------------
      metadata (e.g., i_mtime, i_ctime, etc)
      --------------------------------------
      direct pointers [0 ~ 873]
      
      inline xattrs (200 bytes by default)
      
      indirect pointers [0 ~ 4]
      --------------------------------------
      node footer
      --------------------------------------
      
      1. setxattr flow
       - read_all_xattrs copies all the xattrs from inline and xattr node block.
       - handle xattr entries
       - write_all_xattrs copies modified xattrs into inline and xattr node block.
      
      2. getxattr flow
       - read_all_xattrs copies all the xattrs from inline and xattr node block.
       - check target entries
      
      3. Usage
       # mount -t f2fs -o inline_xattr $DEV $MNT
      
       Once mounted with the inline_xattr option, f2fs marks all the newly created
       files to reserve an amount of inline xattr space explicitly inside the inode
       block. Without the mount option, f2fs will not touch any existing files and
       newly created files as well.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      65985d93
    • J
      f2fs: introduce __find_xattr for readability · dd9cfe23
      Jaegeuk Kim 提交于
      The __find_xattr is to search the wanted xattr entry starting from the
      base_addr.
      
      If not found, the returned entry is the last empty xattr entry that can be
      allocated newly.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      dd9cfe23
  16. 12 8月, 2013 1 次提交
    • J
      f2fs: should cover i_xattr_nid with its xattr node page lock · 479bd73a
      Jaegeuk Kim 提交于
      Previously, f2fs_setxattr assigns i_xattr_nid in the inode page inconsistently.
      
      The scenario is:
      
      = Thread 1 =         = Thread 2 =     = fi->i_xattr_nid =  = on-disk nid =
      
      f2fs_setxattr                                   0                 0
        new_node_page                                 X                 0
                         sync_inode_page              X                 X
                         checkpoint                   X                 X -.
          grab_cache_page                             X                 X  |
      --> allocate a new xattr node block or -ENOSPC      <----------------'
      
      At this moment, the checkpoint stores inconsistent data where the inode has
      i_xattr_nid but actual xattr node block is not allocated yet.
      
      So, we should assign the real i_xattr_nid only after its xattr node block is
      allocated.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      479bd73a
  17. 09 8月, 2013 2 次提交
  18. 11 6月, 2013 1 次提交
    • J
      f2fs: support xattr security labels · 8ae8f162
      Jaegeuk Kim 提交于
      This patch adds the support of security labels for f2fs, which will be used
      by Linus Security Models (LSMs).
      
      Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules:
      "Linux Security Modules (LSM) is a framework that allows the Linux kernel to
      support a variety of computer security models while avoiding favoritism toward
      any single security implementation. The framework is licensed under the terms of
      the GNU General Public License and is standard part of the Linux kernel since
      Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted
      modules in the official kernel.".
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      8ae8f162
  19. 03 6月, 2013 1 次提交
  20. 09 4月, 2013 1 次提交
    • J
      f2fs: introduce a new global lock scheme · 39936837
      Jaegeuk Kim 提交于
      In the previous version, f2fs uses global locks according to the usage types,
      such as directory operations, block allocation, block write, and so on.
      
      Reference the following lock types in f2fs.h.
      enum lock_type {
      	RENAME,		/* for renaming operations */
      	DENTRY_OPS,	/* for directory operations */
      	DATA_WRITE,	/* for data write */
      	DATA_NEW,	/* for data allocation */
      	DATA_TRUNC,	/* for data truncate */
      	NODE_NEW,	/* for node allocation */
      	NODE_TRUNC,	/* for node truncate */
      	NODE_WRITE,	/* for node write */
      	NR_LOCK_TYPE,
      };
      
      In that case, we lose the performance under the multi-threading environment,
      since every types of operations must be conducted one at a time.
      
      In order to address the problem, let's share the locks globally with a mutex
      array regardless of any types.
      So, let users grab a mutex and perform their jobs in parallel as much as
      possbile.
      
      For this, I propose a new global lock scheme as follows.
      
      0. Data structure
       - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
       - f2fs_sb_info -> node_write
      
      1. mutex_lock_op(sbi)
       - try to get an avaiable lock from the array.
       - returns the index of the gottern lock variable.
      
      2. mutex_unlock_op(sbi, index of the lock)
       - unlock the given index of the lock.
      
      3. mutex_lock_all(sbi)
       - grab all the locks in the array before the checkpoint.
      
      4. mutex_unlock_all(sbi)
       - release all the locks in the array after checkpoint.
      
      5. block_operations()
       - call mutex_lock_all()
       - sync_dirty_dir_inodes()
       - grab node_write
       - sync_node_pages()
      
      Note that,
       the pairs of mutex_lock_op()/mutex_unlock_op() and
       mutex_lock_all()/mutex_unlock_all() should be used together.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      39936837
  21. 20 3月, 2013 1 次提交
  22. 11 1月, 2013 1 次提交
    • J
      f2fs: add f2fs_balance_fs in several interfaces · 7d82db83
      Jaegeuk Kim 提交于
      The f2fs_balance_fs() is to check the number of free sections and decide whether
      it needs to conduct cleaning or not. If there are not enough free sections, the
      cleaning job should be started.
      
      In order to control an amount of free sections even under high utilization, f2fs
      should call f2fs_balance_fs at all the VFS interfaces that are able to produce
      dirty pages.
      This patch adds the function calls in the missing interfaces as follows.
      
      1. f2fs_setxattr()
      The f2fs_setxattr() produces dirty node pages so that we should call
      f2fs_balance_fs() either likewise doing in other VFS interfaces such as
      f2fs_lookup(), f2fs_mkdir(), and so on.
      
      2. f2fs_sync_file()
      We should guarantee serving free sections for syncing metadata during fsync.
      Previously, there is no space check before triggering checkpoint and
      sync_node_pages.
      Therefore, if a bunch of fsync calls are triggered under 100% of FS utilization,
      f2fs is able to be faced with no free sections, resulting in BUG_ON().
      
      3. f2fs_sync_fs()
      Before calling write_checkpoint(), we should guarantee that there are minimum
      free sections.
      
      4. f2fs_write_inode()
      f2fs_write_inode() is also able to produce dirty node pages.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7d82db83
  23. 28 12月, 2012 1 次提交
  24. 11 12月, 2012 3 次提交
    • J
      f2fs: resolve build failures · 573ea5fc
      Jaegeuk Kim 提交于
      There exist two build failures reported by Randy Dunlap as follows.
      
      (on i386)
       a. (config-r8857)
      	ERROR: "f2fs_xattr_advise_handler" [fs/f2fs/f2fs.ko] undefined!
      
      Key configs in (config-r8857) are as follows.
       CONFIG_F2FS_FS=m
       # CONFIG_F2FS_STAT_FS is not set
       CONFIG_F2FS_FS_XATTR=y
       # CONFIG_F2FS_FS_POSIX_ACL is not set
      
      The error was occurred due to the function location that we made a mistake.
      Recently we added a new functionality for users to indicate cold files
      explicitly through xattr operations (i.e., f2fs_xattr_advise_handler).
      
      This handler should have been added in xattr.c instead of acl.c in order
      to avoid an undefined operation like in this case where XATTR is set and
      ACL is not set.
      
       b. (config-r8855)
      	fs/f2fs/file.c: In function 'f2fs_vm_page_mkwrite':
      	fs/f2fs/file.c:97:2: error: implicit declaration of function
      	'block_page_mkwrite_return'
      
      Key config in (config-r8855) is CONFIG_BLOCK.
      
      Obviously, f2fs works on top of the block device so that we should consider
      carefully a sort of config dependencies.
      
      The reason why this error was occurred was that f2fs_vm_page_mkwrite() calls
      block_page_mkwrite_return() which is enalbed only if CONFIG_BLOCK is set.
      Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
      573ea5fc
    • J
      f2fs: adjust kernel coding style · 0a8165d7
      Jaegeuk Kim 提交于
      As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment
      blocks. Instead, just use "/*".
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      0a8165d7
    • J
      f2fs: add xattr and acl functionalities · af48b85b
      Jaegeuk Kim 提交于
      This implements xattr and acl functionalities.
      
      - F2FS uses a node page to contain use extended attributes.
      Signed-off-by: NChangman Lee <cm224.lee@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      af48b85b