1. 23 12月, 2013 3 次提交
    • J
      f2fs: use sbi->write_mutex for write bios · 971767ca
      Jaegeuk Kim 提交于
      This patch removes an unnecessary semaphore (i.e., sbi->bio_sem).
      There is no reason to use the semaphore when f2fs submits read and write IOs.
      Instead, let's use a write mutex and cover the sbi->bio[] by the lock.
      
      Change log from v1:
       o split write_mutex suggested by Chao Yu
      
      Chao described,
      "All DATA/NODE/META bio buffers in superblock is protected by
      'sbi->write_mutex', but each bio buffer area is independent, So we
      should split write_mutex to three for DATA/NODE/META."
      Signed-off-by: NChao Yu <chao2.yu@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      971767ca
    • J
      f2fs: add a sysfs entry to control max_discards · 7ac8c3b0
      Jaegeuk Kim 提交于
      If frequent small discards are issued to the device, the performance would
      be degraded significantly.
      So, this patch adds a sysfs entry to control the number of discards to be
      issued during a checkpoint procedure.
      
      By default, f2fs does not issue any small discards, which means max_discards
      is zero.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7ac8c3b0
    • J
      f2fs: add a slab cache entry for small discards · 7fd9e544
      Jaegeuk Kim 提交于
      This patch adds a slab cache entry for small discards.
      
      Each entry consists of:
      
      struct discard_entry {
      	struct list_head list;	/* list head */
      	block_t blkaddr;	/* block address to be discarded */
      	int len;		/* # of consecutive blocks of the discard */
      };
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      7fd9e544
  2. 08 11月, 2013 2 次提交
  3. 25 10月, 2013 2 次提交
  4. 18 10月, 2013 2 次提交
  5. 08 10月, 2013 1 次提交
  6. 07 10月, 2013 2 次提交
    • K
      f2fs: handle remount options correctly · 4058c511
      Kelly Anderson 提交于
      The current f2fs code errors if the xattr or acl options are passed when
      remounting.  This is important in a typical scenario where f2fs is mounted
      as a "ro" root file-system by the boot loader and then the init process wants
      to remount it "rw" with the "remount,rw" option.
      Signed-off-by: NKelly Anderson <kelly@xilka.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      4058c511
    • G
      f2fs: use rw_sem instead of fs_lock(locks mutex) · e479556b
      Gu Zheng 提交于
      The fs_locks is used to block other ops(ex, recovery) when doing checkpoint.
      And each other operate routine(besides checkpoint) needs to acquire a fs_lock,
      there is a terrible problem here, if these are too many concurrency threads acquiring
      fs_lock, so that they will block each other and may lead to some performance problem,
      but this is not the phenomenon we want to see.
      Though there are some optimization patches introduced to enhance the usage of fs_lock,
      but the thorough solution is using a *rw_sem* to replace the fs_lock.
      Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block
      other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other,
      this can avoid the problem described above completely.
      Because of the weakness of rw_sem, the above change may introduce a potential problem
      that the checkpoint thread might get starved if other threads are intensively locking
      the read semaphore for I/O.(Pointed out by Xu Jin)
      In order to avoid this, a wait_list is introduced, the appending read semaphore ops
      will be dropped into the wait_list if checkpoint thread is waiting for write semaphore,
      and will be waked up when checkpoint thread gives up write semaphore.
      Thanks to Kim's previous review and test, and will be very glad to see other guys'
      performance tests about this patch.
      
      V2:
        -fix the potential starvation problem.
        -use more suitable func name suggested by Xu Jin.
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: adjust minor coding standard]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      e479556b
  7. 26 8月, 2013 3 次提交
  8. 20 8月, 2013 1 次提交
  9. 12 8月, 2013 1 次提交
  10. 08 8月, 2013 1 次提交
    • J
      f2fs: fix a build failure due to missing the kobject header · c2d715d1
      Jaegeuk Kim 提交于
      This patch should resolve the following error reported by kbuild test robot.
      
      All error/warnings:
      
         In file included from fs/f2fs/dir.c:13:0:
         >> fs/f2fs/f2fs.h:435:17: error: field 's_kobj' has incomplete type
              struct kobject s_kobj;
      
      The failure was caused by missing the kobject header file in dir.c.
      So, this patch move the header file to the right location, f2fs.h.
      
      CC: Namjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      c2d715d1
  11. 06 8月, 2013 2 次提交
  12. 30 7月, 2013 1 次提交
  13. 17 6月, 2013 1 次提交
    • N
      f2fs: add remount_fs callback support · 696c018c
      Namjae Jeon 提交于
      Add the f2fs_remount function call which will be used
      during the filesystem remounting. This function
      will help us to change the mount options specific to
      f2fs.
      
      Also modify the f2fs background_gc mount option, which
      will allow the user to dynamically trun on/off the
      garbage collection in f2fs based on the background_gc
      value. If background_gc=on, Garbage collection will
      be turned off & if background_gc=off, Garbage collection
      will be truned on.
      
      By default the garbage collection is on in f2fs.
      
      Change Log:
      v2: Incorporated the review comments by Gu Zheng.
          Removing the restore part for VFS flags
          Updating comments with proper flag conditions
          Display GC background option as ON/OFF
          Revised conditions to stop GC in case of remount
      
      v1: Initial changes for adding remount_fs callback
      support.
      
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NPankaj Kumar <pankaj.km@samsung.com>
      Reviewed-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      [Jaegeuk Kim: change /** with /* for the coding style]
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      696c018c
  14. 14 6月, 2013 1 次提交
  15. 11 6月, 2013 1 次提交
  16. 28 5月, 2013 1 次提交
  17. 08 5月, 2013 2 次提交
    • C
      f2fs: continue to mount after failing recovery · bde582b2
      Chris Fries 提交于
      When unable to roll forward the journal, we shouldn't bail out and
      not mount, we should continue to attempt the mount.  Bad recovery data
      is likely unrecoverable at this point, and requiring the user to try
      to mount again doesn't solve any issues.
      Signed-off-by: NChris Fries <C.Fries@motorola.com>
      Reviewed-by: NRussell Knize <rknize2@motorola.com>
      Reviewed-by: NJason Hrycay <jason.hrycay@motorola.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      bde582b2
    • J
      f2fs: avoid deadlock during evict after f2fs_gc · 531ad7d5
      Jaegeuk Kim 提交于
      o Deadlock case #1
      
      Thread 1:
      - writeback_sb_inodes
       - do_writepages
        - f2fs_write_data_pages
         - write_cache_pages
          - f2fs_write_data_page
           - f2fs_balance_fs
            - wait mutex_lock(gc_mutex)
      
      Thread 2:
      - f2fs_balance_fs
       - mutex_lock(gc_mutex)
       - f2fs_gc
        - f2fs_iget
         - wait iget_locked(inode->i_lock)
      
      Thread 3:
      - do_unlinkat
       - iput
        - lock(inode->i_lock)
         - evict
          - inode_wait_for_writeback
      
      o Deadlock case #2
      
      Thread 1:
      - __writeback_single_inode
       : set I_SYNC
        - do_writepages
         - f2fs_write_data_page
          - f2fs_balance_fs
           - f2fs_gc
            - iput
             - evict
              - inode_wait_for_writeback(I_SYNC)
      
      In order to avoid this, even though iput is called with the zero-reference
      count, we need to stop the eviction procedure if the inode is on writeback.
      So this patch links f2fs_drop_inode which checks the I_SYNC flag.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      531ad7d5
  18. 30 4月, 2013 1 次提交
  19. 23 4月, 2013 1 次提交
  20. 22 4月, 2013 1 次提交
  21. 09 4月, 2013 1 次提交
    • J
      f2fs: introduce a new global lock scheme · 39936837
      Jaegeuk Kim 提交于
      In the previous version, f2fs uses global locks according to the usage types,
      such as directory operations, block allocation, block write, and so on.
      
      Reference the following lock types in f2fs.h.
      enum lock_type {
      	RENAME,		/* for renaming operations */
      	DENTRY_OPS,	/* for directory operations */
      	DATA_WRITE,	/* for data write */
      	DATA_NEW,	/* for data allocation */
      	DATA_TRUNC,	/* for data truncate */
      	NODE_NEW,	/* for node allocation */
      	NODE_TRUNC,	/* for node truncate */
      	NODE_WRITE,	/* for node write */
      	NR_LOCK_TYPE,
      };
      
      In that case, we lose the performance under the multi-threading environment,
      since every types of operations must be conducted one at a time.
      
      In order to address the problem, let's share the locks globally with a mutex
      array regardless of any types.
      So, let users grab a mutex and perform their jobs in parallel as much as
      possbile.
      
      For this, I propose a new global lock scheme as follows.
      
      0. Data structure
       - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS]
       - f2fs_sb_info -> node_write
      
      1. mutex_lock_op(sbi)
       - try to get an avaiable lock from the array.
       - returns the index of the gottern lock variable.
      
      2. mutex_unlock_op(sbi, index of the lock)
       - unlock the given index of the lock.
      
      3. mutex_lock_all(sbi)
       - grab all the locks in the array before the checkpoint.
      
      4. mutex_unlock_all(sbi)
       - release all the locks in the array after checkpoint.
      
      5. block_operations()
       - call mutex_lock_all()
       - sync_dirty_dir_inodes()
       - grab node_write
       - sync_node_pages()
      
      Note that,
       the pairs of mutex_lock_op()/mutex_unlock_op() and
       mutex_lock_all()/mutex_unlock_all() should be used together.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      39936837
  22. 03 4月, 2013 2 次提交
    • J
      f2fs: avoid race for summary information · b7473754
      Jaegeuk Kim 提交于
      In order to do GC more reliably, I'd like to lock the vicitm summary page
      until its GC is completed, and also prevent any checkpoint process.
      Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      b7473754
    • J
      f2fs: change GC bitmaps to apply the section granularity · 5ec4e49f
      Jaegeuk Kim 提交于
      This patch removes a bitmap for victim segments selected by foreground GC, and
      modifies the other bitmap for victim segments selected by background GC.
      
      1) foreground GC bitmap
       : We don't need to manage this, since we just only one previous victim section
         number instead of the whole victim history.
         The f2fs uses the victim section number in order not to allocate currently
         GC'ed section to current active logs.
      
      2) background GC bitmap
       : This bitmap is used to avoid selecting victims repeatedly by background GCs.
         In addition, the victims are able to be selected by foreground GCs, since
         there is no need to read victim blocks during foreground GCs.
      
         By the fact that the foreground GC reclaims segments in a section unit, it'd
         be better to manage this bitmap based on the section granularity.
      Reviewed-by: NNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      5ec4e49f
  23. 27 3月, 2013 1 次提交
  24. 20 3月, 2013 3 次提交
  25. 19 3月, 2013 1 次提交
  26. 18 3月, 2013 1 次提交
  27. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035