1. 08 1月, 2017 1 次提交
  2. 01 1月, 2017 1 次提交
    • E
      fscrypt: use ENOKEY when file cannot be created w/o key · 54475f53
      Eric Biggers 提交于
      As part of an effort to clean up fscrypt-related error codes, make
      attempting to create a file in an encrypted directory that hasn't been
      "unlocked" fail with ENOKEY.  Previously, several error codes were used
      for this case, including ENOENT, EACCES, and EPERM, and they were not
      consistent between and within filesystems.  ENOKEY is a better choice
      because it expresses that the failure is due to lacking the encryption
      key.  It also matches the error code returned when trying to open an
      encrypted regular file without the key.
      
      I am not aware of any users who might be relying on the previous
      inconsistent error codes, which were never documented anywhere.
      
      This failure case will be exercised by an xfstest.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      54475f53
  3. 13 12月, 2016 1 次提交
  4. 12 12月, 2016 2 次提交
  5. 09 12月, 2016 1 次提交
  6. 08 12月, 2016 3 次提交
  7. 06 12月, 2016 2 次提交
  8. 30 11月, 2016 2 次提交
  9. 29 11月, 2016 1 次提交
  10. 26 11月, 2016 17 次提交
    • A
      f2fs: fix 32-bit build · 19c52651
      Arnd Bergmann 提交于
      The addition of multiple-device support broke CONFIG_BLK_DEV_ZONED
      on 32-bit machines because of a 64-bit division:
      
      fs/f2fs/f2fs.o: In function `__issue_discard_async':
      extent_cache.c:(.text.__issue_discard_async+0xd4): undefined reference to `__aeabi_uldivmod'
      
      Fortunately, bdev_zone_size() is guaranteed to return a power-of-two
      number, so we can replace the % operator with a cheaper bit mask.
      
      Fixes: 792b84b74b54 ("f2fs: support multiple devices")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      19c52651
    • N
      f2fs: set ->owner for debugfs status file's file_operations · 05e6ea26
      Nicolai Stange 提交于
      The struct file_operations instance serving the f2fs/status debugfs file
      lacks an initialization of its ->owner.
      
      This means that although that file might have been opened, the f2fs module
      can still get removed. Any further operation on that opened file, releasing
      included,  will cause accesses to unmapped memory.
      
      Indeed, Mike Marshall reported the following:
      
        BUG: unable to handle kernel paging request at ffffffffa0307430
        IP: [<ffffffff8132a224>] full_proxy_release+0x24/0x90
        <...>
        Call Trace:
         [] __fput+0xdf/0x1d0
         [] ____fput+0xe/0x10
         [] task_work_run+0x8e/0xc0
         [] do_exit+0x2ae/0xae0
         [] ? __audit_syscall_entry+0xae/0x100
         [] ? syscall_trace_enter+0x1ca/0x310
         [] do_group_exit+0x44/0xc0
         [] SyS_exit_group+0x14/0x20
         [] do_syscall_64+0x61/0x150
         [] entry_SYSCALL64_slow_path+0x25/0x25
        <...>
        ---[ end trace f22ae883fa3ea6b8 ]---
        Fixing recursive fault but reboot is needed!
      
      Fix this by initializing the f2fs/status file_operations' ->owner with
      THIS_MODULE.
      
      This will allow debugfs to grab a reference to the f2fs module upon any
      open on that file, thus preventing it from getting removed.
      
      Fixes: 902829aa ("f2fs: move proc files to debugfs")
      Reported-by: NMike Marshall <hubcap@omnibond.com>
      Reported-by: NMartin Brandenburg <martin@omnibond.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      05e6ea26
    • C
      f2fs: fix incorrect free inode count in ->statfs · b08b12d2
      Chao Yu 提交于
      While calculating inode count that we can create at most in the left space,
      we should consider space which data/node blocks occupied, since we create
      data/node mixly in main area. So fix the wrong calculation in ->statfs.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b08b12d2
    • G
      f2fs: drop duplicate header timer.h · b4ceec29
      Geliang Tang 提交于
      Drop duplicate header timer.h from segment.c.
      Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b4ceec29
    • J
      f2fs: fix wrong AUTO_RECOVER condition · 97dd26ad
      Jaegeuk Kim 提交于
      If i_size is not aligned to the f2fs's block size, we should not skip inode
      update during fsync.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      97dd26ad
    • J
      f2fs: do not recover i_size if it's valid · 3a3a5ead
      Jaegeuk Kim 提交于
      If i_size is already valid during roll_forward recovery, we should not update
      it according to the block alignment.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3a3a5ead
    • C
      f2fs: fix fdatasync · 281518c6
      Chao Yu 提交于
      For below two cases, we can't guarantee data consistence:
      
      a)
      1. xfs_io "pwrite 0 4195328" "fsync"
      2. xfs_io "pwrite 4195328 1024" "fdatasync"
      3. godown
      4. umount & mount
      --> isize we updated before fdatasync won't be recovered
      
      b)
      1. xfs_io "pwrite -S 0xcc 0 4202496" "fsync"
      2. xfs_io "fpunch 4194304 4096" "fdatasync"
      3. godown
      4. umount & mount
      --> dnode we punched before fdatasync won't be recovered
      
      The reason is that normally fdatasync won't be aware of modification
      of metadata in file, e.g. isize changing, dnode updating, so in ->fsync
      we will skip flushing node pages for above cases, result in making
      fdatasynced file being lost during recovery.
      
      Currently we have introduced DIRTY_META global list in sbi for tracking
      dirty inode selectively, so in fdatasync we can choose to flush nodes
      depend on dirty state of current inode in the list.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      281518c6
    • C
      f2fs: fix to account total free nid correctly · 04d47e67
      Chao Yu 提交于
      Thread A		Thread B		Thread C
      - f2fs_create
       - f2fs_new_inode
        - f2fs_lock_op
         - alloc_nid
          alloc last nid
        - f2fs_unlock_op
      			- f2fs_create
      			 - f2fs_new_inode
      			  - f2fs_lock_op
      			   - alloc_nid
      			    as node count still not
      			    be increased, we will
      			    loop in alloc_nid
      						- f2fs_write_node_pages
      						 - f2fs_balance_fs_bg
      						  - f2fs_sync_fs
      						   - write_checkpoint
      						    - block_operations
      						     - f2fs_lock_all
       - f2fs_lock_op
      
      While creating new inode, we do not allocate and account nid atomically,
      so that when there is almost no free nids left, we may encounter deadloop
      like above stack.
      
      In order to avoid that, reuse nm_i::available_nids for accounting free nids
      and make nid allocation and counting being atomical during node creation.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      04d47e67
    • Y
      f2fs: fix an infinite loop when flush nodes in cp · d40a43af
      Yunlei He 提交于
      Thread A			Thread B
      
      - write_checkpoint
       - block_operations
         -blk_start_plug
          -sync_node_pages		- f2fs_do_sync_file
      				 - fsync_node_pages
      				  - f2fs_wait_on_page_writeback
      
      Thread A wait for global F2FS_DIRTY_NODES decreased to zero,
      it start a plug list, some requests have been added to this list.
      Thread B lock one dirty node page, and wait this page write back.
      But this page has been in plug list of thread A with PG_writeback flag.
      Thread A keep on running and its plug list has no chance to finish,
      so it seems a deadlock between cp and fsync path.
      
      This patch add a wait on page write back before set node page dirty
      to avoid this problem.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d40a43af
    • C
      f2fs: don't wait writeback for datas during checkpoint · 36951b38
      Chao Yu 提交于
      Normally, while committing checkpoint, we will wait on all pages to be
      writebacked no matter the page is data or metadata, so in scenario where
      there are lots of data IO being submitted with metadata, we may suffer
      long latency for waiting writeback during checkpoint.
      
      Indeed, we only care about persistence for pages with metadata, but not
      pages with data, as file system consistent are only related to metadate,
      so in order to avoid encountering long latency in above scenario, let's
      recognize and reference metadata in submitted IOs, wait writeback only
      for metadatas.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      36951b38
    • J
      f2fs: fix wrong written_valid_blocks counting · c79b7ff1
      Jaegeuk Kim 提交于
      Previously, written_valid_blocks was got by ckpt->valid_block_count. But if
      the last checkpoint has some NEW_ADDR due to power-cut, we can get wrong value.
      Fix it to get the number from actual written block count from sit entries.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c79b7ff1
    • J
      f2fs: avoid BG_GC in f2fs_balance_fs · 7702bdbe
      Jaegeuk Kim 提交于
      If many threads hit has_not_enough_free_secs() in f2fs_balance_fs() at the same
      time, all the threads would do FG_GC or BG_GC.
      In this critical path, we totally don't need to do BG_GC at all.
      Let's avoid that.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7702bdbe
    • J
      f2fs: fix redundant block allocation · c040ff9d
      Jaegeuk Kim 提交于
      In direct_IO path of f2fs_file_write_iter(),
      1. f2fs_preallocate_blocks(F2FS_GET_BLOCK_PRE_DIO)
         -> allocate LBA X
      2. f2fs_direct_IO()
         -> return 0;
      
      Then,
      f2fs_write_data_page() will allocate another LBA X+1.
      
      This makes EIO triggered by HM-SMR.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c040ff9d
    • J
      f2fs: use err for f2fs_preallocate_blocks · a7de6086
      Jaegeuk Kim 提交于
      This patch has no functional change.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a7de6086
    • J
      f2fs: support multiple devices · 3c62be17
      Jaegeuk Kim 提交于
      This patch implements multiple devices support for f2fs.
      Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big
      volume under one f2fs instance.
      
      Internal block management is very simple, but we will modify block allocation
      and background GC policy to boost IO speed by exploiting them accoording to
      each device speed.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3c62be17
    • J
      f2fs: allow dio read for LFS mode · e57e9ae5
      Jaegeuk Kim 提交于
      We can allow dio reads for LFS mode, while doing buffered writes for dio writes.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e57e9ae5
    • J
      f2fs: revert segment allocation for direct IO · 6ae1be13
      Jaegeuk Kim 提交于
      Now we don't need to be too much careful about storage alignment for dio, since
      its speed becomes quite fast and we'd better avoid any misalignment first.
      
      Revert: 38aa0889 (f2fs: align direct_io'ed data to section)
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6ae1be13
  11. 24 11月, 2016 9 次提交
    • Y
      f2fs: return directly if block has been removed from the victim · 20614711
      Yunlei He 提交于
      If one block has been to written to a new place, just return
      in move data process. This patch check it again with holding
      page lock.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      20614711
    • C
      Revert "f2fs: do not recover from previous remained wrong dnodes" · d47b8715
      Chao Yu 提交于
      i_times of inode will be set with current system time which can be
      configured through 'date', so it's not safe to judge dnode block as
      garbage data or unchanged inode depend on i_times.
      
      Now, we have used enhanced 'cp_ver + cp' crc method to verify valid
      dnode block, so I expect recoverying invalid dnode is almost not
      possible.
      
      This reverts commit 807b1e1c.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d47b8715
    • J
      f2fs: remove checkpoint in f2fs_freeze · b4b9d34c
      Jaegeuk Kim 提交于
      The generic freeze_super() calls sync_filesystems() before f2fs_freeze().
      So, basically we don't need to do checkpoint in f2fs_freeze(). But, in xfs/068,
      it triggers circular locking problem below due to gc_mutex for checkpoint.
      
      ======================================================
      [ INFO: possible circular locking dependency detected ]
      4.9.0-rc1+ #132 Tainted: G           OE
      -------------------------------------------------------
      
      1. wait for __sb_start_write() by
      
       [<ffffffff9845f353>] dump_stack+0x85/0xc2
       [<ffffffff980e80bf>] print_circular_bug+0x1cf/0x230
       [<ffffffff980eb4d0>] __lock_acquire+0x19e0/0x1bc0
       [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220
       [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs]
       [<ffffffff9826bdd0>] __sb_start_write+0x130/0x200
       [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs]
       [<ffffffffc08c7c3b>] f2fs_drop_inode+0x9b/0x160 [f2fs]
       [<ffffffff98289991>] iput+0x171/0x2c0
       [<ffffffffc08cfccf>] f2fs_sync_inode_meta+0x3f/0xf0 [f2fs]
       [<ffffffffc08cfe04>] block_operations+0x84/0x110 [f2fs]
       [<ffffffffc08cff78>] write_checkpoint+0xe8/0xf20 [f2fs]
       [<ffffffff980e979d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs]
       [<ffffffff9803e9d9>] ? sched_clock+0x9/0x10
       [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs]
       [<ffffffffc08c6df5>] f2fs_sync_fs+0x85/0x190 [f2fs]
       [<ffffffff982a4f90>] ? do_fsync+0x70/0x70
       [<ffffffff982a4f90>] ? do_fsync+0x70/0x70
       [<ffffffff982a4fb0>] sync_fs_one_sb+0x20/0x30
       [<ffffffff9826ca3e>] iterate_supers+0xae/0x100
       [<ffffffff982a50b5>] sys_sync+0x55/0x90
       [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6
      
      2. wait for sbi->gc_mutex by
      
       [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220
       [<ffffffff989063d6>] mutex_lock_nested+0x76/0x3f0
       [<ffffffffc08c6de9>] f2fs_sync_fs+0x79/0x190 [f2fs]
       [<ffffffffc08c7a6c>] f2fs_freeze+0x1c/0x20 [f2fs]
       [<ffffffff9826b6ef>] freeze_super+0xcf/0x190
       [<ffffffff9827eebc>] do_vfs_ioctl+0x53c/0x6a0
       [<ffffffff9827f099>] SyS_ioctl+0x79/0x90
       [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b4b9d34c
    • J
      f2fs: assign segments correctly for direct_io · bdb7d964
      Jaegeuk Kim 提交于
      Previously, we assigned CURSEG_WARM_DATA for direct_io, but if we have two or
      four logs, we do not use that type at all.
      Let's fix it.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      bdb7d964
    • C
      f2fs: fix wrong i_atime recovery · 9f0552e0
      Chao Yu 提交于
      Shouldn't update in-memory i_atime with on-disk i_mtime of inode when
      recovering inode.
      
      Shuoran found this bug which is hidden for a long time, honour is belong
      to him.
      Signed-off-by: NShuoran Liu <liushuoran@huawei.com>
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      9f0552e0
    • C
      f2fs: record inode updating status correctly · 60dcedc9
      Chao Yu 提交于
      We should record updating status of inode only for living inode, for those
      unlinked inode it needs to clear its ino cache, otherwise after the ino
      was been reused, it will cause unneeded node page writing during ->fsync.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      60dcedc9
    • D
      f2fs: Trace reset zone events · 126606c7
      Damien Le Moal 提交于
      Similarly to the regular discard, trace zone reset events.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      126606c7
    • D
      f2fs: Reset sequential zones on zoned block devices · f46e8809
      Damien Le Moal 提交于
      When a zoned block device is mounted, discarding sections
      contained in sequential zones must reset the zone write pointer.
      For sections contained in conventional zones, the regular discard
      is used if the drive supports it.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      f46e8809
    • D
      f2fs: Cache zoned block devices zone type · 178053e2
      Damien Le Moal 提交于
      With the zoned block device feature enabled, section discard
      need to do a zone reset for sections contained in sequential
      zones, and a regular discard (if supported) for sections
      stored in conventional zones. Avoid the need for a costly
      report zones to obtain a section zone type when discarding it
      by caching the types of the device zones in the super block
      information. This cache is initialized at mount time for mounts
      with the zoned block device feature enabled.
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      178053e2