1. 30 11月, 2016 1 次提交
    • J
      f2fs: do not activate auto_recovery for fallocated i_size · 26787236
      Jaegeuk Kim 提交于
      If a file needs to keep its i_size by fallocate, we need to turn off auto
      recovery during roll-forward recovery.
      
      This will resolve the below scenario.
      
      1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync"
      2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync"
      3. md5sum /mnt/f2fs/file;
      4. godown /mnt/f2fs/
      5. umount /mnt/f2fs/
      6. mount -t f2fs /dev/sdx /mnt/f2fs
      7. md5sum /mnt/f2fs/file
      Reported-by: NChao Yu <chao@kernel.org>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      26787236
  2. 29 11月, 2016 1 次提交
  3. 26 11月, 2016 17 次提交
    • A
      f2fs: fix 32-bit build · 19c52651
      Arnd Bergmann 提交于
      The addition of multiple-device support broke CONFIG_BLK_DEV_ZONED
      on 32-bit machines because of a 64-bit division:
      
      fs/f2fs/f2fs.o: In function `__issue_discard_async':
      extent_cache.c:(.text.__issue_discard_async+0xd4): undefined reference to `__aeabi_uldivmod'
      
      Fortunately, bdev_zone_size() is guaranteed to return a power-of-two
      number, so we can replace the % operator with a cheaper bit mask.
      
      Fixes: 792b84b74b54 ("f2fs: support multiple devices")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      19c52651
    • N
      f2fs: set ->owner for debugfs status file's file_operations · 05e6ea26
      Nicolai Stange 提交于
      The struct file_operations instance serving the f2fs/status debugfs file
      lacks an initialization of its ->owner.
      
      This means that although that file might have been opened, the f2fs module
      can still get removed. Any further operation on that opened file, releasing
      included,  will cause accesses to unmapped memory.
      
      Indeed, Mike Marshall reported the following:
      
        BUG: unable to handle kernel paging request at ffffffffa0307430
        IP: [<ffffffff8132a224>] full_proxy_release+0x24/0x90
        <...>
        Call Trace:
         [] __fput+0xdf/0x1d0
         [] ____fput+0xe/0x10
         [] task_work_run+0x8e/0xc0
         [] do_exit+0x2ae/0xae0
         [] ? __audit_syscall_entry+0xae/0x100
         [] ? syscall_trace_enter+0x1ca/0x310
         [] do_group_exit+0x44/0xc0
         [] SyS_exit_group+0x14/0x20
         [] do_syscall_64+0x61/0x150
         [] entry_SYSCALL64_slow_path+0x25/0x25
        <...>
        ---[ end trace f22ae883fa3ea6b8 ]---
        Fixing recursive fault but reboot is needed!
      
      Fix this by initializing the f2fs/status file_operations' ->owner with
      THIS_MODULE.
      
      This will allow debugfs to grab a reference to the f2fs module upon any
      open on that file, thus preventing it from getting removed.
      
      Fixes: 902829aa ("f2fs: move proc files to debugfs")
      Reported-by: NMike Marshall <hubcap@omnibond.com>
      Reported-by: NMartin Brandenburg <martin@omnibond.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      05e6ea26
    • C
      f2fs: fix incorrect free inode count in ->statfs · b08b12d2
      Chao Yu 提交于
      While calculating inode count that we can create at most in the left space,
      we should consider space which data/node blocks occupied, since we create
      data/node mixly in main area. So fix the wrong calculation in ->statfs.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b08b12d2
    • G
      f2fs: drop duplicate header timer.h · b4ceec29
      Geliang Tang 提交于
      Drop duplicate header timer.h from segment.c.
      Signed-off-by: NGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      b4ceec29
    • J
      f2fs: fix wrong AUTO_RECOVER condition · 97dd26ad
      Jaegeuk Kim 提交于
      If i_size is not aligned to the f2fs's block size, we should not skip inode
      update during fsync.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      97dd26ad
    • J
      f2fs: do not recover i_size if it's valid · 3a3a5ead
      Jaegeuk Kim 提交于
      If i_size is already valid during roll_forward recovery, we should not update
      it according to the block alignment.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3a3a5ead
    • C
      f2fs: fix fdatasync · 281518c6
      Chao Yu 提交于
      For below two cases, we can't guarantee data consistence:
      
      a)
      1. xfs_io "pwrite 0 4195328" "fsync"
      2. xfs_io "pwrite 4195328 1024" "fdatasync"
      3. godown
      4. umount & mount
      --> isize we updated before fdatasync won't be recovered
      
      b)
      1. xfs_io "pwrite -S 0xcc 0 4202496" "fsync"
      2. xfs_io "fpunch 4194304 4096" "fdatasync"
      3. godown
      4. umount & mount
      --> dnode we punched before fdatasync won't be recovered
      
      The reason is that normally fdatasync won't be aware of modification
      of metadata in file, e.g. isize changing, dnode updating, so in ->fsync
      we will skip flushing node pages for above cases, result in making
      fdatasynced file being lost during recovery.
      
      Currently we have introduced DIRTY_META global list in sbi for tracking
      dirty inode selectively, so in fdatasync we can choose to flush nodes
      depend on dirty state of current inode in the list.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      281518c6
    • C
      f2fs: fix to account total free nid correctly · 04d47e67
      Chao Yu 提交于
      Thread A		Thread B		Thread C
      - f2fs_create
       - f2fs_new_inode
        - f2fs_lock_op
         - alloc_nid
          alloc last nid
        - f2fs_unlock_op
      			- f2fs_create
      			 - f2fs_new_inode
      			  - f2fs_lock_op
      			   - alloc_nid
      			    as node count still not
      			    be increased, we will
      			    loop in alloc_nid
      						- f2fs_write_node_pages
      						 - f2fs_balance_fs_bg
      						  - f2fs_sync_fs
      						   - write_checkpoint
      						    - block_operations
      						     - f2fs_lock_all
       - f2fs_lock_op
      
      While creating new inode, we do not allocate and account nid atomically,
      so that when there is almost no free nids left, we may encounter deadloop
      like above stack.
      
      In order to avoid that, reuse nm_i::available_nids for accounting free nids
      and make nid allocation and counting being atomical during node creation.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      04d47e67
    • Y
      f2fs: fix an infinite loop when flush nodes in cp · d40a43af
      Yunlei He 提交于
      Thread A			Thread B
      
      - write_checkpoint
       - block_operations
         -blk_start_plug
          -sync_node_pages		- f2fs_do_sync_file
      				 - fsync_node_pages
      				  - f2fs_wait_on_page_writeback
      
      Thread A wait for global F2FS_DIRTY_NODES decreased to zero,
      it start a plug list, some requests have been added to this list.
      Thread B lock one dirty node page, and wait this page write back.
      But this page has been in plug list of thread A with PG_writeback flag.
      Thread A keep on running and its plug list has no chance to finish,
      so it seems a deadlock between cp and fsync path.
      
      This patch add a wait on page write back before set node page dirty
      to avoid this problem.
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NPengyang Hou <houpengyang@huawei.com>
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      d40a43af
    • C
      f2fs: don't wait writeback for datas during checkpoint · 36951b38
      Chao Yu 提交于
      Normally, while committing checkpoint, we will wait on all pages to be
      writebacked no matter the page is data or metadata, so in scenario where
      there are lots of data IO being submitted with metadata, we may suffer
      long latency for waiting writeback during checkpoint.
      
      Indeed, we only care about persistence for pages with metadata, but not
      pages with data, as file system consistent are only related to metadate,
      so in order to avoid encountering long latency in above scenario, let's
      recognize and reference metadata in submitted IOs, wait writeback only
      for metadatas.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      36951b38
    • J
      f2fs: fix wrong written_valid_blocks counting · c79b7ff1
      Jaegeuk Kim 提交于
      Previously, written_valid_blocks was got by ckpt->valid_block_count. But if
      the last checkpoint has some NEW_ADDR due to power-cut, we can get wrong value.
      Fix it to get the number from actual written block count from sit entries.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c79b7ff1
    • J
      f2fs: avoid BG_GC in f2fs_balance_fs · 7702bdbe
      Jaegeuk Kim 提交于
      If many threads hit has_not_enough_free_secs() in f2fs_balance_fs() at the same
      time, all the threads would do FG_GC or BG_GC.
      In this critical path, we totally don't need to do BG_GC at all.
      Let's avoid that.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      7702bdbe
    • J
      f2fs: fix redundant block allocation · c040ff9d
      Jaegeuk Kim 提交于
      In direct_IO path of f2fs_file_write_iter(),
      1. f2fs_preallocate_blocks(F2FS_GET_BLOCK_PRE_DIO)
         -> allocate LBA X
      2. f2fs_direct_IO()
         -> return 0;
      
      Then,
      f2fs_write_data_page() will allocate another LBA X+1.
      
      This makes EIO triggered by HM-SMR.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      c040ff9d
    • J
      f2fs: use err for f2fs_preallocate_blocks · a7de6086
      Jaegeuk Kim 提交于
      This patch has no functional change.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      a7de6086
    • J
      f2fs: support multiple devices · 3c62be17
      Jaegeuk Kim 提交于
      This patch implements multiple devices support for f2fs.
      Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big
      volume under one f2fs instance.
      
      Internal block management is very simple, but we will modify block allocation
      and background GC policy to boost IO speed by exploiting them accoording to
      each device speed.
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      3c62be17
    • J
      f2fs: allow dio read for LFS mode · e57e9ae5
      Jaegeuk Kim 提交于
      We can allow dio reads for LFS mode, while doing buffered writes for dio writes.
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e57e9ae5
    • J
      f2fs: revert segment allocation for direct IO · 6ae1be13
      Jaegeuk Kim 提交于
      Now we don't need to be too much careful about storage alignment for dio, since
      its speed becomes quite fast and we'd better avoid any misalignment first.
      
      Revert: 38aa0889 (f2fs: align direct_io'ed data to section)
      Reviewed-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      6ae1be13
  4. 24 11月, 2016 21 次提交