1. 12 2月, 2019 17 次提交
  2. 07 2月, 2019 2 次提交
    • T
      nfsd: Fix error return values for nfsd4_clone_file_range() · e3fdc89c
      Trond Myklebust 提交于
      If the parameter 'count' is non-zero, nfsd4_clone_file_range() will
      currently clobber all errors returned by vfs_clone_file_range() and
      replace them with EINVAL.
      
      Fixes: 42ec3d4c ("vfs: make remap_file_range functions take and...")
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      Cc: stable@vger.kernel.org # v4.20+
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      e3fdc89c
    • T
      fs: ratelimit __find_get_block_slow() failure message. · 43636c80
      Tetsuo Handa 提交于
      When something let __find_get_block_slow() hit all_mapped path, it calls
      printk() for 100+ times per a second. But there is no need to print same
      message with such high frequency; it is just asking for stall warning, or
      at least bloating log files.
      
        [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
        [  399.873324][T15342] b_state=0x00000029, b_size=512
        [  399.878403][T15342] device loop0 blocksize: 4096
        [  399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
        [  399.890400][T15342] b_state=0x00000029, b_size=512
        [  399.895595][T15342] device loop0 blocksize: 4096
        [  399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8
        [  399.907471][T15342] b_state=0x00000029, b_size=512
        [  399.912506][T15342] device loop0 blocksize: 4096
      
      This patch reduces frequency to up to once per a second, in addition to
      concatenating three lines into one.
      
        [  399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      43636c80
  3. 06 2月, 2019 1 次提交
  4. 04 2月, 2019 3 次提交
    • D
      xfs: set buffer ops when repair probes for btree type · add46b3b
      Darrick J. Wong 提交于
      In xrep_findroot_block, we work out the btree type and correctness of a
      given block by calling different btree verifiers on root block
      candidates.  However, we leave the NULL b_ops while ->verify_read
      validates the block, which means that if the verifier calls
      xfs_buf_verifier_error it'll crash on the null b_ops.  Fix it to set
      b_ops before calling the verifier and unsetting it if the verifier
      fails.
      
      Furthermore, improve the documentation around xfs_buf_ensure_ops, which
      is the function that is responsible for cleaning up the b_ops state of
      buffers that go through xrep_findroot_block but don't match anything.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      add46b3b
    • B
      xfs: end sync buffer I/O properly on shutdown error · 465fa17f
      Brian Foster 提交于
      As of commit e339dd8d ("xfs: use sync buffer I/O for sync delwri
      queue submission"), the delwri submission code uses sync buffer I/O
      for sync delwri I/O. Instead of waiting on async I/O to unlock the
      buffer, it uses the underlying sync I/O completion mechanism.
      
      If delwri buffer submission fails due to a shutdown scenario, an
      error is set on the buffer and buffer completion never occurs. This
      can cause xfs_buf_delwri_submit() to deadlock waiting on a
      completion event.
      
      We could check the error state before waiting on such buffers, but
      that doesn't serialize against the case of an error set via a racing
      I/O completion. Instead, invoke I/O completion in the shutdown case
      regardless of buffer I/O type.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      465fa17f
    • B
      xfs: eof trim writeback mapping as soon as it is cached · aa6ee4ab
      Brian Foster 提交于
      The cached writeback mapping is EOF trimmed to try and avoid races
      between post-eof block management and writeback that result in
      sending cached data to a stale location. The cached mapping is
      currently trimmed on the validation check, which leaves a race
      window between the time the mapping is cached and when it is trimmed
      against the current inode size.
      
      For example, if a new mapping is cached by delalloc conversion on a
      blocksize == page size fs, we could cycle various locks, perform
      memory allocations, etc.  in the writeback codepath before the
      associated mapping is eventually trimmed to i_size. This leaves
      enough time for a post-eof truncate and file append before the
      cached mapping is trimmed. The former event essentially invalidates
      a range of the cached mapping and the latter bumps the inode size
      such the trim on the next writepage event won't trim all of the
      invalid blocks. fstest generic/464 reproduces this scenario
      occasionally and causes a lost writeback and stale delalloc blocks
      warning on inode inactivation.
      
      To work around this problem, trim the cached writeback mapping as
      soon as it is cached in addition to on subsequent validation checks.
      This is a minor tweak to tighten the race window as much as possible
      until a proper invalidation mechanism is available.
      
      Fixes: 40214d12 ("xfs: trim writepage mapping to within eof")
      Cc: <stable@vger.kernel.org> # v4.14+
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      aa6ee4ab
  5. 02 2月, 2019 4 次提交
  6. 01 2月, 2019 1 次提交
  7. 31 1月, 2019 7 次提交
    • S
      cifs: update internal module version number · b9b9378b
      Steve French 提交于
      To 2.17
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      b9b9378b
    • A
      CIFS: fix use-after-free of the lease keys · d339adc1
      Aurelien Aptel 提交于
      The request buffers are freed right before copying the pointers.
      Use the func args instead which are identical and still valid.
      
      Simple reproducer (requires KASAN enabled) on a cifs mount:
      
      echo foo > foo ; tail -f foo & rm foo
      
      Cc: <stable@vger.kernel.org> # 4.20
      Fixes: 179e44d4 ("smb3: add tracepoint for sending lease break responses to server")
      Signed-off-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Reviewed-by: NPaulo Alcantara <palcantara@suse.de>
      d339adc1
    • W
      fs/dcache: Track & report number of negative dentries · af0c9af1
      Waiman Long 提交于
      The current dentry number tracking code doesn't distinguish between
      positive & negative dentries.  It just reports the total number of
      dentries in the LRU lists.
      
      As excessive number of negative dentries can have an impact on system
      performance, it will be wise to track the number of positive and
      negative dentries separately.
      
      This patch adds tracking for the total number of negative dentries in
      the system LRU lists and reports it in the 5th field in the
      /proc/sys/fs/dentry-state file.  The number, however, does not include
      negative dentries that are in flight but not in the LRU yet as well as
      those in the shrinker lists which are on the way out anyway.
      
      The number of positive dentries in the LRU lists can be roughly found by
      subtracting the number of negative dentries from the unused count.
      
      Matthew Wilcox had confirmed that since the introduction of the
      dentry_stat structure in 2.1.60, the dummy array was there, probably for
      future extension.  They were not replacements of pre-existing fields.
      So no sane applications that read the value of /proc/sys/fs/dentry-state
      will do dummy thing if the last 2 fields of the sysctl parameter are not
      zero.  IOW, it will be safe to use one of the dummy array entry for
      negative dentry count.
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      af0c9af1
    • W
      fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb() · 1dbd449c
      Waiman Long 提交于
      The nr_dentry_unused per-cpu counter tracks dentries in both the LRU
      lists and the shrink lists where the DCACHE_LRU_LIST bit is set.
      
      The shrink_dcache_sb() function moves dentries from the LRU list to a
      shrink list and subtracts the dentry count from nr_dentry_unused.  This
      is incorrect as the nr_dentry_unused count will also be decremented in
      shrink_dentry_list() via d_shrink_del().
      
      To fix this double decrement, the decrement in the shrink_dcache_sb()
      function is taken out.
      
      Fixes: 4e717f5c ("list_lru: remove special case function list_lru_dispose_all."
      Cc: stable@kernel.org
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1dbd449c
    • E
      btrfs: On error always free subvol_name in btrfs_mount · 532b618b
      Eric W. Biederman 提交于
      The subvol_name is allocated in btrfs_parse_subvol_options and is
      consumed and freed in mount_subvol.  Add a free to the error paths that
      don't call mount_subvol so that it is guaranteed that subvol_name is
      freed when an error happens.
      
      Fixes: 312c89fb ("btrfs: cleanup btrfs_mount() using btrfs_mount_root()")
      Cc: stable@vger.kernel.org # v4.19+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      532b618b
    • D
      btrfs: clean up pending block groups when transaction commit aborts · c7cc64a9
      David Sterba 提交于
      The fstests generic/475 stresses transaction aborts and can reveal
      space accounting or use-after-free bugs regarding block goups.
      
      In this case the pending block groups that remain linked to the
      structures after transaction commit aborts in the middle.
      
      The corrupted slabs lead to failures in following tests, eg. generic/476
      
        [ 8172.752887] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
        [ 8172.755799] #PF error: [normal kernel read fault]
        [ 8172.757571] PGD 661ae067 P4D 661ae067 PUD 3db8e067 PMD 0
        [ 8172.759000] Oops: 0000 [#1] PREEMPT SMP
        [ 8172.760209] CPU: 0 PID: 39 Comm: kswapd0 Tainted: G        W         5.0.0-rc2-default #408
        [ 8172.762495] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
        [ 8172.765772] RIP: 0010:shrink_page_list+0x2f9/0xe90
        [ 8172.770453] RSP: 0018:ffff967f00663b18 EFLAGS: 00010287
        [ 8172.771184] RAX: 0000000000000000 RBX: ffff967f00663c20 RCX: 0000000000000000
        [ 8172.772850] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8c0620ab20e0
        [ 8172.774629] RBP: ffff967f00663dd8 R08: 0000000000000000 R09: 0000000000000000
        [ 8172.776094] R10: ffff8c0620ab22f8 R11: ffff8c063f772688 R12: ffff967f00663b78
        [ 8172.777533] R13: ffff8c063f625600 R14: ffff8c063f625608 R15: dead000000000200
        [ 8172.778886] FS:  0000000000000000(0000) GS:ffff8c063d400000(0000) knlGS:0000000000000000
        [ 8172.780545] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 8172.781787] CR2: 0000000000000058 CR3: 000000004e962000 CR4: 00000000000006f0
        [ 8172.783547] Call Trace:
        [ 8172.784112]  shrink_inactive_list+0x194/0x410
        [ 8172.784747]  shrink_node_memcg.constprop.85+0x3a5/0x6a0
        [ 8172.785472]  shrink_node+0x62/0x1e0
        [ 8172.786011]  balance_pgdat+0x216/0x460
        [ 8172.786577]  kswapd+0xe3/0x4a0
        [ 8172.787085]  ? finish_wait+0x80/0x80
        [ 8172.787795]  ? balance_pgdat+0x460/0x460
        [ 8172.788799]  kthread+0x116/0x130
        [ 8172.789640]  ? kthread_create_on_node+0x60/0x60
        [ 8172.790323]  ret_from_fork+0x24/0x30
        [ 8172.794253] CR2: 0000000000000058
      
      or accounting errors at umount time:
      
        [ 8159.537251] WARNING: CPU: 2 PID: 19031 at fs/btrfs/extent-tree.c:5987 btrfs_free_block_groups+0x3d5/0x410 [btrfs]
        [ 8159.543325] CPU: 2 PID: 19031 Comm: umount Tainted: G        W         5.0.0-rc2-default #408
        [ 8159.545472] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
        [ 8159.548155] RIP: 0010:btrfs_free_block_groups+0x3d5/0x410 [btrfs]
        [ 8159.554030] RSP: 0018:ffff967f079cbde8 EFLAGS: 00010206
        [ 8159.555144] RAX: 0000000001000000 RBX: ffff8c06366cf800 RCX: 0000000000000000
        [ 8159.556730] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8c06255ad800
        [ 8159.558279] RBP: ffff8c0637ac0000 R08: 0000000000000001 R09: 0000000000000000
        [ 8159.559797] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8c0637ac0108
        [ 8159.561296] R13: ffff8c0637ac0158 R14: 0000000000000000 R15: dead000000000100
        [ 8159.562852] FS:  00007f7f693b9fc0(0000) GS:ffff8c063d800000(0000) knlGS:0000000000000000
        [ 8159.564839] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 8159.566160] CR2: 00007f7f68fab7b0 CR3: 000000000aec7000 CR4: 00000000000006e0
        [ 8159.567898] Call Trace:
        [ 8159.568597]  close_ctree+0x17f/0x350 [btrfs]
        [ 8159.569628]  generic_shutdown_super+0x64/0x100
        [ 8159.570808]  kill_anon_super+0x14/0x30
        [ 8159.571857]  btrfs_kill_super+0x12/0xa0 [btrfs]
        [ 8159.573063]  deactivate_locked_super+0x29/0x60
        [ 8159.574234]  cleanup_mnt+0x3b/0x70
        [ 8159.575176]  task_work_run+0x98/0xc0
        [ 8159.576177]  exit_to_usermode_loop+0x83/0x90
        [ 8159.577315]  do_syscall_64+0x15b/0x180
        [ 8159.578339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      This fix is based on 2 Josef's patches that used sideefects of
      btrfs_create_pending_block_groups, this fix introduces the helper that
      does what we need.
      
      CC: stable@vger.kernel.org # 4.4+
      CC: Josef Bacik <josef@toxicpanda.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c7cc64a9
    • A
      btrfs: fix potential oops in device_list_add · 92900e51
      Al Viro 提交于
      alloc_fs_devices() can return ERR_PTR(-ENOMEM), so dereferencing its
      result before the check for IS_ERR() is a bad idea.
      
      Fixes: d1a63002 ("btrfs: add members to fs_devices to track fsid changes")
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      92900e51
  8. 30 1月, 2019 5 次提交