1. 01 6月, 2020 3 次提交
    • P
      cifs: set up next DFS target before generic_ip_connect() · aaa3aef3
      Paulo Alcantara 提交于
      If we mount a very specific DFS link
      
          \\FS0.FOO.COM\dfs\link -> \FS0\share1, \FS1\share2
      
      where its target list contains NB names ("FS0" & "FS1") rather than
      FQDN ones ("FS0.FOO.COM" & "FS1.FOO.COM"), we end up connecting to
      \FOO\share1 but server->hostname will have "FOO.COM".  The reason is
      because both "FS0" and "FS0.FOO.COM" resolve to same IP address and
      they share same TCP server connection, but "FS0.FOO.COM" was the first
      hostname set -- which is OK.
      
      However, if the echo thread timeouts and we still have a good
      connection to "FS0", in cifs_reconnect()
      
          rc = generic_ip_connect(server) -> success
          if (rc) {
                  ...
                  reconn_inval_dfs_target(server, cifs_sb, &tgt_list,
      	                            &tgt_it);
                  ...
           }
           ...
      
      it successfully reconnects to "FS0" server but does not set up next
      DFS target - which should be the same target server "\FS0\share1" -
      and server->hostname remains set to "FS0.FOO.COM" rather than "FS0",
      as reconn_inval_dfs_target() would have it set to "FS0" if called
      earlier.
      
      Finally, in __smb2_reconnect(), the reconnect of tcons would fail
      because tcon->ses->server->hostname (FS0.FOO.COM) does not match DFS
      target's hostname (FS0).
      
      Fix that by calling reconn_inval_dfs_target() before
      generic_ip_connect() so server->hostname will get updated correctly
      prior to reconnecting its tcons in __smb2_reconnect().
      
      With "cifs: handle hostnames that resolve to same ip in failover"
      patch
      
          - The above problem would not occur.
          - We could save an DNS query to find out that they both resolve to
            the same ip address.
      Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      aaa3aef3
    • C
      cifs: remove redundant initialization of variable rc · 136a5dc3
      Colin Ian King 提交于
      The variable rc is being initialized with a value that is never read
      and it is being updated later with a new value.  The initialization is
      redundant and can be removed.
      
      Addresses-Coverity: ("Unused value")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      136a5dc3
    • K
      cifs: handle "nolease" option for vers=1.0 · 8fd6e1d6
      Kenneth D'souza 提交于
      The "nolease" mount option is only supported for SMB2+ mounts.
      Fail with appropriate error message if vers=1.0 option is passed.
      Signed-off-by: NKenneth D'souza <kdsouza@redhat.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      8fd6e1d6
  2. 29 5月, 2020 2 次提交
  3. 28 5月, 2020 1 次提交
    • A
      fanotify: turn off support for FAN_DIR_MODIFY · f1793699
      Amir Goldstein 提交于
      FAN_DIR_MODIFY has been enabled by commit 44d705b0 ("fanotify:
      report name info for FAN_DIR_MODIFY event") in 5.7-rc1. Now we are
      planning further extensions to the fanotify API and during that we
      realized that FAN_DIR_MODIFY may behave slightly differently to be more
      consistent with extensions we plan. So until we finalize these
      extensions, let's not bind our hands with exposing FAN_DIR_MODIFY to
      userland.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      f1793699
  4. 27 5月, 2020 1 次提交
  5. 23 5月, 2020 1 次提交
    • D
      rxrpc: Fix a warning · 8a1d24e1
      David Howells 提交于
      Fix a warning due to an uninitialised variable.
      
      le included from ../fs/afs/fs_probe.c:11:
      ../fs/afs/fs_probe.c: In function 'afs_fileserver_probe_result':
      ../fs/afs/internal.h:1453:2: warning: 'rtt_us' may be used uninitialized in this function [-Wmaybe-uninitialized]
       1453 |  printk("[%-6.6s] "FMT"\n", current->comm ,##__VA_ARGS__)
            |  ^~~~~~
      ../fs/afs/fs_probe.c:35:15: note: 'rtt_us' was declared here
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      8a1d24e1
  6. 22 5月, 2020 1 次提交
  7. 21 5月, 2020 1 次提交
    • T
      pipe: Fix pipe_full() test in opipe_prep(). · 566d1362
      Tetsuo Handa 提交于
      syzbot is reporting that splice()ing from non-empty read side to
      already-full write side causes unkillable task, for opipe_prep() is by
      error not inverting pipe_full() test.
      
        CPU: 0 PID: 9460 Comm: syz-executor.5 Not tainted 5.6.0-rc3-next-20200228-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        RIP: 0010:rol32 include/linux/bitops.h:105 [inline]
        RIP: 0010:iterate_chain_key kernel/locking/lockdep.c:369 [inline]
        RIP: 0010:__lock_acquire+0x6a3/0x5270 kernel/locking/lockdep.c:4178
        Call Trace:
           lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4720
           __mutex_lock_common kernel/locking/mutex.c:956 [inline]
           __mutex_lock+0x156/0x13c0 kernel/locking/mutex.c:1103
           pipe_lock_nested fs/pipe.c:66 [inline]
           pipe_double_lock+0x1a0/0x1e0 fs/pipe.c:104
           splice_pipe_to_pipe fs/splice.c:1562 [inline]
           do_splice+0x35f/0x1520 fs/splice.c:1141
           __do_sys_splice fs/splice.c:1447 [inline]
           __se_sys_splice fs/splice.c:1427 [inline]
           __x64_sys_splice+0x2b5/0x320 fs/splice.c:1427
           do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:295
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Reported-by: syzbot+b48daca8639150bc5e73@syzkaller.appspotmail.com
      Link: https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d
      Fixes: 8cefc107 ("pipe: Use head and tail pointers for the ring, not cursor and length")
      Cc: stable@vger.kernel.org # 5.5+
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      566d1362
  8. 20 5月, 2020 6 次提交
    • X
      io_uring: reset -EBUSY error when io sq thread is waken up · d4ae271d
      Xiaoguang Wang 提交于
      In io_sq_thread(), currently if we get an -EBUSY error and go to sleep,
      we will won't clear it again, which will result in io_sq_thread() will
      never have a chance to submit sqes again. Below test program test.c
      can reveal this bug:
      
      int main(int argc, char *argv[])
      {
              struct io_uring ring;
              int i, fd, ret;
              struct io_uring_sqe *sqe;
              struct io_uring_cqe *cqe;
              struct iovec *iovecs;
              void *buf;
              struct io_uring_params p;
      
              if (argc < 2) {
                      printf("%s: file\n", argv[0]);
                      return 1;
              }
      
              memset(&p, 0, sizeof(p));
              p.flags = IORING_SETUP_SQPOLL;
              ret = io_uring_queue_init_params(4, &ring, &p);
              if (ret < 0) {
                      fprintf(stderr, "queue_init: %s\n", strerror(-ret));
                      return 1;
              }
      
              fd = open(argv[1], O_RDONLY | O_DIRECT);
              if (fd < 0) {
                      perror("open");
                      return 1;
              }
      
              iovecs = calloc(10, sizeof(struct iovec));
              for (i = 0; i < 10; i++) {
                      if (posix_memalign(&buf, 4096, 4096))
                              return 1;
                      iovecs[i].iov_base = buf;
                      iovecs[i].iov_len = 4096;
              }
      
              ret = io_uring_register_files(&ring, &fd, 1);
              if (ret < 0) {
                      fprintf(stderr, "%s: register %d\n", __FUNCTION__, ret);
                      return ret;
              }
      
              for (i = 0; i < 10; i++) {
                      sqe = io_uring_get_sqe(&ring);
                      if (!sqe)
                              break;
      
                      io_uring_prep_readv(sqe, 0, &iovecs[i], 1, 0);
                      sqe->flags |= IOSQE_FIXED_FILE;
      
                      ret = io_uring_submit(&ring);
                      sleep(1);
                      printf("submit %d\n", i);
              }
      
              for (i = 0; i < 10; i++) {
                      io_uring_wait_cqe(&ring, &cqe);
                      printf("receive: %d\n", i);
                      if (cqe->res != 4096) {
                              fprintf(stderr, "ret=%d, wanted 4096\n", cqe->res);
                              ret = 1;
                      }
                      io_uring_cqe_seen(&ring, cqe);
              }
      
              close(fd);
              io_uring_queue_exit(&ring);
              return 0;
      }
      sudo ./test testfile
      above command will hang on the tenth request, to fix this bug, when io
      sq_thread is waken up, we reset the variable 'ret' to be zero.
      Suggested-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d4ae271d
    • J
      io_uring: don't add non-IO requests to iopoll pending list · b532576e
      Jens Axboe 提交于
      We normally disable any commands that aren't specifically poll commands
      for a ring that is setup for polling, but we do allow buffer provide and
      remove commands to support buffer selection for polled IO. Once a
      request is issued, we add it to the poll list to poll for completion. But
      we should not do that for non-IO commands, as those request complete
      inline immediately and aren't pollable. If we do, we can leave requests
      on the iopoll list after they are freed.
      
      Fixes: ddf0322d ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b532576e
    • A
      fix multiplication overflow in copy_fdtable() · 4e89b721
      Al Viro 提交于
      cpy and set really should be size_t; we won't get an overflow on that,
      since sysctl_nr_open can't be set above ~(size_t)0 / sizeof(void *),
      so nr that would've managed to overflow size_t on that multiplication
      won't get anywhere near copy_fdtable() - we'll fail with EMFILE
      before that.
      
      Cc: stable@kernel.org # v2.6.25+
      Fixes: 9cfe015a (get rid of NR_OPEN and introduce a sysctl_nr_open)
      Reported-by: NThiago Macieira <thiago.macieira@intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4e89b721
    • B
      io_uring: don't use kiocb.private to store buf_index · 4f4eeba8
      Bijan Mottahedeh 提交于
      kiocb.private is used in iomap_dio_rw() so store buf_index separately.
      Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com>
      
      Move 'buf_index' to a hole in io_kiocb.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4f4eeba8
    • C
      ext4: fix fiemap size checks for bitmap files · 959f7584
      Christoph Hellwig 提交于
      Add an extra validation of the len parameter, as for ext4 some files
      might have smaller file size limits than others.  This also means the
      redundant size check in ext4_ioctl_get_es_cache can go away, as all
      size checking is done in the shared fiemap handler.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200505154324.3226743-3-hch@lst.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      959f7584
    • R
      ext4: fix EXT4_MAX_LOGICAL_BLOCK macro · 9f44eda1
      Ritesh Harjani 提交于
      ext4 supports max number of logical blocks in a file to be 0xffffffff.
      (This is since ext4_extent's ee_block is __le32).
      This means that EXT4_MAX_LOGICAL_BLOCK should be 0xfffffffe (starting
      from 0 logical offset). This patch fixes this.
      
      The issue was seen when ext4 moved to iomap_fiemap API and when
      overlayfs was mounted on top of ext4. Since overlayfs was missing
      filemap_check_ranges(), so it could pass a arbitrary huge length which
      lead to overflow of map.m_len logic.
      
      This patch fixes that.
      
      Fixes: d3b6f23f ("ext4: move ext4_fiemap to use iomap framework")
      Reported-by: syzbot+77fa5bdb65cc39711820@syzkaller.appspotmail.com
      Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20200505154324.3226743-2-hch@lst.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      9f44eda1
  9. 19 5月, 2020 2 次提交
    • D
      afs: Don't unlock fetched data pages until the op completes successfully · 9d1be4f4
      David Howells 提交于
      Don't call req->page_done() on each page as we finish filling it with
      the data coming from the network.  Whilst this might speed up the
      application a bit, it's a problem if there's a network failure and the
      operation has to be reissued.
      
      If this happens, an oops occurs because afs_readpages_page_done() clears
      the pointer to each page it unlocks and when a retry happens, the
      pointers to the pages it wants to fill are now NULL (and the pages have
      been unlocked anyway).
      
      Instead, wait till the operation completes successfully and only then
      release all the pages after clearing any terminal gap (the server can
      give us less data than we requested as we're allowed to ask for more
      than is available).
      
      KASAN produces a bug like the following, and even without KASAN, it can
      oops and panic.
      
          BUG: KASAN: wild-memory-access in _copy_to_iter+0x323/0x5f4
          Write of size 1404 at addr 0005088000000000 by task md5sum/5235
      
          CPU: 0 PID: 5235 Comm: md5sum Not tainted 5.7.0-rc3-fscache+ #250
          Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
          Call Trace:
           memcpy+0x39/0x58
           _copy_to_iter+0x323/0x5f4
           __skb_datagram_iter+0x89/0x2a6
           skb_copy_datagram_iter+0x129/0x135
           rxrpc_recvmsg_data.isra.0+0x615/0xd42
           rxrpc_kernel_recv_data+0x1e9/0x3ae
           afs_extract_data+0x139/0x33a
           yfs_deliver_fs_fetch_data64+0x47a/0x91b
           afs_deliver_to_call+0x304/0x709
           afs_wait_for_call_to_complete+0x1cc/0x4ad
           yfs_fs_fetch_data+0x279/0x288
           afs_fetch_data+0x1e1/0x38d
           afs_readpages+0x593/0x72e
           read_pages+0xf5/0x21e
           __do_page_cache_readahead+0x128/0x23f
           ondemand_readahead+0x36e/0x37f
           generic_file_buffered_read+0x234/0x680
           new_sync_read+0x109/0x17e
           vfs_read+0xe6/0x138
           ksys_read+0xd8/0x14d
           do_syscall_64+0x6e/0x8a
           entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      Fixes: 196ee9cd ("afs: Make afs_fs_fetch_data() take a list of pages")
      Fixes: 30062bd1 ("afs: Implement YFS support in the fs client")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d1be4f4
    • J
      io_uring: cancel work if task_work_add() fails · e3aabf95
      Jens Axboe 提交于
      We currently move it to the io_wqe_manager for execution, but we cannot
      safely do so as we may lack some of the state to execute it out of
      context. As we cancel work anyway when the ring/task exits, just mark
      this request as canceled and io_async_task_func() will do the right
      thing.
      
      Fixes: aa96bf8a ("io_uring: use io-wq manager as backup task if task is exiting")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      e3aabf95
  10. 18 5月, 2020 4 次提交
    • W
      exfat: fix possible memory leak in exfat_find() · 94182167
      Wei Yongjun 提交于
      'es' is malloced from exfat_get_dentry_set() in exfat_find() and should
      be freed before leaving from the error handling cases, otherwise it will
      cause memory leak.
      
      Fixes: 5f2aa075 ("exfat: add inode operations")
      Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      94182167
    • E
      exfat: use iter_file_splice_write · 03577948
      Eric Sandeen 提交于
      Doing copy_file_range() on exfat with a file opened for direct IO leads
      to an -EFAULT:
      
      # xfs_io -f -d -c "truncate 32768" \
             -c "copy_range -d 16384 -l 16384 -f 0" /mnt/test/junk
      copy_range: Bad address
      
      and the reason seems to be that we go through:
      
      default_file_splice_write
       splice_from_pipe
        __splice_from_pipe
         write_pipe_buf
          __kernel_write
           new_sync_write
            generic_file_write_iter
             generic_file_direct_write
              exfat_direct_IO
               do_blockdev_direct_IO
                iov_iter_get_pages
      
      and land in iterate_all_kinds(), which does "return -EFAULT" for our kvec
      iter.
      
      Setting exfat's splice_write to iter_file_splice_write fixes this and lets
      fsx (which originally detected the problem) run to success from
      the xfstests harness.
      Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
      03577948
    • E
      ubifs: fix wrong use of crypto_shash_descsize() · 3c3c32f8
      Eric Biggers 提交于
      crypto_shash_descsize() returns the size of the shash_desc context
      needed to compute the hash, not the size of the hash itself.
      
      crypto_shash_digestsize() would be correct, or alternatively using
      c->hash_len and c->hmac_desc_len which already store the correct values.
      But actually it's simpler to just use stack arrays, so do that instead.
      
      Fixes: 49525e5e ("ubifs: Add helper functions for authentication support")
      Fixes: da8ef65f ("ubifs: Authenticate replayed journal")
      Cc: <stable@vger.kernel.org> # v4.20+
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Acked-by: NSascha Hauer <s.hauer@pengutronix.de>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      3c3c32f8
    • J
      io_uring: remove dead check in io_splice() · 948a7749
      Jens Axboe 提交于
      We checked for 'force_nonblock' higher up, so it's definitely false
      at this point. Kill the check, it's a remnant of when we tried to do
      inline splice without always punting to async context.
      
      Fixes: 2fb3e822 ("io_uring: punt splice async because of inode mutex")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      948a7749
  11. 17 5月, 2020 4 次提交
    • E
      exec: Move would_dump into flush_old_exec · f87d1c95
      Eric W. Biederman 提交于
      I goofed when I added mm->user_ns support to would_dump.  I missed the
      fact that in the case of binfmt_loader, binfmt_em86, binfmt_misc, and
      binfmt_script bprm->file is reassigned.  Which made the move of
      would_dump from setup_new_exec to __do_execve_file before exec_binprm
      incorrect as it can result in would_dump running on the script instead
      of the interpreter of the script.
      
      The net result is that the code stopped making unreadable interpreters
      undumpable.  Which allows them to be ptraced and written to disk
      without special permissions.  Oops.
      
      The move was necessary because the call in set_new_exec was after
      bprm->mm was no longer valid.
      
      To correct this mistake move the misplaced would_dump from
      __do_execve_file into flos_old_exec, before exec_mmap is called.
      
      I tested and confirmed that without this fix I can attach with gdb to
      a script with an unreadable interpreter, and with this fix I can not.
      
      Cc: stable@vger.kernel.org
      Fixes: f84df2a6 ("exec: Ensure mm->user_ns contains the execed files")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      f87d1c95
    • P
      io_uring: fix FORCE_ASYNC req preparation · bd2ab18a
      Pavel Begunkov 提交于
      As for other not inlined requests, alloc req->io for FORCE_ASYNC reqs,
      so they can be prepared properly.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bd2ab18a
    • P
      io_uring: don't prepare DRAIN reqs twice · 650b5481
      Pavel Begunkov 提交于
      If req->io is not NULL, it's already prepared. Don't do it again,
      it's dangerous.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      650b5481
    • J
      io_uring: initialize ctx->sqo_wait earlier · 583863ed
      Jens Axboe 提交于
      Ensure that ctx->sqo_wait is initialized as soon as the ctx is allocated,
      instead of deferring it to the offload setup. This fixes a syzbot
      reported lockdep complaint, which is really due to trying to wake_up
      on an uninitialized wait queue:
      
      RSP: 002b:00007fffb1fb9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441319
      RDX: 0000000000000001 RSI: 0000000020000140 RDI: 000000000000047b
      RBP: 0000000000010475 R08: 0000000000000001 R09: 00000000004002c8
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402260
      R13: 00000000004022f0 R14: 0000000000000000 R15: 0000000000000000
      INFO: trying to register non-static key.
      the code is fine but needs lockdep annotation.
      turning off the locking correctness validator.
      CPU: 1 PID: 7090 Comm: syz-executor222 Not tainted 5.7.0-rc1-next-20200415-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x188/0x20d lib/dump_stack.c:118
       assign_lock_key kernel/locking/lockdep.c:913 [inline]
       register_lock_class+0x1664/0x1760 kernel/locking/lockdep.c:1225
       __lock_acquire+0x104/0x4c50 kernel/locking/lockdep.c:4234
       lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4934
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x8c/0xbf kernel/locking/spinlock.c:159
       __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:122
       io_cqring_ev_posted+0xa5/0x1e0 fs/io_uring.c:1160
       io_poll_remove_all fs/io_uring.c:4357 [inline]
       io_ring_ctx_wait_and_kill+0x2bc/0x5a0 fs/io_uring.c:7305
       io_uring_create fs/io_uring.c:7843 [inline]
       io_uring_setup+0x115e/0x22b0 fs/io_uring.c:7870
       do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      RIP: 0033:0x441319
      Code: e8 5c ae 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffb1fb9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
      
      Reported-by: syzbot+8c91f5d054e998721c57@syzkaller.appspotmail.com
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      583863ed
  12. 15 5月, 2020 3 次提交
  13. 14 5月, 2020 2 次提交
  14. 13 5月, 2020 2 次提交
  15. 12 5月, 2020 2 次提交
  16. 11 5月, 2020 1 次提交
    • D
      rxrpc: Fix the excessive initial retransmission timeout · c410bf01
      David Howells 提交于
      rxrpc currently uses a fixed 4s retransmission timeout until the RTT is
      sufficiently sampled.  This can cause problems with some fileservers with
      calls to the cache manager in the afs filesystem being dropped from the
      fileserver because a packet goes missing and the retransmission timeout is
      greater than the call expiry timeout.
      
      Fix this by:
      
       (1) Copying the RTT/RTO calculation code from Linux's TCP implementation
           and altering it to fit rxrpc.
      
       (2) Altering the various users of the RTT to make use of the new SRTT
           value.
      
       (3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO
           value instead (which is needed in jiffies), along with a backoff.
      
      Notes:
      
       (1) rxrpc provides RTT samples by matching the serial numbers on outgoing
           DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets
           against the reference serial number in incoming REQUESTED ACK and
           PING-RESPONSE ACK packets.
      
       (2) Each packet that is transmitted on an rxrpc connection gets a new
           per-connection serial number, even for retransmissions, so an ACK can
           be cross-referenced to a specific trigger packet.  This allows RTT
           information to be drawn from retransmitted DATA packets also.
      
       (3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than
           on an rxrpc_call because many RPC calls won't live long enough to
           generate more than one sample.
      
       (4) The calculated SRTT value is in units of 8ths of a microsecond rather
           than nanoseconds.
      
      The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers.
      
      Fixes: 17926a79 ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c410bf01
  17. 10 5月, 2020 2 次提交
  18. 09 5月, 2020 2 次提交
    • L
      cachefiles: Fix race between read_waiter and read_copier involving op->to_do · 7bb0c533
      Lei Xue 提交于
      There is a potential race in fscache operation enqueuing for reading and
      copying multiple pages from cachefiles to netfs.  The problem can be seen
      easily on a heavy loaded system (for example many processes reading files
      continually on an NFS share covered by fscache triggered this problem within
      a few minutes).
      
      The race is due to cachefiles_read_waiter() adding the op to the monitor
      to_do list and then then drop the object->work_lock spinlock before
      completing fscache_enqueue_operation().  Once the lock is dropped,
      cachefiles_read_copier() grabs the op, completes processing it, and
      makes it through fscache_retrieval_complete() which sets the op->state to
      the final state of FSCACHE_OP_ST_COMPLETE(4).  When cachefiles_read_waiter()
      finally gets through the remainder of fscache_enqueue_operation()
      it sees the invalid state, and hits the ASSERTCMP and the following
      oops is seen:
      [ 2259.612361] FS-Cache:
      [ 2259.614785] FS-Cache: Assertion failed
      [ 2259.618639] FS-Cache: 4 == 5 is false
      [ 2259.622456] ------------[ cut here ]------------
      [ 2259.627190] kernel BUG at fs/fscache/operation.c:70!
      ...
      [ 2259.791675] RIP: 0010:[<ffffffffc061b4cf>]  [<ffffffffc061b4cf>] fscache_enqueue_operation+0xff/0x170 [fscache]
      [ 2259.802059] RSP: 0000:ffffa0263d543be0  EFLAGS: 00010046
      [ 2259.807521] RAX: 0000000000000019 RBX: ffffa01a4d390480 RCX: 0000000000000006
      [ 2259.814847] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffa0263d553890
      [ 2259.822176] RBP: ffffa0263d543be8 R08: 0000000000000000 R09: ffffa0263c2d8708
      [ 2259.829502] R10: 0000000000001e7f R11: 0000000000000000 R12: ffffa01a4d390480
      [ 2259.844483] R13: ffff9fa9546c5920 R14: ffffa0263d543c80 R15: ffffa0293ff9bf10
      [ 2259.859554] FS:  00007f4b6efbd700(0000) GS:ffffa0263d540000(0000) knlGS:0000000000000000
      [ 2259.875571] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2259.889117] CR2: 00007f49e1624ff0 CR3: 0000012b38b38000 CR4: 00000000007607e0
      [ 2259.904015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 2259.918764] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 2259.933449] PKRU: 55555554
      [ 2259.943654] Call Trace:
      [ 2259.953592]  <IRQ>
      [ 2259.955577]  [<ffffffffc03a7c12>] cachefiles_read_waiter+0x92/0xf0 [cachefiles]
      [ 2259.978039]  [<ffffffffa34d3942>] __wake_up_common+0x82/0x120
      [ 2259.991392]  [<ffffffffa34d3a63>] __wake_up_common_lock+0x83/0xc0
      [ 2260.004930]  [<ffffffffa34d3510>] ? task_rq_unlock+0x20/0x20
      [ 2260.017863]  [<ffffffffa34d3ab3>] __wake_up+0x13/0x20
      [ 2260.030230]  [<ffffffffa34c72a0>] __wake_up_bit+0x50/0x70
      [ 2260.042535]  [<ffffffffa35bdcdb>] unlock_page+0x2b/0x30
      [ 2260.054495]  [<ffffffffa35bdd09>] page_endio+0x29/0x90
      [ 2260.066184]  [<ffffffffa368fc81>] mpage_end_io+0x51/0x80
      
      CPU1
      cachefiles_read_waiter()
       20 static int cachefiles_read_waiter(wait_queue_entry_t *wait, unsigned mode,
       21                                   int sync, void *_key)
       22 {
      ...
       61         spin_lock(&object->work_lock);
       62         list_add_tail(&monitor->op_link, &op->to_do);
       63         spin_unlock(&object->work_lock);
      <begin race window>
       64
       65         fscache_enqueue_retrieval(op);
      182 static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op)
      183 {
      184         fscache_enqueue_operation(&op->op);
      185 }
       58 void fscache_enqueue_operation(struct fscache_operation *op)
       59 {
       60         struct fscache_cookie *cookie = op->object->cookie;
       61
       62         _enter("{OBJ%x OP%x,%u}",
       63                op->object->debug_id, op->debug_id, atomic_read(&op->usage));
       64
       65         ASSERT(list_empty(&op->pend_link));
       66         ASSERT(op->processor != NULL);
       67         ASSERT(fscache_object_is_available(op->object));
       68         ASSERTCMP(atomic_read(&op->usage), >, 0);
      <end race window>
      
      CPU2
      cachefiles_read_copier()
      168         while (!list_empty(&op->to_do)) {
      ...
      202                 fscache_end_io(op, monitor->netfs_page, error);
      203                 put_page(monitor->netfs_page);
      204                 fscache_retrieval_complete(op, 1);
      
      CPU1
       58 void fscache_enqueue_operation(struct fscache_operation *op)
       59 {
      ...
       69         ASSERTIFCMP(op->state != FSCACHE_OP_ST_IN_PROGRESS,
       70                     op->state, ==,  FSCACHE_OP_ST_CANCELLED);
      Signed-off-by: NLei Xue <carmark.dlut@gmail.com>
      Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      7bb0c533
    • D
      NFSv4: Fix fscache cookie aux_data to ensure change_attr is included · 50eaa652
      Dave Wysochanski 提交于
      Commit 402cb8dd ("fscache: Attach the index key and aux data to
      the cookie") added the aux_data and aux_data_len to parameters to
      fscache_acquire_cookie(), and updated the callers in the NFS client.
      In the process it modified the aux_data to include the change_attr,
      but missed adding change_attr to a couple places where aux_data was
      used.  Specifically, when opening a file and the change_attr is not
      added, the following attempt to lookup an object will fail inside
      cachefiles_check_object_xattr() = -116 due to
      nfs_fscache_inode_check_aux() failing memcmp on auxdata and returning
      FSCACHE_CHECKAUX_OBSOLETE.
      
      Fix this by adding nfs_fscache_update_auxdata() to set the auxdata
      from all relevant fields in the inode, including the change_attr.
      
      Fixes: 402cb8dd ("fscache: Attach the index key and aux data to the cookie")
      Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      50eaa652