1. 16 7月, 2017 1 次提交
    • B
      fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks · 9d5b86ac
      Benjamin Coddington 提交于
      Since commit c69899a1 "NFSv4: Update of VFS byte range lock must be
      atomic with the stateid update", NFSv4 has been inserting locks in rpciod
      worker context.  The result is that the file_lock's fl_nspid is the
      kworker's pid instead of the original userspace pid.
      
      The fl_nspid is only used to represent the namespaced virtual pid number
      when displaying locks or returning from F_GETLK.  There's no reason to set
      it for every inserted lock, since we can usually just look it up from
      fl_pid.  So, instead of looking up and holding struct pid for every lock,
      let's just look up the virtual pid number from fl_pid when it is needed.
      That means we can remove fl_nspid entirely.
      
      The translaton and presentation of fl_pid should handle the following four
      cases:
      
      1 - F_GETLK on a remote file with a remote lock:
          In this case, the filesystem should determine the l_pid to return here.
          Filesystems should indicate that the fl_pid represents a non-local pid
          value that should not be translated by returning an fl_pid <= 0.
      
      2 - F_GETLK on a local file with a remote lock:
          This should be the l_pid of the lock manager process, and translated.
      
      3 - F_GETLK on a remote file with a local lock, and
      4 - F_GETLK on a local file with a local lock:
          These should be the translated l_pid of the local locking process.
      
      Fuse was already doing the correct thing by translating the pid into the
      caller's namespace.  With this change we must update fuse to translate
      to init's pid namespace, so that the locks API can then translate from
      init's pid namespace into the pid namespace of the caller.
      
      With this change, the locks API will expect that if a filesystem returns
      a remote pid as opposed to a local pid for F_GETLK, that remote pid will
      be <= 0.  This signifies that the pid is remote, and the locks API will
      forego translating that pid into the pid namespace of the local calling
      process.
      
      Finally, we convert remote filesystems to present remote pids using
      negative numbers. Have lustre, 9p, ceph, cifs, and dlm negate the remote
      pid returned for F_GETLK lock requests.
      
      Since local pids will never be larger than PID_MAX_LIMIT (which is
      currently defined as <= 4 million), but pid_t is an unsigned int, we
      should have plenty of room to represent remote pids with negative
      numbers if we assume that remote pid numbers are similarly limited.
      
      If this is not the case, then we run the risk of having a remote pid
      returned for which there is also a corresponding local pid.  This is a
      problem we have now, but this patch should reduce the chances of that
      occurring, while also returning those remote pid numbers, for whatever
      that may be worth.
      Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      9d5b86ac
  2. 10 7月, 2017 1 次提交
  3. 09 7月, 2017 6 次提交
  4. 06 7月, 2017 9 次提交
  5. 03 7月, 2017 1 次提交
  6. 21 6月, 2017 5 次提交
  7. 20 6月, 2017 2 次提交
    • I
      sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> · 5dd43ce2
      Ingo Molnar 提交于
      The wait_bit*() types and APIs are mixed into wait.h, but they
      are a pretty orthogonal extension of wait-queues.
      
      Furthermore, only about 50 kernel files use these APIs, while
      over 1000 use the regular wait-queue functionality.
      
      So clean up the main wait.h by moving the wait-bit functionality
      out of it, into a separate .h and .c file:
      
        include/linux/wait_bit.h  for types and APIs
        kernel/sched/wait_bit.c   for the implementation
      
      Update all header dependencies.
      
      This reduces the size of wait.h rather significantly, by about 30%.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5dd43ce2
    • J
      cifs: use get_random_u32 for 32-bit lock random · 51b0817b
      Jason A. Donenfeld 提交于
      Using get_random_u32 here is faster, more fitting of the use case, and
      just as cryptographically secure. It also has the benefit of providing
      better randomness at early boot, which is sometimes when this is used.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Steve French <sfrench@samba.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      51b0817b
  8. 13 5月, 2017 4 次提交
  9. 10 5月, 2017 2 次提交
    • S
      Don't delay freeing mids when blocked on slow socket write of request · de1892b8
      Steve French 提交于
      When processing responses, and in particular freeing mids (DeleteMidQEntry),
      which is very important since it also frees the associated buffers (cifs_buf_release),
      we can block a long time if (writes to) socket is slow due to low memory or networking
      issues.
      
      We can block in send (smb request) waiting for memory, and be blocked in processing
      responess (which could free memory if we let it) - since they both grab the
      server->srv_mutex.
      
      In practice, in the DeleteMidQEntry case - there is no reason we need to
      grab the srv_mutex so remove these around DeleteMidQEntry, and it allows
      us to free memory faster.
      Signed-off-by: NSteve French <steve.french@primarydata.com>
      Acked-by: NPavel Shilovsky <pshilov@microsoft.com>
      de1892b8
    • R
      CIFS: silence lockdep splat in cifs_relock_file() · 560d3889
      Rabin Vincent 提交于
      cifs_relock_file() can perform a down_write() on the inode's lock_sem even
      though it was already performed in cifs_strict_readv().  Lockdep complains
      about this.  AFAICS, there is no problem here, and lockdep just needs to be
      told that this nesting is OK.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.11.0+ #20 Not tainted
       ---------------------------------------------
       cat/701 is trying to acquire lock:
        (&cifsi->lock_sem){++++.+}, at: cifs_reopen_file+0x7a7/0xc00
      
       but task is already holding lock:
        (&cifsi->lock_sem){++++.+}, at: cifs_strict_readv+0x177/0x310
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&cifsi->lock_sem);
         lock(&cifsi->lock_sem);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by cat/701:
        #0:  (&cifsi->lock_sem){++++.+}, at: cifs_strict_readv+0x177/0x310
      
       stack backtrace:
       CPU: 0 PID: 701 Comm: cat Not tainted 4.11.0+ #20
       Call Trace:
        dump_stack+0x85/0xc2
        __lock_acquire+0x17dd/0x2260
        ? trace_hardirqs_on_thunk+0x1a/0x1c
        ? preempt_schedule_irq+0x6b/0x80
        lock_acquire+0xcc/0x260
        ? lock_acquire+0xcc/0x260
        ? cifs_reopen_file+0x7a7/0xc00
        down_read+0x2d/0x70
        ? cifs_reopen_file+0x7a7/0xc00
        cifs_reopen_file+0x7a7/0xc00
        ? printk+0x43/0x4b
        cifs_readpage_worker+0x327/0x8a0
        cifs_readpage+0x8c/0x2a0
        generic_file_read_iter+0x692/0xd00
        cifs_strict_readv+0x29f/0x310
        generic_file_splice_read+0x11c/0x1c0
        do_splice_to+0xa5/0xc0
        splice_direct_to_actor+0xfa/0x350
        ? generic_pipe_buf_nosteal+0x10/0x10
        do_splice_direct+0xb5/0xe0
        do_sendfile+0x278/0x3a0
        SyS_sendfile64+0xc4/0xe0
        entry_SYSCALL_64_fastpath+0x1f/0xbe
      Signed-off-by: NRabin Vincent <rabinv@axis.com>
      Acked-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      560d3889
  10. 09 5月, 2017 1 次提交
  11. 05 5月, 2017 1 次提交
  12. 04 5月, 2017 3 次提交
  13. 03 5月, 2017 4 次提交
    • R
      CIFS: fix oplock break deadlocks · 3998e6b8
      Rabin Vincent 提交于
      When the final cifsFileInfo_put() is called from cifsiod and an oplock
      break work is queued, lockdep complains loudly:
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.11.0+ #21 Not tainted
       ---------------------------------------------
       kworker/0:2/78 is trying to acquire lock:
        ("cifsiod"){++++.+}, at: flush_work+0x215/0x350
      
       but task is already holding lock:
        ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock("cifsiod");
         lock("cifsiod");
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       2 locks held by kworker/0:2/78:
        #0:  ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0
        #1:  ((&wdata->work)){+.+...}, at: process_one_work+0x255/0x8e0
      
       stack backtrace:
       CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 4.11.0+ #21
       Workqueue: cifsiod cifs_writev_complete
       Call Trace:
        dump_stack+0x85/0xc2
        __lock_acquire+0x17dd/0x2260
        ? match_held_lock+0x20/0x2b0
        ? trace_hardirqs_off_caller+0x86/0x130
        ? mark_lock+0xa6/0x920
        lock_acquire+0xcc/0x260
        ? lock_acquire+0xcc/0x260
        ? flush_work+0x215/0x350
        flush_work+0x236/0x350
        ? flush_work+0x215/0x350
        ? destroy_worker+0x170/0x170
        __cancel_work_timer+0x17d/0x210
        ? ___preempt_schedule+0x16/0x18
        cancel_work_sync+0x10/0x20
        cifsFileInfo_put+0x338/0x7f0
        cifs_writedata_release+0x2a/0x40
        ? cifs_writedata_release+0x2a/0x40
        cifs_writev_complete+0x29d/0x850
        ? preempt_count_sub+0x18/0xd0
        process_one_work+0x304/0x8e0
        worker_thread+0x9b/0x6a0
        kthread+0x1b2/0x200
        ? process_one_work+0x8e0/0x8e0
        ? kthread_create_on_node+0x40/0x40
        ret_from_fork+0x31/0x40
      
      This is a real warning.  Since the oplock is queued on the same
      workqueue this can deadlock if there is only one worker thread active
      for the workqueue (which will be the case during memory pressure when
      the rescuer thread is handling it).
      
      Furthermore, there is at least one other kind of hang possible due to
      the oplock break handling if there is only worker.  (This can be
      reproduced without introducing memory pressure by having passing 1 for
      the max_active parameter of cifsiod.) cifs_oplock_break() can wait
      indefintely in the filemap_fdatawait() while the cifs_writev_complete()
      work is blocked:
      
       sysrq: SysRq : Show Blocked State
         task                        PC stack   pid father
       kworker/0:1     D    0    16      2 0x00000000
       Workqueue: cifsiod cifs_oplock_break
       Call Trace:
        __schedule+0x562/0xf40
        ? mark_held_locks+0x4a/0xb0
        schedule+0x57/0xe0
        io_schedule+0x21/0x50
        wait_on_page_bit+0x143/0x190
        ? add_to_page_cache_lru+0x150/0x150
        __filemap_fdatawait_range+0x134/0x190
        ? do_writepages+0x51/0x70
        filemap_fdatawait_range+0x14/0x30
        filemap_fdatawait+0x3b/0x40
        cifs_oplock_break+0x651/0x710
        ? preempt_count_sub+0x18/0xd0
        process_one_work+0x304/0x8e0
        worker_thread+0x9b/0x6a0
        kthread+0x1b2/0x200
        ? process_one_work+0x8e0/0x8e0
        ? kthread_create_on_node+0x40/0x40
        ret_from_fork+0x31/0x40
       dd              D    0   683    171 0x00000000
       Call Trace:
        __schedule+0x562/0xf40
        ? mark_held_locks+0x29/0xb0
        schedule+0x57/0xe0
        io_schedule+0x21/0x50
        wait_on_page_bit+0x143/0x190
        ? add_to_page_cache_lru+0x150/0x150
        __filemap_fdatawait_range+0x134/0x190
        ? do_writepages+0x51/0x70
        filemap_fdatawait_range+0x14/0x30
        filemap_fdatawait+0x3b/0x40
        filemap_write_and_wait+0x4e/0x70
        cifs_flush+0x6a/0xb0
        filp_close+0x52/0xa0
        __close_fd+0xdc/0x150
        SyS_close+0x33/0x60
        entry_SYSCALL_64_fastpath+0x1f/0xbe
      
       Showing all locks held in the system:
       2 locks held by kworker/0:1/16:
        #0:  ("cifsiod"){.+.+.+}, at: process_one_work+0x255/0x8e0
        #1:  ((&cfile->oplock_break)){+.+.+.}, at: process_one_work+0x255/0x8e0
      
       Showing busy workqueues and worker pools:
       workqueue cifsiod: flags=0xc
         pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
           in-flight: 16:cifs_oplock_break
           delayed: cifs_writev_complete, cifs_echo_request
       pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 750 3
      
      Fix these problems by creating a a new workqueue (with a rescuer) for
      the oplock break work.
      Signed-off-by: NRabin Vincent <rabinv@axis.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      CC: Stable <stable@vger.kernel.org>
      3998e6b8
    • D
      cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops · 6026685d
      David Disseldorp 提交于
      As with 61876395, an open directory may have a NULL private_data
      pointer prior to readdir. CIFS_ENUMERATE_SNAPSHOTS must check for this
      before dereference.
      
      Fixes: 834170c8 ("Enable previous version support")
      Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      6026685d
    • D
      cifs: fix leak in FSCTL_ENUM_SNAPS response handling · 0e5c7955
      David Disseldorp 提交于
      The server may respond with success, and an output buffer less than
      sizeof(struct smb_snapshot_array) in length. Do not leak the output
      buffer in this case.
      
      Fixes: 834170c8 ("Enable previous version support")
      Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      0e5c7955
    • S
      Set unicode flag on cifs echo request to avoid Mac error · 26c9cb66
      Steve French 提交于
      Mac requires the unicode flag to be set for cifs, even for the smb
      echo request (which doesn't have strings).
      
      Without this Mac rejects the periodic echo requests (when mounting
      with cifs) that we use to check if server is down
      Signed-off-by: NSteve French <smfrench@gmail.com>
      CC: Stable <stable@vger.kernel.org>
      26c9cb66