1. 08 7月, 2020 1 次提交
  2. 24 6月, 2020 1 次提交
    • Z
      cifs: Fix double add page to memcg when cifs_readpages · 95a3d8f3
      Zhang Xiaoxu 提交于
      When xfstests generic/451, there is an BUG at mm/memcontrol.c:
        page:ffffea000560f2c0 refcount:2 mapcount:0 mapping:000000008544e0ea
             index:0xf
        mapping->aops:cifs_addr_ops dentry name:"tst-aio-dio-cycle-write.451"
        flags: 0x2fffff80000001(locked)
        raw: 002fffff80000001 ffffc90002023c50 ffffea0005280088 ffff88815cda0210
        raw: 000000000000000f 0000000000000000 00000002ffffffff ffff88817287d000
        page dumped because: VM_BUG_ON_PAGE(page->mem_cgroup)
        page->mem_cgroup:ffff88817287d000
        ------------[ cut here ]------------
        kernel BUG at mm/memcontrol.c:2659!
        invalid opcode: 0000 [#1] SMP
        CPU: 2 PID: 2038 Comm: xfs_io Not tainted 5.8.0-rc1 #44
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_
          073836-buildvm-ppc64le-16.ppc.4
        RIP: 0010:commit_charge+0x35/0x50
        Code: 0d 48 83 05 54 b2 02 05 01 48 89 77 38 c3 48 c7
              c6 78 4a ea ba 48 83 05 38 b2 02 05 01 e8 63 0d9
        RSP: 0018:ffffc90002023a50 EFLAGS: 00010202
        RAX: 0000000000000000 RBX: ffff88817287d000 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: ffff88817ac97ea0 RDI: ffff88817ac97ea0
        RBP: ffffea000560f2c0 R08: 0000000000000203 R09: 0000000000000005
        R10: 0000000000000030 R11: ffffc900020237a8 R12: 0000000000000000
        R13: 0000000000000001 R14: 0000000000000001 R15: ffff88815a1272c0
        FS:  00007f5071ab0800(0000) GS:ffff88817ac80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 000055efcd5ca000 CR3: 000000015d312000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         mem_cgroup_charge+0x166/0x4f0
         __add_to_page_cache_locked+0x4a9/0x710
         add_to_page_cache_locked+0x15/0x20
         cifs_readpages+0x217/0x1270
         read_pages+0x29a/0x670
         page_cache_readahead_unbounded+0x24f/0x390
         __do_page_cache_readahead+0x3f/0x60
         ondemand_readahead+0x1f1/0x470
         page_cache_async_readahead+0x14c/0x170
         generic_file_buffered_read+0x5df/0x1100
         generic_file_read_iter+0x10c/0x1d0
         cifs_strict_readv+0x139/0x170
         new_sync_read+0x164/0x250
         __vfs_read+0x39/0x60
         vfs_read+0xb5/0x1e0
         ksys_pread64+0x85/0xf0
         __x64_sys_pread64+0x22/0x30
         do_syscall_64+0x69/0x150
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7f5071fcb1af
        Code: Bad RIP value.
        RSP: 002b:00007ffde2cdb8e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
        RAX: ffffffffffffffda RBX: 00007ffde2cdb990 RCX: 00007f5071fcb1af
        RDX: 0000000000001000 RSI: 000055efcd5ca000 RDI: 0000000000000003
        RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000001000 R11: 0000000000000293 R12: 0000000000000001
        R13: 000000000009f000 R14: 0000000000000000 R15: 0000000000001000
        Modules linked in:
        ---[ end trace 725fa14a3e1af65c ]---
      
      Since commit 3fea5a49 ("mm: memcontrol: convert page cache to a new
      mem_cgroup_charge() API") not cancel the page charge, the pages maybe
      double add to pagecache:
      thread1                       | thread2
      cifs_readpages
      readpages_get_pages
       add_to_page_cache_locked(head,index=n)=0
                                    | readpages_get_pages
                                    | add_to_page_cache_locked(head,index=n+1)=0
       add_to_page_cache_locked(head, index=n+1)=-EEXIST
       then, will next loop with list head page's
       index=n+1 and the page->mapping not NULL
      readpages_get_pages
      add_to_page_cache_locked(head, index=n+1)
       commit_charge
        VM_BUG_ON_PAGE
      
      So, we should not do the next loop when any page add to page cache
      failed.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Acked-by: NRonnie Sahlberg <lsahlber@redhat.com>
      95a3d8f3
  3. 12 6月, 2020 1 次提交
  4. 05 6月, 2020 2 次提交
    • A
      cifs: multichannel: move channel selection above transport layer · 352d96f3
      Aurelien Aptel 提交于
      Move the channel (TCP_Server_Info*) selection from the tranport
      layer to higher in the call stack so that:
      
      - credit handling is done with the server that will actually be used
        to send.
        * ->wait_mtu_credit
        * ->set_credits / set_credits
        * ->add_credits / add_credits
        * add_credits_and_wake_if
      
      - potential reconnection (smb2_reconnect) done when initializing a
        request is checked and done with the server that will actually be
        used to send.
      
      To do this:
      
      - remove the cifs_pick_channel() call out of compound_send_recv()
      
      - select channel and pass it down by adding a cifs_pick_channel(ses)
        call in:
        - smb311_posix_mkdir
        - SMB2_open
        - SMB2_ioctl
        - __SMB2_close
        - query_info
        - SMB2_change_notify
        - SMB2_flush
        - smb2_async_readv  (if none provided in context param)
        - SMB2_read         (if none provided in context param)
        - smb2_async_writev (if none provided in context param)
        - SMB2_write        (if none provided in context param)
        - SMB2_query_directory
        - send_set_info
        - SMB2_oplock_break
        - SMB311_posix_qfs_info
        - SMB2_QFS_info
        - SMB2_QFS_attr
        - smb2_lockv
        - SMB2_lease_break
          - smb2_compound_op
        - smb2_set_ea
        - smb2_ioctl_query_info
        - smb2_query_dir_first
        - smb2_query_info_comound
        - smb2_query_symlink
        - cifs_writepages
        - cifs_write_from_iter
        - cifs_send_async_read
        - cifs_read
        - cifs_readpages
      
      - add TCP_Server_Info *server param argument to:
        - cifs_send_recv
        - compound_send_recv
        - SMB2_open_init
        - SMB2_query_info_init
        - SMB2_set_info_init
        - SMB2_close_init
        - SMB2_ioctl_init
        - smb2_iotcl_req_init
        - SMB2_query_directory_init
        - SMB2_notify_init
        - SMB2_flush_init
        - build_qfs_info_req
        - smb2_hdr_assemble
        - smb2_reconnect
        - fill_small_buf
        - smb2_plain_req_init
        - __smb2_plain_req_init
      
      The read/write codepath is different than the rest as it is using
      pages, io iterators and async calls. To deal with those we add a
      server pointer in the cifs_writedata/cifs_readdata/cifs_io_parms
      context struct and set it in:
      
      - cifs_writepages      (wdata)
      - cifs_write_from_iter (wdata)
      - cifs_readpages       (rdata)
      - cifs_send_async_read (rdata)
      
      The [rw]data->server pointer is eventually copied to
      cifs_io_parms->server to pass it down to SMB2_read/SMB2_write.
      If SMB2_read/SMB2_write is called from a different place that doesn't
      set the server field it will pick a channel.
      
      Some places do not pick a channel and just use ses->server or
      cifs_ses_server(ses). All cifs_ses_server(ses) calls are in codepaths
      involving negprot/sess.setup.
      
      - SMB2_negotiate         (binding channel)
      - SMB2_sess_alloc_buffer (binding channel)
      - SMB2_echo              (uses provided one)
      - SMB2_logoff            (uses master)
      - SMB2_tdis              (uses master)
      
      (list not exhaustive)
      Signed-off-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      352d96f3
    • A
      cifs: multichannel: always zero struct cifs_io_parms · 7c06514a
      Aurelien Aptel 提交于
      SMB2_read/SMB2_write check and use cifs_io_parms->server, which might
      be uninitialized memory.
      
      This change makes all callers zero-initialize the struct.
      Signed-off-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      7c06514a
  5. 04 6月, 2020 1 次提交
  6. 01 6月, 2020 1 次提交
    • J
      cifs: Standardize logging output · a0a3036b
      Joe Perches 提交于
      Use pr_fmt to standardize all logging for fs/cifs.
      
      Some logging output had no CIFS: specific prefix.
      
      Now all output has one of three prefixes:
      
      o CIFS:
      o CIFS: VFS:
      o Root-CIFS:
      
      Miscellanea:
      
      o Convert printks to pr_<level>
      o Neaten macro definitions
      o Remove embedded CIFS: prefixes from formats
      o Convert "illegal" to "invalid"
      o Coalesce formats
      o Add missing '\n' format terminations
      o Consolidate multiple cifs_dbg continuations into single calls
      o More consistent use of upper case first word output logging
      o Multiline statement argument alignment and wrapping
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      a0a3036b
  7. 14 5月, 2020 1 次提交
  8. 11 4月, 2020 1 次提交
    • S
      smb3: enable swap on SMB3 mounts · 4e8aea30
      Steve French 提交于
      Add experimental support for allowing a swap file to be on an SMB3
      mount.  There are use cases where swapping over a secure network
      filesystem is preferable. In some cases there are no local
      block devices large enough, and network block devices can be
      hard to setup and secure.  And in some cases there are no
      local block devices at all (e.g. with the recent addition of
      remote boot over SMB3 mounts).
      
      There are various enhancements that can be added later e.g.:
      - doing a mandatory byte range lock over the swapfile (until
      the Linux VFS is modified to notify the file system that an open
      is for a swapfile, when the file can be opened "DENY_ALL" to prevent
      others from opening it).
      - pinning more buffers in the underlying transport to minimize memory
      allocations in the TCP stack under the fs
      - documenting how to create ACLs (on the server) to secure the
      swapfile (or adding additional tools to cifs-utils to make it easier)
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Acked-by: NPavel Shilovsky <pshilov@microsoft.com>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      4e8aea30
  9. 23 3月, 2020 1 次提交
    • Y
      CIFS: Fix bug which the return value by asynchronous read is error · 97adda8b
      Yilu Lin 提交于
      This patch is used to fix the bug in collect_uncached_read_data()
      that rc is automatically converted from a signed number to an
      unsigned number when the CIFS asynchronous read fails.
      It will cause ctx->rc is error.
      
      Example:
      Share a directory and create a file on the Windows OS.
      Mount the directory to the Linux OS using CIFS.
      On the CIFS client of the Linux OS, invoke the pread interface to
      deliver the read request.
      
      The size of the read length plus offset of the read request is greater
      than the maximum file size.
      
      In this case, the CIFS server on the Windows OS returns a failure
      message (for example, the return value of
      smb2.nt_status is STATUS_INVALID_PARAMETER).
      
      After receiving the response message, the CIFS client parses
      smb2.nt_status to STATUS_INVALID_PARAMETER
      and converts it to the Linux error code (rdata->result=-22).
      
      Then the CIFS client invokes the collect_uncached_read_data function to
      assign the value of rdata->result to rc, that is, rc=rdata->result=-22.
      
      The type of the ctx->total_len variable is unsigned integer,
      the type of the rc variable is integer, and the type of
      the ctx->rc variable is ssize_t.
      
      Therefore, during the ternary operation, the value of rc is
      automatically converted to an unsigned number. The final result is
      ctx->rc=4294967274. However, the expected result is ctx->rc=-22.
      Signed-off-by: NYilu Lin <linyilu@huawei.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Acked-by: NRonnie Sahlberg <lsahlber@redhat.com>
      97adda8b
  10. 19 3月, 2020 1 次提交
  11. 25 2月, 2020 1 次提交
    • A
      cifs: fix rename() by ensuring source handle opened with DELETE bit · 86f740f2
      Aurelien Aptel 提交于
      To rename a file in SMB2 we open it with the DELETE access and do a
      special SetInfo on it. If the handle is missing the DELETE bit the
      server will fail the SetInfo with STATUS_ACCESS_DENIED.
      
      We currently try to reuse any existing opened handle we have with
      cifs_get_writable_path(). That function looks for handles with WRITE
      access but doesn't check for DELETE, making rename() fail if it finds
      a handle to reuse. Simple reproducer below.
      
      To select handles with the DELETE bit, this patch adds a flag argument
      to cifs_get_writable_path() and find_writable_file() and the existing
      'bool fsuid_only' argument is converted to a flag.
      
      The cifsFileInfo struct only stores the UNIX open mode but not the
      original SMB access flags. Since the DELETE bit is not mapped in that
      mode, this patch stores the access mask in cifs_fid on file open,
      which is accessible from cifsFileInfo.
      
      Simple reproducer:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <sys/types.h>
      	#include <sys/stat.h>
      	#include <fcntl.h>
      	#include <unistd.h>
      	#define E(s) perror(s), exit(1)
      
      	int main(int argc, char *argv[])
      	{
      		int fd, ret;
      		if (argc != 3) {
      			fprintf(stderr, "Usage: %s A B\n"
      			"create&open A in write mode, "
      			"rename A to B, close A\n", argv[0]);
      			return 0;
      		}
      
      		fd = openat(AT_FDCWD, argv[1], O_WRONLY|O_CREAT|O_SYNC, 0666);
      		if (fd == -1) E("openat()");
      
      		ret = rename(argv[1], argv[2]);
      		if (ret) E("rename()");
      
      		ret = close(fd);
      		if (ret) E("close()");
      
      		return ret;
      	}
      
      $ gcc -o bugrename bugrename.c
      $ ./bugrename /mnt/a /mnt/b
      rename(): Permission denied
      
      Fixes: 8de9e86c ("cifs: create a helper to find a writeable handle by path name")
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      Reviewed-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      86f740f2
  12. 07 2月, 2020 1 次提交
  13. 06 2月, 2020 1 次提交
  14. 04 2月, 2020 1 次提交
  15. 27 1月, 2020 1 次提交
  16. 04 12月, 2019 1 次提交
    • S
      smb3: query attributes on file close · 43f8a6a7
      Steve French 提交于
      Since timestamps on files on most servers can be updated at
      close, and since timestamps on our dentries default to one
      second we can have stale timestamps in some common cases
      (e.g. open, write, close, stat, wait one second, stat - will
      show different mtime for the first and second stat).
      
      The SMB2/SMB3 protocol allows querying timestamps at close
      so add the code to request timestamp and attr information
      (which is cheap for the server to provide) to be returned
      when a file is closed (it is not needed for the many
      paths that call SMB2_close that are from compounded
      query infos and close nor is it needed for some of
      the cases where a directory close immediately follows a
      directory open.
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Acked-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: NAurelien Aptel <aaptel@suse.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      43f8a6a7
  17. 03 12月, 2019 1 次提交
    • P
      CIFS: Fix NULL-pointer dereference in smb2_push_mandatory_locks · 6f582b27
      Pavel Shilovsky 提交于
      Currently when the client creates a cifsFileInfo structure for
      a newly opened file, it allocates a list of byte-range locks
      with a pointer to the new cfile and attaches this list to the
      inode's lock list. The latter happens before initializing all
      other fields, e.g. cfile->tlink. Thus a partially initialized
      cifsFileInfo structure becomes available to other threads that
      walk through the inode's lock list. One example of such a thread
      may be an oplock break worker thread that tries to push all
      cached byte-range locks. This causes NULL-pointer dereference
      in smb2_push_mandatory_locks() when accessing cfile->tlink:
      
      [598428.945633] BUG: kernel NULL pointer dereference, address: 0000000000000038
      ...
      [598428.945749] Workqueue: cifsoplockd cifs_oplock_break [cifs]
      [598428.945793] RIP: 0010:smb2_push_mandatory_locks+0xd6/0x5a0 [cifs]
      ...
      [598428.945834] Call Trace:
      [598428.945870]  ? cifs_revalidate_mapping+0x45/0x90 [cifs]
      [598428.945901]  cifs_oplock_break+0x13d/0x450 [cifs]
      [598428.945909]  process_one_work+0x1db/0x380
      [598428.945914]  worker_thread+0x4d/0x400
      [598428.945921]  kthread+0x104/0x140
      [598428.945925]  ? process_one_work+0x380/0x380
      [598428.945931]  ? kthread_park+0x80/0x80
      [598428.945937]  ret_from_fork+0x35/0x40
      
      Fix this by reordering initialization steps of the cifsFileInfo
      structure: initialize all the fields first and then add the new
      byte-range lock list to the inode's lock list.
      
      Cc: Stable <stable@vger.kernel.org>
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      Reviewed-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      6f582b27
  18. 25 11月, 2019 6 次提交
    • P
      CIFS: Properly process SMB3 lease breaks · 9bd45408
      Pavel Shilovsky 提交于
      Currenly we doesn't assume that a server may break a lease
      from RWH to RW which causes us setting a wrong lease state
      on a file and thus mistakenly flushing data and byte-range
      locks and purging cached data on the client. This leads to
      performance degradation because subsequent IOs go directly
      to the server.
      
      Fix this by propagating new lease state and epoch values
      to the oplock break handler through cifsFileInfo structure
      and removing the use of cifsInodeInfo flags for that. It
      allows to avoid some races of several lease/oplock breaks
      using those flags in parallel.
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      9bd45408
    • R
      cifs: move cifsFileInfo_put logic into a work-queue · 32546a95
      Ronnie Sahlberg 提交于
      This patch moves the final part of the cifsFileInfo_put() logic where we
      need a write lock on lock_sem to be processed in a separate thread that
      holds no other locks.
      This is to prevent deadlocks like the one below:
      
      > there are 6 processes looping to while trying to down_write
      > cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
      > cifs_new_fileinfo
      >
      > and there are 5 other processes which are blocked, several of them
      > waiting on either PG_writeback or PG_locked (which are both set), all
      > for the same page of the file
      >
      > 2 inode_lock() (inode->i_rwsem) for the file
      > 1 wait_on_page_writeback() for the page
      > 1 down_read(inode->i_rwsem) for the inode of the directory
      > 1 inode_lock()(inode->i_rwsem) for the inode of the directory
      > 1 __lock_page
      >
      >
      > so processes are blocked waiting on:
      >   page flags PG_locked and PG_writeback for one specific page
      >   inode->i_rwsem for the directory
      >   inode->i_rwsem for the file
      >   cifsInodeInflock_sem
      >
      >
      >
      > here are the more gory details (let me know if I need to provide
      > anything more/better):
      >
      > [0 00:48:22.765] [UN]  PID: 8863   TASK: ffff8c691547c5c0  CPU: 3
      > COMMAND: "reopen_file"
      >  #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
      >  #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
      >  #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
      >  #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
      >  #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
      >  #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
      >  #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
      > * (I think legitimize_path is bogus)
      >
      > in path_openat
      >         } else {
      >                 const char *s = path_init(nd, flags);
      >                 while (!(error = link_path_walk(s, nd)) &&
      >                         (error = do_last(nd, file, op)) > 0) {  <<<<
      >
      > do_last:
      >         if (open_flag & O_CREAT)
      >                 inode_lock(dir->d_inode);  <<<<
      >         else
      > so it's trying to take inode->i_rwsem for the directory
      >
      >      DENTRY           INODE           SUPERBLK     TYPE PATH
      > ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR  /mnt/vm1_smb/
      > inode.i_rwsem is ffff8c691158efc0
      >
      > <struct rw_semaphore 0xffff8c691158efc0>:
      >         owner: <struct task_struct 0xffff8c6914275d00> (UN -   8856 -
      > reopen_file), counter: 0x0000000000000003
      >         waitlist: 2
      >         0xffff9965007e3c90     8863   reopen_file      UN 0  1:29:22.926
      >   RWSEM_WAITING_FOR_WRITE
      >         0xffff996500393e00     9802   ls               UN 0  1:17:26.700
      >   RWSEM_WAITING_FOR_READ
      >
      >
      > the owner of the inode.i_rwsem of the directory is:
      >
      > [0 00:00:00.109] [UN]  PID: 8856   TASK: ffff8c6914275d00  CPU: 3
      > COMMAND: "reopen_file"
      >  #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
      >  #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
      >  #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
      >  #3 [ffff99650065b940] msleep at ffffffff9af573a9
      >  #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
      >  #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
      >  #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
      >  #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
      >  #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
      >  #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
      > #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
      > #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
      > #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
      > #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
      > #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
      >
      > cifs_launder_page is for page 0xffffd1e2c07d2480
      >
      > crash> page.index,mapping,flags 0xffffd1e2c07d2480
      >       index = 0x8
      >       mapping = 0xffff8c68f3cd0db0
      >   flags = 0xfffffc0008095
      >
      >   PAGE-FLAG       BIT  VALUE
      >   PG_locked         0  0000001
      >   PG_uptodate       2  0000004
      >   PG_lru            4  0000010
      >   PG_waiters        7  0000080
      >   PG_writeback     15  0008000
      >
      >
      > inode is ffff8c68f3cd0c40
      > inode.i_rwsem is ffff8c68f3cd0ce0
      >      DENTRY           INODE           SUPERBLK     TYPE PATH
      > ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
      > /mnt/vm1_smb/testfile.8853
      >
      >
      > this process holds the inode->i_rwsem for the parent directory, is
      > laundering a page attached to the inode of the file it's opening, and in
      > _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
      > for the file itself.
      >
      >
      > <struct rw_semaphore 0xffff8c68f3cd0ce0>:
      >         owner: <struct task_struct 0xffff8c6914272e80> (UN -   8854 -
      > reopen_file), counter: 0x0000000000000003
      >         waitlist: 1
      >         0xffff9965005dfd80     8855   reopen_file      UN 0  1:29:22.912
      >   RWSEM_WAITING_FOR_WRITE
      >
      > this is the inode.i_rwsem for the file
      >
      > the owner:
      >
      > [0 00:48:22.739] [UN]  PID: 8854   TASK: ffff8c6914272e80  CPU: 2
      > COMMAND: "reopen_file"
      >  #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
      >  #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
      >  #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
      >  #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
      >  #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
      >  #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
      >  #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
      >  #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
      >  #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
      >  #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
      > #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
      > #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
      > #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
      > #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
      >
      > the process holds the inode->i_rwsem for the file to which it's writing,
      > and is trying to __lock_page for the same page as in the other processes
      >
      >
      > the other tasks:
      > [0 00:00:00.028] [UN]  PID: 8859   TASK: ffff8c6915479740  CPU: 2
      > COMMAND: "reopen_file"
      >  #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
      >  #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
      >  #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
      >  #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
      >  #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
      >  #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
      >  #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
      >  #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
      >  #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
      > #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
      >
      > this is opening the file, and is trying to down_write cinode->lock_sem
      >
      >
      > [0 00:00:00.041] [UN]  PID: 8860   TASK: ffff8c691547ae80  CPU: 2
      > COMMAND: "reopen_file"
      > [0 00:00:00.057] [UN]  PID: 8861   TASK: ffff8c6915478000  CPU: 3
      > COMMAND: "reopen_file"
      > [0 00:00:00.059] [UN]  PID: 8858   TASK: ffff8c6914271740  CPU: 2
      > COMMAND: "reopen_file"
      > [0 00:00:00.109] [UN]  PID: 8862   TASK: ffff8c691547dd00  CPU: 6
      > COMMAND: "reopen_file"
      >  #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
      >  #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
      >  #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
      >  #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
      >  #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
      >  #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
      >  #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
      >  #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
      >  #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
      >
      > closing the file, and trying to down_write cifsi->lock_sem
      >
      >
      > [0 00:48:22.839] [UN]  PID: 8857   TASK: ffff8c6914270000  CPU: 7
      > COMMAND: "reopen_file"
      >  #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
      >  #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
      >  #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
      >  #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
      >  #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
      >  #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
      >  #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
      >  #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
      >  #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
      >
      > in __filemap_fdatawait_range
      >                         wait_on_page_writeback(page);
      > for the same page of the file
      >
      >
      >
      > [0 00:48:22.718] [UN]  PID: 8855   TASK: ffff8c69142745c0  CPU: 7
      > COMMAND: "reopen_file"
      >  #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
      >  #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
      >  #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
      >  #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
      >  #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
      >  #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
      >  #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
      >
      >         inode_lock(inode);
      >
      >
      > and one 'ls' later on, to see whether the rest of the mount is available
      > (the test file is in the root, so we get blocked up on the directory
      > ->i_rwsem), so the entire mount is unavailable
      >
      > [0 00:36:26.473] [UN]  PID: 9802   TASK: ffff8c691436ae80  CPU: 4
      > COMMAND: "ls"
      >  #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
      >  #1 [ffff996500393db8] schedule at ffffffff9b6e64df
      >  #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
      >  #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
      >  #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
      >  #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
      >  #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
      >  #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
      >
      > in iterate_dir:
      >         if (shared)
      >                 res = down_read_killable(&inode->i_rwsem);  <<<<
      >         else
      >                 res = down_write_killable(&inode->i_rwsem);
      >
      Reported-by: NFrank Sorenson <sorenson@redhat.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      32546a95
    • P
      CIFS: Respect O_SYNC and O_DIRECT flags during reconnect · 44805b0e
      Pavel Shilovsky 提交于
      Currently the client translates O_SYNC and O_DIRECT flags
      into corresponding SMB create options when openning a file.
      The problem is that on reconnect when the file is being
      re-opened the client doesn't set those flags and it causes
      a server to reject re-open requests because create options
      don't match. The latter means that any subsequent system
      call against that open file fail until a share is re-mounted.
      
      Fix this by properly setting SMB create options when
      re-openning files after reconnects.
      
      Fixes: 1013e760: ("SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags")
      Cc: Stable <stable@vger.kernel.org>
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      44805b0e
    • L
      cifs: smbd: Invalidate and deregister memory registration on re-send for direct I/O · b7a55bbd
      Long Li 提交于
      On re-send, there might be a reconnect and all prevoius memory registrations
      need to be invalidated and deregistered.
      Signed-off-by: NLong Li <longli@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      b7a55bbd
    • Y
      CIFS: remove set but not used variables 'cinode' and 'netfid' · f28a2e5e
      YueHaibing 提交于
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      fs/cifs/file.c: In function 'cifs_flock':
      fs/cifs/file.c:1704:8: warning:
       variable 'netfid' set but not used [-Wunused-but-set-variable]
      
      fs/cifs/file.c:1702:24: warning:
       variable 'cinode' set but not used [-Wunused-but-set-variable]
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      f28a2e5e
    • S
      cifs: add support for flock · d0677992
      Steve French 提交于
      The flock system call locks the whole file rather than a byte
      range and so is currently emulated by various other file systems
      by simply sending a byte range lock for the whole file.
      Add flock handling for cifs.ko in similar way.
      
      xfstest generic/504 passes with this as well
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      d0677992
  19. 25 10月, 2019 2 次提交
    • D
      cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs · d46b0da7
      Dave Wysochanski 提交于
      There's a deadlock that is possible and can easily be seen with
      a test where multiple readers open/read/close of the same file
      and a disruption occurs causing reconnect.  The deadlock is due
      a reader thread inside cifs_strict_readv calling down_read and
      obtaining lock_sem, and then after reconnect inside
      cifs_reopen_file calling down_read a second time.  If in
      between the two down_read calls, a down_write comes from
      another process, deadlock occurs.
      
              CPU0                    CPU1
              ----                    ----
      cifs_strict_readv()
       down_read(&cifsi->lock_sem);
                                     _cifsFileInfo_put
                                        OR
                                     cifs_new_fileinfo
                                      down_write(&cifsi->lock_sem);
      cifs_reopen_file()
       down_read(&cifsi->lock_sem);
      
      Fix the above by changing all down_write(lock_sem) calls to
      down_write_trylock(lock_sem)/msleep() loop, which in turn
      makes the second down_read call benign since it will never
      block behind the writer while holding lock_sem.
      Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
      Suggested-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed--by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      d46b0da7
    • P
      CIFS: Fix use after free of file info structures · 1a67c415
      Pavel Shilovsky 提交于
      Currently the code assumes that if a file info entry belongs
      to lists of open file handles of an inode and a tcon then
      it has non-zero reference. The recent changes broke that
      assumption when putting the last reference of the file info.
      There may be a situation when a file is being deleted but
      nothing prevents another thread to reference it again
      and start using it. This happens because we do not hold
      the inode list lock while checking the number of references
      of the file info structure. Fix this by doing the proper
      locking when doing the check.
      
      Fixes: 487317c9 ("cifs: add spinlock for the openFileList to cifsInodeInfo")
      Fixes: cb248819 ("cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic")
      Cc: Stable <stable@vger.kernel.org>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      1a67c415
  20. 07 10月, 2019 2 次提交
    • P
      CIFS: Gracefully handle QueryInfo errors during open · 30573a82
      Pavel Shilovsky 提交于
      Currently if the client identifies problems when processing
      metadata returned in CREATE response, the open handle is being
      leaked. This causes multiple problems like a file missing a lease
      break by that client which causes high latencies to other clients
      accessing the file. Another side-effect of this is that the file
      can't be deleted.
      
      Fix this by closing the file after the client hits an error after
      the file was opened and the open descriptor wasn't returned to
      the user space. Also convert -ESTALE to -EOPENSTALE to allow
      the VFS to revalidate a dentry and retry the open.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      30573a82
    • D
      cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic · cb248819
      Dave Wysochanski 提交于
      Commit 487317c9 ("cifs: add spinlock for the openFileList to
      cifsInodeInfo") added cifsInodeInfo->open_file_lock spin_lock to protect
      the openFileList, but missed a few places where cifs_inode->openFileList
      was enumerated.  Change these remaining tcon->open_file_lock to
      cifsInodeInfo->open_file_lock to avoid panic in is_size_safe_to_change.
      
      [17313.245641] RIP: 0010:is_size_safe_to_change+0x57/0xb0 [cifs]
      [17313.245645] Code: 68 40 48 89 ef e8 19 67 b7 f1 48 8b 43 40 48 8d 4b 40 48 8d 50 f0 48 39 c1 75 0f eb 47 48 8b 42 10 48 8d 50 f0 48 39 c1 74 3a <8b> 80 88 00 00 00 83 c0 01 a8 02 74 e6 48 89 ef c6 07 00 0f 1f 40
      [17313.245649] RSP: 0018:ffff94ae1baefa30 EFLAGS: 00010202
      [17313.245654] RAX: dead000000000100 RBX: ffff88dc72243300 RCX: ffff88dc72243340
      [17313.245657] RDX: dead0000000000f0 RSI: 00000000098f7940 RDI: ffff88dd3102f040
      [17313.245659] RBP: ffff88dd3102f040 R08: 0000000000000000 R09: ffff94ae1baefc40
      [17313.245661] R10: ffffcdc8bb1c4e80 R11: ffffcdc8b50adb08 R12: 00000000098f7940
      [17313.245663] R13: ffff88dc72243300 R14: ffff88dbc8f19600 R15: ffff88dc72243428
      [17313.245667] FS:  00007fb145485700(0000) GS:ffff88dd3e000000(0000) knlGS:0000000000000000
      [17313.245670] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [17313.245672] CR2: 0000026bb46c6000 CR3: 0000004edb110003 CR4: 00000000007606e0
      [17313.245753] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [17313.245756] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [17313.245759] PKRU: 55555554
      [17313.245761] Call Trace:
      [17313.245803]  cifs_fattr_to_inode+0x16b/0x580 [cifs]
      [17313.245838]  cifs_get_inode_info+0x35c/0xa60 [cifs]
      [17313.245852]  ? kmem_cache_alloc_trace+0x151/0x1d0
      [17313.245885]  cifs_open+0x38f/0x990 [cifs]
      [17313.245921]  ? cifs_revalidate_dentry_attr+0x3e/0x350 [cifs]
      [17313.245953]  ? cifsFileInfo_get+0x30/0x30 [cifs]
      [17313.245960]  ? do_dentry_open+0x132/0x330
      [17313.245963]  do_dentry_open+0x132/0x330
      [17313.245969]  path_openat+0x573/0x14d0
      [17313.245974]  do_filp_open+0x93/0x100
      [17313.245979]  ? __check_object_size+0xa3/0x181
      [17313.245986]  ? audit_alloc_name+0x7e/0xd0
      [17313.245992]  do_sys_open+0x184/0x220
      [17313.245999]  do_syscall_64+0x5b/0x1b0
      
      Fixes: 487317c9 ("cifs: add spinlock for the openFileList to cifsInodeInfo")
      
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      cb248819
  21. 17 9月, 2019 4 次提交
  22. 14 6月, 2019 1 次提交
  23. 30 5月, 2019 1 次提交
  24. 08 5月, 2019 1 次提交
  25. 25 4月, 2019 1 次提交
    • J
      cifs: fix page reference leak with readv/writev · 13f5938d
      Jérôme Glisse 提交于
      CIFS can leak pages reference gotten through GUP (get_user_pages*()
      through iov_iter_get_pages()). This happen if cifs_send_async_read()
      or cifs_write_from_iter() calls fail from within __cifs_readv() and
      __cifs_writev() respectively. This patch move page unreference to
      cifs_aio_ctx_release() which will happens on all code paths this is
      all simpler to follow for correctness.
      Signed-off-by: NJérôme Glisse <jglisse@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: linux-cifs@vger.kernel.org
      Cc: samba-technical@lists.samba.org
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stable <stable@vger.kernel.org>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      13f5938d
  26. 16 4月, 2019 1 次提交
    • A
      CIFS: keep FileInfo handle live during oplock break · b98749ca
      Aurelien Aptel 提交于
      In the oplock break handler, writing pending changes from pages puts
      the FileInfo handle. If the refcount reaches zero it closes the handle
      and waits for any oplock break handler to return, thus causing a deadlock.
      
      To prevent this situation:
      
      * We add a wait flag to cifsFileInfo_put() to decide whether we should
        wait for running/pending oplock break handlers
      
      * We keep an additionnal reference of the SMB FileInfo handle so that
        for the rest of the handler putting the handle won't close it.
        - The ref is bumped everytime we queue the handler via the
          cifs_queue_oplock_break() helper.
        - The ref is decremented at the end of the handler
      
      This bug was triggered by xfstest 464.
      
      Also important fix to address the various reports of
      oops in smb2_push_mandatory_locks
      Signed-off-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      b98749ca
  27. 23 3月, 2019 2 次提交
  28. 15 3月, 2019 1 次提交
    • A
      CIFS: fix POSIX lock leak and invalid ptr deref · bc31d0cd
      Aurelien Aptel 提交于
      We have a customer reporting crashes in lock_get_status() with many
      "Leaked POSIX lock" messages preceeding the crash.
      
       Leaked POSIX lock on dev=0x0:0x56 ...
       Leaked POSIX lock on dev=0x0:0x56 ...
       Leaked POSIX lock on dev=0x0:0x56 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       POSIX: fl_owner=ffff8900e7b79380 fl_flags=0x1 fl_type=0x1 fl_pid=20709
       Leaked POSIX lock on dev=0x0:0x4b ino...
       Leaked locks on dev=0x0:0x4b ino=0xf911400000029:
       POSIX: fl_owner=ffff89f41c870e00 fl_flags=0x1 fl_type=0x1 fl_pid=19592
       stack segment: 0000 [#1] SMP
       Modules linked in: binfmt_misc msr tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 arc4 ecb auth_rpcgss nfsv4 md4 nfs nls_utf8 lockd grace cifs sunrpc ccm dns_resolver fscache af_packet iscsi_ibft iscsi_boot_sysfs vmw_vsock_vmci_transport vsock xfs libcrc32c sb_edac edac_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg ansi_cprng vmw_balloon aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr vmxnet3 i2c_piix4 vmw_vmci shpchp fjes processor button ac btrfs xor raid6_pq sr_mod cdrom ata_generic sd_mod ata_piix vmwgfx crc32c_intel drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw ahci libahci drm libata vmw_pvscsi sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
      
       Supported: Yes
       CPU: 6 PID: 28250 Comm: lsof Not tainted 4.4.156-94.64-default #1
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
       task: ffff88a345f28740 ti: ffff88c74005c000 task.ti: ffff88c74005c000
       RIP: 0010:[<ffffffff8125dcab>]  [<ffffffff8125dcab>] lock_get_status+0x9b/0x3b0
       RSP: 0018:ffff88c74005fd90  EFLAGS: 00010202
       RAX: ffff89bde83e20ae RBX: ffff89e870003d18 RCX: 0000000049534f50
       RDX: ffffffff81a3541f RSI: ffffffff81a3544e RDI: ffff89bde83e20ae
       RBP: 0026252423222120 R08: 0000000020584953 R09: 000000000000ffff
       R10: 0000000000000000 R11: ffff88c74005fc70 R12: ffff89e5ca7b1340
       R13: 00000000000050e5 R14: ffff89e870003d30 R15: ffff89e5ca7b1340
       FS:  00007fafd64be800(0000) GS:ffff89f41fd00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000001c80018 CR3: 000000a522048000 CR4: 0000000000360670
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Stack:
        0000000000000208 ffffffff81a3d6b6 ffff89e870003d30 ffff89e870003d18
        ffff89e5ca7b1340 ffff89f41738d7c0 ffff89e870003d30 ffff89e5ca7b1340
        ffffffff8125e08f 0000000000000000 ffff89bc22b67d00 ffff88c74005ff28
       Call Trace:
        [<ffffffff8125e08f>] locks_show+0x2f/0x70
        [<ffffffff81230ad1>] seq_read+0x251/0x3a0
        [<ffffffff81275bbc>] proc_reg_read+0x3c/0x70
        [<ffffffff8120e456>] __vfs_read+0x26/0x140
        [<ffffffff8120e9da>] vfs_read+0x7a/0x120
        [<ffffffff8120faf2>] SyS_read+0x42/0xa0
        [<ffffffff8161cbc3>] entry_SYSCALL_64_fastpath+0x1e/0xb7
      
      When Linux closes a FD (close(), close-on-exec, dup2(), ...) it calls
      filp_close() which also removes all posix locks.
      
      The lock struct is initialized like so in filp_close() and passed
      down to cifs
      
      	...
              lock.fl_type = F_UNLCK;
              lock.fl_flags = FL_POSIX | FL_CLOSE;
              lock.fl_start = 0;
              lock.fl_end = OFFSET_MAX;
      	...
      
      Note the FL_CLOSE flag, which hints the VFS code that this unlocking
      is done for closing the fd.
      
      filp_close()
        locks_remove_posix(filp, id);
          vfs_lock_file(filp, F_SETLK, &lock, NULL);
            return filp->f_op->lock(filp, cmd, fl) => cifs_lock()
              rc = cifs_setlk(file, flock, type, wait_flag, posix_lck, lock, unlock, xid);
                rc = server->ops->mand_unlock_range(cfile, flock, xid);
                if (flock->fl_flags & FL_POSIX && !rc)
                        rc = locks_lock_file_wait(file, flock)
      
      Notice how we don't call locks_lock_file_wait() which does the
      generic VFS lock/unlock/wait work on the inode if rc != 0.
      
      If we are closing the handle, the SMB server is supposed to remove any
      locks associated with it. Similarly, cifs.ko frees and wakes up any
      lock and lock waiter when closing the file:
      
      cifs_close()
        cifsFileInfo_put(file->private_data)
      	/*
      	 * Delete any outstanding lock records. We'll lose them when the file
      	 * is closed anyway.
      	 */
      	down_write(&cifsi->lock_sem);
      	list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) {
      		list_del(&li->llist);
      		cifs_del_lock_waiters(li);
      		kfree(li);
      	}
      	list_del(&cifs_file->llist->llist);
      	kfree(cifs_file->llist);
      	up_write(&cifsi->lock_sem);
      
      So we can safely ignore unlocking failures in cifs_lock() if they
      happen with the FL_CLOSE flag hint set as both the server and the
      client take care of it during the actual closing.
      
      This is not a proper fix for the unlocking failure but it's safe and
      it seems to prevent the lock leakages and crashes the customer
      experiences.
      Signed-off-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NNeilBrown <neil@brown.name>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Acked-by: NPavel Shilovsky <pshilov@microsoft.com>
      bc31d0cd