1. 03 11月, 2015 2 次提交
    • T
      pNFS/flexfiles: Add support for FF_FLAGS_NO_IO_THRU_MDS · 260074cd
      Trond Myklebust 提交于
      For loosely coupled pNFS/flexfiles systems, there is often no advantage
      at all in going through the MDS for I/O, since the MDS is subject to
      the same limitations as all other clients when talking to DSes. If a
      DS is unresponsive, I/O through the MDS will fail.
      
      For such systems, the only scalable solution is to have the pNFS clients
      retry doing pNFS, and so the protocol now provides a flag that allows
      the pNFS server to signal this.
      
      If LAYOUTGET returns FF_FLAGS_NO_IO_THRU_MDS, then we should assume that
      the MDS wants the client to retry using these devices, even if they were
      previously marked as being unavailable. To do so, we add a helper,
      ff_layout_mark_devices_valid() that will be called from layoutget.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      260074cd
    • T
      pNFS/flexfiles: When mirrored, retry failed reads by switching mirrors · 13544412
      Trond Myklebust 提交于
      If the pNFS/flexfiles file is mirrored, and a read to one mirror fails,
      then we should bump the mirror index, so that we retry to a different
      mirror. Once we've iterated through all mirrors and all failed, we can
      return the layout and issue a new LAYOUTGET.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      13544412
  2. 22 10月, 2015 11 次提交
    • K
      NFSv4.1/pnfs: Retry through MDS when getting bad length of data · f8417b48
      Kinglong Mee 提交于
      If non rpc-based layout driver return bad length of data, nfs retries
      by calling rpc_restart_call_prepare() that cause an NULL reference panic.
      
      This patch lets nfs retry through MDS for non rpc-based layout driver
      return bad length of data.
      
      [13034.883329] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [13034.884902] IP: [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
      [13034.886558] PGD 0
      [13034.888126] Oops: 0000 [#1] KASAN
      [13034.889710] Modules linked in: blocklayoutdriver(OE) nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c coretemp btrfs crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon auth_rpcgss shpchp nfs_acl lockd vmw_vmci parport_pc xor raid6_pq grace parport sunrpc i2c_piix4 vmwgfx drm_kms_helper ttm drm mptspi e1000 serio_raw scsi_transport_spi mptscsih mptbase ata_generic pata_acpi [last unloaded: fscache]
      [13034.898260] CPU: 0 PID: 10112 Comm: kworker/0:1 Tainted: G           OE   4.3.0-rc5+ #279
      [13034.899932] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
      [13034.903342] Workqueue: events bl_read_cleanup [blocklayoutdriver]
      [13034.905059] task: ffff88006a9148c0 ti: ffff880035e90000 task.ti: ffff880035e90000
      [13034.906827] RIP: 0010:[<ffffffffa00db372>]  [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
      [13034.910522] RSP: 0018:ffff880035e97b58  EFLAGS: 00010282
      [13034.912378] RAX: fffffbfff04a5a94 RBX: ffff880068fe4858 RCX: 0000000000000003
      [13034.914339] RDX: dffffc0000000000 RSI: 0000000000000003 RDI: 0000000000000282
      [13034.916236] RBP: ffff880035e97b68 R08: 0000000000000001 R09: 0000000000000001
      [13034.918229] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      [13034.920007] R13: ffff880068fe4858 R14: ffff880068fe4a60 R15: 0000000000001000
      [13034.921845] FS:  0000000000000000(0000) GS:ffffffff82247000(0000) knlGS:0000000000000000
      [13034.923645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13034.925525] CR2: 0000000000000000 CR3: 00000000063dd000 CR4: 00000000001406f0
      [13034.932808] Stack:
      [13034.934813]  ffff880068fe4780 0000000000001000 ffff880035e97ba8 ffffffffa08800d2
      [13034.936675]  ffffffffa088029d ffff880068fe4780 ffff880068fe4858 ffffffffa089c0a0
      [13034.938593]  ffff880068fe47e0 ffff88005d59faf0 ffff880035e97be0 ffffffffa087e08f
      [13034.940454] Call Trace:
      [13034.942388]  [<ffffffffa08800d2>] nfs_readpage_result+0x112/0x200 [nfs]
      [13034.944317]  [<ffffffffa088029d>] ? nfs_readpage_done+0xdd/0x160 [nfs]
      [13034.946267]  [<ffffffffa087e08f>] nfs_pgio_result+0x9f/0x120 [nfs]
      [13034.948166]  [<ffffffffa09266cc>] pnfs_ld_read_done+0x7c/0x1e0 [nfsv4]
      [13034.950247]  [<ffffffffa03b07ee>] bl_read_cleanup+0x2e/0x60 [blocklayoutdriver]
      [13034.952156]  [<ffffffff810ebf62>] process_one_work+0x412/0x870
      [13034.954102]  [<ffffffff810ebe84>] ? process_one_work+0x334/0x870
      [13034.955949]  [<ffffffff810ebb50>] ? queue_delayed_work_on+0x40/0x40
      [13034.957985]  [<ffffffff810ec441>] worker_thread+0x81/0x6a0
      [13034.959817]  [<ffffffff810ec3c0>] ? process_one_work+0x870/0x870
      [13034.961785]  [<ffffffff810f43bd>] kthread+0x17d/0x1a0
      [13034.963544]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
      [13034.965479]  [<ffffffff81100428>] ? finish_task_switch+0x88/0x220
      [13034.967223]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
      [13034.968929]  [<ffffffff81b6ae5f>] ret_from_fork+0x3f/0x70
      [13034.970534]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
      [13034.972176] Code: c7 43 50 40 84 0d a0 e8 3d fe 1c e1 48 8d 7b 58 c7 83 e4 00 00 00 00 00 00 00 e8 ca fe 1c e1 4c 8b 63 58 4c 89 e7 e8 be fe 1c e1 <49> 83 3c 24 00 74 12 48 c7 43 50 f0 a2 0e a0 b8 01 00 00 00 5b
      [13034.977148] RIP  [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
      [13034.978780]  RSP <ffff880035e97b58>
      [13034.980399] CR2: 0000000000000000
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      f8417b48
    • K
      nfs/blocklayout: Fix bad using of page offset in bl_read_pagelist · 15ae2c7b
      Kinglong Mee 提交于
      Blocklayout uses file offset for the read-back page's offset of first writing,
      it's definitely wrong, it writes data to bad address of page that cause userspace
      application segment fault. It must be the page base stored in header->args.pgbase.
      
      Also, the pg_offset has no influence with isect and extent length.
      
      Note: The offset of the non-first page is always zero.
      
      Ps: A test program will segment fault at read() as,
      #define _GNU_SOURCE
      
      #include <stdio.h>
      #include <stdlib.h>
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <unistd.h>
      #include <fcntl.h>
      #include <errno.h>
      
      int main(int argc, char **argv)
      {
              char buf[2049];
              char *filename = NULL;
              int fd = -1;
      
              if (argc < 2) {
                      printf("Usage: %s filename\n", argv[0]);
                      return 0;
              }
      
              filename = argv[1];
              fd = open(filename, O_RDONLY | O_DIRECT);
              if (fd < 0) {
                      printf("Open %s fail: %m\n", filename);
                      return 1;
              }
      
              lseek(fd, 2048, SEEK_SET);
              if (read(fd, buf, sizeof(buf) - 1) != (sizeof(buf) - 1))
                      printf("Read 4096 bityes data from %s fail: %m\n", filename);
      out:
              close(fd);
              return 0;
      }
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      15ae2c7b
    • K
      NFS: Return directly if encode_sessionid fail · e0a63c0b
      Kinglong Mee 提交于
      encode_sessionid() may return error, nfs needs process the return value.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      e0a63c0b
    • K
      NFS: Fix bad checking of max taglen in callback request · 403889c0
      Kinglong Mee 提交于
      The taglen should be checked with CB_OP_TAGLEN_MAXSZ directly.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      403889c0
    • K
      NFS: Fix bad defines of callback response maxsize · 45724e8a
      Kinglong Mee 提交于
      As CB_OP_TAGLEN_MAXSZ, all XXX_MAXSZ should be defined as bit.
      Each operation should not cantains CB_OP_TAGLEN_MAXSZ.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      45724e8a
    • K
      NFS: Use NFS4_MAX_SESSIONID_LEN directly for decode/encode sessionid · 590184a6
      Kinglong Mee 提交于
      It's no need to define a temporary variables for NFS4_MAX_SESSIONID_LEN.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      590184a6
    • K
      NFS: Remove unneeded NFS_DEBUG checking before define NFSDBG_FACILITY · 39de493e
      Kinglong Mee 提交于
      It's not needed to checking NFS_DEBUG before define NFSDBG_FACILITY, remove it.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      39de493e
    • K
      NFS: Remove the left function defines in callback.h · f765bf76
      Kinglong Mee 提交于
      Commit 778be232 "NFS do not find client in NFSv4 pg_authenticate" has remove
      the define and using of nfs4_set_callback_sessionid(), and
      commit 36281caa "NFSv4: Further clean-ups of delegation stateid validation"
      has update the checking of stateid, and move the code to nfs4proc.c.
      
      This patch remove those function defines left in callback.h
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      f765bf76
    • K
      NFS: Remove the left global variable nfs_callback_tcpport · 8c163d8e
      Kinglong Mee 提交于
      Commit bbe0a3aa "NFS: make nfs_callback_tcpport per network context" has
      make nfs_callback_tcpport per network, but left the global nfs_callback_tcpport,
      remove it.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      8c163d8e
    • K
      NFS: Get rid of the unneeded addr stored in callback arguments · d4e2ce09
      Kinglong Mee 提交于
      Commit c36fca52 "NFS refactor nfs_find_client and reference client
      across callback processing" has store clp in cb_process_state
      which is set in cb_sequence.
      
      So that, it's unneeded to store address pointer in any callback arguments.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      d4e2ce09
    • L
      nfsroot: make nfsroot to accept the 1024 bytes long directory name · c6466193
      Li RongQing 提交于
      although NFS_MAXPATHLEN is defined to 1024, nfs client hopes to accept
      a 1024 byte path, but nfs_root_parms is limited to 256, and the nfs path
      will truncated when a user inputs nfs path from kernel cmdline
      
      enlarge nfs_root_parms to 1024, to make it accept the 1024 bytes long
      directory name, since nfs_root_parms is defined as _initdata, it will
      be released after system bootup
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      c6466193
  3. 16 10月, 2015 7 次提交
  4. 08 10月, 2015 5 次提交
  5. 07 10月, 2015 1 次提交
  6. 04 10月, 2015 1 次提交
  7. 03 10月, 2015 6 次提交
  8. 02 10月, 2015 2 次提交
    • S
      [SMB3] Do not fall back to SMBWriteX in set_file_size error cases · 646200a0
      Steve French 提交于
      The error paths in set_file_size for cifs and smb3 are incorrect.
      
      In the unlikely event that a server did not support set file info
      of the file size, the code incorrectly falls back to trying SMBWriteX
      (note that only the original core SMB Write, used for example by DOS,
      can set the file size this way - this actually  does not work for the more
      recent SMBWriteX).  The idea was since the old DOS SMB Write could set
      the file size if you write zero bytes at that offset then use that if
      server rejects the normal set file info call.
      
      Fortunately the SMBWriteX will never be sent on the wire (except when
      file size is zero) since the length and offset fields were reversed
      in the two places in this function that call SMBWriteX causing
      the fall back path to return an error. It is also important to never call
      an SMB request from an SMB2/sMB3 session (which theoretically would
      be possible, and can cause a brief session drop, although the client
      recovers) so this should be fixed.  In practice this path does not happen
      with modern servers but the error fall back to SMBWriteX is clearly wrong.
      
      Removing the calls to SMBWriteX in the error paths in cifs_set_file_size
      
      Pointed out by PaX/grsecurity team
      Signed-off-by: NSteve French <steve.french@primarydata.com>
      Reported-by: NPaX Team <pageexec@freemail.hu>
      CC: Emese Revfy <re.emese@gmail.com>
      CC: Brad Spengler <spender@grsecurity.net>
      CC: Stable <stable@vger.kernel.org>
      646200a0
    • R
      dax: fix NULL pointer in __dax_pmd_fault() · 8346c416
      Ross Zwisler 提交于
      Commit 46c043ed ("mm: take i_mmap_lock in unmap_mapping_range() for
      DAX") moved some code in __dax_pmd_fault() that was responsible for
      zeroing newly allocated PMD pages.  The new location didn't properly set
      up 'kaddr', so when run this code resulted in a NULL pointer BUG.
      
      Fix this by getting the correct 'kaddr' via bdev_direct_access().
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reported-by: NDan Williams <dan.j.williams@intel.com>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8346c416
  9. 29 9月, 2015 1 次提交
    • R
      UBIFS: Kill unneeded locking in ubifs_init_security · cf6f54e3
      Richard Weinberger 提交于
      Fixes the following lockdep splat:
      [    1.244527] =============================================
      [    1.245193] [ INFO: possible recursive locking detected ]
      [    1.245193] 4.2.0-rc1+ #37 Not tainted
      [    1.245193] ---------------------------------------------
      [    1.245193] cp/742 is trying to acquire lock:
      [    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
      [    1.245193]
      [    1.245193] but task is already holding lock:
      [    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
      [    1.245193]
      [    1.245193] other info that might help us debug this:
      [    1.245193]  Possible unsafe locking scenario:
      [    1.245193]
      [    1.245193]        CPU0
      [    1.245193]        ----
      [    1.245193]   lock(&sb->s_type->i_mutex_key#9);
      [    1.245193]   lock(&sb->s_type->i_mutex_key#9);
      [    1.245193]
      [    1.245193]  *** DEADLOCK ***
      [    1.245193]
      [    1.245193]  May be due to missing lock nesting notation
      [    1.245193]
      [    1.245193] 2 locks held by cp/742:
      [    1.245193]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811ad37f>] mnt_want_write+0x1f/0x50
      [    1.245193]  #1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
      [    1.245193]
      [    1.245193] stack backtrace:
      [    1.245193] CPU: 2 PID: 742 Comm: cp Not tainted 4.2.0-rc1+ #37
      [    1.245193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140816_022509-build35 04/01/2014
      [    1.245193]  ffffffff8252d530 ffff88007b023a38 ffffffff814f6f49 ffffffff810b56c5
      [    1.245193]  ffff88007c30cc80 ffff88007b023af8 ffffffff810a150d ffff88007b023a68
      [    1.245193]  000000008101302a ffff880000000000 00000008f447e23f ffffffff8252d500
      [    1.245193] Call Trace:
      [    1.245193]  [<ffffffff814f6f49>] dump_stack+0x4c/0x65
      [    1.245193]  [<ffffffff810b56c5>] ? console_unlock+0x1c5/0x510
      [    1.245193]  [<ffffffff810a150d>] __lock_acquire+0x1a6d/0x1ea0
      [    1.245193]  [<ffffffff8109fa78>] ? __lock_is_held+0x58/0x80
      [    1.245193]  [<ffffffff810a1a93>] lock_acquire+0xd3/0x270
      [    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff814fc83b>] mutex_lock_nested+0x6b/0x3a0
      [    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff8128e286>] ubifs_create+0xa6/0x1f0
      [    1.245193]  [<ffffffff81198e7f>] ? path_openat+0x3af/0x1280
      [    1.245193]  [<ffffffff81195d15>] vfs_create+0x95/0xc0
      [    1.245193]  [<ffffffff8119929c>] path_openat+0x7cc/0x1280
      [    1.245193]  [<ffffffff8109ffe3>] ? __lock_acquire+0x543/0x1ea0
      [    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
      [    1.245193]  [<ffffffff81088c00>] ? calc_global_load_tick+0x60/0x90
      [    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
      [    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
      [    1.245193]  [<ffffffff8119ac55>] do_filp_open+0x75/0xd0
      [    1.245193]  [<ffffffff814ffd86>] ? _raw_spin_unlock+0x26/0x40
      [    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
      [    1.245193]  [<ffffffff81189bd9>] do_sys_open+0x129/0x200
      [    1.245193]  [<ffffffff81189cc9>] SyS_open+0x19/0x20
      [    1.245193]  [<ffffffff81500717>] entry_SYSCALL_64_fastpath+0x12/0x6f
      
      While the lockdep splat is a false positive, becuase path_openat holds i_mutex
      of the parent directory and ubifs_init_security() tries to acquire i_mutex
      of a new inode, it reveals that taking i_mutex in ubifs_init_security() is
      in vain because it is only being called in the inode allocation path
      and therefore nobody else can see the inode yet.
      
      Cc: stable@vger.kernel.org # 3.20-
      Reported-and-tested-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
      Reviewed-and-tested-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Signed-off-by: dedekind1@gmail.com
      cf6f54e3
  10. 26 9月, 2015 1 次提交
  11. 24 9月, 2015 2 次提交
  12. 23 9月, 2015 1 次提交