1. 29 12月, 2015 3 次提交
  2. 28 12月, 2015 5 次提交
  3. 14 12月, 2015 1 次提交
  4. 13 12月, 2015 1 次提交
    • H
      osd fs: __r4w_get_page rely on PageUptodate for uptodate · 3066a967
      Hugh Dickins 提交于
      Commit 42cb14b1 ("mm: migrate dirty page without
      clear_page_dirty_for_io etc") simplified the migration of a PageDirty
      pagecache page: one stat needs moving from zone to zone and that's about
      all.
      
      It's convenient and safest for it to shift the PageDirty bit from old
      page to new, just before updating the zone stats: before copying data
      and marking the new PageUptodate.  This is all done while both pages are
      isolated and locked, just as before; and just as before, there's a
      moment when the new page is visible in the radix_tree, but not yet
      PageUptodate.  What's new is that it may now be briefly visible as
      PageDirty before it is PageUptodate.
      
      When I scoured the tree to see if this could cause a problem anywhere,
      the only places I found were in two similar functions __r4w_get_page():
      which look up a page with find_get_page() (not using page lock), then
      claim it's uptodate if it's PageDirty or PageWriteback or PageUptodate.
      
      I'm not sure whether that was right before, but now it might be wrong
      (on rare occasions): only claim the page is uptodate if PageUptodate.
      Or perhaps the page in question could never be migratable anyway?
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Tested-by: NBoaz Harrosh <ooo@electrozaur.com>
      Cc: Benny Halevy <bhalevy@panasas.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3066a967
  5. 08 12月, 2015 1 次提交
    • T
      SUNRPC: Fix callback channel · 756b9b37
      Trond Myklebust 提交于
      The NFSv4.1 callback channel is currently broken because the receive
      message will keep shrinking because the backchannel receive buffer size
      never gets reset.
      The easiest solution to this problem is instead of changing the receive
      buffer, to rather adjust the copied request.
      
      Fixes: 38b7631f ("nfs4: limit callback decoding to received bytes")
      Cc: Benjamin Coddington <bcodding@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      756b9b37
  6. 26 11月, 2015 3 次提交
  7. 24 11月, 2015 10 次提交
  8. 14 11月, 2015 1 次提交
  9. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  10. 04 11月, 2015 3 次提交
  11. 03 11月, 2015 3 次提交
  12. 23 10月, 2015 1 次提交
  13. 22 10月, 2015 7 次提交
    • K
      NFSv4.1/pnfs: Retry through MDS when getting bad length of data · f8417b48
      Kinglong Mee 提交于
      If non rpc-based layout driver return bad length of data, nfs retries
      by calling rpc_restart_call_prepare() that cause an NULL reference panic.
      
      This patch lets nfs retry through MDS for non rpc-based layout driver
      return bad length of data.
      
      [13034.883329] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [13034.884902] IP: [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
      [13034.886558] PGD 0
      [13034.888126] Oops: 0000 [#1] KASAN
      [13034.889710] Modules linked in: blocklayoutdriver(OE) nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c coretemp btrfs crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon auth_rpcgss shpchp nfs_acl lockd vmw_vmci parport_pc xor raid6_pq grace parport sunrpc i2c_piix4 vmwgfx drm_kms_helper ttm drm mptspi e1000 serio_raw scsi_transport_spi mptscsih mptbase ata_generic pata_acpi [last unloaded: fscache]
      [13034.898260] CPU: 0 PID: 10112 Comm: kworker/0:1 Tainted: G           OE   4.3.0-rc5+ #279
      [13034.899932] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
      [13034.903342] Workqueue: events bl_read_cleanup [blocklayoutdriver]
      [13034.905059] task: ffff88006a9148c0 ti: ffff880035e90000 task.ti: ffff880035e90000
      [13034.906827] RIP: 0010:[<ffffffffa00db372>]  [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
      [13034.910522] RSP: 0018:ffff880035e97b58  EFLAGS: 00010282
      [13034.912378] RAX: fffffbfff04a5a94 RBX: ffff880068fe4858 RCX: 0000000000000003
      [13034.914339] RDX: dffffc0000000000 RSI: 0000000000000003 RDI: 0000000000000282
      [13034.916236] RBP: ffff880035e97b68 R08: 0000000000000001 R09: 0000000000000001
      [13034.918229] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      [13034.920007] R13: ffff880068fe4858 R14: ffff880068fe4a60 R15: 0000000000001000
      [13034.921845] FS:  0000000000000000(0000) GS:ffffffff82247000(0000) knlGS:0000000000000000
      [13034.923645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13034.925525] CR2: 0000000000000000 CR3: 00000000063dd000 CR4: 00000000001406f0
      [13034.932808] Stack:
      [13034.934813]  ffff880068fe4780 0000000000001000 ffff880035e97ba8 ffffffffa08800d2
      [13034.936675]  ffffffffa088029d ffff880068fe4780 ffff880068fe4858 ffffffffa089c0a0
      [13034.938593]  ffff880068fe47e0 ffff88005d59faf0 ffff880035e97be0 ffffffffa087e08f
      [13034.940454] Call Trace:
      [13034.942388]  [<ffffffffa08800d2>] nfs_readpage_result+0x112/0x200 [nfs]
      [13034.944317]  [<ffffffffa088029d>] ? nfs_readpage_done+0xdd/0x160 [nfs]
      [13034.946267]  [<ffffffffa087e08f>] nfs_pgio_result+0x9f/0x120 [nfs]
      [13034.948166]  [<ffffffffa09266cc>] pnfs_ld_read_done+0x7c/0x1e0 [nfsv4]
      [13034.950247]  [<ffffffffa03b07ee>] bl_read_cleanup+0x2e/0x60 [blocklayoutdriver]
      [13034.952156]  [<ffffffff810ebf62>] process_one_work+0x412/0x870
      [13034.954102]  [<ffffffff810ebe84>] ? process_one_work+0x334/0x870
      [13034.955949]  [<ffffffff810ebb50>] ? queue_delayed_work_on+0x40/0x40
      [13034.957985]  [<ffffffff810ec441>] worker_thread+0x81/0x6a0
      [13034.959817]  [<ffffffff810ec3c0>] ? process_one_work+0x870/0x870
      [13034.961785]  [<ffffffff810f43bd>] kthread+0x17d/0x1a0
      [13034.963544]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
      [13034.965479]  [<ffffffff81100428>] ? finish_task_switch+0x88/0x220
      [13034.967223]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
      [13034.968929]  [<ffffffff81b6ae5f>] ret_from_fork+0x3f/0x70
      [13034.970534]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
      [13034.972176] Code: c7 43 50 40 84 0d a0 e8 3d fe 1c e1 48 8d 7b 58 c7 83 e4 00 00 00 00 00 00 00 e8 ca fe 1c e1 4c 8b 63 58 4c 89 e7 e8 be fe 1c e1 <49> 83 3c 24 00 74 12 48 c7 43 50 f0 a2 0e a0 b8 01 00 00 00 5b
      [13034.977148] RIP  [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
      [13034.978780]  RSP <ffff880035e97b58>
      [13034.980399] CR2: 0000000000000000
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      f8417b48
    • K
      nfs/blocklayout: Fix bad using of page offset in bl_read_pagelist · 15ae2c7b
      Kinglong Mee 提交于
      Blocklayout uses file offset for the read-back page's offset of first writing,
      it's definitely wrong, it writes data to bad address of page that cause userspace
      application segment fault. It must be the page base stored in header->args.pgbase.
      
      Also, the pg_offset has no influence with isect and extent length.
      
      Note: The offset of the non-first page is always zero.
      
      Ps: A test program will segment fault at read() as,
      #define _GNU_SOURCE
      
      #include <stdio.h>
      #include <stdlib.h>
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <unistd.h>
      #include <fcntl.h>
      #include <errno.h>
      
      int main(int argc, char **argv)
      {
              char buf[2049];
              char *filename = NULL;
              int fd = -1;
      
              if (argc < 2) {
                      printf("Usage: %s filename\n", argv[0]);
                      return 0;
              }
      
              filename = argv[1];
              fd = open(filename, O_RDONLY | O_DIRECT);
              if (fd < 0) {
                      printf("Open %s fail: %m\n", filename);
                      return 1;
              }
      
              lseek(fd, 2048, SEEK_SET);
              if (read(fd, buf, sizeof(buf) - 1) != (sizeof(buf) - 1))
                      printf("Read 4096 bityes data from %s fail: %m\n", filename);
      out:
              close(fd);
              return 0;
      }
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      15ae2c7b
    • K
      NFS: Return directly if encode_sessionid fail · e0a63c0b
      Kinglong Mee 提交于
      encode_sessionid() may return error, nfs needs process the return value.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      e0a63c0b
    • K
      NFS: Fix bad checking of max taglen in callback request · 403889c0
      Kinglong Mee 提交于
      The taglen should be checked with CB_OP_TAGLEN_MAXSZ directly.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      403889c0
    • K
      NFS: Fix bad defines of callback response maxsize · 45724e8a
      Kinglong Mee 提交于
      As CB_OP_TAGLEN_MAXSZ, all XXX_MAXSZ should be defined as bit.
      Each operation should not cantains CB_OP_TAGLEN_MAXSZ.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      45724e8a
    • K
      NFS: Use NFS4_MAX_SESSIONID_LEN directly for decode/encode sessionid · 590184a6
      Kinglong Mee 提交于
      It's no need to define a temporary variables for NFS4_MAX_SESSIONID_LEN.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      590184a6
    • K
      NFS: Remove unneeded NFS_DEBUG checking before define NFSDBG_FACILITY · 39de493e
      Kinglong Mee 提交于
      It's not needed to checking NFS_DEBUG before define NFSDBG_FACILITY, remove it.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      39de493e