1. 09 1月, 2016 12 次提交
    • N
      nfsd: don't hold i_mutex over userspace upcalls · bbddca8e
      NeilBrown 提交于
      We need information about exports when crossing mountpoints during
      lookup or NFSv4 readdir.  If we don't already have that information
      cached, we may have to ask (and wait for) rpc.mountd.
      
      In both cases we currently hold the i_mutex on the parent of the
      directory we're asking rpc.mountd about.  We've seen situations where
      rpc.mountd performs some operation on that directory that tries to take
      the i_mutex again, resulting in deadlock.
      
      With some care, we may be able to avoid that in rpc.mountd.  But it
      seems better just to avoid holding a mutex while waiting on userspace.
      
      It appears that lookup_one_len is pretty much the only operation that
      needs the i_mutex.  So we could just drop the i_mutex elsewhere and do
      something like
      
      	mutex_lock()
      	lookup_one_len()
      	mutex_unlock()
      
      In many cases though the lookup would have been cached and not required
      the i_mutex, so it's more efficient to create a lookup_one_len() variant
      that only takes the i_mutex when necessary.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bbddca8e
    • D
      fs:affs:Replace time_t with time64_t · db39c167
      DengChao 提交于
      The affs code uses "time_t" and "get_seconds()". This will cause
      problems on 32-bit architectures in 2038 when time_t overflows.
      This patch replaces them with "time64_t" and
      "ktime_get_real_seconds()". This patch introduces expensive 64-bit
      divsion in "secs_to_datestamp()", considering this function is not
      called so often, the cost should be acceptable.
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDengChao <chao.deng@linaro.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      db39c167
    • S
      fs/9p: use fscache mutex rather than spinlock · 8f5fed1e
      Sasha Levin 提交于
      We may sleep inside a the lock, so use a mutex rather than spinlock.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8f5fed1e
    • E
      proc: add a reschedule point in proc_readfd_common() · 3cc4a84e
      Eric Dumazet 提交于
      User can pass an arbitrary large buffer to getdents().
      
      It is typically a 32KB buffer used by libc scandir() implementation.
      
      When scanning /proc/{pid}/fd, we can hold cpu way too long,
      so add a cond_resched() to be kind with other tasks.
      
      We've seen latencies of more than 50ms on real workloads.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3cc4a84e
    • J
      logfs: constify logfs_block_ops structures · bc51b2a9
      Julia Lawall 提交于
      The logfs_block_ops structures are never modified, so declare them as
      const.
      
      Done with the help of Coccinelle.
      Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bc51b2a9
    • S
      fcntl: allow to set O_DIRECT flag on pipe · 0dbf5f20
      Stanislav Kinsburskiy 提交于
      With packetized mode for pipes, it's not possible to set O_DIRECT on pipe file
      via sys_fcntl, because of unsupported sanity checks.
      Ability to set this flag will be used by CRIU to migrate packetized pipes.
      
      v2:
      Fixed typos and mode variable to check.
      Signed-off-by: NStanislav Kinsburskiy <skinsbursky@virtuozzo.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0dbf5f20
    • A
      fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE · 90330e68
      Abhi Das 提交于
      During testing, I discovered that __generic_file_splice_read() returns
      0 (EOF) when aops->readpage fails with AOP_TRUNCATED_PAGE on the first
      page of a single/multi-page splice read operation. This EOF return code
      causes the userspace test to (correctly) report a zero-length read error
      when it was expecting otherwise.
      
      The current strategy of returning a partial non-zero read when ->readpage
      returns AOP_TRUNCATED_PAGE works only when the failed page is not the
      first of the lot being processed.
      
      This patch attempts to retry lookup and call ->readpage again on pages
      that had previously failed with AOP_TRUNCATED_PAGE. With this patch, my
      tests pass and I haven't noticed any unwanted side effects.
      
      This version removes the thrice-retry loop and instead indefinitely
      retries lookups on AOP_TRUNCATED_PAGE errors from ->readpage. This
      behavior is now similar to do_generic_file_read().
      Signed-off-by: NAbhi Das <adas@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      90330e68
    • R
      fs: xattr: Use kvfree() · 0b2a6f23
      Richard Weinberger 提交于
      ... instead of open coding it.
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0b2a6f23
    • A
      nbd: use ->compat_ioctl() · 263a3df1
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      263a3df1
    • J
      compat_ioctl: don't call do_ioctl under set_fs(KERNEL_DS) · a7f61e89
      Jann Horn 提交于
      This replaces all code in fs/compat_ioctl.c that translated
      ioctl arguments into a in-kernel structure, then performed
      do_ioctl under set_fs(KERNEL_DS), with code that allocates
      data on the user stack and can call the VFS ioctl handler
      under USER_DS.
      
      This is done as a hardening measure because the caller
      does not know what kind of ioctl handler will be invoked,
      only that no corresponding compat_ioctl handler exists and
      what the ioctl command number is. The accidental
      invocation of an unlocked_ioctl handler that unexpectedly
      calls copy_to_user could be a severe security issue.
      Signed-off-by: NJann Horn <jann@thejh.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a7f61e89
    • A
      compat_ioctl: don't pass fd around when not needed · 66cf191f
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      66cf191f
    • J
      compat_ioctl: don't look up the fd twice · b4341721
      Jann Horn 提交于
      In code in fs/compat_ioctl.c that translates ioctl arguments
      into a in-kernel structure, then performs sys_ioctl, possibly
      under set_fs(KERNEL_DS), this commit changes the sys_ioctl
      calls to do_ioctl calls. do_ioctl is a new function that does
      the same thing as sys_ioctl, but doesn't look up the fd again.
      
      This change is made to avoid (potential) security issues
      because of ioctl handlers that accept one of the ioctl
      commands I2C_FUNCS, VIDEO_GET_EVENT, MTIOCPOS, MTIOCGET,
      TIOCGSERIAL, TIOCSSERIAL, RTC_IRQP_READ, RTC_EPOCH_READ.
      This can happen for multiple reasons:
      
       - The ioctl command number could be reused.
       - The ioctl handler might not check the full ioctl
         command. This is e.g. true for drm_ioctl.
       - The ioctl handler is very special, e.g. cuse_file_ioctl
      
      The real issue is that set_fs(KERNEL_DS) is used here,
      but that's fixed in a separate commit
      "compat_ioctl: don't call do_ioctl under set_fs(KERNEL_DS)".
      
      This change mitigates potential security issues by
      preventing a race that permits invocation of
      unlocked_ioctl handlers under KERNEL_DS through compat
      code even if a corresponding compat_ioctl handler exists.
      
      So far, no way has been identified to use this to damage
      kernel memory without having CAP_SYS_ADMIN in the init ns
      (with the capability, doing reads/writes at arbitrary
      kernel addresses should be easy through CUSE's ioctl
      handler with FUSE_IOCTL_UNRESTRICTED set).
      
      [AV: two missed sys_ioctl() taken care of]
      Signed-off-by: NJann Horn <jann@thejh.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b4341721
  2. 07 1月, 2016 2 次提交
  3. 06 1月, 2016 1 次提交
  4. 04 1月, 2016 7 次提交
  5. 30 12月, 2015 3 次提交
    • X
      ocfs2/dlm: clear migration_pending when migration target goes down · cc28d6d8
      xuejiufei 提交于
      We have found a BUG on res->migration_pending when migrating lock
      resources.  The situation is as follows.
      
      dlm_mark_lockres_migration
        res->migration_pending = 1;
        __dlm_lockres_reserve_ast
        dlm_lockres_release_ast returns with res->migration_pending remains
            because other threads reserve asts
        wait dlm_migration_can_proceed returns 1
        >>>>>>> o2hb found that target goes down and remove target
                from domain_map
        dlm_migration_can_proceed returns 1
        dlm_mark_lockres_migrating returns -ESHOTDOWN with
            res->migration_pending still remains.
      
      When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
      with res->migration_pending.  So clear migration_pending when target is
      down.
      Signed-off-by: NJiufei Xue <xuejiufei@huawei.com>
      Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cc28d6d8
    • J
      ocfs2: fix flock panic issue · b5a8bc33
      Junxiao Bi 提交于
      Commit 4f656367 ("Move locks API users to locks_lock_inode_wait()")
      move flock/posix lock indentify code to locks_lock_inode_wait(), but
      missed to set fl_flags to FL_FLOCK which caused the following kernel
      panic on 4.4.0_rc5.
      
        kernel BUG at fs/locks.c:1895!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: ocfs2(O) ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront
        CPU: 0 PID: 20268 Comm: flock_unit_test Tainted: G           O    4.4.0-rc5-next-20151217 #1
        Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014
        task: ffff88007b3672c0 ti: ffff880028b58000 task.ti: ffff880028b58000
        RIP: locks_lock_inode_wait+0x2e/0x160
        Call Trace:
          ocfs2_do_flock+0x91/0x160 [ocfs2]
          ocfs2_flock+0x76/0xd0 [ocfs2]
          SyS_flock+0x10f/0x1a0
          entry_SYSCALL_64_fastpath+0x12/0x71
        Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 48 81 ec 88 00 00 00 8b 46 40 83 e0 03 83 f8 01 0f 84 ad 00 00 00 83 f8 02 74 04 <0f> 0b eb fe 4c 8d ad 60 ff ff ff 4c 8d 7b 58 e8 0e 8e 73 00 4d
        RIP  locks_lock_inode_wait+0x2e/0x160
         RSP <ffff880028b5bce8>
        ---[ end trace dfca74ec9b5b274c ]---
      
      Fixes: 4f656367 ("Move locks API users to locks_lock_inode_wait()")
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5a8bc33
    • J
      ocfs2: fix BUG when calculate new backup super · 5c9ee4cb
      Joseph Qi 提交于
      When resizing, it firstly extends the last gd.  Once it should backup
      super in the gd, it calculates new backup super and update the
      corresponding value.
      
      But it currently doesn't consider the situation that the backup super is
      already done.  And in this case, it still sets the bit in gd bitmap and
      then decrease from bg_free_bits_count, which leads to a corrupted gd and
      trigger the BUG in ocfs2_block_group_set_bits:
      
          BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);
      
      So check whether the backup super is done and then do the updates.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: NJiufei Xue <xuejiufei@huawei.com>
      Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9ee4cb
  6. 23 12月, 2015 1 次提交
  7. 19 12月, 2015 1 次提交
  8. 17 12月, 2015 1 次提交
  9. 16 12月, 2015 2 次提交
    • C
      Btrfs: check prepare_uptodate_page() error code earlier · bb1591b4
      Chris Mason 提交于
      prepare_pages() may end up calling prepare_uptodate_page() twice if our
      write only spans a single page.  But if the first call returns an error,
      our page will be unlocked and its not safe to call it again.
      
      This bug goes all the way back to 2011, and it's not something commonly
      hit.
      
      While we're here, add a more explicit check for the page being truncated
      away.  The bare lock_page() alone is protected only by good thoughts and
      i_mutex, which we're sure to regret eventually.
      Reported-by: NDave Jones <dsj@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      bb1591b4
    • C
      Btrfs: check for empty bitmap list in setup_cluster_bitmaps · 1b9b922a
      Chris Mason 提交于
      Dave Jones found a warning from kasan in setup_cluster_bitmaps()
      
      ==================================================================
      BUG: KASAN: stack-out-of-bounds in setup_cluster_bitmap+0xc4/0x5a0 at
      addr ffff88039bef6828
      Read of size 8 by task nfsd/1009
      page:ffffea000e6fbd80 count:0 mapcount:0 mapping:          (null)
      index:0x0
      flags: 0x8000000000000000()
      page dumped because: kasan: bad access detected
      CPU: 1 PID: 1009 Comm: nfsd Tainted: G        W
      4.4.0-rc3-backup-debug+ #1
       ffff880065647b50 000000006bb712c2 ffff88039bef6640 ffffffffa680a43e
       0000004559c00000 ffff88039bef66c8 ffffffffa62638d1 ffffffffa61121c0
       ffff8803a5769de8 0000000000000296 ffff8803a5769df0 0000000000046280
      Call Trace:
       [<ffffffffa680a43e>] dump_stack+0x4b/0x6d
       [<ffffffffa62638d1>] kasan_report_error+0x501/0x520
       [<ffffffffa61121c0>] ? debug_show_all_locks+0x1e0/0x1e0
       [<ffffffffa6263948>] kasan_report+0x58/0x60
       [<ffffffffa6814b00>] ? rb_last+0x10/0x40
       [<ffffffffa66f8af4>] ? setup_cluster_bitmap+0xc4/0x5a0
       [<ffffffffa6262ead>] __asan_load8+0x5d/0x70
       [<ffffffffa66f8af4>] setup_cluster_bitmap+0xc4/0x5a0
       [<ffffffffa66f675a>] ? setup_cluster_no_bitmap+0x6a/0x400
       [<ffffffffa66fcd16>] btrfs_find_space_cluster+0x4b6/0x640
       [<ffffffffa66fc860>] ? btrfs_alloc_from_cluster+0x4e0/0x4e0
       [<ffffffffa66fc36e>] ? btrfs_return_cluster_to_free_space+0x9e/0xb0
       [<ffffffffa702dc37>] ? _raw_spin_unlock+0x27/0x40
       [<ffffffffa666a1a1>] find_free_extent+0xba1/0x1520
      
      Andrey noticed this was because we were doing list_first_entry on a list
      that might be empty.  Rework the tests a bit so we don't do that.
      Signed-off-by: NChris Mason <clm@fb.com>
      Reprorted-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
      Reported-by: NDave Jones <dsj@fb.com>
      1b9b922a
  10. 14 12月, 2015 1 次提交
  11. 13 12月, 2015 2 次提交
    • J
      ocfs2: fix SGID not inherited issue · 854ee2e9
      Junxiao Bi 提交于
      Commit 8f1eb487 ("ocfs2: fix umask ignored issue") introduced an
      issue, SGID of sub dir was not inherited from its parents dir.  It is
      because SGID is set into "inode->i_mode" in ocfs2_get_init_inode(), but
      is overwritten by "mode" which don't have SGID set later.
      
      Fixes: 8f1eb487 ("ocfs2: fix umask ignored issue")
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Acked-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      854ee2e9
    • H
      osd fs: __r4w_get_page rely on PageUptodate for uptodate · 3066a967
      Hugh Dickins 提交于
      Commit 42cb14b1 ("mm: migrate dirty page without
      clear_page_dirty_for_io etc") simplified the migration of a PageDirty
      pagecache page: one stat needs moving from zone to zone and that's about
      all.
      
      It's convenient and safest for it to shift the PageDirty bit from old
      page to new, just before updating the zone stats: before copying data
      and marking the new PageUptodate.  This is all done while both pages are
      isolated and locked, just as before; and just as before, there's a
      moment when the new page is visible in the radix_tree, but not yet
      PageUptodate.  What's new is that it may now be briefly visible as
      PageDirty before it is PageUptodate.
      
      When I scoured the tree to see if this could cause a problem anywhere,
      the only places I found were in two similar functions __r4w_get_page():
      which look up a page with find_get_page() (not using page lock), then
      claim it's uptodate if it's PageDirty or PageWriteback or PageUptodate.
      
      I'm not sure whether that was right before, but now it might be wrong
      (on rare occasions): only claim the page is uptodate if PageUptodate.
      Or perhaps the page in question could never be migratable anyway?
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Tested-by: NBoaz Harrosh <ooo@electrozaur.com>
      Cc: Benny Halevy <bhalevy@panasas.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3066a967
  12. 10 12月, 2015 3 次提交
    • H
      btrfs: fix misleading warning when space cache failed to load · 94356889
      Holger Hoffstätte 提交于
      When an inconsistent space cache is detected during loading we log a
      warning that users frequently mistake as instruction to invalidate the
      cache manually, even though this is not required. Fix the message to
      indicate that the cache will be rebuilt automatically.
      Signed-off-by: NHolger Hoffstätte <holger.hoffstaette@googlemail.com>
      Acked-by: NFilipe Manana <fdmanana@suse.com>
      94356889
    • F
      Btrfs: fix transaction handle leak in balance · 8a7d656f
      Filipe Manana 提交于
      If we fail to allocate a new data chunk, we were jumping to the error path
      without release the transaction handle we got before. Fix this by always
      releasing it before doing the jump.
      
      Fixes: 2c9fe835 ("btrfs: Fix lost-data-profile caused by balance bg")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      8a7d656f
    • F
      Btrfs: fix unprotected list move from unused_bgs to deleted_bgs list · 348a0013
      Filipe Manana 提交于
      As of my previous change titled "Btrfs: fix scrub preventing unused block
      groups from being deleted", the following warning at
      extent-tree.c:btrfs_delete_unused_bgs() can be hit when we mount the a
      filesysten with "-o discard":
      
       10263  void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info)
       10264  {
       (...)
       10405                  if (trimming) {
       10406                          WARN_ON(!list_empty(&block_group->bg_list));
       10407                          spin_lock(&trans->transaction->deleted_bgs_lock);
       10408                          list_move(&block_group->bg_list,
       10409                                    &trans->transaction->deleted_bgs);
       10410                          spin_unlock(&trans->transaction->deleted_bgs_lock);
       10411                          btrfs_get_block_group(block_group);
       10412                  }
       (...)
      
      This happens because scrub can now add back the block group to the list of
      unused block groups (fs_info->unused_bgs). This is dangerous because we
      are moving the block group from the unused block groups list to the list
      of deleted block groups without holding the lock that protects the source
      list (fs_info->unused_bgs_lock).
      
      The following diagram illustrates how this happens:
      
                  CPU 1                                     CPU 2
      
       cleaner_kthread()
         btrfs_delete_unused_bgs()
      
           sees bg X in list
            fs_info->unused_bgs
      
           deletes bg X from list
            fs_info->unused_bgs
      
                                                  scrub_enumerate_chunks()
      
                                                    searches device tree using
                                                    its commit root
      
                                                    finds device extent for
                                                    block group X
      
                                                    gets block group X from the tree
                                                    fs_info->block_group_cache_tree
                                                    (via btrfs_lookup_block_group())
      
                                                    sets bg X to RO (again)
      
                                                    scrub_chunk(bg X)
      
                                                    sets bg X back to RW mode
      
                                                    adds bg X to the list
                                                    fs_info->unused_bgs again,
                                                    since it's still unused and
                                                    currently not in that list
      
           sets bg X to RO mode
      
           btrfs_remove_chunk(bg X)
      
           --> discard is enabled and bg X
               is in the fs_info->unused_bgs
               list again so the warning is
               triggered
           --> we move it from that list into
               the transaction's delete_bgs
               list, but we can have another
               task currently manipulating
               the first list (fs_info->unused_bgs)
      
      Fix this by using the same lock (fs_info->unused_bgs_lock) to protect both
      the list of unused block groups and the list of deleted block groups. This
      makes it safe and there's not much worry for more lock contention, as this
      lock is seldom used and only the cleaner kthread adds elements to the list
      of deleted block groups. The warning goes away too, as this was previously
      an impossible case (and would have been better a BUG_ON/ASSERT) but it's
      not impossible anymore.
      Reproduced with fstest btrfs/073 (using MOUNT_OPTIONS="-o discard").
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      348a0013
  13. 09 12月, 2015 2 次提交
    • A
      fix the regression from "direct-io: Fix negative return from dio read beyond eof" · 2d4594ac
      Al Viro 提交于
      Sure, it's better to bail out of past-the-eof read and return 0 than return
      a bogus negative value on such.  Only we'd better make sure we are bailing out
      with 0 and not -ENOMEM...
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2d4594ac
    • A
      9p: ->evict_inode() should kick out ->i_data, not ->i_mapping · 4ad78628
      Al Viro 提交于
      For block devices the pagecache is associated with the inode
      on bdevfs, not with the aliasing ones on the mountable filesystems.
      The latter have its own ->i_data empty and ->i_mapping pointing
      to the (unique per major/minor) bdevfs inode.  That guarantees
      cache coherence between all block device inodes with the same
      device number.
      
      Eviction of an alias inode has no business trying to evict the
      pages belonging to bdevfs one; moreover, ->i_mapping is only
      safe to access when the thing is opened.  At the time of
      ->evict_inode() the victim is definitely *not* opened.  We are
      about to kill the address space embedded into struct inode
      (inode->i_data) and that's what we need to empty of any pages.
      
      9p instance tries to empty inode->i_mapping instead, which is
      both unsafe and bogus - if we have several device nodes with
      the same device number in different places, closing one of them
      should not try to empty the (shared) page cache.
      
      Fortunately, other instances in the tree are OK; they are
      evicting from &inode->i_data instead, as 9p one should.
      
      Cc: stable@vger.kernel.org # v2.6.32+, ones prior to 2.6.36 need only half of that
      Reported-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Tested-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4ad78628
  14. 08 12月, 2015 1 次提交
    • T
      SUNRPC: Fix callback channel · 756b9b37
      Trond Myklebust 提交于
      The NFSv4.1 callback channel is currently broken because the receive
      message will keep shrinking because the backchannel receive buffer size
      never gets reset.
      The easiest solution to this problem is instead of changing the receive
      buffer, to rather adjust the copied request.
      
      Fixes: 38b7631f ("nfs4: limit callback decoding to received bytes")
      Cc: Benjamin Coddington <bcodding@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      756b9b37
  15. 07 12月, 2015 1 次提交