1. 27 7月, 2016 37 次提交
  2. 23 7月, 2016 2 次提交
    • A
      radix-tree: fix radix_tree_iter_retry() for tagged iterators. · 3cb9185c
      Andrey Ryabinin 提交于
      radix_tree_iter_retry() resets slot to NULL, but it doesn't reset tags.
      Then NULL slot and non-zero iter.tags passed to radix_tree_next_slot()
      leading to crash:
      
        RIP: radix_tree_next_slot include/linux/radix-tree.h:473
          find_get_pages_tag+0x334/0x930 mm/filemap.c:1452
        ....
        Call Trace:
          pagevec_lookup_tag+0x3a/0x80 mm/swap.c:960
          mpage_prepare_extent_to_map+0x321/0xa90 fs/ext4/inode.c:2516
          ext4_writepages+0x10be/0x2b20 fs/ext4/inode.c:2736
          do_writepages+0x97/0x100 mm/page-writeback.c:2364
          __filemap_fdatawrite_range+0x248/0x2e0 mm/filemap.c:300
          filemap_write_and_wait_range+0x121/0x1b0 mm/filemap.c:490
          ext4_sync_file+0x34d/0xdb0 fs/ext4/fsync.c:115
          vfs_fsync_range+0x10a/0x250 fs/sync.c:195
          vfs_fsync fs/sync.c:209
          do_fsync+0x42/0x70 fs/sync.c:219
          SYSC_fdatasync fs/sync.c:232
          SyS_fdatasync+0x19/0x20 fs/sync.c:230
          entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207
      
      We must reset iterator's tags to bail out from radix_tree_next_slot()
      and go to the slow-path in radix_tree_next_chunk().
      
      Fixes: 46437f9a ("radix-tree: fix race in gang lookup")
      Link: http://lkml.kernel.org/r/1468495196-10604-1-git-send-email-aryabinin@virtuozzo.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3cb9185c
    • J
      mm: memcontrol: fix cgroup creation failure after many small jobs · 73f576c0
      Johannes Weiner 提交于
      The memory controller has quite a bit of state that usually outlives the
      cgroup and pins its CSS until said state disappears.  At the same time
      it imposes a 16-bit limit on the CSS ID space to economically store IDs
      in the wild.  Consequently, when we use cgroups to contain frequent but
      small and short-lived jobs that leave behind some page cache, we quickly
      run into the 64k limitations of outstanding CSSs.  Creating a new cgroup
      fails with -ENOSPC while there are only a few, or even no user-visible
      cgroups in existence.
      
      Although pinning CSSs past cgroup removal is common, there are only two
      instances that actually need an ID after a cgroup is deleted: cache
      shadow entries and swapout records.
      
      Cache shadow entries reference the ID weakly and can deal with the CSS
      having disappeared when it's looked up later.  They pose no hurdle.
      
      Swap-out records do need to pin the css to hierarchically attribute
      swapins after the cgroup has been deleted; though the only pages that
      remain swapped out after offlining are tmpfs/shmem pages.  And those
      references are under the user's control, so they are manageable.
      
      This patch introduces a private 16-bit memcg ID and switches swap and
      cache shadow entries over to using that.  This ID can then be recycled
      after offlining when the CSS remains pinned only by objects that don't
      specifically need it.
      
      This script demonstrates the problem by faulting one cache page in a new
      cgroup and deleting it again:
      
        set -e
        mkdir -p pages
        for x in `seq 128000`; do
          [ $((x % 1000)) -eq 0 ] && echo $x
          mkdir /cgroup/foo
          echo $$ >/cgroup/foo/cgroup.procs
          echo trex >pages/$x
          echo $$ >/cgroup/cgroup.procs
          rmdir /cgroup/foo
        done
      
      When run on an unpatched kernel, we eventually run out of possible IDs
      even though there are no visible cgroups:
      
        [root@ham ~]# ./cssidstress.sh
        [...]
        65000
        mkdir: cannot create directory '/cgroup/foo': No space left on device
      
      After this patch, the IDs get released upon cgroup destruction and the
      cache and css objects get released once memory reclaim kicks in.
      
      [hannes@cmpxchg.org: init the IDR]
        Link: http://lkml.kernel.org/r/20160621154601.GA22431@cmpxchg.org
      Fixes: b2052564 ("mm: memcontrol: continue cache reclaim from offlined groups")
      Link: http://lkml.kernel.org/r/20160617162516.GD19084@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: NJohn Garcia <john.garcia@mesosphere.io>
      Reviewed-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Nikolay Borisov <kernel@kyup.com>
      Cc: <stable@vger.kernel.org>	[3.19+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73f576c0
  3. 17 7月, 2016 1 次提交
    • P
      vlan: use a valid default mtu value for vlan over macsec · 18d3df3e
      Paolo Abeni 提交于
      macsec can't cope with mtu frames which need vlan tag insertion, and
      vlan device set the default mtu equal to the underlying dev's one.
      By default vlan over macsec devices use invalid mtu, dropping
      all the large packets.
      This patch adds a netif helper to check if an upper vlan device
      needs mtu reduction. The helper is used during vlan devices
      initialization to set a valid default and during mtu updating to
      forbid invalid, too bit, mtu values.
      The helper currently only check if the lower dev is a macsec device,
      if we get more users, we need to update only the helper (possibly
      reserving an additional IFF bit).
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18d3df3e