1. 12 1月, 2013 1 次提交
    • M
      mm: compaction: Partially revert capture of suitable high-order page · 47ecfcb7
      Mel Gorman 提交于
      Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when
      waiting for POLLIN on a local TCP socket.  It was easier to trigger if
      there was disk IO and dirty pages at the same time and he bisected it to
      commit 1fb3f8ca ("mm: compaction: capture a suitable high-order page
      immediately when it is made available").
      
      The intention of that patch was to improve high-order allocations under
      memory pressure after changes made to reclaim in 3.6 drastically hurt
      THP allocations but the approach was flawed.  For Eric, the problem was
      that page->pfmemalloc was not being cleared for captured pages leading
      to a poor interaction with swap-over-NFS support causing the packets to
      be dropped.  However, I identified a few more problems with the patch
      including the fact that it can increase contention on zone->lock in some
      cases which could result in async direct compaction being aborted early.
      
      In retrospect the capture patch took the wrong approach.  What it should
      have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
      was allocating for THP and avoided races that way.  While the patch was
      showing to improve allocation success rates at the time, the benefit is
      marginal given the relative complexity and it should be revisited from
      scratch in the context of the other reclaim-related changes that have
      taken place since the patch was first written and tested.  This patch
      partially reverts commit 1fb3f8ca "mm: compaction: capture a suitable
      high-order page immediately when it is made available".
      Reported-and-tested-by: NEric Wong <normalperson@yhbt.net>
      Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      47ecfcb7
  2. 10 1月, 2013 1 次提交
  3. 05 1月, 2013 2 次提交
    • M
      mm: limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT · 53a59fc6
      Michal Hocko 提交于
      Since commit e303297e ("mm: extended batches for generic
      mmu_gather") we are batching pages to be freed until either
      tlb_next_batch cannot allocate a new batch or we are done.
      
      This works just fine most of the time but we can get in troubles with
      non-preemptible kernel (CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY)
      on large machines where too aggressive batching might lead to soft
      lockups during process exit path (exit_mmap) because there are no
      scheduling points down the free_pages_and_swap_cache path and so the
      freeing can take long enough to trigger the soft lockup.
      
      The lockup is harmless except when the system is setup to panic on
      softlockup which is not that unusual.
      
      The simplest way to work around this issue is to limit the maximum
      number of batches in a single mmu_gather.  10k of collected pages should
      be safe to prevent from soft lockups (we would have 2ms for one) even if
      they are all freed without an explicit scheduling point.
      
      This patch doesn't add any new explicit scheduling points because it
      relies on zap_pmd_range during page tables zapping which calls
      cond_resched per PMD.
      
      The following lockup has been reported for 3.0 kernel with a huge
      process (in order of hundreds gigs but I do know any more details).
      
        BUG: soft lockup - CPU#56 stuck for 22s! [kernel:31053]
        Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc mptctl mptbase autofs4 binfmt_misc dm_round_robin dm_multipath bonding cpufreq_conservative cpufreq_userspace cpufreq_powersave pcc_cpufreq mperf microcode fuse loop osst sg sd_mod crc_t10dif st qla2xxx scsi_transport_fc scsi_tgt netxen_nic i7core_edac iTCO_wdt joydev e1000e serio_raw pcspkr edac_core iTCO_vendor_support acpi_power_meter rtc_cmos hpwdt hpilo button container usbhid hid dm_mirror dm_region_hash dm_log linear uhci_hcd ehci_hcd usbcore usb_common scsi_dh_emc scsi_dh_alua scsi_dh_hp_sw scsi_dh_rdac scsi_dh dm_snapshot pcnet32 mii edd dm_mod raid1 ext3 mbcache jbd fan thermal processor thermal_sys hwmon cciss scsi_mod
        Supported: Yes
        CPU 56
        Pid: 31053, comm: kernel Not tainted 3.0.31-0.9-default #1 HP ProLiant DL580 G7
        RIP: 0010:  _raw_spin_unlock_irqrestore+0x8/0x10
        RSP: 0018:ffff883ec1037af0  EFLAGS: 00000206
        RAX: 0000000000000e00 RBX: ffffea01a0817e28 RCX: ffff88803ffd9e80
        RDX: 0000000000000200 RSI: 0000000000000206 RDI: 0000000000000206
        RBP: 0000000000000002 R08: 0000000000000001 R09: ffff887ec724a400
        R10: 0000000000000000 R11: dead000000200200 R12: ffffffff8144c26e
        R13: 0000000000000030 R14: 0000000000000297 R15: 000000000000000e
        FS:  00007ed834282700(0000) GS:ffff88c03f200000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 000000000068b240 CR3: 0000003ec13c5000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Process kernel (pid: 31053, threadinfo ffff883ec1036000, task ffff883ebd5d4100)
        Call Trace:
          release_pages+0xc5/0x260
          free_pages_and_swap_cache+0x9d/0xc0
          tlb_flush_mmu+0x5c/0x80
          tlb_finish_mmu+0xe/0x50
          exit_mmap+0xbd/0x120
          mmput+0x49/0x120
          exit_mm+0x122/0x160
          do_exit+0x17a/0x430
          do_group_exit+0x3d/0xb0
          get_signal_to_deliver+0x247/0x480
          do_signal+0x71/0x1b0
          do_notify_resume+0x98/0xb0
          int_signal+0x12/0x17
        DWARF2 unwinder stuck at int_signal+0x12/0x17
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Cc: <stable@vger.kernel.org>	[3.0+]
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53a59fc6
    • B
      mm: fix zone_watermark_ok_safe() accounting of isolated pages · a458431e
      Bartlomiej Zolnierkiewicz 提交于
      Commit 702d1a6e ("memory-hotplug: fix kswapd looping forever
      problem") added an isolated pageblocks counter (nr_pageblock_isolate in
      struct zone) and used it to adjust free pages counter in
      zone_watermark_ok_safe() to prevent kswapd looping forever problem.
      
      Then later, commit 2139cbe6 ("cma: fix counting of isolated pages")
      fixed accounting of isolated pages in global free pages counter.  It
      made the previous zone_watermark_ok_safe() fix unnecessary and
      potentially harmful (cause now isolated pages may be accounted twice
      making free pages counter incorrect).
      
      This patch removes the special isolated pageblocks counter altogether
      which fixes zone_watermark_ok_safe() free pages check.
      Reported-by: NTomasz Stanislawski <t.stanislaws@samsung.com>
      Signed-off-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Aaditya Kumar <aaditya.kumar.30@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a458431e
  4. 04 1月, 2013 1 次提交
    • G
      MM: vmscan: remove __devinit attribute. · fcb35a9b
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit from the file.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcb35a9b
  5. 03 1月, 2013 3 次提交
    • M
      mm: mempolicy: Convert shared_policy mutex to spinlock · 42288fe3
      Mel Gorman 提交于
      Sasha was fuzzing with trinity and reported the following problem:
      
        BUG: sleeping function called from invalid context at kernel/mutex.c:269
        in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main
        2 locks held by trinity-main/6361:
         #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff810aa314>] __do_page_fault+0x1e4/0x4f0
         #1:  (&(&mm->page_table_lock)->rlock){+.+...}, at: [<ffffffff8122f017>] handle_pte_fault+0x3f7/0x6a0
        Pid: 6361, comm: trinity-main Tainted: G        W
        3.7.0-rc2-next-20121024-sasha-00001-gd95ef01-dirty #74
        Call Trace:
          __might_sleep+0x1c3/0x1e0
          mutex_lock_nested+0x29/0x50
          mpol_shared_policy_lookup+0x2e/0x90
          shmem_get_policy+0x2e/0x30
          get_vma_policy+0x5a/0xa0
          mpol_misplaced+0x41/0x1d0
          handle_pte_fault+0x465/0x6a0
      
      This was triggered by a different version of automatic NUMA balancing
      but in theory the current version is vunerable to the same problem.
      
      do_numa_page
        -> numa_migrate_prep
          -> mpol_misplaced
            -> get_vma_policy
              -> shmem_get_policy
      
      It's very unlikely this will happen as shared pages are not marked
      pte_numa -- see the page_mapcount() check in change_pte_range() -- but
      it is possible.
      
      To address this, this patch restores sp->lock as originally implemented
      by Kosaki Motohiro.  In the path where get_vma_policy() is called, it
      should not be calling sp_alloc() so it is not necessary to treat the PTL
      specially.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      42288fe3
    • H
      mempolicy: remove arg from mpol_parse_str, mpol_to_str · a7a88b23
      Hugh Dickins 提交于
      Remove the unused argument (formerly no_context) from mpol_parse_str()
      and from mpol_to_str().
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a7a88b23
    • H
      tmpfs mempolicy: fix /proc/mounts corrupting memory · f2a07f40
      Hugh Dickins 提交于
      Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA
      mempolicy testing.  Very nasty.  Reading /proc/mounts, /proc/pid/mounts
      or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often
      in a page table (causing "Bad swap" or "Bad page map" warning or "Bad
      pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere
      worse.  "mpol=prefer" and "mpol=prefer:Node" are equally toxic.
      
      Recent NUMA enhancements are not to blame: this dates back to 2.6.35,
      when commit e17f74af "mempolicy: don't call mpol_set_nodemask() when
      no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(),
      which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags.
      With slab poisoning, you can then rely on mpol_to_str() to set the bit
      for node 0x6b6b, probably in the next page above the caller's stack.
      
      mpol_parse_str() is only called from shmem_parse_options(): no_context
      is always true, so call it unused for now, and remove !no_context code.
      Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might
      expect.  Then mpol_to_str() can ignore its no_context argument also,
      the mpol being appropriately initialized whether contextualized or not.
      Rename its no_context unused too, and let subsequent patch remove them
      (that's not needed for stable backporting, which would involve rejects).
      
      I don't understand why MPOL_LOCAL is described as a pseudo-policy:
      it's a reasonable policy which suffers from a confusing implementation
      in terms of MPOL_PREFERRED with MPOL_F_LOCAL.  I believe this would be
      much more robust if MPOL_LOCAL were recognized in switch statements
      throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly
      empty) nodes mask like everyone else, instead of its preferred_node
      variant (I presume an optimization from the days before MPOL_LOCAL).
      But that would take me too long to get right and fully tested.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f2a07f40
  6. 29 12月, 2012 1 次提交
  7. 24 12月, 2012 1 次提交
  8. 21 12月, 2012 6 次提交
  9. 20 12月, 2012 2 次提交
  10. 19 12月, 2012 22 次提交
    • F
      mm/vmscan.c: avoid possible deadlock caused by too_many_isolated() · 3cf23841
      Fengguang Wu 提交于
      Neil found that if too_many_isolated() returns true while performing
      direct reclaim we can end up waiting for other threads to complete their
      direct reclaim.  If those threads are allowed to enter the FS or IO to
      free memory, but this thread is not, then it is possible that those
      threads will be waiting on this thread and so we get a circular deadlock.
      
      some task enters direct reclaim with GFP_KERNEL
        => too_many_isolated() false
          => vmscan and run into dirty pages
            => pageout()
              => take some FS lock
                => fs/block code does GFP_NOIO allocation
                  => enter direct reclaim again
                    => too_many_isolated() true
                      => waiting for others to progress, however the other
                         tasks may be circular waiting for the FS lock..
      
      The fix is to let !__GFP_IO and !__GFP_FS direct reclaims enjoy higher
      priority than normal ones, by lowering the throttle threshold for the
      latter.
      
      Allowing ~1/8 isolated pages in normal is large enough.  For example, for
      a 1GB LRU list, that's ~128MB isolated pages, or 1k blocked tasks (each
      isolates 32 4KB pages), or 64 blocked tasks per logical CPU (assuming 16
      logical CPUs per NUMA node).  So it's not likely some CPU goes idle
      waiting (when it could make progress) because of this limit: there are
      much more sleeping reclaim tasks than the number of CPU, so the task may
      well be blocked by some low level queue/lock anyway.
      
      Now !GFP_IOFS reclaims won't be waiting for GFP_IOFS reclaims to progress.
       They will be blocked only when there are too many concurrent !GFP_IOFS
      reclaims, however that's very unlikely because the IO-less direct reclaims
      is able to progress much more faster, and they won't deadlock each other.
      The threshold is raised high enough for them, so that there can be
      sufficient parallel progress of !GFP_IOFS reclaims.
      
      [akpm@linux-foundation.org: tweak comment]
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Torsten Kaiser <just.for.lkml@googlemail.com>
      Tested-by: NNeilBrown <neilb@suse.de>
      Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3cf23841
    • F
      vmscan: comment too_many_isolated() · d37dd5dc
      Fengguang Wu 提交于
      Comment "Why it's doing so" rather than "What it does" as proposed by
      Andrew Morton.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d37dd5dc
    • A
      mm/kmemleak.c: remove obsolete simple_strtoul · dc053733
      Abhijit Pawar 提交于
      Replace the obsolete simple_strtoul() with kstrtoul().
      Signed-off-by: NAbhijit Pawar <abhi.c.pawar@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc053733
    • T
      mm/memory_hotplug.c: improve comments · 79a4dcef
      Tang Chen 提交于
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79a4dcef
    • J
      mm/hugetlb: create hugetlb cgroup file in hugetlb_init · 7179e7bf
      Jianguo Wu 提交于
      Build kernel with CONFIG_HUGETLBFS=y,CONFIG_HUGETLB_PAGE=y and
      CONFIG_CGROUP_HUGETLB=y, then specify hugepagesz=xx boot option, system
      will fail to boot.
      
      This failure is caused by following code path:
      
        setup_hugepagesz
          hugetlb_add_hstate
            hugetlb_cgroup_file_init
              cgroup_add_cftypes
                kzalloc <--slab is *not available* yet
      
      For this path, slab is not available yet, so memory allocated will be
      failed, and cause WARN_ON() in hugetlb_cgroup_file_init().
      
      So I move hugetlb_cgroup_file_init() into hugetlb_init().
      
      [akpm@linux-foundation.org: tweak coding-style, remove pointless __init on inlined function]
      [akpm@linux-foundation.org: fix warning]
      Signed-off-by: NJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7179e7bf
    • A
      mm/mprotect.c: coding-style cleanups · 7d12efae
      Andrew Morton 提交于
      A few gremlins have recently crept in.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d12efae
    • G
      slub: drop mutex before deleting sysfs entry · 5413dfba
      Glauber Costa 提交于
      Sasha Levin recently reported a lockdep problem resulting from the new
      attribute propagation introduced by kmemcg series.  In short, slab_mutex
      will be called from within the sysfs attribute store function.  This will
      create a dependency, that will later be held backwards when a cache is
      destroyed - since destruction occurs with the slab_mutex held, and then
      calls in to the sysfs directory removal function.
      
      In this patch, I propose to adopt a strategy close to what
      __kmem_cache_create does before calling sysfs_slab_add, and release the
      lock before the call to sysfs_slab_remove.  This is pretty much the last
      operation in the kmem_cache_shutdown() path, so we could do better by
      splitting this and moving this call alone to later on.  This will fit
      nicely when sysfs handling is consistent between all caches, but will look
      weird now.
      
      Lockdep info:
      
        ======================================================
        [ INFO: possible circular locking dependency detected ]
        3.7.0-rc4-next-20121106-sasha-00008-g353b62f #117 Tainted: G        W
        -------------------------------------------------------
        trinity-child13/6961 is trying to acquire lock:
         (s_active#43){++++.+}, at:  sysfs_addrm_finish+0x31/0x60
      
        but task is already holding lock:
         (slab_mutex){+.+.+.}, at:  kmem_cache_destroy+0x22/0xe0
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
        -> #1 (slab_mutex){+.+.+.}:
                lock_acquire+0x1aa/0x240
                __mutex_lock_common+0x59/0x5a0
                mutex_lock_nested+0x3f/0x50
                slab_attr_store+0xde/0x110
                sysfs_write_file+0xfa/0x150
                vfs_write+0xb0/0x180
                sys_pwrite64+0x60/0xb0
                tracesys+0xe1/0xe6
        -> #0 (s_active#43){++++.+}:
                __lock_acquire+0x14df/0x1ca0
                lock_acquire+0x1aa/0x240
                sysfs_deactivate+0x122/0x1a0
                sysfs_addrm_finish+0x31/0x60
                sysfs_remove_dir+0x89/0xd0
                kobject_del+0x16/0x40
                __kmem_cache_shutdown+0x40/0x60
                kmem_cache_destroy+0x40/0xe0
                mon_text_release+0x78/0xe0
                __fput+0x122/0x2d0
                ____fput+0x9/0x10
                task_work_run+0xbe/0x100
                do_exit+0x432/0xbd0
                do_group_exit+0x84/0xd0
                get_signal_to_deliver+0x81d/0x930
                do_signal+0x3a/0x950
                do_notify_resume+0x3e/0x90
                int_signal+0x12/0x17
      
        other info that might help us debug this:
      
         Possible unsafe locking scenario:
      
               CPU0                    CPU1
               ----                    ----
          lock(slab_mutex);
                                       lock(s_active#43);
                                       lock(slab_mutex);
          lock(s_active#43);
      
         *** DEADLOCK ***
      
        2 locks held by trinity-child13/6961:
         #0:  (mon_lock){+.+.+.}, at:  mon_text_release+0x25/0xe0
         #1:  (slab_mutex){+.+.+.}, at:  kmem_cache_destroy+0x22/0xe0
      
        stack backtrace:
        Pid: 6961, comm: trinity-child13 Tainted: G        W    3.7.0-rc4-next-20121106-sasha-00008-g353b62f #117
        Call Trace:
          print_circular_bug+0x1fb/0x20c
          __lock_acquire+0x14df/0x1ca0
          lock_acquire+0x1aa/0x240
          sysfs_deactivate+0x122/0x1a0
          sysfs_addrm_finish+0x31/0x60
          sysfs_remove_dir+0x89/0xd0
          kobject_del+0x16/0x40
          __kmem_cache_shutdown+0x40/0x60
          kmem_cache_destroy+0x40/0xe0
          mon_text_release+0x78/0xe0
          __fput+0x122/0x2d0
          ____fput+0x9/0x10
          task_work_run+0xbe/0x100
          do_exit+0x432/0xbd0
          do_group_exit+0x84/0xd0
          get_signal_to_deliver+0x81d/0x930
          do_signal+0x3a/0x950
          do_notify_resume+0x3e/0x90
          int_signal+0x12/0x17
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5413dfba
    • G
      memcg: add comments clarifying aspects of cache attribute propagation · ebe945c2
      Glauber Costa 提交于
      This patch clarifies two aspects of cache attribute propagation.
      
      First, the expected context for the for_each_memcg_cache macro in
      memcontrol.h.  The usages already in the codebase are safe.  In mm/slub.c,
      it is trivially safe because the lock is acquired right before the loop.
      In mm/slab.c, it is less so: the lock is acquired by an outer function a
      few steps back in the stack, so a VM_BUG_ON() is added to make sure it is
      indeed safe.
      
      A comment is also added to detail why we are returning the value of the
      parent cache and ignoring the children's when we propagate the attributes.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebe945c2
    • G
      slub: slub-specific propagation changes · 107dab5c
      Glauber Costa 提交于
      SLUB allows us to tune a particular cache behavior with sysfs-based
      tunables.  When creating a new memcg cache copy, we'd like to preserve any
      tunables the parent cache already had.
      
      This can be done by tapping into the store attribute function provided by
      the allocator.  We of course don't need to mess with read-only fields.
      Since the attributes can have multiple types and are stored internally by
      sysfs, the best strategy is to issue a ->show() in the root cache, and
      then ->store() in the memcg cache.
      
      The drawback of that, is that sysfs can allocate up to a page in buffering
      for show(), that we are likely not to need, but also can't guarantee.  To
      avoid always allocating a page for that, we can update the caches at store
      time with the maximum attribute size ever stored to the root cache.  We
      will then get a buffer big enough to hold it.  The corolary to this, is
      that if no stores happened, nothing will be propagated.
      
      It can also happen that a root cache has its tunables updated during
      normal system operation.  In this case, we will propagate the change to
      all caches that are already active.
      
      [akpm@linux-foundation.org: tweak code to avoid __maybe_unused]
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      107dab5c
    • G
      slab: propagate tunable values · 943a451a
      Glauber Costa 提交于
      SLAB allows us to tune a particular cache behavior with tunables.  When
      creating a new memcg cache copy, we'd like to preserve any tunables the
      parent cache already had.
      
      This could be done by an explicit call to do_tune_cpucache() after the
      cache is created.  But this is not very convenient now that the caches are
      created from common code, since this function is SLAB-specific.
      
      Another method of doing that is taking advantage of the fact that
      do_tune_cpucache() is always called from enable_cpucache(), which is
      called at cache initialization.  We can just preset the values, and then
      things work as expected.
      
      It can also happen that a root cache has its tunables updated during
      normal system operation.  In this case, we will propagate the change to
      all caches that are already active.
      
      This change will require us to move the assignment of root_cache in
      memcg_params a bit earlier.  We need this to be already set - which
      memcg_kmem_register_cache will do - when we reach __kmem_cache_create()
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      943a451a
    • G
      memcg: aggregate memcg cache values in slabinfo · 749c5415
      Glauber Costa 提交于
      When we create caches in memcgs, we need to display their usage
      information somewhere.  We'll adopt a scheme similar to /proc/meminfo,
      with aggregate totals shown in the global file, and per-group information
      stored in the group itself.
      
      For the time being, only reads are allowed in the per-group cache.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      749c5415
    • G
      memcg/sl[au]b: shrink dead caches · 22933152
      Glauber Costa 提交于
      This means that when we destroy a memcg cache that happened to be empty,
      those caches may take a lot of time to go away: removing the memcg
      reference won't destroy them - because there are pending references, and
      the empty pages will stay there, until a shrinker is called upon for any
      reason.
      
      In this patch, we will call kmem_cache_shrink() for all dead caches that
      cannot be destroyed because of remaining pages.  After shrinking, it is
      possible that it could be freed.  If this is not the case, we'll schedule
      a lazy worker to keep trying.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22933152
    • G
      memcg/sl[au]b: track all the memcg children of a kmem_cache · 7cf27982
      Glauber Costa 提交于
      This enables us to remove all the children of a kmem_cache being
      destroyed, if for example the kernel module it's being used in gets
      unloaded.  Otherwise, the children will still point to the destroyed
      parent.
      Signed-off-by: NSuleiman Souhlal <suleiman@google.com>
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7cf27982
    • G
      memcg: destroy memcg caches · 1f458cbf
      Glauber Costa 提交于
      Implement destruction of memcg caches.  Right now, only caches where our
      reference counter is the last remaining are deleted.  If there are any
      other reference counters around, we just leave the caches lying around
      until they go away.
      
      When that happens, a destruction function is called from the cache code.
      Caches are only destroyed in process context, so we queue them up for
      later processing in the general case.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f458cbf
    • G
      sl[au]b: allocate objects from memcg cache · d79923fa
      Glauber Costa 提交于
      We are able to match a cache allocation to a particular memcg.  If the
      task doesn't change groups during the allocation itself - a rare event,
      this will give us a good picture about who is the first group to touch a
      cache page.
      
      This patch uses the now available infrastructure by calling
      memcg_kmem_get_cache() before all the cache allocations.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d79923fa
    • G
      sl[au]b: always get the cache from its page in kmem_cache_free() · b9ce5ef4
      Glauber Costa 提交于
      struct page already has this information.  If we start chaining caches,
      this information will always be more trustworthy than whatever is passed
      into the function.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9ce5ef4
    • G
      memcg: skip memcg kmem allocations in specified code regions · 0e9d92f2
      Glauber Costa 提交于
      Create a mechanism that skip memcg allocations during certain pieces of
      our core code.  It basically works in the same way as
      preempt_disable()/preempt_enable(): By marking a region under which all
      allocations will be accounted to the root memcg.
      
      We need this to prevent races in early cache creation, when we
      allocate data using caches that are not necessarily created already.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      yCc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0e9d92f2
    • G
      memcg: infrastructure to match an allocation to the right cache · d7f25f8a
      Glauber Costa 提交于
      The page allocator is able to bind a page to a memcg when it is
      allocated.  But for the caches, we'd like to have as many objects as
      possible in a page belonging to the same cache.
      
      This is done in this patch by calling memcg_kmem_get_cache in the
      beginning of every allocation function.  This function is patched out by
      static branches when kernel memory controller is not being used.
      
      It assumes that the task allocating, which determines the memcg in the
      page allocator, belongs to the same cgroup throughout the whole process.
      Misaccounting can happen if the task calls memcg_kmem_get_cache() while
      belonging to a cgroup, and later on changes.  This is considered
      acceptable, and should only happen upon task migration.
      
      Before the cache is created by the memcg core, there is also a possible
      imbalance: the task belongs to a memcg, but the cache being allocated from
      is the global cache, since the child cache is not yet guaranteed to be
      ready.  This case is also fine, since in this case the GFP_KMEMCG will not
      be passed and the page allocator will not attempt any cgroup accounting.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7f25f8a
    • G
      memcg: allocate memory for memcg caches whenever a new memcg appears · 55007d84
      Glauber Costa 提交于
      Every cache that is considered a root cache (basically the "original"
      caches, tied to the root memcg/no-memcg) will have an array that should be
      large enough to store a cache pointer per each memcg in the system.
      
      Theoreticaly, this is as high as 1 << sizeof(css_id), which is currently
      in the 64k pointers range.  Most of the time, we won't be using that much.
      
      What goes in this patch, is a simple scheme to dynamically allocate such
      an array, in order to minimize memory usage for memcg caches.  Because we
      would also like to avoid allocations all the time, at least for now, the
      array will only grow.  It will tend to be big enough to hold the maximum
      number of kmem-limited memcgs ever achieved.
      
      We'll allocate it to be a minimum of 64 kmem-limited memcgs.  When we have
      more than that, we'll start doubling the size of this array every time the
      limit is reached.
      
      Because we are only considering kmem limited memcgs, a natural point for
      this to happen is when we write to the limit.  At that point, we already
      have set_limit_mutex held, so that will become our natural synchronization
      mechanism.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      55007d84
    • G
      slab/slub: consider a memcg parameter in kmem_create_cache · 2633d7a0
      Glauber Costa 提交于
      Allow a memcg parameter to be passed during cache creation.  When the slub
      allocator is being used, it will only merge caches that belong to the same
      memcg.  We'll do this by scanning the global list, and then translating
      the cache to a memcg-specific cache
      
      Default function is created as a wrapper, passing NULL to the memcg
      version.  We only merge caches that belong to the same memcg.
      
      A helper is provided, memcg_css_id: because slub needs a unique cache name
      for sysfs.  Since this is visible, but not the canonical location for slab
      data, the cache name is not used, the css_id should suffice.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2633d7a0
    • G
      slab: annotate on-slab caches nodelist locks · 6ccfb5bc
      Glauber Costa 提交于
      We currently provide lockdep annotation for kmalloc caches, and also
      caches that have SLAB_DEBUG_OBJECTS enabled.  The reason for this is that
      we can quite frequently nest in the l3->list_lock lock, which is not
      something trivial to avoid.
      
      My proposal with this patch, is to extend this to caches whose slab
      management object lives within the slab as well ("on_slab").  The need for
      this arose in the context of testing kmemcg-slab patches.  With such
      patchset, we can have per-memcg kmalloc caches.  So the same path that led
      to nesting between kmalloc caches will could then lead to in-memcg
      nesting.  Because they are not annotated, lockdep will trigger.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6ccfb5bc
    • G
      slab/slub: struct memcg_params · ba6c496e
      Glauber Costa 提交于
      For the kmem slab controller, we need to record some extra information in
      the kmem_cache structure.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Signed-off-by: NSuleiman Souhlal <suleiman@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba6c496e