1. 14 1月, 2014 1 次提交
  2. 29 12月, 2013 1 次提交
    • L
      slub: Fix calculation of cpu slabs · 8afb1474
      Li Zefan 提交于
        /sys/kernel/slab/:t-0000048 # cat cpu_slabs
        231 N0=16 N1=215
        /sys/kernel/slab/:t-0000048 # cat slabs
        145 N0=36 N1=109
      
      See, the number of slabs is smaller than that of cpu slabs.
      
      The bug was introduced by commit 49e22585
      ("slub: per cpu cache for partial pages").
      
      We should use page->pages instead of page->pobjects when calculating
      the number of cpu partial slabs. This also fixes the mapping of slabs
      and nodes.
      
      As there's no variable storing the number of total/active objects in
      cpu partial slabs, and we don't have user interfaces requiring those
      statistics, I just add WARN_ON for those cases.
      
      Cc: <stable@vger.kernel.org> # 3.2+
      Acked-by: NChristoph Lameter <cl@linux.com>
      Reviewed-by: NWanpeng Li <liwanp@linux.vnet.ibm.com>
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      8afb1474
  3. 13 11月, 2013 1 次提交
  4. 12 11月, 2013 2 次提交
    • Z
      mm, slub: fix the typo in mm/slub.c · 721ae22a
      Zhi Yong Wu 提交于
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      721ae22a
    • C
      slub: Handle NULL parameter in kmem_cache_flags · c6f58d9b
      Christoph Lameter 提交于
      Andreas Herrmann writes:
      
        When I've used slub_debug kernel option (e.g.
        "slub_debug=,skbuff_fclone_cache" or similar) on a debug session I've
        seen a panic like:
      
          Highbank #setenv bootargs console=ttyAMA0 root=/dev/sda2 kgdboc.kgdboc=ttyAMA0,115200 slub_debug=,kmalloc-4096 earlyprintk=ttyAMA0
          ...
          Unable to handle kernel NULL pointer dereference at virtual address 00000000
          pgd = c0004000
          [00000000] *pgd=00000000
          Internal error: Oops: 5 [#1] SMP ARM
          Modules linked in:
          CPU: 0 PID: 0 Comm: swapper Tainted: G        W    3.12.0-00048-gbe408cd3 #314
          task: c0898360 ti: c088a000 task.ti: c088a000
          PC is at strncmp+0x1c/0x84
          LR is at kmem_cache_flags.isra.46.part.47+0x44/0x60
          pc : [<c02c6da0>]    lr : [<c0110a3c>]    psr: 200001d3
          sp : c088bea8  ip : c088beb8  fp : c088beb4
          r10: 00000000  r9 : 413fc090  r8 : 00000001
          r7 : 00000000  r6 : c2984a08  r5 : c0966e78  r4 : 00000000
          r3 : 0000006b  r2 : 0000000c  r1 : 00000000  r0 : c2984a08
          Flags: nzCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
          Control: 10c5387d  Table: 0000404a  DAC: 00000015
          Process swapper (pid: 0, stack limit = 0xc088a248)
          Stack: (0xc088bea8 to 0xc088c000)
          bea0:                   c088bed4 c088beb8 c0110a3c c02c6d90 c0966e78 00000040
          bec0: ef001f00 00000040 c088bf14 c088bed8 c0112070 c0110a04 00000005 c010fac8
          bee0: c088bf5c c088bef0 c010fac8 ef001f00 00000040 00000000 00000040 00000001
          bf00: 413fc090 00000000 c088bf34 c088bf18 c0839190 c0112040 00000000 ef001f00
          bf20: 00000000 00000000 c088bf54 c088bf38 c0839200 c083914c 00000006 c0961c4c
          bf40: c0961c28 00000000 c088bf7c c088bf58 c08392ac c08391c0 c08a2ed8 c0966e78
          bf60: c086b874 c08a3f50 c0961c28 00000001 c088bfb4 c088bf80 c083b258 c0839248
          bf80: 2f800000 0f000000 c08935b4 ffffffff c08cd400 ffffffff c08cd400 c0868408
          bfa0: c29849c0 00000000 c088bff4 c088bfb8 c0824974 c083b1e4 ffffffff ffffffff
          bfc0: c08245c0 00000000 00000000 c0868408 00000000 10c5387d c0892bcc c0868404
          bfe0: c0899440 0000406a 00000000 c088bff8 00008074 c0824824 00000000 00000000
          [<c02c6da0>] (strncmp+0x1c/0x84) from [<c0110a3c>] (kmem_cache_flags.isra.46.part.47+0x44/0x60)
          [<c0110a3c>] (kmem_cache_flags.isra.46.part.47+0x44/0x60) from [<c0112070>] (__kmem_cache_create+0x3c/0x410)
          [<c0112070>] (__kmem_cache_create+0x3c/0x410) from [<c0839190>] (create_boot_cache+0x50/0x74)
          [<c0839190>] (create_boot_cache+0x50/0x74) from [<c0839200>] (create_kmalloc_cache+0x4c/0x88)
          [<c0839200>] (create_kmalloc_cache+0x4c/0x88) from [<c08392ac>] (create_kmalloc_caches+0x70/0x114)
          [<c08392ac>] (create_kmalloc_caches+0x70/0x114) from [<c083b258>] (kmem_cache_init+0x80/0xe0)
          [<c083b258>] (kmem_cache_init+0x80/0xe0) from [<c0824974>] (start_kernel+0x15c/0x318)
          [<c0824974>] (start_kernel+0x15c/0x318) from [<00008074>] (0x8074)
          Code: e3520000 01a00002 089da800 e5d03000 (e5d1c000)
          ---[ end trace 1b75b31a2719ed1d ]---
          Kernel panic - not syncing: Fatal exception
      
        Problem is that slub_debug option is not parsed before
        create_boot_cache is called. Solve this by changing slub_debug to
        early_param.
      
        Kernels 3.11, 3.10 are also affected.  I am not sure about older
        kernels.
      
      Christoph Lameter explains:
      
        kmem_cache_flags may be called with NULL parameter during early boot.
        Skip the test in that case.
      
      Cc: stable@vger.kernel.org # 3.10 and 3.11
      Reported-by: NAndreas Herrmann <andreas.herrmann@calxeda.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      c6f58d9b
  5. 25 10月, 2013 1 次提交
  6. 18 10月, 2013 1 次提交
  7. 12 9月, 2013 1 次提交
  8. 05 9月, 2013 2 次提交
  9. 13 8月, 2013 2 次提交
  10. 09 8月, 2013 1 次提交
    • L
      Revert "slub: do not put a slab to cpu partial list when cpu_partial is 0" · 37090506
      Linus Torvalds 提交于
      This reverts commit 318df36e.
      
      This commit caused Steven Rostedt's hackbench runs to run out of memory
      due to a leak.  As noted by Joonsoo Kim, it is buggy in the following
      scenario:
      
       "I guess, you may set 0 to all kmem caches's cpu_partial via sysfs,
        doesn't it?
      
        In this case, memory leak is possible in following case.  Code flow of
        possible leak is follwing case.
      
         * in __slab_free()
         1. (!new.inuse || !prior) && !was_frozen
         2. !kmem_cache_debug && !prior
         3. new.frozen = 1
         4. after cmpxchg_double_slab, run the (!n) case with new.frozen=1
         5. with this patch, put_cpu_partial() doesn't do anything,
            because this cache's cpu_partial is 0
         6. return
      
        In step 5, leak occur"
      
      And Steven does indeed have cpu_partial set to 0 due to RT testing.
      
      Joonsoo is cooking up a patch, but everybody agrees that reverting this
      for now is the right thing to do.
      Reported-and-bisected-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      37090506
  11. 17 7月, 2013 1 次提交
  12. 15 7月, 2013 3 次提交
    • C
      mm/slub: remove 'per_cpu' which is useless variable · e35e1a97
      Chen Gang 提交于
      Remove 'per_cpu', since it is useless now after the patch: "205ab99d
      slub: Update statistics handling for variable order slabs". And the
      partial list is handled in the same way as the per cpu slab.
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NChen Gang <gang.chen@asianux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      e35e1a97
    • P
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      
      [1] https://lkml.org/lkml/2013/5/20/589Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      0db0628d
    • S
      slub: Check for page NULL before doing the node_match check · c25f195e
      Steven Rostedt 提交于
      In the -rt kernel (mrg), we hit the following dump:
      
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
      PGD a2d39067 PUD b1641067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in: sunrpc cpufreq_ondemand ipv6 tg3 joydev sg serio_raw pcspkr k8temp amd64_edac_mod edac_core i2c_piix4 e100 mii shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
      CPU 3
      Pid: 20878, comm: hackbench Not tainted 3.6.11-rt25.14.el6rt.x86_64 #1 empty empty/Tyan Transport GT24-B3992
      RIP: 0010:[<ffffffff811573f1>]  [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
      RSP: 0018:ffff8800a9b17d70  EFLAGS: 00010213
      RAX: 0000000000000000 RBX: 0000000001200011 RCX: ffff8800a06d8000
      RDX: 0000000004d92a03 RSI: 00000000000000d0 RDI: ffff88013b805500
      RBP: ffff8800a9b17dc0 R08: ffff88023fd14d10 R09: ffffffff81041cbd
      R10: 00007f4e3f06e9d0 R11: 0000000000000246 R12: ffff88013b805500
      R13: ffff8801ff46af40 R14: 0000000000000001 R15: 0000000000000000
      FS:  00007f4e3f06e700(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000000 CR3: 00000000a2d3a000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process hackbench (pid: 20878, threadinfo ffff8800a9b16000, task ffff8800a06d8000)
      Stack:
       ffff8800a9b17da0 ffffffff81202e08 ffff8800a9b17de0 000000d001200011
       0000000001200011 0000000001200011 0000000000000000 0000000000000000
       00007f4e3f06e9d0 0000000000000000 ffff8800a9b17e60 ffffffff81041cbd
      Call Trace:
       [<ffffffff81202e08>] ? current_has_perm+0x68/0x80
       [<ffffffff81041cbd>] copy_process+0xdd/0x15b0
       [<ffffffff810a2125>] ? rt_up_read+0x25/0x30
       [<ffffffff8104369a>] do_fork+0x5a/0x360
       [<ffffffff8107c66b>] ? migrate_enable+0xeb/0x220
       [<ffffffff8100b068>] sys_clone+0x28/0x30
       [<ffffffff81527423>] stub_clone+0x13/0x20
       [<ffffffff81527152>] ? system_call_fastpath+0x16/0x1b
      Code: 89 fc 89 75 cc 41 89 d6 4d 8b 04 24 65 4c 03 04 25 48 ae 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 12 41 83 fe ff 74 27 <48> 8b 00 48 c1 e8 3a 41 39 c6 74 1b 8b 75 cc 4c 89 c9 44 89 f2
      RIP  [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
       RSP <ffff8800a9b17d70>
      CR2: 0000000000000000
      ---[ end trace 0000000000000002 ]---
      
      Now, this uses SLUB pretty much unmodified, but as it is the -rt kernel
      with CONFIG_PREEMPT_RT set, spinlocks are mutexes, although they do
      disable migration. But the SLUB code is relatively lockless, and the
      spin_locks there are raw_spin_locks (not converted to mutexes), thus I
      believe this bug can happen in mainline without -rt features. The -rt
      patch is just good at triggering mainline bugs ;-)
      
      Anyway, looking at where this crashed, it seems that the page variable
      can be NULL when passed to the node_match() function (which does not
      check if it is NULL). When this happens we get the above panic.
      
      As page is only used in slab_alloc() to check if the node matches, if
      it's NULL I'm assuming that we can say it doesn't and call the
      __slab_alloc() code. Is this a correct assumption?
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c25f195e
  13. 08 7月, 2013 1 次提交
  14. 07 7月, 2013 3 次提交
  15. 30 4月, 2013 1 次提交
  16. 05 4月, 2013 2 次提交
  17. 02 4月, 2013 2 次提交
  18. 28 2月, 2013 1 次提交
  19. 24 2月, 2013 1 次提交
  20. 01 2月, 2013 4 次提交
  21. 21 1月, 2013 1 次提交
  22. 19 12月, 2012 7 次提交
    • G
      slub: drop mutex before deleting sysfs entry · 5413dfba
      Glauber Costa 提交于
      Sasha Levin recently reported a lockdep problem resulting from the new
      attribute propagation introduced by kmemcg series.  In short, slab_mutex
      will be called from within the sysfs attribute store function.  This will
      create a dependency, that will later be held backwards when a cache is
      destroyed - since destruction occurs with the slab_mutex held, and then
      calls in to the sysfs directory removal function.
      
      In this patch, I propose to adopt a strategy close to what
      __kmem_cache_create does before calling sysfs_slab_add, and release the
      lock before the call to sysfs_slab_remove.  This is pretty much the last
      operation in the kmem_cache_shutdown() path, so we could do better by
      splitting this and moving this call alone to later on.  This will fit
      nicely when sysfs handling is consistent between all caches, but will look
      weird now.
      
      Lockdep info:
      
        ======================================================
        [ INFO: possible circular locking dependency detected ]
        3.7.0-rc4-next-20121106-sasha-00008-g353b62f #117 Tainted: G        W
        -------------------------------------------------------
        trinity-child13/6961 is trying to acquire lock:
         (s_active#43){++++.+}, at:  sysfs_addrm_finish+0x31/0x60
      
        but task is already holding lock:
         (slab_mutex){+.+.+.}, at:  kmem_cache_destroy+0x22/0xe0
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
        -> #1 (slab_mutex){+.+.+.}:
                lock_acquire+0x1aa/0x240
                __mutex_lock_common+0x59/0x5a0
                mutex_lock_nested+0x3f/0x50
                slab_attr_store+0xde/0x110
                sysfs_write_file+0xfa/0x150
                vfs_write+0xb0/0x180
                sys_pwrite64+0x60/0xb0
                tracesys+0xe1/0xe6
        -> #0 (s_active#43){++++.+}:
                __lock_acquire+0x14df/0x1ca0
                lock_acquire+0x1aa/0x240
                sysfs_deactivate+0x122/0x1a0
                sysfs_addrm_finish+0x31/0x60
                sysfs_remove_dir+0x89/0xd0
                kobject_del+0x16/0x40
                __kmem_cache_shutdown+0x40/0x60
                kmem_cache_destroy+0x40/0xe0
                mon_text_release+0x78/0xe0
                __fput+0x122/0x2d0
                ____fput+0x9/0x10
                task_work_run+0xbe/0x100
                do_exit+0x432/0xbd0
                do_group_exit+0x84/0xd0
                get_signal_to_deliver+0x81d/0x930
                do_signal+0x3a/0x950
                do_notify_resume+0x3e/0x90
                int_signal+0x12/0x17
      
        other info that might help us debug this:
      
         Possible unsafe locking scenario:
      
               CPU0                    CPU1
               ----                    ----
          lock(slab_mutex);
                                       lock(s_active#43);
                                       lock(slab_mutex);
          lock(s_active#43);
      
         *** DEADLOCK ***
      
        2 locks held by trinity-child13/6961:
         #0:  (mon_lock){+.+.+.}, at:  mon_text_release+0x25/0xe0
         #1:  (slab_mutex){+.+.+.}, at:  kmem_cache_destroy+0x22/0xe0
      
        stack backtrace:
        Pid: 6961, comm: trinity-child13 Tainted: G        W    3.7.0-rc4-next-20121106-sasha-00008-g353b62f #117
        Call Trace:
          print_circular_bug+0x1fb/0x20c
          __lock_acquire+0x14df/0x1ca0
          lock_acquire+0x1aa/0x240
          sysfs_deactivate+0x122/0x1a0
          sysfs_addrm_finish+0x31/0x60
          sysfs_remove_dir+0x89/0xd0
          kobject_del+0x16/0x40
          __kmem_cache_shutdown+0x40/0x60
          kmem_cache_destroy+0x40/0xe0
          mon_text_release+0x78/0xe0
          __fput+0x122/0x2d0
          ____fput+0x9/0x10
          task_work_run+0xbe/0x100
          do_exit+0x432/0xbd0
          do_group_exit+0x84/0xd0
          get_signal_to_deliver+0x81d/0x930
          do_signal+0x3a/0x950
          do_notify_resume+0x3e/0x90
          int_signal+0x12/0x17
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5413dfba
    • G
      memcg: add comments clarifying aspects of cache attribute propagation · ebe945c2
      Glauber Costa 提交于
      This patch clarifies two aspects of cache attribute propagation.
      
      First, the expected context for the for_each_memcg_cache macro in
      memcontrol.h.  The usages already in the codebase are safe.  In mm/slub.c,
      it is trivially safe because the lock is acquired right before the loop.
      In mm/slab.c, it is less so: the lock is acquired by an outer function a
      few steps back in the stack, so a VM_BUG_ON() is added to make sure it is
      indeed safe.
      
      A comment is also added to detail why we are returning the value of the
      parent cache and ignoring the children's when we propagate the attributes.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebe945c2
    • G
      slub: slub-specific propagation changes · 107dab5c
      Glauber Costa 提交于
      SLUB allows us to tune a particular cache behavior with sysfs-based
      tunables.  When creating a new memcg cache copy, we'd like to preserve any
      tunables the parent cache already had.
      
      This can be done by tapping into the store attribute function provided by
      the allocator.  We of course don't need to mess with read-only fields.
      Since the attributes can have multiple types and are stored internally by
      sysfs, the best strategy is to issue a ->show() in the root cache, and
      then ->store() in the memcg cache.
      
      The drawback of that, is that sysfs can allocate up to a page in buffering
      for show(), that we are likely not to need, but also can't guarantee.  To
      avoid always allocating a page for that, we can update the caches at store
      time with the maximum attribute size ever stored to the root cache.  We
      will then get a buffer big enough to hold it.  The corolary to this, is
      that if no stores happened, nothing will be propagated.
      
      It can also happen that a root cache has its tunables updated during
      normal system operation.  In this case, we will propagate the change to
      all caches that are already active.
      
      [akpm@linux-foundation.org: tweak code to avoid __maybe_unused]
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      107dab5c
    • G
      memcg: destroy memcg caches · 1f458cbf
      Glauber Costa 提交于
      Implement destruction of memcg caches.  Right now, only caches where our
      reference counter is the last remaining are deleted.  If there are any
      other reference counters around, we just leave the caches lying around
      until they go away.
      
      When that happens, a destruction function is called from the cache code.
      Caches are only destroyed in process context, so we queue them up for
      later processing in the general case.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f458cbf
    • G
      sl[au]b: allocate objects from memcg cache · d79923fa
      Glauber Costa 提交于
      We are able to match a cache allocation to a particular memcg.  If the
      task doesn't change groups during the allocation itself - a rare event,
      this will give us a good picture about who is the first group to touch a
      cache page.
      
      This patch uses the now available infrastructure by calling
      memcg_kmem_get_cache() before all the cache allocations.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d79923fa
    • G
      sl[au]b: always get the cache from its page in kmem_cache_free() · b9ce5ef4
      Glauber Costa 提交于
      struct page already has this information.  If we start chaining caches,
      this information will always be more trustworthy than whatever is passed
      into the function.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9ce5ef4
    • G
      slab/slub: consider a memcg parameter in kmem_create_cache · 2633d7a0
      Glauber Costa 提交于
      Allow a memcg parameter to be passed during cache creation.  When the slub
      allocator is being used, it will only merge caches that belong to the same
      memcg.  We'll do this by scanning the global list, and then translating
      the cache to a memcg-specific cache
      
      Default function is created as a wrapper, passing NULL to the memcg
      version.  We only merge caches that belong to the same memcg.
      
      A helper is provided, memcg_css_id: because slub needs a unique cache name
      for sysfs.  Since this is visible, but not the canonical location for slab
      data, the cache name is not used, the css_id should suffice.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2633d7a0