1. 29 8月, 2013 1 次提交
    • A
      memcg: check that kmem_cache has memcg_params before accessing it · 6f6b8951
      Andrey Vagin 提交于
      If the system had a few memory groups and all of them were destroyed,
      memcg_limited_groups_array_size has non-zero value, but all new caches
      are created without memcg_params, because memcg_kmem_enabled() returns
      false.
      
      We try to enumirate child caches in a few places and all of them are
      potentially dangerous.
      
      For example my kernel is compiled with CONFIG_SLAB and it crashed when I
      tryed to mount a NFS share after a few experiments with kmemcg.
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        IP: [<ffffffff8118166a>] do_tune_cpucache+0x8a/0xd0
        PGD b942a067 PUD b999f067 PMD 0
        Oops: 0000 [#1] SMP
        Modules linked in: fscache(+) ip6table_filter ip6_tables iptable_filter ip_tables i2c_piix4 pcspkr virtio_net virtio_balloon i2c_core floppy
        CPU: 0 PID: 357 Comm: modprobe Not tainted 3.11.0-rc7+ #59
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        task: ffff8800b9f98240 ti: ffff8800ba32e000 task.ti: ffff8800ba32e000
        RIP: 0010:[<ffffffff8118166a>]  [<ffffffff8118166a>] do_tune_cpucache+0x8a/0xd0
        RSP: 0018:ffff8800ba32fb70  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
        RDX: 0000000000000000 RSI: ffff8800b9f98910 RDI: 0000000000000246
        RBP: ffff8800ba32fba0 R08: 0000000000000002 R09: 0000000000000004
        R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000010
        R13: 0000000000000008 R14: 00000000000000d0 R15: ffff8800375d0200
        FS:  00007f55f1378740(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 00007f24feba57a0 CR3: 0000000037b51000 CR4: 00000000000006f0
        Call Trace:
          enable_cpucache+0x49/0x100
          setup_cpu_cache+0x215/0x280
          __kmem_cache_create+0x2fa/0x450
          kmem_cache_create_memcg+0x214/0x350
          kmem_cache_create+0x2b/0x30
          fscache_init+0x19b/0x230 [fscache]
          do_one_initcall+0xfa/0x1b0
          load_module+0x1c41/0x26d0
          SyS_finit_module+0x86/0xb0
          system_call_fastpath+0x16/0x1b
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Glauber Costa <glommer@openvz.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f6b8951
  2. 08 7月, 2013 1 次提交
  3. 07 7月, 2013 1 次提交
  4. 01 2月, 2013 4 次提交
  5. 19 12月, 2012 6 次提交
    • G
      slab: propagate tunable values · 943a451a
      Glauber Costa 提交于
      SLAB allows us to tune a particular cache behavior with tunables.  When
      creating a new memcg cache copy, we'd like to preserve any tunables the
      parent cache already had.
      
      This could be done by an explicit call to do_tune_cpucache() after the
      cache is created.  But this is not very convenient now that the caches are
      created from common code, since this function is SLAB-specific.
      
      Another method of doing that is taking advantage of the fact that
      do_tune_cpucache() is always called from enable_cpucache(), which is
      called at cache initialization.  We can just preset the values, and then
      things work as expected.
      
      It can also happen that a root cache has its tunables updated during
      normal system operation.  In this case, we will propagate the change to
      all caches that are already active.
      
      This change will require us to move the assignment of root_cache in
      memcg_params a bit earlier.  We need this to be already set - which
      memcg_kmem_register_cache will do - when we reach __kmem_cache_create()
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      943a451a
    • G
      memcg: aggregate memcg cache values in slabinfo · 749c5415
      Glauber Costa 提交于
      When we create caches in memcgs, we need to display their usage
      information somewhere.  We'll adopt a scheme similar to /proc/meminfo,
      with aggregate totals shown in the global file, and per-group information
      stored in the group itself.
      
      For the time being, only reads are allowed in the per-group cache.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      749c5415
    • G
      memcg: destroy memcg caches · 1f458cbf
      Glauber Costa 提交于
      Implement destruction of memcg caches.  Right now, only caches where our
      reference counter is the last remaining are deleted.  If there are any
      other reference counters around, we just leave the caches lying around
      until they go away.
      
      When that happens, a destruction function is called from the cache code.
      Caches are only destroyed in process context, so we queue them up for
      later processing in the general case.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f458cbf
    • G
      sl[au]b: always get the cache from its page in kmem_cache_free() · b9ce5ef4
      Glauber Costa 提交于
      struct page already has this information.  If we start chaining caches,
      this information will always be more trustworthy than whatever is passed
      into the function.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9ce5ef4
    • G
      slab/slub: consider a memcg parameter in kmem_create_cache · 2633d7a0
      Glauber Costa 提交于
      Allow a memcg parameter to be passed during cache creation.  When the slub
      allocator is being used, it will only merge caches that belong to the same
      memcg.  We'll do this by scanning the global list, and then translating
      the cache to a memcg-specific cache
      
      Default function is created as a wrapper, passing NULL to the memcg
      version.  We only merge caches that belong to the same memcg.
      
      A helper is provided, memcg_css_id: because slub needs a unique cache name
      for sysfs.  Since this is visible, but not the canonical location for slab
      data, the cache name is not used, the css_id should suffice.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2633d7a0
    • G
      slab/slub: struct memcg_params · ba6c496e
      Glauber Costa 提交于
      For the kmem slab controller, we need to record some extra information in
      the kmem_cache structure.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Signed-off-by: NSuleiman Souhlal <suleiman@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba6c496e
  6. 11 12月, 2012 2 次提交
  7. 31 10月, 2012 1 次提交
    • G
      slab: Ignore internal flags in cache creation · d8843922
      Glauber Costa 提交于
      Some flags are used internally by the allocators for management
      purposes. One example of that is the CFLGS_OFF_SLAB flag that slab uses
      to mark that the metadata for that cache is stored outside of the slab.
      
      No cache should ever pass those as a creation flags. We can just ignore
      this bit if it happens to be passed (such as when duplicating a cache in
      the kmem memcg patches).
      
      Because such flags can vary from allocator to allocator, we allow them
      to make their own decisions on that, defining SLAB_AVAILABLE_FLAGS with
      all flags that are valid at creation time.  Allocators that doesn't have
      any specific flag requirement should define that to mean all flags.
      
      Common code will mask out all flags not belonging to that set.
      Acked-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      d8843922
  8. 24 10月, 2012 3 次提交
  9. 05 9月, 2012 8 次提交
    • P
      Revert "mm/sl[aou]b: Move sysfs_slab_add to common" · aac3a166
      Pekka Enberg 提交于
      This reverts commit 96d17b7b which
      caused the following errors at boot:
      
        [    1.114885] kobject (ffff88001a802578): tried to init an initialized object, something is seriously wrong.
        [    1.114885] Pid: 1, comm: swapper/0 Tainted: G        W    3.6.0-rc1+ #6
        [    1.114885] Call Trace:
        [    1.114885]  [<ffffffff81273f37>] kobject_init+0x87/0xa0
        [    1.115555]  [<ffffffff8127426a>] kobject_init_and_add+0x2a/0x90
        [    1.115555]  [<ffffffff8127c870>] ? sprintf+0x40/0x50
        [    1.115555]  [<ffffffff81124c60>] sysfs_slab_add+0x80/0x210
        [    1.115555]  [<ffffffff81100175>] kmem_cache_create+0xa5/0x250
        [    1.115555]  [<ffffffff81cf24cd>] ? md_init+0x144/0x144
        [    1.115555]  [<ffffffff81cf25b6>] local_init+0xa4/0x11b
        [    1.115555]  [<ffffffff81cf24e1>] dm_init+0x14/0x45
        [    1.115836]  [<ffffffff810001ba>] do_one_initcall+0x3a/0x160
        [    1.116834]  [<ffffffff81cc2c90>] kernel_init+0x133/0x1b7
        [    1.117835]  [<ffffffff81cc25c4>] ? do_early_param+0x86/0x86
        [    1.117835]  [<ffffffff8171aff4>] kernel_thread_helper+0x4/0x10
        [    1.118401]  [<ffffffff81cc2b5d>] ? start_kernel+0x33f/0x33f
        [    1.119832]  [<ffffffff8171aff0>] ? gs_change+0xb/0xb
        [    1.120325] ------------[ cut here ]------------
        [    1.120835] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xc1/0xf0()
        [    1.121437] sysfs: cannot create duplicate filename '/kernel/slab/:t-0000016'
        [    1.121831] Modules linked in:
        [    1.122138] Pid: 1, comm: swapper/0 Tainted: G        W    3.6.0-rc1+ #6
        [    1.122831] Call Trace:
        [    1.123074]  [<ffffffff81195ce1>] ? sysfs_add_one+0xc1/0xf0
        [    1.123833]  [<ffffffff8103adfa>] warn_slowpath_common+0x7a/0xb0
        [    1.124405]  [<ffffffff8103aed1>] warn_slowpath_fmt+0x41/0x50
        [    1.124832]  [<ffffffff81195ce1>] sysfs_add_one+0xc1/0xf0
        [    1.125337]  [<ffffffff81195eb3>] create_dir+0x73/0xd0
        [    1.125832]  [<ffffffff81196221>] sysfs_create_dir+0x81/0xe0
        [    1.126363]  [<ffffffff81273d3d>] kobject_add_internal+0x9d/0x210
        [    1.126832]  [<ffffffff812742a3>] kobject_init_and_add+0x63/0x90
        [    1.127406]  [<ffffffff81124c60>] sysfs_slab_add+0x80/0x210
        [    1.127832]  [<ffffffff81100175>] kmem_cache_create+0xa5/0x250
        [    1.128384]  [<ffffffff81cf24cd>] ? md_init+0x144/0x144
        [    1.128833]  [<ffffffff81cf25b6>] local_init+0xa4/0x11b
        [    1.129831]  [<ffffffff81cf24e1>] dm_init+0x14/0x45
        [    1.130305]  [<ffffffff810001ba>] do_one_initcall+0x3a/0x160
        [    1.130831]  [<ffffffff81cc2c90>] kernel_init+0x133/0x1b7
        [    1.131351]  [<ffffffff81cc25c4>] ? do_early_param+0x86/0x86
        [    1.131830]  [<ffffffff8171aff4>] kernel_thread_helper+0x4/0x10
        [    1.132392]  [<ffffffff81cc2b5d>] ? start_kernel+0x33f/0x33f
        [    1.132830]  [<ffffffff8171aff0>] ? gs_change+0xb/0xb
        [    1.133315] ---[ end trace 2703540871c8fab7 ]---
        [    1.133830] ------------[ cut here ]------------
        [    1.134274] WARNING: at lib/kobject.c:196 kobject_add_internal+0x1f5/0x210()
        [    1.134829] kobject_add_internal failed for :t-0000016 with -EEXIST, don't try to register things with the same name in the same directory.
        [    1.135829] Modules linked in:
        [    1.136135] Pid: 1, comm: swapper/0 Tainted: G        W    3.6.0-rc1+ #6
        [    1.136828] Call Trace:
        [    1.137071]  [<ffffffff81273e95>] ? kobject_add_internal+0x1f5/0x210
        [    1.137830]  [<ffffffff8103adfa>] warn_slowpath_common+0x7a/0xb0
        [    1.138402]  [<ffffffff8103aed1>] warn_slowpath_fmt+0x41/0x50
        [    1.138830]  [<ffffffff811955a3>] ? release_sysfs_dirent+0x73/0xf0
        [    1.139419]  [<ffffffff81273e95>] kobject_add_internal+0x1f5/0x210
        [    1.139830]  [<ffffffff812742a3>] kobject_init_and_add+0x63/0x90
        [    1.140429]  [<ffffffff81124c60>] sysfs_slab_add+0x80/0x210
        [    1.140830]  [<ffffffff81100175>] kmem_cache_create+0xa5/0x250
        [    1.141829]  [<ffffffff81cf24cd>] ? md_init+0x144/0x144
        [    1.142307]  [<ffffffff81cf25b6>] local_init+0xa4/0x11b
        [    1.142829]  [<ffffffff81cf24e1>] dm_init+0x14/0x45
        [    1.143307]  [<ffffffff810001ba>] do_one_initcall+0x3a/0x160
        [    1.143829]  [<ffffffff81cc2c90>] kernel_init+0x133/0x1b7
        [    1.144352]  [<ffffffff81cc25c4>] ? do_early_param+0x86/0x86
        [    1.144829]  [<ffffffff8171aff4>] kernel_thread_helper+0x4/0x10
        [    1.145405]  [<ffffffff81cc2b5d>] ? start_kernel+0x33f/0x33f
        [    1.145828]  [<ffffffff8171aff0>] ? gs_change+0xb/0xb
        [    1.146313] ---[ end trace 2703540871c8fab8 ]---
      
      Conflicts:
      
      	mm/slub.c
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      aac3a166
    • C
      mm/sl[aou]b: Shrink __kmem_cache_create() parameter lists · 8a13a4cc
      Christoph Lameter 提交于
      Do the initial settings of the fields in common code. This will allow us
      to push more processing into common code later and improve readability.
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      8a13a4cc
    • C
      mm/sl[aou]b: Move kmem_cache allocations into common code · 278b1bb1
      Christoph Lameter 提交于
      Shift the allocations to common code. That way the allocation and
      freeing of the kmem_cache structures is handled by common code.
      Reviewed-by: NGlauber Costa <glommer@parallels.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      278b1bb1
    • C
      mm/sl[aou]b: Move sysfs_slab_add to common · 96d17b7b
      Christoph Lameter 提交于
      Simplify locking by moving the slab_add_sysfs after all locks have been
      dropped. Eases the upcoming move to provide sysfs support for all
      allocators.
      Reviewed-by: NGlauber Costa <glommer@parallels.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      96d17b7b
    • C
      mm/sl[aou]b: Do slab aliasing call from common code · cbb79694
      Christoph Lameter 提交于
      The slab aliasing logic causes some strange contortions in slub. So add
      a call to deal with aliases to slab_common.c but disable it for other
      slab allocators by providng stubs that fail to create aliases.
      
      Full general support for aliases will require additional cleanup passes
      and more standardization of fields in kmem_cache.
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      cbb79694
    • C
      mm/sl[aou]b: Get rid of __kmem_cache_destroy · 12c3667f
      Christoph Lameter 提交于
      What is done there can be done in __kmem_cache_shutdown.
      
      This affects RCU handling somewhat. On rcu free all slab allocators do
      not refer to other management structures than the kmem_cache structure.
      Therefore these other structures can be freed before the rcu deferred
      free to the page allocator occurs.
      Reviewed-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      12c3667f
    • C
      mm/sl[aou]b: Use "kmem_cache" name for slab cache with kmem_cache struct · 9b030cb8
      Christoph Lameter 提交于
      Make all allocators use the "kmem_cache" slabname for the "kmem_cache"
      structure.
      Reviewed-by: NGlauber Costa <glommer@parallels.com>
      Reviewed-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      9b030cb8
    • C
      mm/sl[aou]b: Extract a common function for kmem_cache_destroy · 945cf2b6
      Christoph Lameter 提交于
      kmem_cache_destroy does basically the same in all allocators.
      
      Extract common code which is easy since we already have common mutex
      handling.
      Reviewed-by: NGlauber Costa <glommer@parallels.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      945cf2b6
  10. 09 7月, 2012 2 次提交