1. 19 12月, 2012 9 次提交
    • G
      memcg: allocate memory for memcg caches whenever a new memcg appears · 55007d84
      Glauber Costa 提交于
      Every cache that is considered a root cache (basically the "original"
      caches, tied to the root memcg/no-memcg) will have an array that should be
      large enough to store a cache pointer per each memcg in the system.
      
      Theoreticaly, this is as high as 1 << sizeof(css_id), which is currently
      in the 64k pointers range.  Most of the time, we won't be using that much.
      
      What goes in this patch, is a simple scheme to dynamically allocate such
      an array, in order to minimize memory usage for memcg caches.  Because we
      would also like to avoid allocations all the time, at least for now, the
      array will only grow.  It will tend to be big enough to hold the maximum
      number of kmem-limited memcgs ever achieved.
      
      We'll allocate it to be a minimum of 64 kmem-limited memcgs.  When we have
      more than that, we'll start doubling the size of this array every time the
      limit is reached.
      
      Because we are only considering kmem limited memcgs, a natural point for
      this to happen is when we write to the limit.  At that point, we already
      have set_limit_mutex held, so that will become our natural synchronization
      mechanism.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      55007d84
    • G
      slab/slub: consider a memcg parameter in kmem_create_cache · 2633d7a0
      Glauber Costa 提交于
      Allow a memcg parameter to be passed during cache creation.  When the slub
      allocator is being used, it will only merge caches that belong to the same
      memcg.  We'll do this by scanning the global list, and then translating
      the cache to a memcg-specific cache
      
      Default function is created as a wrapper, passing NULL to the memcg
      version.  We only merge caches that belong to the same memcg.
      
      A helper is provided, memcg_css_id: because slub needs a unique cache name
      for sysfs.  Since this is visible, but not the canonical location for slab
      data, the cache name is not used, the css_id should suffice.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2633d7a0
    • G
      slab/slub: struct memcg_params · ba6c496e
      Glauber Costa 提交于
      For the kmem slab controller, we need to record some extra information in
      the kmem_cache structure.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Signed-off-by: NSuleiman Souhlal <suleiman@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba6c496e
    • G
      fork: protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs · 2ad306b1
      Glauber Costa 提交于
      Because those architectures will draw their stacks directly from the page
      allocator, rather than the slab cache, we can directly pass __GFP_KMEMCG
      flag, and issue the corresponding free_pages.
      
      This code path is taken when the architecture doesn't define
      CONFIG_ARCH_THREAD_INFO_ALLOCATOR (only ia64 seems to), and has
      THREAD_SIZE >= PAGE_SIZE.  Luckily, most - if not all - of the remaining
      architectures fall in this category.
      
      This will guarantee that every stack page is accounted to the memcg the
      process currently lives on, and will have the allocations to fail if they
      go over limit.
      
      For the time being, I am defining a new variant of THREADINFO_GFP, not to
      mess with the other path.  Once the slab is also tracked by memcg, we can
      get rid of that flag.
      
      Tested to successfully protect against :(){ :|:& };:
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NFrederic Weisbecker <fweisbec@redhat.com>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ad306b1
    • G
      memcg: use static branches when code not in use · a8964b9b
      Glauber Costa 提交于
      We can use static branches to patch the code in or out when not used.
      
      Because the _ACTIVE bit on kmem_accounted is only set after the increment
      is done, we guarantee that the root memcg will always be selected for kmem
      charges until all call sites are patched (see memcg_kmem_enabled).  This
      guarantees that no mischarges are applied.
      
      Static branch decrement happens when the last reference count from the
      kmem accounting in memcg dies.  This will only happen when the charges
      drop down to 0.
      
      When that happens, we need to disable the static branch only on those
      memcgs that enabled it.  To achieve this, we would be forced to complicate
      the code by keeping track of which memcgs were the ones that actually
      enabled limits, and which ones got it from its parents.
      
      It is a lot simpler just to do static_key_slow_inc() on every child
      that is accounted.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8964b9b
    • G
      res_counter: return amount of charges after res_counter_uncharge() · 50bdd430
      Glauber Costa 提交于
      It is useful to know how many charges are still left after a call to
      res_counter_uncharge.  While it is possible to issue a res_counter_read
      after uncharge, this can be racy.
      
      If we need, for instance, to take some action when the counters drop down
      to 0, only one of the callers should see it.  This is the same semantics
      as the atomic variables in the kernel.
      
      Since the current return value is void, we don't need to worry about
      anything breaking due to this change: nobody relied on that, and only
      users appearing from now on will be checking this value.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50bdd430
    • G
      mm: allocate kernel pages to the right memcg · 6a1a0d3b
      Glauber Costa 提交于
      When a process tries to allocate a page with the __GFP_KMEMCG flag, the
      page allocator will call the corresponding memcg functions to validate
      the allocation.  Tasks in the root memcg can always proceed.
      
      To avoid adding markers to the page - and a kmem flag that would
      necessarily follow, as much as doing page_cgroup lookups for no reason,
      whoever is marking its allocations with __GFP_KMEMCG flag is responsible
      for telling the page allocator that this is such an allocation at
      free_pages() time.  This is done by the invocation of
      __free_accounted_pages() and free_accounted_pages().
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6a1a0d3b
    • G
      memcg: kmem controller infrastructure · 7ae1e1d0
      Glauber Costa 提交于
      Introduce infrastructure for tracking kernel memory pages to a given
      memcg.  This will happen whenever the caller includes the flag
      __GFP_KMEMCG flag, and the task belong to a memcg other than the root.
      
      In memcontrol.h those functions are wrapped in inline acessors.  The idea
      is to later on, patch those with static branches, so we don't incur any
      overhead when no mem cgroups with limited kmem are being used.
      
      Users of this functionality shall interact with the memcg core code
      through the following functions:
      
      memcg_kmem_newpage_charge: will return true if the group can handle the
                                 allocation. At this point, struct page is not
                                 yet allocated.
      
      memcg_kmem_commit_charge: will either revert the charge, if struct page
                                allocation failed, or embed memcg information
                                into page_cgroup.
      
      memcg_kmem_uncharge_page: called at free time, will revert the charge.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7ae1e1d0
    • G
      mm: add a __GFP_KMEMCG flag · 7a64bf05
      Glauber Costa 提交于
      This flag is used to indicate to the callees that this allocation is a
      kernel allocation in process context, and should be accounted to current's
      memcg.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NMel Gorman <mel@csn.ul.ie>
      Acked-by: NKamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: JoonSoo Kim <js1304@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a64bf05
  2. 18 12月, 2012 20 次提交
  3. 16 12月, 2012 2 次提交
  4. 15 12月, 2012 1 次提交
  5. 14 12月, 2012 1 次提交
    • J
      Revert "libata: check SATA_SETTINGS log with HW Feature Ctrl" · 8349e5ae
      Jeff Garzik 提交于
      This reverts commit de90cd71.
      
      Shane Huang writes:
      
        Please suspend this patch because I just received two new
        DevSlp drives but found word 78 bit 5 is _not_ set.
      
        I'm checking with the drive vendor whether he gave me
        the wrong information. If bit 5 is not the necessary and
        sufficient condition, I will implement another patch to
        replace ata_device->sata_settings into ->devslp_timing.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      8349e5ae
  6. 13 12月, 2012 7 次提交