1. 15 5月, 2015 3 次提交
    • M
      mm, numa: really disable NUMA balancing by default on single node machines · b0dc2b9b
      Mel Gorman 提交于
      NUMA balancing is meant to be disabled by default on UMA machines but
      the check is using nr_node_ids (highest node) instead of
      num_online_nodes (online nodes).
      
      The consequences are that a UMA machine with a node ID of 1 or higher
      will enable NUMA balancing.  This will incur useless overhead due to
      minor faults with the impact depending on the workload.  These are the
      impact on the stats when running a kernel build on a single node machine
      whose node ID happened to be 1:
      
        			       vanilla     patched
        NUMA base PTE updates          5113158           0
        NUMA huge PMD updates              643           0
        NUMA page range updates        5442374           0
        NUMA hint faults               2109622           0
        NUMA hint local faults         2109622           0
        NUMA hint local percent            100         100
        NUMA pages migrated                  0           0
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: <stable@vger.kernel.org>	[3.8+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0dc2b9b
    • H
      CMA: page_isolation: check buddy before accessing it · 1ae7013d
      Hui Zhu 提交于
      I had an issue:
      
          Unable to handle kernel NULL pointer dereference at virtual address 0000082a
          pgd = cc970000
          [0000082a] *pgd=00000000
          Internal error: Oops: 5 [#1] PREEMPT SMP ARM
          PC is at get_pageblock_flags_group+0x5c/0xb0
          LR is at unset_migratetype_isolate+0x148/0x1b0
          pc : [<c00cc9a0>]    lr : [<c0109874>]    psr: 80000093
          sp : c7029d00  ip : 00000105  fp : c7029d1c
          r10: 00000001  r9 : 0000000a  r8 : 00000004
          r7 : 60000013  r6 : 000000a4  r5 : c0a357e4  r4 : 00000000
          r3 : 00000826  r2 : 00000002  r1 : 00000000  r0 : 0000003f
          Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
          Control: 10c5387d  Table: 2cb7006a  DAC: 00000015
          Backtrace:
              get_pageblock_flags_group+0x0/0xb0
              unset_migratetype_isolate+0x0/0x1b0
              undo_isolate_page_range+0x0/0xdc
              __alloc_contig_range+0x0/0x34c
              alloc_contig_range+0x0/0x18
      
      This issue is because when calling unset_migratetype_isolate() to unset
      a part of CMA memory, it try to access the buddy page to get its status:
      
      		if (order >= pageblock_order) {
      			page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
      			buddy_idx = __find_buddy_index(page_idx, order);
      			buddy = page + (buddy_idx - page_idx);
      
      			if (!is_migrate_isolate_page(buddy)) {
      
      But the begin addr of this part of CMA memory is very close to a part of
      memory that is reserved at boot time (not in buddy system).  So add a
      check before accessing it.
      
      [akpm@linux-foundation.org: use conventional code layout]
      Signed-off-by: NHui Zhu <zhuhui@xiaomi.com>
      Suggested-by: NLaura Abbott <labbott@redhat.com>
      Suggested-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1ae7013d
    • V
      gfp: add __GFP_NOACCOUNT · 8f4fc071
      Vladimir Davydov 提交于
      Not all kmem allocations should be accounted to memcg.  The following
      patch gives an example when accounting of a certain type of allocations to
      memcg can effectively result in a memory leak.  This patch adds the
      __GFP_NOACCOUNT flag which if passed to kmalloc and friends will force the
      allocation to go through the root cgroup.  It will be used by the next
      patch.
      
      Note, since in case of kmemleak enabled each kmalloc implies yet another
      allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to
      gfp_kmemleak_mask.
      
      Alternatively, we could introduce a per kmem cache flag disabling
      accounting for all allocations of a particular kind, but (a) we would not
      be able to bypass accounting for kmalloc then and (b) a kmem cache with
      this flag set could not be merged with a kmem cache without this flag,
      which would increase the number of global caches and therefore
      fragmentation even if the memory cgroup controller is not used.
      
      Despite its generic name, currently __GFP_NOACCOUNT disables accounting
      only for kmem allocations while user page allocations are always charged.
      To catch abusing of this flag, a warning is issued on an attempt of
      passing it to mem_cgroup_try_charge.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>	[4.0.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f4fc071
  2. 06 5月, 2015 4 次提交
  3. 24 4月, 2015 1 次提交
    • T
      writeback: use |1 instead of +1 to protect against div by zero · 464d1387
      Tejun Heo 提交于
      mm/page-writeback.c has several places where 1 is added to the divisor
      to prevent division by zero exceptions; however, if the original
      divisor is equivalent to -1, adding 1 leads to division by zero.
      
      There are three places where +1 is used for this purpose - one in
      pos_ratio_polynom() and two in bdi_position_ratio().  The second one
      in bdi_position_ratio() actually triggered div-by-zero oops on a
      machine running a 3.10 kernel.  The divisor is
      
        x_intercept - bdi_setpoint + 1 == span + 1
      
      span is confirmed to be (u32)-1.  It isn't clear how it ended up that
      but it could be from write bandwidth calculation underflow fixed by
      c72efb65 ("writeback: fix possible underflow in write bandwidth
      calculation").
      
      At any rate, +1 isn't a proper protection against div-by-zero.  This
      patch converts all +1 protections to |1.  Note that
      bdi_update_dirty_ratelimit() was already using |1 before this patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      464d1387
  4. 16 4月, 2015 32 次提交
反馈
建议
客服 返回
顶部