1. 20 2月, 2013 1 次提交
  2. 11 12月, 2012 1 次提交
    • L
      Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage · caf49191
      Linus Torvalds 提交于
      This reverts commits a5091539 and
      d7c3b937.
      
      This is a revert of a revert of a revert.  In addition, it reverts the
      even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
      original commits in linux-next.
      
      It turns out that the original patch really was bogus, and that the
      original revert was the correct thing to do after all.  We thought we
      had fixed the problem, and then reverted the revert, but the problem
      really is fundamental: waking up kswapd simply isn't the right thing to
      do, and direct reclaim sometimes simply _is_ the right thing to do.
      
      When certain allocations fail, we simply should try some direct reclaim,
      and if that fails, fail the allocation.  That's the right thing to do
      for THP allocations, which can easily fail, and the GPU allocations want
      to do that too.
      
      So starting kswapd is sometimes simply wrong, and removing the flag that
      said "don't start kswapd" was a mistake.  Let's hope we never revisit
      this mistake again - and certainly not this many times ;)
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      caf49191
  3. 08 12月, 2012 1 次提交
    • E
      net: gro: fix possible panic in skb_gro_receive() · c3c7c254
      Eric Dumazet 提交于
      commit 2e71a6f8 (net: gro: selective flush of packets) added
      a bug for skbs using frag_list. This part of the GRO stack is rarely
      used, as it needs skb not using a page fragment for their skb->head.
      
      Most drivers do use a page fragment, but some of them use GFP_KERNEL
      allocations for the initial fill of their RX ring buffer.
      
      napi_gro_flush() overwrite skb->prev that was used for these skb to
      point to the last skb in frag_list.
      
      Fix this using a separate field in struct napi_gro_cb to point to the
      last fragment.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3c7c254
  4. 07 12月, 2012 1 次提交
    • M
      tmpfs: fix shared mempolicy leak · 18a2f371
      Mel Gorman 提交于
      This fixes a regression in 3.7-rc, which has since gone into stable.
      
      Commit 00442ad0 ("mempolicy: fix a memory corruption by refcount
      imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
      refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
      on expecting alloc_page_vma() to drop the refcount it had acquired.
      This deserves a rework: but for now fix the leak in shmem_alloc_page().
      
      Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
      the same refcounting there as in shmem_alloc_page(), delete its onstack
      mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
      those were invented to let swapin_readahead() make an unknown number of
      calls to alloc_pages_vma() with one mempolicy; but since 00442ad0,
      alloc_pages_vma() has kept refcount in balance, so now no problem.
      Reported-and-tested-by: NTommi Rantala <tt.rantala@gmail.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      18a2f371
  5. 03 12月, 2012 1 次提交
    • J
      linux/kernel.h: define SYMBOL_PREFIX · cbdbf2ab
      James Hogan 提交于
      Define SYMBOL_PREFIX to be the same as CONFIG_SYMBOL_PREFIX if set by
      the architecture, or "" otherwise. This avoids the need for ugly #ifdefs
      whenever symbols are referenced in asm blocks.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      cbdbf2ab
  6. 01 12月, 2012 1 次提交
    • A
      revert "Revert "mm: remove __GFP_NO_KSWAPD"" · a5091539
      Andrew Morton 提交于
      It apepars that this patch was innocent, and we hope that "mm: avoid
      waking kswapd for THP allocations when compaction is deferred or
      contended" will fix the final kswapd-spinning cause.
      
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5091539
  7. 30 11月, 2012 2 次提交
  8. 28 11月, 2012 1 次提交
  9. 27 11月, 2012 2 次提交
    • M
      Revert "mm: remove __GFP_NO_KSWAPD" · 82b212f4
      Mel Gorman 提交于
      With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
      based on failures" reverted, Zdenek Kabelac reported the following
      
        Hmm,  so it's just took longer to hit the problem and observe
        kswapd0 spinning on my CPU again - it's not as endless like before -
        but still it easily eats minutes - it helps to	turn off  Firefox
        or TB  (memory hungry apps) so kswapd0 stops soon - and restart
        those apps again.  (And I still have like >1GB of cached memory)
      
        kswapd0         R  running task        0    30      2 0x00000000
        Call Trace:
          preempt_schedule+0x42/0x60
          _raw_spin_unlock+0x55/0x60
          put_super+0x31/0x40
          drop_super+0x22/0x30
          prune_super+0x149/0x1b0
          shrink_slab+0xba/0x510
      
      The sysrq+m indicates the system has no swap so it'll never reclaim
      anonymous pages as part of reclaim/compaction.  That is one part of the
      problem but not the root cause as file-backed pages could also be
      reclaimed.
      
      The likely underlying problem is that kswapd is woken up or kept awake
      for each THP allocation request in the page allocator slow path.
      
      If compaction fails for the requesting process then compaction will be
      deferred for a time and direct reclaim is avoided.  However, if there
      are a storm of THP requests that are simply rejected, it will still be
      the the case that kswapd is awake for a prolonged period of time as
      pgdat->kswapd_max_order is updated each time.  This is noticed by the
      main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
      it will loopp, shrinking a small number of pages and calling
      shrink_slab() on each iteration.
      
      The temptation is to supply a patch that checks if kswapd was woken for
      THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
      backed up by proper testing.  As 3.7 is very close to release and this
      is not a bug we should release with, a safer path is to revert "mm:
      remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
      out the balance_pgdat() logic in general.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      82b212f4
    • T
      include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID · c5782e9f
      Tushar Behera 提交于
      Commit baf05aa9 ("bug: introduce BUILD_BUG_ON_INVALID() macro")
      introduces this macro only when _CHECKER_ is not defined.  Define a
      silent macro in the else condition to fix following sparse warning:
      
        mm/filemap.c:395:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        mm/filemap.c:396:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        mm/filemap.c:397:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        include/linux/mm.h:419:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        include/linux/mm.h:419:9: error: not a function <noident>
      Signed-off-by: NTushar Behera <tushar.behera@linaro.org>
      Acked-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c5782e9f
  10. 24 11月, 2012 1 次提交
  11. 22 11月, 2012 1 次提交
  12. 20 11月, 2012 1 次提交
    • D
      perf: Make perf build for x86 with UAPI disintegration applied · d2709c7c
      David Howells 提交于
      Make perf build for x86 once the UAPI disintegration patches for that arch
      have been applied by adding the appropriate -I flags - in the right order -
      and then converting some #includes that use ../.. notation to find main kernel
      headerfiles to use <asm/foo.h> and <linux/foo.h> instead.
      
      Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include.
      This makes sure we get the userspace version of the pt_regs struct.  Ideally,
      we wouldn't have the latter -I flag at all, but unfortunately we want
      asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI -
      at least not for x86.  I wonder if the bits outside of the __KERNEL__ guards
      *should* be transferred there.
      
      I note also that perf seems to do its dependency handling manually by listing
      all the header files it might want to use in LIB_H in the Makefile.  Can this
      be changed to use -MD?
      
      Note that to do make this work, we need to export and UAPI disintegrate
      linux/hw_breakpoint.h, which I think should've been exported previously so that
      perf can access the bits.  We have to do this in the same patch to maintain
      bisectability.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d2709c7c
  13. 17 11月, 2012 3 次提交
    • A
      revert "mm: fix-up zone present pages" · 5576646f
      Andrew Morton 提交于
      Revert commit 7f1290f2 ("mm: fix-up zone present pages")
      
      That patch tried to fix a issue when calculating zone->present_pages,
      but it caused a regression on 32bit systems with HIGHMEM.  With that
      change, reset_zone_present_pages() resets all zone->present_pages to
      zero, and fixup_zone_present_pages() is called to recalculate
      zone->present_pages when the boot allocator frees core memory pages into
      buddy allocator.  Because highmem pages are not freed by bootmem
      allocator, all highmem zones' present_pages becomes zero.
      
      Various options for improving the situation are being discussed but for
      now, let's return to the 3.6 code.
      
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Tested-by: NChris Clayton <chris2553@googlemail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5576646f
    • R
      rapidio: fix kernel-doc warnings · 2ca3cb50
      Randy Dunlap 提交于
      Fix rapidio kernel-doc warnings:
      
        Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
        Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
        Warning(include/linux/rio.h:290): No description found for parameter 'switches'
        Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Acked-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ca3cb50
    • H
      memcg: fix hotplugged memory zone oops · bea8c150
      Hugh Dickins 提交于
      When MEMCG is configured on (even when it's disabled by boot option),
      when adding or removing a page to/from its lru list, the zone pointer
      used for stats updates is nowadays taken from the struct lruvec.  (On
      many configurations, calculating zone from page is slower.)
      
      But we have no code to update all the lruvecs (per zone, per memcg) when
      a memory node is hotadded.  Here's an extract from the oops which
      results when running numactl to bind a program to a newly onlined node:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
        IP:  __mod_zone_page_state+0x9/0x60
        Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
        Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
        Call Trace:
          __pagevec_lru_add_fn+0xdf/0x140
          pagevec_lru_move_fn+0xb1/0x100
          __pagevec_lru_add+0x1c/0x30
          lru_add_drain_cpu+0xa3/0x130
          lru_add_drain+0x2f/0x40
         ...
      
      The natural solution might be to use a memcg callback whenever memory is
      hotadded; but that solution has not been scoped out, and it happens that
      we do have an easy location at which to update lruvec->zone.  The lruvec
      pointer is discovered either by mem_cgroup_zone_lruvec() or by
      mem_cgroup_page_lruvec(), and both of those do know the right zone.
      
      So check and set lruvec->zone in those; and remove the inadequate
      attempt to set lruvec->zone from lruvec_init(), which is called before
      NODE_DATA(node) has been allocated in such cases.
      
      Ah, there was one exceptionr.  For no particularly good reason,
      mem_cgroup_force_empty_list() has its own code for deciding lruvec.
      Change it to use the standard mem_cgroup_zone_lruvec() and
      mem_cgroup_get_lru_size() too.  In fact it was already safe against such
      an oops (the lru lists in danger could only be empty), but we're better
      proofed against future changes this way.
      
      I've marked this for stable (3.6) since we introduced the problem in 3.5
      (now closed to stable); but I have no idea if this is the only fix
      needed to get memory hotadd working with memcg in 3.6, and received no
      answer when I enquired twice before.
      Reported-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bea8c150
  14. 16 11月, 2012 1 次提交
    • I
      clk: remove inline usage from clk-provider.h · 93532c8a
      Igor Mazanov 提交于
      Users of GCC 4.7 have reported compiler errors due to having inline
      applied to function declarations in clk-provider.h.  The definitions
      exist in drivers/clk/clk.c.  An example error:
      
      In file included from arch/arm/mach-omap2/clockdomain.c:25:0:
      arch/arm/mach-omap2/clockdomain.c: In function ‘clkdm_clk_disable’:
      include/linux/clk-provider.h:338:12: error: inlining failed in call to always_inline ‘__clk_get_enable_count’: function body not available
      arch/arm/mach-omap2/clockdomain.c:1001:28: error: called from here
      make[1]: *** [arch/arm/mach-omap2/clockdomain.o] Error 1
      make: *** [arch/arm/mach-omap2] Error 2
      
      This patch removes the use of inline from include/linux/clk-provider.h
      but keeps the function definitions in drivers/clk/clk.c as inlined since
      they are one-liners.
      Signed-off-by: NIgor Mazanov <i.mazanov@gmail.com>
      Acked-by: NPaul Walmsley <paul@pwsan.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      [mturquette@linaro.org: improved subject, added changelog]
      93532c8a
  15. 14 11月, 2012 1 次提交
  16. 10 11月, 2012 1 次提交
    • A
      of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again · 0bce04be
      Andreas Larsson 提交于
      This bug-fix makes sure that of_address_to_resource is defined extern for sparc
      so that the sparc-specific implementation of of_address_to_resource() is once
      again used when including include/linux/of_address.h in a sparc context. A
      number of drivers in mainline relies on this function working for sparc.
      
      The bug was introduced in a850a755, "of/address:
      add empty static inlines for !CONFIG_OF". Contrary to that commit title, the
      static inlines are added for !CONFIG_OF_ADDRESS, and CONFIG_OF_ADDRESS is never
      defined for sparc. This is good behavior for the other functions in
      include/linux/of_address.h, as the extern functions defined in
      drivers/of/address.c only gets linked when OF_ADDRESS is configured. However,
      for of_address_to_resource there exists a sparc-specific implementation in
      arch/sparc/arch/sparc/kernel/of_device_common.c
      
      Solution suggested by: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndreas Larsson <andreas@gaisler.com>
      Acked-by: NRob Herring <rob.herring@calxeda.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0bce04be
  17. 08 11月, 2012 4 次提交
  18. 04 11月, 2012 1 次提交
  19. 03 11月, 2012 1 次提交
  20. 01 11月, 2012 1 次提交
    • X
      KVM: x86: fix vcpu->mmio_fragments overflow · 87da7e66
      Xiao Guangrong 提交于
      After commit b3356bf0 (KVM: emulator: optimize "rep ins" handling),
      the pieces of io data can be collected and write them to the guest memory
      or MMIO together
      
      Unfortunately, kvm splits the mmio access into 8 bytes and store them to
      vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
      will cause vcpu->mmio_fragments overflow
      
      The bug can be exposed by isapc (-M isapc):
      
      [23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ ......]
      [23154.858083] Call Trace:
      [23154.859874]  [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm]
      [23154.861677]  [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
      [23154.863604]  [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]
      
      Actually, we can use one mmio_fragment to store a large mmio access then
      split it when we pass the mmio-exit-info to userspace. After that, we only
      need two entries to store mmio info for the cross-mmio pages access
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      87da7e66
  21. 29 10月, 2012 2 次提交
  22. 26 10月, 2012 1 次提交
  23. 25 10月, 2012 2 次提交
  24. 24 10月, 2012 2 次提交
  25. 23 10月, 2012 2 次提交
  26. 22 10月, 2012 1 次提交
  27. 20 10月, 2012 2 次提交
    • C
      pidns: remove recursion from free_pid_ns() · bbc2e3ef
      Cyrill Gorcunov 提交于
      free_pid_ns() operates in a recursive fashion:
      
      free_pid_ns(parent)
        put_pid_ns(parent)
          kref_put(&ns->kref, free_pid_ns);
            free_pid_ns
      
      thus if there was a huge nesting of namespaces the userspace may trigger
      avalanche calling of free_pid_ns leading to kernel stack exhausting and a
      panic eventually.
      
      This patch turns the recursion into an iterative loop.
      
      Based on a patch by Andrew Vagin.
      
      [akpm@linux-foundation.org: export put_pid_ns() to modules]
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bbc2e3ef
    • R
      linux/coredump.h needs asm/siginfo.h · 1d46e232
      Richard Weinberger 提交于
      Commit 5ab1c309 ("coredump: pass siginfo_t* to do_coredump() and
      below, not merely signr") added siginfo_t to linux/coredump.h but forgot
      to include asm/siginfo.h.  This breaks the build for UML/i386.  (And any
      other arch where asm/siginfo.h is not magically preincluded...)
      
        In file included from arch/x86/um/elfcore.c:2:0: include/linux/coredump.h:15:25: error: unknown type name 'siginfo_t'
        make[1]: *** [arch/x86/um/elfcore.o] Error 1
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Amerigo Wang <amwang@redhat.com>
      Cc: "Jonathan M. Foote" <jmfoote@cert.org>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d46e232
  28. 19 10月, 2012 1 次提交