1. 14 4月, 2013 2 次提交
  2. 26 3月, 2013 1 次提交
  3. 23 3月, 2013 3 次提交
    • R
      mm: zone_end_pfn is too small · f9228b20
      Russ Anderson 提交于
      Booting with 32 TBytes memory hits BUG at mm/page_alloc.c:552! (output
      below).
      
      The key hint is "page 4294967296 outside zone".
      4294967296 = 0x100000000 (bit 32 is set).
      
      The problem is in include/linux/mmzone.h:
      
        530 static inline unsigned zone_end_pfn(const struct zone *zone)
        531 {
        532         return zone->zone_start_pfn + zone->spanned_pages;
        533 }
      
      zone_end_pfn is "unsigned" (32 bits).  Changing it to "unsigned long"
      (64 bits) fixes the problem.
      
      zone_end_pfn() was added recently in commit 108bcc96 ("mm: add & use
      zone_end_pfn() and zone_spans_pfn()")
      
      Output from the failure.
      
        No AGP bridge found
        page 4294967296 outside zone [ 4294967296 - 4327469056 ]
        ------------[ cut here ]------------
        kernel BUG at mm/page_alloc.c:552!
        invalid opcode: 0000 [#1] SMP
        Modules linked in:
        CPU 0
        Pid: 0, comm: swapper Not tainted 3.9.0-rc2.dtp+ #10
        RIP: free_one_page+0x382/0x430
        Process swapper (pid: 0, threadinfo ffffffff81942000, task ffffffff81955420)
        Call Trace:
          __free_pages_ok+0x96/0xb0
          __free_pages+0x25/0x50
          __free_pages_bootmem+0x8a/0x8c
          __free_memory_core+0xea/0x131
          free_low_memory_core_early+0x4a/0x98
          free_all_bootmem+0x45/0x47
          mem_init+0x7b/0x14c
          start_kernel+0x216/0x433
          x86_64_start_reservations+0x2a/0x2c
          x86_64_start_kernel+0x144/0x153
        Code: 89 f1 ba 01 00 00 00 31 f6 d3 e2 4c 89 ef e8 66 a4 01 00 e9 2c fe ff ff 0f 0b eb fe 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 eb f3 <0f> 0b eb fe 0f 0b 0f 1f 84 00 00 00 00 00 eb f6 0f 0b eb fe 49
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      Reported-by: NGeorge Beshers <gbeshers@sgi.com>
      Acked-by: NHedi Berriche <hedi@sgi.com>
      Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f9228b20
    • F
      printk: Provide a wake_up_klogd() off-case · dc72c32e
      Frederic Weisbecker 提交于
      wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk()
      nor printk_sched() are in use and there are actually no waiter on
      log_wait waitqueue.  It should be a stub in this case for users like
      bust_spinlocks().
      
      Otherwise this results in this warning when CONFIG_PRINTK=n and
      CONFIG_IRQ_WORK=n:
      
      	kernel/built-in.o In function `wake_up_klogd':
      	(.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue'
      
      To fix this, provide an off-case for wake_up_klogd() when
      CONFIG_PRINTK=n.
      
      There is much more from console_unlock() and other console related code
      in printk.c that should be moved under CONFIG_PRINTK.  But for now,
      focus on a minimal fix as we passed the merged window already.
      
      [akpm@linux-foundation.org: include printk.h in bust_spinlocks.c]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc72c32e
    • J
      irq_work.h: fix warning when CONFIG_IRQ_WORK=n · fe8d5261
      James Hogan 提交于
      A randconfig caught repeated compiler warnings when CONFIG_IRQ_WORK=n
      due to the definition of a non-inline static function in
      <linux/irq_work.h>:
      
        include/linux/irq_work.h +40 : warning: 'irq_work_needs_cpu' defined but not used
      
      Make it inline to supress the warning.  This is caused commit
      00b42959 ("irq_work: Don't stop the tick with pending works") merged
      in v3.9-rc1.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe8d5261
  4. 22 3月, 2013 2 次提交
  5. 20 3月, 2013 1 次提交
  6. 19 3月, 2013 2 次提交
  7. 18 3月, 2013 2 次提交
  8. 17 3月, 2013 1 次提交
  9. 16 3月, 2013 4 次提交
  10. 15 3月, 2013 2 次提交
    • Z
      Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug · aaa0c23c
      Zhouyi Zhou 提交于
      When neighbour table is full, dst_neigh_lookup/dst_neigh_lookup_skb will return
      -ENOBUFS which is absolutely non zero, while all the code in kernel which use
      above functions assume failure only on zero return which will cause panic. (for
      example: : https://bugzilla.kernel.org/show_bug.cgi?id=54731).
      
      This patch corrects above error with smallest changes to kernel source code and
      also correct two return value check missing bugs in drivers/infiniband/hw/cxgb4/cm.c
      
      Tested on my x86_64 SMP machine
      Reported-by: NZhouyi Zhou <zhouzhouyi@gmail.com>
      Tested-by: NZhouyi Zhou <zhouzhouyi@gmail.com>
      Signed-off-by: NZhouyi Zhou <zhouzhouyi@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aaa0c23c
    • P
      list: Fix double fetch of pointer in hlist_entry_safe() · f65846a1
      Paul E. McKenney 提交于
      The current version of hlist_entry_safe() fetches the pointer twice,
      once to test for NULL and the other to compute the offset back to the
      enclosing structure.  This is OK for normal lock-based use because in
      that case, the pointer cannot change.  However, when the pointer is
      protected by RCU (as in "rcu_dereference(p)"), then the pointer can
      change at any time.  This use case can result in the following sequence
      of events:
      
      1.	CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
      	pointer as sees that it is non-NULL.
      
      2.	CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
      	just fetched a pointer to.  Because this is the last entry
      	in the list, the pointer fetched by CPU 0 is now NULL.
      
      3.	CPU 0 refetches the pointer, obtains NULL, and then gets a
      	NULL-pointer crash.
      
      This commit therefore applies gcc's "({ })" statement expression to
      create a temporary variable so that the specified pointer is fetched
      only once, avoiding the above sequence of events.  Please note that
      it is the caller's responsibility to use rcu_dereference() as needed.
      This allows RCU-protected uses to work correctly without imposing
      any additional overhead on the non-RCU case.
      
      Many thanks to Eric Dumazet for spotting root cause!
      Reported-by: NCAI Qian <caiqian@redhat.com>
      Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: NLi Zefan <lizefan@huawei.com>
      f65846a1
  11. 14 3月, 2013 8 次提交
    • P
      skb: Propagate pfmemalloc on skb from head page only · cca7af38
      Pavel Emelyanov 提交于
      Hi.
      
      I'm trying to send big chunks of memory from application address space via
      TCP socket using vmsplice + splice like this
      
         mem = mmap(128Mb);
         vmsplice(pipe[1], mem); /* splice memory into pipe */
         splice(pipe[0], tcp_socket); /* send it into network */
      
      When I'm lucky and a huge page splices into the pipe and then into the socket
      _and_ client and server ends of the TCP connection are on the same host,
      communicating via lo, the whole connection gets stuck! The sending queue
      becomes full and app stops writing/splicing more into it, but the receiving
      queue remains empty, and that's why.
      
      The __skb_fill_page_desc observes a tail page of a huge page and erroneously
      propagates its page->pfmemalloc value onto socket (the pfmemalloc on tail pages
      contain garbage). Then this skb->pfmemalloc leaks through lo and due to the
      
          tcp_v4_rcv
          sk_filter
              if (skb->pfmemalloc && !sock_flag(sk, SOCK_MEMALLOC)) /* true */
                  return -ENOMEM
              goto release_and_discard;
      
      no packets reach the socket. Even TCP re-transmits are dropped by this, as skb
      cloning clones the pfmemalloc flag as well.
      
      That said, here's the proper page->pfmemalloc propagation onto socket: we
      must check the huge-page's head page only, other pages' pfmemalloc and mapping
      values do not contain what is expected in this place. However, I'm not sure
      whether this fix is _complete_, since pfmemalloc propagation via lo also
      oesn't look great.
      
      Both, bit propagation from page to skb and this check in sk_filter, were
      introduced by c48a11c7 (netvm: propagate page->pfmemalloc to skb), in v3.5 so
      Mel and stable@ are in Cc.
      Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cca7af38
    • E
      tcp: fix skb_availroom() · 16fad69c
      Eric Dumazet 提交于
      Chrome OS team reported a crash on a Pixel ChromeBook in TCP stack :
      
      https://code.google.com/p/chromium/issues/detail?id=182056
      
      commit a21d4572 (tcp: avoid order-1 allocations on wifi and tx
      path) did a poor choice adding an 'avail_size' field to skb, while
      what we really needed was a 'reserved_tailroom' one.
      
      It would have avoided commit 22b4a4f2 (tcp: fix retransmit of
      partially acked frames) and this commit.
      
      Crash occurs because skb_split() is not aware of the 'avail_size'
      management (and should not be aware)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMukesh Agrawal <quiche@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16fad69c
    • B
      mtd: nand: reintroduce NAND_NO_READRDY as NAND_NEED_READRDY · 5bc7c33c
      Brian Norris 提交于
      This partially reverts commit 1696e6bc
      ("mtd: nand: kill NAND_NO_READRDY").
      
      In that patch I overlooked a few things.
      
      The original documentation for NAND_NO_READRDY included "True for all
      large page devices, as they do not support autoincrement." I was
      conflating "not support autoincrement" with the NAND_NO_AUTOINCR option,
      which was in fact doing nothing. So, when I dropped NAND_NO_AUTOINCR, I
      concluded that I then could harmlessly drop NAND_NO_READRDY. But of
      course the fact the NAND_NO_AUTOINCR was doing nothing didn't mean
      NAND_NO_READRDY was doing nothing...
      
      So, NAND_NO_READRDY is re-introduced as NAND_NEED_READRDY and applied
      only to those few remaining small-page NAND which needed it in the first
      place.
      
      Cc: stable@kernel.org [3.5+]
      Reported-by: NAlexander Shiyan <shc_work@mail.ru>
      Tested-by: NAlexander Shiyan <shc_work@mail.ru>
      Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      5bc7c33c
    • D
      UAPI: fix endianness conditionals in linux/raid/md_p.h · ca044f9a
      David Howells 提交于
      In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
      compared against __BYTE_ORDER in preprocessor conditionals where these are
      exposed to userspace (that is they're not inside __KERNEL__ conditionals).
      
      However, in the main kernel the norm is to check for
      "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
      this has incorrectly leaked into the userspace headers.
      
      The definition of struct mdp_superblock_s in linux/raid/md_p.h is wrong in
      this way.  Note that userspace will likely interpret the ordering of the
      fields incorrectly as the big-endian variant on a little-endian machines -
      depending on header inclusion order.
      
      [!!!] NOTE [!!!]  This patch may adversely change the userspace API.  It might
      be better to fix the ordering of events_hi, events_lo, cp_events_hi and
      cp_events_lo in struct mdp_superblock_s / typedef mdp_super_t.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca044f9a
    • D
      UAPI: fix endianness conditionals in linux/acct.h · 29ba06b9
      David Howells 提交于
      In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
      compared against __BYTE_ORDER in preprocessor conditionals where these are
      exposed to userspace (that is they're not inside __KERNEL__ conditionals).
      
      However, in the main kernel the norm is to check for
      "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
      this has incorrectly leaked into the userspace headers.
      
      The definition of ACCT_BYTEORDER in linux/acct.h is wrong in this way.
      Note that userspace will likely interpret this incorrectly as the
      big-endian variant on little-endian machines - depending on header
      inclusion order.
      
      [!!!] NOTE [!!!]  This patch may adversely change the userspace API.  It might
      be better to fix the value of ACCT_BYTEORDER.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      29ba06b9
    • D
      UAPI: fix endianness conditionals in linux/aio_abi.h · 51b154ed
      David Howells 提交于
      In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
      compared against __BYTE_ORDER in preprocessor conditionals where these are
      exposed to userspace (that is they're not inside __KERNEL__ conditionals).
      
      However, in the main kernel the norm is to check for
      "defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
      this has incorrectly leaked into the userspace headers.
      
      The definition of PADDED() in linux/aio_abi.h is wrong in this way.  Note
      that userspace will likely interpret this and thus the order of fields in
      struct iocb incorrectly as the little-endian variant on big-endian
      machines - depending on header inclusion order.
      
      [!!!] NOTE [!!!]  This patch may adversely change the userspace API.  It might
      be better to fix the ordering of aio_key and aio_reserved1 in struct iocb.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      51b154ed
    • T
      idr: deprecate idr_pre_get() and idr_get_new[_above]() · c8615d37
      Tejun Heo 提交于
      Now that all in-kernel users are converted to ues the new alloc
      interface, mark the old interface deprecated.  We should be able to
      remove these in a few releases.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c8615d37
    • A
      include/linux/res_counter.h needs errno.h · ebf47beb
      Andrew Morton 提交于
      alpha allmodconfig:
      
        In file included from mm/memcontrol.c:28:
        include/linux/res_counter.h: In function 'res_counter_set_limit':
        include/linux/res_counter.h:203: error: 'EBUSY' undeclared (first use in this function)
        include/linux/res_counter.h:203: error: (Each undeclared identifier is reported only once
        include/linux/res_counter.h:203: error: for each function it appears in.)
      
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebf47beb
  12. 13 3月, 2013 5 次提交
  13. 12 3月, 2013 4 次提交
  14. 09 3月, 2013 1 次提交
  15. 08 3月, 2013 1 次提交
  16. 06 3月, 2013 1 次提交
    • K
      acpi: Export the acpi_processor_get_performance_info · c705c78c
      Konrad Rzeszutek Wilk 提交于
      The git commit d5aaffa9
      (cpufreq: handle cpufreq being disabled for all exported function)
      tightens the cpufreq API by returning errors when disable_cpufreq()
      had been called.
      
      The problem we are hitting is that the module xen-acpi-processor which
      uses the ACPI's functions: acpi_processor_register_performance,
      acpi_processor_preregister_performance, and acpi_processor_notify_smm
      fails at acpi_processor_register_performance with -22.
      
      Note that earlier during bootup in arch/x86/xen/setup.c there is also
      an call to cpufreq's API: disable_cpufreq().
      
      This is b/c we want the Linux kernel to parse the ACPI data, but leave
      the cpufreq decisions to the hypervisor.
      
      In v3.9 all the checks that d5aaffa9
      added are now hit and the calls to cpufreq_register_notifier will now
      fail. This means that acpi_processor_ppc_init ends up printing:
      
      "Warning: Processor Platform Limit not supported"
      
      and the acpi_processor_ppc_status is not set.
      
      The repercussions of that is that the call to
      acpi_processor_register_performance fails right away at:
      
      	if (!(acpi_processor_ppc_status & PPC_REGISTERED))
      
      and we don't progress any further on parsing and extracting the _P*
      objects.
      
      The only reason the Xen code called that function was b/c it was
      exported and the only way to gather the P-states. But we can also
      just make acpi_processor_get_performance_info be exported and not
      use acpi_processor_register_performance. This patch does so.
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      c705c78c