1. 24 11月, 2009 1 次提交
    • C
      x86: SGI UV: Fix BAU initialization · e38e2af1
      Cliff Wickman 提交于
      A memory mapped register that affects the SGI UV Broadcast
      Assist Unit's interrupt handling may sometimes be unintialized.
      
      Remove the condition on its initialization, as that condition
      can be randomly satisfied by a hardware reset.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <E1NBGB9-0005nU-Dp@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e38e2af1
  2. 23 11月, 2009 2 次提交
    • Y
      x86, numa: Use near(er) online node instead of roundrobin for NUMA · d9c2d5ac
      Yinghai Lu 提交于
      CPU to node mapping is set via the following sequence:
      
       1. numa_init_array(): Set up roundrobin from cpu to online node
      
       2. init_cpu_to_node(): Set that according to apicid_to_node[]
      			according to srat only handle the node that
      			is online, and leave other cpu on node
      			without ram (aka not online) to still
      			roundrobin.
      
      3. later call srat_detect_node for Intel/AMD, will use first_online
         node or nearby node.
      
      Problem is that setup_per_cpu_areas() is not called between 2 and 3,
      the per_cpu for cpu on node with ram is on different node, and could
      put that on node with two hops away.
      
      So try to optimize this and add find_near_online_node() and call
      init_cpu_to_node().
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B07A739.3030104@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9c2d5ac
    • Y
      x86: Change crash kernel to reserve via reserve_early() · 44280733
      Yinghai Lu 提交于
      use find_e820_area()/reserve_early() instead.
      
      -v2: address Eric's request, to restore original semantics.
           will fail, if the provided address can not be used.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NEric W. Biederman <ebiederm@xmission.com>
      LKML-Reference: <4B09E2F9.7040403@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      44280733
  3. 19 11月, 2009 1 次提交
    • J
      x86: Eliminate redundant/contradicting cache line size config options · 350f8f56
      Jan Beulich 提交于
      Rather than having X86_L1_CACHE_BYTES and X86_L1_CACHE_SHIFT
      (with inconsistent defaults), just having the latter suffices as
      the former can be easily calculated from it.
      
      To be consistent, also change X86_INTERNODE_CACHE_BYTES to
      X86_INTERNODE_CACHE_SHIFT, and set it to 7 (128 bytes) for NUMA
      to account for last level cache line size (which here matters
      more than L1 cache line size).
      
      Finally, make sure the default value for X86_L1_CACHE_SHIFT,
      when X86_GENERIC is selected, is being seen before that for the
      individual CPU model options (other than on x86-64, where
      GENERIC_CPU is part of the choice construct, X86_GENERIC is a
      separate option on ix86).
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Acked-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Acked-by: NNick Piggin <npiggin@suse.de>
      LKML-Reference: <4AFD5710020000780001F8F0@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      350f8f56
  4. 18 11月, 2009 1 次提交
  5. 17 11月, 2009 5 次提交
    • K
      x86, mm: Report state of NX protections during boot · 4b0f3b81
      Kees Cook 提交于
      It is possible for x86_64 systems to lack the NX bit either due to the
      hardware lacking support or the BIOS having turned off the CPU capability,
      so NX status should be reported.  Additionally, anyone booting NX-capable
      CPUs in 32bit mode without PAE will lack NX functionality, so this change
      provides feedback for that case as well.
      Signed-off-by: NKees Cook <kees.cook@canonical.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <1258154897-6770-6-git-send-email-hpa@zytor.com>
      4b0f3b81
    • H
      x86, mm: Clean up and simplify NX enablement · 4763ed4d
      H. Peter Anvin 提交于
      The 32- and 64-bit code used very different mechanisms for enabling
      NX, but even the 32-bit code was enabling NX in head_32.S if it is
      available.  Furthermore, we had a bewildering collection of tests for
      the available of NX.
      
      This patch:
      
      a) merges the 32-bit set_nx() and the 64-bit check_efer() function
         into a single x86_configure_nx() function.  EFER control is left
         to the head code.
      
      b) eliminates the nx_enabled variable entirely.  Things that need to
         test for NX enablement can verify __supported_pte_mask directly,
         and cpu_has_nx gives the supported status of NX.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Vegard Nossum <vegardno@ifi.uio.no>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      4763ed4d
    • H
      x86, pageattr: Make set_memory_(x|nx) aware of NX support · 583140af
      H. Peter Anvin 提交于
      Make set_memory_x/set_memory_nx directly aware of if NX is supported
      in the system or not, rather than requiring that every caller assesses
      that support independently.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Tim Starling <tstarling@wikimedia.org>
      Cc: Hannes Eder <hannes@hanneseder.net>
      LKML-Reference: <1258154897-6770-4-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      583140af
    • H
      x86, sleep: Always save the value of EFER · a7c4c0d9
      H. Peter Anvin 提交于
      Always save the value of EFER, regardless of the state of NX.  Since
      EFER may not actually exist, use rdmsr_safe() to do so.
      
      v2: check the return value from rdmsr_safe() instead of relying on
          the output values being unchanged on error.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Nigel Cunningham <nigel@tuxonice.net>
      LKML-Reference: <1258154897-6770-3-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      a7c4c0d9
    • H
      x86-32: Use symbolic constants, safer CPUID when enabling EFER.NX · 8a50e513
      H. Peter Anvin 提交于
      Use symbolic constants rather than hard-coded values when setting
      EFER.NX in head_32.S, and do a more rigorous test for the validity of
      the response when probing for the extended CPUID range.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <1258154897-6770-2-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      8a50e513
  6. 12 11月, 2009 1 次提交
  7. 03 11月, 2009 1 次提交
  8. 20 10月, 2009 3 次提交
    • S
      x86-64: add comment for RODATA large page retainment · d6cc1c3a
      Suresh Siddha 提交于
      Add a comment explaining why RODATA is aligned to 2 MB.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      d6cc1c3a
    • S
      x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA · 74e08179
      Suresh Siddha 提交于
      CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel
      text/rodata/data to small 4KB pages as they are mapped with different
      attributes (text as RO, RODATA as RO and NX etc).
      
      On x86_64, preserve the large page mappings for kernel text/rodata/data
      boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the
      RODATA section to be hugepage aligned and having same RWX attributes
      for the 2MB page boundaries
      
      Extra Memory pages padding the sections will be freed during the end of the boot
      and the kernel identity mappings will have different RWX permissions compared to
      the kernel text mappings.
      
      Kernel identity mappings to these physical pages will be mapped with smaller
      pages but large page mappings are still retained for kernel text,rodata,data
      mappings.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      74e08179
    • S
      x86-64: preserve large page mapping for 1st 2MB kernel txt with CONFIG_DEBUG_RODATA · b9af7c0d
      Suresh Siddha 提交于
      In the first 2MB, kernel text is co-located with kernel static
      page tables setup by head_64.S.  CONFIG_DEBUG_RODATA chops this
      2MB large page mapping to small 4KB pages as we mark the kernel text as RO,
      leaving the static page tables as RW.
      
      With CONFIG_DEBUG_RODATA disabled, OLTP run on NHM-EP shows 1% improvement
      with 2% reduction in system time and 1% improvement in iowait idle time.
      
      To recover this, move the kernel static page tables to .data section, so that
      we don't have to break the first 2MB of kernel text to small pages with
      CONFIG_DEBUG_RODATA.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20091014220254.063193621@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b9af7c0d
  9. 13 10月, 2009 2 次提交
    • D
      x86: Export srat physical topology · 8716273c
      David Rientjes 提交于
      This is the counterpart to "x86: export k8 physical topology" for
      SRAT. It is not as invasive because the acpi code already seperates
      node setup into detection and registration steps, with the
      exception of registering e820 active regions in
      acpi_numa_memory_affinity_init().  This is now moved to
      acpi_scan_nodes() if NUMA emulation is disabled or deferred.
      
      acpi_numa_init() now returns a value which specifies whether an
      underlying SRAT was located.  If so, that topology can be used by
      the emulation code to interleave emulated nodes over physical nodes
      or to register the nodes for ACPI.
      
      acpi_get_nodes() may now be used to export the srat physical
      topology of the machine for NUMA emulation.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Len Brown <len.brown@intel.com>
      LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8716273c
    • D
      x86: Export k8 physical topology · 8ee2debc
      David Rientjes 提交于
      To eventually interleave emulated nodes over physical nodes, we
      need to know the physical topology of the machine without actually
      registering it.  This does the k8 node setup in two parts:
      detection and registration.  NUMA emulation can then used the
      physical topology detected to setup the address ranges of emulated
      nodes accordingly.  If emulation isn't used, the k8 nodes are
      registered as normal.
      
      Two formals are added to the x86 NUMA setup functions: `acpi' and
      `k8'. These represent whether ACPI or K8 NUMA has been detected;
      both cannot be true at the same time.  This specifies to the NUMA
      emulation code whether an underlying physical NUMA topology exists
      and which interface to use.
      
      This patch deals solely with separating the k8 setup path into
      Northbridge detection and registration steps and leaves the ACPI
      changes for a subsequent patch.  The `acpi' formal is added here,
      however, to avoid touching all the header files again in the next
      patch.
      
      This approach also ensures emulated nodes will not span physical
      nodes so the true memory latency is not misrepresented.
      
      k8_get_nodes() may now be used to export the k8 physical topology
      of the machine for NUMA emulation.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Len Brown <len.brown@intel.com>
      LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ee2debc
  10. 12 10月, 2009 2 次提交
  11. 08 10月, 2009 1 次提交
    • A
      x86, timers: Check for pending timers after (device) interrupts · 9bcbdd9c
      Arjan van de Ven 提交于
      Now that range timers and deferred timers are common, I found a
      problem with these using the "perf timechart" tool. Frans Pop also
      reported high scheduler latencies via LatencyTop, when using
      iwlagn.
      
      It turns out that on x86, these two 'opportunistic' timers only get
      checked when another "real" timer happens. These opportunistic
      timers have the objective to save power by hitchhiking on other
      wakeups, as to avoid CPU wakeups by themselves as much as possible.
      
      The change in this patch runs this check not only at timer
      interrupts, but at all (device) interrupts. The effect is that:
      
       1) the deferred timers/range timers get delayed less
      
       2) the range timers cause less wakeups by themselves because
          the percentage of hitchhiking on existing wakeup events goes up.
      
      I've verified the working of the patch using "perf timechart", the
      original exposed bug is gone with this patch. Frans also reported
      success - the latencies are now down in the expected ~10 msec
      range.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Tested-by: NFrans Pop <elendil@planet.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <20091008064041.67219b13@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9bcbdd9c
  12. 04 10月, 2009 1 次提交
  13. 03 10月, 2009 1 次提交
    • A
      x86: Simplify bound checks in the MTRR code · 11879ba5
      Arjan van de Ven 提交于
      The current bound checks for copy_from_user in the MTRR driver are
      not as obvious as they could be, and gcc agrees with that.
      
      This patch simplifies the boundary checks to the point that gcc can
      now prove to itself that the copy_from_user() is never going past
      its bounds.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <20090926205150.30797709@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      11879ba5
  14. 02 10月, 2009 1 次提交
    • I
      x86: EDAC: MCE: Fix MCE decoding callback logic · f436f8bb
      Ingo Molnar 提交于
      Make decoding of MCEs happen only on AMD hardware by registering a
      non-default callback only on CPU families which support it.
      
      While looking at the interaction of decode_mce() with the other MCE
      code i also noticed a few other things and made the following
      cleanups/fixes:
      
       - Fixed the mce_decode() weak alias - a weak alias is really not
         good here, it should be a proper callback. A weak alias will be
         overriden if a piece of code is built into the kernel - not
         good, obviously.
      
       - The patch initializes the callback on AMD family 10h and 11h.
      
       - Added the more correct fallback printk of:
      
      	No support for human readable MCE decoding on this CPU type.
      	Transcribe the message and run it through 'mcelog --ascii' to decode.
      
         On CPUs that dont have a decoder.
      
       - Made the surrounding code more readable.
      
      Note that the callback allows us to have a default fallback -
      without having to check the CPU versions during the printout
      itself. When an EDAC module registers itself, it can install the
      decode-print function.
      
      (there's no unregister needed as this is core code.)
      
      version -v2 by Borislav Petkov:
      
       - add K8 to the set of supported CPUs
      
       - always build in edac_mce_amd since we use an early_initcall now
      
       - fix checkpatch warnings
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <20091001141432.GA11410@aftab>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f436f8bb
  15. 01 10月, 2009 3 次提交
    • J
      x86: earlyprintk: Fix regression to handle serial,ttySn as 1 arg · ea3acb19
      Jason Wessel 提交于
      Commit c9530948 ("early_printk: Allow more than one early console")
      introduced a regression in the parsing of the earlyprintk= kernel
      arguments.
      
      If you specify "earlyprintk=serial,ttyS0,115200" as a kernel
      argument, the "serial,ttyS" should be parsed as a single argument
      and not as "serial" and then "ttyS".
      
      Also update the documentation to reflect you can specify the ttyS
      directly without the "serial" argument.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Greg KH <gregkh@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      LKML-Reference: <4ABB7D5E.6000301@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea3acb19
    • E
      x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=y · 04edbdef
      Eric Dumazet 提交于
      Conditionaly compile cmpxchg8b_emu.o and EXPORT_SYMBOL(cmpxchg8b_emu).
      
      This reduces the kernel size a bit.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4AC43E7E.1000600@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      04edbdef
    • A
      x86: Provide an alternative() based cmpxchg64() · 79e1dd05
      Arjan van de Ven 提交于
      cmpxchg64() today generates, to quote Linus, "barf bag" code.
      
      cmpxchg64() is about to get used in the scheduler to fix a bug there,
      but it's a prerequisite that cmpxchg64() first be made non-sucking.
      
      This patch turns cmpxchg64() into an efficient implementation that
      uses the alternative() mechanism to just use the raw instruction on
      all modern systems.
      
      Note: the fallback is NOT smp safe, just like the current fallback
      is not SMP safe. (Interested parties with i486 based SMP systems
      are welcome to submit fix patches for that.)
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      [ fixed asm constraint bug ]
      Fixed-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090930170754.0886ff2e@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      79e1dd05
  16. 30 9月, 2009 1 次提交
  17. 27 9月, 2009 1 次提交
  18. 24 9月, 2009 8 次提交
    • A
      sysctl: remove "struct file *" argument of ->proc_handler · 8d65af78
      Alexey Dobriyan 提交于
      It's unused.
      
      It isn't needed -- read or write flag is already passed and sysctl
      shouldn't care about the rest.
      
      It _was_ used in two places at arch/frv for some reason.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d65af78
    • J
      x86: early_printk: Protect against using the same device twice · 429a6e5e
      Jason Wessel 提交于
      If you use the kernel argument:
      
        earlyprintk=serial,ttyS0,115200
      
      This will cause a recursive hang printing the same line
      again and again:
      
       BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
       BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
       BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
      bootconsole [earlyser0] enabled
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      
      Instead warn the end user that they specified the device
      a second time, and ignore that second console.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Greg KH <gregkh@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4ABAAB89.1080407@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      429a6e5e
    • R
      x86: Reduce verbosity of "TSC is reliable" message · ea01c0d7
      Roland Dreier 提交于
      On modern systems, the kernel prints the message
      
          Skipping synchronization checks as TSC is reliable.
      
      once for every non-boot CPU.
      
      This gets kind of ridiculous on huge systems; for example, on a
      64-thread system I was lucky enough to get:
      
          $ dmesg | grep 'TSC is reliable' | wc
               63     567    4221
      
      There's no point to doing this for every CPU, since the code is
      just checking the boot CPU anyway, so change this to a
      printk_once() to make the message appears only once.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      LKML-Reference: <adazl8l2swc.fsf@cisco.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea01c0d7
    • A
      headers: utsname.h redux · 2bcd57ab
      Alexey Dobriyan 提交于
      * remove asm/atomic.h inclusion from linux/utsname.h --
         not needed after kref conversion
       * remove linux/utsname.h inclusion from files which do not need it
      
      NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
      due to some personality stuff it _is_ needed -- cowardly leave ELF-related
      headers and files alone.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2bcd57ab
    • R
      cpumask: use mm_cpumask() wrapper: x86 · 78f1c4d6
      Rusty Russell 提交于
      Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer).
      
      It's also a chance to use the new cpumask_ ops which take a pointer
      (the older ones are deprecated, but there's no hurry for arch code).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      78f1c4d6
    • R
      cpumask: remove last assignment to mask field of struct irqaction. · 1d1afc19
      Rusty Russell 提交于
      This snuck in after the patch which removed all the others.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      1d1afc19
    • L
      cpumask: use zalloc_cpumask_var() where possible · 79f55997
      Li Zefan 提交于
      Remove open-coded zalloc_cpumask_var() and zalloc_cpumask_var_node().
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      79f55997
    • I
      x86: mce: Use safer ways to access MCE registers · 11868a2d
      Ingo Molnar 提交于
      Use rdmsrl_safe() when accessing MCE registers. While in
      theory we always 'know' which ones are safe to access from
      the capability bits, there's a lot of hardware variations
      and reality might differ from theory, as it did in this case:
      
         http://bugzilla.kernel.org/show_bug.cgi?id=14204
      
      [    0.010016] mce: CPU supports 5 MCE banks
      [    0.011029] general protection fault: 0000 [#1]
      [    0.011998] last sysfs file:
      [    0.011998] Modules linked in:
      [    0.011998]
      [    0.011998] Pid: 0, comm: swapper Not tainted (2.6.31_router #1) HP Vectra
      [    0.011998] EIP: 0060:[<c100d9b9>] EFLAGS: 00010246 CPU: 0
      [    0.011998] EIP is at mce_rdmsrl+0x19/0x60
      [    0.011998] EAX: 00000000 EBX: 00000001 ECX: 00000407 EDX: 08000000
      [    0.011998] ESI: 00000000 EDI: 8c000000 EBP: 00000405 ESP: c17d5eac
      
      So WARN_ONCE() instead of crashing the box.
      
      ( also fix a number of stylistic inconsistencies in the code. )
      
      Note, we might still crash in wrmsrl() if we get that far, but
      we shouldnt if the registers are truly inaccessible.
      Reported-by: NGNUtoo <GNUtoo@no-log.org>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      LKML-Reference: <bug-14204-5438@http.bugzilla.kernel.org/>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      11868a2d
  19. 23 9月, 2009 4 次提交
    • J
      early_printk: Allow more than one early console · c9530948
      Jason Wessel 提交于
      It is desirable to be able to use one early boot device to debug
      another or to have multiple places you can see the early boot
      diagnostics, such as the vga screen or serial device.
      
      This patch changes the early_printk console device registration to
      allow more than one early printk device to get registered via
      register_console().
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      c9530948
    • J
      USB: ehci,dbgp,early_printk: split ehci debug driver from early_printk.c · df6c5169
      Jason Wessel 提交于
      Move the dbgp early printk driver in advance of refactoring and adding
      new code, so the changes to this code are tracked separately from the
      move of the code.
      
      The drivers/usb/early directory will be the location of the current
      and future early usb code for driving usb devices prior initializing
      the standard interrupt driven USB drivers.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      df6c5169
    • P
      perf_event, x86: Fix 'perf sched record' crashing the machine · 7d428966
      Peter Zijlstra 提交于
      Chris Malley reported that 'perf sched record' sometimes
      crashes his box with:
      
      [  389.272175] BUG: unable to handle kernel paging request at ffffb300
      [  389.272294] IP: [<c011b0bd>] default_send_IPI_self+0x1d/0x50
      [  389.272366] *pde = 0073f067 *pte = 00000000
      [  389.274708] Call Trace:
      [  389.274752]  [<c010e3b4>] ?  set_perf_event_pending+0x14/0x20
      [  389.274801]  [<c01b9751>] ?  perf_output_unlock+0x121/0x1a0
      [  389.274848]  [<c01b981a>] ? perf_output_end+0x4a/0x70
      [  389.274893]  [<c01ba690>] ?  __perf_event_overflow+0x240/0x2f0
      [  389.274942]  [<c030963e>] ? atomic64_cmpxchg+0x1e/0x30
      [  389.274988]  [<c01ba8f4>] ?  perf_swevent_ctx_event+0x1b4/0x1c0
      [  389.275035]  [<c01ba773>] ?  perf_swevent_ctx_event+0x33/0x1c0
      [  389.275081]  [<c01ba9a7>] ? do_perf_sw_event+0xa7/0x160
      [  389.275127]  [<c01baae2>] ? perf_tp_event+0x82/0xa0
      [  389.275174]  [<c012e9c6>] ?  ftrace_profile_sched_stat_runtime+0xe6/0x120
      [  389.275224]  [<c012e8e0>] ?  ftrace_profile_sched_stat_runtime+0x0/0x120
      [  389.275273]  [<c013c85a>] ? update_curr+0x18a/0x230
      [  389.275318]  [<c013cdc5>] ?  put_prev_task_fair+0x155/0x160
      [  389.275366]  [<c01618b5>] ? sched_clock_cpu+0xd5/0x110
      [  389.275413]  [<c04e7525>] ? _spin_lock_irq+0x45/0x50
      [  389.275458]  [<c04e424e>] ? schedule+0x20e/0xb10
      
      The problem is that the box has no lapic enabled:
      
        [    0.042445] Local APIC not detected. Using dummy APIC emulation.
      
      The below seems like the best fix. We disabled all lapic bits, except
      the self-IPI-resend logic.
      Reported-by: NChris Malley <mail@chrismalley.co.uk>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <7863dc4c0909221409v7893bfd3o4b590d5951a233ba@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7d428966
    • R
      x86: ptrace: set TS_COMPAT when 32-bit ptrace sets orig_eax>=0 · 8cb3ed13
      Roland McGrath 提交于
      The 32-bit ptrace syscall on a 64-bit kernel (32-bit debugger on
      32-bit task) behaves differently than a native 32-bit kernel.  When
      setting a register state of orig_eax>=0 and eax=-ERESTART* when the
      debugged task is NOT on its way out of a 32-bit syscall, the task will
      fail to do the syscall restart logic that it should do.
      
      Test case available at http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/erestartsys-trap.c?cvsroot=systemtap
      
      This happens because the 32-bit ptrace syscall sets eax=0xffffffff
      when it sets orig_eax>=0.  The resuming task will not sign-extend this
      for the -ERESTART* check because TS_COMPAT is not set.  (So the task
      thinks it is restarting after a 64-bit syscall, not a 32-bit one.)
      
      The fix is to have 32-bit ptrace calls set TS_COMPAT when setting
      orig_eax>=0.  This ensures that the 32-bit syscall restart logic
      will apply when the child resumes.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      8cb3ed13