1. 27 8月, 2009 1 次提交
    • M
      x86: Instruction decoder API · eb13296c
      Masami Hiramatsu 提交于
      Add x86 instruction decoder to arch-specific libraries. This decoder
      can decode x86 instructions used in kernel into prefix, opcode, modrm,
      sib, displacement and immediates. This can also show the length of
      instructions.
      
      This version introduces instruction attributes for decoding
      instructions.
      The instruction attribute tables are generated from the opcode map file
      (x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
      
      Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
      IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
      and consist of below two types of opcode tables.
      
      1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
      written as below;
      
       Table: table-name
       Referrer: escaped-name
       opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
        (or)
       opcode: escape # escaped-name
       EndTable
      
      Group opcodes, which has 8 elements, are written as below;
      
       GrpTable: GrpXXX
       reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
       EndTable
      
      These opcode maps include a few SSE and FP opcodes (for setup), because
      those opcodes are used in the kernel.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      eb13296c
  2. 26 8月, 2009 3 次提交
    • J
      tracing: Create generic syscall TRACE_EVENTs · 1c569f02
      Josh Stone 提交于
      This converts the syscall_enter/exit tracepoints into TRACE_EVENTs, so
      you can have generic ftrace events that capture all system calls with
      arguments and return values.  These generic events are also renamed to
      sys_enter/exit, so they're more closely aligned to the specific
      sys_enter_foo events.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-5-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      1c569f02
    • J
      tracing: Move tracepoint callbacks from declaration to definition · 97419875
      Josh Stone 提交于
      It's not strictly correct for the tracepoint reg/unreg callbacks to
      occur when a client is hooking up, because the actual tracepoint may not
      be present yet.  This happens to be fine for syscall, since that's in
      the core kernel, but it would cause problems for tracepoints defined in
      a module that hasn't been loaded yet.  It also means the reg/unreg has
      to be EXPORTed for any modules to use the tracepoint (as in SystemTap).
      
      This patch removes DECLARE_TRACE_WITH_CALLBACK, and instead introduces
      DEFINE_TRACE_FN which stores the callbacks in struct tracepoint.  The
      callbacks are used now when the active state of the tracepoint changes
      in set_tracepoint & disable_tracepoint.
      
      This also introduces TRACE_EVENT_FN, so ftrace events can also provide
      registration callbacks if needed.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-4-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      97419875
    • J
      tracing: Rename FTRACE_SYSCALLS for tracepoints · 66700001
      Josh Stone 提交于
      s/HAVE_FTRACE_SYSCALLS/HAVE_SYSCALL_TRACEPOINTS/g
      s/TIF_SYSCALL_FTRACE/TIF_SYSCALL_TRACEPOINT/g
      
      The syscall enter/exit tracing is no longer specific to just ftrace, so
      they now have names that reflect their tie to tracepoints instead.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-2-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      66700001
  3. 12 8月, 2009 6 次提交
    • J
      tracing: Convert x86_64 mmap and uname to use DEFINE_SYSCALL · 0ac676fb
      Jason Baron 提交于
      A number of syscalls are not using 'DEFINE_SYSCALL'. I'm not sure why.
      Convert x86_64 uname and mmap to use DEFINE_SYSCALL.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      0ac676fb
    • J
      tracing: Add individual syscalls tracepoint id support · 64c12e04
      Jason Baron 提交于
      The current state of syscalls tracepoints generates only one event id
      for every syscall events.
      
      This patch associates an id with each syscall trace event, so that we
      can identify each syscall trace event using the 'perf' tool.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      64c12e04
    • J
      tracing: Update FTRACE_SYSCALL_MAX · 9daa77e2
      Jason Baron 提交于
      update FTRACE_SYSCALL_MAX to the current number of syscalls
      
      FTRACE_SYSCALL_MAX is a temporary solution to get the number of
      syscalls supported by the arch until we find a more dynamic way
      to get this number.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      9daa77e2
    • J
      tracing: Add syscall tracepoints · a871bd33
      Jason Baron 提交于
      add two tracepoints in syscall exit and entry path, conditioned on
      TIF_SYSCALL_FTRACE. Supports the syscall trace event code.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      a871bd33
    • J
      tracing: Call arch_init_ftrace_syscalls at boot · 066e0378
      Jason Baron 提交于
      Call arch_init_ftrace_syscalls at boot, so we can determine early the
      set of syscalls for the syscall trace events.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      066e0378
    • J
      tracing: Map syscall name to number · eeac19a7
      Jason Baron 提交于
      Add a new function to support translating a syscall name to number at
      runtime.
      This allows the syscall event tracer to map syscall names to number.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      eeac19a7
  4. 11 8月, 2009 1 次提交
    • L
      x86: Fix serialization in pit_expect_msb() · b6e61eef
      Linus Torvalds 提交于
      Wei Chong Tan reported a fast-PIT-calibration corner-case:
      
      | pit_expect_msb() is vulnerable to SMI disturbance corner case
      | in some platforms which causes /proc/cpuinfo to show wrong
      | CPU MHz value when quick_pit_calibrate() jumps to success
      | section.
      
      I think that the real issue isn't even an SMI - but the fact
      that in the very last iteration of the loop, there's no
      serializing instruction _after_ the last 'rdtsc'. So even in
      the absense of SMI's, we do have a situation where the cycle
      counter was read without proper serialization.
      
      The last check should be done outside the outer loop, since
      _inside_ the outer loop, we'll be testing that the PIT has
      the right MSB value has the right value in the next iteration.
      
      So only the _last_ iteration is special, because that's the one
      that will not check the PIT MSB value any more, and because the
      final 'get_cycles()' isn't serialized.
      
      In other words:
      
       - I'd like to move the PIT MSB check to after the last
         iteration, rather than in every iteration
      
       - I think we should comment on the fact that it's also a
         serializing instruction and so 'fences in' the TSC read.
      
      Here's a suggested replacement.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      Tested-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      LKML-Reference: <B28277FD4E0F9247A3D55704C440A140D5D683F3@pgsmsx504.gar.corp.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b6e61eef
  5. 09 8月, 2009 1 次提交
  6. 08 8月, 2009 2 次提交
    • O
      x86: Add quirk to make Apple MacBookPro5,1 use reboot=pci · 498cdbfb
      Ozan Çağlayan 提交于
      MacBookPro5,1 is not able to reboot unless reboot=pci is set.
      This patch forces it through a DMI quirk specific to this
      device.
      Signed-off-by: NOzan Çağlayan <ozan@pardus.org.tr>
      LKML-Reference: <1249403971-6543-1-git-send-email-ozan@pardus.org.tr>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      498cdbfb
    • Y
      x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus · 087d7e56
      Yinghai Lu 提交于
      found a system where x2apic reports an MSI-X irq initialization
      failure:
      
      [  302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
      [  302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
      [  302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
      [  302.894386] igbvf 0000:81:10.4: enabling bus mastering
      [  302.898171] igbvf 0000:81:10.4: setting latency timer to 64
      [  302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.940367]   alloc irq_desc for 265 on node 4
      [  302.956874]   alloc kstat_irqs on node 4
      [  302.959452] alloc irq_2_iommu on node 0
      [  302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
      [  302.977778]   alloc irq_desc for 266 on node 4
      [  302.980347]   alloc kstat_irqs on node 4
      [  302.995312] free_memtype request 0xefb28000-0xefb29000
      [  302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.
      
      ... it turns out that when trying to enable MSI-X,
      __assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
      get vector because for x2apic target-cpus returns cpumask_of(0)
      
      Update that to online_mask like xapic.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <4A785AFF.3050902@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      087d7e56
  7. 06 8月, 2009 2 次提交
  8. 05 8月, 2009 8 次提交
  9. 04 8月, 2009 14 次提交
    • S
      x86: Work around compilation warning in arch/x86/kernel/apm_32.c · dc731fbb
      Subrata Modak 提交于
      The following fix was initially inspired by David Howells fix
      few days back:
      
        http://lkml.org/lkml/2009/7/9/109
      
      However, Ingo disapproves such fixes as it's dangerous (it can
      hide future, relevant warnings) - in something as
      performance-uncritical.
      
      So, initialize 'err' to '0' to work around a GCC false positive
      warning:
      
        http://lkml.org/lkml/2009/7/18/89
      
      Signed-off-by: Subrata Modak<subrata@linux.vnet.ibm.com>
      Cc: Sachin P Sant <sachinp@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      LKML-Reference: <20090721023226.31855.67236.sendpatchset@subratamodak.linux.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc731fbb
    • J
      x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq() · 2a5ef416
      Jack Steiner 提交于
      In uv_setup_irq(), the call to create_irq() initially assigns
      IRQ vectors to cpu 0. The subsequent call to
      assign_irq_vector() in arch_enable_uv_irq() migrates the IRQ to
      another cpu and frees the cpu 0 vector - at least it will be
      freed as soon as the "IRQ move" completes.
      
      arch_enable_uv_irq() needs to send a cleanup IPI to complete
      the IRQ move. Otherwise, assignment of GRU interrupts on large
      systems (>200 cpus) will exhaust the cpu 0 interrupt vectors
      and initialization of the GRU driver will fail.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090720142840.GA8885@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2a5ef416
    • J
      x86, 32-bit: Fix double accounting in reserve_top_address() · 6abf6551
      Jan Beulich 提交于
      With VMALLOC_END included in the calculation of MAXMEM (as of
      2.6.28) it is no longer correct to also bump __VMALLOC_RESERVE
      in reserve_top_address(). Doing so results in needlessly small
      lowmem.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      LKML-Reference: <4A71DD2A020000780000D482@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6abf6551
    • Y
      x86: Don't use current_cpu_data in x2apic phys_pkg_id · d8c7eb34
      Yinghai Lu 提交于
      One system has socket 1 come up as BSP.
      
      kexeced kernel reports BSP as:
      
      [    1.524550] Initializing cgroup subsys cpuacct
      [    1.536064] initial_apicid:20
      [    1.537135] ht_mask_width:1
      [    1.538128] core_select_mask:f
      [    1.539126] core_plus_mask_width:5
      [    1.558479] CPU: Physical Processor ID: 0
      [    1.559501] CPU: Processor Core ID: 0
      [    1.560539] CPU: L1 I cache: 32K, L1 D cache: 32K
      [    1.579098] CPU: L2 cache: 256K
      [    1.580085] CPU: L3 cache: 24576K
      [    1.581108] CPU 0/0x20 -> Node 0
      [    1.596193] CPU 0 microcode level: 0xffff0008
      
      It doesn't have correct physical processor id and will get an
      error:
      
      [   38.840859] CPU0 attaching sched-domain:
      [   38.848287]  domain 0: span 0,8,72 level SIBLING
      [   38.851151]   groups: 0 8 72
      [   38.858137]   domain 1: span 0,8-15,72-79 level MC
      [   38.868944]    groups: 0,8,72 9,73 10,74 11,75 12,76 13,77 14,78 15,79
      [   38.881383] ERROR: parent span is not a superset of domain->span
      [   38.890724]    domain 2: span 0-7,64-71 level CPU
      [   38.899237] ERROR: domain->groups does not contain CPU0
      [   38.909229]     groups: 8-15,72-79
      [   38.912547] ERROR: groups don't span domain->span
      [   38.919665]     domain 3: span 0-127 level NODE
      [   38.930739]      groups: 0-7,64-71 8-15,72-79 16-23,80-87 24-31,88-95 32-39,96-103 40-47,104-111 48-55,112-119 56-63,120-127
      
      it turns out: we can not use current_cpu_data in phys_pgd_id
      for x2apic.
      
      identify_boot_cpu() is called by check_bugs() before
      smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data
      for bsp is assigned with boot_cpu_data.
      
      Just make phys_pkg_id for x2apic is aligned to xapic.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A6ADD0D.10002@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d8c7eb34
    • J
      x86, UV: Fix UV apic mode · c5997fa8
      Jack Steiner 提交于
      Change SGI UV default apicid mode to "physical". This is
      required to match settings in the UV hub chip.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090727143856.GA8905@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c5997fa8
    • J
      x86, UV: Fix macros for accessing large node numbers · 67e83f30
      Jack Steiner 提交于
      The UV chipset automatically supplies the upper bits on nodes
      being referenced by MMR accesses. These bit can be deleted from
      the hub addressing macros.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090727143808.GA8076@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67e83f30
    • J
      x86, UV: Delete mapping of MMR rangs mapped by BIOS · cc5e4fa1
      Jack Steiner 提交于
      The UV BIOS has added additional MMR ranges that are mapped via
      EFI virtual mode mappings. These ranges should be deleted from
      ranges mapped by uv_system_init().
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143656.GA7698@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc5e4fa1
    • J
      x86, UV: Handle missing blade-local memory correctly · 6c7184b7
      Jack Steiner 提交于
      UV blades may not have any blade-local memory. Add a field
      (nid) to the UV blade structure to indicates whether the node
      has local memory. This is needed by the GRU driver (pushed
      separately).
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143507.GA7006@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6c7184b7
    • H
      x86: fix assembly constraints in native_save_fl() · f1f029c7
      H. Peter Anvin 提交于
      From Gabe Black in bugzilla 13888:
      
      native_save_fl is implemented as follows:
      
        11static inline unsigned long native_save_fl(void)
        12{
        13        unsigned long flags;
        14
        15        asm volatile("# __raw_save_flags\n\t"
        16                     "pushf ; pop %0"
        17                     : "=g" (flags)
        18                     : /* no input */
        19                     : "memory");
        20
        21        return flags;
        22}
      
      If gcc chooses to put flags on the stack, for instance because this is
      inlined into a larger function with more register pressure, the offset
      of the flags variable from the stack pointer will change when the
      pushf is performed. gcc doesn't attempt to understand that fact, and
      address used for pop will still be the same. It will write to
      somewhere near flags on the stack but not actually into it and
      overwrite some other value.
      
      I saw this happen in the ide_device_add_all function when running in a
      simulator I work on. I'm assuming that some quirk of how the simulated
      hardware is set up caused the code path this is on to be executed when
      it normally wouldn't.
      
      A simple fix might be to change "=g" to "=r".
      Reported-by: NGabe Black <spamforgabe@umich.edu>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Stable Team <stable@kernel.org>
      f1f029c7
    • B
      x86, msr: execute on the correct CPU subset · bab9a3da
      Borislav Petkov 提交于
      Make rdmsr_on_cpus/wrmsr_on_cpus execute on the current CPU only if it
      is in the supplied bitmask.
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      bab9a3da
    • H
      x86: Fix assert syntax in vmlinux.lds.S · d2ba8b21
      H. Peter Anvin 提交于
      Older versions of binutils did not accept the naked "ASSERT" syntax;
      it is considered an expression whose value needs to be assigned to
      something.
      Reported-tested-and-fixed-by: NJean Delvare <khali@linux-fr.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      d2ba8b21
    • P
      x86: Make 64-bit efi_ioremap use ioremap on MMIO regions · 6a7bbd57
      Paul Mackerras 提交于
      Booting current 64-bit x86 kernels on the latest Apple MacBook
      (MacBook5,2) via EFI gives the following warning:
      
      [    0.182209] ------------[ cut here ]------------
      [    0.182222] WARNING: at arch/x86/mm/pageattr.c:581 __cpa_process_fault+0x44/0xa0()
      [    0.182227] Hardware name: MacBook5,2
      [    0.182231] CPA: called for zero pte. vaddr = ffff8800ffe00000 cpa->vaddr = ffff8800ffe00000
      [    0.182236] Modules linked in:
      [    0.182242] Pid: 0, comm: swapper Not tainted 2.6.31-rc4 #6
      [    0.182246] Call Trace:
      [    0.182254]  [<ffffffff8102c754>] ? __cpa_process_fault+0x44/0xa0
      [    0.182261]  [<ffffffff81048668>] warn_slowpath_common+0x78/0xd0
      [    0.182266]  [<ffffffff81048744>] warn_slowpath_fmt+0x64/0x70
      [    0.182272]  [<ffffffff8102c7ec>] ? update_page_count+0x3c/0x50
      [    0.182280]  [<ffffffff818d25c5>] ? phys_pmd_init+0x140/0x22e
      [    0.182286]  [<ffffffff8102c754>] __cpa_process_fault+0x44/0xa0
      [    0.182292]  [<ffffffff8102ce60>] __change_page_attr_set_clr+0x5f0/0xb40
      [    0.182301]  [<ffffffff810d1035>] ? vm_unmap_aliases+0x175/0x190
      [    0.182307]  [<ffffffff8102d4ae>] change_page_attr_set_clr+0xfe/0x3d0
      [    0.182314]  [<ffffffff8102dcca>] _set_memory_uc+0x2a/0x30
      [    0.182319]  [<ffffffff8102dd4b>] set_memory_uc+0x7b/0xb0
      [    0.182327]  [<ffffffff818afe31>] efi_enter_virtual_mode+0x2ad/0x2c9
      [    0.182334]  [<ffffffff818a1c66>] start_kernel+0x2db/0x3f4
      [    0.182340]  [<ffffffff818a1289>] x86_64_start_reservations+0x99/0xb9
      [    0.182345]  [<ffffffff818a1389>] x86_64_start_kernel+0xe0/0xf2
      [    0.182357] ---[ end trace 4eaa2a86a8e2da22 ]---
      [    0.182982] init_memory_mapping: 00000000ffffc000-0000000100000000
      [    0.182993]  00ffffc000 - 0100000000 page 4k
      
      This happens because the 64-bit version of efi_ioremap calls
      init_memory_mapping for all addresses, regardless of whether they are
      RAM or MMIO.  The EFI tables on this machine ask for runtime access to
      some MMIO regions:
      
      [    0.000000] EFI: mem195: type=11, attr=0x8000000000000000, range=[0x0000000093400000-0x0000000093401000) (0MB)
      [    0.000000] EFI: mem196: type=11, attr=0x8000000000000000, range=[0x00000000ffc00000-0x00000000ffc40000) (0MB)
      [    0.000000] EFI: mem197: type=11, attr=0x8000000000000000, range=[0x00000000ffc40000-0x00000000ffc80000) (0MB)
      [    0.000000] EFI: mem198: type=11, attr=0x8000000000000000, range=[0x00000000ffc80000-0x00000000ffca4000) (0MB)
      [    0.000000] EFI: mem199: type=11, attr=0x8000000000000000, range=[0x00000000ffca4000-0x00000000ffcb4000) (0MB)
      [    0.000000] EFI: mem200: type=11, attr=0x8000000000000000, range=[0x00000000ffcb4000-0x00000000ffffc000) (3MB)
      [    0.000000] EFI: mem201: type=11, attr=0x8000000000000000, range=[0x00000000ffffc000-0x0000000100000000) (0MB)
      
      This arranges to pass the EFI memory type through to efi_ioremap, and
      makes efi_ioremap use ioremap rather than init_memory_mapping if the
      type is EFI_MEMORY_MAPPED_IO.  With this, the above warning goes away.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      LKML-Reference: <19062.55858.533494.471153@cargo.ozlabs.ibm.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6a7bbd57
    • P
      x86: Add quirk to make Apple MacBook5,2 use reboot=pci · 6c6c51e4
      Paul Mackerras 提交于
      The latest Apple MacBook (MacBook5,2) doesn't reboot successfully
      under Linux; neither the EFI reboot method nor the default method
      using the keyboard controller works (the system just hangs and doesn't
      reset).  However, the method using the "PCI reset register" at 0xcf9
      does work.
      
      This adds a quirk to detect this machine via DMI and force the
      reboot_type to BOOT_CF9.  With this it reboots successfully without
      requiring a command-line option.  Note that the EFI code forces
      reboot_type to BOOT_EFI when the machine is booted via EFI, but this
      overrides that since the core_initcall runs after the EFI
      initialization code.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      LKML-Reference: <19062.56420.501516.316181@cargo.ozlabs.ibm.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6c6c51e4
    • T
      x86: Fix CPA memtype reserving in the set_pages_array*() cases · 8523acfe
      Thomas Hellstrom 提交于
      The code was incorrectly reserving memtypes using the page
      virtual address instead of the physical address. Furthermore,
      the code was not ignoring highmem pages as it ought to.
      
      ( upstream does not pass in highmem pages yet - but upcoming
        graphics code will do it and there's no reason to not handle
        this properly in the CPA APIs.)
      
      Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=13884Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <stable@kernel.org>
      Cc: dri-devel@lists.sourceforge.net
      Cc: venkatesh.pallipadi@intel.com
      LKML-Reference: <1249284345-7654-1-git-send-email-thellstrom@vmware.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8523acfe
  10. 31 7月, 2009 1 次提交
    • P
      x86, pat: Fix set_memory_wc related corruption · bdc6340f
      Pallipadi, Venkatesh 提交于
      Changeset 3869c4aa
      that went in after 2.6.30-rc1 was a seemingly small change to _set_memory_wc()
      to make it complaint with SDM requirements. But, introduced a nasty bug, which
      can result in crash and/or strange corruptions when set_memory_wc is used.
      One such crash reported here
      http://lkml.org/lkml/2009/7/30/94
      
      Actually, that changeset introduced two bugs.
      * change_page_attr_set() takes &addr as first argument and can the addr value
        might have changed on return, even for single page change_page_attr_set()
        call. That will make the second change_page_attr_set() in this routine
        operate on unrelated addr, that can eventually cause strange corruptions
        and bad page state crash.
      * The second change_page_attr_set() call, before setting _PAGE_CACHE_WC, should
        clear the earlier _PAGE_CACHE_UC_MINUS, as otherwise cache attribute will not
        be WC (will be UC instead).
      
      The patch below fixes both these problems. Sending a single patch to fix both
      the problems, as the change is to the same line of code. The change to have a
      addr_copy is not very clean. But, it is simpler than making more changes
      through various routines in pageattr.c.
      
      A huge thanks to Jerome for reporting this problem and providing a simple test
      case that helped us root cause the problem.
      Reported-by: NJerome Glisse <glisse@freedesktop.org>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20090730214319.GA1889@linux-os.sc.intel.com>
      Acked-by: NDave Airlie <airlied@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      bdc6340f
  11. 30 7月, 2009 1 次提交
    • R
      lguest: update commentry · a91d74a3
      Rusty Russell 提交于
      Every so often, after code shuffles, I need to go through and unbitrot
      the Lguest Journey (see drivers/lguest/README).  Since we now use RCU in
      a simple form in one place I took the opportunity to expand that explanation.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      a91d74a3