1. 18 3月, 2009 8 次提交
    • S
      x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths · ce4e240c
      Suresh Siddha 提交于
      Impact: optimize APIC IPI related barriers
      
      Uncached MMIO accesses for xapic are inherently serializing and hence
      we don't need explicit barriers for xapic IPI paths.
      
      x2apic MSR writes/reads don't have serializing semantics and hence need
      a serializing instruction or mfence, to make all the previous memory
      stores globally visisble before the x2apic msr write for IPI.
      
      Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "steiner@sgi.com" <steiner@sgi.com>
      Cc: Nick Piggin <npiggin@suse.de>
      LKML-Reference: <1237313814.27006.203.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ce4e240c
    • S
      x86: fix broken irq migration logic while cleaning up multiple vectors · 68a8ca59
      Suresh Siddha 提交于
      Impact: fix spurious IRQs
      
      During irq migration, we send a low priority interrupt to the previous
      irq destination. This happens in non interrupt-remapping case after interrupt
      starts arriving at new destination and in interrupt-remapping case after
      modifying and flushing the interrupt-remapping table entry caches.
      
      This low priority irq cleanup handler can cleanup multiple vectors, as
      multiple irq's can be migrated at almost the same time. While
      there will be multiple invocations of irq cleanup handler (one cleanup
      IPI for each irq migration), first invocation of the cleanup handler
      can potentially cleanup more than one vector (as the first invocation can
      see the requests for more than vector cleanup). When we cleanup multiple
      vectors during the first invocation of the smp_irq_move_cleanup_interrupt(),
      other vectors that are to be cleanedup can still be pending in the local
      cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled).
      
      When we are ready to unhook a vector corresponding to an irq, check if that
      vector is registered in the local cpu's IRR. If so skip that cleanup and
      do a self IPI with the cleanup vector, so that we give a chance to
      service the pending vector interrupt and then cleanup that vector
      allocation once we execute the lowest priority handler.
      
      This fixes spurious interrupts seen when migrating multiple vectors
      at the same time.
      
      [ This is apparently possible even on conventional xapic, although to
        the best of our knowledge it has never been seen.  The stable
        maintainers may wish to consider this one for -stable. ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: stable@kernel.org
      68a8ca59
    • S
      x86, ioapic: Fix non atomic allocation with interrupts disabled · 05c3dc2c
      Suresh Siddha 提交于
      Impact: fix possible race
      
      save_mask_IO_APIC_setup() was using non atomic memory allocation while getting
      called with interrupts disabled. Fix this by splitting this into two different
      function. Allocation part save_IO_APIC_setup() now happens before
      disabling interrupts.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      05c3dc2c
    • S
      x86, x2apic: cleanup ifdef CONFIG_INTR_REMAP in io_apic code · 29b61be6
      Suresh Siddha 提交于
      Impact: cleanup
      
      Clean up #ifdefs and replace them with helper functions.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      29b61be6
    • S
      x86, x2apic: cleanup the IO-APIC level migration with interrupt-remapping · 0280f7c4
      Suresh Siddha 提交于
      Impact: simplification
      
      In the current code, for level triggered migration, we need to modify the
      io-apic RTE with the update vector information, along with modifying interrupt
      remapping table entry(IRTE) with vector and destination. This is to ensure that
      remote IRR bit inthe IOAPIC RTE gets cleared when the cpu does EOI.
      
      With this patch, for level triggered, we eliminate the io-apic RTE modification
      (with the updated vector information), by using a virtual vector (io-apic pin
      number).  Real vector that is used for interrupting cpu will be coming from
      the interrupt-remapping table entry. Trigger mode in the IRTE will always be
      edge, and the actual level or edge trigger will be setup in the IO-APIC RTE.
      So a level triggered interrupt will appear as an edge to the local apic
      cpu but still as level to the IO-APIC.
      
      With this change, level irq migration can be done by simply modifying
      the interrupt-remapping table entry with out changing the io-apic RTE.
      And as the interrupt appears as edge at the cpu, in addition to do the
      local apic EOI, we need to do IO-APIC directed EOI to clear the remote
      IRR bit in  the IO-APIC RTE.
      
      This simplies the irq migration in the presence of interrupt-remapping.
      Idea-by: NRajesh Sankaran <rajesh.sankaran@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      0280f7c4
    • S
      x86, x2apic: fix clear_local_APIC() in the presence of x2apic · cf6567fe
      Suresh Siddha 提交于
      Impact: cleanup, paranoia
      
      We were not clearing the local APIC in clear_local_APIC() in the
      presence of x2apic. Fix it.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      cf6567fe
    • S
      x86, x2apic: use virtual wire A mode in disable_IO_APIC() with interrupt-remapping · 7c6d9f97
      Suresh Siddha 提交于
      Impact: make kexec work with x2apic
      
      disable_IO_APIC() gets called during crashdump aswell, which configures the
      IO-APIC/LAPIC so that legacy interrupts can be delivered for the kexec'd kernel.
      
      In the presence of interrupt-remapping, we need to change the
      interrupt-remapping configuration aswell as modifying IO-APIC for virtual wire
      B mode.
      
      To keep things simple during the crash, use virtual wire A mode
      (for which we don't need to touch io-apic and interrupt-remapping tables).
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      7c6d9f97
    • S
      x86, x2apic: enable fault handling for intr-remapping · 9d783ba0
      Suresh Siddha 提交于
      Impact: interface augmentation (not yet used)
      
      Enable fault handling flow for intr-remapping aswell. Fault handling
      code now shared by both dma-remapping and intr-remapping.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      9d783ba0
  2. 14 3月, 2009 3 次提交
    • I
      x86: cpu/common.c more cleanups · 0f3fa48a
      Ingo Molnar 提交于
      Complete/fix the cleanups of cpu/common.c:
      
       - fix ugly warning due to asm/topology.h -> linux/topology.h change
       - standardize the style across the file
       - simplify/refactor the code flow where possible
      
      Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
      LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0f3fa48a
    • J
      x86: entry_32.S fix compile warnings - fix work mask bit width · 88200bc2
      Jaswinder Singh Rajput 提交于
      Fix:
      
       arch/x86/kernel/entry_32.S:446: Warning: 00000000080001d1 shortened to 00000000000001d1
       arch/x86/kernel/entry_32.S:457: Warning: 000000000800feff shortened to 000000000000feff
       arch/x86/kernel/entry_32.S:527: Warning: 00000000080001d1 shortened to 00000000000001d1
       arch/x86/kernel/entry_32.S:541: Warning: 000000000800feff shortened to 000000000000feff
       arch/x86/kernel/entry_32.S:676: Warning: 0000000008000091 shortened to 0000000000000091
      
      TIF_SYSCALL_FTRACE is 0x08000000 and until now we checked the
      first 16 bits of the work mask - bit 27 falls outside of that.
      
      Update the entry_32.S code to check the full 32-bit mask.
      
      [ %cx => %ecx fix from Cyrill Gorcunov <gorcunov@gmail.com> ]
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "H. Peter Anvin" <hpa@kernel.org>
      LKML-Reference: <1237012693.18733.3.camel@ht.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      88200bc2
    • J
      x86: cpu/common.c cleanups · 9766cdbc
      Jaswinder Singh Rajput 提交于
      - fix various style problems
       - declare varibles before they get used
       - introduced clear_all_debug_regs
       - fix header files issues
      
      LKML-Reference: <1237009789.4387.2.camel@localhost.localdomain>
      Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9766cdbc
  3. 13 3月, 2009 8 次提交
  4. 12 3月, 2009 3 次提交
  5. 11 3月, 2009 7 次提交
  6. 10 3月, 2009 4 次提交
    • S
      x86: BUG to BUG_ON changes · 8c5dfd25
      Stoyan Gaydarov 提交于
      Impact: cleanup
      Signed-off-by: NStoyan Gaydarov <stoyboyker@gmail.com>
      LKML-Reference: <1236661850-8237-8-git-send-email-stoyboyker@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c5dfd25
    • T
      percpu: generalize embedding first chunk setup helper · 66c3a757
      Tejun Heo 提交于
      Impact: code reorganization
      
      Separate out embedding first chunk setup helper from x86 embedding
      first chunk allocator and put it in mm/percpu.c.  This will be used by
      the default percpu first chunk allocator and possibly by other archs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      66c3a757
    • T
      percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk() · 6074d5b0
      Tejun Heo 提交于
      Impact: cleanup, more flexibility for first chunk init
      
      Non-negative @dyn_size used to be allowed iff @unit_size wasn't auto.
      This restriction stemmed from implementation detail and made things a
      bit less intuitive.  This patch allows @dyn_size to be specified
      regardless of @unit_size and swaps the positions of @dyn_size and
      @unit_size so that the parameter order makes more sense (static,
      reserved and dyn sizes followed by enclosing unit_size).
      
      While at it, add @unit_size >= PCPU_MIN_UNIT_SIZE sanity check.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6074d5b0
    • D
      Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod." · 129f8ae9
      Dave Jones 提交于
      This reverts commit e088e4c9.
      
      Removing the sysfs interface for p4-clockmod was flagged as a
      regression in bug 12826.
      
      Course of action:
       - Find out the remaining causes of overheating, and fix them
         if possible. ACPI should be doing the right thing automatically.
         If it isn't, we need to fix that.
       - mark p4-clockmod ui as deprecated
       - try again with the removal in six months.
      
      It's not really feasible to printk about the deprecation, because
      it needs to happen at all the sysfs entry points, which means adding
      a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.
      Signed-off-by: NDave Jones <davej@redhat.com>
      129f8ae9
  7. 08 3月, 2009 2 次提交
    • Y
      x86: remove smp_apply_quirks()/smp_checks() · 1f442d70
      Yinghai Lu 提交于
      Impact: cleanup and code size reduction on 64-bit
      
      This code is only applied to Intel Pentium and AMD K7 32-bit cpus.
      
      Move those checks to intel_init()/amd_init() for 32-bit
      so 64-bit will not build this code.
      
      Also change to use cpu_index check to see if we need to emit warning.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <49B377D2.8030108@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1f442d70
    • C
      x86: UV: remove uv_flush_tlb_others() WARN_ON · 3a450de1
      Cliff Wickman 提交于
      In uv_flush_tlb_others() (arch/x86/kernel/tlb_uv.c),
      the "WARN_ON(!in_atomic())" fails if CONFIG_PREEMPT is not enabled.
      
      And CONFIG_PREEMPT is not enabled by default in the distribution that
      most UV owners will use.
      
      We could #ifdef CONFIG_PREEMPT the warning, but that is not good form.
      And there seems to be no suitable fix to in_atomic() when CONFIG_PREMPT
      is not on.
      
      As Ingo commented:
      
        > and we have no proper primitive to test for atomicity. (mainly
        > because we dont know about atomicity on a non-preempt kernel)
      
      So we drop the WARN_ON.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a450de1
  8. 06 3月, 2009 5 次提交
    • M
      x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs() · 73bf1b62
      Markus Metzger 提交于
      ds_write_config() can write the BTS as well as the PEBS part of
      the DS config. ds_request_pebs() passes the wrong qualifier, which
      results in the wrong configuration to be written.
      Reported-by: NStephane Eranian <eranian@googlemail.com>
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      LKML-Reference: <20090305085721.A22550@sedona.ch.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      73bf1b62
    • M
      x86, bts: remove bad warning · 9ca0791d
      Markus Metzger 提交于
      In case a ptraced task is reaped (while the tracer is still attached),
      ds_exit_thread() is called before ptrace_exit(). The latter will
      release the bts_tracer and remove the thread's ds_ctx.
      The former will WARN() if the context is not NULL.
      
      Oleg Nesterov submitted patches that move ptrace_exit() before
      exit_thread() and thus reverse the order of the above calls.
      
      Remove the bad warning. I will add it again when Oleg's changes are in.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      LKML-Reference: <20090305084954.A22000@sedona.ch.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9ca0791d
    • T
      x86, percpu: setup reserved percpu area for x86_64 · 6b19b0c2
      Tejun Heo 提交于
      Impact: fix relocation overflow during module load
      
      x86_64 uses 32bit relocations for symbol access and static percpu
      symbols whether in core or modules must be inside 2GB of the percpu
      segement base which the dynamic percpu allocator doesn't guarantee.
      This patch makes x86_64 reserve PERCPU_MODULE_RESERVE bytes in the
      first chunk so that module percpu areas are always allocated from the
      first chunk which is always inside the relocatable range.
      
      This problem exists for any percpu allocator but is easily triggered
      when using the embedding allocator because the second chunk is located
      beyond 2GB on it.
      
      This patch also changes the meaning of PERCPU_DYNAMIC_RESERVE such
      that it only indicates the size of the area to reserve for dynamic
      allocation as static and dynamic areas can be separate.  New
      PERCPU_DYNAMIC_RESERVED is increased by 4k for both 32 and 64bits as
      the reserved area separation eats away some allocatable space and
      having slightly more headroom (currently between 4 and 8k after
      minimal boot sans module area) makes sense for common case
      performance.
      
      x86_32 can address anywhere from anywhere and doesn't need reserving.
      
      Mike Galbraith first reported the problem first and bisected it to the
      embedding percpu allocator commit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NMike Galbraith <efault@gmx.de>
      Reported-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
      6b19b0c2
    • T
      percpu, module: implement reserved allocation and use it for module percpu variables · edcb4639
      Tejun Heo 提交于
      Impact: add reserved allocation functionality and use it for module
      	percpu variables
      
      This patch implements reserved allocation from the first chunk.  When
      setting up the first chunk, arch can ask to set aside certain number
      of bytes right after the core static area which is available only
      through a separate reserved allocator.  This will be used primarily
      for module static percpu variables on architectures with limited
      relocation range to ensure that the module perpcu symbols are inside
      the relocatable range.
      
      If reserved area is requested, the first chunk becomes reserved and
      isn't available for regular allocation.  If the first chunk also
      includes piggy-back dynamic allocation area, a separate chunk mapping
      the same region is created to serve dynamic allocation.  The first one
      is called static first chunk and the second dynamic first chunk.
      Although they share the page map, their different area map
      initializations guarantee they serve disjoint areas according to their
      purposes.
      
      If arch doesn't setup reserved area, reserved allocation is handled
      like any other allocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      edcb4639
    • T
      x86: make embedding percpu allocator return excessive free space · 9a4f8a87
      Tejun Heo 提交于
      Impact: reduce unnecessary memory usage on certain configurations
      
      Embedding percpu allocator allocates unit_size *
      smp_num_possible_cpus() bytes consecutively and use it for the first
      chunk.  However, if the static area is small, this can result in
      excessive prellocated free space in the first chunk due to
      PCPU_MIN_UNIT_SIZE restriction.
      
      This patch makes embedding percpu allocator preallocate only what's
      necessary as described by PERPCU_DYNAMIC_RESERVE and return the
      leftover to the bootmem allocator.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      9a4f8a87