1. 25 9月, 2013 2 次提交
  2. 13 9月, 2013 1 次提交
    • P
      perf/x86/intel: Fix Silvermont offcore masks · 06c939c1
      Peter Zijlstra 提交于
      Fengguang Wu reported:
      
      > sparse warnings: (new ones prefixed by >>)
      >
      > >> arch/x86/kernel/cpu/perf_event_intel.c:901:9: sparse: constant 0x768005ffff is so big it is long
      > >> arch/x86/kernel/cpu/perf_event_intel.c:902:9: sparse: constant 0x768005ffff is so big it is long
      >
      > vim +901 arch/x86/kernel/cpu/perf_event_intel.c
      >
      >    895	 },
      >    896	};
      >    897
      >    898	static struct extra_reg intel_slm_extra_regs[] __read_mostly =
      >    899	{
      >    900		/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
      >  > 901		INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x768005ffff, RSP_0),
      >  > 902		INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x768005ffff, RSP_1),
      >    903		EVENT_EXTRA_END
      >    904	};
      >    905
      
      Extend those constants to 64 bits.
      
      Reported-by: fengguang.wu@intel.com
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130909112636.GQ31370@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      06c939c1
  3. 12 9月, 2013 3 次提交
    • S
      perf/x86: Fix uncore PCI fixed counter handling · dbc33f70
      Stephane Eranian 提交于
      There was a bug in the handling of SNB-EP/IVB-EP uncore PCI
      fixed counters, e.g., IMC.
      
      It would cause erratic values to be returned for the IMC
      clockticks event. This was due to a bogus hwc->config value
      which was then written to PCI config space.
      
      The erratic values can be seen via:
      
        $ perf stat -a -C 0 -e uncore_imc_0/clockticks/ -I 1000 sleep 10
      
      The fixed counter has most fields marked as reserved with
      hw reset values of 0. Yet the kernel was defaulting to a
      hwc->config = ~0 and that was causing the issues.
      
      This patch sets the hwc->config values for fixed uncore event
      to 0. Now, the values of IMC clockticks is correct.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: peterz@infradead.org
      Cc: zheng.z.yan@intel.com
      Link: http://lkml.kernel.org/r/20130909195350.GA17643@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dbc33f70
    • S
      perf/x86: Add constraint for IVB CYCLE_ACTIVITY:CYCLES_LDM_PENDING · 6113af14
      Stephane Eranian 提交于
      The IvyBridge event CYCLE_ACTIVITY:CYCLES_LDM_PENDING can only
      be measured on counters 0-3 when HT is off. When HT is on, you
      only have counters 0-3.
      
      If you program it on the eight counters for 1s on a 3GHz
      IVB laptop running a noploop, you see:
      
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
                 2 747 527 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
             3 280 563 608 CYCLE_ACTIVITY:CYCLES_LDM_PENDING
      
      Clearly the last 4 values are bogus.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: ak@linux.intel.com
      Cc: zheng.z.yan@intel.com
      Cc: dhsharp@google.com
      Link: http://lkml.kernel.org/r/20130911152222.GA28761@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6113af14
    • D
      mm: vmstats: track TLB flush stats on UP too · 6df46865
      Dave Hansen 提交于
      The previous patch doing vmstats for TLB flushes ("mm: vmstats: tlb flush
      counters") effectively missed UP since arch/x86/mm/tlb.c is only compiled
      for SMP.
      
      UP systems do not do remote TLB flushes, so compile those counters out on
      UP.
      
      arch/x86/kernel/cpu/mtrr/generic.c calls __flush_tlb() directly.  This is
      probably an optimization since both the mtrr code and __flush_tlb() write
      cr4.  It would probably be safe to make that a flush_tlb_all() (and then
      get these statistics), but the mtrr code is ancient and I'm hesitant to
      touch it other than to just stick in the counters.
      
      [akpm@linux-foundation.org: tweak comments]
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6df46865
  4. 05 9月, 2013 2 次提交
  5. 04 9月, 2013 1 次提交
  6. 02 9月, 2013 4 次提交
  7. 26 8月, 2013 1 次提交
  8. 23 8月, 2013 2 次提交
  9. 20 8月, 2013 1 次提交
    • Y
      x86/ioapic/kcrash: Prevent crash_kexec() from deadlocking on ioapic_lock · 17405453
      Yoshihiro YUNOMAE 提交于
      Prevent crash_kexec() from deadlocking on ioapic_lock. When
      crash_kexec() is executed on a CPU, the CPU will take ioapic_lock
      in disable_IO_APIC(). So if the cpu gets an NMI while locking
      ioapic_lock, a deadlock will happen.
      
      In this patch, ioapic_lock is zapped/initialized before disable_IO_APIC().
      
      You can reproduce this deadlock the following way:
      
      1. Add mdelay(1000) after raw_spin_lock_irqsave() in
         native_ioapic_set_affinity()@arch/x86/kernel/apic/io_apic.c
      
         Although the deadlock can occur without this modification, it will increase
         the potential of the deadlock problem.
      
      2. Build and install the kernel
      
      3. Set up the OS which will run panic() and kexec when NMI is injected
          # echo "kernel.unknown_nmi_panic=1" >> /etc/sysctl.conf
          # vim /etc/default/grub
            add "nmi_watchdog=0 crashkernel=256M" in GRUB_CMDLINE_LINUX line
          # grub2-mkconfig
      
      4. Reboot the OS
      
      5. Run following command for each vcpu on the guest
          # while true; do echo <CPU num> > /proc/irq/<IO-APIC-edge or IO-APIC-fasteoi>/smp_affinitity; done;
         By running this command, cpus will get ioapic_lock for setting affinity.
      
      6. Inject NMI (push a dump button or execute 'virsh inject-nmi <domain>' if you
         use VM). After injecting NMI, panic() is called in an nmi-handler context.
         Then, kexec will normally run in panic(), but the operation will be stopped
         by deadlock on ioapic_lock in crash_kexec()->machine_crash_shutdown()->
         native_machine_crash_shutdown()->disable_IO_APIC()->clear_IO_APIC()->
         clear_IO_APIC_pin()->ioapic_read_entry().
      Signed-off-by: NYoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/20130820070107.28245.83806.stgit@yunodevelSigned-off-by: NIngo Molnar <mingo@kernel.org>
      17405453
  10. 19 8月, 2013 1 次提交
  11. 16 8月, 2013 3 次提交
  12. 14 8月, 2013 4 次提交
  13. 13 8月, 2013 2 次提交
    • T
      x86, microcode, AMD: Fix early microcode loading · 84516098
      Torsten Kaiser 提交于
      load_microcode_amd() (and the helper it is using) should not have an
      cpu parameter. The microcode loading does not depend on the CPU wrt the
      patches loaded since they will end up in a global list for all CPUs
      anyway.
      
      The change from cpu to x86family in load_microcode_amd()
      now allows to drop the code messing with cpu_data(cpu) from
      collect_cpu_info_amd_early(), which is wrong anyway because at that
      point the per-cpu cpu_info is not yet setup (These values would later be
      overwritten by smp_store_boot_cpu_info() / smp_store_cpu_info()).
      
      Fold the rest of collect_cpu_info_amd_early() into load_ucode_amd_ap(),
      because its only used at one place and without the cpuinfo_x86 accesses
      it was not much left.
      Signed-off-by: NTorsten Kaiser <just.for.lkml@googlemail.com>
      [ Fengguang: build fix ]
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      [ Boris: adapt it to current tree. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      84516098
    • T
      x86, microcode, AMD: Make cpu_has_amd_erratum() use the correct struct cpuinfo_x86 · 8c6b79bb
      Torsten Kaiser 提交于
      cpu_has_amd_erratum() is buggy, because it uses the per-cpu cpu_info
      before it is filled by smp_store_boot_cpu_info() / smp_store_cpu_info().
      
      If early microcode loading is enabled its collect_cpu_info_amd_early()
      will fill ->x86 and so the fallback to boot_cpu_data is not used. But
      ->x86_vendor was not filled and is still X86_VENDOR_INTEL resulting in
      no errata fixes getting applied and my system hangs on boot.
      
      Using cpu_info in cpu_has_amd_erratum() is wrong anyway: its only
      caller init_amd() will have a struct cpuinfo_x86 as parameter and the
      set_cpu_bug() that is controlled by cpu_has_amd_erratum() also only uses
      that struct.
      
      So pass the struct cpuinfo_x86 from init_amd() to cpu_has_amd_erratum()
      and the broken fallback can be dropped.
      
      [ Boris: Drop WARN_ON() since we're called only from init_amd() ]
      Signed-off-by: NTorsten Kaiser <just.for.lkml@googlemail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      8c6b79bb
  14. 12 8月, 2013 3 次提交
  15. 09 8月, 2013 3 次提交
    • J
      x86, ticketlock: Add slowpath logic · 96f853ea
      Jeremy Fitzhardinge 提交于
      Maintain a flag in the LSB of the ticket lock tail which indicates
      whether anyone is in the lock slowpath and may need kicking when
      the current holder unlocks.  The flags are set when the first locker
      enters the slowpath, and cleared when unlocking to an empty queue (ie,
      no contention).
      
      In the specific implementation of lock_spinning(), make sure to set
      the slowpath flags on the lock just before blocking.  We must do
      this before the last-chance pickup test to prevent a deadlock
      with the unlocker:
      
      Unlocker			Locker
      				test for lock pickup
      					-> fail
      unlock
      test slowpath
      	-> false
      				set slowpath flags
      				block
      
      Whereas this works in any ordering:
      
      Unlocker			Locker
      				set slowpath flags
      				test for lock pickup
      					-> fail
      				block
      unlock
      test slowpath
      	-> true, kick
      
      If the unlocker finds that the lock has the slowpath flag set but it is
      actually uncontended (ie, head == tail, so nobody is waiting), then it
      clears the slowpath flag.
      
      The unlock code uses a locked add to update the head counter.  This also
      acts as a full memory barrier so that its safe to subsequently
      read back the slowflag state, knowing that the updated lock is visible
      to the other CPUs.  If it were an unlocked add, then the flag read may
      just be forwarded from the store buffer before it was visible to the other
      CPUs, which could result in a deadlock.
      
      Unfortunately this means we need to do a locked instruction when
      unlocking with PV ticketlocks.  However, if PV ticketlocks are not
      enabled, then the old non-locked "add" is the only unlocking code.
      
      Note: this code relies on gcc making sure that unlikely() code is out of
      line of the fastpath, which only happens when OPTIMIZE_SIZE=n.  If it
      doesn't the generated code isn't too bad, but its definitely suboptimal.
      
      Thanks to Srivatsa Vaddagiri for providing a bugfix to the original
      version of this change, which has been folded in.
      Thanks to Stephan Diestelhorst for commenting on some code which relied
      on an inaccurate reading of the x86 memory ordering rules.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Link: http://lkml.kernel.org/r/1376058122-8248-11-git-send-email-raghavendra.kt@linux.vnet.ibm.comSigned-off-by: NSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Stephan Diestelhorst <stephan.diestelhorst@amd.com>
      Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      96f853ea
    • J
      x86, pvticketlock: Use callee-save for lock_spinning · 354714dd
      Jeremy Fitzhardinge 提交于
      Although the lock_spinning calls in the spinlock code are on the
      uncommon path, their presence can cause the compiler to generate many
      more register save/restores in the function pre/postamble, which is in
      the fast path.  To avoid this, convert it to using the pvops callee-save
      calling convention, which defers all the save/restores until the actual
      function is called, keeping the fastpath clean.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Link: http://lkml.kernel.org/r/1376058122-8248-8-git-send-email-raghavendra.kt@linux.vnet.ibm.comReviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Tested-by: NAttilio Rao <attilio.rao@citrix.com>
      Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      354714dd
    • J
      x86, spinlock: Replace pv spinlocks with pv ticketlocks · 545ac138
      Jeremy Fitzhardinge 提交于
      Rather than outright replacing the entire spinlock implementation in
      order to paravirtualize it, keep the ticket lock implementation but add
      a couple of pvops hooks on the slow patch (long spin on lock, unlocking
      a contended lock).
      
      Ticket locks have a number of nice properties, but they also have some
      surprising behaviours in virtual environments.  They enforce a strict
      FIFO ordering on cpus trying to take a lock; however, if the hypervisor
      scheduler does not schedule the cpus in the correct order, the system can
      waste a huge amount of time spinning until the next cpu can take the lock.
      
      (See Thomas Friebel's talk "Prevent Guests from Spinning Around"
      http://www.xen.org/files/xensummitboston08/LHP.pdf for more details.)
      
      To address this, we add two hooks:
       - __ticket_spin_lock which is called after the cpu has been
         spinning on the lock for a significant number of iterations but has
         failed to take the lock (presumably because the cpu holding the lock
         has been descheduled).  The lock_spinning pvop is expected to block
         the cpu until it has been kicked by the current lock holder.
       - __ticket_spin_unlock, which on releasing a contended lock
         (there are more cpus with tail tickets), it looks to see if the next
         cpu is blocked and wakes it if so.
      
      When compiled with CONFIG_PARAVIRT_SPINLOCKS disabled, a set of stub
      functions causes all the extra code to go away.
      
      Results:
      =======
      setup: 32 core machine with 32 vcpu KVM guest (HT off)  with 8GB RAM
      base = 3.11-rc
      patched = base + pvspinlock V12
      
      +-----------------+----------------+--------+
       dbench (Throughput in MB/sec. Higher is better)
      +-----------------+----------------+--------+
      |   base (stdev %)|patched(stdev%) | %gain  |
      +-----------------+----------------+--------+
      | 15035.3   (0.3) |15150.0   (0.6) |   0.8  |
      |  1470.0   (2.2) | 1713.7   (1.9) |  16.6  |
      |   848.6   (4.3) |  967.8   (4.3) |  14.0  |
      |   652.9   (3.5) |  685.3   (3.7) |   5.0  |
      +-----------------+----------------+--------+
      
      pvspinlock shows benefits for overcommit ratio > 1 for PLE enabled cases,
      and undercommits results are flat
      Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Link: http://lkml.kernel.org/r/1376058122-8248-2-git-send-email-raghavendra.kt@linux.vnet.ibm.comReviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Tested-by: NAttilio Rao <attilio.rao@citrix.com>
      [ Raghavendra: Changed SPIN_THRESHOLD, fixed redefinition of arch_spinlock_t]
      Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      545ac138
  16. 07 8月, 2013 7 次提交