1. 29 1月, 2016 2 次提交
    • P
      perf/x86: De-obfuscate code · 8f04b853
      Peter Zijlstra 提交于
      Get rid of the 'onln' obfuscation.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8f04b853
    • P
      perf/x86: Fix uninitialized value usage · e01d8718
      Peter Zijlstra 提交于
      When calling intel_alt_er() with .idx != EXTRA_REG_RSP_* we will not
      initialize alt_idx and then use this uninitialized value to index an
      array.
      
      When that is not fatal, it can result in an infinite loop in its
      caller __intel_shared_reg_get_constraints(), with IRQs disabled.
      
      Alternative error modes are random memory corruption due to the
      cpuc->shared_regs->regs[] array overrun, which manifest in either
      get_constraints or put_constraints doing weird stuff.
      
      Only took 6 hours of painful debugging to find this. Neither GCC nor
      Smatch warnings flagged this bug.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: ae3f011f ("perf/x86/intel: Fix SLM MSR_OFFCORE_RSP1 valid_mask")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e01d8718
  2. 22 1月, 2016 1 次提交
  3. 14 1月, 2016 1 次提交
  4. 06 1月, 2016 14 次提交
  5. 19 12月, 2015 4 次提交
  6. 06 12月, 2015 4 次提交
  7. 04 12月, 2015 1 次提交
    • R
      x86/mm/mtrr: Mark the 'range_new' static variable in mtrr_calc_range_state() as __initdata · c332813b
      Rasmus Villemoes 提交于
      'range_new' doesn't seem to be used after init. It is only passed
      to memset(), sum_ranges(), memcmp() and x86_get_mtrr_mem_range(), the
      latter of which also only passes it on to various *range*
      library functions.
      
      So mark it __initdata to free up an extra page after init.
      
      Its contents are wiped at every call to mtrr_calc_range_state(),
      so it being static is not about preserving state between calls,
      but simply to avoid a 4k+ stack frame. While there, add a
      comment explaining this and why it's safe.
      
      We could also mark nr_range_new as __initdata, but since it's
      just a single int and also doesn't carry state between calls (it
      is unconditionally assigned to before it is read), we might as
      well make it an ordinary automatic variable.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/1449002691-20783-1-git-send-email-linux@rasmusvillemoes.dkSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c332813b
  8. 26 11月, 2015 1 次提交
    • L
      x86: Replace RDRAND forced-reseed with simple sanity check · 0007bccc
      Len Brown 提交于
      x86_init_rdrand() was added with 2 goals:
      
      1. Sanity check that the built-in-self-test circuit on the Digital
         Random Number Generator (DRNG) is not complaining.  As RDRAND
         HW self-checks on every invocation, this goal is achieved
         by simply invoking RDRAND and checking its return code.
      
      2. Force a full re-seed of the random number generator.
         This was done out of paranoia to benefit the most un-sophisticated
         DRNG implementation conceivable in the architecture,
         an implementation that does not exist, and unlikely ever will.
         This worst-case full-re-seed is achieved by invoking
         a 64-bit RDRAND 8192 times.
      
      Unfortunately, this worst-case re-seed costs O(1,000us).
      Magnifying this cost, it is done from identify_cpu(), which is the
      synchronous critical path to bring a processor on-line -- repeated
      for every logical processor in the system at boot and resume from S3.
      
      As it is very expensive, and of highly dubious value, we delete the
      worst-case re-seed from the kernel.
      
      We keep the 1st goal -- sanity check the hardware, and mark it absent
      if it complains.
      
      This change reduces the cost of x86_init_rdrand() by a factor of 1,000x,
      to O(1us) from O(1,000us).
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Link: http://lkml.kernel.org/r/058618cc56ec6611171427ad7205e37e377aa8d4.1439738240.git.len.brown@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      0007bccc
  9. 24 11月, 2015 7 次提交
    • B
      x86/cpu: Fix MSR value truncation issue · 31ac34ca
      Borislav Petkov 提交于
      So sparse rightfully complains that the u64 MSR value we're
      writing into the STAR MSR, i.e. 0xc0000081, is being truncated:
      
      ./arch/x86/include/asm/msr.h:193:36: warning: cast truncates
      bits from constant value (23001000000000 becomes 0)
      
      because the actual value doesn't fit into the unsigned 32-bit
      quantity which are the @low and @high wrmsrl() parameters.
      
      This is not a problem, practically, because gcc is actually
      being smart enough here and does the right thing:
      
        .loc 3 87 0
        xorl    %esi, %esi		# we needz a 32-bit zero
        movl    $2293776, %edx	# 0x00230010 == (__USER32_CS << 16) | __KERNEL_CS go into the high bits
        movl    $-1073741695, %ecx	# MSR_STAR, i.e., 0xc0000081
        movl    %esi, %eax		# low order 32 bits in the MSR which are 0
        #APP
        # 87 "./arch/x86/include/asm/msr.h" 1
                wrmsr
      
      More specifically, MSR_STAR[31:0] is being set to 0. That field
      is reserved on Intel and on AMD it is 32-bit SYSCALL Target EIP.
      
      I'd strongly guess because Intel doesn't have SYSCALL in
      compat/legacy mode and we're using SYSENTER and INT80 there. And
      for compat syscalls in long mode we use CSTAR.
      
      So let's fix the sparse warning by writing SYSRET and SYSCALL CS
      and SS into the high 32-bit half of STAR and 0 in the low half
      explicitly.
      
       [ Actually, if we had to be precise, we would have to read what's in
         STAR[31:0] and write it back unchanged on Intel and write 0 on AMD. I
         guess the current writing to 0 is still ok since Intel can apparently
         stomach it. ]
      
      The resulting code is identical to what we have above:
      
        .loc 3 87 0
        xorl    %esi, %esi      # tmp104
        movl    $2293776, %eax  #, tmp103
        movl    $-1073741695, %ecx      #, tmp102
        movl    %esi, %edx      # tmp104, tmp104
      
        ...
      
              wrmsr
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448273546-2567-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      31ac34ca
    • B
      x86/cpu/amd, kvm: Satisfy guest kernel reads of IC_CFG MSR · ae8b7875
      Borislav Petkov 提交于
      The kernel accesses IC_CFG MSR (0xc0011021) on AMD because it
      checks whether the way access filter is enabled on some F15h
      models, and, if so, disables it.
      
      kvm doesn't handle that MSR access and complains about it, which
      can get really noisy in dmesg when one starts kvm guests all the
      time for testing. And it is useless anyway - guest kernel
      shouldn't be doing such changes anyway so tell it that that
      filter is disabled.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448273546-2567-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ae8b7875
    • B
      x86/cpu: Unify CPU family, model, stepping calculation · 99f925ce
      Borislav Petkov 提交于
      Add generic functions which calc family, model and stepping from
      the CPUID_1.EAX leaf and stick them into the library we have.
      
      Rename those which do call CPUID with the prefix "x86_cpuid" as
      suggested by Paolo Bonzini.
      
      No functionality change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448273546-2567-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      99f925ce
    • B
      x86/mce: Make usable address checks Intel-only · feab21f8
      Borislav Petkov 提交于
      The MCi_MISC bitfield definitions mce_usable_address() checks
      are Intel-only. Make them so.
      
      While at it, move mce_usable_address() up, before all its
      callers and get rid of the forward declaration.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448350880-5573-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      feab21f8
    • B
      x86/mce: Add the missing memory error check on AMD · db548a28
      Borislav Petkov 提交于
      We simply need to look at the extended error code when detecting
      whether the error is of type memory.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448350880-5573-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      db548a28
    • B
      x86/RAS: Remove mce.usable_addr · c0ec382e
      Borislav Petkov 提交于
      It is useless and we can use the function instead. Besides,
      mcelog(8) hasn't managed to make use of it yet. So kill it.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1448350880-5573-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c0ec382e
    • T
      x86/mce: Do not enter deferred errors into the generic pool twice · 8b38937b
      Tony Luck 提交于
      We used to have a special ring buffer for deferred errors that
      was used to mark problem pages. We replaced that with a generic
      pool. Then later converted mce_log() to also use the same pool.
      As a result, we end up adding all deferred errors to the pool
      twice.
      
      Rearrange this code. Make sure to set the m.severity and
      m.usable_addr fields for deferred errors. Then if flags and
      mca_cfg.dont_log_ce mean we call mce_log() we are done, because
      that will add this entry to the generic pool.
      
      If we skipped mce_log(), then we still want to take action for
      the deferred error, so add to the pool.
      
      Change the name of the boolean "error_logged" to "error_seen",
      we should set it whether of not we logged an error because the
      return value from machine_check_poll() is used to decide whether
      storms have subsided or not.
      Reported-by: NGong Chen <gong.chen@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1448350880-5573-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8b38937b
  10. 23 11月, 2015 5 次提交
    • B
      x86/microcode: Initialize the driver late when facilities are up · 2d5be37d
      Borislav Petkov 提交于
      Running microcode_init() from setup_arch() is a bad idea because
      not even kmalloc() is ready at that point and the loader does
      all kinds of allocations and init/registration with various
      subsystems.
      
      Make it a late initcall when required facilities are initialized
      so that the microcode driver initialization can succeed too.
      Reported-and-tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20151120112400.GC4028@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2d5be37d
    • A
      perf/x86: Handle multiple umask bits for BDW CYCLE_ACTIVITY.* · b7883a1c
      Andi Kleen 提交于
      The earlier constraint fix for Broadwell CYCLE_ACTIVITY.*
      forced umask 8 to counter 2. For this it used UEVENT,
      to match the complete umask.
      
      The event list for Broadwell has an additional
      STALLS_L1D_PENDIND event that uses umask 8, but also
      sets other bits in the umask.  The earlier strict umask match
      didn't handle this case.
      
      Add a new UBIT_EVENT constraint macro that only matches
      the specified bits in the umask. Then use that macro
      to handle CYCLE_ACTIVITY.* on Broadwell.
      
      The documented event also uses cmask, but there's no
      need to let the event scheduler know about the cmask,
      as the scheduling restriction is only tied to the umask.
      Reported-by: NGrant Ayers <ayers@cs.stanford.edu>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1447719667-9998-1-git-send-email-andi@firstfloor.org
      [ Filled in the missing email address of Grant Ayers - hopefully I got the right one. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b7883a1c
    • T
      perf/x86/intel/pt: Add interface to stop Intel PT logging · 24cc12b1
      Takao Indoh 提交于
      This patch add a function for external components to stop Intel PT.
      Basically this function is used when kernel panic occurs. When it is
      called, the intel_pt driver disables Intel PT and saves its registers
      using pt_event_stop(), which is also used by pmu.stop handler.
      
      This function stops Intel PT on the CPU where it is working, therefore
      users of it need to call it for each CPU to stop all logging.
      Signed-off-by: NTakao Indoh <indou.takao@jp.fujitsu.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin<alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: H.Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Link: http://lkml.kernel.org/r/1446614553-6072-2-git-send-email-indou.takao@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      24cc12b1
    • A
      perf/x86: Add option to disable reading branch flags/cycles · b16a5b52
      Andi Kleen 提交于
      With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is
      not free anymore, as it has moved to a separate MSR.
      
      For callstack mode we don't need any of this information; so we can
      avoid the unnecessary MSR read. Add flags to the perf interface where
      perf record can request not collecting this information.
      
      Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual
      for branch_sample_types to be negative (disable), not positive (enable),
      but since the legacy ABI reported the flags we need some form of
      explicit disabling to avoid breaking the ABI.
      
      After we have the flags the x86 perf code can keep track if any users
      need the flags. If noone needs it the information is not collected.
      
      This cuts down the cost of LBR callstack on Skylake significantly.
      Profiling a kernel build with LBR call stack the average run time of
      the PMI handler drops by 43%.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b16a5b52
    • A
      perf/x86: Optimize stack walk user accesses · 75925e1a
      Andi Kleen 提交于
      Change the perf user stack walking to use the new
      __copy_from_user_nmi(), and split each access into word sized transfer
      sizes. This allows to inline the complete access and optimize it all
      into a single load.
      
      The main advantage is that this avoids the overhead of double page
      faults.  When normal copy_from_user() fails it reexecutes the copy to
      compute an accurate number of non copied bytes. This leads to
      executing the expensive page fault twice.
      
      While walking stacks having a fault at some point is relatively common
      (typically when some part of the program isn't compiled with frame
      pointers), so this is a large overhead.
      
      With the optimized copies we avoid this problem because they only do
      all accesses once. And of course they're much faster too when the
      access does not fault because they're just single instructions instead
      of complex function calls.
      
      While profiling a kernel build with -g, the patch brings down the
      average time of the PMI handler from 966ns to 552ns (-43%).
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1445551641-13379-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      75925e1a