1. 12 8月, 2014 1 次提交
    • D
      sparc64: Fix pcr_ops initialization and usage bugs. · 8bccf5b3
      David S. Miller 提交于
      Christopher reports that perf_event_print_debug() can crash in uniprocessor
      builds.  The crash is due to pcr_ops being NULL.
      
      This happens because pcr_arch_init() is only invoked by smp_cpus_done() which
      only executes in SMP builds.
      
      init_hw_perf_events() is closely intertwined with pcr_ops being setup properly,
      therefore:
      
      1) Call pcr_arch_init() early on from init_hw_perf_events(), instead of
         from smp_cpus_done().
      
      2) Do not hook up a PMU type if pcr_ops is NULL after pcr_arch_init().
      
      3) Move init_hw_perf_events to a later initcall so that it we will be
         sure to invoke pcr_arch_init() after all cpus are brought up.
      
      Finally, guard the one naked sequence of pcr_ops dereferences in
      __global_pmu_self() with an appropriate NULL check.
      Reported-by: NChristopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bccf5b3
  2. 19 5月, 2014 1 次提交
    • S
      sparc64: fix sparse warnings in perf_event.c · 265c1ffa
      Sam Ravnborg 提交于
      Fix following sparse warnings:
      kernel/perf_event.c:113:1: warning: symbol 'cpu_hw_events' was not declared. Should it be static?
      kernel/perf_event.c:1156:6: warning: symbol 'perf_event_grab_pmc' was not declared. Should it be static?
      kernel/perf_event.c:1172:6: warning: symbol 'perf_event_release_pmc' was not declared. Should it be static?
      kernel/perf_event.c:1672:12: warning: symbol 'init_hw_perf_events' was not declared. Should it be static?
      kernel/perf_event.c:1749:52: warning: incorrect type in argument 2 (different address spaces)
      kernel/perf_event.c:1772:60: warning: incorrect type in argument 2 (different address spaces)
      kernel/perf_event.c:1779:60: warning: incorrect type in argument 2 (different address spaces)
      
      Define the functions static as they are not used outside this file.
      Fix it so copy_from_user are supplied with pointers annotated _user
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      265c1ffa
  3. 27 10月, 2012 1 次提交
    • D
      sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads. · 517ffce4
      David S. Miller 提交于
      The Montgomery Multiply, Montgomery Square, and Multiple-Precision
      Multiply instructions work by loading a combination of the floating
      point and multiple register windows worth of integer registers
      with the inputs.
      
      These values are 64-bit.  But for 32-bit userland processes we only
      save the low 32-bits of each integer register during a register spill.
      This is because the register window save area is in the user stack and
      has a fixed layout.
      
      Therefore, the only way to use these instruction in 32-bit mode is to
      perform the following sequence:
      
      1) Load the top-32bits of a choosen integer register with a sentinel,
         say "-1".  This will be in the outer-most register window.
      
         The idea is that we're trying to see if the outer-most register
         window gets spilled, and thus the 64-bit values were truncated.
      
      2) Load all the inputs for the montmul/montsqr/mpmul instruction,
         down to the inner-most register window.
      
      3) Execute the opcode.
      
      4) Traverse back up to the outer-most register window.
      
      5) Check the sentinel, if it's still "-1" store the results.
         Otherwise retry the entire sequence.
      
      This retry is extremely troublesome.  If you're just unlucky and an
      interrupt or other trap happens, it'll push that outer-most window to
      the stack and clear the sentinel when we restore it.
      
      We could retry forever and never make forward progress if interrupts
      arrive at a fast enough rate (consider perf events as one example).
      So we have do limited retries and fallback to software which is
      extremely non-deterministic.
      
      Luckily it's very straightforward to provide a mechanism to let
      32-bit applications use a 64-bit stack.  Stacks in 64-bit mode are
      biased by 2047 bytes, which means that the lowest bit is set in the
      actual %sp register value.
      
      So if we see bit zero set in a 32-bit application's stack we treat
      it like a 64-bit stack.
      
      Runtime detection of such a facility is tricky, and cumbersome at
      best.  For example, just trying to use a biased stack and seeing if it
      works is hard to recover from (the signal handler will need to use an
      alt stack, plus something along the lines of longjmp).  Therefore, we
      add a system call to report a bitmask of arch specific features like
      this in a cheap and less hairy way.
      
      With help from Andy Polyakov.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      517ffce4
  4. 17 10月, 2012 1 次提交
    • D
      sparc64: Fix bit twiddling in sparc_pmu_enable_event(). · e793d8c6
      David S. Miller 提交于
      There was a serious disconnect in the logic happening in
      sparc_pmu_disable_event() vs. sparc_pmu_enable_event().
      
      Event disable is implemented by programming a NOP event into the PCR.
      
      However, event enable was not reversing this operation.  Instead, it
      was setting the User/Priv/Hypervisor trace enable bits.
      
      That's not sparc_pmu_enable_event()'s job, that's what
      sparc_pmu_enable() and sparc_pmu_disable() do .
      
      The intent of sparc_pmu_enable_event() is clear, since it first clear
      out the event type encoding field.  So fix this by OR'ing in the event
      encoding rather than the trace enable bits.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e793d8c6
  5. 15 10月, 2012 1 次提交
  6. 19 8月, 2012 12 次提交
  7. 09 5月, 2012 1 次提交
  8. 29 3月, 2012 1 次提交
  9. 05 3月, 2012 1 次提交
  10. 28 7月, 2011 1 次提交
    • D
      sparc: Detect and handle UltraSPARC-T3 cpu types. · 4ba991d3
      David S. Miller 提交于
      The cpu compatible string we look for is "SPARC-T3".
      
      As far as memset/memcpy optimizations go, we treat this chip the same
      as Niagara-T2/T2+.  Use cache initializing stores for memset, and use
      perfetch, FPU block loads, cache initializing stores, and block stores
      for copies.
      
      We use the Niagara-T2 perf support, since T3 is a close relative in
      this regard.  Later we'll add support for the new events T3 can
      report, plus enable T3's new "sample" mode.
      
      For now I haven't added any new ELF hwcap flags.  We probably need
      to add a couple, for example:
      
      T2 and T3 both support the population count instruction in hardware.
      
      T3 supports VIS3 instructions, including support (finally) for
      partitioned shift.  One can also now move directly between float
      and integer registers.
      
      T3 supports instructions meant to help with Galois Field and other HPC
      calculations, such as XOR multiply.  Also there are "OP and negate"
      instructions, for example "fnmul" which is multiply-and-negate.
      
      T3 recognizes the transactional memory opcodes, however since
      transactional memory isn't supported: 1) 'commit' behaves as a NOP and
      2) 'chkpt' always branches 3) 'rdcps' returns all zeros and 4) 'wrcps'
      behaves as a NOP.
      
      So we'll need about 3 new elf capability flags in the end to represent
      all of these things.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ba991d3
  11. 27 7月, 2011 1 次提交
  12. 01 7月, 2011 2 次提交
    • P
      perf, arch: Add generic NODE cache events · 89d6c0b5
      Peter Zijlstra 提交于
      Add a NODE level to the generic cache events which is used to measure
      local vs remote memory accesses. Like all other cache events, an
      ACCESS is HIT+MISS, if there is no way to distinguish between reads
      and writes do reads only etc..
      
      The below needs filling out for !x86 (which I filled out with
      unsupported events).
      
      I'm fairly sure ARM can leave it like that since it doesn't strike me as
      an architecture that even has NUMA support. SH might have something since
      it does appear to have some NUMA bits.
      
      Sparc64, PowerPC and MIPS certainly want a good look there since they
      clearly are NUMA capable.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: David Miller <davem@davemloft.net>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: David Daney <ddaney@caviumnetworks.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptopSigned-off-by: NIngo Molnar <mingo@elte.hu>
      89d6c0b5
    • P
      perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17
      Peter Zijlstra 提交于
      The nmi parameter indicated if we could do wakeups from the current
      context, if not, we would set some state and self-IPI and let the
      resulting interrupt do the wakeup.
      
      For the various event classes:
      
        - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
          the PMI-tail (ARM etc.)
        - tracepoint: nmi=0; since tracepoint could be from NMI context.
        - software: nmi=[0,1]; some, like the schedule thing cannot
          perform wakeups, and hence need 0.
      
      As one can see, there is very little nmi=1 usage, and the down-side of
      not using it is that on some platforms some software events can have a
      jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).
      
      The up-side however is that we can remove the nmi parameter and save a
      bunch of conditionals in fast paths.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michael Cree <mcree@orcon.net.nz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a8b0ca17
  13. 22 4月, 2011 1 次提交
  14. 31 3月, 2011 1 次提交
  15. 16 12月, 2010 1 次提交
    • P
      perf: Dynamic pmu types · 2e80a82a
      Peter Zijlstra 提交于
      Extend the perf_pmu_register() interface to allow for named and
      dynamic pmu types.
      
      Because we need to support the existing static types we cannot use
      dynamic types for everything, hence provide a type argument.
      
      If we want to enumerate the PMUs they need a name, provide one.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20101117222056.259707703@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2e80a82a
  16. 10 12月, 2010 1 次提交
  17. 26 11月, 2010 1 次提交
  18. 13 9月, 2010 1 次提交
  19. 10 9月, 2010 6 次提交
    • P
      perf: Remove the sysfs bits · 15ac9a39
      Peter Zijlstra 提交于
      Neither the overcommit nor the reservation sysfs parameter were
      actually working, remove them as they'll only get in the way.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      15ac9a39
    • P
      perf: Rework the PMU methods · a4eaf7f1
      Peter Zijlstra 提交于
      Replace pmu::{enable,disable,start,stop,unthrottle} with
      pmu::{add,del,start,stop}, all of which take a flags argument.
      
      The new interface extends the capability to stop a counter while
      keeping it scheduled on the PMU. We replace the throttled state with
      the generic stopped state.
      
      This also allows us to efficiently stop/start counters over certain
      code paths (like IRQ handlers).
      
      It also allows scheduling a counter without it starting, allowing for
      a generic frozen state (useful for rotating stopped counters).
      
      The stopped state is implemented in two different ways, depending on
      how the architecture implemented the throttled state:
      
       1) We disable the counter:
          a) the pmu has per-counter enable bits, we flip that
          b) we program a NOP event, preserving the counter state
      
       2) We store the counter state and ignore all read/overflow events
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a4eaf7f1
    • P
      perf: Per PMU disable · 33696fc0
      Peter Zijlstra 提交于
      Changes perf_disable() into perf_pmu_disable().
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      33696fc0
    • P
      perf: Reduce perf_disable() usage · 24cd7f54
      Peter Zijlstra 提交于
      Since the current perf_disable() usage is only an optimization,
      remove it for now. This eases the removal of the __weak
      hw_perf_enable() interface.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      24cd7f54
    • P
      perf: Register PMU implementations · b0a873eb
      Peter Zijlstra 提交于
      Simple registration interface for struct pmu, this provides the
      infrastructure for removing all the weak functions.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0a873eb
    • P
      perf: Deconstify struct pmu · 51b0fe39
      Peter Zijlstra 提交于
      sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus <paulus@samba.org>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Michael Cree <mcree@orcon.net.nz>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      51b0fe39
  20. 19 8月, 2010 3 次提交
  21. 24 6月, 2010 1 次提交