1. 11 2月, 2009 1 次提交
    • P
      perf_counters: allow users to count user, kernel and/or hypervisor events · 0475f9ea
      Paul Mackerras 提交于
      Impact: new perf_counter feature
      
      This extends the perf_counter_hw_event struct with bits that specify
      that events in user, kernel and/or hypervisor mode should not be
      counted (i.e. should be excluded), and adds code to program the PMU
      mode selection bits accordingly on x86 and powerpc.
      
      For software counters, we don't currently have the infrastructure to
      distinguish which mode an event occurs in, so we currently fail the
      counter initialization if the setting of the hw_event.exclude_* bits
      would require us to distinguish.  Context switches and CPU migrations
      are currently considered to occur in kernel mode.
      
      On x86, this changes the previous policy that only root can count
      kernel events.  Now non-root users can count kernel events or exclude
      them.  Non-root users still can't use NMI events, though.  On x86 we
      don't appear to have any way to control whether hypervisor events are
      counted or not, so hw_event.exclude_hv is ignored.
      
      On powerpc, the selection of whether to count events in user, kernel
      and/or hypervisor mode is PMU-wide, not per-counter, so this adds a
      check that the hw_event.exclude_* settings are the same as other events
      on the PMU.  Counters being added to a group have to have the same
      settings as the other hardware counters in the group.  Counters and
      groups can only be enabled in hw_perf_group_sched_in or power_perf_enable
      if they have the same settings as any other counters already on the
      PMU.  If we are not running on a hypervisor, the exclude_hv setting
      is ignored (by forcing it to 0) since we can't ever get any
      hypervisor events.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      0475f9ea
  2. 09 2月, 2009 2 次提交
    • M
      perf_counters: account NMI interrupts · d278c484
      Mike Galbraith 提交于
      I noticed that kerneltop interrupts were accounted as NMI, but not their
      perf counter origin.
      
      Account NMI performance counter interrupts.
      Signed-off-by: NMike Galbraith  <efault@gmx.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      
       arch/x86/kernel/cpu/perf_counter.c |    2 +-
       1 file changed, 1 insertion(+), 1 deletion(-)
      d278c484
    • P
      perf_counters: make software counters work as per-cpu counters · 23a185ca
      Paul Mackerras 提交于
      Impact: kernel crash fix
      
      Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
      counter as a per-cpu counter would reliably crash the system, because
      it calls __task_delta_exec with a null pointer.  The page fault,
      context switch and cpu migration counters also won't function
      correctly as per-cpu counters since they reference the current task.
      
      This fixes the problem by redirecting the task_clock counter to the
      cpu_clock counter when used as a per-cpu counter, and by implementing
      per-cpu page fault, context switch and cpu migration counters.
      
      Along the way, this:
      
      - Initializes counter->ctx earlier, in perf_counter_alloc, so that
        sw_perf_counter_init can use it
      - Adds code to kernel/sched.c to count task migrations into each
        cpu, in rq->nr_migrations_in
      - Exports the per-cpu context switch and task migration counts
        via new functions added to kernel/sched.c
      - Makes sure that if sw_perf_counter_init fails, we don't try to
        initialize the counter as a hardware counter.  Since the user has
        passed a negative, non-raw event type, they clearly don't intend
        for it to be interpreted as a hardware event.
      Reported-by: N"Zhang Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      23a185ca
  3. 05 2月, 2009 2 次提交
    • I
      perfcounters: fix "perf counters kills oprofile" bug, v2 · 82aa9a18
      Ingo Molnar 提交于
      Impact: fix kernel crash
      
      Both oprofile and perfcounters register an NMI die handler, but only one
      can handle the NMI.  Conveniently, oprofile unregisters it's notifier
      when not actively in use, so setting it's notifier priority higher than
      perfcounter's allows oprofile to borrow the NMI for the duration of it's
      run.  Tested/works both as module and built-in.
      
      While testing, I found that if kerneltop was generating NMIs at very
      high frequency, the kernel may panic when oprofile registered it's
      handler.  This turned out to be because oprofile registers it's handler
      before reset_value has been allocated, so if an NMI comes in while it's
      still setting up, kabOom.  Rather than try more invasive changes, I
      followed the lead of other places in op_model_ppro.c, and simply
      returned in that highly unlikely event.  (debug warnings attached)
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      82aa9a18
    • M
      perfcounters: fix "perf counters kill oprofile" bug · 5b75af0a
      Mike Galbraith 提交于
      With oprofile as a module, and unloaded by profiling script,
      both oprofile and kerneltop work fine.. unless you leave kerneltop
      running when you start profiling, then you may see badness.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5b75af0a
  4. 02 2月, 2009 1 次提交
  5. 29 1月, 2009 1 次提交
  6. 27 1月, 2009 1 次提交
  7. 23 1月, 2009 12 次提交
  8. 21 1月, 2009 19 次提交
  9. 20 1月, 2009 1 次提交
    • N
      x86: optimise x86's do_page_fault (C entry point for the page fault path) · 92181f19
      Nick Piggin 提交于
      Impact: cleanup, restructure code to improve assembly
      
      gcc isn't _all_ that smart about spilling registers to stack or reusing
      stack slots, even with branch annotations. do_page_fault contained a lot
      of functionality, so split unlikely paths into their own functions, and
      mark them as noinline just to be sure. I consider this actually to be
      somewhat of a cleanup too: the main function now contains about half
      the number of lines so the normal path is easier to read, while the error
      cases are also nicely split away.
      
      Also, ensure the order of arguments to functions is always the same: regs,
      addr, error_code. This can reduce code size a tiny bit, and just looks neater
      too.
      
      And add a couple of branch annotations.
      
      Before:
        do_page_fault:
                subq    $360, %rsp      #,
      
      After:
        do_page_fault:
                subq    $56, %rsp       #,
      
      bloat-o-meter:
        add/remove: 8/0 grow/shrink: 0/1 up/down: 2222/-1680 (542)
        function                                     old     new   delta
        __bad_area_nosemaphore                         -     506    +506
        no_context                                     -     474    +474
        vmalloc_fault                                  -     424    +424
        spurious_fault                                 -     358    +358
        mm_fault_error                                 -     272    +272
        bad_area_access_error                          -      89     +89
        bad_area                                       -      89     +89
        bad_area_nosemaphore                           -      10     +10
        do_page_fault                               2464     784   -1680
      
      Yes, the total size increases by 542 bytes, due to the extra function calls.
      But these will very rarely be called (except for vmalloc_fault) in a normal
      workload. Importantly, do_page_fault is less than 1/3rd it's original size,
      and touches far less stack.
      
      Existing gotos and branch hints did move a lot of the infrequently used text
      out of the fastpath, but that's even further improved after this patch.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      92181f19