1. 26 5月, 2009 3 次提交
  2. 22 5月, 2009 1 次提交
    • P
      perf_counter: Dynamically allocate tasks' perf_counter_context struct · a63eaf34
      Paul Mackerras 提交于
      This replaces the struct perf_counter_context in the task_struct with
      a pointer to a dynamically allocated perf_counter_context struct.  The
      main reason for doing is this is to allow us to transfer a
      perf_counter_context from one task to another when we do lazy PMU
      switching in a later patch.
      
      This has a few side-benefits: the task_struct becomes a little smaller,
      we save some memory because only tasks that have perf_counters attached
      get a perf_counter_context allocated for them, and we can remove the
      inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't
      end up recompiling nearly everything whenever perf_counter.h changes.
      
      The perf_counter_context structures are reference-counted and freed
      when the last reference is dropped.  A context can have references
      from its task and the counters on its task.  Counters can outlive the
      task so it is possible that a context will be freed well after its
      task has exited.
      
      Contexts are allocated on fork if the parent had a context, or
      otherwise the first time that a per-task counter is created on a task.
      In the latter case, we set the context pointer in the task struct
      locklessly using an atomic compare-and-exchange operation in case we
      raced with some other task in creating a context for the subject task.
      
      This also removes the task pointer from the perf_counter struct.  The
      task pointer was not used anywhere and would make it harder to move a
      context from one task to another.  Anything that needed to know which
      task a counter was attached to was already using counter->ctx->task.
      
      The __perf_counter_init_context function moves up in perf_counter.c
      so that it can be called from find_get_context, and now initializes
      the refcount, but is otherwise unchanged.
      
      We were potentially calling list_del_counter twice: once from
      __perf_counter_exit_task when the task exits and once from
      __perf_counter_remove_from_context when the counter's fd gets closed.
      This adds a check in list_del_counter so it doesn't do anything if
      the counter has already been removed from the lists.
      
      Since perf_counter_task_sched_in doesn't do anything if the task doesn't
      have a context, and leaves cpuctx->task_ctx = NULL, this adds code to
      __perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in
      the case where the current task adds the first counter to itself and
      thus creates a context for itself.
      
      This also adds similar code to __perf_counter_enable to handle a
      similar situation which can arise when the counters have been disabled
      using prctl; that also leaves cpuctx->task_ctx = NULL.
      
      [ Impact: refactor counter context management to prepare for new feature ]
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a63eaf34
  3. 21 5月, 2009 1 次提交
    • I
      perf_counter: Fix context removal deadlock · 34adc806
      Ingo Molnar 提交于
      Disable the PMU globally before removing a counter from a
      context. This fixes the following lockup:
      
      [22081.741922] ------------[ cut here ]------------
      [22081.746668] WARNING: at arch/x86/kernel/cpu/perf_counter.c:803 intel_pmu_handle_irq+0x9b/0x24e()
      [22081.755624] Hardware name: X8DTN
      [22081.758903] perfcounters: irq loop stuck!
      [22081.762985] Modules linked in:
      [22081.766136] Pid: 11082, comm: perf Not tainted 2.6.30-rc6-tip #226
      [22081.772432] Call Trace:
      [22081.774940]  <NMI>  [<ffffffff81019aed>] ? intel_pmu_handle_irq+0x9b/0x24e
      [22081.781993]  [<ffffffff81019aed>] ? intel_pmu_handle_irq+0x9b/0x24e
      [22081.788368]  [<ffffffff8104505c>] ? warn_slowpath_common+0x77/0xa3
      [22081.794649]  [<ffffffff810450d3>] ? warn_slowpath_fmt+0x40/0x45
      [22081.800696]  [<ffffffff81019aed>] ? intel_pmu_handle_irq+0x9b/0x24e
      [22081.807080]  [<ffffffff814d1a72>] ? perf_counter_nmi_handler+0x3f/0x4a
      [22081.813751]  [<ffffffff814d2d09>] ? notifier_call_chain+0x58/0x86
      [22081.819951]  [<ffffffff8105b250>] ? notify_die+0x2d/0x32
      [22081.825392]  [<ffffffff814d1414>] ? do_nmi+0x8e/0x242
      [22081.830538]  [<ffffffff814d0f0a>] ? nmi+0x1a/0x20
      [22081.835342]  [<ffffffff8117e102>] ? selinux_file_free_security+0x0/0x1a
      [22081.842105]  [<ffffffff81018793>] ? x86_pmu_disable_counter+0x15/0x41
      [22081.848673]  <<EOE>>  [<ffffffff81018f3d>] ? x86_pmu_disable+0x86/0x103
      [22081.855512]  [<ffffffff8108fedd>] ? __perf_counter_remove_from_context+0x0/0xfe
      [22081.862926]  [<ffffffff8108fcbc>] ? counter_sched_out+0x30/0xce
      [22081.868909]  [<ffffffff8108ff36>] ? __perf_counter_remove_from_context+0x59/0xfe
      [22081.876382]  [<ffffffff8106808a>] ? smp_call_function_single+0x6c/0xe6
      [22081.882955]  [<ffffffff81091b96>] ? perf_release+0x86/0x14c
      [22081.888600]  [<ffffffff810c4c84>] ? __fput+0xe7/0x195
      [22081.893718]  [<ffffffff810c213e>] ? filp_close+0x5b/0x62
      [22081.899107]  [<ffffffff81046a70>] ? put_files_struct+0x64/0xc2
      [22081.905031]  [<ffffffff8104841a>] ? do_exit+0x1e2/0x6ef
      [22081.910360]  [<ffffffff814d0a60>] ? _spin_lock_irqsave+0x9/0xe
      [22081.916292]  [<ffffffff8104898e>] ? do_group_exit+0x67/0x93
      [22081.921953]  [<ffffffff810489cc>] ? sys_exit_group+0x12/0x16
      [22081.927759]  [<ffffffff8100baab>] ? system_call_fastpath+0x16/0x1b
      [22081.934076] ---[ end trace 3a3936ce3e1b4505 ]---
      
      And could potentially also fix the lockup reported by Marcelo Tosatti.
      
      Also, print more debug info in case of a detected lockup.
      
      [ Impact: fix lockup ]
      Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      34adc806
  4. 18 5月, 2009 1 次提交
    • I
      perf_counter, x86: speed up the scheduling fast-path · b68f1d2e
      Ingo Molnar 提交于
      We have to set up the LVT entry only at counter init time, not at
      every switch-in time.
      
      There's friction between NMI and non-NMI use here - we'll probably
      remove the per counter configurability of it - but until then, dont
      slow down things ...
      
      [ Impact: micro-optimization ]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b68f1d2e
  5. 17 5月, 2009 1 次提交
    • I
      perf_counter, x86: fix zero irq_period counters · d2517a49
      Ingo Molnar 提交于
      The quirk to irq_period unearthed an unrobustness we had in the
      hw_counter initialization sequence: we left irq_period at 0, which
      was then quirked up to 2 ... which then generated a _lot_ of
      interrupts during 'perf stat' runs, slowed them down and skewed
      the counter results in general.
      
      Initialize irq_period to the maximum instead.
      
      [ Impact: fix perf stat results ]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d2517a49
  6. 15 5月, 2009 10 次提交
  7. 13 5月, 2009 1 次提交
  8. 11 5月, 2009 5 次提交
  9. 08 5月, 2009 5 次提交
  10. 06 5月, 2009 2 次提交
  11. 05 5月, 2009 3 次提交
    • A
      x86: show number of core_siblings instead of thread_siblings in /proc/cpuinfo · 35d11680
      Andreas Herrmann 提交于
      Commit 7ad728f9
      (cpumask: x86: convert cpu_sibling_map/cpu_core_map to cpumask_var_t)
      changed the output of /proc/cpuinfo for siblings:
      
      Example on an AMD Phenom:
      
        physical id   : 0
        siblings : 1
        core id	   : 3
        cpu cores  : 4
      
      Before that commit it was:
      
        physical id	: 0
        siblings : 4
        core id	   : 3
        cpu cores  : 4
      
      Instead of cpu_core_mask it now uses cpu_sibling_mask to count siblings.
      This is due to the following hunk of above commit:
      
      |  --- a/arch/x86/kernel/cpu/proc.c
      |  +++ b/arch/x86/kernel/cpu/proc.c
      |  @@ -14,7 +14,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinf
      |          if (c->x86_max_cores * smp_num_siblings > 1) {
      |                  seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
      |                  seq_printf(m, "siblings\t: %d\n",
      |  -                          cpus_weight(per_cpu(cpu_core_map, cpu)));
      |  +                          cpumask_weight(cpu_sibling_mask(cpu)));
      |                  seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
      |                  seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
      |                  seq_printf(m, "apicid\t\t: %d\n", c->apicid);
      
      This was a mistake, because the impact line shows that this side-effect
      was not anticipated:
      
         Impact: reduce per-cpu size for CONFIG_CPUMASK_OFFSTACK=y
      
      So revert the respective hunk to restore the old behavior.
      
      [ Impact: fix sibling-info regression in /proc/cpuinfo ]
      Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <20090504182859.GA29045@alberich.amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      35d11680
    • I
      perf_counter: fix fixed-purpose counter support on v2 Intel-PERFMON · 066d7dea
      Ingo Molnar 提交于
      Fixed-purpose counters stopped working in a simple 'perf stat ls' run:
      
         <not counted>  cache references
         <not counted>  cache misses
      
      Due to:
      
        ef7b3e09: perf_counter, x86: remove vendor check in fixed_mode_idx()
      
      Which made x86_pmu.num_counters_fixed matter: if it's nonzero, the
      fixed-purpose counters are utilized.
      
      But on v2 perfmon this field is not set (despite there being
      fixed-purpose PMCs). So add a quirk to set the number of fixed-purpose
      counters to at least three.
      
      [ Impact: add quirk for three fixed-purpose counters on certain Intel CPUs ]
      
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1241002046-8832-28-git-send-email-robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      066d7dea
    • P
      perf_counter: x86: fixup nmi_watchdog vs perf_counter boo-boo · ba77813a
      Peter Zijlstra 提交于
      Invert the atomic_inc_not_zero() test so that we will indeed detect the
      first activation.
      
      Also rename the global num_counters, since its easy to confuse with
      x86_pmu.num_counters.
      
      [ Impact: fix non-working perfcounters on AMD CPUs, cleanup ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1241455664.7620.4938.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba77813a
  12. 04 5月, 2009 1 次提交
  13. 02 5月, 2009 1 次提交
  14. 01 5月, 2009 2 次提交
  15. 30 4月, 2009 2 次提交
    • J
      x86: gettimeofday() vDSO: fix segfault when tv == NULL · 2f65dd47
      John Wright 提交于
      According to the gettimeofday(2) manual:
      
             If either tv or tz is NULL, the corresponding structure is not
             set or returned.
      
      Since it is legal to give NULL as the tv argument, the code should make
      sure tv is not NULL before trying to dereference it.
      
      This issue manifests itself on x86_64 when vdso=0 is not on the kernel
      command-line and libc uses the vDSO for gettimeofday() (e.g. glibc >=
      2.7).  A simple reproducer:
      
        #include <stdio.h>
        #include <sys/time.h>
      
        int main(void)
        {
            struct timezone tz;
      
            gettimeofday(NULL, &tz);
      
            return 0;
        }
      
      See http://bugs.debian.org/466491 for more details.
      
      [ Impact: fix gettimeofday(NULL, &tz) segfault ]
      Signed-off-by: NJohn Wright <john.wright@hp.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: John Wright <john.wright@hp.com>
      LKML-Reference: <1241037121-14805-1-git-send-email-john.wright@hp.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2f65dd47
    • R
      perf_counter, x86: rename bitmasks to ->used_mask and ->active_mask · 43f6201a
      Robert Richter 提交于
      Standardize on explicitly mentioning '_mask' in fields that
      are not plain flags but masks. This avoids typos like:
      
             if (cpuc->used)
      
      (which could easily slip through review unnoticed), while if a
      typo looks like this:
      
             if (cpuc->used_mask)
      
      it might get noticed during review.
      
      [ Impact: cleanup ]
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1241016956-24648-1-git-send-email-robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      43f6201a
  16. 29 4月, 2009 1 次提交