提交 · 0475f9ea8e2cc030298908949e0d5da9f2fc2cfe · openanolis / cloud-kernel

11 2月, 2009 1 次提交

perf_counters: allow users to count user, kernel and/or hypervisor events · 0475f9ea

由 Paul Mackerras 提交于 2月 11, 2009

Impact: new perf_counter feature

This extends the perf_counter_hw_event struct with bits that specify
that events in user, kernel and/or hypervisor mode should not be
counted (i.e. should be excluded), and adds code to program the PMU
mode selection bits accordingly on x86 and powerpc.

For software counters, we don't currently have the infrastructure to
distinguish which mode an event occurs in, so we currently fail the
counter initialization if the setting of the hw_event.exclude_* bits
would require us to distinguish. Context switches and CPU migrations
are currently considered to occur in kernel mode.

On x86, this changes the previous policy that only root can count
kernel events. Now non-root users can count kernel events or exclude
them. Non-root users still can't use NMI events, though. On x86 we
don't appear to have any way to control whether hypervisor events are
counted or not, so hw_event.exclude_hv is ignored.

On powerpc, the selection of whether to count events in user, kernel
and/or hypervisor mode is PMU-wide, not per-counter, so this adds a
check that the hw_event.exclude_* settings are the same as other events
on the PMU. Counters being added to a group have to have the same
settings as the other hardware counters in the group. Counters and
groups can only be enabled in hw_perf_group_sched_in or power_perf_enable
if they have the same settings as any other counters already on the
PMU. If we are not running on a hypervisor, the exclude_hv setting
is ignored (by forcing it to 0) since we can't ever get any
hypervisor events.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

0475f9ea

09 2月, 2009 2 次提交

perf_counters: account NMI interrupts · d278c484

由 Mike Galbraith 提交于 2月 09, 2009

I noticed that kerneltop interrupts were accounted as NMI, but not their
perf counter origin.

Account NMI performance counter interrupts.
Signed-off-by: NMike Galbraith  <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

 arch/x86/kernel/cpu/perf_counter.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

d278c484

perf_counters: make software counters work as per-cpu counters · 23a185ca

由 Paul Mackerras 提交于 2月 09, 2009

Impact: kernel crash fix

Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
counter as a per-cpu counter would reliably crash the system, because
it calls __task_delta_exec with a null pointer.  The page fault,
context switch and cpu migration counters also won't function
correctly as per-cpu counters since they reference the current task.

This fixes the problem by redirecting the task_clock counter to the
cpu_clock counter when used as a per-cpu counter, and by implementing
per-cpu page fault, context switch and cpu migration counters.

Along the way, this:

- Initializes counter->ctx earlier, in perf_counter_alloc, so that
  sw_perf_counter_init can use it
- Adds code to kernel/sched.c to count task migrations into each
  cpu, in rq->nr_migrations_in
- Exports the per-cpu context switch and task migration counts
  via new functions added to kernel/sched.c
- Makes sure that if sw_perf_counter_init fails, we don't try to
  initialize the counter as a hardware counter.  Since the user has
  passed a negative, non-raw event type, they clearly don't intend
  for it to be interpreted as a hardware event.
Reported-by: N"Zhang Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

23a185ca

05 2月, 2009 2 次提交

perfcounters: fix "perf counters kills oprofile" bug, v2 · 82aa9a18

由 Ingo Molnar 提交于 2月 05, 2009

Impact: fix kernel crash

Both oprofile and perfcounters register an NMI die handler, but only one
can handle the NMI. Conveniently, oprofile unregisters it's notifier
when not actively in use, so setting it's notifier priority higher than
perfcounter's allows oprofile to borrow the NMI for the duration of it's
run. Tested/works both as module and built-in.

While testing, I found that if kerneltop was generating NMIs at very
high frequency, the kernel may panic when oprofile registered it's
handler. This turned out to be because oprofile registers it's handler
before reset_value has been allocated, so if an NMI comes in while it's
still setting up, kabOom. Rather than try more invasive changes, I
followed the lead of other places in op_model_ppro.c, and simply
returned in that highly unlikely event. (debug warnings attached)
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

82aa9a18

perfcounters: fix "perf counters kill oprofile" bug · 5b75af0a

由 Mike Galbraith 提交于 2月 04, 2009

With oprofile as a module, and unloaded by profiling script,
both oprofile and kerneltop work fine.. unless you leave kerneltop
running when you start profiling, then you may see badness.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5b75af0a

02 2月, 2009 1 次提交

x86: irqinit_32.c fix compilation warning · 15081c61

由 Jaswinder Singh Rajput 提交于 2月 01, 2009

Fix:

  arch/x86/kernel/irqinit_32.c:124: warning: 'smp_intr_init' defined but not used
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

15081c61

29 1月, 2009 1 次提交

perfcounters: fix refcounting bug · 65d37086

由 Mike Galbraith 提交于 1月 29, 2009

don't kfree in use counters.

Running...

	while true; do perfstat -e 1 -c true; done

...on all cores for a while doesn't seem to be eating ram, and my oops
is gone.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

65d37086

27 1月, 2009 1 次提交

x86: make irqinit_32.c more like irqinit_64.c, v2 · bb3f0b59

由 Yinghai Lu 提交于 1月 25, 2009

Impact: cleanup

1. add smp_intr_init and apic_intr_init for 32bit, the same as 64bit
2. move the apic_intr_init() call before set gate with interrupt[i]
3. for 64bit, if ia32_emulation is not used, will make per_cpu to use 0x80 vector.

[ v2: should use !test_bit() instead of test_bit() with 32bit ]
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bb3f0b59

23 1月, 2009 12 次提交

perfcounters fix section mismatch warning in perf_counter.c::perf_counters_lapic_init() · 3415dd91

由 Mike Galbraith 提交于 1月 23, 2009

Fix:

WARNING: arch/x86/kernel/built-in.o(.text+0xdd0f): Section mismatch in reference from the function pmc_generic_enable() to the function .cpuinit.text:perf_counters_lapic_init()
The function pmc_generic_enable() references
the function __cpuinit perf_counters_lapic_init().
This is often because pmc_generic_enable lacks a __cpuinit
annotation or the annotation of perf_counters_lapic_init is wrong.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3415dd91

perfcounters: ratelimit performance counter interrupts · 4b39fd96

由 Mike Galbraith 提交于 1月 23, 2009

Ratelimit performance counter interrupts to 100KHz per CPU.

This replaces the irq-delta-time based method.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4b39fd96

perfcounters: throttle on too high IRQ rates · 1b023a96

由 Mike Galbraith 提交于 1月 23, 2009

Starting kerneltop with only -c 100 seems to be a bad idea, it can
easily lock the system due to perfcounter IRQ overload.

So add throttling: if a new IRQ arrives in a shorter than
PERFMON_MIN_PERIOD_NS time, turn off perfcounters and untrottle them
from the next timer tick.
Signed-off-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1b023a96

I

Merge branch 'core/percpu' into perfcounters/core · 05e3423c
由 Ingo Molnar 提交于 1月 23, 2009

05e3423c

x86, xen: fix hardirq.h merge fallout · 99d0000f

由 Ingo Molnar 提交于 1月 23, 2009

Impact: build fix

This build error:

 arch/x86/xen/suspend.c:22: error: implicit declaration of function 'fix_to_virt'
 arch/x86/xen/suspend.c:22: error: 'FIX_PARAVIRT_BOOTMAP' undeclared (first use in this function)
 arch/x86/xen/suspend.c:22: error: (Each undeclared identifier is reported only once
 arch/x86/xen/suspend.c:22: error: for each function it appears in.)

triggers because the hardirq.h unification removed an implicit fixmap.h
include - on which arch/x86/xen/suspend.c depended. Add the fixmap.h
include explicitly.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

99d0000f

Merge branch 'core/percpu' into perfcounters/core · bfe2a3c3

由 Ingo Molnar 提交于 1月 23, 2009

Conflicts:
	arch/x86/include/asm/hardirq_32.h
	arch/x86/include/asm/hardirq_64.h

Semantic merge:
	arch/x86/include/asm/hardirq.h
	[ added apic_perf_irqs field. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bfe2a3c3

I

Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu · 35d266a2
由 Ingo Molnar 提交于 1月 23, 2009

35d266a2

x86: make irq_cpustat_t fields conditional · 2de3a5f7

由 Brian Gerst 提交于 1月 23, 2009

Impact: shrink size of irq_cpustat_t when possible
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

2de3a5f7

x86: merge hardirq_{32,64}.h into hardirq.h · 22da7b3d

由 Brian Gerst 提交于 1月 23, 2009

Impact: cleanup
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

22da7b3d

x86: sync hardirq_{32,64}.h · 658a9a2c

由 Brian Gerst 提交于 1月 23, 2009

Impact: better code generation and removal of unused field for 32bit

In general, use the 64-bit version.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

658a9a2c

x86: remove include of apic.h from hardirq_64.h · 3819cd48

由 Brian Gerst 提交于 1月 23, 2009

Impact: cleanup

APIC definitions aren't needed here.  Remove the include and fix
up the fallout.

tj: added include to mce_intel_64.c.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

3819cd48

x86: remove idle_timestamp from 32bit irq_cpustat_t · 03d2989d

由 Brian Gerst 提交于 1月 23, 2009

Impact: bogus irq_cpustat field removed

idle_timestamp is left over from the removed irqbalance code.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

03d2989d

21 1月, 2009 19 次提交

I
Merge commit 'v2.6.29-rc2' into perfcounters/core · 77835492
由 Ingo Molnar 提交于 1月 21, 2009
```
Conflicts:
	include/linux/syscalls.h
```
77835492

x86: make UV support configurable · 03b48632

由 Nick Piggin 提交于 1月 20, 2009

Make X86 SGI Ultraviolet support configurable. Saves about 13K of text size
on my modest config.

   text    data     bss     dec     hex filename
6770537 1158680  694356 8623573  8395d5 vmlinux
6757492 1157664  694228 8609384  835e68 vmlinux.nouv
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

03b48632

x86: uv cleanup, build fix #2 · 5b221278

由 Ingo Molnar 提交于 1月 21, 2009

Fix more build-failure fallout from the UV cleanup - the UV drivers
were not updated to include <asm/uv/uv.h>.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5b221278

x86: make x86_32 use tlb_64.c, build fix, clean up X86_L1_CACHE_BYTES · ace6c6c8

由 Ingo Molnar 提交于 1月 21, 2009

Fix:

  arch/x86/mm/tlb.c:47: error: ‘CONFIG_X86_INTERNODE_CACHE_BYTES’ undeclared here (not in a function)

The CONFIG_X86_INTERNODE_CACHE_BYTES symbol is only defined on 64-bit,
because vsmp support is 64-bit only. Define it on 32-bit too - where it
will always be equal to X86_L1_CACHE_BYTES.

Also move the default of X86_L1_CACHE_BYTES (which is separate from the
more commonly used L1_CACHE_SHIFT kconfig symbol) from 128 bytes to
64 bytes.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ace6c6c8

I
Merge branch 'x86/mm' into core/percpu · 19803078
由 Ingo Molnar 提交于 1月 21, 2009
```
Conflicts:
	arch/x86/mm/fault.c
```
19803078

x86: uv cleanup, build fix · 4ec71fa2

由 Ingo Molnar 提交于 1月 21, 2009

Fix:

arch/x86/mm/srat_64.c: In function ‘acpi_numa_processor_affinity_init’:
arch/x86/mm/srat_64.c:141: error: implicit declaration of function ‘get_uv_system_type’
arch/x86/mm/srat_64.c:141: error: ‘UV_X2APIC’ undeclared (first use in this function)
arch/x86/mm/srat_64.c:141: error: (Each undeclared identifier is reported only once
arch/x86/mm/srat_64.c:141: error: for each function it appears in.)

A couple of UV definitions were moved to asm/uv/uv.h, but srat_64.c did
not include that header. Add it.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4ec71fa2

x86, mm: move tlb.c to arch/x86/mm/ · 55f4949f

由 Ingo Molnar 提交于 1月 21, 2009

Impact: cleanup

Now that it's unified, move the (SMP) TLB flushing code from arch/x86/kernel/
to arch/x86/mm/, where it belongs logically.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

55f4949f

Merge branch 'cpus4096' into core/percpu · 3eb3963f

由 Ingo Molnar 提交于 1月 21, 2009

Conflicts:
	arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/x86/kernel/tlb_32.c

Merge it here because both the cpumask changes and the ongoing percpu
work is touching the TLB code. The percpu changes take precedence, as
they eliminate tlb_32.c altogether.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3eb3963f

I

Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu · ae2b56b9
由 Ingo Molnar 提交于 1月 21, 2009

ae2b56b9

x86: rename tlb_64.c to tlb.c · 16c2d3f8

由 Tejun Heo 提交于 1月 21, 2009

Impact: file rename

tlb_64.c is now the tlb code for both 32 and 64.  Rename it to tlb.c.
Signed-off-by: NTejun Heo <tj@kernel.org>

16c2d3f8

x86: make x86_32 use tlb_64.c · 02cf94c3

由 Tejun Heo 提交于 1月 21, 2009

Impact: less contention when issuing invalidate IPI, cleanup

Make x86_32 use the same tlb code as 64bit.  The 64bit code uses
multiple IPI vectors for tlb shootdown to reduce contention.  This
patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
code paths.

Note that the usage of asmlinkage is inconsistent for x86_32 and 64
and calls for further cleanup.  This has been noted with a FIXME
comment in tlb_64.c.
Signed-off-by: NTejun Heo <tj@kernel.org>

02cf94c3

x86: prepare for tlb merge · 6dd01bed

由 Tejun Heo 提交于 1月 21, 2009

Impact: clean up, ipi vector number reordering for x86_32

Make the following changes to prepare for tlb merge.

* reorder x86_32 ip vectors

* adjust tlb_32.c and tlb_64.c such that their logics coincide exactly
	- on spurious invalidate ipi, tlb_32 acks the irq
	- tlb_64 now has proper memory barriers around clearing
          flush_cpumask (no change in generated code)

* unexport flush_tlb_page from tlb_32.c, there's no user

* use unsigned int for cpu id

* drop unnecessary includes from tlb_64.c
Signed-off-by: NTejun Heo <tj@kernel.org>

6dd01bed

x86: uv cleanup · bdbcdd48

由 Tejun Heo 提交于 1月 21, 2009

Impact: cleanup

Make the following uv related cleanups.

* collect visible uv related definitions and interfaces into uv/uv.h
  and use it.  this cleans up the messy situation where on 64bit, uv
  is defined properly, on 32bit generic it's dummy and on the rest
  undefined.  after this clean up, uv is defined on 64 and dummy on
  32.

* update uv_flush_tlb_others() such that it takes cpumask of
  to-be-flushed cpus as argument, instead of that minus self, and
  returns yet-to-be-flushed cpumask, instead of modifying the passed
  in parameter.  this interface change will ease dummy implementation
  of uv_flush_tlb_others() and makes uv tlb flush related stuff
  defined in tlb_uv proper.
Signed-off-by: NTejun Heo <tj@kernel.org>

bdbcdd48

x86: merge irq_regs.h · d650a514

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup, better irq_regs code generation for x86_64

Make 64-bit use the same optimizations as 32-bit.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

d650a514

x86: merge mmu_context.h · 6826c8ff

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup

tj: * changed cpu to unsigned as was done on mmu_context_64.h as cpu
      id is officially unsigned int
    * added missing ';' to 32bit version of deactivate_mm()
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

6826c8ff

x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32 · 0dd76d73

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup

%fs is currently set to __KERNEL_DS at boot, and conditionally
switched to __KERNEL_PERCPU for secondary cpus.  Instead, initialize
GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and
set %fs to __KERNEL_PERCPU unconditionally.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

0dd76d73

x86: fix percpu_write with 64-bit constants · 299e2699

由 Brian Gerst 提交于 1月 21, 2009

Impact: slightly better code generation for percpu_to_op()

The processor will sign-extend 32-bit immediate values in 64-bit
operations.  Use the 'e' constraint ("32-bit signed integer constant,
or a symbolic reference known to fit that range") for 64-bit constants.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

299e2699

x86: clean up gdt_page definition · 06deef89

由 Brian Gerst 提交于 1月 21, 2009

Impact: cleanup && more compact percpu area layout with future changes

Move 64-bit GDT to page-aligned section and clean up comment
formatting.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

06deef89

x86: update canary handling during switch · 67e68bde

由 Tejun Heo 提交于 1月 21, 2009

Impact: cleanup

In switch_to(), instead of taking offset to irq_stack_union.stack,
make it a proper percpu access using __percpu_arg() and per_cpu_var().
Signed-off-by: NTejun Heo <tj@kernel.org>

67e68bde

20 1月, 2009 1 次提交

x86: optimise x86's do_page_fault (C entry point for the page fault path) · 92181f19

由 Nick Piggin 提交于 1月 20, 2009

Impact: cleanup, restructure code to improve assembly

gcc isn't _all_ that smart about spilling registers to stack or reusing
stack slots, even with branch annotations. do_page_fault contained a lot
of functionality, so split unlikely paths into their own functions, and
mark them as noinline just to be sure. I consider this actually to be
somewhat of a cleanup too: the main function now contains about half
the number of lines so the normal path is easier to read, while the error
cases are also nicely split away.

Also, ensure the order of arguments to functions is always the same: regs,
addr, error_code. This can reduce code size a tiny bit, and just looks neater
too.

And add a couple of branch annotations.

Before:
  do_page_fault:
          subq    $360, %rsp      #,

After:
  do_page_fault:
          subq    $56, %rsp       #,

bloat-o-meter:
  add/remove: 8/0 grow/shrink: 0/1 up/down: 2222/-1680 (542)
  function                                     old     new   delta
  __bad_area_nosemaphore                         -     506    +506
  no_context                                     -     474    +474
  vmalloc_fault                                  -     424    +424
  spurious_fault                                 -     358    +358
  mm_fault_error                                 -     272    +272
  bad_area_access_error                          -      89     +89
  bad_area                                       -      89     +89
  bad_area_nosemaphore                           -      10     +10
  do_page_fault                               2464     784   -1680

Yes, the total size increases by 542 bytes, due to the extra function calls.
But these will very rarely be called (except for vmalloc_fault) in a normal
workload. Importantly, do_page_fault is less than 1/3rd it's original size,
and touches far less stack.

Existing gotos and branch hints did move a lot of the infrequently used text
out of the fastpath, but that's even further improved after this patch.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

92181f19

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功