提交 · 78f13e9525ba777da25c4ddab89f28e9366a8b7c · openanolis / cloud-kernel

09 4月, 2009 1 次提交

perf_counter: allow for data addresses to be recorded · 78f13e95

由 Peter Zijlstra 提交于 4月 08, 2009

Paul suggested we allow for data addresses to be recorded along with
the traditional IPs as power can provide these.

For now, only the software pagefault events provide data addresses,
but in the future power might as well for some events.

x86 doesn't seem capable of providing this atm.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.394816925@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

78f13e95

08 4月, 2009 2 次提交

perf_counter: powerpc: set sample enable bit for marked instruction events · f708223d

由 Paul Mackerras 提交于 4月 08, 2009

Impact: enable access to hardware feature

POWER processors have the ability to "mark" a subset of the instructions
and provide more detailed information on what happens to the marked
instructions as they flow through the pipeline.  This marking is
enabled by the "sample enable" bit in MMCRA, and there are
synchronization requirements around setting and clearing the bit.

This adds logic to the processor-specific back-ends so that they know
which events relate to marked instructions and set the sampling enable
bit if any event that we want to put on the PMU is a marked instruction
event.  It also adds logic to the generic powerpc code to do the
necessary synchronization if that bit is set.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18908.31930.1024.228867@cargo.ozlabs.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f708223d

perf_counter: fix powerpc build · dc66270b

由 Paul Mackerras 提交于 4月 08, 2009

Commit 4af4998b ("perf_counter: rework context time") changed struct
perf_counter_context to have a 'time' field instead of a 'time_now'
field, but neglected to fix the place in the powerpc perf_counter.c
where the time_now field was accessed.  This fixes it.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18908.31922.411398.147810@cargo.ozlabs.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dc66270b

07 4月, 2009 19 次提交

dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32) · 284901a9

由 Yang Hongyang 提交于 4月 06, 2009

Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

284901a9

dma-mapping: replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64) · 6a35528a

由 Yang Hongyang 提交于 4月 06, 2009

Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6a35528a

powerpc/85xx: i2c-mpc: use new I2C bindings for the Socates board · c724d67d

由 Wolfgang Grandegger 提交于 4月 07, 2009

Preserve I2C clock settings for the Socrates MPC8544 board.
Signed-off-by: NWolfgang Grandegger <wg@grandegger.com>
Signed-off-by: NBen Dooks <ben-linux@fluff.org>

c724d67d

perf_counter: theres more to overflow than writing events · f6c7d5fe

由 Peter Zijlstra 提交于 4月 06, 2009

Prepare for more generic overflow handling. The new perf_counter_overflow()
method will handle the generic bits of the counter overflow, and can return
a !0 return value, in which case the counter should be (soft) disabled, so
that it won't count until it's properly disabled.

XXX: do powerpc and swcounter
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.812109629@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f6c7d5fe

powerpc: Fix oops when loading modules · 11b55da7

由 Paul Mackerras 提交于 4月 06, 2009

This fixes a problem reported by Sean MacLennan where loading any
module would cause an oops.  We weren't marking the pages containing
the module text as having hardware execute permission, due to a bug
introduced in commit 8d1cf34e ("powerpc/mm: Tweak PTE bit combination
definitions"), hence trying to execute the module text caused an
exception on processors that support hardware execute permission.

This adds _PAGE_HWEXEC to the definitions of PAGE_KERNEL_X and
PAGE_KERNEL_ROX to fix this problem.
Reported-by: NSean MacLennan <smaclennan@pikatech.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

11b55da7

powerpc: Wire up preadv and pwritev · 1a917bb5

由 Stephen Rothwell 提交于 4月 06, 2009

[paulus@samba.org: changed to use syscall numbers 320 and 321 since
 perf_counters is currently using 319.]
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

1a917bb5

powerpc/ftrace: Fix printf format warning · 7ddb7ad1

由 Michael Ellerman 提交于 4月 06, 2009

'tramp' is an unsigned long, so print it with %lx.

Fixes the following build warning:
arch/powerpc/kernel/ftrace.c:291: error: format ‘%x’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

7ddb7ad1

powerpc/ftrace: Fix #if that should be #ifdef · f4952f6c

由 Michael Ellerman 提交于 4月 06, 2009

Commit bb725340 ("powerpc64,
ftrace: save toc only on modules for function graph"), added an
#if CONFIG_PPC64.  This changes it to #ifdef.

Fixes the following warning on 32-bit builds:
 arch/powerpc/kernel/ftrace.c:562:5: error: "CONFIG_PPC64" is not defined
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

f4952f6c

powerpc: Fix ptrace compat wrapper for FPU register access · bc826666

由 Michael Neuling 提交于 4月 05, 2009

The ptrace compat wrapper mishandles access to the fpu registers.  The
PTRACE_PEEKUSR and PTRACE_POKEUSR requests miscalculate the index into
the fpr array due to the broken FPINDEX macro.  The
PPC_PTRACE_PEEKUSR_3264 request needs to use the same formula that the
native ptrace interface uses when operating on the register number (as
opposed to the 4-byte offset).  The PPC_PTRACE_POKEUSR_3264 request
didn't take TS_FPRWIDTH into account.
Signed-off-by: NAndreas Schwab <schwab@linux-m68k.org>
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

bc826666

powerpc: Print information about mapping hw irqs to virtual irqs · c7d07fdd

由 Michael Ellerman 提交于 4月 05, 2009

The irq remapping layer seems to cause some confusion when people
see a different irq number in /proc/interrupts vs the one they
request in their driver or DTS.

So have the irq remapping layer print out a message when we map an
irq. The message is only printed the first time the irq is mapped,
and it's KERN_DEBUG so most people won't see it.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Acked-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NWolfram Sang <w.sang@pengutronix.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

c7d07fdd

powerpc: Correct dependency of KEXEC · cb93d568

由 Geert Uytterhoeven 提交于 4月 02, 2009

commit 28794d34 ("powerpc/kconfig: Kill
PPC_MULTIPLATFORM") broke KEXEC, by making it dependent on BOOK3S, while it
should be PPC_BOOK3S.
Signed-off-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

cb93d568

powerpc: Disable VSX or current process in giveup_fpu/altivec · 7e875e9d

由 Michael Neuling 提交于 4月 01, 2009

When we call giveup_fpu, we need to need to turn off VSX for the
current process.  If we don't, on return to userspace it may execute a
VSX instruction before the next FP instruction, and not have its
register state refreshed correctly from the thread_struct.  Ditto for
altivec.

This caused a bug where an unaligned lfs or stfs results in
fix_alignment calling giveup_fpu so it can use the FPRs (in order to
do a single <-> double conversion), and then returning to userspace
with FP off but VSX on.  Then if a VSX instruction is executed, before
another FP instruction, it will proceed without another exception and
hence have the incorrect register state for VSX registers 0-31.

   lfs unaligned   <- alignment exception turns FP off but leaves VSX on

   VSX instruction <- no exception since VSX on, hence we get the
                      wrong VSX register values for VSX registers 0-31,
                      which overlap the FPRs.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

7e875e9d

powerpc/pseries: Enable relay in pseries_defconfig · 4c6cf428

由 Anton Blanchard 提交于 3月 31, 2009

Enable relay in pseries config, ppc64_defconfig had it enabled but pseries
did not.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

4c6cf428

powerpc/pseries: Fix ibm,client-architecture comment · 856cc2f0

由 Anton Blanchard 提交于 3月 31, 2009

We specify a 64MB RMO, but the comment says 128MB.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

856cc2f0

powerpc/pseries: Scan for all events in rtasd · 4a9f9506

由 Anton Blanchard 提交于 3月 31, 2009

Instead of checking for known events, pass in all 1s so we handle future
event types.  We were currently missing the IO event type.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

4a9f9506

powerpc/pseries: Add dispatch dispersion statistics · 0559f0a7

由 Anton Blanchard 提交于 3月 31, 2009

PHYP tells us how often a shared processor dispatch changed physical cpus.
This can highlight performance problems caused by the hypervisor.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

0559f0a7

powerpc: Clean up some prom printouts · 1f8737aa

由 Anton Blanchard 提交于 3月 31, 2009

Make all messages consistent, some have spaces before the "...", some do not.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

1f8737aa

powerpc: Print progress of ibm,client-architecture method · 4da727ae

由 Anton Blanchard 提交于 3月 31, 2009

The ibm,client-architecture method will often cause a reconfiguration reboot.
When this happens the last thing we see is:

	Hypertas detected, assuming LPAR !

Which doesn't explain what just happened.  Wrap the ibm,client-architecture
so it's clear what is going on:

	Calling ibm,client-architecture... done

In order to maintain the law of conservation of screen real estate, downgrade
two other messages to debug.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

4da727ae

powerpc: Remove duplicated #include's · 85701e6a

由 Huang Weiyi 提交于 3月 31, 2009

Remove duplicated #include's in
  - arch/powerpc/include/asm/ps3fb.h
  - arch/powerpc/kernel/setup-common.c
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

85701e6a

06 4月, 2009 18 次提交

powerpc/85xx: TQM8548: Update DTS file for multi-chip support · 7995c7e9

由 Wolfgang Grandegger 提交于 3月 30, 2009

This patch adds multi-chip support for the Micron MT29F8G08FAB NAND
flash memory on the TQM8548 modules.
Signed-off-by: NWolfgang Grandegger <wg@grandegger.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

7995c7e9

[MTD] [NAND] FSL-UPM: add multi chip support · b6e0e8c0

由 Wolfgang Grandegger 提交于 3月 30, 2009

This patch adds support for multi-chip NAND devices to the FSL-UPM
driver. This requires support for multiple GPIOs for the RNB pins.
The NAND chips are selected through address lines defined by the
FDT property "fsl,upm-addr-line-cs-offsets".
Signed-off-by: NWolfgang Grandegger <wg@grandegger.com>
Acked-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

b6e0e8c0

powerpc/pq2fads: Update device tree for use with device-tree-aware u-boot. · 65cc0fa3

由 Scott Wood 提交于 4月 03, 2009

Add aliases, and correct CS0 offset to match how u-boot
programs it (this was not a problem with cuImage because
the wrapper would reprogram the localbus to match the device
tree).
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

65cc0fa3

powerpc: Add support for CoreInt delivery of interrupts on MPIC · d91e4ea7

由 Kumar Gala 提交于 1月 07, 2009

CoreInt provides a mechansim to deliver the IRQ vector directly
into the core on an interrupt (via the SPR EPR) rather than having
to go IACK on the PIC. This is suppose to provide an improvment
in interrupt latency by reducing the time to get the IRQ vector.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

d91e4ea7

perf_counter: make it possible for hw_perf_counter_init to return error codes · d5d2bc0d

由 Paul Mackerras 提交于 3月 30, 2009

Impact: better error reporting

At present, if hw_perf_counter_init encounters an error, all it can do
is return NULL, which causes sys_perf_counter_open to return an EINVAL
error to userspace.  This isn't very informative for userspace; it means
that userspace can't tell the difference between "sorry, oprofile is
already using the PMU" and "we don't support this CPU" and "this CPU
doesn't support the requested generic hardware event".

This commit uses the PTR_ERR/ERR_PTR/IS_ERR set of macros to let
hw_perf_counter_init return an error code on error rather than just NULL
if it wishes.  If it does so, that error code will be returned from
sys_perf_counter_open to userspace.  If it returns NULL, an EINVAL
error will be returned to userspace, as before.

This also adapts the powerpc hw_perf_counter_init to make use of this
to return ENXIO, EINVAL, EBUSY, or EOPNOTSUPP as appropriate.  It would
be good to add extra error numbers in future to allow userspace to
distinguish the various errors that are currently reported as EINVAL,
i.e. irq_period < 0, too many events in a group, conflict between
exclude_* settings in a group, and PMU resource conflict in a group.

[ v2: fix a bug pointed out by Corey Ashford where error returns from
      hw_perf_counter_init were not handled correctly in the case of
      raw hardware events.]
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090330171023.682428180@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d5d2bc0d

perf_counter: powerpc: only reserve PMU hardware when we need it · 7595d63b

由 Paul Mackerras 提交于 3月 30, 2009

Impact: cooperate with oprofile

At present, on PowerPC, if you have perf_counters compiled in, oprofile
doesn't work.  There is code to allow the PMU to be shared between
competing subsystems, such as perf_counters and oprofile, but currently
the perf_counter subsystem reserves the PMU for itself at boot time,
and never releases it.

This makes perf_counter play nicely with oprofile.  Now we keep a count
of how many perf_counter instances are counting hardware events, and
reserve the PMU when that count becomes non-zero, and release the PMU
when that count becomes zero.  This means that it is possible to have
perf_counters compiled in and still use oprofile, as long as there are
no hardware perf_counters active.  This also means that if oprofile is
active, sys_perf_counter_open will fail if the hw_event specifies a
hardware event.

To avoid races with other tasks creating and destroying perf_counters,
we use a mutex.  We use atomic_inc_not_zero and atomic_add_unless to
avoid having to take the mutex unless there is a possibility of the
count going between 0 and 1.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090330171023.627912475@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7595d63b

perf_counter: unify and fix delayed counter wakeup · 925d519a

由 Peter Zijlstra 提交于 3月 30, 2009

While going over the wakeup code I noticed delayed wakeups only work
for hardware counters but basically all software counters rely on
them.

This patch unifies and generalizes the delayed wakeup to fix this
issue.

Since we're dealing with NMI context bits here, use a cmpxchg() based
single link list implementation to track counters that have pending
wakeups.

[ This should really be generic code for delayed wakeups, but since we
  cannot use cmpxchg()/xchg() in generic code, I've let it live in the
  perf_counter code. -- Eric Dumazet could use it to aggregate the
  network wakeups. ]

Furthermore, the x86 method of using TIF flags was flawed in that its
quite possible to end up setting the bit on the idle task, loosing the
wakeup.

The powerpc method uses per-cpu storage and does appear to be
sufficient.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NPaul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

925d519a

perf_counter: record time running and time enabled for each counter · 53cfbf59

由 Paul Mackerras 提交于 3月 25, 2009

Impact: new functionality

Currently, if there are more counters enabled than can fit on the CPU,
the kernel will multiplex the counters on to the hardware using
round-robin scheduling.  That isn't too bad for sampling counters, but
for counting counters it means that the value read from a counter
represents some unknown fraction of the true count of events that
occurred while the counter was enabled.

This remedies the situation by keeping track of how long each counter
is enabled for, and how long it is actually on the cpu and counting
events.  These times are recorded in nanoseconds using the task clock
for per-task counters and the cpu clock for per-cpu counters.

These values can be supplied to userspace on a read from the counter.
Userspace requests that they be supplied after the counter value by
setting the PERF_FORMAT_TOTAL_TIME_ENABLED and/or
PERF_FORMAT_TOTAL_TIME_RUNNING bits in the hw_event.read_format field
when creating the counter.  (There is no way to change the read format
after the counter is created, though it would be possible to add some
way to do that.)

Using this information it is possible for userspace to scale the count
it reads from the counter to get an estimate of the true count:

true_count_estimate = count * total_time_enabled / total_time_running

This also lets userspace detect the situation where the counter never
got to go on the cpu: total_time_running == 0.

This functionality has been requested by the PAPI developers, and will
be generally needed for interpreting the count values from counting
counters correctly.

In the implementation, this keeps 5 time values (in nanoseconds) for
each counter: total_time_enabled and total_time_running are used when
the counter is in state OFF or ERROR and for reporting back to
userspace.  When the counter is in state INACTIVE or ACTIVE, it is the
tstamp_enabled, tstamp_running and tstamp_stopped values that are
relevant, and total_time_enabled and total_time_running are determined
from them.  (tstamp_stopped is only used in INACTIVE state.)  The
reason for doing it like this is that it means that only counters
being enabled or disabled at sched-in and sched-out time need to be
updated.  There are no new loops that iterate over all counters to
update total_time_enabled or total_time_running.

This also keeps separate child_total_time_running and
child_total_time_enabled fields that get added in when reporting the
totals to userspace.  They are separate fields so that they can be
atomic.  We don't want to use atomics for total_time_running,
total_time_enabled etc., because then we would have to use atomic
sequences to update them, which are slower than regular arithmetic and
memory accesses.

It is possible to measure total_time_running by adding a task_clock
counter to each group of counters, and total_time_enabled can be
measured approximately with a top-level task_clock counter (though
inaccuracies will creep in if you need to disable and enable groups
since it is not possible in general to disable/enable the top-level
task_clock counter simultaneously with another group).  However, that
adds extra overhead - I measured around 15% increase in the context
switch latency reported by lat_ctx (from lmbench) when a task_clock
counter was added to each of 2 groups, and around 25% increase when a
task_clock counter was added to each of 4 groups.  (In both cases a
top-level task-clock counter was also added.)

In contrast, the code added in this commit gives better information
with no overhead that I could measure (in fact in some cases I
measured lower times with this code, but the differences were all less
than one standard deviation).

[ v2: address review comments by Andrew Morton. ]
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Orig-LKML-Reference: <18890.6578.728637.139402@cargo.ozlabs.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

53cfbf59

perf_counter: new output ABI - part 1 · 7b732a75

由 Peter Zijlstra 提交于 3月 23, 2009

Impact: Rework the perfcounter output ABI

use sys_read() only for instant data and provide mmap() output for all
async overflow data.

The first mmap() determines the size of the output buffer. The mmap()
size must be a PAGE_SIZE multiple of 1+pages, where pages must be a
power of 2 or 0. Further mmap()s of the same fd must have the same
size. Once all maps are gone, you can again mmap() with a new size.

In case of 0 extra pages there is no data output and the first page
only contains meta data.

When there are data pages, a poll() event will be generated for each
full page of data. Furthermore, the output is circular. This means
that although 1 page is a valid configuration, its useless, since
we'll start overwriting it the instant we report a full page.

Future work will focus on the output format (currently maintained)
where we'll likey want each entry denoted by a header which includes a
type and length.

Further future work will allow to splice() the fd, also containing the
async overflow data -- splice() would be mutually exclusive with
mmap() of the data.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.470536358@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7b732a75

perf_counter: add an mmap method to allow userspace to read hardware counters · 37d81828

由 Paul Mackerras 提交于 3月 23, 2009

Impact: new feature giving performance improvement

This adds the ability for userspace to do an mmap on a hardware counter
fd and get access to a read-only page that contains the information
needed to translate a hardware counter value to the full 64-bit
counter value that would be returned by a read on the fd.  This is
useful on architectures that allow user programs to read the hardware
counters, such as PowerPC.

The mmap will only succeed if the counter is a hardware counter
monitoring the current process.

On my quad 2.5GHz PowerPC 970MP machine, userspace can read a counter
and translate it to the full 64-bit value in about 30ns using the
mmapped page, compared to about 830ns for the read syscall on the
counter, so this does give a significant performance improvement.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090323172417.297057964@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

37d81828

perf_counter: remove the event config bitfields · f4a2deb4

由 Peter Zijlstra 提交于 3月 23, 2009

Since the bitfields turned into a bit of a mess, remove them and rely on
good old masks.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.059499915@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f4a2deb4

perf_counter: fix type/event_id layout on big-endian systems · 9aaa131a

由 Paul Mackerras 提交于 3月 21, 2009

Impact: build fix for powerpc

Commit db3a944aca35ae61 ("perf_counter: revamp syscall input ABI")
expanded the hw_event.type field into a union of structs containing
bitfields.  In particular it introduced a type field and a raw_type
field, with the intention that the 1-bit raw_type field should
overlay the most-significant bit of the 8-bit type field, and in fact
perf_counter_alloc() now assumes that (or at least, assumes that
raw_type doesn't overlay any of the bits that are 1 in the values of
PERF_TYPE_{HARDWARE,SOFTWARE,TRACEPOINT}).

Unfortunately this is not true on big-endian systems such as PowerPC,
where bitfields are laid out from left to right, i.e. from most
significant bit to least significant.  This means that setting
hw_event.type = PERF_TYPE_SOFTWARE will set hw_event.raw_type to 1.

This fixes it by making the layout depend on whether or not
__BIG_ENDIAN_BITFIELD is defined.  It's a bit ugly, but that's what
we get for using bitfields in a user/kernel ABI.

Also, that commit didn't fix up some places in arch/powerpc/kernel/
perf_counter.c where hw_event.raw and hw_event.event_id were used.
This fixes them too.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

9aaa131a

perf_counter: powerpc: clean up perc_counter_interrupt · db4fb5ac

由 Paul Mackerras 提交于 3月 19, 2009

Impact: cleanup

This updates the powerpc perf_counter_interrupt following on from the
"perf_counter: unify irq output code" patch.  Since we now use the
generic perf_counter_output code, which sets the perf_counter_pending
flag directly, we no longer need the need_wakeup variable.

This removes need_wakeup and makes perf_counter_interrupt use
get_perf_counter_pending() instead.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194234.024464535@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

db4fb5ac

perf_counter: unify irq output code · 0322cd6e

由 Peter Zijlstra 提交于 3月 19, 2009

Impact: cleanup

Having 3 slightly different copies of the same code around does nobody
any good. First step in revamping the output format.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.929962222@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0322cd6e

perf_counter: revamp syscall input ABI · b8e83514

由 Peter Zijlstra 提交于 3月 19, 2009

Impact: modify ABI

The hardware/software classification in hw_event->type became a little
strained due to the addition of tracepoint tracing.

Instead split up the field and provide a type field to explicitly specify
the counter type, while using the event_id field to specify which event to
use.

Raw counters still work as before, only the raw config now goes into
raw_event.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.836807573@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b8e83514

perf_counter: abstract wakeup flag setting in core to fix powerpc build · b6c5a71d

由 Paul Mackerras 提交于 3月 16, 2009

Impact: build fix for powerpc

Commit bd753921015e7905 ("perf_counter: software counter event
infrastructure") introduced a use of TIF_PERF_COUNTERS into the core
perfcounter code.  This breaks the build on powerpc because we use
a flag in a per-cpu area to signal wakeups on powerpc rather than
a thread_info flag, because the thread_info flags have to be
manipulated with atomic operations and are thus slower than per-cpu
flags.

This fixes the by changing the core to use an abstracted
set_perf_counter_pending() function, which is defined on x86 to set
the TIF_PERF_COUNTERS flag and on powerpc to set the per-cpu flag
(paca->perf_counter_pending).  It changes the previous powerpc
definition of set_perf_counter_pending to not take an argument and
adds a clear_perf_counter_pending, so as to simplify the definition
on x86.

On x86, set_perf_counter_pending() is defined as a macro.  Defining
it as a static inline in arch/x86/include/asm/perf_counters.h causes
compile failures because <asm/perf_counters.h> gets included early in
<linux/sched.h>, and the definitions of set_tsk_thread_flag etc. are
therefore not available in <asm/perf_counters.h>.  (On powerpc this
problem is avoided by defining set_perf_counter_pending etc. in
<asm/hw_irq.h>.)
Signed-off-by: NPaul Mackerras <paulus@samba.org>

b6c5a71d

perf_counter: provide major/minor page fault software events · ac17dc8e

由 Peter Zijlstra 提交于 3月 13, 2009

Provide separate sw counters for major and minor page faults.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ac17dc8e

perf_counter: provide pagefault software events · 7dd1fcc2

由 Peter Zijlstra 提交于 3月 13, 2009

We use the generic software counter infrastructure to provide
page fault events.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7dd1fcc2

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功