提交 · eaf785d51dc6782da4cc87b5e891c8a9f8fa2c27 · openanolis / cloud-kernel

03 11月, 2014 3 次提交

s390/cpum_sf: Remove initialization of PMU event index · eaf785d5

由 Hendrik Brueckner 提交于 10月 20, 2014

The git commit c719f560
"perf: Fix and clean up initialization of pmu::event_idx" removed
the PMU event index callback for all architectures but x86,
remove the initialization of the event index as well.
Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

eaf785d5

s390/signal: add sparse annotations · 37d2cd9d

由 Martin Schwidefsky 提交于 10月 30, 2014

Fix the following warnings from the sparse code checker:

arch/s390/kernel/signal.c:374:38: warning: cast removes address space of expression
arch/s390/kernel/signal.c:374:65: warning: incorrect type in initializer (different address spaces)
arch/s390/kernel/signal.c:374:65: expected unsigned short [noderef] [usertype] <asn:1>*svc
arch/s390/kernel/signal.c:374:65: got void *

arch/s390/kernel/compat_signal.c:437:38: warning: cast removes address space of expression
arch/s390/kernel/compat_signal.c:437:65: warning: incorrect type in initializer (different address spaces)
arch/s390/kernel/compat_signal.c:437:65: expected unsigned short [noderef] [usertype] <asn:1>*svc
arch/s390/kernel/compat_signal.c:437:65: got void *
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

37d2cd9d

s390/cmpxchg: use compiler builtins · f318a122

由 Martin Schwidefsky 提交于 10月 29, 2014

The kernel build for s390 fails for gcc compilers with version 3.x,
set the minimum required version of gcc to version 4.3.

As the atomic builtins are available with all gcc 4.x compilers,
use the __sync_val_compare_and_swap and __sync_bool_compare_and_swap
functions to replace the complex macro and inline assembler magic
in include/asm/cmpxchg.h. The compiler can just-do-it and generates
better code with the builtins.

While we are at it use __sync_bool_compare_and_swap for the
_raw_compare_and_swap function in the spinlock code as well.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

f318a122

27 10月, 2014 4 次提交

s390/kprobes: make use of NOKPROBE_SYMBOL() · 7a5388de

由 Heiko Carstens 提交于 10月 22, 2014

Use NOKPROBE_SYMBOL() instead of __kprobes annotation.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

7a5388de

s390/ftrace,kprobes: allow to patch first instruction · c933146a

由 Heiko Carstens 提交于 10月 15, 2014

If the function tracer is enabled, allow to set kprobes on the first
instruction of a function (which is the function trace caller):

If no kprobe is set handling of enabling and disabling function tracing
of a function simply patches the first instruction. Either it is a nop
(right now it's an unconditional branch, which skips the mcount block),
or it's a branch to the ftrace_caller() function.

If a kprobe is being placed on a function tracer calling instruction
we encode if we actually have a nop or branch in the remaining bytes
after the breakpoint instruction (illegal opcode).
This is possible, since the size of the instruction used for the nop
and branch is six bytes, while the size of the breakpoint is only
two bytes.
Therefore the first two bytes contain the illegal opcode and the last
four bytes contain either "0" for nop or "1" for branch. The kprobes
code will then execute/simulate the correct instruction.

Instruction patching for kprobes and function tracer is always done
with stop_machine(). Therefore we don't have any races where an
instruction is patched concurrently on a different cpu.
Besides that also the program check handler which executes the function
trace caller instruction won't be executed concurrently to any
stop_machine() execution.

This allows to keep full fault based kprobes handling which generates
correct pt_regs contents automatically.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

c933146a

s390/vdso: fix stack corruption · 9b2efe03

由 Heiko Carstens 提交于 10月 27, 2014

The kernel provided vdso functions do not get a stack frame from the
calling function and therefore may not change the stack contents, unless
they allocate space on their own.

This problem was exposed with 070b7be6 "s390/vdso: replace stck with
stcke" which writes 16 bytes instead of 8 bytes into the stack frame. These
additional 8 bytes however were indeed used by the caller (glibc) to save
data and therefore this data was corrupted by the vdso code.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

9b2efe03

s390/time: use stck clock fast for do_account_vtime · 1f759bb3

由 Martin Schwidefsky 提交于 10月 20, 2014

The last high frequency call site of the STCK instruction is
do_account_vtime. Replace it with the faster STCKF instruction.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

1f759bb3

17 10月, 2014 2 次提交

s390/uprobes: fix kprobes dependency · d6fe5be3

由 Jan Willeke 提交于 10月 08, 2014

If kprobes is disabled uprobes will not compile.
Fix this by including the correct header files.
Signed-off-by: NJan Willeke <willeke@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d6fe5be3

s390: wire up bpf syscall · fcb1c2d7

由 Heiko Carstens 提交于 10月 09, 2014

Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

fcb1c2d7

10 10月, 2014 1 次提交

nosave: consolidate __nosave_{begin,end} in <asm/sections.h> · 7f8998c7

由 Geert Uytterhoeven 提交于 10月 09, 2014

The different architectures used their own (and different) declarations:

    extern __visible const void __nosave_begin, __nosave_end;
    extern const void __nosave_begin, __nosave_end;
    extern long __nosave_begin, __nosave_end;

Consolidate them using the first variant in <asm/sections.h>.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7f8998c7

09 10月, 2014 8 次提交

s390/ftrace: simplify enabling/disabling of ftrace_graph_caller · 0cccdda8

由 Heiko Carstens 提交于 10月 08, 2014

We can simply patch the mask field within the branch relative on
condition instruction at the beginning of the ftrace_graph_caller
code block.
This makes the logic even simpler and we get rid of the displacement
calculation.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

0cccdda8

s390/ftrace: remove 31 bit ftrace support · 53255c9a

由 Heiko Carstens 提交于 10月 07, 2014

31 bit and 64 bit diverge more and more and it is rather painful
to keep both parts running.
To make things simpler just remove the 31 bit support which nobody
uses anyway.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

53255c9a

s390/kdump: add support for vector extension · a62bc073

由 Michael Holzheu 提交于 10月 06, 2014

With this patch for kdump the s390 vector registers are stored into the
prepared save areas in the old kernel and into the REGSET_VX_LOW and
REGSET_VX_HIGH ELF notes for /proc/vmcore in the new kernel.

The NT_S390_VXRS_LOW note contains the lower halves of the first 16 vector
registers 0-15. The higher halves are stored in the floating point register
ELF note. The NT_S390_VXRS_HIGH contains the full vector registers 16-31.

The kernel provides a save area for storing vector register in case of
machine checks. A pointer to this save are is stored in the CPU lowcore
at offset 0x11b0. This save area is also used to save the registers for
kdump. In case of a dumped crashed kdump those areas are used to extract
the registers of the production system.

The vector registers for remote CPUs are stored using the "store additional
status at address" SIGP. For the dump CPU the vector registers are stored
with the VSTM instruction.

With this patch also zfcpdump stores the vector registers.
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a62bc073

s390/disassembler: add vector instructions · 3585cb02

由 Martin Schwidefsky 提交于 10月 06, 2014

Add the instruction introduced with the vector extension to the in-kernel
disassembler.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

3585cb02

s390: add support for vector extension · 80703617

由 Martin Schwidefsky 提交于 10月 06, 2014

The vector extension introduces 32 128-bit vector registers and a set of
instruction to operate on the vector registers.

The kernel can control the use of vector registers for the problem state
program with a bit in control register 0. Once enabled for a process the
kernel needs to retain the content of the vector registers on context
switch. The signal frame is extended to include the vector registers.
Two new register sets NT_S390_VXRS_LOW and NT_S390_VXRS_HIGH are added
to the regset interface for the debugger and core dumps.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

80703617

s390/idle: consolidate idle functions and definitions · b5f87f15

由 Martin Schwidefsky 提交于 10月 01, 2014

Move the C functions and definitions related to the idle state handling
to arch/s390/include/asm/idle.h and arch/s390/kernel/idle.c. The function
s390_get_idle_time is renamed to arch_cpu_idle_time and vtime_stop_cpu to
enabled_wait.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b5f87f15

s390/nohz: use a per-cpu flag for arch_needs_cpu · fe0f4976

由 Martin Schwidefsky 提交于 9月 30, 2014

Move the nohz_delay bit from the s390_idle data structure to the
per-cpu flags. Clear the nohz delay flag in __cpu_disable and
remove the cpu hotplug notifier that used to do this.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

fe0f4976

s390/vtime: do not reset idle data on CPU hotplug · a9b16499

由 Martin Schwidefsky 提交于 10月 01, 2014

The sysfs attributes /sys/devices/system/cpu/cpu0/idle_count and
/sys/devices/system/cpu/cpu0/idle_time_us are reset to zero every
time a CPU is set online. The idle and iowait fields in /proc/stat
corresponding to idle_time_us are not reset. To make things
consistent do not reset the data for the sys attributes.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a9b16499

26 9月, 2014 2 次提交

s390/setup: correct 4-level kernel page table detection · 242a112a

由 Martin Schwidefsky 提交于 9月 26, 2014

Fix calculation to decide if a 4-level kernel page table is required.
Git commit c972cc60 "s390/vmalloc: have separate modules area"
added the separate module area which reduces the size of the vmalloc
area but fails to take it into account for the 3 vs 4 level page table
decision.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

242a112a

s390/topology: call set_sched_topology early · 48e9a6c1

由 Martin Schwidefsky 提交于 9月 24, 2014

The call to topology_init is too late for the set_sched_topology call.
The initial scheduling domain structure has already been established
with default topology array. Use the smp_cpus_done() call to get the
s390 specific topology array registered early enough.

Cc: stable@vger.kernel.org # v3.16+
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

48e9a6c1

25 9月, 2014 5 次提交

s390/uprobes: architecture backend for uprobes · 2a0a5b22

由 Jan Willeke 提交于 9月 22, 2014

Signed-off-by: NJan Willeke <willeke@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

2a0a5b22

s390/uprobes: common library for kprobes and uprobes · 975fab17

由 Jan Willeke 提交于 9月 22, 2014

This patch moves common functions from kprobes.c to probes.c.
Thus its possible for uprobes to use them without enabling kprobes.
Signed-off-by: NJan Willeke <willeke@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

975fab17

s390/head.s: use zero as address for stfl · b881dcfb

由 Christian Borntraeger 提交于 9月 19, 2014

The architecture suggests to use address 0 as parameter for stfl,
to allow for future extensions. Using __LC_STFL_FAC_LIST (0x200)
shows which address is used, but might be not future proof.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b881dcfb

s390/rwlock: use directed yield for write-locked rwlocks · d59b93da

由 Martin Schwidefsky 提交于 9月 19, 2014

Add an owner field to the arch_rwlock_t to be able to pass the timeslice
of a virtual CPU with diagnose 0x9c to the lock owner in case the rwlock
is write-locked. The undirected yield in case the rwlock is acquired
writable but the lock is read-locked is removed.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d59b93da

s390/hmcdrv: HMC drive CD/DVD access · 8f933b10

由 Ralf Hoppe 提交于 4月 08, 2013

This device driver allows accessing a HMC drive CD/DVD-ROM.
It can be used in a LPAR and z/VM environment.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NRalf Hoppe <rhoppe@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

8f933b10

24 9月, 2014 1 次提交

ARCH: AUDIT: audit_syscall_entry() should not require the arch · 91397401

由 Eric Paris 提交于 3月 11, 2014

We have a function where the arch can be queried, syscall_get_arch().
So rather than have every single piece of arch specific code use and/or
duplicate syscall_get_arch(), just have the audit code use the
syscall_get_arch() code.
Based-on-patch-by: NRichard Briggs <rgb@redhat.com>
Signed-off-by: NEric Paris <eparis@redhat.com>
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: microblaze-uclinux@itee.uq.edu.au
Cc: linux-mips@linux-mips.org
Cc: linux@lists.openrisc.net
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: linux-xtensa@linux-xtensa.org
Cc: x86@kernel.org

91397401

09 9月, 2014 8 次提交

s390/ftrace: optimize mcount code · 3d1e220d

由 Heiko Carstens 提交于 9月 03, 2014

Reduce the number of executed instructions within the mcount block if
function tracing is enabled. We achieve that by using a non-standard
C function call ABI. Since the called function is also written in
assembler this is not a problem.
This also allows to replace the unconditional store at the beginning
of the mcount block with a larl instruction, which doesn't touch
memory.

In theory we could also patch the first instruction of the mcount block
to enable and disable function tracing. However this would break kprobes.
This could be fixed with implementing the "kprobes_on_ftrace" feature;
however keeping the odd jprobes working seems not to be possible without
a lot of code churn. Therefore keep the code easy and simply accept one
wasted 1-cycle "larl" instruction per function prologue.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

3d1e220d

s390/kprobes: remove unused jprobe_return_end() · ea2f4769

由 Heiko Carstens 提交于 9月 03, 2014

Even if it has a __used annotation it is actually unused.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ea2f4769

s390/ftrace: enforce DYNAMIC_FTRACE if FUNCTION_TRACER is selected · 5d6a0163

由 Heiko Carstens 提交于 8月 15, 2014

We have too many combinations for function tracing. Lets simply stick to
the most advanced option, so we don't have to care of other combinations.

This means we always select DYNAMIC_FTRACE if FUNCTION_TRACER is selected.

In the s390 Makefile also remove CONFIG_FTRACE_SYSCALLS since that
functionality got moved to architecture independent code in the meantime.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5d6a0163

s390/ftrace: add HAVE_DYNAMIC_FTRACE_WITH_REGS support · 10dec7db

由 Heiko Carstens 提交于 8月 15, 2014

This code is based on a patch from Vojtech Pavlik.
http://marc.info/?l=linux-s390&m=140438885114413&w=2

The actual implementation now differs significantly:
Instead of adding a second function "ftrace_regs_caller" which would be nearly
identical to the existing ftrace_caller function, the current ftrace_caller
function is now an alias to ftrace_regs_caller and always passes the needed
pt_regs structure and function_trace_op parameters unconditionally.

Besides that also use asm offsets to correctly allocate and access the new
struct pt_regs on the stack.

While at it we can make use of new instruction to get rid of some indirect
loads if compiled for new machines.

The passed struct pt_regs can be changed by the called function and it's new
contents will replace the current contents.

Note: to change the return address the embedded psw member of the pt_regs
structure must be changed. The psw member is right now incomplete, since
the mask part is missing. For all current use cases this should be sufficent.
Providing and restoring a sane mask would mean we need to add an epsw/lpswe
pair to the mcount code. Only these two instruction would cost us ~120 cycles
which currently seems not necessary.

Cc: Vojtech Pavlik <vojtech@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

10dec7db

s390/ftrace: optimize function graph caller code · 2481a87b

由 Heiko Carstens 提交于 8月 15, 2014

When the function graph tracer is disabled we can skip three additional
instructions. So let's just do this.

So if function tracing is enabled but function graph tracing is
runtime disabled, we get away with a single unconditional branch.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

2481a87b

s390/vdso: add vdso support for coarse clocks · b7eacb59

由 Martin Schwidefsky 提交于 8月 29, 2014

Add CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE optimization to
the 64-bit and 31-bit vdso.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b7eacb59

s390/vdso: replace stck with stcke · 070b7be6

由 Martin Schwidefsky 提交于 8月 29, 2014

If gettimeofday / clock_gettime are called multiple times in a row
the STCK instruction will stall until a difference in the result is
visible. This unnecessarily slows down the vdso calls, use stcke
instead of stck to get rid of the stall.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

070b7be6

s390: remove unused MACHINE_FLAG_RRBM · b7d5006d

由 Heiko Carstens 提交于 8月 27, 2014

Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

b7d5006d

04 9月, 2014 1 次提交

seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computing · a4412fc9

由 Andy Lutomirski 提交于 7月 21, 2014

The secure_computing function took a syscall number parameter, but
it only paid any attention to that parameter if seccomp mode 1 was
enabled.  Rather than coming up with a kludge to get the parameter
to work in mode 2, just remove the parameter.

To avoid churn in arches that don't have seccomp filters (and may
not even support syscall_get_nr right now), this leaves the
parameter in secure_computing_strict, which is now a real function.

For ARM, this is a bit ugly due to the fact that ARM conditionally
supports seccomp filters.  Fixing that would probably only be a
couple of lines of code, but it should be coordinated with the audit
maintainers.

This will be a slight slowdown on some arches.  The right fix is to
pass in all of seccomp_data instead of trying to make just the
syscall nr part be fast.

This is a prerequisite for making two-phase seccomp work cleanly.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: x86@kernel.org
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NKees Cook <keescook@chromium.org>

a4412fc9

01 9月, 2014 2 次提交

s390/vdso: remove NULL pointer check from clock_gettime · 5da76157

由 Martin Schwidefsky 提交于 8月 29, 2014

The explicit NULL pointer check on the timespec argument is only
required for clock_getres but not for clock_gettime.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5da76157

s390/ipl: Add missing SCSI loadparm attributes to /sys/firmware · 69928601

由 Michael Holzheu 提交于 8月 26, 2014

Currently the loadparm is only supported for CCW IPL. But also for SCSI
IPL it can be specified either on the HMC load panel respectively
z/VM console or via diagnose 308.

So fix this for SCSI and add the required sysfs attributes for reading the
IPL loadparm and for setting the loadparm for re-IPL.

With this patch the following two sysfs attributes are introduced:

 - /sys/firmware/ipl/loadparm (for system that have been IPLed from SCSI)
 - /sys/firmware/reipl/fcp/loadparm

Because the loadparm is now available for SCSI and CCW it is moved
now from "struct ipl_block_ccw" to the generic "struct ipl_list_hdr".
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

69928601

27 8月, 2014 1 次提交

s390: Replace __get_cpu_var uses · eb7e7d76

由 Christoph Lameter 提交于 8月 17, 2014

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	this_cpu_inc(y)

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: linux390@de.ibm.com
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NChristoph Lameter <cl@linux.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

eb7e7d76

12 8月, 2014 2 次提交

s390: wire up memfd_create syscall · 7bb1cdbf

由 Heiko Carstens 提交于 8月 11, 2014

Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

7bb1cdbf

s390: add system information as device randomness · bcfcbb6b

由 Martin Schwidefsky 提交于 8月 11, 2014

The virtual-machine cpu information data block and the cpu-id of
the boot cpu can be used as source of device randomness.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

bcfcbb6b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功