提交 · a1b5344620a3e6291afaf7542714ba9c391ef1c7 · openanolis / cloud-kernel

12 3月, 2016 9 次提交

powerpc32: Remove one insn in mulhdu · 737b01fc

由 Christophe Leroy 提交于 2月 09, 2016

Remove one instruction in mulhdu
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

737b01fc

powerpc32: small optimisation in flush_icache_range() · 716fa91d

由 Christophe Leroy 提交于 2月 09, 2016

Inlining of _dcache_range() functions has shown that the compiler
does the same thing a bit better with one insn less
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

716fa91d

powerpc32: move xxxxx_dcache_range() functions inline · affe587b

由 Christophe Leroy 提交于 2月 09, 2016

flush/clean/invalidate _dcache_range() functions are all very
similar and are quite short. They are mainly used in __dma_sync()
perf_event locate them in the top 3 consumming functions during
heavy ethernet activity

They are good candidate for inlining, as __dma_sync() does
almost nothing but calling them
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

affe587b

powerpc32: Remove clear_pages() and define clear_page() inline · 5736f96d

由 Christophe Leroy 提交于 2月 09, 2016

clear_pages() is never used expect by clear_page, and PPC32 is the
only architecture (still) having this function. Neither PPC64 nor
any other architecture has it.

This patch removes clear_pages() and moves clear_page() function
inline (same as PPC64) as it only is a few isns
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

5736f96d

powerpc/8xx: rewrite flush_instruction_cache() in C · 766d45cb

由 Christophe Leroy 提交于 2月 09, 2016

On PPC8xx, flushing instruction cache is performed by writing
in register SPRN_IC_CST. This registers suffers CPU6 ERRATA.
The patch rewrites the fonction in C so that CPU6 ERRATA will
be handled transparently
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

766d45cb

powerpc/8xx: rewrite set_context() in C · a7761fe4

由 Christophe Leroy 提交于 2月 09, 2016

There is no real need to have set_context() in assembly.
Now that we have mtspr() handling CPU6 ERRATA directly, we
can rewrite set_context() in C language for easier maintenance.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

a7761fe4

powerpc/8xx: remove special handling of CPU6 errata in set_dec() · 63e9e1c2

由 Christophe Leroy 提交于 2月 09, 2016

CPU6 ERRATA is now handled directly in mtspr(), so we can use the
standard set_dec() fonction in all cases.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

63e9e1c2

powerpc/8xx: Map linear kernel RAM with 8M pages · a372acfa

由 Christophe Leroy 提交于 2月 09, 2016

On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.

MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.

In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.

In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.

With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

a372acfa

powerpc/8xx: Save r3 all the time in DTLB miss handler · 913a6b3d

由 Christophe Leroy 提交于 2月 09, 2016

We are spending between 40 and 160 cycles with a mean of 65 cycles in
the DTLB handling routine (measured with mftbl) so make it more
simple althought it adds one instruction.
With this modification, we get three registers available at all time,
which will help with following patch.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

913a6b3d

10 3月, 2016 1 次提交

powerpc/8xx: CONFIG_DEBUG_PAGEALLOC requires ITLBmiss for kernel addresses · 921fff35

由 Christophe Leroy 提交于 2月 03, 2016

When CONFIG_DEBUG_PAGEALLOC is activated, the initial TLB mapping gets
flushed to track accesses to wrong areas. Therefore, kernel addresses
will also generate ITLB misses.
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NScott Wood <oss@buserror.net>

921fff35

09 3月, 2016 12 次提交

powerpc/eeh: eeh_pci_enable(): fix checking of post-request state · 949e9b82

由 Andrew Donnellan 提交于 10月 23, 2015

In eeh_pci_enable(), after making the request to set the new options, we
call eeh_ops->wait_state() to check that the request finished successfully.

At the moment, if eeh_ops->wait_state() returns 0, we return 0 without
checking that it reflects the expected outcome. This can lead to callers
further up the chain incorrectly assuming the slot has been successfully
unfrozen and continuing to attempt recovery.

On powernv, this will occur if pnv_eeh_get_pe_state() or
pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL
call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or
OPAL_EEH_PHB_ERROR respectively.

On pseries, this will occur if pseries_eeh_get_state() returns 0, which in
turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA
Stopped states.

Obviously, none of these cases represent a successful completion of a
request to thaw MMIO or DMA.

Fix the check so that a wait_state() return value of 0 won't be considered
successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases.
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NDaniel Axtens <dja@axtens.net>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

949e9b82

powerpc/eeh: Remove duplicated check in eeh_dump_pe_log() · b6c7347f

由 Gavin Shan 提交于 2月 26, 2016

When eeh_dump_pe_log() is only called by eeh_slot_error_detail(),
we already have the check that the PE isn't in PCI config blocked
state in eeh_slot_error_detail(). So we needn't the duplicated
check in eeh_dump_pe_log().

This removes the duplicated check in eeh_dump_pe_log(). No logical
changes introduced.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b6c7347f

powerpc/eeh: Synchronize recovery in host/guest · eca036ee

由 Gavin Shan 提交于 3月 04, 2016

When passing through SRIOV VFs to guest, we possibly encounter EEH
error on PF. In this case, the VF PEs are put into frozen state.
The error could be reported to guest before it's captured by the
host. That means the guest could attempt to recover errors on VFs
before host gets chance to recover errors on PFs. The VFs won't be
recovered successfully.

This enforces the recovery order for above case: the recovery on
child PE in guest is hold until the recovery on parent PE in host
is completed.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

eca036ee

powerpc/eeh: Don't remove passed VFs · 3fa7bf72

由 Gavin Shan 提交于 3月 04, 2016

When we have partial hotplug as part of the error recovery on PF,
the VFs that are bound with vfio-pci driver will experience hotplug.
That's not allowed.

This checks if the VF PE is passed or not. If it does, we leave
the VF without removing it.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

3fa7bf72

powerpc/eeh: Don't propagate error to guest · 2311cca5

由 Gavin Shan 提交于 3月 04, 2016

When EEH error happened to the parent PE of those PEs that have
been passed through to guest, the error is propagated to guest
domain and the VFIO driver's error handlers are called. It's not
correct as the error in the host domain shouldn't be propagated
to guests and affect them.

This adds one more limitation when calling EEH error handlers.
If the PE has been passed through to guest, the error handlers
won't be called.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NRussell Currey <ruscur@russell.cc>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

2311cca5

powerpc/eeh: powerpc/eeh: Support error recovery for VF PE · 67086e32

由 Wei Yang 提交于 3月 04, 2016

PFs are enumerated on PCI bus, while VFs are created by PF's driver.

In EEH recovery, it has two cases:
1. Device and driver is EEH aware, error handlers are called.
2. Device and driver is not EEH aware, un-plug the device and plug it again
by enumerating it.

The special thing happens on the second case. For a PF, we could use the
original pci core to enumerate the bus, while for VF we need to record the
VFs which aer un-plugged then plug it again.

Also The patch caches the VF index in pci_dn, which can be used to
calculate VF's bus, device and function number. Those information helps to
locate the VF's PCI device instance when doing hotplug during EEH recovery
if necessary.
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

67086e32

powerpc/powernv: Support EEH reset for VF PE · 9312bc5b

由 Wei Yang 提交于 3月 04, 2016

PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9312bc5b

powerpc/eeh: Create PE for VFs · c29fa27d

由 Wei Yang 提交于 3月 04, 2016

This creates PEs for VFs in the weak function pcibios_bus_add_device().
Those PEs for VFs are identified with newly introduced flag EEH_PE_VF
so that we treat them differently during EEH recovery.
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c29fa27d

powerpc/eeh: EEH device for VF · 39218cd0

由 Wei Yang 提交于 3月 04, 2016

VFs and their corresponding pdn are created and released dynamically
when their PF's SRIOV capability is enabled and disabled. This creates
and releases EEH devices for VFs when creating and releasing their pdn
instances, which means EEH devices and pdn instances have same life
cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn).
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

39218cd0

powerpc/eeh: Cache normal BARs, not windows or IOV BARs · 51c0e87e

由 Wei Yang 提交于 3月 04, 2016

This restricts the EEH address cache to use only the first 7 BARs. This
makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs.
As the result of this change, eeh_addr_cache_get_dev() will return VFs from
VF's resource addresses instead of parent PFs.

This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev()
to 7 BARs and this effectively excludes PCI bridges from being cached.
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

51c0e87e

powerpc/pci: Remove VFs prior to PF · 971427f5

由 Wei Yang 提交于 3月 04, 2016

As commit ac205b7b ("PCI: make sriov work with hotplug remove")
indicates, VFs which is on the same PCI bus as their PF, should be
removed before the PF. Otherwise, we might run into kernel crash
at PCI unplugging time.

This applies the above pattern to powerpc PCI hotplug path.
Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

971427f5

powerpc/eeh: Reworked eeh_pe_bus_get() · 4eb0799f

由 Gavin Shan 提交于 2月 09, 2016

The original implementation is ugly: unnecessary if statements and
"out" tag. This reworks the function to avoid above weaknesses. No
functional changes introduced.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4eb0799f

07 3月, 2016 7 次提交

powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI · 15308664

由 Torsten Duwe 提交于 3月 03, 2016

The gcc switch -mprofile-kernel defines a new ABI for calling _mcount()
very early in the function with minimal overhead.

Although mprofile-kernel has been available since GCC 3.4, there were
bugs which were only fixed recently. Currently it is known to work in
GCC 4.9, 5 and 6.

Additionally there are two possible code sequences generated by the
flag, the first uses mflr/std/bl and the second is optimised to omit the
std. Currently only gcc 6 has the optimised sequence. This patch
supports both sequences.

Initial work started by Vojtech Pavlik, used with permission.

Key changes:
 - rework _mcount() to work for both the old and new ABIs.
 - implement new versions of ftrace_caller() and ftrace_graph_caller()
   which deal with the new ABI.
 - updates to __ftrace_make_nop() to recognise the new mcount calling
   sequence.
 - updates to __ftrace_make_call() to recognise the nop'ed sequence.
 - implement ftrace_modify_call().
 - updates to the module loader to surpress the toc save in the module
   stub when calling mcount with the new ABI.
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NTorsten Duwe <duwe@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

15308664

powerpc/ftrace: Use $(CC_FLAGS_FTRACE) when disabling ftrace · 9a7841ae

由 Torsten Duwe 提交于 3月 03, 2016

Rather than open-coding -pg whereever we want to disable ftrace, use the
existing $(CC_FLAGS_FTRACE) variable.

This has the advantage that it will work in future when we use a
different set of flags to enable ftrace.
Signed-off-by: NTorsten Duwe <duwe@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9a7841ae

powerpc/ftrace: Use generic ftrace_modify_all_code() · c96f8385

由 Torsten Duwe 提交于 3月 03, 2016

Convert powerpc's arch_ftrace_update_code() from its own version to use
the generic default functionality (without stop_machine -- our
instructions are properly aligned and the replacements atomic).

With this we gain error checking and the much-needed function_trace_op
handling.
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Reviewed-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: NTorsten Duwe <duwe@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c96f8385

powerpc/module: Create a special stub for ftrace_caller() · 336a7b5d

由 Michael Ellerman 提交于 3月 03, 2016

In order to support the new -mprofile-kernel ABI, we need to be able to
call from the module back to ftrace_caller() (in the kernel) without
using the module's r2. That is because the function in this module which
is calling ftrace_caller() may not have setup r2, if it doesn't
otherwise need it (ie. it accesses no globals).

To make that work we add a new stub which is used for calling
ftrace_caller(), which uses the kernel toc instead of the module toc.
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Reviewed-by: NTorsten Duwe <duwe@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

336a7b5d

powerpc/module: Mark module stubs with a magic value · f17c4e01

由 Michael Ellerman 提交于 3月 03, 2016

When a module is loaded, calls out to the kernel go via a stub which is
generated at runtime. One of these stubs is used to call _mcount(),
which is the default target of tracing calls generated by the compiler
with -pg.

If dynamic ftrace is enabled (which it typically is), another stub is
used to call ftrace_caller(), which is the target of tracing calls when
ftrace is actually active.

ftrace then wants to disable the calls to _mcount() at module startup,
and enable/disable the calls to ftrace_caller() when enabling/disabling
tracing - all of these it does by patching the code.

As part of that code patching, the ftrace code wants to confirm that the
branch it is about to modify, is in fact a call to a module stub which
calls _mcount() or ftrace_caller().

Currently it does that by inspecting the instructions and confirming
they are what it expects. Although that works, the code to do it is
pretty intricate because it requires lots of knowledge about the exact
format of the stub.

We can make that process easier by marking the generated stubs with a
magic value, and then looking for that magic value. Altough this is not
as rigorous as the current method, I believe it is sufficient in
practice.
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Reviewed-by: NTorsten Duwe <duwe@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f17c4e01

powerpc/module: Only try to generate the ftrace_caller() stub once · 136cd345

由 Michael Ellerman 提交于 3月 03, 2016

Currently we generate the module stub for ftrace_caller() at the bottom
of apply_relocate_add(). However apply_relocate_add() is potentially
called more than once per module, which means we will try to generate
the ftrace_caller() stub multiple times.

Although the current code deals with that correctly, ie. it only
generates a stub the first time, it would be clearer to only try to
generate the stub once.

Note also on first reading it may appear that we generate a different
stub for each section that requires relocation, but that is not the
case. The code in stub_for_addr() that searches for an existing stub
uses sechdrs[me->arch.stubs_section], ie. the single stub section for
this module.

A cleaner approach is to only generate the ftrace_caller() stub once,
from module_finalize(). Although the original code didn't check to see
if the stub was actually generated correctly, it seems prudent to add a
check, so do that. And an additional benefit is we can clean the ifdefs
up a little.

Finally we must propagate the const'ness of some of the pointers passed
to module_finalize(), but that is also an improvement.
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Reviewed-by: NTorsten Duwe <duwe@suse.de>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

136cd345

powerpc: Create a helper for getting the kernel toc value · a5cab83c

由 Michael Ellerman 提交于 3月 03, 2016

Move the logic to work out the kernel toc pointer into a header. This is
a good cleanup, and also means we can use it elsewhere in future.
Reviewed-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Reviewed-by: NTorsten Duwe <duwe@suse.de>
Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Tested-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>

a5cab83c

05 3月, 2016 4 次提交

powerpc/mpc85xx: Add CPU hotplug support for E6500 · 6becef7e

由 chenhui zhao 提交于 11月 20, 2015

Support Freescale E6500 core-based platforms, like t4240.
Support disabling/enabling individual CPU thread dynamically.
Signed-off-by: NChenhui Zhao <chenhui.zhao@freescale.com>

6becef7e

powerpc/mpc85xx: Add hotplug support on E5500 and E500MC cores · 2f4f1f81

由 chenhui zhao 提交于 11月 20, 2015

Freescale E500MC and E5500 core-based platforms, like P4080, T1040,
support disabling/enabling CPU dynamically.
This patch adds this feature on those platforms.
Signed-off-by: NChenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: NTang Yuantian <Yuantian.Tang@feescale.com>
[scottwood: removed unused pr_fmt]
Signed-off-by: NScott Wood <oss@buserror.net>

2f4f1f81

powerpc/rcpm: add RCPM driver · d17799f9

由 chenhui zhao 提交于 11月 20, 2015

There is a RCPM (Run Control/Power Management) in Freescale QorIQ
series processors. The device performs tasks associated with device
run control and power management.

The driver implements some features: mask/unmask irq, enter/exit low
power states, freeze time base, etc.
Signed-off-by: NChenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: NTang Yuantian <Yuantian.Tang@freescale.com>
[scottwood: remove __KERNEL__ ifdef]
Signed-off-by: NScott Wood <oss@buserror.net>

d17799f9

powerpc/cache: add cache flush operation for various e500 · e7affb1d

由 chenhui zhao 提交于 11月 20, 2015

Various e500 core have different cache architecture, so they
need different cache flush operations. Therefore, add a callback
function cpu_flush_caches to the struct cpu_spec. The cache flush
operation for the specific kind of e500 is selected at init time.
The callback function will flush all caches inside the current cpu.
Signed-off-by: NChenhui Zhao <chenhui.zhao@freescale.com>
Signed-off-by: NTang Yuantian <Yuantian.Tang@feescale.com>
Signed-off-by: NScott Wood <oss@buserror.net>

e7affb1d

03 3月, 2016 1 次提交

powerpc/mm: Move hash related mmu-*.h headers to book3s/ · f64e8084

由 Aneesh Kumar K.V 提交于 3月 01, 2016

No code changes.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

f64e8084

02 3月, 2016 6 次提交

powerpc: Add the ability to save VSX without giving it up · bf6a4d5b

由 Cyril Bur 提交于 2月 29, 2016

This patch adds the ability to be able to save the VSX registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch builds on a previous optimisation for the FPU and VEC registers
in the thread copy path to avoid a possibly pointless reload of VSX state.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

bf6a4d5b

powerpc: Add the ability to save Altivec without giving it up · 6f515d84

由 Cyril Bur 提交于 2月 29, 2016

This patch adds the ability to be able to save the VEC registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch builds on a previous optimisation for the FPU registers in the
thread copy path to avoid a possibly pointless reload of VEC state.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6f515d84

powerpc: Add the ability to save FPU without giving it up · 8792468d

由 Cyril Bur 提交于 2月 29, 2016

This patch adds the ability to be able to save the FPU registers to the
thread struct without giving up (disabling the facility) next time the
process returns to userspace.

This patch optimises the thread copy path (as a result of a fork() or
clone()) so that the parent thread can return to userspace with hot
registers avoiding a possibly pointless reload of FPU register state.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

8792468d

powerpc: Prepare for splitting giveup_{fpu, altivec, vsx} in two · de2a20aa

由 Cyril Bur 提交于 2月 29, 2016

This prepares for the decoupling of saving {fpu,altivec,vsx} registers and
marking {fpu,altivec,vsx} as being unused by a thread.

Currently giveup_{fpu,altivec,vsx}() does both however optimisations to
task switching can be made if these two operations are decoupled.
save_all() will permit the saving of registers to thread structs and leave
threads MSR with bits enabled.

This patch introduces no functional change.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

de2a20aa

powerpc: Restore FPU/VEC/VSX if previously used · 70fe3d98

由 Cyril Bur 提交于 2月 29, 2016

Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
a problem unless a process is using these facilities.

Modern versions of GCC are very good at automatically vectorising code,
new and modernised workloads make use of floating point and vector
facilities, even the kernel makes use of vectorised memcpy.

All this combined greatly increases the cost of a syscall since the
kernel uses the facilities sometimes even in syscall fast-path making it
increasingly common for a thread to take an *_unavailable exception soon
after a syscall, not to mention potentially taking all three.

The obvious overcompensation to this problem is to simply always load
all the facilities on every exit to userspace. Loading up all FPU, VEC
and VSX registers every time can be expensive and if a workload does
avoid using them, it should not be forced to incur this penalty.

An 8bit counter is used to detect if the registers have been used in the
past and the registers are always loaded until the value wraps to back
to zero.

Several versions of the assembly in entry_64.S were tested:

  1. Always calling C.
  2. Performing a common case check and then calling C.
  3. A complex check in asm.

After some benchmarking it was determined that avoiding C in the common
case is a performance benefit (option 2). The full check in asm (option
3) greatly complicated that codepath for a negligible performance gain
and the trade-off was deemed not worth it.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
[mpe: Move load_vec in the struct to fill an existing hole, reword change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fixup

70fe3d98

powerpc: Explicitly disable math features when copying thread · d272f667

由 Cyril Bur 提交于 2月 29, 2016

Currently when threads get scheduled off they always giveup the FPU,
Altivec (VMX) and Vector (VSX) units if they were using them. When they are
scheduled back on a fault is then taken to enable each facility and load
registers. As a result explicitly disabling FPU/VMX/VSX has not been
necessary.

Future changes and optimisations remove this mandatory giveup and fault
which could cause calls such as clone() and fork() to copy threads and run
them later with FPU/VMX/VSX enabled but no registers loaded.

This patch starts the process of having MSR_{FP,VEC,VSX} mean that a
threads registers are hot while not having MSR_{FP,VEC,VSX} means that the
registers must be loaded. This allows for a smarter return to userspace.
Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

d272f667

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功