提交 · e183914af00e15eb41ae666d44e323bfa154be13 · openanolis / cloud-kernel

23 1月, 2017 3 次提交

crypto: x86 - make constants readonly, allow linker to merge them · e183914a

由 Denys Vlasenko 提交于 1月 19, 2017

A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.

Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.

There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".

GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:

	.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16

It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.

When this is not the case, use non-mergeable section:

	.section .rodata[.VAR_NAME], "a", @progbits

This reduces .data by ~15 kbytes:

    text    data     bss     dec      hex filename
11097415 2705840 2630712 16433967  fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479  fac147 vmlinux.o

Merged objects are visible in System.map:

ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK   <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512

Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e183914a

crypto: x86/crc32c - fix %progbits -> @progbits · 587d531b

由 Denys Vlasenko 提交于 1月 19, 2017

%progbits form is used on ARM (where @ is a comment char).

x86 consistently uses @progbits everywhere else.
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: George Spelvin <linux@horizon.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

587d531b

crypto: arm/aes-neonbs - fix issue with v2.22 and older assembler · 13954e78

由 Ard Biesheuvel 提交于 1月 19, 2017

The GNU assembler for ARM version 2.22 or older fails to infer the
element size from the vmov instructions, and aborts the build in
the following way;

.../aes-neonbs-core.S: Assembler messages:
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[1],r10'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[0],r9'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[1],r8'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[0],r7'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[1],r10'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[0],r9'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[1],r8'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[0],r7'

Fix this by setting the element size explicitly, by replacing vmov with
vmov.32.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

13954e78

13 1月, 2017 9 次提交

crypto: arm/aes - avoid reserved 'tt' mnemonic in asm code · 658fa754

由 Ard Biesheuvel 提交于 1月 13, 2017

The ARMv8-M architecture introduces 'tt' and 'ttt' instructions,
which means we can no longer use 'tt' as a register alias on recent
versions of binutils for ARM. So replace the alias with 'ttab'.

Fixes: 81edb426 ("crypto: arm/aes - replace scalar AES cipher")
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

658fa754

crypto: arm/aes - replace bit-sliced OpenSSL NEON code · cc477bf6

由 Ard Biesheuvel 提交于 1月 11, 2017

This replaces the unwieldy generated implementation of bit-sliced AES
in CBC/CTR/XTS modes that originated in the OpenSSL project with a
new version that is heavily based on the OpenSSL implementation, but
has a number of advantages over the old version:
- it does not rely on the scalar AES cipher that also originated in the
  OpenSSL project and contains redundant lookup tables and key schedule
  generation routines (which we already have in crypto/aes_generic.)
- it uses the same expanded key schedule for encryption and decryption,
  reducing the size of the per-key data structure by 1696 bytes
- it adds an implementation of AES in ECB mode, which can be wrapped by
  other generic chaining mode implementations
- it moves the handling of corner cases that are non critical to performance
  to the glue layer written in C
- it was written directly in assembler rather than generated from a Perl
  script
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

cc477bf6

crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64 · 1abee99e

由 Ard Biesheuvel 提交于 1月 11, 2017

This is a reimplementation of the NEON version of the bit-sliced AES
algorithm. This code is heavily based on Andy Polyakov's OpenSSL version
for ARM, which is also available in the kernel. This is an alternative for
the existing NEON implementation for arm64 authored by me, which suffers
from poor performance due to its reliance on the pathologically slow four
register variant of the tbl/tbx NEON instruction.

This version is about ~30% (*) faster than the generic C code, but only in
cases where the input can be 8x interleaved (this is a fundamental property
of bit slicing). For this reason, only the chaining modes ECB, XTS and CTR
are implemented. (The significance of ECB is that it could potentially be
used by other chaining modes)

* Measured on Cortex-A57. Note that this is still an order of magnitude
  slower than the implementations that use the dedicated AES instructions
  introduced in ARMv8, but those are part of an optional extension, and so
  it is good to have a fallback.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

1abee99e

crypto: arm/aes - replace scalar AES cipher · 81edb426

由 Ard Biesheuvel 提交于 1月 11, 2017

This replaces the scalar AES cipher that originates in the OpenSSL project
with a new implementation that is ~15% (*) faster (on modern cores), and
reuses the lookup tables and the key schedule generation routines from the
generic C implementation (which is usually compiled in anyway due to
networking and other subsystems depending on it).

Note that the bit sliced NEON code for AES still depends on the scalar cipher
that this patch replaces, so it is not removed entirely yet.

* On Cortex-A57, the performance increases from 17.0 to 14.9 cycles per byte
  for 128-bit keys.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

81edb426

crypto: arm64/aes - add scalar implementation · bed593c0

由 Ard Biesheuvel 提交于 1月 11, 2017

This adds a scalar implementation of AES, based on the precomputed tables
that are exposed by the generic AES code. Since rotates are cheap on arm64,
this implementation only uses the 4 core tables (of 1 KB each), and avoids
the prerotated ones, reducing the D-cache footprint by 75%.

On Cortex-A57, this code manages 13.0 cycles per byte, which is ~34% faster
than the generic C code. (Note that this is still >13x slower than the code
that uses the optional ARMv8 Crypto Extensions, which manages <1 cycles per
byte.)
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

bed593c0

crypto: arm64/aes-blk - expose AES-CTR as synchronous cipher as well · 293614ce

由 Ard Biesheuvel 提交于 1月 11, 2017

In addition to wrapping the AES-CTR cipher into the async SIMD wrapper,
which exposes it as an async skcipher that defers processing to process
context, expose our AES-CTR implementation directly as a synchronous cipher
as well, but with a lower priority.

This makes the AES-CTR transform usable in places where synchronous
transforms are required, such as the MAC802.11 encryption code, which
executes in sotfirq context, where SIMD processing is allowed on arm64.
Users of the async transform will keep the existing behavior.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

293614ce

crypto: arm/chacha20 - implement NEON version based on SSE3 code · afaf712e

由 Ard Biesheuvel 提交于 1月 11, 2017

This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher. It uses the new skcipher walksize
attribute to process the input in strides of 4x the block size.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

afaf712e

crypto: arm64/chacha20 - implement NEON version based on SSE3 code · b7171ce9

由 Ard Biesheuvel 提交于 1月 11, 2017

This is a straight port to arm64/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher. It uses the new skcipher walksize
attribute to process the input in strides of 4x the block size.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

b7171ce9

crypto: x86/chacha20 - Manually align stack buffer · b8fbe71f

由 Herbert Xu 提交于 1月 11, 2017

The kernel on x86-64 cannot use gcc attribute align to align to
a 16-byte boundary.  This patch reverts to the old way of aligning
it by hand.

Fixes: 9ae433bc ("crypto: chacha20 - convert generic and...")
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>

b8fbe71f

05 1月, 2017 5 次提交

KVM: VMX: remove duplicated declaration · 69130ea1

由 Jan Dakinevich 提交于 12月 23, 2016

Declaration of VMX_VPID_EXTENT_SUPPORTED_MASK occures twice in the code.
Probably, it was happened after unsuccessful merge.
Signed-off-by: NJan Dakinevich <jan.dakinevich@gmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

69130ea1

KVM: MIPS: Flush KVM entry code from icache globally · 32eb12a6

由 James Hogan 提交于 1月 03, 2017

Flush the KVM entry code from the icache on all CPUs, not just the one
that built the entry code.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.16.x-
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

32eb12a6

KVM: MIPS: Don't clobber CP0_Status.UX · 4c881451

由 James Hogan 提交于 1月 03, 2017

On 64-bit kernels, MIPS KVM will clear CP0_Status.UX to prevent the
guest (running in user mode) from accessing the 64-bit memory segments.
However the previous value of CP0_Status.UX is never restored when
exiting from the guest.

If the user process uses 64-bit addressing (the n64 ABI) this can result
in address error exceptions from the kernel if it needs to deliver a
signal before returning to user mode, as the kernel will need to write a
sigframe to high user addresses on the user stack which are disallowed
by CP0_Status.UX=0.

This is fixed by explicitly setting SX and UX again when exiting from
the guest, and explicitly clearing those bits when returning to the
guest. Having the SX and UX bits set when handling guest exits (rather
than only when exiting to userland) will be helpful when we support VZ,
since we shouldn't need to directly read or write guest memory, so it
will be valid for cache management IPIs to access host user addresses.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 4.8.x-
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4c881451

arm64: restore get_current() optimisation · 9d84fb27

由 Mark Rutland 提交于 1月 03, 2017

Commit c02433dd ("arm64: split thread_info from task stack")
inverted the relationship between get_current() and
current_thread_info(), with sp_el0 now holding the current task_struct
rather than the current thead_info. The new implementation of
get_current() prevents the compiler from being able to optimize repeated
calls to either, resulting in a noticeable penalty in some
microbenchmarks.

This patch restores the previous optimisation by implementing
get_current() in the same way as our old current_thread_info(), using a
non-volatile asm statement.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9d84fb27

arm64: mm: fix show_pte KERN_CONT fallout · 6ef4fb38

由 Mark Rutland 提交于 1月 03, 2017

Recent changes made KERN_CONT mandatory for continued lines. In the
absence of KERN_CONT, a newline may be implicit inserted by the core
printk code.

In show_pte, we (erroneously) use printk without KERN_CONT for continued
prints, resulting in output being split across a number of lines, and
not matching the intended output, e.g.

[ff000000000000] *pgd=00000009f511b003
, *pud=00000009f4a80003
, *pmd=0000000000000000

Fix this by using pr_cont() for all the continuations.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6ef4fb38

04 1月, 2017 3 次提交

K
ARM64: defconfig: enable DRM_MESON as module · fcdaf1a2
由 Kevin Hilman 提交于 12月 08, 2016
```
Signed-off-by: NKevin Hilman <khilman@baylibre.com>
```
fcdaf1a2

ARM64: dts: meson-gx: Add Graphic Controller nodes · fafdbdf7

由 Neil Armstrong 提交于 12月 01, 2016

Add Video Processing Unit and CVBS Output nodes, and enable CVBS on selected
boards.
Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com>
Signed-off-by: NKevin Hilman <khilman@baylibre.com>

fafdbdf7

K
ARM64: dts: meson-gxl: fix GPIO include · 1cf3df8a
由 Kevin Hilman 提交于 11月 07, 2016
```
Signed-off-by: NKevin Hilman <khilman@baylibre.com>
```
1cf3df8a

03 1月, 2017 3 次提交

ARM: dts: imx6: Disable "weim" node in the dtsi files · 116dad7d

由 Fabio Estevam 提交于 12月 30, 2016

Commit 1be81ea5 ("ARM: dts: imx6: Add imx-weim parameters to
dtsi's") causes the following probe error when the weim node is not
present on the board dts (such as imx6q-sabresd):

imx-weim 21b8000.weim: Invalid 'ranges' configuration
imx-weim: probe of 21b8000.weim failed with error -22

There is no need to always enable the "weim" node on mx6. Do the same
as in the other i.MX dtsi files where "weim" is disabled and only gets
enabled on a per dts basis.

All the imx6 weim dts users explicitily provide 'status = "okay"', so
this change has no impact on current imx6 weim users.

If a board does not use the weim driver it will not describe its 'ranges'
property, so simply disable the 'weim' node in the imx6 dtsi files to
avoid such probe error message.

Fixes: 1be81ea5 ("ARM: dts: imx6: Add imx-weim parameters to dtsi's")
Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

116dad7d

parisc: Add line-break when printing segfault info · b4a9eb4c

由 Helge Deller 提交于 1月 02, 2017

Add a leading line break else printed line gets too long.
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.9

b4a9eb4c

ARM: dts: qcom: apq8064: Add missing scm clock · 542b9f07

由 Bjorn Andersson 提交于 12月 29, 2016

As per the device tree binding the apq8064 scm node requires the core
clock to be specified, so add this.
Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: NAndy Gross <andy.gross@linaro.org>

542b9f07

02 1月, 2017 9 次提交

ARM: davinci: da8xx: Fix sleeping function called from invalid context · d1df1e01

由 Alexandre Bailon 提交于 12月 09, 2016

Everytime the usb20 phy is enabled, there is a
"sleeping function called from invalid context" BUG.
In addition, there is a recursive locking happening
because of the recurse call to clk_enable().

clk_enable() from arch/arm/mach-davinci/clock.c uses
spin_lock_irqsave() before to invoke the callback
usb20_phy_clk_enable(). usb20_phy_clk_enable() uses
clk_get() and clk_enable_prepapre() which may sleep.

Replace clk_prepare_enable() by davinci_clk_enable().
Signed-off-by: NAlexandre Bailon <abailon@baylibre.com>
Suggested-by: NDavid Lechner <david@lechnology.com>
[nsekhar@ti.com: minor commit description adjustment]
Signed-off-by: NSekhar Nori <nsekhar@ti.com>

d1df1e01

ARM: davinci: Make __clk_{enable,disable} functions public · 48cd30b4

由 Alexandre Bailon 提交于 12月 09, 2016

In some cases, there is a need to enable a clock as part of
clock enable callback of a different clock. For example, USB
2.0 PHY clock enable requires USB 2.0 clock to be enabled.
In this case, it is safe to instead call __clk_enable()
since the clock framework lock is already taken. Calling
clk_enable() causes recursive locking error.

A similar case arises in the clock disable path.

To enable such usage, make __clk_{enable,disable} functions
publicly available outside of clock.c. Also, call them
davinci_clk_{enable|disable} now to be consistent with how
other davinci-specific clock functions are named.

Note that these functions are not exported to drivers. They
are meant for usage in platform specific clock management
code.
Signed-off-by: NAlexandre Bailon <abailon@baylibre.com>
Suggested-by: NDavid Lechner <david@lechnology.com>
Signed-off-by: NSekhar Nori <nsekhar@ti.com>

48cd30b4

ARM: davinci: da850: don't add emac clock to lookup table twice · ef37427a

由 Bartosz Golaszewski 提交于 12月 07, 2016

Similarly to the aemif clock - this screws up the linked list of clock
children. Create a separate clock for mdio inheriting the rate from
emac_clk.

Cc: <stable@vger.kernel.org> # 3.12.x-
Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
[nsekhar@ti.com: add a comment over mdio_clk to explaing its existence +
		 commit headline updates]
Signed-off-by: NSekhar Nori <nsekhar@ti.com>

ef37427a

ARM: davinci: da850: fix infinite loop in clk_set_rate() · 5d45b011

由 Bartosz Golaszewski 提交于 12月 07, 2016

The aemif clock is added twice to the lookup table in da850.c. This
breaks the children list of pll0_sysclk3 as we're using the same list
links in struct clk. When calling clk_set_rate(), we get stuck in
propagate_rate().

Create a separate clock for nand, inheriting the rate of the aemif
clock and retrieve it in the davinci_nand module.

Cc: <stable@vger.kernel.org> # 4.9.x
Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: NSekhar Nori <nsekhar@ti.com>

5d45b011

ARM: i.MX: remove map_io callback · d7da1ccf

由 Vladimir Murzin 提交于 12月 02, 2016

There is no need to define map_io only for debug_ll_io_init() since it
is already called in devicemaps_init() if map_io is NULL.

Apart from that, for NOMMU build debug_ll_io_init() is a nop which
leads to following error:

CC      arch/arm/mach-imx/mach-imx1.o
arch/arm/mach-imx/mach-imx1.c:40:13: error: 'debug_ll_io_init' undeclared here (not in a function)
  .map_io  = debug_ll_io_init,
             ^
make[1]: *** [arch/arm/mach-imx/mach-imx1.o] Error 1

Cc: Alexander Shiyan <shc_work@mail.ru>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

d7da1ccf

ARM: dts: vf610-zii-dev-rev-b: Add missing newline · 4c51de45

由 Andreas Färber 提交于 11月 27, 2016

Found while reviewing Marvell dsa bindings usage.

Fixes: f283745b ("arm: vf610: zii devel b: Add support for switch interrupts")
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NAndreas Färber <afaerber@suse.de>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

4c51de45

ARM: dts: imx6qdl-nitrogen6x: remove duplicate iomux entry · db9e1886

由 Gary Bisson 提交于 11月 25, 2016

The NANDF_CS2 pad is also part of the wlan-vmmcgrp iomux group.

Removing is from the usdhc2grp group avoids the following error:
imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_NANDF_CS2 already requested
by regulators:regulator@4; cannot claim for 2194000.usdhc
imx6q-pinctrl 20e0000.iomuxc: pin-187 (2194000.usdhc) status -22
imx6q-pinctrl 20e0000.iomuxc: could not request pin 187
(MX6Q_PAD_NANDF_CS2) from group usdhc2grp on device 20e0000.iomuxc
Signed-off-by: NGary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

db9e1886

ARM: dts: imx31: fix AVIC base address · af92305e

由 Vladimir Zapolskiy 提交于 11月 17, 2016

On i.MX31 AVIC interrupt controller base address is at 0x68000000.

The problem was shadowed by the AVIC driver, which takes the correct
base address from a SoC specific header file.

Fixes: d2a37b3d ("ARM i.MX31: Add devicetree support")
Signed-off-by: NVladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: NFabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: NShawn Guo <shawnguo@kernel.org>

af92305e

openrisc: Add _text symbol to fix ksym build error · 086cc1c3

由 Stafford Horne 提交于 12月 14, 2016

The build robot reports:

   .tmp_kallsyms1.o: In function `kallsyms_relative_base':
>> (.rodata+0x8a18): undefined reference to `_text'

This is when using 'make alldefconfig'. Adding this _text symbol to mark
the start of the kernel as in other architecture fixes this.
Signed-off-by: NStafford Horne <shorne@gmail.com>
Acked-by: NJonas Bonn <jonas@southpole.se>

086cc1c3

31 12月, 2016 1 次提交

ARM: dts: am572x-idk: Add gpios property to control PCIE_RESETn · 1a38de88

由 Kishon Vijay Abraham I 提交于 12月 30, 2016

Add 'gpios' property to pcie1 dt node and populate it with
GPIO3_23 in order to drive PCIE_RESETn high.

This gets PCIe cards to be detected in AM572X IDK board.
Signed-off-by: NKishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

1a38de88

30 12月, 2016 5 次提交

arm64: dts: vexpress: Support GICC_DIR operations · 1dff32d7

由 Sudeep Holla 提交于 12月 13, 2016

The GICv2 CPU interface registers span across 8K, not 4K as indicated in
the DT. Only the GICC_DIR register is located after the initial 4K
boundary, leaving a functional system but without support for separately
EOI'ing and deactivating interrupts.

After this change the system supports split priority drop and interrupt
deactivation. This patch is based on similar one from Christoffer Dall:
commit 368400e2 ("ARM: dts: vexpress: Support GICC_DIR operations")
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>

1dff32d7

ARM: dts: vexpress: Support GICC_DIR operations · 368400e2

由 Christoffer Dall 提交于 12月 10, 2016

The GICv2 CPU interface registers span across 8K, not 4K as indicated in
the DT.  Only the GICC_DIR register is located after the initial 4K
boundary, leaving a functional system but without support for separately
EOI'ing and deactivating interrupts.

After this change the system supports split priority drop and interrupt
deactivation.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
[sudeep.holla@arm.com: included same fix for tc1 platform too]
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>

368400e2

parisc: Drop TIF_RESTORE_SIGMASK and switch to generic code · 1fe0a7e0

由 Helge Deller 提交于 12月 27, 2016

Commit 7e781418 ("signal: consolidate {TS,TLF}_RESTORE_SIGMASK code")
introduced code with which the "restore sigmask" flag lives in task_struct
instead of ti->flags. Let's use this optimization on parisc too.
Signed-off-by: NHelge Deller <deller@gmx.de>

1fe0a7e0

parisc: Mark cr16 clocksource unstable on SMP systems · 41744213

由 Helge Deller 提交于 12月 26, 2016

The cr16 interval timer of each CPU is not syncronized to other cr16
timers in other CPUs in a SMP system. So, delay the registration of the
cr16 clocksource until all CPUs have been detected and then - if we are
on a SMP machine - mark the cr16 clocksource as unstable and lower it's
rating before registering it at the clocksource framework.

This patch fixes the stalled CPU warnings which we have seen since
introduction of the cr16 clocksource.
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.8+

41744213

mm: optimize PageWaiters bit use for unlock_page() · b91e1302

由 Linus Torvalds 提交于 12月 27, 2016

In commit 62906027 ("mm: add PageWaiters indicating tasks are
waiting for a page bit") Nick Piggin made our page locking no longer
unconditionally touch the hashed page waitqueue, which not only helps
performance in general, but is particularly helpful on NUMA machines
where the hashed wait queues can bounce around a lot.

However, the "clear lock bit atomically and then test the waiters bit"
sequence turns out to be much more expensive than it needs to be,
because you get a nasty stall when trying to access the same word that
just got updated atomically.

On architectures where locking is done with LL/SC, this would be trivial
to fix with a new primitive that clears one bit and tests another
atomically, but that ends up not working on x86, where the only atomic
operations that return the result end up being cmpxchg and xadd.  The
atomic bit operations return the old value of the same bit we changed,
not the value of an unrelated bit.

On x86, we could put the lock bit in the high bit of the byte, and use
"xadd" with that bit (where the overflow ends up not touching other
bits), and look at the other bits of the result.  However, an even
simpler model is to just use a regular atomic "and" to clear the lock
bit, and then the sign bit in eflags will indicate the resulting state
of the unrelated bit #7.

So by moving the PageWaiters bit up to bit #7, we can atomically clear
the lock bit and test the waiters bit on x86 too.  And architectures
with LL/SC (which is all the usual RISC suspects), the particular bit
doesn't matter, so they are fine with this approach too.

This avoids the extra access to the same atomic word, and thus avoids
the costly stall at page unlock time.

The only downside is that the interface ends up being a bit odd and
specialized: clear a bit in a byte, and test the sign bit.  Nick doesn't
love the resulting name of the new primitive, but I'd rather make the
name be descriptive and very clear about the limitation imposed by
trying to work across all relevant architectures than make it be some
generic thing that doesn't make the odd semantics explicit.

So this introduces the new architecture primitive

    clear_bit_unlock_is_negative_byte();

and adds the trivial implementation for x86.  We have a generic
non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)"
combination) which can be overridden by any architecture that can do
better.  According to Nick, Power has the same hickup x86 has, for
example, but some other architectures may not even care.

All these optimizations mean that my page locking stress-test (which is
just executing a lot of small short-lived shell scripts: "make test" in
the git source tree) no longer makes our page locking look horribly bad.
Before all these optimizations, just the unlock_page() costs were just
over 3% of all CPU overhead on "make test".  After this, it's down to
0.66%, so just a quarter of the cost it used to be.

(The difference on NUMA is bigger, but there this micro-optimization is
likely less noticeable, since the big issue on NUMA was not the accesses
to 'struct page', but the waitqueue accesses that were already removed
by Nick's earlier commit).
Acked-by: NNick Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b91e1302

29 12月, 2016 1 次提交

arm64: dts: msm8996: Add required memory carveouts · e9112936

由 Stephen Boyd 提交于 12月 21, 2016

This patch adds required memory carveouts so that the kernel does not
access memory that is in use or has been reserved for use by other remote
processors.
Signed-off-by: NAndy Gross <andy.gross@linaro.org>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>

e9112936

28 12月, 2016 1 次提交

Revert "crypto: arm64/ARM: NEON accelerated ChaCha20" · 5386e5d1

由 Herbert Xu 提交于 12月 28, 2016

This patch reverts the following commits:

8621caa0
80966672

I should not have applied them because they had already been
obsoleted by a subsequent patch series.  They also cause a build
failure because of the subsequent commit 9ae433bc.

Fixes: 9ae433bc ("crypto: chacha20 - convert generic and...")
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

5386e5d1

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功