提交 · 671f9a3e2e24cdeb2d2856abee7422f093e23e29 · openeuler / Kernel

10 7月, 2019 1 次提交

RISC-V: Setup initial page tables in two stages · 671f9a3e

由 Anup Patel 提交于 6月 28, 2019

Currently, the setup_vm() does initial page table setup in one-shot
very early before enabling MMU. Due to this, the setup_vm() has to map
all possible kernel virtual addresses since it does not know size and
location of RAM. This means we have kernel mappings for non-existent
RAM and any buggy driver (or kernel) code doing out-of-bound access
to RAM will not fault and cause underterministic behaviour.

Further, the setup_vm() creates PMD mappings (i.e. 2M mappings) for
RV64 systems. This means for PAGE_OFFSET=0xffffffe000000000 (i.e.
MAXPHYSMEM_128GB=y), the setup_vm() will require 129 pages (i.e.
516 KB) of memory for initial page tables which is never freed. The
memory required for initial page tables will further increase if
we chose a lower value of PAGE_OFFSET (e.g. 0xffffff0000000000)

This patch implements two-staged initial page table setup, as follows:
1. Early (i.e. setup_vm()): This stage maps kernel image and DTB in
a early page table (i.e. early_pg_dir). The early_pg_dir will be used
only by boot HART so it can be freed as-part of init memory free-up.
2. Final (i.e. setup_vm_final()): This stage maps all possible RAM
banks in the final page table (i.e. swapper_pg_dir). The boot HART
will start using swapper_pg_dir at the end of setup_vm_final(). All
non-boot HARTs directly use the swapper_pg_dir created by boot HART.

We have following advantages with this new approach:
1. Kernel mappings for non-existent RAM don't exists anymore.
2. Memory consumed by initial page tables is now indpendent of the
chosen PAGE_OFFSET.
3. Memory consumed by initial page tables on RV64 system is 2 pages
(i.e. 8 KB) which has significantly reduced and these pages will be
freed as-part of the init memory free-up.

The patch also provides a foundation for implementing strict kernel
mappings where we protect kernel text and rodata using PTE permissions.
Suggested-by: NMike Rapoport <rppt@linux.ibm.com>
Signed-off-by: NAnup Patel <anup.patel@wdc.com>
[paul.walmsley@sifive.com: updated to apply; fixed a checkpatch warning]
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

671f9a3e

04 7月, 2019 4 次提交

riscv: remove free_initrd_mem · 2ebca1cb

由 Christoph Hellwig 提交于 5月 20, 2019

The RISC-V free_initrd_mem is identical to the default one, except
that it doesn't poison the freed memory.  Remove it so that the
default implementations gets used instead.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAnup Patel <anup@brainfault.org>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

2ebca1cb

riscv: ccache: Remove unused variable · df7e9059

由 Yash Shah 提交于 7月 01, 2019

Reading the count register clears the interrupt signal. Currently, the
count registers are read into 'regval' variable but the variable is
never used. Therefore remove it. V2 of this patch add comments to
justify the readl calls without checking the return value.
Signed-off-by: NYash Shah <yash.shah@sifive.com>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

df7e9059

riscv: Introduce huge page support for 32/64bit kernel · 9e953cda

由 Alexandre Ghiti 提交于 5月 26, 2019

This patch implements both 4MB huge page support for 32bit kernel
and 2MB/1GB huge pages support for 64bit kernel.
Signed-off-by: NAlexandre Ghiti <alex@ghiti.fr>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

9e953cda

x86, arm64: Move ARCH_WANT_HUGE_PMD_SHARE config in arch/Kconfig · 3876d4a3

由 Alexandre Ghiti 提交于 6月 27, 2019

ARCH_WANT_HUGE_PMD_SHARE config was declared in both architectures:
move this declaration in arch/Kconfig and make those architectures
select it.
Signed-off-by: NAlexandre Ghiti <alex@ghiti.fr>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

3876d4a3

02 7月, 2019 8 次提交

RISC-V: Fix memory reservation in setup_bootmem() · d90d45d7

由 Anup Patel 提交于 6月 07, 2019

Currently, the setup_bootmem() reserves memory from RAM start to the
kernel end. This prevents us from exploring ways to use the RAM below
(or before) the kernel start hence this patch updates setup_bootmem()
to only reserve memory from the kernel start to the kernel end.
Suggested-by: NMike Rapoport <rppt@linux.ibm.com>
Signed-off-by: NAnup Patel <anup.patel@wdc.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

d90d45d7

riscv: defconfig: enable SOC_SIFIVE · bbc5dc51

由 Loys Ollivier 提交于 6月 27, 2019

Enable SOC_SIFIVE so the default upstream config is bootable on the SiFive
Unleashed Board.
And have basic support for future boards based on the same SoC.
Signed-off-by: NLoys Ollivier <lollivier@baylibre.com>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>
[paul.walmsley@sifive.com: updated to apply]
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

bbc5dc51

riscv: select SiFive platform drivers with SOC_SIFIVE · edb7f21c

由 Loys Ollivier 提交于 6月 17, 2019

On selection of SOC_SIFIVE select the corresponding platform drivers.
Signed-off-by: NLoys Ollivier <lollivier@baylibre.com>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

edb7f21c

arch: riscv: add config option for building SiFive's SoC resource · 0cbb8a32

由 Loys Ollivier 提交于 6月 17, 2019

Create a config option for building SiFive SoC specific resources
e.g. SiFive device tree, platform drivers...
Signed-off-by: NLoys Ollivier <lollivier@baylibre.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

0cbb8a32

riscv: Remove gate area stubs · 556024d4

由 Andy Lutomirski 提交于 6月 26, 2019

Since commit a6c19dfe ("arm64,ia64,ppc,s390,sh,tile,um,x86,mm:
remove default gate area"), which predates riscv's inclusion in
Linux by almost three years, the default behavior wrt the gate area
is sane.  Remove riscv's gate area stubs.

Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: linux-riscv@lists.infradead.org
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

556024d4

MAINTAINERS: change the arch/riscv git tree to the new shared tree · 71ec982f

由 Paul Walmsley 提交于 6月 12, 2019

Palmer, with Konstantin's gracious help, set up a shared kernel.org
git tree for arch/riscv patches going forward.  Change the MAINTAINERS
file accordingly.
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>

71ec982f

MAINTAINERS: don't automatically patches involving SiFive to the linux-riscv list · 3e9d80a3

由 Paul Walmsley 提交于 6月 12, 2019

The current K: entry in the "SIFIVE DRIVERS" section causes
scripts/get_maintainer.pl to recommend that all patches that originate
from, or are sent or copied to, anyone with a @sifive.com E-mail
address to be copied to the linux-riscv@lists.infradead.org mailing
list:

https://lore.kernel.org/linux-riscv/CABEDWGxKCqCq2HBU8u1-=QgmMCdb69oXxN5rz65nxNODxdCAnw@mail.gmail.com/

This is undesirable, since not all of these patches may be relevant to
the linux-riscv@ mailing list. Fix by excluding K: matches that look
like a sifive.com E-mail address.

Based on the following patch from Palmer Dabbelt <palmer@sifive.com>:

https://lore.kernel.org/linux-riscv/mhng-2a897a66-1f3d-4878-ba47-1ae36b555540@palmer-si-x1e/Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>

3e9d80a3

RISC-V: defconfig: Enable NO_HZ_IDLE and HIGH_RES_TIMERS · 6dd91e0e

由 Anup Patel 提交于 5月 15, 2019

This patch enables NO_HZ_IDLE (idle dynamic ticks) and HIGH_RES_TIMERS
(hrtimers) in RV32 and RV64 defconfigs.

Both of the above options are enabled by default for architectures
such as x86, ARM, and ARM64.

The idle dynamic ticks helps use save power by stopping timer ticks
when the system is idle whereas hrtimers is a much improved timer
subsystem compared to the old "timer wheel" based system.

This patch is tested on SiFive Unleashed board and QEMU Virt machine.
Signed-off-by: NAnup Patel <anup.patel@wdc.com>
Reviewed-by: NPalmer Dabbelt <palmer@sifive.com>
Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>

6dd91e0e

30 6月, 2019 3 次提交

L

Linux 5.2-rc7 · 6fbc7275
由 Linus Torvalds 提交于 6月 30, 2019

6fbc7275

Merge tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 39132f74

由 Linus Torvalds 提交于 6月 30, 2019

Pull powerpc fix from Michael Ellerman:
 "One fix for a regression in my commit adding KUAP (Kernel User Access
  Prevention) on Radix, which incorrectly touched the AMR in the early
  machine check handler.

  Thanks to Nicholas Piggin"

* tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/exception: Fix machine check early corrupting AMR

39132f74

Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7c15f41e

由 Linus Torvalds 提交于 6月 30, 2019

Pull SMP fixes from Thomas Gleixner:
 "Two small changes for the cpu hotplug code:

   - Prevent out of bounds access which actually might crash the machine
     caused by a missing bounds check in the fail injection code

   - Warn about unsupported migitation mode command line arguments to
     make people aware that they typoed the paramater. Not necessarily a
     fix but quite some people tripped over that"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Fix out-of-bounds read when setting fail state
  cpu/speculation: Warn on unsupported mitigations= parameter

7c15f41e

29 6月, 2019 24 次提交

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 72825454

由 Linus Torvalds 提交于 6月 29, 2019

Pull x86 fixes from Ingo Molnar:
 "Misc fixes all over the place:

   - might_sleep() atomicity fix in the microcode loader

   - resctrl boundary condition fix

   - APIC arithmethics bug fix for frequencies >= 4.2 GHz

   - three 5-level paging crash fixes

   - two speculation fixes

   - a perf/stacktrace fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/unwind/orc: Fall back to using frame pointers for generated code
  perf/x86: Always store regs->ip in perf_callchain_kernel()
  x86/speculation: Allow guests to use SSBD even if host does not
  x86/mm: Handle physical-virtual alignment mismatch in phys_p4d_init()
  x86/boot/64: Add missing fixup_pointer() for next_early_pgt access
  x86/boot/64: Fix crash if kernel image crosses page table boundary
  x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz
  x86/resctrl: Prevent possible overrun during bitmap operations
  x86/microcode: Fix the microcode load on CPU hotplug for real

72825454

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 57103eb7

由 Linus Torvalds 提交于 6月 29, 2019

Pull perf fixes from Ingo Molnar:
 "Various fixes, most of them related to bugs perf fuzzing found in the
  x86 code"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/regs: Use PERF_REG_EXTENDED_MASK
  perf/x86: Remove pmu->pebs_no_xmm_regs
  perf/x86: Clean up PEBS_XMM_REGS
  perf/x86/regs: Check reserved bits
  perf/x86: Disable extended registers for non-supported PMUs
  perf/ioctl: Add check for the sample_period value
  perf/core: Fix perf_sample_regs_user() mm check

57103eb7

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · eed7d30e

由 Linus Torvalds 提交于 6月 29, 2019

Pull irq fixes from Ingo Molnar:
 "Diverse irqchip driver fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Fix command queue pointer comparison bug
  irqchip/mips-gic: Use the correct local interrupt map registers
  irqchip/ti-sci-inta: Fix kernel crash if irq_create_fwspec_mapping fail
  irqchip/irq-csky-mpintc: Support auto irq deliver to all cpus

eed7d30e

Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a7211bc9

由 Linus Torvalds 提交于 6月 29, 2019

Pull EFI fixes from Ingo Molnar:
 "Four fixes:
   - fix a kexec crash on arm64
   - fix a reboot crash on some Android platforms
   - future-proof the code for upcoming ACPI 6.2 changes
   - fix a build warning on x86"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efibc: Replace variable set function in notifier call
  x86/efi: fix a -Wtype-limits compilation warning
  efi/bgrt: Drop BGRT status field reserved bits check
  efi/memreserve: deal with memreserve entries in unmapped memory

a7211bc9

Merge tag 'pm-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2407e486

由 Linus Torvalds 提交于 6月 29, 2019

Pull power management fix from Rafael Wysocki:
 "Avoid skipping bus-level PCI power management during system resume for
  PCIe ports left in D0 during the preceding suspend transition on
  platforms where the power states of those ports can change out of the
  PCI layer's control"

* tag 'pm-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PCI: PM: Avoid skipping bus-level PM on platforms without ACPI

2407e486

Merge tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-dax · 01305db8

由 Linus Torvalds 提交于 6月 29, 2019

Pull XArray fixes from Matthew Wilcox:

 - Account XArray nodes for the page cache to the appropriate cgroup
   (Johannes Weiner)

 - Fix idr_get_next() when called under the RCU lock (Matthew Wilcox)

 - Add a test for xa_insert() (Matthew Wilcox)

* tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-dax:
  XArray tests: Add check_insert
  idr: Fix idr_get_next race with idr_remove
  mm: fix page cache convergence regression

01305db8

Merge branch 'akpm' (patches from Andrew) · 0839c537

由 Linus Torvalds 提交于 6月 29, 2019

Merge misc fixes from Andrew Morton:
 "15 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL
  mm, swap: fix THP swap out
  fork,memcg: alloc_thread_stack_node needs to set tsk->stack
  MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info
  mm/vmalloc.c: avoid bogus -Wmaybe-uninitialized warning
  mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
  initramfs: fix populate_initrd_image() section mismatch
  mm/oom_kill.c: fix uninitialized oc->constraint
  mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge
  mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails
  signal: remove the wrong signal_pending() check in restore_user_sigmask()
  fs/binfmt_flat.c: make load_flat_shared_library() work
  mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask
  fs/proc/array.c: allow reporting eip/esp for all coredumping threads
  mm/dev_pfn: exclude MEMORY_DEVICE_PRIVATE while computing virtual address

0839c537

Merge tag 'arc-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · f8b5c722

由 Linus Torvalds 提交于 6月 29, 2019

Pull ARC fixes from Vineet Gupta:

 - hsdk platform unifying apertures

 - build system CROSS_COMPILE prefix

* tag 'arc-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [plat-hsdk]: unify memory apertures configuration
  ARC: build: Try to guess CROSS_COMPILE with cc-cross-prefix

f8b5c722

Merge tag 'riscv-for-v5.2/fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · c57582ad

由 Linus Torvalds 提交于 6月 29, 2019

Pull RISC-V fixes from Paul Walmsley:
 "Minor RISC-V fixes and one defconfig update.

  The fixes have no functional impact:

   - Fix some comment text in the memory management vmalloc_fault path.

   - Fix some warnings from the DT compiler in our newly-added DT files.

   - Change the newly-added DT bindings such that SoC IP blocks with
     external I/O are marked as "disabled" by default, then enable them
     explicitly in board DT files when the devices are used on the
     board. This aligns the bindings with existing upstream practice.

   - Add the MIT license as an option for a minor header file, at the
     request of one of the U-Boot maintainers.

  The RISC-V defconfig update builds the SiFive SPI driver and the
  MMC-SPI driver by default. The intention here is to make v5.2 more
  usable for testers and users with RISC-V hardware"

* tag 'riscv-for-v5.2/fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: mm: Fix code comment
  dt-bindings: clock: sifive: add MIT license as an option for the header file
  dt-bindings: riscv: resolve 'make dt_binding_check' warnings
  riscv: dts: Re-organize the DT nodes
  RISC-V: defconfig: enable MMC & SPI for RISC-V

c57582ad

Merge tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfs · c949c30b

由 Linus Torvalds 提交于 6月 29, 2019

Pull two more NFS client fixes from Anna Schumaker:
 "These are both stable fixes.

  One to calculate the correct client message length in the case of
  partial transmissions. And the other to set the proper TCP timeout for
  flexfiles"

* tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O
  SUNRPC: Fix up calculation of client message length

c949c30b

Merge tag 'ceph-for-5.2-rc7' of git://github.com/ceph/ceph-client · 43251dbd

由 Linus Torvalds 提交于 6月 29, 2019

Pull ceph fix from Ilya Dryomov:
 "A small fix for a potential -rc1 regression from Jeff"

* tag 'ceph-for-5.2-rc7' of git://github.com/ceph/ceph-client:
  ceph: fix ceph_mdsc_build_path to not stop on first component

43251dbd

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 5b607ba8

由 Linus Torvalds 提交于 6月 29, 2019

Pull SCSI fix from James Bottomley:
 "One simple fix for a driver use after free"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck()

5b607ba8

Merge tag 'for-linus-20190628' of git://git.kernel.dk/linux-block · 9dda12b6

由 Linus Torvalds 提交于 6月 29, 2019

Pull block fixes from Jens Axboe:
 "Just two small fixes.

  One from Paolo, fixing a silly mistake in BFQ. The other one is from
  me, ensuring that we have ->file cleared in the io_uring request a bit
  earlier. That avoids a use-before-free, if we encounter an error
  before ->file is assigned"

* tag 'for-linus-20190628' of git://git.kernel.dk/linux-block:
  block, bfq: fix operator in BFQQ_TOTALLY_SEEKY
  io_uring: ensure req->file is cleared on allocation

9dda12b6

Merge tag 'pinctrl-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 06191371

由 Linus Torvalds 提交于 6月 29, 2019

Pull pin control fixes from Linus Walleij:
 "Sorry to bomb in fixes this late. Maybe I can comfort you by saying it
  is only driver fixes, and mostly IRQ handling which is something GPIO
  and pin control drivers never get right. You think it works and then
  it doesn't.

  Summary:

   - Fix IRQ setup in the MCP23s08.

   - Fix pin setup on pins > 31 in the Ocelot driver.

   - Fix IRQs in the Mediatek driver"

* tag 'pinctrl-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: mediatek: Update cur_mask in mask/mask ops
  pinctrl: mediatek: Ignore interrupts that are wake only during resume
  pinctrl: ocelot: fix pinmuxing for pins after 31
  pinctrl: ocelot: fix gpio direction for pins after 31
  pinctrl: mcp23s08: Fix add_data and irqchip_add_nested call order

06191371

linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL · 8f9fab48

由 Vinod Koul 提交于 6月 28, 2019

DIV_ROUND_UP_ULL adds the two arguments and then invokes
DIV_ROUND_DOWN_ULL.  But on a 32bit system the addition of two 32 bit
values can overflow.  DIV_ROUND_DOWN_ULL does it correctly and stashes
the addition into a unsigned long long so cast the result to unsigned
long long here to avoid the overflow condition.

[akpm@linux-foundation.org: DIV_ROUND_UP_ULL must be an rval]
Link: http://lkml.kernel.org/r/20190625100518.30753-1-vkoul@kernel.orgSigned-off-by: NVinod Koul <vkoul@kernel.org>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f9fab48

mm, swap: fix THP swap out · 1a5f439c

由 Huang Ying 提交于 6月 28, 2019

0-Day test system reported some OOM regressions for several THP
(Transparent Huge Page) swap test cases.  These regressions are bisected
to 68614289 ("block: always define BIO_MAX_PAGES as 256").  In the
commit, BIO_MAX_PAGES is set to 256 even when THP swap is enabled.  So the
bio_alloc(gfp_flags, 512) in get_swap_bio() may fail when swapping out
THP.  That causes the OOM.

As in the patch description of 68614289 ("block: always define
BIO_MAX_PAGES as 256"), THP swap should use multi-page bvec to write THP
to swap space.  So the issue is fixed via doing that in get_swap_bio().

BTW: I remember I have checked the THP swap code when 68614289
("block: always define BIO_MAX_PAGES as 256") was merged, and thought the
THP swap code needn't to be changed.  But apparently, I was wrong.  I
should have done this at that time.

Link: http://lkml.kernel.org/r/20190624075515.31040-1-ying.huang@intel.com
Fixes: 68614289 ("block: always define BIO_MAX_PAGES as 256")
Signed-off-by: N"Huang, Ying" <ying.huang@intel.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1a5f439c

fork,memcg: alloc_thread_stack_node needs to set tsk->stack · 1bf4580e

由 Andrea Arcangeli 提交于 6月 28, 2019

Commit 5eed6f1d ("fork,memcg: fix crash in free_thread_stack on
memcg charge fail") corrected two instances, but there was a third
instance of this bug.

Without setting tsk->stack, if memcg_charge_kernel_stack fails, it'll
execute free_thread_stack() on a dangling pointer.

Enterprise kernels are compiled with VMAP_STACK=y so this isn't
critical, but custom VMAP_STACK=n builds should have some performance
advantage, with the drawback of risking to fail fork because compaction
didn't succeed.  So as long as VMAP_STACK=n is a supported option it's
worth fixing it upstream.

Link: http://lkml.kernel.org/r/20190619011450.28048-1-aarcange@redhat.com
Fixes: 9b6f7e16 ("mm: rework memcg kernel stack accounting")
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Reviewed-by: NRik van Riel <riel@surriel.com>
Acked-by: NRoman Gushchin <guro@fb.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1bf4580e

MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info · 8708e13c

由 Nick Desaulniers 提交于 6月 28, 2019

Add keyword support so that our mailing list gets cc'ed for clang/llvm
patches. We're pretty active on our mailing list so far as code review.
There are numerous Googlers like myself that are paid to support
building the Linux kernel with Clang and LLVM.

Link: http://lkml.kernel.org/r/20190620001907.255803-1-ndesaulniers@google.comSigned-off-by: NNick Desaulniers <ndesaulniers@google.com>
Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8708e13c

mm/vmalloc.c: avoid bogus -Wmaybe-uninitialized warning · 2c929233

由 Arnd Bergmann 提交于 6月 28, 2019

gcc gets confused in pcpu_get_vm_areas() because there are too many
branches that affect whether 'lva' was initialized before it gets used:

  mm/vmalloc.c: In function 'pcpu_get_vm_areas':
  mm/vmalloc.c:991:4: error: 'lva' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      insert_vmap_area_augment(lva, &va->rb_node,
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       &free_vmap_area_root, &free_vmap_area_list);
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  mm/vmalloc.c:916:20: note: 'lva' was declared here
    struct vmap_area *lva;
                      ^~~

Add an intialization to NULL, and check whether this has changed before
the first use.

[akpm@linux-foundation.org: tweak comments]
Link: http://lkml.kernel.org/r/20190618092650.2943749-1-arnd@arndb.de
Fixes: 68ad4a33 ("mm/vmalloc.c: keep track of free blocks for vmap allocation")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NUladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Joel Fernandes <joelaf@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2c929233

mm/page_idle.c: fix oops because end_pfn is larger than max_pfn · 7298e3b0

由 Colin Ian King 提交于 6月 28, 2019

Currently the calcuation of end_pfn can round up the pfn number to more
than the actual maximum number of pfns, causing an Oops.  Fix this by
ensuring end_pfn is never more than max_pfn.

This can be easily triggered when on systems where the end_pfn gets
rounded up to more than max_pfn using the idle-page stress-ng stress test:

sudo stress-ng --idle-page 0

  BUG: unable to handle kernel paging request at 00000000000020d8
  #PF error: [normal kernel read fault]
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 1 PID: 11039 Comm: stress-ng-idle- Not tainted 5.0.0-5-generic #6-Ubuntu
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
  RIP: 0010:page_idle_get_page+0xc8/0x1a0
  Code: 0f b1 0a 75 7d 48 8b 03 48 89 c2 48 c1 e8 33 83 e0 07 48 c1 ea 36 48 8d 0c 40 4c 8d 24 88 49 c1 e4 07 4c 03 24 d5 00 89 c3 be <49> 8b 44 24 58 48 8d b8 80 a1 02 00 e8 07 d5 77 00 48 8b 53 08 48
  RSP: 0018:ffffafd7c672fde8 EFLAGS: 00010202
  RAX: 0000000000000005 RBX: ffffe36341fff700 RCX: 000000000000000f
  RDX: 0000000000000284 RSI: 0000000000000275 RDI: 0000000001fff700
  RBP: ffffafd7c672fe00 R08: ffffa0bc34056410 R09: 0000000000000276
  R10: ffffa0bc754e9b40 R11: ffffa0bc330f6400 R12: 0000000000002080
  R13: ffffe36341fff700 R14: 0000000000080000 R15: ffffa0bc330f6400
  FS: 00007f0ec1ea5740(0000) GS:ffffa0bc7db00000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000020d8 CR3: 0000000077d68000 CR4: 00000000000006e0
  Call Trace:
    page_idle_bitmap_write+0x8c/0x140
    sysfs_kf_bin_write+0x5c/0x70
    kernfs_fop_write+0x12e/0x1b0
    __vfs_write+0x1b/0x40
    vfs_write+0xab/0x1b0
    ksys_write+0x55/0xc0
    __x64_sys_write+0x1a/0x20
    do_syscall_64+0x5a/0x110
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Link: http://lkml.kernel.org/r/20190618124352.28307-1-colin.king@canonical.com
Fixes: 33c3fc71 ("mm: introduce idle page tracking")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NVladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7298e3b0

initramfs: fix populate_initrd_image() section mismatch · 4ada1e81

由 Geert Uytterhoeven 提交于 6月 28, 2019

With gcc-4.6.3:

WARNING: vmlinux.o(.text.unlikely+0x140): Section mismatch in reference from the function populate_initrd_image() to the variable .init.ramfs.info:__initramfs_size
The function populate_initrd_image() references
the variable __init __initramfs_size.
This is often because populate_initrd_image lacks a __init
annotation or the annotation of __initramfs_size is wrong.

WARNING: vmlinux.o(.text.unlikely+0x14c): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:unpack_to_rootfs()
The function populate_initrd_image() references
the function __init unpack_to_rootfs().
This is often because populate_initrd_image lacks a __init
annotation or the annotation of unpack_to_rootfs is wrong.

WARNING: vmlinux.o(.text.unlikely+0x198): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:xwrite()
The function populate_initrd_image() references
the function __init xwrite().
This is often because populate_initrd_image lacks a __init
annotation or the annotation of xwrite is wrong.

Indeed, if the compiler decides not to inline populate_initrd_image(), a
warning is generated.

Fix this by adding the missing __init annotations.

Link: http://lkml.kernel.org/r/20190617074340.12779-1-geert@linux-m68k.org
Fixes: 7c184ecd ("initramfs: factor out a helper to populate the initrd image")
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ada1e81

mm/oom_kill.c: fix uninitialized oc->constraint · 432b1de0

由 Yafang Shao 提交于 6月 28, 2019

In dump_oom_summary() oc->constraint is used to show oom_constraint_text,
but it hasn't been set before. So the value of it is always the default
value 0. We should inititialize it before.

Bellow is the output when memcg oom occurs,

before this patch:
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null), cpuset=/,mems_allowed=0,oom_memcg=/foo,task_memcg=/foo,task=bash,pid=7997,uid=0

after this patch:
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null), cpuset=/,mems_allowed=0,oom_memcg=/foo,task_memcg=/foo,task=bash,pid=13681,uid=0

Link: http://lkml.kernel.org/r/1560522038-15879-1-git-send-email-laoar.shao@gmail.com
Fixes: ef8444ea ("mm, oom: reorganize the oom report in dump_header")
Signed-off-by: NYafang Shao <laoar.shao@gmail.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Wind Yu <yuzhoujian@didichuxing.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

432b1de0

mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge · faf53def

由 Naoya Horiguchi 提交于 6月 28, 2019

madvise(MADV_SOFT_OFFLINE) often returns -EBUSY when calling soft offline
for hugepages with overcommitting enabled.  That was caused by the
suboptimal code in current soft-offline code.  See the following part:

    ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
                            MIGRATE_SYNC, MR_MEMORY_FAILURE);
    if (ret) {
            ...
    } else {
            /*
             * We set PG_hwpoison only when the migration source hugepage
             * was successfully dissolved, because otherwise hwpoisoned
             * hugepage remains on free hugepage list, then userspace will
             * find it as SIGBUS by allocation failure. That's not expected
             * in soft-offlining.
             */
            ret = dissolve_free_huge_page(page);
            if (!ret) {
                    if (set_hwpoison_free_buddy_page(page))
                            num_poisoned_pages_inc();
            }
    }
    return ret;

Here dissolve_free_huge_page() returns -EBUSY if the migration source page
was freed into buddy in migrate_pages(), but even in that case we actually
has a chance that set_hwpoison_free_buddy_page() succeeds.  So that means
current code gives up offlining too early now.

dissolve_free_huge_page() checks that a given hugepage is suitable for
dissolving, where we should return success for !PageHuge() case because
the given hugepage is considered as already dissolved.

This change also affects other callers of dissolve_free_huge_page(), which
are cleaned up together.

[n-horiguchi@ah.jp.nec.com: v3]
  Link: http://lkml.kernel.org/r/1560761476-4651-3-git-send-email-n-horiguchi@ah.jp.nec.comLink: http://lkml.kernel.org/r/1560154686-18497-3-git-send-email-n-horiguchi@ah.jp.nec.com
Fixes: 6bc9b564 ("mm: fix race on soft-offlining")
Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: NChen, Jerry T <jerry.t.chen@intel.com>
Tested-by: NChen, Jerry T <jerry.t.chen@intel.com>
Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: NOscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
Cc: "Chen, Jerry T" <jerry.t.chen@intel.com>
Cc: "Zhuo, Qiuxu" <qiuxu.zhuo@intel.com>
Cc: <stable@vger.kernel.org>	[4.19+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

faf53def

mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails · b38e5962

由 Naoya Horiguchi 提交于 6月 28, 2019

The pass/fail of soft offline should be judged by checking whether the
raw error page was finally contained or not (i.e.  the result of
set_hwpoison_free_buddy_page()), but current code do not work like
that.  It might lead us to misjudge the test result when
set_hwpoison_free_buddy_page() fails.

Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may
not offline the original page and will not return an error.

Link: http://lkml.kernel.org/r/1560154686-18497-2-git-send-email-n-horiguchi@ah.jp.nec.comSigned-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Fixes: 6bc9b564 ("mm: fix race on soft-offlining")
Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: NOscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
Cc: "Chen, Jerry T" <jerry.t.chen@intel.com>
Cc: "Zhuo, Qiuxu" <qiuxu.zhuo@intel.com>
Cc: <stable@vger.kernel.org>	[4.19+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b38e5962

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功