- 06 12月, 2018 5 次提交
-
-
由 Suzuki K Poulose 提交于
Make use of the sorted capability list to access the capability entry in this_cpu_has_cap() to avoid iterating over the two tables. Reviewed-by: NVladimir Murzin <vladimir.murzin@arm.com> Tested-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
We maintain two separate tables of capabilities, errata and features, which decide the system capabilities. We iterate over each of these tables for various operations (e.g, detection, verification etc.). We do not have a way to map a system "capability" to its entry, (i.e, cap -> struct arm64_cpu_capabilities) which is needed for this_cpu_has_cap(). So we iterate over the table one by one to find the entry and then do the operation. Also, this prevents us from optimizing the way we "enable" the capabilities on the CPUs, where we now issue a stop_machine() for each available capability. One solution is to merge the two tables into a single table, sorted by the capability. But this is has the following disadvantages: - We loose the "classification" of an errata vs. feature - It is quite easy to make a mistake when adding an entry, unless we sort the table at runtime. So we maintain a list of pointers to the capability entry, sorted by the "cap number" in a separate array, initialized at boot time. The only restriction is that we can have one "entry" per capability. While at it, remove the duplicate declaration of arm64_errata table. Reviewed-by: NVladimir Murzin <vladimir.murzin@arm.com> Tested-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Remove duplicate entries for Qualcomm erratum 1003. Since the entries are not purely based on generic MIDR checks, use the multi_cap_entry type to merge the entries. Cc: Christopher Covington <cov@codeaurora.org> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: NVladimir Murzin <vladimir.murzin@arm.com> Tested-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
Merge duplicate entries for a single capability using the midr range list for Cavium errata 30115 and 27456. Cc: Andrew Pinski <apinski@cavium.com> Cc: David Daney <david.daney@cavium.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: NVladimir Murzin <vladimir.murzin@arm.com> Tested-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Suzuki K Poulose 提交于
We have two entries for ARM64_WORKAROUND_CLEAN_CACHE capability : 1) ARM Errata 826319, 827319, 824069, 819472 on A53 r0p[012] 2) ARM Errata 819472 on A53 r0p[01] Both have the same work around. Merge these entries to avoid duplicate entries for a single capability. Add a new Kconfig entry to control the "capability" entry to make it easier to handle combinations of the CONFIGs. Cc: Will Deacon <will.deacon@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 04 12月, 2018 1 次提交
-
-
由 Ard Biesheuvel 提交于
readelf complains about the section layout of vmlinux when building with CONFIG_RELOCATABLE=y (for KASLR): readelf: Warning: [21]: Link field (0) should index a symtab section. readelf: Warning: [21]: Info field (0) should index a relocatable section. Also, it seems that our use of '-pie -shared' is contradictory, and thus ambiguous. In general, the way KASLR is wired up at the moment is highly tailored to how ld.bfd happens to implement (and conflate) PIE executables and shared libraries, so given the current effort to support other toolchains, let's fix some of these issues as well. - Drop the -pie linker argument and just leave -shared. In ld.bfd, the differences between them are unclear (except for the ELF type of the produced image [0]) but lld chokes on seeing both at the same time. - Rename the .rela output section to .rela.dyn, as is customary for shared libraries and PIE executables, so that it is not misidentified by readelf as a static relocation section (producing the warnings above). - Pass the -z notext and -z norelro options to explicitly instruct the linker to permit text relocations, and to omit the RELRO program header (which requires a certain section layout that we don't adhere to in the kernel). These are the defaults for current versions of ld.bfd. - Discard .eh_frame and .gnu.hash sections to avoid them from being emitted between .head.text and .text, screwing up the section layout. These changes only affect the ELF image, and produce the same binary image. [0] b9dce7f1 ("arm64: kernel: force ET_DYN ELF type for ...") Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Smith <peter.smith@linaro.org> Tested-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 30 11月, 2018 9 次提交
-
-
由 Ard Biesheuvel 提交于
Improve the performance of the crc32() asm routines by getting rid of most of the branches and small sized loads on the common path. Instead, use a branchless code path involving overlapping 16 byte loads to process the first (length % 32) bytes, and process the remainder using a loop that processes 32 bytes at a time. Tested using the following test program: #include <stdlib.h> extern void crc32_le(unsigned short, char const*, int); int main(void) { static const char buf[4096]; srand(20181126); for (int i = 0; i < 100 * 1000 * 1000; i++) crc32_le(0, buf, rand() % 1024); return 0; } On Cortex-A53 and Cortex-A57, the performance regresses but only very slightly. On Cortex-A72 however, the performance improves from $ time ./crc32 real 0m10.149s user 0m10.149s sys 0m0.000s to $ time ./crc32 real 0m7.915s user 0m7.915s sys 0m0.000s Cc: Rui Sun <sunrui26@huawei.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
The core ftrace hooks take the instrumented PC in x0, but for some reason arm64's prepare_ftrace_return() takes this in x1. For consistency, let's flip the argument order and always pass the instrumented PC in x0. There should be no functional change as a result of this patch. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
The save_return_regs and restore_return_regs macros are only used by return_to_handler, and having them defined out-of-line only serves to obscure the logic. Before we complicate, let's clean this up and fold the logic directly into return_to_handler, saving a few lines of macro boilerplate in the process. At the same time, a missing trailing space is added to the comments, fixing a code style violation. There should be no functional change as a result of this patch. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
The core ftrace code requires that when it is handed the PC of an instrumented function, this PC is the address of the instrumented instruction. This is necessary so that the core ftrace code can identify the specific instrumentation site. Since the instrumented function will be a BL, the address of the instrumented function is LR - 4 at entry to the ftrace code. This fixup is applied in the mcount_get_pc and mcount_get_pc0 helpers, which acquire the PC of the instrumented function. The mcount_get_lr helper is used to acquire the LR of the instrumented function, whose value does not require this adjustment, and cannot be adjusted to anything meaningful. No adjustment of this value is made on other architectures, including arm. However, arm64 adjusts this value by 4. This patch brings arm64 in line with other architectures and removes the adjustment of the LR value. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
The core frace code has an optional sanity check on the frame pointer passed by ftrace_graph_caller and return_to_handler. This is cheap, useful, and enabled unconditionally on x86, sparc, and riscv. Let's do the same on arm64, so that we can catch any problems early. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
The global exports of ftrace_call and ftrace_graph_call are somewhat painful to read. Let's use the generic GLOBAL() macro to ameliorate matters. There should be no functional change as a result of this patch. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
Declaring a global symbol in assembly is tedious, error-prone, and painful to read. While ENTRY() exists, this is supposed to be used for function entry points, and this affects alignment in a potentially undesireable manner. Instead, let's add a generic GLOBAL() macro for this, as x86 added locally in commit: 95695547 ("x86: asm linkage - introduce GLOBAL macro") ... thus allowing us to use this more freely in the kernel. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
Commit 1212f7a1 ("scripts/kallsyms: filter arm64's __efistub_ symbols") updated the kallsyms code to filter out symbols with the __efistub_ prefix explicitly, so we no longer require the hack in our linker script to emit them as absolute symbols. Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
As of commit 6460d320 ("arm64: io: Ensure calls to delay routines are ordered against prior readX()"), MMIO reads smaller than 64 bits fail to compile under clang because we end up mixing 32-bit and 64-bit register operands for the same data processing instruction: ./include/asm-generic/io.h:695:9: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths] return readb(addr); ^ ./arch/arm64/include/asm/io.h:147:58: note: expanded from macro 'readb' ^ ./include/asm-generic/io.h:695:9: note: use constraint modifier "w" ./arch/arm64/include/asm/io.h:147:50: note: expanded from macro 'readb' ^ ./arch/arm64/include/asm/io.h:118:24: note: expanded from macro '__iormb' asm volatile("eor %0, %1, %1\n" \ ^ Fix the build by casting the macro argument to 'unsigned long' when used as an input to the inline asm. Reported-by: NNick Desaulniers <nick.desaulniers@gmail.com> Reported-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 28 11月, 2018 5 次提交
-
-
由 Will Deacon 提交于
In order to reduce the possibility of soft lock-ups, we bound the maximum number of TLBI operations performed by a single call to flush_tlb_range() to an arbitrary constant of 1024. Whilst this does the job of avoiding lock-ups, we can actually be a bit smarter by defining this as PTRS_PER_PTE. Due to the structure of our page tables, using PTRS_PER_PTE means that an outer loop calling flush_tlb_range() for entire table entries will end up performing just a single TLBI operation for each entry. As an example, mremap()ing a 1GB range mapped using 4k pages now requires only 512 TLBI operations when moving the page tables as opposed to 262144 operations (512*512) when using the current threshold of 1024. Cc: Joel Fernandes <joel@joelfernandes.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
Now that we have switched to the small code model entirely, and reduced the extended KASLR range to 4 GB, we can be sure that the targets of relative branches that are out of range are in range for a ADRP/ADD pair, which is one instruction shorter than our current MOVN/MOVK/MOVK sequence, and is more idiomatic and so it is more likely to be implemented efficiently by micro-architectures. So switch over the ordinary PLT code and the special handling of the Cortex-A53 ADRP errata, as well as the ftrace trampline handling. Reviewed-by: NTorsten Duwe <duwe@lst.de> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> [will: Added a couple of comments in the plt equality check] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
Add support for emitting ADR and ADRP instructions so we can switch over our PLT generation code in a subsequent patch. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 James Morse 提交于
__install_bp_hardening_cb() is called via stop_machine() as part of the cpu_enable callback. To force each CPU to take its turn when allocating slots, they take a spinlock. With the RT patches applied, the spinlock becomes a mutex, and we get warnings about sleeping while in stop_machine(): | [ 0.319176] CPU features: detected: RAS Extension Support | [ 0.319950] BUG: scheduling while atomic: migration/3/36/0x00000002 | [ 0.319955] Modules linked in: | [ 0.319958] Preemption disabled at: | [ 0.319969] [<ffff000008181ae4>] cpu_stopper_thread+0x7c/0x108 | [ 0.319973] CPU: 3 PID: 36 Comm: migration/3 Not tainted 4.19.1-rt3-00250-g330fc2c2a880 #2 | [ 0.319975] Hardware name: linux,dummy-virt (DT) | [ 0.319976] Call trace: | [ 0.319981] dump_backtrace+0x0/0x148 | [ 0.319983] show_stack+0x14/0x20 | [ 0.319987] dump_stack+0x80/0xa4 | [ 0.319989] __schedule_bug+0x94/0xb0 | [ 0.319991] __schedule+0x510/0x560 | [ 0.319992] schedule+0x38/0xe8 | [ 0.319994] rt_spin_lock_slowlock_locked+0xf0/0x278 | [ 0.319996] rt_spin_lock_slowlock+0x5c/0x90 | [ 0.319998] rt_spin_lock+0x54/0x58 | [ 0.320000] enable_smccc_arch_workaround_1+0xdc/0x260 | [ 0.320001] __enable_cpu_capability+0x10/0x20 | [ 0.320003] multi_cpu_stop+0x84/0x108 | [ 0.320004] cpu_stopper_thread+0x84/0x108 | [ 0.320008] smpboot_thread_fn+0x1e8/0x2b0 | [ 0.320009] kthread+0x124/0x128 | [ 0.320010] ret_from_fork+0x10/0x18 Switch this to a raw spinlock, as we know this is only called with IRQs masked. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Jeremy Linton 提交于
The BAD_MADT_GICC_ENTRY check is a little too strict because it rejects MADT entries that don't match the currently known lengths. We should remove this restriction to avoid problems if the table length changes. Future code which might depend on additional fields should be written to validate those fields before using them, rather than trying to globally check known MADT version lengths. Link: https://lkml.kernel.org/r/20181012192937.3819951-1-jeremy.linton@arm.comSigned-off-by: NJeremy Linton <jeremy.linton@arm.com> [lorenzo.pieralisi@arm.com: added MADT macro comments] Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: NSudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Al Stone <ahs3@redhat.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 27 11月, 2018 2 次提交
-
-
由 Will Deacon 提交于
A relatively standard idiom for ensuring that a pair of MMIO writes to a device arrive at that device with a specified minimum delay between them is as follows: writel_relaxed(42, dev_base + CTL1); readl(dev_base + CTL1); udelay(10); writel_relaxed(42, dev_base + CTL2); the intention being that the read-back from the device will push the prior write to CTL1, and the udelay will hold up the write to CTL1 until at least 10us have elapsed. Unfortunately, on arm64 where the underlying delay loop is implemented as a read of the architected counter, the CPU does not guarantee ordering from the readl() to the delay loop and therefore the delay loop could in theory be speculated and not provide the desired interval between the two writes. Fix this in a similar manner to PowerPC by introducing a dummy control dependency on the output of readX() which, combined with the ISB in the read of the architected counter, guarantees that a subsequent delay loop can not be executed until the readX() has returned its result. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Alex Van Brunt 提交于
When transitioning a PTE from young to old as part of page aging, we can avoid waiting for the TLB invalidation to complete and therefore drop the subsequent DSB instruction. Whilst this opens up a race with page reclaim, where a PTE in active use via a stale, young TLB entry does not update the underlying descriptor, the worst thing that happens is that the page is reclaimed and then immediately faulted back in. Given that we have a DSB in our context-switch path, the window for a spurious reclaim is fairly limited and eliding the barrier claims to boost NVMe/SSD accesses by over 10% on some platforms. A similar optimisation was made for x86 in commit b13b1d2d ("x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB"). Signed-off-by: NAlex Van Brunt <avanbrunt@nvidia.com> Signed-off-by: NAshish Mhetre <amhetre@nvidia.com> [will: rewrote patch] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 20 11月, 2018 3 次提交
-
-
由 Jessica Yu 提交于
Instead of saving a pointer to the .plt and .init.plt sections to apply plt-based relocations, save and use their section indices instead. The mod->arch.{core,init}.plt pointers were problematic for livepatch because they pointed within temporary section headers (provided by the module loader via info->sechdrs) that would be freed after module load. Since livepatch modules may need to apply relocations post-module-load (for example, to patch a module that is loaded later), using section indices to offset into the section headers (instead of accessing them through a saved pointer) allows livepatch modules on arm64 to pass in their own copy of the section headers to apply_relocate_add() to apply delayed relocations. Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Signed-off-by: NJessica Yu <jeyu@kernel.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
On arm64, we use block mappings and contiguous hints to map the linear region, to minimize the TLB footprint. However, this means that the entire region is mapped using read/write permissions, which we cannot modify at page granularity without having to take intrusive measures to prevent TLB conflicts. This means the linear aliases of pages belonging to read-only mappings (executable or otherwise) in the vmalloc region are also mapped read/write, and could potentially be abused to modify things like module code, bpf JIT code or other read-only data. So let's fix this, by extending the set_memory_ro/rw routines to take the linear alias into account. The consequence of enabling this is that we can no longer use block mappings or contiguous hints, so in cases where the TLB footprint of the linear region is a bottleneck, performance may be affected. Therefore, allow this feature to be runtime en/disabled, by setting rodata=full (or 'on' to disable just this enhancement, or 'off' to disable read-only mappings for code and r/o data entirely) on the kernel command line. Also, allow the default value to be set via a Kconfig option. Tested-by: NLaura Abbott <labbott@redhat.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Ard Biesheuvel 提交于
Call vm_unmap_aliases() every time we apply any changes to permission attributes of mappings in the vmalloc region. This avoids any potential issues resulting from lingering writable or executable aliases of mappings that should be read-only or non-executable, respectively. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 19 11月, 2018 15 次提交
-
-
由 Linus Torvalds 提交于
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm由 Linus Torvalds 提交于
Pull libnvdimm fixes from Dan Williams: "A small batch of fixes for v4.20-rc3. The overflow continuation fix addresses something that has been broken for several releases. Arguably it could wait even longer, but it's a one line fix and this finishes the last of the known address range scrub bug reports. The revert addresses a lockdep regression. The unit tests are not critical to fix, but no reason to hold this fix back. Summary: - Address Range Scrub overflow continuation handling has been broken since it was initially merged. It was only recently that error injection and platform-BIOS support enabled this corner case to be exercised. - The recent attempt to provide more isolation for the kernel Address Range Scrub state machine from userapace initiated sessions triggers a lockdep report. Revert and try again at the next merge window. - Fix a kasan reported buffer overflow in libnvdimm unit test infrastrucutre (nfit_test)" * tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: Revert "acpi, nfit: Further restrict userspace ARS start requests" acpi, nfit: Fix ARS overflow continuation tools/testing/nvdimm: Fix the array size for dimm devices.
-
由 Linus Torvalds 提交于
Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/memblock.c: fix a typo in __next_mem_pfn_range() comments mm, page_alloc: check for max order in hot path scripts/spdxcheck.py: make python3 compliant tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn mm/vmstat.c: fix NUMA statistics updates mm/gup.c: fix follow_page_mask() kerneldoc comment ocfs2: free up write context when direct IO failed scripts/faddr2line: fix location of start_kernel in comment mm: don't reclaim inodes with many attached pages mm, memory_hotplug: check zone_movable in has_unmovable_pages mm/swapfile.c: use kvzalloc for swap_info_struct allocation MAINTAINERS: update OMAP MMC entry hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444! kernel/sched/psi.c: simplify cgroup_move_task() z3fold: fix possible reclaim races
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull scheduler fix from Ingo Molnar: "Fix an exec() related scalability/performance regression, which was caused by incorrectly calculating load and migrating tasks on exec() when they shouldn't be" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix cpu_util_wake() for 'execl' type workloads
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull perf fixes from Ingo Molnar: "Fix uncore PMU enumeration for CofeeLake CPUs" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Support CoffeeLake 8th CBOX perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull EFI fixes from Ingo Molnar: "Misc fixes: two warning splat fixes, a leak fix and persistent memory allocation fixes for ARM" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Permit calling efi_mem_reserve_persistent() from atomic context efi/arm: Defer persistent reservations until after paging_init() efi/arm/libstub: Pack FDT after populating it efi/arm: Revert deferred unmap of early memmap mapping efi: Fix debugobjects warning on 'efi_rts_work'
-
git://git.armlinux.org.uk/~rmk/linux-arm由 Linus Torvalds 提交于
Pull ARM spectre updates from Russell King: "These are the currently known final bits that resolve the Spectre issues. big.Little systems used to be sufficiently identical in that there were no differences between individual CPUs in the system that mattered to the kernel. With the advent of the Spectre problem, the CPUs now have differences in how the workaround is applied. As a result of previous Spectre patches, these systems ended up reporting quite a lot of: "CPUx: Spectre v2: incorrect context switching function, system vulnerable" messages due to the action of the big.Little switcher causing the CPUs to be re-initialised regularly. This series resolves that issue by making the CPU vtable unique to each CPU. However, since this is used very early, before per-cpu is setup, per-cpu can't be used. We also have a problem that two of the methods are not called from preempt-safe paths, but thankfully these remain identical between all CPUs in the system. To make sure, we validate that these are identical during boot" * 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: spectre-v2: per-CPU vtables to work around big.Little systems ARM: add PROC_VTABLE and PROC_TABLE macros ARM: clean up per-processor check_bugs method call ARM: split out processor lookup ARM: make lookup_processor_type() non-__init
-
由 Chen Chang 提交于
Link: http://lkml.kernel.org/r/20181107100247.13359-1-rainccrun@gmail.comSigned-off-by: NChen Chang <rainccrun@gmail.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NMike Rapoport <rppt@linux.ibm.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
Konstantin has noticed that kvmalloc might trigger the following warning: WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60 [...] Call Trace: fragmentation_index+0x76/0x90 compaction_suitable+0x4f/0xf0 shrink_node+0x295/0x310 node_reclaim+0x205/0x250 get_page_from_freelist+0x649/0xad0 __alloc_pages_nodemask+0x12a/0x2a0 kmalloc_large_node+0x47/0x90 __kmalloc_node+0x22b/0x2e0 kvmalloc_node+0x3e/0x70 xt_alloc_table_info+0x3a/0x80 [x_tables] do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables] nf_setsockopt+0x44/0x60 SyS_setsockopt+0x6f/0xc0 do_syscall_64+0x67/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 the problem is that we only check for an out of bound order in the slow path and the node reclaim might happen from the fast path already. This is fixable by making sure that kvmalloc doesn't ever use kmalloc for requests that are larger than KMALLOC_MAX_SIZE but this also shows that the code is rather fragile. A recent UBSAN report just underlines that by the following report UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19 shift exponent 51 is too large for 32-bit type 'int' CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xd2/0x148 lib/dump_stack.c:113 ubsan_epilogue+0x12/0x94 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425 __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117 zone_watermark_fast mm/page_alloc.c:3216 [inline] get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300 __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370 alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093 alloc_pages include/linux/gfp.h:509 [inline] __get_free_pages+0x12/0x60 mm/page_alloc.c:4414 dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156 raw_cmd_copyin drivers/block/floppy.c:3159 [inline] raw_cmd_ioctl drivers/block/floppy.c:3206 [inline] fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544 fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571 __blkdev_driver_ioctl block/ioctl.c:303 [inline] blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601 block_ioctl+0x105/0x150 fs/block_dev.c:1883 vfs_ioctl fs/ioctl.c:46 [inline] do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687 ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702 __do_sys_ioctl fs/ioctl.c:709 [inline] __se_sys_ioctl fs/ioctl.c:707 [inline] __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707 do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Note that this is not a kvmalloc path. It is just that the fast path really depends on having sanitzed order as well. Therefore move the order check to the fast path. Link: http://lkml.kernel.org/r/20181113094305.GM15120@dhcp22.suse.czSigned-off-by: NMichal Hocko <mhocko@suse.com> Reported-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Reported-by: NKyungtae Kim <kt0755@gmail.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Byoungyoung Lee <lifeasageek@gmail.com> Cc: "Dae R. Jeong" <threeearcat@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Uwe Kleine-König 提交于
Without this change the following happens when using Python3 (3.6.6): $ echo "GPL-2.0" | python3 scripts/spdxcheck.py - FAIL: 'str' object has no attribute 'decode' Traceback (most recent call last): File "scripts/spdxcheck.py", line 253, in <module> parser.parse_lines(sys.stdin, args.maxlines, '-') File "scripts/spdxcheck.py", line 171, in parse_lines line = line.decode(locale.getpreferredencoding(False), errors='ignore') AttributeError: 'str' object has no attribute 'decode' So as the line is already a string, there is no need to decode it and the line can be dropped. /usr/bin/python on Arch is Python 3. So this would indeed be worth going into 4.19. Link: http://lkml.kernel.org/r/20181023070802.22558-1-u.kleine-koenig@pengutronix.deSigned-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joe Perches <joe@perches.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yufen Yu 提交于
Other filesystems such as ext4, f2fs and ubifs all return ENXIO when lseek (SEEK_DATA or SEEK_HOLE) requests a negative offset. man 2 lseek says : EINVAL whence is not valid. Or: the resulting file offset would be : negative, or beyond the end of a seekable device. : : ENXIO whence is SEEK_DATA or SEEK_HOLE, and the file offset is beyond : the end of the file. Make tmpfs return ENXIO under these circumstances as well. After this, tmpfs also passes xfstests's generic/448. [akpm@linux-foundation.org: rewrite changelog] Link: http://lkml.kernel.org/r/1540434176-14349-1-git-send-email-yuyufen@huawei.comSigned-off-by: NYufen Yu <yuyufen@huawei.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arnd Bergmann 提交于
gcc-8 complains about the prototype for this function: lib/ubsan.c:432:1: error: ignoring attribute 'noreturn' in declaration of a built-in function '__ubsan_handle_builtin_unreachable' because it conflicts with attribute 'const' [-Werror=attributes] This is actually a GCC's bug. In GCC internals __ubsan_handle_builtin_unreachable() declared with both 'noreturn' and 'const' attributes instead of only 'noreturn': https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84210 Workaround this by removing the noreturn attribute. [aryabinin: add information about GCC bug in changelog] Link: http://lkml.kernel.org/r/20181107144516.4587-1-aryabinin@virtuozzo.comSigned-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: NOlof Johansson <olof@lixom.net> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Janne Huttunen 提交于
Scan through the whole array to see if an update is needed. While we're at it, use sizeof() to be safe against any possible type changes in the future. The bug here is that we wouldn't sync per-cpu counters into global ones if there was an update of numa_stats for higher cpus. Highly theoretical one though because it is much more probable that zone_stats are updated so we would refresh anyway. So I wouldn't bother to mark this for stable, yet something nice to fix. [mhocko@suse.com: changelog enhancement] Link: http://lkml.kernel.org/r/1541601517-17282-1-git-send-email-janne.huttunen@nokia.com Fixes: 1d90ca89 ("mm: update NUMA counter threshold size") Signed-off-by: NJanne Huttunen <janne.huttunen@nokia.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
Commit df06b37f ("mm/gup: cache dev_pagemap while pinning pages") modified the signature of follow_page_mask() but left the parameter description behind. Update the description to make the code and comments agree again. While at it, update formatting of the return value description to match Documentation/doc-guide/kernel-doc.rst guidelines. Link: http://lkml.kernel.org/r/1541603316-27832-1-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wengang Wang 提交于
The write context should also be freed even when direct IO failed. Otherwise a memory leak is introduced and entries remain in oi->ip_unwritten_list causing the following BUG later in unlink path: ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) ERROR: Clear inode of 215043, inode has unwritten extents ... Call Trace: ? __set_current_blocked+0x42/0x68 ocfs2_evict_inode+0x91/0x6a0 [ocfs2] ? bit_waitqueue+0x40/0x33 evict+0xdb/0x1af iput+0x1a2/0x1f7 do_unlinkat+0x194/0x28f SyS_unlinkat+0x1b/0x2f do_syscall_64+0x79/0x1ae entry_SYSCALL_64_after_hwframe+0x151/0x0 This patch also logs, with frequency limit, direct IO failures. Link: http://lkml.kernel.org/r/20181102170632.25921-1-wen.gang.wang@oracle.comSigned-off-by: NWengang Wang <wen.gang.wang@oracle.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Reviewed-by: NChangwei Ge <ge.changwei@h3c.com> Reviewed-by: NJoseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-