- 25 8月, 2020 1 次提交
-
-
由 Luke Hsiao 提交于
For TCP tx zero-copy, the kernel notifies the process of completions by queuing completion notifications on the socket error queue. This patch allows reading these notifications via recvmsg to support TCP tx zero-copy. Ancillary data was originally disallowed due to privilege escalation via io_uring's offloading of sendmsg() onto a kernel thread with kernel credentials (https://crbug.com/project-zero/1975). So, we must ensure that the socket type is one where the ancillary data types that are delivered on recvmsg are plain data (no file descriptors or values that are translated based on the identity of the calling process). This was tested by using io_uring to call recvmsg on the MSG_ERRQUEUE with tx zero-copy enabled. Before this patch, we received -EINVALID from this specific code path. After this patch, we could read tcp tx zero-copy completion notifications from the MSG_ERRQUEUE. Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NArjun Roy <arjunroy@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NJann Horn <jannh@google.com> Reviewed-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NLuke Hsiao <lukehsiao@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 8月, 2020 1 次提交
-
-
由 Tom Parkin 提交于
The l2tp subsystem now uses standard kernel logging APIs for informational and warning messages, and tracepoints for debug information. Now that the tunnel and session debug flags are unused, remove the field from the core structures. Various system calls (in the case of l2tp_ppp) and netlink messages handle the getting and setting of debug flags. To avoid userspace breakage don't modify the API of these calls; simply ignore set requests, and send dummy data for get requests. Signed-off-by: NTom Parkin <tparkin@katalix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 8月, 2020 1 次提交
-
-
由 Tobias Klauser 提交于
Also remove trailing whitespaces in bpf_skb_get_tunnel_key example code. Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821133642.18870-1-tklauser@distanz.ch
-
- 21 8月, 2020 1 次提交
-
-
由 Anup Patel 提交于
We add a separate CLINT timer driver for Linux RISC-V M-mode (i.e. RISC-V NoMMU kernel). The CLINT MMIO device provides three things: 1. 64bit free running counter register 2. 64bit per-CPU time compare registers 3. 32bit per-CPU inter-processor interrupt registers Unlike other timer devices, CLINT provides IPI registers along with timer registers. To use CLINT IPI registers, the CLINT timer driver provides IPI related callbacks to arch/riscv. Signed-off-by: NAnup Patel <anup.patel@wdc.com> Tested-by: NEmil Renner Berhing <kernel@esmil.dk> Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: NAtish Patra <atish.patra@wdc.com> Reviewed-by: NPalmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>
-
- 20 8月, 2020 4 次提交
-
-
由 Kurt Kanzenbach 提交于
The offset for the control field is not needed anymore. Remove it. Signed-off-by: NKurt Kanzenbach <kurt@linutronix.de> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kurt Kanzenbach 提交于
The message type is located at different offsets within the ptp header depending on the ptp version (v1 or v2). Therefore, drivers which also deal with ptp v1 have some code for it. Extract this into a helper function for drivers to be used. Signed-off-by: NKurt Kanzenbach <kurt@linutronix.de> Reviewed-by: NRichard Cochran <richardcochran@gmail.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kurt Kanzenbach 提交于
Reason: A lot of the ptp drivers - which implement hardware time stamping - need specific fields such as the sequence id from the ptp v2 header. Currently all drivers implement that themselves. Introduce a generic function to retrieve a pointer to the start of the ptp v2 header. Suggested-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NKurt Kanzenbach <kurt@linutronix.de> Reviewed-by: NRichard Cochran <richardcochran@gmail.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 brookxu 提交于
In the scenario of writing sparse files, the per-inode prealloc list may be very long, resulting in high overhead for ext4_mb_use_preallocated(). To circumvent this problem, we limit the maximum length of per-inode prealloc list to 512 and allow users to modify it. After patching, we observed that the sys ratio of cpu has dropped, and the system throughput has increased significantly. We created a process to write the sparse file, and the running time of the process on the fixed kernel was significantly reduced, as follows: Running time on unfixed kernel: [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat real 0m2.051s user 0m0.008s sys 0m2.026s Running time on fixed kernel: [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat real 0m0.471s user 0m0.004s sys 0m0.395s Signed-off-by: NChunguang Xu <brookxu@tencent.com> Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 19 8月, 2020 2 次提交
-
-
由 Xin Long 提交于
This patch is to do 3 things for ipv6_dev_find(): As David A. noticed, - rt6_lookup() is not really needed. Different from __ip_dev_find(), ipv6_dev_find() doesn't have a compatibility problem, so remove it. As Hideaki suggested, - "valid" (non-tentative) check for the address is also needed. ipv6_chk_addr() calls ipv6_chk_addr_and_flags(), which will traverse the address hash list, but it's heavy to be called inside ipv6_dev_find(). This patch is to reuse the code of ipv6_chk_addr_and_flags() for ipv6_dev_find(). - dev parameter is passed into ipv6_dev_find(), as link-local addresses from user space has sin6_scope_id set and the dev lookup needs it. Fixes: 81f6cb31 ("ipv6: add ipv6_dev_find()") Suggested-by: NYOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Reported-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Add range validation for NLA_BINARY, allowing validation of any combination of combination minimum or maximum lengths, using the existing NLA_POLICY_RANGE()/NLA_POLICY_FULL_RANGE() macros, just like for integers where the value is checked. Also make NLA_POLICY_EXACT_LEN(), NLA_POLICY_EXACT_LEN_WARN() and NLA_POLICY_MIN_LEN() special cases of this, removing the old types NLA_EXACT_LEN and NLA_MIN_LEN. This allows us to save some code where both minimum and maximum lengths are requires, currently the policy only allows maximum (NLA_BINARY), minimum (NLA_MIN_LEN) or exact (NLA_EXACT_LEN), so a range of lengths cannot be accepted and must be checked by the code that consumes the attributes later. Also, this allows advertising the correct ranges in the policy export to userspace. Here, NLA_MIN_LEN and NLA_EXACT_LEN already were special cases of NLA_BINARY with min and min/max length respectively. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 8月, 2020 3 次提交
-
-
由 Jessica Clarke 提交于
IA-64 is special and treats pgd_offset_k() differently to pgd_offset(), using different formulae to calculate the indices into the kernel and user PGDs. The index into the user PGDs takes into account the region number, but the index into the kernel (init_mm) PGD always assumes a predefined kernel region number. Commit 974b9b2c ("mm: consolidate pte_index() and pte_offset_*() definitions") made IA-64 use a generic pgd_offset_k() which incorrectly used pgd_index() for kernel page tables. As a result, the index into the kernel PGD was going out of bounds and the kernel hung during early boot. Allow overrides of pgd_offset_k() and override it on IA-64 with the old implementation that will correctly index the kernel PGD. Fixes: 974b9b2c ("mm: consolidate pte_index() and pte_offset_*() definitions") Reported-by: NJohn Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: NJessica Clarke <jrtc27@jrtc27.com> Tested-by: NJohn Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
-
由 Randy Dunlap 提交于
Fix a kernel-doc warning for the pcs_config() function prototype: ../include/linux/phylink.h:406: warning: Excess function parameter 'permit_pause_to_mac' description in 'pcs_config' Fixes: 7137e18f ("net: phylink: add struct phylink_pcs") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Howells 提交于
Impose a limit on the number of watches that a user can hold so that they can't use this mechanism to fill up all the available memory. This is done by putting a counter in user_struct that's incremented when a watch is allocated and decreased when it is released. If the number exceeds the RLIMIT_NOFILE limit, the watch is rejected with EAGAIN. This can be tested by the following means: (1) Create a watch queue and attach it to fd 5 in the program given - in this case, bash: keyctl watch_session /tmp/nlog /tmp/gclog 5 bash (2) In the shell, set the maximum number of files to, say, 99: ulimit -n 99 (3) Add 200 keyrings: for ((i=0; i<200; i++)); do keyctl newring a$i @s || break; done (4) Try to watch all of the keyrings: for ((i=0; i<200; i++)); do echo $i; keyctl watch_add 5 %:a$i || break; done This should fail when the number of watches belonging to the user hits 99. (5) Remove all the keyrings and all of those watches should go away: for ((i=0; i<200; i++)); do keyctl unlink %:a$i; done (6) Kill off the watch queue by exiting the shell spawned by watch_session. Fixes: c73be61c ("pipe: Add general notification queue support") Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 8月, 2020 14 次提交
-
-
由 Krzysztof Kozlowski 提交于
Patch series "iomap: Constify ioreadX() iomem argument", v3. The ioread8/16/32() and others have inconsistent interface among the architectures: some taking address as const, some not. It seems there is nothing really stopping all of them to take pointer to const. This patch (of 4): The ioreadX() and ioreadX_rep() helpers have inconsistent interface. On some architectures void *__iomem address argument is a pointer to const, on some not. Implementations of ioreadX() do not modify the memory under the address so they can be converted to a "const" version for const-safety and consistency among architectures. [krzk@kernel.org: sh: clk: fix assignment from incompatible pointer type for ioreadX()] Link: http://lkml.kernel.org/r/20200723082017.24053-1-krzk@kernel.org [akpm@linux-foundation.org: fix drivers/mailbox/bcm-pdc-mailbox.c] Link: http://lkml.kernel.org/r/202007132209.Rxmv4QyS%25lkp@intel.comSuggested-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: NArnd Bergmann <arnd@arndb.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: Allen Hubbe <allenbh@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/20200709072837.5869-1-krzk@kernel.org Link: http://lkml.kernel.org/r/20200709072837.5869-2-krzk@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Romain Naour 提交于
Since the patch [1], building the kernel using a toolchain built with binutils 2.33.1 prevents booting a sh4 system under Qemu. Apply the patch provided by Alan Modra [2] that fix alignment of rodata. [1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ebd2263ba9a9124d93bbc0ece63d7e0fae89b40e [2] https://www.sourceware.org/ml/binutils/2019-12/msg00112.htmlSigned-off-by: NRomain Naour <romain.naour@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Alan Modra <amodra@gmail.com> Cc: Bin Meng <bin.meng@windriver.com> Cc: Chen Zhou <chenzhou10@huawei.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Rich Felker <dalias@libc.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Arnd Bergmann <arnd@arndb.de> Cc: <stable@vger.kernel.org> Link: https://marc.info/?l=linux-sh&m=158429470221261Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Qian Cai 提交于
BUG: KCSAN: data-race in page_cpupid_xchg_last / put_page write (marked) to 0xfffffc0d48ec1a00 of 8 bytes by task 91442 on cpu 3: page_cpupid_xchg_last+0x51/0x80 page_cpupid_xchg_last at mm/mmzone.c:109 (discriminator 11) wp_page_reuse+0x3e/0xc0 wp_page_reuse at mm/memory.c:2453 do_wp_page+0x472/0x7b0 do_wp_page at mm/memory.c:2798 __handle_mm_fault+0xcb0/0xd00 handle_pte_fault at mm/memory.c:4049 (inlined by) __handle_mm_fault at mm/memory.c:4163 handle_mm_fault+0xfc/0x2f0 handle_mm_fault at mm/memory.c:4200 do_page_fault+0x263/0x6f9 do_user_addr_fault at arch/x86/mm/fault.c:1465 (inlined by) do_page_fault at arch/x86/mm/fault.c:1539 page_fault+0x34/0x40 read to 0xfffffc0d48ec1a00 of 8 bytes by task 94817 on cpu 69: put_page+0x15a/0x1f0 page_zonenum at include/linux/mm.h:923 (inlined by) is_zone_device_page at include/linux/mm.h:929 (inlined by) page_is_devmap_managed at include/linux/mm.h:948 (inlined by) put_page at include/linux/mm.h:1023 wp_page_copy+0x571/0x930 wp_page_copy at mm/memory.c:2615 do_wp_page+0x107/0x7b0 __handle_mm_fault+0xcb0/0xd00 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 Reported by Kernel Concurrency Sanitizer on: CPU: 69 PID: 94817 Comm: systemd-udevd Tainted: G W O L 5.5.0-next-20200204+ #6 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 A page never changes its zone number. The zone number happens to be stored in the same word as other bits which are modified, but the zone number bits will never be modified by any other write, so it can accept a reload of the zone bits after an intervening write and it don't need to use READ_ONCE(). Thus, annotate this data race using ASSERT_EXCLUSIVE_BITS() to also assert that there are no concurrent writes to it. Suggested-by: NMarco Elver <elver@google.com> Signed-off-by: NQian Cai <cai@lca.pw> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Link: http://lkml.kernel.org/r/1581619089-14472-1-git-send-email-cai@lca.pwSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Qian Cai 提交于
struct mem_cgroup_per_node mz.lru_zone_size[zone_idx][lru] could be accessed concurrently as noticed by KCSAN, BUG: KCSAN: data-race in lruvec_lru_size / mem_cgroup_update_lru_size write to 0xffff9c804ca285f8 of 8 bytes by task 50951 on cpu 12: mem_cgroup_update_lru_size+0x11c/0x1d0 mem_cgroup_update_lru_size at mm/memcontrol.c:1266 isolate_lru_pages+0x6a9/0xf30 shrink_active_list+0x123/0xcc0 shrink_lruvec+0x8fd/0x1380 shrink_node+0x317/0xd80 do_try_to_free_pages+0x1f7/0xa10 try_to_free_pages+0x26c/0x5e0 __alloc_pages_slowpath+0x458/0x1290 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_vma+0x8a/0x2c0 do_anonymous_page+0x170/0x700 __handle_mm_fault+0xc9f/0xd00 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 read to 0xffff9c804ca285f8 of 8 bytes by task 50964 on cpu 95: lruvec_lru_size+0xbb/0x270 mem_cgroup_get_zone_lru_size at include/linux/memcontrol.h:536 (inlined by) lruvec_lru_size at mm/vmscan.c:326 shrink_lruvec+0x1d0/0x1380 shrink_node+0x317/0xd80 do_try_to_free_pages+0x1f7/0xa10 try_to_free_pages+0x26c/0x5e0 __alloc_pages_slowpath+0x458/0x1290 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_current+0xa6/0x120 alloc_slab_page+0x3b1/0x540 allocate_slab+0x70/0x660 new_slab+0x46/0x70 ___slab_alloc+0x4ad/0x7d0 __slab_alloc+0x43/0x70 kmem_cache_alloc+0x2c3/0x420 getname_flags+0x4c/0x230 getname+0x22/0x30 do_sys_openat2+0x205/0x3b0 do_sys_open+0x9a/0xf0 __x64_sys_openat+0x62/0x80 do_syscall_64+0x91/0xb47 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported by Kernel Concurrency Sanitizer on: CPU: 95 PID: 50964 Comm: cc1 Tainted: G W O L 5.5.0-next-20200204+ #6 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 The write is under lru_lock, but the read is done as lockless. The scan count is used to determine how aggressively the anon and file LRU lists should be scanned. Load tearing could generate an inefficient heuristic, so fix it by adding READ_ONCE() for the read. Signed-off-by: NQian Cai <cai@lca.pw> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Link: http://lkml.kernel.org/r/20200206034945.2481-1-cai@lca.pwSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xiaoming Ni 提交于
Since commit 61a47c1a ("sysctl: Remove the sysctl system call"), sys_sysctl is actually unavailable: any input can only return an error. We have been warning about people using the sysctl system call for years and believe there are no more users. Even if there are users of this interface if they have not complained or fixed their code by now they probably are not going to, so there is no point in warning them any longer. So completely remove sys_sysctl on all architectures. [nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu] Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.comSigned-off-by: NXiaoming Ni <nixiaoming@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: Will Deacon <will@kernel.org> [arm/arm64] Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bin Meng <bin.meng@windriver.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: chenzefeng <chenzefeng2@huawei.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christian Brauner <christian@brauner.io> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Diego Elio Pettenò <flameeyes@flameeyes.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kars de Jong <jongk@linux-m68k.org> Cc: Kees Cook <keescook@chromium.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Nick Piggin <npiggin@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Olof Johansson <olof@lixom.net> Cc: Paul Burton <paulburton@kernel.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Sven Schnelle <svens@stackframe.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com> Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Mirroring offset_in_page(), this gives you the offset within this particular page, no matter what size page it is. It optimises down to offset_in_page() if CONFIG_TRANSPARENT_HUGEPAGE is not set. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-8-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
This is like compound_head() but compiles away when CONFIG_TRANSPARENT_HUGEPAGE is not enabled. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-7-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
The thp prefix is more frequently used than hpage and we should be consistent between the various functions. [akpm@linux-foundation.org: fix mm/migrate.c] Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-6-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
This function returns the number of bytes in a THP. It is like page_size(), but compiles to just PAGE_SIZE if CONFIG_TRANSPARENT_HUGEPAGE is disabled. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-5-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
This function returns the order of a transparent huge page. It compiles to 0 if CONFIG_TRANSPARENT_HUGEPAGE is disabled. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-4-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Give up on the notion that we can remove page-flags.h from mm.h. There are currently 14 inline functions which use a PageFoo function. Also, two of the files directly included by mm.h include page-flags.h themselves, and there are probably more indirect inclusions. So just include it at the top like any other header file. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-3-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Matthew Wilcox (Oracle) 提交于
Patch series "THP prep patches". These are some generic cleanups and improvements, which I would like merged into mmotm soon. The first one should be a performance improvement for all users of compound pages, and the others are aimed at getting code to compile away when CONFIG_TRANSPARENT_HUGEPAGE is disabled (ie small systems). Also better documented / less confusing than the current prefix mixture of compound, hpage and thp. This patch (of 7): This removes a few instructions from functions which need to know how many pages are in a compound page. The storage used is either page->mapping on 64-bit or page->index on 32-bit. Both of these are fine to overlay on tail pages. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-1-willy@infradead.org Link: http://lkml.kernel.org/r/20200629151959.15779-2-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
The #ifdef statement that guards the generic version of pud_alloc_one() by mistake used __HAVE_ARCH_PUD_FREE instead of __HAVE_ARCH_PUD_ALLOC_ONE. Fix it. Fixes: d9e8b929 ("asm-generic: pgalloc: provide generic pud_alloc_one() and pud_free_one()") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200812191415.GE163101@linux.ibm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
This remoes the code from the COW path to call debug_dma_assert_idle(), which was added many years ago. Google shows that it hasn't caught anything in the 6+ years we've had it apart from a false positive, and Hugh just noticed how it had a very unfortunate spinlock serialization in the COW path. He fixed that issue the previous commit (a85ffd59: "dma-debug: fix debug_dma_assert_idle(), use rcu_read_lock()"), but let's see if anybody even notices when we remove this function entirely. NOTE! We keep the dma tracking infrastructure that was added by the commit that introduced it. Partly to make it easier to resurrect this debug code if we ever deside to, and partly because that tracking by pfn and offset looks quite reasonable. The problem with this debug code was simply that it was expensive and didn't seem worth it, not that it was wrong per se. Acked-by: NDan Williams <dan.j.williams@intel.com> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 8月, 2020 2 次提交
-
-
由 Christoph Hellwig 提交于
When allocating coherent pool memory for an IOMMU mapping we don't care about the DMA mask. Move the guess for the initial GFP mask into the dma_direct_alloc_pages and pass dma_coherent_ok as a function pointer argument so that it doesn't get applied to the IOMMU case. Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NAmit Pundir <amit.pundir@linaro.org>
-
由 Eric Dumazet 提交于
There has been some heat around prandom_u32() lately, and some people were wondering if there was a simple way to determine how often it was used, before considering making it maybe 10 times more expensive. This tracepoint exports the generated pseudo random value. Tested: perf list | grep prandom_u32 random:prandom_u32 [Tracepoint event] perf record -a [-g] [-C1] -e random:prandom_u32 sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 259.748 MB perf.data (924087 samples) ] perf report --nochildren ... 97.67% ksoftirqd/1 [kernel.vmlinux] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--48.86%--tcp_v4_syn_recv_sock | tcp_check_req | tcp_v4_rcv | ... --48.81%--tcp_conn_request tcp_v4_conn_request tcp_rcv_state_process ... perf script Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: NSedat Dilek <sedat.dilek@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 8月, 2020 11 次提交
-
-
由 Oleksandr Andrushchenko 提交于
This is the sync up with the canonical definition of the display protocol in Xen. 1. Add protocol version as an integer Version string, which is in fact an integer, is hard to handle in the code that supports different protocol versions. To simplify that also add the version as an integer. 2. Pass buffer offset with XENDISPL_OP_DBUF_CREATE There are cases when display data buffer is created with non-zero offset to the data start. Handle such cases and provide that offset while creating a display buffer. 3. Add XENDISPL_OP_GET_EDID command Add an optional request for reading Extended Display Identification Data (EDID) structure which allows better configuration of the display connectors over the configuration set in XenStore. With this change connectors may have multiple resolutions defined with respect to detailed timing definitions and additional properties normally provided by displays. If this request is not supported by the backend then visible area is defined by the relevant XenStore's "resolution" property. If backend provides extended display identification data (EDID) with XENDISPL_OP_GET_EDID request then EDID values must take precedence over the resolutions defined in XenStore. 4. Bump protocol version to 2. Signed-off-by: NOleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Reviewed-by: NJuergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20200813062113.11030-5-andr2000@gmail.comSigned-off-by: NJuergen Gross <jgross@suse.com>
-
由 Alexander A. Klimov 提交于
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: NAlexander A. Klimov <grandmaster@al2klimov.de> Acked-by: NRob Herring <robh@kernel.org> Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Charles Keepax 提交于
Currently, the only way to remove MFD children is with a call to mfd_remove_devices, which will remove all the children. Under some circumstances it is useful to remove only a subset of the child devices. For example if some additional clean up is required between removal of certain child devices. To accomplish this a level field is added to mfd_cell, the normal mfd_remove_devices is modified to not remove devices that are set to a higher level and a corresponding mfd_remove_devices_late function is added to remove those children. See further discussion at: https://lore.kernel.org/lkml/20200616075834.GF2608702@dell/Suggested-by: NLee Jones <lee.jones@linaro.org> Signed-off-by: NCharles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Randy Dunlap 提交于
Drop the repeated word "in" in a comment. Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Randy Dunlap 提交于
Drop the repeated word "that" in a comment. Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NAdam Thomson <Adam.Thomson.Opensource@diasemi.com> Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Adam Thomson 提交于
This update adds new regmap tables to support the latest DA silicon which will automatically be selected based on the chip and variant information read from the device. Signed-off-by: NAdam Thomson <Adam.Thomson.Opensource@diasemi.com> Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Adam Thomson 提交于
The current implementation performs checking in the i2c_probe() function of the variant_code but does this immediately after the containing struct has been initialised as all zero. This means the check for variant code will always default to using the BB tables and will never select AD. The variant code is subsequently set by device_init() and later used by the RTC so really it's a little fortunate this mismatch works. This update adds raw I2C read access functionality to read the chip and variant/revision information (common to all revisions) so that it can subsequently correctly choose the proper regmap tables for real initialisation. Signed-off-by: NAdam Thomson <Adam.Thomson.Opensource@diasemi.com> Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Michael Walle 提交于
This MFD driver has no user. The keypad driver of this device never made it into the kernel. Therefore, this driver is useless. Remove it. Signed-off-by: NMichael Walle <michael@walle.cc> Cc: Sourav Poddar <sourav.poddar@ti.com> Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Lee Jones 提交于
Extend current list of helpers to provide support for parent drivers wishing to match specific child devices to particular OF nodes. Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Lee Jones 提交于
Remove unnecessary '\'s and leading tabs. This will help to clean-up future diffs when subsequent changes are made. Hint: The aforementioned changes follow this patch. Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Lee Jones 提交于
Currently, when a child platform device (sometimes referred to as a sub-device) is registered via the Multi-Functional Device (MFD) API, the framework attempts to match the newly registered platform device with its associated Device Tree (OF) node. Until now, the device has been allocated the first node found with an identical OF compatible string. Unfortunately, if there are, say for example '3' devices which are to be handled by the same driver and therefore have the same compatible string, each of them will be allocated a pointer to the *first* node. An example Device Tree entry might look like this: mfd_of_test { compatible = "mfd,of-test-parent"; #address-cells = <0x02>; #size-cells = <0x02>; child@aaaaaaaaaaaaaaaa { compatible = "mfd,of-test-child"; reg = <0xaaaaaaaa 0xaaaaaaaa 0 0x11>, <0xbbbbbbbb 0xbbbbbbbb 0 0x22>; }; child@cccccccc { compatible = "mfd,of-test-child"; reg = <0x00000000 0xcccccccc 0 0x33>; }; child@dddddddd00000000 { compatible = "mfd,of-test-child"; reg = <0xdddddddd 0x00000000 0 0x44>; }; }; When used with example sub-device registration like this: static const struct mfd_cell mfd_of_test_cell[] = { OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 0, "mfd,of-test-child"), OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 1, "mfd,of-test-child"), OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 2, "mfd,of-test-child") }; ... the current implementation will result in all devices being allocated the first OF node found containing a matching compatible string: [0.712511] mfd-of-test-child mfd-of-test-child.0: Probing platform device: 0 [0.712710] mfd-of-test-child mfd-of-test-child.0: Using OF node: child@aaaaaaaaaaaaaaaa [0.713033] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1 [0.713381] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@aaaaaaaaaaaaaaaa [0.713691] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2 [0.713889] mfd-of-test-child mfd-of-test-child.2: Using OF node: child@aaaaaaaaaaaaaaaa After this patch each device will be allocated a unique OF node: [0.712511] mfd-of-test-child mfd-of-test-child.0: Probing platform device: 0 [0.712710] mfd-of-test-child mfd-of-test-child.0: Using OF node: child@aaaaaaaaaaaaaaaa [0.713033] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1 [0.713381] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@cccccccc [0.713691] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2 [0.713889] mfd-of-test-child mfd-of-test-child.2: Using OF node: child@dddddddd00000000 Which is fine if all OF nodes are identical. However if we wish to apply an attribute to particular device, we really need to ensure the correct OF node will be associated with the device containing the correct address. We accomplish this by matching the device's address expressed in DT with one provided during sub-device registration. Like this: static const struct mfd_cell mfd_of_test_cell[] = { OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 1, "mfd,of-test-child", 0xdddddddd00000000), OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 2, "mfd,of-test-child", 0xaaaaaaaaaaaaaaaa), OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 3, "mfd,of-test-child", 0x00000000cccccccc) }; This will ensure a specific device (designated here using the platform_ids; 1, 2 and 3) is matched with a particular OF node: [0.712511] mfd-of-test-child mfd-of-test-child.0: Probing platform device: 0 [0.712710] mfd-of-test-child mfd-of-test-child.0: Using OF node: child@dddddddd00000000 [0.713033] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1 [0.713381] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@aaaaaaaaaaaaaaaa [0.713691] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2 [0.713889] mfd-of-test-child mfd-of-test-child.2: Using OF node: child@cccccccc This implementation is still not infallible, hence the mention of "best effort" in the commit subject. Since we have not *insisted* on the existence of 'reg' properties (in some scenarios they just do not make sense) and no device currently uses the new 'of_reg' attribute, we have to make an on-the-fly judgement call whether to associate the OF node anyway. Which we do in cases where parent drivers haven't specified a particular OF node to match to. So there is a *slight* possibility of the following result (note: the implementation here is convoluted, but it shows you one means by which this process can still break): /* * First entry will match to the first OF node with matching compatible * Second will fail, since the first took its OF node and is no longer available * Third will succeed */ static const struct mfd_cell mfd_of_test_cell[] = { OF_MFD_CELL("mfd-of-test-child", NULL, NULL, 0, 1, "mfd,of-test-child"), OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 2, "mfd,of-test-child", 0xaaaaaaaaaaaaaaaa), OF_MFD_CELL_REG("mfd-of-test-child", NULL, NULL, 0, 3, "mfd,of-test-child", 0x00000000cccccccc) }; The result: [0.753869] mfd-of-test-parent mfd_of_test: Registering 3 devices [0.756597] mfd-of-test-child: Failed to locate of_node [id: 2] [0.759999] mfd-of-test-child mfd-of-test-child.1: Probing platform device: 1 [0.760314] mfd-of-test-child mfd-of-test-child.1: Using OF node: child@aaaaaaaaaaaaaaaa [0.760908] mfd-of-test-child mfd-of-test-child.2: Probing platform device: 2 [0.761183] mfd-of-test-child mfd-of-test-child.2: No OF node associated with this device [0.761621] mfd-of-test-child mfd-of-test-child.3: Probing platform device: 3 [0.761899] mfd-of-test-child mfd-of-test-child.3: Using OF node: child@cccccccc We could code around this with some pre-parsing semantics, but the added complexity required to cover each and every corner-case is not justified. Merely patching the current failing (via this patch) is already working with some pretty small corner-cases. Other issues should be patched in the parent drivers which can be achieved simply by implementing OF_MFD_CELL_REG(). Signed-off-by: NLee Jones <lee.jones@linaro.org>
-