- 01 5月, 2020 3 次提交
-
-
由 Bijan Mottahedeh 提交于
Use ctx->fallback_req address for test_and_set_bit_lock() and clear_bit_unlock(). Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We do blocking retry from our poll handler, if the file supports polled notifications. Only mark the request as needing an async worker if we can't poll for it. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We can have files like eventfd where it's perfectly fine to do poll based retry on them, right now io_file_supports_async() doesn't take that into account. Pass in data direction and check the f_op instead of just always needing an async worker. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 4月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
Clay reports that OP_STATX fails for a test case with a valid fd and empty path: -- Test 0: statx:fd 3: SUCCEED, file mode 100755 -- Test 1: statx:path ./uring_statx: SUCCEED, file mode 100755 -- Test 2: io_uring_statx:fd 3: FAIL, errno 9: Bad file descriptor -- Test 3: io_uring_statx:path ./uring_statx: SUCCEED, file mode 100755 This is due to statx not grabbing the process file table, hence we can't lookup the fd in async context. If the fd is valid, ensure that we grab the file table so we can grab the file from async context. Cc: stable@vger.kernel.org # v5.6 Reported-by: NClay Harris <bugs@claycon.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 4月, 2020 1 次提交
-
-
由 Xiaoguang Wang 提交于
When testing io_uring IORING_FEAT_FAST_POLL feature, I got below panic: BUG: kernel NULL pointer dereference, address: 0000000000000030 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 5 PID: 2154 Comm: io_uring_echo_s Not tainted 5.6.0+ #359 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:io_wq_submit_work+0xf/0xa0 Code: ff ff ff be 02 00 00 00 e8 ae c9 19 00 e9 58 ff ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 8b 2f <8b> 45 30 48 8d 9d 48 ff ff ff 25 01 01 00 00 83 f8 01 75 07 eb 2a RSP: 0018:ffffbef543e93d58 EFLAGS: 00010286 RAX: ffffffff84364f50 RBX: ffffa3eb50f046b8 RCX: 0000000000000000 RDX: ffffa3eb0efc1840 RSI: 0000000000000006 RDI: ffffa3eb50f046b8 RBP: 0000000000000000 R08: 00000000fffd070d R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffa3eb50f046b8 R13: ffffa3eb0efc2088 R14: ffffffff85b69be0 R15: ffffa3eb0effa4b8 FS: 00007fe9f69cc4c0(0000) GS:ffffa3eb5ef40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 0000000020410000 CR4: 00000000000006e0 Call Trace: task_work_run+0x6d/0xa0 do_exit+0x39a/0xb80 ? get_signal+0xfe/0xbc0 do_group_exit+0x47/0xb0 get_signal+0x14b/0xbc0 ? __x64_sys_io_uring_enter+0x1b7/0x450 do_signal+0x2c/0x260 ? __x64_sys_io_uring_enter+0x228/0x450 exit_to_usermode_loop+0x87/0xf0 do_syscall_64+0x209/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x7fe9f64f8df9 Code: Bad RIP value. task_work_run calls io_wq_submit_work unexpectedly, it's obvious that struct callback_head's func member has been changed. After looking into codes, I found this issue is still due to the union definition: union { /* * Only commands that never go async can use the below fields, * obviously. Right now only IORING_OP_POLL_ADD uses them, and * async armed poll handlers for regular commands. The latter * restore the work, if needed. */ struct { struct callback_head task_work; struct hlist_node hash_node; struct async_poll *apoll; }; struct io_wq_work work; }; When task_work_run has multiple work to execute, the work that calls io_poll_remove_all() will do req->work restore for non-poll request always, but indeed if a non-poll request has been added to a new callback_head, subsequent callback will call io_async_task_func() to handle this request, that means we should not do the restore work for such non-poll request. Meanwhile in io_async_task_func(), we should drop submit ref when req has been canceled. Fix both issues. Fixes: b1f573bd ("io_uring: restore req->work when canceling poll request") Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Use io_double_put_req() Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 4月, 2020 3 次提交
-
-
由 Pavel Begunkov 提交于
When checking for draining with __req_need_defer(), it tries to match how many requests were sent before a current one with number of already completed. Dropped SQEs are included in req->sequence, and they won't ever appear in CQ. To compensate for that, __req_need_defer() substracts ctx->cached_sq_dropped. However, what it should really use is number of SQEs dropped __before__ the current one. In other words, any submitted request shouldn't shouldn't affect dequeueing from the drain queue of previously submitted ones. Instead of saving proper ctx->cached_sq_dropped in each request, substract from req->sequence it at initialisation, so it includes number of properly submitted requests. note: it also changes behaviour of timeouts, but 1. it's already diverge from the description because of using SQ 2. the description is ambiguous regarding dropped SQEs Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
req->timeout.count and req->io->timeout.seq_offset store the same value, which is sqe->off. Kill the second one Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
io_timeout() can be executed asynchronously by a worker and without holding ctx->uring_lock 1. using ctx->cached_sq_head there is racy there 2. it should count events from a moment of timeout's submission, but not execution Use req->sequence. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 14 4月, 2020 3 次提交
-
-
由 Jens Axboe 提交于
syzbot reports this crash: BUG: unable to handle page fault for address: ffffffffffffffe8 PGD f96e17067 P4D f96e17067 PUD f96e19067 PMD 0 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI CPU: 55 PID: 211750 Comm: trinity-c127 Tainted: G B L 5.7.0-rc1-next-20200413 #4 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 04/12/2017 RIP: 0010:__wake_up_common+0x98/0x290 el/sched/wait.c:87 Code: 40 4d 8d 78 e8 49 8d 7f 18 49 39 fd 0f 84 80 00 00 00 e8 6b bd 2b 00 49 8b 5f 18 45 31 e4 48 83 eb 18 4c 89 ff e8 08 bc 2b 00 <45> 8b 37 41 f6 c6 04 75 71 49 8d 7f 10 e8 46 bd 2b 00 49 8b 47 10 RSP: 0018:ffffc9000adbfaf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffffffffffffe8 RCX: ffffffffaa9636b8 RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffffffffe8 RBP: ffffc9000adbfb40 R08: fffffbfff582c5fd R09: fffffbfff582c5fd R10: ffffffffac162fe3 R11: fffffbfff582c5fc R12: 0000000000000000 R13: ffff888ef82b0960 R14: ffffc9000adbfb80 R15: ffffffffffffffe8 FS: 00007fdcba4c4740(0000) GS:ffff889033780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffe8 CR3: 0000000f776a0004 CR4: 00000000001606e0 Call Trace: __wake_up_common_lock+0xea/0x150 ommon_lock at kernel/sched/wait.c:124 ? __wake_up_common+0x290/0x290 ? lockdep_hardirqs_on+0x16/0x2c0 __wake_up+0x13/0x20 io_cqring_ev_posted+0x75/0xe0 v_posted at fs/io_uring.c:1160 io_ring_ctx_wait_and_kill+0x1c0/0x2f0 l at fs/io_uring.c:7305 io_uring_create+0xa8d/0x13b0 ? io_req_defer_prep+0x990/0x990 ? __kasan_check_write+0x14/0x20 io_uring_setup+0xb8/0x130 ? io_uring_create+0x13b0/0x13b0 ? check_flags.part.28+0x220/0x220 ? lockdep_hardirqs_on+0x16/0x2c0 __x64_sys_io_uring_setup+0x31/0x40 do_syscall_64+0xcc/0xaf0 ? syscall_return_slowpath+0x580/0x580 ? lockdep_hardirqs_off+0x1f/0x140 ? entry_SYSCALL_64_after_hwframe+0x3e/0xb3 ? trace_hardirqs_off_caller+0x3a/0x150 ? trace_hardirqs_off_thunk+0x1a/0x1c entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x7fdcb9dd76ed Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6b 57 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe7fd4e4f8 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 RAX: ffffffffffffffda RBX: 00000000000001a9 RCX: 00007fdcb9dd76ed RDX: fffffffffffffffc RSI: 0000000000000000 RDI: 0000000000005d54 RBP: 00000000000001a9 R08: 0000000e31d3caa7 R09: 0082400004004000 R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000002 R13: 00007fdcb842e058 R14: 00007fdcba4c46c0 R15: 00007fdcb842e000 Modules linked in: bridge stp llc nfnetlink cn brd vfat fat ext4 crc16 mbcache jbd2 loop kvm_intel kvm irqbypass intel_cstate intel_uncore dax_pmem intel_rapl_perf dax_pmem_core ip_tables x_tables xfs sd_mod tg3 firmware_class libphy hpsa scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: binfmt_misc] CR2: ffffffffffffffe8 ---[ end trace f9502383d57e0e22 ]--- RIP: 0010:__wake_up_common+0x98/0x290 Code: 40 4d 8d 78 e8 49 8d 7f 18 49 39 fd 0f 84 80 00 00 00 e8 6b bd 2b 00 49 8b 5f 18 45 31 e4 48 83 eb 18 4c 89 ff e8 08 bc 2b 00 <45> 8b 37 41 f6 c6 04 75 71 49 8d 7f 10 e8 46 bd 2b 00 49 8b 47 10 RSP: 0018:ffffc9000adbfaf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffffffffffffe8 RCX: ffffffffaa9636b8 RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffffffffe8 RBP: ffffc9000adbfb40 R08: fffffbfff582c5fd R09: fffffbfff582c5fd R10: ffffffffac162fe3 R11: fffffbfff582c5fc R12: 0000000000000000 R13: ffff888ef82b0960 R14: ffffc9000adbfb80 R15: ffffffffffffffe8 FS: 00007fdcba4c4740(0000) GS:ffff889033780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffe8 CR3: 0000000f776a0004 CR4: 00000000001606e0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x29800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]— which is due to error injection (or allocation failure) preventing the rings from being setup. On shutdown, we attempt to remove any pending requests, and for poll request, we call io_cqring_ev_posted() when we've killed poll requests. However, since the rings aren't setup, we won't find any poll requests. Make the calling of io_cqring_ev_posted() dependent on actually having completed requests. This fixes this setup corner case, and removes spurious calls if we remove poll requests and don't find any. Reported-by: NQian Cai <cai@lca.pw> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If the request has been marked as canceled, don't try and issue it. Instead just fill a canceled event and finish the request. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We added this for just the regular poll requests in commit a6ba632d ("io_uring: retry poll if we got woken with non-matching mask"), we should do the same for the poll handler used pollable async requests. Move the re-wait check and arm into a helper, and call it from io_async_task_func() as well. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 13 4月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
The splice file punt check uses file->f_mode to check for O_NONBLOCK, but it should be checking file->f_flags. This leads to punting even for files that have O_NONBLOCK set, which isn't necessary. This equates to checking for FMODE_PATH, which will never be set on the fd in question. Fixes: 7d67af2c ("io_uring: add splice(2) support") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 4月, 2020 6 次提交
-
-
由 Xiaoguang Wang 提交于
When running liburing test case 'accept', I got below warning: RED: Invalid credentials RED: At include/linux/cred.h:285 RED: Specified credentials: 00000000d02474a0 RED: ->magic=4b, put_addr=000000005b4f46e9 RED: ->usage=-1699227648, subscr=-25693 RED: ->*uid = { 256,-25693,-25693,65534 } RED: ->*gid = { 0,-1925859360,-1789740800,-1827028688 } RED: ->security is 00000000258c136e eneral protection fault, probably for non-canonical address 0xdead4ead00000000: 0000 [#1] SMP PTI PU: 21 PID: 2037 Comm: accept Not tainted 5.6.0+ #318 ardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014 IP: 0010:dump_invalid_creds+0x16f/0x184 ode: 48 8b 83 88 00 00 00 48 3d ff 0f 00 00 76 29 48 89 c2 81 e2 00 ff ff ff 48 81 fa 00 6b 6b 6b 74 17 5b 48 c7 c7 4b b1 10 8e 5d <8b> 50 04 41 5c 8b 30 41 5d e9 67 e3 04 00 5b 5d 41 5c 41 5d c3 0f SP: 0018:ffffacc1039dfb38 EFLAGS: 00010087 AX: dead4ead00000000 RBX: ffff9ba39319c100 RCX: 0000000000000007 DX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8e10b14b BP: ffffffff8e108476 R08: 0000000000000000 R09: 0000000000000001 10: 0000000000000000 R11: ffffacc1039df9e5 R12: 000000009552b900 13: 000000009319c130 R14: ffff9ba39319c100 R15: 0000000000000246 S: 00007f96b2bfc4c0(0000) GS:ffff9ba39f340000(0000) knlGS:0000000000000000 S: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 R2: 0000000000401870 CR3: 00000007db7a4000 CR4: 00000000000006e0 all Trace: __invalid_creds+0x48/0x4a __io_req_aux_free+0x2e8/0x3b0 ? io_poll_remove_one+0x2a/0x1d0 __io_free_req+0x18/0x200 io_free_req+0x31/0x350 io_poll_remove_one+0x17f/0x1d0 io_poll_cancel.isra.80+0x6c/0x80 io_async_find_and_cancel+0x111/0x120 io_issue_sqe+0x181/0x10e0 ? __lock_acquire+0x552/0xae0 ? lock_acquire+0x8e/0x310 ? fs_reclaim_acquire.part.97+0x5/0x30 __io_queue_sqe.part.100+0xc4/0x580 ? io_submit_sqes+0x751/0xbd0 ? rcu_read_lock_sched_held+0x32/0x40 io_submit_sqes+0x9ba/0xbd0 ? __x64_sys_io_uring_enter+0x2b2/0x460 ? __x64_sys_io_uring_enter+0xaf/0x460 ? find_held_lock+0x2d/0x90 ? __x64_sys_io_uring_enter+0x111/0x460 __x64_sys_io_uring_enter+0x2d7/0x460 do_syscall_64+0x5a/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xb3 After looking into codes, it turns out that this issue is because we didn't restore the req->work, which is changed in io_arm_poll_handler(), req->work is a union with below struct: struct { struct callback_head task_work; struct hlist_node hash_node; struct async_poll *apoll; }; If we forget to restore, members in struct io_wq_work would be invalid, restore the req->work to fix this issue. Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Get rid of not needed 'need_restore' variable. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Requests initialisation is scattered across several functions, namely io_init_req(), io_submit_sqes(), io_submit_sqe(). Put it in io_init_req() for better data locality and code clarity. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
It's a good idea to not read sqe->flags twice, as it's prone to security bugs. Instead of passing it around, embeed them in req->flags. It's already so except for IOSQE_IO_LINK. 1. rename former REQ_F_LINK -> REQ_F_LINK_HEAD 2. introduce and copy REQ_F_LINK, which mimics IO_IOSQE_LINK And leave req_set_fail_links() using new REQ_F_LINK, because it's more sensible. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Having only one place for cleaning up a request after a link assembly/ submission failure will play handy in the future. At least it allows to remove duplicated cleanup sequence. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
As a preparation for extracting request init bits, remove self-coded mm tracking from io_submit_sqes(), but rely on current->mm. It's more convenient, than passing this piece of state in other functions. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
If io_submit_sqes() can't grab an mm, it fails and exits right away. There is no need to track the fact of the failure. Remove @mm_fault. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 11 4月, 2020 22 次提交
-
-
由 Linus Torvalds 提交于
Merge yet more updates from Andrew Morton: - Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc, gup, hugetlb, pagemap, memremap) - Various other things (hfs, ocfs2, kmod, misc, seqfile) * akpm: (34 commits) ipc/util.c: sysvipc_find_ipc() should increase position index kernel/gcov/fs.c: gcov_seq_next() should increase position index fs/seq_file.c: seq_read(): add info message about buggy .next functions drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings change email address for Pali Rohár selftests: kmod: test disabling module autoloading selftests: kmod: fix handling test numbers above 9 docs: admin-guide: document the kernel.modprobe sysctl fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() kmod: make request_module() return an error when autoloading is disabled mm/memremap: set caching mode for PCI P2PDMA memory to WC mm/memory_hotplug: add pgprot_t to mhp_params powerpc/mm: thread pgprot_t through create_section_mapping() x86/mm: introduce __set_memory_prot() x86/mm: thread pgprot_t through init_memory_mapping() mm/memory_hotplug: rename mhp_restrictions to mhp_params mm/memory_hotplug: drop the flags field from struct mhp_restrictions mm/special: create generic fallbacks for pte_special() and pte_mkspecial() mm/vma: introduce VM_ACCESS_FLAGS mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS ...
-
git://git.lwn.net/linux由 Linus Torvalds 提交于
Pull Documentation fixes from Jonathan Corbet: "A handful of late-arriving fixes for the documentation tree" * tag 'docs-5.7-2' of git://git.lwn.net/linux: Documentation: android: binderfs: add 'stats' mount option Documentation: driver-api/usb/writing_usb_driver.rst Updates documentation links docs: driver-api: address duplicate label warning Documentation: sysrq: fix RST formatting docs: kernel-parameters.txt: Fix broken references docs: kernel-parameters.txt: Remove nompx docs: filesystems: fix typo in qnx6.rst
-
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux由 Linus Torvalds 提交于
Pull orangefs updates from Mike Marshall: "A fix and two cleanups. Fix: - Christoph Hellwig noticed that some logic I added to orangefs_file_read_iter introduced a race condition, so he sent a reversion patch. I had to modify his patch since reverting at this point broke Orangefs. Cleanups: - Christoph Hellwig noticed that we were doing some unnecessary work in orangefs_flush, so he sent in a patch that removed the un-needed code. - Al Viro told me he had trouble building Orangefs. Orangefs should be easy to build, even for Al :-). I looked back at the test server build notes in orangefs.txt, just in case that's where the trouble really is, and found a couple of typos and made a couple of clarifications" * tag 'for-linus-5.7-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: clarify build steps for test server in orangefs.txt orangefs: don't mess with I_DIRTY_TIMES in orangefs_flush orangefs: get rid of knob code...
-
git://github.com/jcmvbkbc/linux-xtensa由 Linus Torvalds 提交于
Pull xtensa updates from Max Filippov: - replace setup_irq() by request_irq() - cosmetic fixes in xtensa Kconfig and boot/Makefile * tag 'xtensa-20200410' of git://github.com/jcmvbkbc/linux-xtensa: arch/xtensa: fix grammar in Kconfig help text xtensa: remove meaningless export ccflags-y xtensa: replace setup_irq() by request_irq()
-
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip由 Linus Torvalds 提交于
Pull more xen updates from Juergen Gross: - two cleanups - fix a boot regression introduced in this merge window - fix wrong use of memory allocation flags * tag 'for-linus-5.7-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: fix booting 32-bit pv guest x86/xen: make xen_pvmmu_arch_setup() static xen/blkfront: fix memory allocation flags in blkfront_setup_indirect() xen: Use evtchn_type_t as a type for event channels
-
由 Vasily Averin 提交于
If seq_file .next function does not change position index, read after some lseek can generate unexpected output. https://bugzilla.kernel.org/show_bug.cgi?id=206283Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NWaiman Long <longman@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@redhat.com> Cc: NeilBrown <neilb@suse.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/b7a20945-e315-8bb0-21e6-3875c14a8494@virtuozzo.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasily Averin 提交于
If seq_file .next function does not change position index, read after some lseek can generate unexpected output. https://bugzilla.kernel.org/show_bug.cgi?id=206283Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NPeter Oberparleiter <oberpar@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: NeilBrown <neilb@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Waiman Long <longman@redhat.com> Link: http://lkml.kernel.org/r/f65c6ee7-bd00-f910-2f8a-37cc67e4ff88@virtuozzo.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vasily Averin 提交于
Patch series "seq_file .next functions should increase position index". In Aug 2018 NeilBrown noticed commit 1f4aace6 ("fs/seq_file.c: simplify seq_file iteration code and interface") "Some ->next functions do not increment *pos when they return NULL... Note that such ->next functions are buggy and should be fixed. A simple demonstration is dd if=/proc/swaps bs=1000 skip=1 Choose any block size larger than the size of /proc/swaps. This will always show the whole last line of /proc/swaps" Described problem is still actual. If you make lseek into middle of last output line following read will output end of last line and whole last line once again. $ dd if=/proc/swaps bs=1 # usual output Filename Type Size Used Priority /dev/dm-0 partition 4194812 97536 -2 104+0 records in 104+0 records out 104 bytes copied $ dd if=/proc/swaps bs=40 skip=1 # last line was generated twice dd: /proc/swaps: cannot skip to specified offset v/dm-0 partition 4194812 97536 -2 /dev/dm-0 partition 4194812 97536 -2 3+1 records in 3+1 records out 131 bytes copied There are lot of other affected files, I've found 30+ including /proc/net/ip_tables_matches and /proc/sysvipc/* I've sent patches into maillists of affected subsystems already, this patch-set fixes the problem in files related to pstore, tracing, gcov, sysvipc and other subsystems processed via linux-kernel@ mailing list directly https://bugzilla.kernel.org/show_bug.cgi?id=206283 This patch (of 4): Add debug code to seq_read() to detect missed or out-of-tree incorrect .next seq_file functions. [akpm@linux-foundation.org: s/pr_info/pr_info_ratelimited/, per Qian Cai] https://bugzilla.kernel.org/show_bug.cgi?id=206283Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: NeilBrown <neilb@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Waiman Long <longman@redhat.com> Link: http://lkml.kernel.org/r/244674e5-760c-86bd-d08a-047042881748@virtuozzo.com Link: http://lkml.kernel.org/r/7c24087c-e280-e580-5b0c-0cdaeb14cd18@virtuozzo.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 kbuild test robot 提交于
Remove dev_err() messages after platform_get_irq*() failures. platform_get_irq() already prints an error. Generated by: scripts/coccinelle/api/platform_get_irq.cocci Fixes: 6c41ac96 ("dmaengine: tegra-apb: Support COMPILE_TEST") Signed-off-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NJulia Lawall <julia.lawall@inria.fr> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDmitry Osipenko <digetx@gmail.com> Acked-by: NThierry Reding <treding@nvidia.com> Cc: Laxman Dewangan <ldewangan@nvidia.com> Cc: Vinod Koul <vinod.koul@linux.intel.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Jon Hunter <jonathanh@nvidia.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2002271133450.2973@hadrienSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pali Rohár 提交于
For security reasons I stopped using gmail account and kernel address is now up-to-date alias to my personal address. People periodically send me emails to address which they found in source code of drivers, so this change reflects state where people can contact me. [ Added .mailmap entry as per Joe Perches - Linus ] Signed-off-by: NPali Rohár <pali@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> Link: http://lkml.kernel.org/r/20200307104237.8199-1-pali@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Biggers 提交于
Test that request_module() fails with -ENOENT when /proc/sys/kernel/modprobe contains (a) a nonexistent path, and (b) an empty path. Case (b) is a regression test for the patch "kmod: make request_module() return an error when autoloading is disabled". Tested with 'kmod.sh -t 0010 && kmod.sh -t 0011', and also simply with 'kmod.sh' to run all kmod tests. Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NLuis Chamberlain <mcgrof@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: NeilBrown <neilb@suse.com> Link: http://lkml.kernel.org/r/20200312202552.241885-5-ebiggers@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Biggers 提交于
get_test_count() and get_test_enabled() were broken for test numbers above 9 due to awk interpreting a field specification like '$0010' as octal rather than decimal. Fix it by stripping the leading zeroes. Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NLuis Chamberlain <mcgrof@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: NeilBrown <neilb@suse.com> Link: http://lkml.kernel.org/r/20200318230515.171692-5-ebiggers@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Biggers 提交于
Document the kernel.modprobe sysctl in the same place that all the other kernel.* sysctls are documented. Make sure to mention how to use this sysctl to completely disable module autoloading, and how this sysctl relates to CONFIG_STATIC_USERMODEHELPER. [ebiggers@google.com: v5] Link: http://lkml.kernel.org/r/20200318230515.171692-4-ebiggers@kernel.orgSigned-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: NeilBrown <neilb@suse.com> Link: http://lkml.kernel.org/r/20200312202552.241885-4-ebiggers@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Biggers 提交于
After request_module(), nothing is stopping the module from being unloaded until someone takes a reference to it via try_get_module(). The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace running 'rmmod' concurrently. Since WARN_ONCE() is for kernel bugs only, not for user-reachable situations, downgrade this warning to pr_warn_once(). Keep it printed once only, since the intent of this warning is to detect a bug in modprobe at boot time. Printing the warning more than once wouldn't really provide any useful extra information. Fixes: 41124db8 ("fs: warn in case userspace lied about modprobe return") Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJessica Yu <jeyu@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: NeilBrown <neilb@suse.com> Cc: <stable@vger.kernel.org> [4.13+] Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Biggers 提交于
Patch series "module autoloading fixes and cleanups", v5. This series fixes a bug where request_module() was reporting success to kernel code when module autoloading had been completely disabled via 'echo > /proc/sys/kernel/modprobe'. It also addresses the issues raised on the original thread (https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/#u) bydocumenting the modprobe sysctl, adding a self-test for the empty path case, and downgrading a user-reachable WARN_ONCE(). This patch (of 4): It's long been possible to disable kernel module autoloading completely (while still allowing manual module insertion) by setting /proc/sys/kernel/modprobe to the empty string. This can be preferable to setting it to a nonexistent file since it avoids the overhead of an attempted execve(), avoids potential deadlocks, and avoids the call to security_kernel_module_request() and thus on SELinux-based systems eliminates the need to write SELinux rules to dontaudit module_request. However, when module autoloading is disabled in this way, request_module() returns 0. This is broken because callers expect 0 to mean that the module was successfully loaded. Apparently this was never noticed because this method of disabling module autoloading isn't used much, and also most callers don't use the return value of request_module() since it's always necessary to check whether the module registered its functionality or not anyway. But improperly returning 0 can indeed confuse a few callers, for example get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit: if (!fs && (request_module("fs-%.*s", len, name) == 0)) { fs = __get_fs_type(name, len); WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name); } This is easily reproduced with: echo > /proc/sys/kernel/modprobe mount -t NONEXISTENT none / It causes: request_module fs-NONEXISTENT succeeded, but still no fs? WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0 [...] This should actually use pr_warn_once() rather than WARN_ONCE(), since it's also user-reachable if userspace immediately unloads the module. Regardless, request_module() should correctly return an error when it fails. So let's make it return -ENOENT, which matches the error when the modprobe binary doesn't exist. I've also sent patches to document and test this case. Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NKees Cook <keescook@chromium.org> Reviewed-by: NJessica Yu <jeyu@kernel.org> Acked-by: NLuis Chamberlain <mcgrof@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Ben Hutchings <benh@debian.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Logan Gunthorpe 提交于
PCI BAR IO memory should never be mapped as WB, however prior to this the PAT bits were set WB and it was typically overridden by MTRR registers set by the firmware. Set PCI P2PDMA memory to be UC as this is what it currently, typically, ends up being mapped as on x86 after the MTRR registers override the cache setting. Future use-cases may need to generalize this by adding flags to select the caching type, as some P2PDMA cases may not want UC. However, those use-cases are not upstream yet and this can be changed when they arrive. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-8-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Logan Gunthorpe 提交于
devm_memremap_pages() is currently used by the PCI P2PDMA code to create struct page mappings for IO memory. At present, these mappings are created with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on x86, an mtrr register will typically override this and force the cache type to be UC-. In the case firmware doesn't set this register it is effectively WB and will typically result in a machine check exception when it's accessed. Other arches are not currently likely to function correctly seeing they don't have any MTRR registers to fall back on. To solve this, provide a way to specify the pgprot value explicitly to arch_add_memory(). Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a simple change to pass the pgprot_t down to their respective functions which set up the page tables. For x86_32, set the page tables explicitly using _set_memory_prot() (seeing they are already mapped). For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this should be fine, for now, seeing these architectures don't support ZONE_DEVICE. A check in __add_pages() is also added to ensure the pgprot parameter was set for all arches. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NDan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Logan Gunthorpe 提交于
In prepartion to support a pgprot_t argument for arch_add_memory(). Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-6-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Logan Gunthorpe 提交于
For use in the 32bit arch_add_memory() to set the pgprot type of the memory to add. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-5-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Logan Gunthorpe 提交于
In preparation to support a pgprot_t argument for arch_add_memory(). It's required to move the prototype of init_memory_mapping() seeing the original location came before the definition of pgprot_t. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-4-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Logan Gunthorpe 提交于
The mhp_restrictions struct really doesn't specify anything resembling a restriction anymore so rename it to be mhp_params as it is a list of extended parameters. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Logan Gunthorpe 提交于
Patch series "Allow setting caching mode in arch_add_memory() for P2PDMA", v4. Currently, the page tables created using memremap_pages() are always created with the PAGE_KERNEL cacheing mode. However, the P2PDMA code is creating pages for PCI BAR memory which should never be accessed through the cache and instead use either WC or UC. This still works in most cases, on x86, because the MTRR registers typically override the caching settings in the page tables for all of the IO memory to be UC-. However, this tends not to work so well on other arches or some rare x86 machines that have firmware which does not setup the MTRR registers in this way. Instead of this, this series proposes a change to arch_add_memory() to take the pgprot required by the mapping which allows us to explicitly set pagetable entries for P2PDMA memory to UC. This changes is pretty routine for most of the arches: x86_64, arm64 and powerpc simply need to thread the pgprot through to where the page tables are setup. x86_32 unfortunately sets up the page tables at boot so must use _set_memory_prot() to change their caching mode. ia64, s390 and sh don't appear to have an easy way to change the page tables so, for now at least, we just return -EINVAL on such mappings and thus they will not support P2PDMA memory until the work for this is done. This should be fine as they don't yet support ZONE_DEVICE. This patch (of 7): This variable is not used anywhere and should therefore be removed from the structure. Signed-off-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Link: http://lkml.kernel.org/r/20200306170846.9333-2-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-