- 28 6月, 2019 1 次提交
-
-
由 Jason Gunthorpe 提交于
If the trylock on the hmm->mirrors_sem fails the function will return without decrementing the notifiers that were previously incremented. Since the caller will not call invalidate_range_end() on EAGAIN this will result in notifiers becoming permanently incremented and deadlock. If the sync_cpu_device_pagetables() required blocking the function will not return EAGAIN even though the device continues to touch the pages. This is a violation of the mmu notifier contract. Switch, and rename, the ranges_lock to a spin lock so we can reliably obtain it without blocking during error unwind. The error unwind is necessary since the notifiers count must be held incremented across the call to sync_cpu_device_pagetables() as we cannot allow the range to become marked valid by a parallel invalidate_start/end() pair while doing sync_cpu_device_pagetables(). Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
- 25 6月, 2019 3 次提交
-
-
由 Jason Gunthorpe 提交于
hmm_release() is called exactly once per hmm. ops->release() cannot accidentally trigger any action that would recurse back onto hmm->mirrors_sem. This fixes a use after-free race of the form: CPU0 CPU1 hmm_release() up_write(&hmm->mirrors_sem); hmm_mirror_unregister(mirror) down_write(&hmm->mirrors_sem); up_write(&hmm->mirrors_sem); kfree(mirror) mirror->ops->release(mirror) The only user we have today for ops->release is an empty function, so this is unambiguously safe. As a consequence of plugging this race drivers are not allowed to register/unregister mirrors from within a release op. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Jason Gunthorpe 提交于
Trying to misuse a range outside its lifetime is a kernel bug. Use poison bytes to help detect this condition. Double unregister will reliably crash. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Acked-by: NSouptick Joarder <jrdr.linux@gmail.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Jason Gunthorpe 提交于
No other register/unregister kernel API attempts to provide this kind of protection as it is inherently racy, so just drop it. Callers should provide their own protection, and it appears nouveau already does. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
- 18 6月, 2019 5 次提交
-
-
由 Jason Gunthorpe 提交于
So we can check locking at runtime. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Acked-by: NSouptick Joarder <jrdr.linux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Jason Gunthorpe 提交于
Range functions like hmm_range_snapshot() and hmm_range_fault() call find_vma, which requires hodling the mmget() and the mmap_sem for the mm. Make this simpler for the callers by holding the mmget() inside the range for the lifetime of the range. Other functions that accept a range should only be called if the range is registered. This has the side effect of directly preventing hmm_release() from happening while a range is registered. That means range->dead cannot be false during the lifetime of the range, so remove dead and hmm_mirror_mm_is_alive() entirely. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Jason Gunthorpe 提交于
This list is always read and written while holding hmm->lock so there is no need for the confusing _rcu annotations. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Acked-by: NSouptick Joarder <jrdr.linux@gmail.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NIra Weiny <iweiny@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Jason Gunthorpe 提交于
The wait_event_timeout macro already tests the condition as its first action, so there is no reason to open code another version of this, all that does is skip the might_sleep() debugging in common cases, which is not helpful. Further, based on prior patches, we can now simplify the required condition test: - If range is valid memory then so is range->hmm - If hmm_release() has run then range->valid is set to false at the same time as dead, so no reason to check both. - A valid hmm has a valid hmm->mm. Allowing the return value of wait_event_timeout() (along with its internal barriers) to compute the result of the function. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Jason Gunthorpe 提交于
As coded this function can false-fail in various racy situations. Make it reliable and simpler by running under the write side of the mmap_sem and avoiding the false-failing compare/exchange pattern. Due to the mmap_sem this no longer has to avoid racing with a 2nd parallel hmm_get_or_create(). Unfortunately this still has to use the page_table_lock as the non-sleeping lock protecting mm->hmm, since the contexts where we free the hmm are incompatible with mmap_sem. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
- 10 6月, 2019 2 次提交
-
-
由 Jason Gunthorpe 提交于
So long as a struct hmm pointer exists, so should the struct mm it is linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it once the hmm refcount goes to zero. Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL mm->hmm delete the hmm_hmm_destroy(). Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Jason Gunthorpe 提交于
Ralph observes that hmm_range_register() can only be called by a driver while a mirror is registered. Make this clear in the API by passing in the mirror structure as a parameter. This also simplifies understanding the lifetime model for struct hmm, as the hmm pointer must be valid as part of a registered mirror so all we need in hmm_register_range() is a simple kref_get. Suggested-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
- 07 6月, 2019 6 次提交
-
-
由 Jason Gunthorpe 提交于
mmu_notifier_unregister_no_release() is not a fence and the mmu_notifier system will continue to reference hmm->mn until the srcu grace period expires. Resulting in use after free races like this: CPU0 CPU1 __mmu_notifier_invalidate_range_start() srcu_read_lock hlist_for_each () // mn == hmm->mn hmm_mirror_unregister() hmm_put() hmm_free() mmu_notifier_unregister_no_release() hlist_del_init_rcu(hmm-mn->list) mn->ops->invalidate_range_start(mn, range); mm_get_hmm() mm->hmm = NULL; kfree(hmm) mutex_lock(&hmm->lock); Use SRCU to kfree the hmm memory so that the notifiers can rely on hmm existing. Get the now-safe hmm struct through container_of and directly check kref_get_unless_zero to lock it against free. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPhilip Yang <Philip.Yang@amd.com>
-
由 Kuehling, Felix 提交于
Don't set this flag by default in hmm_vma_do_fault. It is set conditionally just a few lines below. Setting it unconditionally can lead to handle_mm_fault doing a non-blocking fault, returning -EBUSY and unlocking mmap_sem unexpectedly. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Philip Yang 提交于
While the page is migrating by NUMA balancing, HMM failed to detect this condition and still return the old page. Application will use the new page migrated, but driver pass the old page physical address to GPU, this crash the application later. Use pte_protnone(pte) to return this condition and then hmm_vma_do_fault will allocate new page. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Ralph Campbell 提交于
There are no functional changes, just some coding style clean ups and minor comment changes. Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Ralph Campbell 提交于
Update the HMM documentation to reflect the latest API and make a few minor wording changes. Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NJérôme Glisse <jglisse@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
gcc reports that several variables are defined but not used. For the first hunk CONFIG_HUGETLB_PAGE the entire if block is already protected by pud_huge() which is forced to 0. None of the stuff under the ifdef causes compilation problems as it is already stubbed out in the header files. For the second hunk the dummy huge_page_shift macro doesn't touch the argument, so just inline the argument. Link: http://lkml.kernel.org/r/20190522195151.GA23955@ziepe.caSigned-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 03 6月, 2019 13 次提交
-
-
由 Linus Torvalds 提交于
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 fixes from Ingo Molnar: "Two fixes: a quirk for KVM guests running on certain AMD CPUs, and a KASAN related build fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor x86/boot: Provide KASAN compatible aliases for string routines
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull perf fixes from Ingo Molnar: "On the kernel side there's a bunch of ring-buffer ordering fixes for a reproducible bug, plus a PEBS constraints regression fix. Plus tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools headers UAPI: Sync kvm.h headers with the kernel sources perf record: Fix s390 missing module symbol and warning for non-root users perf machine: Read also the end of the kernel perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms perf session: Add missing swap ops for namespace events perf namespace: Protect reading thread's namespace tools headers UAPI: Sync drm/drm.h with the kernel tools headers UAPI: Sync drm/i915_drm.h with the kernel tools headers UAPI: Sync linux/fs.h with the kernel tools headers UAPI: Sync linux/sched.h with the kernel tools arch x86: Sync asm/cpufeatures.h with the with the kernel tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel perf data: Fix 'strncat may truncate' build failure with recent gcc perf/ring-buffer: Use regular variables for nesting perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data perf/ring_buffer: Add ordering to rb->nest increment perf/ring_buffer: Fix exposing a temporarily decreased data_head perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull EFI fixes from Ingo Molnar: "Two EFI fixes: a quirk for weird systabs, plus add more robust error handling in the old 1:1 mapping code" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Allow the number of EFI configuration tables entries to be zero efi/x86/Add missing error handling to old_memmap 1:1 mapping code
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull stacktrace fix from Ingo Molnar: "Fix a stack_trace_save_tsk_reliable() regression" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stacktrace: Unbreak stack_trace_save_tsk_reliable()
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core由 Linus Torvalds 提交于
Pull SPDX fixes from Greg KH: "Here are just two small patches, that fix up some found SPDX identifier issues. The first patch fixes an error in a previous SPDX fixup patch, that causes build errors when doing 'make clean' on the tree (the fact that almost no one noticed it reflects the fact that kernel developers don't like doing that option very often...) The second patch fixes up a number of places in the tree where people mistyped the string "SPDX-License-Identifier". Given that people can not even type their own name all the time without mistakes, this was bound to happen, and odds are, we will have to add some type of check for this to checkpatch.pl to catch this happening in the future. Both of these have passed testing by 0-day" * tag 'spdx-5.2-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: treewide: fix typos of SPDX-License-Identifier crypto: ux500 - fix license comment syntax error
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux由 Linus Torvalds 提交于
Pull powerpc fixes from Michael Ellerman: "A minor fix to our IMC PMU code to print a less confusing error message when the driver can't initialise properly. A fix for a bug where a user requesting an unsupported branch sampling filter can corrupt PMU state, preventing the PMU from counting properly. And finally a fix for a bug in our support for kexec_file_load(), which prevented loading a kernel and initramfs. Most versions of kexec don't yet use kexec_file_load(). Thanks to: Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi Bangoria, Thiago Jung Bauermann" * tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load() powerpc/perf: Fix MMCRA corruption by bhrb_filter powerpc/powernv: Return for invalid IMC domain
-
git://git.kernel.org/pub/scm/virt/kvm/kvm由 Linus Torvalds 提交于
Pull KVM fixes from Paolo Bonzini: "Fixes for PPC and s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Restore SPRG3 in kvmhv_p9_guest_entry() KVM: PPC: Book3S HV: Fix lockdep warning when entering guest on POWER9 KVM: PPC: Book3S HV: XIVE: Fix page offset when clearing ESB pages KVM: PPC: Book3S HV: XIVE: Take the srcu read lock when accessing memslots KVM: PPC: Book3S HV: XIVE: Do not clear IRQ data of passthrough interrupts KVM: PPC: Book3S HV: XIVE: Introduce a new mutex for the XIVE device KVM: PPC: Book3S HV: XIVE: Fix the enforced limit on the vCPU identifier KVM: PPC: Book3S HV: XIVE: Do not test the EQ flag validity when resetting KVM: PPC: Book3S HV: XIVE: Clear file mapping when device is released KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID kvm: fix compile on s390 part 2
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux由 Linus Torvalds 提交于
Pull i2c fixes from Wolfram Sang: "A memleak fix for the core, two driver bugfixes, as well as fixing missing file patterns to MAINTAINERS" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: add I2C DT bindings to ARM platforms MAINTAINERS: add DT bindings to i2c drivers i2c: synquacer: fix synquacer_i2c_doxfer() return value i2c: mlxcpld: Fix wrong initialization order in probe i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
-
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal由 Linus Torvalds 提交于
Pull thermal SoC fix from Eduardo Valentin: "A single revert, detected to cause issues on the tsens driver" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled"
-
由 Linus Torvalds 提交于
Merge tag 'led-fixes-for-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fix from Jacek Anaszewski: "Fix for a recent change in LED core, that didn't take into account the possibility of calling led_blink_setup() from atomic context" * tag 'led-fixes-for-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds: avoid flush_work in atomic context
-
git://git.kernel.dk/linux-block由 Linus Torvalds 提交于
Pull block fixes from Jens Axboe: - A set of patches fixing code comments / kerneldoc (Bart) - Don't allow loop file change for exclusive open (Jan) - Fix revalidate of hidden genhd (Jan) - Init queue failure memory free fix (Jes) - Improve rq limits failure print (John) - Fixup for queue removal/addition (Ming) - Missed error progagation for io_uring buffer registration (Pavel) * tag 'for-linus-20190601' of git://git.kernel.dk/linux-block: block: print offending values when cloned rq limits are exceeded blk-mq: Document the blk_mq_hw_queue_to_node() arguments blk-mq: Fix spelling in a source code comment block: Fix bsg_setup_queue() kernel-doc header block: Fix rq_qos_wait() kernel-doc header block: Fix blk_mq_*_map_queues() kernel-doc headers block: Fix throtl_pending_timer_fn() kernel-doc header block: Convert blk_invalidate_devt() header into a non-kernel-doc header block/partitions/ldm: Convert a kernel-doc header into a non-kernel-doc header blk-mq: Fix memory leak in error handling block: don't protect generic_make_request_checks with blk_queue_enter block: move blk_exit_queue into __blk_release_queue block: Don't revalidate bdev of hidden gendisk loop: Don't change loop device under exclusive opener io_uring: Fix __io_uring_register() false success
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi由 Linus Torvalds 提交于
Pull SCSI fixes from James Bottomley: "Six minor fixes to device drivers and one to the multipath alua handler. The most extensive fix is the zfcp port remove prevention one, but it's impact is only s390" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: libsas: delete sas port if expander discover failed scsi: libsas: only clear phy->in_shutdown after shutdown event done scsi: scsi_dh_alua: Fix possible null-ptr-deref scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
-
- 02 6月, 2019 10 次提交
-
-
由 Linus Torvalds 提交于
Merge misc fixes from Andrew Morton: "Various fixes and followups" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, compaction: make sure we isolate a valid PFN include/linux/generic-radix-tree.h: fix kerneldoc comment kernel/signal.c: trace_signal_deliver when signal_group_exit drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used spdxcheck.py: fix directory structures kasan: initialize tag to 0xff in __kasan_kmalloc z3fold: fix sheduling while atomic scripts/gdb: fix invocation when CONFIG_COMMON_CLK is not set mm/gup: continue VM_FAULT_RETRY processing even for pre-faults ocfs2: fix error path kobject memory leak memcg: make it work on sparse non-0-node systems mm, memcg: consider subtrees in memory.events prctl_set_mm: downgrade mmap_sem to read lock prctl_set_mm: refactor checks from validate_prctl_map kernel/fork.c: make max_threads symbol static arch/arm/boot/compressed/decompress.c: fix build error due to lz4 changes arch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAK mm/vmalloc.c: fix typo in comment lib/sort.c: fix kernel-doc notation warnings mm: fix Documentation/vm/hmm.rst Sphinx warnings
-
由 Suzuki K Poulose 提交于
When we have holes in a normal memory zone, we could endup having cached_migrate_pfns which may not necessarily be valid, under heavy memory pressure with swapping enabled ( via __reset_isolation_suitable(), triggered by kswapd). Later if we fail to find a page via fast_isolate_freepages(), we may end up using the migrate_pfn we started the search with, as valid page. This could lead to accessing NULL pointer derefernces like below, due to an invalid mem_section pointer. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825] Mem abort info: ESR = 0x96000004 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9 [0000000000000008] pgd=0000000000000000 Internal error: Oops: 96000004 [#1] SMP ... CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6 Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018 pstate: 60000005 (nZCv daif -PAN -UAO) pc : set_pfnblock_flags_mask+0x58/0xe8 lr : compaction_alloc+0x300/0x950 [...] Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5) Call trace: set_pfnblock_flags_mask+0x58/0xe8 compaction_alloc+0x300/0x950 migrate_pages+0x1a4/0xbb0 compact_zone+0x750/0xde8 compact_zone_order+0xd8/0x118 try_to_compact_pages+0xb4/0x290 __alloc_pages_direct_compact+0x84/0x1e0 __alloc_pages_nodemask+0x5e0/0xe18 alloc_pages_vma+0x1cc/0x210 do_huge_pmd_anonymous_page+0x108/0x7c8 __handle_mm_fault+0xdd4/0x1190 handle_mm_fault+0x114/0x1c0 __get_user_pages+0x198/0x3c0 get_user_pages_unlocked+0xb4/0x1d8 __gfn_to_pfn_memslot+0x12c/0x3b8 gfn_to_pfn_prot+0x4c/0x60 kvm_handle_guest_abort+0x4b0/0xcd8 handle_exit+0x140/0x1b8 kvm_arch_vcpu_ioctl_run+0x260/0x768 kvm_vcpu_ioctl+0x490/0x898 do_vfs_ioctl+0xc4/0x898 ksys_ioctl+0x8c/0xa0 __arm64_sys_ioctl+0x28/0x38 el0_svc_common+0x74/0x118 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Code: f8607840 f100001f 8b011401 9a801020 (f9400400) ---[ end trace af6a35219325a9b6 ]--- The issue was reported on an arm64 server with 128GB with holes in the zone (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while running 100 KVM guest instances. This patch fixes the issue by ensuring that the page belongs to a valid PFN when we fallback to using the lower limit of the scan range upon failure in fast_isolate_freepages(). Link: http://lkml.kernel.org/r/1558711908-15688-1-git-send-email-suzuki.poulose@arm.com Fixes: 5a811889 ("mm, compaction: use free lists to quickly locate a migration target") Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Reported-by: NMarc Zyngier <marc.zyngier@arm.com> Reviewed-by: NMel Gorman <mgorman@techsingularity.net> Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Qian Cai <cai@lca.pw> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jonathan Corbet 提交于
The DOC comment block section in include/linux/generic-radix-tree.h contained a spurious colon, causing this warning in the documentation build: include/linux/generic-radix-tree.h:1: warning: no structured comments found Remove the colon and make the docs build happy. Link: http://lkml.kernel.org/r/20190524141933.74ae9050@lwn.netSigned-off-by: NJonathan Corbet <corbet@lwn.net> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhenliang Wei 提交于
In the fixes commit, removing SIGKILL from each thread signal mask and executing "goto fatal" directly will skip the call to "trace_signal_deliver". At this point, the delivery tracking of the SIGKILL signal will be inaccurate. Therefore, we need to add trace_signal_deliver before "goto fatal" after executing sigdelset. Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info. Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com Fixes: cf43a757 ("signal: Restore the stop PTRACE_EVENT_EXIT") Signed-off-by: NZhenliang Wei <weizhenliang@huawei.com> Reviewed-by: NChristian Brauner <christian@brauner.io> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Ivan Delalande <colona@arista.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Qian Cai 提交于
Commit cf04eee8 ("iommu/vt-d: Include ACPI devices in iommu=pt") added for_each_active_iommu() in iommu_prepare_static_identity_mapping() but never used the each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'iommu_prepare_static_identity_mapping': drivers/iommu/intel-iommu.c:3037:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Fixed the warning by appending a compiler attribute __maybe_unused for it. Link: http://lkml.kernel.org/r/20190523013314.2732-1-cai@lca.pwSigned-off-by: NQian Cai <cai@lca.pw> Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vincenzo Frascino 提交于
The LICENSE directory has recently changed structure and this makes spdxcheck fails as per below: FAIL: "Blob or Tree named 'other' not found" Traceback (most recent call last): File "scripts/spdxcheck.py", line 240, in <module> spdx = read_spdxdata(repo) File "scripts/spdxcheck.py", line 41, in read_spdxdata for el in lictree[d].traverse(): [...] KeyError: "Blob or Tree named 'other' not found" Fix the script to restore the correctness on checkpatch License checking. References: 62be257e ("LICENSES: Rename other to deprecated") References: 8ea8814f ("LICENSES: Clearly mark dual license only licenses") Link: http://lkml.kernel.org/r/20190523084755.56739-1-vincenzo.frascino@arm.comSigned-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com> Cc: Joe Perches <joe@perches.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeremy Cline <jcline@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nathan Chancellor 提交于
When building with -Wuninitialized and CONFIG_KASAN_SW_TAGS unset, Clang warns: mm/kasan/common.c:484:40: warning: variable 'tag' is uninitialized when used here [-Wuninitialized] kasan_unpoison_shadow(set_tag(object, tag), size); ^~~ set_tag ignores tag in this configuration but clang doesn't realize it at this point in its pipeline, as it points to arch_kasan_set_tag as being the point where it is used, which will later be expanded to (void *)(object) without a use of tag. Initialize tag to 0xff, as it removes this warning and doesn't change the meaning of the code. Link: https://github.com/ClangBuiltLinux/linux/issues/465 Link: http://lkml.kernel.org/r/20190502163057.6603-1-natechancellor@gmail.com Fixes: 7f94ffbc ("kasan: add hooks implementation for tag-based mode") Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Reviewed-by: NAndrey Konovalov <andreyknvl@google.com> Reviewed-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vitaly Wool 提交于
kmem_cache_alloc() may be called from z3fold_alloc() in atomic context, so we need to pass correct gfp flags to avoid "scheduling while atomic" bug. Link: http://lkml.kernel.org/r/20190523153245.119dfeed55927e8755250ddd@gmail.com Fixes: 7c2b8baa ("mm/z3fold.c: add structure for buddy handles") Signed-off-by: NVitaly Wool <vitaly.vul@sony.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Fabiano Rosas 提交于
CLK_GET_RATE_NOCACHE depends on CONFIG_COMMON_CLK. Importing constants.py when CONFIG_COMMON_CLK is not defined causes: (gdb) lx-symbols (...) File "scripts/gdb/linux/proc.py", line 15, in <module> from linux import constants File "scripts/gdb/linux/constants.py", line 2, in <module> LX_CLK_GET_RATE_NOCACHE = gdb.parse_and_eval("CLK_GET_RATE_NOCACHE") gdb.error: No symbol "CLK_GET_RATE_NOCACHE" in current context. Link: http://lkml.kernel.org/r/20190523195313.24701-1-farosas@linux.ibm.com Fixes: e7e6f462 ("scripts/gdb: print cached rate in lx-clk-summary") Signed-off-by: NFabiano Rosas <farosas@linux.ibm.com> Reviewed-by: NStephen Boyd <swboyd@chromium.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Leonard Crestez <leonard.crestez@nxp.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Rapoport 提交于
When get_user_pages*() is called with pages = NULL, the processing of VM_FAULT_RETRY terminates early without actually retrying to fault-in all the pages. If the pages in the requested range belong to a VMA that has userfaultfd registered, handle_userfault() returns VM_FAULT_RETRY *after* user space has populated the page, but for the gup pre-fault case there's no actual retry and the caller will get no pages although they are present. This issue was uncovered when running post-copy memory restore in CRIU after d9c9ce34 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails"). After this change, the copying of FPU state to the sigframe switched from copy_to_user() variants which caused a real page fault to get_user_pages() with pages parameter set to NULL. In post-copy mode of CRIU, the destination memory is managed with userfaultfd and lack of the retry for pre-fault case in get_user_pages() causes a crash of the restored process. Making the pre-fault behavior of get_user_pages() the same as the "normal" one fixes the issue. Link: http://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com Fixes: d9c9ce34 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Tested-by: Andrei Vagin <avagin@gmail.com> [https://travis-ci.org/avagin/linux/builds/533184940] Tested-by: NHugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Borislav Petkov <bp@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-