1. 06 3月, 2020 8 次提交
    • V
      mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled · c87cbc1f
      Vlastimil Babka 提交于
      Commit cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      fixed memory hotplug with debug_pagealloc enabled, where onlining a page
      goes through page freeing, which removes the direct mapping.  Some arches
      don't like when the page is not mapped in the first place, so
      generic_online_page() maps it first.  This is somewhat wasteful, but
      better than special casing page freeing fast paths.
      
      The commit however missed that DEBUG_PAGEALLOC configured doesn't mean
      it's actually enabled.  One has to test debug_pagealloc_enabled() since
      031bc574 ("mm/debug-pagealloc: make debug-pagealloc boottime
      configurable"), or alternatively debug_pagealloc_enabled_static() since
      8e57f8ac ("mm, debug_pagealloc: don't rely on static keys too early"),
      but this is not done.
      
      As a result, a s390 kernel with DEBUG_PAGEALLOC configured but not enabled
      will crash:
      
      Unable to handle kernel pointer dereference in virtual kernel address space
      Failing address: 0000000000000000 TEID: 0000000000000483
      Fault in home space mode while using kernel ASCE.
      AS:0000001ece13400b R2:000003fff7fd000b R3:000003fff7fcc007 S:000003fff7fd7000 P:000000000000013d
      Oops: 0004 ilc:2 [#1] SMP
      CPU: 1 PID: 26015 Comm: chmem Kdump: loaded Tainted: GX 5.3.18-5-default #1 SLE15-SP2 (unreleased)
      Krnl PSW : 0704e00180000000 0000001ecd281b9e (__kernel_map_pages+0x166/0x188)
      R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
      Krnl GPRS: 0000000000000000 0000000000000800 0000400b00000000 0000000000000100
      0000000000000001 0000000000000000 0000000000000002 0000000000000100
      0000001ece139230 0000001ecdd98d40 0000400b00000100 0000000000000000
      000003ffa17e4000 001fffe0114f7d08 0000001ecd4d93ea 001fffe0114f7b20
      Krnl Code: 0000001ecd281b8e: ec17ffff00d8 ahik %r1,%r7,-1
      0000001ecd281b94: ec111dbc0355 risbg %r1,%r1,29,188,3
      >0000001ecd281b9e: 94fb5006 ni 6(%r5),251
      0000001ecd281ba2: 41505008 la %r5,8(%r5)
      0000001ecd281ba6: ec51fffc6064 cgrj %r5,%r1,6,1ecd281b9e
      0000001ecd281bac: 1a07 ar %r0,%r7
      0000001ecd281bae: ec03ff584076 crj %r0,%r3,4,1ecd281a5e
      Call Trace:
      [<0000001ecd281b9e>] __kernel_map_pages+0x166/0x188
      [<0000001ecd4d9516>] online_pages_range+0xf6/0x128
      [<0000001ecd2a8186>] walk_system_ram_range+0x7e/0xd8
      [<0000001ecda28aae>] online_pages+0x2fe/0x3f0
      [<0000001ecd7d02a6>] memory_subsys_online+0x8e/0xc0
      [<0000001ecd7add42>] device_online+0x5a/0xc8
      [<0000001ecd7d0430>] state_store+0x88/0x118
      [<0000001ecd5b9f62>] kernfs_fop_write+0xc2/0x200
      [<0000001ecd5064b6>] vfs_write+0x176/0x1e0
      [<0000001ecd50676a>] ksys_write+0xa2/0x100
      [<0000001ecda315d4>] system_call+0xd8/0x2c8
      
      Fix this by checking debug_pagealloc_enabled_static() before calling
      kernel_map_pages(). Backports for kernel before 5.5 should use
      debug_pagealloc_enabled() instead. Also add comments.
      
      Fixes: cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      Reported-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Qian Cai <cai@lca.pw>
      Link: http://lkml.kernel.org/r/20200224094651.18257-1-vbabka@suse.czSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c87cbc1f
    • S
      mm/z3fold.c: do not include rwlock.h directly · a8198fed
      Sebastian Andrzej Siewior 提交于
      rwlock.h should not be included directly. Instead linux/splinlock.h
      should be included. One thing it does is to break the RT build.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200224133631.1510569-1-bigeasy@linutronix.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8198fed
    • O
      fat: fix uninit-memory access for partial initialized inode · bc87302a
      OGAWA Hirofumi 提交于
      When get an error in the middle of reading an inode, some fields in the
      inode might be still not initialized.  And then the evict_inode path may
      access those fields via iput().
      
      To fix, this makes sure that inode fields are initialized.
      
      Reported-by: syzbot+9d82b8de2992579da5d0@syzkaller.appspotmail.com
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/871rqnreqx.fsf@mail.parknet.co.jpSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bc87302a
    • K
      mm: avoid data corruption on CoW fault into PFN-mapped VMA · c3e5ea6e
      Kirill A. Shutemov 提交于
      Jeff Moyer has reported that one of xfstests triggers a warning when run
      on DAX-enabled filesystem:
      
      	WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
      	...
      	wp_page_copy+0x98c/0xd50 (unreliable)
      	do_wp_page+0xd8/0xad0
      	__handle_mm_fault+0x748/0x1b90
      	handle_mm_fault+0x120/0x1f0
      	__do_page_fault+0x240/0xd70
      	do_page_fault+0x38/0xd0
      	handle_page_fault+0x10/0x30
      
      The warning happens on failed __copy_from_user_inatomic() which tries to
      copy data into a CoW page.
      
      This happens because of race between MADV_DONTNEED and CoW page fault:
      
      	CPU0					CPU1
       handle_mm_fault()
         do_wp_page()
           wp_page_copy()
             do_wp_page()
      					madvise(MADV_DONTNEED)
      					  zap_page_range()
      					    zap_pte_range()
      					      ptep_get_and_clear_full()
      					      <TLB flush>
      	 __copy_from_user_inatomic()
      	 sees empty PTE and fails
      	 WARN_ON_ONCE(1)
      	 clear_page()
      
      The solution is to re-try __copy_from_user_inatomic() under PTL after
      checking that PTE is matches the orig_pte.
      
      The second copy attempt can still fail, like due to non-readable PTE, but
      there's nothing reasonable we can do about, except clearing the CoW page.
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Justin He <Justin.He@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c3e5ea6e
    • H
      mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() · 8a8683ad
      Huang Ying 提交于
      In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
      atomically.  But the PMD is read before that with an ordinary memory
      reading.  If the THP (transparent huge page) is written between the PMD
      reading and pmdp_invalidate(), the PMD dirty bit may be lost, and cause
      data corruption.  The race window is quite small, but still possible in
      theory, so need to be fixed.
      
      The race is fixed via using the return value of pmdp_invalidate() to get
      the original content of PMD, which is a read/modify/write atomic
      operation.  So no THP writing can occur in between.
      
      The race has been introduced when the THP migration support is added in
      the commit 616b8371 ("mm: thp: enable thp migration in generic path").
      But this fix depends on the commit d52605d7 ("mm: do not lose dirty
      and accessed bits in pmdp_invalidate()").  So it's easy to be backported
      after v4.16.  But the race window is really small, so it may be fine not
      to backport the fix at all.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: N"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: NZi Yan <ziy@nvidia.com>
      Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Link: http://lkml.kernel.org/r/20200220075220.2327056-1-ying.huang@intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a8683ad
    • M
      mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa · 8b272b3c
      Mel Gorman 提交于
      : A user reported a bug against a distribution kernel while running a
      : proprietary workload described as "memory intensive that is not swapping"
      : that is expected to apply to mainline kernels.  The workload is
      : read/write/modifying ranges of memory and checking the contents.  They
      : reported that within a few hours that a bad PMD would be reported followed
      : by a memory corruption where expected data was all zeros.  A partial
      : report of the bad PMD looked like
      :
      :   [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
      :   [ 5195.341184] ------------[ cut here ]------------
      :   [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
      :   ....
      :   [ 5195.410033] Call Trace:
      :   [ 5195.410471]  [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
      :   [ 5195.410716]  [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
      :   [ 5195.410918]  [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
      :   [ 5195.411200]  [<ffffffff81098322>] task_work_run+0x72/0x90
      :   [ 5195.411246]  [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
      :   [ 5195.411494]  [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
      :   [ 5195.411739]  [<ffffffff815e56af>] retint_user+0x8/0x10
      :
      : Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
      : was a false detection.  The bug does not trigger if automatic NUMA
      : balancing or transparent huge pages is disabled.
      :
      : The bug is due a race in change_pmd_range between a pmd_trans_huge and
      : pmd_nond_or_clear_bad check without any locks held.  During the
      : pmd_trans_huge check, a parallel protection update under lock can have
      : cleared the PMD and filled it with a prot_numa entry between the transhuge
      : check and the pmd_none_or_clear_bad check.
      :
      : While this could be fixed with heavy locking, it's only necessary to make
      : a copy of the PMD on the stack during change_pmd_range and avoid races.  A
      : new helper is created for this as the check if quite subtle and the
      : existing similar helpful is not suitable.  This passed 154 hours of
      : testing (usually triggers between 20 minutes and 24 hours) without
      : detecting bad PMDs or corruption.  A basic test of an autonuma-intensive
      : workload showed no significant change in behaviour.
      
      Although Mel withdrew the patch on the face of LKML comment
      https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is
      still open, and we have reports of Linpack test reporting bad residuals
      after the bad PMD warning is observed.  In addition to that, bad
      rss-counter and non-zero pgtables assertions are triggered on mm teardown
      for the task hitting the bad PMD.
      
       host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7)
       ....
       host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512
       host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096
      
      The issue is observed on a v4.18-based distribution kernel, but the race
      window is expected to be applicable to mainline kernels, as well.
      
      [akpm@linux-foundation.org: fix comment typo, per Rafael]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NRafael Aquini <aquini@redhat.com>
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Cc: <stable@vger.kernel.org>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.com>
      Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8b272b3c
    • L
      Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux · 9f65ed5f
      Linus Torvalds 提交于
      Pull Hyper-V update from Wei Liu:
      
       - Update MAINTAINERS file for Hyper-V
      
       - One cleanup patch for Hyper-V HID driver
      
      * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        HID: hyperv: NULL check before some freeing functions is not needed.
        Hyper-V: add myself as a maintainer
        Hyper-V: Drop Sasha Levin from the Hyper-V maintainers
      9f65ed5f
    • L
      Merge tag 'dmaengine-fix-5.6-rc5' of git://git.infradead.org/users/vkoul/slave-dma · 6fd145da
      Linus Torvalds 提交于
      Pull dmaengine fixes from Vinod Koul:
       "A bunch of driver fixes:
      
         - Doc updates to clean warnings for dmaengine
      
         - Fixes for newly added Intel idxd driver
      
         - More fixes for newly added TI k3-udma driver
      
         - Fixes for IMX and Tegra drivers"
      
      * tag 'dmaengine-fix-5.6-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: imx-sdma: Fix the event id check to include RX event for UART6
        dmaengine: tegra-apb: Prevent race conditions of tasklet vs free list
        dmaengine: tegra-apb: Fix use-after-free
        dmaengine: imx-sdma: fix context cache
        dmaengine: idxd: wq size configuration needs to check global max size
        dmaengine: idxd: sysfs input of wq incorrect wq type should return error
        dmaengine: coh901318: Fix a double lock bug in dma_tc_handle()
        dmaengine: idxd: correct reserved token calculation
        dmaengine: ti: k3-udma: Fix terminated transfer handling
        dmaengine: ti: k3-udma: Use the channel direction in pause/resume functions
        dmaengine: ti: k3-udma: Use the TR counter helper for slave_sg and cyclic
        dmaengine: ti: k3-udma: Move the TR counter calculation to helper function
        dmaengine: ti: k3-udma: Workaround for RX teardown with stale data in peer
        dmaengine: ti: k3-udma: Use ktime/usleep_range based TX completion check
        dmaengine: idxd: Fix error handling in idxd_wq_cdev_dev_setup()
        dmaengine: doc: fix warnings/issues of client.rst
        dmaengine: idxd: fix runaway module ref count on device driver bind
      6fd145da
  2. 05 3月, 2020 4 次提交
  3. 04 3月, 2020 3 次提交
    • L
      Merge tag '5.6-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 8b614cb8
      Linus Torvalds 提交于
      Pull cifs fixes from Steve French:
       "Five small cifs/smb3 fixes, two for stable (one for a reconnect
        problem and the other fixes a use case when renaming an open file)"
      
      * tag '5.6-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Use #define in cifs_dbg
        cifs: fix rename() by ensuring source handle opened with DELETE bit
        cifs: add missing mount option to /proc/mounts
        cifs: fix potential mismatch of UNC paths
        cifs: don't leak -EAGAIN for stat() during reconnect
      8b614cb8
    • M
      dm: bump version of core and various targets · 636be424
      Mike Snitzer 提交于
      Changes made during the 5.6 cycle warrant bumping the version number
      for DM core and the targets modified by this commit.
      
      It should be noted that dm-thin, dm-crypt and dm-raid already had
      their target version bumped during the 5.6 merge window.
      
      Signed-off-by; Mike Snitzer <snitzer@redhat.com>
      636be424
    • H
      dm: fix congested_fn for request-based device · 974f51e8
      Hou Tao 提交于
      We neither assign congested_fn for requested-based blk-mq device nor
      implement it correctly. So fix both.
      
      Also, remove incorrect comment from dm_init_normal_md_queue and rename
      it to dm_init_congested_fn.
      
      Fixes: 4aa9c692 ("bdi: separate out congested state into a separate struct")
      Cc: stable@vger.kernel.org
      Signed-off-by: NHou Tao <houtao1@huawei.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      974f51e8
  4. 03 3月, 2020 2 次提交
  5. 02 3月, 2020 8 次提交
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2873dc25
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes: a pkeys fix for a bug that triggers with weird BIOS
        settings, and two Xen PV fixes: a paravirt interface fix, and
        pagetable dumping fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Fix dump_pagetables with Xen PV
        x86/ioperm: Add new paravirt function update_io_bitmap()
        x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes
      2873dc25
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c105df5d
      Linus Torvalds 提交于
      Pull scheduler fix from Ingo Molnar:
       "Fix a scheduler statistics bug"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix statistics for find_idlest_group()
      c105df5d
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 852fb4a7
      Linus Torvalds 提交于
      Pull perf fixes from Ingo Molnar:
       "No kernel side changes, all tooling fixes plus two tooling cleanups
        that were committed late in the merge window alongside the perf
        annotate fixes, delayed by Arnaldo's European trip"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
        perf annotate: Fix segfault with source toggle
        perf annotate: Align struct annotate_args
        perf annotate: Simplify disasm_line allocation and freeing code
        perf annotate: Remove privsize from symbol__annotate() args
        perf probe: Check return value of strlist__add() for -ENOMEM
        perf config: Document missing config options
        perf annotate: Fix perf config option description
        perf annotate: Prefer cmdline option over default config
        perf annotate: Make perf config effective
        perf config: Introduce perf_config_u8()
        perf annotate: Fix --show-nr-samples for tui/stdio2
        perf annotate: Fix --show-total-period for tui/stdio2
        perf annotate/tui: Re-render title bar after switching back from script browser
        tools headers UAPI: Update tools's copy of kvm.h headers
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf arch powerpc: Sync powerpc syscall.tbl with the kernel sources
        perf auxtrace: Add auxtrace_record__read_finish()
        perf arm-spe: Fix endless record after being terminated
        perf cs-etm: Fix endless record after being terminated
        perf intel-bts: Fix endless record after being terminated
        ...
      852fb4a7
    • L
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e130a920
      Linus Torvalds 提交于
      Pull EFI fixes from Ingo Molnar:
       "Three fixes to EFI mixed boot mode, mostly related to x86-64 vmap
        stacks activated years ago, bug-fixed recently for EFI, which had
        knock-on effects of various 1:1 mapping assumptions in mixed mode.
      
        There's also a READ_ONCE() fix for reading an mmap-ed EFI firmware
        data field only once, out of caution"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: READ_ONCE rng seed size before munmap
        efi/x86: Handle by-ref arguments covering multiple pages in mixed mode
        efi/x86: Remove support for EFI time and counter services in mixed mode
        efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper
      e130a920
    • L
      Linux 5.6-rc4 · 98d54f81
      Linus Torvalds 提交于
      98d54f81
    • L
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · e7086982
      Linus Torvalds 提交于
      Pull ext4 fixes from Ted Ts'o:
       "Two more bug fixes (including a regression) for 5.6"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
        jbd2: fix data races at struct journal_head
      e7086982
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f853ed90
      Linus Torvalds 提交于
      Pull KVM fixes from Paolo Bonzini:
       "More bugfixes, including a few remaining "make W=1" issues such as too
        large frame sizes on some configurations.
      
        On the ARM side, the compiler was messing up shadow stacks between EL1
        and EL2 code, which is easily fixed with __always_inline"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: VMX: check descriptor table exits on instruction emulation
        kvm: x86: Limit the number of "kvm: disabled by bios" messages
        KVM: x86: avoid useless copy of cpufreq policy
        KVM: allow disabling -Werror
        KVM: x86: allow compiling as non-module with W=1
        KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis
        KVM: Introduce pv check helpers
        KVM: let declaration of kvm_get_running_vcpus match implementation
        KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter
        arm64: Ask the compiler to __always_inline functions used by KVM at HYP
        KVM: arm64: Define our own swab32() to avoid a uapi static inline
        KVM: arm64: Ask the compiler to __always_inline functions used at HYP
        kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()
        KVM: arm/arm64: Fix up includes for trace.h
      f853ed90
    • O
      KVM: VMX: check descriptor table exits on instruction emulation · 86f7e90c
      Oliver Upton 提交于
      KVM emulates UMIP on hardware that doesn't support it by setting the
      'descriptor table exiting' VM-execution control and performing
      instruction emulation. When running nested, this emulation is broken as
      KVM refuses to emulate L2 instructions by default.
      
      Correct this regression by allowing the emulation of descriptor table
      instructions if L1 hasn't requested 'descriptor table exiting'.
      
      Fixes: 07721fee ("KVM: nVMX: Don't emulate instructions in guest mode")
      Reported-by: NJan Kiszka <jan.kiszka@web.de>
      Cc: stable@vger.kernel.org
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: NOliver Upton <oupton@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      86f7e90c
  6. 01 3月, 2020 4 次提交
    • L
      Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · fb279f4e
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "I2C has three driver bugfixes for you. We agreed on the Mac regression
        to go in via I2C"
      
      * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        macintosh: therm_windtunnel: fix regression when instantiating devices
        i2c: altera: Fix potential integer overflow
        i2c: jz4780: silence log flood on txabrt
      fb279f4e
    • D
      ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() · 37b0b6b8
      Dan Carpenter 提交于
      If sbi->s_flex_groups_allocated is zero and the first allocation fails
      then this code will crash.  The problem is that "i--" will set "i" to
      -1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1
      is type promoted to unsigned and becomes UINT_MAX.  Since UINT_MAX
      is more than zero, the condition is true so we call kvfree(new_groups[-1]).
      The loop will carry on freeing invalid memory until it crashes.
      
      Fixes: 7c990728 ("ext4: fix potential race between s_flex_groups online resizing and access")
      Reviewed-by: NSuraj Jitindar Singh <surajjs@amazon.com>
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountainSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      37b0b6b8
    • W
      macintosh: therm_windtunnel: fix regression when instantiating devices · 38b17afb
      Wolfram Sang 提交于
      Removing attach_adapter from this driver caused a regression for at
      least some machines. Those machines had the sensors described in their
      DT, too, so they didn't need manual creation of the sensor devices. The
      old code worked, though, because manual creation came first. Creation of
      DT devices then failed later and caused error logs, but the sensors
      worked nonetheless because of the manually created devices.
      
      When removing attach_adaper, manual creation now comes later and loses
      the race. The sensor devices were already registered via DT, yet with
      another binding, so the driver could not be bound to it.
      
      This fix refactors the code to remove the race and only manually creates
      devices if there are no DT nodes present. Also, the DT binding is updated
      to match both, the DT and manually created devices. Because we don't
      know which device creation will be used at runtime, the code to start
      the kthread is moved to do_probe() which will be called by both methods.
      
      Fixes: 3e7bed52 ("macintosh: therm_windtunnel: drop using attach_adapter")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723Reported-by: NErhard Furtner <erhard_f@mailbox.org>
      Tested-by: NErhard Furtner <erhard_f@mailbox.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
      Cc: stable@kernel.org # v4.19+
      38b17afb
    • Q
      jbd2: fix data races at struct journal_head · 6c5d9112
      Qian Cai 提交于
      journal_head::b_transaction and journal_head::b_next_transaction could
      be accessed concurrently as noticed by KCSAN,
      
       LTP: starting fsync04
       /dev/zero: Can't open blockdev
       EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
       EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
       ==================================================================
       BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]
      
       write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
        __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
        __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
        jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
        (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
        kjournald2+0x13b/0x450 [jbd2]
        kthread+0x1cd/0x1f0
        ret_from_fork+0x27/0x50
      
       read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
        jbd2_write_access_granted+0x1b2/0x250 [jbd2]
        jbd2_write_access_granted at fs/jbd2/transaction.c:1155
        jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
        __ext4_journal_get_write_access+0x50/0x90 [ext4]
        ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
        ext4_mb_new_blocks+0x54f/0xca0 [ext4]
        ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
        ext4_map_blocks+0x3b4/0x950 [ext4]
        _ext4_get_block+0xfc/0x270 [ext4]
        ext4_get_block+0x3b/0x50 [ext4]
        __block_write_begin_int+0x22e/0xae0
        __block_write_begin+0x39/0x50
        ext4_write_begin+0x388/0xb50 [ext4]
        generic_perform_write+0x15d/0x290
        ext4_buffered_write_iter+0x11f/0x210 [ext4]
        ext4_file_write_iter+0xce/0x9e0 [ext4]
        new_sync_write+0x29c/0x3b0
        __vfs_write+0x92/0xa0
        vfs_write+0x103/0x260
        ksys_write+0x9d/0x130
        __x64_sys_write+0x4c/0x60
        do_syscall_64+0x91/0xb05
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       5 locks held by fsync04/25724:
        #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
        #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
        #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
        #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
        #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
       irq event stamp: 1407125
       hardirqs last  enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
       hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
       softirqs last  enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
       softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0
      
       Reported by Kernel Concurrency Sanitizer on:
       CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
       Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
      
      The plain reads are outside of jh->b_state_lock critical section which result
      in data races. Fix them by adding pairs of READ|WRITE_ONCE().
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NQian Cai <cai@lca.pw>
      Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pwSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
      6c5d9112
  7. 29 2月, 2020 11 次提交