1. 29 12月, 2018 40 次提交
    • M
      mm, memory_hotplug: initialize struct pages for the full memory section · 7592dbfa
      Mikhail Zaslonko 提交于
      commit 2830bf6f05fb3e05bc4743274b806c821807a684 upstream.
      
      If memory end is not aligned with the sparse memory section boundary,
      the mapping of such a section is only partly initialized.  This may lead
      to VM_BUG_ON due to uninitialized struct page access from
      is_mem_section_removable() or test_pages_in_a_zone() function triggered
      by memory_hotplug sysfs handlers:
      
      Here are the the panic examples:
       CONFIG_DEBUG_VM=y
       CONFIG_DEBUG_VM_PGFLAGS=y
      
       kernel parameter mem=2050M
       --------------------------
       page:000003d082008000 is uninitialized and poisoned
       page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
       Call Trace:
       ( test_pages_in_a_zone+0xde/0x160)
         show_valid_zones+0x5c/0x190
         dev_attr_show+0x34/0x70
         sysfs_kf_seq_show+0xc8/0x148
         seq_read+0x204/0x480
         __vfs_read+0x32/0x178
         vfs_read+0x82/0x138
         ksys_read+0x5a/0xb0
         system_call+0xdc/0x2d8
       Last Breaking-Event-Address:
         test_pages_in_a_zone+0xde/0x160
       Kernel panic - not syncing: Fatal exception: panic_on_oops
      
       kernel parameter mem=3075M
       --------------------------
       page:000003d08300c000 is uninitialized and poisoned
       page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
       Call Trace:
       ( is_mem_section_removable+0xb4/0x190)
         show_mem_removable+0x9a/0xd8
         dev_attr_show+0x34/0x70
         sysfs_kf_seq_show+0xc8/0x148
         seq_read+0x204/0x480
         __vfs_read+0x32/0x178
         vfs_read+0x82/0x138
         ksys_read+0x5a/0xb0
         system_call+0xdc/0x2d8
       Last Breaking-Event-Address:
         is_mem_section_removable+0xb4/0x190
       Kernel panic - not syncing: Fatal exception: panic_on_oops
      
      Fix the problem by initializing the last memory section of each zone in
      memmap_init_zone() till the very end, even if it goes beyond the zone end.
      
      Michal said:
      
      : This has alwways been problem AFAIU.  It just went unnoticed because we
      : have zeroed memmaps during allocation before f7f99100 ("mm: stop
      : zeroing memory during allocation in vmemmap") and so the above test
      : would simply skip these ranges as belonging to zone 0 or provided a
      : garbage.
      :
      : So I guess we do care for post f7f99100 kernels mostly and
      : therefore Fixes: f7f99100 ("mm: stop zeroing memory during
      : allocation in vmemmap")
      
      Link: http://lkml.kernel.org/r/20181212172712.34019-2-zaslonko@linux.ibm.com
      Fixes: f7f99100 ("mm: stop zeroing memory during allocation in vmemmap")
      Signed-off-by: NMikhail Zaslonko <zaslonko@linux.ibm.com>
      Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Suggested-by: NMichal Hocko <mhocko@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Tested-by: NMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7592dbfa
    • J
      media: ov5640: Fix set format regression · 3fbd4d87
      Jacopo Mondi 提交于
      commit 07115449919383548d094ff83cc27bd08639a8a1 upstream.
      
      The set_fmt operations updates the sensor format only when the image format
      is changed. When only the image sizes gets changed, the format do not get
      updated causing the sensor to always report the one that was previously in
      use.
      
      Without this patch, updating frame size only fails:
        [fmt:UYVY8_2X8/640x480@1/30 field:none colorspace:srgb xfer:srgb ...]
      
      With this patch applied:
        [fmt:UYVY8_2X8/1024x768@1/30 field:none colorspace:srgb xfer:srgb ...]
      
      Fixes: 6949d864 ("media: ov5640: do not change mode if format or frame interval is unchanged")
      Signed-off-by: NJacopo Mondi <jacopo+renesas@jmondi.org>
      Signed-off-by: NMaxime Ripard <maxime.ripard@bootlin.com>
      Tested-by: Adam Ford <aford173@gmail.com> #imx6 w/ CSI2 interface on 4.19.6 and 4.20-RC5
      Signed-off-by: NSakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3fbd4d87
    • I
      iwlwifi: add new cards for 9560, 9462, 9461 and killer series · 7f30924b
      Ihab Zhaika 提交于
      commit f108703cb5f199d0fc98517ac29a997c4c646c94 upstream.
      
      add few PCI ID'S for 9560, 9462, 9461 and killer series.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NIhab Zhaika <ihab.zhaika@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f30924b
    • B
      Revert "mwifiex: restructure rx_reorder_tbl_lock usage" · 9007fba7
      Brian Norris 提交于
      commit 1aa48f088615ebfa5e139951a0d3e7dc2c2af4ec upstream.
      
      This reverts commit 5188d545, because it
      introduced lock recursion:
      
        BUG: spinlock recursion on CPU#2, kworker/u13:1/395
         lock: 0xffffffc0e28a47f0, .magic: dead4ead, .owner: kworker/u13:1/395, .owner_cpu: 2
        CPU: 2 PID: 395 Comm: kworker/u13:1 Not tainted 4.20.0-rc4+ #2
        Hardware name: Google Kevin (DT)
        Workqueue: MWIFIEX_RX_WORK_QUEUE mwifiex_rx_work_queue [mwifiex]
        Call trace:
         dump_backtrace+0x0/0x140
         show_stack+0x20/0x28
         dump_stack+0x84/0xa4
         spin_bug+0x98/0xa4
         do_raw_spin_lock+0x5c/0xdc
         _raw_spin_lock_irqsave+0x38/0x48
         mwifiex_flush_data+0x2c/0xa4 [mwifiex]
         call_timer_fn+0xcc/0x1c4
         run_timer_softirq+0x264/0x4f0
         __do_softirq+0x1a8/0x35c
         do_softirq+0x54/0x64
         netif_rx_ni+0xe8/0x120
         mwifiex_recv_packet+0xfc/0x10c [mwifiex]
         mwifiex_process_rx_packet+0x1d4/0x238 [mwifiex]
         mwifiex_11n_dispatch_pkt+0x190/0x1ac [mwifiex]
         mwifiex_11n_rx_reorder_pkt+0x28c/0x354 [mwifiex]
         mwifiex_process_sta_rx_packet+0x204/0x26c [mwifiex]
         mwifiex_handle_rx_packet+0x15c/0x16c [mwifiex]
         mwifiex_rx_work_queue+0x104/0x134 [mwifiex]
         worker_thread+0x4cc/0x72c
         kthread+0x134/0x13c
         ret_from_fork+0x10/0x18
      
      This was clearly not tested well at all. I simply performed 'wget' in a
      loop and it fell over within a few seconds.
      
      Fixes: 5188d545 ("mwifiex: restructure rx_reorder_tbl_lock usage")
      Cc: <stable@vger.kernel.org>
      Cc: Ganapathi Bhat <gbhat@marvell.com>
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9007fba7
    • E
      iwlwifi: mvm: don't send GEO_TX_POWER_LIMIT to old firmwares · c151740f
      Emmanuel Grumbach 提交于
      commit eca1e56ceedd9cc185eb18baf307d3ff2e4af376 upstream.
      
      Old firmware versions don't support this command. Sending it
      to any firmware before -41.ucode will crash the firmware.
      
      This fixes https://bugzilla.kernel.org/show_bug.cgi?id=201975
      
      Fixes: 66e839030fd6 ("iwlwifi: fix wrong WGDS_WIFI_DATA_SIZE")
      CC: <stable@vger.kernel.org> #4.19+
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c151740f
    • L
      rtlwifi: Fix leak of skb when processing C2H_BT_INFO · fed44d6c
      Larry Finger 提交于
      commit 8cfa272b0d321160ebb5b45073e39ef0a6ad73f2 upstream.
      
      With commit 0a9f8f0a ("rtlwifi: fix btmpinfo timeout while processing
      C2H_BT_INFO"), calling rtl_c2hcmd_enqueue() with rtl_c2h_fast_cmd() true,
      the routine returns without freeing that skb, thereby leaking it.
      
      This issue has been discussed at https://github.com/lwfinger/rtlwifi_new/issues/401
      and the fix tested there.
      
      Fixes: 0a9f8f0a ("rtlwifi: fix btmpinfo timeout while processing C2H_BT_INFO")
      Reported-and-tested-by: NFrancisco Machado Magalhães Neto <franmagneto@gmail.com>
      Cc: Francisco Machado Magalhães Neto <franmagneto@gmail.com>
      Cc: Ping-Ke Shih <pkshih@realtek.com>
      Cc: Stable <stable@vger.kernel.org> # 4.18+
      Signed-off-by: NLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fed44d6c
    • M
      xfrm_user: fix freeing of xfrm states on acquire · 5ecdfbb0
      Mathias Krause 提交于
      commit 4a135e538962cb00a9667c82e7d2b9e4d7cd7177 upstream.
      
      Commit 565f0fa9 ("xfrm: use a dedicated slab cache for struct
      xfrm_state") moved xfrm state objects to use their own slab cache.
      However, it missed to adapt xfrm_user to use this new cache when
      freeing xfrm states.
      
      Fix this by introducing and make use of a new helper for freeing
      xfrm_state objects.
      
      Fixes: 565f0fa9 ("xfrm: use a dedicated slab cache for struct xfrm_state")
      Reported-by: NPan Bian <bianpan2016@163.com>
      Cc: <stable@vger.kernel.org> # v4.18+
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ecdfbb0
    • M
      mm: introduce mm_[p4d|pud|pmd]_folded · 89d6fff0
      Martin Schwidefsky 提交于
      [ Upstream commit 1071fc5779d9846fec56a4ff6089ab08cac1ab72 ]
      
      Add three architecture overrideable functions to test if the
      p4d, pud, or pmd layer of a page table is folded or not.
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      89d6fff0
    • M
      mm: make the __PAGETABLE_PxD_FOLDED defines non-empty · ba38c3e7
      Martin Schwidefsky 提交于
      [ Upstream commit a8874e7e8a8896f2b6c641f4b8e2473eafd35204 ]
      
      Change the currently empty defines for __PAGETABLE_PMD_FOLDED,
      __PAGETABLE_PUD_FOLDED and __PAGETABLE_P4D_FOLDED to return 1.
      This makes it possible to use __is_defined() to test if the
      preprocessor define exists.
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ba38c3e7
    • M
      mm: add mm_pxd_folded checks to pgtable_bytes accounting functions · 28a3b553
      Martin Schwidefsky 提交于
      [ Upstream commit 6d212db11947ae5464e4717536ed9faf61c01e86 ]
      
      The common mm code calls mm_dec_nr_pmds() and mm_dec_nr_puds()
      in free_pgtables() if the address range spans a full pud or pmd.
      If mm_dec_nr_puds/mm_dec_nr_pmds are non-empty due to configuration
      settings they blindly subtract the size of the pmd or pud table from
      pgtable_bytes even if the pud or pmd page table layer is folded.
      
      Add explicit mm_[pmd|pud]_folded checks to the four pgtable_bytes
      accounting functions mm_inc_nr_puds, mm_inc_nr_pmds, mm_dec_nr_puds
      and mm_dec_nr_pmds. As the check for folded page tables can be
      overwritten by the architecture, this allows to keep a correct
      pgtable_bytes value for platforms that use a dynamic number of
      page table levels.
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      28a3b553
    • S
      panic: avoid deadlocks in re-entrant console drivers · 384221cb
      Sergey Senozhatsky 提交于
      commit c7c3f05e341a9a2bd1a92993d4f996cfd6e7348e upstream.
      
      From printk()/serial console point of view panic() is special, because
      it may force CPU to re-enter printk() or/and serial console driver.
      Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:
      
      serial8250_console_write()
      {
      	if (port->sysrq)
      		locked = 0;
      	else if (oops_in_progress)
      		locked = spin_trylock_irqsave(&port->lock, flags);
      	else
      		spin_lock_irqsave(&port->lock, flags);
      	...
      }
      
      panic() does set oops_in_progress via bust_spinlocks(1), so in theory
      we should be able to re-enter serial console driver from panic():
      
      	CPU0
      	<NMI>
      	uart_console_write()
      	serial8250_console_write()		// if (oops_in_progress)
      						//    spin_trylock_irqsave()
      	call_console_drivers()
      	console_unlock()
      	console_flush_on_panic()
      	bust_spinlocks(1)			// oops_in_progress++
      	panic()
      	<NMI/>
      	spin_lock_irqsave(&port->lock, flags)   // spin_lock_irqsave()
      	serial8250_console_write()
      	call_console_drivers()
      	console_unlock()
      	printk()
      	...
      
      However, this does not happen and we deadlock in serial console on
      port->lock spinlock. And the problem is that console_flush_on_panic()
      called after bust_spinlocks(0):
      
      void panic(const char *fmt, ...)
      {
      	bust_spinlocks(1);
      	...
      	bust_spinlocks(0);
      	console_flush_on_panic();
      	...
      }
      
      bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
      can go back to zero. Thus even re-entrant console drivers will simply
      spin on port->lock spinlock. Given that port->lock may already be
      locked either by a stopped CPU, or by the very same CPU we execute
      panic() on (for instance, NMI panic() on printing CPU) the system
      deadlocks and does not reboot.
      
      Fix this by removing bust_spinlocks(0), so oops_in_progress is always
      set in panic() now and, thus, re-entrant console drivers will trylock
      the port->lock instead of spinning on it forever, when we call them
      from console_flush_on_panic().
      
      Link: http://lkml.kernel.org/r/20181025101036.6823-1-sergey.senozhatsky@gmail.com
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Daniel Wang <wonderfly@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: linux-serial@vger.kernel.org
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      384221cb
    • R
      x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence · 0a95cba5
      Reinette Chatre 提交于
      commit 80b71c340f17705ec145911b9a193ea781811b16 upstream.
      
      The user triggers the creation of a pseudo-locked region when writing
      the requested schemata to the schemata resctrl file. The pseudo-locking
      of a region is required to be done on a CPU that is associated with the
      cache on which the pseudo-locked region will reside. In order to run the
      locking code on a specific CPU, the needed CPU has to be selected and
      ensured to remain online during the entire locking sequence.
      
      At this time, the cpu_hotplug_lock is not taken during the pseudo-lock
      region creation and it is thus possible for a CPU to be selected to run
      the pseudo-locking code and then that CPU to go offline before the
      thread is able to run on it.
      
      Fix this by ensuring that the cpu_hotplug_lock is taken while the CPU on
      which code has to run needs to be controlled. Since the cpu_hotplug_lock
      is always taken before rdtgroup_mutex the lock order is maintained.
      
      Fixes: e0bdfe8e ("x86/intel_rdt: Support creation/removal of pseudo-locked region")
      Signed-off-by: NReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: gavin.hindman@intel.com
      Cc: jithu.joseph@intel.com
      Cc: stable <stable@vger.kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/b7b17432a80f95a1fa21a1698ba643014f58ad31.1544476425.git.reinette.chatre@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a95cba5
    • A
      x86/vdso: Pass --eh-frame-hdr to the linker · 56f7bfac
      Alistair Strachan 提交于
      commit cd01544a268ad8ee5b1dfe42c4393f1095f86879 upstream.
      
      Commit
      
        379d98dd ("x86: vdso: Use $LD instead of $CC to link")
      
      accidentally broke unwinding from userspace, because ld would strip the
      .eh_frame sections when linking.
      
      Originally, the compiler would implicitly add --eh-frame-hdr when
      invoking the linker, but when this Makefile was converted from invoking
      ld via the compiler, to invoking it directly (like vmlinux does),
      the flag was missed. (The EH_FRAME section is important for the VDSO
      shared libraries, but not for vmlinux.)
      
      Fix the problem by explicitly specifying --eh-frame-hdr, which restores
      parity with the old method.
      
      See relevant bug reports for additional info:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=201741
        https://bugzilla.redhat.com/show_bug.cgi?id=1659295
      
      Fixes: 379d98dd ("x86: vdso: Use $LD instead of $CC to link")
      Reported-by: NFlorian Weimer <fweimer@redhat.com>
      Reported-by: NCarlos O'Donell <carlos@redhat.com>
      Reported-by: N"H. J. Lu" <hjl.tools@gmail.com>
      Signed-off-by: NAlistair Strachan <astrachan@google.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Carlos O'Donell <carlos@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Cc: kernel-team@android.com
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: X86 ML <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20181214223637.35954-1-astrachan@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56f7bfac
    • D
      x86/mm: Fix decoy address handling vs 32-bit builds · 1e3b98b2
      Dan Williams 提交于
      commit 51c3fbd89d7554caa3290837604309f8d8669d99 upstream.
      
      A decoy address is used by set_mce_nospec() to update the cache attributes
      for a page that may contain poison (multi-bit ECC error) while attempting
      to minimize the possibility of triggering a speculative access to that
      page.
      
      When reserve_memtype() is handling a decoy address it needs to convert it
      to its real physical alias. The conversion, AND'ing with __PHYSICAL_MASK,
      is broken for a 32-bit physical mask and reserve_memtype() is passed the
      last physical page. Gert reports triggering the:
      
          BUG_ON(start >= end);
      
      ...assertion when running a 32-bit non-PAE build on a platform that has
      a driver resource at the top of physical memory:
      
          BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved
      
      Given that the decoy address scheme is only targeted at 64-bit builds and
      assumes that the top of physical address space is free for use as a decoy
      address range, simply bypass address sanitization in the 32-bit case.
      
      Lastly, there was no need to crash the system when this failure occurred,
      and no need to crash future systems if the assumptions of decoy addresses
      are ever violated. Change the BUG_ON() to a WARN() with an error return.
      
      Fixes: 510ee090 ("x86/mm/pat: Prepare {reserve, free}_memtype() for...")
      Reported-by: NGert Robben <t2@gert.gr>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NGert Robben <t2@gert.gr>
      Cc: stable@vger.kernel.org
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: platform-driver-x86@vger.kernel.org
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/154454337985.789277.12133288391664677775.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e3b98b2
    • C
      x86/mtrr: Don't copy uninitialized gentry fields back to userspace · c623326a
      Colin Ian King 提交于
      commit 32043fa065b51e0b1433e48d118821c71b5cd65d upstream.
      
      Currently the copy_to_user of data in the gentry struct is copying
      uninitiaized data in field _pad from the stack to userspace.
      
      Fix this by explicitly memset'ing gentry to zero, this also will zero any
      compiler added padding fields that may be in struct (currently there are
      none).
      
      Detected by CoverityScan, CID#200783 ("Uninitialized scalar variable")
      
      Fixes: b263b31e ("x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NTyler Hicks <tyhicks@canonical.com>
      Cc: security@kernel.org
      Link: https://lkml.kernel.org/r/20181218172956.1440-1-colin.king@canonical.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c623326a
    • T
      futex: Cure exit race · 9933bfb6
      Thomas Gleixner 提交于
      commit da791a667536bf8322042e38ca85d55a78d3c273 upstream.
      
      Stefan reported, that the glibc tst-robustpi4 test case fails
      occasionally. That case creates the following race between
      sys_exit() and sys_futex_lock_pi():
      
       CPU0				CPU1
      
       sys_exit()			sys_futex()
        do_exit()			 futex_lock_pi()
         exit_signals(tsk)		  No waiters:
          tsk->flags |= PF_EXITING;	  *uaddr == 0x00000PID
        mm_release(tsk)		  Set waiter bit
         exit_robust_list(tsk) {	  *uaddr = 0x80000PID;
            Set owner died		  attach_to_pi_owner() {
          *uaddr = 0xC0000000;	   tsk = get_task(PID);
         }				   if (!tsk->flags & PF_EXITING) {
        ...				     attach();
        tsk->flags |= PF_EXITPIDONE;	   } else {
      				     if (!(tsk->flags & PF_EXITPIDONE))
      				       return -EAGAIN;
      				     return -ESRCH; <--- FAIL
      				   }
      
      ESRCH is returned all the way to user space, which triggers the glibc test
      case assert. Returning ESRCH unconditionally is wrong here because the user
      space value has been changed by the exiting task to 0xC0000000, i.e. the
      FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This
      is a valid state and the kernel has to handle it, i.e. taking the futex.
      
      Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE
      is set in the task which 'owns' the futex. If the value has changed, let
      the kernel retry the operation, which includes all regular sanity checks
      and correctly handles the FUTEX_OWNER_DIED case.
      
      If it hasn't changed, then return ESRCH as there is no way to distinguish
      this case from malfunctioning user space. This happens when the exiting
      task did not have a robust list, the robust list was corrupted or the user
      space value in the futex was simply bogus.
      Reported-by: NStefan Liebler <stli@linux.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467
      Link: https://lkml.kernel.org/r/20181210152311.986181245@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9933bfb6
    • D
      Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels · c1f8e7ac
      Dexuan Cui 提交于
      commit fc96df16a1ce80cbb3c316ab7d4dc8cd5c2852ce upstream.
      
      Before 98f4c651, we returned zeros for unopened channels.
      With 98f4c651, we started to return random on-stack values.
      
      We'd better return -EINVAL instead.
      
      Fixes: 98f4c651 ("hv: move ringbuffer bus attributes to dev_groups")
      Cc: stable@vger.kernel.org
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1f8e7ac
    • C
      KVM: Fix UAF in nested posted interrupt processing · 1972ca04
      Cfir Cohen 提交于
      commit c2dd5146e9fe1f22c77c1b011adf84eea0245806 upstream.
      
      nested_get_vmcs12_pages() processes the posted_intr address in vmcs12. It
      caches the kmap()ed page object and pointer, however, it doesn't handle
      errors correctly: it's possible to cache a valid pointer, then release
      the page and later dereference the dangling pointer.
      
      I was able to reproduce with the following steps:
      
      1. Call vmlaunch with valid posted_intr_desc_addr but an invalid
      MSR_EFER. This causes nested_get_vmcs12_pages() to cache the kmap()ed
      pi_desc_page and pi_desc. Later the invalid EFER value fails
      check_vmentry_postreqs() which fails the first vmlaunch.
      
      2. Call vmlanuch with a valid EFER but an invalid posted_intr_desc_addr
      (I set it to 2G - 0x80). The second time we call nested_get_vmcs12_pages
      pi_desc_page is unmapped and released and pi_desc_page is set to NULL
      (the "shouldn't happen" clause). Due to the invalid
      posted_intr_desc_addr, kvm_vcpu_gpa_to_page() fails and
      nested_get_vmcs12_pages() returns. It doesn't return an error value so
      vmlaunch proceeds. Note that at this time we have a dangling pointer in
      vmx->nested.pi_desc and POSTED_INTR_DESC_ADDR in L0's vmcs.
      
      3. Issue an IPI in L2 guest code. This triggers a call to
      vmx_complete_nested_posted_interrupt() and pi_test_and_clear_on() which
      dereferences the dangling pointer.
      
      Vulnerable code requires nested and enable_apicv variables to be set to
      true. The host CPU must also support posted interrupts.
      
      Fixes: 5e2f30b7 "KVM: nVMX: get rid of nested_get_page()"
      Cc: stable@vger.kernel.org
      Reviewed-by: NAndy Honig <ahonig@google.com>
      Signed-off-by: NCfir Cohen <cfir@google.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1972ca04
    • E
      kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs · 229468c6
      Eduardo Habkost 提交于
      commit 0e1b869fff60c81b510c2d00602d778f8f59dd9a upstream.
      
      Some guests OSes (including Windows 10) write to MSR 0xc001102c
      on some cases (possibly while trying to apply a CPU errata).
      Make KVM ignore reads and writes to that MSR, so the guest won't
      crash.
      
      The MSR is documented as "Execution Unit Configuration (EX_CFG)",
      at AMD's "BIOS and Kernel Developer's Guide (BKDG) for AMD Family
      15h Models 00h-0Fh Processors".
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      229468c6
    • W
      KVM: X86: Fix NULL deref in vcpu_scan_ioapic · 76281d12
      Wanpeng Li 提交于
      commit dcbd3e49c2f0b2c2d8a321507ff8f3de4af76d7c upstream.
      
      Reported by syzkaller:
      
          CPU: 1 PID: 5962 Comm: syz-executor118 Not tainted 4.20.0-rc6+ #374
          Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
          RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:169 [inline]
          RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:7449 [inline]
          RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:7602 [inline]
          RIP: 0010:vcpu_run arch/x86/kvm/x86.c:7874 [inline]
          RIP: 0010:kvm_arch_vcpu_ioctl_run+0x5296/0x7320 arch/x86/kvm/x86.c:8074
          Call Trace:
      	 kvm_vcpu_ioctl+0x5c8/0x1150 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2596
      	 vfs_ioctl fs/ioctl.c:46 [inline]
      	 file_ioctl fs/ioctl.c:509 [inline]
      	 do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
      	 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
      	 __do_sys_ioctl fs/ioctl.c:720 [inline]
      	 __se_sys_ioctl fs/ioctl.c:718 [inline]
      	 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
      	 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT14 msr
      and triggers scan ioapic logic to load synic vectors into EOI exit bitmap.
      However, irqchip is not initialized by this simple testcase, ioapic/apic
      objects should not be accessed.
      
      This patch fixes it by also considering whether or not apic is present.
      
      Reported-by: syzbot+39810e6c400efadfef71@syzkaller.appspotmail.com
      Cc: stable@vger.kernel.org
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76281d12
    • T
      posix-timers: Fix division by zero bug · 82c8dbb3
      Thomas Gleixner 提交于
      commit 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 upstream.
      
      The signal delivery path of posix-timers can try to rearm the timer even if
      the interval is zero. That's handled for the common case (hrtimer) but not
      for alarm timers. In that case the forwarding function raises a division by
      zero exception.
      
      The handling for hrtimer based posix timers is wrong because it marks the
      timer as active despite the fact that it is stopped.
      
      Move the check from common_hrtimer_rearm() to posixtimer_rearm() to cure
      both issues.
      
      Reported-by: syzbot+9d38bedac9cc77b8ad5e@syzkaller.appspotmail.com
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: sboyd@kernel.org
      Cc: stable@vger.kernel.org
      Cc: syzkaller-bugs@googlegroups.com
      Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1812171328050.1880@nanos.tec.linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82c8dbb3
    • H
      gpiolib-acpi: Only defer request_irq for GpioInt ACPI event handlers · 1f51527d
      Hans de Goede 提交于
      commit e59f5e08ece1060073d92c66ded52e1f2c43b5bb upstream.
      
      Commit 78d3a92e ("gpiolib-acpi: Register GpioInt ACPI event handlers
      from a late_initcall") deferred the entire acpi_gpiochip_request_interrupt
      call for each event resource.
      
      This means it also delays the gpiochip_request_own_desc(..., "ACPI:Event")
      call. This is a problem if some AML code reads the GPIO pin before we
      run the deferred acpi_gpiochip_request_interrupt, because in that case
      acpi_gpio_adr_space_handler() will already have called
      gpiochip_request_own_desc(..., "ACPI:OpRegion") causing the call from
      acpi_gpiochip_request_interrupt to fail with -EBUSY and we will fail to
      register an event handler.
      
      acpi_gpio_adr_space_handler is prepared for acpi_gpiochip_request_interrupt
      already having claimed the pin, but the other way around does not work.
      
      One example of a problem this causes, is the event handler for the OTG
      ID pin on a Prowise PT301 tablet not registering, keeping the port stuck
      in whatever mode it was in during boot and e.g. only allowing charging
      after a reboot.
      
      This commit fixes this by only deferring the request_irq call and the
      initial run of edge-triggered IRQs instead of deferring all of
      acpi_gpiochip_request_interrupt.
      
      Cc: stable@vger.kernel.org
      Fixes: 78d3a92e ("gpiolib-acpi: Register GpioInt ACPI event ...")
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f51527d
    • C
      gpio: max7301: fix driver for use with CONFIG_VMAP_STACK · 85ac860a
      Christophe Leroy 提交于
      commit abf221d2f51b8ce7b9959a8953f880a8b0a1400d upstream.
      
      spi_read() and spi_write() require DMA-safe memory. When
      CONFIG_VMAP_STACK is selected, those functions cannot be used
      with buffers on stack.
      
      This patch replaces calls to spi_read() and spi_write() by
      spi_write_then_read() which doesn't require DMA-safe buffers.
      
      Fixes: 0c36ec31 ("gpio: gpio driver for max7301 SPI GPIO expander")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85ac860a
    • R
      mmc: omap_hsmmc: fix DMA API warning · 0867cfaa
      Russell King 提交于
      commit 0b479790684192ab7024ce6a621f93f6d0a64d92 upstream.
      
      While booting with rootfs on MMC, the following warning is encountered
      on OMAP4430:
      
      omap-dma-engine 4a056000.dma-controller: DMA-API: mapping sg segment longer than device claims to support [len=69632] [max=65536]
      
      This is because the DMA engine has a default maximum segment size of 64K
      but HSMMC sets:
      
              mmc->max_blk_size = 512;       /* Block Length at max can be 1024 */
              mmc->max_blk_count = 0xFFFF;    /* No. of Blocks is 16 bits */
              mmc->max_req_size = mmc->max_blk_size * mmc->max_blk_count;
              mmc->max_seg_size = mmc->max_req_size;
      
      which ends up telling the block layer that we support a maximum segment
      size of 65535*512, which exceeds the advertised DMA engine capabilities.
      
      Fix this by clamping the maximum segment size to the lower of the
      maximum request size and of the DMA engine device used for either DMA
      channel.
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0867cfaa
    • U
      mmc: core: Use a minimum 1600ms timeout when enabling CACHE ctrl · b38f6898
      Ulf Hansson 提交于
      commit e3ae3401aa19432ee4943eb0bbc2ec704d07d793 upstream.
      
      Some eMMCs from Micron have been reported to need ~800 ms timeout, while
      enabling the CACHE ctrl after running sudden power failure tests. The
      needed timeout is greater than what the card specifies as its generic CMD6
      timeout, through the EXT_CSD register, hence the problem.
      
      Normally we would introduce a card quirk to extend the timeout for these
      specific Micron cards. However, due to the rather complicated debug process
      needed to find out the error, let's simply use a minimum timeout of 1600ms,
      the double of what has been reported, for all cards when enabling CACHE
      ctrl.
      Reported-by: NSjoerd Simons <sjoerd.simons@collabora.co.uk>
      Reported-by: NAndreas Dannenberg <dannenberg@ti.com>
      Reported-by: NFaiz Abbas <faiz_abbas@ti.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b38f6898
    • U
      mmc: core: Allow BKOPS and CACHE ctrl even if no HPI support · 12df9797
      Ulf Hansson 提交于
      commit ba9f39a785a9977e72233000711ef1eb48203551 upstream.
      
      In commit 5320226a ("mmc: core: Disable HPI for certain Hynix eMMC
      cards"), then intent was to prevent HPI from being used for some eMMC
      cards, which didn't properly support it. However, that went too far, as
      even BKOPS and CACHE ctrl became prevented. Let's restore those parts and
      allow BKOPS and CACHE ctrl even if HPI isn't supported.
      
      Fixes: 5320226a ("mmc: core: Disable HPI for certain Hynix eMMC cards")
      Cc: Pratibhasagar V <pratibha@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12df9797
    • U
      mmc: core: Reset HPI enabled state during re-init and in case of errors · f465300a
      Ulf Hansson 提交于
      commit a0741ba40a009f97c019ae7541dc61c1fdf41efb upstream.
      
      During a re-initialization of the eMMC card, we may fail to re-enable HPI.
      In these cases, that isn't properly reflected in the card->ext_csd.hpi_en
      bit, as it keeps being set. This may cause following attempts to use HPI,
      even if's not enabled. Let's fix this!
      
      Fixes: eb0d8f13 ("mmc: core: support HPI send command")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f465300a
    • J
      scsi: sd: use mempool for discard special page · 024d515a
      Jens Axboe 提交于
      commit 61cce6f6eeced5ddd9cac55e807fe28b4f18c1ba upstream.
      
      When boxes are run near (or to) OOM, we have a problem with the discard
      page allocation in sd. If we fail allocating the special page, we return
      busy, and it'll get retried. But since ordering is honored for dispatch
      requests, we can keep retrying this same IO and failing. Behind that IO
      could be requests that want to free memory, but they never get the
      chance. This means you get repeated spews of traces like this:
      
      [1201401.625972] Call Trace:
      [1201401.631748]  dump_stack+0x4d/0x65
      [1201401.639445]  warn_alloc+0xec/0x190
      [1201401.647335]  __alloc_pages_slowpath+0xe84/0xf30
      [1201401.657722]  ? get_page_from_freelist+0x11b/0xb10
      [1201401.668475]  ? __alloc_pages_slowpath+0x2e/0xf30
      [1201401.679054]  __alloc_pages_nodemask+0x1f9/0x210
      [1201401.689424]  alloc_pages_current+0x8c/0x110
      [1201401.699025]  sd_setup_write_same16_cmnd+0x51/0x150
      [1201401.709987]  sd_init_command+0x49c/0xb70
      [1201401.719029]  scsi_setup_cmnd+0x9c/0x160
      [1201401.727877]  scsi_queue_rq+0x4d9/0x610
      [1201401.736535]  blk_mq_dispatch_rq_list+0x19a/0x360
      [1201401.747113]  blk_mq_sched_dispatch_requests+0xff/0x190
      [1201401.758844]  __blk_mq_run_hw_queue+0x95/0xa0
      [1201401.768653]  blk_mq_run_work_fn+0x2c/0x30
      [1201401.777886]  process_one_work+0x14b/0x400
      [1201401.787119]  worker_thread+0x4b/0x470
      [1201401.795586]  kthread+0x110/0x150
      [1201401.803089]  ? rescuer_thread+0x320/0x320
      [1201401.812322]  ? kthread_park+0x90/0x90
      [1201401.820787]  ? do_syscall_64+0x53/0x150
      [1201401.829635]  ret_from_fork+0x29/0x40
      
      Ensure that the discard page allocation has a mempool backing, so we
      know we can make progress.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      024d515a
    • M
      scsi: t10-pi: Return correct ref tag when queue has no integrity profile · 690699b2
      Martin K. Petersen 提交于
      commit 60a89a3ce0cce515dc663bc1b45ac89202ad6c79 upstream.
      
      Commit ddd0bc75 ("block: move ref_tag calculation func to the block
      layer") moved ref tag calculation from SCSI to a library function. However,
      this change broke returning the correct ref tag for devices operating in
      DIF mode since these do not have an associated block integrity profile.
      This in turn caused read/write failures on PI-formatted disks attached to
      an mpt3sas controller.
      
      Fixes: ddd0bc75 ("block: move ref_tag calculation func to the block layer")
      Cc: stable@vger.kernel.org # 4.19+
      Reported-by: NJohn Garry <john.garry@huawei.com>
      Tested-by: NXiang Chen <chenxiang66@hisilicon.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      690699b2
    • R
      ubifs: Handle re-linking of inodes correctly while recovery · 07364588
      Richard Weinberger 提交于
      commit e58725d51fa8da9133f3f1c54170aa2e43056b91 upstream.
      
      UBIFS's recovery code strictly assumes that a deleted inode will never
      come back, therefore it removes all data which belongs to that inode
      as soon it faces an inode with link count 0 in the replay list.
      Before O_TMPFILE this assumption was perfectly fine. With O_TMPFILE
      it can lead to data loss upon a power-cut.
      
      Consider a journal with entries like:
      0: inode X (nlink = 0) /* O_TMPFILE was created */
      1: data for inode X /* Someone writes to the temp file */
      2: inode X (nlink = 0) /* inode was changed, xattr, chmod, … */
      3: inode X (nlink = 1) /* inode was re-linked via linkat() */
      
      Upon replay of entry #2 UBIFS will drop all data that belongs to inode X,
      this will lead to an empty file after mounting.
      
      As solution for this problem, scan the replay list for a re-link entry
      before dropping data.
      
      Fixes: 474b9370 ("ubifs: Implement O_TMPFILE")
      Cc: stable@vger.kernel.org
      Cc: Russell Senior <russell@personaltelco.net>
      Cc: Rafał Miłecki <zajec5@gmail.com>
      Reported-by: NRussell Senior <russell@personaltelco.net>
      Reported-by: NRafał Miłecki <zajec5@gmail.com>
      Tested-by: NRafał Miłecki <rafal@milecki.pl>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07364588
    • J
      USB: serial: option: add Telit LN940 series · 507a953a
      Jörgen Storvist 提交于
      commit 28a86092b1753b802ef7e3de8a4c4a69a9c1bb03 upstream.
      
      Added USB serial option driver support for Telit LN940 series cellular
      modules. Covering both QMI and MBIM modes.
      
      usb-devices output (0x1900):
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 21 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=1bc7 ProdID=1900 Rev=03.10
      S:  Manufacturer=Telit
      S:  Product=Telit LN940 Mobile Broadband
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      
      usb-devices output (0x1901):
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 20 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=1bc7 ProdID=1901 Rev=03.10
      S:  Manufacturer=Telit
      S:  Product=Telit LN940 Mobile Broadband
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 4 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
      I:  If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
      Signed-off-by: NJörgen Storvist <jorgen.storvist@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      507a953a
    • J
      USB: serial: option: add Fibocom NL668 series · 81dfcd0b
      Jörgen Storvist 提交于
      commit 30360224441ce89a98ed627861e735beb4010775 upstream.
      
      Added USB serial option driver support for Fibocom NL668 series cellular
      modules. Reserved USB endpoints 4, 5 and 6 for network + ADB interfaces.
      
      usb-devices output (QMI mode)
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 16 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=1508 ProdID=1001 Rev=03.18
      S:  Manufacturer=Nodecom NL668 Modem
      S:  Product=Nodecom NL668-CN Modem
      S:  SerialNumber=
      C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
      
      usb-devices output (ECM mode)
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 17 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=1508 ProdID=1001 Rev=03.18
      S:  Manufacturer=Nodecom NL668 Modem
      S:  Product=Nodecom NL668-CN Modem
      S:  SerialNumber=
      C:  #Ifs= 7 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 4 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether
      I:  If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
      I:  If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
      Signed-off-by: NJörgen Storvist <jorgen.storvist@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81dfcd0b
    • J
      USB: serial: option: add Simcom SIM7500/SIM7600 (MBIM mode) · 4e0f5002
      Jörgen Storvist 提交于
      commit cc6730df08a291e51e145bc65e24ffb5e2f17ab6 upstream.
      
      Added USB serial option driver support for Simcom SIM7500/SIM7600 series
      cellular modules exposing MBIM interface (VID 0x1e0e,PID 0x9003)
      
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 14 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=1e0e ProdID=9003 Rev=03.18
      S:  Manufacturer=SimTech, Incorporated
      S:  Product=SimTech, Incorporated
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 7 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 5 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
      I:  If#= 6 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
      Signed-off-by: NJörgen Storvist <jorgen.storvist@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e0f5002
    • T
      USB: serial: option: add HP lt4132 · cc0667b5
      Tore Anderson 提交于
      commit d57ec3c83b5153217a70b561d4fb6ed96f2f7a25 upstream.
      
      The HP lt4132 is a rebranded Huawei ME906s-158 LTE modem.
      
      The interface with protocol 0x16 is "CDC ECM & NCM" according to the *.inf
      files included with the Windows driver. Attaching the option driver to it
      doesn't result in a /dev/ttyUSB* device being created, so I've excluded it.
      Note that it is also excluded for corresponding Huawei-branded devices, cf.
      commit d544db29 ("USB: support new huawei devices in option.c").
      
      T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  3 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=ff MxPS=64 #Cfgs=  3
      P:  Vendor=03f0 ProdID=a31d Rev=01.02
      S:  Manufacturer=HP Inc.
      S:  Product=HP lt4132 LTE/HSPA+ 4G Module
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=2mA
      I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=06 Prot=10 Driver=option
      I:  If#=0x1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=13 Driver=option
      I:  If#=0x2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=12 Driver=option
      I:  If#=0x3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=06 Prot=16 Driver=(none)
      I:  If#=0x4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=14 Driver=option
      I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=1b Driver=option
      
      T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  3 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=ff MxPS=64 #Cfgs=  3
      P:  Vendor=03f0 ProdID=a31d Rev=01.02
      S:  Manufacturer=HP Inc.
      S:  Product=HP lt4132 LTE/HSPA+ 4G Module
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 7 Cfg#= 2 Atr=a0 MxPwr=2mA
      I:  If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether
      I:  If#=0x1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=06 Prot=00 Driver=cdc_ether
      I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=06 Prot=10 Driver=option
      I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=13 Driver=option
      I:  If#=0x4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=12 Driver=option
      I:  If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=14 Driver=option
      I:  If#=0x6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=1b Driver=option
      
      T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  3 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=ff MxPS=64 #Cfgs=  3
      P:  Vendor=03f0 ProdID=a31d Rev=01.02
      S:  Manufacturer=HP Inc.
      S:  Product=HP lt4132 LTE/HSPA+ 4G Module
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 3 Cfg#= 3 Atr=a0 MxPwr=2mA
      I:  If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
      I:  If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
      I:  If#=0x2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=06 Prot=14 Driver=option
      Signed-off-by: NTore Anderson <tore@fud.no>
      Cc: stable@vger.kernel.org
      [ johan: drop id defines ]
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc0667b5
    • J
      USB: serial: option: add GosunCn ZTE WeLink ME3630 · 7a370193
      Jörgen Storvist 提交于
      commit 70a7444c550a75584ffcfae95267058817eff6a7 upstream.
      
      Added USB serial option driver support for GosunCn ZTE WeLink ME3630
      series cellular modules for USB modes ECM/NCM and MBIM.
      
      usb-devices output MBIM mode:
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 10 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=19d2 ProdID=0602 Rev=03.18
      S:  Manufacturer=Android
      S:  Product=Android
      S:  SerialNumber=
      C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
      I:  If#= 4 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
      
      usb-devices output ECM/NCM mode:
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 11 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=19d2 ProdID=1476 Rev=03.18
      S:  Manufacturer=Android
      S:  Product=Android
      S:  SerialNumber=
      C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether
      I:  If#= 4 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
      Signed-off-by: NJörgen Storvist <jorgen.storvist@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NJohan Hovold <johan@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a370193
    • N
      USB: xhci: fix 'broken_suspend' placement in struct xchi_hcd · a67fb441
      Nicolas Saenz Julienne 提交于
      commit 2419f30a4a4fcaa5f35111563b4c61f1b2b26841 upstream.
      
      As commented in the struct's definition there shouldn't be anything
      underneath its 'priv[0]' member as it would break some macros.
      
      The patch converts the broken_suspend into a bit-field and relocates it
      next to to the rest of bit-fields.
      
      Fixes: a7d57abcc8a5 ("xhci: workaround CSS timeout on AMD SNPS 3.0 xHC")
      Reported-by: NOliver Neukum  <oneukum@suse.com>
      Signed-off-by: NNicolas Saenz Julienne <nsaenzjulienne@suse.de>
      Acked-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a67fb441
    • M
      xhci: Don't prevent USB2 bus suspend in state check intended for USB3 only · e13bfb35
      Mathias Nyman 提交于
      commit 45f750c16cae3625014c14c77bd9005eda975d35 upstream.
      
      The code to prevent a bus suspend if a USB3 port was still in link training
      also reacted to USB2 port polling state.
      This caused bus suspend to busyloop in some cases.
      USB2 polling state is different from USB3, and should not prevent bus
      suspend.
      
      Limit the USB3 link training state check to USB3 root hub ports only.
      The origial commit went to stable so this need to be applied there as well
      
      Fixes: 2f31a67f01a8 ("usb: xhci: Prevent bus suspend if a port connect change or polling state is detected")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e13bfb35
    • H
      USB: hso: Fix OOB memory access in hso_probe/hso_get_config_data · 8f980122
      Hui Peng 提交于
      commit 5146f95df782b0ac61abde36567e718692725c89 upstream.
      
      The function hso_probe reads if_num from the USB device (as an u8) and uses
      it without a length check to index an array, resulting in an OOB memory read
      in hso_probe or hso_get_config_data.
      
      Add a length check for both locations and updated hso_probe to bail on
      error.
      
      This issue has been assigned CVE-2018-19985.
      Reported-by: NHui Peng <benquike@gmail.com>
      Reported-by: NMathias Payer <mathias.payer@nebelwelt.net>
      Signed-off-by: NHui Peng <benquike@gmail.com>
      Signed-off-by: NMathias Payer <mathias.payer@nebelwelt.net>
      Reviewed-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f980122
    • C
      Revert "vfs: Allow userns root to call mknod on owned filesystems." · 9c5ccadb
      Christian Brauner 提交于
      commit 94f82008ce30e2624537d240d64ce718255e0b80 upstream.
      
      This reverts commit 55956b59.
      
      commit 55956b59 ("vfs: Allow userns root to call mknod on owned filesystems.")
      enabled mknod() in user namespaces for userns root if CAP_MKNOD is
      available. However, these device nodes are useless since any filesystem
      mounted from a non-initial user namespace will set the SB_I_NODEV flag on
      the filesystem. Now, when a device node s created in a non-initial user
      namespace a call to open() on said device node will fail due to:
      
      bool may_open_dev(const struct path *path)
      {
              return !(path->mnt->mnt_flags & MNT_NODEV) &&
                      !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV);
      }
      
      The problem with this is that as of the aforementioned commit mknod()
      creates partially functional device nodes in non-initial user namespaces.
      In particular, it has the consequence that as of the aforementioned commit
      open() will be more privileged with respect to device nodes than mknod().
      Before it was the other way around. Specifically, if mknod() succeeded
      then it was transparent for any userspace application that a fatal error
      must have occured when open() failed.
      
      All of this breaks multiple userspace workloads and a widespread assumption
      about how to handle mknod(). Basically, all container runtimes and systemd
      live by the slogan "ask for forgiveness not permission" when running user
      namespace workloads. For mknod() the assumption is that if the syscall
      succeeds the device nodes are useable irrespective of whether it succeeds
      in a non-initial user namespace or not. This logic was chosen explicitly
      to allow for the glorious day when mknod() will actually be able to create
      fully functional device nodes in user namespaces.
      A specific problem people are already running into when running 4.18 rc
      kernels are failing systemd services. For any distro that is run in a
      container systemd services started with the PrivateDevices= property set
      will fail to start since the device nodes in question cannot be
      opened (cf. the arguments in [1]).
      
      Full disclosure, Seth made the very sound argument that it is already
      possible to end up with partially functional device nodes. Any filesystem
      mounted with MS_NODEV set will allow mknod() to succeed but will not allow
      open() to succeed. The difference to the case here is that the MS_NODEV
      case is transparent to userspace since it is an explicitly set mount option
      while the SB_I_NODEV case is an implicit property enforced by the kernel
      and hence opaque to userspace.
      
      [1]: https://github.com/systemd/systemd/pull/9483Signed-off-by: NChristian Brauner <christian@brauner.io>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Serge Hallyn <serge@hallyn.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c5ccadb
    • D
      iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()" · 38d072a4
      Dave Chinner 提交于
      [ Upstream commit a837eca2412051628c0529768c9bc4f3580b040e ]
      
      This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3.
      
      The reverted commit added page reference counting to iomap page
      structures that are used to track block size < page size state. This
      was supposed to align the code with page migration page accounting
      assumptions, but what it has done instead is break XFS filesystems.
      Every fstests run I've done on sub-page block size XFS filesystems
      has since picking up this commit 2 days ago has failed with bad page
      state errors such as:
      
      # ./run_check.sh "-m rmapbt=1,reflink=1 -i sparse=1 -b size=1k" "generic/038"
      ....
      SECTION       -- xfs
      FSTYP         -- xfs (debug)
      PLATFORM      -- Linux/x86_64 test1 4.20.0-rc6-dgc+
      MKFS_OPTIONS  -- -f -m rmapbt=1,reflink=1 -i sparse=1 -b size=1k /dev/sdc
      MOUNT_OPTIONS -- /dev/sdc /mnt/scratch
      
      generic/038 454s ...
       run fstests generic/038 at 2018-12-20 18:43:05
       XFS (sdc): Unmounting Filesystem
       XFS (sdc): Mounting V5 Filesystem
       XFS (sdc): Ending clean mount
       BUG: Bad page state in process kswapd0  pfn:3a7fa
       page:ffffea0000ccbeb0 count:0 mapcount:0 mapping:ffff88800d9b6360 index:0x1
       flags: 0xfffffc0000000()
       raw: 000fffffc0000000 dead000000000100 dead000000000200 ffff88800d9b6360
       raw: 0000000000000001 0000000000000000 00000000ffffffff
       page dumped because: non-NULL mapping
       CPU: 0 PID: 676 Comm: kswapd0 Not tainted 4.20.0-rc6-dgc+ #915
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
       Call Trace:
        dump_stack+0x67/0x90
        bad_page.cold.116+0x8a/0xbd
        free_pcppages_bulk+0x4bf/0x6a0
        free_unref_page_list+0x10f/0x1f0
        shrink_page_list+0x49d/0xf50
        shrink_inactive_list+0x19d/0x3b0
        shrink_node_memcg.constprop.77+0x398/0x690
        ? shrink_slab.constprop.81+0x278/0x3f0
        shrink_node+0x7a/0x2f0
        kswapd+0x34b/0x6d0
        ? node_reclaim+0x240/0x240
        kthread+0x11f/0x140
        ? __kthread_bind_mask+0x60/0x60
        ret_from_fork+0x24/0x30
       Disabling lock debugging due to kernel taint
      ....
      
      The failures are from anyway that frees pages and empties the
      per-cpu page magazines, so it's not a predictable failure or an easy
      to debug failure.
      
      generic/038 is a reliable reproducer of this problem - it has a 9 in
      10 failure rate on one of my test machines. Failure on other
      machines have been at random points in fstests runs but every run
      has ended up tripping this problem. Hence generic/038 was used to
      bisect the failure because it was the most reliable failure.
      
      It is too close to the 4.20 release (not to mention holidays) to
      try to diagnose, fix and test the underlying cause of the problem,
      so reverting the commit is the only option we have right now. The
      revert has been tested against a current tot 4.20-rc7+ kernel across
      multiple machines running sub-page block size XFs filesystems and
      none of the bad page state failures have been seen.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Cc: Piotr Jaroszynski <pjaroszynski@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Cc: Brian Foster <bfoster@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      38d072a4