1. 29 10月, 2019 40 次提交
    • R
      PCI: PM: Fix pci_power_up() · 2ada4030
      Rafael J. Wysocki 提交于
      commit 45144d42f299455911cc29366656c7324a3a7c97 upstream.
      
      There is an arbitrary difference between the system resume and
      runtime resume code paths for PCI devices regarding the delay to
      apply when switching the devices from D3cold to D0.
      
      Namely, pci_restore_standard_config() used in the runtime resume
      code path calls pci_set_power_state() which in turn invokes
      __pci_start_power_transition() to power up the device through the
      platform firmware and that function applies the transition delay
      (as per PCI Express Base Specification Revision 2.0, Section 6.6.1).
      However, pci_pm_default_resume_early() used in the system resume
      code path calls pci_power_up() which doesn't apply the delay at
      all and that causes issues to occur during resume from
      suspend-to-idle on some systems where the delay is required.
      
      Since there is no reason for that difference to exist, modify
      pci_power_up() to follow pci_set_power_state() more closely and
      invoke __pci_start_power_transition() from there to call the
      platform firmware to power up the device (in case that's necessary).
      
      Fixes: db288c9c ("PCI / PM: restore the original behavior of pci_set_power_state()")
      Reported-by: NDaniel Drake <drake@endlessm.com>
      Tested-by: NDaniel Drake <drake@endlessm.com>
      Link: https://lore.kernel.org/linux-pm/CAD8Lp44TYxrMgPLkHCqF9hv6smEurMXvmmvmtyFhZ6Q4SE+dig@mail.gmail.com/T/#m21be74af263c6a34f36e0fc5c77c5449d9406925Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ada4030
    • J
      xen/netback: fix error path of xenvif_connect_data() · ccb02adf
      Juergen Gross 提交于
      commit 3d5c1a037d37392a6859afbde49be5ba6a70a6b3 upstream.
      
      xenvif_connect_data() calls module_put() in case of error. This is
      wrong as there is no related module_get().
      
      Remove the superfluous module_put().
      
      Fixes: 279f438e ("xen-netback: Don't destroy the netdev until the vif is shut down")
      Cc: <stable@vger.kernel.org> # 3.12
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NPaul Durrant <paul@xen.org>
      Reviewed-by: NWei Liu <wei.liu@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ccb02adf
    • R
      cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown · 89ab39da
      Rafael J. Wysocki 提交于
      commit 65650b35133ff20f0c9ef0abd5c3c66dbce3ae57 upstream.
      
      It is incorrect to set the cpufreq syscore shutdown callback pointer
      to cpufreq_suspend(), because that function cannot be run in the
      syscore stage of system shutdown for two reasons: (a) it may attempt
      to carry out actions depending on devices that have already been shut
      down at that point and (b) the RCU synchronization carried out by it
      may not be able to make progress then.
      
      The latter issue has been present since commit 45975c7d21a1 ("rcu:
      Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"),
      but the former one has been there since commit 90de2a4a ("cpufreq:
      suspend cpufreq governors on shutdown") regardless.
      
      Fix that by dropping cpufreq_syscore_ops altogether and making
      device_shutdown() call cpufreq_suspend() directly before shutting
      down devices, which is along the lines of what system-wide power
      management does.
      
      Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds")
      Fixes: 90de2a4a ("cpufreq: suspend cpufreq governors on shutdown")
      Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89ab39da
    • C
      memstick: jmb38x_ms: Fix an error handling path in 'jmb38x_ms_probe()' · 5f19cbb3
      Christophe JAILLET 提交于
      commit 28c9fac09ab0147158db0baeec630407a5e9b892 upstream.
      
      If 'jmb38x_ms_count_slots()' returns 0, we must undo the previous
      'pci_request_regions()' call.
      
      Goto 'err_out_int' to fix it.
      
      Fixes: 60fdd931 ("memstick: add support for JMicron jmb38x MemoryStick host controller")
      Cc: stable@vger.kernel.org
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f19cbb3
    • Q
      btrfs: tracepoints: Fix bad entry members of qgroup events · 0b95aaae
      Qu Wenruo 提交于
      commit 1b2442b4ae0f234daeadd90e153b466332c466d8 upstream.
      
      [BUG]
      For btrfs:qgroup_meta_reserve event, the trace event can output garbage:
      
        qgroup_meta_reserve: 9c7f6acc-b342-4037-bc47-7f6e4d2232d7: refroot=5(FS_TREE) type=DATA diff=2
        qgroup_meta_reserve: 9c7f6acc-b342-4037-bc47-7f6e4d2232d7: refroot=5(FS_TREE) type=0x258792 diff=2
      
      The @type can be completely garbage, as DATA type is not possible for
      trace_qgroup_meta_reserve() trace event.
      
      [CAUSE]
      Ther are several problems related to qgroup trace events:
      - Unassigned entry member
        Member entry::type of trace_qgroup_update_reserve() and
        trace_qgourp_meta_reserve() is not assigned
      
      - Redundant entry member
        Member entry::type is completely useless in
        trace_qgroup_meta_convert()
      
      Fixes: 4ee0d883 ("btrfs: qgroup: Update trace events for metadata reservation")
      CC: stable@vger.kernel.org # 4.10+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b95aaae
    • F
      Btrfs: check for the full sync flag while holding the inode lock during fsync · 1b921b5b
      Filipe Manana 提交于
      commit ba0b084ac309283db6e329785c1dc4f45fdbd379 upstream.
      
      We were checking for the full fsync flag in the inode before locking the
      inode, which is racy, since at that that time it might not be set but
      after we acquire the inode lock some other task set it. One case where
      this can happen is on a system low on memory and some concurrent task
      failed to allocate an extent map and therefore set the full sync flag on
      the inode, to force the next fsync to work in full mode.
      
      A consequence of missing the full fsync flag set is hitting the problems
      fixed by commit 0c713cbab620 ("Btrfs: fix race between ranged fsync and
      writeback of adjacent ranges"), BUG_ON() when dropping extents from a log
      tree, hitting assertion failures at tree-log.c:copy_items() or all sorts
      of weird inconsistencies after replaying a log due to file extents items
      representing ranges that overlap.
      
      So just move the check such that it's done after locking the inode and
      before starting writeback again.
      
      Fixes: 0c713cbab620 ("Btrfs: fix race between ranged fsync and writeback of adjacent ranges")
      CC: stable@vger.kernel.org # 5.2+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b921b5b
    • F
      Btrfs: add missing extents release on file extent cluster relocation error · ac6bae2b
      Filipe Manana 提交于
      commit 44db1216efe37bf670f8d1019cdc41658d84baf5 upstream.
      
      If we error out when finding a page at relocate_file_extent_cluster(), we
      need to release the outstanding extents counter on the relocation inode,
      set by the previous call to btrfs_delalloc_reserve_metadata(), otherwise
      the inode's block reserve size can never decrease to zero and metadata
      space is leaked. Therefore add a call to btrfs_delalloc_release_extents()
      in case we can't find the target page.
      
      Fixes: 8b62f87b ("Btrfs: rework outstanding_extents")
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac6bae2b
    • Q
      btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group() · 6cd5be98
      Qu Wenruo 提交于
      commit 4b654acdae850f48b8250b9a578a4eaa518c7a6f upstream.
      
      In btrfs_read_block_groups(), if we have an invalid block group which
      has mixed type (DATA|METADATA) while the fs doesn't have MIXED_GROUPS
      feature, we error out without freeing the block group cache.
      
      This patch will add the missing btrfs_put_block_group() to prevent
      memory leak.
      
      Note for stable backports: the file to patch in versions <= 5.3 is
      fs/btrfs/extent-tree.c
      
      Fixes: 49303381 ("Btrfs: bail out if block group has different mixed flag")
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6cd5be98
    • P
      pinctrl: armada-37xx: swap polarity on LED group · a5a10f78
      Patrick Williams 提交于
      commit b835d6953009dc350d61402a854b5a7178d8c615 upstream.
      
      The configuration registers for the LED group have inverted
      polarity, which puts the GPIO into open-drain state when used in
      GPIO mode.  Switch to '0' for GPIO and '1' for LED modes.
      
      Fixes: 87466ccd ("pinctrl: armada-37xx: Add pin controller support for Armada 37xx")
      Signed-off-by: NPatrick Williams <alpawi@amazon.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20191001155154.99710-1-alpawi@amazon.comSigned-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5a10f78
    • P
      pinctrl: armada-37xx: fix control of pins 32 and up · e0e489aa
      Patrick Williams 提交于
      commit 20504fa1d2ffd5d03cdd9dc9c9dd4ed4579b97ef upstream.
      
      The 37xx configuration registers are only 32 bits long, so
      pins 32-35 spill over into the next register.  The calculation
      for the register address was done, but the bitmask was not, so
      any configuration to pin 32 or above resulted in a bitmask that
      overflowed and performed no action.
      
      Fix the register / offset calculation to also adjust the offset.
      
      Fixes: 5715092a ("pinctrl: armada-37xx: Add gpio support")
      Signed-off-by: NPatrick Williams <alpawi@amazon.com>
      Acked-by: NGregory CLEMENT <gregory.clement@bootlin.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20191001154634.96165-1-alpawi@amazon.comSigned-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0e489aa
    • D
      pinctrl: cherryview: restore Strago DMI workaround for all versions · 5e9d7180
      Dmitry Torokhov 提交于
      commit 260996c30f4f3a732f45045e3e0efe27017615e4 upstream.
      
      This is essentially a revert of:
      
      e3f72b749da2 pinctrl: cherryview: fix Strago DMI workaround
      86c5dd68 pinctrl: cherryview: limit Strago DMI workarounds to version 1.0
      
      because even with 1.1 versions of BIOS there are some pins that are
      configured as interrupts but not claimed by any driver, and they
      sometimes fire up and result in interrupt storms that cause touchpad
      stop functioning and other issues.
      
      Given that we are unlikely to qualify another firmware version for a
      while it is better to keep the workaround active on all Strago boards.
      Reported-by: NAlex Levin <levinale@chromium.org>
      Fixes: 86c5dd68 ("pinctrl: cherryview: limit Strago DMI workarounds to version 1.0")
      Cc: stable@vger.kernel.org
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Tested-by: NAlex Levin <levinale@chromium.org>
      Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e9d7180
    • S
      x86/apic/x2apic: Fix a NULL pointer deref when handling a dying cpu · 4dedaa73
      Sean Christopherson 提交于
      commit 7a22e03b0c02988e91003c505b34d752a51de344 upstream.
      
      Check that the per-cpu cluster mask pointer has been set prior to
      clearing a dying cpu's bit.  The per-cpu pointer is not set until the
      target cpu reaches smp_callin() during CPUHP_BRINGUP_CPU, whereas the
      teardown function, x2apic_dead_cpu(), is associated with the earlier
      CPUHP_X2APIC_PREPARE.  If an error occurs before the cpu is awakened,
      e.g. if do_boot_cpu() itself fails, x2apic_dead_cpu() will dereference
      the NULL pointer and cause a panic.
      
        smpboot: do_boot_cpu failed(-22) to wakeup CPU#1
        BUG: kernel NULL pointer dereference, address: 0000000000000008
        RIP: 0010:x2apic_dead_cpu+0x1a/0x30
        Call Trace:
         cpuhp_invoke_callback+0x9a/0x580
         _cpu_up+0x10d/0x140
         do_cpu_up+0x69/0xb0
         smp_init+0x63/0xa9
         kernel_init_freeable+0xd7/0x229
         ? rest_init+0xa0/0xa0
         kernel_init+0xa/0x100
         ret_from_fork+0x35/0x40
      
      Fixes: 023a6117 ("x86/apic/x2apic: Simplify cluster management")
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20191001205019.5789-1-sean.j.christopherson@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4dedaa73
    • S
      x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area · 17099172
      Steve Wahl 提交于
      commit 2aa85f246c181b1fa89f27e8e20c5636426be624 upstream.
      
      Our hardware (UV aka Superdome Flex) has address ranges marked
      reserved by the BIOS. Access to these ranges is caught as an error,
      causing the BIOS to halt the system.
      
      Initial page tables mapped a large range of physical addresses that
      were not checked against the list of BIOS reserved addresses, and
      sometimes included reserved addresses in part of the mapped range.
      Including the reserved range in the map allowed processor speculative
      accesses to the reserved range, triggering a BIOS halt.
      
      Used early in booting, the page table level2_kernel_pgt addresses 1
      GiB divided into 2 MiB pages, and it was set up to linearly map a full
       1 GiB of physical addresses that included the physical address range
      of the kernel image, as chosen by KASLR.  But this also included a
      large range of unused addresses on either side of the kernel image.
      And unlike the kernel image's physical address range, this extra
      mapped space was not checked against the BIOS tables of usable RAM
      addresses.  So there were times when the addresses chosen by KASLR
      would result in processor accessible mappings of BIOS reserved
      physical addresses.
      
      The kernel code did not directly access any of this extra mapped
      space, but having it mapped allowed the processor to issue speculative
      accesses into reserved memory, causing system halts.
      
      This was encountered somewhat rarely on a normal system boot, and much
      more often when starting the crash kernel if "crashkernel=512M,high"
      was specified on the command line (this heavily restricts the physical
      address of the crash kernel, in our case usually within 1 GiB of
      reserved space).
      
      The solution is to invalidate the pages of this table outside the kernel
      image's space before the page table is activated. It fixes this problem
      on our hardware.
      
       [ bp: Touchups. ]
      Signed-off-by: NSteve Wahl <steve.wahl@hpe.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: dimitri.sivanich@hpe.com
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jordan Borgner <mail@jordan-borgner.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: mike.travis@hpe.com
      Cc: russ.anderson@hpe.com
      Cc: stable@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com>
      Link: https://lkml.kernel.org/r/9c011ee51b081534a7a15065b1681d200298b530.1569358539.git.steve.wahl@hpe.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17099172
    • M
      dm cache: fix bugs when a GFP_NOWAIT allocation fails · e49c84c5
      Mikulas Patocka 提交于
      commit 13bd677a472d534bf100bab2713efc3f9e3f5978 upstream.
      
      GFP_NOWAIT allocation can fail anytime - it doesn't wait for memory being
      available and it fails if the mempool is exhausted and there is not enough
      memory.
      
      If we go down this path:
        map_bio -> mg_start -> alloc_migration -> mempool_alloc(GFP_NOWAIT)
      we can see that map_bio() doesn't check the return value of mg_start(),
      and the bio is leaked.
      
      If we go down this path:
        map_bio -> mg_start -> mg_lock_writes -> alloc_prison_cell ->
        dm_bio_prison_alloc_cell_v2 -> mempool_alloc(GFP_NOWAIT) ->
        mg_lock_writes -> mg_complete
      the bio is ended with an error - it is unacceptable because it could
      cause filesystem corruption if the machine ran out of memory
      temporarily.
      
      Change GFP_NOWAIT to GFP_NOIO, so that the mempool code will properly
      wait until memory becomes available. mempool_alloc with GFP_NOIO can't
      fail, so remove the code paths that deal with allocation failure.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e49c84c5
    • P
      tracing: Fix race in perf_trace_buf initialization · 5ce7528c
      Prateek Sood 提交于
      commit 6b1340cc00edeadd52ebd8a45171f38c8de2a387 upstream.
      
      A race condition exists while initialiazing perf_trace_buf from
      perf_trace_init() and perf_kprobe_init().
      
            CPU0                                        CPU1
      perf_trace_init()
        mutex_lock(&event_mutex)
          perf_trace_event_init()
            perf_trace_event_reg()
              total_ref_count == 0
      	buf = alloc_percpu()
              perf_trace_buf[i] = buf
              tp_event->class->reg() //fails       perf_kprobe_init()
      	goto fail                              perf_trace_event_init()
                                                       perf_trace_event_reg()
              fail:
      	  total_ref_count == 0
      
                                                         total_ref_count == 0
                                                         buf = alloc_percpu()
                                                         perf_trace_buf[i] = buf
                                                         tp_event->class->reg()
                                                         total_ref_count++
      
                free_percpu(perf_trace_buf[i])
                perf_trace_buf[i] = NULL
      
      Any subsequent call to perf_trace_event_reg() will observe total_ref_count > 0,
      causing the perf_trace_buf to be always NULL. This can result in perf_trace_buf
      getting accessed from perf_trace_buf_alloc() without being initialized. Acquiring
      event_mutex in perf_kprobe_init() before calling perf_trace_event_init() should
      fix this race.
      
      The race caused the following bug:
      
       Unable to handle kernel paging request at virtual address 0000003106f2003c
       Mem abort info:
         ESR = 0x96000045
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000045
         CM = 0, WnR = 1
       user pgtable: 4k pages, 39-bit VAs, pgdp = ffffffc034b9b000
       [0000003106f2003c] pgd=0000000000000000, pud=0000000000000000
       Internal error: Oops: 96000045 [#1] PREEMPT SMP
       Process syz-executor (pid: 18393, stack limit = 0xffffffc093190000)
       pstate: 80400005 (Nzcv daif +PAN -UAO)
       pc : __memset+0x20/0x1ac
       lr : memset+0x3c/0x50
       sp : ffffffc09319fc50
      
        __memset+0x20/0x1ac
        perf_trace_buf_alloc+0x140/0x1a0
        perf_trace_sys_enter+0x158/0x310
        syscall_trace_enter+0x348/0x7c0
        el0_svc_common+0x11c/0x368
        el0_svc_handler+0x12c/0x198
        el0_svc+0x8/0xc
      
      Ramdumps showed the following:
        total_ref_count = 3
        perf_trace_buf = (
            0x0 -> NULL,
            0x0 -> NULL,
            0x0 -> NULL,
            0x0 -> NULL)
      
      Link: http://lkml.kernel.org/r/1571120245-4186-1-git-send-email-prsood@codeaurora.org
      
      Cc: stable@vger.kernel.org
      Fixes: e12f03d7 ("perf/core: Implement the 'perf_kprobe' PMU")
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NPrateek Sood <prsood@codeaurora.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ce7528c
    • A
      perf/aux: Fix AUX output stopping · 96202569
      Alexander Shishkin 提交于
      commit f3a519e4add93b7b31a6616f0b09635ff2e6a159 upstream.
      
      Commit:
      
        8a58ddae2379 ("perf/core: Fix exclusive events' grouping")
      
      allows CAP_EXCLUSIVE events to be grouped with other events. Since all
      of those also happen to be AUX events (which is not the case the other
      way around, because arch/s390), this changes the rules for stopping the
      output: the AUX event may not be on its PMU's context any more, if it's
      grouped with a HW event, in which case it will be on that HW event's
      context instead. If that's the case, munmap() of the AUX buffer can't
      find and stop the AUX event, potentially leaving the last reference with
      the atomic context, which will then end up freeing the AUX buffer. This
      will then trip warnings:
      
      Fix this by using the context's PMU context when looking for events
      to stop, instead of the event's PMU context.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20191022073940.61814-1-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96202569
    • P
      CIFS: Fix use after free of file info structures · 01332b03
      Pavel Shilovsky 提交于
      commit 1a67c415965752879e2e9fad407bc44fc7f25f23 upstream.
      
      Currently the code assumes that if a file info entry belongs
      to lists of open file handles of an inode and a tcon then
      it has non-zero reference. The recent changes broke that
      assumption when putting the last reference of the file info.
      There may be a situation when a file is being deleted but
      nothing prevents another thread to reference it again
      and start using it. This happens because we do not hold
      the inode list lock while checking the number of references
      of the file info structure. Fix this by doing the proper
      locking when doing the check.
      
      Fixes: 487317c99477d ("cifs: add spinlock for the openFileList to cifsInodeInfo")
      Fixes: cb248819d209d ("cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic")
      Cc: Stable <stable@vger.kernel.org>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01332b03
    • R
      CIFS: avoid using MID 0xFFFF · 71cf8816
      Roberto Bergantinos Corpas 提交于
      commit 03d9a9fe3f3aec508e485dd3dcfa1e99933b4bdb upstream.
      
      According to MS-CIFS specification MID 0xFFFF should not be used by the
      CIFS client, but we actually do. Besides, this has proven to cause races
      leading to oops between SendReceive2/cifs_demultiplex_thread. On SMB1,
      MID is a 2 byte value easy to reach in CurrentMid which may conflict with
      an oplock break notification request coming from server
      Signed-off-by: NRoberto Bergantinos Corpas <rbergant@redhat.com>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71cf8816
    • M
      arm64: Enable workaround for Cavium TX2 erratum 219 when running SMT · 42927455
      Marc Zyngier 提交于
      commit 93916beb70143c46bf1d2bacf814be3a124b253b upstream.
      
      It appears that the only case where we need to apply the TX2_219_TVM
      mitigation is when the core is in SMT mode. So let's condition the
      enabling on detecting a CPU whose MPIDR_EL1.Aff0 is non-zero.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42927455
    • J
      EDAC/ghes: Fix Use after free in ghes_edac remove path · d97e4a6d
      James Morse 提交于
      commit 1e72e673b9d102ff2e8333e74b3308d012ddf75b upstream.
      
      ghes_edac models a single logical memory controller, and uses a global
      ghes_init variable to ensure only the first ghes_edac_register() will
      do anything.
      
      ghes_edac is registered the first time a GHES entry in the HEST is
      probed. There may be multiple entries, so subsequent attempts to
      register ghes_edac are silently ignored as the work has already been
      done.
      
      When a GHES entry is unregistered, it calls ghes_edac_unregister(),
      which free()s the memory behind the global variables in ghes_edac.
      
      But there may be multiple GHES entries, the next call to
      ghes_edac_unregister() will dereference the free()d memory, and attempt
      to free it a second time.
      
      This may also be triggered on a platform with one GHES entry, if the
      driver is unbound/re-bound and unbound. The re-bind step will do
      nothing because of ghes_init, the second unbind will then do the same
      work as the first.
      
      Doing the unregister work on the first call is unsafe, as another
      CPU may be processing a notification in ghes_edac_report_mem_error(),
      using the memory we are about to free.
      
      ghes_init is already half of the reference counting. We only need
      to do the register work for the first call, and the unregister work
      for the last. Add the unregister check.
      
      This means we no longer free ghes_edac's memory while there are
      GHES entries that may receive a notification.
      
      This was detected by KASAN and DEBUG_TEST_DRIVER_REMOVE.
      
       [ bp: merge into a single patch. ]
      
      Fixes: 0fe5f281 ("EDAC, ghes: Model a single, logical memory controller")
      Reported-by: NJohn Garry <john.garry@huawei.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Robert Richter <rrichter@marvell.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20191014171919.85044-2-james.morse@arm.com
      Link: https://lkml.kernel.org/r/304df85b-8b56-b77e-1a11-aa23769f2e7c@huawei.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d97e4a6d
    • H
      parisc: Fix vmap memory leak in ioremap()/iounmap() · ca65fe21
      Helge Deller 提交于
      commit 513f7f747e1cba81f28a436911fba0b485878ebd upstream.
      
      Sven noticed that calling ioremap() and iounmap() multiple times leads
      to a vmap memory leak:
      	vmap allocation for size 4198400 failed:
      	use vmalloc=<size> to increase size
      
      It seems we missed calling vunmap() in iounmap().
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Noticed-by: NSven Schnelle <svens@stackframe.org>
      Cc: <stable@vger.kernel.org> # v3.16+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca65fe21
    • M
      xtensa: drop EXPORT_SYMBOL for outs*/ins* · 19e2ed7b
      Max Filippov 提交于
      commit 8b39da985194aac2998dd9e3a22d00b596cebf1e upstream.
      
      Custom outs*/ins* implementations are long gone from the xtensa port,
      remove matching EXPORT_SYMBOLs.
      This fixes the following build warnings issued by modpost since commit
      15bfc2348d54 ("modpost: check for static EXPORT_SYMBOL* functions"):
      
        WARNING: "insb" [vmlinux] is a static EXPORT_SYMBOL
        WARNING: "insw" [vmlinux] is a static EXPORT_SYMBOL
        WARNING: "insl" [vmlinux] is a static EXPORT_SYMBOL
        WARNING: "outsb" [vmlinux] is a static EXPORT_SYMBOL
        WARNING: "outsw" [vmlinux] is a static EXPORT_SYMBOL
        WARNING: "outsl" [vmlinux] is a static EXPORT_SYMBOL
      
      Cc: stable@vger.kernel.org
      Fixes: d38efc1f ("xtensa: adopt generic io routines")
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19e2ed7b
    • J
      mm/memory-failure: poison read receives SIGKILL instead of SIGBUS if mmaped more than once · 30cff8ab
      Jane Chu 提交于
      commit 3d7fed4ad8ccb691d217efbb0f934e6a4df5ef91 upstream.
      
      Mmap /dev/dax more than once, then read the poison location using
      address from one of the mappings.  The other mappings due to not having
      the page mapped in will cause SIGKILLs delivered to the process.
      SIGKILL succeeds over SIGBUS, so user process loses the opportunity to
      handle the UE.
      
      Although one may add MAP_POPULATE to mmap(2) to work around the issue,
      MAP_POPULATE makes mapping 128GB of pmem several magnitudes slower, so
      isn't always an option.
      
      Details -
      
        ndctl inject-error --block=10 --count=1 namespace6.0
      
        ./read_poison -x dax6.0 -o 5120 -m 2
        mmaped address 0x7f5bb6600000
        mmaped address 0x7f3cf3600000
        doing local read at address 0x7f3cf3601400
        Killed
      
      Console messages in instrumented kernel -
      
        mce: Uncorrected hardware memory error in user-access at edbe201400
        Memory failure: tk->addr = 7f5bb6601000
        Memory failure: address edbe201: call dev_pagemap_mapping_shift
        dev_pagemap_mapping_shift: page edbe201: no PUD
        Memory failure: tk->size_shift == 0
        Memory failure: Unable to find user space address edbe201 in read_poison
        Memory failure: tk->addr = 7f3cf3601000
        Memory failure: address edbe201: call dev_pagemap_mapping_shift
        Memory failure: tk->size_shift = 21
        Memory failure: 0xedbe201: forcibly killing read_poison:22434 because of failure to unmap corrupted page
          => to deliver SIGKILL
        Memory failure: 0xedbe201: Killing read_poison:22434 due to hardware memory corruption
          => to deliver SIGBUS
      
      Link: http://lkml.kernel.org/r/1565112345-28754-3-git-send-email-jane.chu@oracle.comSigned-off-by: NJane Chu <jane.chu@oracle.com>
      Suggested-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30cff8ab
    • D
      hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic() · 91eec769
      David Hildenbrand 提交于
      commit f231fe4235e22e18d847e05cbe705deaca56580a upstream.
      
      Uninitialized memmaps contain garbage and in the worst case trigger
      kernel BUGs, especially with CONFIG_PAGE_POISONING.  They should not get
      touched.
      
      Let's make sure that we only consider online memory (managed by the
      buddy) that has initialized memmaps.  ZONE_DEVICE is not applicable.
      
      page_zone() will call page_to_nid(), which will trigger
      VM_BUG_ON_PGFLAGS(PagePoisoned(page), page) with CONFIG_PAGE_POISONING
      and CONFIG_DEBUG_VM_PGFLAGS when called on uninitialized memmaps.  This
      can be the case when an offline memory block (e.g., never onlined) is
      spanned by a zone.
      
      Note: As explained by Michal in [1], alloc_contig_range() will verify
      the range.  So it boils down to the wrong access in this function.
      
      [1] http://lkml.kernel.org/r/20180423000943.GO17484@dhcp22.suse.cz
      
      Link: http://lkml.kernel.org/r/20191015120717.4858-1-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reported-by: NMichal Hocko <mhocko@kernel.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      91eec769
    • Q
      mm/page_owner: don't access uninitialized memmaps when reading /proc/pagetypeinfo · f712e306
      Qian Cai 提交于
      commit a26ee565b6cd8dc2bf15ff6aa70bbb28f928b773 upstream.
      
      Uninitialized memmaps contain garbage and in the worst case trigger
      kernel BUGs, especially with CONFIG_PAGE_POISONING.  They should not get
      touched.
      
      For example, when not onlining a memory block that is spanned by a zone
      and reading /proc/pagetypeinfo with CONFIG_DEBUG_VM_PGFLAGS and
      CONFIG_PAGE_POISONING, we can trigger a kernel BUG:
      
        :/# echo 1 > /sys/devices/system/memory/memory40/online
        :/# echo 1 > /sys/devices/system/memory/memory42/online
        :/# cat /proc/pagetypeinfo > test.file
         page:fffff2c585200000 is uninitialized and poisoned
         raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
         raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
         page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
         There is not page extension available.
         ------------[ cut here ]------------
         kernel BUG at include/linux/mm.h:1107!
         invalid opcode: 0000 [#1] SMP NOPTI
      
      Please note that this change does not affect ZONE_DEVICE, because
      pagetypeinfo_showmixedcount_print() is called from
      mm/vmstat.c:pagetypeinfo_showmixedcount() only for populated zones, and
      ZONE_DEVICE is never populated (zone->present_pages always 0).
      
      [david@redhat.com: move check to outer loop, add comment, rephrase description]
      Link: http://lkml.kernel.org/r/20191011140638.8160-1-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visible after d0dc12e8Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Miles Chen <miles.chen@mediatek.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f712e306
    • Q
      mm/slub: fix a deadlock in show_slab_objects() · bb6932c5
      Qian Cai 提交于
      commit e4f8e513c3d353c134ad4eef9fd0bba12406c7c8 upstream.
      
      A long time ago we fixed a similar deadlock in show_slab_objects() [1].
      However, it is apparently due to the commits like 01fb58bc ("slab:
      remove synchronous synchronize_sched() from memcg cache deactivation
      path") and 03afc0e2 ("slab: get_online_mems for
      kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
      just reading files in /sys/kernel/slab which will generate a lockdep
      splat below.
      
      Since the "mem_hotplug_lock" here is only to obtain a stable online node
      mask while racing with NUMA node hotplug, in the worst case, the results
      may me miscalculated while doing NUMA node hotplug, but they shall be
      corrected by later reads of the same files.
      
        WARNING: possible circular locking dependency detected
        ------------------------------------------------------
        cat/5224 is trying to acquire lock:
        ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at:
        show_slab_objects+0x94/0x3a8
      
        but task is already holding lock:
        b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #2 (kn->count#45){++++}:
               lock_acquire+0x31c/0x360
               __kernfs_remove+0x290/0x490
               kernfs_remove+0x30/0x44
               sysfs_remove_dir+0x70/0x88
               kobject_del+0x50/0xb0
               sysfs_slab_unlink+0x2c/0x38
               shutdown_cache+0xa0/0xf0
               kmemcg_cache_shutdown_fn+0x1c/0x34
               kmemcg_workfn+0x44/0x64
               process_one_work+0x4f4/0x950
               worker_thread+0x390/0x4bc
               kthread+0x1cc/0x1e8
               ret_from_fork+0x10/0x18
      
        -> #1 (slab_mutex){+.+.}:
               lock_acquire+0x31c/0x360
               __mutex_lock_common+0x16c/0xf78
               mutex_lock_nested+0x40/0x50
               memcg_create_kmem_cache+0x38/0x16c
               memcg_kmem_cache_create_func+0x3c/0x70
               process_one_work+0x4f4/0x950
               worker_thread+0x390/0x4bc
               kthread+0x1cc/0x1e8
               ret_from_fork+0x10/0x18
      
        -> #0 (mem_hotplug_lock.rw_sem){++++}:
               validate_chain+0xd10/0x2bcc
               __lock_acquire+0x7f4/0xb8c
               lock_acquire+0x31c/0x360
               get_online_mems+0x54/0x150
               show_slab_objects+0x94/0x3a8
               total_objects_show+0x28/0x34
               slab_attr_show+0x38/0x54
               sysfs_kf_seq_show+0x198/0x2d4
               kernfs_seq_show+0xa4/0xcc
               seq_read+0x30c/0x8a8
               kernfs_fop_read+0xa8/0x314
               __vfs_read+0x88/0x20c
               vfs_read+0xd8/0x10c
               ksys_read+0xb0/0x120
               __arm64_sys_read+0x54/0x88
               el0_svc_handler+0x170/0x240
               el0_svc+0x8/0xc
      
        other info that might help us debug this:
      
        Chain exists of:
          mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45
      
         Possible unsafe locking scenario:
      
               CPU0                    CPU1
               ----                    ----
          lock(kn->count#45);
                                       lock(slab_mutex);
                                       lock(kn->count#45);
          lock(mem_hotplug_lock.rw_sem);
      
         *** DEADLOCK ***
      
        3 locks held by cat/5224:
         #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8
         #1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0
         #2: b8ff009693eee398 (kn->count#45){++++}, at:
        kernfs_seq_start+0x44/0xf0
      
        stack backtrace:
        Call trace:
         dump_backtrace+0x0/0x248
         show_stack+0x20/0x2c
         dump_stack+0xd0/0x140
         print_circular_bug+0x368/0x380
         check_noncircular+0x248/0x250
         validate_chain+0xd10/0x2bcc
         __lock_acquire+0x7f4/0xb8c
         lock_acquire+0x31c/0x360
         get_online_mems+0x54/0x150
         show_slab_objects+0x94/0x3a8
         total_objects_show+0x28/0x34
         slab_attr_show+0x38/0x54
         sysfs_kf_seq_show+0x198/0x2d4
         kernfs_seq_show+0xa4/0xcc
         seq_read+0x30c/0x8a8
         kernfs_fop_read+0xa8/0x314
         __vfs_read+0x88/0x20c
         vfs_read+0xd8/0x10c
         ksys_read+0xb0/0x120
         __arm64_sys_read+0x54/0x88
         el0_svc_handler+0x170/0x240
         el0_svc+0x8/0xc
      
      I think it is important to mention that this doesn't expose the
      show_slab_objects to use-after-free.  There is only a single path that
      might really race here and that is the slab hotplug notifier callback
      __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path
      doesn't really destroy kmem_cache_node data structures.
      
      [1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html
      
      [akpm@linux-foundation.org: add comment explaining why we don't need mem_hotplug_lock]
      Link: http://lkml.kernel.org/r/1570192309-10132-1-git-send-email-cai@lca.pw
      Fixes: 01fb58bc ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path")
      Fixes: 03afc0e2 ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}")
      Signed-off-by: NQian Cai <cai@lca.pw>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb6932c5
    • D
      mm/memory-failure.c: don't access uninitialized memmaps in memory_failure() · 9792afbd
      David Hildenbrand 提交于
      commit 96c804a6ae8c59a9092b3d5dd581198472063184 upstream.
      
      We should check for pfn_to_online_page() to not access uninitialized
      memmaps.  Reshuffle the code so we don't have to duplicate the error
      message.
      
      Link: http://lkml.kernel.org/r/20191009142435.3975-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com>
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9792afbd
    • F
      mmc: cqhci: Commit descriptors before setting the doorbell · 01a44055
      Faiz Abbas 提交于
      commit c07d0073b9ec80a139d07ebf78e9c30d2a28279e upstream.
      
      Add a write memory barrier to make sure that descriptors are actually
      written to memory, before ringing the doorbell.
      Signed-off-by: NFaiz Abbas <faiz_abbas@ti.com>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01a44055
    • D
      fs/proc/page.c: don't access uninitialized memmaps in fs/proc/page.c · 6ea856ef
      David Hildenbrand 提交于
      commit aad5f69bc161af489dbb5934868bd347282f0764 upstream.
      
      There are three places where we access uninitialized memmaps, namely:
      - /proc/kpagecount
      - /proc/kpageflags
      - /proc/kpagecgroup
      
      We have initialized memmaps either when the section is online or when the
      page was initialized to the ZONE_DEVICE.  Uninitialized memmaps contain
      garbage and in the worst case trigger kernel BUGs, especially with
      CONFIG_PAGE_POISONING.
      
      For example, not onlining a DIMM during boot and calling /proc/kpagecount
      with CONFIG_PAGE_POISONING:
      
        :/# cat /proc/kpagecount > tmp.test
        BUG: unable to handle page fault for address: fffffffffffffffe
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 114616067 P4D 114616067 PUD 114618067 PMD 0
        Oops: 0000 [#1] SMP NOPTI
        CPU: 0 PID: 469 Comm: cat Not tainted 5.4.0-rc1-next-20191004+ #11
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
        RIP: 0010:kpagecount_read+0xce/0x1e0
        Code: e8 09 83 e0 3f 48 0f a3 02 73 2d 4c 89 e7 48 c1 e7 06 48 03 3d ab 51 01 01 74 1d 48 8b 57 08 480
        RSP: 0018:ffffa14e409b7e78 EFLAGS: 00010202
        RAX: fffffffffffffffe RBX: 0000000000020000 RCX: 0000000000000000
        RDX: 0000000000000001 RSI: 00007f76b5595000 RDI: fffff35645000000
        RBP: 00007f76b5595000 R08: 0000000000000001 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
        R13: 0000000000020000 R14: 00007f76b5595000 R15: ffffa14e409b7f08
        FS:  00007f76b577d580(0000) GS:ffff8f41bd400000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: fffffffffffffffe CR3: 0000000078960000 CR4: 00000000000006f0
        Call Trace:
         proc_reg_read+0x3c/0x60
         vfs_read+0xc5/0x180
         ksys_read+0x68/0xe0
         do_syscall_64+0x5c/0xa0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      For now, let's drop support for ZONE_DEVICE from the three pseudo files
      in order to fix this.  To distinguish offline memory (with garbage
      memmap) from ZONE_DEVICE memory with properly initialized memmaps, we
      would have to check get_dev_pagemap() and pfn_zone_device_reserved()
      right now.  The usage of both (especially, special casing devmem) is
      frowned upon and needs to be reworked.
      
      The fundamental issue we have is:
      
      	if (pfn_to_online_page(pfn)) {
      		/* memmap initialized */
      	} else if (pfn_valid(pfn)) {
      		/*
      		 * ???
      		 * a) offline memory. memmap garbage.
      		 * b) devmem: memmap initialized to ZONE_DEVICE.
      		 * c) devmem: reserved for driver. memmap garbage.
      		 * (d) devmem: memmap currently initializing - garbage)
      		 */
      	}
      
      We'll leave the pfn_zone_device_reserved() check in stable_page_flags()
      in place as that function is also used from memory failure.  We now no
      longer dump information about pages that are not in use anymore -
      offline.
      
      Link: http://lkml.kernel.org/r/20191009142435.3975-2-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reported-by: NQian Cai <cai@lca.pw>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
      Cc: Pankaj gupta <pagupta@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Anthony Yznaga <anthony.yznaga@oracle.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ea856ef
    • D
      drivers/base/memory.c: don't access uninitialized memmaps in soft_offline_page_store() · 43a2a6c2
      David Hildenbrand 提交于
      commit 641fe2e9387a36f9ee01d7c69382d1fe147a5e98 upstream.
      
      Uninitialized memmaps contain garbage and in the worst case trigger kernel
      BUGs, especially with CONFIG_PAGE_POISONING.  They should not get touched.
      
      Right now, when trying to soft-offline a PFN that resides on a memory
      block that was never onlined, one gets a misleading error with
      CONFIG_PAGE_POISONING:
      
        :/# echo 5637144576 > /sys/devices/system/memory/soft_offline_page
        [   23.097167] soft offline: 0x150000 page already poisoned
      
      But the actual result depends on the garbage in the memmap.
      
      soft_offline_page() can only work with online pages, it returns -EIO in
      case of ZONE_DEVICE.  Make sure to only forward pages that are online
      (iow, managed by the buddy) and, therefore, have an initialized memmap.
      
      Add a check against pfn_to_online_page() and similarly return -EIO.
      
      Link: http://lkml.kernel.org/r/20191010141200.8985-1-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43a2a6c2
    • H
      drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1 · 4d5307c0
      Hans de Goede 提交于
      commit 984d7a929ad68b7be9990fc9c5cfa5d5c9fc7942 upstream.
      
      Bail from the pci_driver probe function instead of from the drm_driver
      load function.
      
      This avoid /dev/dri/card0 temporarily getting registered and then
      unregistered again, sending unwanted add / remove udev events to
      userspace.
      
      Specifically this avoids triggering the (userspace) bug fixed by this
      plymouth merge-request:
      https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59
      
      Note that despite that being a userspace bug, not sending unnecessary
      udev events is a good idea in general.
      
      BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d5307c0
    • T
      drm/ttm: Restore ttm prefaulting · 11377c3e
      Thomas Hellstrom 提交于
      commit 941f2f72dbbe0cf8c2d6e0b180a8021a0ec477fa upstream.
      
      Commit 4daa4fba ("gpu: drm: ttm: Adding new return type vm_fault_t")
      broke TTM prefaulting. Since vmf_insert_mixed() typically always returns
      VM_FAULT_NOPAGE, prefaulting stops after the second PTE.
      
      Restore (almost) the original behaviour. Unfortunately we can no longer
      with the new vm_fault_t return type determine whether a prefaulting
      PTE insertion hit an already populated PTE, and terminate the insertion
      loop. Instead we continue with the pre-determined number of prefaults.
      
      Fixes: 4daa4fba ("gpu: drm: ttm: Adding new return type vm_fault_t")
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Christian König <christian.koenig@amd.com>
      Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: NChristian König <christian.koenig@amd.com>
      Link: https://patchwork.freedesktop.org/patch/330387/Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11377c3e
    • K
      drm/edid: Add 6 bpc quirk for SDC panel in Lenovo G50 · 33af2a8e
      Kai-Heng Feng 提交于
      commit 11bcf5f78905b90baae8fb01e16650664ed0cb00 upstream.
      
      Another panel that needs 6BPC quirk.
      
      BugLink: https://bugs.launchpad.net/bugs/1819968
      Cc: <stable@vger.kernel.org> # v4.8+
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190402033037.21877-1-kai.heng.feng@canonical.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33af2a8e
    • W
      mac80211: Reject malformed SSID elements · 24ca6289
      Will Deacon 提交于
      commit 4152561f5da3fca92af7179dd538ea89e248f9d0 upstream.
      
      Although this shouldn't occur in practice, it's a good idea to bounds
      check the length field of the SSID element prior to using it for things
      like allocations or memcpy operations.
      
      Cc: <stable@vger.kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Reported-by: NNicolas Waisman <nico@semmle.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20191004095132.15777-1-will@kernel.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      24ca6289
    • W
      cfg80211: wext: avoid copying malformed SSIDs · 73c066a9
      Will Deacon 提交于
      commit 4ac2813cc867ae563a1ba5a9414bfb554e5796fa upstream.
      
      Ensure the SSID element is bounds-checked prior to invoking memcpy()
      with its length field, when copying to userspace.
      
      Cc: <stable@vger.kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Reported-by: NNicolas Waisman <nico@semmle.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20191004095132.15777-2-will@kernel.org
      [adjust commit log a bit]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      73c066a9
    • J
      ACPI: CPPC: Set pcc_data[pcc_ss_id] to NULL in acpi_cppc_processor_exit() · 83dc1670
      John Garry 提交于
      commit 56a0b978d42f58c7e3ba715cf65af487d427524d upstream.
      
      When enabling KASAN and DEBUG_TEST_DRIVER_REMOVE, I find this KASAN
      warning:
      
      [   20.872057] BUG: KASAN: use-after-free in pcc_data_alloc+0x40/0xb8
      [   20.878226] Read of size 4 at addr ffff00236cdeb684 by task swapper/0/1
      [   20.884826]
      [   20.886309] CPU: 19 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc1-00009-ge7f7df3db5bf-dirty #289
      [   20.894994] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
      [   20.903505] Call trace:
      [   20.905942]  dump_backtrace+0x0/0x200
      [   20.909593]  show_stack+0x14/0x20
      [   20.912899]  dump_stack+0xd4/0x130
      [   20.916291]  print_address_description.isra.9+0x6c/0x3b8
      [   20.921592]  __kasan_report+0x12c/0x23c
      [   20.925417]  kasan_report+0xc/0x18
      [   20.928808]  __asan_load4+0x94/0xb8
      [   20.932286]  pcc_data_alloc+0x40/0xb8
      [   20.935938]  acpi_cppc_processor_probe+0x4e8/0xb08
      [   20.940717]  __acpi_processor_start+0x48/0xb0
      [   20.945062]  acpi_processor_start+0x40/0x60
      [   20.949235]  really_probe+0x118/0x548
      [   20.952887]  driver_probe_device+0x7c/0x148
      [   20.957059]  device_driver_attach+0x94/0xa0
      [   20.961231]  __driver_attach+0xa4/0x110
      [   20.965055]  bus_for_each_dev+0xe8/0x158
      [   20.968966]  driver_attach+0x30/0x40
      [   20.972531]  bus_add_driver+0x234/0x2f0
      [   20.976356]  driver_register+0xbc/0x1d0
      [   20.980182]  acpi_processor_driver_init+0x40/0xe4
      [   20.984875]  do_one_initcall+0xb4/0x254
      [   20.988700]  kernel_init_freeable+0x24c/0x2f8
      [   20.993047]  kernel_init+0x10/0x118
      [   20.996524]  ret_from_fork+0x10/0x18
      [   21.000087]
      [   21.001567] Allocated by task 1:
      [   21.004785]  save_stack+0x28/0xc8
      [   21.008089]  __kasan_kmalloc.isra.9+0xbc/0xd8
      [   21.012435]  kasan_kmalloc+0xc/0x18
      [   21.015913]  pcc_data_alloc+0x94/0xb8
      [   21.019564]  acpi_cppc_processor_probe+0x4e8/0xb08
      [   21.024343]  __acpi_processor_start+0x48/0xb0
      [   21.028689]  acpi_processor_start+0x40/0x60
      [   21.032860]  really_probe+0x118/0x548
      [   21.036512]  driver_probe_device+0x7c/0x148
      [   21.040684]  device_driver_attach+0x94/0xa0
      [   21.044855]  __driver_attach+0xa4/0x110
      [   21.048680]  bus_for_each_dev+0xe8/0x158
      [   21.052591]  driver_attach+0x30/0x40
      [   21.056155]  bus_add_driver+0x234/0x2f0
      [   21.059980]  driver_register+0xbc/0x1d0
      [   21.063805]  acpi_processor_driver_init+0x40/0xe4
      [   21.068497]  do_one_initcall+0xb4/0x254
      [   21.072322]  kernel_init_freeable+0x24c/0x2f8
      [   21.076667]  kernel_init+0x10/0x118
      [   21.080144]  ret_from_fork+0x10/0x18
      [   21.083707]
      [   21.085186] Freed by task 1:
      [   21.088056]  save_stack+0x28/0xc8
      [   21.091360]  __kasan_slab_free+0x118/0x180
      [   21.095445]  kasan_slab_free+0x10/0x18
      [   21.099183]  kfree+0x80/0x268
      [   21.102139]  acpi_cppc_processor_exit+0x1a8/0x1b8
      [   21.106832]  acpi_processor_stop+0x70/0x80
      [   21.110917]  really_probe+0x174/0x548
      [   21.114568]  driver_probe_device+0x7c/0x148
      [   21.118740]  device_driver_attach+0x94/0xa0
      [   21.122912]  __driver_attach+0xa4/0x110
      [   21.126736]  bus_for_each_dev+0xe8/0x158
      [   21.130648]  driver_attach+0x30/0x40
      [   21.134212]  bus_add_driver+0x234/0x2f0
      [   21.0x10/0x18
      [   21.161764]
      [   21.163244] The buggy address belongs to the object at ffff00236cdeb600
      [   21.163244]  which belongs to the cache kmalloc-256 of size 256
      [   21.175750] The buggy address is located 132 bytes inside of
      [   21.175750]  256-byte region [ffff00236cdeb600, ffff00236cdeb700)
      [   21.187473] The buggy address belongs to the page:
      [   21.192254] page:fffffe008d937a00 refcount:1 mapcount:0 mapping:ffff002370c0fa00 index:0x0 compound_mapcount: 0
      [   21.202331] flags: 0x1ffff00000010200(slab|head)
      [   21.206940] raw: 1ffff00000010200 dead000000000100 dead000000000122 ffff002370c0fa00
      [   21.214671] raw: 0000000000000000 00000000802a002a 00000001ffffffff 0000000000000000
      [   21.222400] page dumped because: kasan: bad access detected
      [   21.227959]
      [   21.229438] Memory state around the buggy address:
      [   21.234218]  ffff00236cdeb580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   21.241427]  ffff00236cdeb600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   21.248637] >ffff00236cdeb680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   21.255845]                    ^
      [   21.259062]  ffff00236cdeb700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   21.266272]  ffff00236cdeb780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   21.273480] ==================================================================
      
      It seems that global pcc_data[pcc_ss_id] can be freed in
      acpi_cppc_processor_exit(), but we may later reference this value, so
      NULLify it when freed.
      
      Also remove the useless setting of data "pcc_channel_acquired", which
      we're about to free.
      
      Fixes: 85b1407b ("ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs")
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83dc1670
    • J
      ASoC: rsnd: Reinitialize bit clock inversion flag for every format setting · 8e367b02
      Junya Monden 提交于
      commit 22e58665a01006d05f0239621f7d41cacca96cc4 upstream.
      
      Unlike other format-related DAI parameters, rdai->bit_clk_inv flag
      is not properly re-initialized when setting format for new stream
      processing. The inversion, if requested, is then applied not to default,
      but to a previous value, which leads to SCKP bit in SSICR register being
      set incorrectly.
      Fix this by re-setting the flag to its initial value, determined by format.
      
      Fixes: 1a7889ca ("ASoC: rsnd: fixup SND_SOC_DAIFMT_xB_xF behavior")
      Cc: Andrew Gabbasov <andrew_gabbasov@mentor.com>
      Cc: Jiada Wang <jiada_wang@mentor.com>
      Cc: Timo Wischer <twischer@de.adit-jv.com>
      Cc: stable@vger.kernel.org # v3.17+
      Signed-off-by: NJunya Monden <jmonden@jp.adit-jv.com>
      Signed-off-by: NEugeniu Rosca <erosca@de.adit-jv.com>
      Acked-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Link: https://lore.kernel.org/r/20191016124255.7442-1-erosca@de.adit-jv.comSigned-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e367b02
    • E
      Input: synaptics-rmi4 - avoid processing unknown IRQs · b0dd6a24
      Evan Green 提交于
      commit 363c53875aef8fce69d4a2d0873919ccc7d9e2ad upstream.
      
      rmi_process_interrupt_requests() calls handle_nested_irq() for
      each interrupt status bit it finds. If the irq domain mapping for
      this bit had not yet been set up, then it ends up calling
      handle_nested_irq(0), which causes a NULL pointer dereference.
      
      There's already code that masks the irq_status bits coming out of the
      hardware with current_irq_mask, presumably to avoid this situation.
      However current_irq_mask seems to more reflect the actual mask set
      in the hardware rather than the IRQs software has set up and registered
      for. For example, in rmi_driver_reset_handler(), the current_irq_mask
      is initialized based on what is read from the hardware. If the reset
      value of this mask enables IRQs that Linux has not set up yet, then
      we end up in this situation.
      
      There appears to be a third unused bitmask that used to serve this
      purpose, fn_irq_bits. Use that bitmask instead of current_irq_mask
      to avoid calling handle_nested_irq() on IRQs that have not yet been
      set up.
      Signed-off-by: NEvan Green <evgreen@chromium.org>
      Reviewed-by: NAndrew Duggan <aduggan@synaptics.com>
      Link: https://lore.kernel.org/r/20191008223657.163366-1-evgreen@chromium.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b0dd6a24
    • M
      Input: da9063 - fix capability and drop KEY_SLEEP · aa9402c1
      Marco Felsch 提交于
      commit afce285b859cea91c182015fc9858ea58c26cd0e upstream.
      
      Since commit f889beaa ("Input: da9063 - report KEY_POWER instead of
      KEY_SLEEP during power key-press") KEY_SLEEP isn't supported anymore. This
      caused input device to not generate any events if "dlg,disable-key-power"
      is set.
      
      Fix this by unconditionally setting KEY_POWER capability, and not
      declaring KEY_SLEEP.
      
      Fixes: f889beaa ("Input: da9063 - report KEY_POWER instead of KEY_SLEEP during power key-press")
      Signed-off-by: NMarco Felsch <m.felsch@pengutronix.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa9402c1
    • B
      scsi: ch: Make it possible to open a ch device multiple times again · e254d435
      Bart Van Assche 提交于
      commit 6a0990eaa768dfb7064f06777743acc6d392084b upstream.
      
      Clearing ch->device in ch_release() is wrong because that pointer must
      remain valid until ch_remove() is called. This patch fixes the following
      crash the second time a ch device is opened:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000790
      RIP: 0010:scsi_device_get+0x5/0x60
      Call Trace:
       ch_open+0x4c/0xa0 [ch]
       chrdev_open+0xa2/0x1c0
       do_dentry_open+0x13a/0x380
       path_openat+0x591/0x1470
       do_filp_open+0x91/0x100
       do_sys_open+0x184/0x220
       do_syscall_64+0x5f/0x1a0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 085e5676 ("scsi: ch: add refcounting")
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20191009173536.247889-1-bvanassche@acm.orgReported-by: NRob Turk <robtu@rtist.nl>
      Suggested-by: NRob Turk <robtu@rtist.nl>
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e254d435