1. 08 5月, 2019 40 次提交
    • T
      x86/mce: Improve error message when kernel cannot recover, p2 · 61ff4406
      Tony Luck 提交于
      commit 41f035a86b5b72a4f947c38e94239d20d595352a upstream.
      
      In
      
        c7d606f5 ("x86/mce: Improve error message when kernel cannot recover")
      
      a case was added for a machine check caused by a DATA access to poison
      memory from the kernel. A case should have been added also for an
      uncorrectable error during an instruction fetch in the kernel.
      
      Add that extra case so the error message now reads:
      
        mce: [Hardware Error]: Machine check: Instruction fetch error in kernel
      
      Fixes: c7d606f5 ("x86/mce: Improve error message when kernel cannot recover")
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190225205940.15226-1-tony.luck@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61ff4406
    • A
      powerpc/mm/hash: Handle mmap_min_addr correctly in get_unmapped_area topdown search · c7e220ef
      Aneesh Kumar K.V 提交于
      commit 3b4d07d2674f6b4a9281031f99d1f7efd325b16d upstream.
      
      When doing top-down search the low_limit is not PAGE_SIZE but rather
      max(PAGE_SIZE, mmap_min_addr). This handle cases in which mmap_min_addr >
      PAGE_SIZE.
      
      Fixes: fba2369e ("mm: use vm_unmapped_area() on powerpc architecture")
      Reviewed-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7e220ef
    • A
      mac80211: Honor SW_CRYPTO_CONTROL for unicast keys in AP VLAN mode · a78c3898
      Alexander Wetzel 提交于
      commit 78ad2341521d5ea96cb936244ed4c4c4ef9ec13b upstream.
      
      Restore SW_CRYPTO_CONTROL operation on AP_VLAN interfaces for unicast
      keys, the original override was intended to be done for group keys as
      those are treated specially by mac80211 and would always have been
      rejected.
      
      Now the situation is that AP_VLAN support must be enabled by the driver
      if it can support it (meaning it can support software crypto GTK TX).
      
      Thus, also simplify the code - if we get here with AP_VLAN and non-
      pairwise key, software crypto must be used (driver doesn't know about
      the interface) and can be used (driver must've advertised AP_VLAN if
      it also uses SW_CRYPTO_CONTROL).
      
      Fixes: db3bdcb9 ("mac80211: allow AP_VLAN operation on crypto controlled devices")
      Signed-off-by: NAlexander Wetzel <alexander@wetzel-home.de>
      [rewrite commit message]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a78c3898
    • O
      selinux: never allow relabeling on context mounts · 574be221
      Ondrej Mosnacek 提交于
      commit a83d6ddaebe541570291205cb538e35ad4ff94f9 upstream.
      
      In the SECURITY_FS_USE_MNTPOINT case we never want to allow relabeling
      files/directories, so we should never set the SBLABEL_MNT flag. The
      'special handling' in selinux_is_sblabel_mnt() is only intended for when
      the behavior is set to SECURITY_FS_USE_GENFS.
      
      While there, make the logic in selinux_is_sblabel_mnt() more explicit
      and add a BUILD_BUG_ON() to make sure that introducing a new
      SECURITY_FS_USE_* forces a review of the logic.
      
      Fixes: d5f3a5f6 ("selinux: add security in-core xattr support for pstore and debugfs")
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Reviewed-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      574be221
    • S
      selinux: avoid silent denials in permissive mode under RCU walk · 6b13ae52
      Stephen Smalley 提交于
      commit 3a28cff3bd4bf43f02be0c4e7933aebf3dc8197e upstream.
      
      commit 0dc1ba24 ("SELINUX: Make selinux cache VFS RCU walks safe")
      results in no audit messages at all if in permissive mode because the
      cache is updated during the rcu walk and thus no denial occurs on
      the subsequent ref walk.  Fix this by not updating the cache when
      performing a non-blocking permission check.  This only affects search
      and symlink read checks during rcu walk.
      
      Fixes: 0dc1ba24 ("SELINUX: Make selinux cache VFS RCU walks safe")
      Reported-by: NBMK <bmktuwien@gmail.com>
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b13ae52
    • A
      gpio: mxc: add check to return defer probe if clock tree NOT ready · 53ffa564
      Anson Huang 提交于
      commit a329bbe707cee2cf8c660890ef2ad0d00ec7e8a3 upstream.
      
      On i.MX8MQ platform, clock driver uses platform driver
      model and it is probed after GPIO driver, so when GPIO
      driver fails to get clock, it should check the error type
      to decide whether to return defer probe or just ignore
      the clock operation.
      
      Fixes: 2808801a ("gpio: mxc: add clock operation")
      Signed-off-by: NAnson Huang <Anson.Huang@nxp.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53ffa564
    • D
      Input: stmfts - acknowledge that setting brightness is a blocking call · a10c88bf
      Dmitry Torokhov 提交于
      commit 937c4e552fd1174784045684740edfcea536159d upstream.
      
      We need to turn regulators on and off when switching brightness, and
      that may block, therefore we have to set stmfts_brightness_set() as
      LED's brightness_set_blocking() method.
      
      Fixes: 78bcac7b ("Input: add support for the STMicroelectronics FingerTip touchscreen")
      Acked-by: NAndi Shyti <andi@etezian.org>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a10c88bf
    • A
      Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ · a99b9c82
      Anson Huang 提交于
      commit bf2a7ca39fd3ab47ef71c621a7ee69d1813b1f97 upstream.
      
      SNVS IRQ is requested before necessary driver data initialized,
      if there is a pending IRQ during driver probe phase, kernel
      NULL pointer panic will occur in IRQ handler. To avoid such
      scenario, just initialize necessary driver data before enabling
      IRQ. This patch is inspired by NXP's internal kernel tree.
      
      Fixes: d3dc6e23 ("input: keyboard: imx: add snvs power key driver")
      Signed-off-by: NAnson Huang <Anson.Huang@nxp.com>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a99b9c82
    • Y
      IB/core: Destroy QP if XRC QP fails · 8d5c1c03
      Yuval Avnery 提交于
      commit 535005ca8e5e71918d64074032f4b9d4fef8981e upstream.
      
      The open-coded variant missed destroy of SELinux created QP, reuse already
      existing ib_detroy_qp() call and use this opportunity to clean
      ib_create_qp() from double prints and unclear exit paths.
      Reported-by: NParav Pandit <parav@mellanox.com>
      Fixes: d291f1a6 ("IB/core: Enforce PKey security on QPs")
      Signed-off-by: NYuval Avnery <yuvalav@mellanox.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d5c1c03
    • D
      IB/core: Fix potential memory leak while creating MAD agents · 84148743
      Daniel Jurgens 提交于
      commit 6e88e672b69f0e627acdae74a527b730ea224b6b upstream.
      
      If the MAD agents isn't allowed to manage the subnet, or fails to register
      for the LSM notifier, the security context is leaked. Free the context in
      these cases.
      
      Fixes: 47a2b338 ("IB/core: Enforce security on management datagrams")
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Reported-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84148743
    • D
      IB/core: Unregister notifier before freeing MAD security · dabcbe58
      Daniel Jurgens 提交于
      commit d60667fc398ed34b3c7456b020481c55c760e503 upstream.
      
      If the notifier runs after the security context is freed an access of
      freed memory can occur.
      
      Fixes: 47a2b338 ("IB/core: Enforce security on management datagrams")
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dabcbe58
    • R
      platform/x86: intel_pmc_core: Handle CFL regmap properly · d1698f74
      Rajneesh Bhardwaj 提交于
      commit e50af8332785355de3cb40d9f5e8c45dbfc86f53 upstream.
      
      Only Coffeelake should use Cannonlake regmap other than Cannonlake
      platform. This allows Coffeelake special handling only when there is no
      matching PCI device and default reg map selected as per CPUID is for
      Sunrisepoint PCH. This change is needed to enable support for newer SoCs
      such as Icelake.
      
      Cc: "David E. Box" <david.e.box@intel.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Fixes: 661405bd ("platform/x86: intel_pmc_core: Special case for Coffeelake")
      Acked-by: N"David E. Box" <david.e.box@linux.intel.com>
      Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1698f74
    • R
      platform/x86: intel_pmc_core: Fix PCH IP name · 51e777c7
      Rajneesh Bhardwaj 提交于
      commit d6827015e671cd17871c9b7a0fabe06c044f7470 upstream.
      
      For Cannonlake and Icelake, the IP name for Res_6 should be SPF i.e.
      South Port F. No functional change is intended other than just renaming
      the IP appropriately.
      
      Cc: "David E. Box" <david.e.box@intel.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Fixes: 291101f6 ("platform/x86: intel_pmc_core: Add CannonLake PCH support")
      Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51e777c7
    • A
      ASoC: stm32: fix sai driver name initialisation · d4f1e3ef
      Arnaud Pouliquen 提交于
      commit 17d3069ccf06970e2db3f7cbf4335f207524279e upstream.
      
      This patch fixes the sai driver structure overwriting which results in
      a cpu dai name equal NULL.
      
      Fixes: 3e086edf ("ASoC: stm32: add SAI driver")
      Signed-off-by: NArnaud Pouliquen <arnaud.pouliquen@st.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d4f1e3ef
    • C
      ASoC: wm_adsp: Correct handling of compressed streams that restart · 7d3f7107
      Charles Keepax 提交于
      commit 639e5eb3c7d67e407f2a71fccd95323751398f6f upstream.
      
      Previously support was added to allow streams to be stopped and
      started again without the DSP being power cycled and this was done
      by clearing the buffer state in trigger start. Another supported
      use-case is using the DSP for a trigger event then opening the
      compressed stream later to receive the audio, unfortunately clearing
      the buffer state in trigger start destroys the data received
      from such a trigger. Correct this issue by moving the call to
      wm_adsp_buffer_clear to be in trigger stop instead.
      
      Fixes: 61fc060c ("ASoC: wm_adsp: Support streams which can start/stop with DSP active")
      Signed-off-by: NCharles Keepax <ckeepax@opensource.cirrus.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d3f7107
    • H
      ASoC: Intel: bytcr_rt5651: Revert "Fix DMIC map headsetmic mapping" · 3b958d5e
      Hans de Goede 提交于
      commit aee48a9ffa5a128bf4e433c57c39e015ea5b0208 upstream.
      
      Commit 37c7401e ("ASoC: Intel: bytcr_rt5651: Fix DMIC map
      headsetmic mapping"), changed the headsetmic mapping from IN3P to IN2P,
      this was based on the observation that all bytcr_rt5651 devices I have
      access to (7 devices) where all using IN3P for the headsetmic. This was
      an attempt to unifify / simplify the mapping, but it was wrong.
      
      None of those devices was actually using a digital internal mic. Now I've
      access to a Point of View TAB-P1006W-232 (v1.0) tabler, which does use a
      DMIC and it does have its headsetmic connected to IN2P, showing that the
      original mapping was correct, so this commit reverts the change changing
      the mapping back to IN2P.
      
      Fixes: 37c7401e ("ASoC: Intel: bytcr_rt5651: Fix DMIC map ... mapping")
      Acked-by: NPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3b958d5e
    • B
      scsi: RDMA/srpt: Fix a credit leak for aborted commands · 9d696f40
      Bart Van Assche 提交于
      commit 40ca8757291ca7a8775498112d320205b2a2e571 upstream.
      
      Make sure that the next time a response is sent to the initiator that the
      credit it had allocated for the aborted request gets freed.
      
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Nicholas Bellinger <nab@linux-iscsi.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Fixes: 131e6abc ("target: Add TFO->abort_task for aborted task resources release") # v3.15
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d696f40
    • J
      staging: iio: adt7316: fix the dac write calculation · f16e8317
      Jeremy Fertic 提交于
      commit 78accaea117c1ae878774974fab91ac4a0b0e2b0 upstream.
      
      The lsb calculation is not masking the correct bits from the user input.
      Subtract 1 from (1 << offset) to correctly set up the mask to be applied
      to user input.
      
      The lsb register stores its value starting at the bit 7 position.
      adt7316_store_DAC() currently assumes the value is at the other end of the
      register. Shift the lsb value before storing it in a new variable lsb_reg,
      and write this variable to the lsb register.
      
      Fixes: 35f6b6b8 ("staging: iio: new ADT7316/7/8 and ADT7516/7/9 driver")
      Signed-off-by: NJeremy Fertic <jeremyfertic@gmail.com>
      Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f16e8317
    • J
      staging: iio: adt7316: fix the dac read calculation · ad774285
      Jeremy Fertic 提交于
      commit 45130fb030aec26ac28b4bb23344901df3ec3b7f upstream.
      
      The calculation of the current dac value is using the wrong bits of the
      dac lsb register. Create two macros to shift the lsb register value into
      lsb position, depending on whether the dac is 10 or 12 bit. Initialize
      data to 0 so, with an 8 bit dac, the msb register value can be bitwise
      ORed with data.
      
      Fixes: 35f6b6b8 ("staging: iio: new ADT7316/7/8 and ADT7516/7/9 driver")
      Signed-off-by: NJeremy Fertic <jeremyfertic@gmail.com>
      Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad774285
    • J
      staging: iio: adt7316: allow adt751x to use internal vref for all dacs · 7041e3d6
      Jeremy Fertic 提交于
      commit 10bfe7cc1739c22f0aa296b39e53f61e9e3f4d99 upstream.
      
      With adt7516/7/9, internal vref is available for dacs a and b, dacs c and
      d, or all dacs. The driver doesn't currently support internal vref for all
      dacs. Change the else if to an if so both bits are checked rather than
      just one or the other.
      Signed-off-by: NJeremy Fertic <jeremyfertic@gmail.com>
      Fixes: 35f6b6b8 ("staging: iio: new ADT7316/7/8 and ADT7516/7/9 driver")
      Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7041e3d6
    • J
      clk: qcom: Add missing freq for usb30_master_clk on 8998 · 2ccaef71
      Jeffrey Hugo 提交于
      commit 0c8ff62504e3a667387e87889a259632c3199a86 upstream.
      
      The usb30_master_clk supports a 60Mhz frequency, but that is missing from
      the table of supported frequencies.  Add it.
      
      Fixes: b5f5f525 (clk: qcom: Add MSM8998 Global Clock Control (GCC) driver)
      Signed-off-by: NJeffrey Hugo <jhugo@codeaurora.org>
      Signed-off-by: NStephen Boyd <sboyd@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ccaef71
    • S
      Bluetooth: mediatek: fix up an error path to restore bdev->tx_state · 8897bf03
      Sean Wang 提交于
      commit 77f328dbc6cf42f22c691a164958a5452142a542 upstream.
      
      Restore bdev->tx_state with clearing bit BTMTKUART_TX_WAIT_VND_EVT
      when there is an error on waiting for the corresponding event.
      
      Fixes: 7237c4c9 ("Bluetooth: mediatek: Add protocol support for MediaTek serial devices")
      Signed-off-by: NSean Wang <sean.wang@mediatek.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8897bf03
    • B
      Bluetooth: btusb: request wake pin with NOAUTOEN · f5ad05e6
      Brian Norris 提交于
      commit 771acc7e4a6e5dba779cb1a7fd851a164bc81033 upstream.
      
      Badly-designed systems might have (for example) active-high wake pins
      that default to high (e.g., because of external pull ups) until they
      have an active firmware which starts driving it low.  This can cause an
      interrupt storm in the time between request_irq() and disable_irq().
      
      We don't support shared interrupts here, so let's just pre-configure the
      interrupt to avoid auto-enabling it.
      
      Fixes: fd913ef7 ("Bluetooth: btusb: Add out-of-band wakeup support")
      Fixes: 5364a0b4f4be ("arm64: dts: rockchip: move QCA6174A wakeup pin into its USB node")
      Signed-off-by: NBrian Norris <briannorris@chromium.org>
      Reviewed-by: NMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5ad05e6
    • K
      perf/x86/amd: Update generic hardware cache events for Family 17h · 3f8497cf
      Kim Phillips 提交于
      commit 0e3b74e26280f2cf8753717a950b97d424da6046 upstream.
      
      Add a new amd_hw_cache_event_ids_f17h assignment structure set
      for AMD families 17h and above, since a lot has changed.  Specifically:
      
      L1 Data Cache
      
      The data cache access counter remains the same on Family 17h.
      
      For DC misses, PMCx041's definition changes with Family 17h,
      so instead we use the L2 cache accesses from L1 data cache
      misses counter (PMCx060,umask=0xc8).
      
      For DC hardware prefetch events, Family 17h breaks compatibility
      for PMCx067 "Data Prefetcher", so instead, we use PMCx05a "Hardware
      Prefetch DC Fills."
      
      L1 Instruction Cache
      
      PMCs 0x80 and 0x81 (32-byte IC fetches and misses) are backward
      compatible on Family 17h.
      
      For prefetches, we remove the erroneous PMCx04B assignment which
      counts how many software data cache prefetch load instructions were
      dispatched.
      
      LL - Last Level Cache
      
      Removing PMCs 7D, 7E, and 7F assignments, as they do not exist
      on Family 17h, where the last level cache is L3.  L3 counters
      can be accessed using the existing AMD Uncore driver.
      
      Data TLB
      
      On Intel machines, data TLB accesses ("dTLB-loads") are assigned
      to counters that count load/store instructions retired.  This
      is inconsistent with instruction TLB accesses, where Intel
      implementations report iTLB misses that hit in the STLB.
      
      Ideally, dTLB-loads would count higher level dTLB misses that hit
      in lower level TLBs, and dTLB-load-misses would report those
      that also missed in those lower-level TLBs, therefore causing
      a page table walk.  That would be consistent with instruction
      TLB operation, remove the redundancy between dTLB-loads and
      L1-dcache-loads, and prevent perf from producing artificially
      low percentage ratios, i.e. the "0.01%" below:
      
              42,550,869      L1-dcache-loads
              41,591,860      dTLB-loads
                   4,802      dTLB-load-misses          #    0.01% of all dTLB cache hits
               7,283,682      L1-dcache-stores
               7,912,392      dTLB-stores
                     310      dTLB-store-misses
      
      On AMD Families prior to 17h, the "Data Cache Accesses" counter is
      used, which is slightly better than load/store instructions retired,
      but still counts in terms of individual load/store operations
      instead of TLB operations.
      
      So, for AMD Families 17h and higher, this patch assigns "dTLB-loads"
      to a counter for L1 dTLB misses that hit in the L2 dTLB, and
      "dTLB-load-misses" to a counter for L1 DTLB misses that caused
      L2 DTLB misses and therefore also caused page table walks.  This
      results in a much more accurate view of data TLB performance:
      
              60,961,781      L1-dcache-loads
                   4,601      dTLB-loads
                     963      dTLB-load-misses          #   20.93% of all dTLB cache hits
      
      Note that for all AMD families, data loads and stores are combined
      in a single accesses counter, so no 'L1-dcache-stores' are reported
      separately, and stores are counted with loads in 'L1-dcache-loads'.
      
      Also note that the "% of all dTLB cache hits" string is misleading
      because (a) "dTLB cache": although TLBs can be considered caches for
      page tables, in this context, it can be misinterpreted as data cache
      hits because the figures are similar (at least on Intel), and (b) not
      all those loads (technically accesses) technically "hit" at that
      hardware level.  "% of all dTLB accesses" would be more clear/accurate.
      
      Instruction TLB
      
      On Intel machines, 'iTLB-loads' measure iTLB misses that hit in the
      STLB, and 'iTLB-load-misses' measure iTLB misses that also missed in
      the STLB and completed a page table walk.
      
      For AMD Family 17h and above, for 'iTLB-loads' we replace the
      erroneous instruction cache fetches counter with PMCx084
      "L1 ITLB Miss, L2 ITLB Hit".
      
      For 'iTLB-load-misses' we still use PMCx085 "L1 ITLB Miss,
      L2 ITLB Miss", but set a 0xff umask because without it the event
      does not get counted.
      
      Branch Predictor (BPU)
      
      PMCs 0xc2 and 0xc3 continue to be valid across all AMD Families.
      
      Node Level Events
      
      Family 17h does not have a PMCx0e9 counter, and corresponding counters
      have not been made available publicly, so for now, we mark them as
      unsupported for Families 17h and above.
      
      Reference:
      
        "Open-Source Register Reference For AMD Family 17h Processors Models 00h-2Fh"
        Released 7/17/2018, Publication #56255, Revision 3.03:
        https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf
      
      [ mingo: tidied up the line breaks. ]
      Signed-off-by: NKim Phillips <kim.phillips@amd.com>
      Cc: <stable@vger.kernel.org> # v4.9+
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-perf-users@vger.kernel.org
      Fixes: e40ed154 ("perf/x86: Add perf support for AMD family-17h processors")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f8497cf
    • T
      block: pass no-op callback to INIT_WORK(). · 96e4471d
      Tetsuo Handa 提交于
      [ Upstream commit 2e3c18d0ada16f145087b2687afcad1748c0827c ]
      
      syzbot is hitting flush_work() warning caused by commit 4d43d395fed12463
      ("workqueue: Try to catch flush_work() without INIT_WORK().") [1].
      Although that commit did not expect INIT_WORK(NULL) case, calling
      flush_work() without setting a valid callback should be avoided anyway.
      Fix this problem by setting a no-op callback instead of NULL.
      
      [1] https://syzkaller.appspot.com/bug?id=e390366bc48bc82a7c668326e0663be3b91cbd29Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: Nsyzbot <syzbot+ba2a929dcf8e704c180e@syzkaller.appspotmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      [sl: rename blk_timeout_work]
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      96e4471d
    • A
      ARM: iop: don't use using 64-bit DMA masks · 14f3c36b
      Arnd Bergmann 提交于
      [ Upstream commit 2125801ccce19249708ca3245d48998e70569ab8 ]
      
      clang warns about statically defined DMA masks from the DMA_BIT_MASK
      macro with length 64:
      
       arch/arm/mach-iop13xx/setup.c:303:35: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
       static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(64);
                                        ^~~~~~~~~~~~~~~~
       include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK'
       #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
                                                            ^ ~~~
      
      The ones in iop shouldn't really be 64 bit masks, so changing them
      to what the driver can support avoids the warning.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      14f3c36b
    • A
      ARM: orion: don't use using 64-bit DMA masks · 39839f3e
      Arnd Bergmann 提交于
      [ Upstream commit cd92d74d67c811dc22544430b9ac3029f5bd64c5 ]
      
      clang warns about statically defined DMA masks from the DMA_BIT_MASK
      macro with length 64:
      
      arch/arm/plat-orion/common.c:625:29: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
                      .coherent_dma_mask      = DMA_BIT_MASK(64),
                                                ^~~~~~~~~~~~~~~~
      include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK'
       #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
      
      The ones in orion shouldn't really be 64 bit masks, so changing them
      to what the driver can support avoids the warning.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      39839f3e
    • K
      fs: stream_open - opener for stream-like files so that read and write can run... · 04b4d5f7
      Kirill Smelkov 提交于
      fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
      
      [ Upstream commit 10dce8af34226d90fa56746a934f8da5dcdba3df ]
      
      Commit 9c225f26 ("vfs: atomic f_pos accesses as per POSIX") added
      locking for file.f_pos access and in particular made concurrent read and
      write not possible - now both those functions take f_pos lock for the
      whole run, and so if e.g. a read is blocked waiting for data, write will
      deadlock waiting for that read to complete.
      
      This caused regression for stream-like files where previously read and
      write could run simultaneously, but after that patch could not do so
      anymore. See e.g. commit 581d21a2 ("xenbus: fix deadlock on writes
      to /proc/xen/xenbus") which fixes such regression for particular case of
      /proc/xen/xenbus.
      
      The patch that added f_pos lock in 2014 did so to guarantee POSIX thread
      safety for read/write/lseek and added the locking to file descriptors of
      all regular files. In 2014 that thread-safety problem was not new as it
      was already discussed earlier in 2006.
      
      However even though 2006'th version of Linus's patch was adding f_pos
      locking "only for files that are marked seekable with FMODE_LSEEK (thus
      avoiding the stream-like objects like pipes and sockets)", the 2014
      version - the one that actually made it into the tree as 9c225f26 -
      is doing so irregardless of whether a file is seekable or not.
      
      See
      
          https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/
          https://lwn.net/Articles/180387
          https://lwn.net/Articles/180396
      
      for historic context.
      
      The reason that it did so is, probably, that there are many files that
      are marked non-seekable, but e.g. their read implementation actually
      depends on knowing current position to correctly handle the read. Some
      examples:
      
      	kernel/power/user.c		snapshot_read
      	fs/debugfs/file.c		u32_array_read
      	fs/fuse/control.c		fuse_conn_waiting_read + ...
      	drivers/hwmon/asus_atk0110.c	atk_debugfs_ggrp_read
      	arch/s390/hypfs/inode.c		hypfs_read_iter
      	...
      
      Despite that, many nonseekable_open users implement read and write with
      pure stream semantics - they don't depend on passed ppos at all. And for
      those cases where read could wait for something inside, it creates a
      situation similar to xenbus - the write could be never made to go until
      read is done, and read is waiting for some, potentially external, event,
      for potentially unbounded time -> deadlock.
      
      Besides xenbus, there are 14 such places in the kernel that I've found
      with semantic patch (see below):
      
      	drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write()
      	drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write()
      	drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write()
      	drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write()
      	net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write()
      	drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write()
      	drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write()
      	drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write()
      	net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write()
      	drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write()
      	drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write()
      	drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write()
      	drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write()
      	drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write()
      
      In addition to the cases above another regression caused by f_pos
      locking is that now FUSE filesystems that implement open with
      FOPEN_NONSEEKABLE flag, can no longer implement bidirectional
      stream-like files - for the same reason as above e.g. read can deadlock
      write locking on file.f_pos in the kernel.
      
      FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990 ("fuse:
      implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp
      in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and
      write routines not depending on current position at all, and with both
      read and write being potentially blocking operations:
      
      See
      
          https://github.com/libfuse/osspd
          https://lwn.net/Articles/308445
      
          https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406
          https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477
          https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510
      
      Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as
      "somewhat pipe-like files ..." with read handler not using offset.
      However that test implements only read without write and cannot exercise
      the deadlock scenario:
      
          https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131
          https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163
          https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216
      
      I've actually hit the read vs write deadlock for real while implementing
      my FUSE filesystem where there is /head/watch file, for which open
      creates separate bidirectional socket-like stream in between filesystem
      and its user with both read and write being later performed
      simultaneously. And there it is semantically not easy to split the
      stream into two separate read-only and write-only channels:
      
          https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169
      
      Let's fix this regression. The plan is:
      
      1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS -
         doing so would break many in-kernel nonseekable_open users which
         actually use ppos in read/write handlers.
      
      2. Add stream_open() to kernel to open stream-like non-seekable file
         descriptors. Read and write on such file descriptors would never use
         nor change ppos. And with that property on stream-like files read and
         write will be running without taking f_pos lock - i.e. read and write
         could be running simultaneously.
      
      3. With semantic patch search and convert to stream_open all in-kernel
         nonseekable_open users for which read and write actually do not
         depend on ppos and where there is no other methods in file_operations
         which assume @offset access.
      
      4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via
         steam_open if that bit is present in filesystem open reply.
      
         It was tempting to change fs/fuse/ open handler to use stream_open
         instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but
         grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
         and in particular GVFS which actually uses offset in its read and
         write handlers
      
      	https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
      	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
      	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
      	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481
      
         so if we would do such a change it will break a real user.
      
      5. Add stream_open and FOPEN_STREAM handling to stable kernels starting
         from v3.14+ (the kernel where 9c225f26 first appeared).
      
         This will allow to patch OSSPD and other FUSE filesystems that
         provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE
         in their open handler and this way avoid the deadlock on all kernel
         versions. This should work because fs/fuse/ ignores unknown open
         flags returned from a filesystem and so passing FOPEN_STREAM to a
         kernel that is not aware of this flag cannot hurt. In turn the kernel
         that is not aware of FOPEN_STREAM will be < v3.14 where just
         FOPEN_NONSEEKABLE is sufficient to implement streams without read vs
         write deadlock.
      
      This patch adds stream_open, converts /proc/xen/xenbus to it and adds
      semantic patch to automatically locate in-kernel places that are either
      required to be converted due to read vs write deadlock, or that are just
      safe to be converted because read and write do not use ppos and there
      are no other funky methods in file_operations.
      
      Regarding semantic patch I've verified each generated change manually -
      that it is correct to convert - and each other nonseekable_open instance
      left - that it is either not correct to convert there, or that it is not
      converted due to current stream_open.cocci limitations.
      
      The script also does not convert files that should be valid to convert,
      but that currently have .llseek = noop_llseek or generic_file_llseek for
      unknown reason despite file being opened with nonseekable_open (e.g.
      drivers/input/mousedev.c)
      
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yongzhi Pan <panyongzhi@gmail.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Nikolaus Rath <Nikolaus@rath.org>
      Cc: Han-Wen Nienhuys <hanwen@google.com>
      Signed-off-by: NKirill Smelkov <kirr@nexedi.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      04b4d5f7
    • G
      xsysace: Fix error handling in ace_setup · a82cfd77
      Guenter Roeck 提交于
      [ Upstream commit 47b16820c490149c2923e8474048f2c6e7557cab ]
      
      If xace hardware reports a bad version number, the error handling code
      in ace_setup() calls put_disk(), followed by queue cleanup. However, since
      the disk data structure has the queue pointer set, put_disk() also
      cleans and releases the queue. This results in blk_cleanup_queue()
      accessing an already released data structure, which in turn may result
      in a crash such as the following.
      
      [   10.681671] BUG: Kernel NULL pointer dereference at 0x00000040
      [   10.681826] Faulting instruction address: 0xc0431480
      [   10.682072] Oops: Kernel access of bad area, sig: 11 [#1]
      [   10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440
      [   10.682387] Modules linked in:
      [   10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G        W         5.0.0-rc6-next-20190218+ #2
      [   10.682733] NIP:  c0431480 LR: c043147c CTR: c0422ad8
      [   10.682863] REGS: cf82fbe0 TRAP: 0300   Tainted: G        W          (5.0.0-rc6-next-20190218+)
      [   10.683065] MSR:  00029000 <CE,EE,ME>  CR: 22000222  XER: 00000000
      [   10.683236] DEAR: 00000040 ESR: 00000000
      [   10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000
      [   10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000
      [   10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000
      [   10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800
      [   10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114
      [   10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114
      [   10.684602] Call Trace:
      [   10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable)
      [   10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
      [   10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
      [   10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
      [   10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
      [   10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
      [   10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
      [   10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
      [   10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
      [   10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
      [   10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
      [   10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
      [   10.686457] [cf82fe50] [c052c46c] driver_register+0x88/0x15c
      [   10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac
      [   10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
      [   10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
      [   10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
      [   10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c
      [   10.687349] Instruction dump:
      [   10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008
      [   10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <81290040> 75290100 4182002c 80810008
      [   10.688056] ---[ end trace 13c9ff51d41b9d40 ]---
      
      Fix the problem by setting the disk queue pointer to NULL before calling
      put_disk(). A more comprehensive fix might be to rearrange the code
      to check the hardware version before initializing data structures,
      but I don't know if this would have undesirable side effects, and
      it would increase the complexity of backporting the fix to older kernels.
      
      Fixes: 74489a91 ("Add support for Xilinx SystemACE CompactFlash interface")
      Acked-by: NMichal Simek <michal.simek@xilinx.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a82cfd77
    • R
      sh: fix multiple function definition build errors · 54ad0956
      Randy Dunlap 提交于
      [ Upstream commit acaf892ecbf5be7710ae05a61fd43c668f68ad95 ]
      
      Many of the sh CPU-types have their own plat_irq_setup() and
      arch_init_clk_ops() functions, so these same (empty) functions in
      arch/sh/boards/of-generic.c are not needed and cause build errors.
      
      If there is some case where these empty functions are needed, they can
      be retained by marking them as "__weak" while at the same time making
      builds that do not need them succeed.
      
      Fixes these build errors:
      
      arch/sh/boards/of-generic.o: In function `plat_irq_setup':
      (.init.text+0x134): multiple definition of `plat_irq_setup'
      arch/sh/kernel/cpu/sh2/setup-sh7619.o:(.init.text+0x30): first defined here
      arch/sh/boards/of-generic.o: In function `arch_init_clk_ops':
      (.init.text+0x118): multiple definition of `arch_init_clk_ops'
      arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.init.text+0x0): first defined here
      
      Link: http://lkml.kernel.org/r/9ee4e0c5-f100-86a2-bd4d-1d3287ceab31@infradead.orgSigned-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      54ad0956
    • M
      hugetlbfs: fix memory leak for resv_map · b51fdcbe
      Mike Kravetz 提交于
      [ Upstream commit 58b6e5e8f1addd44583d61b0a03c0f5519527e35 ]
      
      When mknod is used to create a block special file in hugetlbfs, it will
      allocate an inode and kmalloc a 'struct resv_map' via resv_map_alloc().
      inode->i_mapping->private_data will point the newly allocated resv_map.
      However, when the device special file is opened bd_acquire() will set
      inode->i_mapping to bd_inode->i_mapping.  Thus the pointer to the
      allocated resv_map is lost and the structure is leaked.
      
      Programs to reproduce:
              mount -t hugetlbfs nodev hugetlbfs
              mknod hugetlbfs/dev b 0 0
              exec 30<> hugetlbfs/dev
              umount hugetlbfs/
      
      resv_map structures are only needed for inodes which can have associated
      page allocations.  To fix the leak, only allocate resv_map for those
      inodes which could possibly be associated with page allocations.
      
      Link: http://lkml.kernel.org/r/20190401213101.16476-1-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Reported-by: NYufen Yu <yuyufen@huawei.com>
      Suggested-by: NYufen Yu <yuyufen@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b51fdcbe
    • C
      kmemleak: powerpc: skip scanning holes in the .bss section · 6a62bbe8
      Catalin Marinas 提交于
      [ Upstream commit 298a32b132087550d3fa80641ca58323c5dfd4d9 ]
      
      Commit 2d4f5671 ("KVM: PPC: Introduce kvm_tmp framework") adds
      kvm_tmp[] into the .bss section and then free the rest of unused spaces
      back to the page allocator.
      
      kernel_init
        kvm_guest_init
          kvm_free_tmp
            free_reserved_area
              free_unref_page
                free_unref_page_prepare
      
      With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel.  As the
      result, kmemleak scan will trigger a panic when it scans the .bss
      section with unmapped pages.
      
      This patch creates dedicated kmemleak objects for the .data, .bss and
      potentially .data..ro_after_init sections to allow partial freeing via
      the kmemleak_free_part() in the powerpc kvm_free_tmp() function.
      
      Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NQian Cai <cai@lca.pw>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: NQian Cai <cai@lca.pw>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6a62bbe8
    • D
      KVM: SVM: prevent DBG_DECRYPT and DBG_ENCRYPT overflow · 82e8da1f
      David Rientjes 提交于
      [ Upstream commit b86bc2858b389255cd44555ce4b1e427b2b770c0 ]
      
      This ensures that the address and length provided to DBG_DECRYPT and
      DBG_ENCRYPT do not cause an overflow.
      
      At the same time, pass the actual number of pages pinned in memory to
      sev_unpin_memory() as a cleanup.
      Reported-by: NCfir Cohen <cfir@google.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      82e8da1f
    • V
      libcxgb: fix incorrect ppmax calculation · 57186663
      Varun Prakash 提交于
      [ Upstream commit cc5a726c79158bd307150e8d4176ec79b52001ea ]
      
      BITS_TO_LONGS() uses DIV_ROUND_UP() because of
      this ppmax value can be greater than available
      per cpu page pods.
      
      This patch removes BITS_TO_LONGS() to fix this
      issue.
      Signed-off-by: NVarun Prakash <varun@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      57186663
    • Y
      net: hns: Fix WARNING when remove HNS driver with SMMU enabled · 5c5e9f23
      Yonglong Liu 提交于
      [ Upstream commit 8601a99d7c0256b7a7fdd1ab14cf6c1f1dfcadc6 ]
      
      When enable SMMU, remove HNS driver will cause a WARNING:
      
      [  141.924177] WARNING: CPU: 36 PID: 2708 at drivers/iommu/dma-iommu.c:443 __iommu_dma_unmap+0xc0/0xc8
      [  141.954673] Modules linked in: hns_enet_drv(-)
      [  141.963615] CPU: 36 PID: 2708 Comm: rmmod Tainted: G        W         5.0.0-rc1-28723-gb729c57de95c-dirty #32
      [  141.983593] Hardware name: Huawei D05/D05, BIOS Hisilicon D05 UEFI Nemo 1.8 RC0 08/31/2017
      [  142.000244] pstate: 60000005 (nZCv daif -PAN -UAO)
      [  142.009886] pc : __iommu_dma_unmap+0xc0/0xc8
      [  142.018476] lr : __iommu_dma_unmap+0xc0/0xc8
      [  142.027066] sp : ffff000013533b90
      [  142.033728] x29: ffff000013533b90 x28: ffff8013e6983600
      [  142.044420] x27: 0000000000000000 x26: 0000000000000000
      [  142.055113] x25: 0000000056000000 x24: 0000000000000015
      [  142.065806] x23: 0000000000000028 x22: ffff8013e66eee68
      [  142.076499] x21: ffff8013db919800 x20: 0000ffffefbff000
      [  142.087192] x19: 0000000000001000 x18: 0000000000000007
      [  142.097885] x17: 000000000000000e x16: 0000000000000001
      [  142.108578] x15: 0000000000000019 x14: 363139343a70616d
      [  142.119270] x13: 6e75656761705f67 x12: 0000000000000000
      [  142.129963] x11: 00000000ffffffff x10: 0000000000000006
      [  142.140656] x9 : 1346c1aa88093500 x8 : ffff0000114de4e0
      [  142.151349] x7 : 6662666578303d72 x6 : ffff0000105ffec8
      [  142.162042] x5 : 0000000000000000 x4 : 0000000000000000
      [  142.172734] x3 : 00000000ffffffff x2 : ffff0000114de500
      [  142.183427] x1 : 0000000000000000 x0 : 0000000000000035
      [  142.194120] Call trace:
      [  142.199030]  __iommu_dma_unmap+0xc0/0xc8
      [  142.206920]  iommu_dma_unmap_page+0x20/0x28
      [  142.215335]  __iommu_unmap_page+0x40/0x60
      [  142.223399]  hnae_unmap_buffer+0x110/0x134
      [  142.231639]  hnae_free_desc+0x6c/0x10c
      [  142.239177]  hnae_fini_ring+0x14/0x34
      [  142.246540]  hnae_fini_queue+0x2c/0x40
      [  142.254080]  hnae_put_handle+0x38/0xcc
      [  142.261619]  hns_nic_dev_remove+0x54/0xfc [hns_enet_drv]
      [  142.272312]  platform_drv_remove+0x24/0x64
      [  142.280552]  device_release_driver_internal+0x17c/0x20c
      [  142.291070]  driver_detach+0x4c/0x90
      [  142.298259]  bus_remove_driver+0x5c/0xd8
      [  142.306148]  driver_unregister+0x2c/0x54
      [  142.314037]  platform_driver_unregister+0x10/0x18
      [  142.323505]  hns_nic_dev_driver_exit+0x14/0xf0c [hns_enet_drv]
      [  142.335248]  __arm64_sys_delete_module+0x214/0x25c
      [  142.344891]  el0_svc_common+0xb0/0x10c
      [  142.352430]  el0_svc_handler+0x24/0x80
      [  142.359968]  el0_svc+0x8/0x7c0
      [  142.366104] ---[ end trace 60ad1cd58e63c407 ]---
      
      The tx ring buffer map when xmit and unmap when xmit done. So in
      hnae_init_ring() did not map tx ring buffer, but in hnae_fini_ring()
      have a unmap operation for tx ring buffer, which is already unmapped
      when xmit done, than cause this WARNING.
      
      The hnae_alloc_buffers() is called in hnae_init_ring(),
      so the hnae_free_buffers() should be in hnae_fini_ring(), not in
      hnae_free_desc().
      
      In hnae_fini_ring(), adds a check is_rx_ring() as in hnae_init_ring().
      When the ring buffer is tx ring, adds a piece of code to ensure that
      the tx ring is unmap.
      Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: NPeng Li <lipeng321@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5c5e9f23
    • Y
      net: hns: fix ICMP6 neighbor solicitation messages discard problem · c9f43101
      Yonglong Liu 提交于
      [ Upstream commit f058e46855dcbc28edb2ed4736f38a71fd19cadb ]
      
      ICMP6 neighbor solicitation messages will be discard by the Hip06
      chips, because of not setting forwarding pool. Enable promisc mode
      has the same problem.
      
      This patch fix the wrong forwarding table configs for the multicast
      vague matching when enable promisc mode, and add forwarding pool
      for the forwarding table.
      Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c9f43101
    • Y
      net: hns: Fix probabilistic memory overwrite when HNS driver initialized · 1ff38d33
      Yonglong Liu 提交于
      [ Upstream commit c0b0984426814f3a9251873b689e67d34d8ccd84 ]
      
      When reboot the system again and again, may cause a memory
      overwrite.
      
      [   15.638922] systemd[1]: Reached target Swap.
      [   15.667561] tun: Universal TUN/TAP device driver, 1.6
      [   15.676756] Bridge firewalling registered
      [   17.344135] Unable to handle kernel paging request at virtual address 0000000200000040
      [   17.352179] Mem abort info:
      [   17.355007]   ESR = 0x96000004
      [   17.358105]   Exception class = DABT (current EL), IL = 32 bits
      [   17.364112]   SET = 0, FnV = 0
      [   17.367209]   EA = 0, S1PTW = 0
      [   17.370393] Data abort info:
      [   17.373315]   ISV = 0, ISS = 0x00000004
      [   17.377206]   CM = 0, WnR = 0
      [   17.380214] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
      [   17.386926] [0000000200000040] pgd=0000000000000000
      [   17.391878] Internal error: Oops: 96000004 [#1] SMP
      [   17.396824] CPU: 23 PID: 95 Comm: kworker/u130:0 Tainted: G            E     4.19.25-1.2.78.aarch64 #1
      [   17.414175] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.54 08/16/2018
      [   17.425615] Workqueue: events_unbound async_run_entry_fn
      [   17.435151] pstate: 00000005 (nzcv daif -PAN -UAO)
      [   17.444139] pc : __mutex_lock.isra.1+0x74/0x540
      [   17.453002] lr : __mutex_lock.isra.1+0x3c/0x540
      [   17.461701] sp : ffff000100d9bb60
      [   17.469146] x29: ffff000100d9bb60 x28: 0000000000000000
      [   17.478547] x27: 0000000000000000 x26: ffff802fb8945000
      [   17.488063] x25: 0000000000000000 x24: ffff802fa32081a8
      [   17.497381] x23: 0000000000000002 x22: ffff801fa2b15220
      [   17.506701] x21: ffff000009809000 x20: ffff802fa23a0888
      [   17.515980] x19: ffff801fa2b15220 x18: 0000000000000000
      [   17.525272] x17: 0000000200000000 x16: 0000000200000000
      [   17.534511] x15: 0000000000000000 x14: 0000000000000000
      [   17.543652] x13: ffff000008d95db8 x12: 000000000000000d
      [   17.552780] x11: ffff000008d95d90 x10: 0000000000000b00
      [   17.561819] x9 : ffff000100d9bb90 x8 : ffff802fb89d6560
      [   17.570829] x7 : 0000000000000004 x6 : 00000004a1801d05
      [   17.579839] x5 : 0000000000000000 x4 : 0000000000000000
      [   17.588852] x3 : ffff802fb89d5a00 x2 : 0000000000000000
      [   17.597734] x1 : 0000000200000000 x0 : 0000000200000000
      [   17.606631] Process kworker/u130:0 (pid: 95, stack limit = 0x(____ptrval____))
      [   17.617438] Call trace:
      [   17.623349]  __mutex_lock.isra.1+0x74/0x540
      [   17.630927]  __mutex_lock_slowpath+0x24/0x30
      [   17.638602]  mutex_lock+0x50/0x60
      [   17.645295]  drain_workqueue+0x34/0x198
      [   17.652623]  __sas_drain_work+0x7c/0x168
      [   17.659903]  sas_drain_work+0x60/0x68
      [   17.666947]  hisi_sas_scan_finished+0x30/0x40 [hisi_sas_main]
      [   17.676129]  do_scsi_scan_host+0x70/0xb0
      [   17.683534]  do_scan_async+0x20/0x228
      [   17.690586]  async_run_entry_fn+0x4c/0x1d0
      [   17.697997]  process_one_work+0x1b4/0x3f8
      [   17.705296]  worker_thread+0x54/0x470
      
      Every time the call trace is not the same, but the overwrite address
      is always the same:
      Unable to handle kernel paging request at virtual address 0000000200000040
      
      The root cause is, when write the reg XGMAC_MAC_TX_LF_RF_CONTROL_REG,
      didn't use the io_base offset.
      Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      1ff38d33
    • Y
      net: hns: Use NAPI_POLL_WEIGHT for hns driver · 7713ee69
      Yonglong Liu 提交于
      [ Upstream commit acb1ce15a61154aa501891d67ebf79bc9ea26818 ]
      
      When the HNS driver loaded, always have an error print:
      "netif_napi_add() called with weight 256"
      
      This is because the kernel checks the NAPI polling weights
      requested by drivers and it prints an error message if a driver
      requests a weight bigger than 64.
      
      So use NAPI_POLL_WEIGHT to fix it.
      Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: NPeng Li <lipeng321@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      7713ee69
    • L
      net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw() · 7e7befd8
      Liubin Shu 提交于
      [ Upstream commit 3a39a12ad364a9acd1038ba8da67cd8430f30de4 ]
      
      This patch is trying to fix the issue due to:
      [27237.844750] BUG: KASAN: use-after-free in hns_nic_net_xmit_hw+0x708/0xa18[hns_enet_drv]
      
      After hnae_queue_xmit() in hns_nic_net_xmit_hw(), can be
      interrupted by interruptions, and than call hns_nic_tx_poll_one()
      to handle the new packets, and free the skb. So, when turn back to
      hns_nic_net_xmit_hw(), calling skb->len will cause use-after-free.
      
      This patch update tx ring statistics in hns_nic_tx_poll_one() to
      fix the bug.
      Signed-off-by: NLiubin Shu <shuliubin@huawei.com>
      Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: NPeng Li <lipeng321@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      7e7befd8
    • W
      arm64: fix wrong check of on_sdei_stack in nmi context · 98d6651f
      Wei Li 提交于
      [ Upstream commit 1c41860864c8ae0387ef7d44f0000e99cbb2e06d ]
      
      When doing unwind_frame() in the context of pseudo nmi (need enable
      CONFIG_ARM64_PSEUDO_NMI), reaching the bottom of the stack (fp == 0,
      pc != 0), function on_sdei_stack() will return true while the sdei acpi
      table is not inited in fact. This will cause a "NULL pointer dereference"
      oops when going on.
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      98d6651f