1. 14 11月, 2018 40 次提交
    • O
      crypto: lrw - Fix out-of bounds access on counter overflow · c2ff3949
      Ondrej Mosnacek 提交于
      commit fbe1a850b3b1522e9fc22319ccbbcd2ab05328d2 upstream.
      
      When the LRW block counter overflows, the current implementation returns
      128 as the index to the precomputed multiplication table, which has 128
      entries. This patch fixes it to return the correct value (127).
      
      Fixes: 64470f1b ("[CRYPTO] lrw: Liskov Rivest Wagner, a tweakable narrow block cipher mode")
      Cc: <stable@vger.kernel.org> # 2.6.20+
      Reported-by: NEric Biggers <ebiggers@kernel.org>
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c2ff3949
    • E
      signal: Guard against negative signal numbers in copy_siginfo_from_user32 · 04eb7194
      Eric W. Biederman 提交于
      commit a36700589b85443e28170be59fa11c8a104130a5 upstream.
      
      While fixing an out of bounds array access in known_siginfo_layout
      reported by the kernel test robot it became apparent that the same bug
      exists in siginfo_layout and affects copy_siginfo_from_user32.
      
      The straight forward fix that makes guards against making this mistake
      in the future and should keep the code size small is to just take an
      unsigned signal number instead of a signed signal number, as I did to
      fix known_siginfo_layout.
      
      Cc: stable@vger.kernel.org
      Fixes: cc731525 ("signal: Remove kernel interal si_code magic")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      04eb7194
    • E
      signal/GenWQE: Fix sending of SIGKILL · 392d51e0
      Eric W. Biederman 提交于
      commit 0ab93e9c99f8208c0a1a7b7170c827936268c996 upstream.
      
      The genweq_add_file and genwqe_del_file by caching current without
      using reference counting embed the assumption that a file descriptor
      will never be passed from one process to another.  It even embeds the
      assumption that the the thread that opened the file will be in
      existence when the process terminates.   Neither of which are
      guaranteed to be true.
      
      Therefore replace caching the task_struct of the opener with
      pid of the openers thread group id.  All the knowledge of the
      opener is used for is as the target of SIGKILL and a SIGKILL
      will kill the entire process group.
      
      Rename genwqe_force_sig to genwqe_terminate, remove it's unncessary
      signal argument, update it's ownly caller, and use kill_pid
      instead of force_sig.
      
      The work force_sig does in changing signal handling state is not
      relevant to SIGKILL sent as SEND_SIG_PRIV.  The exact same processess
      will be killed just with less work, and less confusion.  The work done
      by force_sig is really only needed for handling syncrhonous
      exceptions.
      
      It will still be possible to cause genwqe_device_remove to wait
      8 seconds by passing a file descriptor to another process but
      the possible user after free is fixed.
      
      Fixes: eaf4722d ("GenWQE Character device and DDCB queue")
      Cc: stable@vger.kernel.org
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Frank Haverkamp <haver@linux.vnet.ibm.com>
      Cc: Joerg-Stephan Vogt <jsvogt@de.ibm.com>
      Cc: Michael Jung <mijung@gmx.net>
      Cc: Michael Ruettger <michael@ibmra.de>
      Cc: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Eberhard S. Amann <esa@linux.vnet.ibm.com>
      Cc: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Cc: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      392d51e0
    • B
      PCI: Add Device IDs for Intel GPU "spurious interrupt" quirk · 33d19d93
      Bin Meng 提交于
      commit d0c9606b31a21028fb5b753c8ad79626292accfd upstream.
      
      Add Device IDs to the Intel GPU "spurious interrupt" quirk table.
      
      For these devices, unplugging the VGA cable and plugging it in again causes
      spurious interrupts from the IGD.  Linux eventually disables the interrupt,
      but of course that disables any other devices sharing the interrupt.
      
      The theory is that this is a VGA BIOS defect: it should have disabled the
      IGD interrupt but failed to do so.
      
      See f67fd55f ("PCI: Add quirk for still enabled interrupts on Intel
      Sandy Bridge GPUs") and 7c82126a ("PCI: Add new ID for Intel GPU
      "spurious interrupt" quirk") for some history.
      
      [bhelgaas: See link below for discussion about how to fix this more
      generically instead of adding device IDs for every new Intel GPU.  I hope
      this is the last patch to add device IDs.]
      
      Link: https://lore.kernel.org/linux-pci/1537974841-29928-1-git-send-email-bmeng.cn@gmail.comSigned-off-by: NBin Meng <bmeng.cn@gmail.com>
      [bhelgaas: changelog]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: stable@vger.kernel.org	# v3.4+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33d19d93
    • L
      PCI/ASPM: Fix link_state teardown on device removal · 1e37e70d
      Lukas Wunner 提交于
      commit aeae4f3e5c38d47bdaef50446dc0ec857307df68 upstream.
      
      Upon removal of the last device on a bus, the link_state of the bridge
      leading to that bus is sought to be torn down by having pci_stop_dev()
      call pcie_aspm_exit_link_state().
      
      When ASPM was originally introduced by commit 7d715a6c ("PCI: add
      PCI Express ASPM support"), it determined whether the device being
      removed is the last one by calling list_empty() on the bridge's
      subordinate devices list.  That didn't work because the device is only
      removed from the list slightly later in pci_destroy_dev().
      
      Commit 3419c75e ("PCI: properly clean up ASPM link state on device
      remove") attempted to fix it by calling list_is_last(), but that's not
      correct either because it checks whether the device is at the *end* of
      the list, not whether it's the last one *left* in the list.  If the user
      removes the device which happens to be at the end of the list via sysfs
      but other devices are preceding the device in the list, the link_state
      is torn down prematurely.
      
      The real fix is to move the invocation of pcie_aspm_exit_link_state() to
      pci_destroy_dev() and reinstate the call to list_empty().  Remove a
      duplicate check for dev->bus->self because pcie_aspm_exit_link_state()
      already contains an identical check.
      
      Fixes: 7d715a6c ("PCI: add PCI Express ASPM support")
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Shaohua Li <shaohua.li@intel.com>
      Cc: stable@vger.kernel.org # v2.6.26
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e37e70d
    • V
      ARM: dts: dra7: Fix up unaligned access setting for PCIe EP · 6be746c4
      Vignesh R 提交于
      commit 6d0af44a82be87c13f2320821e9fbb8b8cf5a56f upstream.
      
      Bit positions of PCIE_SS1_AXI2OCP_LEGACY_MODE_ENABLE and
      PCIE_SS1_AXI2OCP_LEGACY_MODE_ENABLE in CTRL_CORE_SMA_SW_7 are
      incorrectly documented in the TRM. In fact, the bit positions are
      swapped. Update the DT bindings for PCIe EP to reflect the same.
      
      Fixes: d23f3839 ("ARM: dts: DRA7: Add pcie1 dt node for EP mode")
      Cc: stable@vger.kernel.org
      Signed-off-by: NVignesh R <vigneshr@ti.com>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6be746c4
    • Q
      EDAC, skx_edac: Fix logical channel intermediate decoding · 51819131
      Qiuxu Zhuo 提交于
      commit 8f18973877204dc8ca4ce1004a5d28683b9a7086 upstream.
      
      The code "lchan = (lchan << 1) | ~lchan" for logical channel
      intermediate decoding is wrong. The wrong intermediate decoding
      result is {0xffffffff, 0xfffffffe}.
      
      Fix it by replacing '~' with '!'. The correct intermediate
      decoding result is {0x1, 0x2}.
      Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      CC: Aristeu Rozanski <aris@redhat.com>
      CC: Mauro Carvalho Chehab <mchehab@kernel.org>
      CC: linux-edac <linux-edac@vger.kernel.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181009172025.18594-1-tony.luck@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51819131
    • T
      EDAC, {i7core,sb,skx}_edac: Fix uncorrected error counting · 5d1267c6
      Tony Luck 提交于
      commit 432de7fd7630c84ad24f1c2acd1e3bb4ce3741ca upstream.
      
      The count of errors is picked up from bits 52:38 of the machine check
      bank status register. But this is the count of *corrected* errors. If an
      uncorrected error is being logged, the h/w sets this field to 0. Which
      means that when edac_mc_handle_error() is called, the EDAC core will
      carefully add zero to the appropriate uncorrected error counts.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      [ Massage commit message. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: stable@vger.kernel.org
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20180928213934.19890-1-tony.luck@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d1267c6
    • M
      EDAC, amd64: Add Family 17h, models 10h-2fh support · 468d9f01
      Michael Jin 提交于
      commit 8960de4a5ca7980ed1e19e7ca5a774d3b7a55c38 upstream.
      
      Add new device IDs for family 17h, models 10h-2fh.
      
      This is required by amd64_edac_mod in order to properly detect PCI
      device functions 0 and 6.
      Signed-off-by: NMichael Jin <mikhail.jin@gmail.com>
      Reviewed-by: NYazen Ghannam <Yazen.Ghannam@amd.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20180816192840.31166-1-mikhail.jin@gmail.comSigned-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      468d9f01
    • B
      HID: hiddev: fix potential Spectre v1 · b599ba13
      Breno Leitao 提交于
      commit f11274396a538b31bc010f782e05c2ce3f804c13 upstream.
      
      uref->usage_index can be indirectly controlled by userspace, hence leading
      to a potential exploitation of the Spectre variant 1 vulnerability.
      
      This field is used as an array index by the hiddev_ioctl_usage() function,
      when 'cmd' is either HIDIOCGCOLLECTIONINDEX, HIDIOCGUSAGES or
      HIDIOCSUSAGES.
      
      For cmd == HIDIOCGCOLLECTIONINDEX case, uref->usage_index is compared to
      field->maxusage and then used as an index to dereference field->usage
      array. The same thing happens to the cmd == HIDIOC{G,S}USAGES cases, where
      uref->usage_index is checked against an array maximum value and then it is
      used as an index in an array.
      
      This is a summary of the HIDIOCGCOLLECTIONINDEX case, which matches the
      traditional Spectre V1 first load:
      
      	copy_from_user(uref, user_arg, sizeof(*uref))
      	if (uref->usage_index >= field->maxusage)
      		goto inval;
      	i = field->usage[uref->usage_index].collection_index;
      	return i;
      
      This patch fixes this by sanitizing field uref->usage_index before using it
      to index field->usage (HIDIOCGCOLLECTIONINDEX) or field->value in
      HIDIOC{G,S}USAGES arrays, thus, avoiding speculation in the first load.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NBreno Leitao <leitao@debian.org>
      v2: Contemplate cmd == HIDIOC{G,S}USAGES case
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b599ba13
    • J
      HID: wacom: Work around HID descriptor bug in DTK-2451 and DTH-2452 · 19785f4c
      Jason Gerecke 提交于
      commit 11db8173dbab7a94cf5ba5225fcedbfc0f3b7e54 upstream.
      
      The DTK-2451 and DTH-2452 have a buggy HID descriptor which incorrectly
      contains a Cintiq-like report, complete with pen tilt, rotation, twist, serial
      number, etc. The hardware doesn't actually support this data but our driver
      duitifully sets up the device as though it does. To ensure userspace has a
      correct view of devices without updated firmware, we clean up this incorrect
      data in wacom_setup_device_quirks.
      
      We're also careful to clear the WACOM_QUIRK_TOOLSERIAL flag since its presence
      causes the driver to wait for serial number information (via
      wacom_wac_pen_serial_enforce) that never comes, resulting in
      the pen being non-responsive.
      Signed-off-by: NJason Gerecke <jason.gerecke@wacom.com>
      Fixes: 83417206 ("HID: wacom: Queue events with missing type/serial data for later processing")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19785f4c
    • S
      selinux: fix mounting of cgroup2 under older policies · f77c8467
      Stephen Smalley 提交于
      commit 7bb185edb0306bb90029a5fa6b9cff900ffdbf4b upstream.
      
      commit 901ef845 ("selinux: allow per-file labeling for cgroupfs")
      broke mounting of cgroup2 under older SELinux policies which lacked
      a genfscon rule for cgroup2.  This prevents mounting of cgroup2 even
      when SELinux is permissive.
      
      Change the handling when there is no genfscon rule in policy to
      just mark the inode unlabeled and not return an error to the caller.
      This permits mounting and access if allowed by policy, e.g. to
      unconfined domains.
      
      I also considered changing the behavior of security_genfs_sid() to
      never return -ENOENT, but the current behavior is relied upon by
      other callers to perform caller-specific handling.
      
      Fixes: 901ef845 ("selinux: allow per-file labeling for cgroupfs")
      CC: <stable@vger.kernel.org>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Reported-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Tested-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f77c8467
    • T
      ext4: fix use-after-free race in ext4_remount()'s error path · 15f255ec
      Theodore Ts'o 提交于
      commit 33458eaba4dfe778a426df6a19b7aad2ff9f7eec upstream.
      
      It's possible for ext4_show_quota_options() to try reading
      s_qf_names[i] while it is being modified by ext4_remount() --- most
      notably, in ext4_remount's error path when the original values of the
      quota file name gets restored.
      
      Reported-by: syzbot+a2872d6feea6918008a9@syzkaller.appspotmail.com
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org # 3.2+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15f255ec
    • W
      ext4: propagate error from dquot_initialize() in EXT4_IOC_FSSETXATTR · ce1daaa8
      Wang Shilong 提交于
      commit 182a79e0 upstream.
      
      We return most failure of dquota_initialize() except
      inode evict, this could make a bit sense, for example
      we allow file removal even quota files are broken?
      
      But it dosen't make sense to allow setting project
      if quota files etc are broken.
      Signed-off-by: NWang Shilong <wshilong@ddn.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce1daaa8
    • W
      ext4: fix setattr project check in fssetxattr ioctl · 0d0413e9
      Wang Shilong 提交于
      commit dc7ac6c4 upstream.
      
      Currently, project quota could be changed by fssetxattr
      ioctl, and existed permission check inode_owner_or_capable()
      is obviously not enough, just think that common users could
      change project id of file, that could make users to
      break project quota easily.
      
      This patch try to follow same regular of xfs project
      quota:
      
      "Project Quota ID state is only allowed to change from
      within the init namespace. Enforce that restriction only
      if we are trying to change the quota ID state.
      Everything else is allowed in user namespaces."
      
      Besides that, check and set project id'state should
      be an atomic operation, protect whole operation with
      inode lock, ext4_ioctl_setproject() is only used for
      ioctl EXT4_IOC_FSSETXATTR, we have held mnt_want_write_file()
      before ext4_ioctl_setflags(), and ext4_ioctl_setproject()
      is called after ext4_ioctl_setflags(), we could share
      codes, so remove it inside ext4_ioctl_setproject().
      Signed-off-by: NWang Shilong <wshilong@ddn.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d0413e9
    • L
      ext4: initialize retries variable in ext4_da_write_inline_data_begin() · 99a3b224
      Lukas Czerner 提交于
      commit 625ef8a3 upstream.
      
      Variable retries is not initialized in ext4_da_write_inline_data_begin()
      which can lead to nondeterministic number of retries in case we hit
      ENOSPC. Initialize retries to zero as we do everywhere else.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Fixes: bc0ca9df ("ext4: retry allocation when inline->extent conversion failed")
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99a3b224
    • T
      ext4: fix EXT4_IOC_SWAP_BOOT · b2af09dd
      Theodore Ts'o 提交于
      commit 18aded17492088962ef43f00825179598b3e8c58 upstream.
      
      The code EXT4_IOC_SWAP_BOOT ioctl hasn't been updated in a while, and
      it's a bit broken with respect to more modern ext4 kernels, especially
      metadata checksums.
      
      Other problems fixed with this commit:
      
      * Don't allow installing a DAX, swap file, or an encrypted file as a
        boot loader.
      
      * Respect the immutable and append-only flags.
      
      * Wait until any DIO operations are finished *before* calling
        truncate_inode_pages().
      
      * Don't swap inode->i_flags, since these flags have nothing to do with
        the inode blocks --- and it will give the IMA/audit code heartburn
        when the inode is evicted.
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Reported-by: syzbot+e81ccd4744c6c4f71354@syzkaller.appspotmail.com
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2af09dd
    • A
      gfs2_meta: ->mount() can get NULL dev_name · 8c448126
      Al Viro 提交于
      commit 3df629d873f8683af6f0d34dfc743f637966d483 upstream.
      
      get in sync with mount_bdev() handling of the same
      
      Reported-by: syzbot+c54f8e94e6bba03b04e9@syzkaller.appspotmail.com
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c448126
    • J
      jbd2: fix use after free in jbd2_log_do_checkpoint() · 25881163
      Jan Kara 提交于
      commit ccd3c437 upstream.
      
      The code cleaning transaction's lists of checkpoint buffers has a bug
      where it increases bh refcount only after releasing
      journal->j_list_lock. Thus the following race is possible:
      
      CPU0					CPU1
      jbd2_log_do_checkpoint()
      					jbd2_journal_try_to_free_buffers()
      					  __journal_try_to_free_buffer(bh)
        ...
        while (transaction->t_checkpoint_io_list)
        ...
          if (buffer_locked(bh)) {
      
      <-- IO completes now, buffer gets unlocked -->
      
            spin_unlock(&journal->j_list_lock);
      					    spin_lock(&journal->j_list_lock);
      					    __jbd2_journal_remove_checkpoint(jh);
      					    spin_unlock(&journal->j_list_lock);
      					  try_to_free_buffers(page);
            get_bh(bh) <-- accesses freed bh
      
      Fix the problem by grabbing bh reference before unlocking
      journal->j_list_lock.
      
      Fixes: dc6e8d66 ("jbd2: don't call get_bh() before calling __jbd2_journal_remove_checkpoint()")
      Fixes: be1158cc ("jbd2: fold __process_buffer() into jbd2_log_do_checkpoint()")
      Reported-by: syzbot+7f4a27091759e2fe7453@syzkaller.appspotmail.com
      CC: stable@vger.kernel.org
      Reviewed-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25881163
    • J
      IB/rxe: Revise the ib_wr_opcode enum · 3b0b2820
      Jason Gunthorpe 提交于
      commit 9a59739b upstream.
      
      This enum has become part of the uABI, as both RXE and the
      ib_uverbs_post_send() command expect userspace to supply values from this
      enum. So it should be properly placed in include/uapi/rdma.
      
      In userspace this enum is called 'enum ibv_wr_opcode' as part of
      libibverbs.h. That enum defines different values for IB_WR_LOCAL_INV,
      IB_WR_SEND_WITH_INV, and IB_WR_LSO. These were introduced (incorrectly, it
      turns out) into libiberbs in 2015.
      
      The kernel has changed its mind on the numbering for several of the IB_WC
      values over the years, but has remained stable on IB_WR_LOCAL_INV and
      below.
      
      Based on this we can conclude that there is no real user space user of the
      values beyond IB_WR_ATOMIC_FETCH_AND_ADD, as they have never worked via
      rdma-core. This is confirmed by inspection, only rxe uses the kernel enum
      and implements the latter operations. rxe has clearly never worked with
      these attributes from userspace. Other drivers that support these opcodes
      implement the functionality without calling out to the kernel.
      
      To make IB_WR_SEND_WITH_INV and related work for RXE in userspace we
      choose to renumber the IB_WR enum in the kernel to match the uABI that
      userspace has bee using since before Soft RoCE was merged. This is an
      overall simpler configuration for the whole software stack, and obviously
      can't break anything existing.
      Reported-by: NSeth Howell <seth.howell@intel.com>
      Tested-by: NSeth Howell <seth.howell@intel.com>
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3b0b2820
    • A
      IB/mlx5: Fix MR cache initialization · b8a7fdb1
      Artemy Kovalyov 提交于
      commit 013c2403bf32e48119aeb13126929f81352cc7ac upstream.
      
      Schedule MR cache work only after bucket was initialized.
      
      Cc: <stable@vger.kernel.org> # 4.10
      Fixes: 49780d42 ("IB/mlx5: Expose MR cache for mlx5_ib")
      Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
      Reviewed-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8a7fdb1
    • D
      ASoC: sta32x: set ->component pointer in private struct · 8d564c8c
      Daniel Mack 提交于
      commit 747df19747bc9752cd40b9cce761e17a033aa5c2 upstream.
      
      The ESD watchdog code in sta32x_watchdog() dereferences the pointer
      which is never assigned.
      
      This is a regression from a1be4cea ("ASoC: sta32x: Convert to direct
      regmap API usage.") which went unnoticed since nobody seems to use that ESD
      workaround.
      
      Fixes: a1be4cea ("ASoC: sta32x: Convert to direct regmap API usage.")
      Signed-off-by: NDaniel Mack <daniel@zonque.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d564c8c
    • T
      ASoC: intel: skylake: Add missing break in skl_tplg_get_token() · 77a65118
      Takashi Iwai 提交于
      commit 9c80c5a8831471e0a3e139aad1b0d4c0fdc50b2f upstream.
      
      skl_tplg_get_token() misses a break in the big switch() block for
      SKL_TKN_U8_CORE_ID entry.
      Spotted nicely by -Wimplicit-fallthrough compiler option.
      
      Fixes: 6277e832 ("ASoC: Intel: Skylake: Parse vendor tokens to build module data")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77a65118
    • D
      libnvdimm, pmem: Fix badblocks population for 'raw' namespaces · 6c1400b3
      Dan Williams 提交于
      commit 91ed7ac444ef749603a95629a5ec483988c4f14b upstream.
      
      The driver is only initializing bb_res in the devm_memremap_pages()
      paths, but the raw namespace case is passing an uninitialized bb_res to
      nvdimm_badblocks_populate().
      
      Fixes: e8d51348 ("memremap: change devm_memremap_pages interface...")
      Cc: <stable@vger.kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Reported-by: NJacek Zloch <jacek.zloch@intel.com>
      Reported-by: NKrzysztof Rusocki <krzysztof.rusocki@intel.com>
      Reviewed-by: NVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c1400b3
    • D
      libnvdimm, region: Fail badblocks listing for inactive regions · 8f696986
      Dan Williams 提交于
      commit 5d394eee2c102453278d81d9a7cf94c80253486a upstream.
      
      While experimenting with region driver loading the following backtrace
      was triggered:
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       [..]
       Call Trace:
        dump_stack+0x85/0xcb
        register_lock_class+0x571/0x580
        ? __lock_acquire+0x2ba/0x1310
        ? kernfs_seq_start+0x2a/0x80
        __lock_acquire+0xd4/0x1310
        ? dev_attr_show+0x1c/0x50
        ? __lock_acquire+0x2ba/0x1310
        ? kernfs_seq_start+0x2a/0x80
        ? lock_acquire+0x9e/0x1a0
        lock_acquire+0x9e/0x1a0
        ? dev_attr_show+0x1c/0x50
        badblocks_show+0x70/0x190
        ? dev_attr_show+0x1c/0x50
        dev_attr_show+0x1c/0x50
      
      This results from a missing successful call to devm_init_badblocks()
      from nd_region_probe(). Block attempts to show badblocks while the
      region is not enabled.
      
      Fixes: 6a6bef90 ("libnvdimm: add mechanism to publish badblocks...")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NDave Jiang <dave.jiang@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f696986
    • A
      libnvdimm: Hold reference on parent while scheduling async init · 4f1a55a4
      Alexander Duyck 提交于
      commit b6eae0f61db27748606cc00dafcfd1e2c032f0a5 upstream.
      
      Unlike asynchronous initialization in the core we have not yet associated
      the device with the parent, and as such the device doesn't hold a reference
      to the parent.
      
      In order to resolve that we should be holding a reference on the parent
      until the asynchronous initialization has completed.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 4d88a97a ("libnvdimm: ...base ... infrastructure")
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f1a55a4
    • N
      scsi: target: Fix target_wait_for_sess_cmds breakage with active signals · 7a338b87
      Nicholas Bellinger 提交于
      commit 38fe73cc2c96fbc9942b07220f2a4f1bab37392d upstream.
      
      With the addition of commit 00d909a1 ("scsi: target: Make the session
      shutdown code also wait for commands that are being aborted") in v4.19-rc, it
      incorrectly assumes no signals will be pending for task_struct executing the
      normal session shutdown and I/O quiesce code-path.
      
      For example, iscsi-target and iser-target issue SIGINT to all kthreads as part
      of session shutdown.  This has been the behaviour since day one.
      
      As-is when signals are pending with se_cmds active in se_sess->sess_cmd_list,
      wait_event_interruptible_lock_irq_timeout() returns a negative number and
      immediately kills the machine because of the do while (ret <= 0) loop that was
      added in commit 00d909a1 to spin while backend I/O is taking any amount of
      extended time (say 30 seconds) to complete.
      
      Here's what it looks like in action with debug plus delayed backend I/O
      completion:
      
      [ 4951.909951] se_sess: 000000003e7e08fa before target_wait_for_sess_cmds
      [ 4951.914600] target_wait_for_sess_cmds: signal_pending: 1
      [ 4951.918015] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 0
      [ 4951.921639] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 1
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 2
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 3
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 4
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 5
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 6
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 7
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 8
      [ 4951.921944] wait_event_interruptible_lock_irq_timeout ret: -512 signal_pending: 1 loop count: 9
      
      ... followed by the usual RCU CPU stalls and deadlock.
      
      There was never a case pre commit 00d909a1 where
      wait_for_complete(&se_cmd->cmd_wait_comp) was able to be interrupted, so to
      address this for v4.19+ moving forward go ahead and use
      wait_event_lock_irq_timeout() instead so new code works with all fabric
      drivers.
      
      Also for commit 00d909a1, fix a minor regression in
      target_release_cmd_kref() to only wake_up the new se_sess->cmd_list_wq only
      when shutdown has actually been triggered via se_sess->sess_tearing_down.
      
      Fixes: 00d909a1 ("scsi: target: Make the session shutdown code also wait for commands that are being aborted")
      Cc: <stable@vger.kernel.org> # v4.19+
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
      Tested-by: NNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      Reviewed-by: NBryant G. Ly <bly@catalogicsoftware.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a338b87
    • N
      scsi: sched/wait: Add wait_event_lock_irq_timeout for TASK_UNINTERRUPTIBLE usage · fd594155
      Nicholas Bellinger 提交于
      commit 25ab0bc334b43bbbe4eabc255006ce42a9424da2 upstream.
      
      Short of reverting commit 00d909a1 ("scsi: target: Make the session
      shutdown code also wait for commands that are being aborted") for v4.19,
      target-core needs a wait_event_t macro can be executed using
      TASK_UNINTERRUPTIBLE to function correctly with existing fabric drivers that
      expect to run with signals pending during session shutdown and active se_cmd
      I/O quiesce.
      
      The most notable is iscsi-target/iser-target, while ibmvscsi_tgt invokes
      session shutdown logic from userspace via configfs attribute that could also
      potentially have signals pending.
      
      So go ahead and introduce wait_event_lock_irq_timeout() to achieve this, and
      update + rename __wait_event_lock_irq_timeout() to make it accept 'state' as a
      parameter.
      
      Fixes: 00d909a1 ("scsi: target: Make the session shutdown code also wait for commands that are being aborted")
      Cc: <stable@vger.kernel.org> # v4.19+
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      Reviewed-by: NBryant G. Ly <bly@catalogicsoftware.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd594155
    • C
      dmaengine: ppc4xx: fix off-by-one build failure · 1ff43509
      Christian Lamparter 提交于
      commit 27d8d2d7a9b7eb05c4484b74b749eaee7b50b845 upstream.
      
      There are two poly_store, but one should have been poly_show.
      
      |adma.c:4382:16: error: conflicting types for 'poly_store'
      | static ssize_t poly_store(struct device_driver *dev, const char *buf,
      |                ^~~~~~~~~~
      |adma.c:4363:16: note: previous definition of 'poly_store' was here
      | static ssize_t poly_store(struct device_driver *dev, char *buf)
      |                ^~~~~~~~~~
      
      CC: stable@vger.kernel.org
      Fixes: 13efe1a0 ("dmaengine: ppc4xx: remove DRIVER_ATTR() usage")
      Signed-off-by: NChristian Lamparter <chunkeey@gmail.com>
      Signed-off-by: NVinod Koul <vkoul@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ff43509
    • S
      net/ipv4: defensive cipso option parsing · 247a9fa4
      Stefan Nuernberger 提交于
      commit 076ed3da0c9b2f88d9157dbe7044a45641ae369e upstream.
      
      commit 40413955 ("Cipso: cipso_v4_optptr enter infinite loop") fixed
      a possible infinite loop in the IP option parsing of CIPSO. The fix
      assumes that ip_options_compile filtered out all zero length options and
      that no other one-byte options beside IPOPT_END and IPOPT_NOOP exist.
      While this assumption currently holds true, add explicit checks for zero
      length and invalid length options to be safe for the future. Even though
      ip_options_compile should have validated the options, the introduction of
      new one-byte options can still confuse this code without the additional
      checks.
      Signed-off-by: NStefan Nuernberger <snu@amazon.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Simon Veith <sveith@amazon.de>
      Cc: stable@vger.kernel.org
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      247a9fa4
    • L
      iwlwifi: mvm: check return value of rs_rate_from_ucode_rate() · 3fa27214
      Luca Coelho 提交于
      commit 3d71c3f1f50cf309bd20659422af549bc784bfff upstream.
      
      The rs_rate_from_ucode_rate() function may return -EINVAL if the rate
      is invalid, but none of the callsites check for the error, potentially
      making us access arrays with index IWL_RATE_INVALID, which is larger
      than the arrays, causing an out-of-bounds access.  This will trigger
      KASAN warnings, such as the one reported in the bugzilla issue
      mentioned below.
      
      This fixes https://bugzilla.kernel.org/show_bug.cgi?id=200659
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3fa27214
    • F
      mt76: mt76x2: fix multi-interface beacon configuration · 77f61e70
      Felix Fietkau 提交于
      commit 5289976a upstream.
      
      If the first virtual interface is a station (or an AP with beacons
      temporarily disabled), the beacon of the second interface needs to
      occupy hardware beacon slot 0.
      For some reason the beacon index was incorrectly masked with the
      virtual interface beacon mask, which prevents the secondary
      interface from sending beacons unless the first one also does.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFelix Fietkau <nbd@nbd.name>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77f61e70
    • Y
      usb: gadget: udc: renesas_usb3: Fix b-device mode for "workaround" · 11abbcd3
      Yoshihiro Shimoda 提交于
      commit afc92514a34c7414b28047b1205a6b709103c699 upstream.
      
      If the "workaround_for_vbus" is true, the driver will not call
      usb_disconnect(). So, since the controller keeps some registers'
      value, the driver doesn't re-enumarate suitable speed after
      the b-device mode is disabled. To fix the issue, this patch
      adds usb_disconnect() calling in renesas_usb3_b_device_write()
      if workaround_for_vbus is true.
      
      Fixes: 43ba968b ("usb: gadget: udc: renesas_usb3: add debugfs to set the b-device mode")
      Cc: <stable@vger.kernel.org> # v4.14+
      Signed-off-by: NYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: NFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11abbcd3
    • A
      usb: typec: tcpm: Fix APDO PPS order checking to be based on voltage · 512307dd
      Adam Thomson 提交于
      commit 1b6af2f58c2b1522e0804b150ca95e50a9e80ea7 upstream.
      
      Current code mistakenly checks against max current to determine
      order but this should be max voltage. This commit fixes the issue
      so order is correctly determined, thus avoiding failure based on
      a higher voltage PPS APDO having a lower maximum current output,
      which is actually valid.
      
      Fixes: 2eadc33f ("typec: tcpm: Add core support for sink side PPS")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAdam Thomson <Adam.Thomson.Opensource@diasemi.com>
      Reviewed-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      512307dd
    • S
      usbip:vudc: BUG kmalloc-2048 (Not tainted): Poison overwritten · 08c7103d
      Shuah Khan (Samsung OSG) 提交于
      commit e28fd56a upstream.
      
      In rmmod path, usbip_vudc does platform_device_put() twice once from
      platform_device_unregister() and then from put_vudc_device().
      
      The second put results in:
      
      BUG kmalloc-2048 (Not tainted): Poison overwritten error or
      BUG: KASAN: use-after-free in kobject_put+0x1e/0x230 if KASAN is
      enabled.
      
      [  169.042156] calling  init+0x0/0x1000 [usbip_vudc] @ 1697
      [  169.042396] =============================================================================
      [  169.043678] probe of usbip-vudc.0 returned 1 after 350 usecs
      [  169.044508] BUG kmalloc-2048 (Not tainted): Poison overwritten
      [  169.044509] -----------------------------------------------------------------------------
      ...
      [  169.057849] INFO: Freed in device_release+0x2b/0x80 age=4223 cpu=3 pid=1693
      [  169.057852] 	kobject_put+0x86/0x1b0
      [  169.057853] 	0xffffffffc0c30a96
      [  169.057855] 	__x64_sys_delete_module+0x157/0x240
      
      Fix it to call platform_device_del() instead and let put_vudc_device() do
      the platform_device_put().
      Reported-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NShuah Khan (Samsung OSG) <shuah@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08c7103d
    • L
      libertas: don't set URB_ZERO_PACKET on IN USB transfer · f253f6de
      Lubomir Rintel 提交于
      commit 6528d88047801b80d2a5370ad46fb6eff2f509e0 upstream.
      
      The USB core gets rightfully upset:
      
        usb 1-1: BOGUS urb flags, 240 --> 200
        WARNING: CPU: 0 PID: 60 at drivers/usb/core/urb.c:503 usb_submit_urb+0x2f8/0x3ed
        Modules linked in:
        CPU: 0 PID: 60 Comm: kworker/0:3 Not tainted 4.19.0-rc6-00319-g5206d00a45c7 #39
        Hardware name: OLPC XO/XO, BIOS OLPC Ver 1.00.01 06/11/2014
        Workqueue: events request_firmware_work_func
        EIP: usb_submit_urb+0x2f8/0x3ed
        Code: 75 06 8b 8f 80 00 00 00 8d 47 78 89 4d e4 89 55 e8 e8 35 1c f6 ff 8b 55 e8 56 52 8b 4d e4 51 50 68 e3 ce c7 c0 e8 ed 18 c6 ff <0f> 0b 83 c4 14 80 7d ef 01 74 0a 80 7d ef 03 0f 85 b8 00 00 00 8b
        EAX: 00000025 EBX: ce7d4980 ECX: 00000000 EDX: 00000001
        ESI: 00000200 EDI: ce7d8800 EBP: ce7f5ea8 ESP: ce7f5e70
        DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 EFLAGS: 00210292
        CR0: 80050033 CR2: 00000000 CR3: 00e80000 CR4: 00000090
        Call Trace:
         ? if_usb_fw_timeo+0x64/0x64
         __if_usb_submit_rx_urb+0x85/0xe6
         ? if_usb_fw_timeo+0x64/0x64
         if_usb_submit_rx_urb_fwload+0xd/0xf
         if_usb_prog_firmware+0xc0/0x3db
         ? _request_firmware+0x54/0x47b
         ? _request_firmware+0x89/0x47b
         ? if_usb_probe+0x412/0x412
         lbs_fw_loaded+0x55/0xa6
         ? debug_smp_processor_id+0x12/0x14
         helper_firmware_cb+0x3c/0x3f
         request_firmware_work_func+0x37/0x6f
         process_one_work+0x164/0x25a
         worker_thread+0x1c4/0x284
         kthread+0xec/0xf1
         ? cancel_delayed_work_sync+0xf/0xf
         ? kthread_create_on_node+0x1a/0x1a
         ret_from_fork+0x2e/0x38
        ---[ end trace 3ef1e3b2dd53852f ]---
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NLubomir Rintel <lkundrak@v3.sk>
      Signed-off-by: NKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f253f6de
    • J
      xen/pvh: don't try to unplug emulated devices · c43c81be
      Juergen Gross 提交于
      commit e6111161c0a02d58919d776eec94b313bb57911f upstream.
      
      A Xen PVH guest has no associated qemu device model, so trying to
      unplug any emulated devices is making no sense at all.
      
      Bail out early from xen_unplug_emulated_devices() when running as PVH
      guest. This will avoid issuing the boot message:
      
      [    0.000000] Xen Platform PCI: unrecognised magic value
      
      Cc: <stable@vger.kernel.org> # 4.11
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c43c81be
    • R
      xen/pvh: increase early stack size · 82f76b05
      Roger Pau Monne 提交于
      commit 7deecbda3026f5e2a8cc095d7ef7261a920efcf2 upstream.
      
      While booting on an AMD EPYC box the stack canary would detect stack
      overflows when using the current PVH early stack size (256). Switch to
      using the value defined by BOOT_STACK_SIZE, which prevents the stack
      overflow.
      
      Cc: <stable@vger.kernel.org> # 4.11
      Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82f76b05
    • J
      xen: make xen_qlock_wait() nestable · 99d25586
      Juergen Gross 提交于
      commit a856531951dc8094359dfdac21d59cee5969c18e upstream.
      
      xen_qlock_wait() isn't safe for nested calls due to interrupts. A call
      of xen_qlock_kick() might be ignored in case a deeper nesting level
      was active right before the call of xen_poll_irq():
      
      CPU 1:                                   CPU 2:
      spin_lock(lock1)
                                               spin_lock(lock1)
                                               -> xen_qlock_wait()
                                                  -> xen_clear_irq_pending()
                                                  Interrupt happens
      spin_unlock(lock1)
      -> xen_qlock_kick(CPU 2)
      spin_lock_irqsave(lock2)
                                               spin_lock_irqsave(lock2)
                                               -> xen_qlock_wait()
                                                  -> xen_clear_irq_pending()
                                                     clears kick for lock1
                                                  -> xen_poll_irq()
      spin_unlock_irq_restore(lock2)
      -> xen_qlock_kick(CPU 2)
                                                  wakes up
                                               spin_unlock_irq_restore(lock2)
                                               IRET
                                                 resumes in xen_qlock_wait()
                                                 -> xen_poll_irq()
                                                 never wakes up
      
      The solution is to disable interrupts in xen_qlock_wait() and not to
      poll for the irq in case xen_qlock_wait() is called in nmi context.
      
      Cc: stable@vger.kernel.org
      Cc: Waiman.Long@hp.com
      Cc: peterz@infradead.org
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99d25586
    • J
      xen: fix race in xen_qlock_wait() · 9dc8cf0d
      Juergen Gross 提交于
      commit 2ac2a7d4d9ff4e01e36f9c3d116582f6f655ab47 upstream.
      
      In the following situation a vcpu waiting for a lock might not be
      woken up from xen_poll_irq():
      
      CPU 1:                CPU 2:                      CPU 3:
      takes a spinlock
                            tries to get lock
                            -> xen_qlock_wait()
      frees the lock
      -> xen_qlock_kick(cpu2)
                              -> xen_clear_irq_pending()
      
      takes lock again
                                                        tries to get lock
                                                        -> *lock = _Q_SLOW_VAL
                              -> *lock == _Q_SLOW_VAL ?
                              -> xen_poll_irq()
      frees the lock
      -> xen_qlock_kick(cpu3)
      
      And cpu 2 will sleep forever.
      
      This can be avoided easily by modifying xen_qlock_wait() to call
      xen_poll_irq() only if the related irq was not pending and to call
      xen_clear_irq_pending() only if it was pending.
      
      Cc: stable@vger.kernel.org
      Cc: Waiman.Long@hp.com
      Cc: peterz@infradead.org
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dc8cf0d