1. 25 10月, 2021 9 次提交
  2. 23 10月, 2021 1 次提交
  3. 27 9月, 2021 1 次提交
  4. 25 9月, 2021 1 次提交
  5. 22 9月, 2021 3 次提交
    • A
      s390/qeth: fix deadlock during failing recovery · d2b59bd4
      Alexandra Winter 提交于
      Commit 0b9902c1 ("s390/qeth: fix deadlock during recovery") removed
      taking discipline_mutex inside qeth_do_reset(), fixing potential
      deadlocks. An error path was missed though, that still takes
      discipline_mutex and thus has the original deadlock potential.
      
      Intermittent deadlocks were seen when a qeth channel path is configured
      offline, causing a race between qeth_do_reset and ccwgroup_remove.
      Call qeth_set_offline() directly in the qeth_do_reset() error case and
      then a new variant of ccwgroup_set_offline(), without taking
      discipline_mutex.
      
      Fixes: b41b554c ("s390/qeth: fix locking for discipline setup / removal")
      Signed-off-by: NAlexandra Winter <wintera@linux.ibm.com>
      Reviewed-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      d2b59bd4
    • A
      s390/qeth: Fix deadlock in remove_discipline · ee909d0b
      Alexandra Winter 提交于
      Problem: qeth_close_dev_handler is a worker that tries to acquire
      card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline().
      Since commit b41b554c
      ("s390/qeth: fix locking for discipline setup / removal")
      qeth_remove_discipline() is called under card->discipline_mutex and
      cancels the work and waits for it to finish.
      
      STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the
      only situation that schedules close_dev_work. In that situation scheduling
      qeth recovery will also result in an offline interface, when resetting the
      isolation mode fails, if the external switch is still set to VEB.
      And since commit 0b9902c1 ("s390/qeth: fix deadlock during recovery")
      qeth recovery does not aquire card->discipline_mutex anymore.
      
      So we accept the longer pathlength of qeth_schedule_recovery in this
      error situation and re-use the existing function.
      
      As a side-benefit this changes the hwtrap to behave like during recovery
      instead of like during a user-triggered set_offline.
      
      Fixes: b41b554c ("s390/qeth: fix locking for discipline setup / removal")
      Signed-off-by: NAlexandra Winter <wintera@linux.ibm.com>
      Acked-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ee909d0b
    • J
      s390/qeth: fix NULL deref in qeth_clear_working_pool_list() · 248f064a
      Julian Wiedmann 提交于
      When qeth_set_online() calls qeth_clear_working_pool_list() to roll
      back after an error exit from qeth_hardsetup_card(), we are at risk of
      accessing card->qdio.in_q before it was allocated by
      qeth_alloc_qdio_queues() via qeth_mpc_initialize().
      
      qeth_clear_working_pool_list() then dereferences NULL, and by writing to
      queue->bufs[i].pool_entry scribbles all over the CPU's lowcore.
      Resulting in a crash when those lowcore areas are used next (eg. on
      the next machine-check interrupt).
      
      Such a scenario would typically happen when the device is first set
      online and its queues aren't allocated yet. An early IO error or certain
      misconfigs (eg. mismatched transport mode, bad portno) then cause us to
      error out from qeth_hardsetup_card() with card->qdio.in_q still being
      NULL.
      
      Fix it by checking the pointer for NULL before accessing it.
      
      Note that we also have (rare) paths inside qeth_mpc_initialize() where
      a configuration change can cause us to free the existing queues,
      expecting that subsequent code will allocate them again. If we then
      error out before that re-allocation happens, the same bug occurs.
      
      Fixes: eff73e16 ("s390/qeth: tolerate pre-filled RX buffer")
      Reported-by: NStefan Raspl <raspl@linux.ibm.com>
      Root-caused-by: NHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: NAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      248f064a
  6. 15 9月, 2021 2 次提交
  7. 14 9月, 2021 4 次提交
  8. 08 9月, 2021 3 次提交
  9. 07 9月, 2021 5 次提交
  10. 31 8月, 2021 1 次提交
  11. 27 8月, 2021 1 次提交
    • H
      s390/ap: fix state machine hang after failure to enable irq · cabebb69
      Harald Freudenberger 提交于
      If for any reason the interrupt enable for an ap queue fails the
      state machine run for the queue returned wrong return codes to the
      caller. So the caller assumed interrupt support for this queue in
      enabled and thus did not re-establish the high resolution timer used
      for polling. In the end this let to a hang for the user space process
      waiting "forever" for the reply.
      
      This patch reworks these return codes to return correct indications
      for the caller to re-establish the timer when a queue runs without
      interrupt support.
      
      Please note that this is fixing a wrong behavior after a first
      failure (enable interrupt support for the queue) failed. However,
      looks like this occasionally happens on KVM systems.
      Signed-off-by: NHarald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
      cabebb69
  12. 26 8月, 2021 1 次提交
  13. 25 8月, 2021 7 次提交
  14. 24 8月, 2021 1 次提交