1. 16 12月, 2021 1 次提交
    • J
      xen/console: harden hvc_xen against event channel storms · fe415186
      Juergen Gross 提交于
      The Xen console driver is still vulnerable for an attack via excessive
      number of events sent by the backend. Fix that by using a lateeoi event
      channel.
      
      For the normal domU initial console this requires the introduction of
      bind_evtchn_to_irq_lateeoi() as there is no xenbus device available
      at the time the event channel is bound to the irq.
      
      As the decision whether an interrupt was spurious or not requires to
      test for bytes having been read from the backend, move sending the
      event into the if statement, as sending an event without having found
      any bytes to be read is making no sense at all.
      
      This is part of XSA-391
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      ---
      V2:
      - slightly adapt spurious irq detection (Jan Beulich)
      V3:
      - fix spurious irq detection (Jan Beulich)
      fe415186
  2. 12 8月, 2021 1 次提交
    • M
      xen/events: Fix race in set_evtchn_to_irq · 88ca2521
      Maximilian Heyne 提交于
      There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq
      mapping are lazily allocated in this function. The check whether the row
      is already present and the row initialization is not synchronized. Two
      threads can at the same time allocate a new row for evtchn_to_irq and
      add the irq mapping to the their newly allocated row. One thread will
      overwrite what the other has set for evtchn_to_irq[row] and therefore
      the irq mapping is lost. This will trigger a BUG_ON later in
      bind_evtchn_to_cpu:
      
        INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802
        INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002)
        INFO: nvme nvme77: 1/0/0 default/read/poll queues
        CRIT: kernel BUG at drivers/xen/events/events_base.c:427!
        WARN: invalid opcode: 0000 [#1] SMP NOPTI
        WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme]
        WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0
        WARN: Call Trace:
        WARN:  set_affinity_irq+0x121/0x150
        WARN:  irq_do_set_affinity+0x37/0xe0
        WARN:  irq_setup_affinity+0xf6/0x170
        WARN:  irq_startup+0x64/0xe0
        WARN:  __setup_irq+0x69e/0x740
        WARN:  ? request_threaded_irq+0xad/0x160
        WARN:  request_threaded_irq+0xf5/0x160
        WARN:  ? nvme_timeout+0x2f0/0x2f0 [nvme]
        WARN:  pci_request_irq+0xa9/0xf0
        WARN:  ? pci_alloc_irq_vectors_affinity+0xbb/0x130
        WARN:  queue_request_irq+0x4c/0x70 [nvme]
        WARN:  nvme_reset_work+0x82d/0x1550 [nvme]
        WARN:  ? check_preempt_wakeup+0x14f/0x230
        WARN:  ? check_preempt_curr+0x29/0x80
        WARN:  ? nvme_irq_check+0x30/0x30 [nvme]
        WARN:  process_one_work+0x18e/0x3c0
        WARN:  worker_thread+0x30/0x3a0
        WARN:  ? process_one_work+0x3c0/0x3c0
        WARN:  kthread+0x113/0x130
        WARN:  ? kthread_park+0x90/0x90
        WARN:  ret_from_fork+0x3a/0x50
      
      This patch sets evtchn_to_irq rows via a cmpxchg operation so that they
      will be set only once. The row is now cleared before writing it to
      evtchn_to_irq in order to not create a race once the row is visible for
      other threads.
      
      While at it, do not require the page to be zeroed, because it will be
      overwritten with -1's in clear_evtchn_to_irq_row anyway.
      Signed-off-by: NMaximilian Heyne <mheyne@amazon.de>
      Fixes: d0b075ff ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated")
      Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.deReviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      88ca2521
  3. 22 7月, 2021 1 次提交
  4. 24 6月, 2021 1 次提交
  5. 07 4月, 2021 1 次提交
  6. 11 3月, 2021 2 次提交
  7. 10 3月, 2021 1 次提交
  8. 24 2月, 2021 1 次提交
  9. 12 2月, 2021 1 次提交
  10. 13 1月, 2021 1 次提交
    • D
      xen: Fix event channel callback via INTX/GSI · 3499ba81
      David Woodhouse 提交于
      For a while, event channel notification via the PCI platform device
      has been broken, because we attempt to communicate with xenstore before
      we even have notifications working, with the xs_reset_watches() call
      in xs_init().
      
      We tend to get away with this on Xen versions below 4.0 because we avoid
      calling xs_reset_watches() anyway, because xenstore might not cope with
      reading a non-existent key. And newer Xen *does* have the vector
      callback support, so we rarely fall back to INTX/GSI delivery.
      
      To fix it, clean up a bit of the mess of xs_init() and xenbus_probe()
      startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM
      case, deferring it to be called from xenbus_probe() in the XS_HVM case
      instead.
      
      Then fix up the invocation of xenbus_probe() to happen either from its
      device_initcall if the callback is available early enough, or when the
      callback is finally set up. This means that the hack of calling
      xenbus_probe() from a workqueue after the first interrupt, or directly
      from the PCI platform device setup, is no longer needed.
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Link: https://lore.kernel.org/r/20210113132606.422794-2-dwmw2@infradead.orgSigned-off-by: NJuergen Gross <jgross@suse.com>
      3499ba81
  11. 15 12月, 2020 6 次提交
  12. 23 10月, 2020 3 次提交
  13. 20 10月, 2020 5 次提交
    • J
      xen/events: block rogue events for some time · 5f7f7740
      Juergen Gross 提交于
      In order to avoid high dom0 load due to rogue guests sending events at
      high frequency, block those events in case there was no action needed
      in dom0 to handle the events.
      
      This is done by adding a per-event counter, which set to zero in case
      an EOI without the XEN_EOI_FLAG_SPURIOUS is received from a backend
      driver, and incremented when this flag has been set. In case the
      counter is 2 or higher delay the EOI by 1 << (cnt - 2) jiffies, but
      not more than 1 second.
      
      In order not to waste memory shorten the per-event refcnt to two bytes
      (it should normally never exceed a value of 2). Add an overflow check
      to evtchn_get() to make sure the 2 bytes really won't overflow.
      
      This is part of XSA-332.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
      Reviewed-by: NWei Liu <wl@xen.org>
      5f7f7740
    • J
      xen/events: defer eoi in case of excessive number of events · e99502f7
      Juergen Gross 提交于
      In case rogue guests are sending events at high frequency it might
      happen that xen_evtchn_do_upcall() won't stop processing events in
      dom0. As this is done in irq handling a crash might be the result.
      
      In order to avoid that, delay further inter-domain events after some
      time in xen_evtchn_do_upcall() by forcing eoi processing into a
      worker on the same cpu, thus inhibiting new events coming in.
      
      The time after which eoi processing is to be delayed is configurable
      via a new module parameter "event_loop_timeout" which specifies the
      maximum event loop time in jiffies (default: 2, the value was chosen
      after some tests showing that a value of 2 was the lowest with an
      only slight drop of dom0 network throughput while multiple guests
      performed an event storm).
      
      How long eoi processing will be delayed can be specified via another
      parameter "event_eoi_delay" (again in jiffies, default 10, again the
      value was chosen after testing with different delay values).
      
      This is part of XSA-332.
      
      Cc: stable@vger.kernel.org
      Reported-by: NJulien Grall <julien@xen.org>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
      Reviewed-by: NWei Liu <wl@xen.org>
      e99502f7
    • J
      xen/events: use a common cpu hotplug hook for event channels · 7beb290c
      Juergen Gross 提交于
      Today only fifo event channels have a cpu hotplug callback. In order
      to prepare for more percpu (de)init work move that callback into
      events_base.c and add percpu_init() and percpu_deinit() hooks to
      struct evtchn_ops.
      
      This is part of XSA-332.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Reviewed-by: NWei Liu <wl@xen.org>
      7beb290c
    • J
      xen/events: add a new "late EOI" evtchn framework · 54c9de89
      Juergen Gross 提交于
      In order to avoid tight event channel related IRQ loops add a new
      framework of "late EOI" handling: the IRQ the event channel is bound
      to will be masked until the event has been handled and the related
      driver is capable to handle another event. The driver is responsible
      for unmasking the event channel via the new function xen_irq_lateeoi().
      
      This is similar to binding an event channel to a threaded IRQ, but
      without having to structure the driver accordingly.
      
      In order to support a future special handling in case a rogue guest
      is sending lots of unsolicited events, add a flag to xen_irq_lateeoi()
      which can be set by the caller to indicate the event was a spurious
      one.
      
      This is part of XSA-332.
      
      Cc: stable@vger.kernel.org
      Reported-by: NJulien Grall <julien@xen.org>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
      Reviewed-by: NWei Liu <wl@xen.org>
      54c9de89
    • J
      xen/events: avoid removing an event channel while handling it · 073d0552
      Juergen Gross 提交于
      Today it can happen that an event channel is being removed from the
      system while the event handling loop is active. This can lead to a
      race resulting in crashes or WARN() splats when trying to access the
      irq_info structure related to the event channel.
      
      Fix this problem by using a rwlock taken as reader in the event
      handling loop and as writer when deallocating the irq_info structure.
      
      As the observed problem was a NULL dereference in evtchn_from_irq()
      make this function more robust against races by testing the irq_info
      pointer to be not NULL before dereferencing it.
      
      And finally make all accesses to evtchn_to_irq[row][col] atomic ones
      in order to avoid seeing partial updates of an array element in irq
      handling. Note that irq handling can be entered only for event channels
      which have been valid before, so any not populated row isn't a problem
      in this regard, as rows are only ever added and never removed.
      
      This is XSA-331.
      
      Cc: stable@vger.kernel.org
      Reported-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Reported-by: NJinoh Kang <luke1337@theori.io>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
      Reviewed-by: NWei Liu <wl@xen.org>
      073d0552
  14. 01 10月, 2020 1 次提交
  15. 27 8月, 2020 1 次提交
  16. 11 6月, 2020 2 次提交
  17. 07 4月, 2020 1 次提交
  18. 02 12月, 2019 1 次提交
  19. 27 9月, 2019 1 次提交
  20. 17 7月, 2019 1 次提交
  21. 21 5月, 2019 1 次提交
  22. 17 4月, 2019 1 次提交
    • T
      x86/irq/32: Invoke irq_ctx_init() from init_IRQ() · 451f743a
      Thomas Gleixner 提交于
      irq_ctx_init() is invoked from native_init_IRQ() or from xen_init_IRQ()
      code. There is no reason to have this split. The interrupt stacks must be
      allocated no matter what.
      
      Invoke it from init_IRQ() before invoking the native or XEN init
      implementation.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Abraham <j.abraham1776@gmail.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Nicolai Stange <nstange@suse.de>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: x86-ml <x86@kernel.org>
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20190414160146.001162606@linutronix.de
      451f743a
  23. 17 1月, 2019 1 次提交
    • J
      xen: Fix x86 sched_clock() interface for xen · 867cefb4
      Juergen Gross 提交于
      Commit f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable'
      sched_clock() interface") broke Xen guest time handling across
      migration:
      
      [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
      [  187.251137] OOM killer disabled.
      [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
      [  187.252299] suspending xenstore...
      [  187.266987] xen:grant_table: Grant tables using version 1 layout
      [18446743811.706476] OOM killer enabled.
      [18446743811.706478] Restarting tasks ... done.
      [18446743811.720505] Setting capacity to 16777216
      
      Fix that by setting xen_sched_clock_offset at resume time to ensure a
      monotonic clock value.
      
      [boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
       to avoid printing with incorrect timestamp during resume (as we
       haven't re-adjusted the clock yet)]
      
      Fixes: f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
      Cc: <stable@vger.kernel.org> # 4.11
      Reported-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      867cefb4
  24. 31 10月, 2018 1 次提交
  25. 14 9月, 2018 1 次提交
  26. 22 6月, 2018 1 次提交
  27. 01 3月, 2018 1 次提交