1. 05 6月, 2020 2 次提交
  2. 20 5月, 2020 1 次提交
    • J
      ACPI: APEI: Kick the memory_failure() queue for synchronous errors · 7f17b4a1
      James Morse 提交于
      memory_failure() offlines or repairs pages of memory that have been
      discovered to be corrupt. These may be detected by an external
      component, (e.g. the memory controller), and notified via an IRQ.
      In this case the work is queued as not all of memory_failure()s work
      can happen in IRQ context.
      
      If the error was detected as a result of user-space accessing a
      corrupt memory location the CPU may take an abort instead. On arm64
      this is a 'synchronous external abort', and on a firmware first
      system it is replayed using NOTIFY_SEA.
      
      This notification has NMI like properties, (it can interrupt
      IRQ-masked code), so the memory_failure() work is queued. If we
      return to user-space before the queued memory_failure() work is
      processed, we will take the fault again. This loop may cause platform
      firmware to exceed some threshold and reboot when Linux could have
      recovered from this error.
      
      For NMIlike notifications keep track of whether memory_failure() work
      was queued, and make task_work pending to flush out the queue.
      To save memory allocations, the task_work is allocated as part of
      the ghes_estatus_node, and free()ing it back to the pool is deferred.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Tested-by: NTyler Baicar <baicar@os.amperecomputing.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      7f17b4a1
  3. 09 5月, 2020 2 次提交
  4. 04 4月, 2020 1 次提交
    • Q
      x86: ACPI: fix CPU hotplug deadlock · 696ac2e3
      Qian Cai 提交于
      Similar to commit 0266d81e ("acpi/processor: Prevent cpu hotplug
      deadlock") except this is for acpi_processor_ffh_cstate_probe():
      
      "The problem is that the work is scheduled on the current CPU from the
      hotplug thread associated with that CPU.
      
      It's not required to invoke these functions via the workqueue because
      the hotplug thread runs on the target CPU already.
      
      Check whether current is a per cpu thread pinned on the target CPU and
      invoke the function directly to avoid the workqueue."
      
       WARNING: possible circular locking dependency detected
       ------------------------------------------------------
       cpuhp/1/15 is trying to acquire lock:
       ffffc90003447a28 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: __flush_work+0x4c6/0x630
      
       but task is already holding lock:
       ffffffffafa1c0e8 (cpuidle_lock){+.+.}-{3:3}, at: cpuidle_pause_and_lock+0x17/0x20
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (cpu_hotplug_lock){++++}-{0:0}:
       cpus_read_lock+0x3e/0xc0
       irq_calc_affinity_vectors+0x5f/0x91
       __pci_enable_msix_range+0x10f/0x9a0
       pci_alloc_irq_vectors_affinity+0x13e/0x1f0
       pci_alloc_irq_vectors_affinity at drivers/pci/msi.c:1208
       pqi_ctrl_init+0x72f/0x1618 [smartpqi]
       pqi_pci_probe.cold.63+0x882/0x892 [smartpqi]
       local_pci_probe+0x7a/0xc0
       work_for_cpu_fn+0x2e/0x50
       process_one_work+0x57e/0xb90
       worker_thread+0x363/0x5b0
       kthread+0x1f4/0x220
       ret_from_fork+0x27/0x50
      
       -> #0 ((work_completion)(&wfc.work)){+.+.}-{0:0}:
       __lock_acquire+0x2244/0x32a0
       lock_acquire+0x1a2/0x680
       __flush_work+0x4e6/0x630
       work_on_cpu+0x114/0x160
       acpi_processor_ffh_cstate_probe+0x129/0x250
       acpi_processor_evaluate_cst+0x4c8/0x580
       acpi_processor_get_power_info+0x86/0x740
       acpi_processor_hotplug+0xc3/0x140
       acpi_soft_cpu_online+0x102/0x1d0
       cpuhp_invoke_callback+0x197/0x1120
       cpuhp_thread_fun+0x252/0x2f0
       smpboot_thread_fn+0x255/0x440
       kthread+0x1f4/0x220
       ret_from_fork+0x27/0x50
      
       other info that might help us debug this:
      
       Chain exists of:
       (work_completion)(&wfc.work) --> cpuhp_state-up --> cpuidle_lock
      
       Possible unsafe locking scenario:
      
       CPU0                    CPU1
       ----                    ----
       lock(cpuidle_lock);
                               lock(cpuhp_state-up);
                               lock(cpuidle_lock);
       lock((work_completion)(&wfc.work));
      
       *** DEADLOCK ***
      
       3 locks held by cpuhp/1/15:
       #0: ffffffffaf51ab10 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x69/0x2f0
       #1: ffffffffaf51ad40 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x69/0x2f0
       #2: ffffffffafa1c0e8 (cpuidle_lock){+.+.}-{3:3}, at: cpuidle_pause_and_lock+0x17/0x20
      
       Call Trace:
       dump_stack+0xa0/0xea
       print_circular_bug.cold.52+0x147/0x14c
       check_noncircular+0x295/0x2d0
       __lock_acquire+0x2244/0x32a0
       lock_acquire+0x1a2/0x680
       __flush_work+0x4e6/0x630
       work_on_cpu+0x114/0x160
       acpi_processor_ffh_cstate_probe+0x129/0x250
       acpi_processor_evaluate_cst+0x4c8/0x580
       acpi_processor_get_power_info+0x86/0x740
       acpi_processor_hotplug+0xc3/0x140
       acpi_soft_cpu_online+0x102/0x1d0
       cpuhp_invoke_callback+0x197/0x1120
       cpuhp_thread_fun+0x252/0x2f0
       smpboot_thread_fn+0x255/0x440
       kthread+0x1f4/0x220
       ret_from_fork+0x27/0x50
      Signed-off-by: NQian Cai <cai@lca.pw>
      Tested-by: NBorislav Petkov <bp@suse.de>
      [ rjw: Subject ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      696ac2e3
  5. 30 3月, 2020 7 次提交
  6. 25 3月, 2020 1 次提交
  7. 21 3月, 2020 1 次提交
  8. 22 2月, 2020 1 次提交
  9. 16 2月, 2020 1 次提交
  10. 14 2月, 2020 2 次提交
  11. 12 2月, 2020 1 次提交
  12. 13 1月, 2020 2 次提交
  13. 20 12月, 2019 2 次提交
  14. 29 10月, 2019 3 次提交
  15. 28 10月, 2019 1 次提交
  16. 21 10月, 2019 1 次提交
  17. 15 10月, 2019 2 次提交
  18. 28 8月, 2019 1 次提交
  19. 21 8月, 2019 6 次提交
  20. 23 7月, 2019 1 次提交
  21. 06 7月, 2019 1 次提交
    • D
      ACPI: PM: Make acpi_sleep_state_supported() non-static · ad5a449b
      Dexuan Cui 提交于
      With some upcoming patches to save/restore the Hyper-V drivers related
      states, a Linux VM running on Hyper-V will be able to hibernate. When
      a Linux VM hibernates, unluckily we must disable the memory hot-add/remove
      and balloon up/down capabilities in the hv_balloon driver
      (drivers/hv/hv_balloon.c), because these can not really work according to
      the design of the related back-end driver on the host.
      
      By default, Hyper-V does not enable the virtual ACPI S4 state for a VM;
      on recent Hyper-V hosts, the administrator is able to enable the virtual
      ACPI S4 state for a VM, so we hope to use the presence of the virtual ACPI
      S4 state as a hint for hv_balloon to disable the aforementioned
      capabilities. In this way, hibernation will work more reliably, from the
      user's perspective.
      
      By marking acpi_sleep_state_supported() non-static, we'll be able to
      implement a hv_is_hibernation_supported() API in the always-built-in
      module arch/x86/hyperv/hv_init.c, and the API will be called by hv_balloon.
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ad5a449b