1. 02 3月, 2017 1 次提交
  2. 02 12月, 2016 1 次提交
    • P
      ACPI / APEI: Fix NMI notification handling · a545715d
      Prarit Bhargava 提交于
      When removing and adding cpu 0 on a system with GHES NMI the following stack
      trace is seen when re-adding the cpu:
      
      WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+
      Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2
      Call Trace:
       dump_stack+0x63/0x8e
       __warn+0xd1/0xf0
       warn_slowpath_null+0x1d/0x20
       setup_local_APIC+0x275/0x370
       apic_ap_setup+0xe/0x20
       start_secondary+0x48/0x180
       set_init_arg+0x55/0x55
       early_idt_handler_array+0x120/0x120
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x13d/0x14c
      
      During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an
      NMI on CPU 0.  The GHES NMI handler, ghes_notify_nmi() runs the
      ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR
      (0xf6).  The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is  also
      0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms
      that something has set the IRQ_WORK_VECTOR line prior to the APIC being
      initialized.
      
      Commit 2383844d ("GHES: Elliminate double-loop in the NMI handler")
      incorrectly modified the behavior such that the handler returns
      NMI_HANDLED only if an error was processed, and incorrectly runs the ghes
      work queue for every NMI.
      
      This patch modifies the ghes_proc_irq_work() to run as it did prior to
      2383844d ("GHES: Elliminate double-loop in the NMI handler") by
      properly returning NMI_HANDLED and only calling the work queue if
      NMI_HANDLED has been set.
      
      Fixes: 2383844d (GHES: Elliminate double-loop in the NMI handler)
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a545715d
  3. 24 10月, 2016 1 次提交
  4. 21 9月, 2016 1 次提交
  5. 10 3月, 2016 1 次提交
    • P
      drivers/acpi: make apei/ghes.c more explicitly non-modular · 020bf066
      Paul Gortmaker 提交于
      The Kconfig currently controlling compilation of this code is:
      
      config ACPI_APEI_GHES
            bool "APEI Generic Hardware Error Source"
      
      ...meaning that it currently is not being built as a module by anyone.
      
      Lets remove the modular code that is essentially orphaned, so that
      when reading the driver there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.
      
      We replace module.h with moduleparam.h as we are keeping the
      pre-existing module_param that the file has, as currently that is
      the easiest way to maintain compatibility with the existing boot
      arg use cases.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      020bf066
  6. 14 9月, 2015 1 次提交
    • J
      acpi/apei: Use appropriate pgprot_t to map GHES memory · 8ece249a
      Jonathan (Zhixiong) Zhang 提交于
      If the ACPI APEI firmware handles hardware error first (called
      "firmware first handling"), the firmware updates the GHES memory
      region with hardware error record (called "generic hardware
      error record"). Essentially the firmware writes hardware error
      records in the GHES memory region, triggers an NMI/interrupt,
      then the GHES driver goes off and grabs the error record from
      the GHES region.
      
      The kernel currently maps the GHES memory region as cacheable
      (PAGE_KERNEL) for all architectures. However, on some arm64
      platforms, there is a mismatch between how the kernel maps the
      GHES region (PAGE_KERNEL) and how the firmware maps it
      (EFI_MEMORY_UC, ie. uncacheable), leading to the possibility of
      the kernel GHES driver reading stale data from the cache when it
      receives the interrupt.
      
      With stale data being read, the kernel is unaware there is new
      hardware error to be handled when there actually is; this may
      lead to further damage in various scenarios, such as error
      propagation caused data corruption. If uncorrected error (such
      as double bit ECC error) happened in memory operation and if the
      kernel is unaware of such an event happening, errorneous data may
      be propagated to the disk.
      
      Instead GHES memory region should be mapped with page protection
      type according to what is returned from arch_apei_get_mem_attribute().
      Signed-off-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      [ Small stylistic tweaks. ]
      Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1441372302-23242-3-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8ece249a
  7. 08 7月, 2015 1 次提交
  8. 28 4月, 2015 5 次提交
  9. 22 10月, 2014 2 次提交
  10. 20 10月, 2014 1 次提交
  11. 23 7月, 2014 3 次提交
  12. 17 6月, 2014 1 次提交
    • L
      ACPICA: Restore error table definitions to reduce code differences between... · 0a00fd5e
      Lv Zheng 提交于
      ACPICA: Restore error table definitions to reduce code differences between Linux and ACPICA upstream.
      
      The following commit has changed ACPICA table header definitions:
      
       Commit: 88f074f4
       Subject: ACPI, CPER: Update cper info
      
      While such definitions are currently maintained in ACPICA. As the
      modifications applying to the table definitions affect other OSPMs'
      drivers, it is very difficult for ACPICA to initiate a process to
      complete the merge. Thus this commit finally only leaves us divergences.
      
      Revert such naming modifications to reduce the source code differecnes
      between Linux and ACPICA upstream. No functional changes.
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Cc: Bob Moore <robert.moore@intel.com>
      Cc: Chen, Gong <gong.chen@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      0a00fd5e
  13. 21 12月, 2013 2 次提交
  14. 07 12月, 2013 1 次提交
  15. 24 10月, 2013 1 次提交
  16. 22 10月, 2013 1 次提交
  17. 11 7月, 2013 1 次提交
  18. 07 6月, 2013 1 次提交
  19. 05 6月, 2013 1 次提交
  20. 31 5月, 2013 1 次提交
  21. 26 2月, 2013 1 次提交
  22. 22 2月, 2013 1 次提交
  23. 29 11月, 2012 1 次提交
  24. 22 11月, 2012 1 次提交
  25. 12 6月, 2012 1 次提交
  26. 17 1月, 2012 4 次提交
    • M
      ACPI APEI: Convert atomicio routines · 700130b4
      Myron Stowe 提交于
      APEI needs memory access in interrupt context.  The obvious choice is
      acpi_read(), but originally it couldn't be used in interrupt context
      because it makes temporary mappings with ioremap().  Therefore, we added
      drivers/acpi/atomicio.c, which provides:
          acpi_pre_map_gar()     -- ioremap in process context
      	acpi_atomic_read()     -- memory access in interrupt context
      	acpi_post_unmap_gar()  -- iounmap
      
      Later we added acpi_os_map_generic_address() (29718521) and enhanced
      acpi_read() so it works in interrupt context as long as the address has
      been previously mapped (620242ae).  Now this sequence:
          acpi_os_map_generic_address()    -- ioremap in process context
          acpi_read()/apei_read()          -- now OK in interrupt context
          acpi_os_unmap_generic_address()
      is equivalent to what atomicio.c provides.
      
      This patch introduces apei_read() and apei_write(), which currently are
      functional equivalents of acpi_read() and acpi_write().  This is mainly
      proactive, to prevent APEI breakages if acpi_read() and acpi_write()
      are ever augmented to support the 'bit_offset' field of GAS, as APEI's
      __apei_exec_write_register() precludes splitting up functionality
      related to 'bit_offset' and APEI's 'mask' (see its
      APEI_EXEC_PRESERVE_REGISTER block).
      
      With apei_read() and apei_write() in place, usages of atomicio routines
      are converted to apei_read()/apei_write() and existing calls within
      osl.c and the CA, based on the re-factoring that was done in an earlier
      patch series - http://marc.info/?l=linux-acpi&m=128769263327206&w=2:
          acpi_pre_map_gar()     -->  acpi_os_map_generic_address()
          acpi_post_unmap_gar()  -->  acpi_os_unmap_generic_address()
          acpi_atomic_read()     -->  apei_read()
          acpi_atomic_write()    -->  apei_write()
      
      Note that acpi_read() and acpi_write() currently use 'bit_width'
      for accessing GARs which seems incorrect.  'bit_width' is the size of
      the register, while 'access_width' is the size of the access the
      processor must generate on the bus.  The 'access_width' may be larger,
      for example, if the hardware only supports 32-bit or 64-bit reads.  I
      wanted to minimize any possible impacts with this patch series so I
      did *not* change this behavior.
      Signed-off-by: NMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      700130b4
    • H
      ACPI, APEI, Printk queued error record before panic · 46d12f0b
      Huang Ying 提交于
      Because printk is not safe inside NMI handler, the recoverable error
      records received in NMI handler will be queued to be printked in a
      delayed IRQ context via irq_work.  If a fatal error occurs after the
      recoverable error and before the irq_work processed, we lost a error
      report.
      
      To solve the issue, the queued error records are printked in NMI
      handler if system will go panic.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      46d12f0b
    • H
      ACPI, APEI, GHES, Distinguish interleaved error report in kernel log · 5ba82ab5
      Huang Ying 提交于
      In most cases, printk only guarantees messages from different printk
      calling will not be interleaved between each other.  But, one APEI
      GHES hardware error report will involve multiple printk calling,
      normally each for one line.  So it is possible that the hardware error
      report comes from different generic hardware error source will be
      interleaved.
      
      In this patch, a sequence number is prefixed to each line of error
      report.  So that, even if they are interleaved, they still can be
      distinguished by the prefixed sequence number.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      5ba82ab5
    • H
      ACPI, APEI, GHES: Add PCIe AER recovery support · a654e5ee
      Huang Ying 提交于
      aer_recover_queue() is called when recoverable PCIe AER errors are
      notified by firmware to do the recovery work.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      a654e5ee
  27. 13 1月, 2012 1 次提交
  28. 10 10月, 2011 1 次提交
  29. 03 8月, 2011 1 次提交