1. 02 12月, 2016 1 次提交
    • P
      ACPI / APEI: Fix NMI notification handling · a545715d
      Prarit Bhargava 提交于
      When removing and adding cpu 0 on a system with GHES NMI the following stack
      trace is seen when re-adding the cpu:
      
      WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+
      Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2
      Call Trace:
       dump_stack+0x63/0x8e
       __warn+0xd1/0xf0
       warn_slowpath_null+0x1d/0x20
       setup_local_APIC+0x275/0x370
       apic_ap_setup+0xe/0x20
       start_secondary+0x48/0x180
       set_init_arg+0x55/0x55
       early_idt_handler_array+0x120/0x120
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x13d/0x14c
      
      During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an
      NMI on CPU 0.  The GHES NMI handler, ghes_notify_nmi() runs the
      ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR
      (0xf6).  The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is  also
      0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms
      that something has set the IRQ_WORK_VECTOR line prior to the APIC being
      initialized.
      
      Commit 2383844d ("GHES: Elliminate double-loop in the NMI handler")
      incorrectly modified the behavior such that the handler returns
      NMI_HANDLED only if an error was processed, and incorrectly runs the ghes
      work queue for every NMI.
      
      This patch modifies the ghes_proc_irq_work() to run as it did prior to
      2383844d ("GHES: Elliminate double-loop in the NMI handler") by
      properly returning NMI_HANDLED and only calling the work queue if
      NMI_HANDLED has been set.
      
      Fixes: 2383844d (GHES: Elliminate double-loop in the NMI handler)
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a545715d
  2. 24 10月, 2016 1 次提交
  3. 21 9月, 2016 1 次提交
  4. 09 9月, 2016 1 次提交
    • N
      pstore: Split pstore fragile flags · c950fd6f
      Namhyung Kim 提交于
      This patch adds new PSTORE_FLAGS for each pstore type so that they can
      be enabled separately.  This is a preparation for ongoing virtio-pstore
      work to support those types flexibly.
      
      The PSTORE_FLAGS_FRAGILE is changed to PSTORE_FLAGS_DMESG to preserve the
      original behavior.
      
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Colin Cross <ccross@android.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: linux-acpi@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      [kees: retained "FRAGILE" for now to make merges easier]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      c950fd6f
  5. 30 6月, 2016 1 次提交
    • H
      ACPI / APEI: Add Boot Error Record Table (BERT) support · a3e2acc5
      Huang Ying 提交于
      ACPI/APEI is designed to verifiy/report H/W errors, like Corrected
      Error(CE) and Uncorrected Error(UC). It contains four tables: HEST,
      ERST, EINJ and BERT. The first three tables have been merged for
      a long time, but because of lacking BIOS support for BERT, the
      support for BERT is pending until now. Recently on ARM 64 platform
      it is has been supported. So here we come.
      
      Under normal circumstances, when a hardware error occurs, kernel will
      be notified via NMI, MCE or some other method, then kernel will
      process the error condition, report it, and recover it if possible.
      But sometime, the situation is so bad, so that firmware may choose to
      reset directly without notifying Linux kernel.
      
      Linux kernel can use the Boot Error Record Table (BERT) to get the
      un-notified hardware errors that occurred in a previous boot. In this
      patch, the error information is reported via printk.
      
      For more information about BERT, please refer to ACPI Specification
      version 6.0, section 18.3.1:
        http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
      
      The following log is a BERT record after system reboot because of hitting
      a fatal memory error:
      BERT: Error records from previous boot:
      [Hardware Error]: It has been corrected by h/w and requires no further action
      [Hardware Error]: event severity: corrected
      [Hardware Error]:  Error 0, type: recoverable
      [Hardware Error]:   section_type: memory error
      [Hardware Error]:   error_status: 0x0000000000000400
      [Hardware Error]:   physical_address: 0xffffffffffffffff
      [Hardware Error]:   card: 1 module: 2 bank: 3 row: 1 column: 2 bit_position: 5
      [Hardware Error]:   error_type: 2, single-bit ECC
      
      [Tomasz Nowicki: Clear error status at the end of error handling]
      [Tony: Applied some cleanups suggested by Fu Wei]
      [Fu Wei: delete EXPORT_SYMBOL_GPL(bert_disable), improve the code]
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NTomasz Nowicki <tomasz.nowicki@linaro.org>
      Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
      Tested-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Signed-off-by: NFu Wei <fu.wei@linaro.org>
      Tested-by: NTyler Baicar <tbaicar@codeaurora.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a3e2acc5
  6. 24 6月, 2016 2 次提交
  7. 03 6月, 2016 1 次提交
    • G
      pstore: add lzo/lz4 compression support · 8cfc8ddc
      Geliang Tang 提交于
      Like zlib compression in pstore, this patch added lzo and lz4
      compression support so that users can have more options and better
      compression ratio.
      
      The original code treats the compressed data together with the
      uncompressed ECC correction notice by using zlib decompress. The
      ECC correction notice is missing in the decompression process. The
      treatment also makes lzo and lz4 not working. So I treat them
      separately by using pstore_decompress() to treat the compressed
      data, and memcpy() to treat the uncompressed ECC correction notice.
      Signed-off-by: NGeliang Tang <geliangtang@163.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      8cfc8ddc
  8. 11 3月, 2016 2 次提交
  9. 10 3月, 2016 1 次提交
    • P
      drivers/acpi: make apei/ghes.c more explicitly non-modular · 020bf066
      Paul Gortmaker 提交于
      The Kconfig currently controlling compilation of this code is:
      
      config ACPI_APEI_GHES
            bool "APEI Generic Hardware Error Source"
      
      ...meaning that it currently is not being built as a module by anyone.
      
      Lets remove the modular code that is essentially orphaned, so that
      when reading the driver there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.
      
      We replace module.h with moduleparam.h as we are keeping the
      pre-existing module_param that the file has, as currently that is
      the easiest way to maintain compatibility with the existing boot
      arg use cases.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      020bf066
  10. 30 1月, 2016 1 次提交
    • T
      ACPI/EINJ: Allow memory error injection to NVDIMM · 4650bac1
      Toshi Kani 提交于
      In the case of memory error injection, einj_error_inject()
      checks if a target address is System RAM. Change this check to
      allow injecting a memory error into NVDIMM memory by calling
      region_intersects() with IORES_DESC_PERSISTENT_MEMORY. This
      enables memory error testing on both System RAM and NVDIMM.
      
      In addition, page_is_ram() is replaced with region_intersects()
      with IORESOURCE_SYSTEM_RAM, so that it can verify a target
      address range with the requested size.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: linux-acpi@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: linux-nvdimm@lists.01.org
      Link: http://lkml.kernel.org/r/1453841853-11383-18-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4650bac1
  11. 23 1月, 2016 1 次提交
  12. 14 9月, 2015 1 次提交
    • J
      acpi/apei: Use appropriate pgprot_t to map GHES memory · 8ece249a
      Jonathan (Zhixiong) Zhang 提交于
      If the ACPI APEI firmware handles hardware error first (called
      "firmware first handling"), the firmware updates the GHES memory
      region with hardware error record (called "generic hardware
      error record"). Essentially the firmware writes hardware error
      records in the GHES memory region, triggers an NMI/interrupt,
      then the GHES driver goes off and grabs the error record from
      the GHES region.
      
      The kernel currently maps the GHES memory region as cacheable
      (PAGE_KERNEL) for all architectures. However, on some arm64
      platforms, there is a mismatch between how the kernel maps the
      GHES region (PAGE_KERNEL) and how the firmware maps it
      (EFI_MEMORY_UC, ie. uncacheable), leading to the possibility of
      the kernel GHES driver reading stale data from the cache when it
      receives the interrupt.
      
      With stale data being read, the kernel is unaware there is new
      hardware error to be handled when there actually is; this may
      lead to further damage in various scenarios, such as error
      propagation caused data corruption. If uncorrected error (such
      as double bit ECC error) happened in memory operation and if the
      kernel is unaware of such an event happening, errorneous data may
      be propagated to the disk.
      
      Instead GHES memory region should be mapped with page protection
      type according to what is returned from arch_apei_get_mem_attribute().
      Signed-off-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      [ Small stylistic tweaks. ]
      Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1441372302-23242-3-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8ece249a
  13. 08 7月, 2015 1 次提交
  14. 03 6月, 2015 1 次提交
    • S
      x86/mm: Decouple <linux/vmalloc.h> from <asm/io.h> · d6472302
      Stephen Rothwell 提交于
      Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so
      remove it from there and fix up the resulting build problems
      triggered on x86 {64|32}-bit {def|allmod|allno}configs.
      
      The breakages were triggering in places where x86 builds relied
      on vmalloc() facilities but did not include <linux/vmalloc.h>
      explicitly and relied on the implicit inclusion via <asm/io.h>.
      
      Also add:
      
        - <linux/init.h> to <linux/io.h>
        - <asm/pgtable_types> to <asm/io.h>
      
      ... which were two other implicit header file dependencies.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      [ Tidied up the changelog. ]
      Acked-by: NDavid Miller <davem@davemloft.net>
      Acked-by: NTakashi Iwai <tiwai@suse.de>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: NVinod Koul <vinod.koul@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Colin Cross <ccross@android.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: James E.J. Bottomley <JBottomley@odin.com>
      Cc: Jaroslav Kysela <perex@perex.cz>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Suma Ramars <sramars@cisco.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d6472302
  15. 28 4月, 2015 5 次提交
  16. 16 12月, 2014 1 次提交
    • C
      ACPI, EINJ: Enhance error injection tolerance level · d91525eb
      Chen, Gong 提交于
      Some BIOSes utilize PCI MMCFG space read/write opertion to trigger
      specific errors. EINJ will report errors as below when hitting such
      cases:
      
      APEI: Can not request [mem 0x83f990a0-0x83f990a3] for APEI EINJ Trigger registers
      
      It is because on x86 platform ACPI based PCI MMCFG logic has
      reserved all MMCFG spaces so that EINJ can't reserve it again.
      We already trust the ACPI/APEI code when using the EINJ interface
      so it is not a big leap to also trust it to access the right
      MMCFG addresses. Skip address checking to allow the access.
      Signed-off-by: NChen, Gong <gong.chen@linux.intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      d91525eb
  17. 22 10月, 2014 2 次提交
  18. 20 10月, 2014 1 次提交
  19. 23 7月, 2014 3 次提交
  20. 17 6月, 2014 1 次提交
    • L
      ACPICA: Restore error table definitions to reduce code differences between... · 0a00fd5e
      Lv Zheng 提交于
      ACPICA: Restore error table definitions to reduce code differences between Linux and ACPICA upstream.
      
      The following commit has changed ACPICA table header definitions:
      
       Commit: 88f074f4
       Subject: ACPI, CPER: Update cper info
      
      While such definitions are currently maintained in ACPICA. As the
      modifications applying to the table definitions affect other OSPMs'
      drivers, it is very difficult for ACPICA to initiate a process to
      complete the merge. Thus this commit finally only leaves us divergences.
      
      Revert such naming modifications to reduce the source code differecnes
      between Linux and ACPICA upstream. No functional changes.
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Cc: Bob Moore <robert.moore@intel.com>
      Cc: Chen, Gong <gong.chen@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      0a00fd5e
  21. 28 5月, 2014 1 次提交
    • L
      ACPI: Clean up acpi_os_map/unmap_memory() to eliminate __iomem. · a238317c
      Lv Zheng 提交于
      ACPICA doesn't include protections around address space checking, Linux
      build tests always complain increased sparse warnings around ACPICA
      internal acpi_os_map/unmap_memory() invocations.  This patch tries to fix
      this issue permanently.
      
      There are 2 choices left for us to solve this issue:
       1. Add __iomem address space awareness into ACPICA.
       2. Remove sparse checker of __iomem from ACPICA source code.
      
      This patch chooses solution 2, because:
       1.  Most of the acpi_os_map/unmap_memory() invocations are used for ACPICA.
           table mappings, which in fact are not IO addresses.
       2.  The only IO addresses usage is for "system memory space" mapping code in:
            drivers/acpi/acpica/exregion.c
            drivers/acpi/acpica/evrgnini.c
            drivers/acpi/acpica/exregion.c
          The mapped address is accessed in the handler of "system memory space"
          - acpi_ex_system_memory_space_handler().  This function in fact can be
          changed to invoke acpi_os_read/write_memory() so that __iomem can
          always be type-casted in the OSL layer.
      
      According to the above investigation, we drew the following conclusion:
      It is not a good idea to introduce __iomem address space awareness into
      ACPICA mostly in order to protect non-IO addresses.
      
      We can simply remove __iomem for acpi_os_map/unmap_memory() to remove
      __iomem checker for ACPICA code. Then we need to enforce external usages
      to invoke other APIs that are aware of __iomem address space.
      The external usages are:
       drivers/acpi/apei/einj.c
       drivers/acpi/acpi_extlog.c
       drivers/char/tpm/tpm_acpi.c
       drivers/acpi/nvs.c
      
      This patch thus performs cleanups in this way:
       1. Add acpi_os_map/unmap_iomem() to be invoked by non-ACPICA code.
       2. Remove __iomem from acpi_os_map/unmap_memory().
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a238317c
  22. 18 2月, 2014 1 次提交
  23. 21 12月, 2013 4 次提交
  24. 20 12月, 2013 1 次提交
  25. 18 12月, 2013 1 次提交
    • L
      ACPI, APEI, EINJ: Changes to the ACPI/APEI/EINJ debugfs interface · 3482fb5e
      Luck, Tony 提交于
      When I added support for ACPI5 I made the assumption that
      injected processor errors would just need to know the APICID,
      memory errors just the address and mask, and PCIe errors just the
      segment/bus/device/function. So I had the code check the type of injection
      and multiplex the "param1" value appropriately.
      
      This was not a good assumption :-(
      
      There are injection scenarios where we need to specify more than one of
      these items. E.g. injecting a cache error we need to specify an APICID
      of the cpu that owns the cache, and also an address (so that we can trip
      the error by accessing the address).
      
      Add a "flags" file to give the user direct access to specify which items
      are valid in the ACPI SET_ERROR_TYPE_WITH_ADDRESS structure. Also add
      new files param3 and param4 to hold all these values.
      
      For backwards compatability with old injection scripts we maintain the
      old behaviour if flags remains set at zero (or is reset to 0).
      Acked-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      3482fb5e
  26. 07 12月, 2013 2 次提交
    • L
      ACPI / i915: Fix incorrect <acpi/acpi.h> inclusions via <linux/acpi_io.h> · 27d50c82
      Lv Zheng 提交于
      To avoid build problems and breaking dependencies between ACPI header
      files, <acpi/acpi.h> should not be included directly by code outside
      of the ACPI core subsystem.  However, that is possible if
      <linux/acpi_io.h> is included, because that file contains
      a direct inclusion of <acpi/acpi.h>.
      
      For this reason, remove the direct <acpi/acpi.h> inclusion from
      <linux/acpi_io.h>, move that file from include/linux/ to include/acpi/
      and make <linux/acpi.h> include it for CONFIG_ACPI set along with the
      other ACPI header files.  Accordingly, Remove the inclusions of
      <linux/acpi_io.h> from everywhere.
      
      Of course, that causes the contents of the new <acpi/acpi_io.h> file
      to be available for CONFIG_ACPI set only, so intel_opregion.o that
      depends on it should also depend on CONFIG_ACPI (and it really should
      not be compiled for CONFIG_ACPI unset anyway).
      
      References: https://01.org/linuxgraphics/sites/default/files/documentation/acpi_igd_opregion_spec.pdf
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      [rjw: Subject and changelog]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      27d50c82
    • L
      ACPI: Clean up inclusions of ACPI header files · 8b48463f
      Lv Zheng 提交于
      Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and
      <acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h>
      inclusions and remove some inclusions of those files that aren't
      necessary.
      
      First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>
      should not be included directly from any files that are built for
      CONFIG_ACPI unset, because that generally leads to build warnings about
      undefined symbols in !CONFIG_ACPI builds.  For CONFIG_ACPI set,
      <linux/acpi.h> includes those files and for CONFIG_ACPI unset it
      provides stub ACPI symbols to be used in that case.
      
      Second, there are ordering dependencies between those files that always
      have to be met.  Namely, it is required that <acpi/acpi_bus.h> be included
      prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the
      latter depends on are always there.  And <acpi/acpi.h> which provides
      basic ACPICA type declarations should always be included prior to any other
      ACPI headers in CONFIG_ACPI builds.  That also is taken care of including
      <linux/acpi.h> as appropriate.
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff)
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff)
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8b48463f
  27. 01 11月, 2013 1 次提交