1. 14 7月, 2011 1 次提交
    • H
      ACPI, APEI, Add apei_exec_run_optional · eecf2f71
      Huang Ying 提交于
      Some actions in APEI ERST and EINJ tables are optional, for example,
      ACPI_EINJ_BEGIN_OPERATION action is used to do some preparation for
      error injection, and firmware may choose to do nothing here.  While
      some other actions are mandatory, for example, firmware must provide
      ACPI_EINJ_GET_ERROR_TYPE implementation.
      
      Original implementation treats all actions as optional (that is, can
      have no instructions), that may cause issue if firmware does not
      provide some mandatory actions.  To fix this, this patch adds
      apei_exec_run_optional, which should be used for optional actions.
      The original apei_exec_run should be used for mandatory actions.
      
      Cc: Thomas Renninger <trenn@novell.com>
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      eecf2f71
  2. 14 12月, 2010 1 次提交
    • H
      ACPI, APEI, Add APEI generic error status printing support · f59c55d0
      Huang Ying 提交于
      In APEI, Hardware error information reported by firmware to Linux
      kernel is in the data structure of APEI generic error status (struct
      acpi_hes_generic_status).  While now printk is used by Linux kernel to
      report hardware error information to user space.
      
      So, this patch adds printing support for the data structure, so that
      the corresponding hardware error information can be reported to user
      space via printk.
      
      PCIe AER information printing is not implemented yet.  Will refactor the
      original PCIe AER information printing code to avoid code duplicating.
      
      The output format is as follow:
      
      <error record> :=
      APEI generic hardware error status
      severity: <integer>, <severity string>
      section: <integer>, severity: <integer>, <severity string>
      flags: <integer>
      <section flags strings>
      fru_id: <uuid string>
      fru_text: <string>
      section_type: <section type string>
      <section data>
      
      <severity string>* := recoverable | fatal | corrected | info
      
      <section flags strings># :=
      [primary][, containment warning][, reset][, threshold exceeded]\
      [, resource not accessible][, latent error]
      
      <section type string> := generic processor error | memory error | \
      PCIe error | unknown, <uuid string>
      
      <section data> :=
      <generic processor section data> | <memory section data> | \
      <pcie section data> | <null>
      
      <generic processor section data> :=
      [processor_type: <integer>, <proc type string>]
      [processor_isa: <integer>, <proc isa string>]
      [error_type: <integer>
      <proc error type strings>]
      [operation: <integer>, <proc operation string>]
      [flags: <integer>
      <proc flags strings>]
      [level: <integer>]
      [version_info: <integer>]
      [processor_id: <integer>]
      [target_address: <integer>]
      [requestor_id: <integer>]
      [responder_id: <integer>]
      [IP: <integer>]
      
      <proc type string>* := IA32/X64 | IA64
      
      <proc isa string>* := IA32 | IA64 | X64
      
      <processor error type strings># :=
      [cache error][, TLB error][, bus error][, micro-architectural error]
      
      <proc operation string>* := unknown or generic | data read | data write | \
      instruction execution
      
      <proc flags strings># :=
      [restartable][, precise IP][, overflow][, corrected]
      
      <memory section data> :=
      [error_status: <integer>]
      [physical_address: <integer>]
      [physical_address_mask: <integer>]
      [node: <integer>]
      [card: <integer>]
      [module: <integer>]
      [bank: <integer>]
      [device: <integer>]
      [row: <integer>]
      [column: <integer>]
      [bit_position: <integer>]
      [requestor_id: <integer>]
      [responder_id: <integer>]
      [target_id: <integer>]
      [error_type: <integer>, <mem error type string>]
      
      <mem error type string>* :=
      unknown | no error | single-bit ECC | multi-bit ECC | \
      single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \
      target abort | parity error | watchdog timeout | invalid address | \
      mirror Broken | memory sparing | scrub corrected error | \
      scrub uncorrected error
      
      <pcie section data> :=
      [port_type: <integer>, <pcie port type string>]
      [version: <integer>.<integer>]
      [command: <integer>, status: <integer>]
      [device_id: <integer>:<integer>:<integer>.<integer>
      slot: <integer>
      secondary_bus: <integer>
      vendor_id: <integer>, device_id: <integer>
      class_code: <integer>]
      [serial number: <integer>, <integer>]
      [bridge: secondary_status: <integer>, control: <integer>]
      
      <pcie port type string>* := PCIe end point | legacy PCI end point | \
      unknown | unknown | root port | upstream switch port | \
      downstream switch port | PCIe to PCI/PCI-X bridge | \
      PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \
      root complex event collector
      
      Where, [] designate corresponding content is optional
      
      All <field string> description with * has the following format:
      
      field: <integer>, <field string>
      
      Where value of <integer> should be the position of "string" in <field
      string> description. Otherwise, <field string> will be "unknown".
      
      All <field strings> description with # has the following format:
      
      field: <integer>
      <field strings>
      
      Where each string in <fields strings> corresponding to one set bit of
      <integer>. The bit position is the position of "string" in <field
      strings> description.
      
      For more detailed explanation of every field, please refer to UEFI
      specification version 2.3 or later, section Appendix N: Common
      Platform Error Record.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      f59c55d0
  3. 20 5月, 2010 2 次提交
    • H
      ACPI, APEI, UEFI Common Platform Error Record (CPER) header · 06d65dea
      Huang Ying 提交于
      CPER stands for Common Platform Error Record, it is the hardware error
      record format used to describe platform hardware error by various APEI
      tables, such as ERST, BERT and HEST etc.
      
      For more information about CPER, please refer to Appendix N of UEFI
      Specification version 2.3.
      
      This patch mainly includes the data structure difinition header file
      used by other files.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      06d65dea
    • H
      ACPI, APEI, APEI supporting infrastructure · a643ce20
      Huang Ying 提交于
      APEI stands for ACPI Platform Error Interface, which allows to report
      errors (for example from the chipset) to the operating system. This
      improves NMI handling especially. In addition it supports error
      serialization and error injection.
      
      For more information about APEI, please refer to ACPI Specification
      version 4.0, chapter 17.
      
      This patch provides some common functions used by more than one APEI
      tables, mainly framework of interpreter for EINJ and ERST.
      
      A machine readable language is defined for EINJ and ERST for OS to
      execute, and so to drive the firmware to fulfill the corresponding
      functions. The machine language for EINJ and ERST is compatible, so a
      common framework is defined for them.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      a643ce20