- 11 2月, 2019 1 次提交
-
-
由 James Morse 提交于
If the GHES notification type is SDEI, register the provided event using the SDEI-GHES helper. SDEI may be one of two types of event, normal and critical. Critical events can interrupt normal events, so these must have separate fixmap slots and locks in case both event types are in use. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 08 2月, 2019 19 次提交
-
-
由 James Morse 提交于
Now that ghes notification helpers provide the fixmap slots and take the lock themselves, multiple NMI-like notifications can be used on arm64. These should be named after their notification method as they can't all be called 'NMI'. x86's NOTIFY_NMI already is, change the SEA fixmap entry to be called FIX_APEI_GHES_SEA. Future patches can add support for FIX_APEI_GHES_SEI and FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}. Because all of ghes.c builds on both architectures, provide a constant for each fixmap entry that the architecture will never use. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
Each struct ghes has an worst-case sized buffer for storing the estatus. If an error is being processed by ghes_proc() in process context this buffer will be in use. If the error source then triggers an NMI-like notification, the same buffer will be used by in_nmi_queue_one_entry() to stage the estatus data, before __process_error() copys it into a queued estatus entry. Merge __process_error()s work into in_nmi_queue_one_entry() so that the queued estatus entry is used from the beginning. Use the new ghes_peek_estatus() to know how much memory to allocate from the ghes_estatus_pool before reading the records. Reported-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Change since v6: * Added a comment explaining the 'ack-error, then goto no_work'. * Added missing esatus-clearing, which is necessary after reading the GAS, Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
ghes_read_estatus() reads the record address, then the record's header, then performs some sanity checks before reading the records into the provided estatus buffer. To provide this estatus buffer the caller must know the size of the records in advance, or always provide a worst-case sized buffer as happens today for the non-NMI notifications. Add a function to peek at the record's header to find the size. This will let the NMI path allocate the right amount of memory before reading the records, instead of using the worst-case size, and having to copy the records. Split ghes_read_estatus() to create __ghes_peek_estatus() which returns the address and size of the CPER records. Signed-off-by: NJames Morse <james.morse@arm.com> Changes since v7: * Grammar * concistent argument ordering Changes since v6: * Additional buf_addr = 0 error handling * Moved checking out of peek-estatus * Reworded an error message so we can tell them apart Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
ghes_read_estatus() checks various lengths in the top-level header to ensure the CPER records to be read aren't obviously corrupt. Take the opportunity to make this more user-friendly, printing a (ratelimited) message about the nature of the header format error. Suggested-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NJames Morse <james.morse@arm.com> [ rjw: Add missing 'static' ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
The NMI-like notifications scribble over ghes->estatus, before copying it somewhere else. If this interrupts the ghes_probe() code calling ghes_proc() on each struct ghes, the data is corrupted. All the NMI-like notifications should use a queued estatus entry from the beginning, instead of the ghes version, then copying it. To do this, break up any use of "ghes->estatus" so that all functions take the estatus as an argument. This patch just moves these ghes->estatus dereferences into separate arguments, no change in behaviour. struct ghes becomes unused in ghes_clear_estatus() as it only wanted ghes->estatus, which we now pass directly. This is removed. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi(). This doesn't work when there are multiple NMI-like notifications, that could interrupt each other. As with the locking, move the chosen fixmap_idx to the notification helper. This only matters for NMI-like notifications, anything calling ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave spinlock. This lets us collapse the ghes_ioremap_pfn_*() helpers. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
ghes_copy_tofrom_phys() takes different locks depending on in_nmi(). This doesn't work if there are multiple NMI-like notifications, that can interrupt each other. Now that NOTIFY_SEA is always called in the same context, move the lock-taking to the notification helper. The helper will always know which lock to take. This avoids ghes_copy_tofrom_phys() taking a guess based on in_nmi(). This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All the other notifications use ghes_proc(), and are called in process or IRQ context. Move the spin_lock_irqsave() around their ghes_proc() calls. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
Now that the estatus queue can be used by more than one notification method, we can move notifications that have NMI-like behaviour over. Switch NOTIFY_SEA over to use the estatus queue. This makes it behave in the same way as x86's NOTIFY_NMI. Remove Kconfig's ability to turn ACPI_APEI_SEA off if ACPI_APEI_GHES is selected. This roughly matches the x86 NOTIFY_NMI behaviour, and means each architecture has at least one user of the estatus-queue, meaning it doesn't need guarding with ifdef. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
The estatus-queue code is currently hidden by the NOTIFY_NMI #ifdefs. Once NOTIFY_SEA starts using the estatus-queue we can stop hiding it as each architecture has a user that can't be turned off. Split the existing CONFIG_HAVE_ACPI_APEI_NMI block in two, and move the SEA code into the gap. Move the code around ... and changes the stale comment describing why the status queue is necessary: printk() is no longer the issue, its the helpers like memory_failure_queue() that aren't nmi safe. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
During ghes_proc() we use ghes_ack_error() to tell an external agent we are done with these records and it can re-use the memory. rc may hold an error returned by ghes_read_estatus(), ENOENT causes us to skip ghes_ack_error() (as there is nothing to ack), but rc may also by EIO, which gets supressed. ghes_clear_estatus() is where we mark the records as processed for non GHESv2 error sources, and already spots the ENOENT case as buf_paddr is set to 0 by ghes_read_estatus(). Move the ghes_ack_error() call in here to avoid extra logic with the return code in ghes_proc(). This enables GHESv2 acking for NMI-like error sources. This is safe as the buffer is pre-mapped by map_gen_v2() before the GHES is added to any NMI handler lists. This same pre-mapping step means we can't receive an error from apei_read()/write() here as apei_check_gar() succeeded when it was mapped, and the mapping was cached, so the address can't be rejected at runtime. Remove the error-returns as this is now called from a function with no return. Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
Refactor the estatus queue's pool notification routine from NOTIFY_NMI's handlers. This will allow another notification method to use the estatus queue without duplicating this code. Add rcu_read_lock()/rcu_read_unlock() around the list list_for_each_entry_rcu() walker. These aren't strictly necessary as the whole nmi_enter/nmi_exit() window is a spooky RCU read-side critical section. in_nmi_queue_one_entry() is separate from the rcu-list walker for a later caller that doesn't need to walk a list. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NPunit Agrawal <punit.agrawal@arm.com> Tested-by: NTyler Baicar <tbaicar@codeaurora.org> [ rjw: Drop unnecessary err variable in two places ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
ghes_read_estatus() sets a flag in struct ghes if the buffer of CPER records needs to be cleared once the records have been processed. This flag value is a problem if a struct ghes can be processed concurrently, as happens at probe time if an NMI arrives for the same error source. The NMI clears the flag, meaning the interrupted handler may never do the ghes_estatus_clear() work. The GHES_TO_CLEAR flags is only set at the same time as buffer_paddr, which is now owned by the caller and passed to ghes_clear_estatus(). Use this value as the flag. A non-zero buf_paddr returned by ghes_read_estatus() means ghes_clear_estatus() should clear this address. ghes_read_estatus() already checks for a read of error_status_address being zero, so CPER records cannot be written here. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
ghes_notify_nmi() checks ghes->flags for GHES_TO_CLEAR before going on to __process_error(). This is pointless as ghes_read_estatus() will always set this flag if it returns success, which was checked earlier in the loop. Remove it. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
When CPER records are found the address of the records is stashed in the struct ghes. Once the records have been processed, this address is overwritten with zero so that it won't be processed again without being re-populated by firmware. This goes wrong if a struct ghes can be processed concurrently, as can happen at probe time when an NMI occurs. If the NMI arrives on another CPU, the probing CPU may call ghes_clear_estatus() on the records before the handler had finished with them. Even on the same CPU, once the interrupted handler is resumed, it will call ghes_clear_estatus() on the NMIs records, this memory may have already been re-used by firmware. Avoid this stashing by letting the caller hold the address. A later patch will do away with the use of ghes->flags in the read/clear code too. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
Adding new NMI-like notifications duplicates the calls that grow and shrink the estatus pool. This is all pretty pointless, as the size is capped to 64K. Allocate this for each ghes and drop the code that grows and shrinks the pool. Suggested-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
ghes.c has a memory pool it uses for the estatus cache and the estatus queue. The cache is initialised when registering the platform driver. For the queue, an NMI-like notification has to grow/shrink the pool as it is registered and unregistered. This is all pretty noisy when adding new NMI-like notifications, it would be better to replace this with a static pool size based on the number of users. As a precursor, move the call that creates the pool from ghes_init(), into hest.c. Later this will take the number of ghes entries and consolidate the queue allocations. Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put this. The pool is now initialised as part of ACPI's subsys_initcall(): (acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init()) Before this patch it happened later as a GHES specific device_initcall(). Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
The ghes code is careful to parse and round firmware's advertised memory requirements for CPER records, up to a maximum of 64K. However when ghes_estatus_pool_expand() does its work, it splits the requested size into PAGE_SIZE granules. This means if firmware generates 5K of CPER records, and correctly describes this in the table, __process_error() will silently fail as it is unable to allocate more than PAGE_SIZE. Switch the estatus pool to vmalloc() memory. On x86 vmalloc() memory may fault and be fixed up by vmalloc_fault(). To prevent this call vmalloc_sync_all() before an NMI handler could discover the memory. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
Subsequent patches will split up ghes_read_estatus(), at which point passing around the 'silent' flag gets annoying. This is to suppress prink() messages, which prior to commit 42a0bb3f ("printk/nmi: generic solution for safe printk in NMI"), were unsafe in NMI context. This is no longer necessary, remove the flag. printk() messages are batched in a per-cpu buffer and printed via irq-work, or a call back from panic(). Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 James Morse 提交于
oops_begin() exists to group printk() messages with the oops message printed by die(). To reach this caller we know that platform firmware took this error first, then notified the OS via NMI with a 'panic' severity. Don't wait for another CPU to release the die-lock before panic()ing, our only goal is to print this fatal error and panic(). This code is always called in_nmi(), and since commit 42a0bb3f ("printk/nmi: generic solution for safe printk in NMI"), it has been safe to call printk() from this context. Messages are batched in a per-cpu buffer and printed via irq-work, or a call back from panic(). Link: https://patchwork.kernel.org/patch/10313555/Acked-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 21 12月, 2018 1 次提交
-
-
由 Lenny Szubowicz 提交于
In __ghes_panic() clear the block status in the APEI generic error status block for that generic hardware error source before calling panic() to prevent a second panic() in the crash kernel for exactly the same fatal error. Otherwise ghes_probe(), running in the crash kernel, would see an unhandled error in the APEI generic error status block and panic again, thereby precluding any crash dump. Signed-off-by: NLenny Szubowicz <lszubowi@redhat.com> Signed-off-by: NDavid Arcari <darcari@redhat.com> Tested-by: NTyler Baicar <baicar.tyler@gmail.com> Acked-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 12 5月, 2018 1 次提交
-
-
由 Alexandru Gagniuc 提交于
The use of the @ghes argument was removed in a previous commit, but function signature was not updated to reflect this. Signed-off-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com> Acked-by: N"Rafael J. Wysocki" <rafael@kernel.org> Cc: linux-acpi@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180430213358.8319-1-mr.nuke.me@gmail.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 02 5月, 2018 1 次提交
-
-
由 Borislav Petkov 提交于
Tony reported seeing "Internal error: Can't find EDAC structure" when injecting correctable errors due to the fact that ghes_edac would still load even if the whitelist won't hit. Drop the pr_err() in ghes_edac_report_mem_error() for now due to the hacky way how ghes_edac depends on ghes.c. While at it, make ghes_edac_register() return an error if it doesn't hit in the whitelist as it is the only sensible thing to do in that situation. Furthermore, move the call to it to happen last in ghes_probe() so that GHES initializing properly does not depend on ghes_edac init at all as latter is only reporting errors and not required for GHES's proper functioning. Reviewed-by: NToshi Kani <toshi.kani@hpe.com> Tested-by: NSughosh Ganu <sughosh.ganu@arm.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20180420182015.zao3olss4tvvlxki@agluck-desk
-
- 24 1月, 2018 1 次提交
-
-
由 Eric W. Biederman 提交于
Today 4 architectures set ARCH_SUPPORTS_MEMORY_FAILURE (arm64, parisc, powerpc, and x86), while 4 other architectures set __ARCH_SI_TRAPNO (alpha, metag, sparc, and tile). These two sets of architectures do not interesect so remove the trapno paramater to remove confusion. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 05 12月, 2017 3 次提交
-
-
由 Colin Ian King 提交于
Variables len and node_len are redundant and can be removed. Cleans up clang warning: node_len = GHES_ESTATUS_NODE_LEN(len); Signed-off-by: NColin Ian King <colin.king@canonical.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Tyler Baicar 提交于
Currently the GHES code only calls into the AER driver for recoverable type errors. This is incorrect because errors of other severities do not get logged by the AER driver and do not get exposed to user space via the AER trace event. So, call into the AER driver for PCIe errors regardless of the severity Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Tyler Baicar 提交于
Move PCIe AER error handling code into a separate function. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 07 11月, 2017 2 次提交
-
-
由 James Morse 提交于
Now that nothing is using the ghes_ioremap_area pages, rip them out. Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Tested-by: NTyler Baicar <tbaicar@codeaurora.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: All applicable <stable@vger.kernel.org>
-
由 James Morse 提交于
Replace ghes_io{re,un}map_pfn_{nmi,irq}()s use of ioremap_page_range() with __set_fixmap() as ioremap_page_range() may sleep to allocate a new level of page-table, even if its passed an existing final-address to use in the mapping. The GHES driver can only be enabled for architectures that select HAVE_ACPI_APEI: Add fixmap entries to both x86 and arm64. clear_fixmap() does the TLB invalidation in __set_fixmap() for arm64 and __set_pte_vaddr() for x86. In each case its the same as the respective arch_apei_flush_tlb_one(). Reported-by: NFengguang Wu <fengguang.wu@intel.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NJames Morse <james.morse@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Tested-by: NTyler Baicar <tbaicar@codeaurora.org> Tested-by: NToshi Kani <toshi.kani@hpe.com> [ For the arm64 bits: ] Acked-by: NWill Deacon <will.deacon@arm.com> [ For the x86 bits: ] Acked-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: All applicable <stable@vger.kernel.org>
-
- 03 11月, 2017 1 次提交
-
-
由 Kees Cook 提交于
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tyler Baicar <tbaicar@codeaurora.org> Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org> Cc: Shiju Jose <shiju.jose@huawei.com> Cc: linux-acpi@vger.kernel.org Signed-off-by: NKees Cook <keescook@chromium.org> Tested-by: NTyler Baicar <tbaicar@codeaurora.org>
-
- 23 10月, 2017 1 次提交
-
-
由 Dongjiu Geng 提交于
For the SEA notification, the two functions ghes_sea_add() and ghes_sea_remove() are only called when CONFIG_ACPI_APEI_SEA is defined. If not, it will return errors in the ghes_probe() and not continue. If the probe is failed, the ghes_sea_remove() also has no chance to be called. Hence, remove the unnecessary handling when CONFIG_ACPI_APEI_SEA is not defined. For the NMI notification, it has the same issue as SEA notification, so also remove the unused dead-code for it. Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com> Tested-by: NTyler Baicar <tbaicar@codeaurora.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 11 10月, 2017 1 次提交
-
-
由 Jan Beulich 提交于
Match up with what 7edda088 ("acpi: apei: handle SEA notification type for ARMv8") did for ghes_ioremap_pfn_nmi(). Signed-off-by: NJan Beulich <jbeulich@suse.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 28 9月, 2017 1 次提交
-
-
由 Tyler Baicar 提交于
Currently we acknowledge errors before clearing the error status. This could cause a new error to be populated by firmware in-between the error acknowledgment and the error status clearing which would cause the second error's status to be cleared without being handled. So, clear the error status before acknowledging the errors. Also, make sure to acknowledge the error if the error status read fails. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 30 8月, 2017 1 次提交
-
-
由 Punit Agrawal 提交于
According to the ACPI specification, firmware is not required to provide the Hardware Error Source Table (HEST). When HEST is not present, the following superfluous message is printed to the kernel boot log - [ 3.460067] GHES: HEST is not enabled! Extend hest_disable variable to track whether the firmware provides this table and if it is not present skip any log output. The existing behaviour is preserved in all other cases. Suggested-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 31 7月, 2017 1 次提交
-
-
由 Loc Ho 提交于
X-Gene platforms describe multiple GHES error sources with the same hardware error notification type (external interrupt) and interrupt number. Change the GHES interrupt request to support sharing the same IRQ. This change includs contributions from Tuan Phan <tphan@apm.com>. Signed-off-by: NLoc Ho <lho@apm.com> Acked-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 23 6月, 2017 5 次提交
-
-
由 Tyler Baicar 提交于
Check for pending errors when probing GHES entries. It is possible that a fatal error is already pending at this point, so we should handle it as soon as the driver is probed. This also avoids a potential issue if there was an interrupt that was already cleared for an error since the GHES driver wasn't present. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Tyler Baicar 提交于
Currently external aborts are unsupported by the guest abort handling. Add handling for SEAs so that the host kernel reports SEAs which occur in the guest kernel. When an SEA occurs in the guest kernel, the guest exits and is routed to kvm_handle_guest_abort(). Prior to this patch, a print message of an unsupported FSC would be printed and nothing else would happen. With this patch, the code gets routed to the APEI handling of SEAs in the host kernel to report the SEA information. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Acked-by: NChristoffer Dall <cdall@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Tyler Baicar 提交于
Currently there are trace events for the various RAS errors with the exception of ARM processor type errors. Add a new trace event for such errors so that the user will know when they occur. These trace events are consistent with the ARM processor error section type defined in UEFI 2.6 spec section N.2.4.4. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Tyler Baicar 提交于
The UEFI spec includes non-standard section type support in the Common Platform Error Record. This is defined in section N.2.3 of UEFI version 2.5. Currently if the CPER section's type (UUID) does not match any section type that the kernel knows how to parse, a trace event is not generated. Generate a trace event which contains the raw error data for non-standard section type error records. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Tested-by: NShiju Jose <shiju.jose@huawei.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
Even if an error status block's severity is fatal, the kernel does not honor the severity level and panic. With the firmware first model, the platform could inform the OS about a fatal hardware error through the non-NMI GHES notification type. The OS should panic when a hardware error record is received with this severity. Call panic() after CPER data in error status block is printed if severity is fatal, before each error section is handled. Signed-off-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Reviewed-by: NJames Morse <james.morse@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-