- 02 9月, 2020 40 次提交
-
-
由 Daniele Albano 提交于
to #29608102 commit 61710e437f2807e26a3402543bdbb7217a9c8620 upstream. We currently filter these for timeout_remove/async_cancel/files_update, but we only should be filtering for fixed file and buffer select. This also causes a second read of sqe->flags, which isn't needed. Just check req->flags for the relevant bits. This then allows these commands to be used in links, for example, like everything else. Signed-off-by: NDaniele Albano <d.albano@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Jens Axboe 提交于
to #29608102 commit 807abcb0883439af5ead73f3308310453b97b624 upstream. The double poll additions were centered around doing POLL_ADD on file descriptors that use more than one waitqueue (typically one for read, one for write) when being polled. However, it can also end up being triggered for when we use poll triggered retry. For that case, we cannot safely use req->io, as that could be used by the request type itself. Add a second io_poll_iocb pointer in the structure we allocate for poll based retry, and ensure we use the right one from the two paths. Fixes: 18bceab101ad ("io_uring: allow POLL_ADD with double poll_wait() users") Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Wetp Zhang 提交于
fix #29415191 commit 03151c6e0b66c63c3e9980edf78c3a7a99801764 upstream Action Required memory error should happen only when a processor is about to access to a corrupted memory, so it's synchronous and only affects current process/thread. Recently commit 872e9a205c84 ("mm, memory_failure: don't send BUS_MCEERR_AO for action required error") fixed the issue that Action Required memory could unnecessarily send SIGBUS to the processes which share the error memory. But we still have another issue that we could send SIGBUS to a wrong thread. This is because collect_procs() and task_early_kill() fails to add the current process to "to-kill" list. So this patch is suggesting to fix it. With this fix, SIGBUS(BUS_MCEERR_AR) is never sent to non-current process/thread. Signed-off-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NTony Luck <tony.luck@intel.com> Acked-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/1591321039-22141-3-git-send-email-naoya.horiguchi@nec.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Wetp Zhang 提交于
fix #29415191 commit 872e9a205c8491daf1a51ea3733c8c1d15d51e10 upstream Some processes dont't want to be killed early, but in "Action Required" case, those also may be killed by BUS_MCEERR_AO when sharing memory with other which is accessing the fail memory. And sending SIGBUS with BUS_MCEERR_AO for action required error is strange, so ignore the non-current processes here. Suggested-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Acked-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/1590817116-21281-1-git-send-email-wetp.zy@linux.alibaba.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Qiuxu Zhuo 提交于
fix #29307272 commit ce20670828c1228ecd37befbdda87a1f87a803b9 upstream The i10nm_edac driver failed to load on Ice Lake and Tremont/Jacobsville servers if their CPU stepping >= 4 and failed on Ice Lake-D servers from stepping 0. The root cause was that for Ice Lake and Tremont/Jacobsville servers with CPU stepping >=4, the offset for bus number configuration register was updated from 0xcc to 0xd0. For Ice Lake-D servers, all the steppings use the updated 0xd0 offset. Fix the issue by using the appropriate offset for bus number configuration register according to the CPU model number and stepping. Reported-by: NJerry Chen <jerry.t.chen@intel.com> Reported-and-tested-by: NJin Wen <wen.jin@intel.com> Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/linux-edac/20200427084022.GC11036@zn.tnicSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Youquan Song 提交于
fix #29307272 commit ee5340abab3babb91c1807cea47de4468b2dfc91 upstream The device ID for configuration agent PCI device and the offset for bus number configuration register can be CPU model specific. So add a new structure res_config to make them configurable and pass res_config to {skx,i10nm}_init() and skx_get_all_bus_mappings() for use. Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20200427083246.GB11036@zn.tnicSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Borislav Petkov 提交于
fix #29415191 commit 1df73b2131e3b33d518609769636b41ce00212de upstream the severity grading code returns in_kernel_recov error context for errors which have happened in kernel space but from which the kernel can recover. whether the recovery can happen is determined by the exception table entry having as handler ex_handler_fault() and which has been declared at build time using _asm_extable_fault(). in_kernel_recov is used in mce_severity_intel() to lookup the corresponding error severity in the severities table. however, the mapping back from error severity to whether the error is in_kernel_recov is ambiguous and in the very paranoid case - which might not be possible right now - but be better safe than sorry later, an exception fixup could be attempted for another mce whose address is in the exception table and has the proper severity. which would be unfortunate, to say the least. therefore, mark such mces explicitly as mce_in_kernel_recov so that the recovery attempt is done only for them. document the whole handling, while at it, as it is not trivial. reported-by: Nthomas gleixner <tglx@linutronix.de> signed-off-by: Nborislav petkov <bp@suse.de> tested-by: Ntony luck <tony.luck@intel.com> link: https://lkml.kernel.org/r/20200407163414.18058-10-bp@alien8.deSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Tony Luck 提交于
fix #29415191 commit 43505646941bee217b91d064756975aa1ab6ee3b upstream Sometimes, when logs are getting lost, it's nice to just have everything dumped to the serial console. Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NTony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200214222720.13168-7-tony.luck@intel.comSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Tony Luck 提交于
fix #29415191 commit 925946cfa715a5a71639528f82b98e58f14dd4cb upstream Instead of keeping count of how many handlers are registered on the MCE notifier chain and printing if below some magic value, look at mce->kflags to see if anyone claims to have handled/logged this error. [ bp: Do not print ->kflags in __print_mce(). ] Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NTony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200214222720.13168-6-tony.luck@intel.comSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Tony Luck 提交于
fix #29415191 commit 23ba710a0864108910c7531dc4c73ef65eca5568 upstream If the handler took any action to log or deal with the error, set a bit in mce->kflags so that the default handler on the end of the machine check chain can see what has been done. Get rid of NOTIFY_STOP returns. Make the EDAC and dev-mcelog handlers skip over errors already processed by CEC. Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NTony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200214222720.13168-5-tony.luck@intel.comSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Tony Luck 提交于
fix #29415191 commit 1de08dccd383482a3e88845d3554094d338f5ff9 upstream There can be many different subsystems register on the mce handler chain. Add a new bitmask field and define values so that handlers can indicate whether they took any action to log or otherwise handle an error. The default handler at the end of the chain can use this information to decide whether to print to the console log. Boris suggested a generic name and leaving plenty of spare bits for possible future use. [ bp: Move flag bits to the internal mce.h header and use BIT_ULL(). ] Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NTony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200214222720.13168-4-tony.luck@intel.comSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Peter Zijlstra 提交于
fix #29415191 commit 0d00449c7a28a1514595630735df383dec606812 upstream A few exceptions (like #DB and #BP) can happen at any location in the code, this then means that tracers should treat events from these exceptions as NMI-like. The interrupted context could be holding locks with interrupts disabled for instance. Similarly, #MC is an actual NMI-like exception. All of them use ist_enter() which only concerns itself with RCU, but does not do any of the other setup that NMIs need. This means things like: printk() raw_spin_lock_irq(&logbuf_lock); <#DB/#BP/#MC> printk() raw_spin_lock_irq(&logbuf_lock); are entirely possible (well, not really since printk tries hard to play nice, but the concept stands). So replace ist_enter() with nmi_enter(). Also observe that any nmi_enter() caller must be both notrace and NOKPROBE, or in the noinstr text section. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134101.525508608@linutronix.deSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Peter Zijlstra 提交于
fix #29415191 commit 5567d11c21a1d508a91a8cb64a819783a0835d9f upstream Convert #MC over to using task_work_add(); it will run the same code slightly later, on the return to user path of the same exception. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NFrederic Weisbecker <frederic@kernel.org> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/20200505134100.957390899@linutronix.deSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Youquan Song 提交于
fix #29415191 commit b052df3da821adfd6be26a6eb16624fb50e90e56 upstream This is completely overengineered and definitely not an interface which should be made available to anything else than this particular MCE case. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAlexandre Chartre <alexandre.chartre@oracle.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134059.462640294@linutronix.deSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Tony Luck 提交于
fix #29415191 commit 17fae1294ad9d711b2c3dd0edef479d40c76a5e8 upstream An interesting thing happened when a guest Linux instance took a machine check. The VMM unmapped the bad page from guest physical space and passed the machine check to the guest. Linux took all the normal actions to offline the page from the process that was using it. But then guest Linux crashed because it said there was a second machine check inside the kernel with this stack trace: do_memory_failure set_mce_nospec set_memory_uc _set_memory_uc change_page_attr_set_clr cpa_flush clflush_cache_range_opt This was odd, because a CLFLUSH instruction shouldn't raise a machine check (it isn't consuming the data). Further investigation showed that the VMM had passed in another machine check because is appeared that the guest was accessing the bad page. Fix is to check the scope of the poison by checking the MCi_MISC register. If the entire page is affected, then unmap the page. If only part of the page is affected, then mark the page as uncacheable. This assumes that VMMs will do the logical thing and pass in the "whole page scope" via the MCi_MISC register (since they unmapped the entire page). [ bp: Adjust to x86/entry changes. ] Fixes: 284ce401 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()") Reported-by: NJue Wang <juew@google.com> Signed-off-by: NTony Luck <tony.luck@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NJue Wang <juew@google.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200520163546.GA7977@agluck-desk2.amr.corp.intel.comSigned-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NWetp Zhang <wetp.zy@linux.alibaba.com> Reviewed-by: NArtie Ding <artie.ding@linux.alibaba.com>
-
由 Alexandru Gagniuc 提交于
task #29600094 commit f496648b99f8f7f6711f7c30a6327381f37dd1e8 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. When in-band presence detect is disabled, PDS may come up at any time or not at all. PDS being low may indicate that the card is still mating, and we could expect contact bounce to bring down the link as well. It is reasonable to assume that most cards will mate in a hotplug slot in about a second. Thus, when we know PDS only reflects out-of-band presence detect, it's worthwhile to wait the extra second or so to make sure the card is properly mated before loading the driver and to prevent the hotplug code from disabling a device if the presence detect change goes active after the device is enabled. Link: https://lore.kernel.org/r/20191025190047.38130-3-stuart.w.hayes@gmail.com [bhelgaas: use ctrl_info() instead of pci_info()] Signed-off-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: NStuart Hayes <stuart.w.hayes@gmail.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: NLukas Wunner <lukas@wunner.de> (cherry picked from commit f496648b99f8f7f6711f7c30a6327381f37dd1e8) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Alexandru Gagniuc 提交于
task #29600094 commit 202853595e53f981c86656c49fc1cc1e3620f558 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The presence detect state (PDS) is normally a logical OR of in-band and out-of-band (OOB) presence detect. As of PCIe 4.0, there is the option to disable in-band presence so that the PDS bit always reflects the state of the out-of-band presence. The recommendation of the PCIe spec is to disable in-band presence whenever supported (PCIe r5.0, appendix I implementation note): Due to architectural issues, the in-band (Physical-Layer-based) portion of the PD mechanism is deprecated for use with async hot-plug. One issue is that in-band PD as architected does not detect adapter removal during certain LTSSM states, notably the L1 and Disabled States. Another issue is that when both in-band and OOB PD are being used together, the Presence Detect State bit and its associated interrupt mechanism always reflect the logical OR of the inband and OOB PD states, and with some hot-plug hardware configurations, it is important for software to detect and respond to in-band and OOB PD events independently. If OOB PD is being used and the associated DSP supports In-Band PD Disable, it is recommended that the In-Band PD Disable bit be Set, and the Presence Detect State bit and its associated interrupt mechanism be used exclusively for OOB PD. As a substitute for in-band PD with async hot-plug, the reference model uses either the DPC or the DLL Link Active mechanism. Link: https://lore.kernel.org/r/20191025190047.38130-2-stuart.w.hayes@gmail.com [bhelgaas: move PCI_EXP_SLTCAP2 read earlier & print PCI_EXP_SLTCAP2_IBPD value (suggested by Lukas)] Signed-off-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: NLukas Wunner <lukas@wunner.de> (cherry picked from commit 202853595e53f981c86656c49fc1cc1e3620f558) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Conflicts: drivers/pci/hotplug/pciehp.h drivers/pci/hotplug/pciehp_hpc.c Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Thomas Gleixner 提交于
task #29600094 commit 9ae0522537852408f0f48af888e44d6876777463 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The AER error injection mechanism just blindly abuses generic_handle_irq() which is really not meant for consumption by random drivers. The include of linux/irq.h should have been a red flag in the first place. Driver code, unless implementing interrupt chips or low level hypervisor functionality has absolutely no business with that. Invoking generic_handle_irq() from non interrupt handling context can have nasty side effects at least on x86 due to the hardware trainwreck which makes interrupt affinity changes a fragile beast. Sathyanarayanan triggered a NULL pointer dereference in the low level APIC code that way. While the particular pointer could be checked this would only paper over the issue because there are other ways to trigger warnings or silently corrupt state. Invoke the new irq_inject_interrupt() mechanism, which has the necessary sanity checks in place and injects the interrupt via the irq_retrigger() mechanism, which is at least halfways safe vs. the fragile x86 affinity change mechanics. It's safe on x86 as it does not corrupt state, but it still can cause a premature completion of an interrupt affinity change causing the interrupt line to become stale. Very unlikely, but possible. For regular operations this is a non issue as AER error injection is meant for debugging and testing and not for usage on production systems. People using this should better know what they are doing. Fixes: 390e2db82480 ("PCI/AER: Abstract AER interrupt handling") Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: https://lkml.kernel.org/r/20200306130624.098374457@linutronix.de (cherry picked from commit 9ae0522537852408f0f48af888e44d6876777463) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Conflicts: drivers/pci/pcie/Kconfig Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Thomas Gleixner 提交于
task #29600094 commit acd26bcf362708594ea081ef55140e37d0854ed2 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Error injection mechanisms need a half ways safe way to inject interrupts as invoking generic_handle_irq() or the actual device interrupt handler directly from e.g. a debugfs write is not guaranteed to be safe. On x86 generic_handle_irq() is unsafe due to the hardware trainwreck which is the base of x86 interrupt delivery and affinity management. Move the irq debugfs injection code into a separate function which can be used by error injection code as well. The implementation prevents at least that state is corrupted, but it cannot close a very tiny race window on x86 which might result in a stale and not serviced device interrupt under very unlikely circumstances. This is explicitly for debugging and testing and not for production use or abuse in random driver code. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lkml.kernel.org/r/20200306130623.990928309@linutronix.de (cherry picked from commit acd26bcf362708594ea081ef55140e37d0854ed2) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Conflicts: include/linux/interrupt.h kernel/irq/debugfs.c Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Thomas Gleixner 提交于
task #29600094 commit da90921acc62c71d27729ae211ccfda5370bf75b upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The code sets IRQS_REPLAY unconditionally whether the resend happens or not. That doesn't have bad side effects right now, but inconsistent state is always a latent source of problems. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lkml.kernel.org/r/20200306130623.882129117@linutronix.de (cherry picked from commit da90921acc62c71d27729ae211ccfda5370bf75b) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Thomas Gleixner 提交于
task #29600094 commit 1f85b1f5e1f5541272abedc19ba7b6c5b564c228 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. In preparation for an interrupt injection interface which can be used safely by error injection mechanisms. e.g. PCI-E/ AER, add a return value to check_irq_resend() so errors can be propagated to the caller. Split out the software resend code so the ugly #ifdef in check_irq_resend() goes away and the whole thing becomes readable. Fix up the caller in debugfs. The caller in irq_startup() does not care about the return value as this is unconditionally invoked for all interrupts and the resend is best effort anyway. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lkml.kernel.org/r/20200306130623.775200917@linutronix.de (cherry picked from commit 1f85b1f5e1f5541272abedc19ba7b6c5b564c228) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Thomas Gleixner 提交于
task #29600094 commit c16816acd08697b02a53f56f8936497a9f6f6e7a upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. In general calling generic_handle_irq() with interrupts disabled from non interrupt context is harmless. For some interrupt controllers like the x86 trainwrecks this is outright dangerous as it might corrupt state if an interrupt affinity change is pending. Add infrastructure which allows to mark interrupts as unsafe and catch such usage in generic_handle_irq(). Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NMarc Zyngier <maz@kernel.org> Link: https://lkml.kernel.org/r/20200306130623.590923677@linutronix.de (cherry picked from commit c16816acd08697b02a53f56f8936497a9f6f6e7a) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Conflicts: include/linux/irq.h Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Dongdong Liu 提交于
task #29600094 commit d95f20c4f07020ebc605f3b46af4b6db9eb5fc99 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Previously we did not call INIT_KFIFO() for aer_fifo. This leads to kfifo_put() sometimes returning 0 (queue full) when in fact it is not. It is easy to reproduce the problem by using aer-inject: $ aer-inject -s :82:00.0 multiple-corr-nonfatal The content of the multiple-corr-nonfatal file is as below: AER COR RCVR HL 0 1 2 3 AER UNCOR POISON_TLP HL 4 5 6 7 Fixes: 27c1ce8bbed7 ("PCI/AER: Use kfifo for tracking events instead of reimplementing it") Link: https://lore.kernel.org/r/1579767991-103898-1-git-send-email-liudongdong3@huawei.comSigned-off-by: NDongdong Liu <liudongdong3@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit d95f20c4f07020ebc605f3b46af4b6db9eb5fc99) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Olof Johansson 提交于
task #29600094 commit 35a0b2378c199d4f26e458b2ca38ea56aaf2d9b8 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Prior to eed85ff4 ("PCI/DPC: Enable DPC only if AER is available"), Linux handled DPC events regardless of whether firmware had granted it ownership of AER or DPC, e.g., via _OSC. PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to control of AER, so after eed85ff4, Linux handles DPC events only if it has control of AER. On platforms that do not grant OS control of AER via _OSC, Linux DPC handling worked before eed85ff4 but not after. To make Linux DPC handling work on those platforms the same way they did before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux handle DPC events regardless of whether it has control of AER. [bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/] Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.netSigned-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit 35a0b2378c199d4f26e458b2ca38ea56aaf2d9b8) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Andy Shevchenko 提交于
task #29600094 commit 161eea1b25268a1f7fa8d220a3f0c350b4cc491c upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Kernel-doc validator complains: aer.c:207: warning: Function parameter or member 'str' not described in 'pcie_ecrc_get_policy' aer.c:1209: warning: Function parameter or member 'irq' not described in 'aer_isr' aer.c:1209: warning: Function parameter or member 'context' not described in 'aer_isr' aer.c:1209: warning: Excess function parameter 'work' description in 'aer_isr' Fix the above accordingly. Link: https://lore.kernel.org/r/20190827151823.75312-2-andriy.shevchenko@linux.intel.comSigned-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> (cherry picked from commit 161eea1b25268a1f7fa8d220a3f0c350b4cc491c) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Andy Shevchenko 提交于
task #29600094 commit 6a8c97345a15f9c60ff6c6ac1629a3e9ec140320 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Simplify error counting code by using for_each_set_bit() library function. Link: https://lore.kernel.org/r/20190827151823.75312-1-andriy.shevchenko@linux.intel.comSigned-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> (cherry picked from commit 6a8c97345a15f9c60ff6c6ac1629a3e9ec140320) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Rajat Jain 提交于
task #29600094 commit 6458b438ebc12bec732290bf80c53c4eeeaed1c0 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The elements in the aer_uncorrectable_error_string[] refer to the bit names in Uncorrectable Error Status Register. Add PoisonTLPBlocked, which was added in PCIe r3.1, sec 7.10.2. Link: https://lore.kernel.org/r/20190827222145.32642-1-rajatja@google.comSigned-off-by: NRajat Jain <rajatja@google.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit 6458b438ebc12bec732290bf80c53c4eeeaed1c0) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Patel, Mayurkumar 提交于
task #29600094 commit af65d1ad416bc6e069ccb9e649faeda224248f96 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Previously we did not save and restore the AER configuration on suspend/resume, so the configuration may be lost after resume. Save the AER configuration during suspend and restore it during resume. [bhelgaas: commit log] Link: https://lore.kernel.org/r/92EBB4272BF81E4089A7126EC1E7B28492C3B007@IRSMSX101.ger.corp.intel.comSigned-off-by: NMayurkumar Patel <mayurkumar.patel@intel.com> Signed-off-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> (cherry picked from commit af65d1ad416bc6e069ccb9e649faeda224248f96) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Mika Westerberg 提交于
task #29600094 commit ca78410403dd64ac0ee0e3cc8646b38335271bfd upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. In some systems, the Device/Port Type in the PCI Express Capabilities register incorrectly identifies upstream ports as downstream ports. d0751b98 ("PCI: Add dev->has_secondary_link to track downstream PCIe links") addressed this by adding pci_dev.has_secondary_link, which is set for downstream ports. But this is confusing because pci_pcie_type() sometimes gives the wrong answer, and it's not obvious that we should use pci_dev.has_secondary_link instead. Reduce the confusion by correcting the type of the port itself so that pci_pcie_type() returns the actual type regardless of what the Device/Port Type register claims it is. Update the users to call pci_pcie_type() and pcie_downstream_port() accordingly, and remove pci_dev.has_secondary_link completely. Link: https://lore.kernel.org/linux-pci/20190703133953.GK128603@google.com/Suggested-by: NBjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20190822085553.62697-2-mika.westerberg@linux.intel.comSigned-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> (cherry picked from commit ca78410403dd64ac0ee0e3cc8646b38335271bfd) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Mika Westerberg 提交于
task #29600094 commit 984998e3404e9073479281dbba8af36b104e8c00 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. pcie_downstream_port() is useful in other places where code needs to determine whether the PCIe port is downstream so make it available outside of access.c. Link: https://lore.kernel.org/r/20190822085553.62697-1-mika.westerberg@linux.intel.comSigned-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> (cherry picked from commit 984998e3404e9073479281dbba8af36b104e8c00) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Subbaraya Sundeep 提交于
task #29600094 commit 2dbce590117981196fe355efc0569bc6f949ae9b upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The "Enhanced Allocation (EA) for Memory and I/O Resources" ECN, approved 23 October 2014, sec 6.9.1.2, specifies a second DW in the capability for type 1 (bridge) functions to describe fixed secondary and subordinate bus numbers. This ECN was included in the PCIe r4.0 spec, but sec 6.9.1.2 was omitted, presumably by mistake. Read fixed bus numbers from the EA capability for bridges. Signed-off-by: NSubbaraya Sundeep <sbhatta@marvell.com> [bhelgaas: add pci_ea_fixed_busnrs() return value] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit 2dbce590117981196fe355efc0569bc6f949ae9b) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Alex Williamson 提交于
task #29600094 commit 15d2aba7c602cd9005b20ff011b670547b3882c4 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The Interrupt Message Number in the PCIe Capabilities register (PCIe r4.0, sec 7.5.3.2) indicates which MSI/MSI-X vector is shared by interrupts related to the PCIe Capability, including Link Bandwidth Management and Link Autonomous Bandwidth Interrupts (Link Control, 7.5.3.7), Command Completed and Hot-Plug Interrupts (Slot Control, 7.5.3.10), and the PME Interrupt (Root Control, 7.5.3.12). pcie_message_numbers() checked whether we want to enable PME or Hot-Plug interrupts but neglected to check for Link Bandwidth Management, so if we only wanted the Bandwidth Management interrupts, it decided we didn't need any vectors at all. Then pcie_port_enable_irq_vec() tried to reallocate zero vectors, which failed, resulting in fallback to INTx. On some systems, e.g., an X79-based workstation, that INTx seems broken or not handled correctly, so we got spurious IRQ16 interrupts for Bandwidth Management events. Change pcie_message_numbers() so that if we want Link Bandwidth Management interrupts, we use the shared MSI/MSI-X vector from the PCIe Capabilities register. Fixes: e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification") Link: https://lore.kernel.org/lkml/155597243666.19387.1205950870601742062.stgit@gimli.homeSigned-off-by: NAlex Williamson <alex.williamson@redhat.com> [bhelgaas: changelog] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit 15d2aba7c602cd9005b20ff011b670547b3882c4) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Alexandru Gagniuc 提交于
task #29600094 commit e8303bb7a75c113388badcc49b2a84b4121c1b3e upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Frederick Lawler 提交于
task #29600094 commit 9cc6f75b27e76d38fa7c2825c4a9a64fe26e4c77 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Log messages with pci_dev, not pcie_device. Factor out common message prefixes with dev_fmt(). Example output change: - aer 0000:00:00.0:pci002: AER enabled with IRQ ... + pcieport 0000:00:00.0: AER: enabled with IRQ ... Link: https://lore.kernel.org/lkml/20190509141456.223614-5-helgaas@kernel.orgSigned-off-by: NFrederick Lawler <fred@fredlawl.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com> (cherry picked from commit 9cc6f75b27e76d38fa7c2825c4a9a64fe26e4c77) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Conflicts: drivers/pci/pcie/aer.c Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Frederick Lawler 提交于
task #29600094 commit 10a9990c10447a7bfe9dc016629898814741d090 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Log messages with pci_dev, not pcie_device. Factor out common message prefixes with dev_fmt(). Example output change: - dpc 0000:00:01.1:pcie008: DPC error containment capabilities... + pcieport 0000:00:01.1: DPC: error containment capabilities... Link: https://lore.kernel.org/lkml/20190509141456.223614-4-helgaas@kernel.orgSigned-off-by: NFrederick Lawler <fred@fredlawl.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> (cherry picked from commit 10a9990c10447a7bfe9dc016629898814741d090) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Mohan Kumar 提交于
task #29600094 commit 34c6b7105e5a11174f856483cde8ad6e61b7236a upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Replace dev_printk(KERN_DEBUG) with dev_info(), etc to be more consistent with other logging and avoid checkpatch warnings. The KERN_DEBUG messages could be converted to dev_dbg(), but that depends on CONFIG_DYNAMIC_DEBUG and DEBUG, and we want most of these messages to *always* be in the dmesg log. Link: https://lore.kernel.org/lkml/1555733240-19875-1-git-send-email-mohankumar718@gmail.comSigned-off-by: NMohan Kumar <mohankumar718@gmail.com> [bhelgaas: commit log] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit 34c6b7105e5a11174f856483cde8ad6e61b7236a) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Mohan Kumar 提交于
task #29600094 commit 25da8dbaaf0679b3b22c783952a8392071cfa135 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Replace printk() with pr_*() to be more consistent with other logging and avoid checkpatch warnings. Link: https://lore.kernel.org/lkml/1555733026-19609-1-git-send-email-mohankumar718@gmail.com Link: https://lore.kernel.org/lkml/1555733130-19804-1-git-send-email-mohankumar718@gmail.comSigned-off-by: NMohan Kumar <mohankumar718@gmail.com> [bhelgaas: squash in similar changes from second patch in series] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit 25da8dbaaf0679b3b22c783952a8392071cfa135) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Bjorn Helgaas 提交于
task #29600094 commit 7db4af43c97b68dc65394c799b86cdd0fffe5f8d upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. Use dev_printk() when possible. This makes messages more consistent with other device-related messages and, in some cases, adds useful information. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit 7db4af43c97b68dc65394c799b86cdd0fffe5f8d) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Honghui Zhang 提交于
task #29600094 commit f0cfecea8d1e8e0cd5d5053f9452b3a450f49eb5 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The Class Code for subtractive decode PCI-to-PCI bridge is 060401h; add an entry to make portdrv support this type of bridge. This allows use of PCIe services on subtractive decode ports. Signed-off-by: NHonghui Zhang <honghui.zhang@mediatek.com> [bhelgaas: add braces surrounding entry] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit f0cfecea8d1e8e0cd5d5053f9452b3a450f49eb5) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
由 Bjorn Helgaas 提交于
task #29600094 commit c89f7f98c971e0cabc819b6c0fe6bf509287b7e0 upstream. Backport summary: for 4.19 kernel ICX PCIe Gen4 support. The pci_device_id table was technically correct, but unusually formatted, which made adding entries error-prone. Change the format so it's obvious how to add entries. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> (cherry picked from commit c89f7f98c971e0cabc819b6c0fe6bf509287b7e0) Signed-off-by: NEthan Zhao <haifeng.zhao@intel.com> Signed-off-by: NArtie Ding <artie.ding@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-