1. 10 9月, 2019 1 次提交
    • L
      Revert "x86/apic: Include the LDR when clearing out APIC registers" · 991467a4
      Linus Torvalds 提交于
      [ Upstream commit 950b07c14e8c59444e2359f15fd70ed5112e11a0 ]
      
      This reverts commit 558682b5291937a70748d36fd9ba757fb25b99ae.
      
      Chris Wilson reports that it breaks his CPU hotplug test scripts.  In
      particular, it breaks offlining and then re-onlining the boot CPU, which
      we treat specially (and the BIOS does too).
      
      The symptoms are that we can offline the CPU, but it then does not come
      back online again:
      
          smpboot: CPU 0 is now offline
          smpboot: Booting Node 0 Processor 0 APIC 0x0
          smpboot: do_boot_cpu failed(-1) to wakeup CPU#0
      
      Thomas says he knows why it's broken (my personal suspicion: our magic
      handling of the "cpu0_logical_apicid" thing), but for 5.3 the right fix
      is to just revert it, since we've never touched the LDR bits before, and
      it's not worth the risk to do anything else at this stage.
      
      [ Hotpluging of the boot CPU is special anyway, and should be off by
        default. See the "BOOTPARAM_HOTPLUG_CPU0" config option and the
        cpu0_hotplug kernel parameter.
      
        In general you should not do it, and it has various known limitations
        (hibernate and suspend require the boot CPU, for example).
      
        But it should work, even if the boot CPU is special and needs careful
        treatment       - Linus ]
      
      Link: https://lore.kernel.org/lkml/156785100521.13300.14461504732265570003@skylake-alporthouse-com/Reported-by: NChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bandan Das <bsd@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      991467a4
  2. 06 9月, 2019 2 次提交
  3. 29 8月, 2019 1 次提交
  4. 07 8月, 2019 1 次提交
    • Q
      x86/apic: Silence -Wtype-limits compiler warnings · 242666b2
      Qian Cai 提交于
      [ Upstream commit ec6335586953b0df32f83ef696002063090c7aef ]
      
      There are many compiler warnings like this,
      
      In file included from ./arch/x86/include/asm/smp.h:13,
                       from ./arch/x86/include/asm/mmzone_64.h:11,
                       from ./arch/x86/include/asm/mmzone.h:5,
                       from ./include/linux/mmzone.h:969,
                       from ./include/linux/gfp.h:6,
                       from ./include/linux/mm.h:10,
                       from arch/x86/kernel/apic/io_apic.c:34:
      arch/x86/kernel/apic/io_apic.c: In function 'check_timer':
      ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
      expression >= 0 is always true [-Wtype-limits]
         if ((v) <= apic_verbosity) \
                 ^~
      arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro
      'apic_printk'
        apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X "
        ^~~~~~~~~~~
      ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
      expression >= 0 is always true [-Wtype-limits]
         if ((v) <= apic_verbosity) \
                 ^~
      arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro
      'apic_printk'
          apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: "
          ^~~~~~~~~~~
      
      APIC_QUIET is 0, so silence them by making apic_verbosity type int.
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pwSigned-off-by: NSasha Levin <sashal@kernel.org>
      242666b2
  5. 21 7月, 2019 4 次提交
    • T
      x86/irq: Seperate unused system vectors from spurious entry again · fc6975ee
      Thomas Gleixner 提交于
      commit f8a8fe61fec8006575699559ead88b0b833d5cad upstream
      
      Quite some time ago the interrupt entry stubs for unused vectors in the
      system vector range got removed and directly mapped to the spurious
      interrupt vector entry point.
      
      Sounds reasonable, but it's subtly broken. The spurious interrupt vector
      entry point pushes vector number 0xFF on the stack which makes the whole
      logic in __smp_spurious_interrupt() pointless.
      
      As a consequence any spurious interrupt which comes from a vector != 0xFF
      is treated as a real spurious interrupt (vector 0xFF) and not
      acknowledged. That subsequently stalls all interrupt vectors of equal and
      lower priority, which brings the system to a grinding halt.
      
      This can happen because even on 64-bit the system vector space is not
      guaranteed to be fully populated. A full compile time handling of the
      unused vectors is not possible because quite some of them are conditonally
      populated at runtime.
      
      Bring the entry stubs back, which wastes 160 bytes if all stubs are unused,
      but gains the proper handling back. There is no point to selectively spare
      some of the stubs which are known at compile time as the required code in
      the IDT management would be way larger and convoluted.
      
      Do not route the spurious entries through common_interrupt and do_IRQ() as
      the original code did. Route it to smp_spurious_interrupt() which evaluates
      the vector number and acts accordingly now that the real vector numbers are
      handed in.
      
      Fixup the pr_warn so the actual spurious vector (0xff) is clearly
      distiguished from the other vectors and also note for the vectored case
      whether it was pending in the ISR or not.
      
       "Spurious APIC interrupt (vector 0xFF) on CPU#0, should never happen."
       "Spurious interrupt vector 0xed on CPU#1. Acked."
       "Spurious interrupt vector 0xee on CPU#1. Not pending!."
      
      Fixes: 2414e021 ("x86: Avoid building unused IRQ entry stubs")
      Reported-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Link: https://lkml.kernel.org/r/20190628111440.550568228@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      fc6975ee
    • T
      x86/irq: Handle spurious interrupt after shutdown gracefully · 9494cd39
      Thomas Gleixner 提交于
      commit b7107a67f0d125459fe41f86e8079afd1a5e0b15 upstream
      
      Since the rework of the vector management, warnings about spurious
      interrupts have been reported. Robert provided some more information and
      did an initial analysis. The following situation leads to these warnings:
      
         CPU 0                  CPU 1               IO_APIC
      
                                                    interrupt is raised
                                                    sent to CPU1
      			  Unable to handle
      			  immediately
      			  (interrupts off,
      			   deep idle delay)
         mask()
         ...
         free()
           shutdown()
           synchronize_irq()
           clear_vector()
                                do_IRQ()
                                  -> vector is clear
      
      Before the rework the vector entries of legacy interrupts were statically
      assigned and occupied precious vector space while most of them were
      unused. Due to that the above situation was handled silently because the
      vector was handled and the core handler of the assigned interrupt
      descriptor noticed that it is shut down and returned.
      
      While this has been usually observed with legacy interrupts, this situation
      is not limited to them. Any other interrupt source, e.g. MSI, can cause the
      same issue.
      
      After adding proper synchronization for level triggered interrupts, this
      can only happen for edge triggered interrupts where the IO-APIC obviously
      cannot provide information about interrupts in flight.
      
      While the spurious warning is actually harmless in this case it worries
      users and driver developers.
      
      Handle it gracefully by marking the vector entry as VECTOR_SHUTDOWN instead
      of VECTOR_UNUSED when the vector is freed up.
      
      If that above late handling happens the spurious detector will not complain
      and switch the entry to VECTOR_UNUSED. Any subsequent spurious interrupt on
      that line will trigger the spurious warning as before.
      
      Fixes: 464d1230 ("x86/vector: Switch IOAPIC to global reservation mode")
      Reported-by: NRobert Hodaszi <Robert.Hodaszi@digi.com>
      Signed-off-by: Thomas Gleixner <tglx@linutronix.de>-
      Tested-by: NRobert Hodaszi <Robert.Hodaszi@digi.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Link: https://lkml.kernel.org/r/20190628111440.459647741@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      9494cd39
    • T
      x86/ioapic: Implement irq_get_irqchip_state() callback · 7897f5a4
      Thomas Gleixner 提交于
      commit dfe0cf8b51b07e56ded571e3de0a4a9382517231 upstream
      
      When an interrupt is shut down in free_irq() there might be an inflight
      interrupt pending in the IO-APIC remote IRR which is not yet serviced. That
      means the interrupt has been sent to the target CPUs local APIC, but the
      target CPU is in a state which delays the servicing.
      
      So free_irq() would proceed to free resources and to clear the vector
      because synchronize_hardirq() does not see an interrupt handler in
      progress.
      
      That can trigger a spurious interrupt warning, which is harmless and just
      confuses users, but it also can leave the remote IRR in a stale state
      because once the handler is invoked the interrupt resources might be freed
      already and therefore acknowledgement is not possible anymore.
      
      Implement the irq_get_irqchip_state() callback for the IO-APIC irq chip. The
      callback is invoked from free_irq() via __synchronize_hardirq(). Check the
      remote IRR bit of the interrupt and return 'in flight' if it is set and the
      interrupt is configured in level mode. For edge mode the remote IRR has no
      meaning.
      
      As this is only meaningful for level triggered interrupts this won't cure
      the potential spurious interrupt warning for edge triggered interrupts, but
      the edge trigger case does not result in stale hardware state. This has to
      be addressed at the vector/interrupt entry level seperately.
      
      Fixes: 464d1230 ("x86/vector: Switch IOAPIC to global reservation mode")
      Reported-by: NRobert Hodaszi <Robert.Hodaszi@digi.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Link: https://lkml.kernel.org/r/20190628111440.370295517@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      7897f5a4
    • C
      x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz · 2a6ee369
      Colin Ian King 提交于
      [ Upstream commit ea136a112d89bade596314a1ae49f748902f4727 ]
      
      The left shift of unsigned int cpu_khz will overflow for large values of
      cpu_khz, so cast it to a long long before shifting it to avoid overvlow.
      For example, this can happen when cpu_khz is 4194305, i.e. ~4.2 GHz.
      
      Addresses-Coverity: ("Unintentional integer overflow")
      Fixes: 8c3ba8d0 ("x86, apic: ack all pending irqs when crashed/on kexec")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: kernel-janitors@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190619181446.13635-1-colin.king@canonical.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      2a6ee369
  6. 06 3月, 2019 1 次提交
    • D
      irq/matrix: Spread managed interrupts on allocation · 8cae7757
      Dou Liyang 提交于
      [ Upstream commit 76f99ae5b54d48430d1f0c5512a84da0ff9761e0 ]
      
      Linux spreads out the non managed interrupt across the possible target CPUs
      to avoid vector space exhaustion.
      
      Managed interrupts are treated differently, as for them the vectors are
      reserved (with guarantee) when the interrupt descriptors are initialized.
      
      When the interrupt is requested a real vector is assigned. The assignment
      logic uses the first CPU in the affinity mask for assignment. If the
      interrupt has more than one CPU in the affinity mask, which happens when a
      multi queue device has less queues than CPUs, then doing the same search as
      for non managed interrupts makes sense as it puts the interrupt on the
      least interrupt plagued CPU. For single CPU affine vectors that's obviously
      a NOOP.
      
      Restructre the matrix allocation code so it does the 'best CPU' search, add
      the sanity check for an empty affinity mask and adapt the call site in the
      x86 vector management code.
      
      [ tglx: Added the empty mask check to the core and improved change log ]
      Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cae7757
  7. 08 9月, 2018 1 次提交
  8. 15 8月, 2018 1 次提交
  9. 05 8月, 2018 1 次提交
    • N
      x86: Don't include linux/irq.h from asm/hardirq.h · 447ae316
      Nicolai Stange 提交于
      The next patch in this series will have to make the definition of
      irq_cpustat_t available to entering_irq().
      
      Inclusion of asm/hardirq.h into asm/apic.h would cause circular header
      dependencies like
      
        asm/smp.h
          asm/apic.h
            asm/hardirq.h
              linux/irq.h
                linux/topology.h
                  linux/smp.h
                    asm/smp.h
      
      or
      
        linux/gfp.h
          linux/mmzone.h
            asm/mmzone.h
              asm/mmzone_64.h
                asm/smp.h
                  asm/apic.h
                    asm/hardirq.h
                      linux/irq.h
                        linux/irqdesc.h
                          linux/kobject.h
                            linux/sysfs.h
                              linux/kernfs.h
                                linux/idr.h
                                  linux/gfp.h
      
      and others.
      
      This causes compilation errors because of the header guards becoming
      effective in the second inclusion: symbols/macros that had been defined
      before wouldn't be available to intermediate headers in the #include chain
      anymore.
      
      A possible workaround would be to move the definition of irq_cpustat_t
      into its own header and include that from both, asm/hardirq.h and
      asm/apic.h.
      
      However, this wouldn't solve the real problem, namely asm/harirq.h
      unnecessarily pulling in all the linux/irq.h cruft: nothing in
      asm/hardirq.h itself requires it. Also, note that there are some other
      archs, like e.g. arm64, which don't have that #include in their
      asm/hardirq.h.
      
      Remove the linux/irq.h #include from x86' asm/hardirq.h.
      
      Fix resulting compilation errors by adding appropriate #includes to *.c
      files as needed.
      
      Note that some of these *.c files could be cleaned up a bit wrt. to their
      set of #includes, but that should better be done from separate patches, if
      at all.
      Signed-off-by: NNicolai Stange <nstange@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      447ae316
  10. 31 7月, 2018 2 次提交
  11. 24 7月, 2018 1 次提交
  12. 02 7月, 2018 1 次提交
    • T
      Revert "x86/apic: Ignore secondary threads if nosmt=force" · 506a66f3
      Thomas Gleixner 提交于
      Dave Hansen reported, that it's outright dangerous to keep SMT siblings
      disabled completely so they are stuck in the BIOS and wait for SIPI.
      
      The reason is that Machine Check Exceptions are broadcasted to siblings and
      the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a
      logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or
      reboots the machine. The MCE chapter in the SDM contains the following
      blurb:
      
          Because the logical processors within a physical package are tightly
          coupled with respect to shared hardware resources, both logical
          processors are notified of machine check errors that occur within a
          given physical processor. If machine-check exceptions are enabled when
          a fatal error is reported, all the logical processors within a physical
          package are dispatched to the machine-check exception handler. If
          machine-check exceptions are disabled, the logical processors enter the
          shutdown state and assert the IERR# signal. When enabling machine-check
          exceptions, the MCE flag in control register CR4 should be set for each
          logical processor.
      
      Reverting the commit which ignores siblings at enumeration time solves only
      half of the problem. The core cpuhotplug logic needs to be adjusted as
      well.
      
      This thoughtful engineered mechanism also turns the boot process on all
      Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU
      before the secondary CPUs are brought up. Depending on the number of
      physical cores the window in which this situation can happen is smaller or
      larger. On a HSW-EX it's about 750ms:
      
      MCE is enabled on the boot CPU:
      
      [    0.244017] mce: CPU supports 22 MCE banks
      
      The corresponding sibling #72 boots:
      
      [    1.008005] .... node  #0, CPUs:    #72
      
      That means if an MCE hits on physical core 0 (logical CPUs 0 and 72)
      between these two points the machine is going to shutdown. At least it's a
      known safe state.
      
      It's obvious that the early boot can be hit by an MCE as well and then runs
      into the same situation because MCEs are not yet enabled on the boot CPU.
      But after enabling them on the boot CPU, it does not make any sense to
      prevent the kernel from recovering.
      
      Adjust the nosmt kernel parameter documentation as well.
      
      Reverts: 2207def7 ("x86/apic: Ignore secondary threads if nosmt=force")
      Reported-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NTony Luck <tony.luck@intel.com>
      506a66f3
  13. 21 6月, 2018 4 次提交
    • M
      x86/platform/UV: Add kernel parameter to set memory block size · d7609f42
      mike.travis@hpe.com 提交于
      Add a kernel parameter that allows setting UV memory block size.  This
      is to provide an adjustment for new forms of PMEM and other DIMM memory
      that might require alignment restrictions other than scanning the global
      address table for the required minimum alignment.  The value set will be
      further adjusted by both the GAM range table scan as well as restrictions
      imposed by set_memory_block_size_order().
      Signed-off-by: NMike Travis <mike.travis@hpe.com>
      Reviewed-by: NAndrew Banman <andrew.banman@hpe.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <russ.anderson@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dan.j.williams@intel.com
      Cc: jgross@suse.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: mhocko@suse.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/lkml/20180524201711.854849120@stormcage.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d7609f42
    • M
      x86/platform/UV: Use new set memory block size function · bbbd2b51
      mike.travis@hpe.com 提交于
      Add a call to the new function to "adjust" the current fixed UV memory
      block size of 2GB so it can be changed to a different physical boundary.
      This accommodates changes in the Intel BIOS, and therefore UV BIOS,
      which now can align boundaries different than the previous UV standard
      of 2GB.  It also flags any UV Global Address boundaries from BIOS that
      cause a change in the mem block size (boundary).
      
      The current boundary of 2GB has been used on UV since the first system
      release in 2009 with Linux 2.6 and has worked fine.  But the new NVDIMM
      persistent memory modules (PMEM), along with the Intel BIOS changes to
      support these modules caused the memory block size boundary to be set
      to a lower limit.  Intel only guarantees that this minimum boundary at
      64MB though the current Linux limit is 128MB.
      
      Note that the default remains 2GB if no changes occur.
      Signed-off-by: NMike Travis <mike.travis@hpe.com>
      Reviewed-by: NAndrew Banman <andrew.banman@hpe.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <russ.anderson@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dan.j.williams@intel.com
      Cc: jgross@suse.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: mhocko@suse.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/lkml/20180524201711.732785782@stormcage.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bbbd2b51
    • T
      x86/apic: Ignore secondary threads if nosmt=force · 2207def7
      Thomas Gleixner 提交于
      nosmt on the kernel command line merely prevents the onlining of the
      secondary SMT siblings.
      
      nosmt=force makes the APIC detection code ignore the secondary SMT siblings
      completely, so they even do not show up as possible CPUs. That reduces the
      amount of memory allocations for per cpu variables and saves other
      resources from being allocated too large.
      
      This is not fully equivalent to disabling SMT in the BIOS because the low
      level SMT enabling in the BIOS can result in partitioning of resources
      between the siblings, which is not undone by just ignoring them. Some CPUs
      can use the full resources when their sibling is not onlined, but this is
      depending on the CPU family and model and it's not well documented whether
      this applies to all partitioned resources. That means depending on the
      workload disabling SMT in the BIOS might result in better performance.
      
      Linus analysis of the Intel manual:
      
        The intel optimization manual is not very clear on what the partitioning
        rules are.
      
        I find:
      
          "In general, the buffers for staging instructions between major pipe
           stages  are partitioned. These buffers include µop queues after the
           execution trace cache, the queues after the register rename stage, the
           reorder buffer which stages instructions for retirement, and the load
           and store buffers.
      
           In the case of load and store buffers, partitioning also provided an
           easier implementation to maintain memory ordering for each logical
           processor and detect memory ordering violations"
      
        but some of that partitioning may be relaxed if the HT thread is "not
        active":
      
          "In Intel microarchitecture code name Sandy Bridge, the micro-op queue
           is statically partitioned to provide 28 entries for each logical
           processor,  irrespective of software executing in single thread or
           multiple threads. If one logical processor is not active in Intel
           microarchitecture code name Ivy Bridge, then a single thread executing
           on that processor  core can use the 56 entries in the micro-op queue"
      
        but I do not know what "not active" means, and how dynamic it is. Some of
        that partitioning may be entirely static and depend on the early BIOS
        disabling of HT, and even if we park the cores, the resources will just be
        wasted.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      2207def7
    • T
      x86/smp: Provide topology_is_primary_thread() · 6a4d2657
      Thomas Gleixner 提交于
      If the CPU is supporting SMT then the primary thread can be found by
      checking the lower APIC ID bits for zero. smp_num_siblings is used to build
      the mask for the APIC ID bits which need to be taken into account.
      
      This uses the MPTABLE or ACPI/MADT supplied APIC ID, which can be different
      than the initial APIC ID in CPUID. But according to AMD the lower bits have
      to be consistent. Intel gave a tentative confirmation as well.
      
      Preparatory patch to support disabling SMT at boot/runtime.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      6a4d2657
  14. 06 6月, 2018 4 次提交
    • T
      x86/apic/vector: Print APIC control bits in debugfs · a07771ac
      Thomas Gleixner 提交于
      Extend the debugability of the vector management by adding the state bits
      to the debugfs output.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Link: https://lkml.kernel.org/r/20180604162224.908136099@linutronix.de
      a07771ac
    • T
      x86/ioapic: Use apic_ack_irq() · 2b04e46d
      Thomas Gleixner 提交于
      To address the EBUSY fail of interrupt affinity settings in case that the
      previous setting has not been cleaned up yet, use the new apic_ack_irq()
      function instead of directly invoking ack_APIC_irq().
      
      Preparatory change for the real fix
      
      Fixes: dccfe314 ("x86/vector: Simplify vector move cleanup")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Link: https://lkml.kernel.org/r/20180604162224.639011135@linutronix.de
      2b04e46d
    • T
      x86/apic: Provide apic_ack_irq() · c0255770
      Thomas Gleixner 提交于
      apic_ack_edge() is explicitely for handling interrupt affinity cleanup when
      interrupt remapping is not available or disable.
      
      Remapped interrupts and also some of the platform specific special
      interrupts, e.g. UV, invoke ack_APIC_irq() directly.
      
      To address the issue of failing an affinity update with -EBUSY the delayed
      affinity mechanism can be reused, but ack_APIC_irq() does not handle
      that. Adding this to ack_APIC_irq() is not possible, because that function
      is also used for exceptions and directly handled interrupts like IPIs.
      
      Create a new function, which just contains the conditional invocation of
      irq_move_irq() and the final ack_APIC_irq().
      
      Reuse the new function in apic_ack_edge().
      
      Preparatory change for the real fix.
      
      Fixes: dccfe314 ("x86/vector: Simplify vector move cleanup")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de
      c0255770
    • T
      x86/apic/vector: Prevent hlist corruption and leaks · 80ae7b1a
      Thomas Gleixner 提交于
      Several people observed the WARN_ON() in irq_matrix_free() which triggers
      when the caller tries to free an vector which is not in the allocation
      range. Song provided the trace information which allowed to decode the root
      cause.
      
      The rework of the vector allocation mechanism failed to preserve a sanity
      check, which prevents setting a new target vector/CPU when the previous
      affinity change has not fully completed.
      
      As a result a half finished affinity change can be overwritten, which can
      cause the leak of a irq descriptor pointer on the previous target CPU and
      double enqueue of the hlist head into the cleanup lists of two or more
      CPUs. After one CPU cleaned up its vector the next CPU will invoke the
      cleanup handler with vector 0, which triggers the out of range warning in
      the matrix allocator.
      
      Prevent this by checking the apic_data of the interrupt whether the
      move_in_progress flag is false and the hlist node is not hashed. Return
      -EBUSY if not.
      
      This prevents the damage and restores the behaviour before the vector
      allocation rework, but due to other changes in that area it also widens the
      chance that user space can observe -EBUSY. In theory this should be fine,
      but actually not all user space tools handle -EBUSY correctly. Addressing
      that is not part of this fix, but will be addressed in follow up patches.
      
      Fixes: 69cde000 ("x86/vector: Use matrix allocator for vector assignment")
      Reported-by: NDmitry Safonov <0x7f454c46@gmail.com>
      Reported-by: NTariq Toukan <tariqt@mellanox.com>
      Reported-by: NSong Liu <liu.song.a23@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: https://lkml.kernel.org/r/20180604162224.303870257@linutronix.de
      80ae7b1a
  15. 19 5月, 2018 1 次提交
  16. 18 5月, 2018 1 次提交
  17. 10 4月, 2018 1 次提交
    • L
      x86/apic: Fix signedness bug in APIC ID validity checks · a774635d
      Li RongQing 提交于
      The APIC ID as parsed from ACPI MADT is validity checked with the
      apic->apic_id_valid() callback, which depends on the selected APIC type.
      
      For non X2APIC types APIC IDs >= 0xFF are invalid, but values > 0x7FFFFFFF
      are detected as valid. This happens because the 'apicid' argument of the
      apic_id_valid() callback is type 'int'. So the resulting comparison
      
         apicid < 0xFF
      
      evaluates to true for all unsigned int values > 0x7FFFFFFF which are handed
      to default_apic_id_valid(). As a consequence, invalid APIC IDs in !X2APIC
      mode are considered valid and accounted as possible CPUs.
      
      Change the apicid argument type of the apic_id_valid() callback to u32 so
      the evaluation is unsigned and returns the correct result.
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NLi RongQing <lirongqing@baidu.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: jgross@suse.com
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/1523322966-10296-1-git-send-email-lirongqing@baidu.com
      a774635d
  18. 01 3月, 2018 3 次提交
  19. 23 2月, 2018 1 次提交
  20. 20 2月, 2018 1 次提交
  21. 17 2月, 2018 5 次提交
  22. 16 2月, 2018 1 次提交
  23. 15 2月, 2018 1 次提交