1. 09 9月, 2015 1 次提交
    • M
      x86: use generic early mem copy · 5dd2c4bd
      Mark Salter 提交于
      The early_ioremap library now has a generic copy_from_early_mem()
      function.  Use the generic copy function for x86 relocate_initrd().
      
      [akpm@linux-foundation.org: remove MAX_MAP_CHUNK define, per Yinghai Lu]
      Signed-off-by: NMark Salter <msalter@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5dd2c4bd
  2. 05 9月, 2015 3 次提交
  3. 28 8月, 2015 2 次提交
    • T
      x86/irq: Do not dereference irq descriptor before checking it · a47d4576
      Thomas Gleixner 提交于
      Having the IS_NULL_OR_ERR() check after dereferencing the pointer is
      not really working well.
      
      Move the dereference after the check.
      
      Fixes: a782a7e4 'x86/irq: Store irq descriptor in vector array'
      Reported-and-tested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a47d4576
    • L
      x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del() · 2baa891e
      Luis R. Rodriguez 提交于
      The effort to replace mtrr_add() with architecture agnostic
      arch_phys_wc_add() is complete, this will ensure write-combining
      implementations (PAT on x86) is taken advantage instead of using
      MTRR. With the effort done now, hide direct MTRR access for
      drivers.
      
      The legacy user-space /proc/mtrr ABI is not affected.
      
      Update x86 documentation on MTRR to reflect the completion of
      the phasing out of direct access to MTRR, also add a note on
      platform firmware code use of MTRRs based on the obituary
      discussion of MTRRs on Linux [0].
      
        [0] http://lkml.kernel.org/r/1438991330.3109.196.camel@hp.comSigned-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Cc: <syrjala@sci.fi>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Walls <awalls@md.metrocast.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: airlied@linux.ie
      Cc: benh@kernel.crashing.org
      Cc: bhelgaas@google.com
      Cc: dan.j.williams@intel.com
      Cc: konrad.wilk@oracle.com
      Cc: linux-fbdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: mst@redhat.com
      Cc: netdev@vger.kernel.org
      Cc: vinod.koul@intel.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/1440443613-13696-12-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2baa891e
  4. 27 8月, 2015 1 次提交
    • J
      ACPI, PCI: Penalize legacy IRQ used by ACPI SCI · 5d0ddfeb
      Jiang Liu 提交于
      Nick Meier reported a regression with HyperV that "
        After rebooting the VM, the following messages are logged in syslog
        when trying to load the tulip driver:
          tulip: Linux Tulip drivers version 1.1.15 (Feb 27, 2007)
          tulip: 0000:00:0a.0: PCI INT A: failed to register GSI
          tulip: Cannot enable tulip board #0, aborting
          tulip: probe of 0000:00:0a.0 failed with error -16
        Errors occur in 3.19.0 kernel
        Works in 3.17 kernel.
      "
      
      According to the ACPI dump file posted by Nick at
      https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072
      
      The ACPI MADT table includes an interrupt source overridden entry for
      ACPI SCI:
      [236h 0566  1]                Subtable Type : 02 <Interrupt Source Override>
      [237h 0567  1]                       Length : 0A
      [238h 0568  1]                          Bus : 00
      [239h 0569  1]                       Source : 09
      [23Ah 0570  4]                    Interrupt : 00000009
      [23Eh 0574  2]        Flags (decoded below) : 000D
                                         Polarity : 1
                                     Trigger Mode : 3
      
      And in DSDT table, we have _PRT method to define PCI interrupts, which
      eventually goes to:
              Name (PRSA, ResourceTemplate ()
              {
                  IRQ (Level, ActiveLow, Shared, )
                      {3,4,5,7,9,10,11,12,14,15}
              })
              Name (PRSB, ResourceTemplate ()
              {
                  IRQ (Level, ActiveLow, Shared, )
                      {3,4,5,7,9,10,11,12,14,15}
              })
              Name (PRSC, ResourceTemplate ()
              {
                  IRQ (Level, ActiveLow, Shared, )
                      {3,4,5,7,9,10,11,12,14,15}
              })
              Name (PRSD, ResourceTemplate ()
              {
                  IRQ (Level, ActiveLow, Shared, )
                      {3,4,5,7,9,10,11,12,14,15}
              })
      
      According to the MADT and DSDT tables, IRQ 9 may be used for:
       1) ACPI SCI in level, high mode
       2) PCI legacy IRQ in level, low mode
      So there's a conflict in polarity setting for IRQ 9.
      
      Prior to commit cd68f6bd ("x86, irq, acpi: Get rid of special
      handling of GSI for ACPI SCI"), ACPI SCI is handled specially and
      there's no check for conflicts between ACPI SCI and PCI legagy IRQ.
      And it seems that the HyperV hypervisor doesn't make use of the
      polarity configuration in IOAPIC entry, so it just works.
      
      Commit cd68f6bd gets rid of the specially handling of ACPI SCI,
      and then the pin attribute checking code discloses the conflicts
      between ACPI SCI and PCI legacy IRQ on HyperV virtual machine,
      and rejects the request to assign IRQ9 to PCI devices.
      
      So penalize legacy IRQ used by ACPI SCI and mark it unusable if ACPI
      SCI attributes conflict with PCI IRQ attributes.
      
      Please refer to following links for more information:
      https://bugzilla.kernel.org/show_bug.cgi?id=101301
      https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072
      
      Fixes: cd68f6bd ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI")
      Reported-and-tested-by: NNick Meier <nmeier@microsoft.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: 3.19+ <stable@vger.kernel.org> # 3.19+
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5d0ddfeb
  5. 25 8月, 2015 1 次提交
  6. 23 8月, 2015 1 次提交
  7. 22 8月, 2015 5 次提交
    • T
      x86/apic: Fix fallout from x2apic cleanup · a57e456a
      Thomas Gleixner 提交于
      In the recent x2apic cleanup I got two things really wrong:
      1) The safety check in __disable_x2apic which allows the function to
         be called unconditionally is backwards. The check is there to
         prevent access to the apic MSR in case that the machine has no
         apic. Though right now it returns if the machine has an apic and
         therefor the disabling of x2apic is never invoked.
      
      2) x2apic_disable() sets x2apic_mode to 0 after registering the local
         apic. That's wrong, because register_lapic_address() checks x2apic
         mode and therefor takes the wrong code path.
      
      This results in boot failures on machines with x2apic preenabled by
      BIOS and can also lead to an fatal MSR access on machines without
      apic.
      
      The solutions are simple:
      1) Correct the sanity check for apic availability
      2) Clear x2apic_mode _before_ calling register_lapic_address()
      
      Fixes: 659006bf 'x86/x2apic: Split enable and setup function'
      Reported-and-tested-by: NJavier Monteagudo <javiermon@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://bugzilla.redhat.com/show_bug.cgi?id=1224764
      Cc: stable@vger.kernel.org # 4.0+
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      a57e456a
    • H
      x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer · b466bdb6
      Huang Rui 提交于
      MWAITX can enable a timer and a corresponding timer value
      specified in SW P0 clocks. The SW P0 frequency is the same as
      TSC. The timer provides an upper bound on how long the
      instruction waits before exiting.
      
      This way, a delay function in the kernel can leverage that
      MWAITX timer of MWAITX.
      
      When a CPU core executes MWAITX, it will be quiesced in a
      waiting phase, diminishing its power consumption. This way, we
      can save power in comparison to our default TSC-based delays.
      
      A simple test shows that:
      
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      	$ sleep 10000s
      	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
      
      Results:
      
      	* TSC-based default delay:      485115 uWatts average power
      	* MWAITX-based delay:           252738 uWatts average power
      
      Thus, that's about 240 milliWatts less power consumption. The
      test method relies on the support of AMD CPU accumulated power
      algorithm in fam15h_power for which patches are forthcoming.
      Suggested-by: NAndy Lutomirski <luto@amacapital.net>
      Suggested-by: NBorislav Petkov <bp@suse.de>
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NHuang Rui <ray.huang@amd.com>
      [ Fix delay truncation. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Li <tony.li@amd.com>
      Link: http://lkml.kernel.org/r/1438744732-1459-3-git-send-email-ray.huang@amd.com
      Link: http://lkml.kernel.org/r/1439201994-28067-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b466bdb6
    • A
      x86/traps: Weaken context tracking entry assertions · f0a97af8
      Andy Lutomirski 提交于
      We were asserting that we were all the way in CONTEXT_KERNEL
      when exception handlers were called.  While having this be true
      is, I think, a nice goal (or maybe a variant in which we assert
      that we're in CONTEXT_KERNEL or some new IRQ context), we're not
      quite there.
      
      In particular, if an IRQ interrupts the SYSCALL prologue and the
      IRQ handler in turn causes an exception, the exception entry
      will be called in RCU IRQ mode but with CONTEXT_USER.
      
      This is okay (nothing goes wrong), but until we fix up the
      SYSCALL prologue, we need to avoid warning.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/c81faf3916346c0e04346c441392974f49cd7184.1440133286.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f0a97af8
    • I
      x86/fpu/math-emu: Fix crash in fork() · 827409b2
      Ingo Molnar 提交于
      During later stages of math-emu bootup the following crash triggers:
      
      	 math_emulate: 0060:c100d0a8
      	 Kernel panic - not syncing: Math emulation needed in kernel
      	 CPU: 0 PID: 1511 Comm: login Not tainted 4.2.0-rc7+ #1012
      	 [...]
      	 Call Trace:
      	  [<c181d50d>] dump_stack+0x41/0x52
      	  [<c181c918>] panic+0x77/0x189
      	  [<c1003530>] ? math_error+0x140/0x140
      	  [<c164c2d7>] math_emulate+0xba7/0xbd0
      	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
      	  [<c1109c3c>] ? __alloc_pages_nodemask+0x12c/0x870
      	  [<c136ac20>] ? proc_clear_tty+0x40/0x70
      	  [<c136ac6e>] ? session_clear_tty+0x1e/0x30
      	  [<c1003530>] ? math_error+0x140/0x140
      	  [<c1003575>] do_device_not_available+0x45/0x70
      	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
      	  [<c18258e6>] error_code+0x5a/0x60
      	  [<c1003530>] ? math_error+0x140/0x140
      	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
      	  [<c100c205>] arch_dup_task_struct+0x25/0x30
      	  [<c1048cea>] copy_process.part.51+0xea/0x1480
      	  [<c115a8e5>] ? dput+0x175/0x200
      	  [<c136af70>] ? no_tty+0x30/0x30
      	  [<c1157242>] ? do_vfs_ioctl+0x322/0x540
      	  [<c104a21a>] _do_fork+0xca/0x340
      	  [<c1057b06>] ? SyS_rt_sigaction+0x66/0x90
      	  [<c104a557>] SyS_clone+0x27/0x30
      	  [<c1824a80>] sysenter_do_call+0x12/0x12
      
      The reason is the incorrect assumption in fpu_copy(), that FNSAVE
      can be executed from math-emu kernels as well.
      
      Don't try to copy the registers, the soft state will be copied
      by fork anyway, so the child task inherits the parent task's
      soft math state.
      
      With this fix applied math-emu kernels boot up fine on modern
      hardware and the 'no387 nofxsr' boot options.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Bobby Powers <bobbypowers@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      827409b2
    • I
      x86/fpu/math-emu: Fix math-emu boot crash · 5fc96038
      Ingo Molnar 提交于
      On a math-emu bootup the following crash occurs:
      
      	Initializing CPU#0
      	------------[ cut here ]------------
      	kernel BUG at arch/x86/kernel/traps.c:779!
      	invalid opcode: 0000 [#1] SMP
      	[...]
      	EIP is at do_device_not_available+0xe/0x70
      	[...]
      	Call Trace:
      	 [<c18238e6>] error_code+0x5a/0x60
      	 [<c1002bd0>] ? math_error+0x140/0x140
      	 [<c100bbd9>] ? fpu__init_cpu+0x59/0xa0
      	 [<c1012322>] cpu_init+0x202/0x330
      	 [<c104509f>] ? __native_set_fixmap+0x1f/0x30
      	 [<c1b56ab0>] trap_init+0x305/0x346
      	 [<c1b548af>] start_kernel+0x1a5/0x35d
      	 [<c1b542b4>] i386_start_kernel+0x82/0x86
      
      The reason is that in the following commit:
      
        b1276c48 ("x86/fpu: Initialize fpregs in fpu__init_cpu_generic()")
      
      I failed to consider math-emu's limitation that it cannot execute the
      FNINIT instruction in kernel mode.
      
      The long term fix might be to allow math-emu to execute (certain) kernel
      mode FPU instructions, but for now apply the safe (albeit somewhat ugly)
      fix: initialize the emulation state explicitly without trapping out to
      the FPU emulator.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5fc96038
  8. 21 8月, 2015 3 次提交
  9. 19 8月, 2015 2 次提交
    • D
      libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option · 7a67832c
      Dan Williams 提交于
      We currently register a platform device for e820 type-12 memory and
      register a nvdimm bus beneath it.  Registering the platform device
      triggers the device-core machinery to probe for a driver, but that
      search currently comes up empty.  Building the nvdimm-bus registration
      into the e820_pmem platform device registration in this way forces
      libnvdimm to be built-in.  Instead, convert the built-in portion of
      CONFIG_X86_PMEM_LEGACY to simply register a platform device and move the
      rest of the logic to the driver for e820_pmem, for the following
      reasons:
      
      1/ Letting e820_pmem support be a module allows building and testing
         libnvdimm.ko changes without rebooting
      
      2/ All the normal policy around modules can be applied to e820_pmem
         (unbind to disable and/or blacklisting the module from loading by
         default)
      
      3/ Moving the driver to a generic location and converting it to scan
         "iomem_resource" rather than "e820.map" means any other architecture can
         take advantage of this simple nvdimm resource discovery mechanism by
         registering a resource named "Persistent Memory (legacy)"
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      7a67832c
    • J
      x86/irq: Build correct vector mapping for multiple MSI interrupts · 527f0a91
      Jiang Liu 提交于
      Alex Deucher, Mark Rustad and Alexander Holler reported a regression
      with the latest v4.2-rc4 kernel, which breaks some SATA controllers.
      With multi-MSI capable SATA controllers, only the first port works,
      all other ports time out when executing SATA commands.
      
      This happens because the first argument to assign_irq_vector_policy()
      is always the base linux irq number of the multi MSI interrupt block,
      so all subsequent vector assignments operate on the base linux irq
      number, so all MSI irqs are handled as the first irq number. Therefor
      the other MSI irqs of a device are never set up correctly and never
      fire.
      
      Add the loop iterator to the base irq number so all vectors are
      assigned correctly.
      
      Fixes: b5dc8e6c "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
      Reported-and-tested-by: NAlex Deucher <alexdeucher@gmail.com>
      Reported-and-tested-by: NMark Rustad <mrustad@gmail.com>
      Reported-and-tested-by: NAlexander Holler <holler@ahsoftware.de>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Link: http://lkml.kernel.org/r/1439911228-9880-1-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      527f0a91
  10. 17 8月, 2015 4 次提交
    • L
      x86/smpboot: Remove APIC.wait_for_init_deassert and atomic init_deasserted · 656bba30
      Len Brown 提交于
      Both the per-APIC flag ".wait_for_init_deassert",
      and the global atomic_t "init_deasserted"
      are dead code -- remove them.
      
      For all APIC types, "wait_for_master()"
      prevents an AP from proceeding until the BSP has set
      cpu_callout_mask, making "init_deasserted" {unnecessary}:
      
      	BSP: <de-assert INIT>
      	...
      	BSP: {set init_deasserted}
      	AP: wait_for_master()
      		set cpu_initialized_mask
      		wait for cpu_callout_mask
      	BSP: test cpu_initialized_mask
      	BSP: set cpu_callout_mask
      	AP: test cpu_callout_mask
      	AP: {wait for init_deasserted}
      	...
      	AP: <touch APIC>
      
      Deleting the {dead code} above is necessary to enable
      some parallelism in a future patch.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jan H. Schönherr <jschoenh@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/de4b3a9bab894735e285870b5296da25ee6a8a5a.1439739165.git.len.brown@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      656bba30
    • L
      x86/smpboot: Remove SIPI delays from cpu_up() · a9bcaa02
      Len Brown 提交于
      MPS 1.4 example code shows the following required delays during processor
      on-lining:
      
      	INIT
      	 udelay(10,000)
      	SIPI
      	 udelay(200)
      	SIPI
      	 udelay(200) /* Linux actually implements this as udelay(300) */
      
      Linux skips the udelay(10,000) on modern processors.
      This patch removes the udelay(200) after each SIPI
      on those same processors.
      
      All three legacy delays can be restored by the cmdline
      "cpu_init_udelay=10000".
      
      As measured by analyze_suspend.py, this patch speeds
      processor resume time on my desktop from 2.4ms to 1.8ms, per AP.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jan H. Schönherr <jschoenh@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/a5dfdbc8fbfdd813784da204aad5677fe459ac37.1439739165.git.len.brown@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a9bcaa02
    • L
      x86/smpboot: Remove udelay(100) when polling cpu_callin_map · 2d99af8e
      Len Brown 提交于
      After the BSP sends INIT/SIPI/SIP to the AP and sees the AP
      in the cpu_initialized_map, it sets the AP loose via the
      cpu_callout_map, and waits for it via the cpu_callin_map.
      
      The BSP polls the cpu_callin_map with a udelay(100)
      and a schedule() in each iteration.
      
      The udelay(100) adds no value.
      
      For example, on my 4-CPU dekstop, the AP finishes
      cpu_callin() in under 70 usec and sets the cpu_callin_mask.
      The BSP, however, doesn't see that setting until over 30 usec
      later, because it was still running its udelay(100)
      when the AP finished.
      
      Deleting the udelay(100) in the cpu_callin_mask polling loop,
      saves from 0 to 100 usec per Application Processor.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jan H. Schönherr <jschoenh@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/0aade12eabeb89a688c929fe80856eaea0544bb7.1439739165.git.len.brown@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2d99af8e
    • L
      x86/smpboot: Remove udelay(100) when polling cpu_initialized_map · 6e38f1e7
      Len Brown 提交于
      After the BSP sends the APIC INIT/SIPI/SIPI to the AP,
      it waits for the AP to come up and indicate that it is alive
      by setting its own bit in the cpu_initialized_mask.
      
      Linux polls for up to 10 seconds for this to happen.
      Each polling loop has a udelay(100) and a call to schedule().
      
      The udelay(100) adds no value.
      
      For example, on my desktop, the BSP waits for the
      other 3 CPUs to come on line at boot for 305, 404, 405 usec.
      For resume from S3, it waits 317, 404, 405 usec.
      
      But when the udelay(100) is removed, the BSP waits
      305, 310, 306 for boot, and 305, 307, 306 for resume.
      
      So for both boot and resume, removing the udelay(100)
      speeds online by about 100us in 2 of 3 cases.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: Jan H. Schönherr <jschoenh@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/33ef746c67d2489cad0a9b1958cf71167232ff2b.1439739165.git.len.brown@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e38f1e7
  11. 14 8月, 2015 1 次提交
  12. 13 8月, 2015 10 次提交
  13. 12 8月, 2015 4 次提交
    • T
      perf/x86/intel/pt: Clean up files of Intel Processor Trace · 709bc871
      Takao Indoh 提交于
      This patch just cleans up some files of Intel Processor Trace, does not
      change its behavior. This patch removes unused definitions and replaces a
      constant value with a macro.
      Signed-off-by: NTakao Indoh <indou.takao@jp.fujitsu.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin<alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: H.Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1438681015-5124-1-git-send-email-indou.takao@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      709bc871
    • P
      perf/x86: Fix MSR PMU driver · 19b3340c
      Peter Zijlstra 提交于
      Currently we only update the sysfs event files per available MSR, we
      didn't actually disallow creating unlisted events.
      
      Rework things such that the dectection, sysfs listing and event
      creation are better coordinated.
      
      Sadly it appears it's impossible to probe R/O MSRs under virt. This
      means we have to do the full model table to avoid listing all MSRs all
      the time.
      Tested-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      19b3340c
    • M
      perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler · d7a702f0
      Matt Fleming 提交于
      Tony reports that booting his 144-cpu machine with maxcpus=10 triggers
      the following WARN_ON():
      
      [   21.045727] WARNING: CPU: 8 PID: 647 at arch/x86/kernel/cpu/perf_event_intel_cqm.c:1267 intel_cqm_cpu_prepare+0x75/0x90()
      [   21.045744] CPU: 8 PID: 647 Comm: systemd-udevd Not tainted 4.2.0-rc4 #1
      [   21.045745] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0066.R00.1506021730 06/02/2015
      [   21.045747]  0000000000000000 0000000082771b09 ffff880856333ba8 ffffffff81669b67
      [   21.045748]  0000000000000000 0000000000000000 ffff880856333be8 ffffffff8107b02a
      [   21.045750]  ffff88085b789800 ffff88085f68a020 ffffffff819e2470 000000000000000a
      [   21.045750] Call Trace:
      [   21.045757]  [<ffffffff81669b67>] dump_stack+0x45/0x57
      [   21.045759]  [<ffffffff8107b02a>] warn_slowpath_common+0x8a/0xc0
      [   21.045761]  [<ffffffff8107b15a>] warn_slowpath_null+0x1a/0x20
      [   21.045762]  [<ffffffff81036725>] intel_cqm_cpu_prepare+0x75/0x90
      [   21.045764]  [<ffffffff81036872>] intel_cqm_cpu_notifier+0x42/0x160
      [   21.045767]  [<ffffffff8109a33d>] notifier_call_chain+0x4d/0x80
      [   21.045769]  [<ffffffff8109a44e>] __raw_notifier_call_chain+0xe/0x10
      [   21.045770]  [<ffffffff8107b538>] _cpu_up+0xe8/0x190
      [   21.045771]  [<ffffffff8107b65a>] cpu_up+0x7a/0xa0
      [   21.045774]  [<ffffffff8165e920>] cpu_subsys_online+0x40/0x90
      [   21.045777]  [<ffffffff81433b37>] device_online+0x67/0x90
      [   21.045778]  [<ffffffff81433bea>] online_store+0x8a/0xa0
      [   21.045782]  [<ffffffff81430e78>] dev_attr_store+0x18/0x30
      [   21.045785]  [<ffffffff8126b6ba>] sysfs_kf_write+0x3a/0x50
      [   21.045786]  [<ffffffff8126ad40>] kernfs_fop_write+0x120/0x170
      [   21.045789]  [<ffffffff811f0b77>] __vfs_write+0x37/0x100
      [   21.045791]  [<ffffffff811f38b8>] ? __sb_start_write+0x58/0x110
      [   21.045795]  [<ffffffff81296d2d>] ? security_file_permission+0x3d/0xc0
      [   21.045796]  [<ffffffff811f1279>] vfs_write+0xa9/0x190
      [   21.045797]  [<ffffffff811f2075>] SyS_write+0x55/0xc0
      [   21.045800]  [<ffffffff81067300>] ? do_page_fault+0x30/0x80
      [   21.045804]  [<ffffffff816709ae>] entry_SYSCALL_64_fastpath+0x12/0x71
      [   21.045805] ---[ end trace fe228b836d8af405 ]---
      
      The root cause is that CPU_UP_PREPARE is completely the wrong notifier
      action from which to access cpu_data(), because smp_store_cpu_info()
      won't have been executed by the target CPU at that point, which in turn
      means that ->x86_cache_max_rmid and ->x86_cache_occ_scale haven't been
      filled out.
      
      Instead let's invoke our handler from CPU_STARTING and rename it
      appropriately.
      Reported-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vikas Shivappa <vikas.shivappa@intel.com>
      Link: http://lkml.kernel.org/r/1438863163-14083-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d7a702f0
    • P
      perf/x86/intel: Fix memory leak on hot-plug allocation fail · dbc72b7a
      Peter Zijlstra 提交于
      We fail to free the shared_regs allocation if the constraint_list
      allocation fails.
      
      Cure this and be more consistent in NULL-ing the pointers after free.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      dbc72b7a
  14. 10 8月, 2015 1 次提交
  15. 08 8月, 2015 1 次提交