1. 03 2月, 2016 1 次提交
  2. 19 12月, 2015 1 次提交
  3. 28 8月, 2015 1 次提交
    • L
      x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del() · 2baa891e
      Luis R. Rodriguez 提交于
      The effort to replace mtrr_add() with architecture agnostic
      arch_phys_wc_add() is complete, this will ensure write-combining
      implementations (PAT on x86) is taken advantage instead of using
      MTRR. With the effort done now, hide direct MTRR access for
      drivers.
      
      The legacy user-space /proc/mtrr ABI is not affected.
      
      Update x86 documentation on MTRR to reflect the completion of
      the phasing out of direct access to MTRR, also add a note on
      platform firmware code use of MTRRs based on the obituary
      discussion of MTRRs on Linux [0].
      
        [0] http://lkml.kernel.org/r/1438991330.3109.196.camel@hp.comSigned-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Cc: <syrjala@sci.fi>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Walls <awalls@md.metrocast.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: airlied@linux.ie
      Cc: benh@kernel.crashing.org
      Cc: bhelgaas@google.com
      Cc: dan.j.williams@intel.com
      Cc: konrad.wilk@oracle.com
      Cc: linux-fbdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: mst@redhat.com
      Cc: netdev@vger.kernel.org
      Cc: vinod.koul@intel.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/1440443613-13696-12-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2baa891e
  4. 27 5月, 2015 4 次提交
    • L
      x86/mm/pat: Wrap pat_enabled into a function API · cb32edf6
      Luis R. Rodriguez 提交于
      We use pat_enabled in x86-specific code to see if PAT is enabled
      or not but we're granting full access to it even though readers
      do not need to set it. If, for instance, we granted access to it
      to modules later they then could override the variable
      setting... no bueno.
      
      This renames pat_enabled to a new static variable __pat_enabled.
      Folks are redirected to use pat_enabled() now.
      
      Code that sets this can only be internal to pat.c. Apart from
      the early kernel parameter "nopat" to disable PAT, we also have
      a few cases that disable it later and make use of a helper
      pat_disable(). It is wrapped under an ifdef but since that code
      cannot run unless PAT was enabled its not required to wrap it
      with ifdefs, unwrap that. Likewise, since "nopat" doesn't really
      change non-PAT systems just remove that ifdef as well.
      
      Although we could add and use an early_param_off(), these
      helpers don't use __read_mostly but we want to keep
      __read_mostly for __pat_enabled as this is a hot path -- upon
      boot, for instance, a simple guest may see ~4k accesses to
      pat_enabled(). Since __read_mostly early boot params are not
      that common we don't add a helper for them just yet.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Walls <awalls@md.metrocast.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kyle McMartin <kyle@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1430425520-22275-3-git-send-email-mcgrof@do-not-panic.com
      Link: http://lkml.kernel.org/r/1432628901-18044-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb32edf6
    • L
      x86/mm/mtrr: Generalize runtime disabling of MTRRs · f9626104
      Luis R. Rodriguez 提交于
      It is possible to enable CONFIG_MTRR and CONFIG_X86_PAT and end
      up with a system with MTRR functionality disabled but PAT
      functionality enabled. This can happen, for instance, when the
      Xen hypervisor is used where MTRRs are not supported but PAT is.
      This can happen on Linux as of commit
      
        47591df5 ("xen: Support Xen pv-domains using PAT")
      
      by Juergen, introduced in v3.19.
      
      Technically, we should assume the proper CPU bits would be set
      to disable MTRRs but we can't always rely on this. At least on
      the Xen Hypervisor, for instance, only X86_FEATURE_MTRR was
      disabled as of Xen 4.4 through Xen commit 586ab6a [0], but not
      X86_FEATURE_K6_MTRR, X86_FEATURE_CENTAUR_MCR, or
      X86_FEATURE_CYRIX_ARR for instance.
      
      Roger Pau Monné has clarified though that although this is
      technically true we will never support PVH on these CPU types so
      Xen has no need to disable these bits on those systems. As per
      Roger, AMD K6, Centaur and VIA chips don't have the necessary
      hardware extensions to allow running PVH guests [1].
      
      As per Toshi it is also possible for the BIOS to disable MTRR
      support, in such cases get_mtrr_state() would update the MTRR
      state as per the BIOS, we need to propagate this information as
      well.
      
      x86 MTRR code relies on quite a bit of checks for mtrr_if being
      set to check to see if MTRRs did get set up. Instead, lets
      provide a generic getter for that. This also adds a few checks
      where they were not before which could potentially safeguard
      ourselves against incorrect usage of MTRR where this was not
      desirable.
      
      Where possible match error codes as if MTRRs were disabled on
      arch/x86/include/asm/mtrr.h.
      
      Lastly, since disabling MTRRs can happen at run time and we
      could end up with PAT enabled, best record now in our logs when
      MTRRs are disabled.
      
      [0] ~/devel/xen (git::stable-4.5)$ git describe --contains 586ab6a 4.4.0-rc1~18
      [1] http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg03460.htmlSigned-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roger Pau Monné <roger.pau@citrix.com>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: bhelgaas@google.com
      Cc: david.vrabel@citrix.com
      Cc: jbeulich@suse.com
      Cc: konrad.wilk@oracle.com
      Cc: venkatesh.pallipadi@intel.com
      Cc: ville.syrjala@linux.intel.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/1426893517-2511-3-git-send-email-mcgrof@do-not-panic.com
      Link: http://lkml.kernel.org/r/1432628901-18044-12-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f9626104
    • L
      x86/mm/mtrr: Avoid #ifdeffery with phys_wc_to_mtrr_index() · 7d010fdf
      Luis R. Rodriguez 提交于
      There is only one user but since we're going to bury MTRR next
      out of access to drivers, expose this last piece of API to
      drivers in a general fashion only needing io.h for access to
      helpers.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Abhilash Kesavan <a.kesavan@samsung.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Cristian Stoica <cristian.stoica@freescale.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1429722736-4473-1-git-send-email-mcgrof@do-not-panic.com
      Link: http://lkml.kernel.org/r/1432628901-18044-11-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7d010fdf
    • L
      x86/mm/mtrr, pat: Document Write Combining MTRR type effects on PAT / non-PAT pages · 2f9e8973
      Luis R. Rodriguez 提交于
      As part of the effort to phase out MTRR use document
      write-combining MTRR effects on pages with different non-PAT
      page attributes flags and different PAT entry values. Extend
      arch_phys_wc_add() documentation to clarify power of two sizes /
      boundary requirements as we phase out mtrr_add() use.
      
      Lastly hint towards ioremap_uc() for corner cases on device
      drivers working with devices with mixed regions where MTRR size
      requirements would otherwise not enable write-combining
      effective memory types.
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Antonino Daplas <adaplas@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Ville Syrjälä <syrjala@sci.fi>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: linux-fbdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1430343851-967-3-git-send-email-mcgrof@do-not-panic.com
      Link: http://lkml.kernel.org/r/1432628901-18044-10-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2f9e8973
  5. 12 9月, 2014 1 次提交
  6. 26 6月, 2013 1 次提交
    • Y
      x86: Fix /proc/mtrr with base/size more than 44bits · d5c78673
      Yinghai Lu 提交于
      On one sytem that mtrr range is more then 44bits, in dmesg we have
      [    0.000000] MTRR default type: write-back
      [    0.000000] MTRR fixed ranges enabled:
      [    0.000000]   00000-9FFFF write-back
      [    0.000000]   A0000-BFFFF uncachable
      [    0.000000]   C0000-DFFFF write-through
      [    0.000000]   E0000-FFFFF write-protect
      [    0.000000] MTRR variable ranges enabled:
      [    0.000000]   0 [000080000000-0000FFFFFFFF] mask 3FFF80000000 uncachable
      [    0.000000]   1 [380000000000-38FFFFFFFFFF] mask 3F0000000000 uncachable
      [    0.000000]   2 [000099000000-000099FFFFFF] mask 3FFFFF000000 write-through
      [    0.000000]   3 [00009A000000-00009AFFFFFF] mask 3FFFFF000000 write-through
      [    0.000000]   4 [381FFA000000-381FFBFFFFFF] mask 3FFFFE000000 write-through
      [    0.000000]   5 [381FFC000000-381FFC0FFFFF] mask 3FFFFFF00000 write-through
      [    0.000000]   6 [0000AD000000-0000ADFFFFFF] mask 3FFFFF000000 write-through
      [    0.000000]   7 [0000BD000000-0000BDFFFFFF] mask 3FFFFF000000 write-through
      [    0.000000]   8 disabled
      [    0.000000]   9 disabled
      
      but /proc/mtrr report wrong:
      reg00: base=0x080000000 ( 2048MB), size= 2048MB, count=1: uncachable
      reg01: base=0x80000000000 (8388608MB), size=1048576MB, count=1: uncachable
      reg02: base=0x099000000 ( 2448MB), size=   16MB, count=1: write-through
      reg03: base=0x09a000000 ( 2464MB), size=   16MB, count=1: write-through
      reg04: base=0x81ffa000000 (8519584MB), size=   32MB, count=1: write-through
      reg05: base=0x81ffc000000 (8519616MB), size=    1MB, count=1: write-through
      reg06: base=0x0ad000000 ( 2768MB), size=   16MB, count=1: write-through
      reg07: base=0x0bd000000 ( 3024MB), size=   16MB, count=1: write-through
      reg08: base=0x09b000000 ( 2480MB), size=   16MB, count=1: write-combining
      
      so bit 44 and bit 45 get cut off.
      
      We have problems in arch/x86/kernel/cpu/mtrr/generic.c::generic_get_mtrr().
      1. for base, we miss cast base_lo to 64bit before shifting.
      Fix that by adding u64 casting.
      
      2. for size, it only can handle 44 bits aka 32bits + page_shift
      Fix that with 64bit mask instead of 32bit mask_lo, then range could be
      more than 44bits.
      At the same time, we need to update size_or_mask for old cpus that does
      support cpuid 0x80000008 to get phys_addr. Need to set high 32bits
      to all 1s, otherwise will not get correct size for them.
      
      Also fix mtrr_add_page: it should check base and (base + size - 1)
      instead of base and size, as base and size could be small but
      base + size could bigger enough to be out of boundary. We can
      use boot_cpu_data.x86_phys_bits directly to avoid size_or_mask.
      
      So When are we going to have size more than 44bits? that is 16TiB.
      
      after patch we have right ouput:
      reg00: base=0x080000000 ( 2048MB), size= 2048MB, count=1: uncachable
      reg01: base=0x380000000000 (58720256MB), size=1048576MB, count=1: uncachable
      reg02: base=0x099000000 ( 2448MB), size=   16MB, count=1: write-through
      reg03: base=0x09a000000 ( 2464MB), size=   16MB, count=1: write-through
      reg04: base=0x381ffa000000 (58851232MB), size=   32MB, count=1: write-through
      reg05: base=0x381ffc000000 (58851264MB), size=    1MB, count=1: write-through
      reg06: base=0x0ad000000 ( 2768MB), size=   16MB, count=1: write-through
      reg07: base=0x0bd000000 ( 3024MB), size=   16MB, count=1: write-through
      reg08: base=0x09b000000 ( 2480MB), size=   16MB, count=1: write-combining
      
      -v2: simply checking in mtrr_add_page according to hpa.
      
      [ hpa: This probably wants to go into -stable only after having sat in
        mainline for a bit.  It is not a regression. ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1371162815-29931-1-git-send-email-yinghai@kernel.org
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      d5c78673
  7. 31 5月, 2013 1 次提交
    • A
      Add arch_phys_wc_{add, del} to manipulate WC MTRRs if needed · d0d98eed
      Andy Lutomirski 提交于
      Several drivers currently use mtrr_add through various #ifdef guards
      and/or drm wrappers.  The vast majority of them want to add WC MTRRs
      on x86 systems and don't actually need the MTRR if PAT (i.e.
      ioremap_wc, etc) are working.
      
      arch_phys_wc_add and arch_phys_wc_del are new functions, available
      on all architectures and configurations, that add WC MTRRs on x86 if
      needed (and handle errors) and do nothing at all otherwise.  They're
      also easier to use than mtrr_add and mtrr_del, so the call sites can
      be simplified.
      
      As an added benefit, this will avoid wasting MTRRs and possibly
      warning pointlessly on PAT-supporting systems.
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      d0d98eed
  8. 07 12月, 2012 1 次提交
  9. 15 11月, 2012 1 次提交
  10. 26 8月, 2011 1 次提交
  11. 02 7月, 2011 1 次提交
  12. 28 6月, 2011 2 次提交
    • S
      x86, mtrr: use stop_machine APIs for doing MTRR rendezvous · 192d8857
      Suresh Siddha 提交于
      MTRR rendezvous sequence is not implemened using stop_machine() before, as this
      gets called both from the process context aswell as the cpu online paths
      (where the cpu has not come online and the interrupts are disabled etc).
      
      Now that we have a new stop_machine_from_inactive_cpu() API, use it for
      rendezvous during mtrr init of a logical processor that is coming online.
      
      For the rest (runtime MTRR modification, system boot, resume paths), use
      stop_machine() to implement the rendezvous sequence. This will consolidate and
      cleanup the code.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/20110623182057.076997177@sbsiddha-MOBL3.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      192d8857
    • S
      x86, mtrr: lock stop machine during MTRR rendezvous sequence · 6d3321e8
      Suresh Siddha 提交于
      MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
      happen in parallel with another system wide rendezvous using
      stop_machine(). This can lead to deadlock (The order in which
      works are queued can be different on different cpu's. Some cpu's
      will be running the first rendezvous handler and others will be running
      the second rendezvous handler. Each set waiting for the other set to join
      for the system wide rendezvous, leading to a deadlock).
      
      MTRR rendezvous sequence is not implemented using stop_machine() as this
      gets called both from the process context aswell as the cpu online paths
      (where the cpu has not come online and the interrupts are disabled etc).
      stop_machine() works with only online cpus.
      
      For now, take the stop_machine mutex in the MTRR rendezvous sequence that
      gets called from an online cpu (here we are in the process context
      and can potentially sleep while taking the mutex). And the MTRR rendezvous
      that gets triggered during cpu online doesn't need to take this stop_machine
      lock (as the stop_machine() already ensures that there is no cpu hotplug
      going on in parallel by doing get_online_cpus())
      
          TBD: Pursue a cleaner solution of extending the stop_machine()
               infrastructure to handle the case where the calling cpu is
               still not online and use this for MTRR rendezvous sequence.
      
      fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008Reported-by: NVadim Kotelnikov <vadimuzzz@inbox.ru>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
      Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      6d3321e8
  13. 30 3月, 2011 1 次提交
    • S
      x86, mtrr, pat: Fix one cpu getting out of sync during resume · 84ac7cdb
      Suresh Siddha 提交于
      On laptops with core i5/i7, there were reports that after resume
      graphics workloads were performing poorly on a specific AP, while
      the other cpu's were ok. This was observed on a 32bit kernel
      specifically.
      
      Debug showed that the PAT init was not happening on that AP
      during resume and hence it contributing to the poor workload
      performance on that cpu.
      
      On this system, resume flow looked like this:
      
      1. BP starts the resume sequence and we reinit BP's MTRR's/PAT
         early on using mtrr_bp_restore()
      
      2. Resume sequence brings all AP's online
      
      3. Resume sequence now kicks off the MTRR reinit on all the AP's.
      
      4. For some reason, between point 2 and 3, we moved from BP
         to one of the AP's. My guess is that printk() during resume
         sequence is contributing to this. We don't see similar
         behavior with the 64bit kernel but there is no guarantee that
         at this point the remaining resume sequence (after AP's bringup)
         has to happen on BP.
      
      5. set_mtrr() was assuming that we are still on BP and skipped the
         MTRR/PAT init on that cpu (because of 1 above)
      
      6. But we were on an AP and this led to not reprogramming PAT
         on this cpu leading to bad performance.
      
      Fix this by doing unconditional mtrr_if->set_all() in set_mtrr()
      during MTRR/PAT init. This might be unnecessary if we are still
      running on BP. But it is of no harm and will guarantee that after
      resume, all the cpu's will be in sync with respect to the
      MTRR/PAT registers.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net>
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Tested-by: NKeith Packard <keithp@keithp.com>
      Cc: stable@kernel.org	[v2.6.32+]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      84ac7cdb
  14. 24 3月, 2011 1 次提交
    • R
      x86: Use syscore_ops instead of sysdev classes and sysdevs · f3c6ea1b
      Rafael J. Wysocki 提交于
      Some subsystems in the x86 tree need to carry out suspend/resume and
      shutdown operations with one CPU on-line and interrupts disabled and
      they define sysdev classes and sysdevs or sysdev drivers for this
      purpose.  This leads to unnecessarily complicated code and excessive
      memory usage, so switch them to using struct syscore_ops objects for
      this purpose instead.
      
      Generally, there are three categories of subsystems that use
      sysdevs for implementing PM operations: (1) subsystems whose
      suspend/resume callbacks ignore their arguments entirely (the
      majority), (2) subsystems whose suspend/resume callbacks use their
      struct sys_device argument, but don't really need to do that,
      because they can be implemented differently in an arguably simpler
      way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
      use their struct sys_device argument, but the value of that argument
      is always the same and could be ignored (microcode_core.c).  In all
      of these cases the subsystems in question may be readily converted to
      using struct syscore_ops objects for power management and shutdown.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      f3c6ea1b
  15. 03 2月, 2011 1 次提交
    • S
      x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platforms · f7448548
      Suresh Siddha 提交于
      Markus Kohn ran into a hard hang regression on an acer aspire
      1310, when acpi is enabled. git bisect showed the following
      commit as the bad one that introduced the boot regression.
      
      	commit d0af9eed
      	Author: Suresh Siddha <suresh.b.siddha@intel.com>
      	Date:   Wed Aug 19 18:05:36 2009 -0700
      
      	    x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init
      
      Because of the UP configuration of that platform,
      native_smp_prepare_cpus() bailed out (in smp_sanity_check())
      before doing the set_mtrr_aps_delayed_init()
      
      Further down the boot path, native_smp_cpus_done() will call the
      delayed MTRR initialization for the AP's (mtrr_aps_init()) with
      mtrr_aps_delayed_init not set. This resulted in the boot
      processor reprogramming its MTRR's to the values seen during the
      start of the OS boot. While this is not needed ideally, this
      shouldn't have caused any side-effects. This is because the
      reprogramming of MTRR's (set_mtrr_state() that gets called via
      set_mtrr()) will check if the live register contents are
      different from what is being asked to write and will do the actual
      write only if they are different.
      
      BP's mtrr state is read during the start of the OS boot and
      typically nothing would have changed when we ask to reprogram it
      on BP again because of the above scenario on an UP platform. So
      on a normal UP platform no reprogramming of BP MTRR MSR's
      happens and all is well.
      
      However, on this platform, bios seems to be modifying the fixed
      mtrr range registers between the start of OS boot and when we
      double check the live registers for reprogramming BP MTRR
      registers. And as the live registers are modified, we end up
      reprogramming the MTRR's to the state seen during the start of
      the OS boot.
      
      During ACPI initialization, something in the bios (probably smi
      handler?) don't like this fact and results in a hard lockup.
      
      We didn't see this boot hang issue on this platform before the
      commit d0af9eed, because only
      the AP's (if any) will program its MTRR's to the value that BP
      had at the start of the OS boot.
      
      Fix this issue by checking mtrr_aps_delayed_init before
      continuing further in the mtrr_aps_init(). Now, only AP's (if
      any) will program its MTRR's to the BP values during boot.
      
      Addresses https://bugzilla.novell.com/show_bug.cgi?id=623393
      
        [ By the way, this behavior of the bios modifying MTRR's after the start
          of the OS boot is not common and the kernel is not prepared to
          handle this situation well. Irrespective of this issue, during
          suspend/resume, linux kernel will try to reprogram the BP's MTRR values
          to the values seen during the start of the OS boot. So suspend/resume might
          be already broken on this platform for all linux kernel versions. ]
      Reported-and-bisected-by: NMarkus Kohn <jabber@gmx.org>
      Tested-by: NMarkus Kohn <jabber@gmx.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Thomas Renninger <trenn@novell.com>
      Cc: Rafael Wysocki <rjw@novell.com>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Cc: stable@kernel.org # [v2.6.32+]
      LKML-Reference: <1296694975.4418.402.camel@sbsiddha-MOBL3.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f7448548
  16. 31 7月, 2010 1 次提交
    • S
      x86, mtrr: Use stop machine context to rendezvous all the cpu's · 68f202e4
      Suresh Siddha 提交于
      Use the stop machine context rather than IPI's to rendezvous all the cpus for
      MTRR initialization that happens during cpu bringup or for MTRR modifications
      during runtime.
      
      This avoids deadlock scenario (reported by Prarit) like:
      
      cpu A holds a read_lock (tasklist_lock for example) with irqs enabled
      cpu B waits for the same lock with irqs disabled using write_lock_irq
      cpu C doing set_mtrr() (during AP bringup for example), which will try to
      rendezvous all the cpus using IPI's
      
      This will result in C and A come to the rendezvous point and waiting
      for B. B is stuck forever waiting for the lock and thus not
      reaching the rendezvous point.
      
      Using stop cpu (run in the process context of per cpu based keventd) to do
      this rendezvous, avoids this deadlock scenario.
      
      Also make sure all the cpu's are in the rendezvous handler before we proceed
      with the local_irq_save() on each cpu. This lock step disabling irqs on all
      the cpus will avoid other deadlock scenarios (for example involving
      with the blocking smp_call_function's etc).
      
         [ This problem is very old. Marking -stable only for 2.6.35 as the
           stop_one_cpu_nowait() API is present only in 2.6.35. Any older
           kernel interested in this fix need to do some more work in backporting
           this patch. ]
      Reported-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1280515602.2682.10.camel@sbsiddha-MOBL3.sc.intel.com>
      Acked-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: stable@kernel.org	[2.6.35]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      68f202e4
  17. 06 3月, 2010 1 次提交
  18. 02 2月, 2010 1 次提交
    • E
      x86, mtrr: Constify struct mtrr_ops · 3b9cfc0a
      Emese Revfy 提交于
      This is part of the ops structure constification
      effort started by Arjan van de Ven et al.
      
      Benefits of this constification:
      
       * prevents modification of data that is shared
         (referenced) by many other structure instances
         at runtime
      
       * detects/prevents accidental (but not intentional)
         modification attempts on archs that enforce
         read-only kernel data at runtime
      
       * potentially better optimized code as the compiler
         can assume that the const data cannot be changed
      
       * the compiler/linker move const data into .rodata
         and therefore exclude them from false sharing
      Signed-off-by: NEmese Revfy <re.emese@gmail.com>
      LKML-Reference: <4B65D712.3080804@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      3b9cfc0a
  19. 22 8月, 2009 2 次提交
    • H
      x86, mtrr: make mtrr_aps_delayed_init static bool · 5400743d
      H. Peter Anvin 提交于
      mtr_aps_delayed_init was declared u32 and made global, but it only
      ever takes boolean values and is only ever used in
      arch/x86/kernel/cpu/mtrr/main.c.  Declare it "static bool" and remove
      external references.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      5400743d
    • S
      x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init · d0af9eed
      Suresh Siddha 提交于
      SDM Vol 3a section titled "MTRR considerations in MP systems" specifies
      the need for synchronizing the logical cpu's while initializing/updating
      MTRR.
      
      Currently Linux kernel does the synchronization of all cpu's only when
      a single MTRR register is programmed/updated. During an AP online
      (during boot/cpu-online/resume)  where we initialize all the MTRR/PAT registers,
      we don't follow this synchronization algorithm.
      
      This can lead to scenarios where during a dynamic cpu online, that logical cpu
      is initializing MTRR/PAT with cache disabled (cr0.cd=1) etc while other logical
      HT sibling continue to run (also with cache disabled because of cr0.cd=1
      on its sibling).
      
      Starting from Westmere, VMX transitions with cr0.cd=1 don't work properly
      (because of some VMX performance optimizations) and the above scenario
      (with one logical cpu doing VMX activity and another logical cpu coming online)
      can result in system crash.
      
      Fix the MTRR initialization by doing rendezvous of all the cpus. During
      boot and resume, we delay the MTRR/PAT init for APs till all the
      logical cpu's come online and the rendezvous process at the end of AP's bringup,
      will initialize the MTRR/PAT for all AP's.
      
      For dynamic single cpu online, we synchronize all the logical cpus and
      do the MTRR/PAT init on the AP that is coming online.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      d0af9eed
  20. 04 7月, 2009 1 次提交
    • J
      x86: Clean up mtrr/main.c · dbd51be0
      Jaswinder Singh Rajput 提交于
      Fix following trivial style problems:
      
        ERROR: trailing whitespace X 25
        WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
        WARNING: Use #include <linux/kvm_para.h> instead of <asm/kvm_para.h>
        ERROR: do not initialise externals to 0 or NULL X 2
        ERROR: "foo * bar" should be "foo *bar" X 5
        ERROR: do not use assignment in if condition X 2
        WARNING: line over 80 characters X 8
        ERROR: return is not a function, parentheses are not required
        WARNING: braces {} are not necessary for any arm of this statement
        ERROR: space required before the open parenthesis '(' X 2
        ERROR: open brace '{' following function declarations go on the next line
        ERROR: space required after that ',' (ctx:VxV) X 8
        ERROR: space required before the open parenthesis '(' X 3
        ERROR: else should follow close brace '}'
        WARNING: space prohibited between function name and open parenthesis '('
        WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable X 2
      
      Also use pr_debug and pr_warning where possible.
      
      total: 50 errors, 14 warnings
      
      arch/x86/kernel/cpu/mtrr/main.o:
      
         text	   data	    bss	    dec	    hex	filename
         3668	    116	   4156	   7940	   1f04	main.o.before
         3668	    116	   4156	   7940	   1f04	main.o.after
      
      md5:
         e01af2fd28deef77c8d01e71acfbd365  main.o.before.asm
         e01af2fd28deef77c8d01e71acfbd365  main.o.after.asm
      Suggested-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <20090703164225.GA21447@elte.hu>
      Cc: Avi Kivity <avi@redhat.com> # Avi, please have a look at the kvm_para.h bit
      [ More cleanups ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dbd51be0
  21. 15 5月, 2009 1 次提交
  22. 17 3月, 2009 1 次提交
  23. 13 3月, 2009 2 次提交
  24. 29 1月, 2009 1 次提交
  25. 31 12月, 2008 1 次提交
  26. 26 12月, 2008 1 次提交
  27. 28 10月, 2008 1 次提交
  28. 05 10月, 2008 3 次提交
  29. 03 10月, 2008 2 次提交
  30. 30 9月, 2008 2 次提交
    • J
      x86: fix typo in enable_mtrr_cleanup early parameter · 3dcd7e26
      J.A. Magallón 提交于
      Correct typo for 'enable_mtrr_cleanup' early boot param name.
      Signed-off-by: NJ.A. Magallon <jamagallon@ono.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3dcd7e26
    • Y
      x86: mtrr_cleanup try gran_size to less than 1M · 46240657
      Yinghai Lu 提交于
      one have gran < 1M
      
      reg00: base=0xd8000000 (3456MB), size= 128MB: uncachable, count=1
      reg01: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1
      reg02: base=0x00000000 (   0MB), size=4096MB: write-back, count=1
      reg03: base=0x100000000 (4096MB), size= 512MB: write-back, count=1
      reg04: base=0x120000000 (4608MB), size= 128MB: write-back, count=1
      reg05: base=0xd7f80000 (3455MB), size= 512KB: uncachable, count=1
      
      will get
      
      Found optimal setting for mtrr clean up
      gran_size: 512K         chunk_size: 2M  num_reg: 7      lose RAM: 0G
      range0: 0000000000000000 - 00000000d8000000
      Setting variable MTRR 0, base: 0GB, range: 2GB, type WB
      Setting variable MTRR 1, base: 2GB, range: 1GB, type WB
      Setting variable MTRR 2, base: 3GB, range: 256MB, type WB
      Setting variable MTRR 3, base: 3328MB, range: 128MB, type WB
      hole: 00000000d7f00000 - 00000000d7f80000
      Setting variable MTRR 4, base: 3455MB, range: 512KB, type UC
      rangeX: 0000000100000000 - 0000000128000000
      Setting variable MTRR 5, base: 4GB, range: 512MB, type WB
      Setting variable MTRR 6, base: 4608MB, range: 128MB, type WB
      
      so start from 64k instead of 1M
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      46240657