1. 28 12月, 2009 1 次提交
  2. 17 12月, 2009 1 次提交
    • Y
      x86: Fix checking of SRAT when node 0 ram is not from 0 · 32996250
      Yinghai Lu 提交于
      Found one system that boot from socket1 instead of socket0, SRAT get rejected...
      
      [    0.000000] SRAT: Node 1 PXM 0 0-a0000
      [    0.000000] SRAT: Node 1 PXM 0 100000-80000000
      [    0.000000] SRAT: Node 1 PXM 0 100000000-2080000000
      [    0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000
      [    0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000
      [    0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000
      [    0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000
      [    0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000
      [    0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000
      [    0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000
      ...
      [    0.000000] NUMA: Allocated memnodemap from 500000 - 701040
      [    0.000000] NUMA: Using 20 for the hash shift.
      [    0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used
      [    0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used
      [    0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used
      [    0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used
      [    0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used
      [    0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used
      [    0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used
      [    0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used
      [    0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used
      [    0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used
      [    0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used.
      [    0.000000] SRAT: SRAT not used.
      
      the early_node_map is not sorted because node0 with non zero start come first.
      
      so try to sort it right away after all regions are registered.
      
      also fixs refression by 8716273c (x86: Export srat physical topology)
      
      -v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g)
      -v3: update comments.
      Reported-and-tested-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B2579D2.3010201@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      32996250
  3. 14 12月, 2009 1 次提交
  4. 10 12月, 2009 3 次提交
    • C
      vfs: Implement proper O_SYNC semantics · 6b2f3d1f
      Christoph Hellwig 提交于
      While Linux provided an O_SYNC flag basically since day 1, it took until
      Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
      since that day we had generic_osync_around with only minor changes and the
      great "For now, when the user asks for O_SYNC, we'll actually give
      O_DSYNC" comment.  This patch intends to actually give us real O_SYNC
      semantics in addition to the O_DSYNC semantics.  After Jan's O_SYNC
      patches which are required before this patch it's actually surprisingly
      simple, we just need to figure out when to set the datasync flag to
      vfs_fsync_range and when not.
      
      This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
      numerical value to keep binary compatibility, and adds a new real O_SYNC
      flag.  To guarantee backwards compatiblity it is defined as expanding to
      both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
      sure we are backwards-compatible when compiled against the new headers.
      
      This also means that all places that don't care about the differences can
      just check O_DSYNC and get the right behaviour for O_SYNC, too - only
      places that actuall care need to check __O_SYNC in addition.  Drivers and
      network filesystems have been updated in a fail safe way to always do the
      full sync magic if O_DSYNC is set.  The few places setting O_SYNC for
      lower layers are kept that way for now to stay failsafe.
      
      We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
      to make sure we always get these sane options.
      
      Note that parisc really screwed up their headers as they already define a
      O_DSYNC that has always been a no-op.  We try to repair it by using it for
      the new O_DSYNC and redefinining O_SYNC to send both the traditional
      O_SYNC numerical value _and_ the O_DSYNC one.
      
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger@sun.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      6b2f3d1f
    • J
      x86: mmio-mod.c: Use pr_fmt · 3a0340be
      Joe Perches 提交于
      - Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
       - Remove #define NAME
       - Remove NAME from pr_<level>
      Signed-off-by: NJoe Perches <joe@perches.com>
      LKML-Reference: <009cb214c45ef932df0242856228f4739cc91408.1260383912.git.joe@perches.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a0340be
    • J
      x86: kmmio.c: Add and use pr_fmt(fmt) · 1bd591a5
      Joe Perches 提交于
      - Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
       - Strip "kmmio: " from pr_<level>s
      Signed-off-by: NJoe Perches <joe@perches.com>
      LKML-Reference: <7aa509f8a23933036d39f54bd51e9acc52068049.1260383912.git.joe@perches.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1bd591a5
  5. 06 12月, 2009 1 次提交
  6. 04 12月, 2009 1 次提交
  7. 26 11月, 2009 1 次提交
  8. 24 11月, 2009 4 次提交
  9. 23 11月, 2009 4 次提交
    • J
      x86: Suppress stack overrun message for init_task · 0e7810be
      Jan Beulich 提交于
      init_task doesn't get its stack end location set to
      STACK_END_MAGIC, and hence the message is confusing
      rather than helpful in this case.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      LKML-Reference: <4B06AEFE02000078000211F4@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0e7810be
    • Y
      x86, numa: Use near(er) online node instead of roundrobin for NUMA · d9c2d5ac
      Yinghai Lu 提交于
      CPU to node mapping is set via the following sequence:
      
       1. numa_init_array(): Set up roundrobin from cpu to online node
      
       2. init_cpu_to_node(): Set that according to apicid_to_node[]
      			according to srat only handle the node that
      			is online, and leave other cpu on node
      			without ram (aka not online) to still
      			roundrobin.
      
      3. later call srat_detect_node for Intel/AMD, will use first_online
         node or nearby node.
      
      Problem is that setup_per_cpu_areas() is not called between 2 and 3,
      the per_cpu for cpu on node with ram is on different node, and could
      put that on node with two hops away.
      
      So try to optimize this and add find_near_online_node() and call
      init_cpu_to_node().
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B07A739.3030104@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9c2d5ac
    • Y
      x86, numa, bootmem: Only free bootmem on NUMA failure path · 021428ad
      Yinghai Lu 提交于
      In the NUMA bootmem setup failure path we freed nodedata_phys
      incorrectly.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B07A739.3030104@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      021428ad
    • Y
      x86: apic: Print out SRAT table APIC id in hex · 163d3866
      Yinghai Lu 提交于
      Make it consistent with APIC MADT print out,
      for big systems APIC id in hex is more readable.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B07A739.3030104@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      163d3866
  10. 19 11月, 2009 1 次提交
    • J
      x86: Eliminate redundant/contradicting cache line size config options · 350f8f56
      Jan Beulich 提交于
      Rather than having X86_L1_CACHE_BYTES and X86_L1_CACHE_SHIFT
      (with inconsistent defaults), just having the latter suffices as
      the former can be easily calculated from it.
      
      To be consistent, also change X86_INTERNODE_CACHE_BYTES to
      X86_INTERNODE_CACHE_SHIFT, and set it to 7 (128 bytes) for NUMA
      to account for last level cache line size (which here matters
      more than L1 cache line size).
      
      Finally, make sure the default value for X86_L1_CACHE_SHIFT,
      when X86_GENERIC is selected, is being seen before that for the
      individual CPU model options (other than on x86-64, where
      GENERIC_CPU is part of the choice construct, X86_GENERIC is a
      separate option on ix86).
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Acked-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Acked-by: NNick Piggin <npiggin@suse.de>
      LKML-Reference: <4AFD5710020000780001F8F0@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      350f8f56
  11. 17 11月, 2009 3 次提交
    • K
      x86, mm: Report state of NX protections during boot · 4b0f3b81
      Kees Cook 提交于
      It is possible for x86_64 systems to lack the NX bit either due to the
      hardware lacking support or the BIOS having turned off the CPU capability,
      so NX status should be reported.  Additionally, anyone booting NX-capable
      CPUs in 32bit mode without PAE will lack NX functionality, so this change
      provides feedback for that case as well.
      Signed-off-by: NKees Cook <kees.cook@canonical.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <1258154897-6770-6-git-send-email-hpa@zytor.com>
      4b0f3b81
    • H
      x86, mm: Clean up and simplify NX enablement · 4763ed4d
      H. Peter Anvin 提交于
      The 32- and 64-bit code used very different mechanisms for enabling
      NX, but even the 32-bit code was enabling NX in head_32.S if it is
      available.  Furthermore, we had a bewildering collection of tests for
      the available of NX.
      
      This patch:
      
      a) merges the 32-bit set_nx() and the 64-bit check_efer() function
         into a single x86_configure_nx() function.  EFER control is left
         to the head code.
      
      b) eliminates the nx_enabled variable entirely.  Things that need to
         test for NX enablement can verify __supported_pte_mask directly,
         and cpu_has_nx gives the supported status of NX.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Vegard Nossum <vegardno@ifi.uio.no>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      4763ed4d
    • H
      x86, pageattr: Make set_memory_(x|nx) aware of NX support · 583140af
      H. Peter Anvin 提交于
      Make set_memory_x/set_memory_nx directly aware of if NX is supported
      in the system or not, rather than requiring that every caller assesses
      that support independently.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Tim Starling <tstarling@wikimedia.org>
      Cc: Hannes Eder <hannes@hanneseder.net>
      LKML-Reference: <1258154897-6770-4-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      583140af
  12. 10 11月, 2009 2 次提交
    • X
      x86: pat: Remove ioremap_default() · 2fb8f4e6
      Xiaotian Feng 提交于
      Commit:
      
        b6ff32d9: x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype()
      
      consolidated reserve_memtype() and pat_x_mtrr_type,
      this made ioremap_default() same as ioremap_cache().
      
      Remove the redundant function and change the only caller to use
      ioremap_cache.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <1257845005-7938-1-git-send-email-dfeng@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2fb8f4e6
    • X
      x86: pat: Clean up req_type special case for reserve_memtype() · 83ea05ea
      Xiaotian Feng 提交于
      Commit:
      
        b6ff32d9: x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype()
      
      consolidated code in pat_x_mtrr_type() and reserve_memtype(),
      which removed the special case (req_type is -1) for the
      PAT-enabled part.
      
      We should also change comments and the PAT-disabled part.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <1257844987-7906-1-git-send-email-dfeng@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      83ea05ea
  13. 08 11月, 2009 1 次提交
  14. 03 11月, 2009 3 次提交
  15. 28 10月, 2009 1 次提交
    • S
      tracing: allow to change permissions for text with dynamic ftrace enabled · 883242dd
      Steven Rostedt 提交于
      The commit 74e08179
      x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA
      prevents text sections from becoming read/write using set_memory_rw.
      
      The dynamic ftrace changes all text pages to read/write just before
      converting the calls to tracing to nops, and vice versa.
      
      I orginally just added a flag to allow this transaction when ftrace
      did the change, but I also found that when the CPA testing was running
      it would remove the read/write as well, and ftrace does not do the text
      conversion on boot up, and the CPA changes caused the dynamic tracer
      to fail on self tests.
      
      The current solution I have is to simply not to prevent
      change_page_attr from setting the RW bit for kernel text pages.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      883242dd
  16. 23 10月, 2009 1 次提交
  17. 20 10月, 2009 2 次提交
    • S
      x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA · 74e08179
      Suresh Siddha 提交于
      CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel
      text/rodata/data to small 4KB pages as they are mapped with different
      attributes (text as RO, RODATA as RO and NX etc).
      
      On x86_64, preserve the large page mappings for kernel text/rodata/data
      boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the
      RODATA section to be hugepage aligned and having same RWX attributes
      for the 2MB page boundaries
      
      Extra Memory pages padding the sections will be freed during the end of the boot
      and the kernel identity mappings will have different RWX permissions compared to
      the kernel text mappings.
      
      Kernel identity mappings to these physical pages will be mapped with smaller
      pages but large page mappings are still retained for kernel text,rodata,data
      mappings.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      74e08179
    • S
      x86-64: preserve large page mapping for 1st 2MB kernel txt with CONFIG_DEBUG_RODATA · b9af7c0d
      Suresh Siddha 提交于
      In the first 2MB, kernel text is co-located with kernel static
      page tables setup by head_64.S.  CONFIG_DEBUG_RODATA chops this
      2MB large page mapping to small 4KB pages as we mark the kernel text as RO,
      leaving the static page tables as RW.
      
      With CONFIG_DEBUG_RODATA disabled, OLTP run on NHM-EP shows 1% improvement
      with 2% reduction in system time and 1% improvement in iowait idle time.
      
      To recover this, move the kernel static page tables to .data section, so that
      we don't have to break the first 2MB of kernel text to small pages with
      CONFIG_DEBUG_RODATA.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20091014220254.063193621@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      b9af7c0d
  18. 13 10月, 2009 5 次提交
    • D
      x86: Interleave emulated nodes over physical nodes · adc19389
      David Rientjes 提交于
      Add interleaved NUMA emulation support
      
      This patch interleaves emulated nodes over the system's physical
      nodes. This is required for interleave optimizations since
      mempolicies, for example, operate by iterating over a nodemask and
      act without knowledge of node distances.  It can also be used for
      testing memory latencies and NUMA bugs in the kernel.
      
      There're a couple of ways to do this:
      
       - divide the number of emulated nodes by the number of physical
         nodes and allocate the result on each physical node, or
      
       - allocate each successive emulated node on a different physical
         node until all memory is exhausted.
      
      The disadvantage of the first option is, depending on the asymmetry
      in node capacities of each physical node, emulated nodes may
      substantially differ in size on a particular physical node compared
      to another.
      
      The disadvantage of the second option is, also depending on the
      asymmetry in node capacities of each physical node, there may be
      more emulated nodes allocated on a single physical node as another.
      
      This patch implements the second option; we sacrifice the
      possibility that we may have slightly more emulated nodes on a
      particular physical node compared to another in lieu of node size
      asymmetry.
      
       [ Note that "node capacity" of a physical node is not only a
         function of its addressable range, but also is affected by
         subtracting out the amount of reserved memory over that range.
         NUMA emulation only deals with available, non-reserved memory
         quantities. ]
      
      We ensure there is at least a minimal amount of available memory
      allocated to each node.  We also make sure that at least this
      amount of available memory is available in ZONE_DMA32 for any node
      that includes both ZONE_DMA32 and ZONE_NORMAL.
      
      This patch also cleans the emulation code up by no longer passing
      the statically allocated struct bootnode array among the various
      functions. This init.data array is not allocated on the stack since
      it may be very large and thus it may be accessed at file scope.
      
      The WARN_ON() for nodes_cover_memory() when faking proximity
      domains is removed since it relies on successive nodes always
      having greater start addresses than previous nodes; with
      interleaving this is no longer always true.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Len Brown <len.brown@intel.com>
      LKML-Reference: <alpine.DEB.1.00.0909251519150.14754@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      adc19389
    • D
      x86: Export srat physical topology · 8716273c
      David Rientjes 提交于
      This is the counterpart to "x86: export k8 physical topology" for
      SRAT. It is not as invasive because the acpi code already seperates
      node setup into detection and registration steps, with the
      exception of registering e820 active regions in
      acpi_numa_memory_affinity_init().  This is now moved to
      acpi_scan_nodes() if NUMA emulation is disabled or deferred.
      
      acpi_numa_init() now returns a value which specifies whether an
      underlying SRAT was located.  If so, that topology can be used by
      the emulation code to interleave emulated nodes over physical nodes
      or to register the nodes for ACPI.
      
      acpi_get_nodes() may now be used to export the srat physical
      topology of the machine for NUMA emulation.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Len Brown <len.brown@intel.com>
      LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8716273c
    • D
      x86: Export k8 physical topology · 8ee2debc
      David Rientjes 提交于
      To eventually interleave emulated nodes over physical nodes, we
      need to know the physical topology of the machine without actually
      registering it.  This does the k8 node setup in two parts:
      detection and registration.  NUMA emulation can then used the
      physical topology detected to setup the address ranges of emulated
      nodes accordingly.  If emulation isn't used, the k8 nodes are
      registered as normal.
      
      Two formals are added to the x86 NUMA setup functions: `acpi' and
      `k8'. These represent whether ACPI or K8 NUMA has been detected;
      both cannot be true at the same time.  This specifies to the NUMA
      emulation code whether an underlying physical NUMA topology exists
      and which interface to use.
      
      This patch deals solely with separating the k8 setup path into
      Northbridge detection and registration steps and leaves the ACPI
      changes for a subsequent patch.  The `acpi' formal is added here,
      however, to avoid touching all the header files again in the next
      patch.
      
      This approach also ensures emulated nodes will not span physical
      nodes so the true memory latency is not misrepresented.
      
      k8_get_nodes() may now be used to export the k8 physical topology
      of the machine for NUMA emulation.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Len Brown <len.brown@intel.com>
      LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ee2debc
    • D
      x86: Clean up and add missing log levels for k8 · 1af5ba51
      David Rientjes 提交于
      Convert all printk's in arch/x86/mm/k8topology_64.c to use
      pr_info() or pr_err() appropriately.
      
      Adds log levels for messages currently lacking them.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Len Brown <len.brown@intel.com>
      LKML-Reference: <alpine.DEB.1.00.0909251517440.14754@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1af5ba51
    • B
      x86, 64-bit: Move K8 B step iret fixup to fault entry asm · ae24ffe5
      Brian Gerst 提交于
      Move the handling of truncated %rip from an iret fault to the fault
      entry path.
      
      This allows x86-64 to use the standard search_extable() function.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Beulich <jbeulich@novell.com>
      LKML-Reference: <1255357103-5418-1-git-send-email-brgerst@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ae24ffe5
  19. 12 10月, 2009 1 次提交
  20. 24 9月, 2009 2 次提交
    • R
      x86: Reduce verbosity of "PAT enabled" kernel message · e23a8b6a
      Roland Dreier 提交于
      On modern systems, the kernel prints the message
      
          x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
      
      once for every CPU.
      
      This gets kind of ridiculous on huge systems; for example, on a
      64-thread system I was lucky enough to get:
      
          dmesg| grep 'PAT enabled' | wc
               64     704    5174
      
      There is already a BUG() if non-boot CPUs have PAT capabilities
      that don't match the boot CPU, so just print the message on the
      boot CPU. (I kept the print after the wrmsrl() that enables PAT,
      so that the log output continues to mean that the system survived
      enabling PAT on the boot CPU)
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <adavdj92sso.fsf@cisco.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e23a8b6a
    • R
      cpumask: use mm_cpumask() wrapper: x86 · 78f1c4d6
      Rusty Russell 提交于
      Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer).
      
      It's also a chance to use the new cpumask_ ops which take a pointer
      (the older ones are deprecated, but there's no hurry for arch code).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      78f1c4d6
  21. 23 9月, 2009 1 次提交