1. 02 5月, 2011 15 次提交
    • T
      x86-32, NUMA: implement temporary NUMA init shims · b0d31080
      Tejun Heo 提交于
      To help transition to common NUMA init, implement temporary 32bit
      shims for numa_add_memblk() and numa_set_distance().
      numa_add_memblk() registers the memblk and adjusts
      node_start/end_pfn[].  numa_set_distance() is noop.
      
      These shims will allow using 64bit NUMA init functions on 32bit and
      gradual transition to common NUMA init path.
      
      For detailed description, please read description of commits which
      make use of the shim functions.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      b0d31080
    • T
      x86, NUMA: Move numa_nodes_parsed to numa.[hc] · e6df595b
      Tejun Heo 提交于
      Move numa_nodes_parsed from numa_64.[hc] to numa.[hc] to prepare for
      NUMA init path unification.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      e6df595b
    • T
      x86-32, NUMA: Move get_memcfg_numa() into numa_32.c · daf4f480
      Tejun Heo 提交于
      There's no reason get_memcfg_numa() to be implemented inline in
      mmzone_32.h.  Move it to numa_32.c and also make
      get_memcfg_numa_flag() static.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      daf4f480
    • T
      x86, NUMA: make srat.c 32bit safe · eca9ad31
      Tejun Heo 提交于
      Make srat.c 32bit safe by removing the assumption that unsigned long
      is 64bit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      eca9ad31
    • T
      x86, NUMA: rename srat_64.c to srat.c · 7b2600f8
      Tejun Heo 提交于
      Rename srat_64.c to srat.c.  This is to prepare for unification of
      NUMA init paths between 32 and 64bit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      7b2600f8
    • T
      x86, NUMA: trivial cleanups · 1201e10a
      Tejun Heo 提交于
      * Kill no longer used struct bootnode.
      
      * Kill dangling declaration of pxm_to_nid() in numa_32.h.
      
      * Make setup_node_bootmem() static.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      1201e10a
    • T
      x86-32, NUMA: use sparse_memory_present_with_active_regions() · 797390d8
      Tejun Heo 提交于
      Instead of calling memory_present() for each region from NUMA init,
      call sparse_memory_present_with_active_regions() from paging_init()
      similarly to x86-64.
      
      For flat and numaq, this results in exactly the same memory_present()
      calls.  For srat, if there are multiple memory chunks for a node,
      after this change, memory_present() will be called separately for each
      chunk instead of being called once to encompass the whole range, which
      doesn't cause any harm and actually is the better behavior.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      797390d8
    • T
      x86-32, NUMA: Make apic->x86_32_numa_cpu_node() optional · 84914ed0
      Tejun Heo 提交于
      NUMAQ is the only meaningful user of this callback and
      setup_local_APIC() the only callsite.  Stop torturing everyone else by
      making the callback optional and removing all the boilerplate
      implementations and assignments.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      84914ed0
    • T
      x86, NUMA: Unify 32/64bit numa_cpu_node() implementation · 6bd26273
      Tejun Heo 提交于
      Currently, the only meaningful user of apic->x86_32_numa_cpu_node() is
      NUMAQ which returns valid mapping only after CPU is initialized during
      SMP bringup; thus, the previous patch to set apicid -> node in
      setup_local_APIC() makes __apicid_to_node[] always contain the correct
      mapping whether custom apic->x86_32_numa_cpu_node() is used or not.
      
      So, there is no reason to keep separate 32bit implementation.  We can
      always consult __apicid_to_node[].  Move 64bit implementation from
      numa_64.c to numa.c and remove 32bit implementation from numa_32.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      6bd26273
    • T
      x86-32, NUMA: Automatically set apicid -> node in setup_local_APIC() · c4b90c11
      Tejun Heo 提交于
      Some x86-32 NUMA implementations (NUMAQ) don't initialize apicid ->
      node mapping using set_apicid_to_node() during NUMA init but implement
      custom apic->x86_32_numa_cpu_node() instead.
      
      This patch automatically initializes the default apic -> node mapping
      table from apic->x86_32_numa_cpu_node() from setup_local_APIC() such
      that the mapping table is in sync with the actual mapping.
      
      As the table isn't used by custom implementations, this doesn't make
      any difference at this point.  This is in preparation of unifying
      numa_cpu_node() between x86-32 and 64.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      c4b90c11
    • T
      x86-64, NUMA: simplify nodedata allocation · acd26d61
      Tejun Heo 提交于
      With top-down memblock allocation, the allocation range limits in
      ealry_node_mem() can be simplified - try node-local first, then any
      node but in any case don't allocate below DMA limit.
      
      Remove early_node_mem() and implement simplified allocation directly
      in setup_node_bootmem().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      acd26d61
    • T
      x86-64, NUMA: trivial cleanups for setup_node_bootmem() · ebe685f2
      Tejun Heo 提交于
      Make the following trivial changes in preparation for further updates.
      
      * nodeid -> nid, nid -> tnid
      * use nd_ prefix for nodedata related variables
      * remove start/end_pfn and use start/end directly
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      ebe685f2
    • T
      x86-64, NUMA: Simplify hotadd memory handling · 9688678a
      Tejun Heo 提交于
      The only special handling NUMA needs to do for hotadd memory is
      determining the node for the hotadd memory given the address of it and
      there's nothing specific to specific config method used.
      
      srat_64.c does somewhat elaborate error checking on
      ACPI_SRAT_MEM_HOT_PLUGGABLE regions, remembers them and implements
      memory_add_physaddr_to_nid() which determines the node for given
      hotadd address.
      
      This is almost completely redundant.  All the information is already
      available to the generic NUMA code which already performs all the
      sanity checking and merging.  All that's necessary is not using
      __initdata from numa_meminfo and providing a function which uses it to
      map address to node.
      
      Drop the specific implementation from srat_64.c and add generic
      memory_add_physaddr_to_nid() in numa_64.c, which is enabled if
      CONFIG_MEMORY_HOTPLUG is set.  Other than dropping the code, srat_64.c
      doesn't need any change as it already calls numa_add_memblk() for hot
      pluggable regions which is enough.
      
      While at it, change CONFIG_MEMORY_HOTPLUG_SPARSE in srat_64.c to
      CONFIG_MEMORY_HOTPLUG, for NUMA on x86-64, the two are always the
      same.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      9688678a
    • Y
      x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo() · 2be19102
      Yinghai Lu 提交于
      numa_cleanup_meminfo() trims each memblk between low (0) and
      high (max_pfn) limits and discards empty ones.  However, the
      emptiness detection incorrectly used equality test.  If the
      start of a memblk is higher than max_pfn, it is empty but fails
      the equality test and doesn't get discarded.
      
      The condition triggers when max_pfn is lower than start of a
      NUMA node and results in memory misconfiguration - leading to
      WARN_ON()s and other funnies.  The bug was discovered in devel
      branch where 32bit too uses this code path for NUMA init.  If a
      node is above the addressing limit, max_pfn ends up lower than
      the node triggering this problem.
      
      The failure hasn't been observed on x86-64 but is still possible
      with broken hardware e820/NUMA info.  As the fix is very low
      risk, it would be better to apply it even for 64bit.
      
      Fix it by using >= instead of ==.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      [ Extracted the actual fix from the original patch and rewrote patch description. ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/20110501171204.GO29280@htj.dyndns.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      2be19102
    • B
      x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors · e20a2d20
      Boris Ostrovsky 提交于
      Older AMD K8 processors (Revisions A-E) are affected by erratum
      400 (APIC timer interrupts don't occur in C states greater than
      C1). This, for example, means that X86_FEATURE_ARAT flag should
      not be set for these parts.
      
      This addresses regression introduced by commit
      b87cf80a ("x86, AMD: Set ARAT
      feature on AMD processors") where the system may become
      unresponsive until external interrupt (such as keyboard input)
      occurs. This results, for example, in time not being reported
      correctly, lack of progress on the system and other lockups.
      Reported-by: NJoerg-Volker Peetz <jvpeetz@web.de>
      Tested-by: NJoerg-Volker Peetz <jvpeetz@web.de>
      Acked-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NBoris Ostrovsky <Boris.Ostrovsky@amd.com>
      Cc: stable@kernel.org
      Link: http://lkml.kernel.org/r/1304113663-6586-1-git-send-email-ostr@amd64.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      e20a2d20
  2. 29 4月, 2011 4 次提交
  3. 28 4月, 2011 2 次提交
  4. 27 4月, 2011 4 次提交
    • D
      perf, x86, nmi: Move LVT un-masking into irq handlers · 2bce5dac
      Don Zickus 提交于
      It was noticed that P4 machines were generating double NMIs for
      each perf event.  These extra NMIs lead to 'Dazed and confused'
      messages on the screen.
      
      I tracked this down to a P4 quirk that said the overflow bit had
      to be cleared before re-enabling the apic LVT mask.  My first
      attempt was to move the un-masking inside the perf nmi handler
      from before the chipset NMI handler to after.
      
      This broke Nehalem boxes that seem to like the unmasking before
      the counters themselves are re-enabled.
      
      In order to keep this change simple for 2.6.39, I decided to
      just simply move the apic LVT un-masking to the beginning of all
      the chipset NMI handlers, with the exception of Pentium4's to
      fix the double NMI issue.
      
      Later on we can move the un-masking to later in the handlers to
      save a number of 'extra' NMIs on those particular chipsets.
      
      I tested this change on a P4 machine, an AMD machine, a Nehalem
      box, and a core2quad box.  'perf top' worked correctly along
      with various other small 'perf record' runs.  Anything high
      stress breaks all the machines but that is a different problem.
      
      Thanks to various people for testing different versions of this
      patch.
      Reported-and-tested-by: NShaun Ruffell <sruffell@digium.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303900353-10242-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      CC: Cyrill Gorcunov <gorcunov@gmail.com>
      2bce5dac
    • M
      m68k/mm: Set all online nodes in N_NORMAL_MEMORY · 4aac0b48
      Michael Schmitz 提交于
      For m68k, N_NORMAL_MEMORY represents all nodes that have present memory
      since it does not support HIGHMEM.  This patch sets the bit at the time
      node_present_pages has been set by free_area_init_node.
      At the time the node is brought online, the node state would have to be
      done unconditionally since information about present memory has not yet
      been recorded.
      
      If N_NORMAL_MEMORY is not accurate, slub may encounter errors since it
      uses this nodemask to setup per-cache kmem_cache_node data structures.
      
      This pach is an alternative to the one proposed by David Rientjes
      <rientjes@google.com> attempting to set node state immediately when
      bringing the node online.
      Signed-off-by: NMichael Schmitz <schmitz@debian.org>
      Tested-by: NThorsten Glaser <tg@debian.org>
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      CC: stable@kernel.org
      4aac0b48
    • L
      Revert wrong fixes for common misspellings · e9c54999
      Lucas De Marchi 提交于
      These changes were incorrectly fixed by codespell. They were now
      manually corrected.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      e9c54999
    • I
      perf events, x86: Work around the Nehalem AAJ80 erratum · ec75a716
      Ingo Molnar 提交于
      On Nehalem CPUs the retired branch-misses event can be completely bogus,
      when there are no branch-misses occuring. When there are a lot of branch
      misses then the count is pretty accurate. Still, this leaves us with an
      event that over-counts a lot.
      
      Detect this erratum and work it around by using BR_MISP_EXEC.ANY events.
      These will also count speculated branches but still it's a lot more
      precise in practice than the architectural event.
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/n/tip-yyfg0bxo9jsqxd6a0ovfny27@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ec75a716
  5. 26 4月, 2011 7 次提交
  6. 22 4月, 2011 4 次提交
    • P
      perf, x86: Update/fix Intel Nehalem cache events · f4929bd3
      Peter Zijlstra 提交于
      Change the Nehalem cache events to use retired memory instruction counters
      (similar to Westmere), this greatly improves the provided stats.
      
      Using:
      
      main ()
      {
              int i;
      
              for (i = 0; i < 1000000000; i++) {
                      asm("mov (%%rsp), %%rbx;"
                          "mov %%rbx, (%%rsp);" : : : "rbx");
              }
      }
      
      We find:
      
       $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,056 instructions:u           #      0.000 IPC ( +-   0.000% )
            4,999,502,846 l1-dcache-loads:u          ( +-   0.008% )
            1,000,034,832 l1-dcache-stores:u         ( +-   0.000% )
               1.565184942  seconds time elapsed   ( +-   0.005% )
      
      The 5b is surprising - we'd expect 1b:
      
       $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,054 instructions:u           #      0.000 IPC ( +-   0.000% )
            1,000,021,961 r10b:u                     ( +-   0.000% )
            1,000,030,951 l1-dcache-stores:u         ( +-   0.000% )
               1.565055422  seconds time elapsed   ( +-   0.003% )
      
      Which this patch thus fixes.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f4929bd3
    • C
      perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflow · 1ea5a6af
      Cyrill Gorcunov 提交于
      It's not enough to simply disable event on overflow the
      cpuc->active_mask should be cleared as well otherwise counter
      may stall in "active" even in real being already disabled (which
      potentially may lead to the situation that user may not use this
      counter further).
      
      Don pointed out that:
      
       " I also noticed this patch fixed some unknown NMIs
         on a P4 when I stressed the box".
      Tested-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303398203-2918-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1ea5a6af
    • I
      x86, perf event: Turn off unstructured raw event access to offcore registers · b52c55c6
      Ingo Molnar 提交于
      Andi Kleen pointed out that the Intel offcore support patches were merged
      without user-space tool support to the functionality:
      
       |
       | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the
       | user space bits were not. This made it impossible to set the extra mask
       | and actually do the OFFCORE profiling
       |
      
      Andi submitted a preliminary patch for user-space support, as an
      extension to perf's raw event syntax:
      
       |
       | Some raw events -- like the Intel OFFCORE events -- support additional
       | parameters. These can be appended after a ':'.
       |
       | For example on a multi socket Intel Nehalem:
       |
       |    perf stat -e r1b7:20ff -a sleep 1
       |
       | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0
       | that measures any access to DRAM on another socket.
       |
      
      But this kind of usability is absolutely unacceptable - users should not
      be expected to type in magic, CPU and model specific incantations to get
      access to useful hardware functionality.
      
      The proper solution is to expose useful offcore functionality via
      generalized events - that way users do not have to care which specific
      CPU model they are using, they can use the conceptual event and not some
      model specific quirky hexa number.
      
      We already have such generalization in place for CPU cache events,
      and it's all very extensible.
      
      "Offcore" events measure general DRAM access patters along various
      parameters. They are particularly useful in NUMA systems.
      
      We want to support them via generalized DRAM events: either as the
      fourth level of cache (after the last-level cache), or as a separate
      generalization category.
      
      That way user-space support would be very obvious, memory access
      profiling could be done via self-explanatory commands like:
      
        perf record -e dram ./myapp
        perf record -e dram-remote ./myapp
      
      ... to measure DRAM accesses or more expensive cross-node NUMA DRAM
      accesses.
      
      These generalized events would work on all CPUs and architectures that
      have comparable PMU features.
      
      ( Note, these are just examples: actual implementation could have more
        sophistication and more parameter - as long as they center around
        similarly simple usecases. )
      
      Now we do not want to revert *all* of the current offcore bits, as they
      are still somewhat useful for generic last-level-cache events, implemented
      in this commit:
      
        e994d7d2: perf: Fix LLC-* events on Intel Nehalem/Westmere
      
      But we definitely do not yet want to expose the unstructured raw events
      to user-space, until better generalization and usability is implemented
      for these hardware event features.
      
      ( Note: after generalization has been implemented raw offcore events can be
        supported as well: there can always be an odd event that is marginally
        useful but not useful enough to generalize. DRAM profiling is definitely
        *not* such a category so generalization must be done first. )
      
      Furthermore, PERF_TYPE_RAW access to these registers was not intended
      to go upstream without proper support - it was a side-effect of the above
      e994d7d2 commit, not mentioned in the changelog.
      
      As v2.6.39 is nearing release we go for the simplest approach: disable
      the PERF_TYPE_RAW offcore hack for now, before it escapes into a released
      kernel and becomes an ABI.
      
      Once proper structure is implemented for these hardware events and users
      are offered usable solutions we can revisit this issue.
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b52c55c6
    • A
      perf: Support Xeon E7's via the Westmere PMU driver · b2508e82
      Andi Kleen 提交于
      There's a new model number public, 47, for Xeon E7 (aka Westmere EX).
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: a.p.zijlstra@chello.nl
      Link: http://lkml.kernel.org/r/1303429715-10202-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b2508e82
  7. 21 4月, 2011 4 次提交
    • D
      [PARISC] set memory ranges in N_NORMAL_MEMORY when onlined · d9b41e0b
      David Rientjes 提交于
      When a DISCONTIGMEM memory range is brought online as a NUMA node, it
      also needs to have its bet set in N_NORMAL_MEMORY.  This is necessary for
      generic kernel code that utilizes N_NORMAL_MEMORY as a subset of N_ONLINE
      for memory savings.
      
      These types of hacks can hopefully be removed once DISCONTIGMEM is either
      removed or abstracted away from CONFIG_NUMA.
      
      Fixes a panic in the slub code which only initializes structures for
      N_NORMAL_MEMORY to save memory:
      
      	Backtrace:
      	 [<000000004021c938>] add_partial+0x28/0x98
      	 [<000000004021faa0>] __slab_free+0x1d0/0x1d8
      	 [<000000004021fd04>] kmem_cache_free+0xc4/0x128
      	 [<000000004033bf9c>] ida_get_new_above+0x21c/0x2c0
      	 [<00000000402a8980>] sysfs_new_dirent+0xd0/0x238
      	 [<00000000402a974c>] create_dir+0x5c/0x168
      	 [<00000000402a9ab0>] sysfs_create_dir+0x98/0x128
      	 [<000000004033d6c4>] kobject_add_internal+0x114/0x258
      	 [<000000004033d9ac>] kobject_add_varg+0x7c/0xa0
      	 [<000000004033df20>] kobject_add+0x50/0x90
      	 [<000000004033dfb4>] kobject_create_and_add+0x54/0xc8
      	 [<00000000407862a0>] cgroup_init+0x138/0x1f0
      	 [<000000004077ce50>] start_kernel+0x5a0/0x840
      	 [<000000004011fa3c>] start_parisc+0xa4/0xb8
      	 [<00000000404bb034>] packet_ioctl+0x16c/0x208
      	 [<000000004049ac30>] ip_mroute_setsockopt+0x260/0xf20
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      d9b41e0b
    • D
      x86, numa: Fix cpu nodemasks for NUMA emulation and CONFIG_DEBUG_PER_CPU_MAPS · 7a6c6547
      David Rientjes 提交于
      The cpu<->node mappings under CONFIG_DEBUG_PER_CPU_MAPS=y
      when NUMA emulation is enabled is currently broken because it does
      not iterate through every emulated node and bind cpus that have
      affinity to it.
      
      NUMA emulation should bind each cpu to every local node to
      accurately represent the true NUMA topology of the underlying
      machine.
      
      debug_cpumask_set_cpu() needs to be fixed at the same time so
      that the debugging information that it emits shows the new
      cpumask of the node being assigned when the cpu is being added
      or removed.
      
      It can now take responsibility of setting or clearing the cpu
      itself to remove the need for duplicate code.
      
      Also change its last parameter, "enable", to have the correct bool
      type since it can only be true or false.
      
       -v2: Fix the return statements, by Kosaki Motohiro
      Acked-and-Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1104201918470.12634@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      7a6c6547
    • D
      Revert "x86, NUMA: Fix fakenuma boot failure" · 37f8527d
      David Rientjes 提交于
      Andreas Herrmann reported that 7d6b4670 ("x86, NUMA: Fix fakenuma
      boot failure") causes certain physical NUMA topologies (for example
      AMD Magny-Cours) to move sibling cpus to a single node when in reality
      they are in separate domains.
      
      This may result in some nodes being completely void of cpus, which
      doesn't accurately represent the correct topology. The system will
      boot, but will have suboptimal NUMA performance.
      
      This commit was intended as a fix for NUMA emulation, but should
      not cause a regression for real NUMA machines as a side effect.
      
      ( There will be a separate fix for the numa-debug code, which
        will not affect physical topologies. )
      Reported-by: NAndreas Herrmann <herrmann.der.user@googlemail.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1104201918110.12634@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      37f8527d
    • A
      OMAP2/3: hwmod: fix gpio-reset timeouts seen during bootup. · f95440ca
      Avinash.H.M 提交于
      GPIO module expects the debounce clocks to be enabled during reset. It doesn't
      reset properly and timeouts are seen, if this clock isn't enabled during
      reset. Add the HWMOD_CONTROL_OPT_CLKS_IN_RESET flags to the GPIO HWMODs, with
      which the debounce clocks are enabled during reset.
      
      Cc: Rajendra Nayak <rnayak@ti.com>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Benoit Cousson <b-cousson@ti.com>
      Cc: Kevin Hilman <khilman@ti.com>
      Signed-off-by: NAvinash.H.M <avinashhm@ti.com>
      Signed-off-by: NPaul Walmsley <paul@pwsan.com>
      f95440ca