1. 30 5月, 2016 1 次提交
  2. 22 4月, 2016 1 次提交
    • L
      ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present · 702b07fc
      Lukasz Anaczkowski 提交于
      SRAT maps APIC ID to proximity domains ids (PXM). Mapping from PXM to
      NUMA node ids is based on order of entries in SRAT table.
      SRAT table has just LAPIC entires or mix of LAPIC and X2APIC entries.
      As long as there are only LAPIC entires, mapping from proximity domain
      id to NUMA node id is as assumed by BIOS. However, once APIC entries are
      mixed, X2APIC entries would be first mapped which causes unexpected NUMA
      node mapping.
      
      To fix that, change parsing to check each entry against both LAPIC and
      X2APIC so mapping is in the SRAT/PXM order.
      
      This is supplemental change to the fix made by commit d81056b5
      (Handle apic/x2apic entries in MADT in correct order) and using the
      mechanism introduced by 9b3fedde (ACPI / tables: Add acpi_subtable_proc
      to ACPI table parsers).
      
      Fixes: d81056b5 (Handle apic/x2apic entries in MADT in correct order)
      Signed-off-by: NLukasz Anaczkowski <lukasz.anaczkowski@intel.com>
      [ rjw : Subject & changelog ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      702b07fc
  3. 08 7月, 2015 1 次提交
  4. 26 6月, 2015 1 次提交
    • T
      acpi: Add acpi_map_pxm_to_online_node() · 99759869
      Toshi Kani 提交于
      The kernel initializes CPU & memory's NUMA topology from ACPI
      SRAT table.  Some other ACPI tables, such as NFIT and DMAR, also
      contain proximity IDs for their device's NUMA topology.  This
      information can be used to improve performance of these devices.
      
      This patch introduces acpi_map_pxm_to_online_node(), which is
      similar to acpi_map_pxm_to_node(), but always returns an online
      node.  When the mapped node from a given proximity ID is offline,
      it looks up the node distance table and returns the nearest
      online node.
      
      ACPI device drivers, which are called after the NUMA initialization
      has completed in the kernel, can call this interface to obtain their
      device NUMA topology from ACPI tables.  Such drivers do not have to
      deal with offline nodes.  A node may be offline when a device
      proximity ID is unique, SRAT memory entry does not exist, or NUMA is
      disabled, ex. "numa=off" on x86.
      
      This patch also moves the pxm range check from acpi_get_node() to
      acpi_map_pxm_to_node().
      Signed-off-by: NToshi Kani <toshi.kani@hp.com>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com&gt;>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      99759869
  5. 06 2月, 2015 1 次提交
  6. 04 2月, 2014 4 次提交
  7. 07 12月, 2013 1 次提交
    • L
      ACPI: Clean up inclusions of ACPI header files · 8b48463f
      Lv Zheng 提交于
      Replace direct inclusions of <acpi/acpi.h>, <acpi/acpi_bus.h> and
      <acpi/acpi_drivers.h>, which are incorrect, with <linux/acpi.h>
      inclusions and remove some inclusions of those files that aren't
      necessary.
      
      First of all, <acpi/acpi.h>, <acpi/acpi_bus.h> and <acpi/acpi_drivers.h>
      should not be included directly from any files that are built for
      CONFIG_ACPI unset, because that generally leads to build warnings about
      undefined symbols in !CONFIG_ACPI builds.  For CONFIG_ACPI set,
      <linux/acpi.h> includes those files and for CONFIG_ACPI unset it
      provides stub ACPI symbols to be used in that case.
      
      Second, there are ordering dependencies between those files that always
      have to be met.  Namely, it is required that <acpi/acpi_bus.h> be included
      prior to <acpi/acpi_drivers.h> so that the acpi_pci_root declarations the
      latter depends on are always there.  And <acpi/acpi.h> which provides
      basic ACPICA type declarations should always be included prior to any other
      ACPI headers in CONFIG_ACPI builds.  That also is taken care of including
      <linux/acpi.h> as appropriate.
      Signed-off-by: NLv Zheng <lv.zheng@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Acked-by: Bjorn Helgaas <bhelgaas@google.com> (drivers/pci stuff)
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> (Xen stuff)
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8b48463f
  8. 24 9月, 2013 1 次提交
  9. 13 8月, 2013 1 次提交
  10. 03 3月, 2013 1 次提交
    • Y
      x86, ACPI, mm: Revert movablemem_map support · 20e6926d
      Yinghai Lu 提交于
      Tim found:
      
        WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
        Hardware name: S2600CP
        sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
        smpboot: Booting Node   1, Processors  #1
        Modules linked in:
        Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
        Call Trace:
          set_cpu_sibling_map+0x279/0x449
          start_secondary+0x11d/0x1e5
      
      Don Morris reproduced on a HP z620 workstation, and bisected it to
      commit e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock
      is ready")
      
      It turns out movable_map has some problems, and it breaks several things
      
      1. numa_init is called several times, NOT just for srat. so those
      	nodes_clear(numa_nodes_parsed)
      	memset(&numa_meminfo, 0, sizeof(numa_meminfo))
         can not be just removed.  Need to consider sequence is: numaq, srat, amd, dummy.
         and make fall back path working.
      
      2. simply split acpi_numa_init to early_parse_srat.
         a. that early_parse_srat is NOT called for ia64, so you break ia64.
         b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
      	     set_apicid_to_node(i, NUMA_NO_NODE)
           still left in numa_init. So it will just clear result from early_parse_srat.
           it should be moved before that....
         c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
             early before override from INITRD is settled.
      
      3. that patch TITLE is total misleading, there is NO x86 in the title,
         but it changes critical x86 code. It caused x86 guys did not
         pay attention to find the problem early. Those patches really should
         be routed via tip/x86/mm.
      
      4. after that commit, following range can not use movable ram:
        a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
        b. initrd... it will be freed after booting, so it could be on movable...
        c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
      	anymore.
        d. init_mem_mapping: can not put page table high anymore.
        e. initmem_init: vmemmap can not be high local node anymore. That is
           not good.
      
      If node is hotplugable, the mem related range like page table and
      vmemmap could be on the that node without problem and should be on that
      node.
      
      We have workaround patch that could fix some problems, but some can not
      be fixed.
      
      So just remove that offending commit and related ones including:
      
       f7210e6c ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
          protect movablecore_map in memblock_overlaps_region().")
      
       01a178a9 ("acpi, memory-hotplug: support getting hotplug info from
          SRAT")
      
       27168d38 ("acpi, memory-hotplug: extend movablemem_map ranges to
          the end of node")
      
       e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock is
          ready")
      
       fb06bc8e ("page_alloc: bootmem limit with movablecore_map")
      
       42f47e27 ("page_alloc: make movablemem_map have higher priority")
      
       6981ec31 ("page_alloc: introduce zone_movable_limit[] to keep
          movable limit for nodes")
      
       34b71f1e ("page_alloc: add movable_memmap kernel parameter")
      
       4d59a751 ("x86: get pg_data_t's memory from other node")
      
      Later we should have patches that will make sure kernel put page table
      and vmemmap on local node ram instead of push them down to node0.  Also
      need to find way to put other kernel used ram to local node ram.
      Reported-by: NTim Gardner <tim.gardner@canonical.com>
      Reported-by: NDon Morris <don.morris@hp.com>
      Bisected-by: NDon Morris <don.morris@hp.com>
      Tested-by: NDon Morris <don.morris@hp.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20e6926d
  11. 24 2月, 2013 1 次提交
    • T
      acpi, memory-hotplug: parse SRAT before memblock is ready · e8d19552
      Tang Chen 提交于
      On linux, the pages used by kernel could not be migrated.  As a result,
      if a memory range is used by kernel, it cannot be hot-removed.  So if we
      want to hot-remove memory, we should prevent kernel from using it.
      
      The way now used to prevent this is specify a memory range by
      movablemem_map boot option and set it as ZONE_MOVABLE.
      
      But when the system is booting, memblock will allocate memory, and
      reserve the memory for kernel.  And before we parse SRAT, and know the
      node memory ranges, memblock is working.  And it may allocate memory in
      ranges to be set as ZONE_MOVABLE.  This memory can be used by kernel,
      and never be freed.
      
      So, let's parse SRAT before memblock is called first.  And it is early
      enough.
      
      The first call of memblock_find_in_range_node() is in:
      
        setup_arch()
          |-->setup_real_mode()
      
      so, this patch add a function early_parse_srat() to parse SRAT, and call
      it before setup_real_mode() is called.
      
      NOTE:
      
      1) early_parse_srat() is called before numa_init(), and has initialized
         numa_meminfo.  So DO NOT clear numa_nodes_parsed in numa_init() and DO
         NOT zero numa_meminfo in numa_init(), otherwise we will lose memory
         numa info.
      
      2) I don't know why using count of memory affinities parsed from SRAT
         as a return value in original acpi_numa_init().  So I add a static
         variable srat_mem_cnt to remember this count and use it as the return
         value of the new acpi_numa_init()
      
      [mhocko@suse.cz: parse SRAT before memblock is ready fix]
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wu Jianguo <wujianguo@huawei.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: "Brown, Len" <len.brown@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8d19552
  12. 26 1月, 2013 1 次提交
  13. 11 1月, 2013 1 次提交
  14. 03 8月, 2012 2 次提交
  15. 17 1月, 2012 1 次提交
    • K
      ACPI: Store SRAT table revision · 8df0eb7c
      Kurt Garloff 提交于
      In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides
      32bits for these. The new fields were reserved before.
      According to the ACPI spec, the OS must disregrard reserved fields.
      In order to know whether or not, we must know what version the SRAT
      table has.
      
      This patch stores the SRAT table revision for later consumption
      by arch specific __init functions.
      Signed-off-by: NKurt Garloff <kurt@garloff.de>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      8df0eb7c
  16. 16 2月, 2011 1 次提交
    • T
      x86-64, NUMA: Unify {acpi|amd}_{numa_init|scan_nodes}() arguments and return values · 940fed2e
      Tejun Heo 提交于
      The functions used during NUMA initialization - *_numa_init() and
      *_scan_nodes() - have different arguments and return values.  Unify
      them such that they all take no argument and return 0 on success and
      -errno on failure.  This is in preparation for further NUMA init
      cleanups.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      940fed2e
  17. 12 1月, 2011 1 次提交
  18. 24 12月, 2010 1 次提交
    • Y
      x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation · d3bd0588
      Yinghai Lu 提交于
      Recent Intel new system have different order in MADT, aka will list all thread0
      at first, then all thread1.
      But SRAT table still old order, it will list cpus in one socket all together.
      
      If the user have compiled limited NR_CPUS or boot with nr_cpus=, could have missed
      to put some cpus apic id to node mapping into apicid_to_node[].
      
      for example for 4 sockets system with 64 cpus with nr_cpus=32 will get crash...
      
      [    9.106288] Total of 32 processors activated (136190.88 BogoMIPS).
      [    9.235021] divide error: 0000 [#1] SMP
      [    9.235315] last sysfs file:
      [    9.235481] CPU 1
      [    9.235592] Modules linked in:
      [    9.245398]
      [    9.245478] Pid: 2, comm: kthreadd Not tainted 2.6.37-rc1-tip-yh-01782-ge92ef79-dirty #274      /Sun Fire x4800
      [    9.265415] RIP: 0010:[<ffffffff81075a8f>]  [<ffffffff81075a8f>] select_task_rq_fair+0x4f0/0x623
      ...
      [    9.645938] RIP  [<ffffffff81075a8f>] select_task_rq_fair+0x4f0/0x623
      [    9.665356]  RSP <ffff88103f8d1c40>
      [    9.665568] ---[ end trace 2296156d35fdfc87 ]---
      
      So let just parse all cpu entries in SRAT.
      
      Also add apicid checking with MAX_LOCAL_APIC, in case We could out of boundaries of
      apicid_to_node[].
      
      it fixes following bug too.
      https://bugzilla.kernel.org/show_bug.cgi?id=22662
      
      -v2: expand to 32bit according to hpa
         need to add MAX_LOCAL_APIC for 32bit
      Reported-and-Tested-by: NWu Fengguang <fengguang.wu@intel.com>
      Reported-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Tested-by: NMyron Stowe <myron.stowe@hp.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4D0AD486.9020704@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      d3bd0588
  19. 15 8月, 2010 1 次提交
  20. 04 4月, 2010 1 次提交
  21. 18 2月, 2010 1 次提交
  22. 16 12月, 2009 1 次提交
  23. 13 10月, 2009 1 次提交
    • D
      x86: Export srat physical topology · 8716273c
      David Rientjes 提交于
      This is the counterpart to "x86: export k8 physical topology" for
      SRAT. It is not as invasive because the acpi code already seperates
      node setup into detection and registration steps, with the
      exception of registering e820 active regions in
      acpi_numa_memory_affinity_init().  This is now moved to
      acpi_scan_nodes() if NUMA emulation is disabled or deferred.
      
      acpi_numa_init() now returns a value which specifies whether an
      underlying SRAT was located.  If so, that topology can be used by
      the emulation code to interleave emulated nodes over physical nodes
      or to register the nodes for ACPI.
      
      acpi_get_nodes() may now be used to export the srat physical
      topology of the machine for NUMA emulation.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Ankita Garg <ankita@in.ibm.com>
      Cc: Len Brown <len.brown@intel.com>
      LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8716273c
  24. 29 8月, 2009 1 次提交
  25. 04 4月, 2009 1 次提交
    • S
      x86, ACPI: add support for x2apic ACPI extensions · 7237d3de
      Suresh Siddha 提交于
      All logical processors with APIC ID values of 255 and greater will have their
      APIC reported through Processor X2APIC structure (type-9 entry type) and all
      logical processors with APIC ID less than 255 will have their APIC reported
      through legacy Processor Local APIC (type-0 entry type) only. This is the
      same case even for NMI structure reporting.
          
      The Processor X2APIC Affinity structure provides the association between the
      X2APIC ID of a logical processor and the proximity domain to which the logical
      processor belongs.
          
      For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device()
      objects in the ACPI namespace.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      7237d3de
  26. 16 3月, 2009 1 次提交
  27. 31 12月, 2008 1 次提交
  28. 11 10月, 2008 1 次提交
  29. 17 7月, 2008 1 次提交
  30. 12 6月, 2008 1 次提交
  31. 07 2月, 2008 1 次提交
    • A
      ACPI: misc cleanups · e5685b9d
      Adrian Bunk 提交于
          This patch contains the following possible cleanups:
          - make the following needlessly global code static:
            - drivers/acpi/bay.c:dev_attr_eject
            - drivers/acpi/bay.c:dev_attr_present
            - drivers/acpi/dock.c:dev_attr_docked
            - drivers/acpi/dock.c:dev_attr_flags
            - drivers/acpi/dock.c:dev_attr_uid
            - drivers/acpi/dock.c:dev_attr_undock
            - drivers/acpi/pci_bind.c:acpi_pci_unbind()
            - drivers/acpi/pci_link.c:acpi_link_lock
            - drivers/acpi/sbs.c:acpi_sbs_callback()
            - drivers/acpi/sbshc.c:acpi_smbus_transaction()
            - drivers/acpi/sleep/main.c:acpi_sleep_prepare()
          - #if 0 the following unused global functions:
            - drivers/acpi/numa.c:acpi_unmap_pxm_to_node()
          - remove the following unused EXPORT_SYMBOL's:
            - acpi_register_gsi
            - acpi_unregister_gsi
            - acpi_strict
            - acpi_bus_receive_event
            - register_acpi_bus_type
            - unregister_acpi_bus_type
            - acpi_os_printf
            - acpi_os_sleep
            - acpi_os_stall
            - acpi_os_read_pci_configuration
            - acpi_os_create_semaphore
            - acpi_os_delete_semaphore
            - acpi_os_wait_semaphore
            - acpi_os_signal_semaphore
            - acpi_os_signal
            - acpi_pci_irq_enable
            - acpi_get_pxm
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Acked-by: NAlexey Starikovskiy <astarikovskiy@suse.de>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      e5685b9d
  32. 14 12月, 2007 1 次提交
    • J
      ACPI: fix modpost warnings · ffada891
      Jan Beulich 提交于
      for sn2_defconfig:
      
      WARNING: vmlinux.o(.text+0x4b8601): Section mismatch: reference to .init.data:node_to_pxm_map (between '__acpi_map_pxm_to_node' and 'acpi_get_pxm')
      WARNING: vmlinux.o(.text+0x4b8741): Section mismatch: reference to .init.data:pxm_to_node_map (between 'acpi_map_pxm_to_node' and 'acpi_get_node')
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      ffada891
  33. 22 7月, 2007 2 次提交
    • D
      x86_64: fake pxm-to-node mapping for fake numa · 3484d798
      David Rientjes 提交于
      For NUMA emulation, our SLIT should represent the true NUMA topology of the
      system but our proximity domain to node ID mapping needs to reflect the
      emulated state.
      
      When NUMA emulation has successfully setup fake nodes on the system, a new
      function, acpi_fake_nodes() is called.  This function determines the proximity
      domain (_PXM) for each true node found on the system.  It then finds which
      emulated nodes have been allocated on this true node as determined by its
      starting address.  The node ID to PXM mapping is changed so that each fake
      node ID points to the PXM of the true node that it is located on.
      
      If the machine failed to register a SLIT, then we assume there is no special
      requirement for emulated node affinity so we use the default LOCAL_DISTANCE,
      which is newly exported to this code, as our measurement if the emulated nodes
      appear in the same PXM.  Otherwise, we use REMOTE_DISTANCE.
      
      PXM_INVAL and NID_INVAL are also exported to the ACPI header file so that we
      can compare node_to_pxm() results in generic code (in this case, the SRAT
      code).
      
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3484d798
    • D
      x86_64: various cleanups in NUMA scan node · ae2c6dcf
      David Rientjes 提交于
      In acpi_scan_nodes(), we immediately return -1 if acpi_numa <= 0, meaning
      we haven't detected any underlying ACPI topology or we have explicitly
      disabled its use from the command-line with numa=noacpi.
      
      acpi_table_print_srat_entry() and acpi_table_parse_srat() are only
      referenced within drivers/acpi/numa.c, so we can mark them as static and
      remove their prototypes from the header file.
      
      Likewise, pxm_to_node_map[] and node_to_pxm_map[] are only used within
      drivers/acpi/numa.c, so we mark them as static and remove their externs
      from the header file.
      
      The automatic 'result' variable is unused in acpi_numa_init(), so it's
      removed.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae2c6dcf
  34. 02 6月, 2007 1 次提交
  35. 17 5月, 2007 1 次提交