1. 25 9月, 2014 2 次提交
  2. 13 8月, 2014 4 次提交
    • N
      powerpc: reorder per-cpu NUMA information's initialization · 2fabf084
      Nishanth Aravamudan 提交于
      There is an issue currently where NUMA information is used on powerpc
      (and possibly ia64) before it has been read from the device-tree, which
      leads to large slab consumption with CONFIG_SLUB and memoryless nodes.
      
      NUMA powerpc non-boot CPU's cpu_to_node/cpu_to_mem is only accurate
      after start_secondary(), similar to ia64, which is invoked via
      smp_init().
      
      Commit 6ee0578b ("workqueue: mark init_workqueues() as
      early_initcall()") made init_workqueues() be invoked via
      do_pre_smp_initcalls(), which is obviously before the secondary
      processors are online.
      
      Additionally, the following commits changed init_workqueues() to use
      cpu_to_node to determine the node to use for kthread_create_on_node:
      
      bce90380 ("workqueue: add wq_numa_tbl_len and
      wq_numa_possible_cpumask[]")
      f3f90ad4 ("workqueue: determine NUMA node of workers accourding to
      the allowed cpumask")
      
      Therefore, when init_workqueues() runs, it sees all CPUs as being on
      Node 0. On LPARs or KVM guests where Node 0 is memoryless, this leads to
      a high number of slab deactivations
      (http://www.spinics.net/lists/linux-mm/msg67489.html).
      
      Fix this by initializing the powerpc-specific CPU<->node/local memory
      node mapping as early as possible, which on powerpc is
      do_init_bootmem(). Currently that function initializes the mapping for
      the boot CPU, but we extend it to setup the mapping for all possible
      CPUs. Then, in smp_prepare_cpus(), we can correspondingly set the
      per-cpu values for all possible CPUs. That ensures that before the
      early_initcalls run (and really as early as possible), the per-cpu NUMA
      mapping is accurate.
      
      While testing memoryless nodes on PowerKVM guests with a fix to the
      workqueue logic to use cpu_to_mem() instead of cpu_to_node(), with a
      guest topology of:
      
      available: 2 nodes (0-1)
      node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
      node 0 size: 0 MB
      node 0 free: 0 MB
      node 1 cpus: 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
      node 1 size: 16336 MB
      node 1 free: 15329 MB
      node distances:
      node   0   1
        0:  10  40
        1:  40  10
      
      the slab consumption decreases from
      
      Slab:             932416 kB
      SUnreclaim:       902336 kB
      
      to
      
      Slab:             395264 kB
      SUnreclaim:       359424 kB
      
      And we a corresponding increase in the slab efficiency from
      
      slab                                   mem     objs    slabs
                                            used   active   active
      ------------------------------------------------------------
      kmalloc-16384                       337 MB   11.28%  100.00%
      task_struct                         288 MB    9.93%  100.00%
      
      to
      
      slab                                   mem     objs    slabs
                                            used   active   active
      ------------------------------------------------------------
      kmalloc-16384                        37 MB  100.00%  100.00%
      task_struct                          31 MB  100.00%  100.00%
      
      Powerpc didn't support memoryless nodes until recently (64bb80d8
      "powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES" and 8c272261
      "powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"). Those commits also
      helped improve memory consumption with these kind of environments.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2fabf084
    • A
      powerpc/ppc476: Disable BTAC · 97b3be1e
      Alistair Popple 提交于
      This patch disables the branch target address CAM which under specific
      circumstances may cause the processor to skip execution of 1-4
      instructions. This fixes IBM Erratum #47.
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      97b3be1e
    • G
      powerpc/powernv: Fix IOMMU group lost · 763fe0ad
      Gavin Shan 提交于
      When we take full hotplug to recover from EEH errors, PCI buses
      could be involved. For the case, the child devices of involved
      PCI buses can't be attached to IOMMU group properly, which is
      caused by commit 3f28c5af ("powerpc/powernv: Reduce multi-hit of
      iommu_add_device()").
      
      When adding the PCI devices of the newly created PCI buses to
      the system, the IOMMU group is expected to be added in (C).
      (A) fails to bind the IOMMU group because bus->is_added is
      false. (B) fails because the device doesn't have binding IOMMU
      table yet. bus->is_added is set to true at end of (C) and
      pdev->is_added is set to true at (D).
      
         pcibios_add_pci_devices()
            pci_scan_bridge()
               pci_scan_child_bus()
                  pci_scan_slot()
                     pci_scan_single_device()
                        pci_scan_device()
                        pci_device_add()
                           pcibios_add_device()           A: Ignore
                           device_add()                   B: Ignore
                        pcibios_fixup_bus()
                           pcibios_setup_bus_devices()
                              pcibios_setup_device()      C: Hit
            pcibios_finish_adding_to_bus()
               pci_bus_add_devices()
                  pci_bus_add_device()                    D: Add device
      
      If the parent PCI bus isn't involved in hotplug, the IOMMU
      group is expected to be bound in (B). (A) should fail as the
      sysfs entries aren't populated.
      
      The patch fixes the issue by reverting commit 3f28c5af and remove
      WARN_ON() in iommu_add_device() to allow calling the function
      even the specified device already has associated IOMMU group.
      
      Cc: <stable@vger.kernel.org>  # 3.16+
      Reported-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Acked-by: NWei Yang <weiyang@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      763fe0ad
    • G
      powerpc: Fix "attempt to move .org backwards" error · 11d54904
      Guenter Roeck 提交于
      Once again, we see
      
      arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
      arch/powerpc/kernel/exceptions-64s.S:865: Error: attempt to move .org backwards
      arch/powerpc/kernel/exceptions-64s.S:866: Error: attempt to move .org backwards
      arch/powerpc/kernel/exceptions-64s.S:890: Error: attempt to move .org backwards
      
      when compiling ppc:allmodconfig.
      
      This time the problem has been caused by to commit 0869b6fd
      ("powerpc/book3s: Add basic infrastructure to handle HMI in Linux"),
      which adds functions hmi_exception_early and hmi_exception_after_realmode
      into a critical (size-limited) code area, even though that does not appear
      to be necessary.
      
      Move those functions to a non-critical area of the file.
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      11d54904
  3. 09 8月, 2014 2 次提交
  4. 07 8月, 2014 1 次提交
  5. 06 8月, 2014 2 次提交
  6. 05 8月, 2014 16 次提交
  7. 30 7月, 2014 1 次提交
    • A
      powerpc/e6500: Add support for hardware threads · e16c8765
      Andy Fleming 提交于
      The general idea is that each core will release all of its
      threads into the secondary thread startup code, which will
      eventually wait in the secondary core holding area, for the
      appropriate bit in the PACA to be set. The kick_cpu function
      pointer will set that bit in the PACA, and thus "release"
      the core/thread to boot. We also need to do a few things that
      U-Boot normally does for CPUs (like enable branch prediction).
      Signed-off-by: NAndy Fleming <afleming@freescale.com>
      [scottwood@freescale.com: various changes, including only enabling
       threads if Linux wants to kick them]
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      e16c8765
  8. 28 7月, 2014 12 次提交