1. 18 11月, 2014 1 次提交
  2. 03 11月, 2014 1 次提交
    • C
      powerpc: Replace __get_cpu_var uses · 69111bac
      Christoph Lameter 提交于
      This still has not been merged and now powerpc is the only arch that does
      not have this change. Sorry about missing linuxppc-dev before.
      
      V2->V2
        - Fix up to work against 3.18-rc1
      
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      [mpe: Fix build errors caused by set/or_softirq_pending(), and rework
            assignment in __set_breakpoint() to use memcpy().]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      69111bac
  3. 27 8月, 2014 2 次提交
    • T
      Revert "powerpc: Replace __get_cpu_var uses" · 23f66e2d
      Tejun Heo 提交于
      This reverts commit 5828f666 due to
      build failure after merging with pending powerpc changes.
      
      Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      23f66e2d
    • C
      powerpc: Replace __get_cpu_var uses · 5828f666
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      tj: Folded a fix patch.
          http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5828f666
  4. 13 8月, 2014 1 次提交
    • G
      powerpc/powernv: Fix IOMMU group lost · 763fe0ad
      Gavin Shan 提交于
      When we take full hotplug to recover from EEH errors, PCI buses
      could be involved. For the case, the child devices of involved
      PCI buses can't be attached to IOMMU group properly, which is
      caused by commit 3f28c5af ("powerpc/powernv: Reduce multi-hit of
      iommu_add_device()").
      
      When adding the PCI devices of the newly created PCI buses to
      the system, the IOMMU group is expected to be added in (C).
      (A) fails to bind the IOMMU group because bus->is_added is
      false. (B) fails because the device doesn't have binding IOMMU
      table yet. bus->is_added is set to true at end of (C) and
      pdev->is_added is set to true at (D).
      
         pcibios_add_pci_devices()
            pci_scan_bridge()
               pci_scan_child_bus()
                  pci_scan_slot()
                     pci_scan_single_device()
                        pci_scan_device()
                        pci_device_add()
                           pcibios_add_device()           A: Ignore
                           device_add()                   B: Ignore
                        pcibios_fixup_bus()
                           pcibios_setup_bus_devices()
                              pcibios_setup_device()      C: Hit
            pcibios_finish_adding_to_bus()
               pci_bus_add_devices()
                  pci_bus_add_device()                    D: Add device
      
      If the parent PCI bus isn't involved in hotplug, the IOMMU
      group is expected to be bound in (B). (A) should fail as the
      sysfs entries aren't populated.
      
      The patch fixes the issue by reverting commit 3f28c5af and remove
      WARN_ON() in iommu_add_device() to allow calling the function
      even the specified device already has associated IOMMU group.
      
      Cc: <stable@vger.kernel.org>  # 3.16+
      Reported-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Acked-by: NWei Yang <weiyang@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      763fe0ad
  5. 05 8月, 2014 1 次提交
  6. 11 2月, 2014 1 次提交
    • B
      powerpc/powernv: Add iommu DMA bypass support for IODA2 · cd15b048
      Benjamin Herrenschmidt 提交于
      This patch adds the support for to create a direct iommu "bypass"
      window on IODA2 bridges (such as Power8) allowing to bypass iommu
      page translation completely for 64-bit DMA capable devices, thus
      significantly improving DMA performances.
      
      Additionally, this adds a hook to the struct iommu_table so that
      the IOMMU API / VFIO can disable the bypass when external ownership
      is requested, since in that case, the device will be used by an
      environment such as userspace or a KVM guest which must not be
      allowed to bypass translations.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cd15b048
  7. 15 1月, 2014 1 次提交
  8. 30 12月, 2013 2 次提交
  9. 05 12月, 2013 1 次提交
    • A
      PPC: POWERNV: move iommu_add_device earlier · d905c5df
      Alexey Kardashevskiy 提交于
      The current implementation of IOMMU on sPAPR does not use iommu_ops
      and therefore does not call IOMMU API's bus_set_iommu() which
      1) sets iommu_ops for a bus
      2) registers a bus notifier
      Instead, PCI devices are added to IOMMU groups from
      subsys_initcall_sync(tce_iommu_init) which does basically the same
      thing without using iommu_ops callbacks.
      
      However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
      implements iommu_ops and when tce_iommu_init is called, every PCI device
      is already added to some group so there is a conflict.
      
      This patch does 2 things:
      1. removes the loop in which PCI devices were added to groups and
      adds explicit iommu_add_device() calls to add devices as soon as they get
      the iommu_table pointer assigned to them.
      2. moves a bus notifier to powernv code in order to avoid conflict with
      the notifier from Freescale driver.
      
      iommu_add_device() and iommu_del_device() are public now.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d905c5df
  10. 03 10月, 2013 1 次提交
    • N
      powerpc/iommu: Use GFP_KERNEL instead of GFP_ATOMIC in iommu_init_table() · 1cf389df
      Nishanth Aravamudan 提交于
      Under heavy (DLPAR?) stress, we tripped this panic() in
      arch/powerpc/kernel/iommu.c::iommu_init_table():
      
      	page = alloc_pages_node(nid, GFP_ATOMIC, get_order(sz));
      	if (!page)
      		panic("iommu_init_table: Can't allocate %ld bytes\n", sz);
      
      Before the panic() we got a page allocation failure for an order-2
      allocation. There appears to be memory free, but perhaps not in the
      ATOMIC context. I looked through all the call-sites of
      iommu_init_table() and didn't see any obvious reason to need an ATOMIC
      allocation. Most call-sites in fact have an explicit GFP_KERNEL
      allocation shortly before the call to iommu_init_table(), indicating we
      are not in an atomic context. There is some indirection for some paths,
      but I didn't see any locks indicating that GFP_KERNEL is inappropriate.
      
      With this change under the same conditions, we have not been able to
      reproduce the panic.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org>
      1cf389df
  11. 15 7月, 2013 1 次提交
  12. 20 6月, 2013 1 次提交
    • A
      powerpc/vfio: Enable on PowerNV platform · 4e13c1ac
      Alexey Kardashevskiy 提交于
      This initializes IOMMU groups based on the IOMMU configuration
      discovered during the PCI scan on POWERNV (POWER non virtualized)
      platform.  The IOMMU groups are to be used later by the VFIO driver,
      which is used for PCI pass through.
      
      It also implements an API for mapping/unmapping pages for
      guest PCI drivers and providing DMA window properties.
      This API is going to be used later by QEMU-VFIO to handle
      h_put_tce hypercalls from the KVM guest.
      
      The iommu_put_tce_user_mode() does only a single page mapping
      as an API for adding many mappings at once is going to be
      added later.
      
      Although this driver has been tested only on the POWERNV
      platform, it should work on any platform which supports
      TCE tables.  As h_put_tce hypercall is received by the host
      kernel and processed by the QEMU (what involves calling
      the host kernel again), performance is not the best -
      circa 220MB/s on 10Gb ethernet network.
      
      To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
      option and configure VFIO as required.
      
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4e13c1ac
  13. 18 4月, 2013 1 次提交
  14. 10 1月, 2013 1 次提交
  15. 15 11月, 2012 1 次提交
  16. 04 10月, 2012 1 次提交
  17. 13 7月, 2012 1 次提交
  18. 10 7月, 2012 1 次提交
    • A
      powerpc: IOMMU fault injection · d6b9a81b
      Anton Blanchard 提交于
      Add the ability to inject IOMMU faults. We enable this per device
      via a fail_iommu sysfs property, similar to fault injection on other
      subsystems.
      
      An example:
      
      ...
      0003:01:00.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 02)
      
      To inject one error to this device:
      
      echo 1 > /sys/bus/pci/devices/0003:01:00.1/fail_iommu
      echo 1 > /sys/kernel/debug/fail_iommu/probability
      echo 1 > /sys/kernel/debug/fail_iommu/times
      
      As feared, the first failure injected on the be3 results in an
      unrecoverable error, taking down both functions of the card
      permanently:
      
      be2net 0003:01:00.1: Unrecoverable error in the card
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d6b9a81b
  19. 03 7月, 2012 4 次提交
  20. 23 2月, 2012 1 次提交
    • M
      fadump: Register for firmware assisted dump. · 3ccc00a7
      Mahesh Salgaonkar 提交于
      On 2012-02-20 11:02:51 Mon, Paul Mackerras wrote:
      > On Thu, Feb 16, 2012 at 04:44:30PM +0530, Mahesh J Salgaonkar wrote:
      >
      > If I have read the code correctly, we are going to get this printk on
      > non-pSeries machines or on older pSeries machines, even if the user
      > has not put the fadump=on option on the kernel command line.  The
      > printk will be annoying since there is no actual error condition.  It
      > seems to me that the condition for the printk should include
      > fw_dump.fadump_enabled.  In other words you should probably add
      >
      > 	if (!fw_dump.fadump_enabled)
      > 		return 0;
      >
      > at the beginning of the function.
      
      Hi Paul,
      
      Thanks for pointing it out. Please find the updated patch below.
      
      The existing patches above this (4/10 through 10/10) cleanly applies
      on this update.
      
      Thanks,
      -Mahesh.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3ccc00a7
  21. 23 9月, 2011 1 次提交
  22. 09 12月, 2010 1 次提交
  23. 21 5月, 2010 1 次提交
  24. 19 3月, 2010 1 次提交
    • F
      powerpc: Remove IOMMU_VMERGE config option · 191aee58
      FUJITA Tomonori 提交于
      The description says:
      
       Cause IO segments sent to a device for DMA to be merged virtually
       by the IOMMU when they happen to have been allocated contiguously.
       This doesn't add pressure to the IOMMU allocator. However, some
       drivers don't support getting large merged segments coming back
       from *_map_sg().
      
       Most drivers don't have this problem; it is safe to say Y here.
      
      It's out of date. Long ago, drivers didn't have a way to tell IOMMUs
      about their segment length limit (that is, the maximum segment length
      that they can handle). So IOMMUs merged as many segments as possible
      and gave too large segments to drivers.
      
      dma_get_max_seg_size() was introduced to solve the above
      problem. Device drives can use the API to tell IOMMU about the maximum
      segment length that they can handle. In addition, the default limit
      (64K) should be safe for everyone.
      
      So this config option seems to be unnecessary.
      
      Note that this config option just enables users to disable the virtual
      merging by default. Users can still disable the virtual merging by the
      boot parameter.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      191aee58
  25. 16 12月, 2009 1 次提交
    • A
      iommu-helper: use bitmap library · a66022c4
      Akinobu Mita 提交于
      Use bitmap library and kill some unused iommu helper functions.
      
      1. s/iommu_area_free/bitmap_clear/
      
      2. s/iommu_area_reserve/bitmap_set/
      
      3. Use bitmap_find_next_zero_area instead of find_next_zero_area
      
        This cannot be simple substitution because find_next_zero_area
        doesn't check the last bit of the limit in bitmap
      
      4. Remove iommu_area_free, iommu_area_reserve, and find_next_zero_area
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a66022c4
  26. 13 1月, 2009 1 次提交
  27. 31 10月, 2008 2 次提交
    • M
      powerpc: Update remaining dma_mapping_ops to use map/unmap_page · f9226d57
      Mark Nelson 提交于
      After the merge of the 32 and 64bit DMA code, dma_direct_ops lost
      their map/unmap_single() functions but gained map/unmap_page().  This
      caused a problem for Cell because Cell's dma_iommu_fixed_ops called
      the dma_direct_ops if the fixed linear mapping was to be used or the
      iommu ops if the dynamic window was to be used.  So in order to fix
      this problem we need to update the 64bit DMA code to use
      map/unmap_page.
      
      First, we update the generic IOMMU code so that iommu_map_single()
      becomes iommu_map_page() and iommu_unmap_single() becomes
      iommu_unmap_page().  Then we propagate these changes up through all
      the callers of these two functions and in the process update all the
      dma_mapping_ops so that they have map/unmap_page rahter than
      map/unmap_single.  We can do this because on 64bit there is no HIGHMEM
      memory so map/unmap_page ends up performing exactly the same function
      as map/unmap_single, just taking different arguments.
      
      This has no affect on drivers because the dma_map_single_attrs() just
      ends up calling the map_page() function of the appropriate
      dma_mapping_ops and similarly the dma_unmap_single_attrs() calls
      unmap_page().
      
      This fixes an oops on Cell blades, which oops on boot without this
      because they call dma_direct_ops.map_single, which is NULL.
      Signed-off-by: NMark Nelson <markn@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      f9226d57
    • M
      powerpc: Use is_kdump_kernel() · 62a8bd6c
      Milton Miller 提交于
      linux/crash_dump.h defines is_kdump_kernel() to be used by code that
      needs to know if the previous kernel crashed instead of a (clean) boot
      or reboot.
      
      This updates the just added powerpc code to use it.  This is needed
      for the next commit, which will remove __kdump_flag.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      62a8bd6c
  28. 22 10月, 2008 1 次提交
    • M
      powerpc: Support for relocatable kdump kernel · 54622f10
      Mohan Kumar M 提交于
      This adds relocatable kernel support for kdump. With this one can
      use the same regular kernel to capture the kdump. A signature (0xfeed1234)
      is passed in r6 from panic code to the next kernel through kexec_sequence
      and purgatory code. The signature is used to differentiate between
      kdump kernel and non-kdump kernels.
      
      The purgatory code compares the signature and sets the __kdump_flag in
      head_64.S.  During the boot up, kernel code checks __kdump_flag and if it
      is set, the kernel will behave as relocatable kdump kernel. This kernel
      will boot at the address where it was loaded by kexec-tools ie. at the
      address reserved through crashkernel boot parameter.
      
      CONFIG_CRASH_DUMP depends on CONFIG_RELOCATABLE option to build kdump
      kernel as relocatable. So the same kernel can be used as production and
      kdump kernel.
      
      This patch incorporates the changes suggested by Paul Mackerras to avoid
      GOT use and to avoid two copies of the code.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMohan Kumar M <mohan@in.ibm.com>
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      54622f10
  29. 17 10月, 2008 2 次提交
  30. 25 7月, 2008 1 次提交
    • R
      powerpc/pseries: iommu enablement for CMO · 6490c490
      Robert Jennings 提交于
      To support Cooperative Memory Overcommitment (CMO), we need to check
      for failure from some of the tce hcalls.
      
      These changes for the pseries platform affect the powerpc architecture;
      patches for the other affected platforms are included in this patch.
      
      pSeries platform IOMMU code changes:
       * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and
         return an error.
      
      Architecture IOMMU code changes:
       * Calls to ppc_md.tce_build need to check return values and return
         DMA_MAPPING_ERROR for transient errors.
      
      Architecture changes:
       * struct machdep_calls for tce_build*_pSeriesLP functions need to change
         to indicate failure.
       * all other platforms will need updates to iommu functions to match the new
         calling semantics; they will return 0 on success.  The other platforms
         default configs have been built, but no further testing was performed.
      Signed-off-by: NRobert Jennings <rcj@linux.vnet.ibm.com>
      Acked-by: NOlof Johansson <olof@lixom.net>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6490c490
  31. 22 7月, 2008 1 次提交
  32. 09 7月, 2008 2 次提交