1. 11 6月, 2015 4 次提交
    • A
      powerpc/spapr: vfio: Replace iommu_table with iommu_table_group · b348aa65
      Alexey Kardashevskiy 提交于
      Modern IBM POWERPC systems support multiple (currently two) TCE tables
      per IOMMU group (a.k.a. PE). This adds a iommu_table_group container
      for TCE tables. Right now just one table is supported.
      
      This defines iommu_table_group struct which stores pointers to
      iommu_group and iommu_table(s). This replaces iommu_table with
      iommu_table_group where iommu_table was used to identify a group:
      - iommu_register_group();
      - iommudata of generic iommu_group;
      
      This removes @data from iommu_table as it_table_group provides
      same access to pnv_ioda_pe.
      
      For IODA, instead of embedding iommu_table, the new iommu_table_group
      keeps pointers to those. The iommu_table structs are allocated
      dynamically.
      
      For P5IOC2, both iommu_table_group and iommu_table are embedded into
      PE struct. As there is no EEH and SRIOV support for P5IOC2,
      iommu_free_table() should not be called on iommu_table struct pointers
      so we can keep it embedded in pnv_phb::p5ioc2.
      
      For pSeries, this replaces multiple calls of kzalloc_node() with a new
      iommu_pseries_alloc_group() helper and stores the table group struct
      pointer into the pci_dn struct. For release, a iommu_table_free_group()
      helper is added.
      
      This moves iommu_table struct allocation from SR-IOV code to
      the generic DMA initialization code in pnv_pci_ioda_setup_dma_pe and
      pnv_pci_ioda2_setup_dma_pe as this is where DMA is actually initialized.
      This change is here because those lines had to be changed anyway.
      
      This should cause no behavioural change.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      [aw: for the vfio related changes]
      Acked-by: NAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b348aa65
    • A
      powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table · da004c36
      Alexey Kardashevskiy 提交于
      This adds a iommu_table_ops struct and puts pointer to it into
      the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush
      callbacks from ppc_md to the new struct where they really belong to.
      
      This adds the requirement for @it_ops to be initialized before calling
      iommu_init_table() to make sure that we do not leave any IOMMU table
      with iommu_table_ops uninitialized. This is not a parameter of
      iommu_init_table() though as there will be cases when iommu_init_table()
      will not be called on TCE tables, for example - VFIO.
      
      This does s/tce_build/set/, s/tce_free/clear/ and removes "tce_"
      redundant prefixes.
      
      This removes tce_xxx_rm handlers from ppc_md but does not add
      them to iommu_table_ops as this will be done later if we decide to
      support TCE hypercalls in real mode. This removes _vm callbacks as
      only virtual mode is supported by now so this also removes @rm parameter.
      
      For pSeries, this always uses tce_buildmulti_pSeriesLP/
      tce_buildmulti_pSeriesLP. This changes multi callback to fall back to
      tce_build_pSeriesLP/tce_free_pSeriesLP if FW_FEATURE_MULTITCE is not
      present. The reason for this is we still have to support "multitce=off"
      boot parameter in disable_multitce() and we do not want to walk through
      all IOMMU tables in the system and replace "multi" callbacks with single
      ones.
      
      For powernv, this defines _ops per PHB type which are P5IOC2/IODA1/IODA2.
      This makes the callbacks for them public. Later patches will extend
      callbacks for IODA1/2.
      
      No change in behaviour is expected.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      da004c36
    • A
      powerpc/iommu: Put IOMMU group explicitly · ac9a5889
      Alexey Kardashevskiy 提交于
      So far an iommu_table lifetime was the same as PE. Dynamic DMA windows
      will change this and iommu_free_table() will not always require
      the group to be released.
      
      This moves iommu_group_put() out of iommu_free_table().
      
      This adds a iommu_pseries_free_table() helper which does
      iommu_group_put() and iommu_free_table(). Later it will be
      changed to receive a table_group and we will have to change less
      lines then.
      
      This should cause no behavioural change.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ac9a5889
    • A
      powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group · 4617082e
      Alexey Kardashevskiy 提交于
      The set_iommu_table_base_and_group() name suggests that the function
      sets table base and add a device to an IOMMU group.
      
      The actual purpose for table base setting is to put some reference
      into a device so later iommu_add_device() can get the IOMMU group
      reference and the device to the group.
      
      At the moment a group cannot be explicitly passed to iommu_add_device()
      as we want it to work from the bus notifier, we can fix it later and
      remove confusing calls of set_iommu_table_base().
      
      This replaces set_iommu_table_base_and_group() with a couple of
      set_iommu_table_base() + iommu_add_device() which makes reading the code
      easier.
      
      This adds few comments why set_iommu_table_base() and iommu_add_device()
      are called where they are called.
      
      For IODA1/2, this essentially removes iommu_add_device() call from
      the pnv_pci_ioda_dma_dev_setup() as it will always fail at this particular
      place:
      - for physical PE, the device is already attached by iommu_add_device()
      in pnv_pci_ioda_setup_dma_pe();
      - for virtual PE, the sysfs entries are not ready to create all symlinks
      so actual adding is happening in tce_iommu_bus_notifier.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4617082e
  2. 11 4月, 2015 1 次提交
  3. 04 3月, 2015 1 次提交
  4. 25 11月, 2014 1 次提交
    • G
      of/reconfig: Always use the same structure for notifiers · f5242e5a
      Grant Likely 提交于
      The OF_RECONFIG notifier callback uses a different structure depending
      on whether it is a node change or a property change. This is silly, and
      not very safe. Rework the code to use the same data structure regardless
      of the type of notifier.
      Signed-off-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
      Cc: <linuxppc-dev@lists.ozlabs.org>
      f5242e5a
  5. 10 11月, 2014 1 次提交
  6. 03 11月, 2014 1 次提交
    • C
      powerpc: Replace __get_cpu_var uses · 69111bac
      Christoph Lameter 提交于
      This still has not been merged and now powerpc is the only arch that does
      not have this change. Sorry about missing linuxppc-dev before.
      
      V2->V2
        - Fix up to work against 3.18-rc1
      
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      [mpe: Fix build errors caused by set/or_softirq_pending(), and rework
            assignment in __set_breakpoint() to use memcpy().]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      69111bac
  7. 15 10月, 2014 1 次提交
  8. 03 10月, 2014 1 次提交
    • A
      powerpc/iommu/ddw: Fix endianness · 9410e018
      Alexey Kardashevskiy 提交于
      rtas_call() accepts and returns values in CPU endianness.
      The ddw_query_response and ddw_create_response structs members are
      defined and treated as BE but as they are passed to rtas_call() as
      (u32 *) and they get byteswapped automatically, the data is CPU-endian.
      This fixes ddw_query_response and ddw_create_response definitions and use.
      
      of_read_number() is designed to work with device tree cells - it assumes
      the input is big-endian and returns data in CPU-endian. However due
      to the ddw_create_response struct fix, create.addr_hi/lo are already
      CPU-endian so do not byteswap them.
      
      ddw_avail is a pointer to the "ibm,ddw-applicable" property which contains
      3 cells which are big-endian as it is a device tree. rtas_call() accepts
      a RTAS token in CPU-endian. This makes use of of_property_read_u32_array
      to byte swap and avoid the need for a number of be32_to_cpu calls.
      
      Cc: stable@vger.kernel.org # v3.13+
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      [aik: folded Anton's patch with of_property_read_u32_array]
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9410e018
  9. 27 8月, 2014 2 次提交
    • T
      Revert "powerpc: Replace __get_cpu_var uses" · 23f66e2d
      Tejun Heo 提交于
      This reverts commit 5828f666 due to
      build failure after merging with pending powerpc changes.
      
      Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      23f66e2d
    • C
      powerpc: Replace __get_cpu_var uses · 5828f666
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      tj: Folded a fix patch.
          http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5828f666
  10. 13 8月, 2014 1 次提交
    • G
      powerpc/pseries: Avoid deadlock on removing ddw · 5efbabe0
      Gavin Shan 提交于
      Function remove_ddw() could be called in of_reconfig_notifier and
      we potentially remove the dynamic DMA window property, which invokes
      of_reconfig_notifier again. Eventually, it leads to the deadlock as
      following backtrace shows.
      
      The patch fixes the above issue by deferring releasing the dynamic
      DMA window property while releasing the device node.
      
      =============================================
      [ INFO: possible recursive locking detected ]
      3.16.0+ #428 Tainted: G        W
      ---------------------------------------------
      drmgr/2273 is trying to acquire lock:
       ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
       .__blocking_notifier_call_chain+0x40/0x78
      
      but task is already holding lock:
       ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
       .__blocking_notifier_call_chain+0x40/0x78
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock((of_reconfig_chain).rwsem);
        lock((of_reconfig_chain).rwsem);
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by drmgr/2273:
       #0:  (sb_writers#4){.+.+.+}, at: [<c0000000001cbe70>] \
            .vfs_write+0xb0/0x1f8
       #1:  ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
            .__blocking_notifier_call_chain+0x40/0x78
      
      stack backtrace:
      CPU: 17 PID: 2273 Comm: drmgr Tainted: G        W     3.16.0+ #428
      Call Trace:
      [c0000000137e7000] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable)
      [c0000000137e70b0] [c00000000083cd34] .dump_stack+0x7c/0x9c
      [c0000000137e7130] [c0000000000b8afc] .__lock_acquire+0x128c/0x1c68
      [c0000000137e7280] [c0000000000b9a4c] .lock_acquire+0xe8/0x104
      [c0000000137e7350] [c00000000083588c] .down_read+0x4c/0x90
      [c0000000137e73e0] [c000000000091890] .__blocking_notifier_call_chain+0x40/0x78
      [c0000000137e7490] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48
      [c0000000137e7520] [c000000000682a28] .of_reconfig_notify+0x34/0x5c
      [c0000000137e75b0] [c000000000682a9c] .of_property_notify+0x4c/0x54
      [c0000000137e7650] [c000000000682bf0] .of_remove_property+0x30/0xd4
      [c0000000137e76f0] [c000000000052a44] .remove_ddw+0x144/0x168
      [c0000000137e7790] [c000000000053204] .iommu_reconfig_notifier+0x30/0xe0
      [c0000000137e7820] [c00000000009137c] .notifier_call_chain+0x6c/0xb4
      [c0000000137e78c0] [c0000000000918ac] .__blocking_notifier_call_chain+0x5c/0x78
      [c0000000137e7970] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48
      [c0000000137e7a00] [c000000000682a28] .of_reconfig_notify+0x34/0x5c
      [c0000000137e7a90] [c000000000682e14] .of_detach_node+0x44/0x1fc
      [c0000000137e7b40] [c0000000000518e4] .ofdt_write+0x3ac/0x688
      [c0000000137e7c20] [c000000000238430] .proc_reg_write+0xb8/0xd4
      [c0000000137e7cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8
      [c0000000137e7d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0
      [c0000000137e7e30] [c00000000000a064] syscall_exit+0x0/0x98
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5efbabe0
  11. 15 1月, 2014 2 次提交
    • N
      Revert "pseries/iommu: Remove DDW on kexec" · 97e7dc52
      Nishanth Aravamudan 提交于
      After reverting 25ebc45b
      ("powerpc/pseries/iommu: remove default window before attempting DDW
      manipulation"), we no longer remove the base window in enable_ddw.
      Therefore, we no longer need to reset the DMA window state in
      find_existing_ddw_windows(). We can instead go back to what was done
      before, which simply reuses the previous configuration, if any. Further,
      this removes the final caller of the reset-pe-dma-windows call, so
      remove those functions.
      
      This fixes an EEH on kdump with the ipr driver. The EEH occurs, because
      the initcall removes the DDW configuration (64-bit DMA window), but
      doesn't ensure the ops are via the IOMMU -- a DMA operation occurs
      during probe (still investigating this) and we EEH.
      
      This reverts commit 14b6f00f.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      97e7dc52
    • N
      Revert "powerpc/pseries/iommu: remove default window before attempting DDW manipulation" · ae69e1ed
      Nishanth Aravamudan 提交于
      Ben rightfully pointed out that there is a race in the "newer" DDW code.
      Presuming we are running on recent enough firmware that supports the
      "reset" DDW manipulation call, we currently always remove the base
      32-bit DMA window in order to maximize the resources for Phyp when
      creating the 64-bit window. However, this can be problematic for the
      case where multiple functions are in the same PE (partitionable
      endpoint), where some funtions might be 32-bit DMA only. All of a
      sudden, the only functional DMA window for such functions is gone. We
      will have serious errors in such situations. The best solution is simply
      to revert the extension to the DDW code where we ever remove the base
      DMA window.
      
      This reverts commit 25ebc45b.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ae69e1ed
  12. 30 12月, 2013 2 次提交
  13. 05 12月, 2013 1 次提交
    • A
      PPC: POWERNV: move iommu_add_device earlier · d905c5df
      Alexey Kardashevskiy 提交于
      The current implementation of IOMMU on sPAPR does not use iommu_ops
      and therefore does not call IOMMU API's bus_set_iommu() which
      1) sets iommu_ops for a bus
      2) registers a bus notifier
      Instead, PCI devices are added to IOMMU groups from
      subsys_initcall_sync(tce_iommu_init) which does basically the same
      thing without using iommu_ops callbacks.
      
      However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
      implements iommu_ops and when tce_iommu_init is called, every PCI device
      is already added to some group so there is a conflict.
      
      This patch does 2 things:
      1. removes the loop in which PCI devices were added to groups and
      adds explicit iommu_add_device() calls to add devices as soon as they get
      the iommu_table pointer assigned to them.
      2. moves a bus notifier to powernv code in order to avoid conflict with
      the notifier from Freescale driver.
      
      iommu_add_device() and iommu_del_device() are public now.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d905c5df
  14. 30 10月, 2013 1 次提交
  15. 27 8月, 2013 1 次提交
  16. 14 8月, 2013 1 次提交
  17. 20 6月, 2013 1 次提交
  18. 18 4月, 2013 1 次提交
    • N
      powerpc/pseries: close DDW race between functions of adapter · 61435690
      Nishanth Aravamudan 提交于
      Given a PCI device with multiple functions in a DDW capable slot, the
      following situation can be encountered: When the first function sets a
      64-bit DMA mask, enable_ddw() will be called and we can fail to properly
      configure DDW (the most common reason being the new DMA window's size is
      not large enough to map all of an LPAR's memory). With the recent
      changes to DDW, we remove the base window in order to determine if the
      new window is of sufficient size to cover an LPAR's memory. We correctly
      replace the base window if we find that not to be the case. However,
      once we go through and re-configured 32-bit DMA via the IOMMU, the next
      function of the adapter will go through the same process. And since DDW
      is a characteristic of the slot itself, we are most likely going to fail
      again. But to determine we are going to fail the second slot, we again
      remove the base window -- but that is now in-use by the first
      function/driver, which might be issuing I/O already.
      
      To close this window, keep a list of all the failed struct device_nodes
      that have failed to configure DDW. If the current device_node is in that
      list, just fail out immediately and fall back to 32-bit DMA without
      doing any DDW manipulation.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      61435690
  19. 08 2月, 2013 2 次提交
    • N
      pseries/iommu: Remove DDW on kexec · 14b6f00f
      Nishanth Aravamudan 提交于
      pseries/iommu: remove DDW on kexec
      
      We currently insert a property in the device-tree when we successfully
      configure DDW for a given slot. This was meant to be an optimization to
      speed up kexec/kdump, so that we don't need to make the RTAS calls again
      to re-configured DDW in the new kernel.
      
      However, we end up tripping a plpar_tce_stuff failure on kexec/kdump
      because we unconditionally parse the ibm,dma-window property for the
      node at bus/dev setup time. This property contains the 32-bit DMA window
      LIOBN, which is distinct from the DDW window's. We pass that LIOBN (via
      iommu_table_init -> iommu_table_clear -> tce_free ->
      tce_freemulti_pSeriesLP) to plpar_tce_stuff, which fails because that
      32-bit window is no longer present after
      25ebc45b ("powerpc/pseries/iommu: remove
      default window before attempting DDW manipulation").
      
      I believe the simplest, easiest-to-maintain fix is to just change our
      initcall to, rather than detecting and updating the new kernel's DDW
      knowledge, just remove all DDW configurations. When the drivers
      re-initialize, we will set everything back up as it was before.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      14b6f00f
    • N
      pseries/iommu: Restore_default_window does not use liobn parameter · a1dabade
      Nishanth Aravamudan 提交于
      The parameter is unused, and complicates a following fix. Just remove
      it.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a1dabade
  20. 29 1月, 2013 2 次提交
    • N
      pseries/iommu: Ensure TCEs are cleared with non-huge DDW · 71cf1def
      Nishanth Aravamudan 提交于
      There are now two kinds of DMA windows that might be presented by
      PowerVM DDW support -- huge windows (that can map all of system memory
      regardless of the LPAR configuration) and non-huge windows (which
      can't). They are implemented slightly differently in PowerVM, and thus
      have different characteristics. The most obvious is that slot isolate
      doesn't clear the TCEs/window for us with non-huge windows. Thus, when a
      DLPAR operation occurs on a slot using a non-huge window, TCEs are still
      present (the notifier chain doesn't currently remove them explicitly)
      and the DLPAR fails. Fix this by calling remove_ddw() first, which will
      unmap the DDW TCEs.
      
      Note: a corresponding change to drmgr is needed to actually successfully
      DLPAR, such that the device-tree update (which causes the notifier chain
      to fire) occurs before slot isolate.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      71cf1def
    • N
      pseries/iommu: Fix iteration in DDW TCE clearrange · 22b38298
      Nishanth Aravamudan 提交于
      tce_clearrange_multi_pSeriesLP is attempting to iterate over all TCEs in
      a given range. However, is it not advancing the dma_offset value passed
      to plpar_tce_stuff via the next value. This prevents DLPAR from
      completing, because TCEs are still present at slot isolation time.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      22b38298
  21. 15 11月, 2012 2 次提交
  22. 05 9月, 2012 2 次提交
  23. 06 7月, 2012 1 次提交
  24. 03 7月, 2012 2 次提交
  25. 29 6月, 2012 1 次提交
  26. 16 6月, 2012 1 次提交
    • G
      devicetree: add helper inline for retrieving a node's full name · efd68e72
      Grant Likely 提交于
      The pattern (np ? np->full_name : "<none>") is rather common in the
      kernel, but can also make for quite long lines.  This patch adds a new
      inline function, of_node_full_name() so that the test for a valid node
      pointer doesn't need to be open coded at all call sites.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      efd68e72
  27. 28 3月, 2012 1 次提交
  28. 25 11月, 2011 2 次提交