1. 06 5月, 2013 2 次提交
  2. 02 5月, 2013 2 次提交
  3. 30 4月, 2013 3 次提交
  4. 26 4月, 2013 5 次提交
    • J
      powerpc/pseries: Update CPU maps when device tree is updated · 5d88aa85
      Jesse Larrew 提交于
      Platform events such as partition migration or the new PRRN firmware
      feature can cause the NUMA characteristics of a CPU to change, and these
      changes will be reflected in the device tree nodes for the affected
      CPUs.
      
      This patch registers a handler for Open Firmware device tree updates
      and reconfigures the CPU and node maps whenever the associativity
      changes. Currently, this is accomplished by marking the affected CPUs in
      the cpu_associativity_changes_mask and allowing
      arch_update_cpu_topology() to retrieve the new associativity information
      using hcall_vphn().
      
      Protecting the NUMA cpu maps from concurrent access during an update
      operation will be addressed in a subsequent patch in this series.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5d88aa85
    • N
      powerpc/pseries: Update firmware_has_feature() to check architecture vector 5 bits · f0ff7eb4
      Nathan Fontenot 提交于
      The firmware_has_feature() function makes it easy to check for supported
      features of the hypervisor. This patch extends the capability of
      firmware_has_feature() to include checking for specified bits
      in vector 5 of the architecture vector as reported in the device tree.
      
      As part of this the #defines used for the architecture vector are re-defined
      such that each option has the index into vector 5 and the feature bit encoded
      into it. This makes checking for architecture bits when initiating data
      for firmware_has_feature much easier.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f0ff7eb4
    • N
      powerpc/pseries: Use ARRAY_SIZE to iterate over firmware_features_table array · 43c0ea60
      Nathan Fontenot 提交于
      When iterating over the entries in firmware_features_table we only need
      to go over the actual number of entries in the array instead of declaring
      it to be bigger and checking to make sure there is a valid entry in every
      slot.
      
      This patch removes the FIRMWARE_MAX_FEATURES #define and replaces the
      array looping with the use of ARRAY_SIZE().
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      43c0ea60
    • N
      powerpc/pseries: Correct buffer parsing in update_dt_node() · 2e9b7b02
      Nathan Fontenot 提交于
      Correct parsing of the buffer returned from ibm,update-properties. The first
      element is a length and the path to the property which is slightly different
      from the list of properties in the buffer so we need to specifically
      handle this.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2e9b7b02
    • N
      powerpc/pseries: Expose pseries devicetree_update() · 762ec157
      Nathan Fontenot 提交于
      Newer firmware on Power systems can transparently reassign platform resources
      (CPU and Memory) in use. For instance, if a processor or memory unit is
      predicted to fail, the platform may transparently move the processing to an
      equivalent unused processor or the memory state to an equivalent unused
      memory unit. However, reassigning resources across NUMA boundaries may alter
      the performance of the partition. When such reassignment is necessary, the
      Platform Resource Reassignment Notification (PRRN) option provides a
      mechanism to inform the Linux kernel of changes to the NUMA affinity of
      its platform resources.
      
      When rtasd receives a PRRN event, it needs to make a series of RTAS
      calls (ibm,update-nodes and ibm,update-properties) to retrieve the
      updated device tree information. These calls are already handled in the
      pseries_devicetree_update() routine used in partition migration.
      
      This patch exposes pseries_devicetree_update() to make it accessible
      to other pseries routines, this patch also updates pseries_devicetree_update()
      to take a 32-bit scope parameter. The scope value, which was previously hard
      coded to 1 for partition migration, is used for the RTAS calls
      ibm,update-nodes/properties to update the device tree.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      762ec157
  5. 23 4月, 2013 1 次提交
  6. 18 4月, 2013 2 次提交
    • N
      powerpc/pseries: close DDW race between functions of adapter · 61435690
      Nishanth Aravamudan 提交于
      Given a PCI device with multiple functions in a DDW capable slot, the
      following situation can be encountered: When the first function sets a
      64-bit DMA mask, enable_ddw() will be called and we can fail to properly
      configure DDW (the most common reason being the new DMA window's size is
      not large enough to map all of an LPAR's memory). With the recent
      changes to DDW, we remove the base window in order to determine if the
      new window is of sufficient size to cover an LPAR's memory. We correctly
      replace the base window if we find that not to be the case. However,
      once we go through and re-configured 32-bit DMA via the IOMMU, the next
      function of the adapter will go through the same process. And since DDW
      is a characteristic of the slot itself, we are most likely going to fail
      again. But to determine we are going to fail the second slot, we again
      remove the base window -- but that is now in-use by the first
      function/driver, which might be issuing I/O already.
      
      To close this window, keep a list of all the failed struct device_nodes
      that have failed to configure DDW. If the current device_node is in that
      list, just fail out immediately and fall back to 32-bit DMA without
      doing any DDW manipulation.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      61435690
    • L
      powerpc: Use VPA subfunction macros instead of numbers for vpa calls · bb18b3a4
      Li Zhong 提交于
      Use macros in vpa calls.
      Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      bb18b3a4
  7. 10 4月, 2013 1 次提交
    • A
      procfs: new helper - PDE_DATA(inode) · d9dda78b
      Al Viro 提交于
      The only part of proc_dir_entry the code outside of fs/proc
      really cares about is PDE(inode)->data.  Provide a helper
      for that; static inline for now, eventually will be moved
      to fs/proc, along with the knowledge of struct proc_dir_entry
      layout.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9dda78b
  8. 09 4月, 2013 1 次提交
  9. 08 4月, 2013 1 次提交
  10. 05 3月, 2013 1 次提交
  11. 23 2月, 2013 1 次提交
  12. 08 2月, 2013 2 次提交
    • N
      pseries/iommu: Remove DDW on kexec · 14b6f00f
      Nishanth Aravamudan 提交于
      pseries/iommu: remove DDW on kexec
      
      We currently insert a property in the device-tree when we successfully
      configure DDW for a given slot. This was meant to be an optimization to
      speed up kexec/kdump, so that we don't need to make the RTAS calls again
      to re-configured DDW in the new kernel.
      
      However, we end up tripping a plpar_tce_stuff failure on kexec/kdump
      because we unconditionally parse the ibm,dma-window property for the
      node at bus/dev setup time. This property contains the 32-bit DMA window
      LIOBN, which is distinct from the DDW window's. We pass that LIOBN (via
      iommu_table_init -> iommu_table_clear -> tce_free ->
      tce_freemulti_pSeriesLP) to plpar_tce_stuff, which fails because that
      32-bit window is no longer present after
      25ebc45b ("powerpc/pseries/iommu: remove
      default window before attempting DDW manipulation").
      
      I believe the simplest, easiest-to-maintain fix is to just change our
      initcall to, rather than detecting and updating the new kernel's DDW
      knowledge, just remove all DDW configurations. When the drivers
      re-initialize, we will set everything back up as it was before.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      14b6f00f
    • N
      pseries/iommu: Restore_default_window does not use liobn parameter · a1dabade
      Nishanth Aravamudan 提交于
      The parameter is unused, and complicates a following fix. Just remove
      it.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a1dabade
  13. 29 1月, 2013 2 次提交
    • N
      pseries/iommu: Ensure TCEs are cleared with non-huge DDW · 71cf1def
      Nishanth Aravamudan 提交于
      There are now two kinds of DMA windows that might be presented by
      PowerVM DDW support -- huge windows (that can map all of system memory
      regardless of the LPAR configuration) and non-huge windows (which
      can't). They are implemented slightly differently in PowerVM, and thus
      have different characteristics. The most obvious is that slot isolate
      doesn't clear the TCEs/window for us with non-huge windows. Thus, when a
      DLPAR operation occurs on a slot using a non-huge window, TCEs are still
      present (the notifier chain doesn't currently remove them explicitly)
      and the DLPAR fails. Fix this by calling remove_ddw() first, which will
      unmap the DDW TCEs.
      
      Note: a corresponding change to drmgr is needed to actually successfully
      DLPAR, such that the device-tree update (which causes the notifier chain
      to fire) occurs before slot isolate.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      71cf1def
    • N
      pseries/iommu: Fix iteration in DDW TCE clearrange · 22b38298
      Nishanth Aravamudan 提交于
      tce_clearrange_multi_pSeriesLP is attempting to iterate over all TCEs in
      a given range. However, is it not advancing the dma_offset value passed
      to plpar_tce_stuff via the next value. This prevents DLPAR from
      completing, because TCEs are still present at slot isolation time.
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      22b38298
  14. 28 1月, 2013 1 次提交
    • F
      cputime: Generic on-demand virtual cputime accounting · abf917cd
      Frederic Weisbecker 提交于
      If we want to stop the tick further idle, we need to be
      able to account the cputime without using the tick.
      
      Virtual based cputime accounting solves that problem by
      hooking into kernel/user boundaries.
      
      However implementing CONFIG_VIRT_CPU_ACCOUNTING require
      low level hooks and involves more overhead. But we already
      have a generic context tracking subsystem that is required
      for RCU needs by archs which plan to shut down the tick
      outside idle.
      
      This patch implements a generic virtual based cputime
      accounting that relies on these generic kernel/user hooks.
      
      There are some upsides of doing this:
      
      - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
      if context tracking is already built (already necessary for RCU in full
      tickless mode).
      
      - We can rely on the generic context tracking subsystem to dynamically
      (de)activate the hooks, so that we can switch anytime between virtual
      and tick based accounting. This way we don't have the overhead
      of the virtual accounting when the tick is running periodically.
      
      And one downside:
      
      - There is probably more overhead than a native virtual based cputime
      accounting. But this relies on hooks that are already set anyway.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      abf917cd
  15. 10 1月, 2013 10 次提交
  16. 04 1月, 2013 1 次提交
    • G
      POWERPC: drivers: remove __dev* attributes. · cad5cef6
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitdata,
      __devinitconst, and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cad5cef6
  17. 27 11月, 2012 1 次提交
    • J
      cpuidle: Measure idle state durations with monotonic clock · a474a515
      Julius Werner 提交于
      Many cpuidle drivers measure their time spent in an idle state by
      reading the wallclock time before and after idling and calculating the
      difference. This leads to erroneous results when the wallclock time gets
      updated by another processor in the meantime, adding that clock
      adjustment to the idle state's time counter.
      
      If the clock adjustment was negative, the result is even worse due to an
      erroneous cast from int to unsigned long long of the last_residency
      variable. The negative 32 bit integer will zero-extend and result in a
      forward time jump of roughly four billion milliseconds or 1.3 hours on
      the idle state residency counter.
      
      This patch changes all affected cpuidle drivers to either use the
      monotonic clock for their measurements or make use of the generic time
      measurement wrapper in cpuidle.c, which was already working correctly.
      Some superfluous CLIs/STIs in the ACPI code are removed (interrupts
      should always already be disabled before entering the idle function, and
      not get reenabled until the generic wrapper has performed its second
      measurement). It also removes the erroneous cast, making sure that
      negative residency values are applied correctly even though they should
      not appear anymore.
      Signed-off-by: NJulius Werner <jwerner@chromium.org>
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Tested-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: NLen Brown <len.brown@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a474a515
  18. 26 11月, 2012 1 次提交
  19. 23 11月, 2012 1 次提交
  20. 17 11月, 2012 1 次提交