1. 03 11月, 2014 1 次提交
    • C
      powerpc: Replace __get_cpu_var uses · 69111bac
      Christoph Lameter 提交于
      This still has not been merged and now powerpc is the only arch that does
      not have this change. Sorry about missing linuxppc-dev before.
      
      V2->V2
        - Fix up to work against 3.18-rc1
      
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      [mpe: Fix build errors caused by set/or_softirq_pending(), and rework
            assignment in __set_breakpoint() to use memcpy().]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      69111bac
  2. 25 9月, 2014 2 次提交
  3. 27 8月, 2014 2 次提交
    • T
      Revert "powerpc: Replace __get_cpu_var uses" · 23f66e2d
      Tejun Heo 提交于
      This reverts commit 5828f666 due to
      build failure after merging with pending powerpc changes.
      
      Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      23f66e2d
    • C
      powerpc: Replace __get_cpu_var uses · 5828f666
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      tj: Folded a fix patch.
          http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5828f666
  4. 13 8月, 2014 1 次提交
  5. 11 7月, 2014 1 次提交
    • A
      powerpc/pseries: Use jump labels for hcall tracepoints · cc1adb5f
      Anton Blanchard 提交于
      hcall tracepoints add quite a few instructions to our hcall path:
      
      plpar_hcall:
      	mr      r2,r2
      	mfcr    r0
      	stw     r0,8(r1)
      	b       164		<---- start
      	ld      r12,0(r2)
      	std     r12,32(r1)
      	cmpdi   r12,0
      	beq     164		<---- end
      ...
      
      We have an unconditional branch that gets noped out during boot and
      a load/compare/branch. We also store the tracepoint value to the
      stack for the hcall_exit path to use.
      
      By using jump labels we can simplify this to just a single nop that
      gets replaced with a branch when the tracepoint is enabled:
      
      plpar_hcall:
      	mr      r2,r2
      	mfcr    r0
      	stw     r0,8(r1)
      	nop			<----
      ...
      
      If jump labels are not enabled, we fall back to the old method.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cc1adb5f
  6. 09 12月, 2013 2 次提交
  7. 21 11月, 2013 1 次提交
  8. 27 8月, 2013 2 次提交
  9. 14 8月, 2013 2 次提交
  10. 24 7月, 2013 1 次提交
  11. 01 7月, 2013 1 次提交
  12. 21 6月, 2013 2 次提交
  13. 30 4月, 2013 2 次提交
  14. 08 4月, 2013 1 次提交
  15. 17 9月, 2012 1 次提交
  16. 05 9月, 2012 1 次提交
  17. 21 3月, 2012 1 次提交
  18. 11 1月, 2012 1 次提交
    • A
      powerpc: Fix RCU idle and hcall tracing · a5ccfee0
      Anton Blanchard 提交于
      Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit
      region. Since pSeries calls H_CEDE in the idle loop, we were violating
      this rule.
      
      commit a7b152d5 (powerpc: Tell RCU about idle after hcall tracing)
      tried to work around it by delaying the rcu_idle_enter until after we
      called the hcall tracepoint, but there are a number of issues with it.
      
      The hcall tracepoint trampoline code is called conditionally when the
      tracepoint is enabled. If the tracepoint is not enabled we never call
      rcu_idle_enter. The idle_uses_rcu check was also done at compile time
      which breaks multiplatform builds.
      
      The simple fix is to avoid tracing H_CEDE and rely on other tracepoints
      and the hypervisor dispatch trace log to work out if we called H_CEDE.
      
      This fixes a hang during boot on pSeries.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a5ccfee0
  19. 03 1月, 2012 1 次提交
  20. 12 12月, 2011 1 次提交
    • P
      powerpc: Tell RCU about idle after hcall tracing · a7b152d5
      Paul E. McKenney 提交于
      The PowerPC pSeries platform (CONFIG_PPC_PSERIES=y) enables
      hypervisor-call tracing for CONFIG_TRACEPOINTS=y kernels.  One of the
      hypervisor calls that is traced is the H_CEDE call in the idle loop
      that tells the hypervisor that this OS instance no longer needs the
      current CPU.  However, tracing uses RCU, so this combination of kernel
      configuration variables needs to avoid telling RCU about the current CPU's
      idleness until after the H_CEDE-entry tracing completes on the one hand,
      and must tell RCU that the the current CPU is no longer idle before the
      H_CEDE-exit tracing starts.
      
      In all other cases, it suffices to inform RCU of CPU idleness upon
      idle-loop entry and exit.
      
      This commit makes the required adjustments.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      a7b152d5
  21. 01 11月, 2011 1 次提交
  22. 05 8月, 2011 2 次提交
  23. 29 6月, 2011 2 次提交
    • B
      powerpc/pseries: Re-implement HVSI as part of hvc_vio · 4d2bb3f5
      Benjamin Herrenschmidt 提交于
      On pseries machines, consoles are provided by the hypervisor using
      a low level get_chars/put_chars type interface. However, this is
      really just a transport to the service processor which implements
      them either as "raw" console (networked consoles, HMC, ...) or as
      "hvsi" serial ports.
      
      The later is a simple packet protocol on top of the raw character
      interface that is supposed to convey additional "serial port" style
      semantics. In practice however, all it does is provide a way to
      read the CD line and set/clear our DTR line, that's it.
      
      We currently implement the "raw" protocol as an hvc console backend
      (/dev/hvcN) and the "hvsi" protocol using a separate tty driver
      (/dev/hvsi0).
      
      However this is quite impractical. The arbitrary difference between
      the two type of devices has been a major source of user (and distro)
      confusion. Additionally, there's an additional mini -hvsi implementation
      in the pseries platform code for our low level debug console and early
      boot kernel messages, which means code duplication, though that low
      level variant is impractical as it's incapable of doing the initial
      protocol negociation to establish the link to the FSP.
      
      This essentially replaces the dedicated hvsi driver and the platform
      udbg code completely by extending the existing hvc_vio backend used
      in "raw" mode so that:
      
       - It now supports HVSI as well
       - We add support for hvc backend providing tiocm{get,set}
       - It also provides a udbg interface for early debug and boot console
      
      This is overall less code, though this will only be obvious once we
      remove the old "hvsi" driver, which is still available for now. When
      the old driver is enabled, the new code still kicks in for the low
      level udbg console, replacing the old mini implementation in the platform
      code, it just doesn't provide the higher level "hvc" interface.
      
      In addition to producing generally simler code, this has several benefits
      over our current situation:
      
       - The user/distro only has to deal with /dev/hvcN for the hypervisor
      console, avoiding all sort of confusion that has plagued us in the past
      
       - The tty, kernel and low level debug console all use the same code
      base which supports the full protocol establishment process, thus the
      console is now available much earlier than it used to be with the
      old HVSI driver. The kernel console works much earlier and udbg is
      available much earlier too. Hackers can enable a hard coded very-early
      debug console as well that works with HVSI (previously that was only
      supported for the "raw" mode).
      
      I've tried to keep the same semantics as hvsi relative to how I react
      to things like CD changes, with some subtle differences though:
      
       - I clear DTR on close if HUPCL is set
      
       - Current hvsi triggers a hangup if it detects a up->down transition
         on CD (you can still open a console with CD down). My new implementation
         triggers a hangup if the link to the FSP is severed, and severs it upon
         detecting a up->down transition on CD.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4d2bb3f5
    • B
      powerpc/udbg: Register udbg console generically · dd2e356a
      Benjamin Herrenschmidt 提交于
      When CONFIG_PPC_EARLY_DEBUG is set, call register_early_udbg_console()
      early from generic code.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      dd2e356a
  24. 04 5月, 2011 1 次提交
  25. 27 4月, 2011 1 次提交
  26. 07 2月, 2011 1 次提交
  27. 29 11月, 2010 1 次提交
  28. 02 9月, 2010 2 次提交
    • P
      powerpc: Account time using timebase rather than PURR · cf9efce0
      Paul Mackerras 提交于
      Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the
      PURR register for measuring the user and system time used by
      processes, as well as other related times such as hardirq and
      softirq times.  This turns out to be quite confusing for users
      because it means that a program will often be measured as taking
      less time when run on a multi-threaded processor (SMT2 or SMT4 mode)
      than it does when run on a single-threaded processor (ST mode), even
      though the program takes longer to finish.  The discrepancy is
      accounted for as stolen time, which is also confusing, particularly
      when there are no other partitions running.
      
      This changes the accounting to use the timebase instead, meaning that
      the reported user and system times are the actual number of real-time
      seconds that the program was executing on the processor thread,
      regardless of which SMT mode the processor is in.  Thus a program will
      generally show greater user and system times when run on a
      multi-threaded processor than on a single-threaded processor.
      
      On pSeries systems on POWER5 or later processors, we measure the
      stolen time (time when this partition wasn't running) using the
      hypervisor dispatch trace log.  We check for new entries in the
      log on every entry from user mode and on every transition from
      kernel process context to soft or hard IRQ context (i.e. when
      account_system_vtime() gets called).  So that we can correctly
      distinguish time stolen from user time and time stolen from system
      time, without having to check the log on every exit to user mode,
      we store separate timestamps for exit to user mode and entry from
      user mode.
      
      On systems that have a SPURR (POWER6 and POWER7), we read the SPURR
      in account_system_vtime() (as before), and then apportion the SPURR
      ticks since the last time we read it between scaled user time and
      scaled system time according to the relative proportions of user
      time and system time over the same interval.  This avoids having to
      read the SPURR on every kernel entry and exit.  On systems that have
      PURR but not SPURR (i.e., POWER5), we do the same using the PURR
      rather than the SPURR.
      
      This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl
      for now since it conflicts with the use of the dispatch trace log
      by the time accounting code.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cf9efce0
    • P
      powerpc: Abstract indexing of lppaca structs · 8154c5d2
      Paul Mackerras 提交于
      Currently we have the lppaca structs as a simple array of NR_CPUS
      entries, taking up space in the data section of the kernel image.
      In future we would like to allocate them dynamically, so this
      abstracts out the accesses to the array, making it easier to
      change how we locate the lppaca for a given cpu in future.
      Specifically, lppaca[cpu] changes to lppaca_of(cpu).
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8154c5d2
  29. 21 5月, 2010 1 次提交
    • M
      powerpc/kexec: Speedup kexec hash PTE tear down · d504bed6
      Michael Neuling 提交于
      Currently for kexec the PTE tear down on 1TB segment systems normally
      requires 3 hcalls for each PTE removal. On a machine with 32GB of
      memory it can take around a minute to remove all the PTEs.
      
      This optimises the path so that we only remove PTEs that are valid.
      It also uses the read 4 PTEs at once HCALL.  For the common case where
      a PTEs is invalid in a 1TB segment, this turns the 3 HCALLs per PTE
      down to 1 HCALL per 4 PTEs.
      
      This gives an > 10x speedup in kexec times on PHYP, taking a 32GB
      machine from around 1 minute down to a few seconds.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d504bed6
  30. 28 10月, 2009 1 次提交