1. 27 8月, 2014 16 次提交
    • T
      Revert "powerpc: Replace __get_cpu_var uses" · 23f66e2d
      Tejun Heo 提交于
      This reverts commit 5828f666 due to
      build failure after merging with pending powerpc changes.
      
      Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      23f66e2d
    • C
      sparc: Replace __get_cpu_var uses · 494fc421
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: sparclinux@vger.kernel.org
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      494fc421
    • C
      avr32: Replace __get_cpu_var with __this_cpu_write · 8c23af61
      Christoph Lameter 提交于
      Replace the single use of __get_cpu_var in avr32 with
      __this_cpu_write.
      
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Acked-by: NHans-Christian Egtvedt <egtvedt@samfundet.no>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      8c23af61
    • C
      blackfin: Replace __get_cpu_var uses · 7e788ab1
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      CC: Mike Frysinger <vapier@gentoo.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      7e788ab1
    • C
      tile: Use this_cpu_ptr() for hardware counters · 81829a96
      Christoph Lameter 提交于
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      81829a96
    • C
      tile: Replace __get_cpu_var uses · b4f50191
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b4f50191
    • C
      powerpc: Replace __get_cpu_var uses · 5828f666
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      tj: Folded a fix patch.
          http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5828f666
    • C
      alpha: Replace __get_cpu_var · 2999a4b3
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Acked-by: NRichard Henderson <rth@twiddle.net>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      2999a4b3
    • C
      ia64: Replace __get_cpu_var uses · 6065a244
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: linux-ia64@vger.kernel.org
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6065a244
    • C
      s390: Replace __get_cpu_var uses · eb7e7d76
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	this_cpu_inc(y)
      
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      CC: linux390@de.ibm.com
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      eb7e7d76
    • C
      mips: Replace __get_cpu_var uses · 35898716
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      35898716
    • C
      MIPS: Replace __get_cpu_var uses in FPU emulator. · d1cd39ad
      Christoph Lameter 提交于
      The use of __this_cpu_inc() requires a fundamental integer type, so
      change the type of all the counters to unsigned long, which is the
      same width they were before, but not wrapped in local_t.
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      d1cd39ad
    • C
      arm: Replace __this_cpu_ptr with raw_cpu_ptr · 06b96c8b
      Christoph Lameter 提交于
      __this_cpu_ptr is being phased out. So replace with raw_cpu_ptr.
      
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      06b96c8b
    • C
      uv: Replace __get_cpu_var · e1632170
      Christoph Lameter 提交于
      Use __this_cpu_read instead.
      
      Cc: Hedi Berriche <hedi@sgi.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e1632170
    • C
      x86: Replace __get_cpu_var uses · 89cbc767
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      89cbc767
    • C
      metag: Replace __get_cpu_var uses for address calculation · bd83e65b
      Christoph Lameter 提交于
      Replace __get_cpu_var uses for address calculation with this_cpu_ptr().
      Acked-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      bd83e65b
  2. 16 8月, 2014 1 次提交
  3. 14 8月, 2014 4 次提交
  4. 13 8月, 2014 19 次提交
    • A
      powerpc/thp: Add tracepoints to track hugepage invalidate · 9e813308
      Aneesh Kumar K.V 提交于
      Add tracepoint to track hugepage invalidate. This help us
      in debugging difficult to track bugs.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9e813308
    • A
      powerpc/mm: Use read barrier when creating real_pte · 85c1fafd
      Aneesh Kumar K.V 提交于
      On ppc64 we support 4K hash pte with 64K page size. That requires
      us to track the hash pte slot information on a per 4k basis. We do that
      by storing the slot details in the second half of pte page. The pte bit
      _PAGE_COMBO is used to indicate whether the second half need to be
      looked while building real_pte. We need to use read memory barrier while
      doing that so that load of hidx is not reordered w.r.t _PAGE_COMBO
      check. On the store side we already do a lwsync in __hash_page_4K
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      85c1fafd
    • A
      powerpc/thp: Use ACCESS_ONCE when loading pmdp · 7e467245
      Aneesh Kumar K.V 提交于
      We would get wrong results in compiler recomputed old_pmd. Avoid
      that by using ACCESS_ONCE
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7e467245
    • A
      powerpc/thp: Invalidate with vpn in loop · 969b7b20
      Aneesh Kumar K.V 提交于
      As per ISA, for 4k base page size we compare 14..65 bits of VA specified
      with the entry_VA in tlb. That implies we need to make sure we do a
      tlbie with all the possible 4k va we used to access the 16MB hugepage.
      With 64k base page size we compare 14..57 bits of VA. Hence we cannot
      ignore the lower 24 bits of va while tlbie .We also cannot tlb
      invalidate a 16MB entry with just one tlbie instruction because
      we don't track which va was used to instantiate the tlb entry.
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      969b7b20
    • A
      powerpc/thp: Handle combo pages in invalidate · fc047955
      Aneesh Kumar K.V 提交于
      If we changed base page size of the segment, either via sub_page_protect
      or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
      table entries. We do a lazy hash page table flush for all mapped pages
      in the demoted segment. This happens when we handle hash page fault for
      these pages.
      
      We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
      pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
      that implies that we could possibly have older 64K hash pte entries in
      the hash page table and we need to invalidate those entries.
      
      Use _PAGE_COMBO to determine the page size with which we should
      invalidate the hash table entries on unmap.
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fc047955
    • A
      powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte · 629149fa
      Aneesh Kumar K.V 提交于
      If we changed base page size of the segment, either via sub_page_protect
      or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
      table entries. We do a lazy hash page table flush for all mapped pages
      in the demoted segment. This happens when we handle hash page fault
      for these pages.
      
      We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
      pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
      that implies that we could possibly have older 64K hash pte entries in
      the hash page table and we need to invalidate those entries.
      
      Handle this correctly for 16M pages
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      629149fa
    • A
      powerpc/thp: Don't recompute vsid and ssize in loop on invalidate · fa1f8ae8
      Aneesh Kumar K.V 提交于
      The segment identifier and segment size will remain the same in
      the loop, So we can compute it outside. We also change the
      hugepage_invalidate interface so that we can use it the later patch
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fa1f8ae8
    • A
      powerpc/thp: Add write barrier after updating the valid bit · b0aa44a3
      Aneesh Kumar K.V 提交于
      With hugepages, we store the hpte valid information in the pte page
      whose address is stored in the second half of the PMD. Use a
      write barrier to make sure clearing pmd busy bit and updating
      hpte valid info are ordered properly.
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b0aa44a3
    • N
      powerpc: reorder per-cpu NUMA information's initialization · 2fabf084
      Nishanth Aravamudan 提交于
      There is an issue currently where NUMA information is used on powerpc
      (and possibly ia64) before it has been read from the device-tree, which
      leads to large slab consumption with CONFIG_SLUB and memoryless nodes.
      
      NUMA powerpc non-boot CPU's cpu_to_node/cpu_to_mem is only accurate
      after start_secondary(), similar to ia64, which is invoked via
      smp_init().
      
      Commit 6ee0578b ("workqueue: mark init_workqueues() as
      early_initcall()") made init_workqueues() be invoked via
      do_pre_smp_initcalls(), which is obviously before the secondary
      processors are online.
      
      Additionally, the following commits changed init_workqueues() to use
      cpu_to_node to determine the node to use for kthread_create_on_node:
      
      bce90380 ("workqueue: add wq_numa_tbl_len and
      wq_numa_possible_cpumask[]")
      f3f90ad4 ("workqueue: determine NUMA node of workers accourding to
      the allowed cpumask")
      
      Therefore, when init_workqueues() runs, it sees all CPUs as being on
      Node 0. On LPARs or KVM guests where Node 0 is memoryless, this leads to
      a high number of slab deactivations
      (http://www.spinics.net/lists/linux-mm/msg67489.html).
      
      Fix this by initializing the powerpc-specific CPU<->node/local memory
      node mapping as early as possible, which on powerpc is
      do_init_bootmem(). Currently that function initializes the mapping for
      the boot CPU, but we extend it to setup the mapping for all possible
      CPUs. Then, in smp_prepare_cpus(), we can correspondingly set the
      per-cpu values for all possible CPUs. That ensures that before the
      early_initcalls run (and really as early as possible), the per-cpu NUMA
      mapping is accurate.
      
      While testing memoryless nodes on PowerKVM guests with a fix to the
      workqueue logic to use cpu_to_mem() instead of cpu_to_node(), with a
      guest topology of:
      
      available: 2 nodes (0-1)
      node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
      node 0 size: 0 MB
      node 0 free: 0 MB
      node 1 cpus: 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
      node 1 size: 16336 MB
      node 1 free: 15329 MB
      node distances:
      node   0   1
        0:  10  40
        1:  40  10
      
      the slab consumption decreases from
      
      Slab:             932416 kB
      SUnreclaim:       902336 kB
      
      to
      
      Slab:             395264 kB
      SUnreclaim:       359424 kB
      
      And we a corresponding increase in the slab efficiency from
      
      slab                                   mem     objs    slabs
                                            used   active   active
      ------------------------------------------------------------
      kmalloc-16384                       337 MB   11.28%  100.00%
      task_struct                         288 MB    9.93%  100.00%
      
      to
      
      slab                                   mem     objs    slabs
                                            used   active   active
      ------------------------------------------------------------
      kmalloc-16384                        37 MB  100.00%  100.00%
      task_struct                          31 MB  100.00%  100.00%
      
      Powerpc didn't support memoryless nodes until recently (64bb80d8
      "powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES" and 8c272261
      "powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"). Those commits also
      helped improve memory consumption with these kind of environments.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2fabf084
    • H
      powerpc/perf/hv-24x7: Use kmem_cache_free · d6589722
      Himangi Saraogi 提交于
      Free memory allocated using kmem_cache_zalloc using kmem_cache_free
      rather than kfree.
      
      The Coccinelle semantic patch that makes this change is as follows:
      
      // <smpl>
      @@
      expression x,E,c;
      @@
      
       x = \(kmem_cache_alloc\|kmem_cache_zalloc\|kmem_cache_alloc_node\)(c,...)
       ... when != x = E
           when != &x
      ?-kfree(x)
      +kmem_cache_free(c,x)
      // </smpl>
      Signed-off-by: NHimangi Saraogi <himangi774@gmail.com>
      Acked-by: NJulia Lawall <julia.lawall@lip6.fr>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d6589722
    • T
      powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info · 587870e8
      Thomas Falcon 提交于
      A buffer returned by H_VTERM_PARTNER_INFO contains device information
      in big endian format, causing problems for little endian architectures.
      This patch ensures that they are in cpu endian.
      Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      587870e8
    • A
      powerpc: Hard disable interrupts in xmon · a71d64b4
      Anton Blanchard 提交于
      xmon only soft disables interrupts. This seems like a bad idea - we
      certainly don't want decrementer and PMU exceptions going off when
      we are debugging something inside xmon.
      
      This issue was uncovered when the hard lockup detector went off
      inside xmon. To ensure we wont get a spurious hard lockup warning,
      I also call touch_nmi_watchdog() when exiting xmon.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a71d64b4
    • N
      powerpc: remove duplicate definition of TEXASR_FS · 56758e3c
      Nishanth Aravamudan 提交于
      It appears that commits 7f06f21d ("powerpc/tm: Add checking to
      treclaim/trechkpt") and e4e38121 ("KVM: PPC: Book3S HV: Add
      transactional memory support") both added definitions of TEXASR_FS.
      Remove one of them. At the same time, fix the alignment of the remaining
      definition (should be tab-separated like the rest of the #defines).
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      56758e3c
    • G
      powerpc/pseries: Avoid deadlock on removing ddw · 5efbabe0
      Gavin Shan 提交于
      Function remove_ddw() could be called in of_reconfig_notifier and
      we potentially remove the dynamic DMA window property, which invokes
      of_reconfig_notifier again. Eventually, it leads to the deadlock as
      following backtrace shows.
      
      The patch fixes the above issue by deferring releasing the dynamic
      DMA window property while releasing the device node.
      
      =============================================
      [ INFO: possible recursive locking detected ]
      3.16.0+ #428 Tainted: G        W
      ---------------------------------------------
      drmgr/2273 is trying to acquire lock:
       ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
       .__blocking_notifier_call_chain+0x40/0x78
      
      but task is already holding lock:
       ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
       .__blocking_notifier_call_chain+0x40/0x78
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock((of_reconfig_chain).rwsem);
        lock((of_reconfig_chain).rwsem);
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by drmgr/2273:
       #0:  (sb_writers#4){.+.+.+}, at: [<c0000000001cbe70>] \
            .vfs_write+0xb0/0x1f8
       #1:  ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
            .__blocking_notifier_call_chain+0x40/0x78
      
      stack backtrace:
      CPU: 17 PID: 2273 Comm: drmgr Tainted: G        W     3.16.0+ #428
      Call Trace:
      [c0000000137e7000] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable)
      [c0000000137e70b0] [c00000000083cd34] .dump_stack+0x7c/0x9c
      [c0000000137e7130] [c0000000000b8afc] .__lock_acquire+0x128c/0x1c68
      [c0000000137e7280] [c0000000000b9a4c] .lock_acquire+0xe8/0x104
      [c0000000137e7350] [c00000000083588c] .down_read+0x4c/0x90
      [c0000000137e73e0] [c000000000091890] .__blocking_notifier_call_chain+0x40/0x78
      [c0000000137e7490] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48
      [c0000000137e7520] [c000000000682a28] .of_reconfig_notify+0x34/0x5c
      [c0000000137e75b0] [c000000000682a9c] .of_property_notify+0x4c/0x54
      [c0000000137e7650] [c000000000682bf0] .of_remove_property+0x30/0xd4
      [c0000000137e76f0] [c000000000052a44] .remove_ddw+0x144/0x168
      [c0000000137e7790] [c000000000053204] .iommu_reconfig_notifier+0x30/0xe0
      [c0000000137e7820] [c00000000009137c] .notifier_call_chain+0x6c/0xb4
      [c0000000137e78c0] [c0000000000918ac] .__blocking_notifier_call_chain+0x5c/0x78
      [c0000000137e7970] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48
      [c0000000137e7a00] [c000000000682a28] .of_reconfig_notify+0x34/0x5c
      [c0000000137e7a90] [c000000000682e14] .of_detach_node+0x44/0x1fc
      [c0000000137e7b40] [c0000000000518e4] .ofdt_write+0x3ac/0x688
      [c0000000137e7c20] [c000000000238430] .proc_reg_write+0xb8/0xd4
      [c0000000137e7cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8
      [c0000000137e7d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0
      [c0000000137e7e30] [c00000000000a064] syscall_exit+0x0/0x98
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5efbabe0
    • G
      powerpc/pseries: Failure on removing device node · f1b3929c
      Gavin Shan 提交于
      While running command "drmgr -c phb -r -s 'PHB 528'", following
      backtrace jumped out because the target device node isn't marked
      with OF_DETACHED by of_detach_node(), which caused by error
      returned from memory hotplug related reconfig notifier when
      disabling CONFIG_MEMORY_HOTREMOVE. The patch fixes it.
      
      ERROR: Bad of_node_put() on /pci@800000020000210/ethernet@0
      CPU: 14 PID: 2252 Comm: drmgr Tainted: G        W     3.16.0+ #427
      Call Trace:
      [c000000012a776a0] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable)
      [c000000012a77750] [c00000000083cd34] .dump_stack+0x7c/0x9c
      [c000000012a777d0] [c0000000006807c4] .of_node_release+0x58/0xe0
      [c000000012a77860] [c00000000038a7d0] .kobject_release+0x174/0x1b8
      [c000000012a77900] [c00000000038a884] .kobject_put+0x70/0x78
      [c000000012a77980] [c000000000681680] .of_node_put+0x28/0x34
      [c000000012a77a00] [c000000000681ea8] .__of_get_next_child+0x64/0x70
      [c000000012a77a90] [c000000000682138] .of_find_node_by_path+0x1b8/0x20c
      [c000000012a77b40] [c000000000051840] .ofdt_write+0x308/0x688
      [c000000012a77c20] [c000000000238430] .proc_reg_write+0xb8/0xd4
      [c000000012a77cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8
      [c000000012a77d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0
      [c000000012a77e30] [c00000000000a064] syscall_exit+0x0/0x98
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f1b3929c
    • B
      powerpc/boot: Use correct zlib types for comparison · ce8f150a
      Benjamin Herrenschmidt 提交于
      Avoids this warning:
      
      arch/powerpc/boot/gunzip_util.c:118:9: warning: comparison of distinct pointer types lacks a cast
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ce8f150a
    • V
      powerpc/powernv: Interface to register/unregister opal dump region · b09c2ec4
      Vasant Hegde 提交于
      PowerNV platform is capable of capturing host memory region when system
      crashes (because of host/firmware). We have new OPAL API to register/
      unregister memory region to be captured when system crashes.
      
      This patch adds support for new API. Also during boot time we register
      kernel log buffer and unregister before doing kexec.
      Signed-off-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b09c2ec4
    • M
      powerpc: Add POWER8 features to CPU_FTRS_POSSIBLE/ALWAYS · 3609e09f
      Michael Ellerman 提交于
      We have been a bit slack about updating the CPU_FTRS_POSSIBLE and
      CPU_FTRS_ALWAYS masks. When we added POWER8, and also POWER8E we forgot
      to update the ALWAYS mask. And when we added POWER8_DD1 we forgot to
      update both the POSSIBLE and ALWAYS masks.
      
      Luckily this hasn't caused any actual bugs AFAICS. Failing to update the
      ALWAYS mask just forgoes a potential optimisation opportunity. Failing
      to update the POSSIBLE mask for POWER8_DD1 is also OK because it only
      removes a bit rather than adding any.
      
      Regardless they should all be in both masks so as to avoid any future
      bugs when the set of ALWAYS/POSSIBLE bits changes, or the masks
      themselves change.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NMichael Neuling <mikey@neuling.org>
      Acked-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3609e09f
    • A
      powerpc/ppc476: Disable BTAC · 97b3be1e
      Alistair Popple 提交于
      This patch disables the branch target address CAM which under specific
      circumstances may cause the processor to skip execution of 1-4
      instructions. This fixes IBM Erratum #47.
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      97b3be1e