1. 18 1月, 2017 2 次提交
  2. 10 12月, 2016 2 次提交
    • C
      powerpc/8xx: Implement support of hugepages · 4b914286
      Christophe Leroy 提交于
      8xx uses a two level page table with two different linux page size
      support (4k and 16k). 8xx also support two different hugepage sizes
      512k and 8M. In order to support them on linux we define two different
      page table layout.
      
      The size of pages is in the PGD entry, using PS field (bits 28-29):
      00 : Small pages (4k or 16k)
      01 : 512k pages
      10 : reserved
      11 : 8M pages
      
      For 512K hugepage size a pgd entry have the below format
      [<hugepte address >0101] . The hugepte table allocated will contain 8
      entries pointing to 512K huge pte in 4k pages mode and 64 entries in
      16k pages mode.
      
      For 8M in 16k mode, a pgd entry have the below format
      [<hugepte address >1101] . The hugepte table allocated will contain 8
      entries pointing to 8M huge pte.
      
      For 8M in 4k mode, multiple pgd entries point to the same hugepte
      address and pgd entry will have the below format
      [<hugepte address>1101]. The hugepte table allocated will only have one
      entry.
      
      For the time being, we do not support CPU15 ERRATA when HUGETLB is
      selected
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits)
      Signed-off-by: NScott Wood <oss@buserror.net>
      4b914286
    • C
      powerpc: get hugetlbpage handling more generic · 03bb2d65
      Christophe Leroy 提交于
      Today there are two implementations of hugetlbpages which are managed
      by exclusive #ifdefs:
      * FSL_BOOKE: several directory entries points to the same single hugepage
      * BOOK3S: one upper level directory entry points to a table of hugepages
      
      In preparation of implementation of hugepage support on the 8xx, we
      need a mix of the two above solutions, because the 8xx needs both cases
      depending on the size of pages:
      * In 4k page size mode, each PGD entry covers a 4M bytes area. It means
      that 2 PGD entries will be necessary to cover an 8M hugepage while a
      single PGD entry will cover 8x 512k hugepages.
      * In 16 page size mode, each PGD entry covers a 64M bytes area. It means
      that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
      hugepages will be covers by one PGD entry.
      
      This patch:
      * removes #ifdefs in favor of if/else based on the range sizes
      * merges the two huge_pte_alloc() functions as they are pretty similar
      * merges the two hugetlbpage_init() functions as they are pretty similar
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3)
      Signed-off-by: NScott Wood <oss@buserror.net>
      03bb2d65
  3. 23 9月, 2016 1 次提交
  4. 21 7月, 2016 1 次提交
  5. 25 6月, 2016 1 次提交
  6. 20 5月, 2016 1 次提交
  7. 11 5月, 2016 2 次提交
  8. 01 5月, 2016 2 次提交
  9. 29 3月, 2016 1 次提交
    • S
      powerpc/mm: Fixup preempt underflow with huge pages · 08a5bb29
      Sebastian Siewior 提交于
      hugepd_free() used __get_cpu_var() once. Nothing ensured that the code
      accessing the variable did not migrate from one CPU to another and soon
      this was noticed by Tiejun Chen in 94b09d75 ("powerpc/hugetlb:
      Replace __get_cpu_var with get_cpu_var"). So we had it fixed.
      
      Christoph Lameter was doing his __get_cpu_var() replaces and forgot
      PowerPC. Then he noticed this and sent his fixed up batch again which
      got applied as 69111bac ("powerpc: Replace __get_cpu_var uses").
      
      The careful reader will noticed one little detail: get_cpu_var() got
      replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does
      a preempt_enable() and nothing that does preempt_disable() so we
      underflow the preempt counter.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      08a5bb29
  10. 29 2月, 2016 1 次提交
  11. 16 1月, 2016 2 次提交
  12. 14 12月, 2015 2 次提交
  13. 12 10月, 2015 2 次提交
  14. 18 8月, 2015 1 次提交
    • M
      powerpc/cell: Drop support for 64K local store on 4K kernels · f444f1f8
      Michael Ellerman 提交于
      Back in the olden days we added support for using 64K pages to map the
      SPU (Synergistic Processing Unit) local store on Cell, when the main
      kernel was using 4K pages.
      
      This was useful at the time because distros were using 4K pages, but
      using 64K pages on the SPUs could reduce TLB pressure there.
      
      However these days the number of Cell users is approaching zero, and
      supporting this option adds unpleasant complexity to the memory
      management code.
      
      So drop the option, CONFIG_SPU_FS_64K_LS, and all related code.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NJeremy Kerr <jk@ozlabs.org>
      f444f1f8
  15. 25 6月, 2015 1 次提交
    • Z
      mm/hugetlb: reduce arch dependent code about huge_pmd_unshare · e81f2d22
      Zhang Zhen 提交于
      Currently we have many duplicates in definitions of huge_pmd_unshare.  In
      all architectures this function just returns 0 when
      CONFIG_ARCH_WANT_HUGE_PMD_SHARE is N.
      
      This patch puts the default implementation in mm/hugetlb.c and lets these
      architectures use the common code.
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: James Yang <James.Yang@freescale.com>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e81f2d22
  16. 17 6月, 2015 1 次提交
    • P
      powerpc: don't use module_init for non-modular core hugetlb code · 6f114281
      Paul Gortmaker 提交于
      The hugetlbpage.o is obj-y (always built in).  It will never
      be modular, so using module_init as an alias for __initcall is
      somewhat misleading.
      
      Fix this up now, so that we can relocate module_init from
      init.h into module.h in the future.  If we don't do this, we'd
      have to add module.h to obviously non-modular code, and that
      would be a worse thing.
      
      Note that direct use of __initcall is discouraged, vs. one
      of the priority categorized subgroups.  As __initcall gets
      mapped onto device_initcall, our use of arch_initcall (which
      makes sense for arch code) will thus change this registration
      from level 6-device to level 3-arch (i.e. slightly earlier).
      However no observable impact of that small difference has
      been observed during testing, or is expected.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      6f114281
  17. 20 5月, 2015 1 次提交
    • L
      module: add extra argument for parse_params() callback · ecc86170
      Luis R. Rodriguez 提交于
      This adds an extra argument onto parse_params() to be used
      as a way to make the unused callback a bit more useful and
      generic by allowing the caller to pass on a data structure
      of its choice. An example use case is to allow us to easily
      make module parameters for every module which we will do
      next.
      
      @ parse @
      identifier name, args, params, num, level_min, level_max;
      identifier unknown, param, val, doing;
      type s16;
      @@
       extern char *parse_args(const char *name,
       			 char *args,
       			 const struct kernel_param *params,
       			 unsigned num,
       			 s16 level_min,
       			 s16 level_max,
      +			 void *arg,
       			 int (*unknown)(char *param, char *val,
      					const char *doing
      +					, void *arg
      					));
      
      @ parse_mod @
      identifier name, args, params, num, level_min, level_max;
      identifier unknown, param, val, doing;
      type s16;
      @@
       char *parse_args(const char *name,
       			 char *args,
       			 const struct kernel_param *params,
       			 unsigned num,
       			 s16 level_min,
       			 s16 level_max,
      +			 void *arg,
       			 int (*unknown)(char *param, char *val,
      					const char *doing
      +					, void *arg
      					))
      {
      	...
      }
      
      @ parse_args_found @
      expression R, E1, E2, E3, E4, E5, E6;
      identifier func;
      @@
      
      (
      	R =
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   func);
      |
      	R =
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   &func);
      |
      	R =
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   NULL);
      |
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   func);
      |
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   &func);
      |
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   NULL);
      )
      
      @ parse_args_unused depends on parse_args_found @
      identifier parse_args_found.func;
      @@
      
      int func(char *param, char *val, const char *unused
      +		 , void *arg
      		 )
      {
      	...
      }
      
      @ mod_unused depends on parse_args_found @
      identifier parse_args_found.func;
      expression A1, A2, A3;
      @@
      
      -	func(A1, A2, A3);
      +	func(A1, A2, A3, NULL);
      
      Generated-by: Coccinelle SmPL
      Cc: cocci@systeme.lip6.fr
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Felipe Contreras <felipe.contreras@gmail.com>
      Cc: Ewan Milne <emilne@redhat.com>
      Cc: Jean Delvare <jdelvare@suse.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecc86170
  18. 12 5月, 2015 1 次提交
  19. 17 4月, 2015 2 次提交
    • A
      powerpc/mm/thp: Return pte address if we find trans_splitting. · 7d6e7f7f
      Aneesh Kumar K.V 提交于
      For THP that is marked trans splitting, we return the pte.
      This require the callers to handle the pmd_trans_splitting scenario,
      if they care. All the current callers are either looking at pfn or
      write_ok, hence we don't need to update them.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7d6e7f7f
    • A
      powerpc/mm/thp: Make page table walk safe against thp split/collapse · 691e95fd
      Aneesh Kumar K.V 提交于
      We can disable a THP split or a hugepage collapse by disabling irq.
      We do send IPI to all the cpus in the early part of split/collapse,
      and disabling local irq ensure we don't make progress with
      split/collapse. If the THP is getting split we return NULL from
      find_linux_pte_or_hugepte(). For all the current callers it should be ok.
      We need to be careful if we want to use returned pte_t pointer outside
      the irq disabled region. W.r.t to THP split, the pfn remains the same,
      but then a hugepage collapse will result in a pfn change. There are
      few steps we can take to avoid a hugepage collapse.One way is to take page
      reference inside the irq disable region. Other option is to take
      mmap_sem so that a parallel collapse will not happen. We can also
      disable collapse by taking pmd_lock. Another method used by kvm
      subsystem is to check whether we had a mmu_notifer update in between
      using mmu_notifier_retry().
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      691e95fd
  20. 16 4月, 2015 1 次提交
  21. 10 4月, 2015 1 次提交
  22. 12 2月, 2015 1 次提交
    • N
      mm/hugetlb: reduce arch dependent code around follow_huge_* · 61f77eda
      Naoya Horiguchi 提交于
      Currently we have many duplicates in definitions around
      follow_huge_addr(), follow_huge_pmd(), and follow_huge_pud(), so this
      patch tries to remove the m.  The basic idea is to put the default
      implementation for these functions in mm/hugetlb.c as weak symbols
      (regardless of CONFIG_ARCH_WANT_GENERAL_HUGETL B), and to implement
      arch-specific code only when the arch needs it.
      
      For follow_huge_addr(), only powerpc and ia64 have their own
      implementation, and in all other architectures this function just returns
      ERR_PTR(-EINVAL).  So this patch sets returning ERR_PTR(-EINVAL) as
      default.
      
      As for follow_huge_(pmd|pud)(), if (pmd|pud)_huge() is implemented to
      always return 0 in your architecture (like in ia64 or sparc,) it's never
      called (the callsite is optimized away) no matter how implemented it is.
      So in such architectures, we don't need arch-specific implementation.
      
      In some architecture (like mips, s390 and tile,) their current
      arch-specific follow_huge_(pmd|pud)() are effectively identical with the
      common code, so this patch lets these architecture use the common code.
      
      One exception is metag, where pmd_huge() could return non-zero but it
      expects follow_huge_pmd() to always return NULL.  This means that we need
      arch-specific implementation which returns NULL.  This behavior looks
      strange to me (because non-zero pmd_huge() implies that the architecture
      supports PMD-based hugepage, so follow_huge_pmd() can/should return some
      relevant value,) but that's beyond this cleanup patch, so let's keep it.
      
      Justification of non-trivial changes:
      - in s390, follow_huge_pmd() checks !MACHINE_HAS_HPAGE at first, and this
        patch removes the check. This is OK because we can assume MACHINE_HAS_HPAGE
        is true when follow_huge_pmd() can be called (note that pmd_huge() has
        the same check and always returns 0 for !MACHINE_HAS_HPAGE.)
      - in s390 and mips, we use HPAGE_MASK instead of PMD_MASK as done in common
        code. This patch forces these archs use PMD_MASK, but it's OK because
        they are identical in both archs.
        In s390, both of HPAGE_SHIFT and PMD_SHIFT are 20.
        In mips, HPAGE_SHIFT is defined as (PAGE_SHIFT + PAGE_SHIFT - 3) and
        PMD_SHIFT is define as (PAGE_SHIFT + PAGE_SHIFT + PTE_ORDER - 3), but
        PTE_ORDER is always 0, so these are identical.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      61f77eda
  23. 19 1月, 2015 1 次提交
  24. 02 12月, 2014 1 次提交
  25. 19 11月, 2014 1 次提交
    • M
      powerpc: Remove more traces of bootmem · e39f223f
      Michael Ellerman 提交于
      Although we are now selecting NO_BOOTMEM, we still have some traces of
      bootmem lying around. That is because even with NO_BOOTMEM there is
      still a shim that converts bootmem calls into memblock calls, but
      ultimately we want to remove all traces of bootmem.
      
      Most of the patch is conversions from alloc_bootmem() to
      memblock_virt_alloc(). In general a call such as:
      
        p = (struct foo *)alloc_bootmem(x);
      
      Becomes:
      
        p = memblock_virt_alloc(x, 0);
      
      We don't need the cast because memblock_virt_alloc() returns a void *.
      The alignment value of zero tells memblock to use the default alignment,
      which is SMP_CACHE_BYTES, the same value alloc_bootmem() uses.
      
      We remove a number of NULL checks on the result of
      memblock_virt_alloc(). That is because memblock_virt_alloc() will panic
      if it can't allocate, in exactly the same way as alloc_bootmem(), so the
      NULL checks are and always have been redundant.
      
      The memory returned by memblock_virt_alloc() is already zeroed, so we
      remove several memsets of the result of memblock_virt_alloc().
      
      Finally we convert a few uses of __alloc_bootmem(x, y, MAX_DMA_ADDRESS)
      to just plain memblock_virt_alloc(). We don't use memblock_alloc_base()
      because MAX_DMA_ADDRESS is ~0ul on powerpc, so limiting the allocation
      to that is pointless, 16XB ought to be enough for anyone.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e39f223f
  26. 17 11月, 2014 1 次提交
    • W
      mmu_gather: move minimal range calculations into generic code · fb7332a9
      Will Deacon 提交于
      On architectures with hardware broadcasting of TLB invalidation messages
      , it makes sense to reduce the range of the mmu_gather structure when
      unmapping page ranges based on the dirty address information passed to
      tlb_remove_tlb_entry.
      
      arm64 already does this by directly manipulating the start/end fields
      of the gather structure, but this confuses the generic code which
      does not expect these fields to change and can end up calculating
      invalid, negative ranges when forcing a flush in zap_pte_range.
      
      This patch moves the minimal range calculation out of the arm64 code
      and into the generic implementation, simplifying zap_pte_range in the
      process (which no longer needs to care about start/end, since they will
      point to the appropriate ranges already). With the range being tracked
      by core code, the need_flush flag is dropped in favour of checking that
      the end of the range has actually been set.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Michal Simek <monstr@monstr.eu>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb7332a9
  27. 14 11月, 2014 2 次提交
  28. 10 11月, 2014 1 次提交
  29. 03 11月, 2014 1 次提交
    • C
      powerpc: Replace __get_cpu_var uses · 69111bac
      Christoph Lameter 提交于
      This still has not been merged and now powerpc is the only arch that does
      not have this change. Sorry about missing linuxppc-dev before.
      
      V2->V2
        - Fix up to work against 3.18-rc1
      
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      [mpe: Fix build errors caused by set/or_softirq_pending(), and rework
            assignment in __set_breakpoint() to use memcpy().]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      69111bac
  30. 27 8月, 2014 2 次提交
    • T
      Revert "powerpc: Replace __get_cpu_var uses" · 23f66e2d
      Tejun Heo 提交于
      This reverts commit 5828f666 due to
      build failure after merging with pending powerpc changes.
      
      Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      23f66e2d
    • C
      powerpc: Replace __get_cpu_var uses · 5828f666
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      tj: Folded a fix patch.
          http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5828f666