1. 16 1月, 2018 1 次提交
    • C
      powerpc/8xx: Remove _PAGE_USER and handle user access at PMD level · de0f9387
      Christophe Leroy 提交于
      As Linux kernel separates KERNEL and USER address spaces, there is
      therefore no need to flag USER access at page level.
      
      Today, the 8xx TLB handlers already handle user access in the L1 entry
      through Access Protection Groups, it is then natural to move the user
      access handling at PMD level once _PAGE_NA allows to handle PAGE_NONE
      protection without _PAGE_USER
      
      In the mean time, as we free up one bit in the PTE, we can use it to
      include SPS (page size flag) in the PTE and avoid handling it at every
      TLB miss hence removing special handling based on compiled page size.
      
      For _PAGE_EXEC, we rework it to use PP PTE bits, avoiding the copy
      of _PAGE_EXEC bit into the L1 entry. Unfortunatly we are not
      able to put it at the correct location as it conflicts with
      NA/RO/RW bits for data entries.
      
      Upper bits of APG in L1 entry overlap with PMD base address. In
      order to avoid having to filter that out, we set up all groups so that
      upper bits can have any value.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      de0f9387
  2. 22 12月, 2017 1 次提交
  3. 16 11月, 2017 1 次提交
  4. 23 8月, 2017 1 次提交
  5. 17 8月, 2017 1 次提交
    • A
      powerpc/mm: Rename find_linux_pte_or_hugepte() · 94171b19
      Aneesh Kumar K.V 提交于
      Add newer helpers to make the function usage simpler. It is always
      recommended to use find_current_mm_pte() for walking the page table.
      If we cannot use find_current_mm_pte(), it should be documented why
      the said usage of __find_linux_pte() is safe against a parallel THP
      split.
      
      For now we have KVM code using __find_linux_pte(). This is because kvm
      code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but
      with PACA soft_enabled = 1. We may want to fix that later and make
      sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that
      we can switch kvm to use find_linux_pte().
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      94171b19
  6. 16 8月, 2017 1 次提交
    • A
      powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel command line · 79cc38de
      Aneesh Kumar K.V 提交于
      With commit aa888a74 ("hugetlb: support larger than MAX_ORDER") we added
      support for allocating gigantic hugepages via kernel command line. Switch
      ppc64 arch specific code to use that.
      
      W.r.t FSL support, we now limit our allocation range using BOOTMEM_ALLOC_ACCESSIBLE.
      
      We use the kernel command line to do reservation of hugetlb pages on powernv
      platforms. On pseries hash mmu mode the supported gigantic huge page size is
      16GB and that can only be allocated with hypervisor assist. For pseries the
      command line option doesn't do the allocation. Instead pseries does gigantic
      hugepage allocation based on hypervisor hint that is specified via
      "ibm,expected#pages" property of the memory node.
      
      Cc: Scott Wood <oss@buserror.net>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      79cc38de
  7. 15 8月, 2017 1 次提交
    • C
      powerpc/hugetlb: fix page rights verification in gup_hugepte() · ca8afd40
      Christophe Leroy 提交于
      gup_hugepte() checks if pages are present and readable, and
      when  'write' is set, also checks if the pages are writable.
      
      Initially this was done by checking if _PAGE_PRESENT and
      _PAGE_READ were set. In addition, _PAGE_WRITE was verified for write
      accesses.
      
      The problem is that we have to handle the three following cases:
      1/ The target defines __PAGE_READ and __PAGE_WRITE
      2/ The target defines __PAGE_RW
      3/ The target defines __PAGE_RO
      
      In case 1/, this is obvious
      In case 2/, __PAGE_READ is defined as 0 and __PAGE_WRITE as __PAGE_RW
      so it works as well.
      But in case 3, __PAGE_RW is defined as 0, which means __PAGE_WRITE is 0
      and then the test returns true (page writable) in all cases.
      
      A first correction was attempted in commit 6b8cb66a ("powerpc: Fix
      usage of _PAGE_RO in hugepage"), but that fix is wrong:
      instead of checking that the page is writable when write is requested,
      it checks that the page is NOT writable when write is NOT requested.
      
      This patch adds a new pte_read() helper to check whether a page is
      readable or not. This avoids handling all possible cases in
      gup_hugepte().
      
      Then gup_hugepte() is modified to use pte_present(), pte_read()
      and pte_write() instead of the raw flags.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ca8afd40
  8. 07 7月, 2017 4 次提交
  9. 02 7月, 2017 2 次提交
  10. 05 6月, 2017 1 次提交
  11. 31 3月, 2017 1 次提交
    • A
      powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout · a525108c
      Aneesh Kumar K.V 提交于
      Without this if firmware reports 1MB page size support we will crash
      trying to use 1MB as hugetlb page size.
      
      echo 300 > /sys/kernel/mm/hugepages/hugepages-1024kB/nr_hugepages
      
      kernel BUG at ./arch/powerpc/include/asm/hugetlb.h:19!
      .....
      ....
      [c0000000e2c27b30] c00000000029dae8 .hugetlb_fault+0x638/0xda0
      [c0000000e2c27c30] c00000000026fb64 .handle_mm_fault+0x844/0x1d70
      [c0000000e2c27d70] c00000000004805c .do_page_fault+0x3dc/0x7c0
      [c0000000e2c27e30] c00000000000ac98 handle_page_fault+0x10/0x30
      
      With fix, we don't enable 1MB as hugepage size.
      
      bash-4.2# cd /sys/kernel/mm/hugepages/
      bash-4.2# ls
      hugepages-16384kB  hugepages-16777216kB
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a525108c
  12. 18 1月, 2017 3 次提交
  13. 10 12月, 2016 2 次提交
    • C
      powerpc/8xx: Implement support of hugepages · 4b914286
      Christophe Leroy 提交于
      8xx uses a two level page table with two different linux page size
      support (4k and 16k). 8xx also support two different hugepage sizes
      512k and 8M. In order to support them on linux we define two different
      page table layout.
      
      The size of pages is in the PGD entry, using PS field (bits 28-29):
      00 : Small pages (4k or 16k)
      01 : 512k pages
      10 : reserved
      11 : 8M pages
      
      For 512K hugepage size a pgd entry have the below format
      [<hugepte address >0101] . The hugepte table allocated will contain 8
      entries pointing to 512K huge pte in 4k pages mode and 64 entries in
      16k pages mode.
      
      For 8M in 16k mode, a pgd entry have the below format
      [<hugepte address >1101] . The hugepte table allocated will contain 8
      entries pointing to 8M huge pte.
      
      For 8M in 4k mode, multiple pgd entries point to the same hugepte
      address and pgd entry will have the below format
      [<hugepte address>1101]. The hugepte table allocated will only have one
      entry.
      
      For the time being, we do not support CPU15 ERRATA when HUGETLB is
      selected
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits)
      Signed-off-by: NScott Wood <oss@buserror.net>
      4b914286
    • C
      powerpc: get hugetlbpage handling more generic · 03bb2d65
      Christophe Leroy 提交于
      Today there are two implementations of hugetlbpages which are managed
      by exclusive #ifdefs:
      * FSL_BOOKE: several directory entries points to the same single hugepage
      * BOOK3S: one upper level directory entry points to a table of hugepages
      
      In preparation of implementation of hugepage support on the 8xx, we
      need a mix of the two above solutions, because the 8xx needs both cases
      depending on the size of pages:
      * In 4k page size mode, each PGD entry covers a 4M bytes area. It means
      that 2 PGD entries will be necessary to cover an 8M hugepage while a
      single PGD entry will cover 8x 512k hugepages.
      * In 16 page size mode, each PGD entry covers a 64M bytes area. It means
      that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
      hugepages will be covers by one PGD entry.
      
      This patch:
      * removes #ifdefs in favor of if/else based on the range sizes
      * merges the two huge_pte_alloc() functions as they are pretty similar
      * merges the two hugetlbpage_init() functions as they are pretty similar
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3)
      Signed-off-by: NScott Wood <oss@buserror.net>
      03bb2d65
  14. 23 9月, 2016 1 次提交
  15. 21 7月, 2016 1 次提交
  16. 25 6月, 2016 1 次提交
  17. 20 5月, 2016 1 次提交
  18. 11 5月, 2016 2 次提交
  19. 01 5月, 2016 2 次提交
  20. 29 3月, 2016 1 次提交
    • S
      powerpc/mm: Fixup preempt underflow with huge pages · 08a5bb29
      Sebastian Siewior 提交于
      hugepd_free() used __get_cpu_var() once. Nothing ensured that the code
      accessing the variable did not migrate from one CPU to another and soon
      this was noticed by Tiejun Chen in 94b09d75 ("powerpc/hugetlb:
      Replace __get_cpu_var with get_cpu_var"). So we had it fixed.
      
      Christoph Lameter was doing his __get_cpu_var() replaces and forgot
      PowerPC. Then he noticed this and sent his fixed up batch again which
      got applied as 69111bac ("powerpc: Replace __get_cpu_var uses").
      
      The careful reader will noticed one little detail: get_cpu_var() got
      replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does
      a preempt_enable() and nothing that does preempt_disable() so we
      underflow the preempt counter.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      08a5bb29
  21. 29 2月, 2016 1 次提交
  22. 16 1月, 2016 2 次提交
  23. 14 12月, 2015 2 次提交
  24. 12 10月, 2015 2 次提交
  25. 18 8月, 2015 1 次提交
    • M
      powerpc/cell: Drop support for 64K local store on 4K kernels · f444f1f8
      Michael Ellerman 提交于
      Back in the olden days we added support for using 64K pages to map the
      SPU (Synergistic Processing Unit) local store on Cell, when the main
      kernel was using 4K pages.
      
      This was useful at the time because distros were using 4K pages, but
      using 64K pages on the SPUs could reduce TLB pressure there.
      
      However these days the number of Cell users is approaching zero, and
      supporting this option adds unpleasant complexity to the memory
      management code.
      
      So drop the option, CONFIG_SPU_FS_64K_LS, and all related code.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NJeremy Kerr <jk@ozlabs.org>
      f444f1f8
  26. 25 6月, 2015 1 次提交
    • Z
      mm/hugetlb: reduce arch dependent code about huge_pmd_unshare · e81f2d22
      Zhang Zhen 提交于
      Currently we have many duplicates in definitions of huge_pmd_unshare.  In
      all architectures this function just returns 0 when
      CONFIG_ARCH_WANT_HUGE_PMD_SHARE is N.
      
      This patch puts the default implementation in mm/hugetlb.c and lets these
      architectures use the common code.
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: James Yang <James.Yang@freescale.com>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e81f2d22
  27. 17 6月, 2015 1 次提交
    • P
      powerpc: don't use module_init for non-modular core hugetlb code · 6f114281
      Paul Gortmaker 提交于
      The hugetlbpage.o is obj-y (always built in).  It will never
      be modular, so using module_init as an alias for __initcall is
      somewhat misleading.
      
      Fix this up now, so that we can relocate module_init from
      init.h into module.h in the future.  If we don't do this, we'd
      have to add module.h to obviously non-modular code, and that
      would be a worse thing.
      
      Note that direct use of __initcall is discouraged, vs. one
      of the priority categorized subgroups.  As __initcall gets
      mapped onto device_initcall, our use of arch_initcall (which
      makes sense for arch code) will thus change this registration
      from level 6-device to level 3-arch (i.e. slightly earlier).
      However no observable impact of that small difference has
      been observed during testing, or is expected.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      6f114281
  28. 20 5月, 2015 1 次提交
    • L
      module: add extra argument for parse_params() callback · ecc86170
      Luis R. Rodriguez 提交于
      This adds an extra argument onto parse_params() to be used
      as a way to make the unused callback a bit more useful and
      generic by allowing the caller to pass on a data structure
      of its choice. An example use case is to allow us to easily
      make module parameters for every module which we will do
      next.
      
      @ parse @
      identifier name, args, params, num, level_min, level_max;
      identifier unknown, param, val, doing;
      type s16;
      @@
       extern char *parse_args(const char *name,
       			 char *args,
       			 const struct kernel_param *params,
       			 unsigned num,
       			 s16 level_min,
       			 s16 level_max,
      +			 void *arg,
       			 int (*unknown)(char *param, char *val,
      					const char *doing
      +					, void *arg
      					));
      
      @ parse_mod @
      identifier name, args, params, num, level_min, level_max;
      identifier unknown, param, val, doing;
      type s16;
      @@
       char *parse_args(const char *name,
       			 char *args,
       			 const struct kernel_param *params,
       			 unsigned num,
       			 s16 level_min,
       			 s16 level_max,
      +			 void *arg,
       			 int (*unknown)(char *param, char *val,
      					const char *doing
      +					, void *arg
      					))
      {
      	...
      }
      
      @ parse_args_found @
      expression R, E1, E2, E3, E4, E5, E6;
      identifier func;
      @@
      
      (
      	R =
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   func);
      |
      	R =
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   &func);
      |
      	R =
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   NULL);
      |
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   func);
      |
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   &func);
      |
      	parse_args(E1, E2, E3, E4, E5, E6,
      +		   NULL,
      		   NULL);
      )
      
      @ parse_args_unused depends on parse_args_found @
      identifier parse_args_found.func;
      @@
      
      int func(char *param, char *val, const char *unused
      +		 , void *arg
      		 )
      {
      	...
      }
      
      @ mod_unused depends on parse_args_found @
      identifier parse_args_found.func;
      expression A1, A2, A3;
      @@
      
      -	func(A1, A2, A3);
      +	func(A1, A2, A3, NULL);
      
      Generated-by: Coccinelle SmPL
      Cc: cocci@systeme.lip6.fr
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Felipe Contreras <felipe.contreras@gmail.com>
      Cc: Ewan Milne <emilne@redhat.com>
      Cc: Jean Delvare <jdelvare@suse.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecc86170