1. 05 6月, 2021 1 次提交
  2. 15 5月, 2021 1 次提交
  3. 14 5月, 2021 1 次提交
  4. 11 5月, 2021 1 次提交
  5. 06 5月, 2021 2 次提交
    • P
      hugetlb/userfaultfd: forbid huge pmd sharing when uffd enabled · c1991e07
      Peter Xu 提交于
      Huge pmd sharing could bring problem to userfaultfd.  The thing is that
      userfaultfd is running its logic based on the special bits on page table
      entries, however the huge pmd sharing could potentially share page table
      entries for different address ranges.  That could cause issues on
      either:
      
       - When sharing huge pmd page tables for an uffd write protected range,
         the newly mapped huge pmd range will also be write protected
         unexpectedly, or,
      
       - When we try to write protect a range of huge pmd shared range, we'll
         first do huge_pmd_unshare() in hugetlb_change_protection(), however
         that also means the UFFDIO_WRITEPROTECT could be silently skipped for
         the shared region, which could lead to data loss.
      
      While at it, a few other things are done altogether:
      
       - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because
         that's definitely something that arch code would like to use too
      
       - ARM64 currently directly check against
         CONFIG_ARCH_WANT_HUGE_PMD_SHARE when trying to share huge pmd. Switch
         to the want_pmd_share() helper.
      
       - Move vma_shareable() from huge_pmd_share() into want_pmd_share().
      
      [peterx@redhat.com: fix build with !ARCH_WANT_HUGE_PMD_SHARE]
        Link: https://lkml.kernel.org/r/20210310185359.88297-1-peterx@redhat.com
      
      Link: https://lkml.kernel.org/r/20210218231202.15426-1-peterx@redhat.comSigned-off-by: NPeter Xu <peterx@redhat.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: NAxel Rasmussen <axelrasmussen@google.com>
      Tested-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
      Cc: Adam Ruprecht <ruprecht@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Cannon Matthews <cannonmatthews@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chinwen Chang <chinwen.chang@mediatek.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Lokesh Gidra <lokeshgidra@google.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: "Michal Koutn" <mkoutny@suse.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oliver Upton <oupton@google.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Shawn Anastasio <shawn@anastas.io>
      Cc: Steven Price <steven.price@arm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1991e07
    • P
      hugetlb: pass vma into huge_pte_alloc() and huge_pmd_share() · aec44e0f
      Peter Xu 提交于
      Patch series "hugetlb: Disable huge pmd unshare for uffd-wp", v4.
      
      This series tries to disable huge pmd unshare of hugetlbfs backed memory
      for uffd-wp.  Although uffd-wp of hugetlbfs is still during rfc stage,
      the idea of this series may be needed for multiple tasks (Axel's uffd
      minor fault series, and Mike's soft dirty series), so I picked it out
      from the larger series.
      
      This patch (of 4):
      
      It is a preparation work to be able to behave differently in the per
      architecture huge_pte_alloc() according to different VMA attributes.
      
      Pass it deeper into huge_pmd_share() so that we can avoid the find_vma() call.
      
      [peterx@redhat.com: build fix]
        Link: https://lkml.kernel.org/r/20210304164653.GB397383@xz-x1Link: https://lkml.kernel.org/r/20210218230633.15028-1-peterx@redhat.com
      
      Link: https://lkml.kernel.org/r/20210218230633.15028-2-peterx@redhat.comSigned-off-by: NPeter Xu <peterx@redhat.com>
      Suggested-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Adam Ruprecht <ruprecht@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Cannon Matthews <cannonmatthews@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chinwen Chang <chinwen.chang@mediatek.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Lokesh Gidra <lokeshgidra@google.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: "Michal Koutn" <mkoutny@suse.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oliver Upton <oupton@google.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Shawn Anastasio <shawn@anastas.io>
      Cc: Steven Price <steven.price@arm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aec44e0f
  6. 01 5月, 2021 4 次提交
  7. 23 4月, 2021 2 次提交
  8. 09 4月, 2021 1 次提交
  9. 29 3月, 2021 4 次提交
  10. 26 3月, 2021 1 次提交
  11. 22 3月, 2021 1 次提交
  12. 20 3月, 2021 2 次提交
  13. 19 3月, 2021 1 次提交
    • Q
      KVM: arm64: Prepare the creation of s1 mappings at EL2 · f320bc74
      Quentin Perret 提交于
      When memory protection is enabled, the EL2 code needs the ability to
      create and manage its own page-table. To do so, introduce a new set of
      hypercalls to bootstrap a memory management system at EL2.
      
      This leads to the following boot flow in nVHE Protected mode:
      
       1. the host allocates memory for the hypervisor very early on, using
          the memblock API;
      
       2. the host creates a set of stage 1 page-table for EL2, installs the
          EL2 vectors, and issues the __pkvm_init hypercall;
      
       3. during __pkvm_init, the hypervisor re-creates its stage 1 page-table
          and stores it in the memory pool provided by the host;
      
       4. the hypervisor then extends its stage 1 mappings to include a
          vmemmap in the EL2 VA space, hence allowing to use the buddy
          allocator introduced in a previous patch;
      
       5. the hypervisor jumps back in the idmap page, switches from the
          host-provided page-table to the new one, and wraps up its
          initialization by enabling the new allocator, before returning to
          the host.
      
       6. the host can free the now unused page-table created for EL2, and
          will now need to issue hypercalls to make changes to the EL2 stage 1
          mappings instead of modifying them directly.
      
      Note that for the sake of simplifying the review, this patch focuses on
      the hypervisor side of things. In other words, this only implements the
      new hypercalls, but does not make use of them from the host yet. The
      host-side changes will follow in a subsequent patch.
      
      Credits to Will for __pkvm_init_switch_pgd.
      Acked-by: NWill Deacon <will@kernel.org>
      Co-authored-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NQuentin Perret <qperret@google.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20210319100146.1149909-18-qperret@google.com
      f320bc74
  14. 11 3月, 2021 1 次提交
    • A
      arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds · 7ba8f2b2
      Ard Biesheuvel 提交于
      52-bit VA kernels can run on hardware that is only 48-bit capable, but
      configure the ID map as 52-bit by default. This was not a problem until
      recently, because the special T0SZ value for a 52-bit VA space was never
      programmed into the TCR register anwyay, and because a 52-bit ID map
      happens to use the same number of translation levels as a 48-bit one.
      
      This behavior was changed by commit 1401bef7 ("arm64: mm: Always update
      TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ
      value for a 52-bit VA to be programmed into TCR_EL1. While some hardware
      simply ignores this, Mark reports that Amberwing systems choke on this,
      resulting in a broken boot. But even before that commit, the unsupported
      idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly
      as well.
      
      Given that we already have to deal with address spaces being either 48-bit
      or 52-bit in size, the cleanest approach seems to be to simply default to
      a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the
      kernel in DRAM requires it. This is guaranteed not to happen unless the
      system is actually 52-bit VA capable.
      
      Fixes: 90ec95cd ("arm64: mm: Introduce VA_BITS_MIN")
      Reported-by: NMark Salter <msalter@redhat.com>
      Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.comSigned-off-by: NArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
      7ba8f2b2
  15. 10 3月, 2021 1 次提交
  16. 09 3月, 2021 2 次提交
    • A
      arm64/mm: Reorganize pfn_valid() · 093bbe21
      Anshuman Khandual 提交于
      There are multiple instances of pfn_to_section_nr() and __pfn_to_section()
      when CONFIG_SPARSEMEM is enabled. This can be optimized if memory section
      is fetched earlier. This replaces the open coded PFN and ADDR conversion
      with PFN_PHYS() and PHYS_PFN() helpers. While there, also add a comment.
      This does not cause any functional change.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/1614921898-4099-3-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      093bbe21
    • A
      arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory · eeb0753b
      Anshuman Khandual 提交于
      pfn_valid() validates a pfn but basically it checks for a valid struct page
      backing for that pfn. It should always return positive for memory ranges
      backed with struct page mapping. But currently pfn_valid() fails for all
      ZONE_DEVICE based memory types even though they have struct page mapping.
      
      pfn_valid() asserts that there is a memblock entry for a given pfn without
      MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
      that they do not have memblock entries. Hence memblock_is_map_memory() will
      invariably fail via memblock_search() for a ZONE_DEVICE based address. This
      eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
      to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
      into the system via memremap_pages() called from a driver, their respective
      memory sections will not have SECTION_IS_EARLY set.
      
      Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
      regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
      for firmware reserved memory regions. memblock_is_map_memory() can just be
      skipped as its always going to be positive and that will be an optimization
      for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal
      hotplugged memory too will not have SECTION_IS_EARLY set for their sections
      
      Skipping memblock_is_map_memory() for all non early memory sections would
      fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
      performance for normal hotplug memory as well.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Acked-by: NDavid Hildenbrand <david@redhat.com>
      Fixes: 73b20c84 ("arm64: mm: implement pte_devmap support")
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/1614921898-4099-2-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
      eeb0753b
  17. 27 2月, 2021 4 次提交
  18. 25 2月, 2021 1 次提交
  19. 23 2月, 2021 1 次提交
  20. 09 2月, 2021 2 次提交
  21. 08 2月, 2021 2 次提交
  22. 06 2月, 2021 1 次提交
  23. 03 2月, 2021 2 次提交
  24. 27 1月, 2021 1 次提交