1. 24 1月, 2020 8 次提交
  2. 05 1月, 2020 1 次提交
    • D
      mm/memory_hotplug: shrink zones when offlining memory · feee6b29
      David Hildenbrand 提交于
      We currently try to shrink a single zone when removing memory.  We use
      the zone of the first page of the memory we are removing.  If that
      memmap was never initialized (e.g., memory was never onlined), we will
      read garbage and can trigger kernel BUGs (due to a stale pointer):
      
          BUG: unable to handle page fault for address: 000000000000353d
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          PGD 0 P4D 0
          Oops: 0002 [#1] SMP PTI
          CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
          Workqueue: kacpi_hotplug acpi_hotplug_work_fn
          RIP: 0010:clear_zone_contiguous+0x5/0x10
          Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
          RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
          RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
          RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
          R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
          FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           __remove_pages+0x4b/0x640
           arch_remove_memory+0x63/0x8d
           try_remove_memory+0xdb/0x130
           __remove_memory+0xa/0x11
           acpi_memory_device_remove+0x70/0x100
           acpi_bus_trim+0x55/0x90
           acpi_device_hotplug+0x227/0x3a0
           acpi_hotplug_work_fn+0x1a/0x30
           process_one_work+0x221/0x550
           worker_thread+0x50/0x3b0
           kthread+0x105/0x140
           ret_from_fork+0x3a/0x50
          Modules linked in:
          CR2: 000000000000353d
      
      Instead, shrink the zones when offlining memory or when onlining failed.
      Introduce and use remove_pfn_range_from_zone(() for that.  We now
      properly shrink the zones, even if we have DIMMs whereby
      
       - Some memory blocks fall into no zone (never onlined)
      
       - Some memory blocks fall into multiple zones (offlined+re-onlined)
      
       - Multiple memory blocks that fall into different zones
      
      Drop the zone parameter (with a potential dubious value) from
      __remove_pages() and __remove_section().
      
      Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NOscar Salvador <osalvador@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: <stable@vger.kernel.org>	[5.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      feee6b29
  3. 30 12月, 2019 1 次提交
  4. 23 12月, 2019 1 次提交
    • M
      powerpc/mm: Mark get_slice_psize() & slice_addr_is_low() as notrace · 91a063c9
      Michael Ellerman 提交于
      These slice routines are called from the SLB miss handler, which can
      lead to warnings from the IRQ code, because we have not reconciled the
      IRQ state properly:
      
        WARNING: CPU: 72 PID: 30150 at arch/powerpc/kernel/irq.c:258 arch_local_irq_restore.part.0+0xcc/0x100
        Modules linked in:
        CPU: 72 PID: 30150 Comm: ftracetest Not tainted 5.5.0-rc2-gcc9x-g7e0165b2 #1
        NIP:  c00000000001d83c LR: c00000000029ab90 CTR: c00000000026cf90
        REGS: c0000007eee3b960 TRAP: 0700   Not tainted  (5.5.0-rc2-gcc9x-g7e0165b2)
        MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 22242844  XER: 20000000
        CFAR: c00000000001d780 IRQMASK: 0
        ...
        NIP arch_local_irq_restore.part.0+0xcc/0x100
        LR  trace_graph_entry+0x270/0x340
        Call Trace:
          trace_graph_entry+0x254/0x340 (unreliable)
          function_graph_enter+0xe4/0x1a0
          prepare_ftrace_return+0xa0/0x130
          ftrace_graph_caller+0x44/0x94	# (get_slice_psize())
          slb_allocate_user+0x7c/0x100
          do_slb_fault+0xf8/0x300
          instruction_access_slb_common+0x140/0x180
      
      Fixes: 48e7b769 ("powerpc/64s/hash: Convert SLB miss handlers to C")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20191221121337.4894-1-mpe@ellerman.id.au
      91a063c9
  5. 18 12月, 2019 1 次提交
    • P
      KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor · d89c69f4
      Paul Mackerras 提交于
      Commit 22945688 ("KVM: PPC: Book3S HV: Support reset of secure
      guest") added a call to uv_svm_terminate, which is an ultravisor
      call, without any check that the guest is a secure guest or even that
      the system has an ultravisor.  On a system without an ultravisor,
      the ultracall will degenerate to a hypercall, but since we are not
      in KVM guest context, the hypercall will get treated as a system
      call, which could have random effects depending on what happens to
      be in r0, and could also corrupt the current task's kernel stack.
      Hence this adds a test for the guest being a secure guest before
      doing uv_svm_terminate().
      
      Fixes: 22945688 ("KVM: PPC: Book3S HV: Support reset of secure guest")
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      d89c69f4
  6. 17 12月, 2019 1 次提交
  7. 16 12月, 2019 3 次提交
  8. 14 12月, 2019 1 次提交
  9. 13 12月, 2019 3 次提交
  10. 10 12月, 2019 1 次提交
  11. 05 12月, 2019 6 次提交
  12. 04 12月, 2019 4 次提交
  13. 02 12月, 2019 1 次提交
  14. 29 11月, 2019 1 次提交
    • C
      powerpc/fixmap: fix crash with HIGHMEM · 2807273f
      Christophe Leroy 提交于
      Commit f2bb8693 ("powerpc/fixmap: don't clear fixmap area in
      paging_init()") removed the clearing of fixmap area in order to
      avoid clearing fixmapped areas set earlier.
      
      However unlike all other users of fixmap which use __set_fixmap(),
      HIGHMEM functions directly use __set_pte_at(). This means
      the page table must pre-exist, otherwise the following crash
      can be encoutered due to the lack of entry in the PGD.
      
      Oops: Kernel access of bad area, sig: 11 [#1]
      BE PAGE_SIZE=4K MMU=Hash PowerMac
      Modules linked in:
      CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.0+ #2528
      NIP:  c0144ce8 LR: c0144ccc CTR: 00000080
      REGS: ef0b5aa0 TRAP: 0300   Not tainted  (5.4.0+)
      MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 44282842  XER: 00000000
      DAR: fffdf000 DSISR: 42000000
      GPR00: c0144ccc ef0b5b58 ef0b0000 fffdf000 fffdf000 00000000 c0000f7c 00000000
      GPR08: c0833000 fffdf000 00000000 ef1c53c9 24042842 00000000 00000000 00000000
      GPR16: 00000000 00000000 ef7e7358 effe8160 00000000 c08a9660 c0851644 00000004
      GPR24: c08c70a8 00002dc2 00000000 00000001 00000201 effe8160 effe8160 00000000
      NIP [c0144ce8] prep_new_page+0x138/0x178
      LR [c0144ccc] prep_new_page+0x11c/0x178
      Call Trace:
      [ef0b5b58] [c0144ccc] prep_new_page+0x11c/0x178 (unreliable)
      [ef0b5b88] [c0147218] get_page_from_freelist+0x1fc/0xd88
      [ef0b5c38] [c0148328] __alloc_pages_nodemask+0xd4/0xbb4
      [ef0b5cf8] [c0142ba8] __vmalloc_node_range+0x1b4/0x2e0
      [ef0b5d38] [c0142dd0] vzalloc+0x48/0x58
      [ef0b5d58] [c0301c8c] check_partition+0x58/0x244
      [ef0b5d78] [c02ffe80] blk_add_partitions+0x44/0x2cc
      [ef0b5db8] [c01a32d8] bdev_disk_changed+0x68/0xfc
      [ef0b5de8] [c01a4494] __blkdev_get+0x290/0x460
      [ef0b5e28] [c02fdd40] __device_add_disk+0x480/0x4d8
      [ef0b5e68] [c0810688] brd_init+0xc0/0x188
      [ef0b5e88] [c0005194] do_one_initcall+0x40/0x19c
      [ef0b5ee8] [c07dd4dc] kernel_init_freeable+0x164/0x230
      [ef0b5f28] [c0005408] kernel_init+0x18/0x10c
      [ef0b5f38] [c0014274] ret_from_kernel_thread+0x14/0x1c
      
      Partially revert that commit to still clear the fixmap area dedicated
      to HIGHMEM.
      
      Fixes: f2bb8693 ("powerpc/fixmap: don't clear fixmap area in paging_init()")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/d42fa9747df5afa41e67b08e374c98d3b40529c9.1574927918.git.christophe.leroy@c-s.fr
      2807273f
  15. 28 11月, 2019 6 次提交
    • A
      powerpc: Ultravisor: Add PPC_UV config option · 013a53f2
      Anshuman Khandual 提交于
      CONFIG_PPC_UV adds support for ultravisor.
      Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      [ Update config help and commit message ]
      Signed-off-by: NClaudio Carvalho <cclaudio@linux.ibm.com>
      Reviewed-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      013a53f2
    • B
      KVM: PPC: Book3S HV: Support reset of secure guest · 22945688
      Bharata B Rao 提交于
      Add support for reset of secure guest via a new ioctl KVM_PPC_SVM_OFF.
      This ioctl will be issued by QEMU during reset and includes the
      the following steps:
      
      - Release all device pages of the secure guest.
      - Ask UV to terminate the guest via UV_SVM_TERMINATE ucall
      - Unpin the VPA pages so that they can be migrated back to secure
        side when guest becomes secure again. This is required because
        pinned pages can't be migrated.
      - Reinit the partition scoped page tables
      
      After these steps, guest is ready to issue UV_ESM call once again
      to switch to secure mode.
      Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      	[Implementation of uv_svm_terminate() and its call from
      	guest shutdown path]
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      	[Unpinning of VPA pages]
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      22945688
    • B
      KVM: PPC: Book3S HV: Handle memory plug/unplug to secure VM · c3262257
      Bharata B Rao 提交于
      Register the new memslot with UV during plug and unregister
      the memslot during unplug. In addition, release all the
      device pages during unplug.
      Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      c3262257
    • B
      KVM: PPC: Book3S HV: Radix changes for secure guest · 008e359c
      Bharata B Rao 提交于
      - After the guest becomes secure, when we handle a page fault of a page
        belonging to SVM in HV, send that page to UV via UV_PAGE_IN.
      - Whenever a page is unmapped on the HV side, inform UV via UV_PAGE_INVAL.
      - Ensure all those routines that walk the secondary page tables of
        the guest don't do so in case of secure VM. For secure guest, the
        active secondary page tables are in secure memory and the secondary
        page tables in HV are freed when guest becomes secure.
      Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      008e359c
    • B
      KVM: PPC: Book3S HV: Shared pages support for secure guests · 60f0a643
      Bharata B Rao 提交于
      A secure guest will share some of its pages with hypervisor (Eg. virtio
      bounce buffers etc). Support sharing of pages between hypervisor and
      ultravisor.
      
      Shared page is reachable via both HV and UV side page tables. Once a
      secure page is converted to shared page, the device page that represents
      the secure page is unmapped from the HV side page tables.
      Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      60f0a643
    • B
      KVM: PPC: Book3S HV: Support for running secure guests · ca9f4942
      Bharata B Rao 提交于
      A pseries guest can be run as secure guest on Ultravisor-enabled
      POWER platforms. On such platforms, this driver will be used to manage
      the movement of guest pages between the normal memory managed by
      hypervisor (HV) and secure memory managed by Ultravisor (UV).
      
      HV is informed about the guest's transition to secure mode via hcalls:
      
      H_SVM_INIT_START: Initiate securing a VM
      H_SVM_INIT_DONE: Conclude securing a VM
      
      As part of H_SVM_INIT_START, register all existing memslots with
      the UV. H_SVM_INIT_DONE call by UV informs HV that transition of
      the guest to secure mode is complete.
      
      These two states (transition to secure mode STARTED and transition
      to secure mode COMPLETED) are recorded in kvm->arch.secure_guest.
      Setting these states will cause the assembly code that enters the
      guest to call the UV_RETURN ucall instead of trying to enter the
      guest directly.
      
      Migration of pages betwen normal and secure memory of secure
      guest is implemented in H_SVM_PAGE_IN and H_SVM_PAGE_OUT hcalls.
      
      H_SVM_PAGE_IN: Move the content of a normal page to secure page
      H_SVM_PAGE_OUT: Move the content of a secure page to normal page
      
      Private ZONE_DEVICE memory equal to the amount of secure memory
      available in the platform for running secure guests is created.
      Whenever a page belonging to the guest becomes secure, a page from
      this private device memory is used to represent and track that secure
      page on the HV side. The movement of pages between normal and secure
      memory is done via migrate_vma_pages() using UV_PAGE_IN and
      UV_PAGE_OUT ucalls.
      
      In order to prevent the device private pages (that correspond to pages
      of secure guest) from participating in KSM merging, H_SVM_PAGE_IN
      calls ksm_madvise() under read version of mmap_sem. However
      ksm_madvise() needs to be under write lock.  Hence we call
      kvmppc_svm_page_in with mmap_sem held for writing, and it then
      downgrades to a read lock after calling ksm_madvise.
      
      [paulus@ozlabs.org - roll in patch "KVM: PPC: Book3S HV: Take write
       mmap_sem when calling ksm_madvise"]
      Signed-off-by: NBharata B Rao <bharata@linux.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      ca9f4942
  16. 27 11月, 2019 1 次提交