1. 10 6月, 2020 1 次提交
    • M
      mm: don't include asm/pgtable.h if linux/mm.h is already included · e31cf2f4
      Mike Rapoport 提交于
      Patch series "mm: consolidate definitions of page table accessors", v2.
      
      The low level page table accessors (pXY_index(), pXY_offset()) are
      duplicated across all architectures and sometimes more than once.  For
      instance, we have 31 definition of pgd_offset() for 25 supported
      architectures.
      
      Most of these definitions are actually identical and typically it boils
      down to, e.g.
      
      static inline unsigned long pmd_index(unsigned long address)
      {
              return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
      }
      
      static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
      {
              return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
      }
      
      These definitions can be shared among 90% of the arches provided
      XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
      
      For architectures that really need a custom version there is always
      possibility to override the generic version with the usual ifdefs magic.
      
      These patches introduce include/linux/pgtable.h that replaces
      include/asm-generic/pgtable.h and add the definitions of the page table
      accessors to the new header.
      
      This patch (of 12):
      
      The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
      functions involving page table manipulations, e.g.  pte_alloc() and
      pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
      in the files that include <linux/mm.h>.
      
      The include statements in such cases are remove with a simple loop:
      
      	for f in $(git grep -l "include <linux/mm.h>") ; do
      		sed -i -e '/include <asm\/pgtable.h>/ d' $f
      	done
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
      Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e31cf2f4
  2. 09 6月, 2020 1 次提交
    • S
      mm/gup.c: convert to use get_user_{page|pages}_fast_only() · dadbb612
      Souptick Joarder 提交于
      API __get_user_pages_fast() renamed to get_user_pages_fast_only() to
      align with pin_user_pages_fast_only().
      
      As part of this we will get rid of write parameter.  Instead caller will
      pass FOLL_WRITE to get_user_pages_fast_only().  This will not change any
      existing functionality of the API.
      
      All the callers are changed to pass FOLL_WRITE.
      
      Also introduce get_user_page_fast_only(), and use it in a few places
      that hard-code nr_pages to 1.
      
      Updated the documentation of the API.
      Signed-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: Paul Mackerras <paulus@ozlabs.org>		[arch/powerpc/kvm]
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Michal Suchanek <msuchanek@suse.de>
      Link: http://lkml.kernel.org/r/1590396812-31277-1-git-send-email-jrdr.linux@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dadbb612
  3. 01 6月, 2020 4 次提交
  4. 16 5月, 2020 4 次提交
  5. 14 5月, 2020 1 次提交
    • D
      kvm: Replace vcpu->swait with rcuwait · da4ad88c
      Davidlohr Bueso 提交于
      The use of any sort of waitqueue (simple or regular) for
      wait/waking vcpus has always been an overkill and semantically
      wrong. Because this is per-vcpu (which is blocked) there is
      only ever a single waiting vcpu, thus no need for any sort of
      queue.
      
      As such, make use of the rcuwait primitive, with the following
      considerations:
      
        - rcuwait already provides the proper barriers that serialize
        concurrent waiter and waker.
      
        - Task wakeup is done in rcu read critical region, with a
        stable task pointer.
      
        - Because there is no concurrency among waiters, we need
        not worry about rcuwait_wait_event() calls corrupting
        the wait->task. As a consequence, this saves the locking
        done in swait when modifying the queue. This also applies
        to per-vcore wait for powerpc kvm-hv.
      
      The x86 tscdeadline_latency test mentioned in 8577370f
      ("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
      latency is reduced by around 15-20% with this change.
      
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: kvmarm@lists.cs.columbia.edu
      Cc: linux-mips@vger.kernel.org
      Reviewed-by: NMarc Zyngier <maz@kernel.org>
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Message-Id: <20200424054837.5138-6-dave@stgolabs.net>
      [Avoid extra logic changes. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      da4ad88c
  6. 08 5月, 2020 1 次提交
  7. 25 4月, 2020 1 次提交
  8. 21 4月, 2020 4 次提交
  9. 16 4月, 2020 1 次提交
  10. 31 3月, 2020 1 次提交
  11. 26 3月, 2020 1 次提交
  12. 17 3月, 2020 17 次提交
  13. 12 2月, 2020 1 次提交
  14. 05 2月, 2020 1 次提交
    • Z
      KVM: fix overflow of zero page refcount with ksm running · 7df003c8
      Zhuang Yanying 提交于
      We are testing Virtual Machine with KSM on v5.4-rc2 kernel,
      and found the zero_page refcount overflow.
      The cause of refcount overflow is increased in try_async_pf
      (get_user_page) without being decreased in mmu_set_spte()
      while handling ept violation.
      In kvm_release_pfn_clean(), only unreserved page will call
      put_page. However, zero page is reserved.
      So, as well as creating and destroy vm, the refcount of
      zero page will continue to increase until it overflows.
      
      step1:
      echo 10000 > /sys/kernel/pages_to_scan/pages_to_scan
      echo 1 > /sys/kernel/pages_to_scan/run
      echo 1 > /sys/kernel/pages_to_scan/use_zero_pages
      
      step2:
      just create several normal qemu kvm vms.
      And destroy it after 10s.
      Repeat this action all the time.
      
      After a long period of time, all domains hang because
      of the refcount of zero page overflow.
      
      Qemu print error log as follow:
       …
       error: kvm run failed Bad address
       EAX=00006cdc EBX=00000008 ECX=80202001 EDX=078bfbfd
       ESI=ffffffff EDI=00000000 EBP=00000008 ESP=00006cc4
       EIP=000efd75 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
       ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
       CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
       SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
       DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
       FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
       GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
       LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
       TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
       GDT=     000f7070 00000037
       IDT=     000f70ae 00000000
       CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
       DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
       DR6=00000000ffff0ff0 DR7=0000000000000400
       EFER=0000000000000000
       Code=00 01 00 00 00 e9 e8 00 00 00 c7 05 4c 55 0f 00 01 00 00 00 <8b> 35 00 00 01 00 8b 3d 04 00 01 00 b8 d8 d3 00 00 c1 e0 08 0c ea a3 00 00 01 00 c7 05 04
       …
      
      Meanwhile, a kernel warning is departed.
      
       [40914.836375] WARNING: CPU: 3 PID: 82067 at ./include/linux/mm.h:987 try_get_page+0x1f/0x30
       [40914.836412] CPU: 3 PID: 82067 Comm: CPU 0/KVM Kdump: loaded Tainted: G           OE     5.2.0-rc2 #5
       [40914.836415] RIP: 0010:try_get_page+0x1f/0x30
       [40914.836417] Code: 40 00 c3 0f 1f 84 00 00 00 00 00 48 8b 47 08 a8 01 75 11 8b 47 34 85 c0 7e 10 f0 ff 47 34 b8 01 00 00 00 c3 48 8d 78 ff eb e9 <0f> 0b 31 c0 c3 66 90 66 2e 0f 1f 84 00 0
       0 00 00 00 48 8b 47 08 a8
       [40914.836418] RSP: 0018:ffffb4144e523988 EFLAGS: 00010286
       [40914.836419] RAX: 0000000080000000 RBX: 0000000000000326 RCX: 0000000000000000
       [40914.836420] RDX: 0000000000000000 RSI: 00004ffdeba10000 RDI: ffffdf07093f6440
       [40914.836421] RBP: ffffdf07093f6440 R08: 800000424fd91225 R09: 0000000000000000
       [40914.836421] R10: ffff9eb41bfeebb8 R11: 0000000000000000 R12: ffffdf06bbd1e8a8
       [40914.836422] R13: 0000000000000080 R14: 800000424fd91225 R15: ffffdf07093f6440
       [40914.836423] FS:  00007fb60ffff700(0000) GS:ffff9eb4802c0000(0000) knlGS:0000000000000000
       [40914.836425] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [40914.836426] CR2: 0000000000000000 CR3: 0000002f220e6002 CR4: 00000000003626e0
       [40914.836427] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [40914.836427] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [40914.836428] Call Trace:
       [40914.836433]  follow_page_pte+0x302/0x47b
       [40914.836437]  __get_user_pages+0xf1/0x7d0
       [40914.836441]  ? irq_work_queue+0x9/0x70
       [40914.836443]  get_user_pages_unlocked+0x13f/0x1e0
       [40914.836469]  __gfn_to_pfn_memslot+0x10e/0x400 [kvm]
       [40914.836486]  try_async_pf+0x87/0x240 [kvm]
       [40914.836503]  tdp_page_fault+0x139/0x270 [kvm]
       [40914.836523]  kvm_mmu_page_fault+0x76/0x5e0 [kvm]
       [40914.836588]  vcpu_enter_guest+0xb45/0x1570 [kvm]
       [40914.836632]  kvm_arch_vcpu_ioctl_run+0x35d/0x580 [kvm]
       [40914.836645]  kvm_vcpu_ioctl+0x26e/0x5d0 [kvm]
       [40914.836650]  do_vfs_ioctl+0xa9/0x620
       [40914.836653]  ksys_ioctl+0x60/0x90
       [40914.836654]  __x64_sys_ioctl+0x16/0x20
       [40914.836658]  do_syscall_64+0x5b/0x180
       [40914.836664]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
       [40914.836666] RIP: 0033:0x7fb61cb6bfc7
      Signed-off-by: NLinFeng <linfeng23@huawei.com>
      Signed-off-by: NZhuang Yanying <ann.zhuangyanying@huawei.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7df003c8
  15. 31 1月, 2020 1 次提交