1. 15 5月, 2017 2 次提交
    • P
      x86/tsc: Remodel cyc2ns to use seqcount_latch() · 59eaef78
      Peter Zijlstra 提交于
      Replace the custom multi-value scheme with the more regular
      seqcount_latch() scheme. Along with scrapping a lot of lines, the latch
      scheme is better documented and used in more places.
      
      The immediate benefit however is not being limited on the update side.
      The current code has a limit where the writers block which is hit by
      future changes.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59eaef78
    • P
      x86/tsc: Provide 'tsc=unstable' boot parameter · 8309f86c
      Peter Zijlstra 提交于
      Since the clocksource watchdog will only detect broken TSC after the
      fact, all TSC based clocks will likely have observed non-continuous
      values before/when switching away from TSC.
      
      Therefore only thing to fully avoid random clock movement when your
      BIOS randomly mucks with TSC values from SMI handlers is reporting the
      TSC as unstable at boot.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8309f86c
  2. 13 5月, 2017 1 次提交
  3. 11 5月, 2017 2 次提交
  4. 10 5月, 2017 3 次提交
    • N
      uapi: export all headers under uapi directories · fcc8487d
      Nicolas Dichtel 提交于
      Regularly, when a new header is created in include/uapi/, the developer
      forgets to add it in the corresponding Kbuild file. This error is usually
      detected after the release is out.
      
      In fact, all headers under uapi directories should be exported, thus it's
      useless to have an exhaustive list.
      
      After this patch, the following files, which were not exported, are now
      exported (with make headers_install_all):
      asm-arc/kvm_para.h
      asm-arc/ucontext.h
      asm-blackfin/shmparam.h
      asm-blackfin/ucontext.h
      asm-c6x/shmparam.h
      asm-c6x/ucontext.h
      asm-cris/kvm_para.h
      asm-h8300/shmparam.h
      asm-h8300/ucontext.h
      asm-hexagon/shmparam.h
      asm-m32r/kvm_para.h
      asm-m68k/kvm_para.h
      asm-m68k/shmparam.h
      asm-metag/kvm_para.h
      asm-metag/shmparam.h
      asm-metag/ucontext.h
      asm-mips/hwcap.h
      asm-mips/reg.h
      asm-mips/ucontext.h
      asm-nios2/kvm_para.h
      asm-nios2/ucontext.h
      asm-openrisc/shmparam.h
      asm-parisc/kvm_para.h
      asm-powerpc/perf_regs.h
      asm-sh/kvm_para.h
      asm-sh/ucontext.h
      asm-tile/shmparam.h
      asm-unicore32/shmparam.h
      asm-unicore32/ucontext.h
      asm-x86/hwcap2.h
      asm-xtensa/kvm_para.h
      drm/armada_drm.h
      drm/etnaviv_drm.h
      drm/vgem_drm.h
      linux/aspeed-lpc-ctrl.h
      linux/auto_dev-ioctl.h
      linux/bcache.h
      linux/btrfs_tree.h
      linux/can/vxcan.h
      linux/cifs/cifs_mount.h
      linux/coresight-stm.h
      linux/cryptouser.h
      linux/fsmap.h
      linux/genwqe/genwqe_card.h
      linux/hash_info.h
      linux/kcm.h
      linux/kcov.h
      linux/kfd_ioctl.h
      linux/lightnvm.h
      linux/module.h
      linux/nbd-netlink.h
      linux/nilfs2_api.h
      linux/nilfs2_ondisk.h
      linux/nsfs.h
      linux/pr.h
      linux/qrtr.h
      linux/rpmsg.h
      linux/sched/types.h
      linux/sed-opal.h
      linux/smc.h
      linux/smc_diag.h
      linux/stm.h
      linux/switchtec_ioctl.h
      linux/vfio_ccw.h
      linux/wil6210_uapi.h
      rdma/bnxt_re-abi.h
      
      Note that I have removed from this list the files which are generated in every
      exported directories (like .install or .install.cmd).
      
      Thanks to Julien Floret <julien.floret@6wind.com> for the tip to get all
      subdirs with a pure makefile command.
      
      For the record, note that exported files for asm directories are a mix of
      files listed by:
       - include/uapi/asm-generic/Kbuild.asm;
       - arch/<arch>/include/uapi/asm/Kbuild;
       - arch/<arch>/include/asm/Kbuild.
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: NMark Salter <msalter@redhat.com>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      fcc8487d
    • N
      x86: stop exporting msr-index.h to userland · 25dc1d6c
      Nicolas Dichtel 提交于
      Even if this file was not in an uapi directory, it was exported because
      it was listed in the Kbuild file.
      
      Fixes: b72e7464 ("x86/uapi: Do not export <asm/msr-index.h> as part of the user API headers")
      Suggested-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      25dc1d6c
    • B
      x86, pmem: Fix cache flushing for iovec write < 8 bytes · 8376efd3
      Ben Hutchings 提交于
      Commit 11e63f6d added cache flushing for unaligned writes from an
      iovec, covering the first and last cache line of a >= 8 byte write and
      the first cache line of a < 8 byte write.  But an unaligned write of
      2-7 bytes can still cover two cache lines, so make sure we flush both
      in that case.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 11e63f6d ("x86, pmem: fix broken __copy_user_nocache ...")
      Signed-off-by: NBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      8376efd3
  5. 09 5月, 2017 12 次提交
  6. 08 5月, 2017 2 次提交
    • X
      x86/kexec/64: Use gbpages for identity mappings if available · 8638100c
      Xunlei Pang 提交于
      Kexec sets up all identity mappings before booting into the new
      kernel, and this will cause extra memory consumption for paging
      structures which is quite considerable on modern machines with
      huge memory sizes.
      
      E.g. on a 32TB machine that is kdumping, it could waste around
      128MB (around 4MB/TB) from the reserved memory after kexec sets
      all the identity mappings using the current 2MB page.
      
      Add to that the memory needed for the loaded kdump kernel, initramfs,
      etc., and it causes a kexec syscall -NOMEM failure.
      
      As a result, we had to enlarge reserved memory via "crashkernel=X"
      to work around this problem.
      
      This causes some trouble for distributions that use policies
      to evaluate the proper "crashkernel=X" value for users.
      
      So enable gbpages for kexec mappings.
      Signed-off-by: NXunlei Pang <xlpang@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: akpm@linux-foundation.org
      Cc: kexec@lists.infradead.org
      Link: http://lkml.kernel.org/r/1493862171-8799-2-git-send-email-xlpang@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8638100c
    • X
      x86/mm: Add support for gbpages to kernel_ident_mapping_init() · 66aad4fd
      Xunlei Pang 提交于
      Kernel identity mappings on x86-64 kernels are created in two
      ways: by the early x86 boot code, or by kernel_ident_mapping_init().
      
      Native kernels (which is the dominant usecase) use the former,
      but the kexec and the hibernation code uses kernel_ident_mapping_init().
      
      There's a subtle difference between these two ways of how identity
      mappings are created, the current kernel_ident_mapping_init() code
      creates identity mappings always using 2MB page(PMD level) - while
      the native kernel boot path also utilizes gbpages where available.
      
      This difference is suboptimal both for performance and for memory
      usage: kernel_ident_mapping_init() needs to allocate pages for the
      page tables when creating the new identity mappings.
      
      This patch adds 1GB page(PUD level) support to kernel_ident_mapping_init()
      to address these concerns.
      
      The primary advantage would be better TLB coverage/performance,
      because we'd utilize 1GB TLBs instead of 2MB ones.
      
      It is also useful for machines with large number of memory to
      save paging structure allocations(around 4MB/TB using 2MB page)
      when setting identity mappings for all the memory, after using
      1GB page it will consume only 8KB/TB.
      
      ( Note that this change alone does not activate gbpages in kexec,
        we are doing that in a separate patch. )
      Signed-off-by: NXunlei Pang <xlpang@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: akpm@linux-foundation.org
      Cc: kexec@lists.infradead.org
      Link: http://lkml.kernel.org/r/1493862171-8799-1-git-send-email-xlpang@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      66aad4fd
  7. 07 5月, 2017 1 次提交
    • K
      x86/boot: Declare error() as noreturn · 60854a12
      Kees Cook 提交于
      The compressed boot function error() is used to halt execution, but it
      wasn't marked with "noreturn". This fixes that in preparation for
      supporting kernel FORTIFY_SOURCE, which uses the noreturn annotation
      on panic, and calls error(). GCC would warn about a noreturn function
      calling a non-noreturn function:
      
        arch/x86/boot/compressed/misc.c: In function ‘fortify_panic’:
        arch/x86/boot/compressed/misc.c:416:1: warning: ‘noreturn’ function does return
         }
       ^
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/20170506045116.GA2879@beastSigned-off-by: NIngo Molnar <mingo@kernel.org>
      60854a12
  8. 05 5月, 2017 6 次提交
    • B
      xen/x86: Do not call xen_init_time_ops() until shared_info is initialized · d162809f
      Boris Ostrovsky 提交于
      Routines that are set by xen_init_time_ops() use shared_info's
      pvclock_vcpu_time_info area. This area is not properly available until
      shared_info is mapped in xen_setup_shared_info().
      
      This became especially problematic due to commit dd759d93 ("x86/timers:
      Add simple udelay calibration") where we end up reading tsc_to_system_mul
      from xen_dummy_shared_info (i.e. getting zero value) and then trying
      to divide by it in pvclock_tsc_khz().
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      d162809f
    • J
      x86/xen: fix xsave capability setting · 40f4ac0b
      Juergen Gross 提交于
      Commit 690b7f10b4f9f ("x86/xen: use capabilities instead of fake cpuid
      values for xsave") introduced a regression as it tried to make use of
      the fixup feature before it being available.
      
      Fall back to the old variant testing via cpuid().
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      40f4ac0b
    • J
      kvm: nVMX: Don't validate disabled secondary controls · 2e5b0bd9
      Jim Mattson 提交于
      According to the SDM, if the "activate secondary controls" primary
      processor-based VM-execution control is 0, no checks are performed on
      the secondary processor-based VM-execution controls.
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2e5b0bd9
    • M
      x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang incompatibility · 121843eb
      Matthias Kaehlcke 提交于
      The constraint "rm" allows the compiler to put mix_const into memory.
      When the input operand is a memory location then MUL needs an operand
      size suffix, since Clang can't infer the multiplication width from the
      operand.
      
      Add and use the _ASM_MUL macro which determines the operand size and
      resolves to the NUL instruction with the corresponding suffix.
      
      This fixes the following error when building with clang:
      
        CC      arch/x86/lib/kaslr.o
        /tmp/kaslr-dfe1ad.s: Assembler messages:
        /tmp/kaslr-dfe1ad.s:182: Error: no instruction mnemonic suffix given and no register operands; can't size instruction
      Signed-off-by: NMatthias Kaehlcke <mka@chromium.org>
      Cc: Grant Grundler <grundler@chromium.org>
      Cc: Greg Hackmann <ghackmann@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Davidson <md@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170501224741.133938-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      121843eb
    • B
      x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds() · fc5f9d5f
      Baoquan He 提交于
      Jeff Moyer reported that on his system with two memory regions 0~64G and
      1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR
      will make the system hang intermittently during boot. While adding 'nokaslr'
      won't.
      
      The back trace is:
      
       Oops: 0000 [#1] SMP
      
       RIP: memcpy_erms()
       [ .... ]
       Call Trace:
        pmem_rw_page()
        bdev_read_page()
        do_mpage_readpage()
        mpage_readpages()
        blkdev_readpages()
        __do_page_cache_readahead()
        force_page_cache_readahead()
        page_cache_sync_readahead()
        generic_file_read_iter()
        blkdev_read_iter()
        __vfs_read()
        vfs_read()
        SyS_read()
        entry_SYSCALL_64_fastpath()
      
      This crash happens because the for loop count calculation in sync_global_pgds()
      is not correct. When a mapping area crosses PGD entries, we should
      calculate the starting address of region which next PGD covers and assign
      it to next for loop count, but not add PGDIR_SIZE directly. The old
      code works right only if the mapping area is an exact multiple of PGDIR_SIZE,
      otherwize the end region could be skipped so that it can't be synchronized
      to all other processes from kernel PGD init_mm.pgd.
      
      In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
      PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
      makes this area be mapped inside one PGD entry. With KASLR enabled,
      this area could cross two PGD entries, then the next PGD entry won't
      be synced to all other processes. That is why we saw empty PGD.
      
      Fix it.
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jinbum Park <jinb.park7@gmail.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fc5f9d5f
    • J
      x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic() · 42fc6c6c
      Josh Poimboeuf 提交于
      Andrey Konovalov reported the following warning while fuzzing the kernel
      with syzkaller:
      
        WARNING: kernel stack regs at ffff8800686869f8 in a.out:4933 has bad 'bp' value c3fc855a10167ec0
      
      The unwinder dump revealed that RBP had a bad value when an interrupt
      occurred in csum_partial_copy_generic().
      
      That function saves RBP on the stack and then overwrites it, using it as
      a scratch register.  That's problematic because it breaks stack traces
      if an interrupt occurs in the middle of the function.
      
      Replace the usage of RBP with another callee-saved register (R15) so
      stack traces are no longer affected.
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: linux-sctp@vger.kernel.org
      Cc: netdev <netdev@vger.kernel.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Link: http://lkml.kernel.org/r/4b03a961efda5ec9bfe46b7b9c9ad72d1efad343.1493909486.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      42fc6c6c
  9. 04 5月, 2017 2 次提交
  10. 03 5月, 2017 3 次提交
  11. 02 5月, 2017 6 次提交
    • D
      KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING · 5c0aea0e
      David Hildenbrand 提交于
      We needed the lock to avoid racing with creation of the irqchip on x86. As
      kvm_set_irq_routing() calls srcu_synchronize_expedited(), this lock
      might be held for a longer time.
      
      Let's introduce an arch specific callback to check if we can actually
      add irq routes. For x86, all we have to do is check if we have an
      irqchip in the kernel. We don't need kvm->lock at that point as the
      irqchip is marked as inititalized only when actually fully created.
      Reported-by: NSteve Rutherford <srutherford@google.com>
      Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
      Fixes: 1df6dded ("KVM: x86: race between KVM_SET_GSI_ROUTING and KVM_CREATE_IRQCHIP")
      Signed-off-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5c0aea0e
    • J
      xen: Implement EFI reset_system callback · e371fd76
      Julien Grall 提交于
      When rebooting DOM0 with ACPI on ARM64, the kernel is crashing with the stack
      trace [1].
      
      This is happening because when EFI runtimes are enabled, the reset code
      (see machine_restart) will first try to use EFI restart method.
      
      However, the EFI restart code is expecting the reset_system callback to
      be always set. This is not the case for Xen and will lead to crash.
      
      The EFI restart helper is used in multiple places and some of them don't
      not have fallback (see machine_power_off). So implement reset_system
      callback as a call to xen_reboot when using EFI Xen.
      
      [   36.999270] reboot: Restarting system
      [   37.002921] Internal error: Attempting to execute userspace memory: 86000004 [#1] PREEMPT SMP
      [   37.011460] Modules linked in:
      [   37.014598] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.11.0-rc1-00003-g1e248b60a39b-dirty #506
      [   37.023903] Hardware name: (null) (DT)
      [   37.027734] task: ffff800902068000 task.stack: ffff800902064000
      [   37.033739] PC is at 0x0
      [   37.036359] LR is at efi_reboot+0x94/0xd0
      [   37.040438] pc : [<0000000000000000>] lr : [<ffff00000880f2c4>] pstate: 404001c5
      [   37.047920] sp : ffff800902067cf0
      [   37.051314] x29: ffff800902067cf0 x28: ffff800902068000
      [   37.056709] x27: ffff000008992000 x26: 000000000000008e
      [   37.062104] x25: 0000000000000123 x24: 0000000000000015
      [   37.067499] x23: 0000000000000000 x22: ffff000008e6e250
      [   37.072894] x21: ffff000008e6e000 x20: 0000000000000000
      [   37.078289] x19: ffff000008e5d4c8 x18: 0000000000000010
      [   37.083684] x17: 0000ffffa7c27470 x16: 00000000deadbeef
      [   37.089079] x15: 0000000000000006 x14: ffff000088f42bef
      [   37.094474] x13: ffff000008f42bfd x12: ffff000008e706c0
      [   37.099870] x11: ffff000008e70000 x10: 0000000005f5e0ff
      [   37.105265] x9 : ffff800902067a50 x8 : 6974726174736552
      [   37.110660] x7 : ffff000008cc6fb8 x6 : ffff000008cc6fb0
      [   37.116055] x5 : ffff000008c97dd8 x4 : 0000000000000000
      [   37.121453] x3 : 0000000000000000 x2 : 0000000000000000
      [   37.126845] x1 : 0000000000000000 x0 : 0000000000000000
      [   37.132239]
      [   37.133808] Process systemd-shutdow (pid: 1, stack limit = 0xffff800902064000)
      [   37.141118] Stack: (0xffff800902067cf0 to 0xffff800902068000)
      [   37.146949] 7ce0:                                   ffff800902067d40 ffff000008085334
      [   37.154869] 7d00: 0000000000000000 ffff000008f3b000 ffff800902067d40 ffff0000080852e0
      [   37.162787] 7d20: ffff000008cc6fb0 ffff000008cc6fb8 ffff000008c7f580 ffff000008c97dd8
      [   37.170706] 7d40: ffff800902067d60 ffff0000080e2c2c 0000000000000000 0000000001234567
      [   37.178624] 7d60: ffff800902067d80 ffff0000080e2ee8 0000000000000000 ffff0000080e2df4
      [   37.186544] 7d80: 0000000000000000 ffff0000080830f0 0000000000000000 00008008ff1c1000
      [   37.194462] 7da0: ffffffffffffffff 0000ffffa7c4b1cc 0000000000000000 0000000000000024
      [   37.202380] 7dc0: ffff800902067dd0 0000000000000005 0000fffff24743c8 0000000000000004
      [   37.210299] 7de0: 0000fffff2475f03 0000000000000010 0000fffff2474418 0000000000000005
      [   37.218218] 7e00: 0000fffff2474578 000000000000000a 0000aaaad6b722c0 0000000000000001
      [   37.226136] 7e20: 0000000000000123 0000000000000038 ffff800902067e50 ffff0000081e7294
      [   37.234055] 7e40: ffff800902067e60 ffff0000081e935c ffff800902067e60 ffff0000081e9388
      [   37.241973] 7e60: ffff800902067eb0 ffff0000081ea388 0000000000000000 00008008ff1c1000
      [   37.249892] 7e80: ffffffffffffffff 0000ffffa7c4a79c 0000000000000000 ffff000000020000
      [   37.257810] 7ea0: 0000010000000004 0000000000000000 0000000000000000 ffff0000080830f0
      [   37.265729] 7ec0: fffffffffee1dead 0000000028121969 0000000001234567 0000000000000000
      [   37.273651] 7ee0: ffffffffffffffff 8080000000800000 0000800000008080 feffa9a9d4ff2d66
      [   37.281567] 7f00: 000000000000008e feffa9a9d5b60e0f 7f7fffffffff7f7f 0101010101010101
      [   37.289485] 7f20: 0000000000000010 0000000000000008 000000000000003a 0000ffffa7ccf588
      [   37.297404] 7f40: 0000aaaad6b87d00 0000ffffa7c4b1b0 0000fffff2474be0 0000aaaad6b88000
      [   37.305326] 7f60: 0000fffff2474fb0 0000000001234567 0000000000000000 0000000000000000
      [   37.313240] 7f80: 0000000000000000 0000000000000001 0000aaaad6b70d4d 0000000000000000
      [   37.321159] 7fa0: 0000000000000001 0000fffff2474ea0 0000aaaad6b5e2e0 0000fffff2474e80
      [   37.329078] 7fc0: 0000ffffa7c4b1cc 0000000000000000 fffffffffee1dead 000000000000008e
      [   37.336997] 7fe0: 0000000000000000 0000000000000000 9ce839cffee77eab fafdbf9f7ed57f2f
      [   37.344911] Call trace:
      [   37.347437] Exception stack(0xffff800902067b20 to 0xffff800902067c50)
      [   37.353970] 7b20: ffff000008e5d4c8 0001000000000000 0000000080f82000 0000000000000000
      [   37.361883] 7b40: ffff800902067b60 ffff000008e17000 ffff000008f44c68 00000001081081b4
      [   37.369802] 7b60: ffff800902067bf0 ffff000008108478 0000000000000000 ffff000008c235b0
      [   37.377721] 7b80: ffff800902067ce0 0000000000000000 0000000000000000 0000000000000015
      [   37.385643] 7ba0: 0000000000000123 000000000000008e ffff000008992000 ffff800902068000
      [   37.393557] 7bc0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [   37.401477] 7be0: 0000000000000000 ffff000008c97dd8 ffff000008cc6fb0 ffff000008cc6fb8
      [   37.409396] 7c00: 6974726174736552 ffff800902067a50 0000000005f5e0ff ffff000008e70000
      [   37.417318] 7c20: ffff000008e706c0 ffff000008f42bfd ffff000088f42bef 0000000000000006
      [   37.425234] 7c40: 00000000deadbeef 0000ffffa7c27470
      [   37.430190] [<          (null)>]           (null)
      [   37.434982] [<ffff000008085334>] machine_restart+0x6c/0x70
      [   37.440550] [<ffff0000080e2c2c>] kernel_restart+0x6c/0x78
      [   37.446030] [<ffff0000080e2ee8>] SyS_reboot+0x130/0x228
      [   37.451337] [<ffff0000080830f0>] el0_svc_naked+0x24/0x28
      [   37.456737] Code: bad PC value
      [   37.459891] ---[ end trace 76e2fc17e050aecd ]---
      Signed-off-by: NJulien Grall <julien.grall@arm.com>
      
      --
      
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      
      The x86 code has theoritically a similar issue, altought EFI does not
      seem to be the preferred method. I have only built test it on x86.
      
      This should also probably be fixed in stable tree.
      
          Changes in v2:
              - Implement xen_efi_reset_system using xen_reboot
              - Move xen_efi_reset_system in drivers/xen/efi.c
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      e371fd76
    • J
      xen: Export xen_reboot · 5d9404e1
      Julien Grall 提交于
      The helper xen_reboot will be called by the EFI code in a later patch.
      
      Note that the ARM version does not yet exist and will be added in a
      later patch too.
      Signed-off-by: NJulien Grall <julien.grall@arm.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      5d9404e1
    • B
      xen/x86: Call xen_smp_intr_init_pv() on BSP · f31b9692
      Boris Ostrovsky 提交于
      Recent code rework that split handling ov PV, HVM and PVH guests into
      separate files missed calling xen_smp_intr_init_pv() on CPU0.
      
      Add this call.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      f31b9692
    • B
      xen: Revert commits da72ff5b and 72a9b186 · 84d582d2
      Boris Ostrovsky 提交于
      Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741)
      established that commit 72a9b186 ("xen: Remove event channel
      notification through Xen PCI platform device") (and thus commit
      da72ff5b ("partially revert "xen: Remove event channel
      notification through Xen PCI platform device"")) are unnecessary and,
      in fact, prevent HVM guests from booting on Xen releases prior to 4.0
      
      Therefore we revert both of those commits.
      
      The summary of that discussion is below:
      
        Here is the brief summary of the current situation:
      
        Before the offending commit (72a9b186):
      
        1) INTx does not work because of the reset_watches path.
        2) The reset_watches path is only taken if you have Xen > 4.0
        3) The Linux Kernel by default will use vector inject if the hypervisor
           support. So even INTx does not work no body running the kernel with
           Xen > 4.0 would notice. Unless he explicitly disabled this feature
           either in the kernel or in Xen (and this can only be disabled by
           modifying the code, not user-supported way to do it).
      
        After the offending commit (+ partial revert):
      
        1) INTx is no longer support for HVM (only for PV guests).
        2) Any HVM guest The kernel will not boot on Xen < 4.0 which does
           not have vector injection support. Since the only other mode
           supported is INTx which.
      
        So based on this summary, I think before commit (72a9b186) we were
        in much better position from a user point of view.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      84d582d2
    • B
      xen/pvh: Do not fill kernel's e820 map in init_pvh_bootparams() · 5f6a1614
      Boris Ostrovsky 提交于
      e820 map is updated with information from the zeropage (i.e. pvh_bootparams)
      by default_machine_specific_memory_setup(). With the way things are done
      now,  we end up with a duplicated e820 map.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      5f6a1614