1. 16 6月, 2016 1 次提交
    • A
      PCI/MSI: irqchip: Fix PCI_MSI dependencies · 3ee80364
      Arnd Bergmann 提交于
      The PCI_MSI symbol is used inconsistently throughout the tree, with some
      drivers using 'select' and others using 'depends on', or using conditional
      selects.  This keeps causing problems; the latest one is a result of
      ARCH_ALPINE using a 'select' statement to enable its platform-specific MSI
      driver without enabling MSI:
      
        warning: (ARCH_ALPINE) selects ALPINE_MSI which has unmet direct dependencies (PCI && PCI_MSI)
        drivers/irqchip/irq-alpine-msi.c:104:15: error: variable 'alpine_msix_domain_info' has initializer but incomplete type
         static struct msi_domain_info alpine_msix_domain_info = {
      		 ^~~~~~~~~~~~~~~
        drivers/irqchip/irq-alpine-msi.c:105:2: error: unknown field 'flags' specified in initializer
          .flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
          ^
        drivers/irqchip/irq-alpine-msi.c:105:11: error: 'MSI_FLAG_USE_DEF_DOM_OPS' undeclared here (not in a function)
          .flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
      	     ^~~~~~~~~~~~~~~~~~~~~~~~
      
      There is little reason to enable PCI support for a platform that uses MSI
      but then leave MSI disabled at compile time.
      
      Select PCI_MSI from irqchips that implement MSI, and make PCI host bridges
      that use MSI on ARM depend on PCI_MSI_IRQ_DOMAIN.
      
      For all three architectures that support PCI_MSI_IRQ_DOMAIN (ARM, ARM64,
      X86), enable it by default whenever MSI is enabled.
      
      [bhelgaas: changelog, omit crypto config change]
      Suggested-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      3ee80364
  2. 05 6月, 2016 4 次提交
    • H
      58f1c654
    • H
      parisc: Fix pagefault crash in unaligned __get_user() call · 8b78f260
      Helge Deller 提交于
      One of the debian buildd servers had this crash in the syslog without
      any other information:
      
       Unaligned handler failed, ret = -2
       clock_adjtime (pid 22578): Unaligned data reference (code 28)
       CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G  E  4.5.0-2-parisc64-smp #1 Debian 4.5.4-1
       task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000
      
            YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
       PSW: 00001000000001001111100000001111 Tainted: G            E
       r00-03  000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0
       r04-07  00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff
       r08-11  0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4
       r12-15  000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b
       r16-19  0000000000028800 0000000000000001 0000000000000070 00000001bde7c218
       r20-23  0000000000000000 00000001bde7c210 0000000000000002 0000000000000000
       r24-27  0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0
       r28-31  0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218
       sr00-03  0000000001200000 0000000001200000 0000000000000000 0000000001200000
       sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      
       IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88
        IIR: 0ca0d089    ISR: 0000000001200000  IOR: 00000000fa6f7fff
        CPU:        1   CR30: 00000001bde7c000 CR31: ffffffffffffffff
        ORIG_R28: 00000002369fe628
        IAOQ[0]: compat_get_timex+0x2dc/0x3c0
        IAOQ[1]: compat_get_timex+0x2e0/0x3c0
        RP(r2): compat_get_timex+0x40/0x3c0
       Backtrace:
        [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0
        [<0000000040205024>] syscall_exit+0x0/0x14
      
      This means the userspace program clock_adjtime called the clock_adjtime()
      syscall and then crashed inside the compat_get_timex() function.
      Syscalls should never crash programs, but instead return EFAULT.
      
      The IIR register contains the executed instruction, which disassebles
      into "ldw 0(sr3,r5),r9".
      This load-word instruction is part of __get_user() which tried to read the word
      at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in.  The
      unaligned handler is able to emulate all ldw instructions, but it fails if it
      fails to read the source e.g. because of page fault.
      
      The following program reproduces the problem:
      
      #define _GNU_SOURCE
      #include <unistd.h>
      #include <sys/syscall.h>
      #include <sys/mman.h>
      
      int main(void) {
              /* allocate 8k */
              char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
              /* free second half (upper 4k) and make it invalid. */
              munmap(ptr+4096, 4096);
              /* syscall where first int is unaligned and clobbers into invalid memory region */
              /* syscall should return EFAULT */
              return syscall(__NR_clock_adjtime, 0, ptr+4095);
      }
      
      To fix this issue we simply need to check if the faulting instruction address
      is in the exception fixup table when the unaligned handler failed. If it
      is, call the fixup routine instead of crashing.
      
      While looking at the unaligned handler I found another issue as well: The
      target register should not be modified if the handler was unsuccessful.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org
      8b78f260
    • H
      parisc: Fix printk time during boot · 0032c088
      Helge Deller 提交于
      Avoid showing invalid printk time stamps during boot.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Reviewed-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      0032c088
    • M
      parisc: Fix backtrace on PA-RISC · be24a897
      Mikulas Patocka 提交于
      This patch fixes backtrace on PA-RISC
      
      There were several problems:
      
      1) The code that decodes instructions handles instructions that subtract
      from the stack pointer incorrectly. If the instruction subtracts the
      number X from the stack pointer the code increases the frame size by
      (0x100000000-X).  This results in invalid accesses to memory and
      recursive page faults.
      
      2) Because gcc reorders blocks, handling instructions that subtract from
      the frame pointer is incorrect. For example, this function
      	int f(int a)
      	{
      		if (__builtin_expect(a, 1))
      			return a;
      		g();
      		return a;
      	}
      is compiled in such a way, that the code that decreases the stack
      pointer for the first "return a" is placed before the code for "g" call.
      If we recognize this decrement, we mistakenly believe that the frame
      size for the "g" call is zero.
      
      To fix problems 1) and 2), the patch doesn't recognize instructions that
      decrease the stack pointer at all. To further safeguard the unwind code
      against nonsense values, we don't allow frame size larger than
      Total_frame_size.
      
      3) The backtrace is not locked. If stack dump races with module unload,
      invalid table can be accessed.
      
      This patch adds a spinlock when processing module tables.
      
      Note, that for correct backtrace, you need recent binutils.
      Binutils 2.18 from Debian 5 produce garbage unwind tables.
      Binutils 2.21 work better (it sometimes forgets function frames, but at
      least it doesn't generate garbage).
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      be24a897
  3. 03 6月, 2016 6 次提交
    • M
      arm64: fix alignment when RANDOMIZE_TEXT_OFFSET is enabled · aed7eb83
      Mark Rutland 提交于
      With ARM64_64K_PAGES and RANDOMIZE_TEXT_OFFSET enabled, we hit the
      following issue on the boot:
      
      kernel BUG at arch/arm64/mm/mmu.c:480!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.6.0 #310
      Hardware name: ARM Juno development board (r2) (DT)
      task: ffff000008d58a80 ti: ffff000008d30000 task.ti: ffff000008d30000
      PC is at map_kernel_segment+0x44/0xb0
      LR is at paging_init+0x84/0x5b0
      pc : [<ffff000008c450b4>] lr : [<ffff000008c451a4>] pstate: 600002c5
      
      Call trace:
      [<ffff000008c450b4>] map_kernel_segment+0x44/0xb0
      [<ffff000008c451a4>] paging_init+0x84/0x5b0
      [<ffff000008c42728>] setup_arch+0x198/0x534
      [<ffff000008c40848>] start_kernel+0x70/0x388
      [<ffff000008c401bc>] __primary_switched+0x30/0x74
      
      Commit 7eb90f2f ("arm64: cover the .head.text section in the .text
      segment mapping") removed the alignment between the .head.text and .text
      sections, and used the _text rather than the _stext interval for mapping
      the .text segment.
      
      Prior to this commit _stext was always section aligned and didn't cause
      any issue even when RANDOMIZE_TEXT_OFFSET was enabled. Since that
      alignment has been removed and _text is used to map the .text segment,
      we need ensure _text is always page aligned when RANDOMIZE_TEXT_OFFSET
      is enabled.
      
      This patch adds logic to TEXT_OFFSET fuzzing to ensure that the offset
      is always aligned to the kernel page size. To ensure this, we rely on
      the PAGE_SHIFT being available via Kconfig.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NSudeep Holla <sudeep.holla@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 7eb90f2f ("arm64: cover the .head.text section in the .text segment mapping")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      aed7eb83
    • M
      arm64: move {PAGE,CONT}_SHIFT into Kconfig · 030c4d24
      Mark Rutland 提交于
      In some cases (e.g. the awk for CONFIG_RANDOMIZE_TEXT_OFFSET) we would
      like to make use of PAGE_SHIFT outside of code that can include the
      usual header files.
      
      Add a new CONFIG_ARM64_PAGE_SHIFT for this, likewise with
      ARM64_CONT_SHIFT for consistency.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      030c4d24
    • M
      arm64: mm: dump: log span level · 48dd73c5
      Mark Rutland 提交于
      The page table dump code logs spans of entries at the same level
      (pgd/pud/pmd/pte) which have the same attributes. While we log the
      (decoded) attributes, we don't log the level, which leaves the output
      ambiguous and/or confusing in some cases.
      
      For example:
      
      0xffff800800000000-0xffff800980000000           6G       RW NX SHD AF        BLK UXN MEM/NORMAL
      
      If using 4K pages, this may describe a span of 6 1G block entries at the
      PGD/PUD level, or 3072 2M block entries at the PMD level.
      
      This patch adds the page table level to each output line, removing this
      ambiguity. For the example above, this will produce:
      
      0xffffffc800000000-0xffffffc980000000           6G PUD       RW NX SHD AF        BLK UXN MEM/NORMAL
      
      When 3 level tables are in use, and we use the asm-generic/nopud.h
      definitions, the dump code treats each entry in the PGD as a 1 element
      table at the PUD level, and logs spans as being PUDs, which can be
      confusing. To counteract this, the "PUD" mnemonic is replaced with "PGD"
      when CONFIG_PGTABLE_LEVELS <= 3. Likewise for "PMD" when
      CONFIG_PGTABLE_LEVELS <= 2.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Huang Shijie <shijie.huang@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Steve Capper <steve.capper@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      48dd73c5
    • M
      arm64: update stale PAGE_OFFSET comment · a13e3a5b
      Mark Rutland 提交于
      Commit ab893fb9 ("arm64: introduce KIMAGE_VADDR as the virtual
      base of the kernel region") logically split KIMAGE_VADDR from
      PAGE_OFFSET, and since commit f9040773 ("arm64: move kernel
      image to base of vmalloc area") the two have been distinct values.
      
      Unfortunately, neither commit updated the comment above these
      definitions, which now erroneously states that PAGE_OFFSET is the start
      of the kernel image rather than the start of the linear mapping.
      
      This patch fixes said comment, and introduces an explanation of
      KIMAGE_VADDR.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      a13e3a5b
    • M
      arm64: report CPU number in bad_mode · 8051f4d1
      Mark Rutland 提交于
      If we take an exception we don't expect (e.g. SError), we report this in
      the bad_mode handler with pr_crit. Depending on the configured log
      level, we may or may not log additional information in functions called
      subsequently. Notably, the messages in dump_stack (including the CPU
      number) are printed with KERN_DEFAULT and may not appear.
      
      Some exceptions have an IMPLEMENTATION DEFINED ESR_ELx.ISS encoding, and
      knowing the CPU number is crucial to correctly decode them. To ensure
      that this is always possible, we should log the CPU number along with
      the ESR_ELx value, so we are not reliant on subsequent logs or
      additional printk configuration options.
      
      This patch logs the CPU number in bad_mode such that it is possible for
      a developer to decode these exceptions, provided access to sufficient
      documentation.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NAl Grant <Al.Grant@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8051f4d1
    • G
      irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144 · fbf8f40e
      Ganapatrao Kulkarni 提交于
      The erratum fixes the hang of ITS SYNC command by avoiding inter node
      io and collections/cpu mapping on thunderx dual-socket platform.
      
      This fix is only applicable for Cavium's ThunderX dual-socket platform.
      Reviewed-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Signed-off-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fbf8f40e
  4. 02 6月, 2016 8 次提交
    • P
      KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS · d14bdb55
      Paolo Bonzini 提交于
      MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
      any of bits 63:32.  However, this is not detected at KVM_SET_DEBUGREGS
      time, and the next KVM_RUN oopses:
      
         general protection fault: 0000 [#1] SMP
         CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
         Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
         [...]
         Call Trace:
          [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm]
          [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
          [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
          [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
          [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
         RIP  [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40
          RSP <ffff88005836bd50>
      
      Testcase (beautified/reduced from syzkaller output):
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[8];
      
          int main()
          {
              struct kvm_debugregs dr = { 0 };
      
              r[2] = open("/dev/kvm", O_RDONLY);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
      
              memcpy(&dr,
                     "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72"
                     "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8"
                     "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9"
                     "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb",
                     48);
              r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr);
              r[6] = ioctl(r[4], KVM_RUN, 0);
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      d14bdb55
    • P
      KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number · 78e546c8
      Paolo Bonzini 提交于
      This cannot be returned by KVM_GET_VCPU_EVENTS, so it is okay to return
      EINVAL.  It causes a WARN from exception_type:
      
          WARNING: CPU: 3 PID: 16732 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]()
          CPU: 3 PID: 16732 Comm: a.out Tainted: G        W       4.4.6-300.fc23.x86_64 #1
          Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
           0000000000000286 000000006308a48b ffff8800bec7fcf8 ffffffff813b542e
           0000000000000000 ffffffffa0966496 ffff8800bec7fd30 ffffffff810a40f2
           ffff8800552a8000 0000000000000000 00000000002c267c 0000000000000001
          Call Trace:
           [<ffffffff813b542e>] dump_stack+0x63/0x85
           [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
           [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
           [<ffffffffa0924809>] exception_type+0x49/0x50 [kvm]
           [<ffffffffa0934622>] kvm_arch_vcpu_ioctl_run+0x10a2/0x14e0 [kvm]
           [<ffffffffa091c04d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
           [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
           [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
           [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
          ---[ end trace b1a0391266848f50 ]---
      
      Testcase (beautified/reduced from syzkaller output):
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
          #include <linux/kvm.h>
      
          long r[31];
      
          int main()
          {
              memset(r, -1, sizeof(r));
              r[2] = open("/dev/kvm", O_RDONLY);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[7] = ioctl(r[3], KVM_CREATE_VCPU, 0);
      
              struct kvm_vcpu_events ve = {
                      .exception.injected = 1,
                      .exception.nr = 0xd4
              };
              r[27] = ioctl(r[7], KVM_SET_VCPU_EVENTS, &ve);
              r[30] = ioctl(r[7], KVM_RUN, 0);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      78e546c8
    • P
      KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID · 83676e92
      Paolo Bonzini 提交于
      This causes an ugly dmesg splat.  Beautified syzkaller testcase:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <sys/ioctl.h>
          #include <fcntl.h>
          #include <linux/kvm.h>
      
          long r[8];
      
          int main()
          {
              struct kvm_cpuid2 c = { 0 };
              r[2] = open("/dev/kvm", O_RDWR);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_CREATE_VCPU, 0x8);
              r[7] = ioctl(r[4], KVM_SET_CPUID, &c);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      83676e92
    • P
      kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR · b21629da
      Paolo Bonzini 提交于
      Found by syzkaller:
      
          WARNING: CPU: 3 PID: 15175 at arch/x86/kvm/x86.c:7705 __x86_set_memory_region+0x1dc/0x1f0 [kvm]()
          CPU: 3 PID: 15175 Comm: a.out Tainted: G        W       4.4.6-300.fc23.x86_64 #1
          Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
           0000000000000286 00000000950899a7 ffff88011ab3fbf0 ffffffff813b542e
           0000000000000000 ffffffffa0966496 ffff88011ab3fc28 ffffffff810a40f2
           00000000000001fd 0000000000003000 ffff88014fc50000 0000000000000000
          Call Trace:
           [<ffffffff813b542e>] dump_stack+0x63/0x85
           [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
           [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
           [<ffffffffa09251cc>] __x86_set_memory_region+0x1dc/0x1f0 [kvm]
           [<ffffffffa092521b>] x86_set_memory_region+0x3b/0x60 [kvm]
           [<ffffffffa09bb61c>] vmx_set_tss_addr+0x3c/0x150 [kvm_intel]
           [<ffffffffa092f4d4>] kvm_arch_vm_ioctl+0x654/0xbc0 [kvm]
           [<ffffffffa091d31a>] kvm_vm_ioctl+0x9a/0x6f0 [kvm]
           [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
           [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
           [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
      
      Testcase:
      
          #include <unistd.h>
          #include <sys/ioctl.h>
          #include <fcntl.h>
          #include <string.h>
          #include <linux/kvm.h>
      
          long r[8];
      
          int main()
          {
              memset(r, -1, sizeof(r));
      	r[2] = open("/dev/kvm", O_RDONLY|O_TRUNC);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
              r[5] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
              r[7] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
              return 0;
          }
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b21629da
    • D
      KVM: Handle MSR_IA32_PERF_CTL · 0c2df2a1
      Dmitry Bilunov 提交于
      Intel CPUs having Turbo Boost feature implement an MSR to provide a
      control interface via rdmsr/wrmsr instructions. One could detect the
      presence of this feature by issuing one of these instructions and
      handling the #GP exception which is generated in case the referenced MSR
      is not implemented by the CPU.
      
      KVM's vCPU model behaves exactly as a real CPU in this case by injecting
      a fault when MSR_IA32_PERF_CTL is called (which KVM does not support).
      However, some operating systems use this register during an early boot
      stage in which their kernel is not capable of handling #GP correctly,
      causing #DP and finally a triple fault effectively resetting the vCPU.
      
      This patch implements a dummy handler for MSR_IA32_PERF_CTL to avoid the
      crashes.
      Signed-off-by: NDmitry Bilunov <kmeaw@yandex-team.ru>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      0c2df2a1
    • N
      KVM: x86: avoid write-tearing of TDP · b19ee2ff
      Nadav Amit 提交于
      In theory, nothing prevents the compiler from write-tearing PTEs, or
      split PTE writes. These partially-modified PTEs can be fetched by other
      cores and cause mayhem. I have not really encountered such case in
      real-life, but it does seem possible.
      
      For example, the compiler may try to do something creative for
      kvm_set_pte_rmapp() and perform multiple writes to the PTE.
      Signed-off-by: NNadav Amit <nadav.amit@gmail.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b19ee2ff
    • R
      ARM: fix PTRACE_SETVFPREGS on SMP systems · e2dfb4b8
      Russell King 提交于
      PTRACE_SETVFPREGS fails to properly mark the VFP register set to be
      reloaded, because it undoes one of the effects of vfp_flush_hwstate().
      
      Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to
      an invalid CPU number, but vfp_set() overwrites this with the original
      CPU number, thereby rendering the hardware state as apparently "valid",
      even though the software state is more recent.
      
      Fix this by reverting the previous change.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 8130b9d7 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers")
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NSimon Marchi <simon.marchi@ericsson.com>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      e2dfb4b8
    • W
      arm64: unistd32.h: wire up missing syscalls for compat tasks · 10fdf851
      Will Deacon 提交于
      We're missing entries for mlock2, copy_file_range, preadv2 and pwritev2
      in our compat syscall table, so hook them up. Only the last two need
      compat wrappers.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      10fdf851
  5. 01 6月, 2016 5 次提交
  6. 31 5月, 2016 9 次提交
  7. 30 5月, 2016 4 次提交
    • R
      powerpc/pseries/eeh: Refactor the configure_bridge RTAS tokens · bd000b82
      Russell Currey 提交于
      The RTAS calls "ibm,configure-pe" and "ibm,configure-bridge" perform the
      same actions, however the former can skip configuration if unnecessary.
      The existing code treats them as different tokens even though only one
      will ever be called.  Refactor this by making a single token that is
      assigned during init.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bd000b82
    • R
      powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge · 871e178e
      Russell Currey 提交于
      In the "ibm,configure-pe" and "ibm,configure-bridge" RTAS calls, the
      spec states that values of 9900-9905 can be returned, indicating that
      software should delay for 10^x (where x is the last digit, i.e. 990x)
      milliseconds and attempt the call again. Currently, the kernel doesn't
      know about this, and respecting it fixes some PCI failures when the
      hypervisor is busy.
      
      The delay is capped at 0.2 seconds.
      
      Cc: <stable@vger.kernel.org> # 3.10+
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      871e178e
    • D
      sparc64: Fix return from trap window fill crashes. · 7cafc0b8
      David S. Miller 提交于
      We must handle data access exception as well as memory address unaligned
      exceptions from return from trap window fill faults, not just normal
      TLB misses.
      
      Otherwise we can get an OOPS that looks like this:
      
      ld-linux.so.2(36808): Kernel bad sw trap 5 [#1]
      CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 #34
      task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
      TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
      TPC: <do_sparc64_fault+0x5c4/0x700>
      g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
      g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
      o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
      o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
      RPC: <do_sparc64_fault+0x5bc/0x700>
      l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
      l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
      i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
      i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
      I7: <user_rtt_fill_fixup+0x6c/0x7c>
      Call Trace:
       [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c
      
      The window trap handlers are slightly clever, the trap table entries for them are
      composed of two pieces of code.  First comes the code that actually performs
      the window fill or spill trap handling, and then there are three instructions at
      the end which are for exception processing.
      
      The userland register window fill handler is:
      
      	add	%sp, STACK_BIAS + 0x00, %g1;		\
      	ldxa	[%g1 + %g0] ASI, %l0;			\
      	mov	0x08, %g2;				\
      	mov	0x10, %g3;				\
      	ldxa	[%g1 + %g2] ASI, %l1;			\
      	mov	0x18, %g5;				\
      	ldxa	[%g1 + %g3] ASI, %l2;			\
      	ldxa	[%g1 + %g5] ASI, %l3;			\
      	add	%g1, 0x20, %g1;				\
      	ldxa	[%g1 + %g0] ASI, %l4;			\
      	ldxa	[%g1 + %g2] ASI, %l5;			\
      	ldxa	[%g1 + %g3] ASI, %l6;			\
      	ldxa	[%g1 + %g5] ASI, %l7;			\
      	add	%g1, 0x20, %g1;				\
      	ldxa	[%g1 + %g0] ASI, %i0;			\
      	ldxa	[%g1 + %g2] ASI, %i1;			\
      	ldxa	[%g1 + %g3] ASI, %i2;			\
      	ldxa	[%g1 + %g5] ASI, %i3;			\
      	add	%g1, 0x20, %g1;				\
      	ldxa	[%g1 + %g0] ASI, %i4;			\
      	ldxa	[%g1 + %g2] ASI, %i5;			\
      	ldxa	[%g1 + %g3] ASI, %i6;			\
      	ldxa	[%g1 + %g5] ASI, %i7;			\
      	restored;					\
      	retry; nop; nop; nop; nop;			\
      	b,a,pt	%xcc, fill_fixup_dax;			\
      	b,a,pt	%xcc, fill_fixup_mna;			\
      	b,a,pt	%xcc, fill_fixup;
      
      And the way this works is that if any of those memory accesses
      generate an exception, the exception handler can revector to one of
      those final three branch instructions depending upon which kind of
      exception the memory access took.  In this way, the fault handler
      doesn't have to know if it was a spill or a fill that it's handling
      the fault for.  It just always branches to the last instruction in
      the parent trap's handler.
      
      For example, for a regular fault, the code goes:
      
      winfix_trampoline:
      	rdpr	%tpc, %g3
      	or	%g3, 0x7c, %g3
      	wrpr	%g3, %tnpc
      	done
      
      All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
      trap time program counter, we'll get that final instruction in the
      trap handler.
      
      On return from trap, we have to pull the register window in but we do
      this by hand instead of just executing a "restore" instruction for
      several reasons.  The largest being that from Niagara and onward we
      simply don't have enough levels in the trap stack to fully resolve all
      possible exception cases of a window fault when we are already at
      trap level 1 (which we enter to get ready to return from the original
      trap).
      
      This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
      code branches directly to these to do the window fill by hand if
      necessary.  Now if you look at them, we'll see at the end:
      
      	    ba,a,pt    %xcc, user_rtt_fill_fixup;
      	    ba,a,pt    %xcc, user_rtt_fill_fixup;
      	    ba,a,pt    %xcc, user_rtt_fill_fixup;
      
      And oops, all three cases are handled like a fault.
      
      This doesn't work because each of these trap types (data access
      exception, memory address unaligned, and faults) store their auxiliary
      info in different registers to pass on to the C handler which does the
      real work.
      
      So in the case where the stack was unaligned, the unaligned trap
      handler sets up the arg registers one way, and then we branched to
      the fault handler which expects them setup another way.
      
      So the FAULT_TYPE_* value ends up basically being garbage, and
      randomly would generate the backtrace seen above.
      Reported-by: NNick Alcock <nix@esperi.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7cafc0b8
    • D
      sparc: Harden signal return frame checks. · d11c2a0d
      David S. Miller 提交于
      All signal frames must be at least 16-byte aligned, because that is
      the alignment we explicitly create when we build signal return stack
      frames.
      
      All stack pointers must be at least 8-byte aligned.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d11c2a0d
  8. 29 5月, 2016 3 次提交
    • G
      h8300: Add <asm/hash.h> · 4684fe95
      George Spelvin 提交于
      This will improve the performance of hash_32() and hash_64(), but due
      to complete lack of multi-bit shift instructions on H8, performance will
      still be bad in surrounding code.
      
      Designing H8-specific hash algorithms to work around that is a separate
      project.  (But if the maintainers would like to get in touch...)
      Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      4684fe95
    • G
      microblaze: Add <asm/hash.h> · 7b13277b
      George Spelvin 提交于
      Microblaze is an FPGA soft core that can be configured various ways.
      
      If it is configured without a multiplier, the standard __hash_32()
      will require a call to __mulsi3, which is a slow software loop.
      
      Instead, use a shift-and-add sequence for the constant multiply.
      GCC knows how to do this, but it's not as clever as some.
      Signed-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Alistair Francis <alistair.francis@xilinx.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      7b13277b
    • G
      m68k: Add <asm/hash.h> · 14c44b95
      George Spelvin 提交于
      This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
      for the original mc68000, which lacks a 32x32-bit multiply instruction.
      
      Yes, the amount of optimization effort put in is excessive. :-)
      
      Shift-add chain found by Yevgen Voronenko's Hcub algorithm at
      http://spiral.ece.cmu.edu/mcm/gen.htmlSigned-off-by: NGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Cc: Philippe De Muyter <phdm@macq.eu>
      Cc: linux-m68k@lists.linux-m68k.org
      14c44b95