1. 05 11月, 2015 2 次提交
    • R
      sparc64: Don't restrict fp regs for no-fault loads · cae9af6a
      Rob Gardner 提交于
      The function handle_ldf_stq() deals with no-fault ASI
      loads and stores, but restricts fp registers to quad
      word regs (ie, %f0, %f4 etc). This is valid for the
      STQ case, but unnecessarily restricts loads, which
      may be single precision, double, or quad. This results
      in SIGFPE being raised for this instruction when the
      source address is invalid:
      	ldda [%g1] ASI_PNF, %f2
      but not for this one:
      	ldda [%g1] ASI_PNF, %f4
      The validation check for quad register is moved to
      within the STQ block so that loads are not affected
      by the check.
      
      An additional problem is that the calculation for freg
      is incorrect when a single precision load is being
      handled. This causes %f1 to be seen as %f32 etc,
      and the incorrect register ends up being overwritten.
      This code sequence demonstrates the problem:
      	ldd [%g1], %f32		! g1 = valid address
      	lda [%i3] ASI_PNF, %f1  ! i3 = invalid address
      	std %f32, [%g1]
      This is corrected by basing the freg calculation on
      the load size.
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cae9af6a
    • D
      iommu-common: Fix error code used in iommu_tbl_range_{alloc,free}(). · d618382b
      David S. Miller 提交于
      The value returned from iommu_tbl_range_alloc() (and the one passed
      in as a fourth argument to iommu_tbl_range_free) is not a DMA address,
      it is rather an index into the IOMMU page table.
      
      Therefore using DMA_ERROR_CODE is not appropriate.
      
      Use a more type matching error code define, IOMMU_ERROR_CODE, and
      update all users of this interface.
      Reported-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d618382b
  2. 08 8月, 2015 2 次提交
  3. 07 8月, 2015 3 次提交
    • D
      sparc64: Fix userspace FPU register corruptions. · 44922150
      David S. Miller 提交于
      If we have a series of events from userpsace, with %fprs=FPRS_FEF,
      like follows:
      
      ETRAP
      	ETRAP
      		VIS_ENTRY(fprs=0x4)
      		VIS_EXIT
      		RTRAP (kernel FPU restore with fpu_saved=0x4)
      	RTRAP
      
      We will not restore the user registers that were clobbered by the FPU
      using kernel code in the inner-most trap.
      
      Traps allocate FPU save slots in the thread struct, and FPU using
      sequences save the "dirty" FPU registers only.
      
      This works at the initial trap level because all of the registers
      get recorded into the top-level FPU save area, and we'll return
      to userspace with the FPU disabled so that any FPU use by the user
      will take an FPU disabled trap wherein we'll load the registers
      back up properly.
      
      But this is not how trap returns from kernel to kernel operate.
      
      The simplest fix for this bug is to always save all FPU register state
      for anything other than the top-most FPU save area.
      
      Getting rid of the optimized inner-slot FPU saving code ends up
      making VISEntryHalf degenerate into plain VISEntry.
      
      Longer term we need to do something smarter to reinstate the partial
      save optimizations.  Perhaps the fundament error is having trap entry
      and exit allocate FPU save slots and restore register state.  Instead,
      the VISEntry et al. calls should be doing that work.
      
      This bug is about two decades old.
      Reported-by: NJames Y Knight <jyknight@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44922150
    • A
      signal: fix information leak in copy_siginfo_to_user · 26135022
      Amanieu d'Antras 提交于
      This function may copy the si_addr_lsb, si_lower and si_upper fields to
      user mode when they haven't been initialized, which can leak kernel
      stack data to user mode.
      
      Just checking the value of si_code is insufficient because the same
      si_code value is shared between multiple signals.  This is solved by
      checking the value of si_signo in addition to si_code.
      Signed-off-by: NAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26135022
    • A
      signal: fix information leak in copy_siginfo_from_user32 · 3c00cb5e
      Amanieu d'Antras 提交于
      This function can leak kernel stack data when the user siginfo_t has a
      positive si_code value.  The top 16 bits of si_code descibe which fields
      in the siginfo_t union are active, but they are treated inconsistently
      between copy_siginfo_from_user32, copy_siginfo_to_user32 and
      copy_siginfo_to_user.
      
      copy_siginfo_from_user32 is called from rt_sigqueueinfo and
      rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
      of si_code.
      
      This fixes the following information leaks:
      x86:   8 bytes leaked when sending a signal from a 32-bit process to
             itself. This leak grows to 16 bytes if the process uses x32.
             (si_code = __SI_CHLD)
      x86:   100 bytes leaked when sending a signal from a 32-bit process to
             a 64-bit process. (si_code = -1)
      sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
             64-bit process. (si_code = any)
      
      parsic and s390 have similar bugs, but they are not vulnerable because
      rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
      to a different process.  These bugs are also fixed for consistency.
      Signed-off-by: NAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c00cb5e
  4. 05 8月, 2015 1 次提交
  5. 01 8月, 2015 1 次提交
  6. 31 7月, 2015 6 次提交
    • R
      dmaengine: xgene-dma: Fix the resource map to handle overlapping · cda8e937
      Rameshwar Prasad Sahu 提交于
      There is an overlap in dma ring cmd csr region due to sharing of ethernet
      ring cmd csr region. This patch fix the resource overlapping by mapping
      the entire dma ring cmd csr region.
      Signed-off-by: NRameshwar Prasad Sahu <rsahu@apm.com>
      Signed-off-by: NVinod Koul <vinod.koul@intel.com>
      cda8e937
    • A
      x86/ldt: Make modify_ldt synchronous · 37868fe1
      Andy Lutomirski 提交于
      modify_ldt() has questionable locking and does not synchronize
      threads.  Improve it: redesign the locking and synchronize all
      threads' LDTs using an IPI on all modifications.
      
      This will dramatically slow down modify_ldt in multithreaded
      programs, but there shouldn't be any multithreaded programs that
      care about modify_ldt's performance in the first place.
      
      This fixes some fallout from the CVE-2015-5157 fixes.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      37868fe1
    • A
      x86/xen: Probe target addresses in set_aliased_prot() before the hypercall · aa1acff3
      Andy Lutomirski 提交于
      The update_va_mapping hypercall can fail if the VA isn't present
      in the guest's page tables.  Under certain loads, this can
      result in an OOPS when the target address is in unpopulated vmap
      space.
      
      While we're at it, add comments to help explain what's going on.
      
      This isn't a great long-term fix.  This code should probably be
      changed to use something like set_memory_ro.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <dvrabel@cantab.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aa1acff3
    • J
      x86/irq: Use the caller provided polarity setting in mp_check_pin_attr() · 646c4b75
      Jiang Liu 提交于
      Commit d32932d0 ("x86/irq: Convert IOAPIC to use hierarchical
      irqdomain interfaces") introduced a regression which causes
      malfunction of interrupt lines.
      
      The reason is that the conversion of mp_check_pin_attr() missed to
      update the polarity selection of the interrupt pin with the caller
      provided setting and instead uses a stale attribute value. That in
      turn results in chosing the wrong interrupt flow handler.
      
      Use the caller supplied setting to configure the pin correctly which
      also choses the correct interrupt flow handler.
      
      This restores the original behaviour and on the affected
      machine/driver (Surface Pro 3, i2c controller) all IOAPIC IRQ
      configuration are identical to v4.1.
      
      Fixes: d32932d0 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
      Reported-and-tested-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reported-and-tested-by: NChen Yu <yu.c.chen@intel.com>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1438242695-23531-1-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      646c4b75
    • R
      efi: Check for NULL efi kernel parameters · 9115c758
      Ricardo Neri 提交于
      Even though it is documented how to specifiy efi parameters, it is
      possible to cause a kernel panic due to a dereference of a NULL pointer when
      parsing such parameters if "efi" alone is given:
      
      PANIC: early exception 0e rip 10:ffffffff812fb361 error 0 cr2 0
      [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.2.0-rc1+ #450
      [ 0.000000]  ffffffff81fe20a9 ffffffff81e03d50 ffffffff8184bb0f 00000000000003f8
      [ 0.000000]  0000000000000000 ffffffff81e03e08 ffffffff81f371a1 64656c62616e6520
      [ 0.000000]  0000000000000069 000000000000005f 0000000000000000 0000000000000000
      [ 0.000000] Call Trace:
      [ 0.000000]  [<ffffffff8184bb0f>] dump_stack+0x45/0x57
      [ 0.000000]  [<ffffffff81f371a1>] early_idt_handler_common+0x81/0xae
      [ 0.000000]  [<ffffffff812fb361>] ? parse_option_str+0x11/0x90
      [ 0.000000]  [<ffffffff81f4dd69>] arch_parse_efi_cmdline+0x15/0x42
      [ 0.000000]  [<ffffffff81f376e1>] do_early_param+0x50/0x8a
      [ 0.000000]  [<ffffffff8106b1b3>] parse_args+0x1e3/0x400
      [ 0.000000]  [<ffffffff81f37a43>] parse_early_options+0x24/0x28
      [ 0.000000]  [<ffffffff81f37691>] ? loglevel+0x31/0x31
      [ 0.000000]  [<ffffffff81f37a78>] parse_early_param+0x31/0x3d
      [ 0.000000]  [<ffffffff81f3ae98>] setup_arch+0x2de/0xc08
      [ 0.000000]  [<ffffffff8109629a>] ? vprintk_default+0x1a/0x20
      [ 0.000000]  [<ffffffff81f37b20>] start_kernel+0x90/0x423
      [ 0.000000]  [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
      [ 0.000000]  [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef
      [ 0.000000] RIP 0xffffffff81ba2efc
      
      This panic is not reproducible with "efi=" as this will result in a non-NULL
      zero-length string.
      
      Thus, verify that the pointer to the parameter string is not NULL. This is
      consistent with other parameter-parsing functions which check for NULL pointers.
      Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      9115c758
    • D
      x86/efi: Use all 64 bit of efi_memmap in setup_e820() · 7cc03e48
      Dmitry Skorodumov 提交于
      The efi_info structure stores low 32 bits of memory map
      in efi_memmap and high 32 bits in efi_memmap_hi.
      
      While constructing pointer in the setup_e820(), need
      to take into account all 64 bit of the pointer.
      
      It is because on 64bit machine the function
      efi_get_memory_map() may return full 64bit pointer and before
      the patch that pointer was truncated.
      
      The issue is triggered on Parallles virtual machine and
      fixed with this patch.
      Signed-off-by: NDmitry Skorodumov <sdmitry@parallels.com>
      Cc: Denis V. Lunev <den@openvz.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      7cc03e48
  7. 30 7月, 2015 3 次提交
    • C
      KVM: s390: Fix hang VCPU hang/loop regression · 586b7ccd
      Christian Borntraeger 提交于
      commit 785dbef4 ("KVM: s390: optimize round trip time in request
      handling") introduced a regression. This regression was seen with
      CPU hotplug in the guest and switching between 1 or 2 CPUs. This will
      set/reset the IBS control via synced request.
      
      Whenever we make a synced request, we first set the vcpu->requests
      bit and then block the vcpu. The handler, on the other hand, unblocks
      itself, processes vcpu->requests (by clearing them) and unblocks itself
      once again.
      
      Now, if the requester sleeps between setting of vcpu->requests and
      blocking, the handler will clear the vcpu->requests bit and try to
      unblock itself (although no bit is set). When the requester wakes up,
      it blocks the VCPU and we have a blocked VCPU without requests.
      
      Solution is to always unset the block bit.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Fixes: 785dbef4 ("KVM: s390: optimize round trip time in request handling")
      586b7ccd
    • A
      powerpc/eeh-powernv: Fix unbalanced IRQ warning · b8d65e96
      Alistair Popple 提交于
      pnv_eeh_next_error() re-enables the eeh opal event interrupt but it
      gets called from a loop if there are more outstanding events to
      process, resulting in a warning due to enabling an already enabled
      interrupt. Instead the interrupt should only be re-enabled once the
      last outstanding event has been processed.
      Tested-by: NDaniel Axtens <dja@axtens.net>
      Reported-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b8d65e96
    • D
      ebpf, x86: fix general protection fault when tail call is invoked · 2482abb9
      Daniel Borkmann 提交于
      With eBPF JIT compiler enabled on x86_64, I was able to reliably trigger
      the following general protection fault out of an eBPF program with a simple
      tail call, f.e. tracex5 (or a stripped down version of it):
      
        [  927.097918] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
        [...]
        [  927.100870] task: ffff8801f228b780 ti: ffff880016a64000 task.ti: ffff880016a64000
        [  927.102096] RIP: 0010:[<ffffffffa002440d>]  [<ffffffffa002440d>] 0xffffffffa002440d
        [  927.103390] RSP: 0018:ffff880016a67a68  EFLAGS: 00010006
        [  927.104683] RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000001
        [  927.105921] RDX: 0000000000000000 RSI: ffff88014e438000 RDI: ffff880016a67e00
        [  927.107137] RBP: ffff880016a67c90 R08: 0000000000000000 R09: 0000000000000001
        [  927.108351] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880016a67e00
        [  927.109567] R13: 0000000000000000 R14: ffff88026500e460 R15: ffff880220a81520
        [  927.110787] FS:  00007fe7d5c1f740(0000) GS:ffff880265000000(0000) knlGS:0000000000000000
        [  927.112021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [  927.113255] CR2: 0000003e7bbb91a0 CR3: 000000006e04b000 CR4: 00000000001407e0
        [  927.114500] Stack:
        [  927.115737]  ffffc90008cdb000 ffff880016a67e00 ffff88026500e460 ffff880220a81520
        [  927.117005]  0000000100000000 000000000000001b ffff880016a67aa8 ffffffff8106c548
        [  927.118276]  00007ffcdaf22e58 0000000000000000 0000000000000000 ffff880016a67ff0
        [  927.119543] Call Trace:
        [  927.120797]  [<ffffffff8106c548>] ? lookup_address+0x28/0x30
        [  927.122058]  [<ffffffff8113d176>] ? __module_text_address+0x16/0x70
        [  927.123314]  [<ffffffff8117bf0e>] ? is_ftrace_trampoline+0x3e/0x70
        [  927.124562]  [<ffffffff810c1a0f>] ? __kernel_text_address+0x5f/0x80
        [  927.125806]  [<ffffffff8102086f>] ? print_context_stack+0x7f/0xf0
        [  927.127033]  [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
        [  927.128254]  [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
        [  927.129461]  [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140
        [  927.130654]  [<ffffffff8119ee4a>] trace_call_bpf+0x8a/0x140
        [  927.131837]  [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140
        [  927.133015]  [<ffffffff8119f008>] kprobe_perf_func+0x28/0x220
        [  927.134195]  [<ffffffff811a1668>] kprobe_dispatcher+0x38/0x60
        [  927.135367]  [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230
        [  927.136523]  [<ffffffff81061400>] kprobe_ftrace_handler+0xf0/0x150
        [  927.137666]  [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230
        [  927.138802]  [<ffffffff8117950c>] ftrace_ops_recurs_func+0x5c/0xb0
        [  927.139934]  [<ffffffffa022b0d5>] 0xffffffffa022b0d5
        [  927.141066]  [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230
        [  927.142199]  [<ffffffff81174b95>] seccomp_phase1+0x5/0x230
        [  927.143323]  [<ffffffff8102c0a4>] syscall_trace_enter_phase1+0xc4/0x150
        [  927.144450]  [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230
        [  927.145572]  [<ffffffff8102c0a4>] ? syscall_trace_enter_phase1+0xc4/0x150
        [  927.146666]  [<ffffffff817f9a9f>] tracesys+0xd/0x44
        [  927.147723] Code: 48 8b 46 10 48 39 d0 76 2c 8b 85 fc fd ff ff 83 f8 20 77 21 83
                             c0 01 89 85 fc fd ff ff 48 8d 44 d6 80 48 8b 00 48 83 f8 00 74
                             0a <48> 8b 40 20 48 83 c0 33 ff e0 48 89 d8 48 8b 9d d8 fd ff
                             ff 4c
        [  927.150046] RIP  [<ffffffffa002440d>] 0xffffffffa002440d
      
      The code section with the instructions that traps points into the eBPF JIT
      image of the root program (the one invoking the tail call instruction).
      
      Using bpf_jit_disasm -o on the eBPF root program image:
      
        [...]
        4e:   mov    -0x204(%rbp),%eax
              8b 85 fc fd ff ff
        54:   cmp    $0x20,%eax               <--- if (tail_call_cnt > MAX_TAIL_CALL_CNT)
              83 f8 20
        57:   ja     0x000000000000007a
              77 21
        59:   add    $0x1,%eax                <--- tail_call_cnt++
              83 c0 01
        5c:   mov    %eax,-0x204(%rbp)
              89 85 fc fd ff ff
        62:   lea    -0x80(%rsi,%rdx,8),%rax  <--- prog = array->prog[index]
              48 8d 44 d6 80
        67:   mov    (%rax),%rax
              48 8b 00
        6a:   cmp    $0x0,%rax                <--- check for NULL
              48 83 f8 00
        6e:   je     0x000000000000007a
              74 0a
        70:   mov    0x20(%rax),%rax          <--- GPF triggered here! fetch of bpf_func
              48 8b 40 20                              [ matches <48> 8b 40 20 ... from above ]
        74:   add    $0x33,%rax               <--- prologue skip of new prog
              48 83 c0 33
        78:   jmpq   *%rax                    <--- jump to new prog insns
              ff e0
        [...]
      
      The problem is that rax has 5a5a5a5a5a5a5a5a, which suggests a tail call
      jump to map slot 0 is pointing to a poisoned page. The issue is the following:
      
      lea instruction has a wrong offset, i.e. it should be ...
      
        lea    0x80(%rsi,%rdx,8),%rax
      
      ... but it actually seems to be ...
      
        lea   -0x80(%rsi,%rdx,8),%rax
      
      ... where 0x80 is offsetof(struct bpf_array, prog), thus the offset needs
      to be positive instead of negative. Disassembling the interpreter, we btw
      similarly do:
      
        [...]
        c88:  lea     0x80(%rax,%rdx,8),%rax  <--- prog = array->prog[index]
              48 8d 84 d0 80 00 00 00
        c90:  add     $0x1,%r13d
              41 83 c5 01
        c94:  mov     (%rax),%rax
              48 8b 00
        [...]
      
      Now the other interesting fact is that this panic triggers only when things
      like CONFIG_LOCKDEP are being used. In that case offsetof(struct bpf_array,
      prog) starts at offset 0x80 and in non-CONFIG_LOCKDEP case at offset 0x50.
      Reason is that the work_struct inside struct bpf_map grows by 48 bytes in my
      case due to the lockdep_map member (which also has CONFIG_LOCK_STAT enabled
      members).
      
      Changing the emitter to always use the 4 byte displacement in the lea
      instruction fixes the panic on my side. It increases the tail call instruction
      emission by 3 more byte, but it should cover us from various combinations
      (and perhaps other future increases on related structures).
      
      After patch, disassembly:
      
        [...]
        9e:   lea    0x80(%rsi,%rdx,8),%rax   <--- CONFIG_LOCKDEP/CONFIG_LOCK_STAT
              48 8d 84 d6 80 00 00 00
        a6:   mov    (%rax),%rax
              48 8b 00
        [...]
      
        [...]
        9e:   lea    0x50(%rsi,%rdx,8),%rax   <--- No CONFIG_LOCKDEP
              48 8d 84 d6 50 00 00 00
        a6:   mov    (%rax),%rax
              48 8b 00
        [...]
      
      Fixes: b52f00e6 ("x86: bpf_jit: implement bpf_tail_call() helper")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2482abb9
  8. 28 7月, 2015 2 次提交
    • H
      s390/cachinfo: add missing facility check to init_cache_level() · 0b991f5c
      Heiko Carstens 提交于
      Stephen Powell reported the following crash on a z890 machine:
      
      Kernel BUG at 00000000001219d0 [verbose debug info unavailable]
      illegal operation: 0001 ilc:3 [#1] SMP
      Krnl PSW : 0704e00180000000 00000000001219d0 (init_cache_level+0x38/0xe0)
      	   R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
      Krnl Code: 00000000001219c2: a7840056		brc	8,121a6e
      	   00000000001219c6: a7190000		lghi	%r1,0
      	  #00000000001219ca: eb101000004c	ecag	%r1,%r0,0(%r1)
      	  >00000000001219d0: a7390000		lghi	%r3,0
      	   00000000001219d4: e310f0a00024	stg	%r1,160(%r15)
      	   00000000001219da: a7080000		lhi	%r0,0
      	   00000000001219de: a7b9f000		lghi	%r11,-4096
      	   00000000001219e2: c0a0002899d9	larl	%r10,634d94
      Call Trace:
       [<0000000000478ee2>] detect_cache_attributes+0x2a/0x2b8
       [<000000000097c9b0>] cacheinfo_sysfs_init+0x60/0xc8
       [<00000000001001c0>] do_one_initcall+0x98/0x1c8
       [<000000000094fdc2>] kernel_init_freeable+0x212/0x2d8
       [<000000000062352e>] kernel_init+0x26/0x118
       [<000000000062fd2e>] kernel_thread_starter+0x6/0xc
      
      The illegal operation was executed because of a missing facility check,
      which should have made sure that the ECAG execution would only be executed
      on machines which have the general-instructions-extension facility
      installed.
      Reported-and-tested-by: NStephen Powell <zlinuxman@wowway.com>
      Cc: stable@vger.kernel.org # v4.0+
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      0b991f5c
    • A
      arm64/efi: map the entire UEFI vendor string before reading it · f91b1fea
      Ard Biesheuvel 提交于
      At boot, the UTF-16 UEFI vendor string is copied from the system
      table into a char array with a size of 100 bytes. However, this
      size of 100 bytes is also used for memremapping() the source,
      which may not be sufficient if the vendor string exceeds 50
      UTF-16 characters, and the placement of the vendor string inside
      a 4 KB page happens to leave the end unmapped.
      
      So use the correct '100 * sizeof(efi_char16_t)' for the size of
      the mapping.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Fixes: f84d0275 ("arm64: add EFI runtime services")
      Cc: <stable@vger.kernel.org> # 3.16+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      f91b1fea
  9. 27 7月, 2015 3 次提交
  10. 26 7月, 2015 2 次提交
    • T
      x86/mm/pat: Revert 'Adjust default caching mode translation tables' · 1a4e8795
      Thomas Gleixner 提交于
      Toshi explains:
      
      "No, the default values need to be set to the fallback types,
       i.e. minimal supported mode.  For WC and WT, UC is the fallback type.
      
       When PAT is disabled, pat_init() does update the tables below to
       enable WT per the default BIOS setup.  However, when PAT is enabled,
       but CPU has PAT -errata, WT falls back to UC per the default values."
      
      Revert: ca1fec58 'x86/mm/pat: Adjust default caching mode translation tables'
      Requested-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Jan Beulich <jbeulich@suse.de>
      Link: http://lkml.kernel.org/r/1437577776.3214.252.camel@hp.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1a4e8795
    • M
      perf/x86/intel/cqm: Return cached counter value from IRQ context · 2c534c0d
      Matt Fleming 提交于
      Peter reported the following potential crash which I was able to
      reproduce with his test program,
      
      [  148.765788] ------------[ cut here ]------------
      [  148.765796] WARNING: CPU: 34 PID: 2840 at kernel/smp.c:417 smp_call_function_many+0xb6/0x260()
      [  148.765797] Modules linked in:
      [  148.765800] CPU: 34 PID: 2840 Comm: perf Not tainted 4.2.0-rc1+ #4
      [  148.765803]  ffffffff81cdc398 ffff88085f105950 ffffffff818bdfd5 0000000000000007
      [  148.765805]  0000000000000000 ffff88085f105990 ffffffff810e413a 0000000000000000
      [  148.765807]  ffffffff82301080 0000000000000022 ffffffff8107f640 ffffffff8107f640
      [  148.765809] Call Trace:
      [  148.765810]  <NMI>  [<ffffffff818bdfd5>] dump_stack+0x45/0x57
      [  148.765818]  [<ffffffff810e413a>] warn_slowpath_common+0x8a/0xc0
      [  148.765822]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
      [  148.765824]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
      [  148.765825]  [<ffffffff810e422a>] warn_slowpath_null+0x1a/0x20
      [  148.765827]  [<ffffffff811613f6>] smp_call_function_many+0xb6/0x260
      [  148.765829]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
      [  148.765831]  [<ffffffff81161748>] on_each_cpu_mask+0x28/0x60
      [  148.765832]  [<ffffffff8107f6ef>] intel_cqm_event_count+0x7f/0xe0
      [  148.765836]  [<ffffffff811cdd35>] perf_output_read+0x2a5/0x400
      [  148.765839]  [<ffffffff811d2e5a>] perf_output_sample+0x31a/0x590
      [  148.765840]  [<ffffffff811d333d>] ? perf_prepare_sample+0x26d/0x380
      [  148.765841]  [<ffffffff811d3497>] perf_event_output+0x47/0x60
      [  148.765843]  [<ffffffff811d36c5>] __perf_event_overflow+0x215/0x240
      [  148.765844]  [<ffffffff811d4124>] perf_event_overflow+0x14/0x20
      [  148.765847]  [<ffffffff8107e7f4>] intel_pmu_handle_irq+0x1d4/0x440
      [  148.765849]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765853]  [<ffffffff81219bad>] ? vunmap_page_range+0x19d/0x2f0
      [  148.765854]  [<ffffffff81219d11>] ? unmap_kernel_range_noflush+0x11/0x20
      [  148.765859]  [<ffffffff814ce6fe>] ? ghes_copy_tofrom_phys+0x11e/0x2a0
      [  148.765863]  [<ffffffff8109e5db>] ? native_apic_msr_write+0x2b/0x30
      [  148.765865]  [<ffffffff8109e44d>] ? x2apic_send_IPI_self+0x1d/0x20
      [  148.765869]  [<ffffffff81065135>] ? arch_irq_work_raise+0x35/0x40
      [  148.765872]  [<ffffffff811c8d86>] ? irq_work_queue+0x66/0x80
      [  148.765875]  [<ffffffff81075306>] perf_event_nmi_handler+0x26/0x40
      [  148.765877]  [<ffffffff81063ed9>] nmi_handle+0x79/0x100
      [  148.765879]  [<ffffffff81064422>] default_do_nmi+0x42/0x100
      [  148.765880]  [<ffffffff81064563>] do_nmi+0x83/0xb0
      [  148.765884]  [<ffffffff818c7c0f>] end_repeat_nmi+0x1e/0x2e
      [  148.765886]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765888]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765890]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765891]  <<EOE>>  [<ffffffff8110ab66>] finish_task_switch+0x156/0x210
      [  148.765898]  [<ffffffff818c1671>] __schedule+0x341/0x920
      [  148.765899]  [<ffffffff818c1c87>] schedule+0x37/0x80
      [  148.765903]  [<ffffffff810ae1af>] ? do_page_fault+0x2f/0x80
      [  148.765905]  [<ffffffff818c1f4a>] schedule_user+0x1a/0x50
      [  148.765907]  [<ffffffff818c666c>] retint_careful+0x14/0x32
      [  148.765908] ---[ end trace e33ff2be78e14901 ]---
      
      The CQM task events are not safe to be called from within interrupt
      context because they require performing an IPI to read the counter value
      on all sockets. And performing IPIs from within IRQ context is a
      "no-no".
      
      Make do with the last read counter value currently event in
      event->count when we're invoked in this context.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vikas Shivappa <vikas.shivappa@intel.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Will Auld <will.auld@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      2c534c0d
  11. 25 7月, 2015 1 次提交
  12. 24 7月, 2015 9 次提交
  13. 23 7月, 2015 5 次提交
    • R
      ARM: OMAP2+: hwmod: Fix _wait_target_ready() for hwmods without sysc · 9a258afa
      Roger Quadros 提交于
      For hwmods without sysc, _init_mpu_rt_base(oh) won't be called and so
      _find_mpu_rt_port(oh) will return NULL thus preventing ready state check
      on those modules after the module is enabled.
      
      This can potentially cause a bus access error if the module is accessed
      before the module is ready.
      
      Fix this by unconditionally calling _init_mpu_rt_base() during hwmod
      _init(). Do ioremap only if we need SYSC access.
      
      Eventhough _wait_target_ready() check doesn't really need MPU RT port but
      just the PRCM registers, we still mandate that the hwmod must have an
      MPU RT port if ready state check needs to be done. Else it would mean that
      the module is not accessible by MPU so there is no point in waiting
      for target to be ready.
      
      e.g. this fixes the below DCAN bus access error on AM437x-gp-evm.
      
      [   16.672978] ------------[ cut here ]------------
      [   16.677885] WARNING: CPU: 0 PID: 1580 at drivers/bus/omap_l3_noc.c:147 l3_interrupt_handler+0x234/0x35c()
      [   16.687946] 44000000.ocp:L3 Custom Error: MASTER M2 (64-bit) TARGET L4_PER_0 (Read): Data Access in User mode during Functional access
      [   16.700654] Modules linked in: xhci_hcd btwilink ti_vpfe dwc3 videobuf2_core ov2659 bluetooth v4l2_common videodev ti_am335x_adc kfifo_buf industrialio c_can_platform videobuf2_dma_contig media snd_soc_tlv320aic3x pixcir_i2c_ts c_can dc
      [   16.731144] CPU: 0 PID: 1580 Comm: rpc.statd Not tainted 3.14.26-02561-gf733aa036398 #180
      [   16.739747] Backtrace:
      [   16.742336] [<c0011108>] (dump_backtrace) from [<c00112a4>] (show_stack+0x18/0x1c)
      [   16.750285]  r6:00000093 r5:00000009 r4:eab5b8a8 r3:00000000
      [   16.756252] [<c001128c>] (show_stack) from [<c05a4418>] (dump_stack+0x20/0x28)
      [   16.763870] [<c05a43f8>] (dump_stack) from [<c0037120>] (warn_slowpath_common+0x6c/0x8c)
      [   16.772408] [<c00370b4>] (warn_slowpath_common) from [<c00371e4>] (warn_slowpath_fmt+0x38/0x40)
      [   16.781550]  r8:c05d1f90 r7:c0730844 r6:c0730448 r5:80080003 r4:ed0cd210
      [   16.788626] [<c00371b0>] (warn_slowpath_fmt) from [<c027fa94>] (l3_interrupt_handler+0x234/0x35c)
      [   16.797968]  r3:ed0cd480 r2:c0730508
      [   16.801747] [<c027f860>] (l3_interrupt_handler) from [<c0063758>] (handle_irq_event_percpu+0x54/0x1bc)
      [   16.811533]  r10:ed005600 r9:c084855b r8:0000002a r7:00000000 r6:00000000 r5:0000002a
      [   16.819780]  r4:ed0e6d80
      [   16.822453] [<c0063704>] (handle_irq_event_percpu) from [<c00638f0>] (handle_irq_event+0x30/0x40)
      [   16.831789]  r10:eb2b6938 r9:eb2b6960 r8:bf011420 r7:fa240100 r6:00000000 r5:0000002a
      [   16.840052]  r4:ed005600
      [   16.842744] [<c00638c0>] (handle_irq_event) from [<c00661d8>] (handle_fasteoi_irq+0x74/0x128)
      [   16.851702]  r4:ed005600 r3:00000000
      [   16.855479] [<c0066164>] (handle_fasteoi_irq) from [<c0063068>] (generic_handle_irq+0x28/0x38)
      [   16.864523]  r4:0000002a r3:c0066164
      [   16.868294] [<c0063040>] (generic_handle_irq) from [<c000ef60>] (handle_IRQ+0x38/0x8c)
      [   16.876612]  r4:c081c640 r3:00000202
      [   16.880380] [<c000ef28>] (handle_IRQ) from [<c00084f0>] (gic_handle_irq+0x30/0x5c)
      [   16.888328]  r6:eab5ba38 r5:c0804460 r4:fa24010c r3:00000100
      [   16.894303] [<c00084c0>] (gic_handle_irq) from [<c05a8d80>] (__irq_svc+0x40/0x50)
      [   16.902193] Exception stack(0xeab5ba38 to 0xeab5ba80)
      [   16.907499] ba20:                                                       00000000 00000006
      [   16.916108] ba40: fa1d0000 fa1d0008 ed3d3000 eab5bab4 ed3d3460 c0842af4 bf011420 eb2b6960
      [   16.924716] ba60: eb2b6938 eab5ba8c eab5ba90 eab5ba80 bf035220 bf07702c 600f0013 ffffffff
      [   16.933317]  r7:eab5ba6c r6:ffffffff r5:600f0013 r4:bf07702c
      [   16.939317] [<bf077000>] (c_can_plat_read_reg_aligned_to_16bit [c_can_platform]) from [<bf035220>] (c_can_get_berr_counter+0x38/0x64 [c_can])
      [   16.952696] [<bf0351e8>] (c_can_get_berr_counter [c_can]) from [<bf010294>] (can_fill_info+0x124/0x15c [can_dev])
      [   16.963480]  r5:ec8c9740 r4:ed3d3000
      [   16.967253] [<bf010170>] (can_fill_info [can_dev]) from [<c0502fa8>] (rtnl_fill_ifinfo+0x58c/0x8fc)
      [   16.976749]  r6:ec8c9740 r5:ed3d3000 r4:eb2b6780
      [   16.981613] [<c0502a1c>] (rtnl_fill_ifinfo) from [<c0503408>] (rtnl_dump_ifinfo+0xf0/0x1dc)
      [   16.990401]  r10:ec8c9740 r9:00000000 r8:00000000 r7:00000000 r6:ebd4d1b4 r5:ed3d3000
      [   16.998671]  r4:00000000
      [   17.001342] [<c0503318>] (rtnl_dump_ifinfo) from [<c050e6e4>] (netlink_dump+0xa8/0x1e0)
      [   17.009772]  r10:00000000 r9:00000000 r8:c0503318 r7:ebf3e6c0 r6:ebd4d1b4 r5:ec8c9740
      [   17.018050]  r4:ebd4d000
      [   17.020714] [<c050e63c>] (netlink_dump) from [<c050ec10>] (__netlink_dump_start+0x104/0x154)
      [   17.029591]  r6:eab5bd34 r5:ec8c9980 r4:ebd4d000
      [   17.034454] [<c050eb0c>] (__netlink_dump_start) from [<c0505604>] (rtnetlink_rcv_msg+0x110/0x1f4)
      [   17.043778]  r7:00000000 r6:ec8c9980 r5:00000f40 r4:ebf3e6c0
      [   17.049743] [<c05054f4>] (rtnetlink_rcv_msg) from [<c05108e8>] (netlink_rcv_skb+0xb4/0xc8)
      [   17.058449]  r8:eab5bdac r7:ec8c9980 r6:c05054f4 r5:ec8c9980 r4:ebf3e6c0
      [   17.065534] [<c0510834>] (netlink_rcv_skb) from [<c0504134>] (rtnetlink_rcv+0x24/0x2c)
      [   17.073854]  r6:ebd4d000 r5:00000014 r4:ec8c9980 r3:c0504110
      [   17.079846] [<c0504110>] (rtnetlink_rcv) from [<c05102ac>] (netlink_unicast+0x180/0x1ec)
      [   17.088363]  r4:ed0c6800 r3:c0504110
      [   17.092113] [<c051012c>] (netlink_unicast) from [<c0510670>] (netlink_sendmsg+0x2ac/0x380)
      [   17.100813]  r10:00000000 r8:00000008 r7:ec8c9980 r6:ebd4d000 r5:eab5be70 r4:eab5bee4
      [   17.109083] [<c05103c4>] (netlink_sendmsg) from [<c04dfdb4>] (sock_sendmsg+0x90/0xb0)
      [   17.117305]  r10:00000000 r9:eab5a000 r8:becdda3c r7:0000000c r6:ea978400 r5:eab5be70
      [   17.125563]  r4:c05103c4
      [   17.128225] [<c04dfd24>] (sock_sendmsg) from [<c04e1c28>] (SyS_sendto+0xb8/0xdc)
      [   17.136001]  r6:becdda5c r5:00000014 r4:ecd37040
      [   17.140876] [<c04e1b70>] (SyS_sendto) from [<c000e680>] (ret_fast_syscall+0x0/0x30)
      [   17.148923]  r10:00000000 r8:c000e804 r7:00000122 r6:becdda5c r5:0000000c r4:becdda5c
      [   17.157169] ---[ end trace 2b71e15b38f58bad ]---
      
      Fixes: 6423d6df ("ARM: OMAP2+: hwmod: check for module address space during init")
      Signed-off-by: NRoger Quadros <rogerq@ti.com>
      Signed-off-by: NPaul Walmsley <paul@pwsan.com>
      Cc: <stable@vger.kernel.org>
      9a258afa
    • A
      powerpc/powernv/ioda2: Fix calculation for memory allocated for TCE table · 3ba3a73e
      Alexey Kardashevskiy 提交于
      The existing code stores the amount of memory allocated for a TCE table.
      At the moment it uses @offset which is a virtual offset in the TCE table
      which is only correct for a one level tables and it does not include
      memory allocated for intermediate levels. When multilevel TCE table is
      requested, WARN_ON in tce_iommu_create_table() prints a warning.
      
      This adds an additional counter to pnv_pci_ioda2_table_do_alloc_pages()
      to count actually allocated memory.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3ba3a73e
    • P
      KVM: x86: rename quirk constants to KVM_X86_QUIRK_* · 0da029ed
      Paolo Bonzini 提交于
      Make them clearly architecture-dependent; the capability is valid for
      all architectures, but the argument is not.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      0da029ed
    • X
      KVM: vmx: obey KVM_QUIRK_CD_NW_CLEARED · fb279950
      Xiao Guangrong 提交于
      OVMF depends on WB to boot fast, because it only clears caches after
      it has set up MTRRs---which is too late.
      
      Let's do writeback if CR0.CD is set to make it happy, similar to what
      SVM is already doing.
      Signed-off-by: NXiao Guangrong <guangrong.xiao@intel.com>
      Tested-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fb279950
    • P
      KVM: x86: introduce kvm_check_has_quirk · 41dbc6bc
      Paolo Bonzini 提交于
      The logic of the disabled_quirks field usually results in a double
      negation.  Wrap it in a simple function that checks the bit and
      negates it.
      
      Based on a patch from Xiao Guangrong.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      41dbc6bc