1. 07 8月, 2015 4 次提交
    • V
      ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff · 10971638
      Vineet Gupta 提交于
      The increment of delay counter was 2 instructions:
      Arithmatic Shfit Left (ASL) + set to 1 on overflow
      
      This can be done in 1 using ROtate Left (ROL)
      Suggested-by: NNigel Topham <ntopham@synopsys.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      10971638
    • D
      sparc64: Fix userspace FPU register corruptions. · 44922150
      David S. Miller 提交于
      If we have a series of events from userpsace, with %fprs=FPRS_FEF,
      like follows:
      
      ETRAP
      	ETRAP
      		VIS_ENTRY(fprs=0x4)
      		VIS_EXIT
      		RTRAP (kernel FPU restore with fpu_saved=0x4)
      	RTRAP
      
      We will not restore the user registers that were clobbered by the FPU
      using kernel code in the inner-most trap.
      
      Traps allocate FPU save slots in the thread struct, and FPU using
      sequences save the "dirty" FPU registers only.
      
      This works at the initial trap level because all of the registers
      get recorded into the top-level FPU save area, and we'll return
      to userspace with the FPU disabled so that any FPU use by the user
      will take an FPU disabled trap wherein we'll load the registers
      back up properly.
      
      But this is not how trap returns from kernel to kernel operate.
      
      The simplest fix for this bug is to always save all FPU register state
      for anything other than the top-most FPU save area.
      
      Getting rid of the optimized inner-slot FPU saving code ends up
      making VISEntryHalf degenerate into plain VISEntry.
      
      Longer term we need to do something smarter to reinstate the partial
      save optimizations.  Perhaps the fundament error is having trap entry
      and exit allocate FPU save slots and restore register state.  Instead,
      the VISEntry et al. calls should be doing that work.
      
      This bug is about two decades old.
      Reported-by: NJames Y Knight <jyknight@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44922150
    • A
      signal: fix information leak in copy_siginfo_to_user · 26135022
      Amanieu d'Antras 提交于
      This function may copy the si_addr_lsb, si_lower and si_upper fields to
      user mode when they haven't been initialized, which can leak kernel
      stack data to user mode.
      
      Just checking the value of si_code is insufficient because the same
      si_code value is shared between multiple signals.  This is solved by
      checking the value of si_signo in addition to si_code.
      Signed-off-by: NAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26135022
    • A
      signal: fix information leak in copy_siginfo_from_user32 · 3c00cb5e
      Amanieu d'Antras 提交于
      This function can leak kernel stack data when the user siginfo_t has a
      positive si_code value.  The top 16 bits of si_code descibe which fields
      in the siginfo_t union are active, but they are treated inconsistently
      between copy_siginfo_from_user32, copy_siginfo_to_user32 and
      copy_siginfo_to_user.
      
      copy_siginfo_from_user32 is called from rt_sigqueueinfo and
      rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
      of si_code.
      
      This fixes the following information leaks:
      x86:   8 bytes leaked when sending a signal from a 32-bit process to
             itself. This leak grows to 16 bytes if the process uses x32.
             (si_code = __SI_CHLD)
      x86:   100 bytes leaked when sending a signal from a 32-bit process to
             a 64-bit process. (si_code = -1)
      sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
             64-bit process. (si_code = any)
      
      parsic and s390 have similar bugs, but they are not vulnerable because
      rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
      to a different process.  These bugs are also fixed for consistency.
      Signed-off-by: NAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c00cb5e
  2. 05 8月, 2015 3 次提交
  3. 04 8月, 2015 8 次提交
  4. 03 8月, 2015 15 次提交
    • V
      ARCv2: Fix the peripheral address space detection · e13c42ec
      Vineet Gupta 提交于
      With HS 2.1 release, the peripheral space register no longer contains
      the uncached space specifics, causing the kernel to panic early on.
      So read the newer NON VOLATILE AUX register to get that info.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      e13c42ec
    • J
      MIPS: Replace add and sub instructions in relocate_kernel.S with addiu · a4504755
      James Cowgill 提交于
      Fixes the assembler errors generated when compiling a MIPS R6 kernel with
      CONFIG_KEXEC on, by replacing the offending add and sub instructions with
      addiu instructions.
      
      Build errors:
      arch/mips/kernel/relocate_kernel.S: Assembler messages:
      arch/mips/kernel/relocate_kernel.S:27: Error: invalid operands `dadd $16,$16,8'
      arch/mips/kernel/relocate_kernel.S:64: Error: invalid operands `dadd $20,$20,8'
      arch/mips/kernel/relocate_kernel.S:65: Error: invalid operands `dadd $18,$18,8'
      arch/mips/kernel/relocate_kernel.S:66: Error: invalid operands `dsub $22,$22,1'
      scripts/Makefile.build:294: recipe for target 'arch/mips/kernel/relocate_kernel.o' failed
      Signed-off-by: NJames Cowgill <James.Cowgill@imgtec.com>
      Cc: <stable@vger.kernel.org> # 4.0+
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10558/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      a4504755
    • J
      MIPS: Flush RPS on kernel entry with EVA · 3aff47c0
      James Hogan 提交于
      When EVA is enabled, flush the Return Prediction Stack (RPS) present on
      some MIPS cores on entry to the kernel from user mode.
      
      This is important specifically for interAptiv with EVA enabled,
      otherwise kernel mode RPS mispredicts may trigger speculative fetches of
      user return addresses, which may be sensitive in the kernel address
      space due to EVA's overlapping user/kernel address spaces.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.15.x-
      Patchwork: https://patchwork.linux-mips.org/patch/10812/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      3aff47c0
    • F
      Revert "MIPS: BCM63xx: Provide a plat_post_dma_flush hook" · 247bfb65
      Florian Fainelli 提交于
      This reverts commit 3cf29543 ("MIPS:
      BCM63xx: Provide a plat_post_dma_flush hook") since this commit was
      found to prevent BCM6358 (early BMIPS4350 cores) and some BCM6368
      (BMIPS4380 cores) from booting reliably.
      
      Alvaro was able to track this down to an issue specifically located to
      devices that use the second thread (TP1) when booting. Since BCM63xx did
      not have a need for plat_post_dma_flush() hook before, let's just keep
      things the way they were.
      Reported-by: NÁlvaro Fernández Rojas <noltari@gmail.com>
      Reported-by: NJonas Gorski <jogo@openwrt.org>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Kevin Cernekee <cernekee@gmail.com>
      Cc: Nicolas Schichan <nschichan@freebox.fr>
      Cc: linux-mips@linux-mips.org
      Cc: blogic@openwrt.org
      Cc: noltari@gmail.com
      Cc: jogo@openwrt.org
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: stable@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/10804/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      247bfb65
    • K
      MIPS: BMIPS: Delete unused Kconfig symbol · 3592bb08
      Kevin Cernekee 提交于
      This was left over from an earlier iteration of the BMIPS irqchip changes.
      It doesn't actually have an effect, so let's nuke it.
      Reported-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: NKevin Cernekee <cernekee@chromium.org>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Cc: stable@vger.kernel.org # v4.1+
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/9910/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      3592bb08
    • F
      MIPS: Export get_c0_perfcount_int() · 0cb0985f
      Felix Fietkau 提交于
      get_c0_perfcount_int is tested from oprofile code. If oprofile is
      compiled as module, get_c0_perfcount_int needs to be exported, otherwise
      it cannot be resolved.
      
      Fixes: a669efc4 ("MIPS: Add hook to get C0 performance counter interrupt")
      Cc: stable@vger.kernel.org # v3.19+
      Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
      Cc: linux-mips@linux-mips.org
      Cc: abrestic@chromium.org
      Patchwork: https://patchwork.linux-mips.org/patch/10763/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      0cb0985f
    • J
      MIPS: show_stack: Fix stack trace with EVA · 1e77863a
      James Hogan 提交于
      The show_stack() function deals exclusively with kernel contexts, but if
      it gets called in user context with EVA enabled, show_stacktrace() will
      attempt to access the stack using EVA accesses, which will either read
      other user mapped data, or more likely cause an exception which will be
      handled by __get_user().
      
      This is easily reproduced using SysRq t to show all task states, which
      results in the following stack dump output:
      
       Stack : (Bad stack address)
      
      Fix by setting the current user access mode to kernel around the call to
      show_stacktrace(). This causes __get_user() to use normal loads to read
      the kernel stack.
      
      Now we get the correct output, like this:
      
       Stack : 00000000 80168960 00000000 004a0000 00000000 00000000 8060016c 1f3abd0c
                 1f172cd8 8056f09c 7ff1e450 8014fc3c 00000001 806dd0b0 0000001d 00000002
                 1f17c6a0 1f17c804 1f17c6a0 8066f6e0 00000000 0000000a 00000000 00000000
                 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
                 00000000 00000000 00000000 00000000 00000000 0110e800 1f3abd6c 1f17c6a0
                 ...
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.15+
      Patchwork: https://patchwork.linux-mips.org/patch/10778/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      1e77863a
    • J
      MIPS: do_mcheck: Fix kernel code dump with EVA · 55c723e1
      James Hogan 提交于
      If a machine check exception is raised in kernel mode, user context,
      with EVA enabled, then the do_mcheck handler will attempt to read the
      code around the EPC using EVA load instructions, i.e. as if the reads
      were from user mode. This will either read random user data if the
      process has anything mapped at the same address, or it will cause an
      exception which is handled by __get_user, resulting in this output:
      
       Code: (Bad address in epc)
      
      Fix by setting the current user access mode to kernel if the saved
      register context indicates the exception was taken in kernel mode. This
      causes __get_user to use normal loads to read the kernel code.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.15+
      Patchwork: https://patchwork.linux-mips.org/patch/10777/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      55c723e1
    • A
      MIPS: SMP: Don't increment irq_count multiple times for call function IPIs · 4ace6139
      Alex Smith 提交于
      The majority of SMP platforms handle their IPIs through do_IRQ()
      which calls irq_{enter/exit}(). When a call function IPI is received,
      smp_call_function_interrupt() is called which also calls
      irq_{enter,exit}(), meaning irq_count is raised twice.
      
      When tick broadcasting is used (which is implemented via a call
      function IPI), this incorrectly causes all CPU idle time on the core
      receiving broadcast ticks to be accounted as time spent servicing
      IRQs, as account_process_tick() will account as such if irq_count is
      greater than 1. This results in 100% CPU usage being reported on a
      core which receives its ticks via broadcast.
      
      This patch removes the SMP smp_call_function_interrupt() wrapper which
      calls irq_{enter,exit}(). Platforms which handle their IPIs through
      do_IRQ() now call generic_smp_call_function_interrupt() directly to
      avoid incrementing irq_count a second time. Platforms which don't
      (loongson, sgi-ip27, sibyte) call generic_smp_call_function_interrupt()
      wrapped in irq_{enter,exit}().
      Signed-off-by: NAlex Smith <alex.smith@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10770/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      4ace6139
    • R
      MIPS: Partially disable RIXI support. · 55fdcb2d
      Ralf Baechle 提交于
      Execution of break instruction, trap instructions, emulation of unaligned
      loads or floating point instructions - anything that tries to read the
      instruction's opcode from userspace - needs read access to a page.
      
      RIXI (Read Inhibit / Execute Inhibit) support however allows the creation of
      pags that are executable but not readable.  On such a mapping the attempted
      load of the opcode by the kernel is going to cause an endless loop of
      page faults.
      
      The quick workaround for this is to disable the combinations that the kernel
      currently isn't able to handle which are executable mappings.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      55fdcb2d
    • R
      MIPS: Handle page faults of executable but unreadable pages correctly. · e070dab7
      Ralf Baechle 提交于
      Without this we end taking execeptions in an endless loop hanging the
      thread.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      e070dab7
    • J
      MIPS: Malta: Don't reinitialise RTC · 106eccb4
      James Hogan 提交于
      On Malta, since commit a87ea88d ("MIPS: Malta: initialise the RTC at
      boot"), the RTC is reinitialised and forced into binary coded decimal
      (BCD) mode during init, even if the bootloader has already initialised
      it, and may even have already put it into binary mode (as YAMON does).
      This corrupts the current time, can result in the RTC seconds being an
      invalid BCD (e.g. 0x1a..0x1f) for up to 6 seconds, as well as confusing
      YAMON for a while after reset, enough for it to report timeouts when
      attempting to load from TFTP (it actually uses the RTC in that code).
      
      Therefore only initialise the RTC to the extent that is necessary so
      that Linux avoids interfering with the bootloader setup, while also
      allowing it to estimate the CPU frequency without hanging, without a
      bootloader necessarily having done anything with the RTC (for example
      when the kernel is loaded via EJTAG).
      
      The divider control is configured for a 32KHZ reference clock if
      necessary, and the SET bit of the RTC_CONTROL register is cleared if
      necessary without changing any other bits (this bit will be set when
      coming out of reset if the battery has been disconnected).
      
      Fixes: a87ea88d ("MIPS: Malta: initialise the RTC at boot")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Reviewed-by: NPaul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.14+
      Patchwork: https://patchwork.linux-mips.org/patch/10739/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      106eccb4
    • J
      MIPS: unaligned: Fix build error on big endian R6 kernels · 531a6d59
      James Cowgill 提交于
      Commit eeb53895 ("MIPS: unaligned: Prevent EVA instructions on kernel
      unaligned accesses") renamed the Load* and Store* defines in unaligned.c
      to _Load* and _Store* as part of its fix. One define was missed out which
      causes big endian R6 kernels to fail to build.
      
      arch/mips/kernel/unaligned.c:880:35:
      error: implicit declaration of function '_StoreDW'
       #define StoreDW(addr, value, res) _StoreDW(addr, value, res)
                                         ^
      Signed-off-by: NJames Cowgill <James.Cowgill@imgtec.com>
      Fixes: eeb53895 ("MIPS: unaligned: Prevent EVA instructions on kernel unaligned accesses")
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: <stable@vger.kernel.org> # 4.0+
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10575/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      531a6d59
    • F
      MIPS: Fix sched_getaffinity with MT FPAFF enabled · 1d62d737
      Felix Fietkau 提交于
      p->thread.user_cpus_allowed is zero-initialized and is only filled on
      the first sched_setaffinity call.
      
      To avoid adding overhead in the task initialization codepath, simply OR
      the returned mask in sched_getaffinity with p->cpus_allowed.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10740/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      1d62d737
    • J
      MIPS: Fix build with CONFIG_OF=y for non OF-enabled targets · d3557d96
      Jonas Gorski 提交于
      Commit 01306aea ("MIPS: prepare for user enabling of CONFIG_OF")
      changed the guards in asm/prom.h from CONFIG_OF to CONFIG_USE_OF, but
      missed the actual function declarations in kernel/prom.c, which have
      additional dependencies.
      
      Fixes the following build error:
      
        CC      arch/mips/kernel/prom.o
      arch/mips/kernel/prom.c: In function '__dt_setup_arch':
      arch/mips/kernel/prom.c:54:2: error: implicit declaration of function 'early_init_dt_scan' [-Werror=implicit-function-declaration]
        if (!early_init_dt_scan(bph))
        ^
      
      Fixes: 01306aea ("MIPS: prepare for user enabling of CONFIG_OF")
      Signed-off-by: NJonas Gorski <jogo@openwrt.org>
      Acked-by: NRob Herring <robh@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: devicetree@vger.kernel.org
      Cc: Grant Likely <grant.likely@linaro.org>
      Patchwork: https://patchwork.linux-mips.org/patch/10741/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      d3557d96
  5. 01 8月, 2015 1 次提交
  6. 31 7月, 2015 6 次提交
    • R
      dmaengine: xgene-dma: Fix the resource map to handle overlapping · cda8e937
      Rameshwar Prasad Sahu 提交于
      There is an overlap in dma ring cmd csr region due to sharing of ethernet
      ring cmd csr region. This patch fix the resource overlapping by mapping
      the entire dma ring cmd csr region.
      Signed-off-by: NRameshwar Prasad Sahu <rsahu@apm.com>
      Signed-off-by: NVinod Koul <vinod.koul@intel.com>
      cda8e937
    • A
      x86/ldt: Make modify_ldt synchronous · 37868fe1
      Andy Lutomirski 提交于
      modify_ldt() has questionable locking and does not synchronize
      threads.  Improve it: redesign the locking and synchronize all
      threads' LDTs using an IPI on all modifications.
      
      This will dramatically slow down modify_ldt in multithreaded
      programs, but there shouldn't be any multithreaded programs that
      care about modify_ldt's performance in the first place.
      
      This fixes some fallout from the CVE-2015-5157 fixes.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      37868fe1
    • A
      x86/xen: Probe target addresses in set_aliased_prot() before the hypercall · aa1acff3
      Andy Lutomirski 提交于
      The update_va_mapping hypercall can fail if the VA isn't present
      in the guest's page tables.  Under certain loads, this can
      result in an OOPS when the target address is in unpopulated vmap
      space.
      
      While we're at it, add comments to help explain what's going on.
      
      This isn't a great long-term fix.  This code should probably be
      changed to use something like set_memory_ro.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <dvrabel@cantab.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aa1acff3
    • J
      x86/irq: Use the caller provided polarity setting in mp_check_pin_attr() · 646c4b75
      Jiang Liu 提交于
      Commit d32932d0 ("x86/irq: Convert IOAPIC to use hierarchical
      irqdomain interfaces") introduced a regression which causes
      malfunction of interrupt lines.
      
      The reason is that the conversion of mp_check_pin_attr() missed to
      update the polarity selection of the interrupt pin with the caller
      provided setting and instead uses a stale attribute value. That in
      turn results in chosing the wrong interrupt flow handler.
      
      Use the caller supplied setting to configure the pin correctly which
      also choses the correct interrupt flow handler.
      
      This restores the original behaviour and on the affected
      machine/driver (Surface Pro 3, i2c controller) all IOAPIC IRQ
      configuration are identical to v4.1.
      
      Fixes: d32932d0 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
      Reported-and-tested-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reported-and-tested-by: NChen Yu <yu.c.chen@intel.com>
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1438242695-23531-1-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      646c4b75
    • R
      efi: Check for NULL efi kernel parameters · 9115c758
      Ricardo Neri 提交于
      Even though it is documented how to specifiy efi parameters, it is
      possible to cause a kernel panic due to a dereference of a NULL pointer when
      parsing such parameters if "efi" alone is given:
      
      PANIC: early exception 0e rip 10:ffffffff812fb361 error 0 cr2 0
      [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.2.0-rc1+ #450
      [ 0.000000]  ffffffff81fe20a9 ffffffff81e03d50 ffffffff8184bb0f 00000000000003f8
      [ 0.000000]  0000000000000000 ffffffff81e03e08 ffffffff81f371a1 64656c62616e6520
      [ 0.000000]  0000000000000069 000000000000005f 0000000000000000 0000000000000000
      [ 0.000000] Call Trace:
      [ 0.000000]  [<ffffffff8184bb0f>] dump_stack+0x45/0x57
      [ 0.000000]  [<ffffffff81f371a1>] early_idt_handler_common+0x81/0xae
      [ 0.000000]  [<ffffffff812fb361>] ? parse_option_str+0x11/0x90
      [ 0.000000]  [<ffffffff81f4dd69>] arch_parse_efi_cmdline+0x15/0x42
      [ 0.000000]  [<ffffffff81f376e1>] do_early_param+0x50/0x8a
      [ 0.000000]  [<ffffffff8106b1b3>] parse_args+0x1e3/0x400
      [ 0.000000]  [<ffffffff81f37a43>] parse_early_options+0x24/0x28
      [ 0.000000]  [<ffffffff81f37691>] ? loglevel+0x31/0x31
      [ 0.000000]  [<ffffffff81f37a78>] parse_early_param+0x31/0x3d
      [ 0.000000]  [<ffffffff81f3ae98>] setup_arch+0x2de/0xc08
      [ 0.000000]  [<ffffffff8109629a>] ? vprintk_default+0x1a/0x20
      [ 0.000000]  [<ffffffff81f37b20>] start_kernel+0x90/0x423
      [ 0.000000]  [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
      [ 0.000000]  [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef
      [ 0.000000] RIP 0xffffffff81ba2efc
      
      This panic is not reproducible with "efi=" as this will result in a non-NULL
      zero-length string.
      
      Thus, verify that the pointer to the parameter string is not NULL. This is
      consistent with other parameter-parsing functions which check for NULL pointers.
      Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      9115c758
    • D
      x86/efi: Use all 64 bit of efi_memmap in setup_e820() · 7cc03e48
      Dmitry Skorodumov 提交于
      The efi_info structure stores low 32 bits of memory map
      in efi_memmap and high 32 bits in efi_memmap_hi.
      
      While constructing pointer in the setup_e820(), need
      to take into account all 64 bit of the pointer.
      
      It is because on 64bit machine the function
      efi_get_memory_map() may return full 64bit pointer and before
      the patch that pointer was truncated.
      
      The issue is triggered on Parallles virtual machine and
      fixed with this patch.
      Signed-off-by: NDmitry Skorodumov <sdmitry@parallels.com>
      Cc: Denis V. Lunev <den@openvz.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      7cc03e48
  7. 30 7月, 2015 3 次提交
    • C
      KVM: s390: Fix hang VCPU hang/loop regression · 586b7ccd
      Christian Borntraeger 提交于
      commit 785dbef4 ("KVM: s390: optimize round trip time in request
      handling") introduced a regression. This regression was seen with
      CPU hotplug in the guest and switching between 1 or 2 CPUs. This will
      set/reset the IBS control via synced request.
      
      Whenever we make a synced request, we first set the vcpu->requests
      bit and then block the vcpu. The handler, on the other hand, unblocks
      itself, processes vcpu->requests (by clearing them) and unblocks itself
      once again.
      
      Now, if the requester sleeps between setting of vcpu->requests and
      blocking, the handler will clear the vcpu->requests bit and try to
      unblock itself (although no bit is set). When the requester wakes up,
      it blocks the VCPU and we have a blocked VCPU without requests.
      
      Solution is to always unset the block bit.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Fixes: 785dbef4 ("KVM: s390: optimize round trip time in request handling")
      586b7ccd
    • A
      powerpc/eeh-powernv: Fix unbalanced IRQ warning · b8d65e96
      Alistair Popple 提交于
      pnv_eeh_next_error() re-enables the eeh opal event interrupt but it
      gets called from a loop if there are more outstanding events to
      process, resulting in a warning due to enabling an already enabled
      interrupt. Instead the interrupt should only be re-enabled once the
      last outstanding event has been processed.
      Tested-by: NDaniel Axtens <dja@axtens.net>
      Reported-by: NDaniel Axtens <dja@axtens.net>
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b8d65e96
    • D
      ebpf, x86: fix general protection fault when tail call is invoked · 2482abb9
      Daniel Borkmann 提交于
      With eBPF JIT compiler enabled on x86_64, I was able to reliably trigger
      the following general protection fault out of an eBPF program with a simple
      tail call, f.e. tracex5 (or a stripped down version of it):
      
        [  927.097918] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
        [...]
        [  927.100870] task: ffff8801f228b780 ti: ffff880016a64000 task.ti: ffff880016a64000
        [  927.102096] RIP: 0010:[<ffffffffa002440d>]  [<ffffffffa002440d>] 0xffffffffa002440d
        [  927.103390] RSP: 0018:ffff880016a67a68  EFLAGS: 00010006
        [  927.104683] RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: 0000000000000001
        [  927.105921] RDX: 0000000000000000 RSI: ffff88014e438000 RDI: ffff880016a67e00
        [  927.107137] RBP: ffff880016a67c90 R08: 0000000000000000 R09: 0000000000000001
        [  927.108351] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880016a67e00
        [  927.109567] R13: 0000000000000000 R14: ffff88026500e460 R15: ffff880220a81520
        [  927.110787] FS:  00007fe7d5c1f740(0000) GS:ffff880265000000(0000) knlGS:0000000000000000
        [  927.112021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [  927.113255] CR2: 0000003e7bbb91a0 CR3: 000000006e04b000 CR4: 00000000001407e0
        [  927.114500] Stack:
        [  927.115737]  ffffc90008cdb000 ffff880016a67e00 ffff88026500e460 ffff880220a81520
        [  927.117005]  0000000100000000 000000000000001b ffff880016a67aa8 ffffffff8106c548
        [  927.118276]  00007ffcdaf22e58 0000000000000000 0000000000000000 ffff880016a67ff0
        [  927.119543] Call Trace:
        [  927.120797]  [<ffffffff8106c548>] ? lookup_address+0x28/0x30
        [  927.122058]  [<ffffffff8113d176>] ? __module_text_address+0x16/0x70
        [  927.123314]  [<ffffffff8117bf0e>] ? is_ftrace_trampoline+0x3e/0x70
        [  927.124562]  [<ffffffff810c1a0f>] ? __kernel_text_address+0x5f/0x80
        [  927.125806]  [<ffffffff8102086f>] ? print_context_stack+0x7f/0xf0
        [  927.127033]  [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
        [  927.128254]  [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
        [  927.129461]  [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140
        [  927.130654]  [<ffffffff8119ee4a>] trace_call_bpf+0x8a/0x140
        [  927.131837]  [<ffffffff8119edfa>] ? trace_call_bpf+0x3a/0x140
        [  927.133015]  [<ffffffff8119f008>] kprobe_perf_func+0x28/0x220
        [  927.134195]  [<ffffffff811a1668>] kprobe_dispatcher+0x38/0x60
        [  927.135367]  [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230
        [  927.136523]  [<ffffffff81061400>] kprobe_ftrace_handler+0xf0/0x150
        [  927.137666]  [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230
        [  927.138802]  [<ffffffff8117950c>] ftrace_ops_recurs_func+0x5c/0xb0
        [  927.139934]  [<ffffffffa022b0d5>] 0xffffffffa022b0d5
        [  927.141066]  [<ffffffff81174b91>] ? seccomp_phase1+0x1/0x230
        [  927.142199]  [<ffffffff81174b95>] seccomp_phase1+0x5/0x230
        [  927.143323]  [<ffffffff8102c0a4>] syscall_trace_enter_phase1+0xc4/0x150
        [  927.144450]  [<ffffffff81174b95>] ? seccomp_phase1+0x5/0x230
        [  927.145572]  [<ffffffff8102c0a4>] ? syscall_trace_enter_phase1+0xc4/0x150
        [  927.146666]  [<ffffffff817f9a9f>] tracesys+0xd/0x44
        [  927.147723] Code: 48 8b 46 10 48 39 d0 76 2c 8b 85 fc fd ff ff 83 f8 20 77 21 83
                             c0 01 89 85 fc fd ff ff 48 8d 44 d6 80 48 8b 00 48 83 f8 00 74
                             0a <48> 8b 40 20 48 83 c0 33 ff e0 48 89 d8 48 8b 9d d8 fd ff
                             ff 4c
        [  927.150046] RIP  [<ffffffffa002440d>] 0xffffffffa002440d
      
      The code section with the instructions that traps points into the eBPF JIT
      image of the root program (the one invoking the tail call instruction).
      
      Using bpf_jit_disasm -o on the eBPF root program image:
      
        [...]
        4e:   mov    -0x204(%rbp),%eax
              8b 85 fc fd ff ff
        54:   cmp    $0x20,%eax               <--- if (tail_call_cnt > MAX_TAIL_CALL_CNT)
              83 f8 20
        57:   ja     0x000000000000007a
              77 21
        59:   add    $0x1,%eax                <--- tail_call_cnt++
              83 c0 01
        5c:   mov    %eax,-0x204(%rbp)
              89 85 fc fd ff ff
        62:   lea    -0x80(%rsi,%rdx,8),%rax  <--- prog = array->prog[index]
              48 8d 44 d6 80
        67:   mov    (%rax),%rax
              48 8b 00
        6a:   cmp    $0x0,%rax                <--- check for NULL
              48 83 f8 00
        6e:   je     0x000000000000007a
              74 0a
        70:   mov    0x20(%rax),%rax          <--- GPF triggered here! fetch of bpf_func
              48 8b 40 20                              [ matches <48> 8b 40 20 ... from above ]
        74:   add    $0x33,%rax               <--- prologue skip of new prog
              48 83 c0 33
        78:   jmpq   *%rax                    <--- jump to new prog insns
              ff e0
        [...]
      
      The problem is that rax has 5a5a5a5a5a5a5a5a, which suggests a tail call
      jump to map slot 0 is pointing to a poisoned page. The issue is the following:
      
      lea instruction has a wrong offset, i.e. it should be ...
      
        lea    0x80(%rsi,%rdx,8),%rax
      
      ... but it actually seems to be ...
      
        lea   -0x80(%rsi,%rdx,8),%rax
      
      ... where 0x80 is offsetof(struct bpf_array, prog), thus the offset needs
      to be positive instead of negative. Disassembling the interpreter, we btw
      similarly do:
      
        [...]
        c88:  lea     0x80(%rax,%rdx,8),%rax  <--- prog = array->prog[index]
              48 8d 84 d0 80 00 00 00
        c90:  add     $0x1,%r13d
              41 83 c5 01
        c94:  mov     (%rax),%rax
              48 8b 00
        [...]
      
      Now the other interesting fact is that this panic triggers only when things
      like CONFIG_LOCKDEP are being used. In that case offsetof(struct bpf_array,
      prog) starts at offset 0x80 and in non-CONFIG_LOCKDEP case at offset 0x50.
      Reason is that the work_struct inside struct bpf_map grows by 48 bytes in my
      case due to the lockdep_map member (which also has CONFIG_LOCK_STAT enabled
      members).
      
      Changing the emitter to always use the 4 byte displacement in the lea
      instruction fixes the panic on my side. It increases the tail call instruction
      emission by 3 more byte, but it should cover us from various combinations
      (and perhaps other future increases on related structures).
      
      After patch, disassembly:
      
        [...]
        9e:   lea    0x80(%rsi,%rdx,8),%rax   <--- CONFIG_LOCKDEP/CONFIG_LOCK_STAT
              48 8d 84 d6 80 00 00 00
        a6:   mov    (%rax),%rax
              48 8b 00
        [...]
      
        [...]
        9e:   lea    0x50(%rsi,%rdx,8),%rax   <--- No CONFIG_LOCKDEP
              48 8d 84 d6 50 00 00 00
        a6:   mov    (%rax),%rax
              48 8b 00
        [...]
      
      Fixes: b52f00e6 ("x86: bpf_jit: implement bpf_tail_call() helper")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2482abb9