1. 01 1月, 2016 1 次提交
  2. 25 12月, 2015 5 次提交
    • R
      sparc64: fix FP corruption in user copy functions · a7c5724b
      Rob Gardner 提交于
      Short story: Exception handlers used by some copy_to_user() and
      copy_from_user() functions do not diligently clean up floating point
      register usage, and this can result in a user process seeing invalid
      values in floating point registers. This sometimes makes the process
      fail.
      
      Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions
      use floating point registers and VIS alignaddr/faligndata to
      accelerate data copying when source and dest addresses don't align
      well. Linux uses a lazy scheme for saving floating point registers; It
      is not done upon entering the kernel since it's a very expensive
      operation. Rather, it is done only when needed. If the kernel ends up
      not using FP regs during the course of some trap or system call, then
      it can return to user space without saving or restoring them.
      
      The various memcpy functions begin their FP code with VISEntry (or a
      variation thereof), which saves the FP regs. They conclude their FP
      code with VISExit (or a variation) which essentially marks the FP regs
      "clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned
      off so that a lazy restore will be triggered when/if the user process
      accesses floating point regs again.
      
      The bug is that the user copy variants of memcpy, copy_from_user() and
      copy_to_user(), employ an exception handling mechanism to detect faults
      when accessing user space addresses, and when this handler is invoked,
      an immediate return from the function is forced, and VISExit is not
      executed, thus leaving the fprs register in an indeterminate state,
      but often with fprs.FPRS_FEF set and one or more dirty bits. This
      results in a return to user space with invalid values in the FP regs,
      and since fprs.FPRS_FEF is on, no lazy restore occurs.
      
      This bug affects copy_to_user() and copy_from_user() for NG4, NG2,
      U3, and U1. All are fixed by using a new exception handler for those
      loads and stores that are done during the time between VISEnter and
      VISExit.
      
      n.b. In NG4memcpy, the problematic code can be triggered by a copy
      size greater than 128 bytes and an unaligned source address.  This bug
      is known to be the cause of random user process memory corruptions
      while perf is running with the callgraph option (ie, perf record -g).
      This occurs because perf uses copy_from_user() to read user stacks,
      and may fault when it follows a stack frame pointer off to an
      invalid page. Validation checks on the stack address just obscure
      the underlying problem.
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7c5724b
    • R
      sparc64: Perf should save/restore fault info · 83352694
      Rob Gardner 提交于
      There have been several reports of random processes being killed with
      a bus error or segfault during userspace stack walking in perf.  One
      of the root causes of this problem is an asynchronous modification to
      thread_info fault_address and fault_code, which stems from a perf
      counter interrupt arriving during kernel processing of a "benign"
      fault, such as a TSB miss. Since perf_callchain_user() invokes
      copy_from_user() to read user stacks, a fault is not only possible,
      but probable. Validity checks on the stack address merely cover up the
      problem and reduce its frequency.
      
      The solution here is to save and restore fault_address and fault_code
      in perf_callchain_user() so that the benign fault handler is not
      disturbed by a perf interrupt.
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83352694
    • R
      sparc64: Ensure perf can access user stacks · 3f74306a
      Rob Gardner 提交于
      When an interrupt (such as a perf counter interrupt) is delivered
      while executing in user space, the trap entry code puts ASI_AIUS in
      %asi so that copy_from_user() and copy_to_user() will access the
      correct memory. But if a perf counter interrupt is delivered while the
      cpu is already executing in kernel space, then the trap entry code
      will put ASI_P in %asi, and this will prevent copy_from_user() from
      reading any useful stack data in either of the perf_callchain_user_X
      functions, and thus no user callgraph data will be collected for this
      sample period. An additional problem is that a fault is guaranteed
      to occur, and though it will be silently covered up, it wastes time
      and could perturb state.
      
      In perf_callchain_user(), we ensure that %asi contains ASI_AIUS
      because we know for a fact that the subsequent calls to
      copy_from_user() are intended to read the user's stack.
      
      [ Use get_fs()/set_fs() -DaveM ]
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f74306a
    • R
      sparc64: Don't set %pil in rtrap_nmi too early · 1ca04a4c
      Rob Gardner 提交于
      Commit 28a1f533 delays setting %pil to avoid potential
      hardirq stack overflow in the common rtrap_irq path.
      Setting %pil also needs to be delayed in the rtrap_nmi
      path for the same reason.
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: NDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ca04a4c
    • K
      sparc64: Add ADI capability to cpu capabilities · 82924e54
      Khalid Aziz 提交于
      Add ADI (Application Data Integrity) capability to cpu capabilities list.
      ADI capability allows virtual addresses to be encoded with a tag in
      bits 63-60. This tag serves as an access control key for the regions
      of virtual address with ADI enabled and a key set on them. Hypervisor
      encodes this capability as "adp" in "hwcap-list" property in machine
      description.
      Signed-off-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82924e54
  3. 24 12月, 2015 1 次提交
  4. 23 12月, 2015 1 次提交
  5. 22 12月, 2015 4 次提交
  6. 21 12月, 2015 1 次提交
    • H
      parisc: Fix syscall restarts · 71a71fb5
      Helge Deller 提交于
      On parisc syscalls which are interrupted by signals sometimes failed to
      restart and instead returned -ENOSYS which in the worst case lead to
      userspace crashes.
      A similiar problem existed on MIPS and was fixed by commit e967ef02
      ("MIPS: Fix restart of indirect syscalls").
      
      On parisc the current syscall restart code assumes that all syscall
      callers load the syscall number in the delay slot of the ble
      instruction. That's how it is e.g. done in the unistd.h header file:
      	ble 0x100(%sr2, %r0)
      	ldi #syscall_nr, %r20
      Because of that assumption the current code never restored %r20 before
      returning to userspace.
      
      This assumption is at least not true for code which uses the glibc
      syscall() function, which instead uses this syntax:
      	ble 0x100(%sr2, %r0)
      	copy regX, %r20
      where regX depend on how the compiler optimizes the code and register
      usage.
      
      This patch fixes this problem by adding code to analyze how the syscall
      number is loaded in the delay branch and - if needed - copy the syscall
      number to regX prior returning to userspace for the syscall restart.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      71a71fb5
  7. 18 12月, 2015 2 次提交
    • M
      s390/dis: Fix handling of format specifiers · 272fa59c
      Michael Holzheu 提交于
      The print_insn() function returns strings like "lghi %r1,0". To escape the
      '%' character in sprintf() a second '%' is used. For example "lghi %%r1,0"
      is converted into "lghi %r1,0".
      
      After print_insn() the output string is passed to printk(). Because format
      specifiers like "%r" or "%f" are ignored by printk() this works by chance
      most of the time. But for instructions with control registers like
      "lctl %c6,%c6,780" this fails because printk() interprets "%c" as
      character format specifier.
      
      Fix this problem and escape the '%' characters twice.
      
      For example "lctl %%%%c6,%%%%c6,780" is then converted by sprintf()
      into "lctl %%c6,%%c6,780" and by printk() into "lctl %c6,%c6,780".
      Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      272fa59c
    • A
      powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion" · 036592fb
      Alistair Popple 提交于
      Commit 25642e14 ("powerpc/opal-irqchip: Fix double endian
      conversion") fixed an endian bug by calling opal_handle_events() in
      opal_event_unmask().
      
      However this introduced a deadlock if we find an event is active
      during unmasking and call opal_handle_events() again. The bad call
      sequence is:
      
        opal_interrupt()
        -> opal_handle_events()
           -> generic_handle_irq()
              -> handle_level_irq()
                 -> raw_spin_lock(&desc->lock)
                    handle_irq_event(desc)
                    unmask_irq(desc)
                    -> opal_event_unmask()
                       -> opal_handle_events()
                          -> generic_handle_irq()
                             -> handle_level_irq()
                                -> raw_spin_lock(&desc->lock)	(BOOM)
      
      When generating multiple opal events in quick succession this would lead
      to the following stall warnings:
      
      EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
      INFO: rcu_sched detected stalls on CPUs/tasks:
      
               12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
               15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
               (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
      NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
      INFO: rcu_sched detected stalls on CPUs/tasks:
               12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
               15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
               (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)
      
      This patch corrects the problem by queuing the work if an event is
      active during unmasking, which is similar to the pre-endian fix
      behaviour.
      
      Fixes: 25642e14 ("powerpc/opal-irqchip: Fix double endian conversion")
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Reported-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      036592fb
  8. 17 12月, 2015 6 次提交
  9. 16 12月, 2015 3 次提交
    • M
      Partial revert of "powerpc: Individual System V IPC system calls" · 2475c362
      Michael Ellerman 提交于
      This partially reverts commit a3423615.
      
      While reviewing the glibc patch to exploit the individual IPC calls,
      Arnd & Andreas noticed that we were still requiring userspace to pass
      IPC_64 in order to get the new style IPC API.
      
      With a bit of cleanup in the kernel we can drop that requirement, and
      instead only provide the new style API, which will simplify things for
      userspace.
      
      Rather than try and sneak that patch into 4.4, instead we will drop the
      individual IPC calls for powerpc, and merge them again in 4.5 once the
      cleanup patch has gone in.
      
      Because we've already added sys_mlock2() as syscall #378, we don't do a
      full revert of the IPC calls. Instead we drop the __NR #defines, and
      send those now undefined syscall numbers to sys_ni_syscall(). This
      leaves a gap in the syscall numbers, but we'll reuse them when we merge
      the individual IPC calls.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      2475c362
    • D
      Revert "scatterlist: use sg_phys()" · 3e6110fd
      Dan Williams 提交于
      commit db0fa0cb "scatterlist: use sg_phys()" did replacements of
      the form:
      
          phys_addr_t phys = page_to_phys(sg_page(s));
          phys_addr_t phys = sg_phys(s) & PAGE_MASK;
      
      However, this breaks platforms where sizeof(phys_addr_t) >
      sizeof(unsigned long).  Revert for 4.3 and 4.4 to make room for a
      combined helper in 4.5.
      
      Cc: <stable@vger.kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Fixes: db0fa0cb ("scatterlist: use sg_phys()")
      Suggested-by: NJoerg Roedel <joro@8bytes.org>
      Reported-by: NVitaly Lavrov <vel21ripn@gmail.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      3e6110fd
    • L
      Fix user-visible spelling error · 173ae9ba
      Linus Torvalds 提交于
      Pavel Machek reports a warning about W+X pages found in the "Persisent"
      kmap area.  After grepping for it (using the correct spelling), and not
      finding it, I noticed how the debug printk was just misspelled.  Fix it.
      
      The actual mapping bug that Pavel reported is still open.  It's
      apparently a separate issue from the known EFI page tables, looks like
      it's related to the HIGHMEM mappings.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      173ae9ba
  10. 15 12月, 2015 4 次提交
  11. 14 12月, 2015 2 次提交
  12. 13 12月, 2015 2 次提交
  13. 12 12月, 2015 8 次提交
    • H
      parisc: Disable huge pages on Mako machines · 78c0cbff
      Helge Deller 提交于
      Mako-based machines (PA8800 and PA8900 CPUs) don't allow aliasing on
      non-equaivalent addresses.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      78c0cbff
    • H
      parisc: Wire up mlock2 syscall · 5c477b45
      Helge Deller 提交于
      Signed-off-by: NHelge Deller <deller@gmx.de>
      5c477b45
    • B
      parisc: Remove unused pcibios_init_bus() · 5f0e9b4c
      Bjorn Helgaas 提交于
      There are no callers of pcibios_init_bus(), so remove it.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      5f0e9b4c
    • V
      c512c6ba
    • V
      ARCv2: perf: Ensure perf intr gets enabled on all cores · c6317bc7
      Vineet Gupta 提交于
      This was the second perf intr issue
      
      perf sampling on multicore requires intr to be enabled on all cores.
      ARC perf probe code used helper arc_request_percpu_irq() which calls
       - request_percpu_irq() on core0
       - enable_percpu_irq() on all all cores (including core0)
      
      genirq requires that request be made ahead of enable call.
      However if perf probe happened on non core0 (observed on a 3.18 kernel),
      enable would get called ahead of request, failing obviously and
      rendering perf intr disabled on all such cores
      
      [   11.120000] 1 ARC perf       : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
      [   11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 0 =====> request_percpu_irq() IRQ 20
      [   11.140000] 0 -----> enable_percpu_irq() IRQ 20
      
      Fix this fragility, by calling request_percpu_irq() on whatever core
      calls probe (there is no requirement on which core calls this anyways)
      and then calling enable on each cores.
      
      Interestingly this started as invesigation of STAR 9000838902:
      "sporadically IRQs enabled on perf prob"
      
      which was about occassional boot spew as request_percpu_irq got called
      non-locally (from an IPI), and re-enabled interrupts in following path
      proc_mkdir ->  spin_unlock_irq()
      
      which the irq work code didn't like.
      
      | ARC perf     : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
      |
      | BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()!
      | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2
      |
      | Stack Trace:
      |  arc_unwind_core.constprop.1+0x94/0x104
      |  dump_stack+0x62/0x98
      |  irq_work_run_list+0xb0/0xb4
      |  irq_work_run+0x22/0x3c
      |  do_IPI+0x74/0x9c
      |  handle_irq_event_percpu+0x34/0x164
      |  handle_percpu_irq+0x58/0x78
      |  generic_handle_irq+0x1e/0x2c
      |  arch_do_IRQ+0x3c/0x60
      |  ret_from_exception+0x0/0x8
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: <stable@vger.kernel.org> #4.2+
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      c6317bc7
    • V
      ARC: intc: No need to clear IRQ_NOAUTOEN · 5bf704c2
      Vineet Gupta 提交于
      arc_request_percpu_irq() is called by all cores to request/enable percpu
      irq. It has some "prep" calls needed by genirq:
       - setup percpu devid
       - disable IRQ_NOAUTOEN
      
      However given that enable_percpu_irq() is called enayways, latter can be
      avoided.
      
      We are now left with irq_set_percpu_devid() quirk and that too for
      ARCompact builds only, since previous patch updated ARCv2 intc to do this
      in the "right" place, i.e. irq map function.
      
      By next release, this will ultimately be fixed for ARCompact as well.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      5bf704c2
    • V
      ARCv2: intc: Fix random perf irq disabling in SMP setup · 8eb0984b
      Vineet Gupta 提交于
      As part of fixing another perf issue, observed that after a perf run,
      the interrupt got disabled on one/more cores.
      
      Turns out that despite requesting perf irq as percpu, the flow handler
      registered was not handle_percpu_irq()
      
      Given that on ARCv2 cores, IRQs < 24 are always private to cpu, we
      register the right handler at the very onset.
      
      Before Fix
      
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0    522      8    51916  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0    522      8   104368  ARCv2 core Intc  20 ARC perf counters
      
      After Fix
      
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:  64198  62012  62697  67803  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20: 126014 122792 123301 133654  ARCv2 core Intc  20 ARC perf counters
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: stable@vger.kernel.org #4.2+
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      8eb0984b
    • L
      ls2080a/dts: Add little endian property for GPIO IP block · 65347783
      Liu Gang 提交于
      The GPIO block for ls2080a platform has little endian registers,
      the GPIO driver needs this property to read/write registers by
      right interface.
      Signed-off-by: NLiu Gang <Gang.Liu@freescale.com>
      Signed-off-by: NLi Yang <leoli@freescale.com>
      Signed-off-by: NKevin Hilman <khilman@linaro.org>
      65347783