1. 29 1月, 2016 3 次提交
  2. 22 1月, 2016 1 次提交
    • V
      ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2 · cbfe74a7
      Vineet Gupta 提交于
      Returning to delay slot, riding an interrupti, had one loose end.
      AUX_USER_SP used for restoring user mode SP upon RTIE was not being
      setup from orig task's saved value, causing task to use wrong SP,
      leading to ProtV errors.
      
      The reason being:
       - INTERRUPT_EPILOGUE returns to a kernel trampoline, thus not expected to restore it
       - EXCEPTION_EPILOGUE is not used at all
      
      Fix that by restoring AUX_USER_SP explicitly in the trampoline.
      
      This was broken in the original workaround, but the error scenarios got
      reduced considerably since v3.14 due to following:
      
       1. The Linuxthreads.old based userspace at the time caused many more
          exceptions in delay slot than the current NPTL based one.
          Infact with current userspace the error doesn't happen at all.
      
       2. Return from interrupt (delay slot or otherwise) doesn't get exercised much
          after commit 4de0e528 ("Really Re-enable interrupts to avoid deadlocks")
          since IRQ_ACTIVE.active being clear means most returns are as if from pure
          kernel (even for active interrupts)
      
      Infact the issue only happened in an experimental branch where I was tinkering with
      reverted 4de0e528
      
      Cc: stable@kernel.org # v4.2+
      Fixes: 4255b07f ("ARCv2: STAR 9000793984: Handle return from intr to Delay Slot")
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      cbfe74a7
  3. 21 12月, 2015 3 次提交
  4. 17 12月, 2015 4 次提交
  5. 12 12月, 2015 4 次提交
    • V
      c512c6ba
    • V
      ARCv2: perf: Ensure perf intr gets enabled on all cores · c6317bc7
      Vineet Gupta 提交于
      This was the second perf intr issue
      
      perf sampling on multicore requires intr to be enabled on all cores.
      ARC perf probe code used helper arc_request_percpu_irq() which calls
       - request_percpu_irq() on core0
       - enable_percpu_irq() on all all cores (including core0)
      
      genirq requires that request be made ahead of enable call.
      However if perf probe happened on non core0 (observed on a 3.18 kernel),
      enable would get called ahead of request, failing obviously and
      rendering perf intr disabled on all such cores
      
      [   11.120000] 1 ARC perf       : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
      [   11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 0 =====> request_percpu_irq() IRQ 20
      [   11.140000] 0 -----> enable_percpu_irq() IRQ 20
      
      Fix this fragility, by calling request_percpu_irq() on whatever core
      calls probe (there is no requirement on which core calls this anyways)
      and then calling enable on each cores.
      
      Interestingly this started as invesigation of STAR 9000838902:
      "sporadically IRQs enabled on perf prob"
      
      which was about occassional boot spew as request_percpu_irq got called
      non-locally (from an IPI), and re-enabled interrupts in following path
      proc_mkdir ->  spin_unlock_irq()
      
      which the irq work code didn't like.
      
      | ARC perf     : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
      |
      | BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()!
      | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2
      |
      | Stack Trace:
      |  arc_unwind_core.constprop.1+0x94/0x104
      |  dump_stack+0x62/0x98
      |  irq_work_run_list+0xb0/0xb4
      |  irq_work_run+0x22/0x3c
      |  do_IPI+0x74/0x9c
      |  handle_irq_event_percpu+0x34/0x164
      |  handle_percpu_irq+0x58/0x78
      |  generic_handle_irq+0x1e/0x2c
      |  arch_do_IRQ+0x3c/0x60
      |  ret_from_exception+0x0/0x8
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: <stable@vger.kernel.org> #4.2+
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      c6317bc7
    • V
      ARC: intc: No need to clear IRQ_NOAUTOEN · 5bf704c2
      Vineet Gupta 提交于
      arc_request_percpu_irq() is called by all cores to request/enable percpu
      irq. It has some "prep" calls needed by genirq:
       - setup percpu devid
       - disable IRQ_NOAUTOEN
      
      However given that enable_percpu_irq() is called enayways, latter can be
      avoided.
      
      We are now left with irq_set_percpu_devid() quirk and that too for
      ARCompact builds only, since previous patch updated ARCv2 intc to do this
      in the "right" place, i.e. irq map function.
      
      By next release, this will ultimately be fixed for ARCompact as well.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      5bf704c2
    • V
      ARCv2: intc: Fix random perf irq disabling in SMP setup · 8eb0984b
      Vineet Gupta 提交于
      As part of fixing another perf issue, observed that after a perf run,
      the interrupt got disabled on one/more cores.
      
      Turns out that despite requesting perf irq as percpu, the flow handler
      registered was not handle_percpu_irq()
      
      Given that on ARCv2 cores, IRQs < 24 are always private to cpu, we
      register the right handler at the very onset.
      
      Before Fix
      
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0    522      8    51916  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0    522      8   104368  ARCv2 core Intc  20 ARC perf counters
      
      After Fix
      
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:  64198  62012  62697  67803  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20: 126014 122792 123301 133654  ARCv2 core Intc  20 ARC perf counters
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: stable@vger.kernel.org #4.2+
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      8eb0984b
  6. 24 11月, 2015 1 次提交
    • V
      ARC: dw2 unwind: Remove falllback linear search thru FDE entries · 2e22502c
      Vineet Gupta 提交于
      Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"
      
      | perf record -g -c 15000 -e cycles /sbin/hackbench
      |
      | INFO: rcu_preempt self-detected stall on CPU
      | 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
      | Task dump for CPU 1:
      
      in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
      search (which iterates thru each of ~11K entries) thus takes 2 orders of
      magnitude longer (~3 million cycles vs. 2000). Routines written in hand
      assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
      yet) fail the unwinder binary lookup, hit linear search, failing
      nevertheless in the end.
      
      However the linear search is pointless as binary lookup tables are created
      from it in first place. It is impossible to have binary lookup fail while
      succeed the linear search. It is pure waste of cycles thus removed by
      this patch.
      
      This manifested as RCU stalls / NMI watchdog splat when running
      hackbench under perf with callgraph profiling. The triggering condition
      was perf counter overflowing in routine lacking dwarf info (like memset)
      leading to patheic 3 million cycle unwinder slow path and by the time it
      returned new interrupts were already pending (Timer, IPI) and taken
      rightaway. The original memset didn't make forward progress, system kept
      accruing more interrupts and more unwinder delayes in a vicious feedback
      loop, ultimately triggering the NMI diagnostic.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      2e22502c
  7. 18 11月, 2015 1 次提交
    • V
      ARC: remove SYNC from __switch_to() · e81b75f7
      Vineet Gupta 提交于
      SYNC in __switch_to() is a historic relic and not needed at all.
      
       - In UP context it is obviously useless, why would we want to stall
         the core for all updates to stack memory of t0 to complete before
         loading kernel mode callee registers from t1 stack's memory.
      
       - In SMP, there could be potential race in which outgoing task could
         be concurrently picked for running on a different core, thus writes
         to stack here need to be visible before the reads from stack on
         other core. Peter confirmed that generic schedular already has needed
         barriers (by way of rq lock) so there is no need for additional arch
         barrier.
      
      This came up when Noam was trying to replace this SYNC with EZChip
      specific hardware thread scheduling instruction for their platform
      support.
      
      Link: http://lkml.kernel.org/r/20151102092654.GM17308@twins.programming.kicks-ass.net
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: Noam Camus <noamc@ezchip.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      e81b75f7
  8. 16 11月, 2015 1 次提交
  9. 14 11月, 2015 1 次提交
  10. 28 10月, 2015 9 次提交
  11. 17 10月, 2015 5 次提交
  12. 16 9月, 2015 1 次提交
    • T
      genirq: Remove irq argument from irq flow handlers · bd0b9ac4
      Thomas Gleixner 提交于
      Most interrupt flow handlers do not use the irq argument. Those few
      which use it can retrieve the irq number from the irq descriptor.
      
      Remove the argument.
      
      Search and replace was done with coccinelle and some extra helper
      scripts around it. Thanks to Julia for her help!
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      bd0b9ac4
  13. 27 8月, 2015 6 次提交