1. 11 3月, 2016 1 次提交
  2. 21 1月, 2016 2 次提交
  3. 17 1月, 2016 1 次提交
  4. 16 1月, 2016 1 次提交
  5. 21 12月, 2015 6 次提交
    • V
      ARC: dw2 unwind: Catch Dwarf SNAFUs early · 6b538db7
      Vineet Gupta 提交于
      Instead of seeing empty stack traces, let kernel fail early so dwarf
      issues can be fixed sooner
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      6b538db7
    • V
      ARC: dw2 unwind: Don't bail for CIE.version != 1 · 6d0d5060
      Vineet Gupta 提交于
      The rudimentary CIE.version == 3 handling is already present in code
      (for return address register specification)
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      6d0d5060
    • V
      Revert "ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing" · 2d64affc
      Vineet Gupta 提交于
      Blingly ignoring CIE.version != 1 was a bad idea.
      It still leaves "desirability" when running perf with callgraphing where libgcc
      symbols might show in hotspot.
      
      More importantly, basic CIE.version == 3 support already exists in code:
      
      |
      |   retAddrReg = state.version <= 1 ? *ptr++ : get_uleb128(&ptr, end);
      |
      
      Next commit with simply add continue-not-bail for CIE.version != 1
      
      This reverts commit 323f41f9.
      2d64affc
    • V
      ARC: Fix linking errors with CONFIG_MODULE + CONFIG_CC_OPTIMIZE_FOR_SIZE · 07fd7d4b
      Vineet Gupta 提交于
      At -Os, ARC gcc generates millicode thunk for function prologue/epilogue,
      which are served by libgcc.
      
      Modules historically are NOT linked with libgcc to avoid code bloat, reducing
      runtime relocation fixups etc. I even once tried doing that but got lost
      in makefile intricacies.
      
      This means modules at -Os don't get the millicode thunks, causing build
      failures below:
      
      | MODPOST 5 modules
      | ERROR: "__ld_r13_to_r18" [crypto/sha256_generic.ko] undefined!
      | ERROR: "__ld_r13_to_r18_ret" [crypto/sha256_generic.ko] undefined!
      | ERROR: "__st_r13_to_r18" [crypto/sha256_generic.ko] undefined!
      | ERROR: "__ld_r13_to_r17_ret" [crypto/sha256_generic.ko] undefined!
      | ERROR: "__st_r13_to_r17" [crypto/sha256_generic.ko] undefined!
      | ERROR: "__ld_r13_to_r16_ret" [crypto/sha256_generic.ko] undefined!
      | ERROR: "__st_r13_to_r16" [crypto/sha256_generic.ko] undefined!
      |....
      |....
      
      Workaround that by inhibiting millicode thunks for loadable modules
      
      Fixes STAR 9000641864:
      ("Linux built with optimizations for size emits errors for modules")
      Reported-by: NAnton Kolesov <akolesov@synosys.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      07fd7d4b
    • A
      ARC: mm: fix building for MMU v2 · 4b32e89a
      Alexey Brodkin 提交于
      ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
      define ARC_REG_IC_PTAG for MMU versions >= 3.
      
      But current implementation of cache_line_loop_vX() routines assumes
      availability of all of them (v2, v3 and v4) simultaneously.
      
      And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
      compilation problem:
      ---------------------------------->8-------------------------------
        CC      arch/arc/mm/cache.o
      arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
      arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
         aux_tag = ARC_REG_IC_PTAG;
                   ^
      arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
      scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
      ---------------------------------->8-------------------------------
      
      The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
      version being used.
      
      We don't use it in cache_line_loop_v2() anyways so who cares.
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      4b32e89a
    • V
      ARC: mm: HIGHMEM: Fix section mismatch splat · 899cfd2b
      Vineet Gupta 提交于
      | WARNING: vmlinux.o(.text+0xd6c2): Section mismatch in reference from the function alloc_kmap_pgtable() to the function
      | .init.text:__alloc_bootmem_low()
      The function alloc_kmap_pgtable() references the function __init __alloc_bootmem_low().
      This is often because alloc_kmap_pgtable lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      899cfd2b
  6. 17 12月, 2015 5 次提交
  7. 12 12月, 2015 4 次提交
    • V
      c512c6ba
    • V
      ARCv2: perf: Ensure perf intr gets enabled on all cores · c6317bc7
      Vineet Gupta 提交于
      This was the second perf intr issue
      
      perf sampling on multicore requires intr to be enabled on all cores.
      ARC perf probe code used helper arc_request_percpu_irq() which calls
       - request_percpu_irq() on core0
       - enable_percpu_irq() on all all cores (including core0)
      
      genirq requires that request be made ahead of enable call.
      However if perf probe happened on non core0 (observed on a 3.18 kernel),
      enable would get called ahead of request, failing obviously and
      rendering perf intr disabled on all such cores
      
      [   11.120000] 1 ARC perf       : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
      [   11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed
      [   11.140000] 0 =====> request_percpu_irq() IRQ 20
      [   11.140000] 0 -----> enable_percpu_irq() IRQ 20
      
      Fix this fragility, by calling request_percpu_irq() on whatever core
      calls probe (there is no requirement on which core calls this anyways)
      and then calling enable on each cores.
      
      Interestingly this started as invesigation of STAR 9000838902:
      "sporadically IRQs enabled on perf prob"
      
      which was about occassional boot spew as request_percpu_irq got called
      non-locally (from an IPI), and re-enabled interrupts in following path
      proc_mkdir ->  spin_unlock_irq()
      
      which the irq work code didn't like.
      
      | ARC perf     : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
      |
      | BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()!
      | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2
      |
      | Stack Trace:
      |  arc_unwind_core.constprop.1+0x94/0x104
      |  dump_stack+0x62/0x98
      |  irq_work_run_list+0xb0/0xb4
      |  irq_work_run+0x22/0x3c
      |  do_IPI+0x74/0x9c
      |  handle_irq_event_percpu+0x34/0x164
      |  handle_percpu_irq+0x58/0x78
      |  generic_handle_irq+0x1e/0x2c
      |  arch_do_IRQ+0x3c/0x60
      |  ret_from_exception+0x0/0x8
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: <stable@vger.kernel.org> #4.2+
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      c6317bc7
    • V
      ARC: intc: No need to clear IRQ_NOAUTOEN · 5bf704c2
      Vineet Gupta 提交于
      arc_request_percpu_irq() is called by all cores to request/enable percpu
      irq. It has some "prep" calls needed by genirq:
       - setup percpu devid
       - disable IRQ_NOAUTOEN
      
      However given that enable_percpu_irq() is called enayways, latter can be
      avoided.
      
      We are now left with irq_set_percpu_devid() quirk and that too for
      ARCompact builds only, since previous patch updated ARCv2 intc to do this
      in the "right" place, i.e. irq map function.
      
      By next release, this will ultimately be fixed for ARCompact as well.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      5bf704c2
    • V
      ARCv2: intc: Fix random perf irq disabling in SMP setup · 8eb0984b
      Vineet Gupta 提交于
      As part of fixing another perf issue, observed that after a perf run,
      the interrupt got disabled on one/more cores.
      
      Turns out that despite requesting perf irq as percpu, the flow handler
      registered was not handle_percpu_irq()
      
      Given that on ARCv2 cores, IRQs < 24 are always private to cpu, we
      register the right handler at the very onset.
      
      Before Fix
      
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0    522      8    51916  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0    522      8   104368  ARCv2 core Intc  20 ARC perf counters
      
      After Fix
      
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20:  64198  62012  62697  67803  ARCv2 core Intc  20 ARC perf counters
      |
      | [ARCLinux]# perf record -c 20000 /sbin/hackbench
      | Running with 10*40 (== 400) tasks.
      |
      | [ARCLinux]# cat /proc/interrupts | grep perf
      |  20: 126014 122792 123301 133654  ARCv2 core Intc  20 ARC perf counters
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: stable@vger.kernel.org #4.2+
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      8eb0984b
  8. 07 12月, 2015 1 次提交
  9. 05 12月, 2015 1 次提交
  10. 24 11月, 2015 1 次提交
    • V
      ARC: dw2 unwind: Remove falllback linear search thru FDE entries · 2e22502c
      Vineet Gupta 提交于
      Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"
      
      | perf record -g -c 15000 -e cycles /sbin/hackbench
      |
      | INFO: rcu_preempt self-detected stall on CPU
      | 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
      | Task dump for CPU 1:
      
      in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
      search (which iterates thru each of ~11K entries) thus takes 2 orders of
      magnitude longer (~3 million cycles vs. 2000). Routines written in hand
      assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
      yet) fail the unwinder binary lookup, hit linear search, failing
      nevertheless in the end.
      
      However the linear search is pointless as binary lookup tables are created
      from it in first place. It is impossible to have binary lookup fail while
      succeed the linear search. It is pure waste of cycles thus removed by
      this patch.
      
      This manifested as RCU stalls / NMI watchdog splat when running
      hackbench under perf with callgraph profiling. The triggering condition
      was perf counter overflowing in routine lacking dwarf info (like memset)
      leading to patheic 3 million cycle unwinder slow path and by the time it
      returned new interrupts were already pending (Timer, IPI) and taken
      rightaway. The original memset didn't make forward progress, system kept
      accruing more interrupts and more unwinder delayes in a vicious feedback
      loop, ultimately triggering the NMI diagnostic.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      2e22502c
  11. 18 11月, 2015 1 次提交
    • V
      ARC: remove SYNC from __switch_to() · e81b75f7
      Vineet Gupta 提交于
      SYNC in __switch_to() is a historic relic and not needed at all.
      
       - In UP context it is obviously useless, why would we want to stall
         the core for all updates to stack memory of t0 to complete before
         loading kernel mode callee registers from t1 stack's memory.
      
       - In SMP, there could be potential race in which outgoing task could
         be concurrently picked for running on a different core, thus writes
         to stack here need to be visible before the reads from stack on
         other core. Peter confirmed that generic schedular already has needed
         barriers (by way of rq lock) so there is no need for additional arch
         barrier.
      
      This came up when Noam was trying to replace this SYNC with EZChip
      specific hardware thread scheduling instruction for their platform
      support.
      
      Link: http://lkml.kernel.org/r/20151102092654.GM17308@twins.programming.kicks-ass.net
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: Noam Camus <noamc@ezchip.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      e81b75f7
  12. 16 11月, 2015 4 次提交
  13. 14 11月, 2015 4 次提交
  14. 03 11月, 2015 1 次提交
  15. 29 10月, 2015 1 次提交
  16. 28 10月, 2015 6 次提交