提交 · 964cf28f9d10f4e5229e4365258c292bc5c856b2 · openeuler / raspberrypi-kernel

17 10月, 2015 12 次提交

V
ARC: boot log: move helper macros to header for reuse · 964cf28f
由 Vineet Gupta 提交于 10月 02, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
964cf28f

ARC: mm: compute TLB size as needed from ways * sets · b598e17f

由 Vineet Gupta 提交于 10月 02, 2015

This frees up some bits to hold more high level info such as PAE being
present, w/o increasing the size of already bloated cpuinfo struct
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

b598e17f

V
ARC: mm: MMU v1..v3 only selectable for ARCompact ISA based cores · c583ee4f
由 Vineet Gupta 提交于 9月 29, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
c583ee4f

ARC: make write_aux_reg safer against macro substitution · 5c35ee64

由 Vineet Gupta 提交于 9月 29, 2015

It was generating warnings when called as write_aux_reg(x, paddr >> 32)
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

5c35ee64

ARC: [arcompact] entry.S: Elide extra check/branch in exception ret path · 9fabcc63

由 Vineet Gupta 提交于 10月 08, 2015

This is done by improving the laddering logic !

Before:

   if Exception
      goto excep_or_pure_k_ret

   if !Interrupt(L2)
      goto l1_chk
   else
      INTERRUPT_EPILOGUE 2

 l1_chk:
   if !Interrupt(L1)  (i.e. pure kernel mode)
      goto excep_or_pure_k_ret
   else
      INTERRUPT_EPILOGUE 1

 excep_or_pure_k_ret:
   EXCEPTION_EPILOGUE

Now:

   if !Interrupt(L1 or L2) (i.e. exception or pure kernel mode)
      goto excep_or_pure_k_ret

  ; guaranteed to be an interrupt
   if !Interrupt(L2)
      goto l1_ret
   else
      INTERRUPT_EPILOGUE 2

 ; by virtue of above, no need to chk for L1 active
 l1_ret:
    INTERRUPT_EPILOGUE 1

 excep_or_pure_k_ret:
    EXCEPTION_EPILOGUE
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

9fabcc63

V
ARC: [arcompact] entry.S: Document preemption games for L2 intr · 5f888087
由 Vineet Gupta 提交于 9月 06, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
5f888087

ARC: [arcompact] entry.S: Improve early return from exception · 55a2ae77

由 Vineet Gupta 提交于 9月 05, 2015

The requirement is to
 - Reenable Exceptions (AE cleared)
 - Reenable Interrupts (E1/E2 set)

We need to do wiggle these bits into ERSTATUS and call RTIE.

Prev version used the pre-exception STATUS32 as starting point for what
goes into ERSTATUS. This required explicit fixups of U/DE/L bits.

Instead, use the current (in-exception) STATUS32 as starting point.
Being in exception handler U/DE/L can be safely assumed to be correct.
Only AE/E1/E2 need to be fixed.

So the new implementation is slightly better
 -Avoids read form memory
 -Is 4 bytes smaller for the typical 1 level of intr configuration
 -Depicts the semantics more clearly
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

55a2ae77

ARC: [arcompact] don't check for hard isr calling local_irq_enable() · 9dbd3d9b

由 Vineet Gupta 提交于 9月 05, 2015

Historically this was done by ARC IDE driver, which is long gone.
IRQ core is pretty robust now and already checks if IRQs are enabled
in hard ISRs. Thus no point in checking this in arch code, for every
call of irq enabled.

Further if some driver does do that - let it bring down the system so we
notice/fix this sooner than covering up for sucker

This makes local_irq_enable() - for L1 only case atleast simple enough
so we can inline it.
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

9dbd3d9b

V
ARCv2: mm: THP: flush_pmd_tlb_range make SMP safe · c7119d56
由 Vineet Gupta 提交于 10月 15, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
c7119d56

ARCv2: mm: THP: Implement flush_pmd_tlb_range() optimization · 722fe8fd

由 Vineet Gupta 提交于 2月 27, 2015

Implement the TLB flush routine to evict a sepcific Super TLB entry,
vs. moving to a new ASID on every such flush.
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

722fe8fd

V
ARCv2: mm: THP: boot validation/reporting · 6ce18798
由 Vineet Gupta 提交于 3月 12, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
6ce18798

ARCv2: mm: THP support · fe6c1b86

由 Vineet Gupta 提交于 7月 08, 2014

MMUv4 in HS38x cores supports Super Pages which are basis for Linux THP
support.

Normal and Super pages can co-exist (ofcourse not overlap) in TLB with a
new bit "SZ" in TLB page desciptor to distinguish between them.
Super Page size is configurable in hardware (4K to 16M), but fixed once
RTL builds.

The exact THP size a Linx configuration will support is a function of:
 - MMU page size (typical 8K, RTL fixed)
 - software page walker address split between PGD:PTE:PFN (typical
   11:8:13, but can be changed with 1 line)

So for above default, THP size supported is 8K * 256 = 2M

Default Page Walker is 2 levels, PGD:PTE:PFN, which in THP regime
reduces to 1 level (as PTE is folded into PGD and canonically referred
to as PMD).

Thus thp PMD accessors are implemented in terms of PTE (just like sparc)
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

fe6c1b86

09 10月, 2015 3 次提交

ARC: mm: Introduce PTE_SPECIAL · 24830fc7

由 Vineet Gupta 提交于 2月 16, 2015

Needed for THP, but will also come in handy for fast GUP later
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

24830fc7

V
ARC: mm: pte flags comsetic cleanups, comments · 129cbed5
由 Vineet Gupta 提交于 12月 05, 2013
```
No semantical changes
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
129cbed5

ARC: mm: switch pgtable_to to pte_t * · e8a75963

由 Vineet Gupta 提交于 8月 28, 2015

ARC is the only arch with unsigned long type (vs. struct page *).
Historically this was done to avoid the page_address() calls in various
arch hooks which need to get the virtual/logical address of the table.

Some arches alternately define it as pte_t *, and is as efficient as
unsigned long (generated code doesn't change)
Suggested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

e8a75963

16 9月, 2015 1 次提交

genirq: Remove irq argument from irq flow handlers · bd0b9ac4

由 Thomas Gleixner 提交于 9月 14, 2015

Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.

Remove the argument.

Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>

bd0b9ac4

12 9月, 2015 1 次提交

ARCv2: [axs103_smp] Reduce clk for SMP FPGA configs · 3ebb0540

由 Vineet Gupta 提交于 9月 11, 2015

Newer bitfiles needs the reduced clk even for SMP builds

Cc: <stable@vger.kernel.org>  #4.2
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3ebb0540

27 8月, 2015 7 次提交

V
ARCv2: entry: Fix reserved handler · 3d592659
由 Vineet Gupta 提交于 8月 27, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
3d592659

ARCv2: perf: Finally introduce HS perf unit · 9b28829d

由 Vineet Gupta 提交于 11月 18, 2014

With all features in place, the ARC HS pct block can now be effectively
allowed to be probed/used
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

9b28829d

ARCv2: perf: SMP support · e525c37f

由 Alexey Brodkin 提交于 8月 24, 2015

* split off pmu info into singleton and per-cpu bits
* setup PMU on all cores
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

e525c37f

ARCv2: perf: implement exclusion of event counting in user or kernel mode · e6b1d126

由 Alexey Brodkin 提交于 8月 24, 2015

Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

e6b1d126

ARCv2: perf: Support sampling events using overflow interrupts · 36481cf7

由 Alexey Brodkin 提交于 8月 24, 2015

In times of ARC 700 performance counters didn't have support of
interrupt an so for ARC we only had support of non-sampling events.

Put simply only "perf stat" was functional.

Now with ARC HS we have support of interrupts in performance counters
which this change introduces support of.

ARC performance counters act in the following way in regard of
interrupts generation.
 [1] A counter counts starting from value set in PCT_COUNT register pair
 [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised

Basic setup look like this:
 [1] PCT_COUNT = 0;
 [2] PCT_INT_CNT = __limit_value__;
 [3] Enable interrupts for that counter and let it run
 [4] Let counter reach its limit
 [5] Handle interrupt when it happens

Note that PCT HW block is build in CPU core and so ints interrupt
line (which is basically OR of all counters IRQs) is wired directly to
top-level IRQC. That means do de-assert PCT interrupt it's required to
reset IRQs from all counters that have reached their limit values.
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

36481cf7

ARCv2: perf: implement "event_set_period" · 1fe8bfa5

由 Alexey Brodkin 提交于 8月 24, 2015

This generalization prepares for support of overflow interrupts.

Hardware event counters on ARC work that way:
Each counter counts from programmed start value (set in
ARC_REG_PCT_COUNT) to a limit value (set in ARC_REG_PCT_INT_CNT) and
once limit value is reached this timer generates an interrupt.

Even though this hardware implementation allows for more flexibility,
in Linux kernel we decided to mimic behavior of other architectures
this way:

 [1] Set limit value as half of counter's max value (to allow counter to
     run after reaching it limit, see below for more explanation):
 ---------->8-----------
 arc_pmu->max_period = (1ULL << counter_size) / 2 - 1ULL;
 ---------->8-----------

 [2] Set start value as "arc_pmu->max_period - sample_period" and then
count up to the limit

Our event counters don't stop on reaching max value (the one we set in
ARC_REG_PCT_INT_CNT) but continue to count until kernel explicitly
stops each of them.

And setting a limit as half of counter capacity is done to allow
capturing of additional events in between moment when interrupt was
triggered until we're actually processing PMU interrupts. That way
we're trying to be more precise.

For example if we count CPU cycles we keep track of cycles while
running through generic IRQ handling code:

 [1] We set counter period as say 100_000 events of type "crun"
 [2] Counter reaches that limit and raises its interrupt
 [3] Once we get in PMU IRQ handler we read current counter value from
ARC_REG_PCT_SNAP ans see there something like 105_000.

If counters stop on reaching a limit value then we would miss
additional 5000 cycles.
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

1fe8bfa5

ARC: perf: cap the number of counters to hardware max of 32 · fb7c5725

由 Vineet Gupta 提交于 8月 24, 2015

The number of counters in PCT can never be more than 32 (while
countable conditions could be 100+) for both ARCompact and ARCv2

And while at it update copyright dates.
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

fb7c5725

21 8月, 2015 1 次提交
- V
  ARC: Eliminate some ARCv2 specific code for ARCompact build · fd0881a2
  由 Vineet Gupta 提交于 8月 21, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
  fd0881a2
20 8月, 2015 10 次提交

V
ARC: add/fix some comments in code - no functional change · 09074950
由 Vineet Gupta 提交于 8月 19, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
09074950

ARC: change some branchs to jumps to resolve linkage errors · 6de6066c

由 Yuriy Kolerov 提交于 8月 12, 2015

When kernel's binary becomes large enough (32M and more) errors
may occur during the final linkage stage. It happens because
the build system uses short relocations for ARC  by default.
This problem may be easily resolved by passing -mlong-calls
option to GCC to use long absolute jumps (j) instead of short
relative branchs (b).

But there are fragments of pure assembler code exist which use
branchs in inappropriate places and cause a linkage error because
of relocations overflow.

First of these fragments is .fixup insertion in futex.h and
unaligned.c. It inserts a code in the separate section (.fixup)
with branch instruction. It leads to the linkage error when
kernel becomes large.

Second of these fragments is calling scheduler's functions
(common kernel code) from entry.S of ARC's code. When kernel's
binary becomes large it may lead to the linkage error because
scheduler may occur far enough from ARC's code in the final
binary.
Signed-off-by: NYuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

6de6066c

ARC: ensure futex ops are atomic in !LLSC config · eb2cd8b7

由 Vineet Gupta 提交于 8月 06, 2015

W/o hardware assisted atomic r-m-w the best we can do is to disable
preemption.

Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

eb2cd8b7

ARC: Enable HAVE_FUTEX_CMPXCHG · 5e057429

由 Vineet Gupta 提交于 8月 06, 2015

ARC doesn't need the runtime detection of futex cmpxchg op

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

5e057429

ARC: make futex_atomic_cmpxchg_inatomic() return bimodal · 882a95ae

由 Vineet Gupta 提交于 8月 06, 2015

Callers of cmpxchg_futex_value_locked() in futex code expect bimodal
return value:
  !0 (essentially -EFAULT as failure)
   0 (success)

Before this patch, the success return value was old value of futex,
which could very well be non zero, causing caller to possibly take the
failure path erroneously.

Fix that by returning 0 for success

(This fix was done back in 2011 for all upstream arches, which ARC
obviously missed)

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

882a95ae

ARC: futex cosmetics · ed574e2b

由 Vineet Gupta 提交于 8月 05, 2015

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

ed574e2b

ARC: add barriers to futex code · 31d30c82

由 Vineet Gupta 提交于 8月 05, 2015

The atomic ops on futex need to provide the full barrier just like
regular atomics in kernel.

Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
as core code already does that

Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

31d30c82

ARCv2: IOC: Allow boot time disable · 1648c70d

由 Alexey Brodkin 提交于 6月 09, 2015

Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

1648c70d

V
ARCv2: SLC: Allow boot time disable · 79335a2c
由 Vineet Gupta 提交于 6月 04, 2015
```
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
```
79335a2c

ARCv2: Support IO Coherency and permutations involving L1 and L2 caches · f2b0b25a

由 Alexey Brodkin 提交于 5月 25, 2015

In case of ARCv2 CPU there're could be following configurations
that affect cache handling for data exchanged with peripherals
via DMA:
 [1] Only L1 cache exists
 [2] Both L1 and L2 exist, but no IO coherency unit
 [3] L1, L2 caches and IO coherency unit exist

Current implementation takes care of [1] and [2].
Moreover support of [2] is implemented with run-time check
for SLC existence which is not super optimal.

This patch introduces support of [3] and rework of DMA ops
usage. Instead of doing run-time check every time a particular
DMA op is executed we'll have 3 different implementations of
DMA ops and select appropriate one during init.

As for IOC support for it we need:
 [a] Implement empty DMA ops because IOC takes care of cache
     coherency with DMAed data
 [b] Route dma_alloc_coherent() via dma_alloc_noncoherent()
     This is required to make IOC work in first place and also
     serves as optimization as LD/ST to coherent buffers can be
     srviced from caches w/o going all the way to memory
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
[vgupta:
  -Added some comments about IOC gains
  -Marked dma ops as static,
  -Massaged changelog a bit]
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

f2b0b25a

11 8月, 2015 1 次提交

ARC: Enable optimistic spinning for LLSC config · 2a440168

由 Vineet Gupta 提交于 8月 08, 2015

Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

2a440168

07 8月, 2015 1 次提交

ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff · 10971638

由 Vineet Gupta 提交于 8月 07, 2015

The increment of delay counter was 2 instructions:
Arithmatic Shfit Left (ASL) + set to 1 on overflow

This can be done in 1 using ROtate Left (ROL)
Suggested-by: NNigel Topham <ntopham@synopsys.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

10971638

05 8月, 2015 1 次提交

ARC: Make pt_regs regs unsigned · 87ce6280

由 Vineet Gupta 提交于 8月 05, 2015

KGDB fails to build after f51e2f19 ("ARC: make sure instruction_pointer()
returns unsigned value")

The hack to force one specific reg to unsigned backfired. There's no
reason to keep the regs signed after all.

|  CC      arch/arc/kernel/kgdb.o
|../arch/arc/kernel/kgdb.c: In function 'kgdb_trap':
| ../arch/arc/kernel/kgdb.c:180:29: error: lvalue required as left operand of assignment
|   instruction_pointer(regs) -= BREAK_INSTR_SIZE;
Reported-by: NYuriy Kolerov <yuriy.kolerov@synopsys.com>
Fixes: f51e2f19 ("ARC: make sure instruction_pointer() returns unsigned value")
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

87ce6280

04 8月, 2015 2 次提交

ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle · b89aa12c

由 Vineet Gupta 提交于 7月 21, 2015

The previous commit for delayed retry of SCOND needs some fine tuning
for spin locks.

The backoff from delayed retry in conjunction with spin looping of lock
itself can potentially cause the delay counter to reach high values.
So to provide fairness to any lock operation, after a lock "seems"
available (i.e. just before first SCOND try0, reset the delay counter
back to starting value of 1

Essentially reset delay to 1 for a new spin-wait-loop-acquire cycle.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

b89aa12c

ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff · e78fdfef

由 Vineet Gupta 提交于 7月 14, 2015

This is to workaround the llock/scond livelock

HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping
coherency transactions in the SCU. The exclusive line state keeps rotating
among contenting cores leading to a never ending cycle. So break the cycle
by deferring the retry of failed exclusive access (SCOND). The actual delay
needed is function of number of contending cores as well as the unrelated
coherency traffic from other cores. To keep the code simple, start off with
small delay of 1 which would suffice most cases and in case of contention
double the delay. Eventually the delay is sufficient such that the coherency
pipeline is drained, thus a subsequent exclusive access would succeed.

Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.comAcked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

e78fdfef