1. 22 5月, 2018 1 次提交
    • W
      arm64: lse: Add early clobbers to some input/output asm operands · 32c3fa7c
      Will Deacon 提交于
      For LSE atomics that read and write a register operand, we need to
      ensure that these operands are annotated as "early clobber" if the
      register is written before all of the input operands have been consumed.
      Failure to do so can result in the compiler allocating the same register
      to both operands, leading to splats such as:
      
       Unable to handle kernel paging request at virtual address 11111122222221
       [...]
       x1 : 1111111122222222 x0 : 1111111122222221
       Process swapper/0 (pid: 1, stack limit = 0x000000008209f908)
       Call trace:
        test_atomic64+0x1360/0x155c
      
      where x0 has been allocated as both the value to be stored and also the
      atomic_t pointer.
      
      This patch adds the missing clobbers.
      
      Cc: <stable@vger.kernel.org>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Reported-by: NMark Salter <msalter@redhat.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      32c3fa7c
  2. 20 7月, 2017 1 次提交
  3. 10 5月, 2017 1 次提交
    • M
      arm64: atomic_lse: match asm register sizes · 8997c934
      Mark Rutland 提交于
      The LSE atomic code uses asm register variables to ensure that
      parameters are allocated in specific registers. In the majority of cases
      we specifically ask for an x register when using 64-bit values, but in a
      couple of cases we use a w regsiter for a 64-bit value.
      
      For asm register variables, the compiler only cares about the register
      index, with wN and xN having the same meaning. The compiler determines
      the register size to use based on the type of the variable. Thus, this
      inconsistency is merely confusing, and not harmful to code generation.
      
      For consistency, this patch updates those cases to use the x register
      alias. There should be no functional change as a result of this patch.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      8997c934
  4. 10 9月, 2016 1 次提交
    • W
      arm64: lse: convert lse alternatives NOP padding to use __nops · 05492f2f
      Will Deacon 提交于
      The LSE atomics are implemented using alternative code sequences of
      different lengths, and explicit NOP padding is used to ensure the
      patching works correctly.
      
      This patch converts the bulk of the LSE code over to using the __nops
      macro, which makes it slightly clearer as to what is going on and also
      consolidates all of the padding at the end of the various sequences.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      05492f2f
  5. 16 6月, 2016 2 次提交
  6. 27 2月, 2016 1 次提交
    • A
      arm64: lse: deal with clobbered IP registers after branch via PLT · 5be8b70a
      Ard Biesheuvel 提交于
      The LSE atomics implementation uses runtime patching to patch in calls
      to out of line non-LSE atomics implementations on cores that lack hardware
      support for LSE. To avoid paying the overhead cost of a function call even
      if no call ends up being made, the bl instruction is kept invisible to the
      compiler, and the out of line implementations preserve all registers, not
      just the ones that they are required to preserve as per the AAPCS64.
      
      However, commit fd045f6c ("arm64: add support for module PLTs") added
      support for routing branch instructions via veneers if the branch target
      offset exceeds the range of the ordinary relative branch instructions.
      Since this deals with jump and call instructions that are exposed to ELF
      relocations, the PLT code uses x16 to hold the address of the branch target
      when it performs an indirect branch-to-register, something which is
      explicitly allowed by the AAPCS64 (and ordinary compiler generated code
      does not expect register x16 or x17 to retain their values across a bl
      instruction).
      
      Since the lse runtime patched bl instructions don't adhere to the AAPCS64,
      they don't deal with this clobbering of registers x16 and x17. So add them
      to the clobber list of the asm() statements that perform the call
      instructions, and drop x16 and x17 from the list of registers that are
      callee saved in the out of line non-LSE implementations.
      
      In addition, since we have given these functions two scratch registers,
      they no longer need to stack/unstack temp registers.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [will: factored clobber list into #define, updated Makefile comment]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      5be8b70a
  7. 06 11月, 2015 1 次提交
    • L
      arm64: cmpxchg_dbl: fix return value type · 57a65667
      Lorenzo Pieralisi 提交于
      The current arm64 __cmpxchg_double{_mb} implementations carry out the
      compare exchange by first comparing the old values passed in to the
      values read from the pointer provided and by stashing the cumulative
      bitwise difference in a 64-bit register.
      
      By comparing the register content against 0, it is possible to detect if
      the values read differ from the old values passed in, so that the compare
      exchange detects whether it has to bail out or carry on completing the
      operation with the exchange.
      
      Given the current implementation, to detect the cmpxchg operation
      status, the __cmpxchg_double{_mb} functions should return the 64-bit
      stashed bitwise difference so that the caller can detect cmpxchg failure
      by comparing the return value content against 0. The current implementation
      declares the return value as an int, which means that the 64-bit
      value stashing the bitwise difference is truncated before being
      returned to the __cmpxchg_double{_mb} callers, which means that
      any bitwise difference present in the top 32 bits goes undetected,
      triggering false positives and subsequent kernel failures.
      
      This patch fixes the issue by declaring the arm64 __cmpxchg_double{_mb}
      return values as a long, so that the bitwise difference is
      properly propagated on failure, restoring the expected behaviour.
      
      Fixes: e9a4b795 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
      Cc: <stable@vger.kernel.org> # 4.3+
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      57a65667
  8. 13 10月, 2015 1 次提交
  9. 30 7月, 2015 1 次提交
  10. 27 7月, 2015 7 次提交