1. 25 10月, 2017 2 次提交
  2. 24 10月, 2017 1 次提交
    • M
      arm64: Avoid aligning normal memory pointers in __memcpy_{to,from}io · 9ca255bf
      Mark Salyzyn 提交于
      __memcpy_{to,from}io fall back to byte-at-a-time copying if both the
      source and destination pointers are not 8-byte aligned. Since one of the
      pointers always points at normal memory, this is unnecessary and
      detrimental to performance, so only do byte copying until we hit an 8-byte
      boundary for the device pointer.
      
      This change was motivated by performance issues in the pstore driver.
      On a test platform, measuring probe time for pstore, console buffer
      size of 1/4MB and pmsg of 1/2MB, was in the 90-107ms region. Change
      managed to reduce it to 10-25ms, an improvement in boot time.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Signed-off-by: NMark Salyzyn <salyzyn@android.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9ca255bf
  3. 20 10月, 2017 1 次提交
    • S
      arm64: Fix the feature type for ID register fields · 5bdecb79
      Suzuki K Poulose 提交于
      Now that the ARM ARM clearly specifies the rules for inferring
      the values of the ID register fields, fix the types of the
      feature bits we have in the kernel.
      
      As per ARM ARM DDI0487B.b, section D10.1.4 "Principles of the
      ID scheme for fields in ID registers" lists the registers to
      which the scheme applies along with the exceptions.
      
      This patch changes the relevant feature bits from FTR_EXACT
      to FTR_LOWER_SAFE to select the safer value. This will enable
      an older kernel running on a new CPU detect the safer option
      rather than completely disabling the feature.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5bdecb79
  4. 19 10月, 2017 1 次提交
    • J
      arm64: Update fault_info table with new exception types · 3f7c86b2
      Julien Thierry 提交于
      Based on: ARM Architecture Reference Manual, ARMv8 (DDI 0487B.b).
      
      ARMv8.1 introduces the optional feature ARMv8.1-TTHM which can trigger a
      new type of memory abort. This exception is triggered when hardware update
      of page table flags is not atomic in regards to other memory accesses.
      Replace the corresponding unknown entry with a more accurate one.
      
      Cf: Section D10.2.28 ESR_ELx, Exception Syndrome Register (p D10-2381),
      section D4.4.11 Restriction on memory types for hardware updates on page
      tables (p D4-2116 - D4-2117).
      
      ARMv8.2 does not add new exception types, however it is worth mentioning
      that when obligatory feature RAS (optional for ARMv8.{0,1}) is implemented,
      exceptions related to "Synchronous parity or ECC error on memory access,
      not on translation table walk" become reserved and should not occur.
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3f7c86b2
  5. 18 10月, 2017 2 次提交
  6. 14 10月, 2017 2 次提交
    • J
      arm64: use WFE for long delays · 7b77452e
      Julien Thierry 提交于
      The current delay implementation uses the yield instruction, which is a
      hint that it is beneficial to schedule another thread. As this is a hint,
      it may be implemented as a NOP, causing all delays to be busy loops. This
      is the case for many existing CPUs.
      
      Taking advantage of the generic timer sending periodic events to all
      cores, we can use WFE during delays to reduce power consumption. This is
      beneficial only for delays longer than the period of the timer event
      stream.
      
      If timer event stream is not enabled, delays will behave as yield/busy
      loops.
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7b77452e
    • J
      arm_arch_timer: Expose event stream status · ec5c8e42
      Julien Thierry 提交于
      The arch timer configuration for a CPU might get reset after suspending
      said CPU.
      
      In order to reliably use the event stream in the kernel (e.g. for delays),
      we keep track of the state where we can safely consider the event stream as
      properly configured. After writing to cntkctl, we issue an ISB to ensure
      that subsequent delay loops can rely on the event stream being enabled.
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ec5c8e42
  7. 11 10月, 2017 1 次提交
    • S
      arm64: Expose support for optional ARMv8-A features · f5e035f8
      Suzuki K Poulose 提交于
      ARMv8-A adds a few optional features for ARMv8.2 and ARMv8.3.
      Expose them to the userspace via HWCAPs and mrs emulation.
      
      SHA2-512  - Instruction support for SHA512 Hash algorithm (e.g SHA512H,
      	    SHA512H2, SHA512U0, SHA512SU1)
      SHA3 	  - SHA3 crypto instructions (EOR3, RAX1, XAR, BCAX).
      SM3	  - Instruction support for Chinese cryptography algorithm SM3
      SM4 	  - Instruction support for Chinese cryptography algorithm SM4
      DP	  - Dot Product instructions (UDOT, SDOT).
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f5e035f8
  8. 09 10月, 2017 1 次提交
  9. 04 10月, 2017 3 次提交
    • M
      dma mapping : export caller to vmallocinfo · 359be678
      Matthieu CASTET 提交于
      For example on arm64 board, this add info to "user" entries in vmallocinfo
      
      Before :
      [...]
      0xffffff8008997000 0xffffff80089d8000 266240 user
      [...]
      
      Afer :
      [...]
      0xffffff8008997000 0xffffff80089d8000 266240 atomic_pool_init+0x0/0x1d8 user
      [...]
      
      This help to debug mapping issues, and is consistent with others entries
      (ioremap, vmalloc, ...) that already provide caller.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMatthieu CASTET <matthieu.castet@parrot.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      359be678
    • S
      arm64: Unconditionally support {ARCH_}HAVE_NMI{_SAFE_CMPXCHG} · 396a5d4a
      Stephen Boyd 提交于
      From what I can see there isn't anything about ACPI_APEI_SEA that
      means the arm64 architecture can or cannot support NMI safe
      cmpxchg or NMIs, so the 'if' condition here is not important.
      Let's remove it. Doing that allows us to support ftrace
      histograms via CONFIG_HIST_TRIGGERS that depends on the arch
      having the ARCH_HAVE_NMI_SAFE_CMPXCHG config selected.
      
      Cc: Tyler Baicar <tbaicar@codeaurora.org>
      Cc: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Cc: Dongjiu Geng <gengdongjiu@huawei.com>
      Acked-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      396a5d4a
    • M
      arm64: consistently log boot/secondary CPU IDs · ccaac162
      Mark Rutland 提交于
      Currently we inconsistently log identifying information for the boot CPU
      and secondary CPUs. For the boot CPU, we log the MIDR and MPIDR across
      separate messages, whereas for the secondary CPUs we only log the MIDR.
      
      In some cases, it would be useful to know the MPIDR of secondary CPUs,
      and it would be nice for these messages to be consistent.
      
      This patch ensures that in the primary and secondary boot paths, we log
      both the MPIDR and MIDR in a single message, with a consistent format.
      the MPIDR is consistently padded to 10 hex characters to cover Aff3 in
      bits 39:32, so that IDs can be compared easily.
      
      The newly redundant message in setup_arch() is removed.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Al Stone <ahs3@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      [will: added '0x' prefixes consistently]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ccaac162
  10. 02 10月, 2017 5 次提交
  11. 29 9月, 2017 2 次提交
    • W
      arm64: fault: Route pte translation faults via do_translation_fault · 760bfb47
      Will Deacon 提交于
      We currently route pte translation faults via do_page_fault, which elides
      the address check against TASK_SIZE before invoking the mm fault handling
      code. However, this can cause issues with the path walking code in
      conjunction with our word-at-a-time implementation because
      load_unaligned_zeropad can end up faulting in kernel space if it reads
      across a page boundary and runs into a page fault (e.g. by attempting to
      read from a guard region).
      
      In the case of such a fault, load_unaligned_zeropad has registered a
      fixup to shift the valid data and pad with zeroes, however the abort is
      reported as a level 3 translation fault and we dispatch it straight to
      do_page_fault, despite it being a kernel address. This results in calling
      a sleeping function from atomic context:
      
        BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313
        in_atomic(): 0, irqs_disabled(): 0, pid: 10290
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
        [...]
        [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144
        [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c
        [<ffffff8e016977f0>] do_page_fault+0x140/0x330
        [<ffffff8e01681328>] do_mem_abort+0x54/0xb0
        Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0)
        [...]
        [<ffffff8e016844fc>] el1_da+0x18/0x78
        [<ffffff8e017f399c>] path_parentat+0x44/0x88
        [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8
        [<ffffff8e017f5044>] filename_create+0x4c/0x128
        [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8
        [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28
        Code: 36380080 d5384100 f9400800 9402566d (d4210000)
        ---[ end trace 2d01889f2bca9b9f ]---
      
      Fix this by dispatching all translation faults to do_translation_faults,
      which avoids invoking the page fault logic for faults on kernel addresses.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NAnkit Jain <ankijain@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      760bfb47
    • W
      arm64: mm: Use READ_ONCE when dereferencing pointer to pte table · f069faba
      Will Deacon 提交于
      On kernels built with support for transparent huge pages, different CPUs
      can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk
      and they must take care to use READ_ONCE to avoid value tearing or caching
      of stale values by the compiler. Unfortunately, these functions call into
      our pgtable macros, which don't use READ_ONCE, and compiler caching has
      been observed to cause the following crash during ext4 writeback:
      
      PC is at check_pte+0x20/0x170
      LR is at page_vma_mapped_walk+0x2e0/0x540
      [...]
      Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
      Call trace:
      [<ffff000008233328>] check_pte+0x20/0x170
      [<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540
      [<ffff000008234adc>] page_mkclean_one+0xac/0x278
      [<ffff000008234d98>] rmap_walk_file+0xf0/0x238
      [<ffff000008236e74>] rmap_walk+0x64/0xa0
      [<ffff0000082370c8>] page_mkclean+0x90/0xa8
      [<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8
      [<ffff00000832f984>] mpage_submit_page+0x34/0x98
      [<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170
      [<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8
      [<ffff00000833530c>] ext4_writepages+0x484/0xe30
      [<ffff0000081f6ab4>] do_writepages+0x44/0xe8
      [<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110
      [<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8
      [<ffff000008324310>] ext4_sync_file+0x80/0x4b8
      [<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0
      [<ffff0000082332b4>] SyS_msync+0x194/0x1e8
      
      This is because page_vma_mapped_walk loads the PMD twice before calling
      pte_offset_map: the first time without READ_ONCE (where it gets all zeroes
      due to a concurrent pmdp_invalidate) and the second time with READ_ONCE
      (where it sees a valid table pointer due to a concurrent pmd_populate).
      However, the compiler inlines everything and caches the first value in
      a register, which is subsequently used in pte_offset_phys which returns
      a junk pointer that is later dereferenced when attempting to access the
      relevant pte.
      
      This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure
      that a stale value is not used. Whilst this is a point fix for a known
      failure (and simple to backport), a full fix moving all of our page table
      accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in
      page_vma_mapped_walk is in the works for a future kernel release.
      
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Timur Tabi <timur@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Fixes: f27176cf ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
      Tested-by: NRichard Ruigrok <rruigrok@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      f069faba
  12. 27 9月, 2017 1 次提交
  13. 18 9月, 2017 4 次提交
  14. 14 9月, 2017 1 次提交
  15. 09 9月, 2017 1 次提交
  16. 05 9月, 2017 2 次提交
  17. 01 9月, 2017 1 次提交
  18. 31 8月, 2017 5 次提交
  19. 30 8月, 2017 4 次提交