1. 02 8月, 2016 2 次提交
    • P
      MIPS: Use per-mm page to execute branch delay slot instructions · 432c6bac
      Paul Burton 提交于
      In some cases the kernel needs to execute an instruction from the delay
      slot of an emulated branch instruction. These cases include:
      
        - Emulated floating point branch instructions (bc1[ft]l?) for systems
          which don't include an FPU, or upon which the kernel is run with the
          "nofpu" parameter.
      
        - MIPSr6 systems running binaries targeting older revisions of the
          architecture, which may include branch instructions whose encodings
          are no longer valid in MIPSr6.
      
      Executing instructions from such delay slots is done by writing the
      instruction to memory followed by a trap, as part of an "emuframe", and
      executing it. This avoids the requirement of an emulator for the entire
      MIPS instruction set. Prior to this patch such emuframes are written to
      the user stack and executed from there.
      
      This patch moves FP branch delay emuframes off of the user stack and
      into a per-mm page. Allocating a page per-mm leaves userland with access
      to only what it had access to previously, and compared to other
      solutions is relatively simple.
      
      When a thread requires a delay slot emulation, it is allocated a frame.
      A thread may only have one frame allocated at any one time, since it may
      only ever be executing one instruction at any one time. In order to
      ensure that we can free up allocated frame later, its index is recorded
      in struct thread_struct. In the typical case, after executing the delay
      slot instruction we'll execute a break instruction with the BRK_MEMU
      code. This traps back to the kernel & leads to a call to do_dsemulret
      which frees the allocated frame & moves the user PC back to the
      instruction that would have executed following the emulated branch.
      In some cases the delay slot instruction may be invalid, such as a
      branch, or may trigger an exception. In these cases the BRK_MEMU break
      instruction will not be hit. In order to ensure that frames are freed
      this patch introduces dsemul_thread_cleanup() and calls it to free any
      allocated frame upon thread exit. If the instruction generated an
      exception & leads to a signal being delivered to the thread, or indeed
      if a signal simply happens to be delivered to the thread whilst it is
      executing from the struct emuframe, then we need to take care to exit
      the frame appropriately. This is done by either rolling back the user PC
      to the branch or advancing it to the continuation PC prior to signal
      delivery, using dsemul_thread_rollback(). If this were not done then a
      sigreturn would return to the struct emuframe, and if that frame had
      meanwhile been used in response to an emulated branch instruction within
      the signal handler then we would execute the wrong user code.
      
      Whilst a user could theoretically place something like a compact branch
      to self in a delay slot and cause their thread to become stuck in an
      infinite loop with the frame never being deallocated, this would:
      
        - Only affect the users single process.
      
        - Be architecturally invalid since there would be a branch in the
          delay slot, which is forbidden.
      
        - Be extremely unlikely to happen by mistake, and provide a program
          with no more ability to harm the system than a simple infinite loop
          would.
      
      If a thread requires a delay slot emulation & no frame is available to
      it (ie. the process has enough other threads that all frames are
      currently in use) then the thread joins a waitqueue. It will sleep until
      a frame is freed by another thread in the process.
      
      Since we now know whether a thread has an allocated frame due to our
      tracking of its index, the cookie field of struct emuframe is removed as
      we can be more certain whether we have a valid frame. Since a thread may
      only ever have a single frame at any given time, the epc field of struct
      emuframe is also removed & the PC to continue from is instead stored in
      struct thread_struct. Together these changes simplify & shrink struct
      emuframe somewhat, allowing twice as many frames to fit into the page
      allocated for them.
      
      The primary benefit of this patch is that we are now free to mark the
      user stack non-executable where that is possible.
      Signed-off-by: NPaul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: Maciej Rozycki <maciej.rozycki@imgtec.com>
      Cc: Faraz Shahbazker <faraz.shahbazker@imgtec.com>
      Cc: Raghu Gandham <raghu.gandham@imgtec.com>
      Cc: Matthew Fortune <matthew.fortune@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13764/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      432c6bac
    • A
      MIPS: Modify error handling · 33799a6d
      Amitoj Kaur Chawla 提交于
      debugfs_create_file returns NULL on error so an IS_ERR test is
      incorrect here and a NULL check is required.
      
      The Coccinelle semantic patch used to make this change is as follows:
      @@
      expression e;
      @@
      
        e = debugfs_create_file(...);
      if(
      -    IS_ERR(e)
      +    !e
          )
          {
        <+...
        return
      - PTR_ERR(e)
      + -ENOMEM
        ;
        ...+>
        }
      Signed-off-by: NAmitoj Kaur Chawla <amitoj1606@gmail.com>
      Cc: julia.lawall@lip6.fr
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13834/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      33799a6d
  2. 29 7月, 2016 13 次提交
    • J
      MIPS: c-r4k: Use SMP calls for CM indexed cache ops · 11f76903
      James Hogan 提交于
      The MIPS Coherence Manager (CM) can propagate address-based ("hit")
      cache operations to other cores in the coherent system, alleviating
      software of the need to use SMP calls, however indexed cache operations
      are not propagated by hardware since doing so makes no sense for
      separate caches.
      
      Update r4k_op_needs_ipi() to report that only hit cache operations are
      globalized by the CM, requiring indexed cache operations to be
      globalized by software via an SMP call.
      
      r4k_on_each_cpu() previously had a special case for CONFIG_MIPS_MT_SMP,
      intended to avoid the SMP calls when the only other CPUs in the system
      were other VPEs in the same core, and hence sharing the same caches.
      This was changed by commit cccf34e9 ("MIPS: c-r4k: Fix cache
      flushing for MT cores") to apparently handle multi-core multi-VPE
      systems, but it focussed mainly on hit cache ops, so the SMP calls were
      still disabled entirely for CM systems.
      
      This doesn't normally cause problems, but tests can be written to hit
      these corner cases by using multiple threads, or changing task
      affinities to force the process to migrate cores. For example the
      failure of mprotect RW->RX to globally sync icaches (via
      flush_cache_range) can be detected by modifying and mprotecting a code
      page on one core, and migrating to a different core to execute from it.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13807/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      11f76903
    • J
      MIPS: c-r4k: Avoid small flush_icache_range SMP calls · f70ddc07
      James Hogan 提交于
      Avoid SMP calls for flushing small icache ranges. On non-CM platforms,
      and CM platforms too after we make r4k_on_each_cpu() take the cache op
      type into account, it will be called on multiple CPUs due to the
      possibility that local_r4k_flush_icache_range_ipi() could do
      non-globalized indexed cache ops. This rougly copies the range size
      check out into r4k_flush_icache_range(), which can disallow indexed
      cache ops and allow r4k_on_each_cpu() to skip the SMP call.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13805/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      f70ddc07
    • J
      MIPS: c-r4k: Local flush_icache_range cache op override · 27b93d9c
      James Hogan 提交于
      Allow the permitted cache op types used by
      local_r4k_flush_icache_range_ipi() to be overridden by the SMP caller.
      This will allow SMP calls to be avoided under certain circumstances,
      falling back to a single CPU performing globalized hit cache ops only.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13803/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      27b93d9c
    • J
      MIPS: c-r4k: Split r4k_flush_kernel_vmap_range() · a9341ae2
      James Hogan 提交于
      Split the operation of r4k_flush_kernel_vmap_range() into separate
      SMP callbacks for the indexed cache flush and hit cache flush cases,
      since the logic to determine which to use can be determined by the
      initiating CPU prior to doing any SMP calls.
      
      This will help when we change r4k_on_each_cpu() to distinguish indexed
      and hit cache ops in a later patch, preventing globalized hit cache ops
      being performed redundantly on multiple CPUs.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13806/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      a9341ae2
    • J
      MIPS: c-r4k: Exclude sibling CPUs in SMP calls · 640511ae
      James Hogan 提交于
      When performing SMP calls to foreign cores, exclude sibling CPUs from
      the provided map, as we already handle the local core on the current
      CPU. This prevents an SMP call from for example core 0, VPE 1 to VPE 0
      on the same core.
      
      In the process the cpu_foreign_map cpumask is turned into an array of
      cpumasks, so that each CPU has its own version of it which excludes
      sibling CPUs. r4k_op_needs_ipi() is also updated to reflect that cache
      management SMP calls are not needed when all CPUs are siblings (i.e.
      there are no foreign CPUs according to the new cpu_foreign_map[]
      semantics which exclude siblings).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Jayachandran C. <jchandra@broadcom.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13801/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      640511ae
    • J
      MIPS: c-r4k: Fix valid ASID optimisation · 6d758bfc
      James Hogan 提交于
      Several cache operations are optimised to return early from the SMP call
      handler if the memory map in question has no valid ASID on the current
      CPU, or any online CPU in the case of MIPS_MT_SMP. The idea is that if a
      memory map has never been used on a CPU it shouldn't have cache lines in
      need of flushing.
      
      However this doesn't cover all cases when ASIDs for other CPUs need to
      be checked:
      - Offline VPEs may have recently been online and brought lines into the
        (shared) cache, so they should also be checked, rather than only
        online CPUs.
      - SMP systems with a Coherence Manager (CM), but with MT disabled still
        have globalized hit cache ops, but don't use SMP calls, so all present
        CPUs should be taken into account.
      - R6 systems have a different multithreading implementation, so
        MIPS_MT_SMP won't be set, but as above may still have a CM which
        globalizes hit cache ops.
      
      Additionally for non-globalized cache operations where an SMP call to a
      single VPE in each foreign core is used, it is not necessary to check
      every CPU in the system, only sibling CPUs sharing the same first level
      cache.
      
      Fix this by making has_valid_asid() take a cache op type argument like
      r4k_on_each_cpu(), so it can determine whether r4k_on_each_cpu() will
      have done SMP calls to other cores. It can then determine which set of
      CPUs to check the ASIDs of based on that, excluding foreign CPUs if an
      SMP call will have been performed.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13804/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      6d758bfc
    • J
      MIPS: c-r4k: Add r4k_on_each_cpu cache op type arg · d374d937
      James Hogan 提交于
      The r4k_on_each_cpu() function calls the specified cache flush helper on
      other CPUs if deemed necessary due to the cache ops not being
      globalized by hardware. However this really depends on the cache op
      addressing type, as the MIPS Coherence Manager (CM) if present will
      globalize "hit" cache ops (addressed by virtual address), but not
      "index" cache ops (addressed by cache index). This results in index
      cache ops only being performed on a single CPU when CM is present.
      
      Most (but not all) of the functions called by r4k_on_each_cpu() perform
      cache operations exclusively with a single cache op type, so add a type
      argument and modify the callers to pass in some combination of R4K_HIT
      (global kernel virtual addressing or user virtual addressing
      conditional upon matching active_mm) and R4K_INDEX (index into cache).
      
      This will allow r4k_on_each_cpu() to later distinguish these cases and
      decide whether to perform an SMP call based on it.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13798/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      d374d937
    • J
      MIPS: c-r4k: Avoid dcache flush for sigtramps · 8bd646e9
      James Hogan 提交于
      Avoid the dcache and scache flush in local_r4k_flush_cache_sigtramp() if
      the icache fills straight from the dcache.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13802/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      8bd646e9
    • J
      MIPS: c-r4k: Fix sigtramp SMP call to use kmap · e523f289
      James Hogan 提交于
      Fix r4k_flush_cache_sigtramp() and local_r4k_flush_cache_sigtramp() to
      flush the delay slot emulation trampoline cacheline through a kmap
      rather than directly when the active_mm doesn't match that of the task
      initiating the flush, a bit like local_r4k_flush_cache_page() does.
      
      This would fix a corner case on SMP systems without hardware globalized
      hit cache ops, where a migration to another CPU after the flush, where
      that CPU did not have the same mm active at the time of the flush, could
      result in stale icache content being executed instead of the trampoline,
      e.g. from a previous delay slot emulation with a similar stack pointer.
      
      This case was artificially triggered by replacing the icache flush with
      a full indexed flush (not globalized on CM systems) and forcing the SMP
      call to take place, with a test program that alternated two FPU delay
      slots with a parent process repeatedly changing scheduler affinity.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13797/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      e523f289
    • J
      MIPS: c-r4k: Fix protected_writeback_scache_line for EVA · 0758b116
      James Hogan 提交于
      The protected_writeback_scache_line() function is used by
      local_r4k_flush_cache_sigtramp() to flush an FPU delay slot emulation
      trampoline on the userland stack from the caches so it is visible to
      subsequent instruction fetches.
      
      Commit de8974e3 ("MIPS: asm: r4kcache: Add EVA cache flushing
      functions") updated some protected_ cache flush functions to use EVA
      CACHEE instructions via protected_cachee_op(), and commit 83fd4344
      ("MIPS: r4kcache: Add EVA case for protected_writeback_dcache_line") did
      the same thing for protected_writeback_dcache_line(), but
      protected_writeback_scache_line() never got updated. Lets fix that now
      to flush the right user address from the secondary cache rather than
      some arbitrary kernel unmapped address.
      
      This issue was spotted through code inspection, and it seems unlikely to
      be possible to hit this in practice. It theoretically affect EVA kernels
      on EVA capable cores with an L2 cache, where the icache fetches straight
      from RAM (cpu_icache_snoops_remote_store == 0), running a hard float
      userland with FPU disabled (nofpu). That both Malta and Boston platforms
      override cpu_icache_snoops_remote_store to 1 suggests that all MIPS
      cores fetch instructions into icache straight from L2 rather than RAM.
      
      Fixes: de8974e3 ("MIPS: asm: r4kcache: Add EVA cache flushing functions")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13800/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      0758b116
    • J
      MIPS: SMP: Drop stop_this_cpu() cpu_foreign_map hack · 92696316
      James Hogan 提交于
      Commit cccf34e9 ("MIPS: c-r4k: Fix cache flushing for MT cores")
      added the cpu_foreign_map cpumask containing a single VPE from each
      online core, and recalculated it when secondary CPUs are brought up.
      
      stop_this_cpu() was also updated to recalculate cpu_foreign_map, but
      with an additional hack before marking the CPU as offline to copy
      cpu_online_mask into cpu_foreign_map and perform an SMP memory barrier.
      
      This appears to have been intended to prevent cache management IPIs
      being missed when the VPE representing the core in cpu_foreign_map is
      taken offline while other VPEs remain online. Unfortunately there is
      nothing in this hack to prevent r4k_on_each_cpu() from reading the old
      cpu_foreign_map, and smp_call_function_many() from reading that new
      cpu_online_mask with the core's representative VPE marked offline. It
      then wouldn't send an IPI to any online VPEs of that core.
      
      stop_this_cpu() is only actually called in panic and system shutdown /
      halt / reboot situations, in which case all CPUs are going down and we
      don't really need to care about cache management, so drop this hack.
      
      Note that the __cpu_disable() case for CPU hotplug is handled in the
      previous commit, and no synchronisation is needed there due to the use
      of stop_machine() which prevents hotplug from taking place while any CPU
      has disabled preemption (as r4k_on_each_cpu() does).
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13796/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      92696316
    • J
      MIPS: SMP: Update cpu_foreign_map on CPU disable · 826e99be
      James Hogan 提交于
      When a CPU is disabled via CPU hotplug, cpu_foreign_map is not updated.
      This could result in cache management SMP calls being sent to offline
      CPUs instead of online siblings in the same core.
      
      Add a call to calculate_cpu_foreign_map() in the various MIPS cpu
      disable callbacks after set_cpu_online(). All cases are updated for
      consistency and to keep cpu_foreign_map strictly up to date, not just
      those which may support hardware multithreading.
      
      Fixes: cccf34e9 ("MIPS: c-r4k: Fix cache flushing for MT cores")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Kevin Cernekee <cernekee@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Hongliang Tao <taohl@lemote.com>
      Cc: Hua Yan <yanh@lemote.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13799/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      826e99be
    • J
      MIPS: SMP: Clear ASID without confusing has_valid_asid() · a05c3920
      James Hogan 提交于
      The SMP flush_tlb_*() functions may clear the memory map's ASIDs for
      other CPUs if the mm has only a single user (the current CPU) in order
      to avoid SMP calls. However this makes it appear to has_valid_asid(),
      which is used by various cache flush functions, as if the CPUs have
      never run in the mm, and therefore can't have cached any of its memory.
      
      For flush_tlb_mm() this doesn't sound unreasonable.
      
      flush_tlb_range() corresponds to flush_cache_range() which does do full
      indexed cache flushes, but only on the icache if the specified mapping
      is executable, otherwise it doesn't guarantee that there are no cache
      contents left for the mm.
      
      flush_tlb_page() corresponds to flush_cache_page(), which will perform
      address based cache ops on the specified page only, and also only
      touches the icache if the page is executable. It does not guarantee that
      there are no cache contents left for the mm.
      
      For example, this affects flush_cache_range() which uses the
      has_valid_asid() optimisation. It is required to flush the icache when
      mappings are made executable (e.g. using mprotect) so they are
      immediately usable. If some code is changed to non executable in order
      to be modified then it will not be flushed from the icache during that
      time, but the ASID on other CPUs may still be cleared for TLB flushing.
      When the code is changed back to executable, flush_cache_range() will
      assume the code hasn't run on those other CPUs due to the zero ASID, and
      won't invalidate the icache on them.
      
      This is fixed by clearing the other CPUs ASIDs to 1 instead of 0 for the
      above two flush_tlb_*() functions when the corresponding cache flushes
      are likely to be incomplete (non executable range flush, or any page
      flush). This ASID appears valid to has_valid_asid(), but still triggers
      ASID regeneration due to the upper ASID version bits being 0, which is
      less than the minimum ASID version of 1 and so always treated as stale.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13795/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      a05c3920
  3. 28 7月, 2016 6 次提交
  4. 24 7月, 2016 10 次提交
  5. 21 7月, 2016 1 次提交
  6. 12 7月, 2016 1 次提交
  7. 06 7月, 2016 2 次提交
    • D
      MIPS: Fix page table corruption on THP permission changes. · acd168c0
      David Daney 提交于
      When the core THP code is modifying the permissions of a huge page it
      calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit
      of the page table entry.  The result can be kernel messages like:
      
      mm/memory.c:397: bad pmd 000000040080004d.
      mm/memory.c:397: bad pmd 00000003ff00004d.
      mm/memory.c:397: bad pmd 000000040100004d.
      
      or:
      
      ------------[ cut here ]------------
      WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158()
      Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80
      CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4
      Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0
                0000000000000000 0000000000000000 ffffffff85110000 0000000000000119
                0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f
                0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000
                0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80
                ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001
                0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924
                80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780
                80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f
                0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000
                ...
      Call Trace:
      [<ffffffff80865c4c>] show_stack+0x6c/0xf8
      [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8
      [<ffffffff809207a0>] exit_mmap+0x150/0x158
      [<ffffffff80882d44>] mmput+0x5c/0x110
      [<ffffffff8088b450>] do_exit+0x230/0xa68
      [<ffffffff8088be34>] do_group_exit+0x54/0x1d0
      [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18
      
      ---[ end trace c7b38293191c57dc ]---
      BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536
      
      Fix by not clearing _PAGE_HUGE bit.
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Tested-by: NAaro Koskinen <aaro.koskinen@nokia.com>
      Cc: stable@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13687/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      acd168c0
    • R
      MIPS: Remove cpu_has_safe_index_cacheops · c00ab489
      Ralf Baechle 提交于
      Very early versions of the 1004K had an hardware issue that made index
      cache ops unsafe so they had to be avoided and hit ops be used instead.
      This may significantly slow down cache maintenance operations.  Only
      very early FPGA versions of the 1004K were affected so let's get rid
      of the workaround which was only implemented for the DMA cache
      maintenance operations anyway.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      c00ab489
  8. 02 7月, 2016 1 次提交
    • R
      MIPS: Fix possible corruption of cache mode by mprotect. · 6d037de9
      Ralf Baechle 提交于
      The following testcase may result in a page table entries with a invalid
      CCA field being generated:
      
      static void *bindstack;
      
      static int sysrqfd;
      
      static void protect_low(int protect)
      {
      	mprotect(bindstack, BINDSTACK_SIZE, protect);
      }
      
      static void sigbus_handler(int signal, siginfo_t * info, void *context)
      {
      	void *addr = info->si_addr;
      
      	write(sysrqfd, "x", 1);
      
      	printf("sigbus, fault address %p (should not happen, but might)\n",
      	       addr);
      	abort();
      }
      
      static void run_bind_test(void)
      {
      	unsigned int *p = bindstack;
      
      	p[0] = 0xf001f001;
      
      	write(sysrqfd, "x", 1);
      
      	/* Set trap on access to p[0] */
      	protect_low(PROT_NONE);
      
      	write(sysrqfd, "x", 1);
      
      	/* Clear trap on access to p[0] */
      	protect_low(PROT_READ | PROT_WRITE | PROT_EXEC);
      
      	write(sysrqfd, "x", 1);
      
      	/* Check the contents of p[0] */
      	if (p[0] != 0xf001f001) {
      		write(sysrqfd, "x", 1);
      
      		/* Reached, but shouldn't be */
      		printf("badness, shouldn't happen but does\n");
      		abort();
      	}
      }
      
      int main(void)
      {
      	struct sigaction sa;
      
      	sysrqfd = open("/proc/sysrq-trigger", O_WRONLY);
      
      	if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) {
      		perror("sigprocmask");
      		return 0;
      	}
      
      	sa.sa_sigaction = sigbus_handler;
      	sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_RESTART;
      	if (sigaction(SIGBUS, &sa, NULL)) {
      		perror("sigaction");
      		return 0;
      	}
      
      	bindstack = mmap(NULL,
      			 BINDSTACK_SIZE,
      			 PROT_READ | PROT_WRITE | PROT_EXEC,
      			 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
      	if (bindstack == MAP_FAILED) {
      		perror("mmap bindstack");
      		return 0;
      	}
      
      	printf("bindstack: %p\n", bindstack);
      
      	run_bind_test();
      
      	printf("done\n");
      
      	return 0;
      }
      
      There are multiple ingredients for this:
      
       1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3
          on all platforms except SB1 where it's CCA 5.
       2) _page_cachable_default must have bits set which are not set
          _CACHE_CACHABLE_NONCOHERENT.
       3) Either the defective version of pte_modify for XPA or the standard
          version must be in used.  However pte_modify for the 36 bit address
          space support is no affected.
      
      In that case additional bits in the final CCA mode may generate an invalid
      value for the CCA field.  On the R10000 system where this was tracked
      down for example a CCA 7 has been observed, which is Uncached Accelerated.
      
      Fixed by:
      
       1) Using the proper CCA mode for PAGE_NONE just like for all the other
          PAGE_* pte/pmd bits.
       2) Fix the two affected variants of pte_modify.
      
      Further code inspection also shows the same issue to exist in pmd_modify
      which would affect huge page systems.
      
      Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE
      and pmd_modify issue found by me.
      
      The history of this goes back beyond Linus' git history.  Chris Dearman's
      commit 35133692 ("[MIPS] Allow setting of
      the cache attribute at run time.") missed the opportunity to fix this
      but it was originally introduced in lmo commit
      d523832cf12007b3242e50bb77d0c9e63e0b6518 ("Missing from last commit.")
      and 32cc38229ac7538f2346918a09e75413e8861f87 ("New configuration option
      CONFIG_MIPS_UNCACHED.")
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      Reported-by: NAlastair Bridgewater <alastair.bridgewater@gmail.com>
      6d037de9
  9. 30 6月, 2016 1 次提交
  10. 29 6月, 2016 1 次提交
    • M
      powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 · 190ce869
      Michael Neuling 提交于
      Currently we have 2 segments that are bolted for the kernel linear
      mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
      stacks. Anything accessed outside of these regions may need to be
      faulted in. (In practice machines with TM always have 1T segments)
      
      If a machine has < 2TB of memory we never fault on the kernel linear
      mapping as these two segments cover all physical memory. If a machine
      has > 2TB of memory, there may be structures outside of these two
      segments that need to be faulted in. This faulting can occur when
      running as a guest as the hypervisor may remove any SLB that's not
      bolted.
      
      When we treclaim and trecheckpoint we have a window where we need to
      run with the userspace GPRs. This means that we no longer have a valid
      stack pointer in r1. For this window we therefore clear MSR RI to
      indicate that any exceptions taken at this point won't be able to be
      handled. This means that we can't take segment misses in this RI=0
      window.
      
      In this RI=0 region, we currently access the thread_struct for the
      process being context switched to or from. This thread_struct access
      may cause a segment fault since it's not guaranteed to be covered by
      the two bolted segment entries described above.
      
      We've seen this with a crash when running as a guest with > 2TB of
      memory on PowerVM:
      
        Unrecoverable exception 4100 at c00000000004f138
        Oops: Unrecoverable exception, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
        task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
        NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
        REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
        MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
        CFAR: c00000000004ed18 SOFTE: 0
        GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
        GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
        GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
        GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
        GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
        GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
        GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
        GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
        NIP [c00000000004f138] restore_gprs+0xd0/0x16c
        LR [0000000010003a24] 0x10003a24
        Call Trace:
        [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
        [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
        [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
        [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
        [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
        [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
        [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
        [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
        Instruction dump:
        7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
        38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
        ---[ end trace 602126d0a1dedd54 ]---
      
      This fixes this by copying the required data from the thread_struct to
      the stack before we clear MSR RI. Then once we clear RI, we only access
      the stack, guaranteeing there's no segment miss.
      
      We also tighten the region over which we set RI=0 on the treclaim()
      path. This may have a slight performance impact since we're adding an
      mtmsr instruction.
      
      Fixes: 090b9284 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      190ce869
  11. 28 6月, 2016 2 次提交