1. 01 9月, 2017 12 次提交
    • P
      powerpc: Emulate the dcbz instruction · b2543f7b
      Paul Mackerras 提交于
      This adds code to analyse_instr() and emulate_step() to understand the
      dcbz (data cache block zero) instruction.  The emulate_dcbz() function
      is made public so it can be used by the alignment handler in future.
      (The apparently unnecessary cropping of the address to 32 bits is
      there because it will be needed in that situation.)
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b2543f7b
    • P
      powerpc: Emulate load/store floating double pair instructions · 1f41fb79
      Paul Mackerras 提交于
      This adds lfdp[x] and stfdp[x] to the set of instructions that
      analyse_instr() and emulate_step() understand.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1f41fb79
    • P
      powerpc: Emulate vector element load/store instructions · e61ccc7b
      Paul Mackerras 提交于
      This adds code to analyse_instr() and emulate_step() to handle the
      vector element loads and stores:
      
      lvebx, lvehx, lvewx, stvebx, stvehx, stvewx.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e61ccc7b
    • P
      powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not live · c22435a5
      Paul Mackerras 提交于
      At present, the analyse_instr/emulate_step code checks for the
      relevant MSR_FP/VEC/VSX bit being set when a FP/VMX/VSX load
      or store is decoded, but doesn't recheck the bit before reading or
      writing the relevant FP/VMX/VSX register in emulate_step().
      
      Since we don't have preemption disabled, it is possible that we get
      preempted between checking the MSR bit and doing the register access.
      If that happened, then the registers would have been saved to the
      thread_struct for the current process.  Accesses to the CPU registers
      would then potentially read stale values, or write values that would
      never be seen by the user process.
      
      Another way that the registers can become non-live is if a page
      fault occurs when accessing user memory, and the page fault code
      calls a copy routine that wants to use the VMX or VSX registers.
      
      To fix this, the code for all the FP/VMX/VSX loads gets restructured
      so that it forms an image in a local variable of the desired register
      contents, then disables preemption, checks the MSR bit and either
      sets the CPU register or writes the value to the thread struct.
      Similarly, the code for stores checks the MSR bit, copies either the
      CPU register or the thread struct to a local variable, then reenables
      preemption and then copies the register image to memory.
      
      If the instruction being emulated is in the kernel, then we must not
      use the register values in the thread_struct.  In this case, if the
      relevant MSR enable bit is not set, then emulate_step refuses to
      emulate the instruction.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c22435a5
    • P
      powerpc: Make load/store emulation use larger memory accesses · e0a0986b
      Paul Mackerras 提交于
      At the moment, emulation of loads and stores of up to 8 bytes to
      unaligned addresses on a little-endian system uses a sequence of
      single-byte loads or stores to memory.  This is rather inefficient,
      and the code is hard to follow because it has many ifdefs.
      In addition, the Power ISA has requirements on how unaligned accesses
      are performed, which are not met by doing all accesses as
      sequences of single-byte accesses.
      
      Emulation of VSX loads and stores uses __copy_{to,from}_user,
      which means the emulation code has no control on the size of
      accesses.
      
      To simplify this, we add new copy_mem_in() and copy_mem_out()
      functions for accessing memory.  These use a sequence of the largest
      possible aligned accesses, up to 8 bytes (or 4 on 32-bit systems),
      to copy memory between a local buffer and user memory.  We then
      rewrite {read,write}_mem_unaligned and the VSX load/store
      emulation using these new functions.
      
      These new functions also simplify the code in do_fp_load() and
      do_fp_store() for the unaligned cases.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e0a0986b
    • P
      powerpc: Add emulation for the addpcis instruction · 958465ee
      Paul Mackerras 提交于
      The addpcis instruction puts the sum of the next instruction address
      plus a constant into a register.  Since the result depends on the
      address of the instruction, it will give an incorrect result if it
      is single-stepped out of line, which is what the *probes subsystem
      will currently do if a probe is placed on an addpcis instruction.
      This fixes the problem by adding emulation of it to analyse_instr().
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      958465ee
    • P
      powerpc: Don't update CR0 in emulation of popcnt, prty, bpermd instructions · 5762e083
      Paul Mackerras 提交于
      The architecture shows the least-significant bit of the instruction
      word as reserved for the popcnt[bwd], prty[wd] and bpermd
      instructions, that is, these instructions never update CR0.
      Therefore this changes the emulation of these instructions to
      skip the CR0 update.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5762e083
    • P
      powerpc: Fix emulation of the isel instruction · f1bbb99f
      Paul Mackerras 提交于
      The case added for the isel instruction was added inside a switch
      statement which uses the 10-bit minor opcode field in the 0x7fe
      bits of the instruction word.  However, for the isel instruction,
      the minor opcode field is only the 0x3e bits, and the 0x7c0 bits
      are used for the "BC" field, which indicates which CR bit to use
      to select the result.
      
      Therefore, for the isel emulation to work correctly when BC != 0,
      we need to match on ((instr >> 1) & 0x1f) == 15).  To do this, we
      pull the isel case out of the switch statement and put it in an
      if statement of its own.
      
      Fixes: e27f71e5 ("powerpc/lib/sstep: Add isel instruction emulation")
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f1bbb99f
    • P
      powerpc/64: Fix update forms of loads and stores to write 64-bit EA · d120cdbc
      Paul Mackerras 提交于
      When a 64-bit processor is executing in 32-bit mode, the update forms
      of load and store instructions are required by the architecture to
      write the full 64-bit effective address into the RA register, though
      only the bottom 32 bits are used to address memory.  Currently,
      the instruction emulation code writes the truncated address to the
      RA register.  This fixes it by keeping the full 64-bit EA in the
      instruction_op structure, truncating the address in emulate_step()
      where it is used to address memory, rather than in the address
      computations in analyse_instr().
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d120cdbc
    • P
      powerpc: Handle most loads and stores in instruction emulation code · 350779a2
      Paul Mackerras 提交于
      This extends the instruction emulation infrastructure in sstep.c to
      handle all the load and store instructions defined in the Power ISA
      v3.0, except for the atomic memory operations, ldmx (which was never
      implemented), lfdp/stfdp, and the vector element load/stores.
      
      The instructions added are:
      
      Integer loads and stores: lbarx, lharx, lqarx, stbcx., sthcx., stqcx.,
      lq, stq.
      
      VSX loads and stores: lxsiwzx, lxsiwax, stxsiwx, lxvx, lxvl, lxvll,
      lxvdsx, lxvwsx, stxvx, stxvl, stxvll, lxsspx, lxsdx, stxsspx, stxsdx,
      lxvw4x, lxsibzx, lxvh8x, lxsihzx, lxvb16x, stxvw4x, stxsibx, stxvh8x,
      stxsihx, stxvb16x, lxsd, lxssp, lxv, stxsd, stxssp, stxv.
      
      These instructions are handled both in the analyse_instr phase and in
      the emulate_step phase.
      
      The code for lxvd2ux and stxvd2ux has been taken out, as those
      instructions were never implemented in any processor and have been
      taken out of the architecture, and their opcodes have been reused for
      other instructions in POWER9 (lxvb16x and stxvb16x).
      
      The emulation for the VSX loads and stores uses helper functions
      which don't access registers or memory directly, which can hopefully
      be reused by KVM later.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      350779a2
    • P
      powerpc: Don't check MSR FP/VMX/VSX enable bits in analyse_instr() · ee0a54d7
      Paul Mackerras 提交于
      This removes the checks for the FP/VMX/VSX enable bits in the MSR
      from analyse_instr() and adds them to emulate_step() instead.
      
      The reason for this is that we may want to use analyse_instr() in
      a situation where the FP/VMX/VSX register values are stored in the
      current thread_struct and the FP/VMX/VSX enable bits in the MSR
      image in the pt_regs are zero.  Since analyse_instr() doesn't make
      any changes to register state, it is reasonable for it to indicate
      what the effect of an instruction would be even though the relevant
      enable bit is off.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ee0a54d7
    • P
      powerpc: Change analyse_instr so it doesn't modify *regs · 3cdfcbfd
      Paul Mackerras 提交于
      The analyse_instr function currently doesn't just work out what an
      instruction does, it also executes those instructions whose effect
      is only to update CPU registers that are stored in struct pt_regs.
      This is undesirable because optprobes uses analyse_instr to work out
      if an instruction could be successfully emulated in future.
      
      This changes analyse_instr so it doesn't modify *regs; instead it
      stores information in the instruction_op structure to indicate what
      registers (GPRs, CR, XER, LR) would be set and what value they would
      be set to.  A companion function called emulate_update_regs() can
      then use that information to update a pt_regs struct appropriately.
      
      As a minor cleanup, this replaces inline asm using the cntlzw and
      cntlzd instructions with calls to __builtin_clz() and __builtin_clzl().
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3cdfcbfd
  2. 10 8月, 2017 5 次提交
  3. 12 7月, 2017 2 次提交
  4. 23 4月, 2017 1 次提交
    • N
      powerpc/kprobes: Convert __kprobes to NOKPROBE_SYMBOL() · 71f6e58e
      Naveen N. Rao 提交于
      Along similar lines as commit 9326638c ("kprobes, x86: Use NOKPROBE_SYMBOL()
      instead of __kprobes annotation"), convert __kprobes annotation to either
      NOKPROBE_SYMBOL() or nokprobe_inline. The latter forces inlining, in which case
      the caller needs to be added to NOKPROBE_SYMBOL().
      
      Also:
       - blacklist arch_deref_entry_point(), and
       - convert a few regular inlines to nokprobe_inline in lib/sstep.c
      
      A key benefit is the ability to detect such symbols as being
      blacklisted. Before this patch:
      
        $ cat /sys/kernel/debug/kprobes/blacklist | grep read_mem
        $ perf probe read_mem
        Failed to write event: Invalid argument
          Error: Failed to add events.
        $ dmesg | tail -1
        [ 3736.112815] Could not insert probe at _text+10014968: -22
      
      After patch:
        $ cat /sys/kernel/debug/kprobes/blacklist | grep read_mem
        0xc000000000072b50-0xc000000000072d20	read_mem
        $ perf probe read_mem
        read_mem is blacklisted function, skip it.
        Added new events:
          (null):(null)        (on read_mem)
          probe:read_mem       (on read_mem)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe:read_mem -aR sleep 1
      
        $ grep " read_mem" /proc/kallsyms
        c000000000072b50 t read_mem
        c0000000005f3b40 t read_mem
        $ cat /sys/kernel/debug/kprobes/list
        c0000000005f3b48  k  read_mem+0x8    [DISABLED]
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      [mpe: Minor change log formatting, fix up some conflicts]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      71f6e58e
  5. 03 3月, 2017 1 次提交
  6. 25 1月, 2017 1 次提交
  7. 25 12月, 2016 1 次提交
  8. 18 11月, 2016 1 次提交
  9. 14 11月, 2016 1 次提交
  10. 11 5月, 2016 2 次提交
    • O
      powerpc/sstep: Fix emulation fall-through · 66707836
      Oliver O'Halloran 提交于
      There is a switch fallthough in instr_analyze() which can cause an
      invalid instruction to be emulated as a different, valid, instruction.
      The rld* (opcode 30) case extracts a sub-opcode from bits 3:1 of the
      instruction word. However, the only valid values of this field are 001
      and 000. These cases are correctly handled, but the others are not which
      causes execution to fall through into case 31.
      
      Breaking out of the switch causes the instruction to be marked as
      unknown and allows the caller to deal with the invalid instruction in a
      manner consistent with other invalid instructions.
      Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      66707836
    • L
      powerpc/sstep: Fix sstep.c compile on powerpcspe · dd217310
      Lennart Sorensen 提交于
      Commit be96f633 ("powerpc: Split out instruction analysis part of
      emulate_step()") introduced ldarx and stdcx into the instructions in
      sstep.c, which are not accepted by the assembler on powerpcspe, but does
      seem to be accepted by the normal powerpc assembler even in 32 bit mode.
      
      Wrap these two instructions in a __powerpc64__ check like it is
      everywhere else in the file.
      
      Fixes: be96f633 ("powerpc: Split out instruction analysis part of emulate_step()")
      Signed-off-by: NLen Sorensen <lsorense@csclub.uwaterloo.ca>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dd217310
  11. 12 11月, 2014 1 次提交
  12. 25 9月, 2014 3 次提交
  13. 22 7月, 2014 1 次提交
  14. 11 6月, 2014 1 次提交
  15. 30 10月, 2013 2 次提交
  16. 25 9月, 2013 1 次提交
    • B
      powerpc: Remove ksp_limit on ppc64 · cbc9565e
      Benjamin Herrenschmidt 提交于
      We've been keeping that field in thread_struct for a while, it contains
      the "limit" of the current stack pointer and is meant to be used for
      detecting stack overflows.
      
      It has a few problems however:
      
       - First, it was never actually *used* on 64-bit. Set and updated but
      not actually exploited
      
       - When switching stack to/from irq and softirq stacks, it's update
      is racy unless we hard disable interrupts, which is costly. This
      is fine on 32-bit as we don't soft-disable there but not on 64-bit.
      
      Thus rather than fixing 2 in order to implement 1 in some hypothetical
      future, let's remove the code completely from 64-bit. In order to avoid
      a clutter of ifdef's, we remove the updates from C code completely
      during interrupt stack switching, and instead maintain it from the
      asm helper that is used to do the stack switching in the first place.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cbc9565e
  17. 27 8月, 2013 1 次提交
  18. 20 6月, 2013 1 次提交
  19. 18 9月, 2012 1 次提交
  20. 21 5月, 2011 1 次提交
    • L
      sanitize <linux/prefetch.h> usage · 268bb0ce
      Linus Torvalds 提交于
      Commit e66eed65 ("list: remove prefetching from regular list
      iterators") removed the include of prefetch.h from list.h, which
      uncovered several cases that had apparently relied on that rather
      obscure header file dependency.
      
      So this fixes things up a bit, using
      
         grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
         grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')
      
      to guide us in finding files that either need <linux/prefetch.h>
      inclusion, or have it despite not needing it.
      
      There are more of them around (mostly network drivers), but this gets
      many core ones.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      268bb0ce