1. 01 11月, 2014 2 次提交
    • S
      ftrace/x86: Show trampoline call function in enabled_functions · 15d5b02c
      Steven Rostedt (Red Hat) 提交于
      The file /sys/kernel/debug/tracing/eneabled_functions is used to debug
      ftrace function hooks. Add to the output what function is being called
      by the trampoline if the arch supports it.
      
      Add support for this feature in x86_64.
      
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      15d5b02c
    • S
      ftrace/x86: Add dynamic allocated trampoline for ftrace_ops · f3bea491
      Steven Rostedt (Red Hat) 提交于
      The current method of handling multiple function callbacks is to register
      a list function callback that calls all the other callbacks based on
      their hash tables and compare it to the function that the callback was
      called on. But this is very inefficient.
      
      For example, if you are tracing all functions in the kernel and then
      add a kprobe to a function such that the kprobe uses ftrace, the
      mcount trampoline will switch from calling the function trace callback
      to calling the list callback that will iterate over all registered
      ftrace_ops (in this case, the function tracer and the kprobes callback).
      That means for every function being traced it checks the hash of the
      ftrace_ops for function tracing and kprobes, even though the kprobes
      is only set at a single function. The kprobes ftrace_ops is checked
      for every function being traced!
      
      Instead of calling the list function for functions that are only being
      traced by a single callback, we can call a dynamically allocated
      trampoline that calls the callback directly. The function graph tracer
      already uses a direct call trampoline when it is being traced by itself
      but it is not dynamically allocated. It's trampoline is static in the
      kernel core. The infrastructure that called the function graph trampoline
      can also be used to call a dynamically allocated one.
      
      For now, only ftrace_ops that are not dynamically allocated can have
      a trampoline. That is, users such as function tracer or stack tracer.
      kprobes and perf allocate their ftrace_ops, and until there's a safe
      way to free the trampoline, it can not be used. The dynamically allocated
      ftrace_ops may, although, use the trampoline if the kernel is not
      compiled with CONFIG_PREEMPT. But that will come later.
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f3bea491
  2. 19 10月, 2014 3 次提交
    • D
      sparc64: Do not define thread fpregs save area as zero-length array. · e2653143
      David S. Miller 提交于
      This breaks the stack end corruption detection facility.
      
      What that facility does it write a magic value to "end_of_stack()"
      and checking to see if it gets overwritten.
      
      "end_of_stack()" is "task_thread_info(p) + 1", which for sparc64 is
      the beginning of the FPU register save area.
      
      So once the user uses the FPU, the magic value is overwritten and the
      debug checks trigger.
      
      Fix this by making the size explicit.
      
      Due to the size we use for the fpsaved[], gsr[], and xfsr[] arrays we
      are limited to 7 levels of FPU state saves.  So each FPU register set
      is 256 bytes, allocate 256 * 7 for the fpregs area.
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2653143
    • D
      sparc64: Fix corrupted thread fault code. · 84bd6d8b
      David S. Miller 提交于
      Every path that ends up at do_sparc64_fault() must install a valid
      FAULT_CODE_* bitmask in the per-thread fault code byte.
      
      Two paths leading to the label winfix_trampoline (which expects the
      FAULT_CODE_* mask in register %g4) were not doing so:
      
      1) For pre-hypervisor TLB protection violation traps, if we took
         the 'winfix_trampoline' path we wouldn't have %g4 initialized
         with the FAULT_CODE_* value yet.  Resulting in using the
         TLB_TAG_ACCESS register address value instead.
      
      2) In the TSB miss path, when we notice that we are going to use a
         hugepage mapping, but we haven't allocated the hugepage TSB yet, we
         still have to take the window fixup case into consideration and
         in that particular path we leave %g4 not setup properly.
      
      Errors on this sort were largely invisible previously, but after
      commit 4ccb9272 ("sparc64: sun4v TLB
      error power off events") we now have a fault_code mask bit
      (FAULT_CODE_BAD_RA) that triggers due to this bug.
      
      FAULT_CODE_BAD_RA triggers because this bit is set in TLB_TAG_ACCESS
      (see #1 above) and thus we get seemingly random bus errors triggered
      for user processes.
      
      Fixes: 4ccb9272 ("sparc64: sun4v TLB error power off events")
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84bd6d8b
    • A
      x86,kvm,vmx: Preserve CR4 across VM entry · d974baa3
      Andy Lutomirski 提交于
      CR4 isn't constant; at least the TSD and PCE bits can vary.
      
      TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
      like it's correct.
      
      This adds a branch and a read from cr4 to each vm entry.  Because it is
      extremely likely that consecutive entries into the same vcpu will have
      the same host cr4 value, this fixes up the vmcs instead of restoring cr4
      after the fact.  A subsequent patch will add a kernel-wide cr4 shadow,
      reducing the overhead in the common case to just two memory reads and a
      branch.
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Petr Matousek <pmatouse@redhat.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d974baa3
  3. 17 10月, 2014 3 次提交
  4. 16 10月, 2014 5 次提交
  5. 15 10月, 2014 5 次提交
    • S
      arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort · 3d08c629
      Steve Capper 提交于
      Commit:
      b8865767 ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping
      
      introduced some code in user_mem_abort that failed to compile if
      STRICT_MM_TYPECHECKS was enabled.
      
      This patch fixes up the failing comparison.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Reviewed-by: NKim Phillips <kim.phillips@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3d08c629
    • O
      ARM: sunxi_defconfig: enable CONFIG_REGULATOR · 9a2ad529
      Olof Johansson 提交于
      Commit 97a13e52 ('net: phy: mdio-sun4i: don't select REGULATOR') removed
      the select of REGULATOR, which means that it now has to be explicitly
      enabled in the defconfig or things won't work very well.
      
      In particular, this fixes a problem with SD/MMC not probing on my A31-based
      board.
      
      Cc: Beniamino Galvani <b.galvani@gmail.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      9a2ad529
    • D
      sparc64: Fix FPU register corruption with AES crypto offload. · f4da3628
      David S. Miller 提交于
      The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the
      key material is preloaded into the FPU registers, and then we loop
      over and over doing the crypt operation, reusing those pre-cooked key
      registers.
      
      There are intervening blkcipher*() calls between the crypt operation
      calls.  And those might perform memcpy() and thus also try to use the
      FPU.
      
      The sparc64 kernel FPU usage mechanism is designed to allow such
      recursive uses, but with a catch.
      
      There has to be a trap between the two FPU using threads of control.
      
      The mechanism works by, when the FPU is already in use by the kernel,
      allocating a slot for FPU saving at trap time.  Then if, within the
      trap handler, we try to use the FPU registers, the pre-trap FPU
      register state is saved into the slot.  Then at trap return time we
      notice this and restore the pre-trap FPU state.
      
      Over the long term there are various more involved ways we can make
      this work, but for a quick fix let's take advantage of the fact that
      the situation where this happens is very limited.
      
      All sparc64 chips that support the crypto instructiosn also are using
      the Niagara4 memcpy routine, and that routine only uses the FPU for
      large copies where we can't get the source aligned properly to a
      multiple of 8 bytes.
      
      We look to see if the FPU is already in use in this context, and if so
      we use the non-large copy path which only uses integer registers.
      
      Furthermore, we also limit this special logic to when we are doing
      kernel copy, rather than a user copy.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4da3628
    • I
    • A
      x86: bpf_jit: fix two bugs in eBPF JIT compiler · e0ee9c12
      Alexei Starovoitov 提交于
      1.
      JIT compiler using multi-pass approach to converge to final image size,
      since x86 instructions are variable length. It starts with large
      gaps between instructions (so some jumps may use imm32 instead of imm8)
      and iterates until total program size is the same as in previous pass.
      This algorithm works only if program size is strictly decreasing.
      Programs that use LD_ABS insn need additional code in prologue, but it
      was not emitted during 1st pass, so there was a chance that 2nd pass would
      adjust imm32->imm8 jump offsets to the same number of bytes as increase in
      prologue, which may cause algorithm to erroneously decide that size converged.
      Fix it by always emitting largest prologue in the first pass which
      is detected by oldproglen==0 check.
      Also change error check condition 'proglen != oldproglen' to fail gracefully.
      
      2.
      while staring at the code realized that 64-byte buffer may not be enough
      when 1st insn is large, so increase it to 128 to avoid buffer overflow
      (theoretical maximum size of prologue+div is 109) and add runtime check.
      
      Fixes: 62258278 ("net: filter: x86: internal BPF JIT")
      Reported-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Tested-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0ee9c12
  6. 14 10月, 2014 16 次提交
  7. 13 10月, 2014 3 次提交
  8. 12 10月, 2014 1 次提交
    • H
      parisc: Reduce SIGRTMIN from 37 to 32 to behave like other Linux architectures · 1f25df2e
      Helge Deller 提交于
      This patch reduces the value of SIGRTMIN on PARISC from 37 to 32, thus
      increasing the number of available RT signals and bring it in sync with other
      Linux architectures.
      
      Historically we wanted to natively support HP-UX 32bit binaries with the
      PA-RISC Linux port.  Because of that we carried the various available signals
      from HP-UX (e.g. SIGEMT and SIGLOST) and folded them in between the native
      Linux signals.  Although this was the right decision at that time, this
      required us to increase SIGRTMIN to at least 37 which left us with 27 (64-37)
      RT signals.
      
      Those 27 RT signals haven't been a problem in the past, but with the upcoming
      importance of systemd we now got the problem that systemd alloctes (hardcoded)
      signals up to SIGRTMIN+29 which is beyond our NSIG of 64. Because of that we
      have not been able to use systemd on the PARISC Linux port yet.
      
      Of course we could ask the systemd developers to not use those hardcoded
      values, but this change is very unlikely, esp. with PA-RISC being a niche
      architecture.
      
      The other possibility would be to increase NSIG to e.g. 128, but this would
      mean to duplicate most of the existing Linux signal handling code into the
      parisc specific Linux kernel tree which would most likely introduce lots of new
      bugs beside the code duplication.
      
      The third option is to drop some HP-UX signals and shuffle some other signals
      around to bring SIGRTMIN to 32.  This is of course an ABI change, but testing
      has shown that existing Linux installations are not visibly affected by this
      change - most likely because we move those signals around which are rarely used
      and move them to slots which haven't been used in Linux yet. In an existing
      installation I was able to exchange either the Linux kernel or glibc (or both)
      without affecting the boot process and installed applications.
      
      Dropping the HP-UX signals isn't an issue either, since support for HP-UX was
      basically dropped a few months back with Kernel 3.14 in commit
      f5a408d5 already, when we changed EWOULDBLOCK
      to be equal to EAGAIN.
      
      So, even if this is an ABI change, it's better to change it now and thus bring
      PARISC Linux in sync with other architectures to avoid other issues in the
      future.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: Carlos O'Donell <carlos@systemhalted.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: PARISC Linux Kernel Mailinglist <linux-parisc@vger.kernel.org>
      Tested-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      1f25df2e
  9. 11 10月, 2014 2 次提交
    • D
      sparc64: Fix lockdep warnings on reboot on Ultra-5 · bdcf81b6
      David S. Miller 提交于
      Inconsistently, the raw_* IRQ routines do not interact with and update
      the irqflags tracing and lockdep state, whereas the raw_* spinlock
      interfaces do.
      
      This causes problems in p1275_cmd_direct() because we disable hardirqs
      by hand using raw_local_irq_restore() and then do a raw_spin_lock()
      which triggers a lockdep trace because the CPU's hw IRQ state doesn't
      match IRQ tracing's internal software copy of that state.
      
      The CPU's irqs are disabled, yet current->hardirqs_enabled is true.
      
      ====================
      reboot: Restarting system
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3536 check_flags+0x7c/0x240()
      DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
      Modules linked in: openpromfs
      CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G        W      3.17.0-dirty #145
      Call Trace:
       [000000000045919c] warn_slowpath_common+0x5c/0xa0
       [0000000000459210] warn_slowpath_fmt+0x30/0x40
       [000000000048f41c] check_flags+0x7c/0x240
       [0000000000493280] lock_acquire+0x20/0x1c0
       [0000000000832b70] _raw_spin_lock+0x30/0x60
       [000000000068f2fc] p1275_cmd_direct+0x1c/0x60
       [000000000068ed28] prom_reboot+0x28/0x40
       [000000000043610c] machine_restart+0x4c/0x80
       [000000000047d2d4] kernel_restart+0x54/0x80
       [000000000047d618] SyS_reboot+0x138/0x200
       [00000000004060b4] linux_sparc_syscall32+0x34/0x60
      ---[ end trace 5c439fe81c05a100 ]---
      possible reason: unannotated irqs-off.
      irq event stamp: 2010267
      hardirqs last  enabled at (2010267): [<000000000049a358>] vprintk_emit+0x4b8/0x580
      hardirqs last disabled at (2010266): [<0000000000499f08>] vprintk_emit+0x68/0x580
      softirqs last  enabled at (2010046): [<000000000045d278>] __do_softirq+0x378/0x4a0
      softirqs last disabled at (2010039): [<000000000042bf08>] do_softirq_own_stack+0x28/0x40
      Resetting ...
      ====================
      
      Use local_* variables of the hw IRQ interfaces so that IRQ tracing sees
      all of our changes.
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdcf81b6
    • I
      dtb: Add 10GbE node to APM X-Gene SoC device tree · 5fb32417
      Iyappan Subramanian 提交于
      Added 10GbE interface and clock nodes.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: NKeyur Chudgar <kchudgar@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fb32417